Week 9 comments

8 Replies

Please leave a comment on the Pater: Serial Harmonic Grammar or the Coetzee and Pater: Variation paper by noon Wednesday.

8 thoughts on “Week 9 comments”

Covadonga Sanchez October 31, 2012 at 1:21 pm

In the article on â€œSerial Harmonic Grammar and Berber Syllabificationâ€, we are provided with a meticulous analysis on Berber syllabification in order to prove the advantages that Harmonic Grammar entails. In Harmonic Grammar, constraints are weighted and the representation is changed and evaluated iteratively. Thus, Harmonic Grammar is a serial version of OT that solves some of the problems that can be faced when taking a parallel approach. One of the positive aspects about serial OT is the derivation â€œmyopiaâ€, concept introduced by Wilson (2003) that basically refers to the idea that in this analysis is not possible to foresee what the result is going to be and the consequence of this might be that the optimal outcome is not the best one globally considered. In the article, we can see the derivation process that leads to an optimal candidate in serial OT and the election made using a parallel analysis: in examples from (15) through (17) we can see all the steps that are followed to obtain the optimal candidate (K)(Ë‡sM) from the input /kË‡sm/ where as in (18) we see the optimal candidate (kË‡Sm) that would be the best candidate if all the candidates were considered at once. Therefore, the serial analysis picks the best option at each step of the derivation without considering how the final outcome will globally satisfy the constraints. As shown by the examples in the article, by doing this, serial OT success at choosing the correct form in cases where parallel OT fails.

Reply ↓
Fiona Dixon October 31, 2012 at 2:38 pm

The purpose of Pater’s (2012) “Serial Harmonic Grammar and Berber Syllabification” is to argue the benefits of serialism and weighted constraints in OT and HG, over Smolensky’s original choice to use constraint ranking and parallel OT. The analysis of Imdlawn Tashlhiyt Berber (ITB) is the justification used for this argument. Parallel OT cannot correctly analyze ITB because it’s fixed ranking attempts to choose the surface representation based on minimal violations of the highest ranking candidates, ex. (pg 6). However, this does not work in ITB which seems to require a serial analysis.
Of this proposal, I’m very interested in the use of weighted constraints and the gang effect. I’ve always been of the opinion that the constraints are likely to work together at some point, even if what seems to be highest ranking candidate might disagree. I think this view will be very useful for future analyses.

Reply ↓
Ethan Poole October 31, 2012 at 3:32 pm

These two articles were beneficial to read because they succinctly highlighted some of the benefits of serialism over parallelism and of numerically weighted constraints over ranked constraints. I will focus on weighted constraints which I see as having more descriptive and explanatory adequacy over ranked constraints. As a quick aside, serialism seems like a better solution to parallelism because the candidate set is finite.

The descriptive adequacy is illustrated in the article on Berber syllabification. By using serialism and weighted constraints, you can straightforwardly produce the syllabification systems in Berber, English, and French, which parallel and serial OT cannot do without additional constraints which make inaccurate typological predictions. (You might also argue that typological predictions relate to explanatory adequacy as some kind of evaluation metric.)

The explanatory adequacy is illustrated in the article on variation. The obvious key advantage to weighted constraints is that they lend themselves to quantitative modelling and learnability algorithms more easily than ranked constraints do. Even stochastic OT appears to be forced to use both weighted and ranked constraints to model learning. Although I am sure that other methods exist, all of the work on ranking algorithms that I am familiar with rely on numeric weights (this is at least what Google uses to rank search results). Using weighted constraints allows you to employ all of this knowledge from computer science, etc.

Reply ↓
Amanda Rysling October 31, 2012 at 3:44 pm

Could we use information from frequency of variation as evidence for the status of certain relations or feature values that we cannot otherwise discern? For example, in the case of variation in ITB word-final sonorants (i.e. coda vs. nucleus), if we find that, e.g., /r/ is a nucleus more frequently than /l/ in the crucial cases, could we interpret this as evidence that, at least for some speakers, /r/ is somehow [slightly] more syllabic than /l/? Could this be possible in some cases, but not others, perhaps dependent on the magnitudes of the frequencies of variants (or something similar)?

Reply ↓
Jon Ander Mendia October 31, 2012 at 5:05 pm

Syllabification in Berber seems to be the perfect case to apply a theory with weighed constraints: there is a natural scale (sonority) with respect to which nuclei are naturally ordered, and weighing constraints can replicate the intended effect (that is, we can make the constraints favor candidates that satisfy them in any direction of the scale -either “up” or “down” in the scale). Moreover, the scale is the same for any language, whereas the weight of the constraints varies, and this allows an interesting interplay, mainly wrt. to typology. But, are there any other “natural” scales for which weighed constraints have proven useful?

Moreover, for some constraints, there are some natural limits that our constraints should respect (or that’s what I would expect, intuitively). For example, if the constraints we discussed in class, *Nuc-Stop, *Nuc-Fric and *Nuc-Nasal are present, I would expect the value of *Nuc-Fric to be sandwiched between the values of *Nuc-Stop and *Nuc-Nasal. I wonder whether that intuitive observation holds at all.

Reply ↓
Hsin-Lun Huang October 31, 2012 at 9:55 pm

In terms of capturing the relation between the general preferences and hard restrictions in syllable structure, serial HG does do a better job than parallel OT. With weighted constraints and “gang effect”, we can successfully account for the preference of selecting some sonorant consonants as better nuclei in ITB and at the meantime, explain the prohibition of consonant nuclei in syllabification, as in French. We don’t need to break up the constraints into different sets and posit some language-specific faithfulness constraints to solve the problem of insufficient typological predictions in parallel OT, like what Prince and Smolenky did in the cases of nucleus projection restrictions in French and English. In this respect, serial HG really seems to me to be a more elegant theory because it gives a more general point of view in dealing with syllabification problems in different languages and in making typological predictions. It seems that the phonological issues of all languages can be well accounted for within the framework without resorting to any extra mechanisms. Maybe this claim is a bit too strong but before we run into any language that shows difficulty in being analyzed in serial HG, I believe we can still have some confidence in it.

Content with the general aspect and better problem solving ability of serial HG, I nonetheless have some questions as to how the serial derivations work in terms of weighted constraint interactions. In p.14, step 1 of the derivation, the creation of either the CV syllable (ra) or (lu) violates no constraints. It is explained in the paper that the full interpretation of *C-NUC would select the candidate with (ra) as the optimum because of the higher sonority of “a”. However, taking the candidate with (lu) as the input for the next step does not seem to influence the possible outcome of the derivation. Also, in step 3 and step 4, the inputs tie in weight penalty. So we can take either direction in the derivation. Does this mean that as long as we reach the correct outcome in the end, it does not matter what path we choose in the derivation? If we have a more complex derivation process that could have several tied candidates in one step, does it mean we are safe to make anyone of them optimal and still reach the correct outcome of the derivation?

Finally, I have some questions about some parts in the paper that seem to be mistakes to me. But it could be my misunderstanding that needs some clarification. 1. In p.8, tableau (23), shouldn’t the violation mark of the first candidate (tÃ¦)(bL) under *C-NUC be 1 because the nucleus of the second syllable is L, a sonorant consonant? 2. In p.16, first paragraph, 3rd line, it says that “With our two constraints, the first application of adjunction will beat the fully faithful candidate if and only if *CODA has a greater weight than PARSE-SEG.” Why isn’t it the other way around? Shouldn’t we have PARSE-SEG with a greater weight than *CODA for the candidate with adjunction to beat the fully faithful candidate? Because in the case of (a)p vs. (ap), each candidate violates one of the constraints. If *CODA has a greater weight than PARSE-SEG, we would have (a)p as the optimum. In that case, it is the fully faithful candidate that beats the candidate with adjunction. 3. Also in p.16, it says in the last three lines of the last paragraph that “Parallel HG can also generate the extra languages alluded to above, in which a lower bound is placed on coda size: a coda is formed to parse minimally two, or three segments.” If a coda is formed to parse minimally two segments, shouldn’t the second candidate in the third row be (apt) instead of (ap)t?

Reply ↓
Yohei Oseki November 1, 2012 at 12:58 am

Pater (2012) â€˜Serial Harmonic Grammar and Berber Syllabificationâ€™ points out some deficiencies of the standard version of Optimality Theory and Harmonic Serialism, and proposes the new theoretical framework called â€œSerial Harmonic Grammarâ€. This theory, unlike its predecessors, is equipped with gradable weighted constraints (not absolute ones) and the serial phonological derivation (not parallel one); that is, it inherits from Harmonic Grammar and Serialism characteristic advantages to overcome problems in previous theories.
I have one question about the discussion of a so-called â€œgang effectâ€ (p.12). To demonstrate relevance of the gang effect, phrase-final stop nucleus (32) and phrase-final sonorant nucleus are compared there. In these cases, the hierarchy for *C-NUC violations is crucial because in order to show gang effects, stop consonants should have three violation marks while sonorant consonants just one violation mark. But what about fricative consonants?? Since fricative consonants get two violation marks, phrase-final fricative cases end up with â€œtieâ€. Then according to the analysis of tie case on p.13, this can be interpreted as showing â€œoptionalityâ€. So the question is whether actually this prediction is empirically borne out or not. Iâ€™m curious about the facts in Berber; there are phrase-final fricative consonants, no these consonants or both attested.
And related to this one, I have another question. In pater (2012, p.13), optional prepausal annexation is analyzed as a tie with saying â€œActual theories of variation in HG directly generate a probability distribution over candidatesâ€¦â€. If this assumption is correct, from the statistical perspective, given that (rat)(lult) is -4 and (rat)(lu)(lT) is -5, phrase-final stop consonants CAN appear in Berber as well as cases with no phrase-final stop consonants, though frequency of phrase-final stop consonants is expected to be lower than others. Iâ€™m not sure how to resolve this issue within the Serial Harmonic Grammar framework.

Reply ↓
Sean Bethard November 1, 2012 at 3:23 pm

Some comments on the generality of faithfulness constraints and lexically conditioned variation.

That faithfulness constraints are general and symmetric is important to the overall theory of OT since phonological processes emerge from conflicting markedness and faithfulness constraints. It follows then that individuating faithfulness constraints is potentially detrimental to the theory if the result trivializes ranking relations. The goal of an OT analysis then is to meet descriptive needs while maintaining generality of faithfulness constraints.

Ito and Mester (2003) argue that differentiation of faithfulness constraints are key to the stratification of the lexicon. Knowledge of lexical stratification is necessary to account for certain idiosyncrasies that result when a phonological process applies in certain situations but not in others. An example is the process of sequential voicing in Japanese, which only affects certain parts of the lexicon. To account for this nonuniformity they use a model that limits stratal indexation to faithfulness, i.e. a model that inserts faithfulness constraints at specific points in a set of fixed markedness rankings (similar to the situation of harmonic completeness in phonetic inventories).

Coetzee and Pater (2011) propose a solution to dealing with lexically conditioned variation in HG wherein faithfulness constraints are specific to individual lexical items. This approach has implications for learning and suggests that the lexicon and the phonological component are highly interactive. But are there types of lexical conditioning that are not easily handled by this theory? And how does an analyst or a learner find the constraint weightings that generate the appropriate patterns?

Reply ↓

Linguistics 603: Generative Phonology

First semester graduate phonology

Week 9 comments

8 thoughts on “Week 9 comments”

Leave a Reply Cancel reply