Please add your Week 2 discussion comment/question here. Week 2 includes the classes of Thursday July 14th and Monday July 18th, whose topics are OT/HG typology with different constraint sets, serial OT/HG, and the introduction to probabilistic models and variation.
The handouts and demo files for these classes (handouts 3 and 4) are now available here.
16 replies on “Week 2 Discussion”
This comment relates to my last one about obvious and non-obvious cumulativity. In her CLS paper, Jesney claims that a particular case of conjunctive licensing (i. e. /badnabad/ -> [bat.na.pat], where distinctive voicing is allowed in the onset of the initial syllable only) cannot be accounted for in OT without positional faithfulness. OT can do this with positional markedness only when the constraints are formulated and ranked as follows (constraints included generate a factorial typology consistent with Jesney’s example (3)):
VoiceOnset/Wk, VoiceOnset, *Voice/Wk >> Id-Voice >> *Voice
The difference between this analysis Jesney’s is that I’ve gotten rid of Voice-Syl1, which requires that every instance of [+voice] on an obstruent be realized in the initial syllable, and I’ve added versions of VoiceOnset ([+voice] is associated to an obstruent in onset position) and *Voice that are relativized to weak (non-initial syllable) positions.
Maybe this is an undesirable translation of positional markedness theory into a constraint-based formalism, but the implication is interesting as a case of non-obvious cumulativity. Descriptively, it’s the case that a marked feature can occur only when two unmarked environments coincide, the initial syllable and the syllable onset. Formally, on the other hand, there is no additive effect. Here we have a mismatch between descriptive and formal cumulativity.
The difference between Jesney’s analysis and the one proposed here seems to stem from the degree of specificity in the constraints. In my last comment I suggested that valuable constraints are those that can be evidenced based on surface-observable facts (at least where additivity is concerned). While it’s true that Jesney’s constraints (VoiceOnset, Voice->Syl1) find typological support, the constraints I propose do the same but they sacrifice a level of generality by referring specifically to strong and weak syllables. I don’t know which option is preferable.
See reply linked here.
The conjunctive /badnabad/ -> [batnapat] (onset and word-initial syllable) can indeed be accounted for in OT with only positional markedness constraints. I’m not sure I understand the difference between this analysis and the one that uses the constraints in the Potts et al. paper. The problem is the disjunctive /badnabad/ -> [badnabat]. Along the lines of your discussion, there is a solution with positional markedness in OT, but it would require a constraint that specifically bans voicing in codas of non-initial syllables. This points up a general issue in comparing HG and OT – we can always mimic HG cumulativity by adding more specific constraints.
The poster that discussed contrast in the weak environment was Purnima Thakur’s, called “Sibilants in Gujarati Phonology”, near the bottom of the page here:
http://www.ling.ohio-state.edu/LSAinfotheory/
To clarify: She shows evidence that some speakers do contrast /s/ and /ʃ/ only before high front vowels in Gujarati, focusing on the speaker variation, not typology.
I’m still a beginner with all this phonological theory and trying to get a handle on what these typological models are saying in terms of cognition.
While the graded scores that HG gives to candidates make more sense to me that OT’s categorical decisions, I continue to be concerned about the way every violation of every constraint affects all of the others. As I said last week, it is fairly easy to be convinced that a limited set of constraints and candidates a relevant to a decision in OT, that isn’t as obvious to me in HG.
Is it the case in HG, assuming finite data, that if we have a tableau covering a subset that we can get the same winner in the larger data set along with all the rest of the constraints?
I suspect a counter example would be easy to formulate. If so, then what should we do to be sure that enough of the constraints are included?
See reply linked here.
In reply to Jim White.
It’s not clear to me that the situation is any different in OT than in HG here. In either case, missing constraints or candidates can spoil an analysis. Unfortunately, there’s no simple way of ensuring we have all the relevant constraints. John McCarthy’s book Doing OT gives some useful practical pointers on avoiding mistakes of this type.
Can you give me a concrete example of where you think HG is different?
See reply linked here.
I meant to ask this in our last class, but when we were talking about serialism, you used a couple of examples. One where we had +—– and another where we had +– (I think) where the final minus was protected. If I remember correctly, in the first case you argued that each of the minuses would become pluses, but in the second, this wouldn’t happen as the final one is protected. And what I was wondering, why doesn’t the plus become a minus in either case, but especially the second case?
See reply linked here.
In reply to Sam Perdue:
This is in fact a tricky question for the analysis of harmony processes. Generally, when harmony is blocked, either by a co-occurrence constraint, or by an exceptional morpheme, we don’t see the spreading feature delink. For example, if underlying + – is +Back -Back, and we generally have progressive (L-R) backness harmony, but in this particular case the -Back feature can’t change, we don’t see the first one becoming -Back. There is a variety of answers. One might have a special faithfulness constraint that protects the first segment (e.g. Root/Stem faithfulness), or one might ban regressive spreading in some other way. This likely relates to what’s called the “too-many-repair” problem: it may well be that for many kinds of harmony, this never happens (while things like this do seem to occur for ATR harmony, they don’t for backness, for example). If that’s true, blocking this outcome with a rerankable constraint won’t get us the typological facts…
From handout 3: I was a little confused about the reasoning for not using the number of languages as a measure of restrictiveness. You mention that infinite unattested languages in HG and finite unattested languages in OT can both be removed by going to a serial framework. It seems a little odd to refer to a distinct theory (serial HG) to argue for the relative typological responsibility of HG itself. The infiniteness issue (across problems) doesn’t go away entirely in this move, either. It seems more like arguments in favor of ignoring the infinities have to come from elsewhere…
See reply here.
About handout 4: Can we make sense of MaxEnt as a theory of typology divorced from a theory of learning/language change? We can predict what forms can occur with the majority of the probability mass (i.e. the normal HG winners), but is that enough? We have a lot of other variables to contend with — is there anything we can say about restrictiveness from the model itself?
See reply here.
I guess my queasiness really continues to be about the way all constraints interact with all of the rest.
In an OT tableau it is usually not too difficult to convince one’s self that a sufficient variety of candidates are being checked and all of the relevant constraints are in the tableaux or have been deemed as higher ranked. No matter what new or unknown constraints come along, the analysis should continue to be good – unless it turns out that it needs to be higher ranked than your lowest constraint.
With HG, it is not obvious to me that you can be certain that some lower ranked constraints not included in the tableau won’t affect the analysis. In the languages where there is an OT solution as well as a HG solution, I can see how it would work out that adding lower ranked constraints will permit the same analysis because HG can do any OT (for finite data) and the weights just scale up as needed. But I’m less certain about the languages where there is an HG solution but not OT.
I don’t doubt that this question is not an issue for typological theory. This surely has more to do with my interest in seeing a model where cognitive processes or modules are more explicitly represented. Software engineer that I am, my perspective tends to be towards looking for linguistic description or theory that could be used to implement systems that learn languages.
See reply here.
As a beginner to linguistics, I am still trying to figure out how changing to harmonic serialism would help in the case discussed in handout 3. If serialism does help, what are the rules to decide which step should be processed first in GEN and which should be the second, etc.. Or in fact we do not care about the processing order at all?
See reply linked here.
Another thing is that the last Demo in handout 2 did not work in class (if I did not remember it wrong, and I should’ve post this earlier). I think the problem is in the input files: because the input numbers in the two files are not equal. Simply make them equal and it will work.
Thanks!
In reply to Robert Staubs:
About counting numbers of languages: The general point I was trying to make here was that it would be a mistake to simply run the original HG/OT typology comparison, see that HG generated many more languages, and stop there. In many cases, without looking closer, it’s not apparent that there is a problem underlying both typologies, whose solution gets rid of the seemingly problematic OT/HG difference.
About infinity: Is infinity an issue? In an earlier version of the paper that eventually appeared in under my name in Cognitive Science, Rajesh Bhatt, Chris Potts and I worked up a proof that HG would generate only a finite number of languages under certain assumptions. This was in response to Legendre et al.’s demonstration that the Align/Weight-to-Stress interaction generated an infinite typology. This is all interesting, but I think it’s worth giving some thought to whether there is a problem to be solved in the first place. In the abstract, is an infinite typology a problem, so long as it gets the right result in the finite space of observable languages?
In reply to Robert Staubs.
About MaxEnt models and restrictiveness: I don’t have any more to say myself yet about what patterns of variation MaxEnt models can and cannot capture: I think the possibility that there is a difference between simple and collective harmonic bounding in terms of the required weights is worth further investigation. As you allude to, I suspect that the real interest of these models in terms of typology will be how their implementations as models of learning can help to explain patterns of change, and as a result, typology.
In reply to Xu Zhao.
Here’s one way of understanding how HS limits HG/OT typological differences. In HS, our candidates are limited to a single change at a time. Therefore, insofar as this single change limits the number of possible differences between the candidates, we also limit the sorts of tradeoffs that we get. So far, it looks like the gang effects that we lose are ones that are unattested.
About the ordering of operations. In its basic form HS does not impose an order: all operations are available to apply at any time in creating the candidate set at each step of the derivation. One might well consider a variant that does force an ordering as a way of dealing with opacity: this is how McCarthy’s OT with Candidate Chains works, for example, and I spent some time thinking about another approach (see slides linked here).
In reply to Jim White.
I am extremely interested in explaining typology in terms of explicit representation of linguistic processes, and I’ve been encouraged in working toward that goal in using weighted constraint models.
A summary at the end highlighting the main points would be beneficial.
http://autos.car1.hk/external.php?url=https://msry1.com/