Please post comments or questions for the papers in Week 2 here by the end of the day Sunday (Jan. 29th).
4 thoughts on “Week 2 comments”
Alex
Regarding Pater (to appear):
One of the most important prerequisites for the emergence of systemic simplicity in a constraint-based framework seems to be the presence of general constraints that generalize over entire classes of categories. An interesting follow-up question is why learners should postulate such constraints, supposing that constraints are learned and not innate.
If constraints are learned, it seems possible to represent phonological patterns by very specific constraints only, without postulating broad and general constraints. Hence, there must be some independent factor motivating the learning of general instead of specific constraints. Such a mechanism is built into the MaxEnt learner used by Hayes & Wilson (2006).
However, other work on inductive generalization in language has argued that generalization must be minimal instead of maximal (for instance, Albright & Hayes’ Minimal Generalization Learner). This work is not couched in constraint-based terms. Is the rule-based notion of minimal generalization in conflict with the constraint-based notion of maximal generalization mentioned above, or are these two notions basically independent? If they are not independent, how can we give both of them a place in a balanced model?
In general, what does research in non-linguistic psychology have to say about the nature of human generalization – is it rather “minimal” or “maximal”, and in which sense?
I am wondering about the experiment described in section 5. Applied to the question of language change, would it imply that there is a tendency to simplify one’s phonological contrasts when possible? Is there evidence for that?
My first question is about the initial state assumed on p. 5. Does this assumption have any result on the bdg learner’s advantage? I haven’t been able to think of any reason it would, but I’m not sure.
Also, what’s the status of the stress constraints in 8b? They don’t enjoy the same initial state bias as the realization constraints for the first simulation.
The second is about the lack of a general faithfulness constraint in the first simulation. Would the inclusion of Ident-Voi have any effect?
For simulation #1 (ptg–bdg), what happens when you have four places with 2 voiced in the mismatch language? So, you have [pi, ti, ki, qi, pu, tu, gu, Gu] vs. [pi, ti, ki, qi, bu, du, gu, Gu]. Do you get the same results but a smaller difference between languages?
The first simulation results seem to depend on random sampling and an imbalance in the ratio of voiced : voiceless consonants as well as the constraint set + update rule. If sampling was weighted in the mismatch language according to usage frequency and the voiced [gu] had a very high weight (because it happened to be frequent), or if there were many more [g]-initial words than words than voiceless-initial words, then the learning rate could be the same or favor [p, t, k, g] over p, t, k, b, d, g].
Right?
The loss and/or gain (perhaps through borrowing) of lexical words could then influence the phonological grammar.
If this is true and frequency (either type or token) does matter, I think this speaks against a Principles and Parameters idea in that given very sparse (as well as imperfect) input a grammatical parameter can be set with a single input token. Here the superset grammar is assumed by the learner until they get a token that can only be analyzed with the subset grammar and so the parameter is set to the subset grammar.
Regarding Pater (to appear):
One of the most important prerequisites for the emergence of systemic simplicity in a constraint-based framework seems to be the presence of general constraints that generalize over entire classes of categories. An interesting follow-up question is why learners should postulate such constraints, supposing that constraints are learned and not innate.
If constraints are learned, it seems possible to represent phonological patterns by very specific constraints only, without postulating broad and general constraints. Hence, there must be some independent factor motivating the learning of general instead of specific constraints. Such a mechanism is built into the MaxEnt learner used by Hayes & Wilson (2006).
However, other work on inductive generalization in language has argued that generalization must be minimal instead of maximal (for instance, Albright & Hayes’ Minimal Generalization Learner). This work is not couched in constraint-based terms. Is the rule-based notion of minimal generalization in conflict with the constraint-based notion of maximal generalization mentioned above, or are these two notions basically independent? If they are not independent, how can we give both of them a place in a balanced model?
In general, what does research in non-linguistic psychology have to say about the nature of human generalization – is it rather “minimal” or “maximal”, and in which sense?
I am wondering about the experiment described in section 5. Applied to the question of language change, would it imply that there is a tendency to simplify one’s phonological contrasts when possible? Is there evidence for that?
My first question is about the initial state assumed on p. 5. Does this assumption have any result on the bdg learner’s advantage? I haven’t been able to think of any reason it would, but I’m not sure.
Also, what’s the status of the stress constraints in 8b? They don’t enjoy the same initial state bias as the realization constraints for the first simulation.
The second is about the lack of a general faithfulness constraint in the first simulation. Would the inclusion of Ident-Voi have any effect?
For simulation #1 (ptg–bdg), what happens when you have four places with 2 voiced in the mismatch language? So, you have [pi, ti, ki, qi, pu, tu, gu, Gu] vs. [pi, ti, ki, qi, bu, du, gu, Gu]. Do you get the same results but a smaller difference between languages?
The first simulation results seem to depend on random sampling and an imbalance in the ratio of voiced : voiceless consonants as well as the constraint set + update rule. If sampling was weighted in the mismatch language according to usage frequency and the voiced [gu] had a very high weight (because it happened to be frequent), or if there were many more [g]-initial words than words than voiceless-initial words, then the learning rate could be the same or favor [p, t, k, g] over p, t, k, b, d, g].
Right?
The loss and/or gain (perhaps through borrowing) of lexical words could then influence the phonological grammar.
If this is true and frequency (either type or token) does matter, I think this speaks against a Principles and Parameters idea in that given very sparse (as well as imperfect) input a grammatical parameter can be set with a single input token. Here the superset grammar is assumed by the learner until they get a token that can only be analyzed with the subset grammar and so the parameter is set to the subset grammar.