Rumelhart and McClelland 1986

Please post comments here.

5 thoughts on “Rumelhart and McClelland 1986”

Joe Pater Post authorNovember 1, 2011 at 1:16 pm

From Robert:

I wonder about how they establish distinct epochs for training/testing in which certain frequency words are simply present or absent. Would we expect the same effects if we actually sampled according to the words distributions? Was there any reason (besides maybe something practical about computational power) to do things this way?

Reply ↓

Alex November 1, 2011 at 10:24 pm

1. Regarding Wickelphones/Wickelfeatures: this kind of representation seems quite interesting, because it would seem that such representations could incorporate coarticulation and thus, represent phonetics quite closely in a representation that can be manipulated by phonological computation. Has anyone tried to work out a phonological model that makes use of such represenations?

2. Rumelhart & McClelland’s model is unidirectional and has the present tense as the input and the past tense as the output of the model. Even with the understanding that this is a simplification of what the model is actually intended to do like, I’m wondering whether such a unidirectional model with duplication of sound features for input and output levels can be intergrated in a larger model of phonology and morphology. Even if inflectional morphology is seen as a mapping from an input (base) to an output (other form in the paradigm), it’s not clear how derivational morphology could be handled this way.

Suppose that we have a language with a suffixing nominal inflectional paradigm (with an accusative marked by adding -u after labials and -i otherwise), but stems can optionally take a prefix “ga-” indicating diminution.

tan-i “dog”
tan-i “dog.ACC”
rom “cat”
rom-u “cat.ACC”

ga-tan “DIM.dog”
ga-tan-i “DIM.dog.ACC”
ga-rom “DIM.cat”
ga-rom-u “DIM.cat.ACC”

Suppose that there are connections from a nominal base input level to an accusative output level, and connections from the same nominal base input to a diminutive output level.
Then, to make the accusative of a diminutive stem like “ga-rom-“, there should also be connections from the diminutive level to the accusative level. However, these connections need to be completely independent from the connections between the nominal base level to the accusative level. This need to be a problem in all cases, because supervised learning should make it possible to just have two sets of almost identical associations in both the base-to-accusative network and the diminutive-to-accusative network.

However, suppose that accusative formation has lexically idiosyncratic cases. Then, if diminutive formation is reasonably innovative, it might be that some idiosyncratic noun bases have not yet had a diminutive form before. In that case, we would predict that, if an exceptional noun “kam, kam-i” would get the diminutive form “ga-kam-“, the newly-created diminutive “ga-kam-” would get a regularized accusative “ga-kam-u”. Thus, there will always be loss of lexical exceptionality under prefixation.

However, such loss of exceptionality is not always seen – it seems to be the case that lexical exceptions sometimes do carry over from base to derived form (so that we would get “ga-kam-i” instead of the regularized “ga-kam-u”). This means that the tuning of the base-to-accusative network should have some kind of influence on the tuning of the diminutive-to-accusative network; but this seems completely impossible to model in the kind of connectionist model used by Rumelhart & McClelland.

TL;DR: It seems that a unidirectional connectionist model of morphology would predict that lexical exceptions will always be lost under derivational affixation, which is at odds with what we find in languages.

3. Finally, I remember hearing in my language acquisition class in Leiden that there was this study that argued against U-curve learning of irregular past tenses in English. I don’t remember the details or the reference, but if it’s not true that children first learn irregular forms correctly and then start regularizing them, then the entire argument made here for connectionism as a reasonable model of learning collapses.

Reply ↓

Lena November 2, 2011 at 3:21 am

I was thinking how would this model capture the Semitic verb paradigm. I guess that the creators might need to think about the notion of the base form, and also to choose the theory of how the verbs are stored in the UR: as a consonantal root + vocalic patterns/templates or as some more abstract base. I am not sure whether with way that the authors encode their data (encoding the phonemes that precede and follow some phoneme) they will be able to capture the transition from ktv to kotev – write/he writes (Hebrew).

The model also seems to imitate the later stages of acquisition, when the children already have divided the stream of sounds into separate morphemes. Out of curiosity, were there attempts to model the earlier stage? Thinking about Semitic morphology again, they would not only have to figure out the whole word, and divide it into a stem & inflectional suffixes (gender, number), but also make the extra step and separate consonants and vowels, which might not be so hard due to the regularities in their patterns. I wonder whether people tried to feed this “problem” to computers.

Reply ↓

Minta November 2, 2011 at 11:27 pm

Ok, so the English case is “easy”, in the sense that for each verb base, there is only one past tense form. But some languages are not so “easy”. For example, in Spanish, the past tense inflection depends on the person/number of the subject. In addition, while all past tense forms differ from the present tense by the use of a past-tense suffix, for some verbs, the past tense also differs by a vowel change in the stem (3rd person) or a vowel change and a consonant change in the stem (1st person singular).

So how would this model accommodate languages like Spanish? Can it implement a one (base) to many (past tense forms) mapping? Or does each present tense form serve as the base for its past-tense counterpart of the same person/number? Also, since in at least some cases the suffix carries information about both person/number and tense, does it have to be present in the input? And if so, how would that work?

Reply ↓

Minta November 2, 2011 at 11:37 pm

Ignore my previous comment; I think this one is better.

Ok, so the authors suggest (p 257) that Type VIII verbs are hardest to learn because the relationship between past and present tense is the “most idiosyncratic” for these verbs.

I’m trying to figure out whether this statement is contradictory, given that the authors are promoting a model in which rules don’t exist. Type VIII rules are “most idiosyncratic” compared to what? If the learner does not rely on any formally stated rules, then why does it matter that a particular pairing between the base and the past tense doesn’t follow any particular rule?

Reply ↓

linguist730-pater's blog

Just another websites.umass.edu site

5 thoughts on “Rumelhart and McClelland 1986”

Leave a Reply Cancel reply