This talk summarizes some ways that machine-learning techniques identify regularities within musical corpora, and does so through three analogies with spoken language. First, I show that grouping chords which share structural content and which appear in similar contexts can create equivalence classes, classes that can be seen to provide a constrained chord vocabulary to which surface events reduce or conform. I analogize this vocabulary with linguistic “words.” Second, I show that machine-learned Hidden Markov Models trained on musical bigrams identify contextual categories within music; I suggest that these groupings have rough analogies with linguistic parts of speech. Third, I examine the groupings made by these techniques, and show that some groups rely primarily on the frequency with which chords occur, while others create categories reckoned on the relative contextual position that chords have to these most-frequent harmonies. I relate this dichotomy to the distinction between so-called entity and relationship concepts used in cognitive linguistics, with the former category relying on the intrinsic properties of a concept and the latter are defined via their relationships to entity concepts.