Course Description

Distributional models, or word embeddings, automatically learn a representation of word meanings from corpus co-occurrence patterns. They let us view meaning as a semantic space in which similar meanings are located close together, and word meanings, as they reveal themselves through corpus patterns, can be explored visually as neighborhoods in space. This course provides a general introduction to distributional models and word embeddings, from simple count-based approaches to large language models. The course includes a hands-on session on how to compute such representations. But the main focus of the course will be on how to interpret and use these representations. They can be viewed as an aggregation of utterances from many speakers. and they can be probed for what co-occurrence patterns say about both lexical semantics and the social contexts of words. We will also discuss a recent method that finds meaningful directions in space, a method that is being used in lexical semantics, cognition, and sociolinguistics.

Area Tags: Semantics, Computational Linguistics

(session 1) Tuesday/Friday 9:00-10:20

Location: ILC S331

Instructor: Katrin Erk

Katrin Erk is a professor in the Department of Linguistics at the University of Texas at Austin. Her research expertise is in the area of computational linguistics, especially semantics. Her work is on distributed, flexible approaches to describing word meaning, and on combining them with logic-based representations of sentences and other larger structures. At the word level, she is studying flexible representations of meaning in context, independent of word sense lists. At the sentence level, she is looking into probabilistic frameworks that can draw weighted inferences from combined logical and distributed representations. Katrin Erk completed her dissertation on tree description languages and ellipsis at Saarland University in 2002.