Category Archives: Computational linguistics

David Smith talk, Monday Nov 18

David Smith (https://www.khoury.northeastern.edu/people/david-smith/) will present “Textual Criticism as Language Modeling: Viral Texts, Networked Authors, and Computational Models of Information Propagation” at 4 pm Monday Nov. 18th in ILC N400. An abstract is below.

This presentation is to a joint meeting of the Initiative for Data Science in the Humanities, and the Data Science tea. If you have any questions, contact Joe Pater at pater@umass.edu. David will be available for half hour meetings from 1 – 3:30 in the Linguistics department – sign up here.

Abstract

The era of mass digitization seems to provide a mountain of source material for scholarship, but its foundations are constantly shifting. Selective archiving and digitization obscures data provenance, metadata fails to capture the presence of texts of mutable genres and uncertain authorship embedded within the archive, and automatic optical character recognition (OCR) transcripts contain word error rates above 30% for even eighteenth-century English. The condition of the mass-digitized text is thus closer to the manuscript sources of an edition than to a scholarly publication. On the computational side, models that treat collections as sets of independent documents fail to capture the processes by which new texts are generated from existing ones.

In this talk, I will discuss several aspects of our work on “speculative bibliography” with computational methods. Starting from a simple model of the composition of historical newspaper pages, with applications to text denoising, I describe models of how texts transform their sources, applied to modern science journalism, medieval Arabic historians, and the generically hybrid forms in nineteenth-century newspapers. I conclude by discussing methods for inferring network structure and mapping information propagation among texts and publications.

This is joint work with Ryan Cordell, Rui Dong, Ansel MacLaughlin, Abby Mullen, Ryan Muther, and Shaobin Xu.

Frank colloquium Friday Oct 11 at 3:30

Bob Frank, Yale University, will present “Inductive Bias in Language Acquisition: UG vs. Deep Learning” in the Linguistics colloquium series at 3:30 Fri. Oct 11. An abstract follows. All are welcome!

Abstract: Generative approaches to language acquisition emphasize the need for language-specific inductive bias, Universal Grammar (UG), to guide learners in the face of limited data. In contrast, computational models of language learning, particularly those rooted in contemporary neural network models, have achieved high levels of performance on practical NLP tasks, largely without the imposition of any such bias. While UG-based approaches have led to important insights into the stages and processes underlying language acquisition, they have not yielded a concrete, mechanistic model of the process by which language is learned. At the same time, practical computational models have not been widely tested with respect to their ability to extract linguistically significant generalizations from training data. As a result the ability of such systems to face the challenges identified in the generative tradition remains unproven. In this talk, I will review several experiments that explore the ability of network models to take on such challenges. Looking at question formation and subject-verb agreement, we find that there is considerable variety in the degree to which network architectures are capable of learning significant grammatical generalizations through gradient descent learning, suggesting that the architectures themselves may be able to impose some of the necessary bias that is often assumed to motivate the need for UG. Inadequacies remain in the generalizations acquired, however, which points to the need for hybrid models that integrate language specific information into network models.

Language and Music Workshop this Sunday May 12th

The UMass Amherst Department of Linguistics and the Department of Music and Dance, with additional support from the Interdisciplinary Studies Institute, will host a Language and Music Workshop on the afternoon of Sunday May 12th. The event will take place from noon until 5:45 in N400 in the Integrative Learning Center. Parking is free in permit lots on Sunday; the ILC is at the top corner of the pond on this map.

There are five invited speakers, and five poster presentations listed below. Please join us for lunch beforehand!

Questions? Please e-mail Joe Pater at pater@umass.edu.

Schedule

Noon – Catered lunch

1:00 Bob Ladd – University of Edinburgh

Two problems in theories of tone-melody matching (Abstract)

1:45 François Dell – Centre de Recherches Linguistiques sur l’Asie Orientale (CRLAO) CNRS / EHESS, Paris

Delivery design: towards a typology (Abstract)

2:30 Laura McPherson – Dartmouth College

Tonal adaptation across musical modality: A comparison of Sambla vocal music and speech surrogates (Abstract)

3:15 Poster session (see below for a list of posters)

4:15 Christopher White – University of Massachusetts Amherst

Analogies with Language in Machine-learned Musical Grammars

5:00 Mara Breen – Mount Holyoke College

The Cat in the Hat: Musical and linguistic metric structure realization in child-directed poetry (Abstract)

5:45 Goodbye.

Posters

Ellie Abrams, Laura Gwilliams, Alec Marantz (NYU, NYU Abu Dhabi)

Tracking the building blocks of pitch perception in auditory cortex (Abstract)

Kyle Marcos Allphin, Smith College ’19

Perception of Emotional Characteristics in Diatonic Chords (Abstract)

Ahren B. Fitzroy (Mount Holyoke College, University of Massachusetts, Amherst) and Mara Breen (Mount Holyoke College)

Implicit metric structure in aprosodic productions of The Cat in the Hat modulates auditory processing (Abstract)

Bronwen Garand-Sheridan, Yale University

Sound-symbolic semantics of pitch space (Abstract)

Emily Schwitzgebel, UMass Amherst and Will Evans, UMass Amherst

Subtle Violations in Harmonic Expectancy (Abstract)