Category Archives: Computational linguistics

Prickett in Phonology

Brandon Prickett has just published “Learning biases in opaque interactions” in the latest issue of Phonology. Congratulations Brandon!

https://doi.org/10.1017/S0952675719000320

Abstract
This study uses an artificial language learning experiment and computational modelling to test Kiparsky’s claims about Maximal Utilisation and Transparency biases in phonological acquisition. A Maximal Utilisation bias would prefer phonological patterns in which all rules are maximally utilised, and a Transparency bias would prefer patterns that are not opaque. Results from the experiment suggest that these biases affect the learnability of specific parts of a language, with Maximal Utilisation affecting the acquisition of individual rules, and Transparency affecting the acquisition of rule orderings. Two models were used to simulate the experiment: an expectation-driven Harmonic Serialism learner and a sequence-to-sequence neural network. The results from these simulations show that both models’ learning is affected by these biases, suggesting that the biases emerge from the learning process rather than any explicit structure built into the model.

SENSUS at UMass, April 18-19, 2020

UMass is hosting “Sensus: Constructing meaning in Romance” on April 18-19, 2020. This is a conference on the formal semantics and pragmatics of Romance languages.

Areas: theoretical semantics and pragmatics and their interfaces with other domains, experimental methodologies, fieldwork, the study of variation and computational approaches

Venue: Integrative Learning Center at UMass Amherst (the ILC is a fully accessible building)

Invited speakers:

Luis Alonso-Ovalle
(McGill University)

Mariapaola D’Imperio
(Rutgers University)

Donka Farkas
(UC, Santa Cruz)

Organizers: Ana Arregui, María Biezma, Vincent Homer and Deniz Özy?ld?z

Event sponsored by the Department of Linguistics and the Department of Languages, Literatures and Cultures of UMass Amherst

Contact us at sensus@umass.edu

Details can be found here: http://websites.umass.edu/sensus/

David Smith talk, Monday Nov 18

David Smith (https://www.khoury.northeastern.edu/people/david-smith/) will present “Textual Criticism as Language Modeling: Viral Texts, Networked Authors, and Computational Models of Information Propagation” at 4 pm Monday Nov. 18th in ILC N400. An abstract is below.

This presentation is to a joint meeting of the Initiative for Data Science in the Humanities, and the Data Science tea. If you have any questions, contact Joe Pater at pater@umass.edu. David will be available for half hour meetings from 1 – 3:30 in the Linguistics department – sign up here.

Abstract

The era of mass digitization seems to provide a mountain of source material for scholarship, but its foundations are constantly shifting. Selective archiving and digitization obscures data provenance, metadata fails to capture the presence of texts of mutable genres and uncertain authorship embedded within the archive, and automatic optical character recognition (OCR) transcripts contain word error rates above 30% for even eighteenth-century English. The condition of the mass-digitized text is thus closer to the manuscript sources of an edition than to a scholarly publication. On the computational side, models that treat collections as sets of independent documents fail to capture the processes by which new texts are generated from existing ones.

In this talk, I will discuss several aspects of our work on “speculative bibliography” with computational methods. Starting from a simple model of the composition of historical newspaper pages, with applications to text denoising, I describe models of how texts transform their sources, applied to modern science journalism, medieval Arabic historians, and the generically hybrid forms in nineteenth-century newspapers. I conclude by discussing methods for inferring network structure and mapping information propagation among texts and publications.

This is joint work with Ryan Cordell, Rui Dong, Ansel MacLaughlin, Abby Mullen, Ryan Muther, and Shaobin Xu.