Category Archives: Computational linguistics

CLC Talk on Unsupervised Learning of Phrase Structure – November 15 @ 4pm

The first CLC (Computational Linguistics Community) event of the semester will be a talk on unsupervised learning of phrase structure. The talk will be at 4pm on November 15th and will take place as part of the new Neurolinguistics Reading group. All are welcome! Please see below for more details.

TITLE:
Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Auto-Encoders

AUTHORS:
Andrew Drozdov*, Pat Verga*, Mohit Yadav*, Mohit Iyyer, Andrew McCallum

ABSTRACT:
Syntax is a powerful abstraction for language understanding. Many downstream tasks require segmenting input text into meaningful constituent chunks (e.g., noun phrases or entities); more generally, models for learning semantic representations of text benefit from integrating syntax in the form of parse trees (e.g., tree-LSTMs). Supervised parsers have traditionally been used to obtain these trees, but lately interest has increased in unsupervised methods that induce syntactic representations directly from unlabeled text. To this end, we propose the deep inside-outside recursive auto-encoder (DIORA), a fully-unsupervised method for discovering syntax that simultaneously learns representations for constituents within the induced tree. Unlike many prior approaches, DIORA does not rely on supervision from auxiliary downstream tasks and is thus not constrained to particular domains. Furthermore, competing approaches do not learn explicit phrase representations along with tree structures, which limits their applicability to phrase-based tasks. Extensive experiments on unsupervised parsing, segmentation, and phrase clustering demonstrate the efficacy of our method. DIORA achieves the state of the art in unsupervised parsing (48.7 F1) on the benchmark WSJ dataset.

LOCATION:
ILC N400

UMass Linguistics at NELS 49 at Cornell, October 5-7, 2018

UMass Linguistics was well represented at NELS 49 at Cornell. Cutting and pasting from the NELS website, I find:

The Reversible Core of ObjExp, Location, and Govern-Type Verbs.
Michael Wilson.
Besides Exceptives.
Ekaterina Vostrikova.
Phase Sensitive Morphology and Dependent Case.
Kimberly Johnson.
Don’t give me that attitude! Anti-De Se and Feature Matching of German D-Pronouns.
Alexander Göbel.
A secondary crossover effect in Hindi and the typology of movement.
Rajesh Bhatt and Stefan Keine.
Complementizers in Laz are attitude sensitive.
Omer Demirok, Deniz Ozyildiz and Balkiz Ozturk.
Romanian loves Me: Clitic Clusters, Ethics & Cyclic AGREE.
Rudmila-Rodica Ivan.

UMass Alum Maria Gouskova was one of the invited speakers. There were enough of us to justify a group picture.

 

UMass Linguistics at CreteLing 2018: Part 3 [Distributed Group Photos]

There was frost outside this morning. So it might be a good time to think about summer. This summer the UMass Linguistics department was very well represented at the CreteLing 2018 summer school in Rethymnou, Crete. Since there are a lot of pictures, I’ll break them into three parts. The third part is distributed group photos. It was difficult to get everyone into one picture. So there are many pictures.

In the big group picture you can see Elena Benedicto, Rajesh Bhatt, Satoshi Tomioka, Kai von Fintel, Petr Kusliy, William Quirk, Bobby Tosswill, Ede Zimmerman [partially], Caroline Fery, Winnie Lechner, Katia Vostrikova, Zahra Mirrazi, Rodica Ivan, Leah Chapman, Kyle Johnson, and Deniz Özyildiz.

 

 

UMass Linguistics at CreteLing 2018 Part 2: [Extracurricular Activities]

There was frost outside this morning. So it might be a good time to think about summer. This summer the UMass Linguistics department was very well represented at the CreteLing 2018 summer school in Rethymnou, Crete. Since there are a lot of pictures, I’ll break them into three parts. The second part is extracurricular activities.

Commentaries and blog discussion for Pater 2019

From Joe Pater

I’ve set up a discussion blog post with links to the final (pre-copyedited) versions of my paper “Generative Linguistics and Neural Networks at 60: Foundation, Friction and Fusion” and the commentaries here: https://websites.umass.edu/cogsci/2018/10/12/discussion-generative-linguistics-and-neural-networks-at-60/. Direct links to the commentaries are also below. The paper and commentaries will appear in the March 2019 volume of Language.

Iris Berent and Gary Marcus. No integration without structured representations: reply to Pater.

Ewan Dunbar. Generative grammar, neural networks, and the implementational mapping problem.

Tal Linzen. What can linguistics and deep learning contribute to each other?

Lisa Pearl. Fusion is great, and interpretable fusion could be exciting for theory generation.

Chris Potts. A case for deep learning in semantics

Jonathan Rawski and Jeff Heinz. No Free Lunch in Linguistics or Machine Learning.

 

Su-Lin Blodgett featured in UMass news

Su-Lin Blodgett, a PhD student in the College of Infomation and Computer Sciences, was recently featured in a UMass news aarticle “Data Science Student Aims to Improve Inclusion of African-American English”, which discusses her recent ACL paper: https://www.umass.edu/newsoffice/article/data-science-student-aims-improve

It cites Blodgett as saying that:

By expanding the linguistic coverage of NLP tools to include minority and colloquial dialects, the thoughts and ideas of more individuals and groups can be included in areas such as opinion and sentiment analysis,

Blodgett’s research on the use of African American English on Twitter has been done in collaboration with our own Lisa Green as well as Brendan O’Connor of CICS. The ACL paper was in collaboration with undergraduate Johnny Tian-Zhen Wei and O’Connor.

Jarosz receives NSF grant for SCiL special sessions

Gaja Jarosz has received a conference grant from the NSF to fund three special sessions at the next meeting of the Society for Computation in Linguistics, to be held alongside the LSA annual meeting January 3-6, 2019: a plenary session on Hidden Structure in Language Learning, a special panel on “What should linguists know about Natural Language Processing  and Machine Learning?”, and tutorials for linguists on selected aspects of NLP/ML.

The first tutorial will be offered by Allyson Ettinger (Toyota Technological Institute at Chicago) and will be on vector space models for syntax and semantics. The second tutorial will be offered by Kasia Hitczenko (University of Maryland) and will provide an introduction to Bayesian modeling, with an emphasis on applications in the domains of phonetics and phonology

The panelists will be Sam Bowman (Department of Linguistics and Center for Data Science, NYU), Chris Dyer (DeepMind), Allyson Ettinger (TTI), and Noah Smith (Computer Science and Engineering, Univerity of Washington). The special panel discussion will address the general topic of communication across Linguistics and NLP/ML: what misconceptions there may be on both sides, how goals and evaluative criteria may differ or overlap, what training and skills are most important for pursuing NLP/ML research or career paths, and what to expect when seeking to establish cross-disciplinary collaborative research projects.

Finally, the plenary session on hidden structure will feature talks by Gaja Jarosz, and by Mark Johnson of Macquarie University.

 

 

SCiL 2019 to be held in NYC *and* Paris

The meeting of the Society from Computation in Linguistics is again being organized by Gaja Jarosz and Joe Pater, this year with help from Max Nelson and Brendan O’Connor of Computer Science. Jarosz is also one of the invited speakers, along with Mark Johnson of Macquarie University. This year will also feature an exciting experiment in conference design: there will be two simultaneous locations, with at least some of the talks being held jointly by videoconferencing. SCiL 2019 will again be co-located with the LSA, this time in NYC Jan. 3-6, and it will also take place in Paris over the same dates, at the Laboratoire de linguistique formelle at l’université Paris 7 Diderot. The abstract and paper deadline is Aug. 1: the full call can be found here: http://websites.umass.edu/scil/scil-2019/call-2019/.

Summer Dialect Research Project 2018

Four undergraduate students participated in the Summer Dialect Research Project (SDRP) at UMass in June hosted by the Center for the Study of African American Language (CSAAL). The Center, directed by Lisa Green, fosters and integrates research on language in African American speech communities and applications of that research in different realms. Three students, Christian Muxica, Alexander Santos, and Emily Smith, are enrolled at UMass and majoring in linguistics. Janiya Gilbert is a sophomore at North Carolina A&T University in the animal science program and has interests in language-related and social justice fields. The participants gained research experiences in areas in the study of African American English (AAE), a linguistic variety spoken by some African Americans. They worked on their skills in linguistics while also building broader analytical, argumentation, and collaboration skills. They completed group critical review projects and individual research projects that required analysis of data sets from AAE. During the three-week program, participants attended lecture/discussion sessions with UMass faculty, researchers, and graduate students, who covered topics in syntax, phonology, acquisition, psycholinguistics, and natural language processing. Professors Joe Pater and Kristine Yu worked with the participants in interactive sessions on topics related to sound patterns of AAE, such as the production of word final -t/-d and prosodic properties of yes-no questions in the variety. Professors Tom Roeper and Brian Dillon shared research on topics in acquisition and language processing and linked that research to data in AAE. For instance, Professor Dillon made the connection between research on garden path sentences and subject relative constructions in AAE. In other sessions, researchers discussed ways in which work in linguistics relates to other disciplines. Dr. Barbara Pearson, former research associate at UMass, demonstrated how research used to develop assessments in communication disorders for children who speak all varieties English, including AAE, has drawn on linguistics. Computer science graduate student Su Lin Blodgett presented her research on natural language processing and AAE and Twitter. One participant summed up his experience in the program in the following way: “I really enjoyed these three weeks and got a lot out of our work and hope to shape my senior year linguistic work around some of this research and our projects.”