Author Archives: Joseph Pater

Sadil in Cognitive Brown Bag 3/27 at noon

The next cognitive brown bag speaker (3/27, 12:00, Tobin 521B) is Patrick Sadil (https://www.umass.edu/pbs/people/patrick-sadil). Title and abstract are below. All are welcome.

A (largely) hierarchical Bayesian model for inferring neural tuning functions from voxel tuning functions

Inferring neural properties from the hemodynamic signal provided by fMRI remains challenging. It is tempting to simply assume that the dynamics of individual neurons or neural subpopulations is reflected in the hemodynamic signal, and in apparent support of this assumption important features of neural activity — such as the ‘tuning’ to different stimulus features (e.g., the pattern of activity in response to different orientations, colors, or motion) — are observable in fMRI. However, fMRI measures the aggregated activity of a heterogeneous collection of neural subpopulations, and this aggregated activity may mislead inferences about the behavior of each individual subpopulation. In particular, extant analysis methods can lead to erroneous conclusions about how neural tuning functions are altered by interactions between stimulus features (e.g., changes in the contrast of a stimulus), or between the tuning curves and different cognitive states (e.g., with or without attention). I will present a statistical modeling approach that attempts to remove these limitations. The approach is validated by using it to infer alterations to neural tuning curves from fMRI data, in a circumstance where the ground truth of the alteration has been provided by electrophsyiology.

Misra in Machine Learning and Friends Thurs. 3/28 at 11:45

who: Ishan Misra (Facebook AI Research, NY)
when: 03/28 (Thursday) 11:45a – 1:15p
where: Computer Science Building Rm 150
food: Athena’s Pizza

“Scaling Self-supervised Visual Representation Learning“

Abstract: Self-supervised learning aims to learn representations from the data itself without explicit manual supervision. Existing efforts ignore a crucial aspect of self-supervised learning – the ability to scale to large amount of data because self-supervision requires no manual labels. In this work, we revisit this principle and scale two popular self-supervised approaches to 100 million images. Scaling these methods also provides many interesting insights into the limitations of current self-supervised techniques and evaluations. We conclude that current self-supervised methods are not complex enough to take full advantage of large scale data and do not seem to learn effective high level semantic representations. Finally, we show how scaling current self-supervised methods provides state-of-the-art results that sometimes match or surpass supervised representations on tasks such as object detection, surface normal estimation and visual navigation.

Bio: Ishan is a Research Scientist at Facebook AI Research. He graduated from Carnegie Mellon University where his PhD thesis was titled “Visual Learning with Minimal Human Supervision” and got the Runner Up SCS Distinguished Dissertation Award. This work was about learning recognition models with minimal supervision by exploring structure and biases in the labels (multi-task), classifiers (meta learning) and data (self supervision). His current research interests are in self supervised approaches, understanding vision and language models, and in compositional models for small sample learning.

Website – http://imisra.github.io/

Iyer in Cognitive Brown Bag Weds. March 20th at noon

The cognitive brown bag speaker on Wednesday, March 20 will be Mohit Iyyer of UMass Computer Science (https://people.cs.umass.edu/~miyyer/). Title and abstract are below. As always, the talk is in Tobin 521B at 12:00. All are welcome.

Title: Towards Understanding Narratives with Artificial Intelligence

Abstract:

One of the fundamental goals of artificial intelligence is to build computers that understand language at a human level. Recent progress towards this goal has been fueled by deep learning, which represents words, sentences, and even documents with learned vectors of real-valued numbers. However, creative language—the sort found in novels, film, and comics—poses an immense challenge for such models because it contains a wide range of linguistic phenomena, from phrasal and sentential syntactic complexity to high-level discourse structures such as narrative and character arcs. In this talk, I discuss our recent work on applying deep learning to creative language understanding, as well as exploring the challenges that must be solved before further progress can be made. I begin with an general overview of deep learning before presenting model architectures for two tasks involving creative language understanding: 1) modeling dynamic relationships between fictional characters in novels, and 2) predicting dialogue and artwork from comic book panels. For both tasks, our models only achieve a surface-level understanding, limited by a lack of world knowledge, an inability to perform commonsense reasoning, and a reliance on huge amounts of data. I conclude by proposing ideas on how to push these models to produce deeper insights from creative language that might be of use to humanities researchers.

Morley in Linguistics on Fri. March 6th at 3:30

Rebecca Morley (OSU will present a colloquium in Linguistics on Friday March 8th at 3:30 in ILC N400. All are welcome!

Title: Phonological contrast as an evolving function of local predictability

Abstract:

In this talk I conceptualize phoneme identification as the result of a phonological parse that maps acoustic input to a series of discrete abstract structures. As has been proposed for syntactic processing, the phonological parse is built up incrementally as the speech signal is received, and the highest-probability parse available is selected at each point. As the parse proceeds, listener expectations develop regarding future input. If those expectations fail to be met, the phonological parser can be “garden-pathed” just as a syntactic parse would be. The primary difference between the two domains is that the input to the syntactic parser is typically
assumed to consist of already segmented sequences of words, and the induction problem is one of determining the hierarchical groupings among those words. The input to the phonological parser, on the other hand, will be assumed to consist of a stream of continuously
valued acoustic cues, and the induction problem to be literal segmentation: attributing perceived cues to sequentially ordered discrete segments.

This proposal is illustrated through a re-analysis of the well-known, and well-researched, phenomenon of vowel lengthening in American English. I will argue that no actual lengthening
of vowels before voiced obstruents occurs (nor shortening before voiceless obstruents), but that the effect is an epiphenomenon of speaking rate and prosodic lengthening. I take the results of production experiments to argue for an underlying specification of /short/ for English “voiced” obstruents. And I show that the categorical perception results (in which vowel duration is found to be a sufficient cue to “voicing” on word-final obstruents) can be derived from general properties of the proposed phonological parser. The implications for theories of contrast, diagnostics of contrastive features, and theories of sound change will be discussed.

Burnsky in Cognitive bag lunch at noon on Wednesday

The next cognitive brown bag is Weds. 3/5 at 12:00 in Tobin 521B. The speaker is Jon Burnsky (UMass PBS); title and abstract are below.

What does it mean to predict a word and what can predictions tell us?

I will present data from three experiments investigating prediction in language comprehension. First, I will discuss an eyetracking experiment providing suggestive (though inconclusive) evidence that predicted words that are not encountered are activated similarly to words that are actually encountered. Then, I will discuss two experiments using the cloze task in which comprehenders’ predictions are used as a tool to probe their syntactic or thematic representations of complex sentences. The results suggest that non-veridical representations are computed online when doing so yields a more plausible interpretation of the sentence, perhaps by way of Bayesian inference on the part of the comprehender.

Foley Psycholinguistics talk today, Fri. March 1, 2019

Steven Foley (UCSC) will present “Why are ergatives hard to process? Reading-time evidence from Georgian” in ILC N400 at 3:30. All are welcome!

ABSTRACT: How easily a filler–gap dependency is processed can depend on the syntactic position of its gap: in many languages, for example, subject-gap relative clauses are generally easier to process than object-gap relatives (Kwon et al. 2013). One possible explanation for this is that certain syntactic positions might be intrinsically more accessible for extraction than others (Keenan & Comrie 1977). Alternatively, processing difficulty might correlate with the relative informativity of morphosyntactic cues (e.g., case) ambient to the gap (Polinsky et al. 2012; cf. Hale 2006). Ergative languages are ideal for disentangling these two theories, since they decouple case morphology (ergative ~ absolutive) and syntactic role (subject ~ object). This talk presents reading-time data from Georgian, a split-ergative language, which suggests that case may indeed be a crucial factor affecting real-time comprehension. Across four self-paced reading experiments, ergative DPs in different configurations are read consistently slower than absolutive ones — bearing out the predictions of the informativity hypothesis. However, the case is not closed: it seems that accusative morphology, at least in Japanese and Korean, does not seem to be associated with a processing cost, even though it is just as informative as ergative is. To reconcile this ergative–accusative processing asymmetry, I turn to the debate in formal syntax between different modalities of case assignment, and argue that a theory in which case is assigned by functional heads (Chomsky 2000, 2001) gives us better traction for understanding both Georgian-internal and crosslinguistic processing data than does a configurational theory of case (Marantz 1991).

Zobel in Cognitive bag lunch Weds. Feb. 20 at noon

The next cognitive brown bag is on 2/20 (12:00, Tobin 521B). The speaker is Ben Zobel (UMass PBS); title and abstract are below. Please note that there is no cognitive brown bag on 2/27.

Effects of Age on Spatial Release from Informational Masking

Spatial release from informational masking (SRIM) describes the reduction in perceptual/cognitive confusion between relevant speech (target) and irrelevant speech (masker) when target and masker are perceived as spatially separated compared to spatially co-located. Under complex listening conditions in which peripheral (head shadow) and low-level binaural (interaural time differences) cues are washed out by multiple noise sources and reverberation, SRIM is a crucial mechanism for successful speech processing. It follows that any age-related declines in SRIM would contribute to the speech-processing difficulties older adults often report within noisy environments. Some research indicates age-related declines in SRIM (e.g., Gallun et al., 2013) while other research does not (e.g., Li et al., 2004). This talk will describe research designed to add clarity to two fundamental questions: 1) Does SRIM decline with age and, if so, 2) does age predict this decline independent of hearing loss? To answer these questions, younger and older participants listened to low-pass-filtered noise-vocoded speech and were asked to detect whether a target talker was presented along with two-talker masking babble. Spatial separation was perceptually manipulated without changing peripheral and low-level cues (Freyman et al., 1999). Results showed that detection thresholds were nearly identical across age groups in the co-located condition but markedly higher for older adults compared to younger adults when target and masker were spatially separated. Multiple regression analysis showed that age predicted a decline in SRIM controlling for hearing loss (based on pure-tone audiometry), while there was no indication that hearing loss predicted a decline in SRIM controlling for age. These results provide strong evidence that SRIM declines with age, and that the source of this decline may begin at higher perceptual/cognitive levels of auditory processing. Such declines are likely to contribute to the greater speech-processing difficulties older adults often experience in complex, noisy environments.

Syrett in Linguistics Fri. Feb. 22 at 3:30

Kristen Syrett of Rutgers University (https://sites.rutgers.edu/kristen-syrett/) will present “Playing with semantic building blocks: Acquiring the lexical representations of verbs and adjectives” in ILC N400 Friday Feb. 22 at 3:30. All are welcome!

ABSTRACT: Early lexicons and initial child productions reflect a preponderance of object-denoting lexical items (nouns), while those that denote properties of objects or events (adjectives and verbs) lag behind. If nouns are the “Marsha” of the Brady Bunch, adjectives and verbs compete for the role of “Jan.” In many ways, this asymmetry privileging nouns makes sense: it’s much easier to track event participants than to track ephemeral events and the properties of those participants, which are much less stable, and both verbs and adjectives require nominal elements both syntactically and semantically. But the process of language acquisition is rapid, and within a matter of a few years, the child fairly quickly achieves proficiency, enough so to appreciate polysemy or word play. Given this state of affairs, we might ask two questions about the acquisition of these predicates: (1) What strategies or information sources do children recruit to pin down the lexical meaning of verbs and adjectives?, and (2) When they enter into the lexicon, how rich is children’s semantic knowledge of these words? In this talk, I provide one answer to (1), showcasing the role of the linguistic context. I then highlight a set of examples in response to (2), illustrating children’s early command of selectional restrictions for both categories. In doing so, I also demonstrate that once these words are established as part of the children’s receptive and productive vocabulary, there are certain advantages afforded to the language learner—although here, we uncover an asymmetry between verbs and adjectives implicating other aspects of the grammar and the context. Together, what this body of work reveals is the complex, interrelated process of acquiring and assembling the semantic building blocks of language.

who: Patrick Verga
when: Feb 14 11:45am
where: Computer Science Building Rm 150
food: Athena’s Pizza

“Neural Knowledge Representation And Reasoning”

Abstract: Making complex decisions in science, government policy, finance, and clinical treatments all require integrating and reasoning over disparate sources of information. While some decisions can be made from a single piece of evidence, others require considering information that exists outside of the current context. A long-term store of abstracted knowledge over related concepts can facilitate this type of reasoning, while also influencing interpretation as well as enhancing the acquisition of new knowledge. A symbolic graph over a fixed, human-defined schema encoding facts about entities and their relations is the predominant method of representing knowledge, but this method is brittle, lacks specificity, and is inevitably highly incomplete. On the other extreme, recent work on purely text based knowledge models lack abstractions necessary for complex reasoning.

In this talk I will present a middle ground incorporating powerful neural network models with rich structured ontologies and unstructured raw text to improve the representations of entities and their relations. We first discuss our work on universal schema, a method for learning a latent schema over both existing structured resources and unstructured free text data, embedding them jointly within a shared semantic space. Next we inject additional hierarchical structure into the embedding space of concepts, resulting in more efficient statistical sharing amongst related concepts and improving accuracy in both fine-grained entity typing and linking. We then present initial work to represent knowledge in context, including a single model for extracting all entities and long-range relations simultaneously over full paragraphs while jointly linking these entities to a knowledge graph. Lastly, we propose future directions for representing knowledge in context by incorporating cognitive theories of human memory systems and discuss how these models can address longstanding shortcomings in knowledge representation.

Bio: Patrick Verga is a final year PhD candidate in the College of Information and Computer Sciences at UMass Amherst, advised by Andrew McCallum. His research contributes to knowledge representation and reasoning, with a focus on large knowledge base construction from unstructured text, with applications to general domain, commonsense, and biomedicine. Pat previously interned at Google and the Chan Zuckerberg Initiative and received the best long paper award at EMNLP 2018. Over the past several years he has advised multiple M.S. and junior PhD students, resulting in published research in fine-grained entity typing, unsupervised parsing, and partially labeled named entity extraction. He holds M.S. and B.A degrees in computer science as well as a B.S. in neuroscience.

Cognitive Science at UMass Amherst