Events
Open full Events browserLoading...
Live and recorded talks from the researchers shaping this domain.
LLMs and Human Language Processing
This webinar convened researchers at the intersection of Artificial Intelligence and Neuroscience to investigate how large language models (LLMs) can serve as valuable “model organisms” for understanding human language processing. Presenters showcased evidence that brain recordings (fMRI, MEG, ECoG) acquired while participants read or listened to unconstrained speech can be predicted by representations extracted from state-of-the-art text- and speech-based LLMs. In particular, text-based LLMs tend to align better with higher-level language regions, capturing more semantic aspects, while speech-based LLMs excel at explaining early auditory cortical responses. However, purely low-level features can drive part of these alignments, complicating interpretations. New methods, including perturbation analyses, highlight which linguistic variables matter for each cortical area and time scale. Further, “brain tuning” of LLMs—fine-tuning on measured neural signals—can improve semantic representations and downstream language tasks. Despite open questions about interpretability and exact neural mechanisms, these results demonstrate that LLMs provide a promising framework for probing the computations underlying human language comprehension and production at multiple spatiotemporal scales.
Speaker
Maryia Toneva, Ariel Goldstein, Jean-Remi King • Max Planck Institute of Software Systems; Hebrew University; École Normale Supérieure
Scheduled for
Nov 28, 2024, 2:00 PM
Timezone
GMT+1
Llama 3.1 Paper: The Llama Family of Models
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development.
Speaker
Vibhu Sapra
Scheduled for
Jul 28, 2024, 10:00 AM
Timezone
GMT+2
Improving Language Understanding by Generative Pre Training
Natural language understanding comprises a wide range of diverse tasks such as textual entailment, question answering, semantic similarity assessment, and document classification. Although large unlabeled text corpora are abundant, labeled data for learning these specific tasks is scarce, making it challenging for discriminatively trained models to perform adequately. We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task. In contrast to previous approaches, we make use of task-aware input transformations during fine-tuning to achieve effective transfer while requiring minimal changes to the model architecture. We demonstrate the effectiveness of our approach on a wide range of benchmarks for natural language understanding. Our general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, significantly improving upon the state of the art in 9 out of the 12 tasks studied. For instance, we achieve absolute improvements of 8.9% on commonsense reasoning (Stories Cloze Test), 5.7% on question answering (RACE), and 1.5% on textual entailment (MultiNLI).
Speaker
Amgad Hasan
Scheduled for
Apr 22, 2024, 10:00 AM
Timezone
GMT+2
A Comprehensive Overview of Large Language Models
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.
Speaker
Ivan Leo
Scheduled for
Mar 14, 2024, 10:00 AM
Timezone
GMT+2
Dyslexia, Rhythm, Language and the Developing Brain
Recent insights from auditory neuroscience provide a new perspective on how the brain encodes speech. Using these recent insights, I will provide an overview of key factors underpinning individual differences in children’s development of language and phonology, providing a context for exploring atypical reading development (dyslexia). Children with dyslexia are relatively insensitive to acoustic cues related to speech rhythm patterns. This lack of rhythmic sensitivity is related to the atypical neural encoding of rhythm patterns in speech by the brain. I will describe our recent data from infants as well as children, demonstrating developmental continuity in the key neural variables.
Speaker
Usha Goswami CBE • University of Cambridge
Scheduled for
Feb 21, 2024, 1:00 PM
Timezone
GMT
Enhancing Qualitative Coding with Large Language Models: Potential and Challenges
Qualitative coding is the process of categorizing and labeling raw data to identify themes, patterns, and concepts within qualitative research. This process requires significant time, reflection, and discussion, often characterized by inherent subjectivity and uncertainty. Here, we explore the possibility to leverage large language models (LLM) to enhance the process and assist researchers with qualitative coding. LLMs, trained on extensive human-generated text, possess an architecture that renders them capable of understanding the broader context of a conversation or text. This allows them to extract patterns and meaning effectively, making them particularly useful for the accurate extraction and coding of relevant themes. In our current approach, we employed the chatGPT 3.5 Turbo API, integrating it into the qualitative coding process for data from the SWISS100 study, specifically focusing on data derived from centenarians' experiences during the Covid-19 pandemic, as well as a systematic centenarian literature review. We provide several instances illustrating how our approach can assist researchers with extracting and coding relevant themes. With data from human coders on hand, we highlight points of convergence and divergence between AI and human thematic coding in the context of these data. Moving forward, our goal is to enhance the prototype and integrate it within an LLM designed for local storage and operation (LLaMa). Our initial findings highlight the potential of AI-enhanced qualitative coding, yet they also pinpoint areas requiring attention. Based on these observations, we formulate tentative recommendations for the optimal integration of LLMs in qualitative coding research. Further evaluations using varied datasets and comparisons among different LLMs will shed more light on the question of whether and how to integrate these models into this domain.
Speaker
Kim Uittenhove & Olivier Mucchiut • AFC Lab / University of Lausanne
Scheduled for
Oct 15, 2023, 10:30 AM
Timezone
GMT+1
Prosody in the voice, face, and hands changes which words you hear
Speech may be characterized as conveying both segmental information (i.e., about vowels and consonants) as well as suprasegmental information - cued through pitch, intensity, and duration - also known as the prosody of speech. In this contribution, I will argue that prosody shapes low-level speech perception, changing which speech sounds we hear. Perhaps the most notable example of how prosody guides word recognition is the phenomenon of lexical stress, whereby suprasegmental F0, intensity, and duration cues can distinguish otherwise segmentally identical words, such as "PLAto" vs. "plaTEAU" in Dutch. Work from our group showcases the vast variability in how different talkers produce stressed vs. unstressed syllables, while also unveiling the remarkable flexibility with which listeners can learn to handle this between-talker variability. It also emphasizes that lexical stress is a multimodal linguistic phenomenon, with the voice, lips, and even hands conveying stress in concert. In turn, human listeners actively weigh these multisensory cues to stress depending on the listening conditions at hand. Finally, lexical stress is presented as having a robust and lasting impact on low-level speech perception, even down to changing vowel perception. Thus, prosody - in all its multisensory forms - is a potent factor in speech perception, determining what speech sounds we hear.
Speaker
Hans Rutger Bosker • Donders Institute of Radboud University
Scheduled for
May 22, 2023, 5:00 PM
Timezone
GMT+3
Learning through the eyes and ears of a child
Young children have sophisticated representations of their visual and linguistic environment. Where do these representations come from? How much knowledge arises through generic learning mechanisms applied to sensory data, and how much requires more substantive (possibly innate) inductive biases? We examine these questions by training neural networks solely on longitudinal data collected from a single child (Sullivan et al., 2020), consisting of egocentric video and audio streams. Our principal findings are as follows: 1) Based on visual only training, neural networks can acquire high-level visual features that are broadly useful across categorization and segmentation tasks. 2) Based on language only training, networks can acquire meaningful clusters of words and sentence-level syntactic sensitivity. 3) Based on paired visual and language training, networks can acquire word-referent mappings from tens of noisy examples and align their multi-modal conceptual systems. Taken together, our results show how sophisticated visual and linguistic representations can arise through data-driven learning applied to one child’s first-person experience.
Speaker
Brenden Lake • NYU
Scheduled for
Apr 20, 2023, 12:30 PM
Timezone
EDT
Verb metaphors are processed as analogies
Metaphor is a pervasive phenomenon in language and cognition. To date, the vast majority of psycholinguistic research on metaphor has focused on noun-noun metaphors of the form An X is a Y (e.g., My job is a jail). Yet there is evidence that verb metaphor (e.g., I sailed through my exams) is more common. Despite this, comparatively little work has examined how verb metaphors are processed. In this talk, I will propose a novel account for verb metaphor comprehension: verb metaphors are understood in the same way that analogies are—as comparisons processed via structure-mapping. I will discuss the predictions that arise from applying the analogical framework to verb metaphor and present a series of experiments showing that verb metaphoric extension is consistent with those predictions.
Speaker
Daniel King • Northwestern University
Scheduled for
Mar 8, 2023, 11:00 AM
Timezone
CDT
Modelling metaphor comprehension as a form of analogizing
What do people do when they comprehend language in discourse? According to many psychologists, they build and maintain cognitive representations of utterances in four complementary mental models for discourse that interact with each other: the surface text, the text base, the situation model, and the context model. When people encounter metaphors in these utterances, they need to incorporate them into each of these mental representations for the discourse. Since influential metaphor theories define metaphor as a form of (figurative) analogy, involving cross-domain mapping of a smaller or greater extent, the general expectation has been that metaphor comprehension is also based on analogizing. This expectation, however, has been partly borne out by the data, but not completely. There is no one-to-one relationship between metaphor as (conceptual) structure (analogy) and metaphor as (psychological) process (analogizing). According to Deliberate Metaphor Theory (DMT), only some metaphors are handled by analogy. Instead, most metaphors are presumably handled by lexical disambiguation. This is a hypothesis that brings together most metaphor research in a provocatively new way: it means that most metaphors are not processed metaphorically, which produces a paradox of metaphor. In this talk I will sketch out how this paradox arises and how it can be resolved by a new version of DMT, which I have described in my forthcoming book Slowing metaphor down: Updating Deliberate Metaphor Theory (currently under review). In this theory, the distinction between, but also the relation between, analogy in metaphorical structure versus analogy in metaphorical process is of central importance.
Speaker
Gerard Steen • University of Amsterdam
Scheduled for
Nov 30, 2022, 3:00 AM
Timezone
CDT
Do large language models solve verbal analogies like children do?
Analogical reasoning –learning about new things by relating it to previous knowledge– lies at the heart of human intelligence and creativity and forms the core of educational practice. Children start creating and using analogies early on, making incredible progress moving from associative processes to successful analogical reasoning. For example, if we ask a four-year-old “Horse belongs to stable like chicken belongs to …?” they may use association and reply “egg”, whereas older children will likely give the intended relational response “chicken coop” (or other term to refer to a chicken’s home). Interestingly, despite state-of-the-art AI-language models having superhuman encyclopedic knowledge and superior memory and computational power, our pilot studies show that these large language models often make mistakes providing associative rather than relational responses to verbal analogies. For example, when we asked four- to eight-year-olds to solve the analogy “body is to feet as tree is to …?” they responded “roots” without hesitation, but large language models tend to provide more associative responses such as “leaves”. In this study we examine the similarities and differences between children's and six large language models' (Dutch/multilingual models: RobBERT, BERT-je, M-BERT, GPT-2, M-GPT, Word2Vec and Fasttext) responses to verbal analogies extracted from an online adaptive learning environment, where >14,000 7-12 year-olds from the Netherlands solved 20 or more items from a database of 900 Dutch language verbal analogies.
Speaker
Claire Stevenson • University of Amsterdam
Scheduled for
Nov 16, 2022, 11:00 AM
Timezone
CDT
Children’s inference of verb meanings: Inductive, analogical and abductive inference
Children need inference in order to learn the meanings of words. They must infer the referent from the situation in which a target word is said. Furthermore, to be able to use the word in other situations, they also need to infer what other referents the word can be generalized to. As verbs refer to relations between arguments, verb learning requires relational analogical inference, something which is challenging to young children. To overcome this difficulty, young children recruit a diverse range of cues in their inference of verb meanings, including, but not limited to, syntactic cues and social and pragmatic cues as well as statistical cues. They also utilize perceptual similarity (object similarity) in progressive alignment to extract relational verb meanings and further to gain insights about relational verb meanings. However, just having a list of these cues is not useful: the cues must be selected, combined, and coordinated to produce the optimal interpretation in a particular context. This process involves abductive reasoning, similar to what scientists do to form hypotheses from a range of facts or evidence. In this talk, I discuss how children use a chain of inferences to learn meanings of verbs. I consider not only the process of analogical mapping and progressive alignment, but also how children use abductive inference to find the source of analogy and gain insights into the general principles underlying verb learning. I also present recent findings from my laboratory that show that prelinguistic human infants use a rudimentary form of abductive reasoning, which enables the first step of word learning.
Speaker
Mutsumi Imai • Keio University
Scheduled for
May 18, 2022, 6:00 AM
Timezone
CDT
Language Representations in the Human Brain: A naturalistic approach
Natural language is strongly context-dependent and can be perceived through different sensory modalities. For example, humans can easily comprehend the meaning of complex narratives presented through auditory speech, written text, or visual images. To understand how complex language-related information is represented in the human brain there is a necessity to map the different linguistic and non-linguistic information perceived under different modalities across the cerebral cortex. To map this information to the brain, I suggest following a naturalistic approach and observing the human brain performing tasks in its naturalistic setting, designing quantitative models that transform real-world stimuli into specific hypothesis-related features, and building predictive models that can relate these features to brain responses. In my talk, I will present models of brain responses collected using functional magnetic resonance imaging while human participants listened to or read natural narrative stories. Using natural text and vector representations derived from natural language processing tools I will present how we can study language processing in the human brain across modalities, in different levels of temporal granularity, and across different languages.
Speaker
Fatma Deniz • TU Berlin & Berkeley
Scheduled for
Apr 26, 2022, 2:00 PM
Timezone
GMT+1
Analogical reasoning and metaphor processing in autism - Similarities & differences
In this talk, I will present the results of two recent systematic reviews and meta-analyses related to analogical reasoning and metaphor processing in autism, together with the results of a study that investigated verbal analogical reasoning and metaphor processing in the same sample of participants. Both metaphors and analogies rely on exploiting similarities, and they necessitate contextual processing. Nevertheless, our findings relating to metaphor processing and analogical reasoning showed distinct patterns. Whereas analogical reasoning emerged as a relative strength in autism, metaphor processing was found to be a relative weakness. Additionally, both meta-analytic studies investigated the relations between the level of intelligence of participants included in the studies, and the effect size of group differences between the autistic and typically developing (TD) samples. These analyses suggested in the case of analogical reasoning that the relative advantage of ASD participants might only be present in the case of individuals with lower levels of intelligence. By contrast, impairments in metaphor processing appeared to be more pronounced in the case of individuals with relatively lower levels of (verbal) intelligence. In our experimental study, we administered both verbal analogies and metaphors to the same sample of high-functioning autistic participants and TD controls. The two groups were matched on age, verbal IQ, working memory and educational background. Our aim was to understand better the similarities and differences between processing analogies and metaphors, and to see whether the advantage in analogical reasoning and disadvantage in metaphor processing is universal in autism.
Speaker
Kinga Morsanyi • Loughborough University
Scheduled for
May 5, 2021, 5:00 PM
Timezone
GMT
Decoding the neural processing of speech
Understanding speech in noisy backgrounds requires selective attention to a particular speaker. Humans excel at this challenging task, while current speech recognition technology still struggles when background noise is loud. The neural mechanisms by which we process speech remain, however, poorly understood, not least due to the complexity of natural speech. Here we describe recent progress obtained through applying machine-learning to neuroimaging data of humans listening to speech in different types of background noise. In particular, we develop statistical models to relate characteristic features of speech such as pitch, amplitude fluctuations and linguistic surprisal to neural measurements. We find neural correlates of speech processing both at the subcortical level, related to the pitch, as well as at the cortical level, related to amplitude fluctuations and linguistic structures. We also show that some of these measures allow to diagnose disorders of consciousness. Our findings may be applied in smart hearing aids that automatically adjust speech processing to assist a user, as well as in the diagnosis of brain disorders.
Speaker
Tobias Reichenbach • Friedrich-Alexander-University
Scheduled for
Mar 22, 2021, 12:00 PM
Timezone
GMT
Kamala Harris and the Construction of Complex Ethnolinguistic Political Identity
Over the past 50 years, sociolinguistic studies on black Americans have expanded in both theoretical and technical scope, and newer research has moved beyond seeing speakers, especially black speakers, as a monolithic sociolinguistic community (Wolfram 2007, Blake 2014). Yet there remains a dearth of critical work on complex identities existing within black American communities as well as how these identities are reflected and perceived in linguistic practice. At the same time, linguists have begun to take greater interest in the ways in which public figures, such as politicians, may illuminate the wider social meaning of specific linguistic variables. In this talk, I will present results from analyses of multiple aspects of ethnolinguistic variation in the speech of Vice President Kamala Harris during the 2019-2020 Democratic Party Primary debates. Together, these results show how VP Harris expertly employs both enregistered and subtle linguistic variables, including aspects of African American Language morphosyntax, vowels, and intonational phonology in the construction and performance of a highly specific sociolinguistic identity that reflects her unique positions politically, socially, and racially. The results of this study expand our knowledge about how the complexities of speaker identity are reflected in sociolinguistic variation, as well as press on the boundaries of what we know about how speakers in the public sphere use variation to reflect both who they are and who we want them to be.
Speaker
Nicole Holliday • University of Pennsylvania
Scheduled for
Feb 25, 2021, 12:00 PM
Timezone
EDT
Theory-driven probabilistic modeling of language use: a case study on quantifiers, logic and typicality
Theoretical linguistics postulates abstract structures that successfully explain key aspects of language. However, the precise relation between abstract theoretical ideas and empirical data from language use is not always apparent. Here, we propose to empirically test abstract semantic theories through the lens of probabilistic pragmatic modelling. We consider the historically important case of quantity words (e.g., `some', `all'). Data from a large-scale production study seem to suggest that quantity words are understood via prototypes. But based on statistical and empirical model comparison, we show that a probabilistic pragmatic model that embeds a strict truth-conditional notion of meaning explains the data just as well as a model that encodes prototypes into the meaning of quantity words.
Speaker
Michael Franke • University of Osnabrueck
Scheduled for
Feb 2, 2021, 3:20 PM
Timezone
GMT+1
Monkey Talk – what studies about nonhuman primate vocal communication reveal about the evolution of speech
The evolution of speech is considered to be one of the hardest problems in science. Studies of the communicative abilities of our closest living relatives, the nonhuman primates, aim to contribute to a better understanding of the emergence of this uniquely human capability. Following a brief introduction over the key building blocks that make up the human speech faculty, I will focus on the question of meaning in nonhuman primate vocalizations. While nonhuman primate calls may be highly context specific, thus giving rise to the notion of ‘referentiality’, comparisons across closely related species suggest that this specificity is evolved rather than learned. Yet, as in humans, the structure of calls varies with arousal and affective state, and there is some evidence for effects of sensory-motor integration in vocal production. Thus, the vocal production of nonhuman primates bears little resemblance to the symbolic and combinatorial features of human speech, while basic production mechanisms are shared. Listeners, in contrast, are able learning the meaning of new sounds. A recent study using artificial predator shows that this learning may be extremely rapid. Furthermore, listeners are able to integrate information from multiple sources to make adaptive decisions, which renders the vocal communication system as a whole relatively flexible and powerful. In conclusion, constraints at the side of vocal production, including limits in social cognition and motivation to share experiences, rather than constraints at the side of the recipient explain the differences in communicative abilities between humans and other animals.
Speaker
Julia Fischer • Deutsche Primate Center
Scheduled for
Oct 20, 2020, 12:00 PM
Timezone
EDT
Unsupervised deep learning identifies semantic disentanglement in single inferotemporal neurons
Irina is a research scientist at DeepMind, where she works in the Froniers team. Her work aims to bring together insights from the fields of neuroscience and physics to advance general artificial intelligence through improved representation learning. Before joining DeepMind, Irina was a British Psychological Society Undergraduate Award winner for her achievements as an undergraduate student in Experimental Psychology at Westminster University, followed by a DPhil at the Oxford Centre for Computational Neuroscience and Artificial Intelligence, where she focused on understanding the computational principles underlying speech processing in the auditory brain. During her DPhil, Irina also worked on developing poker AI, applying machine learning in the finance sector, and working on speech recognition at Google Research."" https://arxiv.org/pdf/2006.14304.pdf
Speaker
Irina Higgins • Google Deepmind
Scheduled for
Jul 14, 2020, 2:00 PM
Timezone
GMT
Predicting Patterns of Similarity Among Abstract Semantic Relations
In this talk, I will present some data showing that people’s similarity judgments among word pairs reflect distinctions between abstract semantic relations like contrast, cause-effect, or part-whole. Further, the extent that individual participants’ similarity judgments discriminate between abstract semantic relations was linearly associated with both fluid and crystallized verbal intelligence, albeit more strongly with fluid intelligence. Finally, I will compare three models according to their ability to predict these similarity judgments. All models take as input vector representations of individual word meanings, but they differ in their representation of relations: one model does not represent relations at all, a second model represents relations implicitly, and a third model represents relations explicitly. Across the three models, the third model served as the best predictor of human similarity judgments suggesting the importance of explicit relation representation to fully account for human semantic cognition.
Speaker
Nick Ichien • UCLA
Scheduled for
Jul 8, 2020, 5:00 PM
Timezone
GMT