Over the past decade, Natural Language Processing (NLP) has undergone a transformative journey, marked by profound changes, particularly in the development of Large Language Models (LLMs). While some applications of LLMs, such as dialogue agents, have become a common part of our daily lives, their underlying complexities can go unnoticed. This talk focuses on one key aspect of language comprehension-affects. Affective traits encompass factors such as emotions, humor, sarcasm, and moral values, all of which are essential for fully understanding what is being communicated. Our work examines these subtle elements, aiming to enhance the interpretative abilities of LLMs by deepening their understanding of these traits in language, contributing to more meaningful human-machine interactions.
Large Language Models (LLMs) are often seen as powerful yet static entities, their knowledge frozen after training, disconnected from the ever-evolving world. In this talk, we will explore the challenge of updating these models without retraining them from scratch. We’ll examine current techniques such as fine-tuning, parameter-efficient methods (PEFT), Retrieval-Augmented Generation (RAG), and model editing approaches like Elastic Weight Consolidation (EWC), each with its own trade-offs in scalability, consistency, and memory retention.
The integration of machine learning, particularly large language models (LLMs), into medical applications offers great potential to conduct clinical documentation. This study explores the feasibility and effectiveness of generating structured medical letters exclusively from conversational data between physicians and patients. Using only local models such as the whisper speech-to-text models for transcription and local instance of phi-4 for summarization, we aim to automate the creation of clinical documentation while also generating free to use gold standard datasets for future research. The methodology involves recording 100 real-world physician-patient consultations in clinical settings, transcribing the
conversations into text, and generating clinical letters using only local models.
Large language models (LLMs) have demonstrated remarkable capabilities, but their high computational costs and reliance on extensive labeled data limit their practical deployment in resource-constrained settings. This talk explores strategies for efficiently adapting and leveraging smaller, more deployable models while minimizing reliance on human annotations.
Scaling up Language Models has led to increasingly advanced capabilities for those who can afford to train them. In order to enable community-tailored models for the rest of us, we will examine cross-model consistencies in how LMs acquire their linguistic knowledge-from fundamental syntax and semantics up to higher-level pragmatic features, such as culture. By identifying these consistencies across different models, we highlight opportunities for how they can enable dynamic personalization approaches that improve the accessibility of language technologies for underserved communities, in which collecting sufficient training data is physically impossible.
In this talk, / will present CARAG, a Context-Aware Retrieval Augmented Generation framework that improves Automated Fact Verification (AFV) by incorporating both local and global explanations. Unlike traditional factchecking methods that focus on isolated claims, CARAG leverages thematic embedding aggregation to verify claims in a broader contextual landscape. I will also introduce CARAG-u, an unsupervised extension that eliminates the need for predefined thematic annotations, dynamically deriving contextually relevant evidence clusters from unstructured data. CARAG-u maintains strong performance while increasing adaptability and scalability. Through benchmarks on the FactVer dataset, / will demonstrate how these frameworks enhance explainability and thematic coherence, advancing the role of Al in trustworthy, transparent fact verification.
In the first part of my talk, I will discuss the joys and challenges of my master’s research on generating the script of a full-length play using GPT-2. Namely, I will share some of the strategies we used to navigate around the limited context length of the model, getting the characters to have a consistent persona, and above everything else, making the play interesting to watch for the audience. In the second part, / will share my ongoing doctoral research on evaluating natural language generation. / will discuss our work on data contamination, present an overview of how NG is evaluated across different specific tasks, and share my challenges of evaluating the semantic accuracy of summarization at a scale when no reference is available.
In this lecture, we will journey through the core principles of Al agents, building a conceptual bridge from foundational theories to cutting-edge practical implementations. Attendees will gain insights into how autonomous agents operate, starting with basic Al agent architectures and evolving into sophisticated web automation systems. Highlighting our latest research with WebPilot, the lecture will showcase how integrating Monte Carlo Tree Search with a dual optimization strategy addresses the complexities of dynamic web tasks-mitigating vast action spaces and uncertainty through strategic exploration and adaptive decision-making.
The Teuken 7B model, a large language model for *European languages*, has recently made the news. If you’re interested in knowing how such models are trained, this week’s speaker is one of the lead scientists who’s done it.
As part of the Lamarr NLP monthly meetings, this week we have the pleasure to host Dr. Mehdi Ali from the Fraunhofer IAIS who will give a guest lecture on How To Train A Multilingual Large Language Model?.
We envision a world where Al agents (assistants) are widely used for complex tasks in our digital and physical worlds and are broadly integrated into our society. To move towards such a future, we need an environment for a robust evaluation of agents’ capability, reliability, and
trustworthiness.
This site uses essential cookies (always active for basic functions) and optional cookies for analytics. We respect your privacy under GDPR – choose to accept or reject non-essential cookies. See our Data Protection for details on your rights.