Adversarial Text: Detection, Quality Enhancement, and Future Challenges in the LLM Era

Adversarial text-carefully crafted inputs designed to mislead or degrade the performance of NLP systems-poses a growing challenge across a range of language technologies. In this talk, I will present my work on adversarial text detection and methods for improving the quality and stability of such texts once identified. / will discuss the linguistic and structural characteristics of adversarial inputs, outline current approaches for automatic detection, and introduce techniques for refining adversarial examples to make them more semantically coherent. While the primary focus will be on traditional NLP systems, / will also reflect on how these techniques might evolve to address the emerging complexities of large language models (LLMs). Looking ahead, / will highlight how adversarial methods could be leveraged not only for defence but also as diagnostic tools for probing and improving LLM robustness, interpretability, and trustworthiness.

Bridging Language and Cognition with Computational Models of Morality and Media Framing

When people comprehend, interpret, or communicate about their environment, they draw on “mental schemata” that encode common knowledge and associations based on experiences, moral values, or beliefs.
New information that aligns with existing mental schemata is much more readily understood and accepted. This talk will present two projects that explore the manifestation of media framing, and moral understanding in humans in LLMs. First, / will introduce “narrative media framing,” a conceptualization of framing grounded in the social sciences that links media framing devices with cognitively salient narrative representations. Secondly, I will present our recent work where we propose a robust method for probing representations of morality in LLMs through word associations.

Understanding Al Sentience

No artificial intelligence (Al) has yet been scientifically recognized as sentient. However, the concept of “sentient Al” continues to evoke a spectrum of fears-from valid concerns to misconceptions shaped by fiction. To distinguish genuine risks from misperceptions, I introduce a dual-index framework. The Sentience Index measures an Al’s objective sentience-relevant capacities, while the Human Perception Index measures the gap between reality and human perception of Al sentience, shaped by individual and collective narratives. This approach transforms fear into informed action by fostering evidence-based, philosophically grounded discourse on Al sentience and preparing society for its ontological and ethical implications.

NoLiMa: Long-Context Evaluation Beyond Literal Matching

Recent large language models (LLMs) support long contexts ranging from 128K to 1M tokens. A popular method for evaluating these capabilities is the needle-in-a-haystack (NIAH) test, which involves retrieving a “needle” (relevant information) from a “haystack” (long irrelevant context). Extensions of this approach include increasing distractors, fact chaining, and in-context reasoning.
However, in these benchmarks, models can exploit existing literal matches between the needle and haystack to simplify the task. To address this, we introduce NoLiMa, benchmark extending NIAH with a carefully designed needle set, where questions and needles have minimal lexical overlap, requiring models to infer latent associations to locate the needle within the haystack. We evaluate 12 popular LLMs that claim to support contexts of at least 128K tokens.

Affective Traits of Natural Language

Over the past decade, Natural Language Processing (NLP) has undergone a transformative journey, marked by profound changes, particularly in the development of Large Language Models (LLMs). While some applications of LLMs, such as dialogue agents, have become a common part of our daily lives, their underlying complexities can go unnoticed. This talk focuses on one key aspect of language comprehension-affects. Affective traits encompass factors such as emotions, humor, sarcasm, and moral values, all of which are essential for fully understanding what is being communicated. Our work examines these subtle elements, aiming to enhance the interpretative abilities of LLMs by deepening their understanding of these traits in language, contributing to more meaningful human-machine interactions.

Waking LLMs from CryoSleep with Continual Learning

Large Language Models (LLMs) are often seen as powerful yet static entities, their knowledge frozen after training, disconnected from the ever-evolving world. In this talk, we will explore the challenge of updating these models without retraining them from scratch. We’ll examine current techniques such as fine-tuning, parameter-efficient methods (PEFT), Retrieval-Augmented Generation (RAG), and model editing approaches like Elastic Weight Consolidation (EWC), each with its own trade-offs in scalability, consistency, and memory retention.

Structured Summarization of German Clinical Dialogue in Orthopedy

The integration of machine learning, particularly large language models (LLMs), into medical applications offers great potential to conduct clinical documentation. This study explores the feasibility and effectiveness of generating structured medical letters exclusively from conversational data between physicians and patients. Using only local models such as the whisper speech-to-text models for transcription and local instance of phi-4 for summarization, we aim to automate the creation of clinical documentation while also generating free to use gold standard datasets for future research. The methodology involves recording 100 real-world physician-patient consultations in clinical settings, transcribing the
conversations into text, and generating clinical letters using only local models.