Adversarial text-carefully crafted inputs designed to mislead or degrade the performance of NLP systems-poses a growing challenge across a range of language technologies. In this talk, I will present my work on adversarial text detection and methods for improving the quality and stability of such texts once identified. / will discuss the linguistic and structural characteristics of adversarial inputs, outline current approaches for automatic detection, and introduce techniques for refining adversarial examples to make them more semantically coherent. While the primary focus will be on traditional NLP systems, / will also reflect on how these techniques might evolve to address the emerging complexities of large language models (LLMs). Looking ahead, / will highlight how adversarial methods could be leveraged not only for defence but also as diagnostic tools for probing and improving LLM robustness, interpretability, and trustworthiness.
When people comprehend, interpret, or communicate about their environment, they draw on “mental schemata” that encode common knowledge and associations based on experiences, moral values, or beliefs.
New information that aligns with existing mental schemata is much more readily understood and accepted. This talk will present two projects that explore the manifestation of media framing, and moral understanding in humans in LLMs. First, / will introduce “narrative media framing,” a conceptualization of framing grounded in the social sciences that links media framing devices with cognitively salient narrative representations. Secondly, I will present our recent work where we propose a robust method for probing representations of morality in LLMs through word associations.
No artificial intelligence (Al) has yet been scientifically recognized as sentient. However, the concept of “sentient Al” continues to evoke a spectrum of fears-from valid concerns to misconceptions shaped by fiction. To distinguish genuine risks from misperceptions, I introduce a dual-index framework. The Sentience Index measures an Al’s objective sentience-relevant capacities, while the Human Perception Index measures the gap between reality and human perception of Al sentience, shaped by individual and collective narratives. This approach transforms fear into informed action by fostering evidence-based, philosophically grounded discourse on Al sentience and preparing society for its ontological and ethical implications.
Recent large language models (LLMs) support long contexts ranging from 128K to 1M tokens. A popular method for evaluating these capabilities is the needle-in-a-haystack (NIAH) test, which involves retrieving a “needle” (relevant information) from a “haystack” (long irrelevant context). Extensions of this approach include increasing distractors, fact chaining, and in-context reasoning.
However, in these benchmarks, models can exploit existing literal matches between the needle and haystack to simplify the task. To address this, we introduce NoLiMa, benchmark extending NIAH with a carefully designed needle set, where questions and needles have minimal lexical overlap, requiring models to infer latent associations to locate the needle within the haystack. We evaluate 12 popular LLMs that claim to support contexts of at least 128K tokens.
The hackathon at the Bonn Surgical Technology Center (BOSTER) of the University Hospital Bonn (UKB) brought together participants to explore the role of machine learning in medicine using real clinical data. Prof. Fröhlich from b-it and his team supported the event as advisors and jury members, emphasizing the educational value and impressive outcomes for the students.
On June 23 and 24, 2025, the second edition of “AI in the Life Sciences – An Industry Symposium” will take place at Schloss Birlinghoven in Sankt Augustin, Germany. The symposium, organized by Fraunhofer SCAI and the Bonn-Aachen International Center for Information Technology (b-it), brings together leading experts in artificial intelligence (AI) and life sciences.
Over the past decade, Natural Language Processing (NLP) has undergone a transformative journey, marked by profound changes, particularly in the development of Large Language Models (LLMs). While some applications of LLMs, such as dialogue agents, have become a common part of our daily lives, their underlying complexities can go unnoticed. This talk focuses on one key aspect of language comprehension-affects. Affective traits encompass factors such as emotions, humor, sarcasm, and moral values, all of which are essential for fully understanding what is being communicated. Our work examines these subtle elements, aiming to enhance the interpretative abilities of LLMs by deepening their understanding of these traits in language, contributing to more meaningful human-machine interactions.
Large Language Models (LLMs) are often seen as powerful yet static entities, their knowledge frozen after training, disconnected from the ever-evolving world. In this talk, we will explore the challenge of updating these models without retraining them from scratch. We’ll examine current techniques such as fine-tuning, parameter-efficient methods (PEFT), Retrieval-Augmented Generation (RAG), and model editing approaches like Elastic Weight Consolidation (EWC), each with its own trade-offs in scalability, consistency, and memory retention.
The integration of machine learning, particularly large language models (LLMs), into medical applications offers great potential to conduct clinical documentation. This study explores the feasibility and effectiveness of generating structured medical letters exclusively from conversational data between physicians and patients. Using only local models such as the whisper speech-to-text models for transcription and local instance of phi-4 for summarization, we aim to automate the creation of clinical documentation while also generating free to use gold standard datasets for future research. The methodology involves recording 100 real-world physician-patient consultations in clinical settings, transcribing the
conversations into text, and generating clinical letters using only local models.
Large language models (LLMs) have demonstrated remarkable capabilities, but their high computational costs and reliance on extensive labeled data limit their practical deployment in resource-constrained settings. This talk explores strategies for efficiently adapting and leveraging smaller, more deployable models while minimizing reliance on human annotations.
This site uses essential cookies (always active for basic functions) and optional cookies for analytics. We respect your privacy under GDPR – choose to accept or reject non-essential cookies. See our Data Protection for details on your rights.