Adversarial Text: Detection, Quality Enhancement, and Future Challenges in the LLM Era

Shakila Mahjabin Tonni (Data61, CSIRO, Australia)

27. July 2025
11:00 – 12:00

Abstract:

Adversarial text-carefully crafted inputs designed to mislead or degrade the performance of NLP systems-poses a growing challenge across a range of language technologies. In this talk, I will present my work on adversarial text detection and methods for improving the quality and stability of such texts once identified. / will discuss the linguistic and structural characteristics of adversarial inputs, outline current approaches for automatic detection, and introduce techniques for refining adversarial examples to make them more semantically coherent. While the primary focus will be on traditional NLP systems, / will also reflect on how these techniques might evolve to address the emerging complexities of large language models (LLMs). Looking ahead, / will highlight how adversarial methods could be leveraged not only for defence but also as diagnostic tools for probing and improving LLM robustness, interpretability, and trustworthiness.

I will be focusing on these two papers of mine and discussing some future research directions / am interested in:

1. “What Learned Representations and Influence Functions Can Tell Us About Adversarial Examples”https://aclanthology.org/2023.findingsijcnlp.35.pdf

2. “Graded Suspiciousness of Adversarial Texts to Humans” https://direct.mit.edu/coli/article/doi/10.1162/coli_a_00555/128185

Bio:

Shakila Mahjabin Tonni is a Post-doctoral Research Fellow at Data61, SIRO, Australia. She earned her PhD in Computer Science from Macquarie University, specialising in Natural Language Processing. Her research focused on adversarial text generation and detection, along with human perception of these methods. She is currently working on evaluating and benchmarking large language models for question answering in genomic research.

Adversarial Text: Detection, Quality Enhancement, and Future Challenges in the LLM Era

Related posts from this category