Project C05 — Investigating Noise and Silence in Large Language Models Background
Large language models (LLMs) based on transformer architectures have become central to natural language processing, yet their internal representational processes remain difficult to interpret. In particular, noise and silence are usually treated as undesirable side effects—either as errors or as missing information. Project C05 adopts a different perspective by analyzing both phenomena as structurally meaningful components of model behavior.
Objectives and Approach
The project investigates how noise and silence emerge and propagate within LLMs. Noise is studied as representational instability, for example in hallucinations, token ambiguity, or information loss across layers. Silence is conceptualized not as the absence of output, but as an implicit discourse signal relevant to phenomena such as turn-taking, hesitation, or underspecification.
Methodologically, the project combines model fine-tuning on dialogue corpora annotated for silence-related events with visual analytics techniques. These methods expose internal model processes—such as embeddings and attention patterns—and support explainable AI by making latent structures accessible to linguistic and computational analysis.
Funding
German Research Foundation (DFG), SFB 1760