Hallucination

NLP

When an AI model generates plausible-sounding but factually incorrect or fabricated information with apparent confidence.

Imagine a confident tour guide who makes up historical facts when they do not know the answer, and you cannot tell the difference.

Hallucination occurs when large language models produce outputs that are fluent and confident but factually wrong. The model might fabricate citations, invent historical events, or provide detailed but entirely fictional instructions. This happens because LLMs are trained to predict statistically likely next tokens, not to determine truth , they optimize for fluency and plausibility, not accuracy.

Several factors cause hallucinations. LLMs have no grounding in truth; their training data contains facts, opinions, fiction, and errors in equal measure. Maximum likelihood training means the model learns to produce plausible text, not truthful text. The model also overgeneralizes: having learned patterns like 'X is the capital of Y', it can confidently generate fictional capitals for fictional countries. Critically, LLMs have no mechanism to express uncertainty , they will produce the most likely token even when all options are unlikely.

Mitigations include Retrieval-Augmented Generation (RAG), which grounds responses in retrieved documents; instruction tuning to teach models to say 'I don't know'; RLHF to penalize false statements; and chain-of-thought prompting to surface reasoning steps. However, no complete fix exists because the fundamental architecture is a pattern matcher, not a truth engine. Production systems must always include verification, fallback logic, and appropriate disclaimers.

Last updated: February 22, 2026

Hallucination

Related Terms