Citation Hallucination¶

Citation hallucination is the failure mode in which LLMs fabricate references — inventing plausible-looking author/title/year combinations, getting one or more bibliographic fields wrong on an otherwise-real source, or deploying real references in support of claims the cited papers do not actually make.

Context & Background¶

Zhao et al. (2026-05) audited 111M references across 2.5M papers on arXiv, bioRxiv, SSRN, and PMC and conservatively estimated 146,932 hallucinated citations for 2025 alone, with a mid-2024 inflection. A five-type taxonomy is now in common use:

TF — Total Fabrication: the cited work does not exist
PAC — Plausible Author Composition: real author, but never wrote the cited paper
IH — Identity Hallucination: real paper attributed to wrong authors or year
PH — Partial Hallucination: real paper but title/venue/year garbled
SH — Source Hallucination: real reference used to support a claim it does not actually make (most insidious — passes Tier 0 existence checks)

Standard mitigation is multi-tier verification: a programmatic API check against Semantic Scholar / CrossRef / OpenAlex catches TF/PAC/IH/PH at low cost; SH requires reading the cited source and is much harder to automate.

Practical Implications¶

Never trust LLM parametric memory for citations. Verify every reference against an authoritative bibliographic database before publication.
Use multi-source verification — Semantic Scholar + CrossRef + OpenAlex disagree often enough that triangulation matters.
External post-publication audit beats internal checks. ARS reports a real case where post-publication WebSearch verification found 21/68 reference issues (31% error rate) that survived three rounds of integrity checks.
Anti-leakage protocol: when an AI agent has a session corpus, force it to prefer session materials over its memory; flag [MATERIAL GAP] for missing content rather than filling from training data.
Don't aggregate citations across model sessions without re-verification — verified references in one session are not verified in the next.

Key Sources¶

Academic Research Skills for Claude Code — Tier 0 Semantic Scholar API verification, 5-type taxonomy, post-publication audit case
KatmerCode — /cite-verify checks every reference against CrossRef, Semantic Scholar, OpenAlex
Reviewer (Haaland) — Schema-validated reviewer outputs with parser-quality preflight
Read the Paper, Write the Code — ETH benchmark on AI replication fidelity

Citation Hallucination¶

Context & Background¶

Practical Implications¶

Key Sources¶

Related Concepts¶