Checking whether AI citations actually support the claim

Person at a desk comparing a printed document to information on a laptop screen
TL;DR

AI tools produce fabricated or misattributed citations at rates between 20% and 47% in controlled studies, including real URLs linking to sources that contradict or simply don't contain the attributed claim. The ICO, FCA, and NCSC all confirm that accuracy obligations rest with the organisation using the output, not the tool. A practical check means clicking through every link, searching the document for the specific claim, and verifying date and jurisdiction before anything goes external.

Key takeaways

- AI tools produce fabricated or mismatched citations at rates of 20-47% depending on the domain, including real URLs that link to sources that say something different or don't contain the claim at all. - The ICO confirms that relying on AI does not reduce your accountability obligations under UK GDPR, and organisations must maintain documentation showing how AI outputs were checked before influencing decisions. - The FCA holds financial services firms fully responsible for the accuracy of AI-assisted client communications, regardless of which tool produced the content. - A practical citation check has four steps: click the link, search the document for the specific claim using Ctrl+F, check publication date and jurisdiction, and document any corrections. - Sector-specific tools drawing from curated document sets reduce citation risk but do not eliminate it. A human confirmation step is required before any cited AI output goes to a client or influences a significant decision.

A team member uses ChatGPT to pull together background research for a client proposal. The output arrives with five citations, all linking to credible-looking sources. The proposal goes out. Two days later the client responds: one statistic does not appear in the linked report at all, and a second URL leads to a document that contradicts the claim entirely.

This kind of mismatch is documented enough that UK regulators, the AI vendors themselves, and independent researchers have all written about it explicitly. The pattern has a name, and it happens with enough regularity to warrant a simple default rule: treat every AI citation as provisional until you have confirmed it against the source.

What does it mean to check whether a citation actually supports the claim?

When an AI tool provides a citation, it may be fabricating the source entirely, linking to a real document that says something different, or referencing material from the wrong jurisdiction or date. Checking whether a citation supports the claim means going beyond confirming the URL loads: it means confirming the specific number, conclusion, or assertion attributed to that source actually appears in it.

Controlled studies have found fabrication or mismatch rates between 20% and 47% depending on the domain and how the question is posed. A 2023 study tested ChatGPT on medical questions and found that 20 to 30% of references it provided were either fabricated or failed to support the claim made. A 2024 evaluation found false citation rates as high as 47% when models were asked to supply references in unfamiliar domains.

The AI vendors themselves are direct about this. OpenAI’s usage policies state that outputs may be inaccurate and recommend human review before relying on content for factual purposes. Microsoft’s Copilot documentation warns that it can “get things wrong” and that users should verify important information independently. Both warnings reflect a fundamental limitation in how language models produce text. When the companies selling these tools recommend independent verification of every output, the recommendation is grounded in how the systems actually work.

Why does this matter for your business?

An unchecked AI citation creates three types of risk. Operational: a decision built on a wrong number takes your team in the wrong direction. Reputational: a client who clicks the source and finds a mismatch loses confidence quickly. Regulatory: UK law places the accuracy obligation on the organisation using the AI output, not on the tool that produced it.

The UK Information Commissioner’s Office is explicit on this point. Its guidance on AI and data protection states that organisations must ensure AI outputs are “sufficiently accurate for their intended purpose” and must maintain processes to detect and correct errors. The guidance confirms that relying on AI “does not remove or reduce your accountability obligations,” and that organisations must maintain documentation showing how AI outputs were checked before being used in decisions affecting individuals.

For owner-managed businesses in regulated sectors the exposure is higher. The FCA’s guidance on machine learning in financial services confirms that firms remain fully responsible for the accuracy of any AI-driven analysis or advice provided to customers. An AI citation in a client communication is an organisational output, not a technical artefact, and regulators treat it accordingly.

The 2023 US case Mata v. Avianca illustrated the consequences at their sharpest. Two lawyers filed a court brief containing six non-existent cases generated by ChatGPT. The judge sanctioned them US$5,000 for submitting “non-existent judicial opinions with fake quotes and citations.” UK law firms have since issued internal guidance forbidding unchecked AI research for court submissions. The jurisdictional distance does not reduce the relevance of that lesson.

Where will you actually encounter this?

Citation checking becomes relevant any time your team uses a general-purpose AI tool to produce content that includes references, statistics, or regulatory details. The most common situations in owner-managed businesses are client-facing documents, internal reports used to justify decisions, staff-facing guidance that cites law or policy, and any marketing content that references research or sector data.

A 2024 CIPD survey found that 20% of UK employers were already using generative AI tools at work, while only 19% had provided any guidance or training on their use. That gap is where citation problems develop. Staff are producing research-backed material without a shared understanding of whether the references hold up under scrutiny.

The National Cyber Security Centre’s guidance on using public generative AI safely recommends treating all AI outputs as “unverified” by default, and checking critical information against trusted sources, particularly where legal, financial, or security implications are present. The NCSC frames citation checking as security hygiene, not a quality nicety.

Retrieval-augmented generation tools, where the AI draws answers from a specific document set, reduce citation risk but do not eliminate it. Even when a tool is working from your own knowledge base, it can misstate what a document says. The claim and the source still need to be reconciled by a human before anything goes external.

When do you need to check, and when can you reasonably take a lighter approach?

The answer depends on what the output is used for and who sees it. External communications, anything influencing a decision about a person or a regulated product, and anything taken as authoritative by a client all require a full citation check. Internal drafts used as a starting point for further research can carry lighter-touch review, provided the team knows the citations are provisional.

A practical citation check has four steps. Click every link and confirm the URL loads and is the type of source the AI described. Search within the page (Ctrl+F or Cmd+F) for the specific number, phrase, or conclusion the AI attributed to it. Check the publication date and jurisdiction: a UK regulatory claim needs a UK source, and guidance from several years ago may not reflect current rules. Document any corrections you find, as this gives you a record if a decision is ever questioned and supports your accountability obligations under UK GDPR.

For high-stakes contexts, an AI tool can assist with the initial verification pass. Share the paragraph and the source document with a tool and ask it to locate where the specific claim appears. Use the response as a starting point. Final verification rests with a human reading the primary source.

A simple internal rule covers the large majority of cases: all AI-generated citations and factual claims must be checked against primary sources before anything goes external. For owner-managed businesses this does not require a compliance function. It requires a shared habit.

What connects to citation checking in a broader output evaluation practice?

Citation checking is one part of a broader evaluation discipline for AI output. It connects to the practice of spotting AI outputs that are confidently wrong, to understanding when AI-generated numbers and statistics cannot be trusted, and to the question of what a proportionate review workflow looks like when your team is handling different types of AI output at volume.

The ICO and the FCA both point toward maintaining an audit trail: being able to demonstrate how AI outputs were reviewed before they influenced decisions, not just that individual outputs were spot-checked. The Competition and Markets Authority’s 2023 review of foundation models noted that inaccurate AI outputs can harm consumers, and that developers and deployers share responsibility for ensuring accurate information reaches users.

For owner-managed businesses with operations into the EU, the EU AI Act introduces documentation requirements for high-risk AI systems. If your business offers AI-driven services to EU clients, the documentation expectations around how your system sources and cites evidence are likely to be more demanding than current UK requirements alone.

If your team is using AI tools heavily enough that manual citation checking is becoming a bottleneck, the right conversation is about a structured output evaluation workflow. That conversation starts with the same discipline: confirming the source actually contains what the AI reported it contains, before the output leaves your business.

Sources

- ICO (2023). Guidance on AI and Data Protection. Confirms that organisations must ensure AI outputs are sufficiently accurate for their intended purpose and that accountability obligations remain with the data controller, not the AI tool. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/ - ICO (2022). Guidance on AI and Data Protection (detailed PDF). Sets out documentation and audit-trail obligations when AI outputs influence decisions about individuals. https://ico.org.uk/media/for-organisations/2617219/guidance-on-ai-and-data-protection.pdf - FCA (2022). Machine Learning in Financial Services. FCA paper confirming firms remain fully accountable for the accuracy of AI-driven analysis and advice in client-facing contexts. https://www.fca.org.uk/insight/machine-learning-big-data-financial-services - NCSC (2023). Guidelines for Secure Use of Generative AI. Advises organisations to treat AI outputs as unverified by default and check critical information against trusted sources, particularly where legal, financial, or security implications are present. https://www.ncsc.gov.uk/blog-post/guidelines-for-secure-use-of-generative-ai - CIPD (2024). Generative AI in the World of Work. Survey finding that 20% of UK employers were using generative AI at work while only 19% had provided guidance or training on its use. https://www.cipd.org/uk/knowledge/reports/generative-ai-work/ - Chen et al. (2023). Peer-reviewed study on ChatGPT citation accuracy in medical contexts, finding 20-30% of references were either fabricated or failed to support the specific claim made. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10090306/ - Zheng et al. (2024). Evaluation of LLM-generated bibliographies finding false citation rates as high as 47% when models were prompted for references in unfamiliar domains. https://arxiv.org/abs/2401.01286 - Mata v. Avianca Inc. (2023). US court case in which lawyers were sanctioned US$5,000 for filing a brief containing six non-existent ChatGPT-generated cases; widely referenced in AI governance training globally. https://www.courtlistener.com/docket/67678263/mata-v-avianca-inc/ - UK GDPR Article 5(1)(d) (Retained EU Law 2016/679). Establishes that personal data must be accurate and kept up to date, the legal foundation for the ICO's accuracy obligations applied to AI outputs. https://www.legislation.gov.uk/eur/2016/679/article/5 - CMA (2023). Response to Foundation Models Review. Notes that inaccurate AI outputs can harm consumers and that developers and deployers share responsibility for ensuring accurate information reaches users. https://www.gov.uk/government/news/cma-publishes-response-to-foundation-models-review

Frequently asked questions

How often do AI tools get citations wrong?

Controlled studies report fabrication or mismatch rates of 20% to 47% depending on the domain and how the prompt is phrased. A 2023 study found that 20-30% of ChatGPT's medical references were either fabricated or failed to support the specific claim. A 2024 evaluation found rates as high as 47% when models were asked for citations in unfamiliar areas. Treat every AI citation as provisional until you have confirmed it against the source document.

If I use an AI-generated citation that turns out to be wrong, who is responsible?

Your organisation is responsible, not the AI tool. The ICO's guidance on AI and data protection is explicit: relying on AI does not reduce your accountability obligations under UK GDPR. The FCA takes the same position for financial services firms. In professional services, the individual who signs or authorises the document carries responsibility for accuracy, irrespective of how the content was drafted.

What is the quickest way to check whether an AI citation actually supports the claim?

Open the linked source and use your browser's find function (Ctrl+F or Cmd+F) to search for the specific phrase, statistic, or conclusion the AI attributed to it. If the claim does not appear in the document, the citation does not support it, even if the URL is real and the source covers the right general topic. Also check publication date and jurisdiction: a US regulatory source does not confirm a UK legal position.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation