Invented stats, fake quotes, made-up citations: an owner's field guide

A person at a desk cross-checking a printed document against information on a laptop screen, pen in hand
TL;DR

AI tools regularly invent statistics with confident precision, attribute quotes to people who never said them, and cite sources that do not exist. For a small business, the real exposure sits in pitch decks, thought-leadership content, and client deliverables. The proportionate defence is a three-minute verification routine on any cited claim before the work leaves the firm, paired with a clear rule that the named author owns every fact, not the AI.

Key takeaways

- Fabrication clusters in three forms: invented statistics with suspicious precision, quotes that were never said, and citations to sources that do not exist. - The risk for small firms concentrates in three places: pitch decks, thought-leadership content, and client deliverables, where one false claim can cost a deal or a relationship. - Across six independent studies of AI-generated citations, roughly 51% were entirely fabricated, and only 7% of medical citations were both real and accurately cited. - The fix is a three-minute routine on any cited claim: source exists, date confirms, attribution holds. Apply it to high-stakes content only, not to every internal draft. - The team rule that makes this stick: any cited fact in client-facing work is the responsibility of the named author, not the AI. Courts have already confirmed this.

A managing director I spoke with last month had a quiet realisation a fortnight after winning a six-figure pitch. Her business partner had quoted a McKinsey statistic in the deck, a precise one, the kind that lands well on slide three. She had asked, on the train home, where it was from. He could not remember. They went back to the source. The statistic was not in any McKinsey publication. It had been generated by an AI tool a week earlier and dropped into the deck without a check.

Nothing bad happened. The client never asked. But she has been quietly uncomfortable about it ever since, because she knows the next one might land in front of a CFO who does ask.

What does AI actually invent, and in what shapes?

AI tools invent in three reliable shapes. Statistics with confident precision, quotes attributed to named people, and citations to sources that do not exist. The reason is structural rather than incidental. Language models generate the next plausible word, they do not look facts up, and the three shapes share a profile of high authority and low traceability, which is why they cluster.

A precise figure carries the look of evidence. A 23.5% growth rate reads as more credible than “significant growth” even when it was generated to fit the slide rather than drawn from real data. A named quote sounds like testimony. A citation in the right shape looks like proof. The model is fluent in all three forms because they appear constantly in its training data, so it generates new versions of them with high confidence regardless of whether the underlying claim is true.

The scale is not trivial. A systematic review across six studies found approximately 51% of AI-generated citations were entirely fabricated. A 2024 analysis of ChatGPT medical references found only 7% were both real and accurately cited, with the remainder either invented outright or attached to real papers that did not make the claim. The Stanford HAI 2026 AI Index shows hallucination rates across 26 leading models in 2025 ranging from 22% to 94% depending on task. The models that fabricate least still fabricate, and they do it on the forms that look most authoritative, which is exactly where the damage lands.

Why does this matter more for a small firm than a large one?

A small firm cannot absorb the reputational cost of a single public error the way an enterprise can. A 200-person business can quietly issue a correction and move on. A 20-person services firm cannot. One discovered fabrication in a pitch, a proposal, or a published article can lose a client, sink a deal, or trigger a complaint that takes a quarter to recover from.

The legal direction of travel reinforces this. In Mata v Avianca, two US lawyers were fined $5,000 for submitting a brief with ChatGPT-invented case citations. A California court later fined two firms a combined $31,000 for the same pattern. The principle in both rulings was identical, the professional who signed the work bears responsibility, not the AI vendor, not the tool, not the assistant who ran the prompt. Professional indemnity insurance typically does not cover claims arising from unverified AI output, because the named author failed in their duty of care. Courts have made clear that AI use does not absolve professionals of accuracy obligations, and small firms have the same duty as large ones with much less margin for getting it wrong.

Where in your business will you actually meet this?

The risk concentrates in three places, in roughly that order of cost. Pitch decks and funding proposals come first. Thought-leadership content published under your name comes second. Client deliverables and proposals come third. The shared feature of all three is that the work leaves the firm carrying cited claims, and the audience has both the ability and the motivation to check.

Pitch decks are first because an invented market-size figure, a fabricated competitor benchmark, or a regulatory reference that does not exist can lose a deal outright or create liability when the claim is later discovered. Thought-leadership content is second because a single invented quote or non-existent study, once spotted by a sophisticated reader, destroys the authority the article was built on. Client deliverables are third because an analysis citing a regulation that is not real or a financial benchmark that is invented leaves the firm liable for the error even though the AI generated it.

Internal drafts, brainstorming documents, and background research sit in a different category. The fabrication risk is still there, but the cost of being wrong is contained inside the team. The job is to mark the threshold clearly. Anything moving from internal to external should pass a check. Anything that stays internal does not need one until it is promoted into client-facing work, at which point it joins the same routine the rest of the firm follows.

When should you ask, and when can you ignore?

Verify when the cost of being wrong is high and the audience has both the ability and the motive to check. Ignore the verification step when the content stays inside the team and never gets quoted externally. The threshold is roughly three minutes per cited claim, which is the cheapest insurance policy a small firm can buy on its own credibility.

The three-minute routine has three checks. First, source existence: open the original report or paper and confirm the number, quote, or citation is really there. Second, date confirmation: AI often cites real sources with invented publication years that create a false sense of recency, so check the date matches a real publication. Third, attribution accuracy: this is where errors most commonly hide, because the source is real but the claim is misrepresented, qualified differently, or directly contradicted by the original text. The first two checks take thirty to sixty seconds each. The third takes a little longer because it requires reading the relevant section, not just confirming the source exists.

Run the routine on every cited claim in pitch decks, in published thought-leadership under your name or the firm’s, in client deliverables, in regulator submissions, and in anything quoted to the press. Skip it on internal drafts, exploratory notes, and AI-assisted research that shapes your thinking but is not quoted externally. The rule that holds the line long-term is simple and worth writing into the team’s quality standard. Any cited fact in client-facing work is the responsibility of the named author, not the AI. If you do not have time to verify it, you do not have time to send it.

The closest companion pieces sit alongside this one in the evaluating AI output cluster. What is an AI hallucination covers the underlying mechanism that produces fabricated statistics, quotes, and citations. Hallucinations as a business risk frames the proportionate-controls argument at a firm level rather than at the level of an individual piece of content.

Read those for the wider picture. This post is the field guide for the specific moment when a piece of AI-generated work is about to leave the building with a cited claim in it, and someone has to decide whether the claim has been checked. The answer is always the same. If the work carries the firm’s name and an audience that might verify, three minutes of checking is cheaper than any other outcome.

If you want to talk through where the verification threshold should sit in your firm, and how to write it into the team’s quality standard so it sticks, book a conversation.

Sources

Stanford HAI (2026). 2026 AI Index Report, Responsible AI chapter. Documents hallucination rates across 26 top models in 2025, ranging from 22% to 94%. https://hai.stanford.edu/ai-index/2026-ai-index-report/responsible-ai PubMed Central (2024). The use of artificial intelligence in writing scientific review articles. Found 47% of ChatGPT-generated medical citations entirely fabricated, 46% inaccurate, only 7% real and correctly cited. https://pmc.ncbi.nlm.nih.gov/articles/PMC10277170/ PubMed Central (2026). Systematic review of fabricated citations in generative AI outputs across six studies, approximately 51% fabricated. https://pmc.ncbi.nlm.nih.gov/articles/PMC12826005/ Thomson Reuters Legal (2025). From "trust but verify" to "do not trust until verified": how the legal profession is redefining AI accountability. The named-author duty to verify AI-generated content. https://legal.thomsonreuters.com/blog/from-trust-but-verify-to-do-not-trust-until-verified-how-the-legal-profession-is-redefining-ai-accountability/ Spellbook (2024). Lawyer fined for using AI fake legal citations: Mata v Avianca and subsequent cases. Lawyers sanctioned for failure to verify, not for the AI's error. https://www.spellbook.legal/learn/lawyer-fined-using-ai-legal-fake-citations Science (2024). AI hallucinates because it is trained on fake answers it does not know. Plain-English account of the prediction mechanism behind invented statistics, quotes, and citations. https://www.science.org/content/article/ai-hallucinates-because-it-s-trained-fake-answers-it-doesn-t-know Harvard Kennedy School Misinformation Review (2024). New sources of inaccuracy: a conceptual framework for studying AI hallucinations. Why fabrication clusters around high-authority, low-traceability forms. https://misinforeview.hks.harvard.edu/article/new-sources-of-inaccuracy-a-conceptual-framework-for-studying-ai-hallucinations/ Alera Group (2024). AI is not liable for the mistake, you are. Professional liability exposure for SMEs relying on unverified AI output. https://aleragroup.com/insights/ai-isnt-liable-mistake-you-are Data Journalism Handbook (2024). Creating a verification process and checklists. The source-date-attribution model adapted for AI-assisted content. https://datajournalism.com/read/handbook/verification-1/creating-a-verification-process-and-checklists/9-creating-a-verification-process-and-checklists Information Commissioner's Office (2024). Guidance on AI and data protection: accuracy and statistical accuracy. UK GDPR's accuracy principle applies to AI-generated content about people. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/what-do-we-need-to-know-about-accuracy-and-statistical-accuracy/

Frequently asked questions

How often do AI tools actually invent citations and statistics?

Often enough that you cannot trust an output that contains them without checking. A 2024 analysis of ChatGPT-generated medical content found 47% of references were entirely fabricated, 46% were real papers cited for claims they did not actually make, and only 7% were both real and accurately cited. Legal citation hallucination rates from the same vendors run between 17% and 33%. The rate is not zero on any frontier model, and it is higher than most owners assume.

Do I need to verify every piece of AI-generated content I produce?

No, and trying to will burn out the team. Apply the three-minute routine to content that carries real consequence: pitch decks, client deliverables, thought-leadership pieces published under your name, and anything sent to a regulator. Internal drafts, brainstorming notes, and background research can stay unverified at point of creation, as long as they are checked again before being promoted into client-facing work.

What about specialised tools that claim to be hallucination-free?

Treat the claim with scepticism. Stanford Law audited LexisNexis and Thomson Reuters, both marketed as hallucination-free for legal research, and found both hallucinated more than 17% of the time on real legal-citation tasks. If a vendor uses the phrase, ask for the measured rate on your use case, the methodology behind that number, and what recourse you have when it fails. A vendor who cannot answer is selling marketing.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation