How AI handles stale documents and version drift

A firm sets up Microsoft 365 Copilot and, over the following weeks, staff begin using it for routine questions. The first sign that something is wrong comes from a client. The pricing quoted in an email does not match the current schedule. Copilot had pulled its answer from a proposal sitting in SharePoint, one that predated a fee increase made six months earlier. The file was genuine. The version was not.

This is not a hallucination. The document existed and the words were real. The problem has a different name, a different cause, and a different fix.

What actually happens when AI reads an old document?

Version drift is when an AI system retrieves a genuine document that is no longer the current, authoritative version. In a retrieval-augmented generation system, the AI searches your file store to answer a question, then generates a response from what it finds. The system has no built-in sense of which version takes precedence unless you build that rule into the retrieval layer.

This is different from hallucination, where the model invents content with no source. Version drift produces an answer grounded in a real document; the document is simply superseded. A related failure is contradiction. If your file store holds two versions of the same policy or process template, the AI may blend them or choose one based on retrieval ranking. Ask the same question twice in slightly different words and you may get different answers, each referencing a real file.

Prompt engineering does not solve this. A more precise question changes how the model constructs its answer but does not force the retrieval system to prefer the newer file when both are indexed. Post-generation fact-checking has similar limits. A checker can confirm that the cited document exists, not that it is the current version. A superseded but genuine policy can pass a basic accuracy check while producing an operationally wrong answer.

Why does version drift matter for your business?

Small service firms typically hold years of documents in shared drives, email threads, and copied folders, with old proposals, superseded contracts and previous process templates sitting alongside current ones. An AI system ingesting that environment has no automatic preference for the newer file. Poor document hygiene directly determines how often the AI retrieves the wrong version and returns it as confident, authoritative fact.

UK GDPR requires that personal data be accurate and, where necessary, kept up to date. The Information Commissioner’s Office makes clear this is a legal obligation, not a discretionary hygiene standard. When an AI system is summarising client records, case notes or staff files, version drift converts a document governance failure into a data protection failure. That applies to any service firm handling personal data, not only those in regulated sectors.

For regulated firms, the exposure goes further. The FCA’s research on AI in UK financial services identifies data quality and governance as the core controls for firms deploying AI in advice, compliance or client communications. A firm whose AI assistant draws on superseded compliance templates when handling a complaint or a know-your-customer query has a governance failure, not just a bad output.

The Competition and Markets Authority has also noted that AI-enabled tools must not mislead consumers. If a client-facing system draws on stale internal documents to generate pricing or service scope information, the consequences extend beyond a remedial email.

Where will you actually meet this in your business?

Version drift surfaces wherever AI reads from your internal document store rather than a clean, controlled source. Tools like Microsoft 365 Copilot and Google Workspace Gemini sit directly on top of your files, emails and chats, so document discipline matters as much as model quality. Staff questions about current policies, pricing or processes are where the exposure is sharpest.

The highest-risk situations share three features. First, your firm holds the same document in multiple locations, one version in SharePoint, a copy forwarded by email, a third saved to a local or shared drive. Second, files are never formally archived or labelled as superseded, so every version looks active to the retrieval system. Third, staff are using AI to answer questions they would previously have asked a senior colleague, who would have known intuitively which version applied.

The problem scales with the sensitivity of the use case. A team using AI to draft first-pass emails or summarise meeting notes faces lower exposure than one using it to answer questions about client obligations, complaint procedures or pricing. The NCSC’s guidance on safe AI adoption frames the management of document inputs as part of the broader trust and security problem around AI deployment, not simply a productivity concern.

When does this apply, and when is it not your problem?

If your team uses AI only for drafting and brainstorming without a connected document store, version drift does not apply. The risk is live the moment you connect AI to a knowledge base, SharePoint, Google Drive or a similar system. It scales with how much staff rely on the answers for operational decisions, client communications or compliance work.

Three situations keep the risk low. The first is that your AI use is confined to generic text generation with no internal document retrieval. The second is that your firm already maintains one controlled, versioned document store with clear archive rules. The third is that AI outputs are never used for client commitments, compliance decisions or operational procedures. Each of these reduces the version problem to a background consideration rather than an active failure mode.

The EU AI Act, adopted in 2024, treats data quality and governance as required controls in higher-risk AI deployments. Even before any direct regulatory requirement applies to a small UK firm, the underlying logic is the same. What you give the system determines what comes out. A firm that has already built document lifecycle practices, with status fields, review dates and archive processes, may find version drift is a residual rather than active risk. The gap, for small service firms, is usually that these processes exist informally in someone’s head rather than in the file system itself.

What can you do about it from Monday?

The cheapest intervention for any owner-operated firm is to fix the document layer before adding any AI on top. Give each policy or template one home with a visible status label such as approved, draft or superseded, and move obsolete files out of the searchable index. Industry estimates put up to 80 per cent of AI project time in data preparation rather than model work, and that ratio holds here.

Version metadata is the next layer. Modern document stores allow you to tag files with creation dates, review dates and document status. When an AI system retrieves from that store, it can be configured to prefer current-approved files and exclude superseded ones from retrieval. This is not a default setting in all workplace AI tools, so it is worth asking your vendor or IT provider whether it is active.

Testing for contradiction costs nothing. Ask the AI the same policy question in three different ways and compare the answers. Inconsistent responses are the first visible sign that the knowledge base holds competing versions of the same document.

For client-facing or regulated outputs, keep humans in the review loop until your document governance is confirmed clean. The NCSC frames the management of data inputs as an organisational responsibility, not a technical one. The fix for version drift sits with whoever owns the document lifecycle in your firm, not with whoever chose the model.

How AI handles stale documents, contradictions, and version drift

Key takeaways

What actually happens when AI reads an old document?

Why does version drift matter for your business?

Where will you actually meet this in your business?

When does this apply, and when is it not your problem?

What can you do about it from Monday?

Sources

Frequently asked questions

What is the difference between AI version drift and a hallucination?

Does version drift apply if I am only using ChatGPT or Claude for drafting?

What is the quickest fix for version drift in a small firm?

Ready to talk it through?

If any of this sounds familiar, let's talk.

How AI handles stale documents, contradictions, and version drift

Key takeaways

What actually happens when AI reads an old document?

Why does version drift matter for your business?

Where will you actually meet this in your business?

When does this apply, and when is it not your problem?

What can you do about it from Monday?

Sources

Frequently asked questions

What is the difference between AI version drift and a hallucination?

Does version drift apply if I am only using ChatGPT or Claude for drafting?

What is the quickest fix for version drift in a small firm?

Ready to talk it through?

Related reading

Find the shadow AI in your agency before a client's data leaks through it

A four-tier data map so your team knows what AI can touch

Capture the shop-floor knowledge before it retires

If any of this sounds familiar, let's talk.