What is model drift? Plain-English guide for owners

A small marketing agency owner showed me a Claude-based proposal-writing tool she had built in late 2024. It had worked beautifully for a year. By spring 2026 the proposals were coming out longer, blander, and oddly formal. She was convinced one of the team must have edited the prompt. Nobody had. The model under the hood had been upgraded twice while she was busy running the firm.

That is the version of model drift almost nobody is monitoring. The textbook version, a credit-scoring model going stale on post-pandemic data, is real and well documented. The version that catches owner-led businesses in 2026 is quieter, and few firms have a process for spotting it.

What is model drift?

Model drift is the slow, often invisible degradation of an AI system’s accuracy or behaviour over time. It has three classical flavours plus a fourth that is specific to the way owner-led firms buy AI in 2026. Concept drift is when the world changes and the model’s learned patterns no longer hold. Data drift is when the inputs look different. Performance decay is when accuracy simply drops.

A short non-financial example. A restaurant reservation system learned from 2019 to 2022 data when fine dining was booked solid. In 2023 the restaurant opens a patio. The system still accepts reservations, but it predicts capacity using rules that no longer fit. That is concept drift. In 2024 the booking app goes international and starts taking requests in different time zones with different party sizes, which is data drift. By 2025 customers turn up to find no table, which is performance decay.

The fourth flavour is the one almost nobody is watching. Foundation-model version drift is when your AI vendor releases a new version of the model your tool sits on top of. Your prompt has not changed. Your data has not changed. The model has, and your tool now behaves differently. Anthropic moved from Claude 3 to 3.5 to 4 to 4.7 across 2024 to 2026. OpenAI moved from GPT-4 to 4o to GPT-5. If you built a tool in 2024 and have not pinned the version, it is running on something different from what you tested.

Why does it matter for your business?

Drift matters because the failure mode is silent. The tool still answers, the proposals still come out, nobody gets an error message. What changes is the quality, the tone, or the accuracy of the decision underneath, and you only notice when something goes wrong further downstream. For a regulated firm that is a compliance gap. For a services firm it is client complaints, or a quiet drop in conversion two quarters later.

The financial precedents are easy to find. Zillow wrote off around US$304 million in inventory in late 2021 when its iBuyer pricing model failed to keep up with post-pandemic property markets, and cut a quarter of its workforce. Ofqual’s A-level algorithm collapsed in August 2020 when it could not handle a cohort that did not match historical patterns, and was withdrawn within days. Both are extreme cases, and neither is a 10-person agency. The lesson is the same. Models that go unmonitored eventually meet a reality they were not built for.

The regulatory picture is worth holding gently. The PRA’s Supervisory Statement SS1/23 sets binding model risk expectations for banks, including ongoing performance monitoring. The ICO’s accuracy principle under UK GDPR Article 5(1)(d) applies to anyone using personal data in automated decisions, and the ICO has flagged drift specifically as a concern. The EU AI Act’s accuracy and quality-management requirements, applicable from 2 February 2026 for high-risk systems, reach UK firms that sell into the EU. For a 10-person services firm with no regulated decisions, none of these are binding duty, but they have become the audit baseline that auditors and insurers now ask about.

Where will you actually meet it?

Owner-led businesses meet drift in three places, and only one of them looks like the textbook. The first is a vendor deprecation notice. OpenAI and Anthropic both publish model sunset schedules. When a model your tool depends on is retired, you are forced onto a newer version, and the tool’s behaviour can shift. The migration is the new normal of the API economy, and your monitoring needs to expect it.

The second is a silent SaaS upgrade. The customer service platform you bought in 2023 swaps its underlying model on a Tuesday morning. Nobody told you. Your team starts noticing that the AI agent’s escalation logic feels different, or that summaries are formatted in a new way. The vendor counts this as a routine upgrade. From your operational seat it is a behaviour change you did not authorise, and the symptoms look exactly like drift.

The third is the gut-feel staff complaint. “The system feels different this month.” It is the easiest signal to dismiss and often the most useful one. The team using the tool every day notices a pattern shift before any dashboard does. The firms that catch drift early treat that complaint as a structured signal worth investigating, not a moan to manage. The firms that miss it do not.

When to act, and when to ignore

The action depends on which flavour of drift you have. Fine-tuned drift, where you trained the model on your own data, is fixed by retraining on fresher data. You own the model, you own the fix. Foundation-version drift, where your vendor swapped the model underneath you, calls for a different lever: pin the version where the API allows it, test the new version before letting it through, and revisit the prompt.

Ignore the deeper machinery. Population Stability Index, Kolmogorov-Smirnov tests, and Jensen-Shannon divergence are real metrics, used in regulated credit risk and insurance modelling. For a 10-person agency or a 30-person professional services firm, they are overkill. The proportionate response is an inventory of your AI tools, a risk-rank by what each one decides, a pinned model version where you can set one, a small golden-dataset test you can re-run before any vendor upgrade, and a 30-minute monthly spot-check of recent outputs. No MLOps platform required.

The line worth drawing is between systems that drive a regulated or material decision and systems that do not. A lending or pricing model carrying a six-figure decision warrants formal monitoring, documented thresholds, and a named owner. A proposal-writing assistant warrants the monthly spot-check and a pinned version, no more. Match the governance to the consequence, then spend your effort where it earns its keep.

Concept drift is the textbook flavour where the relationship between inputs and outputs has changed. The world has moved, your model has not, and the patterns it learned no longer fit. It is the version that broke credit-scoring models after the 2020 to 2022 macroeconomic shifts.

Data drift is the version where the inputs look different even if the rules have not. New customer demographics, new product lines, or a sloppier CRM data feed are all common causes. The model is still doing the right thing, it is just being shown a population it was not trained for.

Foundation-model version drift is the 2026 flavour. Your vendor releases a new model version, your prompt or fine-tuned wrapper sits on top, and behaviour shifts without anyone changing your code. It is distinct from concept and data drift because the model itself has changed, not the world or the data.

Hallucination is a different failure mode. A hallucinating tool invents content not grounded in your data. A drifting tool produces content that has changed in quality or behaviour over time. Retrieval-augmented generation reduces hallucination. It does not address drift.

Fine-tuning and foundation models are the two surfaces drift sits on. If you fine-tuned the model, retraining is the lever. If you sit on a foundation model you did not commission, version-pinning, testing, and prompt revision are the levers.

The honest test of any AI tool you have been running for more than six months is the regression check. Take a small set of inputs you know the right answer to, run them through the current version, and compare to the answers it gave when you signed it off. If the answers have shifted in ways you cannot explain, you have drift.

What is model drift? Why it matters for your business

Key takeaways

What is model drift?

Why does it matter for your business?

Where will you actually meet it?

When to act, and when to ignore

Sources

Frequently asked questions

How is model drift different from hallucination?

Do I need a formal MLOps platform to manage drift?

Does PRA SS1/23 apply to my business?

Ready to talk it through?

If any of this sounds familiar, let's talk.

What is model drift? Why it matters for your business

Key takeaways

What is model drift?

Why does it matter for your business?

Where will you actually meet it?

When to act, and when to ignore

Related concepts

Sources

Frequently asked questions

How is model drift different from hallucination?

Do I need a formal MLOps platform to manage drift?

Does PRA SS1/23 apply to my business?

Ready to talk it through?

Related reading

Zero-shot vs few-shot learning: when AI works on tiny data

What is AutoML? Why it matters for your business

What is edge AI? Why running AI locally matters for your business

If any of this sounds familiar, let's talk.