The operations manager comes back for the third time this quarter. The AI tool, the one you spent six months specifying and two months rolling out, is “not working again”. She wants to know whether you should switch vendors. You have a meeting booked with a competing supplier, a list of features that look better than the current one, and a quiet feeling that this is the same conversation you had eighteen months ago about the previous tool.
The instinct is to blame the technology. It is almost always the wrong call. The published failure data on AI projects points at the data, not the model.
What does the research actually show about why AI projects fail?
The studies that try to count converge on data, not algorithms, as the cause. RAND’s analysis from interviews with sixty-five experienced data scientists puts the headline at roughly 85 per cent of AI projects failing because of poor data quality or lack of relevant data. Gartner’s Q3 2024 survey of 248 data management leaders projects 60 per cent of AI projects built on non-AI-ready data will be abandoned through 2026.
Informatica’s 2025 survey of 600 Chief Data Officers found 67 per cent unable to move even half their generative AI pilots into production. The economic impact follows the same shape. Gartner puts the average annual cost of poor data quality at $12.9 million per organisation. For a twenty-person services business that scales down to roughly a quarter of a million pounds a year in operational waste, before any AI tool arrived. The tool simply made it visible.
What three questions tell you whether it is a tool problem or a data problem?
Three questions, fifteen minutes, no vendor meeting required. They will not fix the underlying issue, but they will stop you spending six months solving the wrong one. The questions work because they force the failure back from the visible output into the inputs the tool was actually given. By the time you have answered all three, you usually know what kind of problem you have.
First, when the tool rejects a record, can you verify the rejected record manually? Pull a sample of five or ten rejections. Open the source data. If the records genuinely violate a business rule, the tool is doing its job. If they look fine to a human reader, the tool is applying a rule that is consistent in one part of your system and inconsistent in another. That is a schema problem, and a different vendor will catch the same records.
Second, can you trace the tool’s output back to the data it was trained on? If you cannot inspect what the model saw, you have no way to know whether the model is broken or the training data was. Pull the historical records. Are company names merged, titles out of date, the same contacts appearing twice under different IDs? A model trained on that history learned the corruption. A new tool, given the same history, will learn the same thing.
Third, is the failure in the tool itself or in the integration layer between the tool and your systems? The tool may produce correct output that does not map cleanly to your ERP or CRM. Field names disagree. One side expects CSV, the other JSON. Replacing the tool will not fix the mismatch.
Where will you actually meet this in practice?
When AI tools break in small and mid-sized businesses, four data patterns account for the majority of failures. Each one produces a symptom that looks like a model fault and almost always traces back upstream. The four are stale records, duplicate identifiers, undefined terms, and missing required fields. They turn up in roughly that order of frequency in the engagements I see.
The first is stale source records. Apollo’s research on B2B contact data finds decay running at 30 to 40 per cent per year in normal cases and as high as 70 per cent in fast-moving sectors. Your lead-scoring tool faithfully scores an inactive contact at a company where the person moved on three quarters ago. That looks like a hallucination. The tool is reading what your CRM still says is true. The same Apollo work shows mid-market sales reps spending 28 per cent of their week on non-selling activities, much of it data verification.
The second is duplicate identifiers. A supplier entered three times under slightly different spellings. A contact appearing once in the CRM and once in the marketing platform with no link between them. The tool sees two records and treats them as separate entities. Informatica’s CDO survey names data quality and completeness as the primary reason 67 per cent of organisations cannot get their generative AI pilots into production.
The third is undefined terms. Revenue in your sales system means booked orders. Revenue in your ERP means shipped and invoiced. Revenue in your financial planning tool means recognised revenue under ASC 606. When the AI tries to predict revenue, it is averaging three different definitions of the same word. Alation’s research on semantic consistency frames this as 70 per cent organisational and 30 per cent technical. A different vendor cannot invent a definition you have not agreed.
The fourth is missing required fields. The tool was built to populate a field that historically was optional. Records written before the change do not have it. New records pass through, old records get rejected, and the team sees the AI tool as an obstacle. The ICO’s small-business guidance flags inconsistent field ownership as one of the commonest causes.
When should you fix the data and when can you work around it?
Not every data problem is worth fixing before you let the AI tool run. The proportionate move depends on what you are using the tool for, how much corruption is actually in the dataset, and how directly the output drives decisions or customer-facing actions. Get this judgement wrong in either direction and you waste months. The rule of thumb below covers the bulk of small-business cases.
Fix the data when the tool’s output is consumed by people who need to trust it. Sales scoring, customer retention prediction, financial forecasting, anything where staff or customers will see and act on the model’s answer. Trust degrades fast in these contexts and rebuilds slowly. McKinsey’s 2025 State of AI survey found that organisations getting the most value from AI redesigned workflows around what the tool could deliver before selecting it, which only works when the underlying data is trustworthy enough to redesign around.
Work around the data when the tool is used in a narrow, supervised context where a human reviews each output before it acts. A drafting assistant on top of email, a summariser for meeting notes, a research helper. In those cases the human is the data quality layer. You need a tight enough scope that the tool’s limitations do not matter, not clean records.
The mistake is applying the wrong rule. Founders chasing a quick generative-AI win sometimes invest months in clean-up they did not need. Founders rolling out a customer-facing model sometimes skip clean-up entirely because the pilot looked fine. Both end up disappointed.
Related concepts and what to read next
Data readiness is the foundation, not the whole picture. The post on the data readiness prerequisite sets out the principle for any AI implementation. The finance-specific version, cleaning data before financial AI, walks through the diagnostic for forecasting and reporting tools. The post on knowledge bases going stale at six months covers the same decay pattern in document-based systems. The Plain-English AI explainers on RAG, embeddings, and vector databases cover the AI side.
The proportionate first move, when an AI tool is reported as broken, is an hour with 20 to 30 records rather than a vendor call. Check completeness, accuracy, consistency across systems. If more than 15 per cent of the sample is broken, you have a data problem that a new tool will not fix. Loqate’s analysis of the 1-10-100 rule puts the cost of preventing a bad record at $1, cleaning it later at $10, and leaving it in the system at $100. When that record also feeds a model, the $100 compounds.
If you would rather work through the diagnostic with someone who has helped other owner-managed firms do it, book a conversation.



