Why most AI projects fail at data, not at AI

A woman at a desk leaning over a laptop and a printed spreadsheet of customer records
TL;DR

When AI tools underperform in small businesses, the diagnosis usually misses the real problem. Roughly 85 per cent of AI projects fail because of poor data quality rather than weak models, and Gartner expects 60 per cent of projects built on non-AI-ready data to be abandoned by 2026. Before switching vendors, spend an hour auditing the data the tool was actually built on.

Key takeaways

- Around 85 per cent of AI projects fail because of poor data quality or lack of relevant data, not because the model itself was wrong. - Four data patterns consistently masquerade as AI failures: stale source records, duplicate identifiers, undefined terms, and missing required fields. - Three diagnostic questions distinguish a tool problem from a data problem in fifteen minutes flat. - Fixing data tends to show up in rejection rates first (one to two weeks), then output quality, then adoption, then integration friction. - The proportionate first move when an AI tool is not working is an hour with 20 to 30 records, not a call to a different vendor.

The operations manager comes back for the third time this quarter. The AI tool, the one you spent six months specifying and two months rolling out, is “not working again”. She wants to know whether you should switch vendors. You have a meeting booked with a competing supplier, a list of features that look better than the current one, and a quiet feeling that this is the same conversation you had eighteen months ago about the previous tool.

The instinct is to blame the technology. It is almost always the wrong call. The published failure data on AI projects points at the data, not the model.

What does the research actually show about why AI projects fail?

The studies that try to count converge on data, not algorithms, as the cause. RAND’s analysis from interviews with sixty-five experienced data scientists puts the headline at roughly 85 per cent of AI projects failing because of poor data quality or lack of relevant data. Gartner’s Q3 2024 survey of 248 data management leaders projects 60 per cent of AI projects built on non-AI-ready data will be abandoned through 2026.

Informatica’s 2025 survey of 600 Chief Data Officers found 67 per cent unable to move even half their generative AI pilots into production. The economic impact follows the same shape. Gartner puts the average annual cost of poor data quality at $12.9 million per organisation. For a twenty-person services business that scales down to roughly a quarter of a million pounds a year in operational waste, before any AI tool arrived. The tool simply made it visible.

What three questions tell you whether it is a tool problem or a data problem?

Three questions, fifteen minutes, no vendor meeting required. They will not fix the underlying issue, but they will stop you spending six months solving the wrong one. The questions work because they force the failure back from the visible output into the inputs the tool was actually given. By the time you have answered all three, you usually know what kind of problem you have.

First, when the tool rejects a record, can you verify the rejected record manually? Pull a sample of five or ten rejections. Open the source data. If the records genuinely violate a business rule, the tool is doing its job. If they look fine to a human reader, the tool is applying a rule that is consistent in one part of your system and inconsistent in another. That is a schema problem, and a different vendor will catch the same records.

Second, can you trace the tool’s output back to the data it was trained on? If you cannot inspect what the model saw, you have no way to know whether the model is broken or the training data was. Pull the historical records. Are company names merged, titles out of date, the same contacts appearing twice under different IDs? A model trained on that history learned the corruption. A new tool, given the same history, will learn the same thing.

Third, is the failure in the tool itself or in the integration layer between the tool and your systems? The tool may produce correct output that does not map cleanly to your ERP or CRM. Field names disagree. One side expects CSV, the other JSON. Replacing the tool will not fix the mismatch.

Where will you actually meet this in practice?

When AI tools break in small and mid-sized businesses, four data patterns account for the majority of failures. Each one produces a symptom that looks like a model fault and almost always traces back upstream. The four are stale records, duplicate identifiers, undefined terms, and missing required fields. They turn up in roughly that order of frequency in the engagements I see.

The first is stale source records. Apollo’s research on B2B contact data finds decay running at 30 to 40 per cent per year in normal cases and as high as 70 per cent in fast-moving sectors. Your lead-scoring tool faithfully scores an inactive contact at a company where the person moved on three quarters ago. That looks like a hallucination. The tool is reading what your CRM still says is true. The same Apollo work shows mid-market sales reps spending 28 per cent of their week on non-selling activities, much of it data verification.

The second is duplicate identifiers. A supplier entered three times under slightly different spellings. A contact appearing once in the CRM and once in the marketing platform with no link between them. The tool sees two records and treats them as separate entities. Informatica’s CDO survey names data quality and completeness as the primary reason 67 per cent of organisations cannot get their generative AI pilots into production.

The third is undefined terms. Revenue in your sales system means booked orders. Revenue in your ERP means shipped and invoiced. Revenue in your financial planning tool means recognised revenue under ASC 606. When the AI tries to predict revenue, it is averaging three different definitions of the same word. Alation’s research on semantic consistency frames this as 70 per cent organisational and 30 per cent technical. A different vendor cannot invent a definition you have not agreed.

The fourth is missing required fields. The tool was built to populate a field that historically was optional. Records written before the change do not have it. New records pass through, old records get rejected, and the team sees the AI tool as an obstacle. The ICO’s small-business guidance flags inconsistent field ownership as one of the commonest causes.

When should you fix the data and when can you work around it?

Not every data problem is worth fixing before you let the AI tool run. The proportionate move depends on what you are using the tool for, how much corruption is actually in the dataset, and how directly the output drives decisions or customer-facing actions. Get this judgement wrong in either direction and you waste months. The rule of thumb below covers the bulk of small-business cases.

Fix the data when the tool’s output is consumed by people who need to trust it. Sales scoring, customer retention prediction, financial forecasting, anything where staff or customers will see and act on the model’s answer. Trust degrades fast in these contexts and rebuilds slowly. McKinsey’s 2025 State of AI survey found that organisations getting the most value from AI redesigned workflows around what the tool could deliver before selecting it, which only works when the underlying data is trustworthy enough to redesign around.

Work around the data when the tool is used in a narrow, supervised context where a human reviews each output before it acts. A drafting assistant on top of email, a summariser for meeting notes, a research helper. In those cases the human is the data quality layer. You need a tight enough scope that the tool’s limitations do not matter, not clean records.

The mistake is applying the wrong rule. Founders chasing a quick generative-AI win sometimes invest months in clean-up they did not need. Founders rolling out a customer-facing model sometimes skip clean-up entirely because the pilot looked fine. Both end up disappointed.

Data readiness is the foundation, not the whole picture. The post on the data readiness prerequisite sets out the principle for any AI implementation. The finance-specific version, cleaning data before financial AI, walks through the diagnostic for forecasting and reporting tools. The post on knowledge bases going stale at six months covers the same decay pattern in document-based systems. The Plain-English AI explainers on RAG, embeddings, and vector databases cover the AI side.

The proportionate first move, when an AI tool is reported as broken, is an hour with 20 to 30 records rather than a vendor call. Check completeness, accuracy, consistency across systems. If more than 15 per cent of the sample is broken, you have a data problem that a new tool will not fix. Loqate’s analysis of the 1-10-100 rule puts the cost of preventing a bad record at $1, cleaning it later at $10, and leaving it in the system at $100. When that record also feeds a model, the $100 compounds.

If you would rather work through the diagnostic with someone who has helped other owner-managed firms do it, book a conversation.

Sources

- MIT (2025). State of AI in Business 2025. Source for the 95 per cent generative AI pilot failure rate cited in the post. https://trullion.com/blog/why-95-of-ai-projects-fail-and-why-the-5-that-survive-matter/ - Gartner (2025). Q3 2024 Data Management Leader Survey, 248 data management leaders. Source for the projection that 60 per cent of AI projects built on non-AI-ready data will be abandoned through 2026. https://mybusinessfuture.com/en/data-quality-in-smes-why-ai-fails-without-clean-data/ - RAND Corporation (2024). Root cause analysis of enterprise AI failure rates, 65 data scientists and engineers interviewed. Source for the 85 per cent figure and the data-related root causes. https://talyx.ai/insights/enterprise-ai-implementation-failure - Informatica (2025). CDO Insights 2025, global survey of 600 Chief Data Officers. Source for the finding that 67 per cent of organisations move less than half of their GenAI pilots into production, and the 43 per cent data-readiness obstacle figure. https://www.informatica.com/lp/cdo-insights-2025_5039.html - Apollo (2024). How stale CRM data hurts sales productivity. Source for B2B contact data decay rates of 30 to 40 per cent annually and the 28 per cent of sales rep time spent on data reconciliation. https://www.apollo.io/insights/how-does-stale-crm-data-hurt-mid-market-sales-team-productivity-and-pipeline-quality - Gartner (cited via ORM-Tech, 2026). $1.5 Trillion AI Spend Faces Data Quality Barriers. Source for the $12.9 million average annual cost of poor data quality. https://orm-tech.com/news/20260501-gartner-1-5-trillion-ai-spend-faces-data-quality-barriers/ - Alation (2024). Semantic Consistency in Data: Definition, Challenges and Best Practices. Source for the 70/30 split between organisational and technical causes of semantic inconsistency. https://www.alation.com/glossary/semantic-consistency/ - Information Commissioner's Office (2024). Information governance for your small business. Source for the small-business guidance on document naming, storage, and ownership of data fields. https://ico.org.uk/media2/migrated/4020350/information-governance-for-your-small-business-v-1-0.docx - Loqate (2024). The 1-10-100 rule: the real impact of poor data. Source for the cost ratio of preventing, cleaning, and ignoring bad records. https://www.loqate.com/en-gb/blog/the-1-10-100-rule-the-real-impact-of-poor-data/ - McKinsey (2025). The State of AI: Global Survey 2025. Source for the finding that organisations capturing the most AI value redesigned workflows before selecting technology. https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai

Frequently asked questions

My operations manager keeps telling me the AI tool is broken. Should I switch vendors?

Not yet. The published failure rate for AI projects sits around 85 per cent and the dominant cause is data, not the tool. Spend an hour first. Pull 20 to 30 records the tool was processing and check completeness, accuracy, and consistency against your other systems. If more than 15 per cent have missing fields or duplicates, a different vendor will hit the same wall.

How do I tell the difference between a model that has learned badly and a model fed bad data?

Try to trace the tool's output back to the data it was built on. If you can see the training records and they contain stale contacts, merged companies, or contradictory revenue figures, the model is faithfully learning from corrupted input. The output looks wrong because the input was wrong. Cleaning and retraining usually fixes this within a couple of weeks of the next sync.

How long does it take to see results once we actually fix the data?

In a small business with daily syncs, rejection rates drop within one to two weeks. Output quality improves after the next retraining cycle. Adoption picks up at four to six weeks once the team starts trusting the numbers. Measurable ROI tends to show up between eight and twelve weeks if the underlying process was sound and data was the only blocker.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation