A 35-staff UK accountancy firm has four vendor proposals open on the managing director’s desk, all for the same job: routing 200 support tickets a day to the right specialist. Vendor 1 quotes £3,000 for a rules engine. Vendor 2 quotes £45,000 for a custom ML model. Vendor 3 quotes £80,000 for an LLM-powered agent. Vendor 4 sells a £4,800 a year ML feature inside the CRM the firm already pays for.
All four call themselves “AI-powered”. The MD has read enough vendor decks to know that tells her nothing. She wants to know which is the right shape for a problem where the routing logic is mostly “billing goes to finance, technical goes to engineering, complaints go to a manager”.
The unlock here is a language fix. Once you can name what each vendor is actually selling, the four proposals start to look like apples and oranges and a couple of pears, and the right call gets obvious.
What is the choice you’re really facing?
The choice is not “AI or not AI”. AI is the umbrella for any system doing the kind of reasoning a person would otherwise do. Inside that umbrella sit three different shapes you actually need to distinguish: rule-based systems where a person writes the logic, machine learning where the system learns patterns from historical data, and generative AI where a large language model produces new content. Each carries a different cost, data, and maintenance profile.
A rule-based system is rigid but transparent. The logic is written by a human and does exactly what the human specified, until the human changes it. That is what the £3,000 ticket-routing engine actually is. Machine learning is flexible but opaque: it learns relationships from historical data that a person may not have articulated, and it adapts as new data arrives. Generative AI sits on top of both, handling ambiguity and novel language, and it costs the most to run at scale.
AWS, Google Cloud and IBM all publish the same canonical stack. Any vendor unwilling to say which layer their product sits on is either hiding a thinner architecture than the marketing implies or reselling something they do not fully understand. Settle the language and the procurement decisions get cleaner.
When is the broader “AI” framing the right one?
The umbrella term is right for strategy, board-level positioning, regulatory exposure, and stacks that genuinely combine rules, ML, and generative AI in one workflow. If the question is “should we be investing in AI?”, the answer is almost certainly yes. What you are actually buying might be any of the three. The framing serves the resource-planning conversation, not the procurement one.
Regulators use the umbrella deliberately. The EU AI Act treats “AI system” as a broad category covering rule-based systems, machine learning, and generative models. The UK ICO and FCA do the same. When you are assessing vendor compliance with regulatory expectations, “AI” is the right starting term, and the specifics come next.
The umbrella is also honest when a single product genuinely combines all three. Salesforce Einstein blends predictive ML for forecasting and lead scoring with generative AI for drafted emails and an agentic layer that executes tasks under human oversight. Microsoft 365 Copilot layers LLM-driven content generation over traditional ML for personalisation and rule-based approval workflows. Both are genuinely “AI” at the strategic-purchase level. At the implementation level they are three technologies stacked together with three maintenance profiles.
When is the specific “ML” framing the right one?
Machine learning is the right frame when the problem is predictive, the historical data is there, and a domain expert can sense the pattern but cannot write it down as rules. Familiar cases in any owner-managed services business: which customers will churn, which leads will convert, which suppliers will miss a delivery window. History contains signals that predict outcomes, and the question is whether the signals can be learned.
A workable rule of thumb: if a domain expert can write “if X and Y and Z then do this”, that is a rule. If the same expert says “I know it when I see it, but I cannot fully explain the pattern”, that is ML territory. Procurement teams describe supplier reliability that way. Two thousand historical relationships with delivery, contract-adherence, and escalation data give a model enough to learn the weight of each signal.
Naming the frame as “ML” forces the right vendor questions, which is half the value of the language fix. How much training data do you need? What happens when my data shifts away from yours? Can I audit a specific decision the model made? Is it fine-tuned for my industry, or trained on something more generic? A vendor with clear answers is being honest. A vendor who deflects with “it uses AI, and AI is magic” is selling. ML projects typically run £25,000 to £250,000 to build, with £10,000 to £50,000 of annual maintenance once a model is live.
What does it cost to get the framing wrong?
Four expensive failure modes show up repeatedly. The first is overengineering with ML on a problem rules would have solved cleanly. The £45,000 ticket-routing model in the opening example might end up 96% accurate against a rules engine’s 94%, and the marginal gain rarely justifies the build cost, the data scientist on retainer, and the retraining cycle that comes with it.
The hidden tax is the part-time data scientist who keeps the model honest, often £35,000 to £50,000 a year on top of the build cost.
The second failure mode is the opposite: writing hundreds of fragile if-then rules for a problem that genuinely needs ML. Churn prediction is the textbook case. Real churn signals involve usage frequency, support sentiment, payment behaviour, feature adoption velocity, and seasonality inside the customer’s own industry. A human writing rules for all those interactions ends up with a brittle rulebook that misses real churn or breaks every time conditions shift. The cost is lost revenue, and that cost is silent.
The third is using generative AI where ML or rules would suffice. Classifying support tickets is a closed-form problem ML solves at 95% accuracy for £15,000 to £40,000 one-off. Pushing the same tickets through an LLM API might lift accuracy to 98%, at £100 to £500 a month in inference fees on 10,000 tickets. That is £1,200 to £6,000 a year for a marginal gain on a problem already solved.
The fourth is trust erosion. When a misframed AI project under-delivers, internal confidence in the technology collapses. The next AI initiative gets killed in the budgeting meeting before it has a chance, even if the new idea is the right shape for a real problem.
What should you ask before you decide?
Work through five steps before committing budget. Can you describe the logic as an if-then rule a domain expert would recognise, and is the rule stable? Use rules. Do you have 12 to 24 months of historical data with a clean outcome and stable signal? ML is viable. Is the problem fundamentally about language or novel content? That is generative AI. Then weigh error tolerance and the three-year cost of ownership.
Then put five questions to every vendor. Is this rule-based, ML, or LLM-driven, and which model specifically? What data was the system trained on, and what happens when my data drifts? Can you produce an audit trail for a specific decision? Where does my data live, and will you use it to retrain your base model? What is your model-deprecation and version-update policy? Fisher Phillips and Morgan Lewis publish longer versions of this checklist for legal-risk teams; the discipline travels well into SME procurement.
For a typical UK service business turning over £1m to £10m, the answer is often a hybrid. Rules handle the 60-70% of cases that are deterministic. ML handles the 20-30% that are genuinely learnable. Generative AI sits on top for content drafting and natural-language interpretation where ambiguity is unavoidable. That mix is cheaper than a single enterprise platform, more effective than picking one technology for everything, and more maintainable because you own the decision logic rather than renting it.
If the vendor language is doing more work than the technology behind it, book a conversation and we will help you read what each proposal is really selling.



