The MD of a 60-staff UK managed services firm had been pitched two AI approaches in the same week. One vendor wanted to build a supervised churn model on 2,000 historical customer records, with 80 hours of admin time to verify the labels and £4,000 of tooling on top. Total bill £6,400 over six to twelve weeks, output a churn-probability score the sales team could act on. The other vendor sold a £900 a month unsupervised segmentation add-on. No labelling, results in days, output five customer clusters the firm did not know it had.
She was trying to decide which was better. The sharper question was which one fit her actual problem, and whether her budget supported the labelling the supervised path needed. Vendors rarely lead with the labelling cost. It is the line that decides the project.
What is the difference between supervised and unsupervised learning?
Supervised learning is taught with answers. You give the algorithm historical examples already labelled with the correct outcome, churned or retained, fraud or legitimate, and it learns to predict the same outcome on new records. Unsupervised learning is given raw data with no labels and asked to find structure on its own, clusters, anomalies or hidden patterns. Semi-supervised learning sits between them, labelling a small portion to reduce the burden.
The practical question is whether your problem is “predict a known outcome” or “discover unknown structure”. A churn model needs labels because the algorithm has to see what churn looks like before it can predict it. A customer segmentation does not, because you are asking the data to tell you what types of customer you actually have. A fraud detector needs labels if you want to learn known fraud patterns; unsupervised anomaly detection helps where the fraud type is novel and has never been labelled.
Active learning and weak supervision sit on the hybrid line and can cut labelling cost by 50 to 70 per cent according to published Snorkel and CleanLab work. The technical names matter less than the shape of the question.
When is supervised learning the right answer?
Supervised learning is the right answer when you can articulate the specific outcome you want to predict, you have or can afford historical labelled examples, and the cost of getting a prediction wrong is high enough to justify the labelling investment. The four common SME applications all share that shape, and embedded vendor tooling now does the heavy lifting on training and retraining.
Lead scoring is the clearest case. HubSpot AI lead scoring trains on your past conversion data, and once 500 to 2,000 historical leads are labelled the system scores new prospects automatically. Payback typically lands inside six to twelve months because sales time gets redirected from cold outreach to leads with 60 to 80 per cent conversion probability. Churn prediction follows the same logic, and SME churn models built on 2,000 to 5,000 labelled records hit 75 to 85 per cent accuracy in published work, enough to drive intervention calls.
Fraud detection on labelled transactions is a third pattern. Labelling sits at roughly £2 to £10 per transaction for complex cases, and weak supervision rules can auto-label the obvious examples to cut the cost in half. Invoice classification is the fourth. Tools like Oracle’s invoice classifier reduce manual processing time by 40 to 60 per cent once 500 example invoices are labelled at £0.50 to £2.00 each. The common thread is a known specific outcome, historical examples that exist, and a real cost of error.
When is unsupervised learning the right answer?
Unsupervised learning is the right answer when you do not know what categories exist in your data yet, when labelling would be too expensive or too subjective to be reliable, or when your goal is exploration rather than prediction. The infrastructure cost is the only variable, and SaaS platforms have made the entry point low enough that an SME can run segmentation in days.
Customer segmentation is the canonical case. You feed behavioural data into a clustering algorithm and it discovers natural groups without you defining them in advance. Salesforce Einstein, Klaviyo, Mixpanel and Optimove all ship clustering built in, with cost sitting at £500 to £2,000 a month in platform fees and zero per-record labelling spend. Anomaly detection works the same way for fraud or equipment failures where the failure type has never been labelled, because Isolation Forest and similar methods learn what normal looks like and flag what deviates.
Document and ticket clustering is the third common pattern, useful for support teams that want to find the natural groups in their inbox without predefining categories. Exploratory analysis before a supervised project is the fourth. A few weeks of clustering will tell you whether your churned customers fall into recognisable segments, which decides whether a 5,000-record labelling project is worth committing to. The trade-off is interpretation. Unsupervised learning hands you groups, then your domain expertise has to decide what they mean.
What does the wrong choice cost?
The wrong framing burns money in three different ways, and the failure modes are well documented. Attempting supervised learning on a problem with no available labels and no budget to create them is the first trap. Firms commit £5,000 to labelling, discover the categories they chose do not match what the business actually needs, and start over. The labels become useless and the spend is sunk.
Attempting unsupervised learning when you actually need a prediction is the mirror failure. The clustering returns ten customer segments and the sales team still cannot answer “who will buy”. The output describes the customer base but does not prioritise it. Poor-quality labels are the third trap. IBM puts the average organisational cost of poor data quality at over £5 million annually, and one Unity Technologies case lost £110 million in revenue when corrupted training data spread through their advertising system. Cheap crowdsourced labelling with a 20 per cent error rate costs more once retraining and rework are counted in.
A fourth trap hits regulated firms directly. Unsupervised models cannot explain why a customer was placed in a “high churn risk” segment, because there is no decision rule to audit. ICO guidance on automated decision-making expects firms to explain how a model reached its conclusion about an individual, and FCA-supervised firms face the same scrutiny on consumer outcomes. Choosing unsupervised for any decision a regulator might inspect creates compliance risk that translates into fines, rejection by auditors, or both. Supervised models, by contrast, can report which features drove a prediction, which is the audit trail regulators look for.
What should you ask before you decide?
Five questions carry the bulk of the decision for an SME. Run them in order and the answer usually becomes obvious before any procurement conversation starts. The cases where they do not are the ones that warrant a deeper data-science assessment rather than a quick board discussion.
Can you articulate the specific outcome you want to predict? If yes, supervised is the right shape; if you cannot name the outcome, start unsupervised and let exploration sharpen the question. Can you afford the labelling at your scale? Multiply expected record count by per-record cost using current 2026 rates, £0.05 to £0.50 for semi-automated and £0.50 to £5.00 for fully manual, and compare the total against the business value of better predictions. If labelling exceeds value, consider active learning or weak supervision to cut the burden, or stop. Is the decision regulated or auditable? If yes, supervised only.
What is the cost of a prediction error? Asymmetric error costs favour supervised models because they let you tune the trade-off between precision and recall. Can you afford ongoing labelling for retraining? Supervised models drift as customer behaviour changes, and a budget that covers the first build but not annual retraining will produce a model that gets quietly worse. Embedded platforms like HubSpot and Salesforce Einstein handle retraining on your own labelled data, but you still pay the labelling time.
Default to the cheapest path that answers your question. For a service-led SME between £1 million and £10 million turnover, that is usually unsupervised exploration first to understand the shape of the data, then supervised on the one or two outcomes where prediction has real money behind it.
If you want to talk through which path your data actually supports, book a conversation.



