A 40-staff specialist insurance broker has three options on her desk. A London consultancy has quoted £40,000 for a “bespoke AI” build. A smaller technical partner has quoted £6,000 to fine-tune an open-source 7B model on 2,000 historical claims documents, plus £450 a month in inference cost. Her third option is to do nothing and add another senior claims handler. The broker processes around 50,000 claims documents a year, each currently taking a senior handler 4.8 hours to read, classify and route.
The middle option, she has been told, is “transfer learning”. She wants to know whether the £6,000 figure is realistic, what 2,000 labelled documents actually costs to produce, whether the resulting model will hold up against the variability of real claims, and whether she should be choosing between fine-tuning, RAG and prompt engineering or quietly combining all three.
This is the question worth answering, because transfer learning is the technique that explains why a 30-staff UK firm can run a custom AI capability in 2026 without a data science team or a six-figure compute budget. The cost economics shifted in the last eighteen months. The vocabulary has not caught up.
What is transfer learning?
Transfer learning is the technique of taking a model already trained on millions of examples and adapting it to your specific task using a fraction of the data and compute. The pre-trained model has learned the hard parts: how to recognise edges in images, how grammar works in language, how concepts relate. Your job is to teach it the narrow patterns of your business, often with a few hundred examples.
The lower layers of a model hold general knowledge that transfers across tasks. The higher layers capture task-specific patterns, and those are the layers you retrain or replace. Two main implementations exist. Feature extraction freezes the base model and trains a new output head on top. Fine-tuning unfreezes some upper layers and continues training at a slower learning rate. The GeeksforGeeks reference and the Weights and Biases write-up both frame it the same way: transfer learning is the principle, fine-tuning is one implementation of it.
Why does it matter for your business?
It matters because the cost has collapsed. Training a foundation model from scratch can exceed £400,000 in compute alone and runs eighteen months plus, per the SmartDev reference. Fine-tuning a 7B open-source model like Mistral 7B runs £0.48 to £1 per million training tokens. A typical SME 2,000-record fine-tune lands at £50 to £500 in GPU time, with inference at roughly £37 per 10,000 queries on cost-optimised models.
Precedence Research forecasts the global transfer learning market to grow from £2.93bn in 2025 to £3.61bn in 2026, with a 23 per cent compound annual growth rate through 2035. The figures matter less than the direction. Smaller organisations now find the economics workable, which is why the technique is moving from research labs into the day-to-day procurement decisions of UK SMEs.
The use cases are concrete. A specialist insurance firm fine-tuned a transformer on 2,000 labelled claims and cut classification time from 4.8 hours to 3.2 minutes per document at 94.7 per cent accuracy, per the Artificio case study. Harvey fine-tuned models on 10 billion tokens of case law and now serves 42 per cent of the top 100 US law firms. A small manufacturer can take 200 quality control photos, augment them to 2,000-plus variations, and catch defects that human inspectors miss 5 per cent of the time. A recruitment consultancy can fine-tune on 300 historical applications and cut CV screening time by 60 per cent.
Where will you actually meet it?
You will meet it embedded inside vendor pitches that say “fine-tuned for your business” without naming the technique. Vendors are typically selling one of three things: a full fine-tune of a small open-source model, a parameter-efficient fine-tune (LoRA or QLoRA) of a larger one, or a wrapper around someone else’s API with a custom prompt. The economics differ by an order of magnitude, so the question to ask is which one you are actually buying.
You will also meet it on Hugging Face, which now hosts 2.8 million pre-trained models and 500,000 datasets per its Spring 2026 state-of-the-platform post. AWS SageMaker JumpStart, Google Vertex AI and Azure Machine Learning all offer pre-configured fine-tuning templates. Open-source platforms like SiliconFlow, LLaMA-Factory and Unsloth lower the bar further. Unsloth in particular fine-tunes models on just 3GB of RAM in a free Colab notebook. None of this means you should run the project yourself. It means the supply side is no longer the bottleneck.
The interesting cost lever is parameter-efficient fine-tuning. Full fine-tuning retrains every parameter in a model. For a 7B model that is seven billion numbers to update, expensive and slow. LoRA adds small adapter matrices alongside the frozen base and trains only the adapters, perhaps 100,000 parameters instead of seven billion. QLoRA quantises the base model to 4-bit precision and shrinks the memory footprint by 75 per cent. The combination is fine-tuning that costs ten to a hundred times less and trains five to ten times faster, with comparable accuracy.
When to ask for it versus when to ignore it
Ask for transfer learning when your data is stable, when you need consistent tone, terminology or reasoning style, and when you have at least 200 to 500 labelled examples in the domain. Customer support classification, document routing, contract drafting in a house style, claims triage: all good fits.
Ignore it and reach for retrieval-augmented generation when your data changes frequently, when you need to cite the source of an answer, or when the knowledge base is large. Product catalogues, policy documents, pricing sheets and case law libraries all want RAG, because the model never has to learn the content. The IBM RAG-vs-fine-tuning explainer is the cleanest reference if you want a longer treatment.
Ignore it and reach for prompt engineering when you are prototyping, when you have fewer than 100 examples, or when you need to deploy in hours not weeks. Many production systems eventually combine all three. A fine-tune for tone and reasoning, RAG for current facts, prompt engineering on top to handle edge cases.
Two boundaries worth flagging. Source and target tasks must be related. A vision model transfers to other vision tasks; a language model transfers to other language tasks. Cross-modal transfer is possible but expensive. And fine-tuning on customer or employee data triggers UK GDPR obligations and EU AI Act duties, fully applicable from 2 August 2026 per the European Commission’s regulatory framework. The ICO guidance on AI and data protection is the right starting point. SMEs under 250 staff or £50m turnover get simplified pathways, but compliance still has to be costed in.
Related concepts
Fine-tuning is the most common implementation of transfer learning. Foundation models are the things you transfer-learn from. Retrieval-augmented generation is the adjacent adaptation method that suits frequently changing data. Prompt engineering is the lightest-touch option, and the prompt engineering versus fine-tuning decision guide walks through the trade-off in more depth. Machine learning is the parent discipline, since transfer learning is fundamentally a machine learning method.
The broker’s question, in the end, was the right one. The £6,000 figure is realistic for a 7B fine-tune on 2,000 labelled records if the labelling work is in hand. The 2,000 documents are the constraint, not the compute. And she does not have to choose between fine-tuning, RAG and prompt engineering. She picks the lightest option that solves the problem in front of her, then layers the others as the use case earns them.



