What is a foundation model? Plain-English guide for owners

A founder I work with watched a salesperson move from “powered by AI” on slide one to “built on a leading foundation model” on slide two without naming the model. He asked me afterwards whether that was a meaningful claim or a way of avoiding the question. It was the second.

By 2026 nearly every business AI tool is built on top of a foundation model, and the vendor who will not name it is the vendor whose pricing, continuity and capability are at someone else’s mercy. The plain-English version of the term tells you when the answer matters and when it does not.

What is a foundation model?

A foundation model is a large AI system, pre-trained at enormous cost on huge volumes of data, that other tools are built on top of. The term was coined by Stanford researchers in 2021 to describe a paradigm shift in how AI is created. Before foundation models, machine learning teams built specialist systems for each narrow task. Foundation models invert that. One model is trained once on vast, diverse data, and then adapted by downstream users to do many different things.

GPT-5 from OpenAI, Claude Opus from Anthropic, Gemini from Google, Llama from Meta, Mistral from Mistral AI and Grok from xAI are all foundation models. They arrive pre-trained, and your business or your vendor adapts them by adding instructions, by feeding in your documents at query time, or by fine-tuning them on your examples.

The category is broader than LLMs. Every LLM is a foundation model, but foundation models also include vision systems like CLIP and SAM, and multimodal systems like GPT-4o and Gemini. When a vendor says “built on an LLM” they mean text. When they say “built on a foundation model” they may mean text, vision, audio or all three. Worth asking which.

The economics matter too. Pre-training a foundation model from scratch costs tens of millions of pounds and is the preserve of a handful of labs. Adapting one is cheap and fast. That is why nearly every SME-facing AI tool in 2026 wraps an existing foundation model rather than training a proprietary one. The vendor’s value is in the wrapping, the data, the workflow integration and the user interface. The capability ceiling is set by the foundation model underneath.

Why it matters for your business

The first thing it changes is cost transparency. When a vendor charges a flat per-seat fee, they absorb the per-token API cost their foundation model provider charges them. OpenAI raised input token pricing on GPT-5.5 by 100% in May 2026. Long-context workloads jumped 49% to 92%. Your vendor either ate that increase, throttled you to a cheaper model, or passed it on. Knowing the foundation model lets you anticipate the next move.

The second is continuity. OpenAI announced GPT-4.5 in February 2025 and deprecated it within two months because the inference cost was uneconomic. Tools anchored to that model had to migrate. Prompts that worked on the old model behaved differently on the replacement. By 2026 this cycle is normal. Foundation models are retired roughly every twelve to eighteen months, and the lifecycle is set by the provider, not by you or your vendor.

The third is capability inheritance. Your product cannot do anything the underlying model cannot do, regardless of marketing. If the model hallucinates, the product hallucinates. If the model has a knowledge cut-off, the product does too unless retrieval is bolted on. The UK National Cyber Security Centre is clear that hallucination, bias and prompt injection are intrinsic to how the technology works, not vendor flaws to be fixed.

Where you will meet it

You will meet “foundation model” in vendor pitches where the salesperson does not want to commit to a specific name. “Built on a leading foundation model” or “built on a state-of-the-art foundation model” both translate to “I would rather not tell you exactly which one.” Sometimes that reflects a genuine model-routing setup behind the scenes. More often it is hedging.

You will meet “model-agnostic” in pitches where the vendor wants to neutralise the lock-in concern. The claim is that their platform can swap one foundation model for another without breaking your application. In a few cases this is true and the vendor has built a real abstraction layer, often called a gateway, that translates between your business logic and whichever model is underneath. In most cases the claim means they have built separate integrations to several models, which is fragmentation rather than agnosticism. The test question is “can I switch from GPT to Claude without changing my prompts or my code?” A truthful answer of “configuration change only” indicates the real thing.

You will also meet foundation model language in regulated contexts. Under the EU AI Act, providers of general-purpose AI models placed on the market from 2 August 2025 must publish a summary of training data and disclose how they have evaluated systemic risks. If you operate in or serve the EU, the foundation model your tool uses is part of your compliance picture, and the provider’s documentation is what you point at when an auditor asks.

When to ask about it, when to ignore it

Ask hard questions when your business outcomes depend directly on the model’s capability, cost or availability. A customer service product where the foundation model is the engine of every response deserves a specific name and a specific version. A bid-writing tool used by your team daily deserves the same. Ask the vendor four questions in sequence: which foundation model and version, what is your migration plan when it is deprecated, what is your data handling and residency policy, and how much work is required for me to switch you out.

Ask hard questions when you are in a regulated industry. Healthcare, finance and legal services all carry obligations around the provenance and explainability of automated decisions. The FCA’s published approach to AI is explicit about validation and explainability. The ICO’s guidance is explicit about lawful basis and Data Processing Addendums. Both presuppose you know which foundation model is doing the work.

Ignore the term when you are using a low-stakes tool in a one-off way. The model behind a meeting summariser that drafts your team’s notes is, at this point, an implementation detail. Whether it is GPT, Claude or Llama matters less than whether the summaries are good enough that the team uses them. The product is the user experience, not the architecture diagram.

Ignore “model-agnostic” claims that come without an architecture answer. If the vendor cannot describe the abstraction layer or the gateway, the claim is marketing. Do not buy on it.

LLM, large language model, is a foundation model specialised in text. Every LLM is a foundation model. Not every foundation model is an LLM. The distinction matters when the product handles images, audio or video as well as text.

Base model is a near-synonym for foundation model in most vendor language. When a vendor says “built on the Llama base model” they mean the same as “built on the Llama foundation model”.

Fine-tuning is a way of adapting a foundation model to a specific task by adjusting the model’s weights using your own data. More expensive and slower than prompting or retrieval, but the right call when you need consistent behaviour the base model cannot deliver from instruction alone. Fine-tuning has its own explainer in this series.

Frontier model is a label, not a category, for the most capable models available at any given time. In May 2026 the frontier set includes GPT-5, Claude Opus 4.6 and Gemini 3 Pro. Frontier models are powerful but expensive and tend to be revised on shorter cycles. The UK AI Security Institute uses the term in its evaluations of model risk.

Open-weight model is a foundation model whose parameters have been published openly so that anyone can download and run them. Llama, Mistral and DeepSeek’s variants are open-weight. They cost less per query at high volume and give you full control, in exchange for running the infrastructure yourself.

The point of the vocabulary is not to make you a model expert. It is to give you enough purchase that the next time a vendor says “built on a foundation model” without naming one, you can ask the question that turns the marketing into a contract conversation.

What is a foundation model? Why it matters for your business

Key takeaways

What is a foundation model?

Why it matters for your business

Where you will meet it

When to ask about it, when to ignore it

Sources

Frequently asked questions

Is a foundation model the same thing as an LLM?

Should I worry about which foundation model my vendor uses?

Is "model-agnostic" a real thing or marketing?

Ready to talk it through?

If any of this sounds familiar, let's talk.

What is a foundation model? Why it matters for your business

Key takeaways

What is a foundation model?

Why it matters for your business

Where you will meet it

When to ask about it, when to ignore it

Related concepts

Sources

Frequently asked questions

Is a foundation model the same thing as an LLM?

Should I worry about which foundation model my vendor uses?

Is "model-agnostic" a real thing or marketing?

Ready to talk it through?

Related reading

Zero-shot vs few-shot learning: when AI works on tiny data

What is AutoML? Why it matters for your business

What is edge AI? Why running AI locally matters for your business

If any of this sounds familiar, let's talk.