What is a frontier model? When to pay frontier prices and when not to

A person at a meeting-room table reading a printed proposal with a pen in hand and a laptop closed beside them
TL;DR

A frontier model is one of the small set of largest, most capable general-purpose AI systems at the cutting edge of training compute. In May 2026 that means GPT-5.5, Claude Opus 4.7, Gemini 3.1 Pro, and the open-weight border with Llama 4, DeepSeek V4 and Mistral Large 3. They are real, they are expensive, and for the bulk of routine SME work they are overkill. The procurement question is whether your workload genuinely needs frontier reasoning or whether a mid-tier model at one fifth the price does the job.

Key takeaways

- A frontier model is one of the small set of general-purpose AI systems at the absolute leading edge of training compute and capability. - Regulators define frontier with compute thresholds: 10^25 FLOPs in the EU AI Act, 10^26 FLOPs in the US frameworks. Those rules apply to the model providers, not to you using the API. - The cost gap is large. Claude Opus 4.7 is roughly 18 times more expensive per token than Claude Haiku. For high-volume work that multiplier compounds fast. - Frontier capability earns its keep on reasoning-heavy, autonomous, or high-value work. It is overkill on routine, bounded, templated tasks. - Test with a mid-tier model first. Route only the hardest 10 to 20% of queries to a frontier model. This typically cuts API spend by 40 to 60%.

A 30-consultant management practice with £4 million turnover sat through a vendor pitch for “frontier-grade proposal writing” last month. The pricing slide quoted £15 input and £75 output per million tokens for Claude Opus 4.7. The owner did the maths on a typical proposal of around 200,000 tokens in and 30,000 out. About £5.25 per proposal at frontier rates. Around £1.40 at mid-tier. Below 20p with retrieval and a budget model.

His follow-up question was the right one. Show me a side-by-side on three real briefs from our practice, frontier versus mid-tier. The vendor had not run that comparison. “Frontier-grade” is the start of a procurement conversation, not the end of one. The plain-English version of the term tells you when the answer matters and when it does not.

What is a frontier model?

A frontier model is one of the small set of largest, most capable general-purpose AI systems at the leading edge of training compute. NVIDIA describes them as research systems sitting at the boundary of what AI can do at any given moment, often still being refined during deployment. The named frontier models in May 2026 are GPT-5.5, Claude Opus 4.7 and Gemini 3.1 Pro.

The open-weight border sits just behind, with Llama 4, DeepSeek V4 and Mistral Large 3 closing the capability gap on a six-to-twelve-month lag.

The label is also regulatory. Under the EU AI Act, a general-purpose model with systemic risk is one trained with more than 10^25 floating-point operations of compute. The US executive order 14110, California’s TFAIA and New York’s RAISE Act use a higher threshold of 10^26 FLOPs paired with a $100 million training-cost floor. The UK AI Security Institute, renamed from the AI Safety Institute in February 2025, takes a case-by-case approach focused on cyber, CBRN and loss-of-control risks rather than a single FLOP number.

Those thresholds determine which model providers face safety-testing and reporting obligations. They do not, in the typical case, apply to the SME using the resulting API. Your job is to use the model responsibly within its terms. The provider publishes the safety protocols, runs the evaluations and absorbs the regulatory cost. That is one of the reasons frontier API access is expensive.

Why does it matter for your business?

Because the cost gap between frontier and mid-tier is large enough that getting the choice wrong shows up in your unit economics within months. Per million tokens, Claude Opus 4.7 runs roughly £15 input and £75 output. Claude Sonnet 4.6 is £3 and £15. Claude Haiku 4.5 is £0.80 and £4. The Opus-to-Haiku ratio is around 18 times. For volume work, that multiplier compounds fast.

Epoch AI’s research shows frontier capability is also a moving target. Models matching frontier capability from 12 months prior can run on a single consumer graphics card within 6 to 12 months. GPT-4 originally cost £15 per million input tokens in early 2023. By May 2024, GPT-4o, measurably more capable, cost £2.50 per million tokens. A six-times reduction in 16 months. The same compression keeps repeating across the Claude and Gemini families.

The implication for your business is architectural rather than contractual. If you build a product on GPT-5.5 today at £30 per million output tokens, expect an open-weight equivalent to be available at one fifth the price within 12 to 18 months. Design the application so swapping models is a configuration change, not a rewrite. Use a router or an abstraction layer. Plan for migration before you need it.

Where will you actually meet it?

You will meet “frontier-grade” most often in vendor demos and investor conversations, where the term does marketing work as much as technical work. Your CRM provider, your accounting software, your customer-service platform, all are running frontier-model evaluations internally. Some will pass capability through in new features. Many will simply charge you for the model whether or not your workload needs frontier reasoning.

Four places it shows up. First, vendor pitches that lead with the model name to signal sophistication without showing side-by-side output against a cheaper alternative. Second, cloud-provider sales calls where AWS, Azure and Google Cloud frame frontier adoption as part of “responsible AI strategy”. Third, board and investor calls where someone asks “are we on the latest model” without specifying the workload. Fourth, peer conversations where another founder has a frontier subscription and assumes you should too.

In every case the same procurement question applies. Show me a side-by-side on three real queries from my business, frontier versus mid-tier, with quality scored against my actual standard and cost shown per task. If the vendor has not run that comparison, the frontier claim is unverified. If the mid-tier handles 80% of your queries acceptably, route the hard 20% to the frontier tier and pocket the saving.

When to ask vs when to ignore

Ask hard questions when reasoning depth carries economic weight. Legal analysis, financial modelling, regulated compliance research and complex strategic synthesis all benefit from the marginal accuracy that frontier models deliver. So do autonomous multi-step workflows, where the model needs to maintain state across hours, call tools and recover from intermediate failures. So does novel problem-solving where your team lacks deep domain expertise. In these scenarios the cost premium often earns its keep.

Ignore the frontier label for routine bounded tasks. Customer-service triage, document classification, scheduled reporting, templated writing, data extraction. None of these need frontier reasoning, all of them run faster and cheaper on smaller models, and the quality difference on your actual data is usually below your noise floor. Ignore it also for high-volume low-margin work where unit economics are paramount, for latency-sensitive interactive applications, and for data-sovereign or regulated workloads where an open-weight self-hosted model is the right answer regardless of capability tier.

The decision rule. Test with a mid-tier model first, on real data from your business. Measure quality and cost. Only escalate to the frontier tier if mid-tier output is good but insufficient on a measurable share of queries, then route only those queries to the frontier model. This pattern typically reduces total API costs by 40 to 60% versus a single-frontier-model architecture. If your monthly API spend is below £1,500, frontier capability is almost certainly the wrong default.

Foundation model is the broader category. Every frontier model is a foundation model. Not every foundation model is a frontier model. Foundation includes the established mid-tier and budget tiers. Frontier is the small leading-edge subset.

Model tier is the operational language for the cost-capability ladder underneath frontier. Advanced mid-tier (Claude Sonnet, Gemini 2.5 Pro, GPT-5.2) handles 10 to 15% of business use cases. Mid-tier (Claude Haiku, GPT-5.1 mini, Gemini 2.5 Flash) handles 30 to 40%. Budget tier (Haiku, GPT-5 nano, DeepSeek V3) handles 40 to 50%. The typical SME should run a multi-model architecture across tiers rather than defaulting to frontier for everything.

Mixture of Experts is an architecture pattern that lets a large model use only a fraction of its parameters per query. DeepSeek V4 has roughly 1 trillion total parameters but only 37 billion active per token, which is why it competes on capability with Opus and GPT-5.5 at a fraction of the operating cost. Worth knowing when an open-weight deployment becomes plausible.

The AGI horizon claim is the marketing layer that often sits on top of frontier announcements. Treat the AGI framing as separate from the procurement question. Whether GPT-5.5 is “approaching AGI” is interesting for researchers and unhelpful for owners deciding whether to pay frontier prices for templated proposal writing. Keep the two conversations apart.

The point of the vocabulary is to give you enough purchase that the next time a vendor leads with “frontier-grade”, you can ask for the side-by-side that turns the marketing into a contract conversation. You do not need to be a model expert to do that.

Sources

NVIDIA (2024). What are frontier models? The canonical plain-English definition. https://www.nvidia.com/en-us/glossary/frontier-models/ Epoch AI (2024). How much does it cost to train frontier AI models? Authoritative source on training costs and the 2.4x annual growth trend. https://epoch.ai/blog/how-much-does-it-cost-to-train-frontier-ai-models UK AI Security Institute (2024). Frontier AI Trends Report. The UK regulator's view on capability evaluations and case-by-case risk assessment. https://www.aisi.gov.uk/frontier-ai-trends-report European Commission (2025). Guidelines for providers of general-purpose AI models under the EU AI Act, including the 10^25 FLOPs systemic-risk threshold. https://digital-strategy.ec.europa.eu/en/news/learn-more-about-guidelines-providers-general-purpose-ai-models Anthropic (2026). Introducing Claude Opus 4.7. Vendor announcement and capability claims for the current Anthropic flagship. https://www.anthropic.com/news/claude-opus-4-7 OpenAI (2024). Introducing GPT-5. The foundational announcement for the GPT-5 family. https://openai.com/index/introducing-gpt-5/ Mean CEO Blog (2024). Small language models: stop paying for frontier-model ego. The case for smaller models on the bulk of SME workloads. https://blog.mean.ceo/small-language-models/ Matt Hopkins (2024). Why most AI work does not need a frontier model. Practical SME framing on when to escalate and when not to. https://matthopkins.com/technology/most-ai-work-doesnt-need-frontier-model/ Jones Walker (2026). New York's RAISE Act: what frontier model developers need to know. The 10^26 FLOPs US threshold and adjacent state legislation. https://www.joneswalker.com/en/insights/blogs/ai-law-blog/new-yorks-raise-act-what-frontier-model-developers-need-to-know.html

Frequently asked questions

How do I know if my business actually needs a frontier model?

Run a side-by-side test. Take three real queries from your business, run them through Claude Sonnet or Gemini 2.5 Pro at one end and Claude Opus 4.7 or GPT-5.5 at the other. Score the outputs against your own quality bar. If the mid-tier results are good enough on the bulk of queries, you have your answer. If a measurable share genuinely fail, route only that share to the frontier tier.

What is the difference between a foundation model and a frontier model?

A foundation model is any large pre-trained system that other tools are built on top of. A frontier model is the small subset of foundation models at the current cutting edge of capability and training compute. Frontier is a moving label, not a permanent category. Today's frontier model becomes a mid-tier offering in 12 to 18 months as newer, larger models replace it.

Should I worry about the EU AI Act if my vendor uses a frontier model?

The systemic-risk obligations in the EU AI Act fall on the model provider, not on you using their API. Your vendor inherits those duties from OpenAI, Anthropic or Google. You only carry direct AI Act exposure if you deploy AI in a high-risk context defined in the Act, such as hiring, credit scoring or safety-critical systems. Ask your vendor for their compliance documentation.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation