What is open-source vs closed-source AI? The procurement choice in front of you

Two people leaning over a desk reading printed specification sheets with two laptops open in front of them
TL;DR

Closed-source AI keeps model weights inside the vendor's infrastructure and you pay per token to use it via an API. Open-weight AI publishes the trained parameters as downloadable files you can run on your own hardware. The 2026 capability gap is small. The decision turns on five practical factors, data sovereignty, customisation depth, cost predictability, support burden and regulatory positioning. For many growing services firms the answer is a hybrid architecture, not one or the other.

Key takeaways

- Closed-source AI runs the model inside the vendor's infrastructure and bills you per token. Open-weight AI ships the trained parameters as files you can download and host yourself. - The capability gap is small in 2026. Open-weight models reach roughly 89.6% of closed-source flagship performance at launch and close the gap within weeks. - Token economics differ by an order of magnitude. Llama 4 Maverick on a serverless host runs around 29 to 41 times cheaper per million tokens than GPT-5.2 or Claude Opus 4.6. - Below £1,500 a month of API spend, closed-source is almost always cheaper once DevOps overhead is included. Above £2,500 a month and predictable, self-hosted infrastructure starts to make financial sense. - Data residency is the underrated procurement driver. UK firms handling regulated personal data on US-headquartered cloud APIs are in a legal grey zone under GDPR Article 48 and the US CLOUD Act.

A founder I spoke with last week was choosing the model behind a new automation product for his 35-person customer-support firm. Two paths sat on his desk. Anthropic’s Claude API, integrated in a fortnight, billed per token at roughly £2,500 a month at projected volume. Or a self-hosted Llama 4 deployment on a managed European GPU host, infrastructure at around £1,500 monthly plus another £1,000 of internal engineering time, and eight to ten weeks before it was production-ready.

He wasn’t picking a religion. He was picking a deployment mode that fit his customers, his regulatory exposure and his team’s actual capability. The right answer turned out to be both. Closed-source for the prototype. Open-weight by month twelve. The architecture had to be designed for that from day one.

What is open-source vs closed-source AI?

Closed-source AI keeps the model’s parameters inside the vendor’s infrastructure and you reach the model through a paid API. OpenAI’s GPT-5, Anthropic’s Claude Opus 4.6 and Google’s Gemini 3.1 Pro are the dominant closed-source frontier models in 2026. Open-weight AI publishes the trained parameters as downloadable files. Meta’s Llama 4, Mistral Large 3, DeepSeek V3 and Alibaba’s Qwen are the leading open-weight families. You run them on your own GPU hardware or a third-party host.

The terminology is imprecise. The Open Source Initiative draws a line between true open-source AI, which would also publish the training code and the data, and open-weight, which only publishes the trained parameters. Almost every model the press calls “open-source” today is technically open-weight. The distinction matters when you read the licence, because permissions on the weights do not always extend to the training pipeline or to derivative-model training.

The 2026 capability and cost reality

The capability gap has narrowed sharply. MIT Sloan’s 2026 analysis puts open-weight performance at roughly 89.6% of closed-source flagship benchmarks at launch, closing to parity within around thirteen weeks. For coding, customer service, content generation and the bulk of business automation, the open-weight options are competitive on quality and far cheaper. For frontier reasoning benchmarks the closed-source labs still lead by three to eight points, and that gap matters for a small share of workloads.

The cost picture is where the procurement conversation usually lands. OpenAI’s GPT-5.2 is around $1.75 per million input tokens and $14.00 per million output tokens. Claude Opus 4.6 is $5.00 in and $25.00 out. Llama 4 Maverick on a serverless host such as DeepInfra runs around $0.17 in and $0.60 out. That is roughly 29 to 41 times cheaper at the API level. Self-hosted infrastructure is largely fixed, while API costs scale linearly with usage. The crossover point for a typical UK services firm sits somewhere between £1,500 and £2,500 of monthly API spend, once you cost in the engineering time to run a GPU.

Where does data sovereignty change the procurement calculus?

Data residency is the underrated procurement driver. Under GDPR Article 48, transfers of personal data to a non-adequate country without specific safeguards are a breach with fines up to 4% of global turnover. The US CLOUD Act lets US law enforcement compel American companies to hand over data stored abroad regardless of location. Together these create a sovereignty gap that closed-source US-headquartered providers cannot fully close with contract clauses.

In practice that means AWS Bedrock, Azure OpenAI and Google Vertex AI in EU regions are operating in a legal grey zone for regulated personal data. Self-hosting Llama 4 or Mistral on UK or EU infrastructure resolves the question at the infrastructure layer, not in a contract footnote. The UK is accelerating sovereign-AI plans precisely because of that dependence, and government and regulated-sector RFPs increasingly carry “strategic autonomy” language. For a firm tendering into healthcare, financial services or government work, an EU-resident open-weight deployment can shift from “nice to have” to a procurement requirement.

When to default to closed, when to self-host, when to go hybrid

Default to closed-source when you are prototyping, your team has no GPU capability, your monthly AI spend is under £1,500 and your data is not regulated. The vendor maintains guardrails, abuse monitoring and incident response, and you trade a higher per-token price for speed to market. Lloyds Banking Group’s 2026 survey found the typical UK SME spends under £25,000 a year on AI, consistent with closed-source APIs being the right default for many firms.

Default to open-weight self-hosting when data residency is mandatory under GDPR or sector rules, your monthly API spend is above £2,500 and predictable, your competitive edge depends on fine-tuning a model on your own data, or your customer base includes government and “strategic autonomy” procurement criteria. The UK AI Security Institute’s analysis is honest about the trade-off. Open-weight safeguards can be removed by adversarial fine-tuning, so when you self-host you own runtime monitoring, content filtering and incident response in a way the API providers do not require of you.

Many growing services firms end up hybrid. Closed-source for frontier reasoning and prototyping, open-weight for cost-sensitive volume work and compliance-sensitive deployments. The architecture pattern that holds this together is a model abstraction layer such as LiteLLM or Ollama. With one in place, switching a workload between vendors is a configuration change rather than a rewrite. Re-evaluate the architecture quarterly. The worst decision is locking the business into one vendor’s roadmap and discovering eighteen months in that switching has become ruinous.

Foundation model is the broader category that both closed-source and open-weight systems sit inside. Every LLM is a foundation model and Llama, Claude, GPT-5 and Gemini are all foundation models. Whether the weights ship as a downloadable file or stay inside the vendor’s infrastructure is the open-vs-closed question.

Fine-tuning is the customisation lever that open-weight models unlock most fully. Closed-source providers offer API-level fine-tuning that adjusts behaviour without touching the underlying weights. Open-weight self-hosting allows weight-level fine-tuning on your own data, which is the stronger option when domain-specific performance is part of your competitive edge.

SaaS AI vs self-hosted AI is the deployment-mode decision guide that sits next to this post. The open-vs-closed question is upstream. The deployment-mode question is what to do about it once you have decided which way to go.

The EU AI Act sits in the background of any sovereign-AI conversation, particularly for general-purpose AI providers and high-risk deployments. The 12 questions to ask an AI vendor is the procurement checklist that closes the loop on whichever model you end up running, and the one-page AI risk register is the governance step that follows once the deployment-mode decision is made. If the next step is mapping which workloads belong on which model, book a conversation.

Sources

MindStudio (2026). Open Source AI vs Closed Source: business-model implications. Procurement-frame primer for SMEs choosing between deployment modes. https://www.mindstudio.ai/blog/open-source-ai-vs-closed-source-business-model/ Meta (2025). Llama 4 multimodal intelligence announcement. The leading open-weight frontier-class family in 2026. https://ai.meta.com/blog/llama-4-multimodal-intelligence/ MIT Sloan (2026). AI open models have benefits, so why aren't they more widely used. The 89.6% capability gap and 87% cost-saving figure. https://mitsloan.mit.edu/ideas-made-to-matter/ai-open-models-have-benefits-so-why-arent-they-more-widely-used Open Source Initiative (2024). Open Weights Definition. The formal distinction between open-source and open-weight, used in procurement and licensing. https://opensource.org/ai/open-weights UK AI Security Institute (2025). Managing risks from increasingly capable open-weight AI systems. Regulator-side view on the safety and runtime-monitoring trade-off. https://www.aisi.gov.uk/blog/managing-risks-from-increasingly-capable-open-weight-ai-systems UK Information Commissioner's Office (2024). Guidance on AI and Data Protection. The GDPR anchor for data residency procurement decisions. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/ Linux Foundation (2025). The open-source legacy and AI's licensing challenge. Background on OpenMDW and the standardisation gap in AI model licences. https://www.linuxfoundation.org/blog/the-open-source-legacy-and-ais-licensing-challenge Lloyds Banking Group (2026). Impact of AI adoption on UK business. SME baseline data, 87% report productivity gains, the typical firm spends below £25,000 annually on AI. https://www.lloydsbankinggroup.com/media/press-releases/2026/lloyds/impact-of-ai-adoption-on-business.html ITPro (2025). UK firms accelerate sovereign AI plans amid concerns over dependence on overseas tech. Procurement context for sovereign-AI language in RFPs. https://www.itpro.com/technology/artificial-intelligence/uk-firms-accelerate-sovereign-ai-plans-amid-concerns-over-dependence-on-overseas-tech

Frequently asked questions

Is open-source AI safe to use commercially?

It can be, but the licence is the thing to read. Llama 4 ships under the Llama Community License, not Apache 2.0 or MIT, and earlier Qwen versions capped commercial deployment at 100 million users. Some licences include URLs to external documents that can be amended without notification. Read every model's licence before procurement sign-off, especially if you plan commercial deployment or want to fine-tune the model on your own data.

What does it actually cost to self-host an open-weight model?

Entry-level infrastructure starts around £1,500 a month for a single A10G GPU on a managed European host, plus internal DevOps time costed at roughly £1,000 a month. Below that combined £2,500 monthly figure, closed-source APIs are almost always cheaper once you include the operational burden. Above £2,500 monthly and predictable, the cost crossover starts to favour self-hosting, and the gap widens as volume grows.

Should we just pick one and stick with it?

For many growing services firms the honest answer is hybrid. Use closed-source for prototyping and frontier reasoning, where vendor-managed safety and speed to market matter. Use open-weight self-hosting for high-volume, cost-sensitive work and for compliance-sensitive deployments where data residency is a regulatory requirement. Build the integration layer with a model abstraction (LiteLLM, Ollama or similar) so switching between models is a configuration change, not a rewrite.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation