A 30-staff management consulting firm sat down last month to pick an AI plan for the team. Two options on the table. Option one was ChatGPT Team at £24 per user per month, all 30 seats, total £720 a month. Option two was a small custom integration on the OpenAI API at usage rates, no per-seat fee, projected £25 to £40 a month based on the firm’s actual estimated workload.
The senior partners read both numbers and got suspicious. How can option two cost so little against an option marketed at the same firm size? The answer is unromantic. Only 13 of the 30 staff would actively use AI in the first six months. The £720 figure was paying for 17 seats that would sit idle, dressed up as adoption insurance. The real choice was not “which is cheaper” but “is the simplicity of per-seat worth £8,000 a year, or do we have the engineering depth to manage an API?”
What is per-seat vs usage-based pricing?
Per-seat AI pricing charges a fixed monthly fee for every licensed user, regardless of activity. Microsoft 365 Copilot at £19 to £30 per user, ChatGPT Team at $30 (about £24), and Claude Team in the £18 to £25 range are the canonical examples. Usage-based pricing charges only for what you consume, billed in tokens or requests, with no fee for inactive users.
The shapes have different incentives. Per-seat is predictable, simple, and aligned with how SME finance teams already buy software. The bill is the same in a quiet month as a busy one, which makes annual budgeting straightforward and protects the cash plan. Usage-based bills track activity directly, so a quiet month produces a small bill and a busy month produces a larger one, with no padding either way.
The trade-off is structural. Per-seat trades cost-efficiency for predictability and ease of management. Usage-based trades predictability for cost-efficiency at the price of needing someone who can run a metered API. Neither is universally right, and a vendor’s salesperson will usually push the one their firm makes the strongest margin on rather than the one that fits your usage shape.
Why does it matter for your business?
The pricing model you pick determines three commercial outcomes: how predictable your bill is, how well your costs track your actual value from AI, and what behaviour the model encourages from staff. Picking the wrong shape for your usage means either paying for capacity that sits idle or facing a bill that varies more than your finance team can comfortably plan around.
Per-seat creates a sunk-cost effect. Once the seat is paid for, the marginal cost of running another prompt feels like zero, so adoption metrics rise but value-per-prompt rarely gets examined. That is a feature when your concern is getting the team to use AI at all and a bug when your concern is whether the AI is doing useful work or just making people feel productive.
Usage-based creates the opposite incentive. Every prompt has a measurable cost, which prompts disciplined use and rewards prompt engineering. For agentic workflows or batch processing where hundreds of API calls fire autonomously, usage-based pricing surfaces inefficiency immediately, and the engineering team has a feedback loop on whether the agent is producing more value than tokens consumed. Per-seat hides that signal entirely.
Where will you actually meet it?
You will meet per-seat on the marketing pages of vendors selling to SMEs and large enterprises, and usage-based on the developer pricing pages of the same vendors. Microsoft 365 Copilot, ChatGPT Team, ChatGPT Enterprise, Claude Team, Cursor Pro and Perplexity Pro Teams all sell per-seat, with prices stable on a published table. The OpenAI API, Anthropic API, AWS Bedrock and Google Gemini API all sell usage-based, quoted per million tokens.
The 2026 numbers are worth carrying. Per-seat: Copilot £19 to £30, ChatGPT Team about £24, ChatGPT Enterprise about £48, Claude Team £18 to £25, Cursor Pro about £16, Perplexity Pro Teams about £16. Usage-based on the API side: GPT-4o at $5 input and $15 output per million tokens, Claude Sonnet 4.6 at $3 and $15, Claude Opus at $15 and $75, Gemini 2.5 Pro at $1.25 and $10 below 200,000 tokens of context.
Two consumption shapes tell you what to expect on the bill. A 30-person consulting firm with 13 light users using GPT-4o for emails and research will rarely pay more than £30 a month on the API, against £720 a month on ChatGPT Team for the full headcount. A 10-person dev shop where three engineers each consume 200 million tokens of Claude Sonnet a month will pay around £60, against £150 for ten ChatGPT Pro seats. The shape of usage moves the numbers more than the per-token rate does.
When does each model fit?
Per-seat fits firms with broadly distributed light usage, low technical capacity, and a finance culture that values certainty over optimisation. A 15-person legal practice where every fee earner uses AI for legal research and contract drafting inside Microsoft 365 is a clean per-seat case. There is no API to manage and no engineer on call; the simplicity is paying for itself.
Usage-based fits firms with concentrated power-user demand, agentic or batch workloads, technical capacity to manage APIs, and variable consumption that defies seat forecasting. A 10-person dev shop building AI features into its own product, where three engineers drive 80% of consumption, is a clean usage-based case. So is any team running automation that calls the API thousands of times a day; per-seat plans simply throttle that workload before it produces value.
The decision rule is mechanical once you have honest numbers. If projected active users are under 70% of licensed headcount and you have engineering capacity, default to usage-based. If 90% or more of staff are active and adoption matters more than cost, default to per-seat. If usage is concentrated among two or three power users, default to usage-based. If you are running batch processing or agentic workflows at any scale, usage-based is essentially mandatory because per-seat throttling kills the workload.
Related concepts you will meet next
A token is the unit usage-based plans bill in. Roughly four characters of English text, with output tokens costing four to six times input tokens across every major vendor. If you are evaluating a usage-based plan, the input-to-output ratio of your workload is the number to design around, not the headline per-token rate.
Hybrid pricing is what the leading per-seat plans actually are once you read the fair-use clause. ChatGPT Team caps advanced reasoning messages per user per month, Claude Team uses unspecified fair-use language, and Microsoft 365 Copilot rate-limits high consumers without publishing the threshold. The plan you bought as “unlimited per-seat” behaves as per-seat plus consumption-based overage in practice, and your power users will find the ceiling before your finance team does.
Vendor lock-in is sharper on usage-based plans than per-seat. Per-seat is a contract you can cancel monthly with little switching cost. A usage-based integration runs on vendor-specific API shapes, prompt formats and authentication, and migrating six months of prompt engineering from one vendor to another is non-trivial. The annual saving of usage-based pricing is real, but so is the build cost of switching when the vendor changes their rate card.
The procurement question to take into any vendor conversation is one line. “Show me what 70% adoption of our headcount looks like at your per-seat rate, and what our actual estimated token volume looks like on your API rate, then tell me which shape your usage data says fits us best.” If they cannot answer, they do not know their own unit economics for your workload, which is itself a procurement signal.



