An owner I sat with last month put her last year of AI invoices on the table in date order. £200 a month at the start, £4,000 by the end. She had not consciously decided to grow the bill, the tools had simply earned their way in, one workflow at a time. Her capacity plan for the next quarter was a staffing sheet and a holiday rota. The AI line did not appear anywhere on it. Her forecasts had started to be wrong by 10 to 15 per cent and she could not work out why.
She is not unusual. AI tooling has crept into the second variable cost line in many owner-managed services firms, behind staff and ahead of premises and software combined. Mavvrik’s 2026 cost-statistics analysis reports that companies plan to spend an average of 1.7 per cent of revenue on AI in 2026, against 0.8 per cent in 2025, with AI expected to consume 25 to 50 per cent of IT budgets within a few years. At a £1m firm, a £4,000 monthly AI bill is already 4.8 per cent of revenue, about three times that cross-sector average.
The capacity question is no longer “how many people,” it is “how many people plus how much AI tooling.” The firms that build the combined model see capacity coming. The firms that do not run on lag.
What is capacity planning when AI is a variable cost?
It is capacity planning that treats AI tooling as a delivery resource with its own throughput, its own constraints, and its own variable cost per unit of work, rather than as overhead software. AI is no longer a fixed subscription you can ignore. It is metered consumption that scales with revenue-generating activity, so its capacity and bottlenecks need to be planned alongside staff hours.
Deloitte’s CFO guide to AI token economics argues that tokens, not users or time, are now the fundamental unit of consumption, and that finance functions need to track AI cost per workflow, per client, and per active user rather than per licence. That reclassification is the underlying move. Once AI sits in cost of goods sold rather than overhead, it has to be planned as capacity.
Why does this matter for your business?
It matters because the bottleneck in your firm has probably shifted, and the staffing-only plan no longer tells you where it is. The clearest tell is forecasts wrong by amounts that do not track to headcount. Glean’s AI total cost of ownership work reports first-year budget overruns of 30 to 40 per cent when AI is not modelled explicitly, and Everest Group calls the pattern the “token cost illusion”.
For an owner-managed firm the second risk is gross margin erosion that the management accounts do not flag. The SaaS CFO’s analysis shows that embedding AI features into a service without revisiting pricing and cost classification can pull gross margin from 80 per cent to 65 per cent, fifteen points of margin moving silently from overhead into variable COGS. The capacity plan that treats AI as a delivery resource is what makes that movement visible early enough to act on.
Where AI capacity is the same as staff capacity, and where it is not
AI capacity differs from staff in three ways that matter to a capacity plan. It scales instantly when you need more, you add seats, tokens or concurrent runs rather than recruiting. It has no notice period, holiday calendar, sickness, or Monday-morning ramp-up. And it can run overnight and at weekends without overtime, useful when the time window is the constraint.
AI capacity is the same as staff in four ways owners commonly miss. It is still a fixed monthly tier you have committed to, paid whether the work shows up or not. It is still constrained at the top, OpenAI’s rate-limit documentation defines requests-per-minute, tokens-per-minute and tokens-per-day caps tied to spend tier, and Red Hat’s TokenRateLimitPolicy shows the same pattern in enterprise gateways, with HTTP 429 errors when daily quotas blow. It still has concurrency caps that limit parallel runs. And it still has an upkeep cost, prompt design, workflow maintenance, evaluation and oversight, that does not appear on the licence invoice but absolutely shows up in staff time.
What is the four-line capacity model?
The four-line model is a spreadsheet, not a methodology. Line one is staff hours by capability, total hours available per quarter by role and skill, adjusted for holidays and non-billable commitments. Line two is AI tooling capacity by use case, effective throughput translated from vendor rate limits and concurrency caps into work units like documents drafted per hour or first-pass research tasks per day.
Line three is variable cost per unit of work for each capacity type. For staff this is fully-loaded hourly cost divided by units, with rework counted. For AI it is either vendor tokens-per-task multiplied by published price, or total monthly AI spend divided by mediated tasks if your use case sits behind a SaaS wrapper. Line four is sensitivity to mix shifts, what total cost and capacity look like at different AI-versus-staff splits of the same workload.
A small worked example makes the model concrete. A consultancy expects 1,000 standard analysis tasks in a quarter. In a hybrid mix, AI handles 70 per cent at low marginal cost, analysts handle the remaining 30 per cent and oversight on the rest. The four-line view shows required staff hours, required AI throughput against vendor limits, blended cost per task, and what happens if mix shifts to 50-50 or 90-10. WorkflowMax’s professional services capacity forecasting writes the same picture for staff, the four-line model adds the AI side alongside rather than treating it as separate.
What are the early warning signs of capacity mismatch?
There are three signs that capacity and demand have fallen out of alignment, all visible in the management pack if you know where to look. The first is AI tooling bills growing faster than revenue over two or three quarters running. Mavvrik’s data finds 80 to 85 per cent of firms miss AI forecasts by more than 25 per cent, so the variable cost is eroding gross margin rather than expanding capacity.
The second is staff time on AI oversight growing faster than AI tooling cost itself. Charter’s work on AI tool bloat describes the diminishing marginal productivity that comes from juggling multiple overlapping tools and the “toggle tax” of context switching. When the oversight hours rise faster than the licence bill, the firm is paying twice, once for the tool and once for the people behind it, and the substitution AI was meant to deliver is not happening. The hybrid AI economics literature suggests that even well-designed systems still require 10 to 20 per cent of the human resources the fully-manual version would have used, anything materially above that is a flag.
The third sign is recurring small errors that capacity slack would once have caught. Mis-filed documents, slightly off data extractions, tone-deaf client emails, mis-prioritised tasks. When firms run staff capacity hot and lean on AI for throughput, there is less human slack to spot and correct minor issues before they reach clients. The fix is often a narrower set of AI use cases and a better oversight loop rather than more headcount, but the trigger is the same, slack has gone and small errors are accumulating.
Sister posts walk the related ground. The hidden margin tax of AI sizes the oversight time more precisely. Where AI moves margins sorts which use cases are pulling their weight. The growth and profit dashboard owner-operators need pulls these signals into the eight or ten numbers you actually review each month.
If you want to build the four-line capacity model for your own firm and walk through the warning signs against your last two quarters, Book a conversation.



