The billable-hour squeeze: agency pricing when AI halves production time

Two people at a conference table reviewing documents and a laptop screen showing figures
TL;DR

When AI compresses agency production time, the billing model is the first casualty if nothing changes. Proposal writing, first drafts, and production passes are the workflows that take the hit first, and time-based invoicing exposes the gain immediately once clients notice. Agencies that reset pricing before clients do the maths move from defending timesheets to selling outcomes. The window to do that closes faster than many founders expect.

Key takeaways

- AI compresses the exact workflows that fill agency timesheets, including proposal writing, first drafts, and production passes on document-heavy deliverables. - Agencies running a pyramid staffing model face a specific exposure: the junior-billable layer that AI compresses fastest is also where blended margin is made. - The practical threshold for repricing is before a client with a spreadsheet notices that hours are dropping on recurring work. - Presenting the pricing reset alongside the AI adoption story reduces team resistance; separating them turns efficiency into a perceived threat to day rates. - Outcome-based fees, retainer models, and project fees each let agencies capture the efficiency gain rather than passing it to clients by default.

The ops lead at a 40-person creative agency proved on a Tuesday afternoon that AI could cut their proposal drafting time from six hours to two and a half. The numbers were clear. They took them to the founder expecting to talk about rolling it out to the business development team. Instead, the founder looked at the data and raised a question no one had prepared for. If proposals now take two and a half hours, what does that do to the invoice?

That question is worth preparing for before a client gets there first.

What is the billable-hour squeeze?

The billable-hour squeeze is what happens when AI compresses production time but the pricing model stays put. An agency billing by the hour for work that took six hours can either charge for 2.5 hours and take the revenue hit, or charge for six hours and wait for the client to notice the timesheet. The gap between those two numbers, left unresolved, is the squeeze.

The squeeze does not announce itself. Production timesheets start looking thin. A client mentions during a renewal call that the project hours were lower than expected. By the time the gap is obvious to everyone, the agency has already absorbed the efficiency gain as an unplanned discount.

The risk runs in both directions. Cutting prices before understanding what clients actually value gives away margin on services where the client may have no idea AI is involved. Holding prices for too long means clients start asking questions that are much harder to answer once the compression is visible on the invoice.

The core commercial question is what the agency charges for when AI handles the production work.

Why does it land harder on agencies than other professional services?

Agencies sell time-based output more directly than almost any other professional service. A solicitor’s advice or an accountant’s judgment justifies time-based billing because the expertise takes years to build. A first draft, a research pass, or an RFP response is different. That work is measurable in hours, comparable across firms, and is exactly what AI handles most reliably. The billable unit is the thing under pressure.

Harvard Business Review reported in 2025 that AI is reshaping the structure of consulting firms, automating tasks traditionally handled by junior staff including research, modelling, and analysis. HBR describes the result as a leaner “obelisk” model with fewer layers and smaller teams. Agencies face the same shift. The junior-billable layer, the hours billed at lower rates for production work, is where AI delivers the fastest compression.

That matters commercially because the typical mid-size agency runs on a pyramid staffing model. Junior staff handle the high-volume production work, senior staff handle the high-value thinking, and the blended billing rate across both is where margin is made. If AI compresses the junior layer without any adjustment to how the agency prices, the margin that depended on that layer erodes with every engagement.

Where will you actually see the compression in your agency’s work?

The compression shows up most clearly in proposal and RFP writing, first-draft content, and production passes on document-heavy deliverables. Research tracking AI proposal automation found response time falling from six hours to 2.5 hours per proposal, a reduction of around 58 per cent. For an agency writing 15 proposals a month, that is roughly 50 hours a month reclaimed before any client-facing change has been made.

The same pattern appears in brief writing, market research summaries, and copy iterations. The British Chambers of Commerce reported in early 2026 that half of UK owner-managed businesses are now using AI in some form. Agencies are among the early adopters on production workflows, which is where the billing exposure opens up first.

The workflow question for a delegate is which of these their agency has already started using AI on, and whether any of that work is currently billed to clients. If the answer is yes, the next question is whether the billing rates reflect what is actually happening. In many cases, they do not, because the AI rollout happened before anyone reviewed the commercial model against it.

McKinsey’s 2025 Global Survey found that roughly a third of professional services firms are regularly using AI in at least one business function. Knowledge-work firms, including management consultancies and agencies, are among the highest adopters. The agencies ahead on the repricing question are the ones using that adoption window to set the commercial terms of the shift, rather than waiting for a client to notice first.

When should you change your pricing model, and when is it too early?

The right threshold is when the hours on a recurring deliverable have dropped far enough that a client reviewing the invoice would notice. At that point, the pricing model needs to change before the next renewal. Once a client sees lower timesheets three months running, they will ask why the rate has not moved, and answering reactively is considerably harder than resetting the model in advance.

The internal pressure makes this harder to time well. A team that suspects AI will be used to justify a lower day rate will resist adopting it. The pricing reset and the AI adoption story need to be presented together. If the agency frames AI as a way to reduce hours without explaining how the commercial model compensates, the team hears “efficiency gains” as a disguised pay cut.

A delegate’s job is to get ahead of that narrative before the founder frames it less carefully. That means having a clear answer to two questions. What does the agency charge for instead of hours, and how does that change protect both the margin and the team’s sense of the value they are delivering?

Gallup’s 2025 research found that only about one in ten employees in AI-adopting organisations strongly agreed that AI had changed how work gets done. For many agencies, this means the internal adoption window is still open. The time to reset the commercial model is while the team is still proving the tools work, not after clients have started asking about the timesheets.

What pricing models are agencies actually moving to?

Three approaches are gaining ground in agencies that have moved past hourly billing. Outcome-based fees charge for the delivered result rather than the time taken. Retainer models bundle access to expertise and capacity at a fixed monthly fee. Project fees quote a flat rate per deliverable, with AI efficiency absorbed on the agency’s side. None is universally better; the choice depends on where the agency’s differentiation actually sits.

Outcome-based fees work best when the agency can define and measure the result, whether that is a specific number of leads, a campaign conversion rate, or a content output. This model shifts more risk toward the agency but allows it to price for the value delivered rather than the hours spent. For agencies where competitive advantage lies in the quality of the thinking, this tends to be the highest-margin option once the pricing is calibrated correctly.

Retainer models tie the fee to access and availability rather than production output. A client pays a fixed monthly amount for a defined level of expertise and capacity. This is resilient to AI efficiency gains because the fee is not tied to hours worked. The risk is scope creep when clients assume AI-enabled capacity means unlimited requests without additional cost.

Project fees quote a flat rate per deliverable. If a project that used to take 40 hours now takes 25, the quoted fee stays the same and the margin improves. The client sees a predictable price; the agency retains the efficiency gain rather than exposing it on a timesheet.

The OECD’s 2025 research on AI adoption in owner-managed businesses confirms that productivity gains from AI in professional services are real and measurable. Which model the agency chooses matters less than the act of choosing deliberately, before clients make that decision for them.

If a delegate has already proved the efficiency gain is real, the commercial model is the next thing that needs to change. The agency that prices deliberately, before clients notice, is the one that gets to define what AI-enabled work is worth.

Sources

- McKinsey (2025). The State of AI: Global Survey. Reports roughly a third of professional services firms actively using AI in at least one business function; two-thirds still in pilot phase rather than scaling. https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai - Federal Reserve (2026). Monitoring AI Adoption in the U.S. Economy. Professional services sector adoption at approximately 33 per cent; among the highest-adoption sectors alongside financial services. https://www.federalreserve.gov/econres/notes/feds-notes/monitoring-ai-adoption-in-the-u-s-economy-20260403.html - OECD (2025). AI Adoption by Small and Medium-Sized Enterprises. Productivity gains from AI in professional services are measurable; adoption barriers at small-firm scale; gap between adoption and scaling widening between larger and smaller firms. https://www.oecd.org/content/dam/oecd/en/publications/reports/2025/12/ai-adoption-by-small-and-medium-sized-enterprises_9c48eae6/426399c1-en.pdf - Harvard Business Review (2025). AI Is Changing the Structure of Consulting Firms. AI automating junior consultant tasks including research and analysis; the obelisk model; fewer layers in knowledge-work firms. https://hbr.org/2025/09/ai-is-changing-the-structure-of-consulting-firms - British Chambers of Commerce (2026). Half of SMEs using AI with limited headcount impact so far. Half of UK owner-managed businesses now using AI in some form; productivity gains not yet fully realised at small-firm scale. https://www.britishchambers.org.uk/news/2026/03/half-of-smes-using-ai-with-limited-headcount-impact-so-far/ - Gallup (2026). Rising Adoption Spurs Workforce Changes. Only one in ten employees strongly agrees AI has changed how work gets done; change management as the primary scaling constraint in AI-adopting organisations. https://www.gallup.com/workplace/704225/rising-adoption-spurs-workforce-changes.aspx - SmartDev (2025). AI Use Cases in Professional Services. McKinsey's internal AI tool used by 70 per cent of 45,000 staff; material cuts in research and proposal generation time in knowledge-work firms. https://smartdev.com/ai-use-cases-in-professional-services/ - Cobl.ai (2025). 5 Ways AI Saves Your Sales Team 10 Hours a Week on Proposals. Case evidence from a software development firm: proposal response time reduced from 6 hours to 2.5 hours per proposal, a 58 per cent reduction. https://www.cobl.ai/blog/5-ways-ai-saves-your-sales-team-10-hours-a-week-on-proposals

Frequently asked questions

What is the billable-hour squeeze in agency work?

The billable-hour squeeze describes what happens when AI compresses the time needed to produce billable work, but the pricing model stays the same. For an agency that charges by the hour, this creates an immediate tension. The hours are real but smaller, and the client will notice eventually. The agency must decide whether to pass the efficiency gain to the client or rebuild its pricing around outcomes and expertise rather than production time.

How do I know when it is time to change my agency's pricing model?

The right moment is before clients notice, not after. If a deliverable that regularly took eight hours now takes three, and that deliverable appears on a recurring invoice, you have a window of roughly two to three renewal cycles before the client starts asking questions. Use that window to propose a new pricing approach rather than waiting to defend a shrinking timesheet. The harder conversation is always the reactive one.

What pricing models work best when AI cuts production time?

Three approaches are commonly used. Outcome-based fees tie the charge to the result delivered, not the time taken, and work well when the output is measurable. Retainer models bill for access to expertise and capacity at a fixed monthly rate, which insulates the agency from time compression. Project fees quote a flat rate per deliverable, letting the agency absorb efficiency gains as improved margin rather than reduced invoices.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation