PwC, EY, Deloitte, KPMG, what their AI rollouts tell SMEs

A woman at a kitchen table in the early evening with a laptop open showing an article and a handwritten notepad with three bullet points in front of her, a mug of tea beside the notepad, natural warm window light
TL;DR

PwC, EY, Deloitte and KPMG have invested heavily in generative AI and most of the public coverage is partly vendor marketing. Three patterns under all four firms' stories transfer cleanly to owner-managed firms at a hundredth the scale. Internal productivity before customer-facing AI, bespoke shaping of vendor tools to the firm's actual work, and long horizons measured in years not quarters. The budget gap is real, the play is the same.

Key takeaways

- The Big Four are investing enormous sums into generative AI and the published case studies are partly vendor PR, the useful question is what actually changed inside the firm not what the press release said it would change - Three patterns repeat under all four firms' rollouts, internal productivity before client-facing AI, bespoke build on top of vendor models rather than vendor models alone, and rollout horizons measured in years not quarters - The internal-productivity-first move is the most portable for owner-managed firms, the MIT NANDA research independently identifies back-office automation as the higher-ROI move and confirms that more than half of generative AI budgets are still pointed at the wrong target - What does not transfer, the bespoke vendor partnerships, the dedicated AI engineering teams, the internal AI platforms, the dedicated change-management functions, stop comparing budget lines and start comparing principles - The honest read of any Big Four AI case study is that vendor and firm both benefit from a glowing write-up, the published productivity figures sit at the top of the distribution, the useful filter is what the firm actually does differently as a result

The owner of a thirty-person professional services firm is reading another LinkedIn post about a Big Four AI deployment over her second coffee of the morning. PwC and Anthropic, an expanded partnership, twenty to fifty per cent productivity gains in development work, agentic build across financial services and pharma and healthcare. EY rolling AI capabilities across one hundred and sixty thousand audit engagements. Deloitte’s GenAI audit platform. KPMG’s healthcare generative AI report. The numbers are vast, the language is glowing, and the gap between her resources and theirs is the size of the Atlantic.

She closes the tab. Her frustration is not with the work itself, plenty of it is genuinely impressive, but with the coverage, which seems written for nobody her size. The case studies read like vendor marketing because they partly are vendor marketing. Anthropic benefits from a glowing PwC write-up. PwC benefits from a glowing Anthropic write-up. Both parties have an interest in the figures being at the top of the distribution.

The useful question is not what the Big Four are doing in the abstract, but what they are doing that an owner-managed firm could borrow at a hundredth the scale. Three patterns repeat under all four firms’ public stories and each one is portable.

What are the Big Four actually doing with AI right now?

The four firms are pursuing similar strategies at different speeds. PwC has deepened its alliance with Anthropic into three areas, agentic build using Claude Code to ship production software in weeks, AI-native deal-making that compresses transactions end-to-end, and reinvention of enterprise functions through bespoke internal applications. The firm reports twenty to fifty per cent productivity gains in development work and plans a full AI-driven audit solution in 2026.

EY launched an integrated AI platform in 2023 that spans strategy, transactions, risk, insurance and tax, with subsequent rollouts of AI capabilities supporting one hundred and sixty thousand global audit engagements. Deloitte has built generative AI into its audit-documentation review and publishes its annual State of AI in the Enterprise research as part of its public positioning. KPMG has gone deep on sector-specific applications, including its generative AI in healthcare report and parallel work in financial services and tax.

The coverage of all four follows a recognisable shape. A vendor partnership announcement, a headline productivity figure, a list of functions touched, a forward statement about the next phase. The case studies are useful inputs and marketing assets at the same time. The first move when reading them is to separate the underlying implementation from the press release describing it.

Why does this matter for an owner-managed firm?

It matters because three patterns under the four firms’ stories are genuinely portable and the LinkedIn coverage usually skips them. The patterns sit underneath the budget numbers and the partnership announcements. They show up in every credible Big Four AI rollout, they show up in the MIT NANDA research on why ninety-five per cent of generative AI pilots fail to deliver measurable impact, and they are independent of firm scale.

The first pattern is internal productivity before client-facing AI. PwC’s engineering productivity work and EY’s audit-engagement integration are both internal-first moves, the AI is deployed on the firm’s own back office before it touches a client. The MIT research is clear that this is the higher-ROI pattern and that more than half of generative AI budgets are still pointed at sales and marketing rather than back-office automation. The Big Four are not making that mistake. Many smaller firms are.

The second pattern is bespoke build on top of vendor models. PwC’s in-house applications layered on top of Claude are the cleanest example, but each of the four firms is shaping vendor tools to its specific workflow rather than expecting an off-the-shelf model to learn the firm’s work unaided. The third pattern is long horizons. None of these rollouts are quarterly. PwC’s audit solution is a 2026 milestone. EY’s platform integration is a multi-year programme. The Big Four are operating on a horizon that gives the technology time to compound.

Where will you actually meet these patterns in your own firm?

You meet the internal-productivity-first pattern the moment you ask where AI should go first in your firm. The instinct for many owner-operators is to point AI at the client-facing surface, the website chatbot, the proposal generator, the marketing copy. The MIT NANDA evidence and the Big Four behaviour both say the opposite. The largest measurable ROI sits in the back office, in the boring internal places, not the visible external ones.

You meet the bespoke-on-top-of-vendor pattern as soon as your first general-purpose AI tool stalls in real use. ChatGPT and Claude on their own are powerful for an individual contributor and weak as enterprise infrastructure, because they do not learn from or adapt to your specific workflow. The fix at PwC’s scale is an engineering team building applications on top of the vendor model. The fix at your scale is a thoughtfully written system prompt, a small library of templates codifying how your firm does the work, and a clear human-in-the-loop step. Same principle, different scale.

You meet the long-horizon pattern when you set the success criterion for your first AI rollout. If the criterion is a quarterly productivity win the firm will compress the timeline and the work will fail. If the criterion is a twelve to twenty-four-month compounding capability with checkpoint reviews, the firm gives the technology the time it actually needs. The Big Four are giving themselves years. A smaller firm cannot afford less patience, it can afford less budget.

When should you copy the Big Four and when should you ignore them?

Copy the principles, ignore the budget lines. The internal-productivity-first move is the most portable and the one to start with. Pick the highest-friction back-office task in your firm, the one a senior person grumbles about every week, deploy AI there, measure honestly against a real baseline, and build the next thing on top of what you learned. The Big Four are running this play with three more zeroes of budget. The play is the same.

Ignore the bespoke vendor partnerships. Anthropic does not sign strategic alliances with thirty-person firms and you do not need one. Ignore the dedicated AI engineering teams, the internal AI platforms, the dedicated change-management functions, the multi-year reinvention programmes. These are the artefacts of operating at Big Four scale and they are not the source of the value. The principles are the source of the value. The artefacts are downstream of the principles.

Ignore the published productivity figures as targets. Twenty to fifty per cent development productivity is the top end of a wide distribution, reported by parties with an interest in the figure being high. The British Chambers of Commerce evidence on UK SME AI adoption is more useful as a calibration, fifty-four per cent of UK firms now use AI, ninety-five per cent report no workforce reduction, the actual experience is incremental rather than headline-grabbing. Set your own baseline and measure your own delta against it.

What does this mean for how you read AI case studies generally?

The honest filter for any Big Four AI case study is to ask what the firm actually does differently as a result, rather than what the announcement said it would do. The productivity figures sit at the top of the distribution. The strategic narratives are co-written with the vendor. The internal implementation work is usually real. Reading for the patterns, not the figures, makes the coverage useful at your scale.

Three discipline moves help when reading AI coverage. Discount the headline numbers by at least half until you find a non-vendor source corroborating them. Look for the operational specifics buried near the bottom of the case study, the function rolled out, the team affected, the time horizon, those are the portable details. Cross-reference against the MIT NANDA evidence on where AI investment actually produces measurable ROI, because the vendor coverage and the ROI evidence often point in different directions.

The Big Four are useful precisely because their rollouts are at sufficient scale and duration to test the underlying patterns. The principles survive at your scale. The methods do not. If you want to think through which back-office task in your firm is the right first move and what the twelve-month horizon should look like, book a conversation.

Sources

- Anthropic (2025). PwC and Anthropic expanded partnership announcement, agentic build, AI-native deal-making, and twenty to fifty per cent productivity gains in PwC engineering teams using Claude Code. https://www.anthropic.com/news/pwc-expanded-partnership - Thomson Reuters Tax & Accounting (2024). How do different accounting firms use AI, sector-wide survey covering EY, Deloitte, PwC and KPMG AI rollouts in audit, tax and advisory. https://tax.thomsonreuters.com/blog/how-do-different-accounting-firms-use-ai-tri/ - KPMG US (2024). Generative AI in healthcare report, sector-specific applications of generative AI from one of the Big Four. https://kpmg.com/kpmg-us/content/dam/kpmg/pdf/2024/generative-ai-poised-transform-healthcare.pdf - Deloitte (2024). State of AI in the Enterprise, Deloitte's annual primary research on enterprise AI adoption patterns and ROI. https://www.deloitte.com/us/en/what-we-do/capabilities/applied-artificial-intelligence/content/state-of-ai-in-the-enterprise.html - MIT NANDA (2025). The GenAI Divide, ninety-five per cent of generative AI pilots fail to deliver measurable bottom-line impact, more than half of budgets target sales and marketing while back-office automation carries the higher ROI. https://fortune.com/2025/08/18/mit-report-95-percent-generative-ai-pilots-at-companies-failing-cfo/ - Orgvue (2025). Ninety-two per cent of organisations have invested in AI but seventy-eight per cent report stalled or failed projects, change-management gaps and shadow AI as the dominant failure modes. https://www.orgvue.com/news/92-of-organizations-have-invested-in-ai-but-78-say-projects-have-either-stalled-or-failed/ - British Chambers of Commerce (2026). Half of SMEs now using AI with limited headcount impact, fifty-four per cent of UK firms actively using AI, ninety-five per cent reporting no workforce reduction. https://www.britishchambers.org.uk/news/2026/03/half-of-smes-using-ai-with-limited-headcount-impact-so-far/ - Boston Consulting Group (2025). Are You Generating Value from AI, the widening gap between AI usage and measurable business impact and the disciplines that close it. https://www.bcg.com/publications/2025/are-you-generating-value-from-ai-the-widening-gap - Harvey AI (2024). Hengeler Mueller goes firmwide with Harvey, professional-services rollout pattern that mirrors the Big Four play at smaller scale. https://www.harvey.ai/blog/hengeler-mueller-expands-with-harvey-for-firmwide-legal-ai-adoption

Frequently asked questions

Are the Big Four AI rollouts genuinely useful examples for a thirty-person firm or are they mostly marketing?

They are both. The published figures, twenty to fifty per cent development productivity gains at PwC, AI capabilities across one hundred and sixty thousand audit engagements at EY, sit at the top end of what is achievable and are reported by parties who benefit from the headline number. The underlying implementation work is real and the three patterns under the rollouts transfer cleanly to a smaller firm. Read the case studies for the patterns, discount the figures.

Which Big Four AI move should an owner-managed firm copy first?

The internal-productivity-first move. Pick the highest-friction back-office task in the firm and deploy AI there before anything customer-facing. The MIT NANDA research is clear that more than half of generative AI budgets currently point at sales and marketing while the largest measurable ROI sits in back-office automation. The Big Four are doing the same play with three more zeroes on the budget. The play is the play.

Should I be building bespoke AI on top of vendor models like PwC does with Anthropic?

Not at PwC's scale, no. The principle is portable, the scale is not. The principle is to shape the tool to your specific workflow rather than expecting an off-the-shelf model to learn it. For an owner-managed firm that means a thoughtfully written system prompt, a small set of templates that codify how your firm actually does the work, and a clear definition of where the tool stops and a person takes over. PwC has an engineering team building agentic applications on top of Claude, you have a Friday afternoon, a careful prompt and a small library of examples, the principle is the same.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation