Where you sit on the AI ROI maturity ladder

A managing partner at a desk reviewing a printed report and notebook in late afternoon light, glasses laid on the page
TL;DR

UK SMEs sit on a five-level AI ROI maturity ladder. About half are at Level 1, with adoption tracked and impact not. Level 3, the threshold for board credibility, takes 40 to 60 hours of internal work over 12 to 18 months.

Key takeaways

- Level 1 (anecdotal) is the base case for 50 to 60 percent of UK SMEs that have bought AI. - Level 2 (hours-saved by survey) covers another 25 to 35 percent. The number exists but is not validated. - Level 3 (defendable methodology) is the floor for board and CFO credibility, occupied by 10 to 15 percent of firms. - Level 4 (decision-grade ROI with formal review) and Level 5 (portfolio-wide governance) are rare and rarer. - Climbing Level 1 to Level 3 takes around 40 to 60 hours of internal work over 12 to 18 months, no consulting required.

Picture a managing partner I’ll call Mark. Forty-fee-earner firm, five-figure annual AI spend across Copilot and a document-review tool, both rolled out in the last twelve months. Last Tuesday his finance director asked him for the ROI on the AI rollout for the next board pack. He sat down to write the answer and noticed what he had. The operations lead’s recollection of last quarter. A vendor case study from a peer firm. A general feeling that adoption was decent. The board meeting is in nine days and that pile is what he has to defend.

He’s at Level 1 on a five-level ladder. Most firms are. The ladder is diagnostic, not a moral grading. Until a partner can locate themselves on it accurately, the question of how to climb is the wrong question to be asking.

What are the five levels actually?

The ladder is drawn from technology-investment maturity research, adapted to AI specifically. Level 1 is anecdotal: adoption tracked, impact not. Level 2 is hours-saved by survey, with a number that exists but lacks validation. Level 3 is defendable methodology: time-study, quality assessment, leakage tracked, documented protocol. Level 4 is decision-grade ROI with formal review. Level 5 is portfolio-wide governance across all technology investments.

Each level has its own characteristic sentence. A Level 1 firm says: “We implemented this six months ago and we think it’s working, but we haven’t formally measured the impact.” A Level 2 firm says: “We measure hours-saved monthly by asking users, and we know the numbers aren’t precise.” A Level 3 firm says: “We measured time-saved through a two-week time-study before and after deployment, with five to ten professionals. Our methodology has estimated error bars of plus or minus 15 to 20 percent, but the conclusion holds within that range.” A Level 4 firm has a quarterly cadence and a 12-month review with explicit go-forward decisions. A Level 5 firm runs the same discipline across every technology investment in the portfolio.

The levels describe measurement reality, not aspirational steps.

Where does most of the catalogue actually sit?

About 50 to 60 percent of UK SMEs that have bought AI sit at Level 1. They have the tool. They have adoption metrics. They don’t have impact metrics. The number who can answer “what did the AI deliver in pounds last quarter?” with anything more than impression and inference is small. The honest reading is that most owner-led firms work from numbers that would not survive forty minutes of CFO scrutiny.

Roughly 25 to 35 percent are at Level 2. The hours-saved survey has been done. The number sits in a cell on a spreadsheet somewhere. It is articulated when asked. It has not been validated by anyone, the methodology has not been written down, and if the CFO probed for an hour the number would dissolve. Most of the SMEs reporting AI ROI in vendor surveys are working from Level 2 numbers.

Approximately 10 to 15 percent are at Level 3. Time-study or activity-log measurement, rubric-based quality assessment, value leakage tracked. The methodology is documented. The error bars are explicit. This is the floor for board credibility.

Level 4 is rare, around 3 to 5 percent of SMEs. Level 5 is rarer still, under 1 percent, and most large enterprises do not reach it either.

That distribution is the base case, not a failure. The question is whether the firm wants to keep operating at the base case or move.

Why is Level 3 the credibility floor?

Levels 1 and 2 share a common feature: the measurement methodology was either absent or informal. A CFO who asks “how did you measure that?” gets either no answer or a description of a survey. Both produce the same outcome. The number gets discounted, the case for the AI weakens, and the next renewal happens on gut feel rather than evidence.

Level 3 is structurally different. The methodology is written down. It can be audited. Its error bars are explicit, which means the conclusions can be tested against them. A CFO asking “how did you measure that?” gets an answer that holds up: “two-week time-study before and after, blinded quality assessment on thirty samples, leakage tracked through a survey of where the freed-up hours went.” The CFO does not have to take the AI’s value on faith. They can see the firm has measured it carefully.

The CMM-derived research finding underneath this is concrete. Level 3 firms achieve roughly 20 to 30 percent higher ROI on technology investments than Level 1 firms. The gain comes from better selection, because Level 3 firms catch unsuitable technologies in pilot. It comes from faster correction when something is off. And it comes from board confidence that releases capital for the next investment when the case is real.

A firm climbing from Level 1 to Level 3 measures better. The downstream effect is better decisions about what to deploy and what to kill. The measurement discipline becomes the operating discipline.

What does the climb to Level 3 actually involve?

The work is concrete and modest. About 40 to 60 hours of internal effort, conducted over 12 to 18 months. That’s a partner morning a fortnight, with finance manager support. Most SMEs have the capacity if they have decided the discipline is worth having.

The components are five. First, a baseline measurement of current state for each AI use case before deployment: hours per task, error rates, customer satisfaction, the numbers that anchor everything else. Second, a structured time-study or activity-log methodology applied four to six weeks after deployment, with a defined sample of five to ten people across two weeks. Third, a rubric-based quality assessment on a stratified sample of around thirty items, blinded, scored by someone not involved in the AI procurement. Fourth, leakage tracking, asking where the freed-up hours went: cost reduction, work expansion, or slack. Fifth, written documentation of the methodology so the next review can replicate it.

None of this requires consulting support. None of it requires expensive tooling. What it requires is a decision that the firm will operate on measured evidence rather than impression. The hours-cost is real but bounded. The pay-off is access to the question Mark needed to answer last Tuesday: what did the spend deliver?

If you are sitting where Mark was, with a board pack to write and a pile of impressions to draw from, that is the diagnostic. The next twelve months can put you at Level 3, or they can leave you at Level 1 with a bigger AI estate and the same defensibility problem you had before.

If you’d like to talk through what climbing the ladder looks like for your firm specifically, book a conversation.

Sources

  • Software Engineering Institute, Carnegie Mellon University (1993). Capability Maturity Model for Software, Version 1.1. The foundational five-level process maturity framework that underpins technology-investment maturity adapted to AI. Source.
  • Software Engineering Institute, Carnegie Mellon University (2006). Calculating CMMI-Based ROI. Empirical evidence that organisations moving from Level 1 to Level 3 process maturity reduce cost of quality from 65 per cent to 40 per cent and gain 50 per cent productivity, the basis for the 20 to 30 per cent ROI uplift figure. Source.
  • MIT CISR (Woerner, Sebastian, Weill and Kaganer, 2025). Grow Enterprise AI Maturity for Bottom-Line Impact. Stage 3 enterprises achieve growth 11.3 percentage points and profit 8.7 percentage points above industry average; Stage 1 firms underperform on both. Source.
  • McKinsey & Company (2025). The State of AI Global Survey. 88 per cent of organisations now use AI in at least one function but only 39 per cent report any enterprise-level EBIT impact, the measurement gap the maturity ladder addresses. Source.
  • McKinsey & Company (2024). From Promise to Impact, How Companies Can Measure and Realise the Full Value of AI. Five-layer measurement framework spanning technical performance, adoption, operational KPIs, strategic outcomes, financial impact. Source.
  • Boston Consulting Group (2025). Are You Generating Value from AI, The Widening Gap. Five per cent of "future-built" firms achieve five times the revenue gains and three times the cost reductions of peers, with 60 per cent of firms reporting almost no material value from AI investment. Source.
  • Standish Group, CHAOS Report (2020). Long-running benchmark of IT-project outcomes. 31 per cent succeed on contemporary definitions, 50 per cent are challenged, 19 per cent fail outright, the historical baseline for technology-investment measurement maturity. Source.
  • Kaplan, R. and Norton, D. (1992). The Balanced Scorecard, Measures That Drive Performance, Harvard Business Review. Foundational article establishing multi-dimensional performance measurement across financial, customer, internal-process and learning perspectives. Source.

Frequently asked questions

What is the AI ROI maturity ladder for SMEs?

A five-level diagnostic that locates a firm's AI measurement maturity. Level 1 is anecdotal, with adoption tracked but impact not. Level 2 is hours-saved by survey. Level 3 is defendable methodology with time-study and quality assessment. Level 4 is embedded measurement with formal review. Level 5 is portfolio-wide governance across all technology investments.

Why does Level 3 matter for board defence of AI spend?

Levels 1 and 2 produce numbers that do not survive CFO scrutiny because the methodology was absent or informal. Level 3 has a written methodology, explicit error bars, and an audit trail. The CFO can interrogate it and the case still holds. That is what board credibility actually requires.

How long does it take to move from Level 1 to Level 3 on AI ROI measurement?

About 40 to 60 hours of internal work, over 12 to 18 months. The work covers baseline measurement, time-study or activity-log methodology after deployment, rubric-based quality assessment, leakage tracking, and written documentation. Most SMEs can do this without consulting support.

What ROI uplift do firms see at higher maturity levels?

Capability Maturity Model research applied to technology investments shows Level 3 firms achieve roughly 20 to 30 percent higher ROI than Level 1 firms. The gain comes from better selection (avoiding unsuitable tools), faster correction when something is off, and board confidence that releases capital for the next deployment.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation