The six-dimension diagnostic audit you should run before a second AI engagement

A founder at a kitchen table with three printed documents fanned out, a notebook open to a hand-drawn six-cell grid, pen mid-stroke
TL;DR

A diagnostic audit before a second AI engagement should interrogate six dimensions of the organisation (people, process, tooling, data, vendor, governance) across three phases: documentation review, structured stakeholder interviews, and synthesis into business context. The first engagement failed for reasons that exist in the organisation, not just the engagement. Without the audit, the second engagement inherits the same constraints.

Key takeaways

- The audit interrogates six dimensions: people, process, tooling, data, vendor, governance. The first engagement's failure usually concentrates in two or three of them. - Three phases: document review (business case, SOW, success metrics, project plan, post-implementation reviews), structured stakeholder interviews, synthesis into financial and operational risk. - The contradictions between document phase and interview phase carry the diagnostic value. One audit found a 48-hour client intake claimed in the SOP versus a two-week reality reported in interviews, exposing six figures of annual operational waste. - Specific questions beat generic ones. "What was the training plan, who was responsible, how many staff completed it" produces evidence. "Was the training adequate" produces narrative. - The audit should not be run by the team who ran the first engagement. They are too invested in the narrative. - The output is a prioritised implementation roadmap, not a shelf report. PMI, ITIL, Gartner's AI Maturity model, and Cognizant's framework all contribute scaffolding.

The owner of a thirty-person services firm has finished reading the failure-data piece and accepted that the engagement-design diagnostic from the parent post is sound but incomplete. She is staring at a blank document, asking what she actually looks at, and in what order, before she commits to round two. She does not have a procurement team or an internal audit function. What she needs is a structure she can run on a Wednesday afternoon with whoever inside the firm was closest to the first engagement.

The diagnostic that follows is what experienced operators run when the engagement-design questions do not fully explain the outcome. It is structured, finite, and inexpensive. The work is afternoon-scale, not consulting-engagement-scale. The point is to know what is actually in the way before the second engagement starts.

Which dimensions does the audit cover?

Six dimensions, every one of them load-bearing: people, process, tooling, data, vendor, and governance. Most failed first engagements concentrate in two or three of them, not all six. The audit’s job is to find which two or three. The Audity diagnostic playbook frames this inside each dimension as a three-way cross-check: what the organisation claims to do, what it actually does, and what the gap costs.

What each dimension actually covers:

  • People. Who was trained, who was named accountable, who left during the engagement, who was overcommitted.
  • Process. Which workflows the AI was supposed to change, and where the documented version diverges from the observed one.
  • Tooling. What was deployed, what is in use, what overlaps with other tools the team is already running.
  • Data. What the AI was asked to use, where it lives, whether it was reconciled across systems.
  • Vendor. What was promised, what was delivered, what post-sales support actually looks like by week 18.
  • Governance. Who owned decisions, who reviewed progress, what the steering structure looked like in practice.

How do the three phases work?

Phase one is document analysis. The original business case, the statement of work, the success metrics agreed in writing, the project plan and timeline, the post-implementation review or lessons-learned doc if one exists, the vendor’s statement of practice. This phase typically surfaces contradictions immediately.

A business case might describe automating contract review with claims of 80 percent time savings; the SOW might reference a proof-of-concept limited to ten documents over 30 days; the project plan might allocate three weeks for data preparation. The contradictions land before any conversation begins.

Phase two is stakeholder interview. Five to ten conversations, 30 to 45 minutes each, with the project sponsor, the team trained on the tool, the operational staff whose workflows were supposed to change, the IT lead, ideally the vendor’s account contact. The investigator asks for specific recollections of key moments: when the team first understood the scope, when success began to feel uncertain, what specific obstacles surfaced. These conversations typically surface a narrative significantly different from the documented record.

Phase three is synthesis. The investigator maps the contradictions between phase one and phase two to financial and operational exposure. Audity documents one audit where the formal SOP described a client intake process at approximately 48 hours, while every paralegal interviewed reported the real number at closer to two weeks. That single gap exposed a six-figure annual operational waste, invisible to dashboards and unsurfaced by vendor performance reviews.

Why specific questions matter more than thorough ones?

Generic questions produce narrative the team has rehearsed. Specific questions produce evidence. “Was the training adequate” generates a debatable answer. “What was the training plan, who was responsible for delivering it, how many staff completed it, what did staff report they learned, and what changed in observed behaviour post-training” generates an evidentiary record.

The same shift applies across the six dimensions. “Did adoption happen” is a verdict. “What does adoption mean in this context, how is it being measured, which teams had the highest adoption and which had the lowest, and what was different between them” is a useable line of investigation. The investigator is looking for testable claims, not for someone to agree with the impression they walked in with.

Valid evidence in this phase includes quantitative metrics (login frequency, feature usage, time spent in the tool, ROI against the original case), qualitative evidence (staff interviews, change management records), and comparative evidence (claimed processes versus observed practice). When a vendor or sponsor offers narrative, the investigator asks to see the underlying numbers. The audit is a pattern-recognition exercise grounded in observable fact.

Who should run the diagnostic?

Not the team who ran the first engagement. They are too invested in the existing narrative of why it stalled. The IT team that integrated the tool may be invested in framing the failure as a business problem. The project sponsor may be invested in citing external factors like staff turnover or budget cuts rather than internal change-management gaps.

The work goes faster and reaches further when conducted by someone with no stake in the outcome. That can be an external adviser brought in specifically for the diagnostic, or an internal leader from a different part of the organisation who has bandwidth and credibility to ask hard questions. Either works. What matters is the absence of investment in the existing explanation.

This shift is also easier than it sounds. A neutral party brings two things at once: permission to ask awkward questions, and pattern recognition from having seen similar failures before. The team being interviewed often welcomes this. They have been holding the contradictions privately. Naming them out loud is often the first relief in months.

Which named frameworks contribute scaffolding?

Several mature frameworks exist for post-implementation review and AI maturity assessment. None are off-the-shelf for SMEs. Each contributes a useful checklist when borrowed selectively. PMI’s Post-Implementation Review process examines whether functional and operational tests were conducted, whether user acceptance testing occurred, whether the original objectives were met. ITIL’s service review process interrogates whether agreed service levels were delivered and whether change management around the deployment was adequate.

Gartner’s AI Maturity Model sorts organisations into five levels: Awareness, Repeatable, Defined, Managed, Optimized. Most failed-first SMEs sit at Awareness or Repeatable. A second engagement that assumes a higher maturity level than actually exists is predestined to fail in the same way as the first. Cognizant’s AI/ML maturity model focuses on ten pillars, including data strategy, talent cultivation, and business integration, and sets up the audit as iterative rather than one-shot.

The point of borrowing from these is the discipline they enforce, not their internal vocabularies. A small audit is not a Gartner consulting engagement. It is half a day of structured questions informed by what experienced operators have learned to look for.

What should the output actually look like?

A prioritised implementation roadmap, on one page if it can be. Three to five issues named with specific evidence behind each. For each issue, the financial or operational cost in real numbers, the dependency on any second-engagement design, and the work needed to address it. The order matters. The audit names what to fix first, why, and what happens if the second engagement starts before that issue is addressed.

This is what separates a useful audit from a shelf report. A shelf report describes the situation. A useful audit produces decisions. Three priorities. One page. Owned by named people with named timelines. The second engagement gets scoped against the audit’s priority list, not against a generic vendor SOW.

The next post in this cluster covers the consolidation work that often shows up as the top priority when the audit lands honestly. The parent piece on the second-time buyer’s situation is the on-ramp if you have not yet done the four-question engagement-design diagnostic.

If you would like a neutral party to run this audit on a stalled engagement, book a conversation.

Sources

- Audity 2025: three-phase diagnostic audit, six-dimension scope, six-figure waste example. https://auditynow.com/blog/ai-implementation-failure-audit-diagnostic - PMI Post-Implementation Review process and checklist. https://wiki.en.it-processmaps.com/index.php/Checklist_Post_Implementation_Review_(PIR) - Gartner AI Maturity Model: Awareness, Repeatable, Defined, Managed, Optimized. Source via BMC. https://www.bmc.com/blogs/ai-maturity-models/ - Cognizant AI/ML maturity model: ten pillars, three-stage process. https://www.cognizant.com/blx/en/documents/2261766-ai-ml.pdf - Atlassian post-implementation review playbook. https://www.atlassian.com/work-management/project-management/post-implementation-review - IBM Think 2025: most enterprise AI projects stall before scale because the organisation is operating below the maturity tier the engagement assumed. https://www.ibm.com/think/insights/why-most-enterprise-ai-projects-stall-before-scale

Frequently asked questions

What does a diagnostic audit before a second AI engagement actually look at?

Six dimensions of the organisation: people, process, tooling, data, vendor, governance. Across three phases: document analysis (business case, statement of work, success metrics, post-implementation review), structured stakeholder interviews, and synthesis into financial and operational risk. The output is a prioritised roadmap that names what to fix first.

Who should run the diagnostic?

Someone who was not involved in the first engagement and has no investment in the narrative. This is typically an external adviser brought in for the diagnostic phase, or an internal leader from a different part of the organisation. The team who ran the first engagement is too close to the explanation they have already settled on.

How long does a proper diagnostic audit take?

An afternoon to a week, depending on size. Document phase takes a few hours of reading and cross-referencing. Stakeholder interviews take three to eight conversations of 45 minutes each. Synthesis into a prioritised roadmap takes a day. The work is structured, not exhaustive. Bigger and slower is rarely better.

What is the difference between this audit and the four-question diagnostic in the parent post?

The four-question diagnostic checks whether the engagement design was sound: was the problem defined, was a measurable outcome agreed, did discovery come before tools, was someone accountable for adoption. This audit checks whether the organisation can support any AI engagement: people, process, tooling, data, vendor, governance. The two are complementary.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation