Cost of quality with AI in the workflow, what it costs to ship clean work

Two people at a desk reviewing a printed document together, one pointing at a paragraph on the page with a laptop open beside them in natural daylight
TL;DR

Cost of quality is the total cost of getting work to a shippable standard, split into prevention, appraisal, internal failure, and external failure. AI in the workflow lowers prevention cost, raises appraisal cost because more review is needed, and raises external failure cost sharply when a hallucinated citation or wrong answer reaches a client. Firms that do not track cost of quality end up paying more for AI productivity, not less.

Key takeaways

- Cost of quality has four categories, prevention (training, policy, data), appraisal (review, fact-check), internal failure (rework before delivery), and external failure (refunds, sanctions, claims after delivery) - AI lowers prevention cost on entry-level drafting but raises appraisal cost, recent surveys put verification time at around 4.3 hours per employee per week - The asymmetry that bites at SME scale is external failure, one hallucinated citation or wrong chatbot answer can cost ten to fifty times the productivity gain it sat behind - Tracking cost of quality at SME scale needs only a quarterly tally of error counts by category and a ninety-day external-failure log, not a full quality management system - The tracking forces a clean answer on which AI use cases are net-positive, which need narrower scope, and which raise quality cost more than they lower it

An owner sat across from me last month with two numbers she could not reconcile. Her team was producing thirty per cent more client-facing work than the year before. Her external complaint count had tripled. The complaints were small, a wrong figure in a paragraph, a citation that did not check out, a paragraph that read fluently and meant nothing. None of them alone was a disaster. Together they were costing her partner-level time, two client relationships, and a quiet wobble in how the firm felt to run.

She had introduced AI tooling nine months earlier. Her productivity numbers told her it was working. Her complaint log told her something else. Neither side of that picture was wrong. What was missing was a frame that joined them up.

The frame is an old one from manufacturing called cost of quality. It translates almost directly into AI-assisted professional services, and once you see it, you cannot un-see what AI has done to the shape of the bill.

What is cost of quality?

Cost of quality is the total cost of getting work to a shippable standard, split into four categories. Prevention is what you spend before the work starts, on policy, training, and tool selection. Appraisal is what you spend checking the work. Internal failure is what you spend correcting errors before delivery. External failure is what you spend when an error reaches the client or regulator. The American Society for Quality codified this framework.

Why does it matter for your business?

It matters because AI changes the shape of cost of quality in a predictable direction, and owners commonly see only part of the picture. Prevention cost falls because AI absorbs entry-level work. Appraisal cost rises because probabilistic output needs more verification. Internal failure rises because more errors come through to catch. External failure rises sharply when one slips. The productivity gain is visible. The new costs are not.

The asymmetry is what bites at SME scale. A four-person consultancy that saves five hours a week on drafting feels the win immediately. The same consultancy that ships one fabricated citation in a client deliverable can face a refund, a lost retainer, and a PI insurance conversation that costs ten times the annual productivity gain. Censinet’s review of healthcare AI failures found that a single major AI incident can erase ten to fifty times the anticipated savings. The numbers in professional services are smaller, the shape is the same.

Where will you actually meet this in your workflow?

You meet it first in appraisal time. Recent industry research puts verification time at around 4.3 hours per employee per week for staff using AI tools heavily, and Workday’s 2026 research with three thousand two hundred business leaders found that nearly forty per cent of the time AI saves on drafting is immediately lost to rework. That fee-earner time spent checking the AI’s homework lives quietly inside the productivity number rather than outside it.

You meet it next in internal rework. The hallucinated citation caught by a senior associate. The analysis that looks polished and turns out to be wrong on the second number down. The chatbot reply that promises a refund policy the firm does not actually offer. Each one is caught, none of them are catastrophic, all of them eat time that the AI tooling was supposed to free up. Harvard and Stanford researchers have begun calling this “workslop”, AI-generated work that looks finished and is not.

You meet it last, and most expensively, in external failure. A US court has fined two law firms thirty-one thousand dollars for fake citations produced by a combination of Google Gemini and Westlaw Precision. Deloitte Australia partially refunded a four hundred and forty thousand Australian dollar government contract after fabricated citations and non-existent academic papers surfaced in the deliverable. Air Canada’s chatbot promised a customer a bereavement-fare refund the airline’s policy did not offer, and the tribunal made the airline pay it anyway. The Lloyd’s Market Association’s AI exposure survey now treats “AI produces erroneous advice or service to clients, causing a loss” as a plausible professional indemnity scenario with medium potential impact.

When should you track cost of quality, and when can you ignore it?

Track it whenever AI is doing material work in client-facing output and your complaint count, refund count, or internal correction time is moving in the wrong direction. Ignore it if AI is confined to internal-only work or pattern-matching tasks where a wrong answer never reaches a client. The discipline scales to the risk, not the tool. A solicitor using AI for pleadings and a marketer using AI for headlines sit in two different cost-of-quality categories.

You do not need a full quality management system to do this. Two lightweight habits cover most of it. Keep a ninety-day external-failure log that captures any client-visible error where AI played a material role, with rough direct and indirect cost estimates. Keep a quarterly tally of verification time and internal rework by use case, even a rough one, so you can see which AI use cases are net-positive and which are quietly costing more than they save. Resolver’s incident-costing methodology and the SRA’s guidance on technology adoption both point in the same direction, name the cost, even imprecisely, so it stops hiding.

The adjacent ideas worth holding in mind are the hidden margin tax of AI subscriptions, pricing models when productivity is variable, the AI-discount conversation with clients, and how AI changes professional indemnity exposure. The Information Commissioner’s Office and the Solicitors Regulation Authority have both published guidance that lives mostly inside the prevention category, and the EU AI Act will add an enforcement layer on top from August 2026 for high-risk use cases.

The diagnostic I run with owners is simple. First, what is your verification time per person per week on AI-assisted work, even roughly? If you cannot answer, the appraisal cost is invisible and probably understated. Second, in the last ninety days, how many client-visible errors involved AI, and what did each one cost in time, fee write-offs, or relationship friction? If the answer is “I do not know”, the external failure cost is invisible too. Third, which AI use cases in the firm are clearly net-positive once those costs are counted, and which are not? If you cannot separate them, the cost of quality is hiding the answer.

None of this is anti-AI in client work. The point is that AI repays a clean quality discipline and punishes a sloppy one. Track the four categories honestly, even at low resolution, and the answer on which AI to expand and which to narrow becomes obvious. Skip the tracking and the small errors compound into a complaint log that does not match the productivity dashboard, which is the conversation the owner I sat with last month had been having with herself for nine months.

If this is where you are, Book a conversation.

Sources

- American Society for Quality (n.d.). Cost of Quality, the four-part framework of prevention, appraisal, internal failure, and external failure costs. https://asq.org/quality-resources/cost-of-quality - Six Sigma (2024). Cost of Poor Quality (COPQ), how internal and external failure costs compound when prevention and appraisal are underfunded. https://www.6sigma.us/process-improvement/copq-cost-of-poor-quality/ - Mata v Avianca, Inc. and follow-on cases, JD Supra (2025). Federal court sanctions on hallucinated AI-generated citations in legal filings. https://www.jdsupra.com/legalnews/federal-court-turns-up-the-heat-on-1849454/ - Cloud Security Alliance (2024). Lessons from the Air Canada chatbot ruling on company accountability for AI-generated customer service answers. https://cloudsecurityalliance.org/blog/2024/06/05/the-risks-of-relying-on-ai-lessons-from-air-canada-s-chatbot-debacle - Fortune (2025). Deloitte refunds Australian government contract after AI-generated errors and fabricated citations surfaced in a four hundred and forty thousand dollar report. https://fortune.com/2025/11/25/deloitte-caught-fabricated-ai-generated-research-million-dollar-report-canada-government/ - Censinet (2024). Risk Quantified, Measuring the True Cost of AI Failures in Healthcare, including the ten to fifty times savings-erasure finding for a single major AI incident. https://censinet.com/perspectives/risk-quantified-measuring-true-cost-ai-failures-healthcare - Four Dots (2024). Business Impact of AI Hallucinations, verification time of around 4.3 hours per employee per week and per-incident cost ranges across customer service, finance, and healthcare. https://fourdots.com/business-impact-of-ai-hallucinations-rates-and-ranks - Solicitors Regulation Authority (2024). Compliance tips for solicitors adopting new technology, governance, training, and oversight expectations when AI is introduced into client work. https://www.sra.org.uk/solicitors/resources/innovate/compliance-tips-for-solicitors/ - Information Commissioner's Office (n.d.). Guidance on AI and data protection, prevention-stage expectations on lawful basis, data minimisation, and human oversight in AI-assisted decisions. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/ - Lloyd's Market Association (2024). Understanding AI Exposures, AI Loss Scenarios Survey Results, including the professional indemnity scenario of erroneous AI-generated advice to clients. https://lmalloyds.com/campaigns/understanding-ai-exposures-ai-loss-scenarios-survey-results/

Frequently asked questions

What is cost of quality and why does it matter when AI is in the workflow?

Cost of quality is the total cost of getting work to a shippable standard. It splits into four categories, prevention, appraisal, internal failure, and external failure. AI shifts the balance between these categories rather than removing them. Prevention costs fall because AI handles more of the entry-level draft. Appraisal and failure costs rise because probabilistic output needs more checking, and any error that escapes the firm tends to land harder when it does.

How big is the external failure cost when AI gets it wrong in a professional services context?

Larger than many owners expect. A US court has fined law firms thirty-one thousand dollars for fake citations produced by an AI tool. Deloitte Australia partially refunded a four hundred and forty thousand Australian dollar government contract over AI-generated errors. Censinet's review of healthcare AI failures finds a single major incident can erase ten to fifty times the anticipated savings. The downside is asymmetric to the upside.

How does an owner-managed firm track cost of quality without becoming a quality management department?

Two lightweight habits cover most of it. First, a ninety-day external-failure log that captures any client-visible error where AI played a material role, with direct and indirect cost estimates. Second, a quarterly tally of internal rework and verification time by use case. Those two records, kept honestly, are enough to see which AI use cases are net-positive and which are quietly costing more than they save.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation