Triage before draft, the inbox AI sequence most SMEs get wrong

A practice manager at a desk with coloured sticky notes on a screen labelling email categories, a notebook open to a triage flowchart
TL;DR

Inbox AI works in three layers: triage (classification and routing), briefing (summarisation), and draft (response generation). The risk and value are unevenly distributed. SMEs that start with triage see fast, low-risk wins. SMEs that start with autonomous response introduce client-facing risk they cannot quantify. Klarna's two-thirds-autonomous chatbot is not the SME services-firm benchmark; 30 to 40 percent is.

Key takeaways

- The three-layer model: triage (classification), briefing (summarisation), draft (response generation). Each layer carries different risk and different value. Treating them as one decision is the most common deployment mistake. - Triage is the safest first move: 90 to 95 percent classification accuracy on well-trained systems, 30 to 60 minutes of routing time saved per team member per day, no client-facing risk. - The briefing layer adds 1.5 to 4 hours a day of reading-and-context time saved across a team handling 50-plus emails daily. Still no client-facing risk. - The draft layer is highest-value and highest-risk. Routine inquiries (billing, status, document requests) see 40 to 60 percent time savings. Complex inquiries are often unusable without significant editing. - Klarna's 60 to 70 percent autonomous handling is real and documented. The SME equivalent for professional services is 30 to 40 percent because most professional inquiries are context-specific and nuanced. - The protocol that works at SME scale: pilot triage on one team member for one week, get to 95 percent classification accuracy through correction loops, then scale, then add briefing, then add draft generation.

A 10-person accountancy practice piloted ChatGPT-drafted client replies before they had set up any email triage. Two weeks in, a partner caught a draft that misstated a tax-treatment assumption to a client. The pilot stopped that afternoon. Six months later, the same practice deployed Crisp for email triage only. They freed five hours a day across the team and the partner could not point to a single client risk introduced. The tool was different, but more importantly, the deployment order was different. They started at the layer with no client-facing exposure.

This is the inbox AI mistake most SMEs make in their first attempt. The Klarna story makes autonomous response generation feel inevitable. The starting move is the layer Klarna is not advertising: classification and routing, the part that has nothing to do with the client and everything to do with how email moves through the firm.

What are the three layers of inbox AI?

Inbox AI works in three distinct layers, each with different risk and different value. Triage classifies and routes incoming emails. Briefing summarises long threads. Drafting generates first-pass replies. The risk concentrates in the draft layer, where AI directly touches client communication. The value distributes across all three. Most owners deploy them in reverse order.

The triage layer is mechanical. The AI is reading the email and deciding which category it belongs to (billing, status, document request, complaint, scheduling, escalation). It does not write anything. It does not reach the client. It moves the message to the right team member or the right priority lane. Classification accuracy on well-trained systems lands at 90 to 95 percent. Errors are caught when the wrong person opens the email.

The briefing layer is also low risk. AI summarises long email threads or attached documents so a team member can read a paragraph instead of fifteen messages. It does not produce client-facing output. The drafting layer is where the risk lives, because the AI is now writing on behalf of the firm.

Why is triage the right place to start?

Triage delivers immediate value at low risk. For a 10-person practice receiving 20 to 30 support emails a day, eliminating the manual sorting step (which typically eats 30 to 60 minutes per day per team member) yields 4 to 8 hours a week of recovered team time. None of that touches the client. None of it produces a document the client will read. The win is purely operational.

IBM research shows AI can reduce average response times by up to 99 percent in scenarios where customers were waiting hours for a reply, by routing emails to the correct team member immediately on receipt. The reduction comes from the email reaching the right person without any waiting in a shared inbox, with no AI-generated reply involved.

The protocol that works: pilot the triage system on one team member for one week, with the AI classifying and routing while the team member verifies and corrects misclassifications. Track classification accuracy. Expect 85 to 90 percent at week one, rising to 95-plus percent at week two through correction loops. When the correction effort drops below five minutes a day, scale to the team.

What does the briefing layer add?

Briefing turns a 15-message thread into a paragraph. For a team handling 50-plus emails a day, the time saved on reading and contextualising falls from 2 to 5 minutes per email to 30 to 60 seconds. That is 1.5 to 4 hours a day of reading time recovered across the team. The briefings are reviewed before any reply is sent, so client-facing risk stays at zero.

Briefings work best on long threads, attached documents, and inquiries with multiple back-and-forth exchanges. They do not replace the team member's judgement. They give the team member a faster way to absorb context before they apply judgement. Teams using briefing tools report measurably reduced mental fatigue in high-volume email environments.

Add briefing once triage has been stable for three to four weeks. The team is already used to the AI moving emails. Adding a layer that summarises content the team will read is a small step from there.

When does draft generation become safe?

Draft generation becomes safe when the firm has documented review protocols, established categories of inquiry where AI drafts work well, and accepted that complex inquiries will not be drafted by AI in the first wave. Routine inquiries (billing questions, appointment status, document requests, simple updates) see 40 to 60 percent time savings on drafting. AI takes 20 to 30 seconds; manual drafting takes 2 to 3 minutes.

Complex or nuanced inquiries are different. AI drafts on these are often unusable without significant editing, and the editing time can match or exceed manual drafting time. The discipline is to limit AI drafting to a defined set of categories and route everything else to a human. The triage layer makes this routing reliable.

End-to-end response time on routine inquiries can drop from 4 to 6 hours to 15 to 30 minutes. Off-hours ticket abandonment drops by over 50 percent. The client experience improves because clients get faster, more consistent responses, not because the firm has eliminated humans from the process.

What does Klarna actually tell SMEs?

Klarna's AI assistant handles two-thirds of customer service chats. In the first month, that was 2.3 million conversations, on par with human satisfaction, with a 25 percent drop in repeat inquiries. Estimated profit improvement of £40m in 2024. The numbers are real and the success is documented.

The relevant lesson for an SME is not the percentage. Klarna handles a high volume of routine, transactional, well-categorised inquiries (refund status, payment plans, account questions). A professional services firm receives a different mix: questions specific to a client's matter, requests for advice, sensitive negotiations, status updates on bespoke engagements. The autonomous-handling rate for that mix is 30 to 40 percent, not two-thirds.

The principle that translates is the staged deployment, not the percentage. Klarna built the autonomous chatbot on top of years of email and chat triage. SMEs should expect the same sequence: classification first, briefing next, drafting on routine inquiries last, autonomous response only for the small subset where context-specific risk is low.

What compliance gates does the regulated sector hit?

For legal practices, email correspondence is potentially privileged. AI processing of client emails creates a confidentiality risk if the AI platform retains, processes, or shares the communication outside the firm-client relationship. The SRA requires confidentiality. Most professional-grade tools (Crisp, Zendesk) have Data Processing Agreements and claim GDPR compliance. Consumer-grade tools (free ChatGPT, free Copilot) do not, and are not suitable for processing client personal data.

For accountancy firms, ICO and UK GDPR rules govern processing of personal data in client emails. Legitimate basis for processing is required, and a DPA with the AI vendor is the working standard. For healthcare clinics, NHS Digital governance plus GDPR plus HIPAA-equivalent UK protections apply. For financial services firms, FCA evidence-of-communication requirements mean records of who reviewed an AI-generated response and when must be maintained.

The practical gate is consistent: AI does not autonomously respond to client emails in regulated sectors without human review and a documented sign-off process. Triage, briefing, and draft generation are all acceptable within proper governance. Autonomous response is the last and highest-risk move, not the first.

If you are working out which inbox layer to deploy first and how to keep the regulator and client side both clean, the deployment order is the part most vendors will not write down for you. Book a conversation.

Sources

  • Crisp, automating email responses with AI. Source.
  • Klarna, AI assistant handles two-thirds of customer service chats in its first month. Source.
  • Tess Group, AI compliance UK businesses 2026 guide. Source.
  • Law Society of Ireland, generative AI guidance. Source.
  • Brynjolfsson, E., Li, D. and Raymond, L. (2023). Generative AI at Work, NBER Working Paper 31161. Empirical productivity study showing 14 per cent average gain with 34 per cent for low-skilled workers, the basis for sector-specific AI productivity claims. Source.
  • McKinsey & Company (2024). From Promise to Impact, How Companies Can Measure and Realise the Full Value of AI. Five-layer measurement framework for evaluating sector AI deployments. Source.
  • Goldman Sachs (2023). Generative AI could raise global GDP by 7 per cent. Cross-sector productivity-paradox research, the macroeconomic context for sector-level AI ROI claims. Source.
  • Boston Consulting Group (2026). When Using AI Leads to Brain Fry. Study of 1,488 US workers across large companies on AI oversight load, error rates, decision overload and intent to quit. Source.

Frequently asked questions

Why deploy triage before draft generation?

Triage is lower risk and delivers immediate value. Classification and routing are mechanical decisions that do not touch client communication. Drafting touches the client directly and introduces accuracy, tone, and compliance risk that takes documented controls to manage. Starting with triage builds team confidence and operational clarity before higher-risk layers are added.

Is the Klarna case study relevant to a 10-person services firm?

Partially. Klarna's two-thirds-autonomous chatbot result is real (2.3 million conversations in the first month, on par with human satisfaction, 25 percent drop in repeat inquiries). The SME services-firm equivalent is 30 to 40 percent autonomous handling, because professional inquiries are typically context-specific. The principle of efficient automation improving customer experience translates. The percentage does not.

Which tools work at SME scale?

Crisp and Zendesk AI at £50 to £300 per user per month for email-focused triage, Salesforce Service Cloud and HubSpot Service Hub at £50 to £200 per user per month if already on the CRM, ChatGPT or Claude or Copilot at £20 to £30 per user per month for drafting only with no integration. For under 30 inquiries a day, basic email rules combined with ChatGPT for drafting is often adequate.

What compliance gates apply to inbox AI in regulated sectors?

SRA confidentiality, ICO and UK GDPR processing, NHS Digital data governance, FCA evidence-of-communication requirements. Autonomous client-facing responses are not acceptable in regulated sectors without documented review-and-sign-off. Triage, summarisation, and draft generation are acceptable with proper governance and Data Processing Agreements.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation