The support manager of a 30-staff insurance brokerage has last month’s ticket export in front of her. She has spent the morning tagging every line by intent. Sixty percent are tier-1 structured, password resets, policy document requests, payment status checks, the same three-line answer typed two hundred times. Twenty percent are genuine complaints. Twenty percent are queries the AI on yesterday’s demo call claimed it could handle, and she suspects it cannot.
She has £5k of budget. Her CFO has asked what the team will do with the freed time. The decision in front of her is which slice of those 200 monthly tickets to hand to AI first, and which slice to keep well clear.
This is the right frame for the function in 2026. The evidence is now dense enough to answer the question precisely.
What jobs does AI do well in customer service today?
Six jobs are reliably deployable in 2026 with strong named precedent. Structured-intent deflection on password resets, refund status and order tracking lands at 70 to 90 percent across Zendesk, Tidio and Click4Assistance benchmarks. Sentiment-aware ticket routing reaches 90 to 96 percent accuracy versus 77 percent for human triage. Add sub-two-second triage, knowledge-base article generation, multilingual auto-detection, and real-time agent assist that cuts handle time by 15 to 20 percent.
Each has the same shape. The problem is high-volume, the answer lives in a database or a help article, and the customer wants it inside thirty seconds rather than four hours. Fini reports 80 percent autonomous resolution across its base. Sentisum’s deployment at James Villas cut first-reply time by 46 percent in weeks. McKinsey’s 2025-2026 customer-care research puts speech-analytics savings at 20 to 30 percent of support cost.
The pattern across the data is consistent. AI handles the structured 55 to 60 percent of inbound traffic that a services-led firm typically finds when it tags a month of tickets. The remaining 35 to 40 percent stays with the human team because the work is genuinely different.
Where are the leaders actually using it?
The named-precedent evidence sits across UK and US platforms at honest price points. Click4Assistance, UK-built, serves an estimated 25 percent of UK universities in admissions; Pounce at Georgia State cut summer melt by 22 percent; Becky at Leeds Beckett lifted prospective inquiries by 40 percent. Tidio Lyro at £49 a month achieves 67 percent average resolution, with two named cases reaching 86 and 89 percent after knowledge-base tuning.
Higher up the pricing curve, Intercom Fin sits at £99 a month plus £0.99 per resolution; one named customer doubled its user base while answering 45 percent fewer email inquiries. Zendesk now charges roughly £1.50 to £2.00 per automated resolution, only counted after 72 hours of inactivity with LLM verification, which ties cost to outcome rather than capacity. HubSpot Breeze and Freshworks Freddy operate as agent-assist tools native to existing CRMs. Moneypenny’s UK voice-agent product handles intent recognition and frustration detection inline. Ada is enterprise-priced at £20k-plus annually, named here as a ceiling reference rather than a serious SME option.
The takeaway for an owner-managed firm: your platform shortlist runs at £49 to £100 a month, never £20k a year, until you are well past £15k of monthly recurring revenue from support-driven services.
Where does AI fall short in customer service today?
Three boundaries the brochure tends to omit. Sentiment-heavy and dispute-driven inquiries deflect at only 19 to 31 percent even in top-quartile setups, against 70 to 90 percent on password resets. The asymmetry is not a model upgrade away; complex complaints need empathy and discretion. NobelBiz’s 2026 research finds 60 percent of consumers still prefer human support for sensitive issues.
The second boundary is hallucination. Industry benchmarks put chatbot hallucination rates at 3 to 5 percent. In a regulated firm, a confidently incorrect refund policy is a compliance liability, not just a satisfaction problem. The third is regulatory. The CMA’s March 2026 guidance on agentic AI requires AI use disclosure, regular human-led review of agent outputs, and prompt remediation when problems arise. The FCA expects human judgment in regulated decisions. The ICO’s draft automated decision-making guidance, post the Data Use and Access Act 2025, requires the escalation logic to be auditable. The work this leaves with a human is real complaint handling, regulated advice and any decision that materially affects an individual.
If the function the AI is doing involves trust, judgment or regulated outcomes, this is the territory the AI client communication trust erosion post covers in more depth, and is the right second read before procuring.
What does a 90-day starter rollout actually look like?
Five phases, with real numbers. Weeks 1 to 2 (8 to 12 staff hours, no spend): tag last month’s tickets by intent, identify the structured 55 to 60 percent of volume where AI deflects, write the top 30 questions and answers as the seed knowledge base. Weeks 2 to 3 (20 to 30 hours, £25 to £100 a month subscription): pick the platform on volume not features.
Under 200 monthly inquiries, Tidio Lyro or Crisp; 200 to 500, Freshworks Freddy or Zendesk; already on HubSpot or Salesforce, the native agent. Weeks 3 to 4 (15 to 20 hours): single-channel pilot, run AI parallel to human support, target 50 percent deflection week one and 65 to 70 percent by week four as the knowledge base learns. Weeks 4 to 6 (25 to 35 hours): integrate the CRM, expand the knowledge base, configure sentiment-routing thresholds, expand to additional channels. Weeks 6 to 12 (10 to 15 hours a week): document the human handoff so the agent inherits the AI’s history rather than restarting the conversation, the most cited transition failure in 2026 implementation research.
Total 90-day cost for a representative £2m-revenue firm with a 5-person support team: about £2,500 to £3,700, including staff time at a £30 blended rate. Ongoing cost: £49 to £100 a month. Payback: three to four months on freed support capacity alone.
What should you ask a vendor before you commit?
Five procurement questions separate a serious vendor from a marketing pitch. First, what is the deflection ceiling on my actual ticket mix, not your reference-case mix? Insist on a sandbox pilot with your tickets, never the vendor’s demo. Second, what is the per-resolution cost at my projected volume, with an override floor if my month spikes? Honest per-seat or per-resolution pricing is fine; opaque enterprise pricing is a flag.
Third, how does the platform document its own decisions for the ICO and the CMA, audit-trail format, retention period, customer-access workflow? Fourth, what is the handoff design when the AI escalates? Ask to see the agent-side UI, not the customer-side, because the customer’s experience is downstream of whether the agent inherits context. Fifth, does the platform support multilingual auto-detection out of the box if you have any non-English customer surface? This is now table stakes at SME pricing; do not pay extra for it.
The function has crossed the line from interesting to operational. The owner’s decision in 2026 is which two jobs to start with, which two to keep the AI well clear of, and which vendor will let you run a pilot with your own tickets before you sign anything. If you want to talk that through against your specific ticket mix, book a conversation.



