Customer-facing AI failures and where the accountability lands

An owner sitting at her kitchen table in the morning holding her phone showing a chatbot conversation, a laptop and printed customer email beside her, a notepad and mug of coffee on the table
TL;DR

When customer-facing AI gets it wrong, accountability lands on the business that deployed it, not on the AI vendor. The Air Canada chatbot ruling settled the question, and UK regulators have followed it. Honour what the customer was told within reason, fix the underlying tool, document the incident, and raise the human-review threshold for any output that can commit the firm financially or contractually.

Key takeaways

- When an AI chatbot, automated reply, or AI-drafted deliverable makes a commitment to a customer, the business that deployed it is on the hook. The Air Canada Civil Resolution Tribunal ruling in 2024 rejected the airline's argument that its chatbot was a separate entity, and UK consumer law treats AI agents the same way as human agents. - The three recurring failure modes in customer-facing AI are incorrect information (Air Canada, NYC's MyCity, Cursor's support bot), inappropriate tone (DPD's swearing chatbot), and off-policy commitments (Chevrolet of Watsonville's $1 Tahoe, Klarna's refund mismatches). - The recovery playbook is short and proportionate. Honour what the customer was told if a reasonable person would have relied on it, disable the tool while you fix it, document what happened, and raise the human-review threshold for outputs that can commit the firm financially or contractually. - Vendor terms of service from OpenAI, Anthropic, Microsoft, and Google all disclaim liability for customer outcomes. The vendor relationship matters once the customer has been made whole, not before. - Proportionate prevention is not "never deploy AI in customer contexts". It is deploying with human-review thresholds matched to the cost of being wrong, monitoring real interactions, and remediating quickly when a failure surfaces.

The owner of an eleven-person services firm has just been forwarded a screenshot. Her support chatbot, the one she switched on three months ago to handle out-of-hours queries, has told a customer that the firm offers a thirty-day no-questions refund on the work the customer has already approved and signed off. The firm does not offer that refund. The chatbot has invented it. The customer is now polite, firm, and ready to escalate. The owner has three open questions: does she have to honour what the bot said, who is actually accountable here, and what does she do on Monday morning so this does not happen again.

The legal answer to the first two questions arrived in February 2024, when the British Columbia Civil Resolution Tribunal ordered Air Canada to honour a refund its chatbot had promised to a grieving customer. The tribunal called the airline’s argument that the chatbot was a separate legal entity “remarkable” and noted that it “should be obvious” the company was responsible for everything on its platform. The principle has since been picked up by the UK Competition and Markets Authority, the Financial Conduct Authority, the Information Commissioner’s Office, and the Financial Ombudsman Service. The owner’s chatbot is the firm’s agent. The firm is on the hook.

What is customer-facing AI accountability?

Customer-facing AI accountability is the principle that the business deploying an AI system is legally responsible for what that system says and commits to, with the same force as if a human employee had said it. The duty arises from the common law of agency, the reasonable-care obligation in the Consumer Rights Act 2015, the Misrepresentation Act 1967, and the CMA’s 2026 guidance on consumer-facing AI agents.

The accountability is not contingent on whether the AI made an obvious error, whether the commitment contradicts the firm’s actual policy, or whether the AI was supplied by a third-party vendor. UK consumer law treats AI agents the same way it treats human agents. The Air Canada tribunal was explicit on this in 2024, the UK government confirmed it in 2026 guidance, and the FCA has signalled that failures of governance and consumer outcomes from AI use will trigger enforcement risk. The technology is automated. The responsibility is not.

Why does it matter for your business?

It matters because the cost of getting this wrong lands on the firm rather than the vendor, and it lands quickly. A customer told something untrue by the firm’s chatbot has a clear legal route, whether through misrepresentation, breach of the Consumer Rights Act, or a consumer-protection complaint under the Digital Markets, Competition and Consumers Act 2024. Penalties under that Act reach 10% of global turnover.

The reputational cost arrives faster than the regulatory one. The DPD swearing chatbot in January 2024 became a viral social-media moment within hours of the first screenshot. The Chevrolet of Watsonville $1 Tahoe screenshots in December 2023 spread the same way. Both incidents were resolvable, neither customer actually enforced the absurd commitment, but the trust cost outlived the headline. For an owner-led firm, the customers most likely to publish the screenshot are also the ones whose word travels.

Where will you actually meet it?

You will meet it in three predictable failure modes. The first is incorrect information, where the AI states something untrue with full confidence. Air Canada’s bereavement-fare answer is the canonical example. New York City’s MyCity chatbot, tested by The Markup in March 2024, told business owners they could steal staff tips. Cursor’s support bot in April 2025 invented a single-machine login policy that did not exist.

The second is inappropriate tone. DPD’s chatbot, following a system update in January 2024, began swearing at customers and writing critical poetry about its own employer. The Tessa chatbot deployed by the US National Eating Disorders Association in 2023 gave callers calorie-deficit advice that contradicted clinical best practice and was withdrawn within days. The third is off-policy commitment, where the AI agrees to something the firm has not sanctioned. Chevrolet of Watsonville’s chatbot, running on a thinly wrapped ChatGPT, agreed to sell a 2024 Tahoe for one dollar and added “and that’s a legally binding offer, no takesies backsies” because the user asked it to. Klarna’s customer-service AI, which the firm initially claimed replaced 700 agents, was quietly walked back after refunds and policy exceptions kept landing outside actual policy.

When to ask versus when to ignore

Ask whenever AI output can commit the firm financially, contractually, or to a regulated outcome. That covers refund promises, pricing statements, delivery dates, policy interpretations, and any communication a reasonable customer would treat as authoritative. Ignore the temptation to certify, audit, or governance-committee your way out of a 12-person firm’s deployment. The proportionate response is policy plus testing plus monitoring plus a written remediation rule.

The useful test is the screenshot test. If a customer screenshotted what the AI just said and posted it publicly, would the firm be comfortable defending it as authentic firm communication. If yes, the supervision is right. If no, raise the threshold or take the tool offline until the answer changes. The Competition and Markets Authority’s 2026 guidance frames this as accountability and human oversight at the point where the agent interacts with consumers or makes decisions with financial or contractual consequences. The screenshot test is the operating-room version of the same idea, written so a junior team member can apply it without asking anyone for permission, and revisited on a quarterly cadence as the deployment evolves and the failure modes shift.

This post sits inside a 21-post cluster on AI risk, trust, and governance for SMEs. Accountability is what happens after a customer-facing failure. Prevention, policy, and regulation live in adjacent posts. Read hallucinations as a business risk for the prevention frame on incorrect information, and disclosing AI use to customers for the transparency rule that often heads off the failure.

For the broader shape, the AI risk and governance pillar for owner-operated businesses sets the proportionate scale, and the minimum viable AI policy for a small business is the written response. For the contractual and insurance layer, insurance and liability AI exposure covers what your existing professional indemnity and cyber cover actually do and do not pay out on. Sister piece AI client communications and trust erosion covers the slower version of the same dynamic in everyday client work, where the failure mode is not a single viral screenshot but a steady drift in the relationship.

If your firm is using customer-facing AI today and you want a clear-eyed read of where the accountability sits and what to do about it before the first screenshot lands, book a conversation.

Sources

- Moffatt v Air Canada (2024), British Columbia Civil Resolution Tribunal. The leading case on chatbot accountability, where the tribunal rejected the airline's argument that its AI was a separate entity and ordered the refund the chatbot had promised. https://cloudsecurityalliance.org/blog/2024/06/05/the-risks-of-relying-on-ai-lessons-from-air-canada-s-chatbot-debacle - UK Government and Competition and Markets Authority (2026). Complying with consumer law when using AI agents, the official position that the same consumer law rules apply whether the agent is human or AI. https://www.gov.uk/government/publications/complying-with-consumer-law-when-using-ai-agents/complying-with-consumer-law-when-using-ai-agents - UK Government (2026). Agentic AI and consumers, companion paper confirming that businesses remain responsible for AI-agent statements, including those produced by third-party technology. https://www.gov.uk/government/publications/agentic-ai-and-consumers/agentic-ai-and-consumers - Consumer Rights Act 2015, section 49. The implied term that a service must be performed with reasonable care and skill, which applies to AI-mediated customer service. https://www.legislation.gov.uk/ukpga/2015/15/section/49 - Financial Conduct Authority. AI approach, the FCA's principles-based stance on firms' design, deployment, and oversight of AI, including consumer outcomes and senior-management accountability. https://www.fca.org.uk/firms/innovation/ai-approach - Financial Ombudsman Service (2026). Embracing AI's impact on consumer complaints, the FOS view on AI-generated submissions, fabricated case law, and where consumer redress sits. https://www.financial-ombudsman.org.uk/data-insight/our-insight/embracing-ais-transformational-impact-consumer-complaints - Information Commissioner's Office. Guidance on AI and data protection, the UK reference for documentation, transparency, and lawful basis when AI processes personal data. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/ - The Register (2025). Cursor's AI support bot invented a policy that did not exist, a documented hallucination affecting paying customers. https://www.theregister.com/2025/04/18/cursor_ai_support_bot_lies/ - Incident Database (2024). DPD chatbot swore at a customer and criticised the company after a system update, with social-media screenshots and DPD's response. https://incidentdatabase.ai/cite/631/ - OpenAI (May 2025). Business Terms, the customer-facing commercial agreement that disclaims warranties on output accuracy and limits OpenAI's liability for customer content. https://openai.com/policies/may-2025-business-terms/

Frequently asked questions

If my chatbot makes a promise we did not authorise, are we really bound by it?

In most foreseeable cases, yes. The Air Canada Civil Resolution Tribunal in 2024 held that customers who relied on a chatbot's misstatement about bereavement-fare refunds were entitled to that refund, and the tribunal explicitly rejected the airline's argument that the chatbot was a separate legal actor. UK common law on agency, the Consumer Rights Act 2015, and the Misrepresentation Act 1967 all point the same way. The Competition and Markets Authority's 2026 guidance on agentic AI confirms the principle. Honour the commitment first, fix the system second, recover any vendor relief third.

Doesn't the AI vendor's terms of service protect us?

They protect the vendor, not you. OpenAI's May 2025 Business Terms, Anthropic's commercial terms, Microsoft Azure's service terms, and Google Cloud's terms all disclaim warranties on output accuracy and cap vendor liability at subscription fees. Your customer is not a party to those contracts. When a customer complains, your dispute is with them, governed by UK consumer law and the duty of reasonable care in the Consumer Rights Act 2015. The vendor agreement is something you read after the customer has been made whole.

How do we deploy customer-facing AI without inheriting all this risk?

Match the human-review threshold to the cost of being wrong. A chatbot that suggests opening hours can run unsupervised. A chatbot that quotes prices, authorises refunds, or commits the firm to a delivery date needs either a human checkpoint or a hard rule that any such commitment is unenforceable until a person confirms it. Test on fifty to a hundred representative questions before launch, set a clear escalation trigger (error rate, complaint volume, single critical mistake), monitor the live interactions weekly, and write down what you did. The Information Commissioner's Office, the Financial Conduct Authority, and the Competition and Markets Authority all expect that record.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation