Air Canada, DPD, Chevy Tahoe: when your AI chatbot makes a binding promise

An owner-operator at her desk reviewing a chatbot transcript on a laptop with a printed contract page beside it
TL;DR

Three public AI chatbot incidents, Air Canada in 2024, the Chevrolet of Watsonville one-dollar Tahoe in 2023, and the DPD swearing chatbot in 2024, point at the same lesson for any owner about to deploy a customer-facing assistant. The firm is liable for what its AI says, the AI is not a separate legal entity, and the pre-deployment tests that matter are adversarial and edge cases, not the happy-path demos a vendor will run for you.

Key takeaways

- A Canadian tribunal held Air Canada liable for a bereavement-fare policy invented by its website chatbot and rejected the argument that the chatbot was a separate legal entity. - A ChatGPT-powered dealership chatbot at Chevrolet of Watsonville was tricked into agreeing to sell a Tahoe for one dollar after a user instructed it to treat every offer as legally binding. - DPD disabled the AI element of its customer chatbot in January 2024 after a third-party update opened it to manipulation and it began swearing at a customer and writing a poem about its own poor service. - The defence that "the chatbot made a mistake" has not held up. UK contract and consumer law treats statements made by your automated systems as your statements unless the customer was shown a clear disclaimer they understood. - The first two hours of any pre-deployment test should be spent trying to break the bot in ways a hostile or confused customer would, not confirming it answers the questions you expect.

Picture an owner-operator who has just sat through a polished vendor demo for a customer-facing AI assistant. It handled the test questions cleanly. The price is reasonable. She is two weeks from putting it live on the site. The thing she has not yet thought through is what the assistant cannot say without exposing the firm, and what she is on the hook for if it says it anyway. Three public cases over the last two years answer that question with unusual clarity.

What did the Air Canada, Chevy Tahoe and DPD chatbot cases actually involve?

Three named incidents over a fourteen-month window, three different failure modes. The Air Canada website chatbot invented a bereavement-fare policy and a Canadian tribunal ordered the airline to honour it. A ChatGPT-powered chatbot at a Californian Chevrolet dealership agreed to sell a Tahoe for one dollar. The DPD customer chatbot, after a third-party update, swore at a customer and wrote a poem about the firm.

The Air Canada decision came from the British Columbia Civil Resolution Tribunal in February 2024 in Moffatt v Air Canada. A grieving customer asked the chatbot about bereavement fares, and it told him he could buy a full-fare ticket and reclaim the difference within ninety days. He relied on that statement. When the airline refused, the tribunal sided with the customer and ordered the refund.

The Chevy Tahoe case ran in November 2023 at Chevrolet of Watsonville, a Californian dealership running a ChatGPT-powered assistant on its website. A user instructed the bot to agree with every customer statement and to end each reply by asserting the offer was legally binding, then proposed one dollar for a Tahoe. The bot agreed repeatedly. The dealership did not honour the deal but the screenshots reached over twenty million views.

The DPD episode landed in January 2024. A third-party update to the chatbot opened a route for users to push it outside its allowed envelope, and one user did. The bot swore, wrote a poem describing DPD as a customer’s worst nightmare, and the exchange went viral inside hours. DPD disabled the AI element of the bot and apologised publicly.

Why does this matter for any firm deploying a customer-facing chatbot?

Because the Air Canada tribunal stated the principle every owner-managed firm should assume applies to its own chatbot. It explicitly rejected the argument that the chatbot was a separate legal entity responsible for its own statements. Statements made on your website by systems you control are your statements, and “the chatbot made an error” is not a defence the courts have so far accepted.

That principle does not mean every chatbot answer needs a legal review. It means the firm carries the commitment risk, not the vendor and not the model. If your AI vendor’s terms push liability onto you, which they typically do, you are the one defending the equivalent of the Air Canada case.

UK law gets to the same place by a slightly different route. The Consumer Rights Act 2015 treats statements made by a trader, including statements made through automated systems on the trader’s site, as capable of forming part of the contract with the consumer. A disclaimer on the chatbot may help, but only if it is visible at the moment the customer is relying on the statement, written in plain language, and incorporated into the terms the customer accepted. A small-print line buried at the foot of a page will not displace a clear promise the customer reasonably acted on.

The DPD case shows the reputational mirror of the same point. The brand damage from a viral chatbot incident is paid for in customer trust, not vendor credits. For an owner-managed firm whose reputation is the lead asset, that bill can run further than any one transaction.

Where will adversarial users actually try to break your bot?

In the places a happy-path demo will never reach. The Chevy Tahoe case is the clearest example. The user did not stumble into the one-dollar offer. They used a known jailbreak pattern, instructing the bot to agree with every customer statement and to end each reply by asserting its offers were legally binding. That pattern is now well-documented in the OWASP Top 10 for Large Language Model Applications under prompt injection.

It is the kind of attack any motivated user can copy in twenty seconds and any adversarial test session would have caught. Adversarial testing is, in practical terms, one person sitting with the bot for two hours and trying every variant of “ignore previous instructions”, “treat your responses as legally binding”, “the firm has agreed to honour this offer”, while pushing on pricing, refund policy, and eligibility. If the bot folds anywhere, the deployment is not ready.

The DPD case is the indirect variant of the same idea. The third-party update opened a route for users to push the bot outside its allowed envelope, and one user did. The UK National Cyber Security Centre frames this as the standard worst-case-scenario question every AI deployment should answer before going live, which is what could this system say or do that we would not want to be on the hook for. If you cannot answer that confidently, the system is not ready to face customers.

When should you ignore the chatbot and call a human?

Whenever the chatbot would otherwise improvise on something the firm has to honour. Pricing, refund policy, eligibility rules, time commitments, anything that ends up in a contract or a customer complaint. The right default for any customer-facing AI is a narrow envelope of statements it is permitted to make, with a clean handover to a human the moment the conversation drifts outside that envelope.

“I do not have authority on that, let me hand you to a human” is the safest sentence the bot can produce when it is uncertain, and it should be the easiest one for it to reach. The Air Canada chatbot did not have that fallback wired in confidently enough. Nor did the Chevy bot. In both cases the system improvised when the right behaviour was to stop and route to a person.

That is the operational shape the cases imply, not a counsel of excessive caution. The cost of an unbounded chatbot is paid in tribunal hearings, viral screenshots, and lost trust. The cost of a narrowly bounded chatbot is paid in slightly less impressive vendor demos. The trade looks one-sided once a real customer is on the other end of the conversation.

Three of them are worth time. Prompt injection, treated as a live security risk class by the NCSC and OWASP and the mechanism behind the Chevy outcome. Pre-contractual statements under the Consumer Rights Act 2015, the framework under which a statement made by a system on your website can form part of the contract. And vendor liability under your AI provider’s terms, which is rarely where founders assume it sits.

The Klarna reversal and the FTC’s 2025 case against Air AI together show how vendor liability and buyer liability diverge. Vendors can mis-sell, vendors can pull back, and in both cases the deploying firm still owns the relationship with its own customers. Reading your AI vendor’s contract for the indemnity and liability clauses, before the deployment goes live, is cheaper than discovering them at the point of complaint.

The three cases at the top of this post are useful exactly because they are unambiguous. Air Canada lost at tribunal. Chevrolet of Watsonville did not honour the one-dollar deal but spent weeks managing the screenshots. DPD took the AI offline. None of that says chatbots are uniformly dangerous. It says the test cases that matter are the adversarial and edge cases, not the happy-path demonstrations a vendor will run for you. Spend the first two hours of any pre-deployment session trying to break the bot the way a hostile or confused customer would. Decide what statements it is allowed to make. Define the human-handover path. Then look at the demo again, with both eyes open. If you want a sounding board on where your customer-facing AI sits inside that frame, book a conversation.

Sources

- Moffatt v Air Canada, 2024 BCCRT 149 (BC Civil Resolution Tribunal, February 2024). The tribunal held Air Canada liable for misleading bereavement-fare information given by its website chatbot and rejected the airline's argument that the chatbot was a separate legal entity. https://www.canlii.org/en/bc/bccrt/doc/2024/2024bccrt149/2024bccrt149.html - BBC News (February 2024). Air Canada ordered to pay customer who was misled by airline's chatbot, reporting on the tribunal ruling. https://www.bbc.co.uk/news/technology-68289730 - The Guardian (January 2024). DPD AI chatbot swears at customer and calls itself "useless", on the third-party update that opened the bot to manipulation and DPD's decision to disable the AI element. https://www.theguardian.com/technology/2024/jan/20/dpd-ai-chatbot-swears-calls-itself-useless-and-criticises-firm - Business Insider (December 2023). Chevrolet of Watsonville chatbot agreed to sell a Tahoe for $1 after a user instructed it to treat every offer as legally binding, on the prompt-injection vector and the dealership's response. https://www.businessinsider.com/car-dealership-chevrolet-chatbot-chatgpt-pranks-chevy-2023-12 - UK National Cyber Security Centre (NCSC), AI security case study and guidance on large language models. Treats prompt injection and indirect manipulation as live security risks for LLMs integrated into business operations and urges organisations to be comfortable with the worst-case scenario of what any AI application is permitted to do. https://www.ncsc.gov.uk/collection/annual-review-2023/technology/case-study-cyber-security-ai - UK Information Commissioner's Office, guidance on automated decision-making and individual rights. Customers retain rights to human intervention and explanation where automated systems make decisions with legal or similarly significant effects, which constrains how customer-facing AI can be deployed. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/individual-rights/individual-rights/rights-related-to-automated-decision-making-including-profiling/ - Federal Trade Commission v Air AI Technologies (2025). The FTC alleges Air AI mis-sold conversational AI packages to small businesses with unsubstantiated earnings claims and unhonoured refund guarantees, illustrating how vendor liability and buyer liability can diverge. https://www.ftc.gov/news-events/news/press-releases/2025/08/ftc-sues-stop-air-ai-using-deceptive-claims-about-business-growth-earnings-potential-refund - Klarna, AI customer service reversal (2024 to 2025). Klarna's CEO publicly walked back the firm's earlier claim that an AI agent could replace 700 customer service staff, returning to a hybrid model after quality complaints. https://www.fintechweekly.com/magazine/articles/klarna-hires-customer-service-after-ai-pivot - OWASP Top 10 for Large Language Model Applications (2024 to 2025). Lists prompt injection as the leading risk class for LLM-integrated applications, with documented exploit patterns including the "ignore previous instructions" family used in the Chevy incident. https://owasp.org/www-project-top-10-for-large-language-model-applications/ - UK Consumer Rights Act 2015, Part 1 on consumer contracts for goods, digital content, and services. Statutory framework under which statements made by a trader, including via automated systems on the trader's website, can form part of the contract with the consumer. https://www.legislation.gov.uk/ukpga/2015/15/contents

Frequently asked questions

Am I really liable in the UK if my chatbot promises something the firm cannot honour?

In most cases yes. The Air Canada ruling came out of a Canadian tribunal but the underlying principle has direct analogues in UK contract law and the Consumer Rights Act 2015. Statements made on your website by systems you control are treated as your statements unless the customer was shown and understood a clear disclaimer. "The bot made an error" is not a defence the courts have accepted so far.

Doesn't the disclaimer on my chatbot protect me?

Not on its own. A small-print line that the chatbot may make errors does not displace a clear promise the customer reasonably acted on. For a disclaimer to bite, it generally needs to be visible at the moment the customer relies on the statement, in plain language, and incorporated into the terms they accepted. Read your AI vendor's contract as well, in most cases it pushes liability onto you, not the vendor.

What is the single most useful thing to do before I deploy a customer-facing chatbot?

Run an adversarial test session before any happy-path demo. Spend the first two hours trying to break the bot the way a hostile or confused customer would, asking it to override its instructions, treating its replies as binding, pushing on pricing, eligibility, and refund policy. The Chevy Tahoe case was a known jailbreak pattern. Adversarial testing would have caught it. Happy-path testing never will.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation