What goes wrong when a chatbot gives inaccurate advice

A woman sitting at a kitchen table reviewing a printed email next to an open laptop, looking concerned
TL;DR

When a chatbot gives inaccurate advice, the UK SME using it carries the legal, financial and regulatory consequences. Studies show roughly 45% of general chatbot answers contain errors, half of UK accountancy firms have already seen clients suffer financial losses from chatbot tax advice, and regulators including the ICO and FCA treat the deploying business as responsible. The practical fix is policy, testing, human review and audit logs, not better prompts.

Key takeaways

- A 2025 BBC and European Broadcasting Union study found around 45% of news queries to ChatGPT, Copilot, Gemini and Perplexity contained errors, and described the systems as dangerously self-confident when wrong. - A 2026 Dext survey of 500 UK accountants found 50% had clients who lost money after following ChatGPT-style tax advice instead of speaking to their accountant. - UK law and SME insurers (FSB Insurance Service, Markel) place liability for inaccurate AI outputs on the deploying business, not the model vendor. - The ICO requires lawful basis, accuracy checks and human oversight when generative AI processes personal data, and the FCA applies the same rules to AI used in regulated activities. - The practical fix is a written policy on what the bot can answer, a test library of 20 to 30 real questions, human review on anything client-facing, and an audit trail of what the bot said.

A practice manager I spoke with last winter showed me a printout of a chatbot transcript she had just discovered. A client had asked the bot, on her firm’s website, whether a particular travel expense was claimable. The bot had given a confident answer. The answer was wrong. The client had filed on that basis, HMRC had raised an enquiry, and the question on her desk that Monday morning was who was going to pay for the resulting penalty and the hours of remedial work. The bot was a free tier add-on bolted onto her site by a junior staff member six months earlier. Nobody had reviewed what it was telling customers since.

That is the shape of the problem this post is about. Not whether AI is good or bad in the abstract, but what happens to a small UK service firm on the day a chatbot gives a customer inaccurate advice. The short version is that the bill, the regulatory exposure and the reputational damage all land on the business that deployed the bot, not the company that built the model. The longer version, with numbers and named regulators, is what follows.

How often do general chatbots actually get things wrong?

Often enough that you should not treat them as professional advisers without checks. A 2025 BBC and European Broadcasting Union study tested ChatGPT, Microsoft Copilot, Gemini and Perplexity on news queries and found around 45% of answers contained errors. The researchers called the systems dangerously self-confident when wrong, because they delivered bad answers in the same tone as the correct ones.

That was on straightforward factual questions, not the harder territory of UK tax or contract law. If a model gets nearly half of a simple news query wrong, the assumption that it will reliably handle a question about VAT thresholds, employment law or your refund policy needs evidence behind it, not optimism. General chatbots are trained on global data, with limited UK regulatory specificity and no awareness of last quarter’s HMRC update. They will still answer fluently, because that is what they are built to do.

What is the actual cost when this goes wrong in a UK SME?

A 2026 Dext survey of 500 UK accountants gives the clearest picture. Half of the firms surveyed had clients who suffered direct financial losses after following ChatGPT-style tax advice instead of speaking to their accountant. The errors clustered into a familiar set: incorrect business expenses claims at 46% of firms, VAT miscalculations at 41%, flawed personal tax planning at 35% and payroll and business tax errors at 34%.

The firms reported spending up to 10 hours per affected client cleaning up the mess. The clients themselves paid the cost in overpaid tax, missed allowances and HMRC penalties for failure to take reasonable care. None of those penalties get cancelled because the wrong answer came from an AI. HMRC’s own guidance is clear that “the bot said it” does not meet the reasonable care standard. The same logic applies whether the bot is sitting in a client’s browser tab or on your firm’s website.

Where does UK law actually place the liability?

On you, almost always. FSB Insurance Service, which covers a large share of UK small businesses, is explicit: if you use AI in your firm, you are responsible for what it does, even if you did not build the tool. That covers chatbots giving wrong financial or professional advice, AI-generated marketing with false claims, and customer service bots leaking private data. Markel, the professional indemnity insurer, takes the same line on advisers using AI.

The Information Commissioner’s Office adds the data protection layer. If your chatbot processes personal data, you are the controller under UK GDPR. That means a lawful basis, accuracy checks where the output affects individuals, and appropriate human oversight, especially where the answer significantly affects someone. The Article 5(1)(d) accuracy principle does not bend for AI. For regulated firms, the Financial Conduct Authority has been clear that AI used in regulated activities sits inside existing conduct rules, including the Consumer Duty. A bot that strays into personalised product suggestions can be treated as giving advice, with all the suitability obligations that brings.

Why do these bots get deployed without anyone testing them?

Because the install path is friction-free and the testing path is not. UK SME AI projects commonly fail on planning and oversight, not technology, in advisory work echoed by National Cyber Security Centre guidance. Chatbots get put in front of customers without anyone running real FAQs through them, staff paste outputs into client work with no review, and nobody logs what the bot has said. When a complaint arrives, the firm cannot reconstruct it.

Harvard Business School work on AI decision making found that people tend to over-trust confident AI outputs, even when those outputs are wrong. In their experiments, adding a simple alert when the AI was operating outside its comfort zone cut user errors by nearly 50%. Few off-the-shelf chatbots include that kind of uncertainty signal by default. The result for a small firm is rarely a single dramatic failure. It is a steady drip of slightly wrong recommendations that compound into mis-priced jobs, mis-classified complaints and lost margin, with no clear evidence trail pointing at the bot as the cause.

What does a sensible response look like for an owner-managed firm?

A short written policy, a real test library, a human review step, and an audit log. Antek Automation, writing on UK SME chatbot deployment, recommends building a library of 20 to 30 real customer questions covering pricing, service areas, availability and edge cases, then running each one through the bot in different wording. Document the responses, look for inconsistencies, and only go live when the bot consistently meets your standard.

Then decide in writing what the bot is allowed to answer. Opening hours, process steps, product features, yes. Tax, legal, regulated financial advice, no, with the bot configured to refuse or escalate. Anything client-facing or contractually material gets reviewed by a human before it goes out. Conversation logging stays on. You speak to your broker about how AI use affects your professional indemnity and cyber cover, and you update your client terms to reflect that AI is used as drafting support, with final responsibility resting with the firm. None of this is exotic. It is the same basic discipline you would apply to a new junior hire who is fluent, fast, and wrong about 45% of the time.

If you want to talk through where AI sits sensibly in your business and where it does not, book a conversation.

Sources

- Bersin, J. (2025). BBC and European Broadcasting Union study on AI news query errors. Used for the 45% error rate across major chatbots. https://joshbersin.com/2025/10/bbc-finds-that-45-of-ai-queries-produce-erroneous-answers/ - Accountancy Age (2026). AI slop in the books, the rising cost of fixing chatbot errors (Dext survey of 500 UK accountants). Used for the 50% loss figure and the expense, VAT, tax and payroll error breakdown. https://accountancyage.com/2026/01/08/ai-slop-in-the-books-the-rising-cost-of-fixing-chatbot-errors/ - FSB Insurance Service (2024). If AI goes wrong, who's liable? Used for the UK SME liability and insurance position, including the Markel professional indemnity guidance. https://fsb-insurance-service.com/fsb-insurance-service-blog/cyber/if-ai-goes-wrong-whos-liable/ - Information Commissioner's Office. Generative AI and data protection guidance. Used for the controller responsibility, lawful basis and accuracy obligations under UK GDPR. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/generative-ai/ - Information Commissioner's Office. AI and data protection guidance. Used for the human oversight requirement and the Article 5(1)(d) accuracy principle. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/ai-and-data-protection/ - Financial Conduct Authority (2024). Our approach to artificial intelligence. Used for the regulated-activity and Consumer Duty implications when AI gives personalised product suggestions. https://www.fca.org.uk/news/speeches/our-approach-artificial-intelligence - Harvard Business School Working Knowledge. How decision makers can catch generative AI's bad advice. Used for the over-trust pattern and the finding that simple uncertainty alerts cut user errors by nearly half. https://www.library.hbs.edu/working-knowledge/how-decision-makers-can-catch-gen-ais-bad-advice - Antek Automation. AI chatbot accuracy issues, what UK businesses need to know. Used for the practical test library approach and vendor selection criteria (documented accuracy, audit trail, override). https://blog.antekautomation.com/ai-chatbot-accuracy-issues-what-uk-businesses-need-to-know/ - Shoosmiths. AI and professional negligence, where does liability lie. Used for the reasonable professional standard under UK negligence law when AI outputs are used in client work. https://www.shoosmiths.co.uk/insights/articles/ai-and-professional-negligence-where-does-liability-lie - National Cyber Security Centre. AI advice for organisations. Used for the testing, monitoring and fallback recommendations when deploying AI systems. https://www.ncsc.gov.uk/collection/ai-advice-for-organisations

Frequently asked questions

Who is legally responsible if my chatbot gives a customer wrong advice?

You are. FSB Insurance Service and professional indemnity insurer Markel are explicit: if you deploy AI in your business, you are responsible for what it does, even if you did not build it. That covers wrong financial or professional advice, misleading marketing copy generated by AI, and customer service bots leaking private data. The model vendor's terms typically push liability to you, and UK consumer protection rules already apply to AI-generated content.

What practical errors are accountants seeing from ChatGPT-style tax advice?

The 2026 Dext survey of 500 UK accountants reported incorrect business expenses claims (46% of firms), VAT miscalculations (41%), flawed personal tax planning (35%) and payroll and business tax errors (34%). Accountants said they spent up to 10 hours per client correcting the mess, and the errors showed up as overpaid tax, missed allowances and HMRC penalties for failure to take reasonable care.

What is the minimum I should do before letting a chatbot answer customers?

Write a short policy listing what the bot can and cannot answer, especially anything tax, legal or regulated. Build a test library of 20 to 30 real customer questions and run them through the bot, scoring accuracy and consistency. Keep a human review step for anything client-facing. Turn on conversation logging so you have an audit trail. Speak to your insurer about how AI use affects your professional indemnity and cyber cover.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation