A practice manager I spoke with last winter showed me a printout of a chatbot transcript she had just discovered. A client had asked the bot, on her firm’s website, whether a particular travel expense was claimable. The bot had given a confident answer. The answer was wrong. The client had filed on that basis, HMRC had raised an enquiry, and the question on her desk that Monday morning was who was going to pay for the resulting penalty and the hours of remedial work. The bot was a free tier add-on bolted onto her site by a junior staff member six months earlier. Nobody had reviewed what it was telling customers since.
That is the shape of the problem this post is about. Not whether AI is good or bad in the abstract, but what happens to a small UK service firm on the day a chatbot gives a customer inaccurate advice. The short version is that the bill, the regulatory exposure and the reputational damage all land on the business that deployed the bot, not the company that built the model. The longer version, with numbers and named regulators, is what follows.
How often do general chatbots actually get things wrong?
Often enough that you should not treat them as professional advisers without checks. A 2025 BBC and European Broadcasting Union study tested ChatGPT, Microsoft Copilot, Gemini and Perplexity on news queries and found around 45% of answers contained errors. The researchers called the systems dangerously self-confident when wrong, because they delivered bad answers in the same tone as the correct ones.
That was on straightforward factual questions, not the harder territory of UK tax or contract law. If a model gets nearly half of a simple news query wrong, the assumption that it will reliably handle a question about VAT thresholds, employment law or your refund policy needs evidence behind it, not optimism. General chatbots are trained on global data, with limited UK regulatory specificity and no awareness of last quarter’s HMRC update. They will still answer fluently, because that is what they are built to do.
What is the actual cost when this goes wrong in a UK SME?
A 2026 Dext survey of 500 UK accountants gives the clearest picture. Half of the firms surveyed had clients who suffered direct financial losses after following ChatGPT-style tax advice instead of speaking to their accountant. The errors clustered into a familiar set: incorrect business expenses claims at 46% of firms, VAT miscalculations at 41%, flawed personal tax planning at 35% and payroll and business tax errors at 34%.
The firms reported spending up to 10 hours per affected client cleaning up the mess. The clients themselves paid the cost in overpaid tax, missed allowances and HMRC penalties for failure to take reasonable care. None of those penalties get cancelled because the wrong answer came from an AI. HMRC’s own guidance is clear that “the bot said it” does not meet the reasonable care standard. The same logic applies whether the bot is sitting in a client’s browser tab or on your firm’s website.
Where does UK law actually place the liability?
On you, almost always. FSB Insurance Service, which covers a large share of UK small businesses, is explicit: if you use AI in your firm, you are responsible for what it does, even if you did not build the tool. That covers chatbots giving wrong financial or professional advice, AI-generated marketing with false claims, and customer service bots leaking private data. Markel, the professional indemnity insurer, takes the same line on advisers using AI.
The Information Commissioner’s Office adds the data protection layer. If your chatbot processes personal data, you are the controller under UK GDPR. That means a lawful basis, accuracy checks where the output affects individuals, and appropriate human oversight, especially where the answer significantly affects someone. The Article 5(1)(d) accuracy principle does not bend for AI. For regulated firms, the Financial Conduct Authority has been clear that AI used in regulated activities sits inside existing conduct rules, including the Consumer Duty. A bot that strays into personalised product suggestions can be treated as giving advice, with all the suitability obligations that brings.
Why do these bots get deployed without anyone testing them?
Because the install path is friction-free and the testing path is not. UK SME AI projects commonly fail on planning and oversight, not technology, in advisory work echoed by National Cyber Security Centre guidance. Chatbots get put in front of customers without anyone running real FAQs through them, staff paste outputs into client work with no review, and nobody logs what the bot has said. When a complaint arrives, the firm cannot reconstruct it.
Harvard Business School work on AI decision making found that people tend to over-trust confident AI outputs, even when those outputs are wrong. In their experiments, adding a simple alert when the AI was operating outside its comfort zone cut user errors by nearly 50%. Few off-the-shelf chatbots include that kind of uncertainty signal by default. The result for a small firm is rarely a single dramatic failure. It is a steady drip of slightly wrong recommendations that compound into mis-priced jobs, mis-classified complaints and lost margin, with no clear evidence trail pointing at the bot as the cause.
What does a sensible response look like for an owner-managed firm?
A short written policy, a real test library, a human review step, and an audit log. Antek Automation, writing on UK SME chatbot deployment, recommends building a library of 20 to 30 real customer questions covering pricing, service areas, availability and edge cases, then running each one through the bot in different wording. Document the responses, look for inconsistencies, and only go live when the bot consistently meets your standard.
Then decide in writing what the bot is allowed to answer. Opening hours, process steps, product features, yes. Tax, legal, regulated financial advice, no, with the bot configured to refuse or escalate. Anything client-facing or contractually material gets reviewed by a human before it goes out. Conversation logging stays on. You speak to your broker about how AI use affects your professional indemnity and cyber cover, and you update your client terms to reflect that AI is used as drafting support, with final responsibility resting with the firm. None of this is exotic. It is the same basic discipline you would apply to a new junior hire who is fluent, fast, and wrong about 45% of the time.
If you want to talk through where AI sits sensibly in your business and where it does not, book a conversation.



