A specialty lender I was sat with last month had been pitched an AI underwriting tool. The vendor’s demo opened with a tidy “explainability dashboard”, a row of feature-importance bars and a one-line plain-English sentence saying why the system declined a sample applicant. The owner watched it, nodded, and said the line I have heard in three regulated-services boardrooms this year. “So if a customer or the ICO asks why we declined someone, that is the answer.”
The honest reply is that it probably is not. The dashboard is a story about the decision, neatly presented. A regulator, an Ombudsman, or a rejected applicant who exercises their data rights wants something stricter than that. The gap between the two is where the legal and reputational risk lives.
What is explainable AI?
Explainable AI is an umbrella label for techniques that produce human-readable accounts of why a machine learning system made a decision. The ICO and the Alan Turing Institute, in their joint guidance Explaining Decisions Made With AI, treat it as three different things: transparency (what the system does), interpretability (how it works internally), and explainability proper (a post-hoc account of a specific output). The legal trigger depends on which is being asked for.
In practice, vendor “XAI” almost always means the third. Tools like SHAP and LIME, the two techniques that sit behind nearly every dashboard you will be shown in 2026, run after the model has produced its output and reverse-engineer a rationale from the input features. That rationale is useful. It is not the same as opening up the model and reading off the logic.
Why does the difference between plausible and faithful matter?
Because a plausible explanation is not necessarily a faithful one, and a faithful one is what a regulator wants. Recent work from Anthropic’s Alignment Science team showed that when large language models are prompted to explain their reasoning, the steps they produce frequently do not match the path the model actually took. The team embedded hints in the input; the model used them; its own explanation rarely mentioned them.
For a lender, an insurer, or a recruitment firm, the implication is direct. If your audit trail says “the system declined the applicant because of debt-to-income ratio and employment recency”, and the actual computation leaned heavily on a postcode-correlated feature that the explanation does not surface, the audit trail is plausible and not faithful. It is the sort of trail that holds up until first contact with a careful complaint or an ICO investigation.
This is the wedge to keep in mind when a vendor demos a dashboard. Ask not whether the explanation sounds reasonable, but whether the vendor has validated that it faithfully describes the model’s actual computation, and how.
Where is XAI legally required in the UK and EU?
In several places, with overlapping triggers. UK GDPR Article 22, mirrored for law-enforcement contexts by section 14 of the Data Protection Act 2018, creates a conditional prohibition on solely automated decisions with legal or similarly significant effects. The data subject has a right to “meaningful information about the logic involved”. Mortgage rejections, insurance pricing, and employment screening all sit inside that perimeter when the human in the loop is a rubber-stamp.
The EU AI Act layered a second regime on top. Article 86 grants individuals an explicit right to explanation for decisions taken by high-risk systems. Annex III lists the high-risk uses: credit scoring, recruitment, insurance underwriting, employment management, benefits eligibility, and law enforcement. Those provisions became enforceable on 2 August 2026, and they apply to any UK firm whose AI affects EU residents.
A third trigger sits inside FCA Consumer Duty. Decisions affecting consumers must be explainable to satisfy the good-faith and fair-treatment expectations. The framework is general; specific deployments need ICO guidance and qualified counsel. And the Equality Act 2010 quietly catches indirect discrimination even where Article 22 does not bite.
What does this look like in practice?
It looks like enforcement that has already happened. Bridges v South Wales Police, decided by the Court of Appeal in 2020, found facial-recognition deployment unlawful partly on transparency grounds. The Hague District Court struck down the Dutch SyRI fraud-detection system the same year. Uber drivers won an Amsterdam Court of Appeal ruling in 2023 on Article 22 grounds, and the Italian Garante fined OpenAI EUR 15 million in late 2024 for transparency failings.
Inside firms, the pattern is quieter and more common. A complaint comes in. The compliance team asks the data team to reconstruct why the model declined the applicant. The data team produces a SHAP plot. The compliance team asks whether the SHAP plot is actually faithful to the model’s behaviour or only a post-hoc summary. Nobody knows, because nobody validated it. The firm settles.
The Equality Act intersection is where this moves from a data-protection problem to a tribunal problem. A recruitment AI that systematically downranks a protected group does not need to be solely automated to expose the firm. A human reviewer signing off on a biased shortlist is still a biased shortlist. XAI, properly used, is part of how a firm spots that pattern before a claimant’s lawyer does. The proportionate question for any owner is not whether to commission a full faithfulness study on every model, but whether the firm can produce a defensible local explanation for the next contested decision, in the next twenty working days.
What should you ask a vendor about XAI?
Three concrete questions, useful on Monday morning. First, can your system produce a local explanation for a specific decision, the actual factors in this applicant’s case, not a global feature-importance chart aggregated across all decisions? If the answer is “we show you which features the model relies on overall”, you have transparency, not explanation.
Second, have you validated that the explanation is faithful to the model’s internal computation, or only that it is plausible to a human reading it? Vendors who can answer this distinguish themselves quickly. Many cannot.
Third, where in your documentation do you separate what the model computes from why it produces that output? The two are not the same. A serious vendor will have done the work to keep them apart, and will be able to point you to it. Take all three answers into your wider vendor due diligence rather than trusting the dashboard alone. And if your deployment looks like it might sit inside Article 22, treat the post you are reading as a framework, not as legal advice. Read the deeper Article 22 piece, then talk to qualified counsel before the next contract is signed.



