When gen AI should stay out of your workflow

A founder in a legal consultancy told me recently they had been shown a demo of a tool that would draft first-pass client letters in seconds. It worked well in the demo. The question that came back to me was whether they should actually use it for client correspondence. That is the right question. A working demo tells you the technology is capable. The risk profile of the task tells you whether to use it.

What’s the decision you’re actually facing?

Many owner-managed businesses in services face the same underlying question. Which tasks benefit from gen AI, and which carry enough risk that the checking overhead wipes out the productivity gain? The technology can often do the task. The question is whether it does it well enough, reliably enough, and safely enough for your specific workflow.

OECD research on generative AI and SME workforces finds service-sector firms are among the earlier adopters of gen AI tools. That adoption reflects genuine potential. The mistake is treating that potential as uniform across every task, rather than evaluating it workflow by workflow against the risk profile of each one.

The workflow-by-workflow risk assessment matters more than a firm-wide stance on AI adoption. Some tasks will pass cleanly. Others will fail on criteria that matter a lot. That distinction is worth working through before you add any tool to a live process. Getting it wrong creates liability; getting it right compounds over time.

When does gen AI earn its place in your workflows?

Gen AI earns its place where tasks are low-stakes, reversible, and reviewed before any output leaves the building. Drafting, summarising, reformatting, and generating first-pass content for internal review are reasonable starting points. MIT Sloan advises firms to apply a cost equation that includes fixing errors, not just the productivity saving. When errors are cheap to catch, the economics work.

Gen AI is worth deploying when the output is internal, a human always reviews before anything is sent, errors are embarrassing at worst rather than damaging, and the task repeats often enough for the speed benefit to compound. When all four conditions hold, the case for gen AI is solid. Where one or more fail, look harder at what you are actually asking the technology to do. Gen AI is a tool for generating options, not a replacement for judgement.

One qualification worth making. For narrow, repetitive, rule-based work, traditional automation often outperforms gen AI. Invoice routing, appointment reminders, and fixed-format data extracts are better handled by deterministic tools that produce the same output from the same input every time. Gen AI’s strength is handling variability and ambiguity. Where the task is structured and predictable, simpler automation is cheaper and more reliable.

Where should gen AI stay out of the workflow?

Several categories of work carry enough risk that gen AI should not be the primary production tool, even with human review. In each case, if the output could cause legal, financial, regulatory, or safety harm when wrong, the checking burden may outweigh the benefit. NCSC guidance says organisations should apply human judgement and controls rather than treating AI as inherently reliable automation.

The first category is any workflow involving personal data at scale, special category data (health, HR, disciplinary, safeguarding), or client confidential information across multiple third parties. The ICO makes clear that UK GDPR principles apply regardless of the tool used. Data minimisation, purpose limitation, and transparency are required, and a data protection impact assessment is likely needed where processing poses high risk to individuals.

The second is regulated decision-making. For FCA-authorised firms, or those serving regulated clients, using gen AI to draft regulated advice, handle complaints, or make affordability judgements is much harder to justify than using it as a drafting assistant with strong human sign-off. The FCA holds firms accountable for operational resilience and outsourcing governance whether the tool is AI or not.

The third is employment and HR. The ICO’s employment practices guidance is clear that hiring, performance management, and disciplinary processes require fairness, transparency, and data protection by design. Using gen AI to draft a letter may be defensible; using it to score or rank people is a different matter.

The fourth is customer-facing content where every claim must be accurate, current, and auditable. MIT Sloan’s cost framework is directly relevant. If your team would have to verify every output before sending, the time saved in generation is largely clawed back in checking, and the hallucination risk is material.

The fifth is multi-client confidential work. Agencies, law firms, accountancy practices, and HR consultancies handling client materials face a risk beyond model error, accidental disclosure. The ICO is explicit that controllers remain responsible for lawful processing even when using external AI providers. If client material enters a shared prompt or model session, the risk of cross-contamination is real and the liability sits with the firm.

What does it cost to get this wrong?

The cost runs beyond a bad output. The ICO makes clear that organisations remain accountable for data protection compliance even when using external AI tools, and a data protection impact assessment is likely required for higher-risk processing. The FCA holds the same line for financial services. Using an AI tool does not relax the governance or outsourcing standard.

The CMA has flagged that businesses marketing “AI-powered” outputs may face consumer protection concerns if they cannot substantiate what the system actually does. The NCSC adds a supply-chain dimension. If you cannot reconstruct why an output was produced, or what data was sent to the model, you may struggle to defend it later. In a services firm, one incorrect letter, policy, or client email can absorb the savings from dozens of good outputs.

MIT Sloan’s cost equation framework makes the economic logic clear. Compare the cost of running an AI workflow against the cost of doing things as they are now, including error-detection and correction. For smaller firms without dedicated compliance or quality teams, that calculation often tips differently than it does at scale.

What to ask before you commit

Before adding gen AI to any workflow, run a short assessment against the risk profile of the task. The checklist does not need to be long. The NCSC recommends organisations understand what data goes into the model, limit exposure, and manage supply-chain risk. That gives you a framework for the decision, even without a formal governance process.

The questions that matter are simple. Does the output create legal, financial, or reputational risk if wrong? Does a human review every output before use? Is personal, special-category, or client-confidential data involved? Can you explain and audit what was produced? Would a traditional automation or template-based tool be cheaper and safer for this particular task? And are you making claims about AI capability to clients that you can actually substantiate?

Run through them before any tool goes live, and revisit them whenever the workflow changes significantly. The five minutes of assessment is worth more than the hours of rework that follow a deployment that went wrong.

When generative AI should stay out of the workflow

Key takeaways

What’s the decision you’re actually facing?

When does gen AI earn its place in your workflows?

Where should gen AI stay out of the workflow?

What does it cost to get this wrong?

What to ask before you commit

Sources

Frequently asked questions

Which types of workflow should we not use generative AI for?

Do UK data protection rules apply when we use a third-party AI tool?

How do I know if a task is too high-risk for generative AI?

Ready to talk it through?

If any of this sounds familiar, let's talk.

When generative AI should stay out of the workflow

Key takeaways

What’s the decision you’re actually facing?

When does gen AI earn its place in your workflows?

Where should gen AI stay out of the workflow?

What does it cost to get this wrong?

What to ask before you commit

Sources

Frequently asked questions

Which types of workflow should we not use generative AI for?

Do UK data protection rules apply when we use a third-party AI tool?

How do I know if a task is too high-risk for generative AI?

Ready to talk it through?

Related reading

AI theatre or real progress: how a founder tells the difference

How safe is AI for business use, and where do the risks sit?

How accurate is AI translation for business documents?

If any of this sounds familiar, let's talk.