An operations manager at a small consultancy connects an AI agent to the firm’s CRM on a Friday afternoon. Admin access, because it’s the quickest way to get it running. By Monday, the agent has filed client notes into wrong folders, sent follow-ups to contacts who asked not to be chased, and merged two client briefings into one record. Nothing malicious happened. The agent had more permission than it needed, and nobody had mapped out what “test it” actually meant.
That scenario plays out in many small firms. The fix requires a few specific steps before any agent touches live business data.
What makes AI agents a different kind of security challenge?
Standard software sits in its lane: it reads what it’s told to read, writes where it’s told to write, and waits for a human to push the button. AI agents don’t work that way. They can call APIs, send emails, query databases, and update records, making chains of decisions without a human in the loop at each step. That changes the security question fundamentally.
A standard SaaS tool has a defined scope. An AI agent has a toolset, and that toolset can include sending email on behalf of a staff member, updating a deal stage, deleting a record, running a database query, or searching your entire shared drive. Agents can also be chained. One that summarises emails can trigger a second that drafts replies, which triggers a third that schedules sends. By the time the output appears, a sequence of autonomous decisions has already run.
Zenity, which publishes detailed compliance guidance on agentic AI, describes the security implication clearly: every agent should be treated as a dynamic digital actor with a first-class identity, a bounded system reach, and explicit rules about which actions it can take without human approval. Treat an agent as a chatbot with extra features and you will miss those design decisions entirely.
Why does this matter for your business right now?
According to BigID’s 2025 AI Risk and Readiness report, 69% of organisations name AI-powered data leaks as their top security concern. Almost half, at 47%, have no AI-specific security controls in place, and only 6% report having an advanced AI security strategy. Agents are arriving faster than the controls, and that gap is where small businesses are most exposed.
The UK regulatory picture adds weight. The NCSC’s guidelines for secure AI system development ask organisations to treat AI systems as high-value assets, apply least-privilege access control, and monitor AI system behaviour across the full lifecycle. The UK government’s AI Cyber Security Code of Practice, published in 2024, sets five principles: raise awareness of AI-specific threats, design AI for security, conduct threat modelling covering prompt injection and data poisoning, maintain clear accountability, and identify and protect AI assets.
These guidelines are voluntary, but they set the expected standard. If a breach occurs and you cannot demonstrate that you followed basic secure-by-design principles, those guidelines become the benchmark against which your exposure is measured. Your cyber insurer may also take a view. UK legal commentary increasingly flags negligent AI integration as likely to affect claims.
Gartner predicts that by 2026, 40% of enterprise applications will include task-specific AI agents, up from less than 5% today. That trajectory reaches small businesses through the SaaS platforms they already use. You don’t need to build an agent to be in scope.
Where will you actually encounter these risks in a 10 to 50 person firm?
For services businesses, the first AI agents are rarely purpose-built. They arrive through platforms you already use: an inbox assistant that can reply and archive, a CRM feature that can draft follow-ups and update deal stages, a meeting tool that can book slots and send summaries. Each of those is an agent. Each of them has access to data you probably haven’t audited for that purpose.
Three scenarios are particularly common in owner-managed services firms.
In professional services, client confidentiality obligations apply. An agent connected to a shared drive or CRM doesn’t distinguish between a file you meant to share with one client and a file that belongs to another. Depending on your sector’s professional duties, ad-hoc agent deployment without a prior scope review may breach those obligations.
In many firms, shadow AI is already present. Staff connect their own calendar and email to tools that helpfully offer to schedule, summarise, and reply. The NCSC guidelines stress central visibility and control over AI assets. When individuals sign up to AI tools without approval, invisible data flows emerge that can breach internal policies and UK GDPR obligations, and those flows are typically invisible until an incident surfaces them.
For FCA-regulated firms, including advisers, brokers, and wealth managers, there is an additional layer. The FCA has stated clearly that AI use must comply with existing rules on systems, controls, and consumer protection. An agent that can influence a recommendation, handle a complaint, or access a client account without a visible audit trail is a compliance exposure, not just a technology question.
When should you apply tighter controls, and when can you move faster?
The answer depends on what the agent can actually do, rather than what you intend for it to do. A useful frame is to classify every tool an agent can call by its potential blast radius. Search and read actions carry limited risk. Draft-and-propose actions sit in the middle. Write, send, delete, and charge actions require explicit human approval before execution.
This classification aligns with both the NCSC guidelines and the AI Cyber Security Code. Low-risk tools, such as searching a knowledge base or summarising a document, can run without human sign-off at each step. Medium-risk tools, drafting a CRM note or composing an email without sending it, can run automatically but should be reviewed before they escalate. High-risk tools, sending communications, deleting records, moving money, changing prices, need a human decision point each time.
If your agent platform doesn’t let you configure tool-level permissions at this level of granularity, treat that as a vendor-selection issue. Platforms that lump all tools under one permission level force you to choose between limiting the agent’s usefulness and accepting broad system access. The NCSC guidance is explicit: the permissions an AI system holds on other systems should be only what is required and risk-assessed.
What does “secure before you connect” actually look like in practice?
Five steps, in sequence, before any agent connects to production data. Run it first in a sandboxed environment with synthetic or anonymised records. Assign it a dedicated service account with minimum access. Classify every tool it can call by risk level. Ensure comprehensive logging is in place. And confirm the kill switch works before anyone would need it.
The sandbox phase is where you observe behaviour before it matters. Use synthetic records or fully anonymised data, and probe for prompt injection: embed malicious instructions inside emails or documents the agent will read, and observe whether it acts on them. The NCSC guidelines and the AI Code implementation guide both endorse this separation of test from production environments.
The service account step matters for auditability. When an agent authenticates as a real staff member, its actions appear in logs attributed to that person. If something goes wrong, you cannot cleanly reconstruct what the agent did versus what the person did. A dedicated account with its own API key or OAuth identity makes the agent’s actions separable and traceable.
Comprehensive logging is not optional. The EU AI Act imposes logging requirements for high-risk AI applications, with obligations phasing in from 2025 for firms serving EU customers. Logs also support DPIAs, data-subject access requests, and insurance claims. Make sure they are exportable and that someone knows where to find them.
For the kill switch, Zenity’s compliance guidance recommends being able to disable a single agent, a single tool, or all autonomous write actions without bringing down the wider platform. Document the procedure. More than one person should know how to run it. Test it before it is needed.
Before committing to a vendor platform, verify it can deliver all of this. If it cannot export logs, doesn’t support granular tool permissions, or requires your staff to authenticate on the agent’s behalf, treat those as material gaps.
A few hours of security design before connection costs far less than the incident you’d be managing without it.



