How to choose an AI red teaming provider without security theatre

Two people reviewing a printed security report at an office desk
TL;DR

AI red teaming is structured adversarial testing designed to find exploitable AI behaviours before attackers do. For UK SMEs the choice is between consultancy, open source tools, or an automated platform. The decision hinges on engineering depth and whether you need external evidence for regulators or clients. The five questions that separate genuine providers from security theatre cover architecture scope, sample reports, DPIA mapping, remediation integration, and test data handling.

Key takeaways

- AI red teaming covers prompt injection, data leakage, and agent abuse at minimum. Any provider who only discusses jailbreaks is testing a fraction of your actual attack surface. - Consultancy is the right choice for regulated firms, those needing DPIA evidence, or those without in-house AI security engineering. DIY suits small, technically staffed teams with a simple AI estate and strict data residency requirements. - OWASP's evaluation criteria identify concrete red flags: generic jailbreak-only testing, deliverables that are mostly slides, and no integration with remediation workflows. - Under UK GDPR the ICO can issue fines of up to £17.5 million or 4% of worldwide turnover for serious AI-related infringements. Inadequate testing that leaves a data exposure undetected is a material risk, not a theoretical one. - Before signing anything, ask five questions: how they test agents and RAG pipelines, whether they can show sample reports with reproducible findings, how findings map to your DPIA, how results connect to your development workflow, and where your test data is stored.

You’ve deployed an AI assistant that connects to client data and handles parts of your workflow. It’s working well, but someone on your team has asked how you know it’s actually secure. You look for answers and find vendors offering “AI red teaming.” They all promise thorough testing. Some charge £8,000. Some charge £80,000. Few explain what they will actually test, or what separates a real finding from a slide.

The question worth resolving before you commission anything is which of three routes fits your situation: consultancy, open source tooling, or an automated testing platform. And how to tell real security work from theatre dressed up in threat-model language.

What choice are you actually facing?

AI red teaming is structured adversarial testing designed to surface exploitable AI behaviours before an attacker does. NIST and the UK’s NCSC define it as systematic stress-testing covering prompt injection, data leakage, and agent abuse, beyond obvious jailbreaks. For a UK SME the practical choice is between a consultancy, an open source tool suite, or an automated testing platform.

The “security theatre” problem is real. OWASP’s vendor evaluation guidance flags superficial testing explicitly: a handful of canned jailbreak prompts, no coverage of agents or tool-calling, no connection to remediation workflows. The output is reports, not risk reduction.

The three routes differ significantly in what they produce and what they cost. Consultancy brings human adversaries with domain expertise and output that maps to regulatory requirements. Open source tools give you control, but only if someone can build and maintain them. Automated SaaS platforms sit between the two: broader coverage and continuous testing, but typically priced for larger estates. The real question is which option you will actually use, not which one looks most thorough in the pitch.

When does a consultancy make sense?

A consultancy makes sense when you need an external, evidenced view of AI risks and cannot build that capability in-house. Regulated firms are the obvious fit: an IFA deploying an AI advice workflow, a recruiter using automated screening, or a firm processing health data via a language model. Enterprise due diligence requests and insurer requirements point in the same direction.

The key advantage is human adversaries who understand your specific processes. HiddenLayer’s taxonomy of AI red teaming identifies three attack layers: model-level (adversarial examples, poisoning), system-level (prompt injection, data leakage), and application-level (business process abuse). A consultancy with cross-layer capability can test whether your AI assistant could be manipulated into approving a fraudulent transaction or bypassing an approval step. That requires domain understanding, not just a jailbreak library.

The ICO expects organisations using AI for high-risk processing to complete a Data Protection Impact Assessment and document their testing approach. A well-scoped consultancy engagement feeds directly into that DPIA and provides evidence for FCA or other sector regulators.

The watch-out is the slide deck problem. OWASP flags it explicitly as a red flag: a thick report with generic findings and no reproducible attack paths. The engagement should produce specific exploit steps mapped to your risk register, not a percentile score.

When does building your own test suite make sense?

Building your own test suite makes sense when you have a technically strong engineer who can maintain scripting and CI/CD integration, and when your AI estate is small enough to cover systematically. Open source frameworks such as Microsoft’s PyRIT, Promptfoo, and DeepTeam can orchestrate attack simulations in your own environment, giving you full control over scope and data residency.

The appeal is control and continuity. You own the test scripts, you can extend them for your domain, and you can run them on every significant change without booking another engagement.

The practical constraint is real. Georgetown’s CSET research on AI red-teaming design warns that many open source tools require command-line skills, custom code, and CI/CD work that many small internal teams cannot sustain alongside normal workloads. The common outcome is a small set of obvious tests that gives false comfort. OWASP calls this “checklist-only” testing and treats it as one of the clearest signs of theatre.

The honest question to ask is whether you’ll actually run these tests on every significant change, and whether someone will update the scenarios as the AI stack evolves. For a firm whose AI footprint is one internal tool with no sensitive processing, a lean DIY approach can be proportionate. For anything touching regulated data or client-facing decisions, the route carries more risk than it appears to.

What does it cost to get this wrong?

Getting this wrong in either direction creates real exposure. Overpaying for a consultancy that delivers generic jailbreak results leaves business-process risks untested. Under-investing in DIY with insufficient engineering depth means a test suite that atrophies while your AI systems evolve. In both cases, the gap between what was tested and what an attacker would actually try stays invisible until an incident makes it obvious.

The operational risk is concrete. Security researchers demonstrated in 2023 that prompt injection attacks embedded in documents or web content could cause AI systems to leak confidential information or override system instructions, with Bing Chat the most widely cited case. These attack patterns apply directly to retrieval-augmented systems that pull from client documents or internal knowledge bases. A test suite limited to manual jailbreak prompts will not catch them.

The regulatory exposure compounds this. Under UK GDPR, the ICO can issue fines of up to £17.5 million, or 4% of annual worldwide turnover, for serious infringements. Inadequate testing of AI systems that process personal data can contribute directly to those infringements, particularly where a DPIA was required and the testing evidence is thin. The FCA is equally clear that regulated firms remain accountable for AI outcomes under existing conduct and operational resilience rules.

UK firms selling into European markets face additional obligations under the EU AI Act. Systems classified as high-risk, including credit scoring, recruitment tools, and biometric categorisation, require documented risk management, testing, and post-market monitoring. A provider who understands that framework is materially more useful than one who does not.

What should you ask before you sign anything?

The question that distinguishes genuine AI red teaming from theatre is whether the vendor can explain exactly how they would test your specific architecture, your tool integrations, and your data flows. Any provider worth commissioning should answer the following questions concretely before you agree scope. Vague or deflected responses tell you as much about the engagement as their actual answers would.

Ask how they test agents, RAG pipelines, and tool-calling integrations. OWASP’s evaluation guidance is explicit: a vendor who can only discuss jailbreak libraries has not caught up with how production AI systems work. Your CRM connection, your document retrieval, your ticketing workflow: each is a potential attack surface. The answer you want covers each layer, not “yes, we cover LLM security.”

Ask to see a redacted sample report for a system similar to yours. Look for reproducible attack paths with actual prompts, tool calls, and outputs, not just a narrative summary of categories assessed. If the deliverable is mostly slides with generic advice, that is the OWASP red flag in practice.

Ask how findings map to your DPIA, your risk register, and any sector obligations. A provider who understands FCA operational resilience requirements or ICO high-risk processing expectations will answer this fluently. One who does not will change the subject.

Ask how findings connect to your development workflow. Results that never become actionable tickets do not get fixed. OWASP includes ticketing and CI/CD integration in its green-flag criteria precisely because remediation is where many testing programmes fail.

Ask where your test data is stored and processed. The ICO’s expectations under UK GDPR apply to test data as much as live data. A vendor who cannot answer this clearly is a compliance risk in itself.

If you’re unsure which route fits your situation, the safest default for a regulated UK SME is a scoped consultancy engagement focused on the one or two AI systems that touch client data or decision-making, mapped explicitly to your DPIA and sector obligations. That is proportionate. A subscription platform covering an AI estate you haven’t yet built is not. If you want to think through what the right scope looks like for your firm, book a conversation.

Sources

- NCSC (2023). Guidelines for secure AI system development. UK government guidance recommending adversarial testing including prompt injection, data poisoning, and model abuse as part of secure AI deployment. https://www.ncsc.gov.uk/collection/guidelines-secure-ai-system-development - NIST (2023). AI 100-1 Artificial Intelligence Risk Management Framework Generative AI Profile. Frames AI red teaming as systematic stress-testing to identify harmful behaviours before and after deployment. https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.600-1.pdf - ICO (2024). Guidance on AI and data protection. Sets out UK GDPR obligations for AI systems processing personal data, including proactive testing and explainability for automated decision-making. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/ - ICO (2024). Data protection impact assessments. Explains when DPIAs are mandatory, including for high-risk AI processing such as profiling and automated decisions with significant effects. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/data-protection-impact-assessments/ - OWASP GenAI Security Project (2024). OWASP Top 10 for Large Language Model Applications. Defines distinct risk categories for AI systems including prompt injection, data exfiltration, and supply-chain compromise. https://genai.owasp.org/ - OWASP / AIGL (2024). Vendor Evaluation Criteria for AI Red Teaming Providers and Tooling. Green flags and red flags for evaluating providers, with emphasis on modern architecture coverage and remediation integration. https://www.aigl.blog/vendor-evaluation-criteria-for-ai-red-teaming-providers-tooling/ - FCA (2024). AI innovation and regulation. Confirms that UK regulated firms remain accountable for AI outcomes under existing conduct and operational resilience rules. https://www.fca.org.uk/news/speeches/ai-innovation-and-regulation - CSET Georgetown (2024). AI Red-Teaming Design: Threat Models and Tools. Study on practical constraints of AI red-teaming tools for organisations lacking dedicated security engineers. https://cset.georgetown.edu/article/ai-red-teaming-design-threat-models-and-tools/ - EU AI Act (2024). Regulation on Artificial Intelligence. Imposes risk management, testing, logging, and post-market monitoring requirements for high-risk AI systems, with implications for UK firms selling into EU markets. https://artificialintelligenceact.eu/the-act/

Frequently asked questions

How do I know if an AI red teaming provider is actually testing my system rather than running generic prompts?

Ask them to explain how they would test your specific architecture, including agents, RAG pipelines, and any third-party tool integrations. OWASP's vendor evaluation criteria flag any provider who can only discuss jailbreak libraries as a red flag. A genuine provider describes specific attack scenarios for your data flows and delivers reproducible findings, not a summary score. Ask to see a redacted sample report before committing.

Is AI red teaming a legal requirement for UK businesses?

There is no explicit legal mandate called "AI red teaming," but overlapping obligations point in the same direction. The ICO requires a Data Protection Impact Assessment for AI systems involving high-risk processing, which includes testing for data leakage and discriminatory outcomes. The FCA expects regulated firms to demonstrate reasonable steps for AI oversight, and the EU AI Act imposes testing requirements on high-risk systems sold into European markets.

When does a UK SME actually need AI red teaming rather than standard security practices?

The threshold is when your AI system does something a standard vulnerability scan or penetration test cannot find. Prompt injection via embedded documents, agent manipulation to bypass approvals, RAG source corruption through poisoned data: these require AI-specific adversarial testing. If your AI tool accesses client data, makes or influences decisions, or interacts with other systems via tool calls, a standard review will not cover the attack surface.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation