How to structure an AI demo that reveals reality

Two people looking at a laptop screen together in a bright office, both focused on the same content
TL;DR

A reality-revealing AI demo tests one workflow you already run, uses your own materials, and deliberately shows you where the tool falls short. Structuring it with a baseline, a grounded hand-off, and a human approval step tells you whether deployment is viable before you commit money or time to it.

Key takeaways

- Anchor the demo in one high-volume, low-ambiguity workflow your business already runs, and pull last month's baseline figures before the conversation starts. - Use your own structured materials rather than generic examples, so the demo exposes real data-quality and integration issues rather than hiding them. - Ask the vendor to show you a moment where the AI does not know the answer and routes to a human, not only the scenarios where it performs well. - Every agentic action in the demo should produce a draft for human approval rather than an automatic output, and you should see how that approval flow works. - Cover data protection in plain language: where the AI runs, whether your prompts are used for model training, and how outputs are logged and supervised.

A vendor books a 45-minute slot. They spend the first 30 demonstrating features you have not asked about, using sample data that looks nothing like your client files. By the end you are not sure what you have just watched or whether any of it applies to your business. That is the standard AI demo, and it reveals almost nothing useful about whether a tool will help your firm.

Structuring a demo that actually tests a tool against your specific business requires a different approach. Here is what that looks like.

What is a reality-revealing AI demo?

A reality-revealing demo focuses on one workflow your business already runs, uses your own materials rather than generic examples, and includes a moment where the AI shows you what it cannot do. You leave with a clear sense of where the tool would perform, where it would struggle, and what you would need to address before any deployment could succeed.

The contrast with the vendor-led showcase is deliberate. Standard AI demos are designed to impress, which means the vendor picks the scenarios where the tool performs best, uses polished sample content, and avoids edge cases. You leave having watched something work smoothly, but you have no idea how it would behave with your messy intake emails or your half-structured client database.

UK AI consultancy OpenKit recommends anchoring any AI engagement around a single high-volume, repetitive, low-ambiguity workflow, such as document intake, quote generation, or booking triage. That same principle applies to a demo. Pick one workflow, run it with your own inputs, and see what actually happens.

Grounded AI agents, built to draw answers only from your own uploaded content and route to a human when the answer is not available, demonstrate this constraint clearly. The constraint matters: a system with no floor for “I don’t know” will fabricate rather than defer, and that failure mode is exactly what you need to see before you commit.

Why does the structure of your demo matter?

A badly structured demo wastes your time and sets expectations your deployment will never meet. iCentric’s UK AI adoption guidance warns against declaring victory without a baseline: if you do not establish your current handling time, error rate, and volume before the demo, you have no way to judge whether what you have just seen would materially help your business.

Owner-managed firms are particularly exposed here. You have limited time for vendor conversations, a tight budget for experiments, and no AI team to sense-check what you are being shown. A demo that skips your real workflow is a sales experience, not a capability assessment.

The regulatory context reinforces the point. The CMA’s April 2024 update on AI foundation models specifically flagged the risk of misleading capability claims, signalling that over-promising AI performance in sales contexts could attract scrutiny under consumer protection law. If the demo looks too smooth, that is worth questioning rather than accepting at face value.

Getting the structure right benefits both sides. You learn whether the tool is genuinely viable for your business before committing a budget. The vendor learns whether this is a real opportunity worth pursuing, rather than investing in a sales process that will eventually stall.

Where do you start?

The most useful place to begin is the workflow with the highest volume of repetitive, low-ambiguity steps in your business. Before any demo conversation, pull last month’s figures: average handling time, error count, number of touches per case. These become your baseline. The demo hypothesis is then simple: can this tool halve the handling time while keeping errors at or below today’s level?

Once you have the workflow, prepare the materials the AI will consume. Webreality’s guidance on helping AI understand your business highlights that tools like ChatGPT and Microsoft Copilot perform markedly better when your content is structured and machine-readable: clear FAQs, tagged service descriptions, standard intake templates. If your materials are messy, a good demo will show you that directly.

Do not try to hide the mess. Guidance for professional services firms consistently notes that AI is only as capable as the underlying document structure and information architecture. A demo that uses your real content and exposes the gaps is telling you something valuable: you need a content clean-up before deployment, not an AI tool right now.

Show the AI consuming your actual FAQ or service sheet. Run a few typical queries. Point out where it answers confidently and where it struggles because the source content is vague or contradictory. That honest picture is what the demo is for.

When does a demo tell you the truth?

A demo reveals reality when it includes three things the vendor typically avoids: a live run where the AI encounters something it cannot answer and explicitly says so, an agentic action that stays in draft pending human approval rather than firing automatically, and a set of test cases you can run yourself to compare outputs against expected results.

The first test is the hand-off. Configure the system so that if it cannot find an answer in your uploaded materials, it says clearly that it does not know and routes to a human. Then deliberately run a query it cannot answer. If the vendor has not built this behaviour in, you are looking at a system that will fabricate rather than admit its limits, which is a significant operational risk in any client-facing context.

The NCSC’s 2023 guidance on AI security highlights data exfiltration and prompt injection as real risks in deployed AI systems. A grounded, hand-off-first architecture reduces both.

The second test is the approval loop. Ask the AI to draft an email, a booking confirmation, or an invoice. Watch it produce the draft, but make sure nothing is sent or posted without a human clicking approve. The JAX agent for Xero demonstrates this pattern with accounting workflows: natural-language commands build a draft invoice, the human reviews it, and only then does it reach the ledger.

ICO guidance on AI and data protection requires that AI outputs affecting customers or employees are explainable and subject to human oversight. Showing an approval loop in the demo confirms you have a plan for meeting that standard, not just an impressive product.

What does a credible demo include beyond the live run?

Beyond the workflow demonstration itself, a credible demo addresses three questions in plain English: what data does the AI touch, where does it run, and how is it supervised? iCentric’s adoption guidance notes that structured training sessions of around 45 minutes per role typically triple adoption rates compared with a launch email alone. The demo should preview that training plan, not treat it as an afterthought.

On data, ask the vendor directly: are prompts or outputs used to retrain the public model? Where is data stored and processed? The NCSC advises against pasting sensitive client or financial information into public AI tools. For a UK service firm, you want a clear answer that data stays within boundaries you control, and ideally that it remains within a UK or EU data centre.

On the EU AI Act, adopted by the European Parliament in 2024, UK firms selling into Europe or using AI services that serve EU clients will increasingly encounter transparency requirements: logging, risk management, and labelled AI interactions. A vendor who can explain how their tool supports those obligations is worth continuing the conversation with. A vendor who waves the question away is telling you something about how seriously they take compliance.

Close the demo by asking: what does a four-to-six week pilot look like? What are the evaluation criteria? What does success look like in a format you can verify independently? A vendor who cannot answer that clearly has not run a serious pilot before.

Running a structured demo like this is about getting useful information from a conversation that usually produces very little. Test one workflow with your own data, make the AI show you where it falls short, and you will have what the decision actually requires.

Sources

- ICO (2023). Guidance on AI and data protection. Framework covering lawful basis, DPIAs, fairness, and human oversight requirements for UK organisations deploying AI systems that process personal data. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/ - ICO (2023). Explaining decisions made with AI. ICO guidance on explainability and accountability for AI-generated decisions affecting individuals, including transparency requirements relevant to approval loops and audit logging. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/explaining-decisions-made-with-artificial-intelligence/ - CMA (2024). AI Foundation Models: Update Paper. CMA analysis flagging risks of misleading capability claims and consumer harm, signalling that over-promising AI performance in sales contexts attracts scrutiny under consumer protection law. https://www.gov.uk/government/publications/ai-foundation-models-update-paper - NCSC (2023). The security of AI systems. NCSC guidance on prompt injection, data exfiltration, and secure deployment patterns for AI, including the importance of access controls and data handling in UK business deployments. https://www.ncsc.gov.uk/whitepaper/the-security-of-ai-systems - European Parliament (2024). Artificial Intelligence Act: MEPs adopt landmark law. Adoption of the EU AI Act introducing transparency obligations, risk management requirements, and logging standards affecting UK firms with EU clients or EU-facing AI services. https://www.europarl.europa.eu/news/en/press-room/20240308IPR19015/artificial-intelligence-act-meps-adopt-landmark-law - OpenKit (2024). AI Agents for Business: UK Implementation Guide. UK consultancy guidance recommending single-workflow, time-boxed pilots with clear KPIs and baselines as the most effective route from demo to deployment for UK SMEs. https://openkit.co.uk/blog/posts/ai-agents-implementation-guide-uk - iCentric (2024). AI for Business: The UK Guide to Adoption and ROI. UK AI adoption roadmap advising that structured training sessions of around 45 minutes per role typically triple adoption rates versus a launch email alone, with emphasis on baselines and evaluation harnesses. https://www.icentricagency.com/insights/ai-for-business - Webreality (2024). How to help AI understand your business. Guidance on structuring business content, FAQs, service descriptions, and intake templates to improve AI accuracy and surface data-quality problems during demos and pilots. https://www.webreality.co.uk/insights/how-to-help-ai-understand-your-business/

Frequently asked questions

What should I bring to an AI vendor demo to make it useful?

Bring the baseline metrics for one workflow you want to test: how long it takes currently, how often errors occur, and how many steps are involved. Also bring a sample of your own materials, such as a FAQ, service description, or intake form, so the demo can run against real content rather than generic examples. That combination turns a product showcase into a genuine capability test.

How do I know if an AI demo is showing me something real rather than a vendor showcase?

Ask the vendor to run a query the system cannot answer from your uploaded materials, so you can see how it handles the gap. If it fabricates an answer rather than deferring to a human, that is important information. Also check whether the demo uses your actual content or generic samples. If it is generic, the demo tells you nothing about how the tool would behave in your specific business.

What data protection questions should I ask during an AI demo?

Ask three things: whether the vendor uses your prompts or outputs to retrain their public model, where your data is stored and processed, and how long it is retained. The ICO's guidance on AI and data protection sets clear expectations around lawful basis, transparency, and human oversight for AI systems handling personal data. A vendor who cannot answer these questions clearly has not built their tool with UK compliance in mind.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation