Open four vendor websites in the same product category. Maybe it is contract review, maybe it is sales outreach, maybe it is bookkeeping. Each one leads with the words ‘AI-powered’. Each one describes what it does in almost identical language, lots of verbs about understanding and automating, very few nouns about how. None of them tells you what is actually under the hood. You are an hour in, no closer to a decision, and the sales calls are still to come.
This is the buyer’s reality in 2026. The phrase ‘AI-powered’ has become an empty label, in the same way ‘cloud-based’ was empty by 2015 and ‘mobile-friendly’ was empty by 2012. The phrase tells you the vendor sees value in claiming the feature. It does not tell you the feature is real, or that it is the part of the product doing the work for your use case. Reading past it is a skill, and it saves you the cost of buying something that does less than the marketing implied.
What does ‘AI-powered’ actually mean on a vendor website in 2026?
Almost nothing on its own. The term now covers everything from a large language model fine-tuned on domain data, to a chatbot wrapper sat on top of a third-party API, to lightly rebadged keyword search. The FTC’s Operation AI Comply has charged multiple vendors over deceptive claims, the FCA flags AI-washing in financial services, and the ASA enforces against misleading AI ads.
The phrase has lost specific meaning the usual way. It started as a description of generative models trained on large corpora, then marketing teams extended it to cover any product that touches AI anywhere in its stack. A spell-checker can now be ‘AI-powered’. A search box can be ‘AI-powered’. The Builder.ai collapse in 2025 showed that a company valued at roughly £2 billion can describe itself as AI-first while the product is around 700 engineers in India doing the work manually, with revenues inflated by roughly 300 per cent. Capital and brand are not a substitute for a disciplined buyer interrogation.
Which four claims actually tell you something about the product?
Four questions surface what the label hides. Which foundation model sits underneath, GPT, Claude, Gemini, Llama, Mistral, Qwen, or something the vendor built in-house. What fine-tuning or retrieval-augmented generation has been added on top. Whether answers are generated by the model from scratch or retrieved from a controlled knowledge base. And where a human reviews the output before it reaches a customer.
Each has a real consequence. The foundation model sets the baseline capability floor, and benchmarks from Stanford HELM and MLPerf let you sense-check the claim. Fine-tuning and retrieval-augmented generation tell you how much of the product is generic and how much is shaped to your domain. Generated answers can hallucinate, with industry tracking showing rates from 0.7 per cent on the best models to almost 30 per cent on weaker ones. Retrieval-grounded answers can be wrong but cannot invent. The position of the human reviewer separates a tool that drafts a credit decision for an analyst to approve from one that issues the decision directly, with all the regulatory exposure that creates.
You do not need to grade the answers on a technical scale. You need specific answers rather than marketing answers. A vendor who says “we use Claude with retrieval-augmented generation against your data, every output is reviewed by your operations lead before it sends” is showing you the product. A vendor who says “our proprietary AI engine optimises your workflow end-to-end” is hiding it.
Where will you actually meet these claims when you are evaluating a vendor?
Three surfaces on the vendor’s own website, written for different audiences. The marketing landing page is where the strongest claims live and the least information sits. The technical product page or developer docs is where engineers have to be specific. The help documentation is where the support team explains what users actually do day to day, including the bits marketing would rather not advertise.
Read those three pages in reverse order. Start with the help docs and look for the workflow. Does a user have to upload spreadsheets, tag fields, review every output, send the response themselves. That tells you the system is human-in-the-loop and probably narrower than the demo suggests. Then read the technical documentation for the model name, the integration shape, the data pipeline. Only then read the marketing landing page, and read it as the gloss layer on top of what you already know.
The integrations list is the other quiet signal. A vendor with twenty named integrations into the systems an owner-operated business actually runs, Xero, HubSpot, Slack, Google Workspace, Microsoft 365, has done the engineering work that makes a tool useful day to day. A vendor with one generic webhook and a ‘contact us for custom integration’ page has not. The depth of the integrations list is a better predictor of practical fit than the strength of the marketing claim.
When should an owner ask these questions, and when does the level of detail not matter?
Ask the four questions when the AI claim is the main reason you are buying, or when the product will touch customer data, regulated decisions, or anything that affects your reputation. A bookkeeping tool that auto-categorises transactions is worth interrogating because the output ends up in your statutory accounts. A meeting transcription tool that summarises calls is worth less, because the worst case is a bad summary you delete.
Skip the detail when the AI part is incidental. If you are buying a project management tool and it happens to have an AI assistant that suggests task names, you do not need the foundation model lineage. The product’s value is in the project management, not the AI. The cost of getting the AI part wrong is low, and asking sales reps four architecture questions about a feature you do not really care about is a wasted hour.
The sharper rule is to ask the four questions in proportion to the risk if the AI gets it wrong. High-stakes workflows, customer-facing automation, regulated decisions, anything that produces output you would not want to defend in court, those deserve the architecture conversation. Low-stakes assistive features do not. Owner judgement, not a blanket rule.
What related ideas help you read AI claims with more confidence?
Four underlying concepts make the questions easier to apply. A foundation model is the general-purpose AI model the product runs on. Fine-tuning adapts that model to a specific domain with a smaller targeted dataset. Retrieval-augmented generation, or RAG, grounds answers in a controlled knowledge base, which reduces hallucination. Human-in-the-loop puts a person before the output reaches the world.
Two regulatory threads are worth knowing about even if you are not in financial services or healthcare. The EU AI Act Article 50 already requires AI systems interacting with people to disclose they are AI, and that generated content be marked. California’s Generative AI Training Data Transparency Act took effect on 1 January 2026, requiring developers of generative systems to disclose their training data sources. These will gradually shift vendor norms toward transparency. The OWASP Top 10 for Large Language Model Applications gives you a security frame, prompt injection, training data poisoning, insecure output handling, that a serious vendor should be able to talk about without flinching.
The next time you read ‘AI-powered’ on a vendor page, treat it as the start of a question, not the end of a description. The vendor is signalling that they think AI is the interesting part of the story. Your job is to find out which part actually is, and whether that part covers the work you need done. Many vendors are genuine in one layer of their stack and looser everywhere else. Locating the genuine layer is the buyer’s skill.
If you would like a second pair of eyes on a vendor shortlist before you sign anything, book a conversation.



