The managing director of a 40-staff professional services firm I spoke with last month put a single sheet of paper in front of me. On it she had listed seven tools the firm had bought or trialled in the last fifteen months. ChatGPT Team for the marketing team. A contract-review add-in for the legal group. A bid-writing assistant. An AI note-taker. Two recommendation features baked into existing software. A chatbot on the website.
Her question was sharp. “Across these seven, which are generative AI, which are something else, and which carry EU AI Act exposure if we sell into Berlin next year?” Her firm has used AI for over a year. It has never defined what generative AI actually is, where it fits, or where it should not be anywhere near.
That is the conversation this post is built for. By 2026, “we use AI” has stopped working as a category description. The cost of getting the framing wrong shows up in monthly token bills, vendor lock-in, copyright exposure, and regulatory risk you did not realise you were carrying.
What is generative AI?
Generative AI is the family of AI systems that produce new content, text, images, video, audio, or code, rather than classifying or predicting from existing data. It learns the statistical distribution of its training data and generates outputs that resemble that distribution, predicting the next probable token, frame, or waveform. That is what makes it fluent at scale and what makes it unreliable on facts without grounding.
The contrast with what came before matters. Rule-based automation follows explicit if-then logic, the kind of routing your CRM has done for a decade. Discriminative machine learning classifies and predicts on labelled data, the maths behind fraud scoring, demand forecasting, and recommendation engines. Generative AI does neither. It produces fluent new content that is statistically likely to be plausible, not factually verified. Holding those three types apart is the most useful single distinction in the whole field.
Why it matters for your business
The framing decides where you spend money and where you carry risk. Adoption among UK SMEs is widespread, with 58% reporting generative AI use in 2025 and 91% of users reporting revenue impact. The competitive question has shifted from “should we use this?” to “are we using the right tool for each problem and have we audited cost and risk?” Saying yes to the first and ignoring the second is the common 2026 pattern.
The cost story has changed too. 2026 pricing across the major vendors is now usage-based and increasingly opaque. OpenAI doubled GPT-5.2 pricing earlier this year and now charges 2x input pricing on requests above 272,000 tokens. Anthropic has moved Claude enterprise edition from fixed pricing to usage-based dynamic pricing. The Register reported in April that 58% of organisations attempting to switch AI vendors found the migration either failed or required significantly more effort than expected. Vendor lock-in is now a quantifiable cost line, not a hypothetical one.
Where you will actually meet it
You will meet generative AI across five model families and three deployment shapes. The five families are large language models for text (GPT-5.4, Claude Opus 4.6, Gemini 3 Pro, Llama 3.3), image models (Midjourney, DALL-E, Flux, Imagen), video models (Sora 2, Veo 3.1, Kling, Runway), audio and voice (ElevenLabs, OpenAI Realtime), and multimodal models that accept text, image, and sometimes video input and produce text output.
Inside your stack you will meet it in three shapes. Direct tools your team logs into, ChatGPT Team, Claude for Work, Gemini Workspace. Embedded features inside software you already buy, Microsoft 365 Copilot, Salesforce Einstein, Notion AI, the AI tab inside your accounting platform. Bespoke integrations a developer or a vendor has built on top of an underlying API. The same handful of foundation models sits underneath almost all of them, which means a single provider’s price change or deprecation can ripple through several tools at once.
For a 10 million-token-per-month workload, a realistic volume for a small support or content operation, GPT-5.4 runs roughly £150 to £300 a month, Gemini 3 Pro £40 to £100, DeepSeek £20 to £40. The savings rarely materialise as cleanly as the pricing tables suggest, because switching requires rebuilding integrations and retraining staff. Treat the rate cards as a floor, not a forecast.
When to ask about it, when to ignore it, and when to refuse it
Ask about generative AI when a vendor’s product is doing work that touches money, regulated decisions, customer-facing communication, or anything an auditor would want explained. In those cases the underlying model is part of your supply chain, and you need to know which one, who owns it, where it is hosted, and what their continuity plan is. The 2026 lifecycle is roughly twelve to eighteen months between major versions.
Ignore it when the work is fluent, low-stakes, and easily checked. Drafting an internal email, summarising a meeting transcript, generating ten product photo variations for a designer to pick from, brainstorming campaign angles. Speed and volume matter, perfection does not, human review is acceptable overhead. The ROI is proven and the risk surface is small.
Refuse it for three categories of work. Anything customer-facing or legally binding without grounding to verified sources, because hallucination risk is unacceptable. French courts have already rejected legal submissions that cited nonexistent case law, and the UK ICO’s January 2026 guidance requires meaningful human oversight for any automated decision affecting individuals under the Data (Use and Access) Act 2025. Classification and prediction problems where traditional machine learning solves the same thing at higher accuracy and a fraction of the cost. And real-time performance-critical systems where a 2% error rate is catastrophic.
Related concepts
A large language model is the text-generating subset of generative AI, the engine inside ChatGPT, Claude, and Gemini. Almost every business AI tool in 2026 has an LLM somewhere in its stack, often wrapped behind a friendlier interface and a vendor’s prompts. The other four families, image, video, audio, and multimodal, sit beside the LLM family, and frontier vendors increasingly ship them together as one multimodal service.
A foundation model is the broader category that covers LLMs and image, video, and audio models. Vendors use foundation model and LLM interchangeably more often than they should.
Multimodal AI is what happens when one model accepts and produces multiple input types together, text plus image plus video. Frontier models in 2026 are increasingly multimodal by default.
Retrieval-augmented generation (RAG) and fine-tuning are the two main techniques used to make a generic foundation model useful for your specific business. RAG grounds the model in your documents at query time. Fine-tuning bakes new behaviour into the weights.
The EU AI Act and data residency are the regulatory frame around all of this. If you place AI systems on the EU market or your AI’s output is used by EU citizens, the Act applies, and high-risk obligations go live on 2 August 2026.
The point of separating these terms is to give you enough vocabulary that the next vendor saying “powered by AI” cannot use the phrase to end the conversation. Treat it as the start of the conversation, and audit your stack the way the managing director with the seven-tool list did. That is the work.



