A vendor sat across the table from a managing director I work with, gestured at a slide, and said the words “powered by a large language model” twice in the first minute. The managing director nodded, looked across at me, and asked the question quietly while the salesperson kept talking: “What does that actually mean?”
It is the right question. By 2026, almost every business AI tool is described as being powered by, built on, or wrapped around a large language model. The phrase has stopped meaning anything specific to vendors. It has not stopped mattering to you.
What is an LLM?
A large language model, or LLM, is software trained on enormous amounts of text to predict the next word in a sequence. ChatGPT, Claude, Gemini, and Llama are all LLMs. The word “large” refers to the number of internal numerical parameters the model adjusts during training, often hundreds of billions of them. The point is that an LLM is an engine, not a product.
That distinction matters because the engine is rarely something the vendor pitching you has built. A product called “AI for accountants” or “AI legal assistant” is almost always a wrapper around an LLM that someone else trained. OpenAI’s GPT-5 family. Anthropic’s Claude Opus and Sonnet. Google’s Gemini. Meta’s Llama. Mistral’s open-weight models. The vendor has selected one and built their interface, prompts, and integrations on top.
Two consequences follow. First, the product cannot do anything the underlying model cannot do, no matter what the marketing says. Second, the vendor’s reliability, speed, and cost structure are downstream of the underlying provider’s, not their own.
Why it matters for your business
The cost economics show up first. When a vendor charges you a flat per-seat fee, they are absorbing the per-token API cost their LLM provider charges them. OpenAI raised input token pricing on GPT-5.5 by 100% in May 2026, with workload increases of 49% to 92% for longer-context use cases. Your vendor either ate that cost, passed it on, or quietly throttled you to a cheaper model.
The continuity question shows up next. OpenAI announced GPT-4.5 in February 2025 and wound it down within two months because the inference cost was uneconomic. If your vendor’s product was anchored to that model, they had to migrate, and your prompts, workflows, and outputs all shifted underneath you. By 2026 this pattern is normal. Foundation models cycle out roughly every twelve to eighteen months.
The reliability question shows up last but bites hardest. LLMs hallucinate. They produce confident, plausible, false answers because they are statistical engines, not factual ones. The UK National Cyber Security Centre has been explicit that this is intrinsic to how the technology works, and that responsible deployment requires retrieval, validation, and human review around the model, not faith in the model. A vendor who claims to have eliminated hallucination is selling you a story.
Where you will meet it
You will meet “LLM” in nearly every vendor pitch you take in 2026. The phrasing is consistent. “Our solution is powered by a large language model trained on…” is the structural opener. Sometimes the model is named, often it is not. The category label has moved one level up the stack, where “machine learning” sat in 2018 and “cloud-native” in 2008.
You will also meet it in the small print of products you already use. Microsoft 365 Copilot uses GPT-family models from OpenAI. Salesforce Einstein uses a mix. Notion AI, Slack AI, Zendesk Copilot, and dozens of others all sit on top of the same handful of underlying engines. That is not a problem in itself. It does mean that a single LLM provider’s outage, price change, or model deprecation can ripple through several of your tools at once.
The most useful place to meet the term is in the contract. If a vendor will not name the model and version they use, that is a signal. If they will not commit in writing to a continuity plan if the model is deprecated, that is a different signal. Both belong in your due diligence.
When to ask about it, when to ignore it
Ask about the underlying LLM when the product is doing work that touches money, customer-facing communication, regulated decisions, or anything you would have to explain to an auditor. In those cases the model behind the wrapper is part of your supply chain. You need to know what it is, who owns it, where it is hosted, and whose data protection regime governs it.
Ignore the term when the product is doing low-stakes, easily-checked work. A team using a writing assistant to draft internal emails does not need to interrogate the underlying engine. The question worth asking is whether the output is good enough at the price, not which LLM is generating it.
There is one trap worth flagging. “Powered by an LLM” is sometimes used to imply sophistication that is not there. A glossy chatbot trained only on generic prompts, with no retrieval and no domain context, is technically powered by an LLM and practically useless for your business. The phrase tells you the engine type. It tells you nothing about whether the rest of the car is built.
Related concepts
A foundation model is the broader category that includes LLMs. Every LLM is a foundation model, but foundation models also include vision models, audio models, and multimodal systems. Vendors use the two terms interchangeably more often than they should.
Parameters are the internal numbers that make up the model. A vendor talking about a 70-billion-parameter model is signalling roughly how capable and roughly how expensive it is to run. You do not need to understand the maths to recognise that bigger models are usually more capable, slower, and more expensive per query.
Training is the one-off process of teaching the model on data. It happens once, at enormous cost, by the provider. Inference is what happens every time you or a customer uses the model. Inference is where your costs accrue. When a vendor talks about scaling their AI, they almost always mean inference, not training.
Context window is how much text the model can hold in working memory in a single interaction. Longer context windows let the model work with longer documents in one pass, but they cost more per query.
Hallucination, retrieval-augmented generation (RAG), fine-tuning, and prompt engineering all come up in the next conversation. They are the techniques vendors use around the LLM to make it useful for your specific business.
The point of all of this is not to make you fluent in AI architecture. It is to give you enough vocabulary that the next vendor who says “powered by a large language model” cannot use the phrase to end the conversation. Treat it as the start of the conversation.



