Spotting an AI vendor who doesn't know what they're doing

A woman in her late forties sitting across a meeting-room table from a vendor, her open notebook in front of her, listening as he speaks
TL;DR

Real AI vendor expertise versus visible inexperience is observable in the first thirty minutes of a sales conversation. Five tells filter the worst of the field, vague answers to specific operational questions, marketing language used instead of named models and architecture, case studies that do not survive a follow-up, pricing that omits obvious components, and timelines that compress every scope into a few weeks.

Key takeaways

- Standard advice to do thorough due diligence does not survive contact with a time-poor owner-operator running a thirty-person business; five recognisable tells, applied in conversation, work better than a 40-point checklist. - Tell one is vague answers to specific operational questions; a credible vendor names retrieval-augmented generation, fine-tuning, or named foundation models when asked how the system handles edge cases, a weak one falls back on "advanced AI" and "proprietary algorithms". - Tell two is marketing language replacing technical naming; a credible vendor will tell you which model (GPT-4o, Claude 3.5 Sonnet, Llama, Mistral, Gemini) and why they chose it for your problem, a weak one will not. - Tell three is case studies that do not survive "can I speak to that client", tell four is pricing that omits data preparation, integration, monitoring and change management, tell five is "a few weeks" deployment regardless of scope. - This is the first-conversation filter; structured due diligence with twelve named questions and reference checks comes later in the process. Both belong in a healthy buying decision.

She left the demo last week with a slightly uneasy feeling she could not name. The slides were polished, the speaker was articulate, and her operations director liked the workflow diagrams. She accepted a follow-up anyway, because nothing the vendor said was obviously wrong. A draft proposal is now in her inbox. The unease has not gone away.

The unease is usually data. An owner who has run enough conversations with enough professionals has a calibrated sense for whether the person across the table actually knows what they are doing. With AI, the calibration is harder, because the vocabulary is new. The good news is that the signals are observable and the list is shorter than people expect. Five tells, applied in the first thirty minutes, filter the inexperienced vendors before anyone signs anything.

This is not vendor blacklisting, and it is not “all AI vendors are cowboys” framing. The AI vendor market in 2026 contains real expertise and visible inexperience side by side, often at firms with similar websites. What follows is a recognition skill for the buyer.

What is the cheapest filter for an AI vendor’s competence?

The cheapest filter is the specificity of their answer to a specific operational question. Ask “how does your system handle customer data containing variations the model has not seen during training” and listen. A credible answer names retrieval-augmented generation, a fine-tuning approach with data governance, or acknowledges that data drift is an ongoing concern requiring monitoring. A weak answer says “our advanced AI adapts automatically” and changes the subject.

This is the single most reliable tell. Princeton researchers Arvind Narayanan and Sayash Kapoor describe AI snake oil as systems that do not, and likely cannot, work as advertised, and the language pattern they document is what a non-technical owner can hear in a sales meeting. Vendors with real production experience have lost weekends to data quality, integration breakage, and model behaviour they did not predict. That experience produces specificity. Vendors without it default to abstraction. The gap is audible inside ten minutes.

Why does marketing language for technical concepts matter?

It matters because credible vendors tell you which model they use, why, and what they did to it. A weak vendor will not. The named foundation models in 2026 are GPT-4o from OpenAI, Claude 3.5 Sonnet from Anthropic, Llama from Meta, Mistral, and Gemini from Google. Each has different cost and reasoning characteristics. A serious vendor has a reason for the one they picked and can explain it in plain English.

A vague answer here is diagnostic. “We use our proprietary AI” or “we use the best available models” usually means the vendor is wrapping a public API and adding little engineering on top, while obscuring that fact. There is nothing wrong with wrapping a public API; many useful products do exactly that. There is something wrong with hiding it. The wrapper itself is the work, and a credible vendor describes the wrapper, the model choice, the trade-offs. A vendor who answers a technical question with marketing words has either not done the engineering or is not comfortable describing it.

Why do case studies sometimes not survive a follow-up question?

They do not survive because the case study was constructed for the deck rather than drawn from a real customer the vendor is happy for you to call. The follow-up is the cheapest verification step. “Can I speak to that client, fifteen minutes on the phone.” A credible vendor says yes and provides the contact within days. A weak vendor cites confidentiality, offers a testimonial, or names a contact who has moved on.

Confidentiality is sometimes genuine, and many strong vendors hold enterprise clients under NDA. The diagnostic is not one refusal, it is the pattern across three. Ask for three named references at firms similar to yours in size and sector. A credible vendor produces them. A weak vendor produces one and becomes elusive. When you speak to a reference, ask two questions: what did the vendor get wrong, and what would you have done differently. References rarely volunteer this; when asked directly, they usually answer.

What pricing tells reveal vendor inexperience?

The clearest tell is a quote that contains only the visible cost. Software licensing, API consumption, and compute show up. Data preparation, integration, testing, change management, and monitoring do not. Glean’s research on AI total cost of ownership finds licensing is a fraction of true first-year spend. A vendor whose quote omits the larger lines either does not understand the reality or is choosing not to surface it.

Two questions catch this. What is your assumption about the state of our data on day one, and what is included in the quote if that assumption is wrong. A credible vendor names the assumption explicitly, “we have assumed your customer data is in one system, accessible by API, and reasonably clean,” and quotes a range for the case where it fails. A weak vendor responds in generalities or claims the system handles data integration automatically. The second answer is rarely true in production.

Watch also for usage-based pricing without caps. Unbounded usage pricing is fine in principle, but a serious vendor offers a cap, a usage projection, or a monthly review mechanism so the bill does not surprise you. A vendor who shrugs at the question of cost predictability has either not had the conversation with existing customers or has had it and lost.

When should a “few weeks” timeline make you walk?

When the scope clearly exceeds a few weeks of integration work. Unframe’s practitioner research on AI agent deployment finds that enterprise deployments commonly take seven to twelve months, dominated not by model complexity but by the integration work of connecting the agent to the systems where data actually lives. An owner-managed SME with smaller scope might deploy faster, but the floor is set by integration, not by the vendor’s sales eagerness.

A credible vendor distinguishes a proof-of-concept (rapid, narrow, lower cost) from a production deployment (slower, broader, real money). They might say “we can have a working proof-of-concept in four weeks if your data is accessible by API, and a production deployment across your full scope in four to six months including integration, monitoring, and user acceptance testing.” That is a serious answer. A vendor who promises three weeks regardless of scope has either not deployed into a real production environment or is not telling you what they will quietly cut to hit the date. The Air Canada chatbot case, where a deployed assistant gave a customer incorrect bereavement-fare information and the airline was held liable, is the kind of corner cut when timelines compress and monitoring is dropped.

The five tells are a first-conversation filter, not a substitute for the twelve-question due diligence framework that comes later, and not a substitute for reference checks, contract review, or a properly priced total cost of ownership. They exist so an owner does not spend three hours of diligence on a vendor who would have failed at the first thirty minutes. The unease the buyer could not name at the start of this post is usually one of the five tells in disguise. Now it has a name, and a name makes it cheap to act on. If she wants to talk it through before responding to the proposal, she can book a conversation.

Sources

- Princeton Center for Information Technology Policy (2024). AI Snake Oil, conversation with Arvind Narayanan and Sayash Kapoor on distinguishing AI that does not work from AI that does. Used here to ground the credible-versus-snake-oil framing of vendor language. https://research.princeton.edu/news/ai-snake-oil-conversation-princeton-ai-experts-arvind-narayanan-and-sayash-kapoor - Stanford HAI (policy brief). Validating Claims About AI, a Policymakers Guide. Framework for the three sequential questions every vendor claim must answer, what is being claimed, what was tested, do the two match. https://hai.stanford.edu/policy/validating-claims-about-ai-a-policymakers-guide - NCSC UK (2024). Machine Learning Principles, regulatory guidance for ML system developers and operators on transparency and secure-by-design. Used to ground the model-card and limitations expectations of a credible vendor. https://www.ncsc.gov.uk/collection/machine-learning-principles - NIST (2024). AI Risk Management Framework and Generative AI Profile. Standards reference for governance, drift monitoring, and acceptable risk levels in production AI. https://www.nist.gov/itl/ai-risk-management-framework - AWS (2024). What is Retrieval-Augmented Generation. Reference architecture for the RAG approach a credible vendor should be able to explain in plain terms. https://aws.amazon.com/what-is/retrieval-augmented-generation/ - Fortune (August 2025). MIT NANDA report, 95 per cent of generative AI pilots fail to deliver measurable return. The asymmetric-risk backdrop for vendor selection in SME spend. https://fortune.com/2025/08/18/mit-report-95-percent-generative-ai-pilots-at-companies-failing-cfo/ - Cloud Security Alliance (2024). The Risks of Relying on AI, Lessons from Air Canada's Chatbot. Documented case where a deployed chatbot gave incorrect bereavement-fare information and the airline was held liable, used here for the failure-mode discussion. https://cloudsecurityalliance.org/blog/2024/06/05/the-risks-of-relying-on-ai-lessons-from-air-canada-s-chatbot-debacle - Glean (2024). How to Budget for the Total Cost of Ownership of AI Solutions. Reference for the cost-iceberg pattern, software licensing as a fraction of true first-year spend, used here to ground the pricing-tell discussion. https://www.glean.com/perspectives/how-to-budget-for-the-total-cost-of-ownership-of-ai-solutions - Unframe (2024). AI Agent Deployment, 6 Months vs Days. Practitioner reference on why production AI agent deployments typically take 7 to 12 months, dominated by data integration rather than model complexity. https://www.unframe.ai/blog/ai-agent-deployment-6-months-vs-days

Frequently asked questions

Is it fair to write off a vendor in the first thirty minutes?

It is fair to use the first thirty minutes to decide whether to invest a further three hours. If a vendor cannot name the model they use, cannot describe an integration challenge they have actually solved, and cannot point to a customer you can call, the cost of going further is high and the expected return is low. You are not blacklisting them, you are deciding who clears the first gate.

What if the vendor's salesperson is non-technical and the engineers are credible?

Ask to speak to one of the engineers before signing anything. Credible firms make this easy because their engineers are comfortable in front of customers and the answers are crisp. If the salesperson refuses, controls the conversation, or substitutes their own answers when you ask technical questions, that itself is a tell. It usually means the engineering bench is thinner than the deck suggests.

How does this relate to a longer due diligence process?

The five tells are a first-conversation filter, not a substitute for due diligence. Once a vendor clears the first thirty minutes, work through the twelve-question diligence framework, take named references, look at the contract, and price the total cost of ownership properly. The filter exists so you do not spend hours doing diligence on vendors who would have failed at the first conversation.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation