The managing partner of a 45-staff legal services firm has two quotes on the desk for the same problem. 18,000 historical case files, policy documents, and client matters live in a SharePoint nobody can search. A junior solicitor spends the first 90 minutes of every Monday hunting for similar precedents on whatever matter has just landed.
Vendor A wants £42,000 to deploy a fully managed enterprise vector database with semantic search, plus £1,800 a month in hosting. Vendor B wants £18,000 to add the pgvector extension to the firm’s existing PostgreSQL CRM database and wire up the search interface, plus £200 a month in incremental compute. Both promise the same outcome. The partner is trying to work out whether 18,000 documents needs the Vendor A architecture, what the hidden costs are on each side, and whether anything in the GDPR layer makes one safer than the other when the documents contain client personal data.
For most UK SMEs the question isn’t whether you need a vector database. It’s whether you need a dedicated one.
What is a vector database?
A vector database is specialised infrastructure that stores numerical representations of meaning, called embeddings, and searches them by similarity rather than by keyword. It treats text, images, and audio as fixed-length arrays of numbers that capture the conceptual essence of the content. A query for “maternity leave” returns documents about parental leave and family planning because they cluster nearby in mathematical space, even when the exact phrase never appears.
The operational difference matters. A SQL database searches for exact matches with WHERE clauses and joins. A vector database performs similarity searches using distance metrics like cosine similarity, finding items whose vectors sit closest to a query vector in high-dimensional space. That makes it fundamentally different from the SQL Server, Excel workbooks, and document-management systems UK SMEs typically run today, and it’s why semantic search returns conceptually relevant results that keyword search misses entirely.
Why is your business hearing about vector databases now?
Three trends collided into 2026. Large language models like Claude, GPT-4, and Gemini became reliable enough for production. Generating embeddings dropped to around £0.02 per million tokens at OpenAI 2026 prices, roughly £1 to embed 100,000 typical documents. Specialist vendors optimised the storage layer for AI workloads. Together those shifts moved retrieval-augmented generation from research demo to operational architecture for owner-managed firms.
The practical effect is straightforward. Any SME deploying an AI customer support agent, a semantic document search, an internal knowledge base, or a recommendation engine in 2026 is using a vector database whether the owner knows it or not. The infrastructure is invisible to the end user. It surfaces as faster precedent retrieval for a solicitor, faster client-history lookup for an accountant, or a help desk that finds the right FAQ answer in seconds rather than minutes.
Where will you actually meet a vector database?
You’ll meet one inside many of the AI applications a UK service business is buying this year. AI help desks vectorise the incoming question and search a database of historical cases and FAQ answers. Semantic document search lets a solicitor ask “show me cases involving similar negligence facts” and get conceptually matched precedents back, not just files containing the same nouns. Recommendation engines surface similar customers, cases, or opportunities for business development teams.
The 2026 vendor landscape splits into three clusters. Dedicated services include Pinecone (£50 to £150 a month at one million vectors, fully managed), Weaviate (open-source, around £100 to £200 a month managed at the same scale, hybrid keyword-plus-vector), Qdrant (Rust-based, sub-10ms latency, £25 to £300 depending on configuration), Milvus (distributed, billion-scale, used by 10,000+ enterprise teams), and Chroma (lightweight prototyping). Database extensions include pgvector inside PostgreSQL (free, £50 to £120 on managed Postgres), MongoDB Atlas Vector Search (from £25), Elasticsearch (£95+, hybrid search), and Redis (sub-millisecond, memory-bound, expensive at scale). Managed cloud services include AWS Bedrock Knowledge Bases, Azure AI Search, and Google Vertex AI Search.
When should you ask for a dedicated vector database?
Three numbers run the decision. Below 100,000 documents, pgvector or MongoDB Vector Search inside a database you already pay for is almost always sufficient. The overhead of introducing another vendor and monitoring surface outweighs marginal performance gains. Between 100,000 and a million documents the choice turns on latency. At 100-200ms response times, pgvector still wins. If you need sub-50ms or query rates over 1,000 per second, dedicated infrastructure justifies itself.
Above a million documents with consistent query load, dedicated services usually become cost-effective. The case the research keeps citing is a Series A fintech that paid £18,000 a month for Pinecone to serve 45 million vectors, then migrated to self-hosted Qdrant on AWS and dropped to £6,200 with better latency. The pgvector performance picture has shifted in the same window. Recent benchmarks put pgvectorscale at 471 queries per second at 99% recall on 50 million vectors, around 11 times faster than Qdrant on the same dataset and competitive with Pinecone-class infrastructure. For a UK service firm with under 10 million vectors, operational simplicity beats specialist performance often enough that “use what you already pay for” is the working default.
The other layer the owner needs to ask about is regulatory. Embeddings derived from personal data are personal data under UK GDPR, regardless of their numerical form. The ICO position is unambiguous, and the practical effect is three architectural requirements: support targeted deletion so a person’s vectors can be removed on request, document a retention policy you can justify from business purpose, and pseudonymise identifying fields before embedding where business purpose allows. Sensitive sectors face heightened expectations on encryption, data residency, access logging, and role-based controls.
Related concepts to know alongside vector databases
The upstream concept is the embedding itself. Vector databases store what an embedding model produces, and the two decisions are coupled. Vectors from OpenAI’s 1,536-dimension model aren’t compatible with Sentence-BERT’s 384-dimension model. Switching embedding models forces a re-embed of the entire corpus, around £100 in compute for 10 million documents at 2026 prices plus re-indexing time.
Vector databases are the storage layer under retrieval-augmented generation, the architecture UK service firms commonly use to put their own documents in front of an AI model without retraining it. They’re the alternative to fine-tuning for many knowledge-base use cases, and they sit inside the broader vendor lock-in picture every owner needs to map at procurement. The database layer itself is moderate lock-in, since export and reload is feasible. The embedding-model layer is stronger, since switching forces a re-embed. Plan deliberately, start with a pilot, and scale the infrastructure once business value is demonstrated. The technology works best when it stays invisible to the people using it.
If you’d like to talk through whether your firm needs a dedicated vector database, what to ask the vendors who’ll quote you, and where the GDPR retention duty bites in practice, book a conversation.



