What is a vector database? The infrastructure under modern AI search

A person at a desk in a small office reviewing two printed vendor quotes alongside a laptop open to a search interface
TL;DR

A vector database is specialised infrastructure that stores numerical representations of meaning, called embeddings, and finds similar items in milliseconds using distance metrics rather than keyword matching. It's the layer under any AI system that reads documents, finds similar cases, or powers a conversational support agent. For UK service SMEs with fewer than a million documents, the pgvector extension inside an existing PostgreSQL database is commonly enough. Dedicated infrastructure earns its keep above that scale or where sub-50ms latency is genuinely required.

Key takeaways

- A vector database stores embeddings, the numerical representations of meaning produced by an AI model, and returns the closest matches in milliseconds. A search for "maternity leave" surfaces parental-leave and family-planning documents because they cluster nearby in mathematical space. - For UK SMEs running fewer than a million documents on existing PostgreSQL, the pgvector extension is commonly enough. Recent benchmarks put pgvectorscale at 471 queries per second on 50 million vectors, competitive with Pinecone-class infrastructure. - The 2026 vendor landscape splits into dedicated services (Pinecone, Weaviate, Qdrant, Milvus, Chroma), database extensions (pgvector, MongoDB Atlas, Elasticsearch, Redis), and managed cloud (AWS Bedrock Knowledge Bases, Azure AI Search, Vertex AI Search). - Embeddings derived from personal data are personal data under UK GDPR. Architectures must support targeted deletion for the right to erasure, document a retention policy, and pseudonymise before embedding where the business purpose allows. - Vendor lock-in is moderate at the database layer, where export and reload is feasible, and stronger at the embedding-model layer, where switching models forces a re-embed of the entire corpus.

The managing partner of a 45-staff legal services firm has two quotes on the desk for the same problem. 18,000 historical case files, policy documents, and client matters live in a SharePoint nobody can search. A junior solicitor spends the first 90 minutes of every Monday hunting for similar precedents on whatever matter has just landed.

Vendor A wants £42,000 to deploy a fully managed enterprise vector database with semantic search, plus £1,800 a month in hosting. Vendor B wants £18,000 to add the pgvector extension to the firm’s existing PostgreSQL CRM database and wire up the search interface, plus £200 a month in incremental compute. Both promise the same outcome. The partner is trying to work out whether 18,000 documents needs the Vendor A architecture, what the hidden costs are on each side, and whether anything in the GDPR layer makes one safer than the other when the documents contain client personal data.

For most UK SMEs the question isn’t whether you need a vector database. It’s whether you need a dedicated one.

What is a vector database?

A vector database is specialised infrastructure that stores numerical representations of meaning, called embeddings, and searches them by similarity rather than by keyword. It treats text, images, and audio as fixed-length arrays of numbers that capture the conceptual essence of the content. A query for “maternity leave” returns documents about parental leave and family planning because they cluster nearby in mathematical space, even when the exact phrase never appears.

The operational difference matters. A SQL database searches for exact matches with WHERE clauses and joins. A vector database performs similarity searches using distance metrics like cosine similarity, finding items whose vectors sit closest to a query vector in high-dimensional space. That makes it fundamentally different from the SQL Server, Excel workbooks, and document-management systems UK SMEs typically run today, and it’s why semantic search returns conceptually relevant results that keyword search misses entirely.

Why is your business hearing about vector databases now?

Three trends collided into 2026. Large language models like Claude, GPT-4, and Gemini became reliable enough for production. Generating embeddings dropped to around £0.02 per million tokens at OpenAI 2026 prices, roughly £1 to embed 100,000 typical documents. Specialist vendors optimised the storage layer for AI workloads. Together those shifts moved retrieval-augmented generation from research demo to operational architecture for owner-managed firms.

The practical effect is straightforward. Any SME deploying an AI customer support agent, a semantic document search, an internal knowledge base, or a recommendation engine in 2026 is using a vector database whether the owner knows it or not. The infrastructure is invisible to the end user. It surfaces as faster precedent retrieval for a solicitor, faster client-history lookup for an accountant, or a help desk that finds the right FAQ answer in seconds rather than minutes.

Where will you actually meet a vector database?

You’ll meet one inside many of the AI applications a UK service business is buying this year. AI help desks vectorise the incoming question and search a database of historical cases and FAQ answers. Semantic document search lets a solicitor ask “show me cases involving similar negligence facts” and get conceptually matched precedents back, not just files containing the same nouns. Recommendation engines surface similar customers, cases, or opportunities for business development teams.

The 2026 vendor landscape splits into three clusters. Dedicated services include Pinecone (£50 to £150 a month at one million vectors, fully managed), Weaviate (open-source, around £100 to £200 a month managed at the same scale, hybrid keyword-plus-vector), Qdrant (Rust-based, sub-10ms latency, £25 to £300 depending on configuration), Milvus (distributed, billion-scale, used by 10,000+ enterprise teams), and Chroma (lightweight prototyping). Database extensions include pgvector inside PostgreSQL (free, £50 to £120 on managed Postgres), MongoDB Atlas Vector Search (from £25), Elasticsearch (£95+, hybrid search), and Redis (sub-millisecond, memory-bound, expensive at scale). Managed cloud services include AWS Bedrock Knowledge Bases, Azure AI Search, and Google Vertex AI Search.

When should you ask for a dedicated vector database?

Three numbers run the decision. Below 100,000 documents, pgvector or MongoDB Vector Search inside a database you already pay for is almost always sufficient. The overhead of introducing another vendor and monitoring surface outweighs marginal performance gains. Between 100,000 and a million documents the choice turns on latency. At 100-200ms response times, pgvector still wins. If you need sub-50ms or query rates over 1,000 per second, dedicated infrastructure justifies itself.

Above a million documents with consistent query load, dedicated services usually become cost-effective. The case the research keeps citing is a Series A fintech that paid £18,000 a month for Pinecone to serve 45 million vectors, then migrated to self-hosted Qdrant on AWS and dropped to £6,200 with better latency. The pgvector performance picture has shifted in the same window. Recent benchmarks put pgvectorscale at 471 queries per second at 99% recall on 50 million vectors, around 11 times faster than Qdrant on the same dataset and competitive with Pinecone-class infrastructure. For a UK service firm with under 10 million vectors, operational simplicity beats specialist performance often enough that “use what you already pay for” is the working default.

The other layer the owner needs to ask about is regulatory. Embeddings derived from personal data are personal data under UK GDPR, regardless of their numerical form. The ICO position is unambiguous, and the practical effect is three architectural requirements: support targeted deletion so a person’s vectors can be removed on request, document a retention policy you can justify from business purpose, and pseudonymise identifying fields before embedding where business purpose allows. Sensitive sectors face heightened expectations on encryption, data residency, access logging, and role-based controls.

The upstream concept is the embedding itself. Vector databases store what an embedding model produces, and the two decisions are coupled. Vectors from OpenAI’s 1,536-dimension model aren’t compatible with Sentence-BERT’s 384-dimension model. Switching embedding models forces a re-embed of the entire corpus, around £100 in compute for 10 million documents at 2026 prices plus re-indexing time.

Vector databases are the storage layer under retrieval-augmented generation, the architecture UK service firms commonly use to put their own documents in front of an AI model without retraining it. They’re the alternative to fine-tuning for many knowledge-base use cases, and they sit inside the broader vendor lock-in picture every owner needs to map at procurement. The database layer itself is moderate lock-in, since export and reload is feasible. The embedding-model layer is stronger, since switching forces a re-embed. Plan deliberately, start with a pilot, and scale the infrastructure once business value is demonstrated. The technology works best when it stays invisible to the people using it.

If you’d like to talk through whether your firm needs a dedicated vector database, what to ask the vendors who’ll quote you, and where the GDPR retention duty bites in practice, book a conversation.

Sources

Yugabyte (2024). What is a vector database? The canonical plain-English definition and the source for the meaning-not-keywords framing in this post. https://www.yugabyte.com/blog/what-is-a-vector-database/ ZenML (2025). Vector databases for RAG, the SME-relevant architecture reference behind the help-desk and document-search use cases. https://www.zenml.io/blog/vector-databases-for-rag LeanOpsTech (2026). Vector database cost comparison 2026, the source for the named pricing across Pinecone, Qdrant, Weaviate, and pgvector at common SME scales. https://leanopstech.com/blog/vector-database-cost-comparison-2026/ Encore (2025). pgvector vs Qdrant, the SME decision reference behind the "use what you already pay for" pgvector recommendation. https://encore.dev/articles/pgvector-vs-qdrant Actian (2025). The hidden cost of vector database pricing models, the source for the Pinecone-to-Qdrant £18,000 to £6,200 monthly migration case. https://www.actian.com/blog/databases/the-hidden-cost-of-vector-database-pricing-models/ Tetrate (2024). Vector embeddings explained, the reference for the embedding-model coupling and dimensionality incompatibility in the lock-in section. https://tetrate.io/learn/ai/vector-embeddings-explained ICO (2025). Storage limitation principle under the UK GDPR, the regulatory anchor for the retention-policy guidance. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/data-protection-principles/a-guide-to-the-data-protection-principles/storage-limitation/ ICO (2024). Pseudonymisation guidance under the UK GDPR, the source for stripping identifying fields before embedding to reduce GDPR scope. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/data-sharing/anonymisation/pseudonymisation/ Milvus (2025). How do I choose between Pinecone, Weaviate, Milvus and other vector databases, the multi-vendor decision reference for the 2026 landscape section. https://milvus.io/ai-quick-reference/how-do-i-choose-between-pinecone-weaviate-milvus-and-other-vector-databases Firecrawl (2025). Best vector databases, the latency-benchmark reference behind the sub-10ms Qdrant claim and the pgvectorscale comparison numbers. https://www.firecrawl.dev/blog/best-vector-databases

Frequently asked questions

How do I know if my business actually needs a dedicated vector database?

Three numbers decide it. Below 100,000 documents, pgvector or MongoDB Vector Search inside a database you already run is almost always enough. Between 100,000 and a million, it depends on latency tolerance, with pgvector still winning at 100-200ms response times. Above a million documents with consistent query load, dedicated services like Pinecone, Qdrant, or Weaviate start to earn their operational overhead and licence cost.

Are embeddings of customer data subject to UK GDPR?

Yes, almost always. The ICO position is that embeddings derived from personal data remain personal data, regardless of their numerical form. That means data minimisation, storage limitation, and the right to erasure all apply. The practical implication is to architect for targeted deletion of a person's vectors on request, document a retention policy you can justify by business purpose, and pseudonymise identifying fields before embedding where it's feasible.

What does a vector database actually cost for a UK SME?

Realistic 2026 pricing for an SME pilot at 100,000 to 1 million vectors is £25 to £200 a month for the database itself, plus a few pounds in one-time embedding compute. Hidden costs typically include reranking at around £2 per 1,000 queries, egress charges, backups, and engineer time. A small production deployment usually lands at £500 to £2,000 a month all in, scaling to £2,000 to £5,000 for systems serving tens of millions of vectors with consistent query load.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation