Vector Databases and Embeddings
Vector databases and embeddings: the foundation for semantic search and RAG
Embeddings translate meaning into vectors, and a vector database turns that into a semantic search by similarity rather than by keyword. Together they are the substrate on which RAG and semantic search actually run.
Classic search finds what matches literally. Someone searching for "notice period" misses the document that talks about "end of contract", even though both mean the same thing. Semantic search solves exactly this problem by comparing meaning instead of character strings. This page describes how an embedding translates meaning into a vector, how a vector database finds the nearest neighbours within it, what spectrum of databases is on offer and what to watch for when choosing. It is the technical layer beneath GenAI and RAG; the general database strategy is covered by Modern Databases.
What an embedding is
An embedding is a numeric vector, a list of typically a few hundred to a few thousand numbers, that encodes the meaning of a text, an image or some other content. A model trained for this, the embedding model, places similar content close together and unrelated content far apart. So "notice period" and "end of contract" sit close together in the vector space, while "notice period" and "garden fence" sit far apart, without any shared keyword.
For vectors from different sources to stay comparable, they must all be produced with the same embedding model; switching the model means recomputing the entire collection. Which model fits best is measurable: comparison frames such as the Massive Text Embedding Benchmark pit models against each other across many tasks and languages. For sovereign operation, what matters is that strong embedding models are also available as open weights and run locally, instead of sending every text to a foreign API.
Semantic search as nearest-neighbour search
Once a collection of content is available as vectors, search becomes geometry. The query is translated into a vector with the same embedding model; the database then looks for the nearest neighbours, the stored vectors with the smallest distance to the query. The distance is determined by a distance metric, in practice usually cosine similarity, alongside Euclidean distance or the inner product.
flowchart TD
A["Text<br/>document or query"] --> B["Embedding model<br/>meaning into numbers"]
B --> C["Vector<br/>a few hundred dimensions"]
C --> D["Vector space<br/>similar items sit close together"]
D --> E["Nearest-neighbour search<br/>smallest distance to the query"]
E --> F["Hits<br/>by meaning, not by keyword"]
With a few thousand vectors, every distance can be computed exactly. With millions of vectors that becomes too slow, and the database falls back on an approximate search. An index such as HNSW, a multi-layer neighbourhood graph, finds very good but not guaranteed perfect neighbours very quickly. This trade between speed and hit quality is the core of every vector database; more accuracy costs compute time and memory.
The spectrum of databases
Vector search is no longer a niche today but a feature with several shapes:
- An extension to an existing database. With pgvector, PostgreSQL itself becomes a vector database. Vectors sit next to the business data in the same database, with the same transactions, backups and access rights. That saves a separate component and keeps related data together.
- A specialised vector database. Systems built solely for vectors bring distributed indexes and very large collections. The gain in scale is paid for with an additional component to operate.
- An embedded or edge variant. Lean libraries that run directly in the application or on the device, with no server of their own. They fit where the collection is small and proximity to the data matters.
These shapes are not mutually exclusive. Many projects start with Postgres and pgvector, because the data already lives there, and only move to a specialised system once collection size or load forces the switch. The general question of which database type fits which job is framed by Modern Databases, and its place within an overall architecture by Data Architecture.
Selection criteria
Four questions decide which shape fits:
- Collection size and load. Postgres extensions suit many six-figure collections, depending on dimensions, index, RAM, filters and load; hundreds of millions with a high query rate argue for a specialised, distributed system.
- Hybrid search. Pure vector search misses exact hits such as an order number or a proper name. A hybrid search, combining vector similarity with classic keyword search and metadata filters, delivers the more robust results in practice; not every database handles both equally well.
- Operating effort. Extending an existing database is less effort than running, securing and monitoring a new service. These ongoing costs belong in the calculation, not just the raw feature set.
- Data residency. Where the vectors and the underlying documents physically sit is a compliance question in Switzerland. pgvector on an in-house or Swiss Postgres keeps the entire substrate under the organisation's own control, with no outflow to a foreign cloud.
It is this last criterion in particular that ties the technical choice to the strategic one. A local vector infrastructure is the basis of the Sovereign RAG Switzerland competency; whether a concrete use case holds up is settled by a scoped Enterprise RAG Proof of Concept, before investing in scale.
Where it breaks
- The wrong expectation of accuracy. Embeddings capture meaning, not truth. Two sentences can sit close semantically yet contradict each other in substance. Vector search returns the most plausible candidates, not the verified answer; the verification is the job of the RAG layer above it, with source attribution.
- A stale collection after a model change. If the embedding model is swapped without recomputing the collection, the search compares vectors from two incompatible spaces. The hits look random, without any visible error in the code.
- Vectors only, no filters. Relying solely on vector similarity loses the hard hits and the access boundaries. Metadata and permissions belong in the same query, otherwise the search also returns what a user is not allowed to see.
The substrate under in-house control
Embeddings and vector databases are the point where data ownership becomes technically concrete. Every text that goes to a foreign API to compute an embedding leaves the house, and that holds for the search query just as much as for the indexed collection. An open embedding model on in-house hardware and pgvector on an in-house Postgres keep both paths in house. An architectural decision thus becomes a data-protection decision: the substrate beneath the semantic search stays under the organisation's own control instead of running through a foreign cloud. How this layer turns into a verifiable answer is described by GenAI and RAG; who decides which model may run on which data is settled by AI governance. For Swiss organisations this is also the sovereignty question, since neither the query nor the collection then leaves the country.
References
- pgvector Open-source vector search for PostgreSQL. PostgreSQL extension for vector similarity with HNSW and IVFFlat indexes and several distance metrics. (2026). github.com/pgvector/pgvector
- Hugging Face Massive Text Embedding Benchmark (MTEB). Comparison frame for embedding models across many tasks and languages. (19.10.2022). huggingface.co/blog/mteb
- Pinecone Vector Database, an introduction. Learning resource on the structure and workings of vector databases and nearest-neighbour search. (2026). www.pinecone.io/learn/vector-database/
- Malkov, Yashunin Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. The original paper on the HNSW index behind fast approximate vector search. (30.03.2016). arxiv.org/abs/1603.09320
Related topics
- GenAI and RAG, the architecture that builds on this substrate.
- Modern Databases, the general database strategy and polyglot persistence.
- PostgreSQL, the relational database that becomes a vector database with pgvector.
- Data Architecture, its place within an overall architecture.
- AI Governance, the control over which data flows into which model.
- Sovereign RAG Switzerland, the commercial service counterpart.
Ask AI
These links open external AI services, the conversation and its content are sent to their providers.