Stores and indexes the numeric representations (embeddings) of your document chunks. When a query arrives, the database finds the most semantically similar chunks using approximate nearest-neighbor (ANN) search — a fast algorithm that finds close matches without comparing against every record. Many also support hybrid search, which blends traditional keyword matching (BM25) with semantic similarity for better recall. The choice of vector database determines retrieval speed, scale, and how precisely you can filter results.
🔎
OpenSearch
Open Source · Apache 2.0 · AWS
AWS-backed Elasticsearch fork with a k-NN plugin supporting Faiss and Lucene backends (NMSLIB deprecated since 2.19, removed in 3.0). The de-facto choice for teams needing Apache 2.0 licensing with mature hybrid BM25 + ANN search.
HNSW / IVF / Lucene
Hybrid Search
Apache 2.0
Pros
- True Apache 2.0 — no license surprises for managed deployments
- Mature BM25 + ANN hybrid out of the box
- Multiple ANN backends selectable per index
- Strong AWS integration (OpenSearch Serverless)
Cons
- JVM tuning required for production stability
- ANN throughput trails purpose-built vector DBs
- Operationally complex: sharding, JVM heap, GC pauses
🪣
Elasticsearch
SSPL + AGPLv3 + ELv2 · Elastic
The original distributed search engine, now with dense vector kNN and ELSER sparse semantic search. Largest ecosystem of any search technology; the default platform for log analytics + search combined.
kNN + ELSER
Petabyte Scale
Kibana
Pros
- Massive ecosystem: connectors, tooling, talent pool
- Best-in-class full-text BM25 + vector hybrid quality
- Mature ops: ILM, snapshots, Kibana dashboards
- Proven at petabyte scale across industries
Cons
- ELv2 and SSPL restrict competing managed service offerings
- Memory-heavy; overkill for vector-only workloads
- Dense vector ANN performance trails purpose-built DBs
🚀
Milvus
Open Source · Apache 2.0 · LF AI & Data
Purpose-built distributed vector database. Multiple index types (IVF_FLAT, HNSW, DiskANN, SCANN). Cloud-native K8s deployment. Managed option via Zilliz Cloud. Graduated project under LF AI & Data Foundation.
HNSW / IVF / DiskANN
K8s Native
Multi-vector / Sparse
Pros
- Top ANN performance at billion-vector scale
- Multiple index algorithms selectable per collection
- LF AI & Data graduated project — strong governance and cloud-native K8s model
- Native sparse + dense hybrid (BM25 built-in)
Cons
- Distributed stack complexity (etcd, MinIO, Pulsar/Kafka)
- Younger ecosystem than Elasticsearch
- Full-text BM25 support added late; less mature
🎨
ChromaDB
Open Source · Apache 2.0
Developer-friendly Python-native vector store. Runs in-process for notebooks or as a lightweight server. Default HNSW backend via hnswlib, with simple metadata filtering. Built for fast prototyping.
In-process / Server
HNSW (hnswlib)
Python-first
Pros
- Zero-config setup:
pip install chromadb
- In-process mode — no server for notebooks and dev
- First-class LangChain and LlamaIndex integration
- Simple dict-based metadata filtering
Cons
- Not designed for billion-vector production scale
- Single-node server; no native horizontal scaling
- No built-in auth or multi-tenancy in OSS version
🌲
Pinecone
Proprietary SaaS · Serverless
Fully managed, serverless vector database with auto-scaling indexes. Supports sparse + dense hybrid search with namespace-based multi-tenancy. BYOC option (GA 2024) deploys into your own AWS or GCP account for data sovereignty.
Serverless
Sparse + Dense Hybrid
Namespaces
Pros
- Completely zero-ops — auto-scaling, no infra
- Native sparse + dense hybrid in a single query
- Namespace multi-tenancy built-in
- Fastest time-to-production of any vector DB
Cons
- BYOC (Bring Your Own Cloud) runs in your AWS/GCP account but adds operational complexity
- Cost unpredictable at high query volumes
- Vendor lock-in; no standard export format
🕸️
Weaviate
OSS + Commercial · BSD 3-Clause
GraphQL-first vector DB with a pluggable module system (text2vec-openai, text2vec-cohere, reranker-cohere). Native BM25 hybrid and multi-tenancy with per-tenant data isolation.
GraphQL API
Vectorizer Modules
Multi-tenancy
Pros
- Built-in vectorizer modules (OpenAI, Cohere, HuggingFace)
- Production multi-tenancy with strict per-tenant isolation
- Hybrid BM25 + vector search with re-ranking support
- Weaviate Cloud managed option available
Cons
- GraphQL API steeper learning curve than REST/SQL
- Schema definition required upfront (less flexible)
- Go-based; smaller community than ES/OS
🎯
Qdrant
Open Source · Apache 2.0 · Rust
High-performance Rust-based vector search with rich JSON payload filtering, quantization (int8, binary, product), and sparse vector support. Low memory footprint, low latency. Self-hosted or Qdrant Cloud.
Rust Native
Quantization
Payload Filtering
Pros
- Rust performance — very low latency and memory overhead
- Quantization reduces memory 4–32× (scalar, product, binary)
- Advanced filtering on arbitrary JSON payload fields
- Native sparse + dense hybrid with BM25 built-in
Cons
- Smaller community and fewer enterprise integrations
- Multi-tenancy story less mature than Weaviate
- Less tooling and managed cloud maturity vs. Milvus
🐘
pgvector
Open Source · PostgreSQL Extension
PostgreSQL extension adding vector similarity search (exact cosine/L2 and HNSW/IVF ANN). Store embeddings beside relational data in the same ACID-compliant Postgres instance — no separate service required.
HNSW / IVFFlat
ACID Transactions
SQL Native
Pros
- No new infrastructure — works inside existing Postgres
- ACID transactions: vector + relational data in one query
- Standard SQL JOINs and filters at no extra cost
- Supported by all major Postgres cloud providers
Cons
- ANN recall and throughput trails dedicated vector DBs at scale
- HNSW index build is slow for large datasets
- Not suitable for billion-vector or high-QPS workloads
🪣
Amazon S3 Vectors
Managed · AWS · Proprietary · GA Dec 2025
Native vector storage built directly into S3 — no separate cluster. Uses "Vector Buckets" containing up to 10,000 indexes, each holding up to 2B float32 vectors. Optimized for cost-efficient batch and archival RAG workloads, not real-time search.
2B vectors / index
Cosine / Euclidean
~100ms warm / <1s cold
Pros
- Native S3 — inherits 11-nines durability, no separate infra
- 4.2× cheaper than OpenSearch for bulk vector storage (per AWS)
- Pay-per-query, no provisioned capacity overhead
- Native Bedrock Knowledge Bases integration
Cons
- 100–800ms latency — unsuitable for real-time / interactive search
- float32 only — no quantization, no binary embeddings
- Only cosine and Euclidean metrics; no hybrid text+vector search
- AWS-only; no self-hosting or export to other vector stores