Vector Databases 101: How Embeddings Power Modern Search
Estimated reading time: 6 minutes
Key Takeaways
- Embeddings transform data like text or images into numerical representations, allowing computers to understand semantic relationships.
- Vector databases specialize in storing and querying these embeddings for semantic search, moving beyond traditional keyword matching.
- Approximate Nearest Neighbor (ANN) algorithms like HNSW and IVF enable fast similarity search over large datasets.
- FAISS is a powerful library for high-performance vector search in research or local development.
- Milvus is an open-source vector database built for production-scale vector search with features like storage, scaling, and metadata handling.
Table of contents
- Embeddings Primer: Turning Meaning into Numbers
- From Keyword to Semantic Search
- What Makes a Vector Database Different
- ANN Fundamentals: Making Vector Search Fast
- FAISS Deep Dive: The Vector Search Library
- Milvus Deep Dive: The Vector Database
- Decision Guide: FAISS vs Milvus
- Building a Semantic Search System: Step-by-Step
- Evaluation and Tuning Vector Search
- Scaling and Operations
- Common Pitfalls and Best Practices
- Frequently Asked Questions
Imagine searching a medical database for “physician” but getting no results because all entries are labeled “doctor.” This common failure of traditional search happens daily across applications. While humans understand these terms are related, computers traditionally don’t—unless they use vector databases.
Embeddings Primer: Turning Meaning into Numbers
Embeddings form the foundation of modern search technology. These are numerical representations where the position and distance between vectors in high-dimensional space reflects semantic relationships between original items.
When you hear “embeddings,” think meaning as math
. AI models transform raw data into these numerical representations:
- Text embeddings: Models like BERT or OpenAI’s text-embedding models convert words and sentences into vectors where similar concepts cluster together
- Image embeddings: Vision models map visually similar images to nearby vector positions
- Audio embeddings: Sound patterns get converted into mathematical relationships https://zigron.com/2025/06/03/ai-trends-navigating-future
These vectors typically contain hundreds of values (384, 768, or 1024 dimensions are common) and are often normalized to unit vectors. The magic happens when measuring distances between them using metrics like:
- Cosine similarity: Measures the angle between vectors (closer to 1 means more similar)
- Euclidean distance: Calculates straight-line distance in vector space
- Dot product: For normalized vectors, equivalent to cosine similarity
This mathematical foundation lets computers understand that “physician” and “doctor” are closely related even without exact matching.
From Keyword to Semantic Search
Traditional keyword search uses exact matching strategies:
Search Type | How It Works | Strengths | Limitations | Good For |
---|---|---|---|---|
Keyword | Matches exact terms using algorithms like BM25 | Fast, simple, proven | Misses synonyms and context | Finding exact terms in logs |
Semantic | Matches by meaning using vector similarity | Finds related concepts | More computational work | Understanding user intent |
Hybrid | Combines both approaches | Better results than either alone | More complex to implement | E-commerce, customer support |
The similarity search process is straightforward: convert a query to a vector, find vectors close to it in the database, and return their associated content.
What Makes a Vector Database Different
Vector databases differ from traditional databases by specializing in high-dimensional vector operations:
- Fast vector similarity calculations using specialized indexes
- Storage of both vectors and associated metadata
- Filtering capabilities based on metadata
- Collection and partition management for data organization
- Optimized search APIs for vector queries
Unlike regular databases that excel at exact matching or range queries, vector databases find things that are similar
in ways that match human perception.
ANN Fundamentals: Making Vector Search Fast
Searching through millions of vectors using brute force comparison would be painfully slow. Approximate Nearest Neighbor (ANN) algorithms solve this by trading perfect accuracy for dramatic speed improvements.
Popular ANN approaches include:
- HNSW (Hierarchical Navigable Small World): Creates a multi-layer graph structure for efficient navigation
- IVF (Inverted File): Clusters vectors and searches only the most promising clusters
- Product Quantization (PQ): Compresses vectors to reduce memory footprint
These methods balance a three-way tradeoff between recall (accuracy), latency (speed), and cost (resources required).
FAISS Deep Dive: The Vector Search Library
FAISS (Facebook AI Similarity Search) is a powerful C++/Python library focused on efficient similarity search of dense vectors.
When to use FAISS:
- Local development and prototyping
- Research applications
- When maximum control over indexes is needed
- When operating directly with GPUs for performance
FAISS offers multiple index types:
- Flat: Brute-force exact search (slow but perfect recall)
- IVFFlat: Clustering with exact within-cluster search
- IVFPQ: Clustering with compressed vectors
- HNSW: Graph-based approach for very fast search
The main advantage of FAISS is its raw performance and flexibility, especially with GPU acceleration. However, it’s a library, not a database—you’ll need to handle storage, scaling, and operational concerns yourself.
Milvus Deep Dive: The Vector Database
Milvus is an open-source vector database designed for production environments. It handles the same vector operations as FAISS but adds:
- Built-in persistent storage
- Scaling and clustering capabilities
- Metadata management and filtering
- High availability features
When to use Milvus:
- Production deployments
- When scaling to multiple servers
- When team operations matter
- When metadata filtering is important
Milvus organizes data into collections (similar to tables) that can be partitioned for performance. It supports the same index types as FAISS while adding operational features that production systems need.
Decision Guide: FAISS vs Milvus
Choose FAISS when:
- Working locally or prototyping
- Running research or batch analytics
- Maximizing raw performance is critical
- Operating in constrained environments like edge devices
Choose Milvus when:
- Building production systems
- Scaling across multiple servers
- Needing operational robustness
- Working with teams that need simple APIs
Many projects start with FAISS for prototyping then migrate to Milvus for production. This works well because Milvus uses FAISS internally for some of its index types.
Building a Semantic Search System: Step-by-Step
- Prepare your data
- Clean and deduplicate documents
- Split into appropriate chunks if needed
- Select an embedding model
- Choose based on your domain (general text, code, legal, etc.)
- Options include OpenAI models, Sentence Transformers, or domain-specific models
- Compute embeddings
- Process all documents through your chosen model
- Normalize vectors if required by your distance metric
- Index your vectors
- With FAISS: Select and train an index, add vectors
- With Milvus: Create a collection, define schema, insert data
- Process queries
- Embed the query using the same model
- Perform similarity search
- Return results to users
- Add filters
- Store metadata alongside vectors
- Filter results based on categories, dates, or other attributes
Evaluation and Tuning Vector Search
Measure performance with these metrics:
- Recall@k: Percentage of relevant items in top k results
- Mean Reciprocal Rank (MRR): Rewards placing relevant results higher
- Latency: Speed of queries (p95 response time)
- Throughput: Queries handled per second
Tune ANN parameters for your use case:
- For HNSW: Adjust efSearch (search depth) and M (connections per node)
- For IVF: Tune nlist (number of clusters) and nprobe (clusters searched)
Scaling and Operations
As your system grows:
- Shard vectors across multiple servers
- Add replicas for high availability
- Monitor index health and query performance
- Implement incremental updates as data changes
- Set up observability for slow queries and resource usage
Common Pitfalls and Best Practices
Avoid these common mistakes:
- Using embedding models mismatched to your domain
- Forgetting to normalize vectors when required
- Setting index parameters without testing
- Ignoring metadata filtering capabilities
- Over-compressing vectors and losing accuracy
Best practices include:
- Start simple with Flat indexes then optimize
- Test with representative queries from your domain
- Combine vector and keyword search for best results
- Plan for data growth from the beginning
Frequently Asked Questions
What vector dimension should I use?
The dimension is determined by your embedding model, typically between 256-1536.
Do I need GPUs?
Not always. Modern CPUs with good ANN indexes can handle many use cases efficiently.
Can I combine keyword and vector search?
Yes, this hybrid search
often provides the best results and is supported directly by Milvus.
How often should I update embeddings?
When your data changes significantly or when you upgrade your embedding model.
Vector databases and embeddings have transformed search from simple keyword matching to true semantic understanding. By representing meaning mathematically, computers can now find what users actually want, not just what they literally type. Whether you choose FAISS for raw performance or Milvus for production robustness, these tools enable a new generation of intelligent applications. https://zigron.com/2025/03/11/ai-services-for-smes-advantage/
https://myscale.com/blog/faiss-vs-milvus-performance-analysis/
https://nextbrick.com/exploring-open-source-vector-search-engines-faiss-vs-milvus-vs-pinecone/
https://zilliz.com/comparison/milvus-vs-faiss
https://milvus.io/ai-quick-reference/whats-the-difference-between-faiss-annoy-and-scann
“`