Table of Contents
Quick Answer
Embed your content with a modern embedding model, store in pgvector, and query with cosine similarity. For best results, combine with BM25 (keyword) in a hybrid score. Beats traditional keyword search on 80% of query types.
- Time to implement: 3-5 hours for a basic version
- Cost: ~$0.02 per 1M input tokens for embeddings
- Expected recall improvement: 20-60% over keyword-only
What You'll Need
- Supabase (self-hosted) with pgvector extension
- Embedding API (OpenAI-compatible)
- Node.js / Next.js app
- Content to search (articles, products, docs)
Steps
- Install pgvector. In Supabase SQL editor: create extension if not exists vector;.
- Add embedding column. alter table articles add column embedding vector(1536);. Dimension matches your embedding model.
- Create index. For <100K rows: create index on articles using ivfflat (embedding vector_cosine_ops) with (lists = 100);. For larger: use HNSW.
- Backfill embeddings. For each row, concat title + excerpt + body, call embedding API, store vector. Batch 100 rows per API call for speed.
- On insert/update trigger. Use Supabase Edge Function or app-level hook to re-embed when content changes.
- Query vector search. User types query → embed → SELECT *, 1 - (embedding <=> $1) as similarity FROM articles ORDER BY embedding <=> $1 LIMIT 20.
- Add hybrid BM25. Postgres has tsvector for full-text. Combine: final_score = 0.5 vector_score + 0.5 bm25_score. Ask AI: "Generate a Postgres function that returns top-K hybrid-ranked results."
- Optional: re-rank top 20 with cross-encoder. Cohere Rerank or a self-hosted bge-reranker slashes irrelevant results from top 3.
Common Mistakes
- Only indexing title: Embed full content for meaningful similarity.
- Wrong dimension: Mismatch between model and column dimension = error.
- No BM25 fallback: Pure vector misses exact-match queries like SKUs, names.
- No re-ranking: Top-20 vector results often have 3-5 off-topic hits. Re-rank fixes this.
- Not filtering by metadata: Always pre-filter by user/category/language, then vector search.
Top Tools
Tool
Best For
Price
pgvector
Postgres vector store
Free
text-embedding-3-small
Cheap & good
$0.02/M tokens
bge-m3
Self-hosted embed
Free
Cohere Rerank-compat
Re-ranking
$1/1K
tsvector (pg)
BM25
Free
FAQs
Q: Vector DB (Pinecone, Qdrant) vs pgvector?
Pgvector wins for most — 1M+ vectors run fine with HNSW, no extra infra.
Q: Which embedding model?
3-small (1536d) for speed. 3-large (3072d) for quality. bge-m3 self-hosted for privacy.
Q: How do I handle multilingual content?
Use multilingual embeddings (bge-m3, multilingual-e5) — they cross languages natively.
Q: Real-time vs batch indexing?
Trigger-based re-embed on update works up to ~100 writes/min. Above that, queue.
Q: How do I evaluate quality?
Create 50 query → expected result pairs. Measure Hit@3, NDCG@10. Compare configs.
Q: Can I search images?
Yes — use CLIP embeddings for images + text in the same vector space.
Conclusion
Semantic search is a one-afternoon upgrade that dramatically improves product experience. Add pgvector, embed your content, layer hybrid scoring, and watch bounce rates drop. No new infrastructure needed.