Table of Contents
Quick Answer
Combine content embeddings (for cold start) with collaborative filtering (for warm users), then re-rank with an LLM for final personalization. Store vectors in pgvector, use Supabase real-time for events. Works for products, articles, videos, or users.
- Time to v1: 1-2 weeks
- Cost: $50-200/mo at small scale
- Expected lift: 10-40% on engagement metrics
What You'll Need
- Supabase with pgvector
- Item catalog (products, articles, videos)
- User event log (views, clicks, purchases)
- Embedding API
Steps
- Embed every item. For products: title + description + category. For articles: title + first paragraph. Store in
items.embedding. - Build content-based recs (cold start). For a given item, find nearest neighbors by cosine similarity.
SELECT * FROM items ORDER BY embedding <=> target_embedding LIMIT 10. - Log user events.
events (user_id, item_id, event_type, created_at). Event types: view, click, add_to_cart, purchase. - Build user embeddings. Average embeddings of items user engaged with, weighted by event type (purchase=3x, click=1x). Update on each event.
- Collaborative layer. For similar users:
SELECT user_id FROM user_embeddings ORDER BY embedding <=> target_user LIMIT 50. Recommend items they engaged with that target user hasn't seen. - LLM re-rank (optional, for premium experience). Pass top 20 candidates + user context to LLM: "Re-order these for a user who [preferences]. Return top 10 IDs." Cache per user per day.
- Diversity & exploration. Don't show 10 identical products. Apply MMR (maximal marginal relevance) or just boost items from different categories. 10% random for exploration.
- Measure lift. A/B test recs vs chronological/popular. Track CTR, conversion, session time.
Common Mistakes
- Only showing similar items: Users want "you might also like" PLUS "something new." Balance exploit/explore.
- No time decay: 2-year-old clicks shouldn't weigh as much as yesterday's. Decay weights exponentially.
- Recommending out-of-stock: Filter by availability always.
- Bias toward popular: Popularity ≠ personal. Normalize for item popularity.
- No fallback: Cold-start user with zero events still needs recs. Show editorial + category best-sellers.
Top Tools
| Tool | Best For | Price |
|---|---|---|
| Supabase pgvector | Vector store + SQL | Free tier |
| Recombee | Hosted recs | Paid |
| LightFM | Hybrid model | Free |
| Qdrant | Alternative vector DB | Free tier |
| PostHog | Event tracking | Free tier |
Conclusion
Recommendations compound revenue. Start with simple content-based, add collaborative as you gather events, layer LLM re-ranking for premium feel. Ship fast, measure lift, iterate monthly.
