Table of Contents
Quick Answer
Ingest docs (Notion, Google Drive, PDFs, websites), chunk + embed, store in pgvector, then serve a chat UI that retrieves top chunks and streams LLM answers with source citations. Stack: Next.js + Supabase + assisters.dev-compatible API.
- Time to ship: 3-7 days
- Cost: $0.10-1 per 1K queries
- Use cases: Customer support, internal wiki, product docs
What You'll Need
- Source docs (Markdown, PDF, HTML, Notion, Confluence)
- Supabase with pgvector
- Next.js 15 for chat UI
- Embedding & LLM APIs
Steps
- Inventory sources. List every doc source and format. PDFs, Notion, Drive, Google Docs, help center articles, Slack archives, GitHub wikis.
- Build ingestion pipeline. For each source, fetch → extract text → chunk (500 tokens, 50 overlap) → embed → upsert to pgvector with metadata (source URL, title, updated_at).
- Schema. create table kb_chunks (id uuid, source text, url text, title text, chunk text, embedding vector(1536), updated_at timestamptz); plus an ivfflat or HNSW index.
- Schedule re-ingestion. Cron job daily for changed docs. Compare updated_at from source to stored, re-embed if newer. Delete orphans.
- Build retrieval. User query → embed → top-8 chunks via cosine. Add re-ranking step (cross-encoder) for top-3 final if quality matters.
- Chat UI. shadcn/ui chat pattern. Streaming LLM responses. Show source cards below each answer — clickable links with title + snippet.
- Prompt the LLM carefully. "Answer using ONLY the context. Cite every claim as [1]. If context doesn't cover the question, say 'I don't have info on that.'" Include retrieved chunks with numeric IDs.
- Add feedback loop. Thumbs up/down per answer. Log misses for review. Retrain retrieval weights or add missing content.
Common Mistakes
- Too-small chunks: 100-token chunks lose context. Stick to 400-600.
- No metadata: Can't filter by product/version/language without it.
- Chat only, no search: Offer both — some users want traditional keyword search too.
- Stale data: Schedule daily re-ingestion. Badge answers "updated: 2d ago."
- No access control: Internal KBs need row-level security by team/role.
Top Tools
Tool
Best For
Price
Supabase pgvector
Vector store
Free tier
LlamaIndex
Ingestion framework
Free
Unstructured.io
PDF/doc parsing
Free tier
Cohere Rerank-compatible
Re-ranking
$1/1K
shadcn/ui
Chat components
Free
FAQs
Q: Can I use Notion/Confluence as source?
Yes — both have APIs. Poll or webhook to sync.
Q: How do I handle images in docs?
Use vision models (Claude, GPT-4V) to caption images; embed the captions.
Q: What's a good retrieval quality metric?
Hit@3 (is correct chunk in top 3?). Aim for >85%.
Q: Can I keep data on-premise?
Yes — self-host Supabase, use local embedding model (bge-m3), local LLM (Llama).
Q: How many docs can pgvector handle?
Millions of chunks comfortably with HNSW index on a 4-core VPS.
Q: Do I need LangChain?
No — 200 lines of plain TypeScript does this. LangChain adds complexity fast.
Conclusion
AI knowledge bases replace 80% of support tickets and onboarding questions. Start with your help center docs, measure hit rate weekly, and expand sources. One KB can save your team 20+ hours per week.