Table of Contents
Quick Answer
An embedding is a fixed-length vector of floating-point numbers that captures the semantic meaning of a piece of data.
- Typical size: 384, 768, 1536, or 3072 dimensions
- Produced by models like text-embedding-3-large or voyage-3
- Similar items have small cosine distance between their vectors
What Does Embedding Mean?
Computers do not understand "cat" and "kitten" are related — but their embedding vectors point in nearly the same direction in a high-dimensional space. The embedding model has learned this from reading billions of sentences (Google AI blog on word2vec, 2013; OpenAI embedding guide, 2024).
Embeddings turn the fuzzy idea of "meaning" into math you can index, cluster, and search.
How It Works
- Text enters the embedding model (a trimmed-down transformer)
- The model processes every token through attention layers
- A single pooled vector is output — typically the average of token vectors or a special [CLS] token
- You store this vector in a vector database
Similarity is measured by cosine similarity: 1.0 = identical meaning, 0 = unrelated, -1 = opposite.
Examples
- "king" - "man" + "woman" ~= "queen" (classic word2vec result)
- Search: "how to reset password" matches "I forgot my login" even without keyword overlap
- Recommendations: Netflix embeds movies; similar vectors = similar movies
- Clustering: 10,000 support tickets grouped into "billing", "login", "bug" buckets
- RAG: Embed every doc chunk, embed the query, retrieve nearest neighbors
Text Embeddings vs Image Embeddings
Both produce vectors but from different encoders. CLIP (OpenAI, 2021) embeds text and images into the same space, so "a photo of a dog" and an actual dog photo land near each other. That enables cross-modal search.
When to Use Embeddings
- Semantic search (find by meaning, not keywords)
- Clustering and topic discovery
- Recommendation systems
- Anomaly detection (far-from-normal vectors)
- RAG pipelines
- Deduplication (near-duplicate vectors)
FAQs
Are embeddings reversible? No — you cannot reconstruct original text from a vector, though you can sometimes infer sensitive info.
What is a vector database? A database optimized for nearest-neighbor search. Examples: pgvector, Pinecone, Weaviate.
Are embeddings model-specific? Yes — you cannot mix vectors from different models. Re-embed everything if you switch.
How big is an embedding? 1536 floats = ~6 KB per document. A million documents = ~6 GB.
Do embeddings cost money? Yes, but cheap — usually $0.02-0.13 per million tokens.
Can embeddings replace a database? No — they complement keyword search and structured queries.
Do images use the same embeddings as text? Only with multimodal models like CLIP.
Conclusion
Embeddings are the quiet engine behind search, RAG, and recommendations. Master them and most AI products become simple. More guides at Misar Blog↗.