Embeddings — A Visual Playground

01 · The core idea

An embedding is a list of numbers that represents meaning.

Computers can't reason about the word "cat" directly — they only understand numbers. So we convert each word (or sentence, image, product, song…) into a fixed-length list of numbers called a vector. The trick is: we train the conversion so that things with similar meaning end up with similar numbers.

cat

→

[ 0.21, −0.48, 0.67, −0.02, 0.35, −0.81, 0.14, ... (384 numbers total) ]

Each number is a "dimension". You can think of each dimension as a hidden semantic axis the model learned — maybe one roughly captures "is it alive?", another "is it a fruit?", another "is it formal language?". Real dimensions are usually not that interpretable, but the geometry they create is.

02 · Why vectors?

So we can measure how close two things are in meaning.

Old-school search matches keywords: if your query doesn't contain the exact word, you miss relevant results. Embeddings match meaning.

Keyword search

Exact words only

query: "how to fix a flat tire"

✓ "how to fix a flat tire"

✗ "my bike has a puncture, help"

✗ "deflated wheel repair guide"

The words don't overlap, so the results are missed — even though they're obviously the same topic.

Embedding search

Meaning-based

query: "how to fix a flat tire"

✓ "how to fix a flat tire"

✓ "my bike has a puncture, help"

✓ "deflated wheel repair guide"

All three vectors land in roughly the same region of vector space, so they all score as highly similar to the query.

03 · The geometry

Similar meanings cluster together in vector space.

Real embeddings have hundreds of dimensions, which is impossible to visualize. But the same principle holds in 2D. Below is a hand-placed toy embedding of 30 everyday words. Click any word — its 3 nearest neighbors will light up.

Click a word to see its nearest neighbors.
Distance = difference in meaning.

animals fruits vehicles emotions tech

Notice what's happening. Words within a category (e.g. all animals) group together. Words between related categories live in the space between them. In a real 300+ dimension model, the clustering captures subtler axes too: tense, sentiment, formality, topic, and more.

04 · Measuring similarity

The metric of choice: cosine similarity.

How do we numerically compare two vectors? We look at the angle between them. Two vectors pointing in the same direction are similar; pointing opposite ways means opposite meaning; perpendicular means unrelated.

cos(θ) = (A · B) / (‖A‖ × ‖B‖)

The result ranges from −1 (opposite) to 0 (unrelated) to +1 (identical direction). Drag the red and blue arrow tips below to see how the angle changes the similarity.

Vector A(1.0, 0.0)

Vector B(0.0, 1.0)

A · B0.00

‖A‖1.00

‖B‖1.00

angle90°

cosine similarity0.00

−10+1

05 · The real thing

Run an actual embedding model in your browser.

Everything above was toy data. Now let's use a real model: all-MiniLM-L6-v2 — it converts any sentence into a 384-dimensional vector. The model (~25 MB) loads once, then runs locally. No API calls.

Loading model… (~25 MB, one-time download, will be cached by your browser)

Compare any two sentences

Sentence A

Sentence B

Try: different phrasings of the same idea, same words in different orders, antonyms, different languages, or totally unrelated sentences. Watch the score change.

06 · Practical demo

Semantic search: the killer app.

This is the pattern behind RAG, modern search, and recommendation engines: pre-embed a corpus, embed the query, return the closest matches. Type any query below and watch the 10 sentences re-rank themselves live.

Notice: you don't need to use any words from the sentences themselves — the query matches on meaning. This is what makes embeddings so powerful.

07 · An important gotcha

What embeddings don't capture.

Cosine similarity tells you two sentences are about the same topic — not that they agree. Many opposites, negations, and reversals score surprisingly high. Click any pair below and see for yourself.

Why? A sentence embedding pools every token's vector. "I am a ___ person" shares the overwhelming majority of tokens in both sentences, and words like dog / cat or love / hate are themselves close in embedding space — they appear in similar contexts during training. The model sees "a statement on the same topic", not "two opposing opinions".

The practical rule: for sentiment, stance, or negation, don't use cosine similarity. Reach for a classifier, an NLI model, or an LLM.

08 · Where they shine

Embeddings show up everywhere.

Once you have a way to turn anything into a vector, you unlock a whole toolbox of vector-space operations: searching, clustering, classifying, ranking…

Semantic search

Find documents that mean the same as the query, not just share words.

RAG (retrieval-augmented generation)

Give an LLM just the relevant chunks from a big knowledge base by embedding both and retrieving top-k.

Classification

Compare an item's vector to label prototypes — zero training required.

Clustering / topic discovery

Run k-means on vectors to automatically group similar items.

Recommendations

Embed users and items in the same space; recommend the nearest items.

Deduplication

Spot near-duplicate content even when wording, order, or language differs.

Anomaly detection

Items far from any cluster are outliers — useful for spam, fraud, QA.

Cross-modal search

Models like CLIP embed images and text into the same space — search images with words.

09 · Takeaways

The mental model, in one breath.

An embedding model is a function that maps inputs into a geometric space where distance means difference in meaning. Once you have that, almost anything "fuzzy" becomes a math problem: compare, rank, group, recommend, retrieve.

A few things to remember

Embeddings from different models are not comparable — always use the same model for a given comparison.
More dimensions ≠ always better. 384-dim is standard; 1536 (OpenAI) captures more nuance but costs more to store.
Always normalize your vectors (unit length) when using cosine similarity — many libraries do this automatically.
For speed at scale, store vectors in a vector database (Pinecone, Weaviate, pgvector, Qdrant) with approximate-nearest-neighbor indexes.
The same math applies to images, audio, code, user behavior, graph nodes… anything you can train an encoder for.