๐ฉ๐ฒ๐ฐ๐๐ผ๐ฟ ๐ง๐ฎ๐ฏ๐น๐ฒ๐ ๐ญ๐ฌ๐ญ: ๐จ๐ป๐ฑ๐ฒ๐ฟ๐๐๐ฎ๐ป๐ฑ๐ถ๐ป๐ด ๐ฉ๐ฒ๐ฐ๐๐ผ๐ฟ ๐ฎ๐ป๐ฑ ๐ฃ๐๐ฉ๐ฒ๐ฐ๐๐ผ๐ฟ
You hear about vectors and pgvector. They sound complex. They are. You do not need to be an expert to use them.
A vector is a list of numbers. Example: [1, 2, 3, 4, 5].
Think of it as a point in space. Two numbers make a 2D point. Three numbers make a 3D point. Hundreds of numbers make a high-dimensional space.
Normal text search looks for exact words. Vector search looks for meaning.
Search for "Postgres database setup tutorial." The database finds "How to configure PostgreSQL." Words differ. Meaning is the same.
AI models turn text, images, or audio into vectors. These are embeddings. The AI puts similar ideas close together. Cat and Dog stay near each other. PostgreSQL and Kubernetes stay in another area.
You find the closest vector with these methods:
- Euclidean Distance: Measures straight line distance.
- Inner Product: Measures alignment.
- Cosine Similarity: Measures the angle between vectors.
Dimensionality is the number of values in the list. More dimensions mean more detail.
Normalization makes all vectors the same length. This makes cosine similarity faster.
PostgreSQL has a tool called pgvector. It lets you store vectors in your tables. You use SQL to find the nearest vectors. The <-> operator finds the distance.
Comparing millions of vectors is slow. Use indexes like IVFFlat or HNSW to speed it up.
Use pgvector for:
- Semantic search
- RAG
- Recommendations
- Smart FAQs
Embeddings are points in space. AI is often geometry.
Source: https://dev.to/rmarsigli/vector-tables-101-understanding-vector-and-pgvector-once-and-for-all-3g68