๐๐๐ถ๐น๐ฑ๐ถ๐ป๐ด ๐ฆ๐ฒ๐บ๐ฎ๐ป๐๐ถ๐ฐ ๐ฆ๐ฒ๐ฎ๐ฟ๐ฐ๐ต ๐๐ถ๐๐ต ๐ฝ๐ด๐๐ฒ๐ฐ๐๐ผ๐ฟ ๐ฎ๐ป๐ฑ ๐ข๐ฝ๐ฒ๐ป๐๐
Keyword search failed 31% of users. People searched for funny cats. Titles used different words. The results were empty.
Semantic search fixes this. It looks at meaning.
The setup:
- SQLite for main data.
- Postgres with pgvector for search.
- OpenAI for embeddings.
- Cloudflare Workers for edge caching.
Three ways to make it work:
Use content hashes. Do not embed the same text twice. You save 94% on costs.
Create a structured document. Combine the title, channel, and tags. This gives the vector more context.
Add an extra score for exact matches. Semantic search is fuzzy. Give a bonus to titles with the exact words. This keeps precision high.
Privacy matters. Send only video metadata to the US API. Keep all personal data in the EU.
The result:
- Zero results fell from 31% to 4%.
- Edge cache hit rate is 70%.
- P50 latency is under 10ms.
Stop relying on keyword matching. Use vectors to find what your users mean.
Source: https://dev.to/ahmet_gedik778845/building-video-metadata-semantic-search-with-pgvector-and-openai-embeddings-34c Optional learning community: https://t.me/GyaanSetuAi