Semantic search is becoming easier to prototype thanks to lightweight models and tools like Ollama’s nomic-embed-text. I wanted to see how well a local solution would perform—without relying on cloud services or heavyweight vector databases.
So I built a tiny Node.js script that:
- Embeds blog posts using Ollama model nomic-embed-text
- Stores the vectors in memory
- Ranks documents based on vector similarity
Here’s what I came up with:
🖥️ Platform Specs
- CPU: AMD Ryzen 9 7900X3D (12 cores, 24 threads)
- RAM: 64 GB
- Disk: NVMe SSD
- GPU: not used (ollama on CPU only)
- Environment: Node.js (no DB, no external vector store)
📚 Dataset
- Source: 650 blog posts from launix.de
- Average post size: ~4 KB
- Total text size: ~2.6 MB
⏱️ Embedding Performance
The script reads all posts and sends each one to Ollama for embedding. Total time:
- 167,148 ms for all 650 posts
- That’s 257 ms per post
- Or ~16 KB/s read+embed throughput
Each embedding was fetched via HTTP from a locally running Ollama instance, so this includes I/O, JSON parsing, and vector serialization.
🔍 Search Performance (In-Memory)
To test semantic search:
- A query is embedded using Ollama.
- The resulting vector is compared against all post vectors in memory using cosine similarity.
- Top 5 posts are returned.
Performance:
21–51 ms per query — even with no indexing.
For small datasets, this is more than fast enough for real-time use, especially with our ERP/CRM/DMS products where users can find matching documents quickly and only few users access the system.
🧠 Replacing the Loop: SQL Vector Search with MemCP
While the Node.js prototype loops through each post in a for loop, that doesn’t scale well. Normally, we’d reach for a vector database like Pinecone, Qdrant, or the upcoming MySQL vector extension to make use of a vector index.
But MySQL’s vector support isn’t released yet — so I tried something new: MemCP, an in-memory SQL engine with vector support.
✅ The Goal
Replace this Node.js loop:
for (const post of posts) {
const distance = cosineSimilarity(queryVector, post.vector);
// store or rank the result
}
With a SQL query like this:
SELECT ID, post_title, url
FROM posts
ORDER BY VECTOR_DISTANCE(vector, STRING_TO_VECTOR(?)) ASC
LIMIT 5;
?
is filled the search embedding vectorVECTOR_DISTANCE()
is a new function that computes the distance between two vectorsSTRING_TO_VECTOR(?)
parses the search vector on the fly from a JSON-string into its internal format
🧪 Why MemCP?
- It’s blazing fast (entirely in RAM)
- No external dependencies
- SQL syntax makes integration clean
- If I want, I can implement my whole REST Microservice inside the database with no extra need for an application server
🔧 Integration Plan
- Export embedded vectors from Node.js to CSV or JSON
- Load into MemCP using
memcp import posts.csv
- Replace the JS loop with a parameterized SQL query
🏁 Summary
Task | Time / Speed |
---|---|
Embedding 650 posts | 167,148 ms total |
Avg. embedding time/post | 257 ms |
Read+Embed speed | ~16 KB/s |
In-memory vector search | 21–51 ms per query |
SQL vector search (MemCP) | (In progress, but promising) |
📌 Takeaways
- Ollama + Node.js gives you working semantic search in a day.
- With a CPU like the Ryzen 9 7900X3D, even brute-force search is fast.
- MemCP offers a SQL-native path for scaling vector search without changing your app logic.
If you’re building internal tools, dashboards, or blog search, this is a surprisingly effective stack—with zero cloud dependencies.
Comments are closed