Benchmarking Ollama Embeddings with a Minimal Node.js Script

Semantic search is becoming easier to prototype thanks to lightweight models and tools like Ollama’s nomic-embed-text. I wanted to see how well a local solution would perform—without relying on cloud services or heavyweight vector databases.

So I built a tiny Node.js script that:

Embeds blog posts using Ollama model nomic-embed-text
Stores the vectors in memory
Ranks documents based on vector similarity

Here’s what I came up with:

🖥️ Platform Specs

CPU: AMD Ryzen 9 7900X3D (12 cores, 24 threads)
RAM: 64 GB
Disk: NVMe SSD
GPU: not used (ollama on CPU only)
Environment: Node.js (no DB, no external vector store)

📚 Dataset

Source: 650 blog posts from launix.de
Average post size: ~4 KB
Total text size: ~2.6 MB

⏱️ Embedding Performance

The script reads all posts and sends each one to Ollama for embedding. Total time:

167,148 ms for all 650 posts
That’s 257 ms per post
Or ~16 KB/s read+embed throughput

Each embedding was fetched via HTTP from a locally running Ollama instance, so this includes I/O, JSON parsing, and vector serialization.

🔍 Search Performance (In-Memory)

To test semantic search:

A query is embedded using Ollama.
The resulting vector is compared against all post vectors in memory using cosine similarity.
Top 5 posts are returned.

Performance:
21–51 ms per query — even with no indexing.

For small datasets, this is more than fast enough for real-time use, especially with our ERP/CRM/DMS products where users can find matching documents quickly and only few users access the system.

🧠 Replacing the Loop: SQL Vector Search with MemCP

While the Node.js prototype loops through each post in a for loop, that doesn’t scale well. Normally, we’d reach for a vector database like Pinecone, Qdrant, or the upcoming MySQL vector extension to make use of a vector index.

But MySQL’s vector support isn’t released yet — so I tried something new: MemCP, an in-memory SQL engine with vector support.

✅ The Goal

Replace this Node.js loop:

for (const post of posts) {
  const distance = cosineSimilarity(queryVector, post.vector);
  // store or rank the result
}

With a SQL query like this:

SELECT ID, post_title, url
FROM posts
ORDER BY VECTOR_DISTANCE(vector, STRING_TO_VECTOR(?)) ASC
LIMIT 5;

? is filled the search embedding vector
VECTOR_DISTANCE() is a new function that computes the distance between two vectors
STRING_TO_VECTOR(?) parses the search vector on the fly from a JSON-string into its internal format

🧪 Why MemCP?

It’s blazing fast (entirely in RAM)
No external dependencies
SQL syntax makes integration clean
If I want, I can implement my whole REST Microservice inside the database with no extra need for an application server

🔧 Integration Plan

Export embedded vectors from Node.js to CSV or JSON
Load into MemCP using memcp import posts.csv
Replace the JS loop with a parameterized SQL query

🏁 Summary

Task	Time / Speed
Embedding 650 posts	167,148 ms total
Avg. embedding time/post	257 ms
Read+Embed speed	~16 KB/s
In-memory vector search	21–51 ms per query
SQL vector search (MemCP)	(In progress, but promising)

📌 Takeaways

Ollama + Node.js gives you working semantic search in a day.
With a CPU like the Ryzen 9 7900X3D, even brute-force search is fast.
MemCP offers a SQL-native path for scaling vector search without changing your app logic.

If you’re building internal tools, dashboards, or blog search, this is a surprisingly effective stack—with zero cloud dependencies.