Building Your Own Vector Database¶

Many working versions of MySQL for WordPress do not support the vector field.

However, we can create our version, limited for large datasets but useful for up to 10,000 documents or rows.

The Magic Trick: Using LONGTEXT Column¶

Since MySQL doesn't have a vector field, we'll hack it by storing vectors as text in a LONGTEXT column!

┌─────┬───────────────┬──────────────────────────────────┐
│ id  │     text      │       vector (LONGTEXT)          │
├─────┼───────────────┼──────────────────────────────────┤
│  1  │  "pizza"      │  "[0.9,0.1,0.2]"                 │
│  2  │  "pasta"      │  "[0.8,0.2,0.3]"                 │
│  3  │  "car"        │  "[0.1,0.8,0.9]"                 │
└─────┴───────────────┴──────────────────────────────────┘

Step 1: Create the Database Table¶

CREATE TABLE vector_items (
    id INT AUTO_INCREMENT PRIMARY KEY,
    text VARCHAR(500) NOT NULL,
    embedding LONGTEXT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_text (text)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

Limitations & When to Upgrade¶

This Approach is Slow When:¶

You have 10,000+ vectors (loads all into PHP memory).
You need sub-second search times.
Vector dimensions are very large (1000+). This is true in our case as we use 1536 dimensions but we have a small dataset so it balances out.
As we move past 205, more applications of MySQL will have the version that supports the vector field along with indexing.

When to Use Real Vector DB:¶

Production apps with millions of vectors
Real-time search requirements
Use: Pinecone, Weaviate, Milvus, or pgvector (PostgreSQL extension)

his Approach Works Great For:¶

Small to medium datasets (< 10,000 items)
Learning and prototyping
Side projects and MVPs
When you already have MySQL and don't want extra infrastructure

Summary¶

You just learned how to:

✅ Store vectors as comma-separated text in LONGTEXT
✅ Calculate cosine similarity in pure PHP
✅ Build a working vector search engine
✅ Integrate with OpenAI embeddings
✅ Optimize with caching and filtering

The Magic:

MySQL stores the vectors as text, PHP does the math to find similar items.

Simple but effective!