pgvector + HNSW setup

You need a store for embeddings. Dedicated vector DBs (Pinecone, Qdrant, Weaviate) are great — but if you already run PostgreSQL, pgvector is the simplest.

1. Install

docker run -d --name pgvector \
  -e POSTGRES_PASSWORD=postgres -p 5432:5432 \
  pgvector/pgvector:pg16

Or on an existing PG:

CREATE EXTENSION IF NOT EXISTS vector;

2. Schema

CREATE TABLE IF NOT EXISTS document_chunks (
  id          BIGSERIAL PRIMARY KEY,
  document_id BIGINT NOT NULL,
  chunk_index INT NOT NULL,
  content     TEXT NOT NULL,
  embedding   vector(768),
  created_at  TIMESTAMPTZ DEFAULT now()
);

Change the dim and you must rebuild the index.

3. HNSW vs IVFFlat

Index	Build	Query	Accuracy	Good for
HNSW	slow	fast	very high	read-heavy RAG · search
IVFFlat	fast	ok	needs tuning	bulk load + periodic rebuild

HNSW by default for RAG.

4. Create index

CREATE INDEX ON document_chunks
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

Build after the full load is faster.

5. Query

SELECT id, content, 1 - (embedding <=> $1::vector) AS similarity
FROM document_chunks
ORDER BY embedding <=> $1::vector
LIMIT 10;

<=> cosine distance, <-> L2, <#> negative dot

Must sort ASC by distance for HNSW to engage.

6. Runtime tuning

SET hnsw.ef_search = 100;  -- default 40

Higher = more accurate and slower. Adjust per query requirements.

7. Gotchas

Missing CREATE EXTENSION vector;
Sorting by similarity DESC — HNSW idle; use distance ASC
Dimension mismatch after model change
No index — EXPLAIN should show Index Scan

8. Backup notes

pg_dump stores vectors as text. On restore, rebuild HNSW indexes (parallel with pg_restore --jobs).

Closing

pgvector delivers 80% of a vector DB with 20% of the effort. Up to ~100M vectors in a single PostgreSQL is usually fine.

04-rag-pipeline