pgvector on Heroku Postgres

The pgvector extension for Heroku Postgres adds support for vector data types and similarity search operations. Use it to build semantic search, recommendation systems, and retrieval-augmented generation (RAG) applications.

Quick Start

1. Install pgvector

Connect to your database and enable the extension:

heroku pg:psql DATABASE_URL -a my-app

CREATE EXTENSION vector;

2. Create a Table with Vectors

CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  content TEXT,
  embedding VECTOR(1536)  -- Dimension matches your embedding model
);

3. Insert Embeddings

INSERT INTO documents (content, embedding) VALUES
  ('AI helps developers build applications', '[0.1, 0.2, 0.3, ...]'),
  ('Postgres is a powerful database', '[0.4, 0.5, 0.6, ...]');

4. Search by Similarity

-- Find 5 most similar documents using cosine distance
SELECT content, embedding <=> '[0.15, 0.25, 0.35, ...]' AS distance
FROM documents
ORDER BY distance
LIMIT 5;

5. Add an Index for Performance

-- HNSW index for fast similarity search
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

Build the index after loading your data for better recall. See the Performance section for details.

Use Cases

With pgvector, you can:

Perform retrieval augmented generation (RAG): You can populate the database with vectorized embeddings that represent the semantic nature of the documents indexed, such as the latest product documentation for a specific domain. Given a query, RAG can retrieve the most relevant embeddings and the corresponding documents, which are then used to augment the context of the prompt for generative AI. The AI can then generate responses that are both accurate and contextually relevant.
Recommend products: With a vector database containing various attributes, searching for alternatives based on the search criteria is simple. For example, you can make recommendations based on similar products like dresses or shirts, or match the style and color to offer pants or shoes. You can further extend this with collaborative filtering where the similar preferences of other shoppers enhance the recommendations.
Search Salesforce data: Use Heroku Connect to synchronize Salesforce data into Heroku.Create a table with the embeddings since Heroku Connect can’t synchronize vector data types. For example, you can search for similar support cases with embeddings from Service Cloud cases.
Search multimedia: Search across multimedia content, like images, audio, and video. You can embed the content directly or work with transcriptions and other attributes to perform your search. For example, generating a music playlist by finding similar tracks based on embedded features like tempo, mood, genre, and lyrics.
Categorize and segment data: From industries such as healthcare to manufacturing, data segmentation and categorization are key to successful data analysis. For example, by converting patient records, diagnostic data, or genomic sequences into vectors, you can identify similar cases, aiding in rare disease diagnosis and personalized treatment recommendations.
Detect anomalies: Detect anomalies in your data by comparing vectors that don’t fit the regular pattern. This comparison is useful in analyzing and detecting problematic or suspicious patterns in areas such as network traffic data, industrial sensor data, transactions data, or online behavior.
Perform similarity searches: Perform simple vector similarity searches (VSS) based on the input query vector. Generally, query embeddings are used to search against the embeddings loaded into the database, but any vector data can work.

Complete RAG Example

Here’s a full example combining the Heroku Inference API with pgvector for retrieval-augmented generation:

import os
import psycopg2
from openai import OpenAI

# Initialize OpenAI client pointing to Heroku Inference API
client = OpenAI(
    base_url=os.getenv("EMBEDDING_URL"),
    api_key=os.getenv("EMBEDDING_KEY")
)

# Connect to Postgres
conn = psycopg2.connect(os.getenv("DATABASE_URL"))
cur = conn.cursor()

# 1. Generate embedding for user query
query = "How do I scale my Heroku app?"
response = client.embeddings.create(
    model="cohere-embed-multilingual",
    input=[query]
)
query_embedding = response.data[0].embedding

# 2. Search for similar documents
cur.execute("""
    SELECT content
    FROM documents
    ORDER BY embedding <=> %s::vector
    LIMIT 3
""", (query_embedding,))

relevant_docs = [row[0] for row in cur.fetchall()]

# 3. Use retrieved context with chat completion
chat_client = OpenAI(
    base_url=os.getenv("INFERENCE_URL"),
    api_key=os.getenv("INFERENCE_KEY")
)

context = "\n\n".join(relevant_docs)
response = chat_client.chat.completions.create(
    model=os.getenv("INFERENCE_MODEL_ID"),
    messages=[
        {"role": "system", "content": f"Answer based on this context:\n{context}"},
        {"role": "user", "content": query}
    ]
)

print(response.choices[0].message.content)

Prerequisites

You have an Essential-0, Essential-1, or Essential-2 database running PostgreSQL 14 or higher, or a Standard-tier and higher database running PostgreSQL 15 or higher.
You don’t have Heroku Streaming Data Connectors set up on that database. Vector data types don’t sync.

Provisioning

You can install pgvector by running CREATE EXTENSION vector; in a psql session on your database.

$ heroku pg:psql DATABASE_URL -a example-app
--> Connecting to postgresql-octagonal-12345
psql (13.2, server 11.12 (Ubuntu 11.12-1.pgdg16.04+1))
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
Type "help" for help.

example-app::DATABASE=> CREATE EXTENSION vector;
CREATE EXTENSION
example-app::DATABASE=>

To check what version is installed, run the following query from psql:

=> SELECT extversion FROM pg_extension WHERE extname = 'vector';
 extversion
------------
 0.5.0
(1 row)

Work with pgvector

You can create a table with a vector column:

CREATE TABLE animals(id serial PRIMARY KEY, name VARCHAR(100), embedding VECTOR(100));

In the example, we created an animals table with an embedding vector column. After creating a table, you can insert vectors:

INSERT INTO animals(name, embedding) VALUES ('llama', '[-0.15647223591804504,
…
-0.7506130933761597, 0.1427040845155716]');

In the example, we inserted llama in the embedding vector column. You can also add a vector column to an existing table:

ALTER TABLE fruit ADD COLUMN embedding VECTOR(100);

In the example, we added a vector column to an existing fruit table.

Perform Vector Queries

You can perform different operations on your vector data with these common query operators.

: Euclidean Distance

The Euclidean distance, or L2 distance, operator measures the straight-line distance between two points in a vector. This operator is best for searching for an alternative item or most similar item to the query. In the example, we use Euclidean distance search for animals similar to a shark in the animals table.

=> SELECT name FROM animals WHERE name != 'shark' ORDER BY embedding  (SELECT embedding FROM animals WHERE name = 'shark') LIMIT 5;
 name
-----------
 crocodile
 dolphin
 whale
 turtle
 alligator
(5 rows)

: Negative Inner Product

The negative inner product operator measures the orthogonal projection, which is whether the vectors point in the same or opposite direction with magnitude. The greater the inner product means a greater similarity between two vectors. This operator is best for searching for items of similar topics and similar in magnitude. For example, image identification where an image of a shark yields the highest inner product against the embedding of a shark.

: Cosine Distance

The cosine distance operator measures the cosine of the angle between two vectors. The value ranges from -1 to 1, with values closer to 1 representing greater similarity between the vectors. This operator is best for similarity searches where it’s better to omit the magnitude. For example, anomaly detection where the frequency doesn’t matter, or text-based search for whole paragraph and document semantics.

Performance

As you add more vector data to your database, there can be performance issues or slowness in performing queries. You can index vector data like other columns in Postgres, and pgvector provides a few ways to do so. Keep in mind:

Adding an index causes pgvector to switch to using approximate nearest neighbor search instead of exact nearest neighbor search, possibly causing a difference in query results.
Indexing functions are based on distance calculations. Create functions based on the calculation you plan to rely on the most in your application.

You can set the following settings to help optimize performance. max_parallel_workers_per_gather speeds up queries without creating an index.

SET max_parallel_workers_per_gather = 4;

max_parallel_maintenance_workers speeds up creating indexes on large tables by increasing the number of parallel workers.

SET max_parallel_maintenance_workers = 7;

To see the current settings, use the SHOW command:

=> SHOW max_parallel_workers_per_gather;
 max_parallel_workers_per_gather
---------------------------------
 1
(1 row)

See the pgvector documentation for guidance on creating indexes and optimizing performance on vector data types.

Indexes

pgvector supports two index types: HNSW (Hierarchical Navigable Small Worlds) and IVFFlat (Inverted File with Flat Compression index). Depending on the type of index you choose, there are common trade-offs including speed, recall or the quality of the query result, build times, and resource consumption. The default index type is having no index as pgvector is configured for exact nearest neighbor search. The default allows for perfect recall at the expense of speed. Adding an HNSW or IVFFlat index lets you perform approximate nearest neighbor (ANN) searches.

HNSW

HNSW is the preferred index type for ANN searches as it performs better than IVFFlat in most use cases. However, HNSW indexes take longer to build and take more memory. The algorithm builds multi-layer graphs that allow for fast search results. The index is built as you insert data into the table, so you can create these indexes without any data.

IVFFlat

You can use IVFFlat indexes for improving performance on ANN searches, including searches against high-dimensional embeddings. Since the IVFFlat relies on an existing vector, it’s best to build the index after the table has data in it for better recall. If the data distribution changes significantly, rebuild the index. IVFFlat indexes have faster build times and take less memory than HNSW, but it has worse query performance.

Embeddings API

Generate embeddings for your content

Heroku Tools

Query vectors with pg_vector_query tool

AI Integrations

Use with LlamaIndex and other frameworks

Blog Post

Full tutorial on similarity search

pgvector

​Quick Start

​1. Install pgvector

​2. Create a Table with Vectors

​3. Insert Embeddings

​4. Search by Similarity

​5. Add an Index for Performance

​Use Cases

​Complete RAG Example

​Prerequisites

​Provisioning

​Work with pgvector

​Perform Vector Queries

​: Euclidean Distance

​: Negative Inner Product

​: Cosine Distance

​Performance

​Indexes

​HNSW

​IVFFlat

​Related Resources

Embeddings API

Heroku Tools

AI Integrations

Blog Post

Quick Start

1. Install pgvector

2. Create a Table with Vectors

3. Insert Embeddings

4. Search by Similarity

5. Add an Index for Performance

Use Cases

Complete RAG Example

Prerequisites

Provisioning

Work with pgvector

Perform Vector Queries

: Euclidean Distance

: Negative Inner Product

: Cosine Distance

Performance

Indexes

HNSW

IVFFlat

Related Resources