Skip to main content
Make your first API request in minutes. Heroku AI provides managed inference with Claude, Nova, and other models through familiar OpenAI-compatible endpoints.
from openai import OpenAI

client = OpenAI(
    base_url="https://us.inference.heroku.com/v1",
    api_key="your-inference-key"
)

response = client.chat.completions.create(
    model="claude-4-5-sonnet",
    messages=[
        {"role": "user", "content": "Hello, Heroku AI!"}
    ]
)

print(response.choices[0].message.content)

Models

Choose from chat, embedding, reranking, and image generation models.

Claude 4.5 Sonnet

Anthropic’s balanced model for complex reasoning and coding tasks.

Claude 4.5 Haiku

Fast, cost-effective model for high-volume applications.

Amazon Nova

AWS models optimized for enterprise workloads.

Start building

Quickstart

Make your first API call and provision a model in minutes.

API Reference

Explore all available endpoints: chat, embeddings, images, and agents.

AI Studio

Test prompts interactively before integrating into your app.

Tool calling

Connect external tools and databases to your AI agents.

MCP servers

Deploy Model Context Protocol servers on Heroku.

Migrate from OpenAI

Switch from OpenAI with minimal code changes.

Integrations

Vercel AI SDK

React hooks for streaming chat UIs.

LlamaIndex

RAG pipelines and document loaders.

Pydantic AI

Type-safe agents with structured outputs.

Observability

Monitor with Phoenix, Weave, or Logfire.

Watch the overview