Interact with an agentic system that can autonomously invoke tools
/v1/agents/heroku endpoint allows you to interact with an agentic system powered by large language models (LLMs) that can autonomously invoke tools and execute actions on your behalf. The agent uses a loop to reason, plan, and execute tasks using Heroku Tools and MCP tools.
Authorization header with your Heroku Inference API key:
INFERENCE_KEY config variable.
INFERENCE_MODEL_ID config var.
Example: "claude-4-5-sonnet", "claude-4-5-haiku"
system, user, assistant, tool
/v1/agents/heroku endpoint supports two types of tools:
Tool Object Structure
heroku_tool, mcpname (string): Name of tool (e.g., "code_exec_ruby", "pg_psql")description (string, optional): Hint text to inform the model when to use this toolruntime_params (object): Configuration to control automatic executionRuntime Parameters:"standard-1x")120, default: 120)3)1.0
Controls randomness of the response. Range: 0.0 to 1.0
0 make responses more focused and deterministic1.0 encourage more creative and diverse responses0.999
Nucleus sampling threshold. Range: 0 to 1.0. Specifies the cumulative probability of tokens to consider.
/v1/agents/heroku can include multiple underlying inference requests.
4096 for Haiku models8192 for Sonnet modelsevent: message includes a JSON payload representing a completion. The final event is event: done with data [DONE].
chat.completion or tool.completion object.
id (string): Unique ID for agent session
object (enum): Type of completion. Options: chat.completion, tool.completion
created (integer): Unix timestamp when chunk was created
model (string): Model ID used to generate the message
choices (array): Array of length 1 containing a single choice object
usage (object): Token usage statistics (empty for tool completions)
Choice Object
0message (object): Message content with role assistant or toolrole (string): Message rolecontent (string): Text contenttool_calls (array, optional): Tool calls requested by the modeltool_call_id (string, for tool messages): ID of the tool call being responded tostop, length, tool_calls, ""Usage Object
User Message
"user"content (string): Contents of user messageExample:Assistant Message
"assistant"content (string): Contents of assistant messagetool_calls (array, optional): Array of tool call request objectsExample:System Message
"system"content (string): System prompt to guide the model’s behaviorExample:Tool Message
"tool"content (string): Output of tool calltool_call_id (string): ID of the tool call this message is responding toExample:Bearer token using your INFERENCE_KEY
Successful response (Server-Sent Events)
The response is of type string.