Agents
Chat API
Agents speak the OpenAI Chat Completions protocol. Anything that already talks to OpenAI can talk to a Kraterion agent — and gets retrieval and citations back as an extra field.
Endpoint
POST https://api.kraterion.com/v1/agents/<agent_id>/chat/completions
Authorization: Bearer kr_live_...Authenticate with a bearer token or a share token. The agent's model, tools, and knowledge come from its saved configuration.
Request
A standard Chat Completions body. The system prompt belongs to the agent, so a system message in the request is rejected, and the last message must be from the user. Two Kraterion-specific flags control the extension payload.
{
"messages": [
{ "role": "user", "content": "What's our refund window?" }
],
"model": "gpt-4o-mini",
"temperature": 0.2,
"max_tokens": 512,
"stream": false,
"include_retrieval_info": true,
"include_citations": true
}Response
The familiar Chat Completions object, plus a kraterion block with retrieval stats, citations, and any tool calls.
{
"id": "chatcmpl_kr_...",
"object": "chat.completion",
"model": "gpt-4o-mini",
"choices": [
{
"index": 0,
"message": { "role": "assistant", "content": "Refunds are accepted within 30 days..." },
"finish_reason": "stop"
}
],
"usage": { "prompt_tokens": 812, "completion_tokens": 48, "total_tokens": 860 },
"kraterion": {
"agent_id": "...",
"retrieval": {
"bucket_ids": ["..."],
"hit_count": 3,
"retrieval_latency_ms": 41,
"llm_latency_ms": 520,
"total_latency_ms": 564
},
"citations": [
{
"index": 0,
"chunk_hash": "sha256-...",
"s3_key": "handbook.md",
"ordinal": 7,
"bucket_id": "...",
"source_walrus_blob_id": "...",
"cited": true
}
]
}
}Streaming
Set stream: true for Server-Sent Events. You get standard chat.completion.chunk frames for the text, interleaved kraterion.tool_call frames as tools run, a final kraterion.extension frame with the retrieval and citation summary, and then data: [DONE].
Citations extension
Each citation points at the exact chunk the answer drew on — its content_hash, the source object and ordinal, and the Walrus blob it came from. Because the hash is content-addressed, a reader can verify a quote was really in your data and hasn't been altered. The Search & citations page covers what each field means.
Using the OpenAI SDK
Point the base URL at the agent and pass your bearer token as the API key.
from openai import OpenAI
client = OpenAI(
base_url="https://api.kraterion.com/v1/agents/<agent_id>",
api_key="kr_live_...",
)
resp = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "What's our refund window?"}],
)
print(resp.choices[0].message.content)