Kraterion
RAReplay & audit

Replay any run.
Prove what happened.

Agents are non-deterministic — the same input rarely gives the same output twice. Kraterion records every run as a tamper-evident trail you can replay from a receipt, then trace backward through every input that shaped the answer.

ReproduceAuditTrace
Agent · supportkr_share_test_3f4d…
scoped · support-docs
USER

What is our refund policy?

TOOL CALLS · 3 of 6 available
  • recallquery: "user prefs"2 notes29 ms
  • searchquery: "refund policy"4 hits62 ms
  • readkey: "pricing-faq.md"12 KB38 ms
ASSISTANT

Refunds are processed within 7 business days from the original payment method.

pricing-faq.md · §3score · 0.92verified
run · 3f4d…aereplayable
Total latency213 ms
Tools called3 of 6
Recordrecorded
1
receipt anchors every run
Replay from it anytime
100%
of steps captured
Retrieval, tools, memory
1:1
replay against the same inputs
Same retrieval, same data
yours
records you keep
Not vendor-held
01What a run captures

The whole chain, not just the answer.

A run record is everything it took to produce an output. When something goes wrong, you can see exactly which step caused it.

01

Model and prompt

The model used and a fingerprint of the system prompt, so you know exactly what ran.

02

Inputs

The user inputs that started the run, recorded verbatim.

03

Retrievals

Every chunk the agent retrieved, with the fingerprint of the exact source it came from.

04

Tool calls

Each tool the agent called, with its arguments and its result.

05

Memory

Every remember and recall, tied to the agent that made it.

06

Outputs

Intermediate steps and the final response — the whole chain, not just the answer.

The record itself is stored on Walrus and anchored on Sui — so it can't be altered after the fact, and anyone can verify it.

Bridge

Run it once.
Run it again, exactly.

02How replay works

From a receipt, in seconds.

01
Copy the receipt

Every finished run prints a short receipt.

02
Replay it

kraterion replay <receipt> — or call the API.

03
Same inputs

The run reruns against the same inputs and the same retrieved data.

04
Compare

See the original and the replay side by side.

Original · receipt 3f4d…ae
Replay · just now
recall2 notes
recall2 notes
search4 hits
search4 hits
readpricing-faq.md · §3
readpricing-faq.md · §3
outputRefunds in 7 business days.
outputRefunds in 7 business days.
4 of 4 steps matchverified
03Audit trail & lineage

Click any output. See every input.

The same record that powers replay is a graph. Every artifact an agent reads or writes is a node; every operation is an edge. Start from an output and walk back through the chunks, tool calls, and memory that shaped it — following the OpenLineage mental model, with a verify button on every node.

report.mdOutput

Built from

  • pricing-faq.md · §3chunk · score 0.92verify
  • billing-policy.md · §1.4chunk · score 0.88verify
  • searchquery: "refund policy" → 4 hits
  • recall · user prefs2 notes · markdown outputverify

Every node traces back to a record you can verify.

04The boundary

What we prove — and what we don't.

We record what the run did and make that record tamper-evident: the inputs, the retrievals, the tool calls, the memory, and the outputs. We don't claim to prove the model's internal reasoning was correct — only that this is the run that happened, and you can check it.

Replay & audit

Stop guessing.
Replay the run.

Every run recorded. Every record yours. Every step verifiable.

v 0.1 · testnetAll systems normal