Cryptographic Receipts: Provable Proof That an AI Agent Did Its Job

When an AI agent calls a phone number on your behalf, sends an email, or verifies an identity, something important is missing from the transaction: proof that it happened at all. The agent says it completed the task. The orchestrator believes it. Money moves. But if the outcome is disputed, there is no receipt, no audit trail, no way to establish ground truth without replaying the entire interaction from logs that may or may not exist.
This is the accountability gap in agent commerce today, and it gets expensive fast. Freebot negotiates resolutions with corporate customer service departments and charges only when it wins. Freway's agent Janine recovers abandoned checkouts and takes a commission on recovered sales. Both models depend on a shared, verifiable answer to the question: did the agent actually do what it claimed to do? Without cryptographic receipts, that question is resolved by whoever has better logs and more lawyers.
Cryptographic receipts solve this at the infrastructure layer. They are signed, content-addressed records of agent execution that can be verified by any party without trusting any party. This article covers how to structure them, where to store them, how they integrate with USDC payment settlement on Base, and what failure modes to watch for.
What a Receipt Actually Contains
A cryptographic receipt is a structured document that commits to four things: what inputs the agent received, what outputs it produced, the metadata of the execution itself (timestamp, agent identity, tool versions), and a payment authorization tied to the outcome. Hash the inputs. Hash the outputs. Sign the whole structure with the agent's private key. Now you have a tamper-evident record that any downstream system can verify without replaying the execution.
Here is a concrete receipt schema using EIP-712 typed structured data hashing, which gives you human-readable types and domain separation for free:
// EIP-712 domain separator
const domain = {
name: "OneShot Agent Receipt",
version: "1",
chainId: 8453, // Base mainnet
verifyingContract: "0xYourReceiptRegistryAddress"
};
// Typed data structure
const types = {
ExecutionReceipt: [
{ name: "agentId", type: "bytes32" },
{ name: "taskId", type: "bytes32" },
{ name: "inputsHash", type: "bytes32" },
{ name: "outputsHash", type: "bytes32" },
{ name: "toolchain", type: "string" },
{ name: "startedAt", type: "uint64" },
{ name: "completedAt", type: "uint64" },
{ name: "paymentAmount", type: "uint256" },
{ name: "paymentToken", type: "address" }
]
};
// Build the receipt
const receipt = {
agentId: keccak256(agentPublicKey),
taskId: keccak256(taskSpecification),
inputsHash: keccak256(JSON.stringify(inputs)),
outputsHash: keccak256(JSON.stringify(outputs)),
toolchain: "oneshot@1.4.2/voice+email+sms",
startedAt: BigInt(Math.floor(startTime / 1000)),
completedAt: BigInt(Math.floor(Date.now() / 1000)),
paymentAmount: parseUnits("2.50", 6), // 2.50 USDC, 6 decimals
paymentToken: "0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913" // USDC on Base
};
// Sign with agent key using EIP-712
const signature = await agentWallet.signTypedData(domain, types, receipt);
The inputsHash commits to the exact task specification the agent received. The outputsHash commits to what it returned. If either is disputed later, the original preimage can be produced and re-hashed to confirm the match. The toolchain string records which OneShot tools were invoked, at what version, so you can reproduce the execution environment.
One thing to get right: hash the canonical serialization of your inputs and outputs, not whatever JSON.stringify happens to produce in your runtime. JavaScript's JSON.stringify is not deterministic across object key ordering. Use a library like fast-json-stable-stringify or define a canonical form explicitly. A receipt where the hash cannot be reproduced from the preimage is useless.
On-Chain vs Off-Chain Storage: The Real Tradeoffs

You have two places to store a receipt: on-chain (the receipt hash and signature live in a smart contract on Base) or off-chain (the full receipt lives in content-addressed storage like IPFS or a database, with only an anchor on-chain). Neither is always correct. The tradeoff is cost versus verifiability granularity.
Storing a full receipt on Base costs roughly 0.0001 ETH per transaction at current gas prices, about $0.03-0.05 per receipt at $3000 ETH. If your agent executes 10,000 tasks per day, that is $300-500 per day in gas, or $9,000-15,000 per month, for receipts alone. That is probably too expensive for low-value tasks.
The practical architecture for most agent deployments is a hybrid: store the full receipt payload in content-addressed storage (IPFS CID or a hash-keyed database), anchor only the receipt hash and CID on-chain once per batch. A batch of 100 receipts costs one transaction. The on-chain anchor proves the batch existed at a specific block height. The off-chain payload proves what was in the batch. Dispute resolution can request the full payload for any specific receipt within the batch.
import { OneShot } from "@oneshot-agent/sdk";
import { createPublicClient, createWalletClient, http } from "viem";
import { base } from "viem/chains";
const oneshot = new OneShot({ apiKey: process.env.ONESHOT_API_KEY });
async function executeAndReceipt(task) {
const startedAt = Date.now();
// Execute the task using OneShot tools
const result = await oneshot.voice.call({
to: task.phoneNumber,
script: task.script,
maxDuration: 600
});
const completedAt = Date.now();
// Build canonical payload
const payload = {
agentId: process.env.AGENT_ID,
taskId: task.id,
inputs: stableStringify(task),
outputs: stableStringify(result),
toolchain: `oneshot@${oneshot.version}/voice`,
startedAt,
completedAt,
outcome: result.resolved ? "success" : "failure",
paymentAmount: result.resolved ? task.fee : "0"
};
// Hash for on-chain anchoring
const payloadHash = keccak256(stableStringify(payload));
// Store full payload off-chain
const cid = await ipfs.add(JSON.stringify(payload));
// Queue for batch anchoring (flush every 100 receipts or 10 minutes)
receiptQueue.push({ payloadHash, cid: cid.path, taskId: task.id });
if (receiptQueue.length >= 100) {
await flushReceiptBatch(receiptQueue.splice(0, 100));
}
return { result, receiptCid: cid.path };
}
async function flushReceiptBatch(batch) {
const batchRoot = merkleRoot(batch.map(r => r.payloadHash));
// Single on-chain transaction anchors the entire batch
await registryContract.write.anchorBatch([
batchRoot,
batch.map(r => r.cid),
BigInt(Math.floor(Date.now() / 1000))
]);
}
This pattern costs roughly $0.03-0.05 per 100 receipts in gas, or $0.0003-0.0005 per receipt. At that price, even a $0.10 task can carry a receipt without the receipt cost dominating the transaction economics.
Payment Settlement Tied to Receipt Verification
The real value of cryptographic receipts is that they can gate payment. Instead of paying an agent and hoping it did the work, you pay into escrow, the agent produces a receipt, and the escrow releases funds when the receipt verifies. This is the architecture that makes pay-per-resolution models like Freebot's actually trustworthy rather than just contractually obligated.
USDC on Base is the right settlement token for this: six decimal precision, fast finality (roughly 2 seconds on Base), and low transaction costs. A typical dispute-free settlement flow looks like this:
- Client deposits USDC into a task escrow contract, specifying the agent address and task hash.
- Agent executes the task using OneShot tools, produces a signed receipt.
- Agent submits the receipt to the escrow contract.
- Contract verifies the EIP-712 signature, checks that the taskId matches the escrowed task, and releases USDC to the agent.
- Receipt hash is anchored on-chain as part of the settlement transaction.
Steps 3 through 5 happen in a single transaction. The settlement and the receipt anchoring are atomic. You cannot have one without the other, which means every payment record is also a receipt record.
The escrow contract needs to handle one edge case carefully: what counts as a valid outcome? For a voice call, "success" might mean the call connected and lasted more than 30 seconds. For an email, it might mean the SMTP server returned a 250 OK. For identity verification, it might mean a specific confidence score threshold. The receipt schema must encode the outcome criterion, not just the outcome, so the contract can verify the right thing was measured.
Dispute Resolution Between Agents

Multi-agent systems create a new class of disputes: agent A claims it delivered a result to agent B, agent B claims it never received it, and neither is necessarily lying. Network partitions, deserialization errors, and timeout races can all cause genuine disagreement about what happened. Cryptographic receipts turn these disputes from "he said, she said" into a verifiable chain of custody.
The pattern is a receipt chain. When agent A delivers a result to agent B, agent B signs an acknowledgment receipt that commits to A's output hash. Now you have a linked sequence: A's execution receipt commits to its outputs, B's acknowledgment receipt commits to A's outputs as its inputs. If B later claims it received different inputs than A claims to have sent, the hashes diverge and the discrepancy is localized to the handoff between A and B, not somewhere in a 10-step pipeline.
In practice, you want acknowledgment receipts to be cheap. They do not need to be anchored on-chain individually. They can be signed off-chain and submitted to a dispute resolver only when a dispute is actually raised. The dispute resolver (which can be a smart contract, an arbitration service, or a human with access to the signed payloads) then verifies the chain of custody and determines where it broke.
For the OneShot SDK, this means wrapping tool calls in receipt-generating middleware. Every call to oneshot.voice.call(), oneshot.email.send(), or oneshot.sms.send() should produce a tool-level receipt that feeds into the task-level receipt. The tool receipt proves the external action happened; the task receipt proves the agent's overall execution.
Failure Modes Worth Knowing
Receipt systems have predictable failure modes. The first is hash collision attacks, which are not a practical concern with keccak256 but matter if you use weaker hashes for performance. Do not use MD5 or SHA-1 for receipt hashing, even in internal systems.
The second is clock skew. Receipt timestamps come from the agent's local clock. If the agent's clock is wrong by more than a few minutes, receipts can appear out of order or outside valid windows. Use a time oracle or NTP-synced timestamps, and build a tolerance window into your verification logic. Rejecting a receipt because the agent's clock was 45 seconds off is a bad user experience.
The third is key management. The receipt is only as trustworthy as the agent's signing key. If the key is compromised, an attacker can forge receipts. Agent keys should be hardware-backed where possible (HSM or secure enclave), rotated on a schedule, and revocable on-chain. A revoked key should invalidate future receipts but not past ones, which means your verification logic needs to check that the signing key was valid at the time the receipt was produced, not just valid now.
The fourth is the preimage availability problem. A receipt hash is useless for dispute resolution if the preimage (the original inputs and outputs) is not available. Off-chain storage can go offline. IPFS pins can be dropped. Build redundant storage for receipt payloads: IPFS plus a database backup plus the original agent's local storage. The cost of storing a 10KB receipt payload for a year in S3 is roughly $0.000003. There is no good reason to lose preimages.
Benchmarks: What This Actually Costs
Here are estimated figures for a production receipt system on Base, handling 10,000 agent tasks per day:
- Receipt generation (CPU, signing): under 5ms per receipt on any modern server. Negligible.
- IPFS storage (Pinata or equivalent): roughly $0.15 per GB per month. At 10KB per receipt, 10,000 receipts per day is 100MB per day, 3GB per month, about $0.45 per month.
- On-chain anchoring at 100 receipts per batch: 100 transactions per day at $0.04 each is $4 per day, $120 per month.
- USDC settlement transactions: one per task at $0.04 each is $400 per day, but this cost is typically passed through to the task fee.
- Total receipt infrastructure cost (excluding settlement): roughly $120-125 per month for 300,000 receipts. That is $0.0004 per receipt.
At a task fee of $1.00 or more, receipt overhead is under 0.05% of revenue. At $0.10 tasks, it is 0.4%. The economics work at any reasonable task price.
What to Build Next
The receipt infrastructure described here is the foundation, but it enables several things worth building on top of it. The first is a reputation system for agents. If every execution produces a signed receipt with an outcome field, you can compute per-agent success rates, latency distributions, and failure mode frequencies over time. An agent with 50,000 receipts and a 94% success rate is a fundamentally different counterparty than one with 200 receipts and no track record. This is what Soul.Markets is building toward: verifiable agent identity backed by execution history, not just self-reported capabilities.
The second is receipt-gated access control. Instead of API keys, agents present recent receipts proving they have successfully completed related tasks. An agent that wants to access a high-value tool can be required to show receipts demonstrating it has handled lower-value tasks reliably.