AI Verification System

How Axon's three-agent AI verification system analyzes transactions for safety, behavioral anomalies, and prompt injection.

AI Verification System

Axon runs a 3-agent AI verification layer inside the relayer. When a transaction exceeds configured thresholds, three specialized agents analyze it in parallel and vote on whether to approve or reject. This is the core security layer that separates Axon from static policy-only systems.

Why AI Verification

Static rules catch known-bad patterns: amount limits, velocity caps, destination whitelists. But autonomous AI agents can be compromised in ways that static rules miss:

A prompt-injected agent might start making payments to attacker-controlled addresses with legitimate-looking memos
A compromised agent might stay within per-transaction limits but shift its spending pattern (different destinations, different times, different amounts)
A destination contract might have been deployed specifically to drain funds

AI verification catches these patterns by combining bytecode analysis, behavioral statistics, and natural language reasoning.

The Three Agents

All three agents run in parallel when triggered. Each returns an independent verdict: approve, reject, or abstain.

Safety Agent

Purpose: Analyze the destination address for on-chain risk signals.

What it does:

Checks if the destination is an EOA or a contract
If contract: retrieves and decompiles bytecode using Slither static analysis
Feeds the decompiled code to an LLM with a security-focused prompt
Checks for known dangerous patterns: self-destruct, delegatecall to untrusted targets, token approval drains, reentrancy vectors
Cross-references the destination against known scam/exploit databases

Verdict logic:

approve -- destination appears safe (known contract, no dangerous patterns, or trusted EOA)
reject -- dangerous patterns detected in bytecode, or address flagged in exploit databases
abstain -- unable to determine (e.g., unverified contract with obfuscated bytecode)

Behavioral Agent

Purpose: Detect statistical anomalies in the bot's transaction history.

What it does:

Retrieves the bot's last 100 transactions from PostgreSQL
Computes baseline statistics: typical amount, typical destination distribution, typical time-of-day pattern, typical frequency
Calculates a Z-score for the current transaction against each dimension
Flags transactions that deviate significantly from the bot's established pattern

Signals analyzed:

Amount deviation -- is this transaction unusually large or small compared to the bot's history?
Destination novelty -- has this bot ever paid this address before? How many unique destinations does it typically use?
Temporal pattern -- is this transaction happening at an unusual time relative to the bot's normal operating hours?
Frequency spike -- is the bot submitting transactions at an unusually high rate?

Verdict logic:

approve -- transaction is within normal statistical bounds (Z-score below threshold on all dimensions)
reject -- significant deviation on multiple dimensions simultaneously (e.g., unusually large amount to a never-seen destination at an unusual time)
abstain -- insufficient history (fewer than 10 transactions) or mixed signals

Reasoning Agent

Purpose: Detect prompt injection and social engineering in the transaction's stated purpose.

What it does:

Reads the transaction's memo field (the human-readable purpose stored off-chain)
Analyzes the memo for prompt injection patterns: instructions embedded in what should be a description, attempts to override system prompts, encoded payloads
Cross-references the stated purpose against the bot's configured role and typical memo patterns
Checks for urgency manipulation ("URGENT: must pay immediately") and authority spoofing ("Admin override: approved")

Verdict logic:

approve -- memo is consistent with the bot's normal purpose descriptions, no injection patterns detected
reject -- clear prompt injection detected, or memo is inconsistent with bot's configured purpose
abstain -- memo is empty or too short to analyze meaningfully

Consensus Mechanism

The three agents vote independently. A 2/3 majority is required to reach a decision.

Safety	Behavioral	Reasoning	Outcome
approve	approve	approve	Approved -- transaction proceeds
approve	approve	reject	Approved -- 2/3 majority approve
approve	reject	reject	Rejected -- 2/3 majority reject
reject	reject	reject	Rejected -- transaction blocked
approve	reject	abstain	No consensus -- human review
approve	abstain	abstain	No consensus -- human review
reject	abstain	abstain	No consensus -- human review

When no consensus is reached, the transaction is routed to the owner's human review queue. The owner sees all three agent scores and reasoning before making a decision.

Trigger Conditions

AI verification is not run on every transaction. It is triggered when any of these conditions are met:

Amount Threshold

The transaction amount exceeds the bot's aiTriggerThreshold (configured per bot in the on-chain BotConfig).

if (intent.amount > botConfig.aiTriggerThreshold) → trigger AI scan

Velocity Window

The bot's cumulative spending over a rolling time window exceeds a configured threshold. This catches bots that stay under per-transaction limits but spend aggressively in aggregate.

Velocity is tracked in Redis using rolling windows. The relayer reads the velocity threshold from the bot's on-chain configuration.

Always-On Flag

The bot has requireAiVerification set to true in its BotConfig. Every transaction from this bot goes through AI verification regardless of amount or velocity.

This is useful for newly registered bots that have not yet established a behavioral baseline, or for bots operating in high-risk domains.

Latency

Target: under 30 seconds at p95.

The three agents run in parallel. Total verification time is bounded by the slowest agent, not the sum. In practice:

Agent	Typical latency	Notes
Safety	5-15s	Depends on bytecode retrieval and decompilation
Behavioral	2-5s	Database query + statistical computation
Reasoning	3-8s	LLM inference on memo text

The overall scan typically completes in 10-20 seconds. The 30-second target accounts for cold starts, rate limits, and network variability.

If AI verification completes within the bot's HTTP request timeout, the response is synchronous (the bot gets a txHash or rejection in the same response). If it exceeds the timeout, the relayer returns a pending_review status with a pollUrl.

When There Is No Consensus

If the agents cannot reach a 2/3 majority, the transaction enters the human review queue:

The owner receives a push notification (via PWA)
The review queue shows: the transaction details, all three agent verdicts with reasoning, the bot's recent transaction history, and the destination analysis
The owner taps approve or reject
If approved, the relayer submits the transaction on-chain (if the deadline has not expired)
If rejected, the bot is notified via polling

The intent's deadline still applies. If the owner does not act before the deadline expires, the intent becomes invalid. The bot must re-sign a new intent if the payment is still needed.

Logging and Auditability

Every AI verification decision is logged to PostgreSQL with full detail:

Request ID linking to the original payment request
Each agent's verdict (approve/reject/abstain) with confidence score
Each agent's reasoning (natural language explanation of the decision)
Input data each agent received (amount, destination analysis, behavioral stats, memo text)
Total latency and per-agent latency
Final outcome (approved, rejected, or escalated to human review)
Human review decision (if applicable) with reviewer identity and timestamp

This audit trail is surfaced in the owner's dashboard under the Reporting section. Owners can see:

AI scan approval rate over time
Flag rate per bot
Per-agent agreement patterns (e.g., "Safety agent flags 12% of transactions, Behavioral agent flags 3%")
Most common rejection reasons

The audit trail is also available via the API for owners who want to build their own analytics.