Execution Models

AgentHiFive supports multiple execution models that define how an AI agent accesses provider APIs. Each model represents a different trust level and policy enforcement strategy.

Model A: Token Vending

The agent receives a time-limited provider token and calls the provider API directly.

Human           AgentHiFive              Agent              Provider API
  |                  |                     |                      |
  |-- approve ------>|                     |                      |
  |                  |-- vend token ------>|                      |
  |                  |   (scoped, short)   |                      |
  |                  |                     |-- direct API call -->|
  |                  |                     |<-- response ---------|
  |                  |                     |                      |

How it works:

The human approves a delegation request in the dashboard
AgentHiFive issues a scoped, time-limited provider token to the agent
The agent calls the provider API directly with that token
AgentHiFive has no visibility into individual API calls after the token is vended

Characteristics:

Lowest latency -- no proxy hop
Weakest policy enforcement -- policies are baked into the token scope at issuance time
No per-request body inspection or response filtering
Audit trail limited to token issuance and revocation events
Best for trusted agents performing bulk read operations

Model B: Brokered Proxy

The agent sends every request through AgentHiFive, which evaluates policy rules, decrypts the provider token, and forwards the request.

Agent            Nginx            AgentHiFive (Fastify)         Provider API
  |                |                      |                          |
  |-- request ---->|                      |                          |
  |  (Bearer JWT)  |-- sticky hash ------>|                          |
  |                |                      |-- policy eval (~0.1ms)   |
  |                |                      |-- decrypt token (~1ms)   |
  |                |                      |-- forward request ------>|
  |                |                      |   (50-500ms)             |
  |                |                      |<-- provider response ----|
  |                |                      |-- response filter        |
  |                |                      |-- async audit log        |
  |<-- response ---|<---------------------|                          |

How it works:

The agent sends an API request to AgentHiFive with a JWT
Nginx uses sticky sessions (header hash) to route to a consistent Fastify replica
The policy engine evaluates request rules (allow / deny / require_approval)
If allowed, AgentHiFive decrypts the provider token and forwards the request
Response rules filter the result (field stripping, PII redaction) before returning to the agent
An audit event is logged asynchronously

Characteristics:

Full per-request policy enforcement (method, URL, body content)
Response filtering and PII redaction (including real-time filtering on streams)
Streaming support for SSE, NDJSON, and other content types via stream: true
Complete audit trail of every API call
Provider token never leaves AgentHiFive -- the agent only holds an AgentHiFive JWT
Adds minimal overhead: policy check ~0.1ms, token decrypt ~1ms, provider call 50-500ms dominates
Transparent LLM proxy route (/v1/vault/llm/:provider/*) lets AI SDKs use native HTTP calls through the vault

Performance Strategy

Model B uses cookie-based sticky sessions and PostgreSQL LISTEN/NOTIFY instead of Redis. Each replica maintains an in-memory policy cache invalidated cross-replica via NOTIFY. JTI replay protection and rate limiting are database-backed. If a replica goes down, the load balancer re-routes to another which rebuilds its cache from PostgreSQL on first request (~5-10ms one-time cost).

Model C: Agent Runtime Sandbox (Future)

The agent runs inside a controlled sandbox environment managed by AgentHiFive.

Human           AgentHiFive              Sandbox              Provider API
  |                  |                     |                      |
  |-- configure ---->|                     |                      |
  |                  |-- spawn sandbox --->|                      |
  |                  |   (resource limits) |                      |
  |                  |                     |-- API call --------->|
  |                  |                     |<-- response ---------|
  |                  |<-- audit + result --|                      |
  |<-- notification -|                     |                      |

How it works:

The human configures a task and the sandbox environment
AgentHiFive spawns a sandboxed runtime with resource limits (CPU, memory, network, time)
The agent executes within the sandbox, with all API calls intercepted
Results and audit logs are returned to AgentHiFive when the task completes

Characteristics:

Strongest isolation -- agent cannot escape the sandbox
Resource limits prevent runaway costs
Full audit trail plus execution logs
Higher latency due to sandbox overhead
Not in MVP scope -- planned for future phases

Comparison

Aspect	Model A (Token Vending)	Model B (Brokered Proxy)	Model C (Sandbox)
Trust level	High -- agent is trusted	Medium -- verify then forward	Low -- fully contained
Latency	Lowest (direct call)	Low (+1-2ms proxy overhead)	Higher (sandbox overhead)
Policy enforcement	At token issuance only	Per-request (method + URL + body)	Per-request + resource limits
Response filtering	None	Field stripping + PII redaction (buffered and streaming)	Field stripping + PII redaction
Audit granularity	Token lifecycle events	Every API call	Every API call + execution logs
Token exposure	Agent holds provider token	Agent holds AgentHiFive JWT only	Token stays in sandbox
Use case	Trusted bulk reads	General-purpose agent access	Untrusted or high-risk tasks
MVP status	Yes	Yes	Future roadmap

Choosing a Model

Use Model A when the agent is fully trusted and you need maximum throughput for read-heavy workloads (e.g., indexing a mailbox).
Use Model B for the default case. It provides per-request policy enforcement, response filtering, and a complete audit trail with minimal overhead.
Use Model C (when available) for untrusted agents or high-risk tasks where you need resource limits and full execution isolation.

Most deployments will use Model B as the default. Model A is available as an optimization for specific high-trust scenarios.

Model A: Token Vending​

Model B: Brokered Proxy​

Model C: Agent Runtime Sandbox (Future)​

Comparison​

Choosing a Model​

Model A: Token Vending

Model B: Brokered Proxy

Model C: Agent Runtime Sandbox (Future)

Comparison

Choosing a Model