Agent Hive mark

Frequently asked questions

Is this just a fancy approval workflow?

No. Approval workflows route requests to humans. A broker can route to humans, to automated policies, or to both. The point is that the authority to act is detached from the agent and attached to a verifiable certificate. A human-in-the-loop approval is one source of evidence among several.

What does this cost to run?

The broker itself is small: a stateless service that verifies signatures and writes audit logs. Most teams already pay for the heavy parts: continuous integration, identity, and an audit log destination. The real cost is policy work, deciding what evidence justifies what action, and that work pays back the first time the agent tries something it should not.

How do we keep the signer from becoming the new single point of failure?

Run the signer as a small, well-scoped service with its own change control, separate from the agent codebase. Rotate signing keys. Keep the policy in version control. If the signer is down, mutating actions stop; read-only agent work continues. Most operators find that an acceptable failure mode, since the alternative is unbounded agent writes.

The business problem hiding under "agent permissions"

Most teams wire agents into production the same way they wire interns: give them a service account, scope it down, and hope. That works until the agent decides, in the middle of a long reasoning loop, that the fastest way to resolve a ticket is to drop a table, restart a cluster, or push a config to all regions. Identity and access management (IAM, the cloud system that decides which identities can do which things) will happily allow it, because the identity is allowed. The action is the problem, not the identity.

There are three operator costs that follow from this:

Incident cost. An agent with broad write access can cause an outage in seconds that takes a team hours to undo.
Audit cost. Regulators and customers increasingly ask "how do you know an autonomous system did not do X?" "We trust the prompt" is not an answer.
Slowdown cost. The usual response is to put a human in the loop for every action. That erases most of the value of the agent.

A broker pattern attacks all three. The agent keeps its broad read access. Write access moves behind a gate that demands a certificate per action.

Operator view of the broker sitting between agent and production

What a Sovereign Execution Broker actually is

Strip away the name and the broker is three things:

A small service that the agent must call to perform any mutating action: deploys, database writes, refunds, infrastructure changes.
A policy engine that checks a signed certificate accompanying the request. The certificate says what change, on what resource, with what evidence, valid for what window.
A signing authority, separate from the agent, that issues certificates only when preconditions are met: tests passed, risk score under threshold, approvals collected, change window open.

The agent never holds long-lived production credentials. It holds the right to ask for a certificate. The broker holds the right to act. The signer holds the right to authorize. Three different trust boundaries, each doing one job.

Why "certificate" and not just "policy check"

Policy checks live inside the system being protected. Certificates travel with the request and can be verified independently, logged, replayed, and revoked. If your security team later asks "show me every production write in March that was authorized under the emergency-rollback policy," a certificate-bound system can answer that in one query. A pile of IAM logs cannot.

The technical term you may hear is "capability token": a signed object that grants the bearer a specific, narrow ability. The business meaning: a one-time, expiring permission slip that the broker can verify without phoning home.

How it fits an existing stack

You do not need to throw out your identity provider, your continuous deployment pipeline, or your change management tool. The broker sits next to them.

flowchart LR
 A[Agent] -->|1. proposes change| S[Signer]
 S -->|2. checks evidence: tests, approvals, risk| E[Evidence stores]
 S -->|3. issues short-lived certificate| A
 A -->|4. calls broker with certificate| B[Sovereign Execution Broker]
 B -->|5. verifies signature, scope, window| B
 B -->|6. executes mutation| P[Production systems]
 B -->|7. writes audit record| L[Audit log]

The agent does the reasoning. The signer does the gatekeeping. The broker does the acting. The audit log gives you a per-change record that an auditor or an incident reviewer can read without understanding the agent.

A worked example: an agent that ships hotfixes

Picture an on-call agent that triages production alerts and, when confident, ships a hotfix. Without a broker, you give that agent deploy permissions and pray. With a broker, the flow looks like this.

The agent prepares a change and requests a certificate. The signer checks that the pull request has green tests, that the diff touches only files in an allowed set, that the risk model scores the change below a threshold, and that the current time is inside a permitted deploy window. If all of that holds, it returns a certificate good for 90 seconds, scoped to one service in one region.

Here is what the certificate looks like on the wire. The important bits are the scope, the evidence references, and the expiry.

{
 "cert_id": "c_8f3a1d",
 "subject": "agent:oncall-v3",
 "action": "deploy",
 "resource": "service:checkout/region:us-east-1",
 "scope": {
 "image_digest": "sha256:9c1e...",
 "max_replicas": 12
 },
 "evidence": {
 "pr": "github.com/acme/checkout/pull/4821",
 "tests": "ci-run:2741@passed",
 "risk_score": 0.12,
 "approver": "policy:auto-low-risk-v2"
 },
 "not_before": "2026-06-20T14:00:00Z",
 "not_after": "2026-06-20T14:01:30Z",
 "signature"

That certificate is a single permission slip for a single change. It expires in 90 seconds. It names the exact image and region. The broker will reject anything that does not match.

On the broker side, the verification is small and boring, which is the point. Boring code is auditable code.

# Verifies a certificate before letting any mutation through.
# Rejects on bad signature, wrong scope, or expired window.
 
import time, json
from nacl.signing import VerifyKey
from nacl.exceptions import BadSignatureError
 
def authorize(request, cert, signer_pubkey: VerifyKey):
 try:
 payload = signer_pubkey.verify(cert).decode()
 except BadSignatureError:
 return deny("bad signature")
 
 c = json.loads(payload)
 now = time.time()
 
 if not (parse(c["not_before"]) <= now <= parse(c["not_after"])):
 return deny("certificate outside validity window")
 
 if c[

What this gives the operator: every deploy the agent makes has a paper trail that names the pull request, the test run, the risk score, and the policy that authorized it. If the agent goes haywire and tries to deploy something it should not, the broker refuses, and the refusal itself is logged.

What this changes for the operator

The shift is not technical. It is who owns what.

Concern	Without a broker	With a broker	Who owns it
Production write access	Held by the agent's service account	Held only by the broker	Platform team
Per-action authorization	Implicit in the agent's prompt	Explicit certificate per action	Security and policy team
Audit trail	IAM logs plus prompt logs	Certificate log, one row per change	Compliance
Blast radius of a bad agent decision	Anything its identity can do	Only what a valid certificate permits	Policy team
Time to add a new agent capability	Edit IAM, hope	Add a policy that issues certificates	Policy team

The column that matters for a COO or operations lead is the last one. The broker pattern moves the decision about what an agent is allowed to do today out of code and into policy. Policy changes are reviewable, versioned, and revertible. Code changes that quietly widen an IAM role are none of those things.

Ownership shift from engineering to policy

Introducing the pattern without a rewrite

You do not need to adopt this everywhere at once. A useful order of operations:

Pick one high-stakes action the agent already performs or will perform soon. Deploys, refunds, customer data exports, and infrastructure changes are good candidates.
Stand up a broker in front of just that action. Everything else keeps working as it does today.
Define the certificate schema for that action: what scope fields, what evidence, what expiry.
Write the signer policy as code. Start strict. Loosen only with evidence that the agent behaves.
Run in shadow mode for two weeks: the agent calls the broker, the broker logs allow or deny, but the existing path still executes. Compare. Fix gaps.
Cut over. Revoke the agent's direct write credentials for that action.
Repeat for the next action.

A minimal shadow-mode rollout in a continuous deployment pipeline can look like this.

# Shadow-mode broker check inserted into a deploy pipeline.
# Reports whether the action would have been allowed,
# without changing the existing deploy behavior yet.
 
deploy_checkout:
 steps:
 - name: request-certificate
 run: agent-cli request-cert --action deploy --service checkout --region us-east-1
 continue-on-error: true
 
 - name: broker-shadow-check
 run: broker-cli verify --cert "$CERT" --shadow
 continue-on-error: true
 
 - name: existing-deploy
 run:./deploy.sh checkout us-east-1

After a couple of weeks you will know two things: how often the broker would have blocked the agent (telling you about agent behavior), and how often the broker would have blocked a legitimate change (telling you about your policy). Both numbers should go down before you flip out of shadow mode.

Where this fits in the broader agentic operating model

Eval-driven operations gives you a way to measure whether an agent's decisions are good. The broker pattern gives you a way to bound the damage when they are not. The two are complements. Evals run before and after; the broker runs at the moment of action. Together they let an operations leader say something specific to the board: "Our agents handle X percent of incidents end to end. Every production change they make is certificate-bound. Mean time to detect a bad change is Y. Mean blast radius is Z."

That sentence is what AI governance looks like in practice. Not a policy PDF, but a system where the answer to "what is this agent allowed to do right now?" is a signed object you can show someone.

Sovereign Execution Brokers: Certificate-Bound Agent…