Agent Hive mark

TLDR

Most agent platforms that claim to be self-hosting still run their own recurring jobs on someone else's scheduler. The heartbeat lives off-platform, so the dogfood is not real.
Our founding rule is that every recurring job inside our own company runs as a colony agent on Agent Hive. The daily blog is the first proof case.
This post ships alongside the first primitive that makes the rule true: a schedule table and a job runner. The scheduler loop comes next.

The trap

There is a quiet pattern in almost every agent product that calls itself self-hosting. The product runs, the agents run, the demos look clean. Then you go looking for the thing that actually decides when work happens, and it is somewhere else.

The daily report that lands in your inbox at 9am is fired by a Vercel cron. The nightly data sync is a GitHub Actions schedule. The retry-until-it-works orchestration is a Temporal Cloud workflow. Each of these is a fine tool. None of them is the agent platform. The platform is the thing that ran the work. The schedule, the heartbeat, the decision to wake up and do something, lives in a service the platform does not own.

This matters more than it looks. A scheduler is not a cosmetic detail. It is the part of the system that decides when anything happens at all. If your agent platform cannot wake itself up on a cadence, it is not a platform that runs a business. It is a library that runs when something else tells it to. The honest description of most "autonomous agent" products today is that they are very capable functions waiting on a cron they did not write, in a dashboard they do not control.

We did this too, briefly. The first version of our own blog publisher ran on a Vercel cron. It worked. It also quietly contradicted the entire pitch. We were telling people they could run a business on Agent Hive while the most visible recurring job in our own company ran on infrastructure that had nothing to do with Agent Hive. The cron fired, the post shipped, and the thing doing the firing was not one of our agents. That is the trap, and we walked into it for exactly as long as it took to notice.

The reason the trap is so easy to fall into is that the off-platform scheduler is always the path of least resistance. Vercel cron is one line of config. A GitHub Actions schedule is six lines of YAML. Wiring a real schedule primitive into your own platform is weeks of work that produces, on day one, exactly the same visible behavior as the one line of config. The post still ships at 9am either way. So the rational short-term move is always to take the one-liner, and the rational short-term move is always how platforms end up with a hole at the center of them. You only pay for the hole later, when you try to tell a customer they can do something your own company does not actually do on your own product.

The principle: Agent Hive runs on Agent Hive

So we wrote down a rule, and it is now a founding rule for the platform: every recurring job inside our own company runs as a colony agent on Agent Hive.

Not most jobs. Not the demo-friendly ones. Every recurring job. The blog. The weekly metrics roll-up. The competitor scan. The changelog digest. If it happens on a cadence and it belongs to our company, it runs as a scheduled agent inside a colony, with the same primitives a customer would use, subject to the same failure handling, the same budget enforcement, the same retros.

The blog publisher is the proof case because it is the most exposed. It runs every day. It produces something public. If it breaks, you can see that it broke, because there is no post. There is nowhere to hide a missed run. That is exactly why it is the right first job to move onto the platform. A job that fails in private is easy to paper over. A job that fails in public forces the primitive to be real.

What this actually means in practice

Here is what changes when you take the rule seriously.

No Vercel cron. No GitHub Actions schedule. No Temporal. The trigger does not live in a third-party dashboard. It lives in a table that the platform owns and a runner that the platform executes.

A colony_schedules table holds the recurring jobs. One row per job. Each row carries a cron expression in UTC, the canonical agent it should invoke, the frozen input to pass that agent, and an auth context describing which colony the work belongs to. A schedule is not code. It is data. You can read it, audit it, pause it, and reason about it without deploying anything.

The Marketing colony owns the blog chain. The five stages we have written about before, trend research, image direction, authoring, SEO, and publishing, run as scheduled colony agents under that colony, not as a script some engineer remembers to run. The colony is the unit that holds the work, the budget, and the reporting line.

A schedule row also keeps a small amount of run state next to the definition: when the job last ran, whether that run succeeded, failed, or is still running, and when it is due next. That last field is precomputed, so the runner does not have to parse cron expressions on the hot path to decide what is due. It scans for active schedules whose next run time has already passed, fires them, and advances the clock. Keeping the state on the row, rather than scattered across logs, means the answer to "is the blog healthy" is a single query, not a forensic exercise. It also means a human or an agent can pause a job by flipping one column, without touching code or redeploying anything.

Failure modes, retries, rubric scoring, and retros all live inside Hive. When a run fails, the failure is a row, not a Slack message someone has to notice. The Critic agent reads the run history and decides whether to retry, degrade, or escalate. The rubric scores every output against a standard. The retro surfaces what to fix. None of that is bolted on from outside. It is the same machinery any colony gets.

There is a second-order effect worth naming. Because the schedule is data in a table the platform owns, the platform can reason about it. An agent can read the schedule, notice that a job has missed three runs in a row, and act on that. A cron in a third-party dashboard is invisible to your agents. They cannot see it, cannot query it, and cannot respond to it. The moment the trigger lives inside the platform, the platform's own intelligence can be pointed at the question of whether the recurring work is healthy. That is not a feature you can bolt onto an external scheduler later. It is a property you only get by owning the heartbeat in the first place.

The point is not that we dislike Vercel or GitHub Actions. We use both for plenty of things that are genuinely build-time or deploy-time concerns. The point is that the recurring work of running a business is not a build-time concern. It is the product. And the product should run on the product.

Why this matters for customers

There are two reasons this should matter to you, and only one of them is about us.

The first is honest dogfood. If the agent platform's own daily work does not run on it, the platform is not finished, and you should not believe anyone who tells you otherwise. The fastest way to find the gaps in an agent platform is to try to run a real recurring job on it and see what breaks. We are doing that to ourselves, in public, on a job that fails visibly. Every gap we hit is a gap you would have hit, and we hit it first.

The second reason is the one that actually compounds. Every primitive the Marketing colony needs is a primitive every customer's colony will need too. Schedules. Memory that survives across runs. A rubric to score output quality. Retros that turn a bad run into a fixed process. We are not building a bespoke blog tool. We are building the schedule primitive, the memory primitive, the rubric primitive, and the retro primitive, and the blog is just the first thing that uses them.

When we ship colony_schedules so our blog can wake itself up, we are shipping the table that lets your colony wake itself up to send the Monday metrics email, scan for new competitor pricing every morning, or reconcile yesterday's orders overnight. Shipping these for ourselves means shipping them for the next ten thousand colonies. The selfish path and the customer path are the same path, which is the only kind of roadmap I trust.

What's shipping this week

This is sequenced deliberately, smallest real thing first.

SWARMS-0 is the schedule primitive. The colony_schedules table and a runColonyJob helper that resolves a job target to a runtime function, runs it, catches everything, and records the outcome as a row. That is this PR. It is the foundation, not the loop.

SWARMS-1 takes the Marketing colony live as the first scheduled colony. The blog chain stops being a script and becomes a set of agents the schedule invokes. The heartbeat loop that polls the schedule table and fires due jobs lands here too.

SWARMS-1.5 is a seven-day proof loop. The Marketing colony self-runs daily. The Critic watches every run. The rubric scores every output. The retros surface what to fix. At the end of seven days we either have a colony that ran a public job every day without a human in the loop, or we have a list of exactly what is still missing. Both outcomes are useful. Only one of them is comfortable.

What we learned writing this post

I should be straight about something. This post was written by hand, by me, on 2026-05-30, because the scheduler is not live yet. The colony cannot yet wake up and write its own recovery post, so I wrote this one the old way.

The blog cadence broke. The previous post was 2026-05-29, and 2026-05-30 came and went without one until I sat down and wrote this. That gap is not a footnote. It is the whole argument. The reason the next PR ships the schedule primitive is that the cadence broke, and a platform that runs a business cannot depend on a person remembering to run a job. Self-honesty beats self-congratulation here. The right response to a missed run is not to quietly backfill it and move on. It is to ship the thing that makes the next miss impossible, and to say plainly that we had not shipped it yet.

What to read next

If you want the design argument for why the colony, not the chat thread, is the right unit to hang this work on, read the org chart as product. It explains why an agent is a role with a reporting line and a budget, which is exactly what a scheduled job needs to belong to.

If you want to understand the layer underneath all of this, the loop that actually runs a single agent against a model with a budget and a tool inventory, read the Hive runtime spine. The schedule primitive in this post is what decides when that loop wakes up. The runtime is what runs once it does.

Why we made Agent Hive run on Agent Hive

The only platform to run an AI-native company.