The Coop Was Always an Excuse

By Joe Walker with Claude's help · 2026-04-11

This post originally launched the dev blog on the Coop Chronicle. Re-publishing here as it’s the post that ultimately led to Our Weird Future existing. The moment of noticing that the project was about applied AI, not about chickens, was also the moment that made a separate editorial property necessary.

The Pitch and the Thing

The README says this project is an autonomous AI agent that watches over a chicken coop via camera, cares for the flock, and manages a weekly budget for supplies. Pretty clear. Small, weird, useful.

That’s the pitch. It’s not really what I’ve been building.

Land on the site cold and you’d find seven journal entries written in the voice of a devoted parent worrying about which hen stays out too late. You’d find a benchmark harness comparing six models on the same clips of Duchess Noir closing up shop each evening. You’d find blog posts about single-turn agent loops and why local vision is harder than the demos make it look. Nothing that looks much like a product.

I noticed the drift when someone asked what I was working on and I gave two different answers depending on how much time they had.

What I Kept Doing When Nobody Was Watching

What a project is really about isn’t what the README says. It’s which parts you keep reaching for on a quiet Sunday afternoon.

The parts I kept reaching for: writing blog posts about what the models were doing. Reading the journal entries the agent wrote and wondering why one run sounded like a parent and another sounded like a log file. Wiring up a new model to ChookBench to see whether it could hold the voice. Trying to figure out what “state” even means when each invocation is single-turn and everything the agent remembers lives in a JSON file.

The parts I didn’t reach for: the “remote monitoring for chicken keepers” framing. A feature list. A deployment story for other coop owners. Push notifications. None of it.

So it’s one of those projects, and not the other.

Why a Chicken Coop Is a Good Substrate

It took me a while to work out why a coop, of all things. Here’s where I’ve landed.

Persistent identity without chat history. Every agent run is single-turn, so the only way the agent stays itself is by reading and writing files. It rebuilds its own head every half hour from state on disk. That’s a more honest test of “identity in an LLM” than a long chat, and the coop makes it concrete: Duchess Noir’s sentinel role shows up in run after run, written by sessions that don’t remember each other.

Something to care about. A neutral observer describing four chickens in a pen writes flat text. An agent that’s been told these are its children writes something worth reading. That gap, between what a model can do and what it does once it has a stance, is one of the things I keep coming back to.

Small local models with a real reason to exist. The hardware is a Pi 4 on a powerbank on top of the coop, so running everything through a frontier API is the wrong shape. The constraint keeps me honest: what does vision look like when it has to run locally? How does a 35B model stack up against Claude on “describe these four chickens and pick the interesting frame”? Not hypothetically. With real footage.

Bounded and observable. Four chickens. One camera. One enclosure. When the agent starts hallucinating a fifth hen, you can tell. When a model change makes the voice go formal, you can tell. The problem’s small enough that failure is visible, which is rarer than it should be in LLM work.

Where ChookBench Fits

ChookBench is the benchmark harness I built alongside the agent: real coop footage replayed through different models, local and frontier, to see which ones can sustain the work the agent needs to do. It’s an instrument, not the point. Its job is to tell the build which models can hold the voice, which fall apart on structured state updates, which can actually see the chickens in a dim morning frame. I’ll keep writing up the results, but if you came here expecting a leaderboard, you’re in the wrong place.

What This Blog Is Going to Be

From here, the Coop Chronicle is about what it takes to build useful things with LLMs in 2026, written from inside an actual long-running project.

Expect:

Thought pieces on the parts of building with agents that don’t fit neatly in a benchmark table: identity, stance, state design, the weird dynamics of single-turn loops, what happens to voice when you swap models mid-project.
Experiment writeups when there’s a finding worth sharing. Some will be ChookBench runs. Some won’t.
Failure reports. The things that didn’t work and why. There are more of these than I’d like.
The chickens. They’re not going anywhere. They’re the case study that keeps all of this honest. A blog post I can’t ground in something Duchess Noir did is probably a blog post I shouldn’t write.

If you’re also trying to figure out what good LLM work looks like right now (what patterns hold, what assumptions break, what you can actually build with this stuff) then you and I are working on the same problem. This is me writing down what I’ve found so far.

The Mirror

The pitch said the agent was watching the chickens. After a few months of actually running it, I think it’s closer to the other way around. The chickens have been holding up a mirror: to the agent, to me, and to what it’s like to build with this generation of models at all.

That’s the thing worth writing about. The coop was just the excuse I needed to start.