Borrowed from everywhere: seeding an LLM with strange domains

By Joe Walker with Claude's help · 2026-05-13

When I ask a model to brainstorm, what does it take to push it past the obvious five answers?

Innovation, as far as anyone can tell, is recombination

Pretty much everyone who’s studied where new ideas come from says the same thing: they don’t come from nowhere. Someone takes a pattern that’s doing real work in one field and notices it has the same shape in another.

Einstein called it combinatory play: “the essential feature in productive thought.” Newton, the standing on shoulders line. W. Brian Arthur’s The Nature of Technology argues, with receipts, that technologies are built from earlier technologies. “Ex nihilo” inventions are vanishingly rare, and the engine of progress is recombination of existing components and concepts. The empirical innovation research lines up: studies of biopharmaceutical patents find that the highest-impact inventions come from combining knowledge across local, adjacent, and distant domains, not from staying in one well.

An LLM already contains the library

The strange part: a modern LLM has read, in some form, a huge slice of every field we’ve ever written about. Coral reef ecology, gothic cathedral construction, container shipping logistics, MTG deck archetypes, jazz improvisation, immune signalling, speedrun routing: it’s all in there. Whatever cross-domain analogy a clever human might reach for, the model has the raw material to reach for it too.

But ask one to brainstorm and it doesn’t. It clusters around the same five obvious answers. Ask for “ten ways to analyse this video” and you’ll get pixel-diff bisection, frame sampling, motion detection, scene change detection, a couple of variants, and then it starts repeating itself. It’s standing in front of the library reading the front shelf.

The hypothesis I wanted to test was small: what if I just tell it which shelf to walk to?

The experiment

The chook-manager pipeline does motion-detection plus vision-LLM-on-keyframes. I wanted alternatives, and I mean actually different mechanisms, not parameter tweaks. Here’s the setup:

Write the problem once.
Fan out N parallel calls to a local Qwen3.6-27B (a 27-billion-parameter open model running on my own machine). Half get the neutral prompt. Half get the same prompt prefixed with “You are reasoning by analogy from <domain>. First name a specific pattern from <domain> that has the same shape as this problem. Then generate 3 approaches inspired by that pattern.”
Different seed for each treatment call: coral reefs, blacksmithing, Pokémon IV breeding, Marvel post-credits scenes, Magic: The Gathering, Lord of the Rings, anime tournament arcs, Studio Ghibli backgrounds.
Diff the outputs against the inventory of mechanisms already seen, mark new families, then re-run with the saturated families explicitly excluded in the prompt.

Each dot below is one ideation output (three mechanism proposals each). I stripped the seed-name word from each treatment first, then ran every output through a sentence-transformer embedding model (all-MiniLM-L6-v2, 384 dimensions) and projected to 2D with UMAP, so the picture reflects what the meanings look like rather than just word overlap. All 209 outputs from the four-wave run are on one canvas, coloured by wave; the eight black points are the controls (pooled across the two waves that had them; waves 3 and 4 were treatment-only). Hover any dot for the seed name; click to see the full prompt the model was given and the output it produced.

The eight black dots are the controls: independent runs of the same neutral prompt at temperature 0.9. They bunched into a tight knot. The 201 seeded runs cover roughly the rest of the canvas.

Spread. Mean pairwise distance in embedding space (the average gap between any two points, measured on the 384-dim sentence-transformer vectors, not on the 2D projection) is 1.57x higher inside the pooled seeded outputs than the controls. The model wasn’t just dressing up the same answer in costume. It was actually walking to different shelves.
Separation. The control knot sits off to one side; the seeded clouds barely touch it. The seeded prompts aren’t perturbing the default answer; they’re producing a structurally different kind of output. That separation is the part that matters for ideation: controls keep re-deriving the obvious, treatments don’t.
Wave-on-wave. Each wave on its own runs 1.49x to 1.57x wider than controls. The mechanism-count table below tells the cleaner wave-on-wave story; the scatter mainly establishes that the seeded-vs-control gap survives a real semantic distance metric.

(A 1.57x ratio is more conservative than the 1.8x from my quick TF-IDF first pass: semantic embeddings see deeper similarity than word overlap does, so the gap shrinks but doesn’t collapse. The spread numbers are computed in the original embedding space, not the 2D projection.)

Across four waves, escalating in weirdness, the seed pool started reasonable (ecology, hydrology, ER triage) and ended up in Dragon Ball power-level rankings and Cockney rhyming slang.

Wave	Seed flavour	Distinct new mechanisms
1	31 mainstream seeds, no exclusions	40
2	27 seeds, first exclusion list	+25
3	66 oblique seeds (maths, biology, coding theory), 68-family exclusion	+75
4	77 deliberately-weird seeds (pop culture, Cockney slang, MTG, LOTR)	+85

249 distinct mechanisms in roughly fourteen minutes of wall time, all running on a consumer GPU (an RTX 4090). Cost: zero.

A caveat I missed when I first wrote this section. The table conflates two things. Each wave after the first added more seeds and a longer exclusion list, so the +25 / +75 / +85 growth could be the seeds reaching wider, the exclusions pushing the model off saturated shelves, or some mix. I didn’t separate them at the time. An ablation lower down in this post does, and the answer was not what I expected.

The dedup pass was strict: strict enough that “shadow voting” and “consensus dueling” got collapsed even though one came from a Marvel seed and the other from a Pokémon one. The seeds were doing structural work, not vocabulary work.

The problem statement never changed

You might wonder whether the wave-on-wave growth in new mechanisms was just me tweaking the problem until something interesting fell out. It wasn’t. The problem statement was byte-for-byte identical across all four waves (lightly trimmed for the quote here):

A fixed camera films a small chicken coop continuously, producing 5-15 minute video clips at 1080p, 5fps. The goal is to extract a small set of meaningful observations from each clip - what the chickens are doing, how many are present and where, environmental conditions (light, weather), and anything notable or anomalous - without sending hours of raw video to a vision LLM (which is slow and expensive). The downstream consumer is an autonomous “agent” that writes a daily journal about the flock; it needs the gist of what happened, not raw frames. Different approaches must differ in their core mechanism (what algorithm, signal, representation, or workflow they use), not just parameter tuning of a common pipeline. The current implementation already does pixel-diff bisection then sends keyframes to a vision LLM, so do not re-propose that.

What did change between waves was the seed pool (more domains, weirder domains) and the exclusion list: a “mechanism families already covered, don’t re-propose these” preamble that grew from zero entries in wave 1 to sixty-eight by wave 3. That’s a deliberate forcing function: once the model has produced pixel-diff variants in wave 1, telling it explicitly “pixel-diff variants are off the table” in wave 2 pushes it further down the shelves. The exclusion list is doing the same job the seeds are (narrowing the model’s attractors) but from the opposite direction.

What the prompts actually looked like

The control prompt is deliberately bland:

Generate 3 distinct approaches to this problem. For each, give a one-line name and
2-3 sentences explaining the approach and what makes it different from the others.
Be concrete, not generic.

Problem: <problem statement>

Return only the 3 approaches, no preamble.

The treatment prompt adds one seed plus a rule that throws out any idea where the seed is just decoration:

You are reasoning by analogy from <DOMAIN>. First, in one sentence, name a specific
pattern, structure, or dynamic from <DOMAIN> that has the same shape as this problem.
Then generate 3 distinct approaches to the problem that are directly inspired by
patterns from <DOMAIN>.

The <DOMAIN> connection must change the *mechanism* of the idea, not just its
vocabulary. Test before submitting: if I rename your idea to remove all <DOMAIN>
words, would a generic engineer still arrive at the same proposal? If yes, the
analogy is decorative - discard that idea and find one where the <DOMAIN> pattern
dictates a specific structural choice (cadence, ordering, partitioning, eviction
rule, role split, threshold, lifecycle stage, etc.).

Problem: <problem statement>

That “if a generic engineer would arrive at the same proposal, discard it” line is doing most of the work. Without it, the model just dresses the same mechanism up in seed-flavoured costume: “use a load-balancer, but call it a Saruman.” With it, the seed has to dictate a structural choice, so the model has to pick a primitive that the seed actually maps onto.

A paired example from wave 1: same problem, same model, same temperature, two adjacent runs.

Control output (no seed):

Event-Triggered Semantic Segmentation. Use a lightweight, real-time instance segmentation model (like YOLO-World or MobileSAM) to continuously track individual chickens and classify their actions (eating, sleeping, fighting) on the edge device. Instead of sending video, the system only logs timestamped JSON events containing bounding boxes, action labels, and confidence scores, allowing the agent to reconstruct behavior patterns from structured data rather than visual frames.

A perfectly reasonable answer. Also the answer every engineer in the room would give. It’s the front shelf of the library.

Treatment output (seed: Magic: The Gathering, “mana sink” combo decks):

Pattern: “Land-Drop Stalling” or “Mana Sink” combo structures, where a deck plays inert, low-value cards to fill the stack and establish a resource baseline, only triggering the explosive win condition when specific resource thresholds are mathematically guaranteed to exist, thereby ignoring all intermediate noise that does not contribute to the final state.

Cumulative State Hash Commitment (the “Mana Bank” Checkpoint). Instead of analyzing frames continuously, the system computes a lightweight, irreversible cryptographic hash of the scene’s statistical summary (mean luminance, edge density, dominant color histogram) every 50 frames, storing only these hashes in a log-structured append-only buffer. The Vision LLM is invoked only when the Hamming distance between the current hash and the last “committed” hash exceeds a strict threshold, treating the intermediate hashes as “land drops” that maintain continuity without cost.

That’s not “Event-Triggered Semantic Segmentation in Magic costume.” The mechanism is structurally different: the seed forced cumulative state commitment with a threshold trigger, which is a real distributed-systems primitive, the same shape as a Merkle anchor or a blockchain checkpoint. The control couldn’t find that shape because nothing in the prompt told it to walk that far down the shelves.

This was the pattern across the seeded runs. The seed wasn’t decoration; it was a structural pointer. “Mana sink” pointed at threshold-gated state commitment. “Pokémon IV breeding” pointed at persistent-baseline inheritance and trait carry-over. “Marvel post-credits” pointed at tiered attention economies: bundle the low-stakes filler, reserve the expensive call for the franchise-altering twist.

What surprised me

A snippet from wave 4, the Lord of the Rings seed, after I told the model to think about the Palantír and how a strong will can dominate a scrying stone:

Resonance-Locked Temporal Phase Locking. The system runs a local oscillator synchronized to the dominant periodicity of chicken movements (pecking, walking cycles) and only transmits frame sequences that cause a phase-desynchronization error exceeding a threshold, effectively treating the flock’s activity as a signal that must “lock” with the observer’s internal clock.

That is, almost word for word, a phase-locked loop (PLL: a real signal-processing primitive that synchronises an internal clock to an external signal), applied to chicken behaviour. The model knew about PLLs. It wouldn’t have reached for one if I’d asked nicely for “ten ways to analyse chicken video.” It reached for one because I’d put a Palantír in its hand and told it the problem rhymed.

That was the pattern across the weirder waves. The seed was a structural hint (phase locking, eviction policies, tournament brackets, recombinant inheritance) and the model fetched the matching technical primitive from its actual library shelf and dressed it for the problem. The analogy stops working when the seed is too close to the problem; coral reefs produced cleaner mechanisms than “general ecology” did, because the structural hint was sharper.

This isn’t novel research, by the way. I checked. PersonaFlow does almost exactly this with explicit “expert persona” simulation for research ideation. BILLY goes further and blends persona vectors directly in activation space. There’s recent work measuring lexical diversity gains from persona-prompted generation, and a small subfield on multi-agent persona ideation pipelines with SIGDIAL 2025 already publishing on it. I’m late, not first.

At the time I read this as “domain seeding works.” That’s roughly right, but it’s not quite what’s going on, and a follow-up ablation made me come back and edit this section.

An ablation, after publishing

The wave table claims seeding is what drives the +25 / +75 / +85 growth. But every wave after the first changed two things at once: more seeds, and a longer exclusion list. To test which one was doing the work, I ran a 2x2 on the same chicken-video problem, 20 runs per condition, 240 approaches total. Same model (Qwen3.6-27B), same temperature, same problem statement. Each approach got tagged against the wave-3 saturated taxonomy as either belonging to a banned family or genuinely outside it (“NOVEL”).

Condition	Seed?	Exclusion list?	NOVEL / 60	%NOVEL
A. Pure control	no	no	0	0%
B. Exclusion only	no	yes	47	78%
C. Seed only	yes	no	1	2%
D. Seed + exclusion (= the wave 3/4 setup)	yes	yes	29	48%

Three things fell out of that:

1. The exclusion list is doing most of the structural work. Keep just “do not propose anything that reduces to these 68 families” and you get more NOVEL approaches than the full treatment (78% vs 48%). Keep just a weird seed with no exclusions and the model collapses back to optical-flow / CLIP / YOLO, however exotic the analogy. The wave-on-wave growth in the original table is mostly the exclusion list, not the seeds.

2. The seed actively reduces raw NOVEL count when combined with exclusion. D yields 18 fewer NOVELs than B, because a seed gives the model a path to re-derive saturated families in costume: the Studio Ghibli matte-painting seed literally re-produces the banned three-layer static / slow / fast split under a new name.

3. The seed still earns its keep, but for breadth, not count. B’s 47 NOVEL items concentrate in about five archetypes. D’s 29 cover those plus territory exclusion-only never reached: phase-locked oscillators (tarot, I Ching), market auctions (MTG, backgammon), retrograde analysis (chess endgame tablebases), cross-clip narrative tension (Marvel post-credits). All four cross-clip mechanisms in the whole ablation came from the seeded arm.

So the corrected reading: the +25 / +75 / +85 in waves 2-4 is mostly the exclusion list doing its job. The seed’s contribution is which kinds of NOVEL the model finds, not how many.

One caveat that cuts against the headline above. The wave-3 exclusion block isn’t a pure negative constraint: its footer suggests directions to try (“Useful unexplored directions include: error-correction, market / auction mechanisms, cryptographic commitments…”), and both B and D over-index on exactly those families. Part of what reads as “novelty under negative pressure” may be the model following that embedded positive hint. The cleaner three-arm ablation (banned-list only, hint-footer only, both) is still unrun.

What I’d try next

A few threads that fall out of this round:

Try a problem with thinner priors (agent harness design, a serving-stack pick). Chicken video is well-trodden ground; the gains may be partly “the stock answers were saturated”.
Score the seeds. The MTG seed produced three usable mechanism families; one cathedral seed produced only vocabulary swaps. A per-seed yield log would let an iterate-mode prompt pick productive shelves.
Split the exclusion block into banned-list-only vs hint-footer-only vs both, per the caveat above.
Use the trick during agent execution, not just ideation. Force a stuck coding agent to spend one turn reasoning by analogy from a random domain before its next edit.
The follow-up post is the implementation half. Thirty of these 249 mechanisms have since been built as runnable prototypes by a coding agent; eight earned a place in the coop pipeline. That’s next week.

Read these too:

W. Brian Arthur, The Nature of Technology: long-form case for recombination as the engine of innovation.
Wang et al., PersonaFlow: persona-simulated experts for interdisciplinary research ideation.
Magrelli, Making the Old New Again Through Recombinant Innovation: recent review of how recombination drives novelty.