Autonomous Agentic Research Swarm

Autonomous Agentic Research Swarm is a repo-native framework for executing research through explicit contracts, task state, review gates, and git-scoped work isolation.

This repository should be read framework-first. It currently includes an empirical research project that has been used as a reference implementation to exercise the framework end to end, but the repository itself is not defined by that single project.

The framework is designed to support three research modes:

empirical
modeling
hybrid

Its central operating idea is simple:

the repository is the shared memory

Agents do not rely on hidden conversational context to coordinate. They coordinate through task files, contracts, manifests, review logs, release artifacts, and git history. That makes the workflow inspectable, reproducible, and reviewable.

Why This Exists

Most agentic workflows break down on the same points:

work scope widens implicitly
state lives in chat instead of durable files
review and provenance are bolted on after the fact
parallel work collides because ownership is unclear
release outputs exist without a clean chain of evidence

This framework addresses those failures directly by treating the repository as both the execution substrate and the control plane.

Framework Overview

The framework is built from a small set of explicit architectural pieces.

Repo-native control plane

The local control plane lives under .orchestrator/. Task markdown files carry the authoritative State: field, while folder placement under backlog, active, blocked, and done is only a projection maintained by tooling.

Relevant surfaces:

Contract-first execution

The framework separates framework policy from project policy.

contracts/framework.json defines framework capabilities, roles, states, task semantics, execution engines, and release policy.
contracts/project.yaml defines the currently instantiated research project.
docs/protocol.md locks empirical definitions when the current project is empirical.
contracts/model_spec.md is the modeling specification surface.
contracts/hybrid_interface_v1.yaml defines the only allowed empirical-to-modeling bridge.

That separation is deliberate. The framework is meant to be reusable across projects; the current project contract is only one instantiation.

Explicit role separation

The operating model is built around four formal roles:

Planner: scopes work, writes tasks, maintains workstreams, and manages lifecycle projection
Worker: executes exactly one task in one isolated branch/worktree and edits only the allowed scope
Judge: reruns declared gates, verifies outputs and provenance, and is the only role that can mark scientific work done
Operator: owns preflight, supervision, repair handling, release assembly, and shared operational surfaces

The role model is enforced through AGENTS.md, contracts/framework.json, and the prompt templates under docs/prompts/.

Worktree-scoped execution

The framework assumes strict task isolation:

one task
one branch
one worktree

Tasks declare bounded ownership through fields such as:

allowed_paths
outputs
gates
stop_conditions

This keeps parallelism tractable and makes it clear when the correct action is to block rather than improvise.

Deterministic gates and review bundles

The merge firewall is designed to stay offline and deterministic by default.

Primary runtime and gate surfaces:

Durable runtime and review artifacts live under:

The framework treats these review bundles as first-class outputs, not optional metadata.

Two execution paths

The framework defines two execution paths.

The default path is the local swarm runtime: scripts/swarm.py plus .orchestrator/.
The high-stakes path is the reviewed staged-workflow-runner, reserved for major replans, architecture rewrites, and release assessments under Operator control.

This separation prevents ordinary task execution from being overloaded with high-consequence synthesis work.

Supported Research Modes

The framework supports the modes declared in contracts/framework.json.

Empirical

Empirical mode is for workflows that move from source acquisition to processed datasets, validation artifacts, analytical outputs, manuscript source, and release surfaces.

Relevant framework surfaces include:

Modeling

Modeling mode is for solver, simulation, optimization, or proof-oriented workflows that require explicit instance and experiment definitions rather than informal inputs.

Relevant framework surfaces include:

Hybrid

Hybrid mode is for workflows where empirical outputs are transformed into declared modeling instances through a contract-bound interface.

Relevant framework surfaces include:

The key rule is that modeling work consumes declared instance manifests, not arbitrary empirical data paths.

Framework vs. Current Repository Instance

This repository currently contains both:

the framework itself
one active reference project instance

The current project instance, defined in contracts/project.yaml, is an empirical research project on L2-to-L1 rent. It should be understood as a real end-to-end validation of the framework's empirical path, not as the definition of the framework.

At the current state of this repository, the strongest operational evidence is:

the repo-native control-plane model is working
the local swarm runtime is working
the deterministic gate and Judge review path is working
the empirical mode has been exercised through release assembly on a real project

What is present architecturally but not yet exercised to the same depth:

full modeling runtime maturity on a populated model specification and live instance set
full hybrid runtime maturity beyond the current bridge contract

That distinction matters. The framework supports empirical, modeling, and hybrid work by design, but the current deepest evidence comes from the empirical reference implementation.

Repository Structure

AGENTS.md: role boundaries and operating rules
.orchestrator/: task lifecycle, templates, handoffs, and control-plane state
contracts/: framework policy, project contract, model spec, hybrid interface, schemas, instances, and experiments
docs/: runbooks, prompts, and protocol documents
scripts/: swarm runtime, quality gates, lifecycle sweep, and release assembly
src/: implementation surfaces for ETL, validation, analysis, and modeling
registry/: registry surfaces for empirical projects
data/: raw, processed, sample, and manifest-backed datasets
reports/: validation outputs, figures, tables, models, paper artifacts, release manifests, and review logs
tests/: fast offline verification

How To Work With This Repo

You can use this repository in two different ways.

1. Operate the current reference instance

Use the repo as-is to inspect or extend the current empirical project that has been used to exercise the framework.

2. Use the framework for a different research project

Reuse the framework structure and replace the project-specific contract surfaces with a different project instance:

update contracts/project.yaml
update the relevant empirical, modeling, or hybrid contracts
define the task queue under .orchestrator/
keep the same role, state, gate, and review semantics

The framework is intended to generalize. The current empirical project is only one concrete instantiation.

Getting Started

Prerequisites

Python 3.11
git
pip

Useful optional tools:

quarto
tmux
gh

Install

python -m pip install .

Validate the repository

make gate
make test

Inspect control-plane status

python scripts/swarm.py plan --remote origin --base-branch main

Read the framework in the right order

AGENTS.md
contracts/framework.json
.orchestrator/workstreams.md
docs/runbook_swarm.md
docs/runbook_swarm_automation.md
the current project contract in contracts/project.yaml

For modeling or hybrid work, then continue with:

Design Principles

The framework is built around a small set of strong rules:

the repository is the shared memory
task-file state is authoritative
folder placement is only a projection
contracts outrank chat
one task executes in one isolated worktree
gates stay deterministic and offline by default
review and release artifacts are required outputs
agents should stop on ambiguity instead of widening scope informally

Current Status

This repository now demonstrates a complete empirical reference run through figures, tables, manuscript source, paper build, catalog, and release manifest surfaces. That is evidence that the framework can carry a real research project end to end.

It is not yet evidence that every supported mode has equal runtime maturity. The framework is broader than the current reference project, and the README is written to reflect that boundary explicitly.

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
.devcontainer		.devcontainer
.github		.github
.orchestrator		.orchestrator
contracts		contracts
data		data
docs		docs
registry		registry
reports		reports
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Autonomous Agentic Research Swarm

Why This Exists

Framework Overview

Repo-native control plane

Contract-first execution

Explicit role separation

Worktree-scoped execution

Deterministic gates and review bundles

Two execution paths

Supported Research Modes

Empirical

Modeling

Hybrid

Framework vs. Current Repository Instance

Repository Structure

How To Work With This Repo

1. Operate the current reference instance

2. Use the framework for a different research project

Getting Started

Prerequisites

Install

Validate the repository

Inspect control-plane status

Read the framework in the right order

Design Principles

Current Status

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Autonomous Agentic Research Swarm

Why This Exists

Framework Overview

Repo-native control plane

Contract-first execution

Explicit role separation

Worktree-scoped execution

Deterministic gates and review bundles

Two execution paths

Supported Research Modes

Empirical

Modeling

Hybrid

Framework vs. Current Repository Instance

Repository Structure

How To Work With This Repo

1. Operate the current reference instance

2. Use the framework for a different research project

Getting Started

Prerequisites

Install

Validate the repository

Inspect control-plane status

Read the framework in the right order

Design Principles

Current Status

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages