GitHub - robotijn/ctoc: The CTO Chief repo with the official Iron Loop for programming

CTO Chief
The CTO your AI never had.

CTO Chief is a Claude Code plugin that turns AI coding from "generate and pray" into disciplined engineering. Every feature follows a 16-step Iron Loop — plan before code, test before ship, secure before deploy — wrapped by a refinement loop that drives findings (warnings included) to zero before you ever see the result. 110 agents across 22 categories route through a 4-tier architecture (CTO Chief → sub-orchestrators → specialists → Haiku scouts), with 4 mandatory human gates. The 421-file skill library (99 Tier-2 specialist bodies + 322 reference files) has been brought to 2026 best-practices quality through a websearch → update → critique → update loop on every specialist — no invented statistics, sourced citations, 7-language coverage. The result: AI that writes production-quality code on the first try.

Install

/plugin marketplace add https://github.com/robotijn/ctoc
/plugin install ctoc

Tip

Enable auto-update: /plugin → Marketplaces tab → robotijn → Enable auto-update

Quick Start

1. Start Claude Code:

claude

2. Open the dashboard:

/ctoc

That's it. CTO Chief detects your stack and shows a dashboard.

3. Tell Claude what you want to build:

I want a SaaS product with AI to help creative writers when they get stuck

CTO Chief starts with ideation — agents explore your idea with you, ask clarifying questions, and shape it into actionable plans. Steps 1-7 are collaborative: agents ask, you decide. Steps 8-16 are automated: agents execute, you review the result. Use numbered menus ([1], [2], [3]) to navigate.

Already know exactly what you want? Just be specific: "Add a /health endpoint returning 200 OK" — CTO Chief skips ideation and goes straight to planning.

Note

CTO Chief is open source and actively developed. Issues, PRs, and skill improvement suggestions are welcome.

Tip

For autonomous agent workflows, use claude --dangerously-skip-permissions to avoid repeated tool-call prompts. This is safe on feature branches where git can revert changes. Add --continue to resume a previous session.

Auto-Availability After Install

When you install CTOC from the marketplace, Claude Code auto-discovers every artifact the plugin ships — slash commands, agents, hooks, and skills — per the Claude Code Plugins reference. No manual wiring is needed. The 99 Tier-2 specialist SKILL.md files then become available through three routing paths:

Slash-command pipeline — /ctoc (or any sub-command) dispatches CTO Chief, which dispatches a Tier-1 sub-orchestrator, which dispatches the relevant Tier-2 specialist by name. This is the path used during the Iron Loop and refinement loop.
when_to_load trigger phrases — each SKILL.md declares a list of natural-language triggers in YAML frontmatter (e.g. "SBOM", "prompt injection", "NIST 800-61"). When your conversation matches a trigger, Claude Code auto-loads the skill into context with no slash command needed.
Direct Skill tool invocation — Claude can invoke any skill explicitly via the built-in Skill tool (e.g. Skill(skill_name="ctoc:llm-security-tester")) based on conversational context.

The auto-discovery is documented behavior of Claude Code's plugin system. Installing CTOC therefore makes the entire 421-file library reachable without configuration — you only pay for what loads, but everything is wired and ready.

Skill Library Quality Bar

Every one of the 99 Tier-2 specialist SKILL.md bodies was brought to 2026 best-practices quality through an explicit improvement loop (v6.9.15–v6.9.27). The library is not a grab-bag of LLM-generated stubs — it is engineered.

The 4-step loop (existing skills, 86 of them):

websearch (May 2026 sources) → update v1 → critique (subagent) → update v2

The 6-step loop (5 new gap-fill skills in v6.9.24, per the new-skill memory rule):

websearch → v1 → critique → v2 → extra critique → v3

The extra critique round catches things like missing SLSA/in-toto provenance flows, omitted CVEs (EchoLeak, MCP tool poisoning), and stale tool-lifecycle dates. The loop is documented in commits ec94f62..e0ee079.

Every SKILL.md ships:

YAML frontmatter (tier: 2, dispatch_protocol: v1, max_subagents: 0, declared when_to_load triggers, effort_level, model_optimized_for)
A ## 2026 Best Practices section with sourced citations — no invented stats. Quantitative claims trace back to a primary source (OWASP, NIST, ENISA, EC, CNCF, SLSA.dev, Sigstore, ISO, MITRE ATT&CK/ATLAS, vendor docs, peer-reviewed papers).
7-language coverage (C#, Java, Python, C, C++, JS/TS, SQL) of BAD/SAFE pattern pairs in foundational categories; per-skill rationale where a language is skipped (e.g. SQL skipped in E2E test skills).
A ## Tool Integration (2026) matrix with current CLI commands.
A ## Severity block that reconciles internal triage tiers with the always-critical letter contract on the wire (no soft tiers escape the refinement loop).
A ## Letter schema (refinement-loop output contract) so findings are machine-readable.
A ## Refinement Loop — critic mode footer cross-linking agents/_shared/warnings-are-critical.md.

Warnings are bugs. Every critic emits findings at severity: critical on the wire. Compiler/linter/type-checker warnings, deprecation notices, and CVEs at any severity block phase advancement. Time is a vector: today's warning is tomorrow's customer-visible crash.

Gap-Fill Skills (v6.9.24)

Five new Tier-2 specialists were created from a v6.9.22 gap analysis — each fills a hole that 2026 regulation, the OWASP/MITRE landscape, or industry incidents made urgent. All went through the 6-step v3 critique loop above.

Skill	Why it was added
`compliance/sbom-cra-checker`	EU Cyber Resilience Act reporting goes live 11 Sep 2026 — SBOMs become a legal artifact with 10-year retention and penalties up to €15M / 2.5% turnover. Validates NTIA Minimum Elements, CycloneDX 1.6 / SPDX 2.3+/3.0, signed-SBOM verification, in-toto attestations, SLSA, GUAC, VEX, and ENISA Single Reporting Platform onboarding.
`security/threat-modeler`	Design-time threat decomposition before any code is written — STRIDE, PASTA, LINDDUN(-GO and the new GenAI extension, arXiv 2603.06051), attack trees, automotive TARA, and tagging against MITRE ATT&CK + ATLAS v5.4.0 (16 tactics / 84 techniques / 56 sub-techniques). Tool integration: Threagile, OWASP Threat Dragon, pytm, IriusRisk, Microsoft TMT.
`compliance/ai-governance-checker`	EU AI Act high-risk provisions become enforceable 2 Aug 2026. Classifies systems against EU AI Act risk tiers (Art. 5 prohibited, Annex III high-risk, GPAI Chap V Arts. 51–55 with the 10²⁵ FLOPs systemic-risk threshold), NIST AI 600-1 (12 GenAI risks), and ISO/IEC 42001 (38 Annex A controls). Includes Art. 73 incident-reporting windows (2/10/15-day) to the AI Office.
`ai-quality/llm-security-tester`	LLM red-team analyst covering OWASP LLM Top 10 v2 (2025) all 10 categories, mapped to MITRE ATLAS v5.4.0 tactics. Covers CVE-2025-53773 (GitHub Copilot RCE, CVSS 9.6), CVE-2025-32711 (EchoLeak), the Cursor IDE chain, persistent memory poisoning, MCP tool poisoning, multi-turn crescendo/TAP jailbreaks, and markdown exfiltration. Tools: Garak, PyRIT, PromptFoo.
`security/incident-responder`	NIST SP 800-61r3 (Apr 2025) rewritten around CSF 2.0 functions plus the regulatory clocks that now bind: ENISA SRP 24h/72h/14d/1m from 11 Sep 2026, SEC Item 1.05 8-K (4 business days), NIS2, CIRCIA (pending), GDPR 72h. Runbooks per incident class, blameless-postmortem template, on-call wiring for PagerDuty / Opsgenie (EOS Apr 2027) / incident.io / FireHydrant.

These five take the specialist count from 86 → 91, and the total skill-library file count from 408 → 413.

Cross-Industry Skills (v6.9.27)

A cross-industry critique — pulling best practice from safety-critical, real-time, legal, and regulated-finance engineering, not just SaaS — added 8 specialists across three new categories plus security:

Safety — fault-tree-builder (top-down Fault Tree analysis), fmeda-analyzer (failure modes + diagnostic coverage), redundancy-pattern-picker (lockstep / N-version / voting / standby selection)
Realtime — hil-harness (Model-/Software-/Processor-/Hardware-in-the-Loop test ladder), wcet-budget (worst-case execution time bounds)
Legal — clm-obligations (contract obligation tracking), dsar-handler (GDPR data-subject-access-request flow)
Security — cra-incident-clocks (EU Cyber Resilience Act 24h / 72h / 14d incident clocks)

The same pass added a regulatory-regime profile framework and an evaluation-driven-development harness — see REGULATORY_OPS.md and EVALUATION_HARNESS.md.

Project Init

Initialization is automatic. The first time you open the dashboard (/ctoc) in a project that has not been set up, CTO Chief initializes it before rendering — there is no init command to run. Setup:

Detects your stack — scans for languages (14), frameworks (20+), and tools (linters, test runners, bundlers)
Generates a tailored CLAUDE.md — project-specific instructions including detected tools, quality commands, and Iron Loop steps
Configures .ctoc/settings.yaml — quality gates, enforcement mode, and agent settings tuned to your stack
Creates the plans/ directory structure and initializes Iron Loop state in .ctoc/state/

The generated CLAUDE.md becomes the single source of truth for how Claude works in your project — agent personality, planning pipeline, test commands, and quality standards. Initialization is idempotent: it skips any file that already exists, so opening the dashboard never overwrites your work.

Why CTO Chief?

Without CTO Chief — AI writes code immediately, skips tests, ignores security. You spend hours debugging, refactoring, and adding missing error handling.

With CTO Chief — You start with an idea. A product-owner agent explores it with you, asks the right questions, and shapes it into a plan. Only then does AI write code — tests first, security scanned, with your approval at every checkpoint.

	Without	With CTO Chief
Ideation	None — AI guesses what you want	Product-owner agent explores your idea, asks questions, shapes the plan
Planning	None — straight to code	Functional + implementation plan, reviewed by you
Testing	"I'll add tests later"	TDD — tests written before code (Step 8)
Security	Hope for the best	Shift-left scanning (Step 9) + full audit (Step 13)
Your control	Watch and hope	4 approval gates — nothing ships without you
Quality	Manual review only	Automated: lint, typecheck, tests, 80%+ coverage

How CTO Chief Compares

	CTO Chief	Cursor Rules	Raw Claude Code	GitHub Copilot
Ideation with product owner	AI explores your idea before planning	None	None	None
Planning before coding	6-step plan with adversarial review	Manual rules file	None	None
Step-driven question routing	Questions scoped to your current Iron Loop step	None	None	None
6-month pre-mortem + 5-scenario cash flow	Built into canvas	None	None	None
TDD enforcement	Automatic (Step 8)	Manual	Manual	None
Security scanning	Built-in (Steps 9, 13)	Manual	Manual	None
Threat modeling (STRIDE / PASTA / LINDDUN / ATT&CK / ATLAS)	Built-in (`threat-modeler`)	None	None	None
LLM security testing (OWASP LLM Top 10 v2)	Built-in (`llm-security-tester`)	None	None	None
EU CRA + SBOM compliance (11 Sep 2026)	Built-in (`sbom-cra-checker`)	None	None	None
AI governance (EU AI Act / NIST AI RMF / ISO 42001)	Built-in (`ai-governance-checker`)	None	None	None
Incident response (NIST 800-61r3, SEC 8-K, NIS2)	Built-in (`incident-responder`)	None	None	None
Iterative refinement to zero findings	Refinement loop (incl. warnings)	None	None	None
Human approval gates	4 mandatory checkpoints	None	None	None
Quality verification	Automated gate (Step 14)	Manual	Manual	None
Specialist agents	110 across 22 categories	None	DIY	None
Specialist skill library (engineered, sourced)	99 SKILL.md bodies through critique loop	None	None	None
Production-readiness checklist	SaaS templates with 20+ block-severity checks	None	None	None
Post-launch product loop	KPI library + experiment designer	None	None	None

Example Session

You: I want a SaaS product with AI to help creative writers when they get stuck

╭─ IDEATION ─────────────────────────────────────────────────╮
│ Product-owner agent explores your idea:                    │
│                                                            │
│ "What kind of stuck? Writer's block, plot holes, or       │
│  character development? Who's the target — novelists,      │
│  screenwriters, bloggers? Free tier or paid only?"         │
│                                                            │
│ You discuss back and forth. The agent shapes your idea     │
│ into 3 plans:                                              │
│   Plan 1: AI prompt generator for writer's block           │
│   Plan 2: Character voice coach                            │
│   Plan 3: Plot continuity checker                          │
│                                                            │
│ [1] Start with Plan 1 (Recommended)                        │
│ [2] Start with Plan 2                                      │
│ [3] Start with Plan 3                                      │
╰────────────────────────────────────────────────────────────╯

You: 1

╭─ FUNCTIONAL PLANNING (Steps 2-4) ─────────────────────────╮
│ Product-owner agent writes BDD scenarios WITH you:         │
│                                                            │
│ "Should the AI suggest full paragraphs or just prompts?    │
│  What if the writer rejects the suggestion — retry or      │
│  offer alternatives?"                                      │
│                                                            │
│   Scenario: Writer requests help                           │
│     Given a writer is stuck on chapter 3                   │
│     When they describe their block                         │
│     Then AI generates 3 creative prompts                   │
│                                                            │
│ GATE 1: [1] Approve plan  [2] Discuss  [0] Cancel          │
╰────────────────────────────────────────────────────────────╯

You: 1

╭─ TECHNICAL PLANNING (Steps 5-7) ─────────────────────────╮
│ Implementation-planner agent designs the architecture:     │
│                                                            │
│ "Next.js frontend, FastAPI backend, Claude API for         │
│  generation. 4 files to create, 1 to modify."             │
│                                                            │
│ Integrator+Critic refine the plan (10 rounds)...           │
│                                                            │
│ GATE 2: [1] Approve approach  [2] Discuss  [0] Cancel      │
╰────────────────────────────────────────────────────────────╯

You: 1

╭─ IMPLEMENTATION (Steps 8-16, automated) ──────────────────╮
│ Agents execute without interruption:                       │
│                                                            │
│  Step 8:  ✓ Tests written (TDD red)                        │
│  Step 9:  ✓ Dependencies installed, shift-left scan clean  │
│  Step 10: ✓ Code implemented (TDD green)                   │
│  Step 11: ✓ Self-review passed                             │
│  Step 12: ✓ Optimized                                      │
│  Step 13: ✓ Security scan clean                            │
│  Step 14: ✓ All tests pass, 91% coverage                   │
│  Step 15: ✓ Docs updated                                   │
│  Step 16: Ready for your review                            │
│                                                            │
│ GATE 3: [1] Approve and commit  [2] Changes  [0] Cancel    │
╰────────────────────────────────────────────────────────────╯

You: 1
  ✓ Committed and pushed. Plan 1 done — 2 more plans queued.

Three approvals per plan. Steps 1-7: agents ask, you decide. Steps 8-16: agents execute, you review.

Tip

Ideation is optional. If you already know exactly what you want, say it directly (e.g., "Add a /health endpoint returning 200 OK") and CTO Chief skips to planning. Ideation is most valuable when you have a broad idea that needs shaping — like building a full SaaS product from a single sentence.

Key Features

Ideation-first workflow — Product-owner agent explores your idea, asks questions, and shapes it into plans before any code is written
Collaborative planning, automated execution — Steps 1-7: agents ask questions and you decide. Steps 8-16: agents execute and you review the result.
110 agents across 22 categories — testing, security, quality, infrastructure, SaaS, product, scouts, compliance, AI quality, and more
421 skill files — 99 Tier-2 specialist skill bodies (engineered through the websearch → update → critique → update loop) + 50 language refs + 211 framework refs (85 web, 44 AI/ML, 52 data, 15 DevOps, 15 mobile) + 61 per-language quality configs
Iron Loop methodology — 16 steps across 4 phases with 4 human gates
Refinement loop — Iterative critic → test-writer → implementer cycle with tiered K-budgets (critical K=3 · medium K=5 · low K=7 · final sweep K=∞) that drives findings to zero (warnings included) before Gate 3 — see REFINEMENT_LOOP.md
4-tier agent architecture — CTO Chief (Tier 0, sole dispatcher) → 16 sub-orchestrators (Tier 1) → specialists (Tier 2) → 5 Haiku scouts (Tier 3) for fast pre-screens — see AGENT_ARCHITECTURE.md
6-month pre-mortem + 5-scenario cash flow — Every canvas (lean or BMC) now carries a Gary-Klein 6-month pre-mortem (≥5 failure modes scored by likelihood × impact with this-week mitigations) and a Worst / Conservative / Base / Optimistic / Exceptional 18-month cash flow with runway-per-scenario and commit-now decision triggers
Warnings are bugs — Compiler/linter/type-checker warnings, deprecation notices, and CVEs at any severity are classified critical-tier by the refinement loop. Production-readiness gate requires zero warnings across all toolchains and zero open CVEs before Gate 3
Production-ready SaaS templates — Opinionated starters (B2C subscription, B2B sales-led) with 20+ Gate-3 production-readiness block-severity checks: domain, HTTPS, auth, billing, RLS, observability, legal, zero warnings, zero CVEs
2026-grade compliance & AI safety — Five gap-fill skills (sbom-cra-checker, threat-modeler, ai-governance-checker, llm-security-tester, incident-responder) cover EU CRA, EU AI Act, NIST 800-61r3, OWASP LLM Top 10 v2, MITRE ATLAS v5.4.0, and STRIDE/PASTA/LINDDUN
Product Loop — Post-launch DEFINE → INSTRUMENT → MEASURE → REVIEW → HYPOTHESIZE → EXPERIMENT → LEARN cycle keyed to 17 canonical KPIs across acquisition/activation/retention/revenue/churn — see PRODUCT_LOOP.md
Interactive dashboard — Numbered menus, plan pipeline, progress tracking
Deployment pipeline — Configurable dev → staging → production promotion triggered automatically after Gate 3 approval
Smart quality gates — Background checks that don't block commits, block pushes
Stack detection — Auto-detects 14 languages, dozens of frameworks, and tools
On-demand loading — Skills load only when needed; you only pay for what you use

The Iron Loop

16 steps, 4 phases, 4 human gates — full methodology →

COLLABORATIVE (Steps 1-7) — agents ask questions, you decide
──────────────────────────────────────────────────────────────
Step 1: IDEATION
  IDEATE — product-owner agent explores your idea with you
  Gate 0: You approve the idea to explore

Steps 2-4: FUNCTIONAL PLANNING
  ASSESS → ALIGN → CAPTURE — agents ask what to build, you approve
  Gate 1: You approve what to build

Steps 5-7: IMPLEMENTATION PLANNING
  PLAN → DESIGN → SPEC — agents ask how to build it, you approve
  Gate 2: You approve how to build it

AUTOMATED (Steps 8-16) — agents execute, you review
──────────────────────────────────────────────────────────────
Steps 8-16: IMPLEMENTATION
  TEST → PREPARE → IMPLEMENT → REVIEW → OPTIMIZE → SECURE → VERIFY → DOCUMENT → FINAL-REVIEW
  Gate 3: You approve the result

Steps 1-7 are collaborative. Agents don't just generate — they ask questions, present options with pros and cons, and wait for your decision. The product-owner agent shapes your idea; the implementation-planner designs the architecture. You are always in control.

Steps 8-16 are automated. Once you approve the plan, agents execute all 9 steps without interruption: write tests, implement code, review, optimize, scan for vulnerabilities, verify quality, update docs. You review the final result at Gate 3.

Why start with ideation? Without it, Claude will try to jump straight to code. The ideation phase forces the AI to understand your intent before planning begins. This is what prevents hooks and gates from being bypassed — the AI has a structured path to follow instead of guessing.

Enforcement — Hooks block premature code edits (before planning) and premature commits (before verification). Escape phrases: "skip planning", "skip iron loop", "quick fix", "trivial fix", "trivial change", "hotfix", "urgent".

The 4-Tier Agent Architecture

CTO Chief is the only top-level dispatcher. All other agents are dispatched by CTO Chief, directly or via a sub-orchestrator. See AGENT_ARCHITECTURE.md for the full spec.

Tier	Role	Count	Model	What they do
Tier 0	Top-level coordinator	1	Opus	CTO Chief — sole dispatcher, owns the audit trail, approves all gate crossings
Tier 1	Sub-orchestrators	16	Opus	Planning (7) · Iron Loop (3) · Pipeline (5) · Synthesizer (1) — recommend dispatches and orchestrate Tier 2/3 fan-out
Tier 2	Specialists	72+	Opus / Sonnet	Domain experts — single-purpose, structured findings output, cannot dispatch other agents
Tier 3	Scouts	5	Haiku 4.5	Fast pass/flag pre-screens in isolated 200K context: syntax · lint · test · dep · secret. Short-circuit Tier 2 when clean. ~10–50× cheaper than the specialists they replace on the happy path.

Cross-pillar conflicts (security vs. performance, etc.) are resolved by the synthesizer using a fixed priority: Security > Correctness > Maintainability > Performance > Readability > Consistency. Every dispatch is logged to .ctoc/audit/dispatches/YYYY-MM-DD/<id>.yaml per the Dispatch Protocol.

The Refinement Loop

Findings from the Iron Loop don't get reviewed-and-shipped on the first pass. They run through the refinement loop — an iterative critic → test-writer → implementer cycle that drives findings to zero before Gate 3. See REFINEMENT_LOOP.md.

critics → findings → test-writer (TDD red) → implementer (TDD green) → re-critic
                                                                            │
                                                                       still findings?
                                                                            │
                                                                ┌───────────┴───────────┐
                                                              YES                       NO
                                                                │                       │
                                                          loop again                 advance
                                                                                   phase / done

Phase semantics (tiered K-budgets):

Phase	K (rounds)	Stops on
Critical	3	0 critical findings
Medium	5	0 medium findings
Low	7	0 low findings
Final sweep	∞ (soft cap)	Convergence; escalates to user if it doesn't

Warnings are bugs. Compiler / linter / type-checker warnings, deprecation notices, and CVEs at any severity are classified critical by every critic — they block phase advancement until fixed. Time is a vector: today's warning is tomorrow's customer-visible crash.

Triggered on effort: high plans OR when a risk-surface glob matches (auth, billing, schema migrations, GDPR-relevant paths, etc.). The integrator agent drives the loop; the journal at .ctoc/loops/<slug>/journal.yaml records every round.

The Canvas — 6-Month Pre-Mortem + 5-Scenario Cash Flow

Both Lean Canvas (Maurya) and Business Model Canvas (Osterwalder) carry two extra planning sections by default — surfacing 6-month failure modes and runway scenarios up-front so the business plan is interrogated before any feature work begins.

6-Month Pre-Mortem (Gary Klein, HBR 2007) — Imagine 6 months from now and the initiative has already failed. List ≥5 distinct failure modes scored Likelihood × Impact; pair each with a mitigation that can be started this week. Prospective hindsight is ~30% more accurate at identifying failure causes than forward-looking risk analysis. Refresh every 3–4 months.

Cash Flow Planning — 5 Scenarios over 18 months — Worst / Conservative / Base / Optimistic / Exceptional. The three middle scenarios must each be plausible (defensible, not aspirational). Stress-test deltas per scenario:

Variable	Worst	Conservative	Base	Optimistic	Exceptional
Revenue growth	−50%	−20%	0	+25%	+60%
CAC	+75%	+25%	0	−15%	−30%
Monthly churn	2.0×	1.3×	1.0×	0.8×	0.6×
Time-to-first-pay	+60d	+30d	normal	−15d	−30d

Includes base-case assumption anchors, per-month MRR table at M3/M6/M9/M12/M15/M18, runway per scenario, and commit-now decision triggers (e.g., "if actuals track Worst for 2 consecutive months: switch operating plan to Worst"). Industry signal: startups with 3+ scenarios secure 1.8× the funding (Abacum 2025).

Both sections are owned by the founder or product manager. The CTO Chief technical chain does not produce them; it consumes them when planning instrumentation work.

The Product Loop

The Iron Loop ships features. The Product Loop validates that they earn their place. See PRODUCT_LOOP.md.

DEFINE → INSTRUMENT → MEASURE → REVIEW → HYPOTHESIZE → EXPERIMENT → LEARN
  ↑                                                                    │
  └───────────────── continuous post-launch ───────────────────────────┘

Step	Owner	Cadence
DEFINE	founder + pm	Canvas phase — via `kpi-planner`
INSTRUMENT	programmer	Implementation — via `skills/saas/posthog-analytics`
MEASURE	(automated)	Continuous — PostHog + Stripe
REVIEW	founder + pm	Weekly — via `skills/product/product-reviewer`
HYPOTHESIZE	founder + pm	From review findings
EXPERIMENT	pm + programmer	Via `skills/product/experiment-designer`
LEARN	founder + pm	Post-experiment

Canonical KPI library at .ctoc/templates/product-kpis.yaml — 17 KPIs across acquisition / activation / retention / revenue / churn / satisfaction / engagement. SaaS-b2c launch set: signup_completion, activation_rate, time_to_value, w1_retention, free_to_paid_conversion, monthly_churn, mrr.

KPI status and the weekly product review are reached through the /ctoc:menu dashboard — CTOC ships only three slash commands (menu, push, update).

SaaS Production-Readiness Templates

CTOC ships opinionated templates for common project types. agents/planning/stack-chooser.md (Tier 1) selects the matching template and presents defaults to the user.

Template	Status	Default stack
`saas/b2c-subscription`	ready	Next.js 15 · Supabase · Clerk · Stripe · Resend · PostHog · Sentry · Vercel
`saas/b2b-sales-led`	ready	adds WorkOS SSO · org-scoped data · audit log · MSA/DPA templates · SOC2 docs
`saas/usage-based-api`	planned	metered billing · API keys · rate limiting · usage dashboard
`app/expo-react-native`	planned	Expo SDK 52 · Clerk Expo · Supabase · RevenueCat · EAS
`cli/bun-single-binary`	planned	Bun + cross-platform binary
`oss-lib/typescript`	planned	tsup · changesets · GitHub Actions

Each ready template carries a production-readiness checklist enforced at Gate 3 (review → done). Block-severity items in the B2C template include:

Domain & HTTPS — custom domain, HTTPS enforced
Auth — signup with email verification, password reset
Billing — real-card-tested, webhook signature verified, failed-payment dunning, billing-portal link
Email deliverability — SPF + DKIM + DMARC, welcome + receipt emails
Multi-tenancy — Postgres RLS enforced, RLS policy per user-data table
Observability — Sentry receiving errors, PostHog receiving events
Legal — Privacy Policy, Terms of Service
Support — support@ email forwards
Backups — DB backups enabled
Code quality (v6.9.9+) — zero warnings across all toolchains, zero open CVEs in production dependencies

The B2B template adds enterprise-grade gates: TLS A-grade, WorkOS SSO end-to-end, SCIM provisioning/deprovisioning, organization RLS, RBAC at middleware and DB, audit log capturing every mutation + auth event, ACH/wire billing, DPA + MSA templates, public subprocessor list.

SaaS skills under skills/saas/ (12 skill bodies): stripe-subscriptions · clerk-auth · workos-sso · multi-tenancy-row-level · resend-email · posthog-analytics · sentry-errors · supabase-data · inngest-jobs · rate-limiting · vercel-deploy · legal-scaffold.

Agents

110 agents across 22 categories — browse all →

Full agent list

Category	#	Agents
SaaS	12	clerk-auth, stripe-subscriptions, workos-sso, multi-tenancy-row-level, resend-email, posthog-analytics, sentry-errors, supabase-data, inngest-jobs, rate-limiting, vercel-deploy, legal-scaffold
Testing	14	unit, integration, e2e, mutation, smoke, quality-gate-runner, playwright-qa, coverage-enforcer, coverage-mapper, smart-test-runner, unit-writer, e2e-writer, integration-writer, property-writer
Quality	11	architecture-checker, code-reviewer, complexity-analyzer, complexity-reducer, type-checker, code-smell-detector, dead-code-detector, duplicate-code-detector, consistency-checker, quality-gate, performance-validator
Specialized	11	performance-profiler, memory-safety-checker, accessibility-checker, database-reviewer, api-contract-validator, configuration-validator, error-handler-checker, health-check-validator, observability-checker, resilience-checker, translation-checker
Planning	7	vision-advisor, vision-decomposer, product-owner, implementation-planner, stack-chooser, kpi-planner, unit-economics-modeler
Security	7	security-scanner, secrets-detector, dependency-checker, dependency-auditor, input-validation-checker, concurrency-checker, sast-scanner
Infrastructure	6	terraform-validator, kubernetes-checker, docker-security-checker, ci-pipeline-checker, ci-runner-setup, deployment-setup
Pipeline	5	agent-writer, agent-critic, agent-tester, agent-qa, agent-publisher
Scouts (Tier 3, Haiku)	5	syntax-scout, lint-scout, test-scout, dep-scout, secret-scout
Compliance	3	gdpr-compliance-checker, audit-log-checker, license-scanner
Coordinator	3	cto-chief (Tier 0), ivv-chief, synthesizer
Data/ML	3	data-quality-checker, ml-model-validator, feature-store-validator
Frontend	3	bundle-analyzer, component-tester, visual-regression-checker
Iron Loop	3	integrator, critic, executor
Mobile	3	ios-checker, android-checker, react-native-bridge-checker
Versioning	3	backwards-compatibility-checker, feature-flag-auditor, technical-debt-tracker
AI Quality	2	hallucination-detector, ai-code-quality-reviewer
Architecture	2	pattern-detector, dependency-analyzer
DevEx	2	onboarding-validator, api-deprecation-checker
Documentation	2	documentation-updater, changelog-generator
Product	2	product-reviewer, experiment-designer
Cost	1	cloud-cost-analyzer

Agents spawn conditionally based on your project and current Iron Loop step. Scouts (Tier 3) pre-screen and short-circuit deep dispatches when clean.

Note: not every Tier-2 specialist SKILL.md has a paired top-level agent file. Several skills (e.g. sbom-cra-checker, threat-modeler, ai-governance-checker, llm-security-tester, incident-responder) are dispatched directly through the skill auto-load mechanism — see "Auto-Availability After Install" above.

Skills

421 skill files — browse all →. Loaded on demand based on your stack and the current Iron Loop step.

There are two kinds of skills:

Tier-2 specialist skill bodies (99) — the actual expert agents that run during Iron Loop and refinement-loop steps. Each lives at skills/<category>/<name>/SKILL.md with a structured findings contract.
Knowledge skills (322) — language refs, framework refs, and per-language quality configs. Read by agents (or loaded by code paths like src/lib/quality-config.js and src/lib/skill-loader.js) to inform their work.

v6.9.14: 38 unreachable reference files were deleted from skills/ after a usage audit confirmed they had zero code or agent references. v6.9.15–v6.9.23: all 86 existing SKILL.md bodies were rewritten through a websearch → update → critique → update loop (May 2026 sources, 7-language coverage, sourced citations only). v6.9.24: 5 new gap-fill specialists were added via a 6-step v3 critique loop (see "Gap-Fill Skills" above). Net library: 408 → 413 files; 86 → 91 specialists. v6.9.27: 8 cross-industry-critique specialists added — new legal, realtime, and safety categories plus security/cra-incident-clocks. Net library: 413 → 421 files; 91 → 99 specialists.

Specialist skill bodies (Tier 2) — 99 across 20 categories

Category	#	Skill bodies
SaaS	12	clerk-auth · stripe-subscriptions · workos-sso · multi-tenancy-row-level · resend-email · posthog-analytics · sentry-errors · supabase-data · inngest-jobs · rate-limiting · vercel-deploy · legal-scaffold
Quality	11	architecture-checker · code-reviewer · complexity-analyzer · complexity-reducer · code-smell-detector · consistency-checker · dead-code-detector · duplicate-code-detector · performance-validator · quality-gate · type-checker
Specialized	11	accessibility-checker · api-contract-validator · configuration-validator · database-reviewer · error-handler-checker · health-check-validator · memory-safety-checker · observability-checker · performance-profiler · resilience-checker · translation-checker
Security	10	security-scanner · sast-scanner · secrets-detector · input-validation-checker · concurrency-checker · dependency-checker · dependency-auditor · threat-modeler (new, v6.9.24) · incident-responder (new, v6.9.24) · cra-incident-clocks (new, v6.9.27)
Testing	14 (5+4+5)	playwright-qa · coverage-enforcer · coverage-mapper · smart-test-runner · quality-gate-runner · 4 writers · 5 runners
Infrastructure	5	terraform-validator · kubernetes-checker · docker-security-checker · ci-pipeline-checker · ci-runner-setup
Compliance	5	audit-log-checker · gdpr-compliance-checker · license-scanner · sbom-cra-checker (new, v6.9.24) · ai-governance-checker (new, v6.9.24)
AI Quality	3	ai-code-quality-reviewer · hallucination-detector · llm-security-tester (new, v6.9.24)
Data/ML	3	data-quality-checker · feature-store-validator · ml-model-validator
Frontend	3	bundle-analyzer · component-tester · visual-regression-checker
Mobile	3	android-checker · ios-checker · react-native-bridge-checker
Versioning	3	backwards-compatibility-checker · feature-flag-auditor · technical-debt-tracker
Architecture	2	pattern-detector · dependency-analyzer
DevEx	2	api-deprecation-checker · onboarding-validator
Documentation	2	changelog-generator · documentation-updater
Product	2	product-reviewer · experiment-designer
Safety	3	fault-tree-builder · fmeda-analyzer · redundancy-pattern-picker (new category, v6.9.27)
Legal	2	clm-obligations · dsar-handler (new category, v6.9.27)
Realtime	2	hil-harness · wcet-budget (new category, v6.9.27)
Cost	1	cloud-cost-analyzer

Knowledge skills — 322 reference files

Type	#	Examples
Languages	50	Python, TypeScript, Go, Rust, Java, C#, Swift, Kotlin, Ruby, PHP
Web frameworks	85	React, Next.js, Vue, Django, FastAPI, Rails, Spring Boot, Express
AI/ML frameworks	44	PyTorch, LangChain, Hugging Face, MLflow, TensorFlow
Data frameworks	52	MongoDB, Redis, Kafka, Spark, Elasticsearch, DuckDB
DevOps frameworks	15	Docker, Kubernetes, Helm, Ansible, Pulumi
Mobile frameworks	15	React Native, Flutter, SwiftUI, Jetpack Compose
Quality configs	61	Per-language lint, format, and test configs

Stack detected automatically from your project files. Skills load on-demand — you only pay for what you use.

Interactive Dashboard

The /ctoc command opens an interactive dashboard with 5 areas:

Area	Purpose
Pipeline	The plan pipeline — Business, Implementation, and Execution sections; drill into any stage
Inbox	Morning questions, decisions awaiting review, and plans waiting at a human gate
Agent	Background agent status — start, stop, and monitor the todo-queue runner
Library	Browse the agent and skill library
System	Doctor, update, settings, and logs

Plan pipeline (directories under plans/):

vision → functional → implementation → todo → [in-progress] → review → done

in-progress is a state tracked in plan YAML frontmatter, not a separate directory.

4 human gates — transitions that require your explicit approval: 0. Vision → Functional (approve the idea to explore)

Functional → Implementation (approve what to build)
Implementation → Todo (approve how to build it)
Review → Done (approve the result)

Navigate with numbers [1]–[5] to switch areas, [0] for back. Or just talk naturally.

Enforcement

CTO Chief blocks premature actions with hooks:

Action	Blocked Until	Escape Phrases
Edit/Write code	Planning complete (Step 8+)	"skip planning", "skip iron loop", "quick fix", "trivial fix", "trivial change", "hotfix", "urgent"
Git commit	Documentation complete (Step 15+)	"hotfix", "urgent"

Config and CTOC files are whitelisted and never blocked: .ctoc/**, .local/**, plans/*.md, .gitignore, .gitattributes, VERSION.

Smart Quality Gates

Background quality agent runs checks without blocking your workflow:

git commit → background agent runs: lint, typecheck, tests, security
                    │
              ┌─────┴─────┐
              ▼           ▼
           PASS         FAIL
              │           │
         auto-push    "Fix: ..."

Tier	When	Checks	Blocking?
1	Every commit	lint, typecheck, affected tests, secrets, critical CVEs	Yes (blocks push)
2	Every commit	coverage, complexity, duplication, medium CVEs	No (warnings)
3	Stage transitions	docs, circular deps, bundle size, benchmarks	At transition
4	CI only	full tests, e2e, mutation, memory, license	CI

Deployment Pipeline

After Gate 3 approval (review → done), CTO Chief can automatically promote your code through environments:

Gate 3 approved → development → staging → production
                      │            │           │
                  git-branch   git-branch   git-branch
                  git-tag      webhook      script
                  webhook      script       docker
                  script       docker       ssh
                  docker       ssh
                  ssh

Configurable per environment — choose a deployment strategy (git-branch, git-tag, webhook, script, docker, ssh), set approval mode (auto or manual), and enable auto-rollback on failure. Any environment can be skipped.

Setup — run the deployment-setup agent for an interactive walkthrough, or configure directly in .ctoc/settings.yaml:

deployment:
  enabled: true
  environments:
    - name: staging
      enabled: true
      strategy: git-branch
      branch: deploy/staging
    - name: production
      enabled: true
      strategy: git-branch
      branch: deploy/production
  approval:
    staging: auto
    production: manual    # pause and ask before production
  rollback:
    auto_rollback: true
    keep_history: 10

Status tracking — deployment history and latest status are stored in .ctoc/deployments/. Each entry records environment, status (success/failed/rolled-back), timestamp, commit, and plan name.

How It Works

You ──── /ctoc ────► Dashboard
                        │
                  ┌─────┴─────┐
                  ▼           ▼
            Plan Pipeline   Tools
                  │
  ┌───────────────┼────────────────┐──────────────┐
  ▼               ▼                ▼              ▼
Phase 1        Phase 2          Phase 3        Phase 4
(Ideation)     (What)           (How)          (Build)
Step 1 (opt)   Steps 2-4        Steps 5-7      Steps 8-16
  │               │                │              │
  │            [GATE 1]        [GATE 2]       [GATE 3]
  └──► skip    You approve     You approve    You approve

Priority: security > correctness > performance > cleverness.

Commands

Slash commands (typed in Claude Code):

CTOC ships exactly three slash commands. Everything else — vision, planning, quality, review, agent runs, initialization — goes through the menu.

Command	Description
`/ctoc` (alias for `/ctoc:menu`)	Interactive dashboard. Auto-initializes the project on first run (no init command needed).
`/ctoc:push`	Quality checks + push
`/ctoc:update`	Update to latest version (workaround for plugin-cache bug)

Conversational commands (said to Claude):

Command	Description
`ctoc doctor`	Health check for your CTOC setup
`ctoc process-issues`	Process community-submitted skill improvement issues
`ctoc validate`	Validate plan structure + Iron Loop state

Updating

/ctoc:update

Then restart Claude Code to load the new version.

Note

This is a workaround for a Claude Code bug (#21995) where /plugin update doesn't refresh the cache. /ctoc:update fetches latest, clears cache, and updates the registry.

Troubleshooting

Plugin not found:

/plugin marketplace add https://github.com/robotijn/ctoc
/plugin install ctoc

Plugin stale after update:

/ctoc:update

Then restart Claude Code.

"Edit blocked" or "planning incomplete" error: CTO Chief blocks code edits until planning is done (Step 8+). This is intentional. Options:

Complete the planning steps first (recommended)
Say "quick fix" or "trivial change" to bypass for small edits
Set enforcement to soft in .ctoc/settings.yaml for warnings instead of blocks

Dashboard shows no plans: Start by describing what you want to build. CTO Chief creates the plan for you.

Health check (say to Claude):

ctoc doctor

For developers

Requirements: Claude Code >= 1.0.0, Node.js >= 18.0.0

See CLAUDE.md for full contributor instructions and IRON_LOOP.md for methodology details.

Run tests:

node --test tests/*.test.js

Version management:

const { release, getVersion, syncAll, checkForUpdates } = require('./src/lib/version');

getVersion()       // → '6.9.37'
release()          // → bumps patch, syncs all files
release('minor')   // → bumps minor
release('major')   // → bumps major

Files synced by release(): VERSION (source of truth), .claude-plugin/marketplace.json, .claude-plugin/plugin.json, README.md

Project structure:

ctoc/
├── docs/            14 docs: IRON_LOOP.md, AGENT_ARCHITECTURE.md, REFINEMENT_LOOP.md,
│                    PRODUCT_LOOP.md, DISPATCH_PROTOCOL.md, EVALUATION_HARNESS.md,
│                    INDEPENDENCE.md, REGULATORY_OPS.md, REALTIME.md, PROCESS_FMEA.md,
│                    CRITICAL_CONTROL_POINTS.md, CONTINUOUS_IMPROVEMENT.md,
│                    CONTRIBUTING.md, CODE_OF_CONDUCT.md
├── src/
│   ├── commands/    3 slash commands — menu, push, update (.md spec + .js impl where needed)
│   ├── hooks/       13 Claude Code hooks (session, pre/post tool use, andon-halt)
│   ├── lib/         105 JS modules (planning, quality, refinement, dispatcher, regulatory-regime, audit-chain, retention, legal-hold, traceability, lineage, eval-harness, comparator)
│   ├── areas/       5 dashboard areas (pipeline, inbox, agent, library, system)
│   ├── tabs/        8 legacy tab modules (superseded by areas/, kept for drill-in flows)
│   ├── scripts/     13 build/release utilities
│   └── data/        Static data files
├── agents/          110 agent definitions across 22 categories
│                    (+ _shared/ — 4 cross-cutting rules: ancestry-read,
│                     async-choice-protocol, no-stub-rule, warnings-are-critical)
├── skills/          421 skill files: 99 Tier-2 specialist bodies (SKILL.md)
│                    + 322 reference files (50 langs, 211 frameworks,
│                    61 quality configs). 38 unreachable refs removed in v6.9.14;
│                    86 existing SKILL.md improved in v6.9.15–v6.9.23;
│                    5 gap-fill SKILL.md added in v6.9.24; 8 cross-industry
│                    SKILL.md added in v6.9.27.
├── tests/           68 test files (1470 passing tests)
├── .ctoc/           Config, templates, operations, audit, loop journals
│   ├── templates/   CLAUDE.md.template, canvas templates, SaaS templates,
│   │                questions.yaml, product-kpis.yaml
│   ├── architecture/  tier-definitions.yaml, dispatch-schema.yaml
│   ├── audit/       dispatches/YYYY-MM-DD/<id>.yaml (one per dispatch)
│   └── loops/       <plan-slug>/journal.yaml (refinement-loop history)
└── .claude-plugin/  Plugin metadata (plugin.json, marketplace.json, hooks.json)

License

PolyForm Shield 1.0.0 — See LICENSE

Use CTOC freely for any project. You may not offer CTOC itself or a derivative as a competing product or service without permission. For commercial licensing inquiries, contact the licensor.

Links

Repository · Issues · Discussions

6.9.37 · Built by @robotijn

"Excellence is not an act, but a habit."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Install

Quick Start

Auto-Availability After Install

Skill Library Quality Bar

Gap-Fill Skills (v6.9.24)

Cross-Industry Skills (v6.9.27)

Project Init

Why CTO Chief?

How CTO Chief Compares

Example Session

Key Features

The Iron Loop

The 4-Tier Agent Architecture

The Refinement Loop

The Canvas — 6-Month Pre-Mortem + 5-Scenario Cash Flow

The Product Loop

SaaS Production-Readiness Templates

Agents

Skills

Interactive Dashboard

Enforcement

Smart Quality Gates

Deployment Pipeline

How It Works

Commands

Updating

License

Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 427 Commits
.claude-plugin		.claude-plugin
.ctoc		.ctoc
.github		.github
agents		agents
docs		docs
evals		evals
plans		plans
skills		skills
src		src
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION

Folders and files

Latest commit

History

Repository files navigation

Install

Quick Start

Auto-Availability After Install

Skill Library Quality Bar

Gap-Fill Skills (v6.9.24)

Cross-Industry Skills (v6.9.27)

Project Init

Why CTO Chief?

How CTO Chief Compares

Example Session

Key Features

The Iron Loop

The 4-Tier Agent Architecture

The Refinement Loop

The Canvas — 6-Month Pre-Mortem + 5-Scenario Cash Flow

The Product Loop

SaaS Production-Readiness Templates

Agents

Skills

Interactive Dashboard

Enforcement

Smart Quality Gates

Deployment Pipeline

How It Works

Commands

Updating

License

Links

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages