Overview of the system architecture, networking, and key design decisions.
┌─────────────────────────────────────────────────────────────────────┐
│ Hetzner Dedicated Server (NixOS) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │
│ │ Caddy │ │ Preview │ │ PostgreSQL │ │
│ │ (reverse │ │ Webhook │ │ (host, for node previews)│ │
│ │ proxy, TLS) │ │ (:3100) │ │ │ │
│ └──────┬───────┘ └──────┬───────┘ └──────────────────────────┘ │
│ │ │ │
│ │ routes to containers by subdomain │
│ │ │
│ ┌──────┴───────────────────────────────────────────────────────┐ │
│ │ 10.100.0.0/24 veth network (NAT to external interface) │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ │ │
│ │ │ Agent Container │ │ Preview │ │ │
│ │ │ 10.100.0.3 │ │ Container │ │ │
│ │ │ │ │ 10.100.0.5 │ │ │
│ │ │ - Claude Code │ │ │ │ │
│ │ │ - Dev tools │ │ - App services │ │ │
│ │ │ - SSH │ │ (per repo's │ │ │
│ │ │ │ │ preview- │ │ │
│ │ │ │ │ config.nix) │ │ │
│ │ └─────────────────┘ └─────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
All containers (agents and previews) share a single private subnet. Each container gets a pair of IPs allocated from a shared counter (/var/lib/claude-agents/.next_slot):
| Slot | Host-side IP | Container IP | Purpose |
|---|---|---|---|
| 1 | 10.100.0.2 | 10.100.0.3 | First container |
| 2 | 10.100.0.4 | 10.100.0.5 | Second container |
| N | 10.100.0.{N*2} | 10.100.0.{N*2+1} | Nth container |
The slot counter only increments (never reuses), so destroyed containers leave gaps. This is intentional — it avoids IP conflicts without needing a full allocation table.
Containers reach the internet through NAT:
networking.nat = {
enable = true;
internalInterfaces = [ "ve-+" ]; # all container veth interfaces
externalInterface = "enp3s0"; # server's physical interface
};The host firewall allows:
- Ports 80 (HTTP, for redirects) and 443 (HTTPS) from the internet
- All traffic from container veth interfaces (
ve-+) is trusted — this is required for containers to reach host PostgreSQL and other services
networking.firewall.allowedTCPPorts = [ 80 443 ];
networking.firewall.trustedInterfaces = [ "ve-+" ];TLS is handled by a Cloudflare Origin CA wildcard certificate, with Cloudflare acting as a reverse proxy:
- Wildcard subdomain DNS (
*.preview.example.com) points to the server IP via proxied (orange cloud) Cloudflare A records - Cloudflare SSL/TLS mode must be set to Full (strict) — Origin CA certs are only trusted by Cloudflare's proxy, not by browsers directly
- The Origin CA cert/key are stored at
/var/secrets/cloudflare-origin.pemand/var/secrets/cloudflare-origin-key.pem - A Caddy snippet
(cloudflare_tls)is defined in the global config; each site block imports it - Each preview gets a Caddy config file in
/etc/caddy/previews/<slug>.caddy - The main Caddy config imports all files:
import /etc/caddy/previews/*.caddy
Traffic flow: Browser → Cloudflare (TLS termination with Cloudflare's trusted cert) → Origin server (TLS with Origin CA cert, validated by Cloudflare)
GitHub PR event
→ HTTPS → Caddy (webhook.preview.example.com)
→ reverse_proxy → localhost:3100 (preview-webhook service)
→ verifies HMAC-SHA256 signature
→ returns 202 immediately
→ background: runs `preview create/update/destroy` via shell
→ nixos-container create/start/stop/destroy
→ writes/removes Caddy route files
→ reloads Caddy
Browser: https://42.preview.example.com
→ Caddy (TLS termination, reads /etc/caddy/previews/42.caddy)
→ reverse_proxy → 10.100.0.5:3000 (preview app)
Browser: https://42.preview.example.com/api/users
→ Caddy (path-based routing from preview-config.nix routes)
→ reverse_proxy → 10.100.0.5:4000 (backend API)
| File | Purpose |
|---|---|
server-config/configuration.nix |
Host NixOS config (networking, Caddy, PostgreSQL, systemd services) |
server-config/flake.nix |
Nix flake defining all system configurations |
server-config/agent.sh |
Agent container lifecycle manager (create/destroy/list/ssh) |
server-config/preview.sh |
Preview container lifecycle manager (create/destroy/update/list/logs) |
| File | Flake output | Purpose |
|---|---|---|
server-config/agent-config.nix |
nixosConfigurations.agent |
Agent container (Claude Code + dev tools). Has placeholders: YOUR_GIT_EMAIL, SSH key placeholders |
server-config/preview-config.nix |
nixosConfigurations.preview |
Default preview container (Node.js). Each repo can ship its own preview-config.nix at the repo root. |
| File | Purpose |
|---|---|
server-config/preview-webhook/src/index.ts |
Fastify server, webhook endpoint |
server-config/preview-webhook/src/github.ts |
HMAC signature verification, PR event parsing |
server-config/preview-webhook/src/preview.ts |
Shells out to preview CLI, adds PR links |
server-config/preview-webhook/src/config.ts |
Environment variable loading |
| File | Purpose |
|---|---|
initial-install/flake.nix |
Flake for nixos-anywhere (references disko for disk setup) |
initial-install/disk-config.nix |
RAID 1 disk layout across two SSDs |
initial-install/configuration.nix |
Minimal NixOS config for first boot |
Container creation uses pre-built Nix closures instead of evaluating the flake each time:
# Build once (or done automatically on first create):
nix build /etc/nixos#nixosConfigurations.agent.config.system.build.toplevel
# Create instantly using the cached store path:
nixos-container create myagent --system-path /nix/store/...The store path is cached in /var/lib/claude-agents/.system-path (agents) and /var/lib/preview-deploys/.system-path (previews). This makes agent creation take ~3 seconds instead of minutes.
The agent and preview commands are wrapped with writeShellApplication, which:
- Adds runtime dependencies to
PATHautomatically - Enforces shellcheck at build time (unquoted variables fail the build)
- Creates a proper executable in the system path
(pkgs.writeShellApplication {
name = "preview";
runtimeInputs = [ coreutils gnused nixos-container openssh curl jq postgresql sudo ];
text = builtins.readFile ./preview.sh;
})The project uses NixOS imperative containers (nixos-container create/destroy) rather than declarative ones in configuration.nix. This allows:
- Creating/destroying containers without
nixos-rebuild switch - Dynamic container names and IPs
- Instant creation from pre-built closures
Preview containers receive their configuration through /etc/preview.env, written into the container's filesystem before it starts. This avoids needing to pass environment variables through nixos-container or modify the container config per-instance.
Each repo's preview-config.nix declares a database mode:
"host"— Tekton creates a PostgreSQL database on the host and injectsDATABASE_URLinto the container. Used for apps that don't manage their own database."container"— The app manages its own database inside the container (or has none). Tekton doesn't touch the database. This gives full isolation — destroying the container cleans up everything.
IPs are allocated from a monotonic counter rather than scanning for free slots. This means:
- Destroyed containers leave IP gaps (e.g., slot 3 is never reused if container 3 is destroyed)
- Simple and race-free — no need to lock or scan
- The
/24subnet supports ~127 concurrent containers, which is more than enough