From docker-compose to Voltainer in one script

Tracebind compiles authorized browser sessions into typed APIs. To do that, it drives Auto Browser — an open-source controller that fronts an Xvfb + Chromium runtime via Playwright and exposes a REST/MCP API. Auto Browser ships as a docker-compose stack. We don't run Docker.

This post is the honest log of converting Auto Browser to Voltainer, ArmoredGate's hardened systemd-nspawn platform, in a single bring-up script. Useful if you are evaluating Voltainer, considering systemd-nspawn for production workloads, or trying to wean a non-trivial container stack off Docker. We will not pretend the conversion was clean. It was not. The script is now scripts/voltainer-up.sh and runs end-to-end in about thirty minutes on a cold host. Here is what got us there.

Starting point

Auto Browser's docker-compose.yml declares two services:

browser-node — Debian base image with Xvfb, Chromium, and a small Python supervisor that boots a Playwright server on :9223 plus VNC on :5900 and noVNC on :6080.
controller — FastAPI service on :8000 that talks to browser-node over WebSocket, exposes the REST/MCP surface, and persists session metadata to a shared volume.

The compose file mounts a named volume at /data on both services, runs them on a user-defined bridge network, and forwards a handful of ports to the host. Standard pattern. Nothing exotic.

The translation target was Voltainer, which converts OCI images into systemd-nspawn machines with Landlock confinement, capability dropping, seccomp filtering, and CAS-backed content storage. Voltainer manifests are JSON; bring-up is two shell commands: voltainer create $manifest_id then voltainer start $manifest_id. The ArmoredForge build tool sits in the middle, taking an OCI tarball and producing the manifest plus the CAS blobs.

The plan looked clean on paper:

buildah bud the two Dockerfiles into OCI images.
buildah push them as docker-archive tarballs.
forge convert --input-tarball to emit a Voltainer manifest per service plus a content-addressed blob store.
jq an overrides layer into each manifest to wire env, ports, and bind mounts.
voltainer create and voltainer start both services on the voltbr0 bridge.

In practice each of those steps surfaced something. We will go through them in order.

Stage 1 — buildah

This was the easy part. buildah is a drop-in for docker build in 95% of cases. Auto Browser's Dockerfiles use multi-stage builds, COPY with chown, ARG/ENV mixing, and a HEALTHCHECK. All of it just worked under buildah with the same arguments. We use --layers for the build cache so re-runs are seconds, not minutes:

buildah bud --layers -t auto-browser-browser-node:1.1.0 \
  -f browser-node/Dockerfile browser-node/

The output goes to buildah's local store; buildah push then exports each image as a tarball ArmoredForge can read.

Stage 2 — ArmoredForge convert

ArmoredForge does the actual work of turning an OCI tarball into a Voltainer manifest. It walks the image, content-addresses every file into the shared CAS (/var/armored/cas/blobs/), and emits a manifest JSON that points at the blob set.

The first run looked alarming. The browser-node image is 1.69 GiB extracted, nearly all of it Chromium and Playwright vendor binaries. ArmoredForge's scanner flagged 18,233 secrets, 481 malware hits, 33 forbidden items, and quarantined 423 files. Most of this is what scanners always flag inside Chromium: test certificates, example API keys checked into the Chromium source for crypto unit tests, OpenSSL test vectors, compressed test artifacts in .pyc form. The signal is in the small subset that is not inside chromium-*/, and after filtering to /opt/browser-node, /app, and /usr/local/lib/ the real-finding count is small. Don't ship the raw scan numbers to customers; do triage them by code path.

We ran the conversion in development mode (--mode development --sign=false --scan=false) so we could iterate; production builds get the full scan plus attestation signing.

Stage 3 — the manifest merge

ArmoredForge produces a generic manifest from the image alone. It does not know what env you want set, which ports you want forwarded, or which host paths you want bind-mounted into the machine. Those overrides live in a checked-in JSON file per service. The merge is one line of jq:

jq -s '.[0] * .[1] | del(.["_comment", "_post_create"])' \
  "$forge_manifest" "$overrides" \
  | sudo tee /var/lib/volt/manifests/${manifest_id}.json

This is also where the first real gap appeared.

Extension #1 — `bind_mounts` parser

Auto Browser's two services coordinate through /data: browser-node writes the Playwright WebSocket endpoint file there on boot, controller polls for it. The compose stack handled this with a named volume mounted into both services. Voltainer's manifest schema had no equivalent.

Voltainer's lib/systemd.sh emits nspawn unit files from the manifest. It had handlers for env, capabilities, landlock, seccomp, and network — but no bind_mounts. systemd-nspawn itself supports --bind= and --bind-ro= fine; the manifest just didn't surface it.

We added the parser upstream. The relevant addition to lib/systemd.sh:

# bind_mounts: array of strings ("host:container") or objects with .ro=true
mounts=$(jq -r '.deployment_specs.voltainer.bind_mounts // [] | .[] |
  if type=="string" then "--bind=\(.)"
  else (if .ro then "--bind-ro=" else "--bind=" end) +
       (.source) + ":" + (.target)
  end' "$manifest")

Tiny piece of code; load-bearing for any multi-service stack. With it in place both services share /var/lib/volt/shared/ab-data mounted at /data, and the WebSocket handshake works the way the compose file always did.

Extension #2 — landlock missing-path tolerance

The second gap was nastier and took longer to diagnose. The browser-node image expects a number of paths inside the container — /var/lib/dpkg, /usr/share/dbus-1, miscellaneous Chromium runtime directories — that get populated at install time or first launch. Voltainer's Landlock policy translation walks the container rootfs paths and emits ReadOnlyPaths=, ReadWritePaths=, and InaccessiblePaths= lines into the systemd unit.

systemd refuses to start a unit with a ReadOnlyPaths=/some/missing/path directive. The first nspawn launches died with status=226/NAMESPACE. The unit log just said "namespace setup failed" with no path identified. Tracking it down meant turning the policy generator into a minimal unit by hand and bisecting which path was missing.

The fix is the systemd directive prefix -: prepending it makes systemd skip non-existent paths instead of failing the unit. We patched the translator to emit -/path for every landlock path that doesn't exist on the host at unit-generation time. The lost protection is protection of nothing — the path is missing — so the trade is acceptable. We also added a voltainer lint warning at manifest-load time so the operator sees the skipped paths.

Both extensions are now in upstream Voltainer.

Stage 4 — bring up

voltainer create succeeded. voltainer start ab-browser-node-1.1.0 also succeeded. voltainer start ab-controller-1.1.0 failed.

The controller couldn't resolve ab-browser-node. Voltainer does not seed /etc/hosts for known machines; service-name DNS isn't a primitive of the platform. The browser-node container has a stable bridge IP (10.89.0.3), the controller has 10.89.0.4, but the controller's Python code uses the service name because the compose file did.

Two options: route through a DNS resolver inside the bridge, or seed /etc/hosts post-create. We picked the second, and that is where the worst bug of the whole conversion came from.

The 585-hard-link incident

We wrote a _post_create step that appended an /etc/hosts line to the controller tinyvol with:

echo "10.89.0.3  ab-browser-node" >> /var/lib/volt/tinyvols/ab-controller-1.1.0/etc/hosts

Plausible enough. It also briefly broke ~600 other files on the host.

Voltainer's tinyvols are built by hard-linking from the CAS blob store. The CAS deduplicates aggressively — every empty file in every image points at the same single zero-byte blob. As of the moment we ran the redirect, that blob had 585 hard links pointing at it from various tinyvols. Shell > and >> open the existing file and write through; with 585 hard-links to a shared inode, every one of those files now contained 10.89.0.3 ab-browser-node — Debian package descriptors, empty lock files, NSS placeholders, the works.

We caught this in the next minute because the bring-up smoke test failed in a completely unrelated container that depended on one of those empty files being empty. Fixing the corruption required nuking and recreating about forty tinyvols, which is fast because CAS-backed, but it was an unhappy afternoon.

The correct way to write to a tinyvol file is install(1) or cp from a temporary source: both system calls create a new inode for the destination and let the kernel break the hard-link. Shell redirection writes through. This is now a documented gotcha; the bring-up script uses install for every tinyvol write:

jq -r '._post_create.etc_hosts_entries[]' "$DEPLOY_DIR/ab-controller.overrides.json" \
  > "$WORK/etc-hosts"
sudo install -m 644 -o root -g root "$WORK/etc-hosts" \
  /var/lib/volt/tinyvols/ab-controller-1.1.0/etc/hosts

We have proposed a voltainer write-tinyvol subcommand upstream that wraps the safe pattern so it doesn't have to live in every operator's head.

Stage 5 — egress, allowlists, and small bugs

Two more things bit us between "containers up" and "end-to-end smoke passing."

Egress masquerade. Containers on voltbr0 couldn't reach the internet. Voltainer's network_setup_bridge adds an nftables masquerade rule on first bridge creation, but our bridge predated that code path. The bring-up script now adds it defensively if missing:

sudo nft add rule ip nat postrouting \
  ip saddr 10.89.0.0/24 oifname '!=' voltbr0 masquerade

ALLOWED_HOSTS=*. Auto Browser ships with an ALLOWED_HOSTS setting defaulting to example.com,localhost,127.0.0.1,::1. Anything else returns a 403 with no body and no useful log line. We spent twenty minutes convinced the controller was broken before grepping for the magic string in the upstream source. The bring-up sets ALLOWED_HOSTS=* for development; restrict in production.

Single shared browser session. Auto Browser's shared_browser_node isolation mode allows exactly one active Playwright session at a time. Re-running the bring-up while a previous session was still attached produced 409s; the script now closes any existing session before creating a new one.

Stage 6 — the final pipeline

Stripped to its essence, the conversion pipeline is one shell function per stage, end-to-end:

buildah bud
  → buildah push (docker-archive tarball)
    → forge convert --input-tarball
      → jq merge overrides into manifest
        → rsync CAS blobs into /var/armored/cas
          → voltainer destroy / create / start
            → install /etc/hosts post-create
              → smoke: curl http://10.89.0.4:8000/healthz

The end-to-end test that demonstrates the whole thing works: scripts/run-e2e-real.sh boots Tracebind against the Voltainer-deployed Auto Browser, drives a real Chromium session at httpbin.org, captures eleven network events, and normalizes them into six endpoint clusters with correct {id} collapsing. No Docker on the host. Nothing privileged outside the manifest. The browser process runs under a Landlock policy that lets it read its own runtime and write its own /data and nothing else.

Lessons for anyone doing this

A few things we would tell ourselves before starting:

Treat the CAS as a shared resource. Anything that touches a tinyvol file should create a new inode. Shell redirection is the wrong tool. If your platform's CAS hard-links, this will bite you exactly once and you will not forget.
The manifest is where the platform's surface area is. Every gap between docker-compose and Voltainer surfaces as "this directive doesn't exist in the manifest yet." Budget time for one or two upstream contributions; it is cheaper than working around the gap in the bring-up script.
Landlock missing-path tolerance saves an afternoon. Default systemd behavior is to fail the unit. The - prefix is in the systemd manual and was not in our heads.
systemd-nspawn is genuinely production-ready. Once the manifest reflects reality, the runtime is rock-solid, security-tightenable, and measurably lighter than Docker. We're not paying a runtime tax for the hardened-container story.
Keep the bring-up idempotent. Every stage of our script is destroy-then-recreate. Re-running it is the debug loop.

What ships

The conversion produced four checked-in artifacts:

scripts/voltainer-up.sh — one-shot bring-up.
deploy/voltainer/ab-browser-node.overrides.json and ab-controller.overrides.json — the override layer that merges into the ArmoredForge output.
The two upstream Voltainer extensions: bind_mounts parser and landlock missing-path tolerance, both in Voltainer/lib/systemd.sh.

If you are evaluating Voltainer for a real workload, this is the path. The script is the docs.

— The Tracebind team