From docker-compose to Voltainer in one script
Tracebind compiles authorized browser sessions into typed APIs. To do that, it drives Auto Browser — an open-source controller that fronts an Xvfb + Chromium runtime via Playwright and exposes a REST/MCP API. Auto Browser ships as a docker-compose stack. We don't run Docker.
This post is the honest log of converting Auto Browser to
Voltainer, ArmoredGate's hardened systemd-nspawn
platform, in a single bring-up script. Useful if you are evaluating
Voltainer, considering systemd-nspawn for production workloads, or
trying to wean a non-trivial container stack off Docker. We will not
pretend the conversion was clean. It was not. The script is now scripts/voltainer-up.sh
and runs end-to-end in about thirty minutes on a cold host. Here is what
got us there.
Starting point
Auto Browser's docker-compose.yml declares two
services:
browser-node— Debian base image with Xvfb, Chromium, and a small Python supervisor that boots a Playwright server on:9223plus VNC on:5900and noVNC on:6080.controller— FastAPI service on:8000that talks tobrowser-nodeover WebSocket, exposes the REST/MCP surface, and persists session metadata to a shared volume.
The compose file mounts a named volume at /data on both
services, runs them on a user-defined bridge network, and forwards a
handful of ports to the host. Standard pattern. Nothing exotic.
The translation target was Voltainer, which converts OCI images into
systemd-nspawn machines with Landlock confinement, capability dropping,
seccomp filtering, and CAS-backed content storage. Voltainer manifests
are JSON; bring-up is two shell commands:
voltainer create $manifest_id then
voltainer start $manifest_id. The ArmoredForge build tool
sits in the middle, taking an OCI tarball and producing the manifest
plus the CAS blobs.
The plan looked clean on paper:
buildah budthe two Dockerfiles into OCI images.buildah pushthem as docker-archive tarballs.forge convert --input-tarballto emit a Voltainer manifest per service plus a content-addressed blob store.jqan overrides layer into each manifest to wire env, ports, and bind mounts.voltainer createandvoltainer startboth services on thevoltbr0bridge.
In practice each of those steps surfaced something. We will go through them in order.
Stage 1 — buildah
This was the easy part. buildah is a drop-in for
docker build in 95% of cases. Auto Browser's Dockerfiles
use multi-stage builds, COPY with chown, ARG/ENV mixing, and a
HEALTHCHECK. All of it just worked under buildah with the same
arguments. We use --layers for the build cache so re-runs
are seconds, not minutes:
buildah bud --layers -t auto-browser-browser-node:1.1.0 \
-f browser-node/Dockerfile browser-node/
The output goes to buildah's local store; buildah push
then exports each image as a tarball ArmoredForge can read.
Stage 2 — ArmoredForge convert
ArmoredForge does the actual work of turning an OCI tarball into a
Voltainer manifest. It walks the image, content-addresses every file
into the shared CAS (/var/armored/cas/blobs/), and emits a
manifest JSON that points at the blob set.
The first run looked alarming. The browser-node image is 1.69 GiB
extracted, nearly all of it Chromium and Playwright vendor binaries.
ArmoredForge's scanner flagged 18,233 secrets,
481 malware hits, 33 forbidden items,
and quarantined 423 files. Most of this is what
scanners always flag inside Chromium: test certificates, example API
keys checked into the Chromium source for crypto unit tests, OpenSSL
test vectors, compressed test artifacts in .pyc form. The
signal is in the small subset that is not inside
chromium-*/, and after filtering to
/opt/browser-node, /app, and
/usr/local/lib/ the real-finding count is small. Don't ship
the raw scan numbers to customers; do triage them by code path.
We ran the conversion in development mode
(--mode development --sign=false --scan=false) so we could
iterate; production builds get the full scan plus attestation
signing.
Stage 3 — the manifest merge
ArmoredForge produces a generic manifest from the image alone. It
does not know what env you want set, which ports you want forwarded, or
which host paths you want bind-mounted into the machine. Those overrides
live in a checked-in JSON file per service. The merge is one line of
jq:
jq -s '.[0] * .[1] | del(.["_comment", "_post_create"])' \
"$forge_manifest" "$overrides" \
| sudo tee /var/lib/volt/manifests/${manifest_id}.json
This is also where the first real gap appeared.
Extension #1 —
bind_mounts parser
Auto Browser's two services coordinate through /data:
browser-node writes the Playwright WebSocket endpoint file there on
boot, controller polls for it. The compose stack handled this with a
named volume mounted into both services. Voltainer's manifest schema had
no equivalent.
Voltainer's lib/systemd.sh
emits nspawn unit files from the manifest. It had handlers for env,
capabilities, landlock, seccomp, and network — but no
bind_mounts. systemd-nspawn itself supports
--bind= and --bind-ro= fine; the manifest just
didn't surface it.
We added the parser upstream. The relevant addition to
lib/systemd.sh:
# bind_mounts: array of strings ("host:container") or objects with .ro=true
mounts=$(jq -r '.deployment_specs.voltainer.bind_mounts // [] | .[] |
if type=="string" then "--bind=\(.)"
else (if .ro then "--bind-ro=" else "--bind=" end) +
(.source) + ":" + (.target)
end' "$manifest")
Tiny piece of code; load-bearing for any multi-service stack. With it
in place both services share /var/lib/volt/shared/ab-data
mounted at /data, and the WebSocket handshake works the way
the compose file always did.
Extension #2 — landlock missing-path tolerance
The second gap was nastier and took longer to diagnose. The
browser-node image expects a number of paths inside the container —
/var/lib/dpkg, /usr/share/dbus-1,
miscellaneous Chromium runtime directories — that get populated at
install time or first launch. Voltainer's Landlock policy translation
walks the container rootfs paths and emits ReadOnlyPaths=,
ReadWritePaths=, and InaccessiblePaths= lines
into the systemd unit.
systemd refuses to start a unit with a
ReadOnlyPaths=/some/missing/path directive. The first
nspawn launches died with status=226/NAMESPACE. The unit
log just said "namespace setup failed" with no path identified. Tracking
it down meant turning the policy generator into a minimal unit by hand
and bisecting which path was missing.
The fix is the systemd directive prefix -: prepending it
makes systemd skip non-existent paths instead of failing the unit. We
patched the translator to emit -/path for every landlock
path that doesn't exist on the host at unit-generation time. The lost
protection is protection of nothing — the path is missing — so the trade
is acceptable. We also added a voltainer lint warning at
manifest-load time so the operator sees the skipped paths.
Both extensions are now in upstream Voltainer.
Stage 4 — bring up
voltainer create succeeded.
voltainer start ab-browser-node-1.1.0 also succeeded.
voltainer start ab-controller-1.1.0 failed.
The controller couldn't resolve ab-browser-node.
Voltainer does not seed /etc/hosts for known machines;
service-name DNS isn't a primitive of the platform. The browser-node
container has a stable bridge IP (10.89.0.3), the
controller has 10.89.0.4, but the controller's Python code
uses the service name because the compose file did.
Two options: route through a DNS resolver inside the bridge, or seed
/etc/hosts post-create. We picked the second, and that is
where the worst bug of the whole conversion came from.
The 585-hard-link incident
We wrote a _post_create step that appended an
/etc/hosts line to the controller tinyvol with:
echo "10.89.0.3 ab-browser-node" >> /var/lib/volt/tinyvols/ab-controller-1.1.0/etc/hosts
Plausible enough. It also briefly broke ~600 other files on the host.
Voltainer's tinyvols are built by hard-linking from the CAS blob
store. The CAS deduplicates aggressively — every empty file in every
image points at the same single zero-byte blob. As of the moment we ran
the redirect, that blob had 585 hard links pointing at
it from various tinyvols. Shell > and
>> open the existing file and write through; with 585
hard-links to a shared inode, every one of those files now contained
10.89.0.3 ab-browser-node — Debian package descriptors,
empty lock files, NSS placeholders, the works.
We caught this in the next minute because the bring-up smoke test failed in a completely unrelated container that depended on one of those empty files being empty. Fixing the corruption required nuking and recreating about forty tinyvols, which is fast because CAS-backed, but it was an unhappy afternoon.
The correct way to write to a tinyvol file is install(1)
or cp from a temporary source: both system calls create a
new inode for the destination and let the kernel break the hard-link.
Shell redirection writes through. This is now a documented gotcha; the
bring-up script uses install for every tinyvol write:
jq -r '._post_create.etc_hosts_entries[]' "$DEPLOY_DIR/ab-controller.overrides.json" \
> "$WORK/etc-hosts"
sudo install -m 644 -o root -g root "$WORK/etc-hosts" \
/var/lib/volt/tinyvols/ab-controller-1.1.0/etc/hosts
We have proposed a voltainer write-tinyvol subcommand
upstream that wraps the safe pattern so it doesn't have to live in every
operator's head.
Stage 5 — egress, allowlists, and small bugs
Two more things bit us between "containers up" and "end-to-end smoke passing."
Egress masquerade. Containers on
voltbr0 couldn't reach the internet. Voltainer's
network_setup_bridge adds an nftables masquerade rule on
first bridge creation, but our bridge predated that code path. The
bring-up script now adds it defensively if missing:
sudo nft add rule ip nat postrouting \
ip saddr 10.89.0.0/24 oifname '!=' voltbr0 masquerade
ALLOWED_HOSTS=*. Auto Browser ships
with an ALLOWED_HOSTS setting defaulting to
example.com,localhost,127.0.0.1,::1. Anything else returns
a 403 with no body and no useful log line. We spent twenty minutes
convinced the controller was broken before grepping for the magic string
in the upstream source. The bring-up sets ALLOWED_HOSTS=*
for development; restrict in production.
Single shared browser session. Auto Browser's
shared_browser_node isolation mode allows exactly one
active Playwright session at a time. Re-running the bring-up while a
previous session was still attached produced 409s; the script now closes
any existing session before creating a new one.
Stage 6 — the final pipeline
Stripped to its essence, the conversion pipeline is one shell function per stage, end-to-end:
buildah bud
→ buildah push (docker-archive tarball)
→ forge convert --input-tarball
→ jq merge overrides into manifest
→ rsync CAS blobs into /var/armored/cas
→ voltainer destroy / create / start
→ install /etc/hosts post-create
→ smoke: curl http://10.89.0.4:8000/healthz
The end-to-end test that demonstrates the whole thing works:
scripts/run-e2e-real.sh boots Tracebind against the
Voltainer-deployed Auto Browser, drives a real Chromium session at
httpbin.org, captures eleven network events, and normalizes them into
six endpoint clusters with correct {id} collapsing. No
Docker on the host. Nothing privileged outside the manifest. The browser
process runs under a Landlock policy that lets it read its own runtime
and write its own /data and nothing else.
Lessons for anyone doing this
A few things we would tell ourselves before starting:
- Treat the CAS as a shared resource. Anything that touches a tinyvol file should create a new inode. Shell redirection is the wrong tool. If your platform's CAS hard-links, this will bite you exactly once and you will not forget.
- The manifest is where the platform's surface area is. Every gap between docker-compose and Voltainer surfaces as "this directive doesn't exist in the manifest yet." Budget time for one or two upstream contributions; it is cheaper than working around the gap in the bring-up script.
- Landlock missing-path tolerance saves an afternoon.
Default systemd behavior is to fail the unit. The
-prefix is in the systemd manual and was not in our heads. - systemd-nspawn is genuinely production-ready. Once the manifest reflects reality, the runtime is rock-solid, security-tightenable, and measurably lighter than Docker. We're not paying a runtime tax for the hardened-container story.
- Keep the bring-up idempotent. Every stage of our script is destroy-then-recreate. Re-running it is the debug loop.
What ships
The conversion produced four checked-in artifacts:
scripts/voltainer-up.sh— one-shot bring-up.deploy/voltainer/ab-browser-node.overrides.jsonandab-controller.overrides.json— the override layer that merges into the ArmoredForge output.- The two upstream Voltainer extensions:
bind_mountsparser and landlock missing-path tolerance, both inVoltainer/lib/systemd.sh.
If you are evaluating Voltainer for a real workload, this is the path. The script is the docs.
— The Tracebind team