Self-Hosted Bare Metal Runners


A Self-Hosted Runner is a Bare Metal Runner that you run and manage yourself, instead of using a Bencher managed On-Demand or Dedicated Runner. You point the runner binary at a Bencher API server, and it claims and executes Jobs on your own hardware.

Self-hosting a Runner is a good fit when you want to:

  • Run benchmarks on specific hardware that Bencher Cloud does not offer
  • Keep benchmark workloads inside your own network or an air-gapped environment
  • Run Bencher Self-Hosted end-to-end

This guide walks through registering a Runner, describing its hardware with a Spec, linking the two together, and starting the runner binary. For a conceptual overview of how Runners, Specs, Sandboxes, and Jobs fit together, see the Bare Metal Overview.

🐰 Firecracker sandboxing requires Linux with KVM enabled. A Runner that only serves non-sandboxed Specs can run on any supported host.

Create a Runner

A Runner is the resource that authenticates the runner binary to your Bencher API server. Create one with the bencher CLI, the Runners REST API, or the Bencher Console.

Terminal window
bencher runner create \
--host https://api.bencher.example.com \
--name "My Runner"

Bencher returns the new Runner along with its key:

{
"uuid": "...",
"key": "bencher_runner_..."
}

The key is only shown once, so store it securely. You will pass it to the runner binary when you start the Runner. If a key is ever lost or leaked, rotate it with the POST /v0/runners/{runner}/key API, which immediately invalidates the previous key.

Create a Spec

A Spec describes the hardware that a Runner provides: operating system, CPU architecture, Sandbox type, CPU count, memory, disk, and network access. A Job declares the Spec it needs, and a Runner only claims Jobs for Specs it supports.

Create a Spec with the bencher CLI, the Specs REST API, or the Bencher Console:

Terminal window
bencher spec create \
--host https://api.bencher.example.com \
--name "Intel v1" \
--os linux \
--architecture x86_64 \
--sandbox firecracker \
--cpu 4 \
--memory 51539607552 \
--disk 137438953472

Omit --sandbox to define a non-sandboxed Spec, and pass --network if Jobs on this Spec are allowed network access. Memory and disk are specified in bytes. For more on Specs and how a Testbed records the Spec used for each Report, see Testbeds & Specs.

Assign a Spec to a Runner

A Runner Spec links a Spec to a Runner. A Runner can support multiple Specs, and a Spec can be supported by multiple Runners. A Runner only claims Jobs whose Spec it has been assigned.

Add a Spec to a Runner with the bencher CLI or the Runner Specs REST API:

Terminal window
bencher runner spec add my-runner \
--host https://api.bencher.example.com \
--spec intel-v1

Remove the link later with the DELETE /v0/runners/{runner}/specs/{spec} API.

Start the Runner

With a Runner created and at least one Spec assigned, start the runner binary on your hardware with runner up. Provide the API server host, the Runner UUID or slug, and the Runner key, either as flags or environment variables:

Terminal window
runner up \
--host https://api.bencher.example.com \
--runner my-runner \
--key bencher_runner_...
Terminal window
export BENCHER_HOST=https://api.bencher.example.com
export BENCHER_RUNNER=my-runner
export BENCHER_RUNNER_KEY=bencher_runner_...
runner up

The runner binary opens a single WebSocket connection to the API server, polls for Jobs matching its Specs, executes each one, and reports the results back. The connection stays open across Jobs, and the binary updates itself between Jobs when a new version is available.

By default a Runner rejects any Job whose Spec has no Sandbox. To let a Runner execute non-sandboxed Jobs directly on the host, start it with the --danger-allow-no-sandbox flag.

  • runner up

    runner up

    Start the runner, polling for and executing benchmark Jobs. This is the long-running command used to operate a Self-Hosted Runner. The runner opens a single WebSocket connection to the API server, claims Jobs that match its Specs, executes them, and reports the results back.

    runner up [OPTIONS]

    Options

    --host <HOST>

    The Bencher API server to connect to. By default, https://api.bencher.dev is used. Can also be set with the BENCHER_HOST environment variable.

    --runner <RUNNER>

    The UUID or slug of the Runner to operate as. Can also be set with the BENCHER_RUNNER environment variable.

    --key <KEY>

    The Runner authentication key (bencher_runner_...) returned when the Runner was created. Can also be set with the BENCHER_RUNNER_KEY environment variable.

    --poll-timeout <POLL_TIMEOUT>

    The long-poll timeout in seconds while waiting for a Job, between 1 and 900. By default, 55 is used.

    --danger-allow-no-sandbox

    Allow executing Jobs without a Sandbox. Without this flag, a Job whose Spec has no Sandbox is rejected at runtime. Non-sandboxed Jobs run directly on the host, so only enable this for trusted workloads. Can also be set with the BENCHER_DANGER_ALLOW_NO_SANDBOX environment variable.

    --sandbox-log-level <SANDBOX_LOG_LEVEL>

    The log level for the sandbox process. By default, warning is used.

    --no-auto-update

    Disable automatic updates from the server. By default, the runner updates itself between Jobs when the server offers a new version. Can also be set with the BENCHER_NO_AUTO_UPDATE environment variable.

    --max-download-size <MAX_DOWNLOAD_SIZE>

    The maximum download size in bytes for self-update binaries. By default, 500 MiB is used. Conflicts with --no-auto-update.

    --max-output-size <MAX_OUTPUT_SIZE>

    The maximum size in bytes for collected stdout and stderr. By default, 25 MiB is used.

    --max-file-count <MAX_FILE_COUNT>

    The maximum number of output files to decode. By default, 255 is used.

    The maximum number of symlinks to follow during path resolution, matching the Linux kernel MAXSYMLINKS limit. By default, 40 is used. Only used in non-sandboxed mode.

    --grace-period <GRACE_PERIOD>

    The grace period in seconds after the benchmark exits before final output collection.

    Host Tuning

    Before each benchmark, the runner applies host tuning to reduce measurement noise. By default it disables ASLR, the NMI watchdog, SMT / hyper-threading, and turbo boost; sets the CPU scaling governor to performance, swappiness to 10, and perf_event_paranoid to -1. The flags below keep individual optimizations at their host defaults instead.

    --no-tuning

    Disable all host tuning optimizations.

    --aslr

    Keep ASLR enabled (default: disabled for benchmarks).

    --nmi-watchdog

    Keep the NMI watchdog enabled (default: disabled for benchmarks).

    --smt

    Keep SMT / hyper-threading enabled (default: disabled for benchmarks).

    --turbo

    Keep turbo boost enabled (default: disabled for benchmarks).

    --swappiness <SWAPPINESS>

    Set the swappiness value. By default, 10 is used.

    --governor <GOVERNOR>

    Set the CPU scaling governor. By default, performance is used.

    --perf-event-paranoid <PERF_EVENT_PARANOID>

    Set the perf_event_paranoid value. By default, -1 is used.

    --help

    Print help.

See the runner CLI reference for every option, and the Runner Protocol reference for the messages exchanged with the server.

Self-Hosted Runner Workflow

Once the Runner is connected, you submit benchmarks the same way as with any Bare Metal Runner: build and push an Image, then run bencher run --image with a matching Spec. The diagram below shows the full path, from registering the Runner through executing a Job.

OCI Registryrunner binaryAPI ServerOCI Registryrunner binaryAPI ServerAdminbencher runner createRunner + keybencher spec createbencher runner spec addrunner up --host --runner --keyConnect (WebSocket) and poll for JobsAssign matching JobPull ImageImage layersRun benchmarkSubmit resultsAdmin

Keep Going: runner CLI Reference ➡



Published: Fri, June 19, 2026 at 8:00:00 AM UTC