How to Track Benchmarks in CI with Bencher
Most benchmark results are ephemeral. They disappear as soon as your terminal reaches its scrollback limit. Some benchmark harnesses let you cache results, but most only do so locally. Bencher allows you to track your benchmarks from both local and CI runs and compare the results, while still using your favorite benchmark harness.
There are two popular ways to compare benchmark results when Continuous Benchmarking, that is benchmarking in CI:
- Statistical Continuous Benchmarking
- Track benchmark results over time to create a baseline
- Use this baseline along with Statistical Thresholds to create a statistical boundary
- Compare the new results against this statistical boundary to detect performance regressions
- Relative Continuous Benchmarking
- Run the benchmarks for the current baseline code
- Use Percentage Thresholds to create a boundary for the baseline code
- Switch over to the new version of the code
- Run the benchmarks for the new version of the code
- Compare the new version of the code results against the baseline code results to detect performance regressions
Statistical Continuous Benchmarking
Picking up where we left off in the
Quick Start and Docker Self-Hosted tutorials,
let’s add Statistical Continuous Benchmarking to our Save Walter White
project.
🐰 Make sure you have created an API token and set it as the
BENCHER_API_TOKEN
environment variable before continuing on!
First, we need to create a new Testbed to represent our CI runners, aptly named ci-runner
.
- Use the
bencher testbed create
CLI subcommand. See thetestbed create
docs for more details. (ex:bencher testbed create
) - Set the
--name
option to the desired Testbed name. (ex:--name ci-runner
) - Specify the project argument as the
Save Walter White
project slug. (ex:save-walter-white-1234abcd
)
Next, we need to create a new Threshold for our ci-runner
Testbed:
- Use the
bencher threshold create
CLI subcommand. See thethreshold create
docs for more details. (ex:bencher threshold create
) - Set the
--branch
option to the defaultmain
Branch. (ex:--branch main
) - Set the
--branch
option to the newci-runner
Testbed. (ex:--testbed ci-runner
) - Set the
--measure
option to the built-inLatency
Measure that is generated bybencher mock
. See the definition of Measure for more details. (ex:--measure Latency
) - Set the
--test
option to at-test
Threshold. See Thresholds & Alerts for a full overview. (ex:--test t-test
) - Set the
--upper-boundary
option to an Upper Boundary of0.95
. See Thresholds & Alerts for a full overview. (ex:--upper-boundary 0.95
) - Specify the project argument as the
Save Walter White
project slug. (ex:save-walter-white-1234abcd
)
Now we are ready to run our benchmarks in CI. Because every CI environment is a little bit different, the following example is meant to be more illustrative than practical. For more specific examples, see Continuous Benchmarking in GitHub Actions and Continuous Benchmarking in GitLab CI/CD.
We need to create and maintain a historical baseline for our main
branch by benchmarking every change in CI:
- Use the
bencher run
CLI subcommand to run yourfeature-branch
branch benchmarks. See thebencher run
CLI subcommand for a full overview. (ex:bencher run
) - Set the
--project
option to the Project slug. See the--project
docs for more details. (ex:--project save-walter-white-1234abcd
) - Set the
--branch
option to the default Branch name. See branch selection for a full overview. (ex:--branch main
) - Set the
--testbed
option to the Testbed name. See the--tested
docs for more details. (ex:--testbed ci-runner
) - Set the
--adapter
option to the desired benchmark harness adapter. See benchmark harness adapters for a full overview. (ex:--adapter json
) - Set the
--err
flag to fail the command if an Alert is generated. See Threshold & Alerts for a full overview. (ex:--err
) - Specify the benchmark command arguments.
See benchmark command for a full overview.
(ex:
bencher mock
)
Finally, we are ready to catch performance regressions in CI.
This is how we would track the performance of a new feature branch, named feature-branch
, in CI:
- Use the
bencher run
CLI subcommand to run yourfeature-branch
branch benchmarks. See thebencher run
CLI subcommand for a full overview. (ex:bencher run
) - Set the
--project
option to the Project slug. See the--project
docs for more details. (ex:--project save-walter-white-1234abcd
) - Set the
--branch
option to the feature Branch name. See branch selection for a full overview. (ex:--branch feature-branch
) - Set the
--branch-start-point
option to the feature Branch start point. See branch selection for a full overview. (ex:--branch-start-point main
) - Set the
--branch-start-point-hash
option to the feature Branch start pointgit
hash. See branch selection for a full overview. (ex:--branch-start-point-hash 32ae...dd8b
) - Set the
--testbed
option to the Testbed name. See the--tested
docs for more details. (ex:--testbed ci-runner
) - Set the
--adapter
option to the desired benchmark harness adapter. See benchmark harness adapters for a full overview. (ex:--adapter json
) - Set the
--err
flag to fail the command if an Alert is generated. See Threshold & Alerts for a full overview. (ex:--err
) - Specify the benchmark command arguments.
See benchmark command for a full overview.
(ex:
bencher mock
)
The first time this is command is run in CI,
it will create the feature-branch
Branch since it does not exist yet.
The new feature-branch
will use the main
Branch
at hash 32aea434d751648726097ed3ac760b57107edd8b
as its start point.
This means that feature-branch
will have a copy of all the data and Thresholds
from the main
Branch to compare the results of bencher mock
against,
for the first and all subsequent runs.
Relative Continuous Benchmarking
Picking up where we left off in the
Quick Start and Docker Self-Hosted tutorials,
let’s add Relative Continuous Benchmarking to our Save Walter White
project.
🐰 Make sure you have created an API token and set it as the
BENCHER_API_TOKEN
environment variable before continuing on!
First, we need to create a new Testbed to represent our CI runners, aptly named ci-runner
.
- Use the
bencher testbed create
CLI subcommand. See thetestbed create
docs for more details. (ex:bencher testbed create
) - Set the
--name
option to the desired Testbed name. (ex:--name ci-runner
) - Specify the project argument as the
Save Walter White
project slug. (ex:save-walter-white-1234abcd
)
Relative Continuous Benchmarking runs a side-by-side comparison of two versions of your code.
This can be useful when dealing with noisy CI/CD environments,
where the resources available can be highly variable between runs.
In this example we will be comparing the results from running on the main
branch
to results from running on a feature branch named feature-branch
.
Because every CI environment is a little bit different,
the following example is meant to be more illustrative than practical.
For more specific examples, see Continuous Benchmarking in GitHub Actions
and Continuous Benchmarking in GitLab CI/CD.
First, we need to checkout the main
branch with git
in CI:
Then we need to run our benchmarks on the main
branch in CI:
- Use the
bencher run
CLI subcommand to run yourmain
branch benchmarks. See thebencher run
CLI subcommand for a full overview. (ex:bencher run
) - Set the
--project
option to the Project slug. See the--project
docs for more details. (ex:--project save-walter-white-1234abcd
) - Set the
--branch
option to the feature Branch name. See branch selection for a full overview. (ex:--branch feature-branch
) - Set the
--branch-reset
flag. See branch selection for a full overview. (ex:--branch-reset
) - Set the
--testbed
option to the Testbed name. See the--tested
docs for more details. (ex:--testbed ci-runner
) - Set the
--adapter
option to the desired benchmark harness adapter. See benchmark harness adapters for a full overview. (ex:--adapter json
) - Specify the benchmark command arguments.
See benchmark command for a full overview.
(ex:
bencher mock
)
The first time this is command is run in CI,
it will create the feature-branch
Branch since it does not exist yet.
The new feature-branch
will not have a start point, existing data, or Thresholds.
On subsequent runs, the old version of feature-branch
will be renamed
and a new feature-branch
will be created without a start point, existing data, or Thresholds.
Next, we need to create a new Threshold in CI for our new feature-branch
Branch:
- Use the
bencher threshold create
CLI subcommand. See thethreshold create
docs for more details. (ex:bencher threshold create
) - Set the
--branch
option to the newfeature-branch
Branch. (ex:--branch feature-branch
) - Set the
--branch
option to theci-runner
Testbed. (ex:--testbed ci-runner
) - Set the
--measure
option to the built-inLatency
Measure that is generated bybencher mock
. See the definition of Measure for more details. (ex:--measure Latency
) - Set the
--test
option to apercentage
Threshold. See Thresholds & Alerts for a full overview. (ex:--test t-test
) - Set the
--upper-boundary
option to an Upper Boundary of0.25
(ie25%
). See Thresholds & Alerts for a full overview. (ex:--upper-boundary 0.25
) - Specify the project argument as the
Save Walter White
project slug. (ex:save-walter-white-1234abcd
)
Then, we need to checkout the feature-branch
branch with git
in CI:
Finally, we are ready to run our feature-branch
benchmarks in CI:
- Use the
bencher run
CLI subcommand to run yourfeature-branch
benchmarks. See thebencher run
CLI subcommand for a full overview. (ex:bencher run
) - Set the
--project
option to the Project slug. See the--project
docs for more details. (ex:--project save-walter-white-1234abcd
) - Set the
--branch
option to the feature Branch name. See branch selection for a full overview. (ex:--branch feature-branch
) - Set the
--testbed
option to the Testbed name. See the--tested
docs for more details. (ex:--testbed ci-runner
) - Set the
--adapter
option to the desired benchmark harness adapter. See benchmark harness adapters for a full overview. (ex:--adapter json
) - Set the
--err
flag to fail the command if an Alert is generated. See Threshold & Alerts for a full overview. (ex:--err
) - Specify the benchmark command arguments.
See benchmark command for a full overview.
(ex:
bencher mock
)
Every time this command is run in CI,
it is comparing the results from feature-branch
against only the most recent results from main
.
🐰 Congrats! You have learned how to track benchmarks in CI with Bencher! 🎉