How to benchmark Rust code with Gungraun

Everett Pompeii

What is Benchmarking?

Benchmarking is the practice of testing the performance of your code to see how fast (latency) or how much work (throughput) it can do. This often overlooked step in software development is crucial for creating and maintaining fast and performant code. Benchmarking provides the necessary metrics for developers to understand how well their code performs under various workloads and conditions. For the same reasons that you write unit and integration tests to prevent feature regressions, you should write benchmarks to prevent performance regressions. Performance bugs are bugs!

Write FizzBuzz in Rust

In order to write benchmarks, we need some source code to benchmark. To start off we are going to write a very simple program, FizzBuzz.

The rules for FizzBuzz are as follows:

Write a program that prints the integers from 1 to 100 (inclusive):

For multiples of three, print Fizz

For multiples of five, print Buzz

For multiples of both three and five, print FizzBuzz

For all others, print the number

There are many ways to write FizzBuzz. So we’ll go with the my favorite:

fn main() {
    for i in 1..=100 {
        match (i % 3, i % 5) {
            (0, 0) => println!("FizzBuzz"),
            (0, _) => println!("Fizz"),
            (_, 0) => println!("Buzz"),
            (_, _) => println!("{i}"),
        }
    }
}

Create a main function
Iterate from 1 to 100 inclusively.
For each number, calculate the modulus (remainder after division) for both 3 and 5.
Pattern match on the two remainders. If the remainder is 0, then the number is a multiple of the given factor.
If the remainder is 0 for both 3 and 5 then print FizzBuzz.
If the remainder is 0 for only 3 then print Fizz.
If the remainder is 0 for only 5 then print Buzz.
Otherwise, just print the number.

Follow Step-by-Step

In order to follow along with this set-by-step tutorial, you will need to install Rust.

🐰 The source code for this post is available on GitHub.

With Rust installed, you can then open a terminal window and enter: cargo init game

Then navigate into the newly created game directory.

game
├── Cargo.toml
└── src
    └── main.rs

You should see a directory called src with file named main.rs:

fn main() {
    println!("Hello, world!");
}

Replace its contents with the above FizzBuzz implementation. Then run cargo run. The output should look like:

$ cargo run
   Compiling playground v0.0.1 (/home/bencher)
    Finished dev [unoptimized + debuginfo] target(s) in 0.44s
     Running `target/debug/game`

1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz
...
97
98
Fizz
Buzz

🐰 Boom! You’re cracking the coding interview!

A new Cargo.lock file should have been generated:

game
├── Cargo.lock
├── Cargo.toml
└── src
    └── main.rs

Before going any further, it is important to discuss the differences between micro-benchmarking and macro-benchmarking.

Micro-Benchmarking vs Macro-Benchmarking

There are two major categories of software benchmarks: micro-benchmarks and macro-benchmarks. Micro-benchmarks operate at a level similar to unit tests. For example, a benchmark for a function that determines Fizz, Buzz, or FizzBuzz for a single number would be a micro-benchmark. Macro-benchmarks operate at a level similar to integration tests. For example, a benchmark for a function that plays the entire game of FizzBuzz, from 1 to 100, would be a macro-benchmark.

Generally, it is best to test at the lowest level of abstraction possible. In the case of benchmarks, this makes them both easier to maintain, and it helps to reduce the amount of noise in the measurements. However, just as having some end-to-end tests can be very useful for sanity checking the entire system comes together as expected, having macro-benchmarks can be very useful for making sure that the critical paths through your software remain performant.

Benchmarking in Rust

The four popular options for benchmarking in Rust are: libtest bench, Criterion, Gungraun and Iai.

libtest is Rust’s built-in unit testing and benchmarking framework. Though part of the Rust standard library, libtest bench is still considered unstable, so it is only available on nightly compiler releases. To work on the stable Rust compiler, a separate benchmarking harness needs to be used. Neither is being actively developed, though.

The most popular benchmarking harness within the Rust ecosystem is Criterion. It works on both stable and nightly Rust compiler releases, and it has become the de facto standard within the Rust community. Criterion is also much more feature-rich compared to libtest bench.

An alternative to Criterion is Gungraun. However, it uses instruction counts instead of wall clock time: CPU instructions, Cache metrics like L1 Hits, RAM Hits, and many more… This allows for single-shot benchmarking since these metrics should stay nearly identical between runs.

All four are supported by Bencher. So why choose Gungraun (the renamed successor of Iai-Callgrind)? Gungraun uses instruction counts instead of wall clock time. This makes it ideal for continuous benchmarking, that is benchmarking in CI. I would suggest using Gungraun for continuous benchmarking, especially if you are using shared runners. Gungraun is actively maintained and has comprehensive online documentation, making it a reliable choice for long-term projects. It is important to understand that Gungraun only measures a proxy for what you really care about. Does going from 1,000 instructions to 2,000 instructions double the latency of your application? Maybe or maybe not. For this reason, it can be useful to also run wall clock time based benchmarks in parallel with instruction count based benchmarks.

Install Valgrind

Gungraun uses a tool called Valgrind to collect instruction counts. Valgrind supports Linux, Solaris, FreeBSD, and macOS. However, the macOS support is limited to x86_64 processors as arm64 (M1, M2, etc) processors are not yet supported.

On Debian run: sudo apt-get install valgrind

On macOS (x86_64/Intel chip only): brew install valgrind

Install Gungraun Runner

Gungraun requires the gungraun-runner binary to be installed and available in your $PATH. The runner version must match the library version used in your Cargo.toml.

Install it with: cargo install --version 0.18.0 gungraun-runner

Or using cargo-binstall: cargo binstall [email protected]

Refactor FizzBuzz

In order to test our FizzBuzz application, we decouple our logic from our program’s main function. In contrast to other benchmark harnesses, Gungraun can benchmark the benchmark binary and the main function but this is purely macro-benchmarking. We want to do both, macro and micro. In order to do this, we need to make a few changes.

Under src, create a new file named lib.rs:

game
├── Cargo.lock
├── Cargo.toml
└── src
    └── lib.rs
    └── main.rs

Add the following code to lib.rs:

pub fn play_game(n: u32, print: bool) {
    let result = fizz_buzz(n);
    if print {
        println!("{result}");
    }
}

pub fn fizz_buzz(n: u32) -> String {
    match (n % 3, n % 5) {
        (0, 0) => "FizzBuzz".to_string(),
        (0, _) => "Fizz".to_string(),
        (_, 0) => "Buzz".to_string(),
        (_, _) => n.to_string(),
    }
}

play_game: Takes in an unsigned integer n, calls fizz_buzz with that number, and if print is true print the result.
fizz_buzz: Takes in an unsigned integer n and performs the actual Fizz, Buzz, FizzBuzz, or number logic returning the result as a string.

Then the updated main.rs looks like this:

use game::play_game;

fn main() {
    for i in 1..=100 {
        play_game(i, true);
    }
}

game::play_game: Import play_game from the game crate we just created with lib.rs.
main: The main entrypoint into our program that iterates through the numbers 1 to 100 inclusive and calls play_game for each number, with print set to true.

Benchmarking FizzBuzz

In order to benchmark our code, we need to create a benches directory and add a file to contain our benchmarks, play_game.rs. Note we deviate from the recommended way to structure benchmarks for the sake of simplicity. For your project, you should follow the recommendations:

game
├── Cargo.lock
├── Cargo.toml
└── benches
    └── play_game.rs
└── src
    └── lib.rs
    └── main.rs

Inside of play_game.rs add the following code:

use gungraun::prelude::*;
use std::hint::black_box;
use game::play_game;

#[library_benchmark]
fn bench_play_game() {
    for i in 1..=100 {
        play_game(black_box(i), black_box(false))
    }
}

library_benchmark_group!(
    name = bench_play_game_group,
    benchmarks = [bench_play_game]
);

main!(library_benchmark_groups = bench_play_game_group);

Import the gungraun::prelude module which brings in the necessary macros.
Use std::hint::black_box to prevent the compiler from optimizing away our benchmark.
Import the play_game function from our game crate.
Create a library benchmark function named bench_play_game using the #[library_benchmark] attribute.
Loop from 1 to 100 and call play_game with print set to false.
Create a library benchmark group named bench_play_game_group containing our bench_play_game benchmark.
Use the main! macro to run the benchmark group.

Now, we need to configure the game crate to run our benchmarks.

Add the following to the bottom of your Cargo.toml file:

[dev-dependencies]
gungraun = "0.18.0"

[[bench]]
name = "play_game"
harness = false

[profile.bench]
debug = true

gungraun: Add gungraun as a development dependency, since we are only using it for performance testing.
bench: Register play_game as a benchmark and set harness to false, since we will be using Gungraun as our benchmarking harness.
debug = true: Enable debug information in benchmark builds, which is required for Gungraun to provide detailed output.

Now we’re ready to benchmark our code, run cargo bench:

$ cargo bench
    Finished `bench` profile [optimized + debuginfo] target(s) in 0.73s
     Running benches/play_game.rs (target/release/deps/play_game-84c12f98b1991829)
play_game::bench_play_game_group::bench_play_game_100
  Instructions:                       17902|N/A                  (*********)
  L1 Hits:                            24984|N/A                  (*********)
  LL Hits:                                1|N/A                  (*********)
  RAM Hits:                              20|N/A                  (*********)
  Total read+write:                   25005|N/A                  (*********)
  Estimated Cycles:                   25689|N/A                  (*********)

Gungraun result: Ok. 1 without regressions; 0 regressed; 0 filtered; 1 benchmarks finished in 0.15258s

🐰 Lettuce turnip the beet! We’ve got our first benchmark metrics!

Finally, we can rest our weary developer heads… Just kidding, our users want a new feature!

Write FizzBuzzFibonacci in Rust

Our Key Performance Indicators (KPIs) are down, so our Product Manager (PM) wants us to add a new feature. After much brainstorming and many user interviews, it is decided that good ole FizzBuzz isn’t enough. Kids these days want a new game, FizzBuzzFibonacci.

The rules for FizzBuzzFibonacci are as follows:

Write a program that prints the integers from 1 to 100 (inclusive):

For multiples of three, print Fizz

For multiples of five, print Buzz

For multiples of both three and five, print FizzBuzz

For numbers that are part of the Fibonacci sequence, only print Fibonacci

For all others, print the number

The Fibonacci sequence is a sequence in which each number is the sum of the two preceding numbers. For example, starting at 0 and 1 the next number in the Fibonacci sequence would be 1. Followed by: 2, 3, 5, 8 and so on. Numbers that are part of the Fibonacci sequence are known as Fibonacci numbers. So we’re going to have to write a function that detects Fibonacci numbers.

There are many ways to write the Fibonacci sequence and likewise many ways to detect a Fibonacci number. So we’ll go with the my favorite:

fn is_fibonacci_number(n: u32) -> bool {
    for i in 0..=n {
        let (mut previous, mut current) = (0, 1);
        while current < i {
            let next = previous + current;
            previous = current;
            current = next;
        }
        if current == n {
            return true;
        }
    }
    false
}

Create a function named is_fibonacci_number that takes in an unsigned integer and returns a boolean.
Iterate for all number from 0 to our given number n inclusive.
Initialize our Fibonacci sequence starting with 0 and 1 as the previous and current numbers respectively.
Iterate while the current number is less than the current iteration i.
Add the previous and current number to get the next number.
Update the previous number to the current number.
Update the current number to the next number.
Once current is greater than or equal to the given number n, we will exit the loop.
Check to see is the current number is equal to the given number n and if so return true.
Otherwise, return false.

Now we will need to update our fizz_buzz function:

pub fn fizz_buzz_fibonacci(n: u32) -> String {
    if is_fibonacci_number(n) {
        "Fibonacci".to_string()
    } else {
        match (n % 3, n % 5) {
            (0, 0) => "FizzBuzz".to_string(),
            (0, _) => "Fizz".to_string(),
            (_, 0) => "Buzz".to_string(),
            (_, _) => n.to_string(),
        }
    }
}

Rename the fizz_buzz function to fizz_buzz_fibonacci to make it more descriptive.
Call our is_fibonacci_number helper function.
If the result from is_fibonacci_number is true then return Fibonacci.
If the result from is_fibonacci_number is false then perform the same Fizz, Buzz, FizzBuzz, or number logic returning the result.

Because we renamed fizz_buzz to fizz_buzz_fibonacci we also need to update our play_game function:

pub fn play_game(n: u32, print: bool) {
    let result = fizz_buzz_fibonacci(n);
    if print {
        println!("{result}");
    }
}

Both our main and bench_play_game functions can stay exactly the same.

Benchmarking FizzBuzzFibonacci

Now we can rerun our benchmark:

$ cargo bench
    Finished `bench` profile [optimized + debuginfo] target(s) in 0.73s
     Running benches/play_game.rs (target/release/deps/play_game-84c12f98b1991829)
play_game::bench_play_game_group::bench_play_game_100
  Instructions:                      331835|17902                (+1753.62%) [+18.5362x]
  L1 Hits:                           338828|24984                (+1256.18%) [+13.5618x]
  LL Hits:                                2|1                    (+100.000%) [+2.00000x]
  RAM Hits:                              22|20                   (+10.0000%) [+1.10000x]
  Total read+write:                  338852|25005                (+1255.14%) [+13.5514x]
  Estimated Cycles:                  339608|25689                (+1222.00%) [+13.2200x]

Gungraun result: Ok. 1 without regressions; 0 regressed; 0 filtered; 1 benchmarks finished in 0.15254s

Oh, neat! Gungraun tells us the difference between the estimated cycles of our FizzBuzz and FizzBuzzFibonacci games. Your numbers will be a little different than mine. However, the difference between the two games is likely in the 10-15x range. That seems good to me! Especially for adding a feature as fancy sounding as Fibonacci to our game. The kids will love it!

Expand FizzBuzzFibonacci in Rust

Our game is a hit! The kids do indeed love playing FizzBuzzFibonacci. So much so that word has come down from the execs that they want a sequel. But this is the modern world, we need Annual Recurring Revenue (ARR) not one time purchases! The new vision for our game is that it is open ended, no more living between the bounds of 1 and 100 (even if they are inclusive). No, we’re on to new frontiers!

The rules for Open World FizzBuzzFibonacci are as follows:

Write a program that takes in any positive integer and prints:

For multiples of three, print Fizz

For multiples of five, print Buzz

For multiples of both three and five, print FizzBuzz

For numbers that are part of the Fibonacci sequence, only print Fibonacci

For all others, print the number

In order to have our game work for any number, we will need to accept a command line argument. Update the main function to look like this:

fn main() {
    let args: Vec<String> = std::env::args().collect();
    let i = args
        .get(1)
        .map(|s| s.parse::<u32>())
        .unwrap_or(Ok(15))
        .unwrap_or(15);
    play_game(i, true);
}

Collect all of the arguments (args) passed to our game from the command line.
Get the first argument passed to our game and parse it as an unsigned integer i.
If parsing fails or no argument is passed in, default to playing our game with 15 as the input.
Finally, play our game with the newly parsed unsigned integer i.

Now we can play our game with any number! Use cargo run followed by -- to pass arguments to our game:

$ cargo run -- 9
   Compiling playground v0.0.1 (/home/bencher)
    Finished dev [unoptimized + debuginfo] target(s) in 0.44s
     Running `target/debug/game 9`
Fizz

$ cargo run -- 10
    Finished dev [unoptimized + debuginfo] target(s) in 0.03s
     Running `target/debug/game 10`
Buzz

$ cargo run -- 13
    Finished dev [unoptimized + debuginfo] target(s) in 0.04s
     Running `target/debug/game 13`
Fibonacci

And if we omit or provide an invalid number:

$ cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.03s
     Running `target/debug/game`
FizzBuzz

$ cargo run -- bad
    Finished dev [unoptimized + debuginfo] target(s) in 0.05s
     Running `target/debug/game bad`
FizzBuzz

Wow, that was some thorough testing! CI passes. Our bosses are thrilled. Let’s ship it! 🚀

The End

🐰 … the end of your career maybe?

Just kidding! Everything is on fire! 🔥

Well, at first everything seemed to be going fine. And then at 02:07 AM on Saturday my pager went off:

📟 Your game is on fire! 🔥

After scrambling out of bed, I tried to figure out what was going on. I tried to search through the logs, but that was hard because everything kept crashing. Finally, I found the issue. The kids! They loved our game so much, they were playing it all the way up to a million! In a flash of brilliance, I added two new benchmarks:

Here’s where Gungraun’s parameterized benchmarks shine! Instead of writing separate benchmark functions for each input, we can use the #[benches::...] attribute:

#[library_benchmark]
#[benches::play(100, 1_000_000)]
fn bench_play_game(n: u32) {
    play_game(black_box(n), black_box(false));
}

library_benchmark_group!(
    name = bench_play_game_group,
    benchmarks = [bench_play_game_100, bench_play_game]
);

Add the #[benches::play(100, 1_000_000)] attribute to create a benchmark variant with input 100 and another one with input 1_000_000.
The benchmark function takes a n: u32 parameter that receives each value.
Add the bench_play_game function to the library_benchmark_group!

Nice! One benchmark function, multiple test cases!

When I ran it, I got this:

$ cargo bench
    Finished `bench` profile [optimized + debuginfo] target(s) in 0.73s
     Running benches/play_game.rs (target/release/deps/play_game-84c12f98b1991829)
play_game::bench_play_game_group::bench_play_game_100
  Instructions:                      331835|331835               (No change)
  L1 Hits:                           338831|338828               (+0.00089%) [+1.00001x]
  LL Hits:                                1|2                    (-50.0000%) [-2.00000x]
  RAM Hits:                              20|22                   (-9.09091%) [-1.10000x]
  Total read+write:                  338852|338852               (No change)
  Estimated Cycles:                  339536|339608               (-0.02120%) [-1.00021x]
play_game::bench_play_game_group::bench_play_game play_0:(100)
  Instructions:                        7072|N/A                  (*********)
  L1 Hits:                             7128|N/A                  (*********)
  LL Hits:                                1|N/A                  (*********)
  RAM Hits:                               9|N/A                  (*********)
  Total read+write:                    7138|N/A                  (*********)
  Estimated Cycles:                    7448|N/A                  (*********)
play_game::bench_play_game_group::bench_play_game play_1:(1_000_000)
  Instructions:                   183930316|N/A                  (*********)
  L1 Hits:                        183930372|N/A                  (*********)
  LL Hits:                                1|N/A                  (*********)
  RAM Hits:                               9|N/A                  (*********)
  Total read+write:               183930382|N/A                  (*********)
  Estimated Cycles:               183930692|N/A                  (*********)

Gungraun result: Ok. 3 without regressions; 0 regressed; 0 filtered; 3 benchmarks finished in 1.45441s

Benchmark finished in 1.45 seconds. That was fast! Instead of running benchmarks multiple times like wall-clock benchmarks do, each Gungraun benchmark runs only once. But wait, why are the changes in the first benchmark bench_play_game_100, although we haven’t changed anything in this benchmark? That’s right, but we have changed something different in the benchmark file and since Gungraun and Valgrind are sensitive instruments even very small changes are registered. However, such small changes especially in the cache metrics are negligible. Over time you’ll get a feeling for critical changes in the metrics. Let’s have a closer look at our output.

play_game::bench_play_game_group::bench_play_game play_1:(1_000_000)
  Instructions:                   183930316|N/A                  (*********)
  L1 Hits:                        183930372|N/A                  (*********)
  LL Hits:                                1|N/A                  (*********)
  RAM Hits:                               9|N/A                  (*********)
  Total read+write:               183930382|N/A                  (*********)
  Estimated Cycles:               183930692|N/A                  (*********)

What! 7,448 estimated cycles x 1,000 should be 7,448,000 estimated cycles not 183,930,692 estimated cycles 🤯 Even though I got my Fibonacci sequence code functionally correct, I must have a performance bug in there somewhere.

Fix FizzBuzzFibonacci in Rust

Let’s take another look at that is_fibonacci_number function:

fn is_fibonacci_number(n: u32) -> bool {
    for i in 0..=n {
        let (mut previous, mut current) = (0, 1);
        while current < i {
            let next = previous + current;
            previous = current;
            current = next;
        }
        if current == n {
            return true;
        }
    }
    false
}

Now that I’m thinking about performance, I do realize that I have an unnecessary, extra loop. We can completely get rid of the for i in 0..=n {} loop and just compare the current value to the given number (n) 🤦

fn is_fibonacci_number(n: u32) -> bool {
    let (mut previous, mut current) = (0, 1);
    while current < n {
        let next = previous + current;
        previous = current;
        current = next;
    }
    current == n
}

Update our is_fibonacci_number function.
Initialize our Fibonacci sequence starting with 0 and 1 as the previous and current numbers respectively.
Iterate while the current number is less than the given number n.
Add the previous and current number to get the next number.
Update the previous number to the current number.
Update the current number to the next number.
Once current is greater than or equal to the given number n, we will exit the loop.
Check to see if the current number is equal to the given number n and return that result.

Now lets rerun those benchmarks and see how we did:

$ cargo bench
    Finished `bench` profile [optimized + debuginfo] target(s) in 0.73s
     Running benches/play_game.rs (target/release/deps/play_game-84c12f98b1991829)
play_game::bench_play_game_group::bench_play_game_100
  Instructions:                       23679|331835               (-92.8642%) [-14.0139x]
  L1 Hits:                            30675|338831               (-90.9468%) [-11.0458x]
  LL Hits:                                2|1                    (+100.000%) [+2.00000x]
  RAM Hits:                              19|20                   (-5.00000%) [-1.05263x]
  Total read+write:                   30696|338852               (-90.9412%) [-11.0390x]
  Estimated Cycles:                   31350|339536               (-90.7668%) [-10.8305x]
play_game::bench_play_game_group::bench_play_game play_0:(100)
  Instructions:                         218|7072                 (-96.9174%) [-32.4404x]
  L1 Hits:                              273|7128                 (-96.1700%) [-26.1099x]
  LL Hits:                                1|1                    (No change)
  RAM Hits:                              10|9                    (+11.1111%) [+1.11111x]
  Total read+write:                     284|7138                 (-96.0213%) [-25.1338x]
  Estimated Cycles:                     628|7448                 (-91.5682%) [-11.8599x]
play_game::bench_play_game_group::bench_play_game play_1:(1_000_000)
  Instructions:                         332|183930316            (-99.9998%) [ -554007x]
  L1 Hits:                              387|183930372            (-99.9998%) [ -475272x]
  LL Hits:                                1|1                    (No change)
  RAM Hits:                              10|9                    (+11.1111%) [+1.11111x]
  Total read+write:                     398|183930382            (-99.9998%) [ -462137x]
  Estimated Cycles:                     742|183930692            (-99.9996%) [ -247885x]

Gungraun result: Ok. 3 without regressions; 0 regressed; 0 filtered; 3 benchmarks finished in 0.45459s

Oh, wow! Our 100 benchmark is going down by 11% and our 1_000_000 benchmark is down more than 200,000x! 183,930,692 estimated cycles to 742 estimated cycles! That’s a reduction of 99.9996%!

🐰 Hey, at least we caught this performance bug before it made it to production… oh, right. Nevermind…

Catch Performance Regressions in CI

The execs weren’t happy about the deluge of negative reviews our game received due to my little performance bug. They told me not to let it happen again, and when I asked how, they just told me not to do it again. How am I supposed to manage that‽

Luckily, I’ve found this awesome open source tool called Bencher. There’s a super generous free tier, so I can just use Bencher Cloud for my personal projects. And at work where everything needs to be in our private cloud, I’ve started using Bencher Self-Hosted.

Bencher has a built-in adapters, so it’s easy to integrate into CI. After following the Quickstart guide, I’m able to run my benchmarks and track them with Bencher.

$ bencher run --project game "cargo bench"
    Finished `bench` profile [optimized + debuginfo] target(s) in 0.73s
     Running benches/play_game.rs (target/release/deps/play_game-84c12f98b1991829)
play_game::bench_play_game_group::bench_play_game_100
  Instructions:                       23679|23679                (No change)
  L1 Hits:                            30675|30675                (No change)
  LL Hits:                                2|2                    (No change)
  RAM Hits:                              19|19                   (No change)
  Total read+write:                   30696|30696                (No change)
  Estimated Cycles:                   31350|31350                (No change)
play_game::bench_play_game_group::bench_play_game play_0:(100)
  Instructions:                         218|218                  (No change)
  L1 Hits:                              273|273                  (No change)
  LL Hits:                                1|1                    (No change)
  RAM Hits:                              10|10                   (No change)
  Total read+write:                     284|284                  (No change)
  Estimated Cycles:                     628|628                  (No change)
play_game::bench_play_game_group::bench_play_game play_1:(1_000_000)
  Instructions:                         332|332                  (No change)
  L1 Hits:                              387|387                  (No change)
  LL Hits:                                1|1                    (No change)
  RAM Hits:                              10|10                   (No change)
  Total read+write:                     398|398                  (No change)
  Estimated Cycles:                     742|742                  (No change)

Gungraun result: Ok. 3 without regressions; 0 regressed; 0 filtered; 3 benchmarks finished in 0.45370s

Bencher New Report:
...
View results:
- play_game::bench_play_game_group::bench_play_game play_0:(100) (Estimated Cycles): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=c10f90a6-268a-4b31-b625-66f95eb4861f&measures=f03d9a6c-2b63-45c3-b34a-37149d1a7961&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game play_0:(100) (Instructions): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=c10f90a6-268a-4b31-b625-66f95eb4861f&measures=17acf657-735b-4ece-ab32-ba857db5edce&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game play_0:(100) (L1 Hits): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=c10f90a6-268a-4b31-b625-66f95eb4861f&measures=009a129f-4476-4202-9e2b-cd7aed7110ac&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game play_0:(100) (LL Hits): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=c10f90a6-268a-4b31-b625-66f95eb4861f&measures=932a00d1-e064-4f18-81fb-aa94a5f6d5a0&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game play_0:(100) (RAM Hits): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=c10f90a6-268a-4b31-b625-66f95eb4861f&measures=c98672c7-8229-4e90-9773-482618b71dbf&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game play_0:(100) (Total read+write): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=c10f90a6-268a-4b31-b625-66f95eb4861f&measures=0bd6ec91-2b29-47ea-801e-dc09338f3119&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game play_1:(1_000_000) (Estimated Cycles): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=c0c16a00-5ad1-4787-92ac-a39eed8c5375&measures=f03d9a6c-2b63-45c3-b34a-37149d1a7961&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game play_1:(1_000_000) (Instructions): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=c0c16a00-5ad1-4787-92ac-a39eed8c5375&measures=17acf657-735b-4ece-ab32-ba857db5edce&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game play_1:(1_000_000) (L1 Hits): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=c0c16a00-5ad1-4787-92ac-a39eed8c5375&measures=009a129f-4476-4202-9e2b-cd7aed7110ac&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game play_1:(1_000_000) (LL Hits): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=c0c16a00-5ad1-4787-92ac-a39eed8c5375&measures=932a00d1-e064-4f18-81fb-aa94a5f6d5a0&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game play_1:(1_000_000) (RAM Hits): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=c0c16a00-5ad1-4787-92ac-a39eed8c5375&measures=c98672c7-8229-4e90-9773-482618b71dbf&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game play_1:(1_000_000) (Total read+write): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=c0c16a00-5ad1-4787-92ac-a39eed8c5375&measures=0bd6ec91-2b29-47ea-801e-dc09338f3119&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game_100 (Estimated Cycles): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=4da8a40c-2282-487c-bac8-21218deba041&measures=f03d9a6c-2b63-45c3-b34a-37149d1a7961&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game_100 (Instructions): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=4da8a40c-2282-487c-bac8-21218deba041&measures=17acf657-735b-4ece-ab32-ba857db5edce&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game_100 (L1 Hits): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=4da8a40c-2282-487c-bac8-21218deba041&measures=009a129f-4476-4202-9e2b-cd7aed7110ac&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game_100 (LL Hits): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=4da8a40c-2282-487c-bac8-21218deba041&measures=932a00d1-e064-4f18-81fb-aa94a5f6d5a0&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game_100 (RAM Hits): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=4da8a40c-2282-487c-bac8-21218deba041&measures=c98672c7-8229-4e90-9773-482618b71dbf&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409
- play_game::bench_play_game_group::bench_play_game_100 (Total read+write): https://bencher.dev/console/projects/game/perf/game?branches=b53281dd-375a-4986-8074-17e9d488815e&heads=77345ed9-2e45-4d43-8186-f0f99b8120d1&testbeds=ef809413-f1ae-4889-bb91-d5e2e5769830&specs=%2C&benchmarks=4da8a40c-2282-487c-bac8-21218deba041&measures=0bd6ec91-2b29-47ea-801e-dc09338f3119&start_time=1773186232000&end_time=1775778233000&report=c703a61c-46a4-43bf-bbce-cb69d679b409

Using this nifty time travel device that a nice rabbit gave me, I was able to go back in time and replay what would have happened if we were using Bencher all along. You can see where we first pushed the buggy FizzBuzzFibonacci implementation. I immediately got failures in CI as a comment on my pull request. That same day, I fixed the performance bug, getting rid of that needless, extra loop. No fires. Just happy users.

Bencher: Continuous Benchmarking

Bencher is a suite of continuous benchmarking tools. Have you ever had a performance regression impact your users? Bencher could have prevented that from happening. Bencher allows you to detect and prevent performance regressions before they merge.

Run: Run your benchmarks locally or in CI using the exact same bare metal runners and your favorite benchmarking tools. The bencher CLI orchestrates running your benchmarks on bare metal and stores the results.
Track: Track the results of your benchmarks over time. Monitor, query, and graph the results using the Bencher web console based on the source branch, testbed, benchmark, and measure.
Catch: Catch performance regressions locally or in CI using the exact same bare metal hardware. Bencher uses state of the art, customizable analytics to detect performance regressions before they merge.

For the same reasons that unit tests are run to prevent feature regressions, benchmarks should be run with Bencher to prevent performance regressions. Performance bugs are bugs!

Start catching performance regressions before they merge — try Bencher Cloud for free.

Published: Sun, May 17, 2026 at 12:00:00 AM UTC | Last Updated: Wed, June 3, 2026 at 12:00:00 AM UTC