735.gem5_r
SPEC CPU®2026 Benchmark Description

Benchmark Name

735.gem5_r

Benchmark Program General Category

Computer Architecture Simulation

Benchmark Authors

The gem5 project includes contributions from more than 175 developers (see the Acknowledgments section of The gem5 Simulator: Version 20.0+).

735.gem5_r was submitted to the SPEC CPU v8 Benchmark Search Program by Jason Lowe-Power <jason[at]lowepower [dot] com>.

Benchmark Description

The gem5 simulator is a modular platform for computer-system architecture research, encompassing system-level architecture as well as processor microarchitecture. It was originally conceived for computer architecture research in academia, but gem5 has grown to be used in computer system design by academia for research and teaching, and in industry for research and product design.

The open-source and community-supported gem5 simulator is one of the most popular tools for computer architecture research. This simulation infrastructure allows researchers to model modern computer hardware at the cycle level, and it has enough fidelity to boot unmodified Linux-based operating systems and run full applications for multiple architectures including x86, ARM, and RISC-V. The SPEC®CPU version only supports simulation of the RISC-V architecture.

More technically, gem5 is a modular discrete event driven computer system simulator platform. That means that:

Components can be rearranged, parameterized, extended, or replaced easily to suit varying needs.
It simulates the passing of time as a series of discrete events.
It is more than just a simulator; it is a simulator platform that lets you use as many of its premade components as you want to build up your own simulation system model.

Input Description

gem5 takes as input a single python script which describes the simulation model. The gem5 binary acts as a python interpreter with the inclusion of the gem5 and m5 python libraries. The gem5 and m5 python libraries are wrappers around the C++ models provided by gem5.

Each python script that is used as the input to gem5 configures a system simulation by setting up a memory (e.g., DRAM), a cache hierarchy, and a processor model to simulate. The python script also includes "resources" that are needed to run a workload on the simulated system. These resources can include binary applications, disk images, and kernel. Finally, the python script defines the simulator control (usually just simulator.run(), but more complex control is also possible).

The gem5 simulator can operate in two modes: "syscall emulation" (SE) mode and "full system" (FS) mode. The test, train, and refrate inputs uses SE mode. The refspeed input uses FS mode.

In SE mode, the operating system is emulated in the gem5 simulator. The required resources in SE mode are simply a binary to execute. The binary's ISA must match the ISA of the board that gem5 simulates (e.g., simple_hello_riscv.py in the alternate test input uses RISC-V ISA and requires a RISC-V binary.) To use a new binary with this board, you can change the board.set_se_binary_workload function with your own binary. You can use local binaries by extending the resources.json file or use any binaries available on gem5 Resources.

In FS mode, gem5 does not emulate anything. The entire system including devices are simulated at a cycle-level. Thus, the resource required for FS mode include a disk image which contains an OS image (e.g., Ubuntu) and a kernel binary. FS resources are available on gem5 Resources and you can chose other FS workloads by changing the board.set_workload line.

Here are detailed descriptions of each of the benchmark workloads offered.

synthetic_traffic.py (test): Runs a traffic generator which injects traffic into a single channel of the HBM DRAM model. The script supports two arguments, the type of traffic (Linear or Random), the percentage of reads vs writes. You can change the duration of the generators to make the benchmark run longer or shorter.
synthetic_traffic.py (train, refrate): Runs a traffic generator which injects traffic into the cache models. The script supports the "classic" caches (simple model) and "Ruby" cache (more complex model based on the domain-specific language SLICC). The script supports three arguments, the type of traffic (Linear or Random), the percentage of reads vs writes, and an option to enable the Ruby model instead of classic caches. This input tests the memory subsystem of gem5. You can change the duration of the generators to make the benchmark run longer or shorter.
bwaves-train-simpt-3-match.py (refspeed): Runs a SimPoint-based checkpoint in gem5's SE mode. The checkpoint is a snippet of the SPEC CPU®2017 program 503.bwaves. This input tests loading the checkpoint into gem5 and running a complex application in SE mode. This script uses the "RISCVMatchedBoard" which is a model of an in-order RISC-V processor with a two-level cache hierarchy and DDR4 memory. It runs for 10000000 instructions to warm up the microarchitecture, resets the architectural statistics in the simulator, and runs for another 40000000 to model the program. Note that since this is using SE mode, gem5 emulates the OS system calls and uses the host's kernel. This version of gem5 should be stable across hosts.
bwaves-train-simpt-3-o3cpu.py (refspeed): Runs a SimPoint-based checkpoint in gem5's SE mode. The checkpoint is a snippet of the SPEC CPU®2017 program 503.bwaves. This input tests loading the checkpoint into gem5 and running a complex application in SE mode. This script uses the out-of-order CPU model of a RISC-V processor with a two-level cache hierarchy and DDR4 memory. It runs for 10000000 instructions to warm up the microarchitecture, resets the architectural statistics in the simulator, and runs for another 30000000 to model the program.

In addition to the benchmark workloads above, there are some extra workloads for anyone wishing to do further analysis. These are in the same data directories but were not included in the benchmark due to total runtime, or they did not port well to esoteric systems.

simple_hello_riscv.py (test): Runs an execution-based simulation of the guest application "hello-world". The guest application is compiled for the RISC-V ISA. The system that is modeled is the "RISCVMatchedBoard" which is a model of an in-order RISC-V processor with a two-level cache hierarchy and DDR4 memory.
run_stream.py (train, refrate): Runs a simple stream-like kernel using the RISC-V ISA in SE mode. The input to the stream kernel is the size of the array to use (10000 for train, 30000 for refrate). Runs a single core at 5 GHz with a 2-level cache hierarchy (64 KiB/512 KiB) and DDR4 DRAM models. This tests the execution-based CPU models and the memory subsystem of gem5. You can change the length of the execution by modigying the arguments to the stream-riscv.exe input in the set_se_binary_workload function.
povray-ref-simpt-16.py (refspeed): Runs a SimPoint-based checkpoint in gem5's SE mode. The checkpoint is a snippet of the SPEC CPU®2017 program 511.povray. This input tests loading the checkpoint into gem5 and running a complex application in SE mode. This script uses the "RISCVMatchedBoard" which is a model of an in-order RISC-V processor with a two-level cache hierarchy and DDR4 memory. It runs for 10000000 instructions to warm up the microarchitecture, resets the architectural statistics in the simulator, and runs for another 40000000 to model the program. Note that since this is using SE mode, gem5 emulates the OS system calls and uses the host's kernel. This version of gem5 should be stable across hosts.
povray-ref-simpt-16-o3cpu.py (refspeed): Runs a SimPoint-based checkpoint in gem5's SE mode. The checkpoint is a snippet of the SPEC CPU®2017 program 511.povray. This input tests loading the checkpoint into gem5 and running a complex application in SE mode. This script uses the out-of-order CPU model of a RISC-V processor with a two-level cache hierarchy and DDR4 memory. It runs for 10000000 instructions to warm up the microarchitecture, resets the architectural statistics in the simulator, and runs for another 30000000 to model the program.

A savvy user can craft further by modifying the existing python input scripts, or creating their own. As listed above, each input directory contains extra python scripts that are not part of the measured benchmark; these are good alternative workloads and a starting point for exploration by changing the parameters of the simulated system. In this case, possible models are available in the gem5 standard library, assuming they are built into the benchmark binary.

By default, gem5's debug tracing support is disabled. To enable debug tracing, you can pass the TRACING_ON define/Dflag when building gem5. With TRACING_ON, you can use --debug-flags=FLAG to enable tracing for a flag. See --debug-help for more details.

Useful flags when running with TRACING_ON in execution mode (either SE mode or FS mode) include Exec and ExexAll. You can run the programs with these flags by executing ./gem5sim --debug-flag=ExecAll. When running with the traffic generators, the TrafficGen flag can be useful for debugging. To see all of the reads and writes to main memory, you can use the MemoryAccess flag. This will show each address and data read or written to memory. All of these flags can quickly produce heaps of output (GiBs or more), so using the flag debug-start=[tick] can be useful to enable the debug output starting at a specific tick. The option debug-file=[filename] will redirected the trace/debug output to a file. You can set a maximum tick to run to by modifying the input python files, for example .simulate([maxtick]) or .run(max_ticks=[maxtick]).

A gentle introduction to gem5 is available at www.gem5.org/documentation/learning_gem5/introduction.

Output Description

gem5 outputs a file called stats.txt which lists a number of microarchitectural metrics that were observed during the run time. This can include things like number of clock ticks, number of cache misses, number of page walks, and so on.

During validation, the tool called gem5stats is run on that stats file to extract out the metrics listed in control.stat and place them in a stats.out file. This is the file which is validated to ensure a successful run. Since the simulation is deterministic, this file should match exactly between different test platforms and different compiler optimization levels (as long as infinite-math handling is enabled by the compiler build).

Programming Language

C++, C, Python

Threading Model

Both the SPECrate and SPECspeed versions are single-threaded.

Although the benchmark runs only 1 thread at a time, there are references to std::thread that remain in the source.

Known Portability Issues

GNU/Linux systems implement C++ std::thread using POSIX Threads. Although some systems automatically include the needed support, this is not universal. Surprises have been seen when changing OS versions, or libraries, or compilers; or when FDO is added; or when combining C and C++ modules. Typically, it is safest to add -pthread to all compile and link lines for all SPEC CPU benchmarks that use std::thread. Please see the $SPEC/config directory for Example config files that demonstrate how to conveniently do so.

The statistics processing in gem5 relies on std::isnan() being functional. As a result, precise math is needed to collect stats and print out the results.

Sources and Licensing

The gem5 public repository is available at https://github.com/gem5/gem5.

The SPEC CPU version started with version v22.1.0.0, which was released in December 2022.

gem5 source is licensed under the BSD-3-Clause license. Other external sources used in gem5 are licensed under compatible terms:

drampower library is BSD-3: drampower.license.txt, copyrighted by Fraunhofer IESE.
dramsim2 is BSD-2: dramsim2.license.txt, copyrighted by University of Maryland.
dramsim3 is MIT License: dramsim3.license.txt, copyrighted by University of Maryland.
libelf is BSD-2: libelf.license.txt, copyrighted Joseph Koshy.
libfdt is BSD-3: libfdt.license.txt, copyrighted by David Gibson, IBM Corporation.
pybind11 is BSD-3: pybind11.license.txt, copyrighted by Wenzel Jakob.
softfloat is BSD-3: softfloat.license.txt, copyright The Regents of the University of California.

cpython 3.6 is distributed under the Python License: cpython.license.txt, and we acknowledge the long list of authors: cpython.authors.txt. Cpython is copyrighted by the Python Software Foundation.

The cpython source files {pyfpe.h, fpectlmodule.c, fpetestmodule.c} were produced at the University of California, Lawrence Livermore National Laboratory, and are distributed under a notice from LLNL.

getopt is BSD-3 licensed: getopt.license.txt, copyrighted by David Gottner.
protobuf is BSD-3 licensed: protobuf.license.txt, copyrighted by Google, Inc.
zlib is Zlib licensed: zlib.license.txt, copyrighted by Jean-loup Gailly and Mark Adler.
unicode object is offered under specific terms by Secret Labs AB and Fredrik Lundh: unicodeobject.license.txt.
ide_atareg.h is Copyright (c) 1998, 2001 Manuel Bouyer. Although it contains a BSD "advertising" clause, according to the history of the upstream module, Mr. Bouyer disclaimed the clause (history screenshot, retrieved 2-Apr-2026).
There are two versions of remote_gdb.cc present in the source directories. These are BSD licensed and copyright ARM Limited, Barkhausen Institut, Google, Huawei International, LabWare, NetBSD Foundation, Regents of The University of Michigan, Regents of the University of California, University of Virginia. The files contain advertising clauses for the University of California and for NetBSD. The University of California disclaimed such clauses in 1999 and NetBSD did the same in 2008.

The data inputs to gem5 consist of python scripts which are distributed under BSD-3, copyrighted by The Regents of the University of California. Additionally, there are statically compiled binaries of "hello world" and "streams" programs offered freely without restriction by the author, Jason Lowe-Power. Finally, the refspeed input directory contain gem5 memory SimPoints of the SPEC CPU®2017 applications 503.bwaves_r and 511.povray_r.

References

J. Lowe-Power et.al., The gem5 Simulator: Version 20.0+, September 2020, arXiv:2007.03152. doi:10.48550/arXiv.2007.03152
Website: www.gem5.org
Source code repository: github.com/gem5/gem5

735.gem5_r SPEC CPU®2026 Benchmark Description