803.sph_exa_s
SPEC CPU®2026 Benchmark Description

Benchmark Name

803.sph_exa_s (SPH-EXA mini-app)

Benchmark Program General Category

Astrophysics - Smoothed particle hydrodynamics

Benchmark Authors

Authors listed in alphabetic order:

Aurélien Cavelan (University of Basel)
Danilo Guerrera (University of Basel)
Michal Grabarczyk (University of Basel)

Benchmark Description

The SPH-EXA mini-app implements the smoothed particle hydrodynamics (SPH) technique, a meshless Lagrangian method commonly used for performing hydrodynamical and computational fluid dynamics simulations.

The SPH technique discretizes a fluid in a series of interpolation points (SPH particles) whose distribution follows the mass density of the fluid and their evolution relies on a weighted interpolation over close neighboring particles. SPH simulations with detailed physics calculations represent computationally-demanding applications. The SPH-EXA mini-app is derived from three parent SPH codes used in astrophysics (SPHYNX and ChaNGa) and computational fluid dynamics (SPH-flow).

A number of basic steps of any SPH calculation are included in the mini-app: from the particles' positions and masses a tree is built and walked to identify the neighbors that will be used for the remainder of the global time-step (or iteration). Such steps include the evaluation of the particles' density, acceleration, rate of change of internal energy, and all physical modules relevant to the studied scenario. Next, a new physically relevant and numerically stable time-step is found, and the properties of the particles are updated accordingly.

SPH-EXA mini-app is a modern C++ headers-only code (except for main.cpp) with no external software dependencies. The parallelism is currently expressed via OpenMP.

This mini-app can simulate a three-dimensional rotating square patch, a demanding scenario for SPH simulations due to the presence of negative pressures, which stimulate the emergence of unphysical tensile instabilities that destroy the particle system, unless corrective repulsive forces are included.

Input Description

-n - number of particles to the cube (-n 100 means that the application will run with 100 * 100 * 100 particles)

-s - number of time-steps (iterations)

-w <num> - specify how often output file shall be writen (-w 50 means that output file will be dumped every 50 iterations)

For the testing it has been added to the source code the automatic generation of the input conditions for all given particles.

In the present setup the code performs a simulation of the evolution of a three-dimensional rotating square patch of fluid. To do this, additional information are set in the initialization phase (SqPatch::init() and constants in SqPatch):

The smoothing length H is set to 0.02
The initial density is 0.0
The speed of sound is set to 35 m/s
The mass of the particles is 1 g
The Courant number used to calculate the next tipe step is K=0.2
The index of the sinc kernel is set to 6
15 stabilization time-steps are used before letting the system fully evolve
The target number of neighbors (per particle) is set to 650
The next time-step cannot be bigger than 10% of the previous

The number of particles in the cube and the number of time-steps varies by workload:

Workload	Particles	Time-steps	State dump interval
test	1,000,000 (`-n 100`)	0	1
train	125,000 (`-n 50`)	7	5
refspeed	9,261,000 (`-n 210`)	24	7

Output Description

The code performs the simulation, and at the end it saves the total energy of the system. It is sufficent to check that this value does not change across simulations (for a fixed number of time-steps) to make sure that the code has executed correctly, but for added assurance the intermediate states of the calculation are also checked every n timesteps, where n varies by workload as shown in the table above.

In addition to the intermediate state of the calculation, the state every 128th particle in the cube is also dumped and checked. This is a separate CSV output file per timestep which contains positions (x, y, z), velocities (vx, vy, vz), smoothing length (h), density (ro), internal energy (u), pressure (p), speed of sound (c) and gradient of pressure (gradPx, gradPy, gradPz). Only every 128th particle is printed to keep output files to a reasonable size, and because a large variation in a single point would not have a large effect on the overall calculation.

Each output file has a mix of very large and very small values, and all of them are checked. The tolerances used vary by file. Absolute tolerances are set to cover acceptable deltas in the very small values, while relative tolerances are used to cover acceptable deltas in the large values.

Tolerances as used by the runcpu program in the Spec/object.pm file, and are shown here:

Workload	Output File	Relative tolerance	Absolute tolerance
test	`constants.csv`	0.506%	None
	`sph_exa.out`	0.506%	None
	`dump0.csv`	0.223%	0.0002
train	`constants.csv`	1.032%	None
	`sph_exa.out`	1.032%	None
	`dump0.csv`	0.00241%	0.00007
	`dump5.csv`	0.175%	0.0003
	`dump7.csv`	0.176%	0.0003
refspeed	`constants.csv`	4.3%	None
	`sph_exa.out`	4.3%	None
	`dump0.csv`	None	0.0004
	`dump7.csv`	0.7%	0.04
	`dump14.csv`	0.2%	0.04
	`dump21.csv`	0.6%	0.05
	`dump24.csv`	0.3%	0.05

Values were selected based on observed differences on various platforms relative to a reference run done on an x86 Linux system with GCC 14.2.0 and no optimization (-O0).

Differences from 532.sph_exa_t and 632.sph_exa_s in SPEChpc™ 2021

The source code that implements the algorithm is largely unchanged. The CPU 2026 version removes support for OpenACC and OpenMP target offload.

The workloads are most similar to 532.sph_exa_t. The test workload is exactly the same, while the train cube size and iterations is larger. The refspeed workload (corresponding to "ref" in 532.sph_exa_t) has the same cube size but runs for fewer iterations.

The primary difference is in output validation; the SPEChpc 2021 versions validate only the total energy output after the simulation completes. 803.sph_exa_s validates much more of the intermediate state as explained above.

Programming Language

C++

Threading Model

OpenMP enabled with -DSPEC_OPENMP

Known Portability Issues

None

Sources and Licensing

The benchmark is licensed under the MIT license. The sources SPEC started with are from around commit 7604c824 in the SPH-EXA GitHub repo, but have been modified. The benchmark contains some changes from upstream after this point, but not all of them.

References

GitHub: https://github.com/unibas-dmi-hpc/SPH-EXA_mini-app
D. Guerrera et al., "Towards a Mini-App for Smoothed Particle Hydrodynamics at Exascale," 2018 IEEE International Conference on Cluster Computing (CLUSTER), Belfast, UK, 2018, pp. 607-614, doi: 10.1109/CLUSTER.2018.00077.

803.sph_exa_sSPEC CPU®2026 Benchmark Description