SPEC Benchmarking Joint US/Europe Colloquium
Program Details
System Balance and Application Balance in Cost/Performance
Optimization
John McCalpin, AMD
The concept of "balance" is commonly used in the discussion
of computer systems, typically with the implication that some
systems are "well-balanced" for some workload of interest,
while others systems are "unbalanced" or "poorly
balanced" for the workload of interest. Surprisingly, very
few quantitative discussions of "system balance" and/or "application
balance" appear in the literature. Attempts to create objective
quantifiable definitions quickly lead one to realize that there
are significant subtleties here that are quite important in understanding
the interaction of technologies with customer buying behavior.
In this talk, I will combine simple performance models of the
SPEC CFP2000 and CFP2006 benchmarks with simple cost models for
computer hardware to show that the "optimum" balance
for a system is strongly dependent on the specific metrics that
one is trying to optimize, and that the mathematical equation
for optimum balance provides decidedly non-intuitive results.
[back to program details]
What's New With SPEC CPU2006: Changes to Benchmarks, Metrics,
Run Rules, and Technical Challenge
John Henning, Sun Microsystems
John will cover the topics listed in the title of
the talk, including at least two controversial
issues, one strangled metaphor, and a free
CPU2006 technical gift for the first 40 people
who ask for it. (Sit near the front of the room
to increase your chances of receiving the gift.)
[back to program details]
The CPU2006 Benchmark Tools
Cloyce Spradling, Sun Microsystems
The benchmarks that make up the SPEC CPU2006 and MPI2007 benchmark
suites are set-up, run, timed, and scored by a tools harness.
These tools have evolved over time from a collection of edit-it-yourself
makefiles, shell scripts, and an Excel spreadsheet to the current
Perl-based suite. The basic purpose of the tools is to make life
easier for the benchmarker; they make it easier to tweak compilation
settings, easier to keep track of those settings, and most importantly
they make it easier to follow the run and reporting rules.
This presentation will give a basic overview of how the tools
normally operate to generate benchmark scores. The course of
a normal score-generating run is followed from setup to report
generation with a focus on exactly how the benchmark runs are
timed. A couple of new features designed to ease the burden of
benchmarking will also be discussed. It will also cover features
designed to aid those wishing to use the benchmarks for research
purposes, such as how modified benchmark sources may be tested,
ways to work around the tools when they get in your way, and
how the tools can help with profiling workloads.
[back to program details]
A Journalist's Experiences with CPU2006
Andreas Stiller, c't Magazine (Heise Verlag)
Short introduction of c't and iX, both with more than
12 years of experience with SPEC
Classification
Three main differences in the approach of the users of the benchmarks:
- Companies:
Let their "superhero" shine as best as possible
- Academic:
Look for special aspects, often combined with the use of
CPU benchmarks as the (one and only?) fundament of new
developments in the processor and computer area (as to
be seen in nearly all of the papers of the ISCA symposia).
SPEC2000 is the most accepted value for HPC-performance,
i.e. kSI2K (kiloSPECint2000) for huge grids like the
LHC-Grid.
- Journalistic (the main topic here) :
Try to make a review as fair as possible. Keeping track with
each and every new processor (c't mainly in the Mobile/Desktop/Workstation
space, iX for Workstations & Servers). Use of the
newest compiler possibilities to explore the potentials
of new processors (Hyper-Threading of Itanium Montecito,
Helper-Threads, Auto Vectorization, Auto Parallelisation
...). Making published SPEC results more transparent.
Some important aspects of our approach
- Use of "real world compilers" in addition to specialized
compilers : i.e Microsoft Compilers for Windows and GNU for
Linux. 32 Bit OS are still very important for c't readership
- Use of current RTLs (i.e. from VS2005) although they are
a bit slower than the predecessor VS2003, which logically
is therefore preferred by a)...
- Use of standard hardware (no super high speed "overclocked" memory
etc)
- No use of sophisticated libraries! But this rule can't be
kept any more with CPU2006 and Windows XP/ Server 2003 because
of heavy heap problems with some benchmarks. Some very strange
results will be presented. A much better situation is the
combination of Vista/Server 2008 and the RTL of VS2005. But
some stack problems with the default configuration files
of SPEC could occur.
- Fairness includes sometimes even patching code or compilers
(i.e. modern Intel compilers exclude for some optimizations
the competition. Results of CPU2000 and CPU2006 (not published
yet) with patched code will be presented.
- No use of Peak, only Base! (it was a good idea to dispose
the four-flag rule & FDO)
Propositions & wishes for the future
- Addressing multithreading (with synchronizing, locking,
cache coherency protocol). That will be much more important
then the current SPECrate. OpenMP and MPI should help.
- Addressing Vectorization: SIMD with native vector data types
should be added
- Avoiding too much OS influence (like heap management in
Windows 2K3)
- And finally: Much shorter runtimes (even Intel's processor
generation in the far future: Gesher/Sandy Bridge will probably
have not much more than 4 GHz clock speed ...)
[back to program details]
The HPC Challenge (HPCC) Benchmark Suite: Characterizing
a System with Several Spezialized Kernels
Rolf Rabenseifner, High Performance Computing Center, Stuttgart
In 2003, the DARPA's High Productivity Computing Systems released
the benchmark HPCC suite. It examines the performance of HPC
architectures using kernels with various memory access and communication
patterns of well known computational kernels. Consequently, HPCC
results bound the performance of real applications as a function
of memory access and communication characteristics and define
performance boundaries of HPC architectures. The suite was intended
to augment the TOP500 list and by now the results are publicly
available for 6 out of 10 of the world's fastest computers.
This talk will introduce the several benchmarks used to characterize
different system resources. The publicly available results are
compared. The balance of systems is compared based on ratios
between computational speed, memory and network bandwidth.
[back to program details]
SPEC MPI2007 - An Application Benchmark for Clusters and
HPC Systems
Matthijs Van Waveren, Fujitsu
SPEC plans to release the SPEC MPI2007 benchmark suite at ISC2007.
SPEC HPG has developed this benchmark suite and its run rules
over the last few years. The purpose of the SPEC MPI2007 benchmark
and its run rules is to further the cause of fair and objective
benchmarking of high-performance computing systems. The rules
help ensure that published results are meaningful, comparable
to other results, and reproducible. MPI2007 includes 13 technical
computing applications from the fields of Computational Fluid
Dynamics, Molecular Dynamics, Electromagnetism, Geophysics, Ray
Tracing, and Hydrodynamics. We describe the benchmark suite,
and compare it to other benchmark suites.
[back to program details]
The SPECsfs2007 Benchmark – A Preview
Darren Sawyer, Network Appliance
The SPEC SFS subcommittee has been steadily working towards the
release of SPECsfs2007, the first major update to SPEC SFS network
fileserving benchmark in nearly 10 years. Using data collected
by member companies from systems deployed at customers around
the world, the SPECsfs2007 benchmark has made a number of changes
to the original SPECsfs97 NFS workload to adapt for the realities
of network fileserving today. These include an adjusted NFSv3
operation mix, increased file and transfer sizes, a much larger
working set, elimination of NFSv2 and the UDP networking transport,
addition of IPv6 support, and improved simulation of client commit
patterns based on server responses to write requests, among other
minor changes. How the SFS committee used collected data, industry
trends, and experiences with the SPECsfs97 benchmark to identify
the need for these revisions will be detailed.
In addition to changes in the NFS workload, high customer demand
for the introduction of a Windows-based file serving benchmark
led the committee to pursue adding a new workload using the CIFS
protocol to the SFS benchmark. Adding CIFS proved to be no easy
challenge. Unlike NFS, CIFS is a stateful protocol where certain
operations are only likely or even possible when following a
certain sequence of previous operations. The technique of using
a simple, stateless random distribution of operations utilized
by the NFS operation generation code would not suffice for CIFS.
Thus, a new operation generation technique, based on generating ‘clusters’ of
CIFS operations (aka CoCos) using a Hidden Markov Model (HMM)
created from the patterns observed in real customer trace data,
was developed not only to maintain proper sequences of operations
required by the protocol, but also to better mimic the sequences
of operations seen by real CIFS fileservers. The talk will describe
this technique to some level of detail and will share the data
used to generate the model ultimately used by the CIFS operation
generation code in SPECsfs2007.
Following the workload discussions, a brief overview benchmark
modifications to several non-workload facets of the benchmark
to improve portability, usability, and improved results reporting
and disclosure will be provided. The talk will conclude with
suggestions for future directions of the SFS benchmark.
[back to program details]
Energy Efficiency of Storage Subsystems
Klaus-Dieter Lange, HP
The increasingly concerned with the energy usage of datacenters
has the potential of drastically changing how the IT industry
evaluates storage subsystems. Quantify the possible energy saving
of utilizing modern storage subsystems by identify inherent energy
characteristics of next generation disk IO subsystems. We further
demonstrate the power and performance impact of a variety of
workload patterns.
[back to program details]
CPU Performance/Power Measurements at the Grid Computing
Centre Karlsruhe
Manfred Alef, Forschungszentrum Karlsruhe
One of the largest projects of high energy physics is the construction
and operation of the Large Hadron Collider (LHC) at the European
particle physics laboratory CERN. From 2007, the four detectors
in a particle accelerator with a diameter of 9 km will produce
about 10...15 petabytes per year. The LHC Computing Grid project
(LCG) was launched in order to make the data available to several
1000 scientists worldwide.
In 2001 the Grid Computing Centre Karlsruhe (GridKa) was founded
as the German LCG "tier 1" computing centre. It also
provides grid services to other non-LHC experiments. At present,
there are compute clusters with about 2500 CPU cores and disk
storage of some 1.5 Petabytes installed.
As one of the first tasks, GridKa has started to estimate the
electric power consumption, and the heat dissipation, of the
computing centre. In order to overcome the limitations of the
air conditioning system, GridKa has has installed water cooled
computer cabinets as the first computing centre world wide. Furthermore,
detailed investigations of the performance per Watt ratio of
recent cluster nodes have been started.
In the presentation I will give a brief explanation of the Grid
Computing Centre Karlsruhe, and describe how the performance
and power measurements are used to improve procurements and infrastructure
issues.
[back to program details]
SPECpower™ - Benchmarking the Energy Efficiency
of Servers
Klaus-Dieter Lange, SPEC Power Chair
SPEC is developing the first generation SPEC benchmark for evaluating
the energy efficiency of server class computers. The drive to
create the SPECpower™ benchmark comes from the recognition
that the IT industry, computer manufacturers, and government
agencies are increasingly concerned with the energy usage of
servers. Proven SPEC server benchmark concepts are utilized in
order to provide a means to fairly and consistently report system
energy use under various usage levels. Some critical design decisions
of the benchmark suite will be covered.
[back to program details]
Future of SPECjvm (JVM2007)
Stefan Sarne, BEA
SPECjvm98 remains a valuable benchmark in many ways almost 10
years after its release, but is not generating new submissions.
The Java subcommittee decided to try to address the obsolescence
of this benchmark, at least for submission purposes, by upgrading
the benchmark. We had several goals in place as we began this
effort. This paper begins by explaining those goals and the reasons
behind them, and then proceeds to the steps taken to try to address
them. A key concern was making sure we could produce a benchmark
that could exploit all the cores on a multi-core system; SPECjvm98
essentially ran single-threaded. We also explore some of the
ways in which the benchmark development process gr ew, and include
a description of some of the candidate sub-benchmarks and discuss
some of the strengths and limitations of the suite.
[back to program details]
SPEC Enterprise Java Benchmarks: State of the Art and
Future Directions
Sam Kounev, Technical University of Darmstadt/University of Cambridge
Enterprise Java benchmarks such as SPECjAppServer2002 and its
successor SPECjAppServer2004 have received increasing attention
over the past several years. This talk will discuss the latest
developments in the field and will look at two new benchmarks
that are currently developed at the Java Subcommittee. The first
one is SPECjms2007 which will be the world's first industry standard
benchmark for enterprise messaging platforms. The second one
will become the successor of SPECjAppServer2004 for measuring
the performance and scalability of Java EE platforms. The talk
will present the current state of the two efforts and discuss
some future work that has been planned.
[back to program details]