SPEC HPC2002 FAQ

Frequently Asked Questions about HPC2002

Last Updated: 12 Dec 2002

SPEC General
Use of SPEC Benchmarks in Research
SPEC HPC2002 Description
Publication of SPEC HPC 2002 Results
SPEC HPC2002 Background and Rationales
SPEC HPC2002 in Comparison with Other Benchmarks
Obtaining More Information

SPEC General

Q1. What is SPEC?

SPEC is an acronym for the Standard Performance Evaluation Corporation. SPEC is a non-profit organization composed of computer vendors, systems integrators, universities, research organizations, publishers and consultants whose goal is to establish, maintain and endorse a standardized set of relevant benchmarks for computer systems. Although no one set of tests can fully characterize overall system performance, SPEC believes that the user community will benefit from an objective series of tests that can serve as a common reference point.

Q2. What is a benchmark?

The definition from Webster's II Dictionary states that a benchmark is "A standard of measurement or evaluation." A computer benchmark is typically a computer program that performs a strictly defined set of operations (a workload) and returns some form of result (a metric) describing how the tested computer performed. Computer benchmark metrics usually measure speed (how fast was the workload completed) or throughput (how many workloads per unit time were measured). Running the same computer benchmark on multiple computers allows a comparison to be made.

Q3. Why use a benchmark?

Ideally, the best comparison test for systems would be your own application with your own workload. Unfortunately, it is often very difficult to get a wide base of reliable, repeatable and comparable measurements for different systems on your own application with your own workload. This might be due to time, money, confidentiality, or other constraints.

Q4. What options are viable in this case?

At this point, you can consider using standardized benchmarks as a reference point. Ideally, a standardized benchmark will be portable and maybe already run on the platforms that you are interested in. However, before you consider the results you need to be sure that you understand the correlation between your application/computing needs and what the benchmark is measuring. Are the workloads similar and do they have the same characteristics? Based on your answers to these questions, you can begin to see how the benchmark might approximate your reality.

Note: It is not intended that the SPEC benchmark suites be used as a replacement for the benchmarking of actual customer applications to determine vendor or product selection.

Use of SPEC Benchmarks in Research

Q5: Does SPEC encourage using its benchmarks for research?

Yes. Although much of the benchmarking documentation is written for a benchmarker who intends to generate SPEC-publishable results, SPEC has defined guidelines for research use of its benchmarks (see http://www.spec.org/hpg/academic_rules.html). These guidelines are intended to be consistent with those for high-quality scientific work.

Q6: I only want to use the SPEC benchmarks for my research. Do I need to understand the "SPEC Philosophy" ?

The SPEC benchmarking philosophy deals with fairness in obtaining and reporting computer performance measurements. Many of these ideas apply to scientific performance evaluation as well. Familiarizing yourself with the ideas behind SPEC benchmarks could prove useful for your research and the publication quality of your results.

Q7: Are any of the SPEC rules binding for me?

The only way you could have legally obtained access to the SPEC benchmarks is by becoming a licensee of SPEC. (Your advisor, or colleague who gave you the codes might be the licensee, and he or she has agreed to abide by SPEC's rules). However, most rules apply to the process of generating SPEC-publishable benchmark results. For research use, SPEC has defined guidelines (see http://www.spec.org/hpg/academic_rules.html)

Q8: Are any of the benchmark run tools useful for my research?

The goal of SPEC's benchmark run tools is to help the benchmarker, enforce SPEC's runrules, and ensure the quality of the benchmark reports. Some important aspects of the tools are:

Benchmark code and even suites can be made, run, validated and reports generated with a single command line.
The SPEC-provided makefiles are platform-independent (among the systems whose manufacturers participate in SPEC). System-specific make commands are separated out into a so-called config file.
When making and running a benchmark, the tools copy the source and all relevant files into a completely separate "run" directory, isolating the run from the original source and from other runs.

All these facilities can be useful for research projects as well. It might be worth learning about the tools.

Q9: I want to use the source code and input files only, I don't want to learn about the run tools. How do I proceed?

SPEC HPC benchmarks are full applications, which cannot be made with a simple "f77 *.f" command and executed with "a.out<inputfile". Most benchmarks are a mix of Fortran and C source files. The source files include several parallelization options (such as OpenMP, MPI, or serial). Invoking a benchmark might take several runs of one or several executables. Each benchmark must be validated by comparing its output against a standard file, whereby certain tolerances are allowed. This process is specified in a file, Spec/object.pm, in each benchmark directory. Understanding object.pm requires tool-specific knowledge. The easier option would be to run the benchmark once using the SPEC run tools and then check the files, speccmds.cmd and compare.cmd, in the run directory. They contain the commands for running and validating the benchmarks, respectively.

SPEC HPC2002 Description

Q10: What is SPEC HPC2002?

SPEC HPC2002 is a software benchmark product produced by the Standard Performance Evaluation Corp.'s High-Performance Group (SPEC/HPG). SPEC is a non-profit organization that includes computer vendors, systems integrators, universities, research organizations, publishers and consultants from around the world. The benchmark is designed to provide performance measurements that can be used to compare compute-intensive parallel workloads on different parallel computing systems.

Q11. What does SPEC HPC2002 measure?

SPEC HPC2002 focuses on high-performance computer platforms. The benchmarks measure the overall performance of high-end computer systems, including

the computer's processors (CPUs),
the interconnection system (shared or distributed memory),
the compilers
the MPI and/or OpenMP parallel library implementation
the input/output system.

It is important to remember the contribution of all these components; performance is more than just the processor.

SPEC HPC2002 is made up of several large applications that represent real computing practices in their respective disciplines.

Note that SPEC HPC2002 does not overly stress other computer components such as I/O (disk drives), external networking, operating system or graphics. It is possible to underconfigure a system in such a way that one or more of the components impact the performance of HPC2002 (e.g. running a benchmark on a system with 32 CPU and a single disk for I/O). The HPC2002 benchmarks are derived from real HPC applications and application practices. Although it is not the intent of the benchmark suite to stress certain system components, it is possible to do so on some system configurations.

Q12. What is included in the SPEC HPC2002 package?

SPEC provides the following on the SPEC HPC2002 media:

Source code and datasets for the HPC2002 benchmarks
A tool set for compiling, running, validating and reporting on the benchmarks
Pre-compiled tools for a variety of operating systems and hardware platforms.
Source code for the SPEC HPC2002 tools, for systems not covered by the pre-compiled tools
Run and reporting rules defining how the benchmarks should be used to produce SPEC HPC2002 results.
Documentation

Q13: What applications are included with SPEC HPC2002?

The SPEC HPC2002 suite includes three application areas: seismic processing (SPECseis), computational chemistry (SPECchem), and climate modeling (SPECenv).

SPECseis contains Seismic, an application developed at Atlantic Richfield Corp. (ARCO). Seismic performs time and depth migrations used to locate gas and oil deposits.

SPECchem contains GAMESS (General Atomic and Molecular Electronic Structure System), an improved version of programs that originated in the Department of Energy's National Resource for Computations in Chemistry. Many of the functions found in GAMESS are duplicated in commercial packages used in the pharmaceutical and chemical industries for drug design and bonding analysis.

SPECenv is WRF, a weather forecast application.

More detailed descriptions of the applications (with reference to papers, web sites, etc.) can be found in the individual benchmark directories in the SPEC benchmark distribution.

Q14: Are different-sized datasets available for the two SPEC HPC2002 applications?

Yes. Vendors may report SPEC HPC2002 results based on any of two predefined problem sizes: small and medium. Problem sizes depend on the application and the type of analysis the application performs. For SPECseis, problem sizes relate directly to the number and size of seismic traces being processed. For SPECchem, problem sizes relate to the complexity of the molecule under analysis. For SPECenv the problems represent .... The different problem sizes within SPEC HPC2002 give the application users more information about machine performance as it relates to the type of computational work they do.

Q15. What does the user of the SPEC HPC2002 suite have to provide?

Briefly, you need a Unix or Linux system (Windows is not yet supported) with 2 GB of memory, up to 100GB of disk, and a set of compilers. These requirements depend on the data set that you wish to use. Please see the details in the file system_requirements.txt of the benchmark distribution.

Q16. What are the basic steps in running the benchmarks?

Installation and use are covered in detail in the SPEC HPC2002 User Documentation. The basic steps are as follows:

Install SPEC HPC2002 from media by running the installation scripts to set up the appropriate directory structure and install (and build, if necessary) the SPEC HPC2002 benchmark run tools.
Determine which benchmark you wish to run.
Read the Run and Reporting Rules to ensure that you understand the rules for generating that metric.
Create a configuration file according to the rules for that metric. In this file, you specify compiler flags and other system-dependent information. Benchmark- and vendor-specific example configuration files are provided as a template for creating an initial config file. After you become comfortable with this, you can read other documentation to see how you can use the more complex features of the SPEC HPC2002 tools.
Run the SPEC tools to build (compile), run and validate the benchmarks.
If the above steps are successful, the tools will have generated a report based on the run times and metric equations.

Q17: What if the tools cannot be run or built on a system? Can they be run manually?

To generate SPEC-compliant results, the tools used must be approved by SPEC. If several attempts at using the SPEC tools are not successful for the operating system for which you purchased HPC2002, you should contact SPEC for technical support. SPEC will work with you to correct the problem and/or investigate SPEC-compliant alternatives.

Q18. What metrics can be measured?

The HPC2002 suite can be used to measure and calculate the following metrics:

SPECseis<size>2002
SPECchem<size>2002
SPECenv<size>2002

where <size> indicates the data size: S, M, L, X

All metrics are computed from the overall wallclock execution time T of the benchmarks as 86400/T. This can be interpreted as the number of times the benchmark could run consecutively in a day. Note, however that this is not a throughput measure.

A higher score means "better performance" on the given workload. The performance for the different data sets cannot be compared. They may exercise different execution paths in the benchmarks.

Q19. Which SPEC HPC2002 metric should be used to compare performance?

It depends on your needs. SPEC provides the benchmarks and results as tools for you to use. You need to determine how you use a computer or what your performance requirements are and then choose the appropriate SPEC benchmark.

Q20: How can I obtain SPEC HPC2002?

Information on ordering SPEC HPC2002 is available from the SPEC web site (http://www.spec.org/order.html) or by sending e-mail to info@spec.org.

Publication of SPEC HPC 2002 Results

Q21. Where are SPEC HPC2002 results available?

Results for all measurements submitted to SPEC are available at http://www.spec.org/hpg/

Q22: Can SPEC HPC2002 results be published outside of the SPEC web site?

Yes, SPEC HPC2002 results can be freely published if all the run and reporting rules have been followed and reviewed by SPEC/HPG for a nominal fee. The SPEC HPC2002 license agreement binds every purchaser of the suite to the run and reporting rules if results are quoted in public. A full disclosure of the details of a performance measurement must be provided to anyone who asks. See the SPEC HPC2002 run and reporting rules for details.

SPEC strongly encourages that results be submitted for the web site, since it ensures a peer review process and uniform presentation of all results. The run and reporting rules contain an exemption clause for research and academic use of SPEC HPC2002. Results obtained in this context need not comply with all the requirements for other measurements. It is required, however, that research and academic results be clearly distinguished from results submitted officially to SPEC.

SPEC HPC2002 Background and Rationales

Q23. Why use SPEC HPC2002?

SPEC HPC2002 provides the most realistic and comprehensive benchmarks for measuring a computer system as a whole. The benchmark applications include large, realistic, computational applications. Among all the SPEC suites, HPC2002 has the most flexible runrules, allowing many code optimizations. This reflects computing practices on high-performance systems. It allows the benchmarker to achieve the best application performance.

Other advantages to using SPEC HPC2002:

Benchmark programs are developed from actual end-user applications rather than synthetic benchmarks.
Multiple vendors use the suite and support it.
SPEC HPC2002 is portable to many platforms.
A wide range of results are available at http://www.spec.org/hpg/
The benchmarks are required to be run and reported according to a set of rules to ensure comparability and repeatability.
HPC2002 allows comparison of OpenMP and MPI parallelization paradigms.

Q24: What organizations were involved in developing SPEC HPC2002?

SPEC HPC2002 was developed by the Standard Performance Evaluation Corp.'s High-Performance Group (SPEC/HPG), formed in January 1994. Founding partners of SPEC/HPG include SPEC members, former members of the Perfect Benchmarks effort, and other groups working in the benchmarking arena. SPEC/HPG's mission has remained the same: to maintain and endorse a suite of benchmarks that represent real-world, high-performance computing applications. Current sustaining members include Fujitsu Ltd, Hewlett Packard, IBM, Intel, Silicon Graphics, and Sun Microsystems. Current associates include Argonne National Laboratory, Centre for Scientific Computing, Duke University, Leibniz-Rechenzentrum, NCSA - University of Illinois, North Carolina State University, National Cheng Kung University, National Renewable Energy Lab, National University of Singapore, Purdue University, PC Cluster Consortium, University of Miami, University of Minnesota, University of Pisa, University of Stuttgart, and University of Tsukuba.

Q25: How were the benchmarks selected?

They were selected based on being the most realistic, largest computational applications that can be distributed by SPEC.

Q26: What are SPEC/HPG's plans for adding applications to SPEC HPC2002?

SPEC/HPG is examining additional applications used in other areas of computational analysis running on high-performance computers. Applications under consideration include computational fluid dynamics (CFD), molecular dynamics, climate, ocean and weather codes. The SPEC HPC suite is updated on a regular basis. Contributions are encouraged.

Q27: Will SPEC/HPG replace applications in conjunction with changes in industrial software code?

Yes. Applications in the SPEC HPC2002 suite will be reviewed on a regular basis, and when newer versions are available, they will be incorporated into the benchmark suite. If an application falls out of use within its industrial area, a new, more relevant application will be adopted to replace it.

Q28: Will SPEC HPC2002 include more applications written in C or C++ in the future?

If a suitable application representing relevant computational work in industry is written in C or C++, it will certainly be considered. In fact, both applications in SPEC HPC2002 V1.0 contain components written in C.

Q29: How do SPEC HPC2002 benchmarks address different parallel architectures, such as clusters, vector systems, SMPs and NUMA?

SPEC HPC2002 benchmarks can be executed in serial or parallel mode. Due to the agreed-upon software standards for parallel systems, the parallel implementations have been based on the message-passing programming model MPI, and on the directive-based OpenMP API. Since high-performance computing systems use different architectures, the SPEC HPC2002 run rules allow for some flexibility in adapting the benchmark application to run in parallel mode. To ensure that results are relevant to end-users, SPEC/HPG requires that systems running SPEC HPC2002 benchmarks adhere to the following rules:

they must provide a suitable environment for running typical C and Fortran programs,
the system vendor must offer its implementation for general use
the implementation must be generally available, documented, and supported by the vendor

Q30: Are SPEC HPC2002 results comparable for these different parallel architectures?

Yes. Most consumers of high-performance systems are interested in running a single important application, or perhaps a small set of critical applications, on these high-priced machines. The amount of time it takes to solve a particular computational analysis is often critical to a high-performance systems user's business. For these consumers, being able to compare different machines' abilities to complete a relevant problem of a specific size for their application is valuable information, regardless of the architectural features of the system itself.

Q31: Are SPEC HPC2002 results comparable across workload size? Can you compare serial results to parallel results?

Varying the problem size, but not the system or parallelization, demonstrates how the application performs under a greater workload. The definition of "workload" will be application-specific and meaningful to users doing that sort of work. With SPECseis, for example, larger trace files require more I/O, larger FFTs, and longer running times. A seismic analyst will be able to use the benchmark results to understand the ability of a machine to accomplish mission-critical tasks. Different datasets might also exercise different functionality of the codes, which must be considered when interpreting scalability with respect to data size. Comparing serial to parallel results yields significant information as well: It shows the scalability of the test system for a specific benchmark code.

Q32: How will SPEC/HPG address the evolution of parallel programming models?

As standards emerge for parallel programming models, they will be reflected in the SPEC HPC2002 benchmarks. In response to the growing acceptance of SMP architectures, for example, SPEC/HPG is developing SAS (shared address space) parallel versions of its current benchmarks.

Q33: Can SPEC HPC2002 benchmarks be run on a high-end workstation?

Yes, they can be run on single-processor machines. The smaller problem sizes are likely to be the most suitable for these systems.

Q34: Traditionally, SPEC has not allowed any code changes in its benchmarks. Why are code changes allowed in SPEC HPC2002 and how did SPEC/HPG decide what should be allowed?

SPEC/HPG recognizes that customers who will spend many thousands to millions of dollars on a high-performance computer are willing to invest additional money to optimize their production codes. In addition to delivering more return on investment, code changes are required because there are so many different high-performance architectures; moving an application from one architecture to another is far more involved than porting a single CPU code from one workstation to another.

SPEC/HPG realized that since all customers optimize their programs, vendors should be allowed to perform the same level of optimization as a typical customer. There are specific rules that vendors must follow in optimizing codes. These rules were chosen to allow each vendor to show what its systems are capable of without allowing large application rewrites that would compromise performance comparisons.

Each vendor's code changes must be fully disclosed to the entire SPEC/HPG membership and approved before results are published. These changes must also be included in published reports, so customers know what changes they would have to make to duplicate results.

Q35: Do SPEC HPC2002 benchmarks measure speed or throughput?

Both. SPEC HPC2002 benchmarks measure the time it takes to run an application on the system being tested -- that's a test of speed. The SPEC HPC2002 metric also normalizes the benchmark's elapsed time to the number of seconds in a day. So, the benchmarks also measure throughput, since the metric reports how many benchmarks could be run, back to back, in a given 24-hour period.

Q36: Does SPEC HPC2002 make SPEC CPU2000 obsolete? What does it measure that SPEC CPU2000 does not?

SPEC HPC2002 results provide information that supplements SPEC CPU2000 results. Consumers of high-performance computing systems usually run a particular application or set of applications. It is important for these consumers to know how applications in their area of analysis will perform on the systems under consideration. This is the kind of specific information that SPEC HPC2002 provides.

Q37: Why doesn't SPEC/HPG define a metric such as M/FLOPS or price/performance?

SPEC/HPG chose to focus on total application performance for large, industrially relevant applications. Within this benchmarking environment, a simple metric such as M/FLOPS is inadequate and misleading. Customers need to understand the expected performance of systems under consideration for purchase. Real-world performance includes all of the set-up, computation and post-processing work. Since the pre- and post-processing phases of applications can be significant factors in total system performance, SPEC/HPG chose to concentrate on total system performance.

Q38. Why does HPC2002 only have a "peak" but no baseline metric?

In contrast to other SPEC benchmark suites, SPEC HPC2002 includes only one metric per code and data size. There are no "base" results that would measure compiler-only performance. The SPEC HPC2002 runrules allow certain hand optimizations for all metrics.

Since high-performance computer customers are willing to invest programming time to tune the applications that run on their systems, a baseline result has little meaning to them. Also, the architectures employed in the high-performance computing market are far more diverse than those found in single-CPU workstations. The baseline or "out-of-the-box" performance of any given application has no correlation to the actual performance a customer could expect to achieve on a particular architecture.

Q39: Why is there no reference machine for performance comparisons?

Reference machines give benchmark users a framework for judging metrics that would otherwise just be meaningless sets of numbers. SPEC HPC2002 uses time as its reference, not the speed of a particular machine. The metric itself tells how many successive benchmark runs can be completed in a 24-hour period on the system being tested.

Q40: Why doesn't SPEC HPC2002 provide a composite or aggregate performance metric?

Providing a composite or aggregate performance metric would undermine the purpose of SPEC HPC2002. SPEC HPC2002 is designed to inform users about how industrial-strength applications in their fields of analysis will perform. These users are particularly interested in how well their applications will scale as parallelism increases. This is why SPEC HPC2002 reporting pages provide metrics for systems with different numbers of processors running the same application and problem size.

SPEC HPC2002 in Comparison with Other Benchmarks

Q41. Some of the benchmark names may sound familiar; are these comparable to other programs?

Many of the SPEC benchmarks have been derived from publicly available application programs and all have been developed to be portable to as many current and future hardware platforms as practical. Hardware dependencies have been minimized to avoid unfairly favoring one hardware platform over another. For this reason, the application programs in this distribution should not be used to assess the probable performance of commercially available, tuned versions of the same application. The individual benchmarks in this suite might be similar, but NOT identical to benchmarks or programs with the same name which are available from sources other than SPEC. It is not valid to compare SPEC HPC2002 benchmark results with anything other than other SPEC HPC2002 benchmark results. (Note: This also means that it is not valid to compare SPEC HPC2002 results to older SPEC benchmarks; these benchmarks have been changed and should be considered different and not comparable.)

Q42: What is the difference between SPEC HPC2002 and other benchmarks for high-performance systems?

The most important distinction is that SPEC HPC2002 includes applications used in industry and research to do real work. These applications normally run on multiprocessing systems and require the larger computing resources offered by high-end systems. Only minimal modifications were made to the applications used in the SPEC HPC2002 suite. By leaving even "uninteresting" functions of the application code intact, SPEC HPC2002 provides a realistic measure of real-world application performance. SPEC/HPG's methodology differs from previous benchmarking efforts, which concentrated only on more numerically intensive algorithms.

A second distinction is that SPEC HPC2002 targets all high-performance computer architectures. The applications in the suite are currently in use on a wide variety of systems, including workstations, clusters, SMPs, vector systems and MPPs. The programming models used in the SPEC HPC2002 application codes -- message-passing, shared-memory parallel, and serial models -- can be run on all of today's high-performance systems. Other benchmarks tend to be biased towards either distributed-memory, shared-memory or vector architectures.

Finally, SPEC HPC2002 provides more than just peak performance numbers. To ensure that SPEC HPC2002 reflects performance for real applications, only a limited number of optimizations are allowed. This contrasts with benchmarks that allow a large number of optimizations requiring unrealistic development efforts to reproduce benchmark results. It also contrasts with benchmarks that restrict optimizations altogether.

Q43: How does SPEC HPC2002 compare to the NAS parallel benchmarks or to Parkbench?

The NPB (NAS Parallel Benchmarks) and Parkbench are kernels or subsets of applications; they are used to compare architectural implementations of machines. SPEC HPC2002 benchmarks are complete, real-world applications used by numerous organizations to solve real problems. These new benchmarks allow users to determine how well a given system performs for the entire spectrum of factors needed to solve real-world problems, including numerical computation, I/O, memory access, software systems, and many others.

Obtaining More Information

Q44. How do I contact SPEC for more information or technical support?

SPEC can be contacted in several ways. For general information, including other means of contacting SPEC, please see SPEC's web site at: http://www.spec.org/

General questions can be emailed to: info@spec.org

HPC2002 Technical Support Questions can be sent to: HPC2002support@spec.org

Q45. Now that I've read this document, what should I do next?

If you have arrived here by starting at the benchmark distribution's readme1st document, you should now verify that your system meets the requirements described in system_requirements.txt and then you can install the suite, following the instructions in install_guide_unix.txt (should work for linux).

Standard Performance Evaluation Corporation