SPECvirt_sc2010 Release 1.0 Run and Reporting Rules

Version 1.02

1.0 Introduction
    1.1 Philosophy
    1.2 Fair Use of SPECvirt_sc2010 Results
    1.3 Research and Academic Usage
    1.4 Caveat
    1.5 Definitions
2.0 Running the SPECvirt_sc2010 Benchmark
    2.1 Environment
        2.1.1 Testbed Configuration
        2.1.2 System Under Test (SUT)
        2.1.3 Power
    2.2 Workload VMs
    2.3 Measurement
        2.3.1 Quality of Service
        2.3.2 Benchmark Parameters
        2.3.3 Running SPECvirt_sc2010 Workloads
        2.3.4 Power Measurement
        2.3.5 Client Polling Requirements
3.0 Reporting Results
    3.1 Metrics And Reference Format
    3.2 Testbed Configuration
        3.2.1 SUT Hardware
        3.2.1.1 SUT Stable Storage
        3.2.2 SUT Software
        3.2.3 Network Configuration
        3.2.4 Clients
        3.2.5 General Availability Dates
        3.2.6 Rules on the Use of Open Source Applications
        3.2.7 Test Sponsor
        3.2.8 Notes
4.0 Submission Requirements for SPECvirt_sc2010
    4.1 SUT Configuration Collection
    4.2 Guest Configuration Collection
    4.3 Client Configuration Collection
    4.4 Configuration Collection Archive Format
5.0 The SPECvirt_sc2010 Benchmark Kit
Appendix A. Run Rules References

1.0 Introduction

SPECvirt_sc2010 is the first generation SPEC benchmark for evaluating the performance of datacenter servers used for virtualized server consolidation. The benchmark also provides options for measuring and reporting power and performance/power metrics. This document specifies how SPECvirt_sc2010 is to be run for measuring and publicly reporting performance and power results. These rules abide by the norms laid down by the SPEC Virtualization Subcommittee and approved by the SPEC Open Systems Steering Committee. This ensures that results generated with this suite are meaningful, comparable to other generated results, and are repeatable with sufficient documentation covering factors pertinent to duplicating the results.

Per the SPEC license agreement, all results publicly disclosed must adhere to these Run and Reporting Rules.

1.1 Philosophy

The general philosophy behind the rules of SPECvirt_sc2010 is to ensure that an independent party can reproduce the reported results.

The following attributes are expected:

Proper use of the SPEC benchmark tools as provided.
Availability of an appropriate full disclosure report (FDR).
Support for all of the appropriate hardware and software.

Furthermore, SPEC expects that any public use of results from this benchmark suite shall be for System Under Test (SUT) and configurations that are appropriate for public consumption and comparison. Thus, it is also expected that:

Hardware and software used to run this benchmark must provide a suitable environment for consolidating datacenter servers in a virtual environment.
Optimizations utilized must improve performance for a larger class of workloads than those defined by this benchmark suite.
The SUT and configuration is generally available, documented, supported, and encouraged by the vendor(s) or provider(s).
If power is measured, the power analyzers and temperature sensors used while running the benchmark must be from the list of Accepted Measurement Devices accepted by SPEC.

SPEC requires that any public use of results from this benchmark follow the SPEC Fair Use Rule and those specific to this benchmark (see the Fair Use section below). In the case where it appears that these guidelines have not been adhered to, SPEC may investigate and request that the published material be corrected.

1.2 Fair Use of SPECvirt_sc2010 Results

Consistency and fairness are guiding principles for SPEC. To help assure that these principles are met, any organization or individual who makes public use of SPEC benchmark results must do so in accordance with the SPEC Fair Use Rule, as posted at http://www.spec.org/fairuse.html. Fair-use clauses specific to SPECvirt_sc2010 are covered in http://www.spec.org/fairuse.html#SPECvirt_sc2010.

1.3 Research and Academic Usage

Please consult the SPEC FairUse Rule on Research and Academic Usage at http://www.spec.org/fairuse.html#Academic.

1.4 Caveat

SPEC reserves the right to adapt the benchmark codes, workloads, and rules of SPECvirt_sc2010 as deemed necessary to preserve the goal of fair benchmarking. SPEC will notify members and licensees whenever it makes changes to this document and will rename the metrics if the results are no longer comparable.

Relevant standards are cited in these run rules as URL references, and are current as of the date of publication. Changes or updates to these referenced documents or URLs may necessitate repairs to the links and/or amendment of the run rules. The most current run rules will be available at the SPEC Web site at http://www.spec.org/virt_sc2010. SPEC will notify members and licensees whenever it makes changes to the suite.

1.5 Definitions

In a virtualized environment, the definitions of commonly-used terms can have multiple or different meanings. To avoid ambiguity, this section attempts to define terms that are used throughout this document:

Server: The server represents a host system that is capable of supporting a single operating system or hypervisor. The server consists of one or more enclosures that contains hardware components such as the processors, memory, network adapters, storage adapters, and any other components within that enclosure, as well as the mechanism that provides power for these components. In the case of a blade result, the server includes the blade enclosure.
Clients: The clients are one or several servers that are used to initiate benchmark transactions and record their completion. In most cases, a Client simulates the work requests that would normally come from end users, although in some cases the work requests come from background tasks that are defined by the benchmark workloads to be simulated instead of being included in the SUT.
SUT: The SUT, or System Under Test, is defined as the server and performance-critical components that execute the defined workloads. Among these components are: storage hardware and all other hardware necessary to connect the server and the storage subsystem. Such connecting hardware can be: fibre switches or network switches (if the submitter uses NAS). Client hardware used to initiate and monitor the workflow as well as network switches not described above are not included.
Power monitoring systems: These are the power analyzer, temperature monitor, and system(s) running the applications that control the collection and recording of power information for the benchmark. They are not a part of the SUT.
Active Idle: The state in which the SUT must be capable of completing all workload transactions. In a virtualized environment, all virtual machines necessary to support the peak load of the result must be powered on and ready to respond to client requests.
Virtual Machine (VM): A virtual machine is an abstracted execution environment which presents an operating system (either discrete or virtualized). The abstracted execution environment presents the appearance of a dedicated computer with CPU, memory and I/O resources available to the operating system. In SPECvirt, a VM consists of a single OS and the application software stack that supports a single SPECvirt_sc2010 component workload. There are several methods of implementing a VM, including physical partitions, logical partitions and software-managed virtual machines.
Hypervisor (virtual machine monitor): Software (including firmware) that allows multiple VMs to be run simultaneously on a server.
Tile: A logical grouping of one of each kind of VM used within SPECvirt. For SPECvirt_sc2010, a tile consists of one Web Server, Mail Server, Application Server, Database Server, Infrastructure Server, and Idle Server. A valid SPECvirt_sc2010 benchmark result is achieved by correctly executing the benchmark workloads on one or more tiles.

For further definition or explanation of workload-specific terms, refer to the respective documents of the original benchmarks.

2.0 Running the SPECvirt_sc2010 Benchmark

2.1 Environment

2.1.1 Testbed Configuration

These requirements apply to all hardware and software components used in producing the benchmark result, including the System under Test (SUT), network, and clients.

The SUT must conform to the appropriate networking standards, and must utilize variations of these protocols to satisfy requests made during the benchmark.
The value of TCP TIME_WAIT must be at least 60 seconds (i.e. if a connection between the SUT and a client enters TIME_WAIT it must stay in TIME_WAIT for at least 60 seconds).
The SUT must be comprised of components that are generally available on or before date of publication, or shall be generally available within 3 months of the first publication of these results.
No components that are included in the base configuration of the SUT, as delivered to customers, may be removed from the SUT.
Any deviations from the standard default configuration for testbed configuration components must be documented so an independent party would be able to reproduce the configuration and the result without further information. These deviations must meet general availability and support requirements (see section 3.2.5). The independent party should be able to achieve performance not lower than 95% of that originally reported.
The connections between a SPECvirt_sc2010 load generating machine (client) and the SUT must not use a TCP Maximum Segment Size (MSS) greater than 1460 bytes. This needs to be accomplished by platform-specific means outside the benchmark code itself. The method used to set the TCP MSS must be disclosed. MSS is the largest "chunk" of data that TCP will send to the other end. The resulting IP datagram is normally 40 bytes larger: 20 bytes for the TCP header and 20 bytes for the IP header resulting in an MTU (Maximum Transmission Unit) of 1500 bytes.
Inter-VM communications have no restrictions for MSS.
Ensure that client timekeeping is accurate and does not drift since performance is measured by the clients. For virtualized clients, this may require a guest OS that uses tickless timekeeping. Please refer to virtualization provider's documentation for best practices.
Clients must reside outside of the SUT.
Open Source Applications that are outside of a commercial distribution or support contract must adhere to the Rules on the Use of Open Source Applications.

2.1.2 System Under Test (SUT)

For a run to be valid, the following attributes must hold true:

The SUT returns the complete and appropriate byte streams for each request made.
The SUT logs from the workloads are required per the individual workload run rules:

Web server access logs (common format) from the webserver VMs and infraserver VMs
Mail server logs recording IMAP sessions (timestamp and user identifier)

No dynamic content responses shall be cached by the SUT. In other words, the SUT dynamic code must generate the dynamic content for each request.
The SUT must utilize stable storage; additionally, stable and durable storage must be used for all VMs as described in SUT Stable Storage.
The submitter must keep the required benchmark log files (see above) for the SUT from the run and make them available upon request for the duration of the review cycle.
If power measurements are made the requirements described in section 2.1.3 Power must be met.

2.1.3 Power

This section outlines some of the environmental and other electrical requirements related to power measurement while running the benchmark. Note that power measurement is optional, so this section only applies to results with power in the Performance/Power categories.

To produce a compliant result for either Performance/Power of the Total System Under Test (SPECvirt_sc2010_PPW) or Performance/Power of the Server only, (SPECvirt_sc2010_ServerPPW) the following requirements must be met in addition to the environmental and electrical requirements described in this section.

The SUT must be set in an environment with ambient temperature at 20 degrees C or higher. The minimum temperature reading for the duration of each workload run must be greater than or equal to 20°C.
The usage of power analyzers and temperature sensors must be in accordance with the SPECpower Methodology. The temperature sensor must be placed within 50 mm from the air inlet. If monitoring the temperature of a rack with a single temperature sensor, the temperature sensor must be placed near inlet of the lowest placed device.
All power analyzers and temperature sensors used for testing must have been accepted by SPEC (http://www.spec.org/power/docs/SPECpower-Device_List.html) prior to the testing date and listed Restrictions on Use must be followed.
For Performance/Power of the Total System Under Test (SPECvirt_sc2010_PPW), the power input of all parts of the SUT as described in 1.5 Definitions, SUT must be included.
The percentage of error readings from the power analyzer must be less than 1% for Power and 2% for Volt, Ampere and Power Factor, measured only during measurement interval.
The percentage of “unknown” uncertainty readings from the power analyzer must be less than 1% measured only during measurement interval.
The percentage of “invalid” (uncertainty >1%) readings from the power analyzer must be less than 5% measured only during measurement interval.
The average uncertainty per measurement period must be less than or equal to 1%.
The minimum temperature reading for the duration of each workload run must be greater than or equal to 20°C.
The percentage of error readings from the temperature sensor must be less than or equal to 2%.

Line Voltage Source

The preferred Line Voltage source used for measurements is the main AC power as provided by local utility companies. Power generated from other sources often has unwanted harmonics which are incapable of being measured correctly by many power analyzers, and thus would generate inaccurate results.

The AC Line Voltage Source needs to meet the following characteristics:

Frequency: (50Hz or 60Hz) ± 1%
Voltage: (100V, 110V, 120V, 208V, 220V, or 230V) ± 5%

The usage of an uninterruptible power source (UPS) as the line voltage source is allowed, but the voltage output must be a pure sine-wave. For placement of the UPS, see Power Analyzer Setup below. This usage must be specified in the Notes section of the report.

Systems that are designed to be able to run normal operations without an external source of power cannot be used to produce valid results. Some examples of disallowed systems are notebook computers, hand-held computers/communication devices and servers that are designed to frequently operate on integrated batteries without external power.

Systems with batteries intended to preserve operations during a temporary lapse of external power, or to maintain data integrity during an orderly shutdown when power is lost, can be used to produce valid benchmark results. For SUT components that have an integrated battery, the battery must be fully charged at the end of the measurement interval, or proof must be provided that it is charged at least to the level of charge at the beginning of the interval.

Note that integrated batteries that are intended to maintain such things as durable cache in a storage controller can be assumed to remain fully charged. The above paragraph is intended to address “system” batteries that can provide primary power for the SUT.

DC line voltage sources are currently not supported.
For situations in which the appropriate voltages are not provided by local utility companies (e.g. measuring a server in the United States which is configured for European markets, or measuring a server in a location where the local utility line voltage does not meet the required characteristics), an AC power source may be used, and the power source must be specified in the notes section of the disclosure report. In such situation the following requirements must be met, and the relevant measurements or power source specifications disclosed in the general notes section of the disclosure report:

Total Harmonic Distortion of source voltage (loaded), based on IEC standards: < 5%
The AC Power Source needs to meet the frequency and voltage characteristics previously listed in this section.
The AC Power Source must not manipulate its output in a way that would alter the power measurements compared to a measurement made using a compliant line voltage source without the power source.

The intent is that the AC power source not interferes with measurements such as power factor by trying to adjust its output power to improve the power factor of the load.

Environmental Conditions

SPEC requires that power measurements be taken in an environment representative of the majority of usage environments. The intent is to discourage extreme environments that may artificially impact power consumption or performance of the server.

The following environmental conditions must be met:

Ambient temperature range: 20°C or above
Elevation: within documented operating specification of SUT
Humidity: within documented operating specification of SUT

Power Analyzer Setup

The power analyzer must be located between the AC Line Voltage Source and the SUT. No other active components are allowed between the AC Line Voltage Source and the SUT. If the SUT consists of several discrete parts (server and storage), separate power analyzers may be required.

Power analyzer configuration settings that are set by the SPEC PTDaemon must not be manually overridden.

Power Analyzer Specifications

To ensure comparability and repeatability of power measurements, SPEC requires the following attributes for the power measurement device used during the benchmark. Please note that a power analyzer may meet these requirements when used in some power ranges but not in others, due to the dynamic nature of power analyzer Accuracy and Crest Factor.

Measurements - the analyzer must report true RMS power (watts), voltage, amperes and power factor.
Uncertainty - Measurements must be reported by the analyzer with an overall uncertainty of 1% or less for the ranges measured during the benchmark run. Overall uncertainty means the sum of all specified analyzer uncertainties for the measurements made during the benchmark run.
Calibration - the analyzer must be able to be calibrated by a standard traceable to NIST (U.S.A.) or a counterpart national metrology institute in other countries. The analyzer must have been calibrated within the past year.
Crest Factor – The analyzer must provide a current crest factor of a minimum value of 3. For analyzers which do not specify the crest factor, the analyzer must be capable of measuring an amperage spike of at least 3 times the maximum amperage measured during any 1-second sample of the benchmark test.
Logging - The analyzer must have an interface that allows its measurements to be read by the PTDaemon. The reading rate supported by the analyzer must be at least 1 set of measurements per second, where set is defined as watts and at least 2 of the following readings: volts, amps and power factor. The data averaging interval of the analyzer must be either 1 (preferred) or 2 times the reading interval. "Data averaging interval" is defined as the time period over which all samples captured by the high-speed sampling electronics of the analyzer are averaged to provide the measurement set.

For example:

An analyzer with a vendor-specified uncertainty of +/- 0.5% of reading +/- 4 digits, used in a test with a maximum wattage value of 200W, would have "overall" uncertainty of (((0.5%*200W)+0.4W)=1.4W/200W) or 0.7% at 200W.

An analyzer with a wattage range 20-400W, with a vendor-specified uncertainty of +/- 0.25% of range +/- 4 digits, used in a test with a maximum wattage value of 200W, would have "overall" uncertainty of (((0.25%*400W)+0.4W)=1.4W/200W) or 0.7% at 200W.

Temperature Sensor Specifications

Temperature must be measured no more than 50mm in front of (upwind of) the main airflow inlet of the server.
To ensure comparability and repeatability of temperature measurements, SPEC requires the following attributes for the temperature measurement device used during the benchmark:

Logging - The sensor must have an interface that allows its measurements to be read by the benchmark harness. The reading rate supported by the sensor must be at least 4 samples per minute.
Accuracy - Measurements must be reported by the sensor with an overall accuracy of +/- 0.5 degrees Celsius or better for the ranges measured during the benchmark run.

Supported and Compliant Devices

See the Device List for a list of currently supported (by the benchmark software) and compliant (in specifications) power analyzers and temperature sensors.

2.2 Workload VMs

A tile is a single unit of work that is comprised of six distinct virtual machines and supports all the component workloads. Additional tiles are used to scale the benchmark. The last tile may be configured as a "fractional tile", which means a "load scale factor" of less than 1.0 is applied to all of the VMs within that tile. If used, the load scale factor must be between 0.1 and 0.9 in 0.1 increments (e.g. 0.25 would not be allowed). Each VM is required to be a distinct entity; for example, you cannot run the application server and the database on the same VM. The following block diagram shows the tile architecture and the virtual machine/hypervisor/driver relationships:

Figure 1. Tile block diagram

Note that there are more virtual machines than client drivers; this is because the Infrastructure Server and Database Server VMs do not interact directly with the client. Specifically, the Web Server VM must access parts of its fileset and the backend simulator (BeSim) via inter-VM communication to the Infrastructure Server. Similarly, the Application Server VM accesses the Database Server VM via inter-VM communication.

The operating systems may vary between virtual machines within a tile. All specific workload VMs (guest OS type and application software) across all tiles must be identical, including fractional tiles. Examples of parameters that must remain identical include:

· Guest OS distribution, version, and patch levels

· Application software version and patch levels

· Guest OS and application software tunings

· VM resource parameters from the guest OS perspective (i.e. # CPUs, memory, networking/storage configuration)

The intent is that workload-specific VMs across tiles are "clones", with only the modifications necessary to identify them as different entities (e.g. host name and network address).

Mail Server VM

As Internet email is defined by its protocol definitions, the mail server requires adherence to the relevant protocol standards:

RFC 2060 : Internet Mail Application Protocol - Version 1 (IMAP4)

The IMAP4 protocol implies the following:

RFC   791 : Internet Protocol (IPv4)
RFC   792 : Internet Control Message Protocol (ICMP)
RFC   793 : Transmission Control Protocol (TCP)
RFC   950 : Internet Standard Subnetting Procedure
RFC 1122 : Requirements for Internet Hosts - Communication Layers

Internet standards are evolving standards. Adherence to related RFC's (e.g. RFC 1191 Path MTU Discovery) is also acceptable provided the implementation retains the characteristic of interoperability with other implementations.

Application Server VM

The J2EE server must provide a runtime environment that meets the requirements of the Java 2 Platform, Enterprise Edition, (J2EE) Version 1.3 or later specifications during the benchmark run.

A major new version (i.e. 1.0, 2.0, etc.) of a J2EE server must have passed the J2EE Compatibility Test Suite (CTS) by the product's general availability date.

A J2EE Server that has passed the J2EE Compatibility Test Suite (CTS) satisfies the J2EE compliance requirements for this benchmark regardless of the underlying hardware and other software used to run the benchmark on a specific configuration, provided the runtime configuration options result in behavior consistent with the J2EE specification. For example, using an option that violates J2EE argument passing semantics by enabling a pass-by-reference optimization, would not meet the J2EE compliance requirement.

Comment: The intent of this requirement is to ensure that the J2EE server is a complete implementation satisfying all requirements of the J2EE specification and to prevent any advantage gained by a server that implements only an incomplete or incompatible subset of the J2EE specification.

SPECvirt_sc2010 requires that each Application Server VM execute it own locally installed emulator application (emulator.EAR). This differs from the original SPECjAppServer2004 workload definition.

Database Server VM

All tables must have the properly scaled number of rows as defined by the database population requirements, as defined in the "Application and Database Server Benchmark" section of the SPECvirt_sc2010 Design Overview.

Additional database objects or DDL modifications made to the reference schema scripts in the schema/sql directory in the SPECjAppServer2004 Kit must be disclosed along with the specific reason for the modifications. The base tables and indexes in the reference scripts cannot be replaced or deleted. Views are not allowed. The data types of fields can be modified provided they are semantically equivalent to the standard types specified in the scripts.

Comment: Replacing CHAR with VARCHAR would be considered semantically equivalent. Changing the size of a field (for example: increasing the size of a char field from 8 to 10) would not be considered semantically equivalent. Replacing CHAR with INTEGER (for example: zip code) would not be considered semantically equivalent.

Modifications that a customer may make for compatibility with a particular database server are allowed. Changes may also be necessary to allow the benchmark to run without the database becoming a bottleneck, subject to approval by SPEC. Examples of such changes include:

additional indexes on fields used in query predicates,
additional fields to support optimistic concurrency control,
specifying fields as 'NOT NULL', and
horizontally partitioning tables.

Comment: Schema scripts provided by the vendors in the schema/<vendor> directories are for convenience only. They do not constitute the reference or baseline scripts in the schema/sql directory. Deviations from the scripts in the schema/sql directory must still be disclosed in the submission file even though the vendor-provided scripts were used directly.

In any committed state the primary key values must be unique within each table. For example, in the case of a horizontally partitioned table, primary key values of rows across all partitions must be unique.

The databases must be populated using the supplied load programs or restored from a database copy in a correctly populated state that was populated using the supplied load programs prior to the start of each benchmark run.

Modifications to the load programs are permitted for porting purposes. All such modifications made must be disclosed in the Submission File.

Web Server VM

As the WWW is defined by its interoperative protocol definitions, the Web server requires adherence to the relevant protocol standards. It is expected that the Web server is HTTP 1.1 compliant. The benchmark environment shall be governed by the following standards:

RFC 2616 Hypertext Transfer Protocol 1.1 -- HTTP 1.1(Draft Standard)
RFC 791 Internet Protocol (IPv4) (Standard)

updated by RFC1349 Type of Service in the Internet Protocol Suite (Proposed Standard)

RFC 792 Internet Control Message Protocol (Standard)

updated by RFC 950 Internet Standard Subnetting Procedure (Standard)

RFC 793 Transmission Control Protocol (TCP) (Standard)

updated by RFC 3168 The Addition of Explicit Congestion Notification (ECN) to IP (Proposed Standard)
0

RFC 950 Internet Standard Subnetting Procedure (Standard)
RFC 1122 Requirements for Internet Hosts - Communication Layers (Standard)

updated by RFC 2474 Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers. (Proposed Standard)

RFC 2460 Internet Protocol, Version 6 Specification (IPv6) (Draft Standard) Note: may be used in place of or in conjunction with IPv4.

For further explanation of these protocols, the following might be helpful:

RFC 1180 TCP/IP Tutorial (RFC 1180) (Informational)
RFC 2151 A Primer on Internet and TCP/IP Tools and Utilities (RFC 2151) (Informational)
RFC 1321 MD5 Message Digest Algorithm (Informational)

The current text of all IETF RFC's may be obtained from: http://ietf.org/rfc.html

All marketed standards that a software product states as being adhered to must have passed the relevant test suites used to ensure compliance with the standards.

For a run to be valid, the following attributes must hold true:

The Web server returns the complete and appropriate byte streams for each request made.
The Web Server and BeSim log the following information for each request made: address of the requester, a date and time stamp accurate to at least 1 second, specification of the file requested, size of the file transferred, and the final status of the request. These requirements are satisfied by the Common Log Format.
No dynamic content responses shall be cached by the Web server. In other words, the Web server dynamic code must generate the dynamic content for each request.

Infrastructure VM

The Infrastructure VM has the same requirements as the Web Server VM in its role as a web back-end (BeSim) for the web workload.
It also hosts the download files for the webserver using a file system protocol for remote file sharing (for example NFS or CIFS).

Idle Server VM

For a run to be valid, each idle server VM must have at least 512 MB of memory allocated. The operating system of the idle server VM must be of the same type and version as at least one other VM in the tile. The idle server VM does not need to contain the other VM's workload-specific application software stack. The intent of these requirements is to prohibit vendors from artificially limiting and tuning in order to take advantage of the idle server's limited functionality.

2.3 Measurement

2.3.1 Quality of Service

The SPECvirt_sc2010 individual workload metrics represent the aggregate throughput that a server can support while meeting quality of service (QoS) and validation requirements. In the benchmark run, one or more tiles are run simultaneously. The load generated is based on page requests, database transactions, and IMAP operations as defined in the SPECvirt_sc2010 Design Overview.

The QoS requirements are relative to the individual workloads. These include:

Web Server:

The load generated is based on page requests, transition between pages and the static images accessed within each page.

The QoS requirements are defined in terms of two parameters, Time_Good and Time_Tolerable. QoS requirements are page based, Time_Good and Time_Tolerable values are defined as 3 seconds and 5 seconds respectively. For each page, 95% of the page requests (including all the embedded files within that page) are expected to be returned within Time_Good and 99% of the requests within Time_Tolerable. Very large static files (i.e. Support downloads) use specific byte rates as their QoS requirements.

The validation requirement is such that less than 1% of requests for any given page and less than 0.5% of all the page requests in a given test iteration fail validation.

It is required in this benchmark that all user sessions be run at the "high-speed Internet" speed of 100 kilobytes/sec.

In addition, the URL retrievals (or operations) performed must also meet the following quality criteria:

There must be least 100 requests for each type of page defined in the workload.
The Weighted Percentage Difference (WPD) between the Expected Number of Requests (ENR) and the actual number of requests (ANR) for any given page should be within +/- 1%.
The sum of the per page Weighted Percentage Differences (SWPD) must not exceed +/- 1.5% .

Mail Server:

For each IMAP operation type, 95% of all transactions must complete within five seconds. Additionally for each IMAP operation type, there may be no more than 1.5% failures (where a failure is defined as transactions that return unexpected content, or time-out). The total failure count across all operation types must be no more than 1% of the count of all operations.

Idle Server:

The client polls the Idle Server periodically to ensure that the VM is running and responsive. To meet the Idle Server QoS requirement, 99.5% of all polling requests must be responded to within one second.

Application Server:

Driver Requirements for the Dealer Domain

Business Transaction Mix Requirements

Business Transactions are selected by the Driver based on the mix shown in the following table. The actual mix achieved in the benchmark must be within 5% of the targeted mix for each type of Business Transaction. For example, the browse transactions can vary between 47.5% to 52.5% of the total mix. The Driver checks and reports on whether the mix requirement was met.

*Business Transaction Mix Requirements*
Business Transaction Type	Percent Mix
Purchase	25%
Manage	25%
Browse	50%

Response Time Requirements

The Driver measures and records the Response Time of the different types of Business Transactions. Only successfully completed Business Transactions in the Measurement Interval are included. At least 90% of the Business Transactions of each type must have a Response Time of less than the constraint specified in the table below. The average Response Time of each Business Transaction's type must not be greater than 0.1 seconds more than the 90% Response Time. This requirement ensures that all users will see reasonable response times. For example, if the 90% Response Time of purchase transactions is 1 second, then the average cannot be greater than 1.1 seconds. The Driver checks and reports on whether the response time requirements were met.

*Response Time Requirements*
Business Transaction Type	90% RT (in seconds)
Purchase	2
Manage	2
Browse	2

Cycle Time Requirements

For each Business Transaction, the Driver selects cycle times from a negative exponential distribution, computed from the following equation:

Tc = -ln(x) * 10

where:

Tc = Cycle Time

 ln = natural log (base e)

 x  = random number with at least 31 bits of precision, 

      from a uniform distribution such that (0 < x <= 1)

The distribution is truncated at 5 times the mean. For each Business Transaction, the Driver measures the Response Time Tr and computes the Delay Time Td as Td = Tc - Tr. If Td > 0, the Driver will sleep for this time before beginning the next Business Transaction. If the chosen cycle time Tc is smaller than Tr, then the actual cycle time (Ta) is larger than the chosen one.

The average actual cycle time is allowed to deviate from the targeted one by 5%. The Driver checks and reports on whether the cycle time requirements were met.

Miscellaneous Requirements

The table below shows the range of values allowed for various quantities in the application. The Driver will check and report on whether these requirements were met.

*Miscellaneous Dealer Requirements*
Quantity	Targeted Value	Min. Allowed	Max. Allowed
Average Vehicles per Order	26.6	25.27	27.93
Vehicle Purchasing Rate (/sec)	6.65 * Ir	6.32* Ir	6.98 * Ir
Percent Purchases that are Large Orders	10	9.5	10.5
Large Order Vehicle Purchasing Rate (/sec)	3.5 * Ir	3.33 * Ir	3.68 * Ir
Average # of Vehicles per Large Order	140	133	147
Regular Order Vehicle Purchasing Rate (/sec)	3.15 * Ir	2.99 * Ir	3.31 * Ir
Average # of Vehicles per Regular Order	14	13.3	14.7

Performance Metric

The metric for the Dealer Domain is Dealer Transactions/sec, composed of the total count of all Business Transactions successfully completed during the measurement interval divided by the length of the measurement interval in seconds.

Driver Requirements for the Manufacturing Domain

Response Time Requirements

The M_Driver measures and records the time taken for a work order to complete. Only successfully completed work orders in the Measurement Interval are included. At least 90% of the work orders must have a Response Time of less than 5 seconds. The average Response Time must not be greater than 0.1 seconds more than the 90% Response Time.

Miscellaneous Requirements

The table below shows the range of values allowed for various quantities in the Manufacturing Application. The M_Driver will check and report on whether the run meets these requirements.

*Miscellaneous Manufacturing Requirements*
Quantity	Targeted Value	Min. Allowed	Max. Allowed
LargeOrderline Widget Rate/sec	3.5 * Ir	3.15 * Ir	3.85 * Ir
Planned Line Widget Rate/sec	3.15 * Ir	2.835 * Ir	3.465 * Ir

2.3.2 Benchmark Parameters

Workload-specific configuration files are supplied with the harness. All configurable parameters are listed in these files. For a run to be valid, all the parameters in the configuration files must be left at default values, except for the ones that are marked and listed clearly as "Configurable Workload Properties".

2.3.3 Running SPECvirt_sc2010 Workloads

To configure the initial benchmark environment from scratch, the benchmarker:

Creates the VMs.
Installs and configures the VM's OS and application software including provisioning individual mail server users.
Creates, initializes, and backs up (if appropriate) datasets for each workload.

To run the benchmark, the benchmarker must:

Restore the database server VM's database.
Ensure that all relevant applications are running.
Clear application logs.
Run the client driver with compliant values, which will run all workloads simultaneously at predefined load levels.

2.3.4 Power Measurement

NOTE: This section is only applicable to results that have power measurement, which is optional.

The measurement of power should meet all the environmental aspects listed in Environmental Conditions. The SPECvirt_sc2010 benchmark tools provide the ability to automatically gather measurement data from supported power analyzers and temperature sensors and integrate that data into the benchmark result. SPEC requires that the analyzers and sensors used in a submission be supported by the measurement framework. The provided tools (or a newer version provided by SPEC) must be used to run and produce measured SPECvirt_sc2010 results.

The primary metrics, SPECvirt_sc2010_PPW (performance with SUT power) and SPECvirt_sc2010_ServerPPW (performance with Server only power) are performance per watt metrics obtained by dividing the peak performance by the peak power of the SUT or Server, respectively, during the run measurement phase. For example, if the SPECvirt_sc2010 result consisted of a maximum of 6 tiles, the power would be calculated as the average power while serving transactions within all 6 workload tiles.

2.3.5 Client Polling Requirements

During the measurement phase, the SPECvirt_sc2010 prime controller polls each prime client process associated with each workload in each tile once every 10 seconds. The prime controller collects and records the workload polling data which includes performance and QoS measurement data from the clients. It is expected that in a compliant run all polling requests will be responded to within 10 seconds (BEAT_INTERVAL). Failure to respond to polling requests may indicate problems with the clients' ability to issue and respond to workload requests in a timely manner or accurately record performance.

The prime controller process will detect that each polling request is responded to by the prime client processes, the prime controller will invalidate the test if more than one 10-second polling interval is missed during the test's measurement phase. The test will abort, and the run will be marked as non-compliant.

3.0 Reporting Results

3.1 Metrics And Reference Format

The reported performance metric, SPECvirt_sc2010, appears in both Performance/Power and Performance categories, and will be derived from a set of compliant results from the workloads in the suite:

Mail server
Web server
Application server

The SPECvirt_sc2010 metric is a "supermetric" that is the arithmetic mean of the normalized submetrics for each workload. The metric will be output in the format "SPECvirt_sc2010 <score> @ <# vms> VMs".

The optional reported performance/watt metrics, SPECvirt_sc2010_PPW and SPECvirt_sc2010_ServerPPW, represents the peak performance divided by the average power of the SUT and server respectively during the peak run phase. These metrics will only appear in results in the Performance/Power categories, and the result must not be compared with results that do not have power measured. These metrics will be output in the format "SPECvirt_sc2010_PPW <score> @ <# vms> VMs" and "SPECvirt_sc2010_ServerPPW <score> @ <# vms> VMs"

Please consult the SPEC Fair Use Rule on the treatment of estimates at http://www.spec.org/fairuse.html#SPECvirt_sc2010.

The report of results for the SPECvirt_sc2010 benchmark is generated in ASCII and HTML format by the provided SPEC tools. These tools may not be changed without prior SPEC approval. The tools perform error checking and will flag some error conditions as resulting in an "invalid run". However, these automatic checks are only there for debugging convenience, and do not relieve the benchmarker of the responsibility to check the results and follow the run and reporting rules.

SPEC reviews and accepts for publication on SPEC's website only a complete and compliant set of results run and reported according to these rules. Full disclosure reports of all test and configuration details as described in these run and report rules must be made available. Licensees are encouraged to submit results to SPEC for publication.

3.2 Testbed Configuration

All system configuration information required to duplicate published performance results must be reported. Tunings not in the default configuration for software and hardware settings must be reported. All tiles must be tuned identically.

3.2.1 SUT Hardware

The following SUT hardware components must be reported:

Vendor's name
System model name
System firmware version(s) (e.g. BIOS)
Processor model, clock rate, number of processors (#cores, #chips, #cores/chip, on-chip threading enabled/disabled), and size and organization of primary, secondary, and other cache, per processor. If a level of cache is shared among processor cores in a system that must be stated in the "notes" section
Main memory size and memory configuration if this is an end-user option which may affect performance, e.g. interleaving and access time
Other hardware, e.g. write caches, or other accelerators
Number, type, model, and capacity of disk controllers and drives
Type of file system used

3.2.1.1 SUT Stable Storage

The SUT must utilize stable storage. Additionally, the SUT must use stable and durable storage for all virtual machines (including all corresponding data drives), such that a single drive failure does not incur data loss on the VMs. For example: RAID-1, 5, 10, 50, 0+1 are acceptable RAID levels, but RAID-0 (striping without mirroring or parity) is not considered durable.

The SUT

The hypervisor must be able to recover the virtual machines, and the virtual machines must also be able to recover their data sets, without loss from multiple power failures (including cascading power failures), hypervisor and guest operating system failures, and hardware failures of components (e.g. CPU) other than the storage medium. At any point where the data can be cached, after any virtual server has accepted the message and acknowledged a transaction, there must be a mechanism to ensure any cached data survives the server failure.

Examples of stable storage include:

Media commit of data; i.e. the data has been successfully written to the disk media.
An immediate reply disk drive with battery-backed on-drive intermediate storage or an uninterruptible power supply (UPS).
Server commit of data with battery-backed intermediate storage and recovery software.
Cache commit with UPS.

Examples which are not considered stable storage:

An immediate reply disk drive without battery-backed on-drive intermediate storage or UPS.
Cache commit without UPS.
Server commit of data without battery-backed intermediate storage and recovery software.

Examples of durable storage include:

RAID 1 - Mirroring and Duplexing
RAID 0+1 - Mirrored array whose segments are RAID 0 arrays
RAID 5 - Striped array with distributed parity across all disks (requires at least 3 drives).
RAID 10 (RAID 1+ 0) - Striped array whose segments are RAID 1 array
RAID 50 - Striping (RAID 0) combined with distributed parity (RAID 5)

Examples of non-durable storage include:

RAID 0 - striped disk array without fault tolerance
JBOD - just a bunch of independent disks with/without spanning

If an UPS is required by the SUT to meet the stable storage requirement, the benchmarker is not required to perform the test with an UPS in place. The benchmarker must state in the disclosure that an UPS is required. Supplying a model number for an appropriate UPS is encouraged but not required.

If a battery-backed component is used to meet the stable storage requirement, that battery must have sufficient power to maintain the data for at least 48 hours to allow any cached data to be committed to media and the system to be gracefully shut down. The system or component must also be able to detect a low battery condition and prevent the use of the caching feature of the component or provide for a graceful system shutdown.

Hypervisors are required to safely store all completed transactions to its virtualized workloads (including failure of the hypervisor's own storage):

Mail server: Mail servers are required to safely store any email they have accepted until the recipient has disposed of it.
Web server and Infrastructure servers: The web servers' log file records must be written to non-volatile storage, at least once every 60 seconds. The web servers must log the following information for each request made: address of the requester, a date and time stamp accurate to at least 1 second, specification of the file requested, size of the file transferred, and the final status of the request. These requirements are satisfied by the Common Log Format.
Application and Database servers: The Atomicity, Consistency, and Isolation properties of transaction processing systems must be supported.

3.2.2 SUT Software

The following SUT software components must be reported:

Virtualization software (hypervisor) and all hypervisor-level tunings
Virtual machine details (number of virtual processors, memory, network adapters, disks, etc.)
Workload-specific details (operating system, application name and version, tunings) for each workload (NOTE: these must be identical across all tiles)
The values of MSL (maximum segment life) and TIME-WAIT. If TIME-WAIT is not equal to 2*MSL, that must be noted. (Reference section 4.2.2.13 of RFC 1122).
Other clarifying information as required to reproduce benchmark results (e.g. number of daemons, BIOS parameters, disk configuration, non-default kernel parameters, etc.), and logging mode, must be stated in the "notes" section.
The method for creating the Web server RSA public encryption key and certification must be stated.

3.2.3 Network Configuration

A brief description of the network configuration used to achieve the benchmark results is required. The minimum information to be supplied is:

Number, type, and model of network controllers
Number and type of networks used
Base speed of network
Number, type, model, and relationship of external network components to support SUT (e.g., any external routers, hubs, switches, etc.)
A network configuration notes section may be used to list the following additional information:
Relationship of clients, client type, and networks (including routers, etc. if applicable) -- in short: which clients are connected to which LAN segments. For example: "client1 and client2 on one ATM-622, client3 and client4 on second ATM-622, and clients 5, 6, and 7 each on their own 100TX segment."

3.2.4 Clients

The following load generator hardware components must be reported:

Number of physical client systems used for all load drivers and the prime controller
System model number(s), processor type and clock rate, number of processors
Main memory size
Network Controller(s)
Operating System and/or Hypervisor and Version

If clients have been virtualized then report virtual resources (CPU, memory, and network) for each and the mapping to the physical system.
If physical clients used to drive workloads on multiple tiles, specify the mapping of clients to tiles or workloads.

JVM product used to run client including vendor and version.
Other performance critical Hardware
Other performance critical Software

3.2.5 General Availability Dates

The dates of general customer availability must be listed for the major components: hardware, software (hypervisor, operating systems, and applications), month and year. All the system, hardware and software features are required to be generally available on or before date of publication, or within 3 months of the date of publication (except where precluded by these rules, see section 3.2.7). With multiple components having different availability dates, the latest availability date must be listed.

Products are considered generally available if they are orderable by ordinary customers and ship within a reasonable time frame. This time frame is a function of the product size and classification, and common practice. The availability of support and documentation for the products must coincide with the release of the products.

Hardware products that are still supported by their original or primary vendor may be used if their original general availability date was within the last five years. The five-year limit is waived for hardware used in clients.

For ease and cost of benchmarking, storage and networking hardware external to the server such as disks, storage enclosures, storage controllers and network switches, which were generally available within the last five years but are no longer available from the original vendor, may be used. If such end-of-life (and possibly unsupported) hardware is used, then the test sponsor represents that the performance measured is no better than 105% of the performance on hardware available as of the date of publication. The product(s) and their end-of-life date(s) must be noted in the disclosure. If it is later determined that the performance using available hardware to be lower than 95% of that reported, the result shall be marked non-compliant (NC).

Software products that are still supported by their original or primary vendor may be used if their original general availability date was within the last three years.

In the disclosure, the benchmarker must identify any component that is no longer orderable by ordinary customers.

If pre-release hardware or software is tested, then the test sponsor represents that the performance measured is generally representative of the performance to be expected on the same configuration of the release system. If it is later determined that the performance using available hardware or software to be lower than 95% of that reported, the result shall be marked non-compliant (NC).

3.2.6 Rules on the Use of Open Source Applications

SPECvirt_sc2010 does permit Open Source Applications outside of a commercial distribution or support contract with some limitations. The following are the rules that govern the admissibility of the Open Source Application in the context of a benchmark run or implementation. Open Source Applications do not include shareware and freeware, where the source is not part of the distribution.

Open Source Application rules do not apply to Open Source operating systems, which would still require a commercial distribution and support.
Only a "stable" release can be used in the benchmark environment; non-"stable" releases (alpha, beta, or release candidates) cannot be used.

Reason: An open source project is not contractually bound and volunteer resources make predictable future release dates unlikely (i.e. may be more likely to miss SPEC's 3 month General Availability window). A "stable" release is one that is clearly denoted as a stable release or a release that is available and recommended for general use. It must be a release that is not on the development fork, not designated as an alpha, beta, test, preliminary, pre-released, prototype, release-candidate, or any other terms that indicate that it may not be suitable for general use.
The initial "stable" release of the application must be a minimum of 12 months old.

Reason: This helps ensure that the software has real application to the intended user base and is not a benchmark special that's put out with a benchmark result and only available for the 1st three months to meet SPEC's forward availability window.
At least two additional stable releases (major, minor, or bug fix) must have been completed, announced and shipped beyond the initial stable release.

Reason: This helps establish a track record for the project and shows that it is actively maintained.
An established online support forum must be in place and clearly active, "usable", and "useful". It’s required that there be at least one posting within the last 3 months. Postings from the benchmarkers or their representatives, or members of the Virtualization Subcommittee will not be included in the count.

Reason: Another aspect that establishes that support is available for the software. However, benchmarkers must not cause the forum to appear active when it otherwise would not be. A "useful" support forum is defined as one that provides useful responses to users’ questions, such that if a previously unreported problem is reported with sufficient detail, it is responded to by a project developer or community member with sufficient information that the user ends up with a solution, a workaround, or has been notified that the issue will be address in a future release, or that its outside the scope of the project. The archive of the problem-reporting tool must have examples of this level of conversation. A "usable" support forum is defined as one where the problem reporting tool was available without restriction, had a simple user-interface, and users can access old reports.
The project must have at least 2 identified developers contributing and maintaining the application.

Reason: To help ensure that this is a real application with real developers and not a fly-by-night benchmark special.
The application must use a standard open source license such as one of those listed at http://www.opensource.org/licenses/.
The "stable" release used in the actual test run must have been a latest "stable" release within the prior six months at the time the result is submitted for review. The exact beginning of this time window has to be determined starting from the date of the submission then going back 6 months and keeping the day number the same. Note: Residual cases are treated as described as in http://www/spec.org/osg/policy.html#s2.3.4 substituting the 6 month window for 3 month availability window. Examples:

Submission date	Beginning of time window
Aug 20, 2019	Feb 20, 2019
Jul 20, 2019	Jan 20, 2019
Jun 20, 2019	Dec 20 2018

Reason: Benchmarkers should keep up to date with the recent releases; however they are not required to move to a release that would be fewer than six months old at the time of their submission.

Please note, an Open Source Application project may support several parallel development branches and so there may be multiple latest stable releases that meet these rules. For example, a project may have releases such as 10.0, 9.5.1, 8.3.12, and 8.2.29 that are all current supported and stable releases.
The "stable" release used in the actual test run must be no older than 18 months. If there has not been a "stable" release within 18 months, then the open source project may no longer be active and as such may no longer meet these requirements. An exception may be made for “mature” projects (see below).
In rare cases, open source projects may reach “maturity” where the software requires little or no maintenance and there may no longer be active development. If it can be demonstrated that the software is still in general use and recommended either by commercial organizations or active open source projects or user forums and the source code for the software is fewer than 20,000 lines, then a request can be made to the subcommittee to grant this software “mature” status. In general, it is expected that the final stable release for the "mature" project continues to work "as is" for the majority of users but that over time some users may need to make portability changes. This status may be reviewed semi-annually. The current list of projects granted "mature" status by the subcommittee include: the FastCGI library and Alternate PHP Cache.

Note: The Webserver workload requires the use of Smarty 2.6.26 which is included in the release kit and is not subject to the above rules.

3.2.7 Test Sponsor

The reporting page must list the date the test was performed, month and year, the organization which performed the test and is reporting the results, and the SPEC license number of that organization.

3.2.8 Notes

This section is used to document:

System state: single or multi-user
System tuning parameters other than default
Process tuning parameters other than default
MTU size of the network used
Background load, if any
ANY approved portability changes made to the individual benchmark source code including module name, line number of the change.
Additional information such as compilation options may be listed
Critical customer-identifiable firmware or option versions such as network and disk controllers
Additional important information required to reproduce the results, which do not fit in the space allocated above must be listed here.
If the configuration is large and complex, added information must be supplied either by a separate drawing of the configuration or by a detailed written description which is adequate to describe the system to a person who did not originally configure it.
Part numbers or sufficient information that would allow the end user to order the SUT configuration if desired.

4.0 Submission Requirements for SPECvirt_sc2010

Once you have a compliant run and wish to submit it to SPEC for review, you will need to provide the following:

The reporter-generated submission file containing ALL the information outlined in section 3 and indicating into which categories the result should be submitted
The output of configuration gathering script(s) that obtain the configuration information from the SUT, clients, and all workload virtual machines as described in sections 4.1, 4.2, and 4.3 below. The scripts may collect specific files and the output of various commands to provide supporting documentation for the review. The configuration gathering script(s) used must be run automatically at the beginning or the end of the benchmark to help ensure that there are no changes made to the testbed and the configuration information collected represents the system as tested.

Use of the SPECvirt_sc2010 harness features for running benchmark initialization or exit scripts (SPECVIRT_INIT_SCRIPT, SPECVIRT_EXIT_SCRIPT) are strongly recommended for any data collection scripts that can make use of these features.
If an external test framework is used to initiate the SPECvirt_sc2010 test and run any of the data collection, the collection methodology must be described and the start and ending timestamps for the data collection archive must show that it was run immediately prior or immediately following the test based on the first and last timestamps in the primecontroller log for the test. For example, if the test starting and ending timestamps are 15:00:00 and 18:00:00 respectively, and a data collection script takes 5 minutes to run, if it is run prior to the test it should start no earlier than 14:50:00. If run after the test, the script should start no later than 18:05:00.
If some aspect of the data collection can not be automated such as accessing a SAN manager to get details on the storage configuration, then this should be noted.
Submission of the configuration gathering script(s) is encouraged but not required.

Log files from the run upon request
Additional files or the output of commands run on the SUT, VMs, or clients to help document details relevant to questions that may arise during the review.

Once you have the submission ready, please email SPECvirt_sc2010 submissions to subvirt_sc2010@spec.org.

In order to publicly disclose SPECvirt_sc2010 results, the submitter must adhere to these reporting rules in addition to having followed the run rules described in this document. The goal of the reporting rules is to ensure the system under test is sufficiently documented such that someone could reproduce the test and its results.

Compliant runs need to be submitted to SPEC for review and must be accepted prior to public disclosure. If public statements using SPECvirt_sc2010 are made they must follow the SPEC Fair Use Rule (http://www.spec.org/fairuse.html).

Many other SPEC benchmarks allow duplicate submissions for a single system sold under various names. Each SPECvirt_sc2010 result from a power enabled run submitted to SPEC or made public must be for an actual run of the benchmark on the SUT named in the result. Electrically equivalent submissions for power enabled runs are not allowed, unless it is also mechanically equivalent (e.g. rebadged).

4.1 SUT Configuration Collection

The submitter is required to run a script that will collect available configuration details of the SUT and all the virtual machines used for the benchmark, including:

SUT configuration and tuning
SUT storage configuration and tuning
SUT network configuration
Hypervisor configuration and tuning
Virtual machine configuration (# of virtual CPUs, memory, disk, network adapters)
Virtual networking configuration
Additional files or the output of commands run on the SUT to help document details relevant to questions that may arise during the review.

The primary reason for this step is to ensure that there are not subtle differences that the vendor may miss.

4.2 Guest Configuration Collection

The submitter is required to run a script which provides the details of each VM, its operating system and application tunings that is not captured in the SUT configuration collection script including:

Guest OS, filesystem, and network configuration and tuning (e.g. non-default registry or /etc/sysctl.conf)
Application-specific configuration and tuning files
Output of commands to document details related to the specific requirements for a workload such as the infraserver files shared with the webserver and software versions used.

During a review of the result, the submitter may be required to provide, upon request, additional details of the VM, operating system and application tunings and log files that may not be captured in the above script. These may include, but are not limited to:

Application-specific log files
Additional files or the output of commands run on the VMs to help document details relevant to questions that may arise during the review.

The primary reason for this step is to ensure that the vendor has disclosed all non-default tunings.

4.3 Client Configuration Collection

The submitter is required to run a script which collects the details of each type or uniquely configured physical and virtual client used, such that the testbed's client configuration could be reproduced. Clients that are clones of a specific and documented type may be identified and data collection is encouraged but not required. The client collection script should collect files and output of commands to document the client configuration and tuning details including:

Hardware configuration (physical and if applicable virtual)
Host operating system, filesystem, and network configuration and tuning (e.g. non-default registry or /etc/sysctl.conf)
If virtualized client(s) are used: hypervisor-specific configuration and tuning files
Virtualized client Guest OS, filesystem, and network configuration and tuning (e.g. non-default registry or /etc/sysctl.conf)

During a review of the result, the submitter may be required to provide, upon request, additional details of the client configuration that may not be captured in the above script to help document details relevant to questions that may arise during the review.

4.4 Configuration Collection Archive Format

The submitter must submit the Configuration Collection Archive containing the data (files and command ouput) described sections 4.0, 4.1.and 4.2 above using the highlevel directory structure described below as the foundation:

Physical_Configuration

System_Under_Test
SUT_Storage
SUT_Network
Clients

Client_type<type 1-n>

Physical_Configuration
Software_Configuration
VM_Configuration (if applicable)

Virtual_Configuration

Tile1

appserver

VM_Configuration
Software_Configuration

dbserver

VM_Configuration
Software_Configuration

idleserver

VM_Configuration
Software_Configuration

infraserver

VM_Configuration
Software_Configuration

mailserver

VM_Configuration
Software_Configuration

webserver

VM_Configuration
Software_Configuration

Tile<2-n>

Repeat directory structure above for each tile in the test and populate with data from the corresponding set of VMs.

5.0 The SPECvirt_sc2010 Benchmark Kit

SPEC provides client driver software, which includes tools for running the benchmark and reporting its results. The client drivers are written in Java; precompiled class files are included with the kit, so no build step is necessary. Recompilation of the client driver software is not allowed, unless prior approval from SPEC is given.

This software implements various checks for conformance with these run and reporting rules; therefore, the SPEC software must be used as provided. Source code modifications are not allowed, unless prior approval from SPEC is given. Any such substitution must be reviewed and deemed "performance-neutral" by the OSSC.

The kit also includes source code for the file set generators, script code for the web server, and other necessary components.

Appendix A. Run Rules References

SPECvirt_sc2010 uses modified versions of SPECweb2005, SPECjAppServer2004, and SPECmail2008 for its virtualized workloads. For reference, the run rules for those benchmarks are listed below:

Web Server: SPECweb2005 Run Rules
Mail Server: SPECmail2008 Run Rules
Application Server: SPECjAppServer2004 run rules
Performance/Power: SPEC Power and Performance Methodology

NOTE: Not all of these run rules are applicable to SPECvirt, but when a compliance issue is raised, SPEC reserves the right to refer back to these individual benchmarks' run rules as needed for clarification.

Java^® is a registered trademark of Oracle Corporation.