1.0
Introduction
1.1 Philosophy
1.2 Fair Use of SPECvirt_sc2010 Results
1.3 Research and Academic Usage
1.4 Caveat
1.5 Definitions
2.0 Running the SPECvirt_sc2010 Benchmark
2.1 Environment
2.1.1 Testbed
Configuration
2.1.2 System
Under Test (SUT)
2.1.3 Power
2.2 Workload VMs
2.3 Measurement
2.3.1 Quality
of Service
2.3.2
Benchmark Parameters
2.3.3 Running
SPECvirt_sc2010 Workloads
2.3.4 Power
Measurement
2.3.5 Client Polling
Requirements
3.0 Reporting Results
3.1 Metrics And Reference Format
3.2 Testbed Configuration
3.2.1 SUT
Hardware
3.2.1.1 SUT
Stable Storage
3.2.2 SUT
Software
3.2.3 Network
Configuration
3.2.4 Clients
3.2.5 General
Availability Dates
3.2.6 Rules on
the Use of Open Source Applications
3.2.7 Test
Sponsor
3.2.8 Notes
4.0 Submission Requirements for SPECvirt_sc2010
4.1 SUT Configuration Collection
4.2 Guest Configuration Collection
4.3 Client Configuration Collection
4.4 Configuration Collection Archive Format
5.0 The SPECvirt_sc2010 Benchmark Kit
Appendix A. Run Rules References
SPECvirt_sc2010 is
the first generation SPEC benchmark for evaluating the performance of
datacenter servers used for virtualized server consolidation. The
benchmark also provides options for measuring and reporting power and
performance/power metrics. This document specifies how SPECvirt_sc2010 is to be
run for measuring and publicly reporting performance and power results. These
rules abide by the norms laid down by the SPEC Virtualization Subcommittee and
approved by the SPEC Open Systems Steering Committee. This ensures that results
generated with this suite are meaningful, comparable to other generated
results, and are repeatable with sufficient documentation covering factors
pertinent to duplicating the results.
Per the SPEC license
agreement, all results publicly disclosed must adhere to these Run and
Reporting Rules.
The general
philosophy behind the rules of SPECvirt_sc2010 is to ensure that an independent
party can reproduce the reported results.
The following
attributes are expected:
- Proper use of the SPEC benchmark
tools as provided.
- Availability of an appropriate
full disclosure report (FDR).
- Support for all of the appropriate
hardware and software.
Furthermore, SPEC
expects that any public use of results from this benchmark suite shall be for
System Under Test (SUT) and configurations that are appropriate for public
consumption and comparison. Thus, it is also expected that:
- Hardware and software used to run
this benchmark must provide a suitable environment for consolidating datacenter
servers in a virtual environment.
- Optimizations utilized must
improve performance for a larger class of workloads than those defined by
this benchmark suite.
- The SUT and configuration is
generally available, documented, supported, and encouraged by the
vendor(s) or provider(s).
- If power is measured, the power
analyzers and temperature sensors used while running the benchmark must be
from the list of Accepted
Measurement Devices accepted by SPEC.
SPEC requires that
any public use of results from this benchmark follow the SPEC Fair Use Rule and those
specific to this benchmark (see the Fair Use section below).
In the case where it appears that these guidelines have not been adhered to,
SPEC may investigate and request that the published material be corrected.
Consistency and fairness are guiding principles for SPEC. To
help assure that these principles are met, any organization or individual who
makes public use of SPEC benchmark results must do so in accordance with the
SPEC Fair Use Rule, as posted at http://www.spec.org/fairuse.html. Fair-use
clauses specific to SPECvirt_sc2010 are covered in http://www.spec.org/fairuse.html#SPECvirt_sc2010.
Please consult the SPEC FairUse
Rule on Research and Academic Usage at http://www.spec.org/fairuse.html#Academic.
SPEC reserves the
right to adapt the benchmark codes, workloads, and rules of SPECvirt_sc2010 as
deemed necessary to preserve the goal of fair benchmarking. SPEC will notify
members and licensees whenever it makes changes to this document and will
rename the metrics if the results are no longer comparable.
Relevant standards
are cited in these run rules as URL references, and are current as of the date
of publication. Changes or updates to these referenced documents or URLs may
necessitate repairs to the links and/or amendment of the run rules. The most
current run rules will be available at the SPEC Web site at http://www.spec.org/virt_sc2010.
SPEC will notify members and licensees whenever it makes changes to the suite.
In a virtualized
environment, the definitions of commonly-used terms can have multiple or
different meanings. To avoid ambiguity, this section attempts to define terms
that are used throughout this document:
- Server: The server represents a host system
that is capable of supporting a single operating system or
hypervisor. The server consists of one or more enclosures that
contains hardware components such as the processors, memory, network
adapters, storage adapters, and any other components within that
enclosure, as well as the mechanism that provides power for these
components. In the case of a blade result, the server includes the blade
enclosure.
- Clients: The clients are one or several
servers that are used to initiate benchmark transactions and record their
completion. In most cases, a Client simulates the work requests that would
normally come from end users, although in some cases the work requests
come from background tasks that are defined by the benchmark workloads to
be simulated instead of being included in the SUT.
- SUT: The SUT, or System Under Test, is
defined as the server and performance-critical components that execute the
defined workloads. Among these components are: storage hardware and all
other hardware necessary to connect the server and the storage subsystem.
Such connecting hardware can be: fibre switches or network switches (if
the submitter uses NAS). Client hardware used to initiate and monitor the
workflow as well as network switches not described above are not included.
- Power monitoring systems: These are the power analyzer,
temperature monitor, and system(s) running the applications that control
the collection and recording of power information for the benchmark. They
are not a part of the SUT.
- Active Idle: The state in which the SUT must
be capable of completing all workload transactions. In a virtualized
environment, all virtual machines necessary to support the peak load of
the result must be powered on and ready to respond to client requests.
- Virtual Machine (VM): A virtual machine is an abstracted
execution environment which presents an operating system (either discrete
or virtualized). The abstracted execution environment presents the
appearance of a dedicated computer with CPU, memory and I/O resources
available to the operating system. In SPECvirt, a VM consists of a
single OS and the application software stack that supports a single
SPECvirt_sc2010 component workload. There are several methods of
implementing a VM, including physical partitions, logical partitions and
software-managed virtual machines.
- Hypervisor (virtual machine
monitor): Software
(including firmware) that allows multiple VMs to be run simultaneously on
a server.
- Tile: A logical grouping of one of each
kind of VM used within SPECvirt. For SPECvirt_sc2010, a tile
consists of one Web Server, Mail Server, Application Server, Database
Server, Infrastructure Server, and Idle Server. A valid
SPECvirt_sc2010 benchmark result is achieved by correctly executing the
benchmark workloads on one or more tiles.
For further
definition or explanation of workload-specific terms, refer to the respective
documents of the original benchmarks.
These requirements
apply to all hardware and software components used in producing the benchmark
result, including the System under Test (SUT), network, and clients.
- The SUT must conform to the
appropriate networking standards, and must utilize variations of these
protocols to satisfy requests made during the benchmark.
- The value of TCP TIME_WAIT must be
at least 60 seconds (i.e. if a connection between the SUT and a
client enters TIME_WAIT it must stay in TIME_WAIT for at least 60
seconds).
- The SUT must be comprised of
components that are generally available on or before date of publication,
or shall be generally available within 3 months of the first publication
of these results.
- No components that are included in
the base configuration of the SUT, as delivered to customers, may be
removed from the SUT.
- Any deviations from the standard
default configuration for testbed configuration components must be
documented so an independent party would be able to reproduce the
configuration and the result without further information. These deviations
must meet general availability and support requirements (see section 3.2.5). The independent party should be able to
achieve performance not lower than 95% of that originally reported.
- The connections between a
SPECvirt_sc2010 load generating machine (client) and the SUT must not use
a TCP Maximum Segment Size (MSS) greater than 1460 bytes. This needs to be
accomplished by platform-specific means outside the benchmark code itself.
The method used to set the TCP MSS must be disclosed. MSS is the largest
"chunk" of data that TCP will send to the other end. The
resulting IP datagram is normally 40 bytes larger: 20 bytes for the TCP
header and 20 bytes for the IP header resulting in an MTU (Maximum
Transmission Unit) of 1500 bytes.
- Inter-VM communications have no
restrictions for MSS.
- Ensure that client timekeeping is
accurate and does not drift since performance is measured by the
clients. For virtualized clients, this may require a guest OS that
uses tickless timekeeping. Please refer to virtualization provider's
documentation for best practices.
- Clients must reside outside of the
SUT.
- Open Source Applications that are
outside of a commercial distribution or support contract must adhere to
the Rules on the Use of Open Source Applications.
For a run to be
valid, the following attributes must hold true:
- The SUT returns the complete and
appropriate byte streams for each request made.
- The SUT logs from the workloads are
required per the individual workload run rules:
- Web server access logs (common
format) from the webserver VMs and infraserver VMs
- Mail server logs recording IMAP
sessions (timestamp and user identifier)
- No dynamic content responses shall
be cached by the SUT. In other words, the SUT dynamic code must generate
the dynamic content for each request.
- The SUT must utilize stable
storage; additionally, stable and durable storage must be used for all VMs
as described in SUT Stable Storage.
- The submitter must keep the
required benchmark log files (see above) for the SUT from the run and make
them available upon request for the duration of the review cycle.
- If power measurements are made the
requirements described in section 2.1.3
Power must be met.
This section outlines
some of the environmental and other electrical requirements related to power
measurement while running the benchmark. Note that power measurement is
optional, so this section only applies to results with power in the
Performance/Power categories.
To produce a compliant result for either Performance/Power of the Total
System Under Test (SPECvirt_sc2010_PPW) or Performance/Power of the Server
only, (SPECvirt_sc2010_ServerPPW) the following requirements must be met in
addition to the environmental and electrical requirements described in this
section.
- The SUT must be set in an
environment with ambient temperature at 20 degrees C or higher. The
minimum temperature reading for the duration of each workload run must be
greater than or equal to 20°C.
- The usage of power analyzers and
temperature sensors must be in accordance with the SPECpower
Methodology. The temperature sensor must be placed within 50 mm from
the air inlet. If monitoring the temperature of a rack with a single
temperature sensor, the temperature sensor must be placed near inlet of
the lowest placed device.
- All power analyzers and
temperature sensors used for testing must have been accepted by SPEC (http://www.spec.org/power/docs/SPECpower-Device_List.html)
prior to the testing date and listed Restrictions on Use must be
followed.
- For Performance/Power of the Total
System Under Test (SPECvirt_sc2010_PPW), the power input of all parts of
the SUT as described in 1.5
Definitions, SUT must be included.
- The percentage of error readings
from the power analyzer must be less than 1% for Power and 2% for Volt,
Ampere and Power Factor, measured only during measurement interval.
- The percentage of “unknown”
uncertainty readings from the power analyzer must be less than 1% measured
only during measurement interval.
- The percentage of “invalid”
(uncertainty >1%) readings from the power analyzer must be less than 5%
measured only during measurement interval.
- The average uncertainty per
measurement period must be less than or equal to 1%.
- The minimum temperature reading
for the duration of each workload run must be greater than or equal to
20°C.
- The percentage of error readings
from the temperature sensor must be less than or equal to 2%.
Line Voltage Source
The preferred Line
Voltage source used for measurements is the main AC power as provided by local
utility companies. Power generated from other sources often has unwanted
harmonics which are incapable of being measured correctly by many power
analyzers, and thus would generate inaccurate results.
- The AC Line Voltage Source
needs to meet the following characteristics:
- Frequency: (50Hz or 60Hz) ± 1%
- Voltage: (100V, 110V, 120V, 208V,
220V, or 230V) ± 5%
The usage of an
uninterruptible power source (UPS) as the line voltage source is allowed, but
the voltage output must be a pure sine-wave. For placement of the UPS, see Power Analyzer Setup below. This usage must be
specified in the Notes section of the report.
Systems that are designed to be able to run normal operations without an
external source of power cannot be used to produce valid results. Some examples
of disallowed systems are notebook computers, hand-held computers/communication
devices and servers that are designed to frequently operate on integrated
batteries without external power.
Systems with batteries intended to preserve operations during a temporary
lapse of external power, or to maintain data integrity during an orderly
shutdown when power is lost, can be used to produce valid benchmark results.
For SUT components that have an integrated battery, the battery must be fully
charged at the end of the measurement interval, or
proof must be provided that it is charged at least to the level of charge at
the beginning of the interval.
Note that integrated
batteries that are intended to maintain such things as durable cache in a storage
controller can be assumed to remain fully charged. The above paragraph is
intended to address “system” batteries that can provide primary power for the
SUT.
DC line voltage
sources are currently not supported.
For situations in which the appropriate voltages are not provided by local
utility companies (e.g. measuring a server in the United States which is
configured for European markets, or measuring a server in a location where the
local utility line voltage does not meet the required characteristics), an AC
power source may be used, and the power source must be specified in the notes
section of the disclosure report. In such situation the following requirements
must be met, and the relevant measurements or power source specifications
disclosed in the general notes section of the disclosure report:
- Total Harmonic Distortion of
source voltage (loaded), based on IEC standards: < 5%
- The AC Power Source needs to meet
the frequency and voltage characteristics previously listed in this
section.
- The AC Power Source must not
manipulate its output in a way that would alter the power measurements
compared to a measurement made using a compliant line voltage source
without the power source.
The intent is that the AC power source not interferes with measurements
such as power factor by trying to adjust its output power to improve the
power factor of the load.
Environmental Conditions
SPEC requires that
power measurements be taken in an environment representative of the majority of
usage environments. The intent is to discourage extreme environments that may
artificially impact power consumption or performance of the server.
The following
environmental conditions must be met:
- Ambient temperature range: 20°C or
above
- Elevation: within documented operating
specification of SUT
- Humidity: within documented
operating specification of SUT
Power Analyzer Setup
The power analyzer
must be located between the AC Line Voltage Source and the SUT. No other active
components are allowed between the AC Line Voltage Source and the SUT. If the
SUT consists of several discrete parts (server and storage), separate power
analyzers may be required.
Power analyzer
configuration settings that are set by the SPEC PTDaemon must not be manually
overridden.
Power Analyzer Specifications
To ensure
comparability and repeatability of power measurements, SPEC requires the
following attributes for the power measurement device used during the
benchmark. Please note that a power analyzer may meet these requirements when
used in some power ranges but not in others, due to the dynamic nature of power
analyzer Accuracy and Crest Factor.
- Measurements - the analyzer must report true
RMS power (watts), voltage, amperes and power factor.
- Uncertainty - Measurements must be reported
by the analyzer with an overall uncertainty of 1% or less for the ranges
measured during the benchmark run. Overall uncertainty means the sum of
all specified analyzer uncertainties for the measurements made during the
benchmark run.
- Calibration - the analyzer must be able to be
calibrated by a standard traceable to NIST
(U.S.A.) or a counterpart national metrology institute in other countries.
The analyzer must have been calibrated within the past year.
- Crest Factor – The analyzer must provide a
current crest factor of a minimum value of 3. For analyzers which do not
specify the crest factor, the analyzer must be capable of measuring an
amperage spike of at least 3 times the maximum amperage measured during
any 1-second sample of the benchmark test.
- Logging - The analyzer must have an
interface that allows its measurements to be read by the PTDaemon. The
reading rate supported by the analyzer must be at least 1 set of
measurements per second, where set is defined as watts and at least 2 of
the following readings: volts, amps and power factor. The data averaging
interval of the analyzer must be either 1 (preferred) or 2 times the
reading interval. "Data averaging interval" is defined as the
time period over which all samples captured by the high-speed sampling
electronics of the analyzer are averaged to provide the measurement set.
For example:
An analyzer with a vendor-specified uncertainty of +/- 0.5% of reading
+/- 4 digits, used in a test with a maximum wattage value of 200W, would have
"overall" uncertainty of (((0.5%*200W)+0.4W)=1.4W/200W) or 0.7% at
200W.
An analyzer with a wattage range 20-400W, with a vendor-specified
uncertainty of +/- 0.25% of range +/- 4 digits, used in a test with a
maximum wattage value of 200W, would have "overall" uncertainty of
(((0.25%*400W)+0.4W)=1.4W/200W) or 0.7% at 200W.
Temperature
Sensor Specifications
Temperature must be
measured no more than 50mm in front of (upwind of) the main airflow inlet of
the server.
To ensure comparability and repeatability of temperature measurements, SPEC
requires the following attributes for the temperature measurement device used
during the benchmark:
- Logging - The sensor must have an
interface that allows its measurements to be read by the benchmark
harness. The reading rate supported by the sensor must be at least 4
samples per minute.
- Accuracy - Measurements must be
reported by the sensor with an overall accuracy of +/- 0.5 degrees Celsius
or better for the ranges measured during the benchmark run.
Supported and Compliant
Devices
See the Device List
for a list of currently supported (by the benchmark software) and compliant (in
specifications) power analyzers and temperature sensors.
A tile is a single unit of work that is
comprised of six distinct virtual machines and supports all the component
workloads. Additional tiles are used to scale the benchmark. The
last tile may be configured as a "fractional tile", which means a
"load scale factor" of less than 1.0 is applied to all of the VMs
within that tile. If used, the load scale factor must be between 0.1 and
0.9 in 0.1 increments (e.g. 0.25 would not be allowed). Each VM is required to
be a distinct entity; for example, you cannot run the application server and
the database on the same VM. The following block diagram shows the tile
architecture and the virtual machine/hypervisor/driver relationships:
Figure 1. Tile block diagram
Note that there are more
virtual machines than client drivers; this is because the Infrastructure Server
and Database Server VMs do not interact directly with the client.
Specifically, the Web Server VM must access parts of its fileset and the
backend simulator (BeSim) via inter-VM communication to the Infrastructure
Server. Similarly, the Application Server VM accesses the Database Server
VM via inter-VM communication.
The operating systems may vary between
virtual machines within a tile. All specific workload VMs (guest OS type
and application software) across all tiles must be identical, including
fractional tiles. Examples of parameters that must remain identical
include:
·
Guest OS distribution, version, and patch
levels
·
Application software version and patch levels
·
Guest OS and application software tunings
·
VM resource parameters from the guest OS
perspective (i.e. # CPUs, memory, networking/storage configuration)
The intent is that workload-specific VMs
across tiles are "clones", with only the modifications necessary to
identify them as different entities (e.g. host name and network address).
Mail Server VM
As
Internet email is defined by its protocol definitions, the mail server requires
adherence to the relevant protocol standards:
RFC 2060 : Internet Mail Application
Protocol - Version 1 (IMAP4)
The IMAP4 protocol
implies the following:
RFC 791 : Internet
Protocol (IPv4)
RFC 792 :
Internet Control Message Protocol (ICMP)
RFC 793 :
Transmission Control Protocol (TCP)
RFC 950 :
Internet Standard Subnetting Procedure
RFC 1122 :
Requirements for Internet Hosts - Communication Layers
Internet standards
are evolving standards. Adherence to related RFC's (e.g. RFC 1191 Path MTU Discovery) is
also acceptable provided the implementation retains the characteristic of
interoperability with other implementations.
Application Server VM
The J2EE server must
provide a runtime environment that meets the requirements of the Java 2
Platform, Enterprise Edition, (J2EE) Version 1.3 or later specifications during
the benchmark run.
A major new version
(i.e. 1.0, 2.0, etc.) of a J2EE server must have passed the J2EE Compatibility
Test Suite (CTS) by the product's general availability date.
A J2EE Server that
has passed the J2EE Compatibility Test Suite (CTS) satisfies the J2EE
compliance requirements for this benchmark regardless of the underlying
hardware and other software used to run the benchmark on a specific
configuration, provided the runtime configuration options result in behavior
consistent with the J2EE specification. For example, using an option that
violates J2EE argument passing semantics by enabling a pass-by-reference
optimization, would not meet the J2EE compliance requirement.
Comment:
The intent of this requirement is to ensure that the J2EE server is a complete
implementation satisfying all requirements of the J2EE specification and to
prevent any advantage gained by a server that implements only an incomplete or
incompatible subset of the J2EE specification.
SPECvirt_sc2010 requires that each Application Server VM execute it own
locally installed emulator application (emulator.EAR). This differs from the
original SPECjAppServer2004 workload definition.
Database Server VM
All tables must have
the properly scaled number of rows as defined by the database population
requirements, as defined in the "Application and Database Server
Benchmark" section of the SPECvirt_sc2010 Design
Overview.
Additional database
objects or DDL modifications made to the reference schema scripts in the schema/sql
directory in the SPECjAppServer2004 Kit must be disclosed along with the specific
reason for the modifications. The base tables and indexes in the reference
scripts cannot be replaced or deleted. Views are not allowed. The data types of
fields can be modified provided they are semantically equivalent to the
standard types specified in the scripts.
Comment: Replacing CHAR with
VARCHAR would be considered semantically equivalent. Changing the size of a
field (for example: increasing the size of a char field from 8 to 10) would not
be considered semantically equivalent. Replacing CHAR with INTEGER (for
example: zip code) would not be considered semantically equivalent.
Modifications that a
customer may make for compatibility with a particular database server are
allowed. Changes may also be necessary to allow the benchmark to run without
the database becoming a bottleneck, subject to approval by SPEC. Examples of
such changes include:
- additional indexes on fields used
in query predicates,
- additional fields to support
optimistic concurrency control,
- specifying fields as 'NOT NULL',
and
- horizontally partitioning tables.
Comment: Schema scripts provided
by the vendors in the schema/<vendor> directories are for
convenience only. They do not constitute the reference or baseline scripts in
the schema/sql directory. Deviations from the scripts in the schema/sql
directory must still be disclosed in the submission file even though the
vendor-provided scripts were used directly.
In any committed
state the primary key values must be unique within each table. For example, in
the case of a horizontally partitioned table, primary key values of rows across
all partitions must be unique.
The databases must be populated using the supplied load programs or
restored from a database copy in a correctly populated state that was populated
using the supplied load programs prior to the start of each benchmark run.
Modifications to the
load programs are permitted for porting purposes. All such modifications made
must be disclosed in the Submission File.
Web Server VM
As the WWW is defined
by its interoperative protocol definitions, the Web server requires adherence
to the relevant protocol standards. It is expected that the Web server is HTTP
1.1 compliant. The benchmark environment shall be governed by the following
standards:
- RFC 2616 Hypertext Transfer
Protocol 1.1 -- HTTP 1.1(Draft Standard)
- RFC 791 Internet Protocol (IPv4)
(Standard)
- updated by RFC1349 Type of Service
in the Internet Protocol Suite (Proposed Standard)
- RFC 792 Internet Control Message
Protocol (Standard)
- updated by RFC 950 Internet
Standard Subnetting Procedure (Standard)
- RFC 793 Transmission Control
Protocol (TCP) (Standard)
- updated by RFC 3168 The Addition
of Explicit Congestion Notification (ECN) to IP (Proposed Standard)
0
- RFC 950 Internet Standard
Subnetting Procedure (Standard)
- RFC 1122 Requirements for
Internet Hosts - Communication Layers (Standard)
- updated by RFC 2474 Definition of
the Differentiated Services Field (DS Field) in the IPv4 and IPv6
Headers. (Proposed Standard)
- RFC 2460 Internet Protocol,
Version 6 Specification (IPv6) (Draft Standard) Note: may be used in
place of or in conjunction with IPv4.
For further explanation of these protocols, the following might be helpful:
- RFC 1180 TCP/IP Tutorial (RFC
1180) (Informational)
- RFC 2151 A Primer on Internet and
TCP/IP Tools and Utilities (RFC 2151) (Informational)
- RFC 1321 MD5 Message Digest
Algorithm (Informational)
The current text of
all IETF RFC's may be obtained from: http://ietf.org/rfc.html
All marketed standards that a software product states as being adhered to must
have passed the relevant test suites used to ensure compliance with the
standards.
For a run to be
valid, the following attributes must hold true:
- The Web server returns the
complete and appropriate byte streams for each request made.
- The Web Server and BeSim log the
following information for each request made: address of the requester, a
date and time stamp accurate to at least 1 second, specification of the
file requested, size of the file transferred, and the final status of the
request. These requirements are satisfied by the Common Log Format.
- No dynamic content responses shall
be cached by the Web server. In other words, the Web server dynamic code
must generate the dynamic content for each request.
Infrastructure VM
The Infrastructure VM
has the same requirements as the Web Server VM in its role as a web back-end
(BeSim) for the web workload.
It also hosts the download files for the webserver using a file system protocol
for remote file sharing (for example NFS or CIFS).
Idle Server VM
For a run to be
valid, each idle server VM must have at least 512 MB of memory allocated.
The operating system of the idle server VM must be of the same type and version
as at least one other VM in the tile. The idle server VM does not need to
contain the other VM's workload-specific application software stack. The intent
of these requirements is to prohibit vendors from artificially limiting and
tuning in order to take advantage of the idle server's limited functionality.
The SPECvirt_sc2010
individual workload metrics represent the aggregate throughput that a server
can support while meeting quality of service (QoS) and validation
requirements. In the benchmark run, one or more tiles are run
simultaneously. The load generated is based on page requests, database
transactions, and IMAP operations as defined in the SPECvirt_sc2010
Design Overview.
The QoS requirements
are relative to the individual workloads. These include:
The load generated is based on page requests, transition between pages and
the static images accessed within each page.
The QoS requirements are defined in terms of two parameters, Time_Good and
Time_Tolerable. QoS requirements are page based, Time_Good and Time_Tolerable
values are defined as 3 seconds and 5 seconds respectively. For each page, 95% of
the page requests (including all the embedded files within that page) are
expected to be returned within Time_Good and 99% of the requests within
Time_Tolerable. Very large static files (i.e. Support downloads) use
specific byte rates as their QoS requirements.
The validation requirement is such that less than 1% of requests for any
given page and less than 0.5% of all the page requests in a given test
iteration fail validation.
It is required in this benchmark that all user sessions be run at the
"high-speed Internet" speed of 100 kilobytes/sec.
In addition, the URL retrievals (or operations) performed must also meet
the following quality criteria:
- There must be least 100 requests
for each type of page defined in the workload.
- The Weighted Percentage Difference
(WPD) between the Expected Number of Requests (ENR) and the actual
number of requests (ANR) for any given page should be within +/-
1%.
- The sum of the per page
Weighted Percentage Differences (SWPD) must not exceed +/-
1.5% .
For each IMAP operation type, 95% of all transactions
must complete within five seconds. Additionally for each IMAP operation type,
there may be no more than 1.5% failures (where a failure is defined as
transactions that return unexpected content, or time-out). The total failure
count across all operation types must be no more than 1% of the count of all
operations.
The client polls the Idle Server periodically to ensure
that the VM is running and responsive. To meet the Idle Server QoS requirement,
99.5% of all polling requests must be responded to within one second.
Driver
Requirements for the Dealer Domain
Business Transactions are selected by the Driver based on the mix shown in the
following table. The actual mix achieved in the benchmark must be within 5% of
the targeted mix for each type of Business Transaction. For example, the browse
transactions can vary between 47.5% to 52.5% of the total mix. The Driver
checks and reports on whether the mix requirement was met.
Business Transaction Mix Requirements
|
Business Transaction Type
|
Percent Mix
|
Purchase
|
25%
|
Manage
|
25%
|
Browse
|
50%
|
The Driver measures and records the Response Time of the different types of
Business Transactions. Only successfully completed Business Transactions in the
Measurement Interval are included. At least 90% of the Business Transactions of
each type must have a Response Time of less than the constraint specified in
the table below. The average Response Time of each Business Transaction's type
must not be greater than 0.1 seconds more than the 90% Response Time. This
requirement ensures that all users will see reasonable response times. For
example, if the 90% Response Time of purchase transactions is 1 second, then
the average cannot be greater than 1.1 seconds. The Driver checks and reports
on whether the response time requirements were met.
Response Time Requirements
|
Business Transaction Type
|
90% RT (in seconds)
|
Purchase
|
2
|
Manage
|
2
|
Browse
|
2
|
For each Business Transaction, the Driver selects cycle times from a
negative exponential distribution, computed from the following equation:
Tc = -ln(x) * 10
where:
Tc = Cycle Time
ln = natural log (base e)
x = random number with at least 31 bits of precision,
from a uniform distribution such that (0 < x <= 1)
The distribution is truncated at 5 times the mean. For each Business
Transaction, the Driver measures the Response Time Tr and computes the
Delay Time Td as Td = Tc - Tr. If Td > 0, the Driver
will sleep for this time before beginning the next Business Transaction. If the
chosen cycle time Tc is smaller than Tr, then the actual cycle
time (Ta) is larger than the chosen one.
The average actual cycle time is allowed to deviate from the targeted one
by 5%. The Driver checks and reports on whether the cycle time requirements
were met.
The table below shows the range of values allowed for various quantities in
the application. The Driver will check and report on whether these requirements
were met.
Miscellaneous Dealer Requirements
|
Quantity
|
Targeted Value
|
Min. Allowed
|
Max. Allowed
|
Average Vehicles per Order
|
26.6
|
25.27
|
27.93
|
Vehicle Purchasing Rate (/sec)
|
6.65
* Ir
|
6.32*
Ir
|
6.98
* Ir
|
Percent Purchases that are Large
Orders
|
10
|
9.5
|
10.5
|
Large Order Vehicle Purchasing Rate (/sec)
|
3.5
* Ir
|
3.33
* Ir
|
3.68
* Ir
|
Average # of Vehicles per Large Order
|
140
|
133
|
147
|
Regular Order Vehicle Purchasing Rate
(/sec)
|
3.15
* Ir
|
2.99
* Ir
|
3.31
* Ir
|
Average # of Vehicles per Regular
Order
|
14
|
13.3
|
14.7
|
The metric for the Dealer Domain is Dealer Transactions/sec,
composed of the total count of all Business Transactions successfully completed
during the measurement interval divided by the length of the measurement
interval in seconds.
The M_Driver measures and records the time taken for a work order to
complete. Only successfully completed work orders in the Measurement Interval
are included. At least 90% of the work orders must have a Response Time of less
than 5 seconds. The average Response Time must not be greater than 0.1 seconds
more than the 90% Response Time.
The table below shows the range of values allowed for various quantities in
the Manufacturing Application. The M_Driver will check and report on whether
the run meets these requirements.
Miscellaneous Manufacturing Requirements
|
Quantity
|
Targeted Value
|
Min. Allowed
|
Max. Allowed
|
LargeOrderline
Widget Rate/sec
|
3.5 * Ir
|
3.15 * Ir
|
3.85 * Ir
|
Planned Line Widget Rate/sec
|
3.15 * Ir
|
2.835 * Ir
|
3.465 * Ir
|
Workload-specific
configuration files are supplied with the harness. All configurable parameters are
listed in these files. For a run to be valid, all the parameters in the
configuration files must be left at default values, except for the ones that
are marked and listed clearly as "Configurable Workload Properties".
To configure the
initial benchmark environment from scratch, the benchmarker:
- Creates the VMs.
- Installs and configures the VM's
OS and application software including provisioning individual mail server
users.
- Creates, initializes, and backs up
(if appropriate) datasets for each workload.
To run the benchmark,
the benchmarker must:
- Restore the database server VM's
database.
- Ensure that all relevant
applications are running.
- Clear application logs.
- Run the client driver with
compliant values, which will run all workloads simultaneously at
predefined load levels.
NOTE:
This section is only applicable to results that have power measurement, which
is optional.
The measurement of
power should meet all the environmental aspects listed in Environmental Conditions. The
SPECvirt_sc2010 benchmark tools provide the ability to automatically gather
measurement data from supported power analyzers and temperature sensors and integrate
that data into the benchmark result. SPEC requires that the analyzers and
sensors used in a submission be supported by the measurement framework. The
provided tools (or a newer version provided by SPEC) must be used to run and
produce measured SPECvirt_sc2010 results.
The primary metrics,
SPECvirt_sc2010_PPW (performance with SUT power) and
SPECvirt_sc2010_ServerPPW (performance with Server only power) are
performance per watt metrics obtained by dividing the peak performance by the
peak power of the SUT or Server, respectively, during the run measurement
phase. For example, if the SPECvirt_sc2010 result consisted of a maximum
of 6 tiles, the power would be calculated as the average power while serving
transactions within all 6 workload tiles.
2.3.5
Client Polling Requirements
During the measurement phase, the SPECvirt_sc2010 prime controller polls
each prime client process associated with each workload in each tile once every
10 seconds. The prime controller collects and records the workload polling data
which includes performance and QoS measurement data from the clients. It
is expected that in a compliant run all polling requests will be responded to
within 10 seconds (BEAT_INTERVAL). Failure to respond to polling requests may
indicate problems with the clients' ability to issue and respond to workload
requests in a timely manner or accurately record performance.
The prime controller process will detect that each polling request is
responded to by the prime client processes, the prime controller will
invalidate the test if more than one 10-second polling interval is missed
during the test's measurement phase. The test will abort, and the run
will be marked as non-compliant.
The reported performance
metric, SPECvirt_sc2010, appears in both Performance/Power and Performance
categories, and will be derived from a set of compliant results from the
workloads in the suite:
- Mail server
- Web server
- Application server
The SPECvirt_sc2010
metric is a "supermetric" that is the arithmetic mean of the
normalized submetrics for each workload. The metric will be output in the
format "SPECvirt_sc2010 <score> @ <# vms> VMs".
The optional
reported performance/watt metrics, SPECvirt_sc2010_PPW and SPECvirt_sc2010_ServerPPW,
represents the peak performance divided by the average power of the SUT and
server respectively during the peak run phase. These metrics will only
appear in results in the Performance/Power categories, and the result must not
be compared with results that do not have power measured. These metrics
will be output in the format "SPECvirt_sc2010_PPW <score> @ <#
vms> VMs" and "SPECvirt_sc2010_ServerPPW <score> @ <#
vms> VMs"
Please consult the SPEC Fair Use Rule on the treatment of
estimates at http://www.spec.org/fairuse.html#SPECvirt_sc2010.
The report of results for the
SPECvirt_sc2010 benchmark is generated in ASCII and HTML format by the provided
SPEC tools. These tools may not be changed without prior SPEC approval. The
tools perform error checking and will flag some error conditions as resulting
in an "invalid run". However, these automatic checks are only
there for debugging convenience, and do not relieve the benchmarker of the
responsibility to check the results and follow the run and reporting rules.
SPEC reviews and
accepts for publication on SPEC's website only a complete and compliant set of
results run and reported according to these rules. Full disclosure
reports of all test and configuration details as described in these run and
report rules must be made available. Licensees are encouraged to submit
results to SPEC for publication.
All system
configuration information required to duplicate published performance results
must be reported. Tunings not in the default configuration for software and
hardware settings must be reported. All tiles must be tuned identically.
The following SUT
hardware components must be reported:
- Vendor's name
- System model name
- System firmware version(s) (e.g.
BIOS)
- Processor model, clock rate,
number of processors (#cores, #chips, #cores/chip, on-chip threading
enabled/disabled), and size and organization of primary, secondary, and
other cache, per processor. If a level of cache is shared among processor
cores in a system that must be stated in the "notes" section
- Main memory size and memory
configuration if this is an end-user option which may affect performance,
e.g. interleaving and access time
- Other hardware, e.g. write caches,
or other accelerators
- Number, type, model, and capacity
of disk controllers and drives
- Type of file system used
The SUT must utilize stable storage. Additionally, the SUT must use stable
and durable storage for all virtual machines (including all corresponding data
drives), such that a single drive failure does not incur data loss on the VMs.
For example: RAID-1, 5, 10, 50, 0+1 are acceptable RAID levels, but RAID-0
(striping without mirroring or parity) is not considered durable.
The SUT
The hypervisor must
be able to recover the virtual machines, and the virtual machines must also be
able to recover their data sets, without loss from multiple power failures
(including cascading power failures), hypervisor and guest operating system
failures, and hardware failures of components (e.g. CPU) other than the storage
medium. At any point where the data can be cached, after any virtual server has
accepted the message and acknowledged a transaction, there must be a mechanism
to ensure any cached data survives the server failure.
- Examples of stable storage
include:
- Media commit of data; i.e. the
data has been successfully written to the disk media.
- An immediate reply disk drive
with battery-backed on-drive intermediate storage or an uninterruptible
power supply (UPS).
- Server commit of data with
battery-backed intermediate storage and recovery software.
- Cache commit with UPS.
- Examples which are not considered
stable storage:
- An immediate reply disk drive
without battery-backed on-drive intermediate storage or UPS.
- Cache commit without UPS.
- Server commit of data without
battery-backed intermediate storage and recovery software.
- Examples of durable storage
include:
- RAID 1 - Mirroring and Duplexing
- RAID 0+1 - Mirrored array whose
segments are RAID 0 arrays
- RAID 5 - Striped array with distributed
parity across all disks (requires at least 3 drives).
- RAID 10 (RAID 1+ 0) - Striped
array whose segments are RAID 1 array
- RAID 50 - Striping (RAID 0)
combined with distributed parity (RAID 5)
- Examples of non-durable storage
include:
- RAID 0 - striped disk array
without fault tolerance
- JBOD - just a bunch
of independent disks with/without spanning
If an UPS is required
by the SUT to meet the stable storage requirement, the benchmarker is not
required to perform the test with an UPS in place. The benchmarker must
state in the disclosure that an UPS is required. Supplying a model number for
an appropriate UPS is encouraged but not required.
If a battery-backed
component is used to meet the stable storage requirement, that battery
must have sufficient power to maintain the data for at least 48 hours to allow
any cached data to be committed to media and the system to be gracefully shut
down. The system or component must also be able to detect a low battery
condition and prevent the use of the caching feature of the component or
provide for a graceful system shutdown.
Hypervisors are required to safely store all completed transactions to its
virtualized workloads (including failure of the hypervisor's own storage):
- Mail server: Mail servers are required to
safely store any email they have accepted until the recipient has disposed
of it.
- Web server and Infrastructure
servers: The web
servers' log file records must be written to non-volatile storage, at
least once every 60 seconds. The web servers must log the following
information for each request made: address of the requester, a date and
time stamp accurate to at least 1 second, specification of the file
requested, size of the file transferred, and the final status of the
request. These requirements are satisfied by the Common Log Format.
- Application and Database servers: The Atomicity, Consistency, and
Isolation properties of transaction processing systems must be supported.
The following SUT
software components must be reported:
- Virtualization software
(hypervisor) and all hypervisor-level tunings
- Virtual machine details (number of
virtual processors, memory, network adapters, disks, etc.)
- Workload-specific details
(operating system, application name and version, tunings) for each workload
(NOTE: these must be identical across all tiles)
- The values of MSL (maximum segment
life) and TIME-WAIT. If TIME-WAIT is not equal to 2*MSL, that must be
noted. (Reference section 4.2.2.13 of RFC 1122).
- Other clarifying information as
required to reproduce benchmark results (e.g. number of daemons, BIOS
parameters, disk configuration, non-default kernel parameters, etc.), and
logging mode, must be stated in the "notes" section.
- The method for creating the Web
server RSA public encryption key and certification must be stated.
A brief description
of the network configuration used to achieve the benchmark results is required.
The minimum information to be supplied is:
- Number, type, and model of network
controllers
- Number and type of networks used
- Base speed of network
- Number, type, model, and
relationship of external network components to support SUT (e.g., any
external routers, hubs, switches, etc.)
- A network configuration notes
section may be used to list the following additional information:
- Relationship of clients, client
type, and networks (including routers, etc. if applicable) -- in short:
which clients are connected to which LAN segments. For example:
"client1 and client2 on one ATM-622, client3 and client4 on second
ATM-622, and clients 5, 6, and 7 each on their own 100TX segment."
The following load
generator hardware components must be reported:
- Number of physical client systems
used for all load drivers and the prime controller
- System model number(s), processor
type and clock rate, number of processors
- Main memory size
- Network Controller(s)
- Operating System and/or Hypervisor
and Version
- If clients have been virtualized
then report virtual resources (CPU, memory, and network) for each and the
mapping to the physical system.
- If physical clients used to drive
workloads on multiple tiles, specify the mapping of clients to tiles or
workloads.
- JVM product used to run client
including vendor and version.
- Other performance critical
Hardware
- Other performance critical
Software
The dates of general
customer availability must be listed for the major components: hardware,
software (hypervisor, operating systems, and applications), month and year. All
the system, hardware and software features are required to be generally
available on or before date of publication, or within 3 months of the date of
publication (except where precluded by these rules, see section 3.2.7). With multiple
components having different availability dates, the latest availability date
must be listed.
Products are
considered generally available if they are orderable by ordinary customers and
ship within a reasonable time frame. This time frame is a function of the
product size and classification, and common practice. The availability of
support and documentation for the products must coincide with the release of
the products.
Hardware products
that are still supported by their original or primary vendor may be used if
their original general availability date was within the last five years. The
five-year limit is waived for hardware used in clients.
For ease and cost of
benchmarking, storage and networking hardware external to the server such as
disks, storage enclosures, storage controllers and network switches, which were
generally available within the last five years but are no longer available from
the original vendor, may be used. If such end-of-life (and possibly
unsupported) hardware is used, then the test sponsor represents that the
performance measured is no better than 105% of the performance on hardware
available as of the date of publication. The product(s) and their end-of-life
date(s) must be noted in the disclosure. If it is later determined that the
performance using available hardware to be lower than 95% of that reported, the
result shall be marked non-compliant
(NC).
Software products
that are still supported by their original or primary vendor may be used if
their original general availability date was within the last three years.
In the disclosure,
the benchmarker must identify any component that is no longer orderable by
ordinary customers.
If pre-release
hardware or software is tested, then the test sponsor represents that the
performance measured is generally representative of the performance to be
expected on the same configuration of the release system. If it is later
determined that the performance using available hardware or software to be
lower than 95% of that reported, the result shall be marked non-compliant
(NC).
SPECvirt_sc2010 does
permit Open Source Applications outside of a commercial distribution or support
contract with some limitations. The following are the rules that govern the
admissibility of the Open Source Application in the context of a benchmark run
or implementation. Open Source Applications do not include shareware and
freeware, where the source is not part of the distribution.
- Open Source Application rules do
not apply to Open Source operating systems, which would still require a
commercial distribution and support.
- Only a "stable" release
can be used in the benchmark environment; non-"stable" releases
(alpha, beta, or release candidates) cannot be used.
Reason: An open source project is not contractually bound and volunteer
resources make predictable future release dates unlikely (i.e. may be more
likely to miss SPEC's 3 month General Availability window). A
"stable" release is one that is clearly denoted as a stable
release or a release that is available and recommended for general use. It
must be a release that is not on the development fork, not designated as
an alpha, beta, test, preliminary, pre-released, prototype,
release-candidate, or any other terms that indicate that it may not be
suitable for general use.
- The initial "stable"
release of the application must be a minimum of 12 months old.
Reason: This helps ensure that the software has real application to the
intended user base and is not a benchmark special that's put out with a
benchmark result and only available for the 1st three months to meet SPEC's
forward availability window.
- At least two additional stable
releases (major, minor, or bug fix) must have been completed, announced
and shipped beyond the initial stable release.
Reason: This helps establish a track record for the project and shows that
it is actively maintained.
- An established online support
forum must be in place and clearly active, "usable", and
"useful". It’s required that there be at least one posting
within the last 3 months. Postings from the benchmarkers or their
representatives, or members of the Virtualization Subcommittee will not be
included in the count.
Reason: Another aspect that establishes that support is available for the
software. However, benchmarkers must not cause the forum to appear
active when it otherwise would not be. A "useful" support forum
is defined as one that provides useful responses to users’ questions, such
that if a previously unreported problem is reported with sufficient
detail, it is responded to by a project developer or community member with
sufficient information that the user ends up with a solution, a
workaround, or has been notified that the issue will be address in a
future release, or that its outside the scope of the project. The
archive of the problem-reporting tool must have examples of this level of
conversation. A "usable" support forum is defined as one where
the problem reporting tool was available without restriction, had a simple
user-interface, and users can access old reports.
- The project must have at least 2
identified developers contributing and maintaining the application.
Reason: To help ensure that this is a real application with real
developers and not a fly-by-night benchmark special.
- The application must use a
standard open source license such as one of those listed at http://www.opensource.org/licenses/.
- The "stable" release
used in the actual test run must have been a latest "stable"
release within the prior six months at the time the result is submitted for
review. The exact beginning of this time window has to be determined
starting from the date of the submission then going back 6 months and
keeping the day number the same. Note: Residual cases are treated as
described as in http://www/spec.org/osg/policy.html#s2.3.4
substituting the 6 month window for 3 month availability window. Examples:
Submission date
|
Beginning of time window
|
Aug
20, 2019
|
Feb
20, 2019
|
Jul
20, 2019
|
Jan
20, 2019
|
Jun
20, 2019
|
Dec
20 2018
|
Reason: Benchmarkers should keep up to date with the recent releases;
however they are not required to move to a release that would be fewer
than six months old at the time of their submission.
Please note, an Open Source Application project may support several
parallel development branches and so there may be multiple latest stable
releases that meet these rules. For example, a project may have releases
such as 10.0, 9.5.1, 8.3.12, and 8.2.29 that are all current supported and
stable releases.
- The "stable" release
used in the actual test run must be no older than 18 months. If
there has not been a "stable" release within 18 months, then the
open source project may no longer be active and as such may no longer meet
these requirements. An exception may be made for “mature” projects
(see below).
- In rare cases, open source
projects may reach “maturity” where the software requires little or no
maintenance and there may no longer be active development. If it can
be demonstrated that the software is still in general use and recommended
either by commercial organizations or active open source projects or user
forums and the source code for the software is fewer than 20,000 lines,
then a request can be made to the subcommittee to grant this software
“mature” status. In general, it is expected that the final stable
release for the "mature" project continues to work "as
is" for the majority of users but that over time some users may need
to make portability changes. This status may be reviewed
semi-annually. The current list of projects granted
"mature" status by the subcommittee include: the FastCGI
library and Alternate PHP Cache.
Note: The Webserver workload requires the use of Smarty 2.6.26 which is
included in the release kit and is not subject to the above rules.
The reporting page
must list the date the test was performed, month and year, the organization
which performed the test and is reporting the results, and the SPEC license
number of that organization.
This section is used
to document:
- System state: single or multi-user
- System tuning parameters other
than default
- Process tuning parameters other
than default
- MTU size of the network used
- Background load, if any
- ANY approved portability changes
made to the individual benchmark source code including module name, line
number of the change.
- Additional information such as
compilation options may be listed
- Critical customer-identifiable
firmware or option versions such as network and disk controllers
- Additional important information
required to reproduce the results, which do not fit in the space allocated
above must be listed here.
- If the configuration is large and
complex, added information must be supplied either by a separate drawing
of the configuration or by a detailed written description which is
adequate to describe the system to a person who did not originally configure
it.
- Part numbers or sufficient
information that would allow the end user to order the SUT configuration
if desired.
Once you have a
compliant run and wish to submit it to SPEC for review, you will need to
provide the following:
- The reporter-generated submission
file containing ALL the information outlined in section 3 and indicating
into which categories the result should be submitted
- The output of configuration
gathering script(s) that obtain the configuration information from the
SUT, clients, and all workload virtual machines as described in sections
4.1, 4.2, and 4.3 below. The scripts may collect specific files and
the output of various commands to provide supporting documentation for the
review. The configuration gathering script(s) used must
be run automatically at the beginning or the end of the benchmark to help
ensure that there are no changes made to the testbed and the configuration
information collected represents the system as tested.
- Use of the SPECvirt_sc2010
harness features for running benchmark initialization or exit scripts
(SPECVIRT_INIT_SCRIPT, SPECVIRT_EXIT_SCRIPT) are strongly recommended for
any data collection scripts that can make use of these features.
- If an external test framework is
used to initiate the SPECvirt_sc2010 test and run any of the data
collection, the collection methodology must be described and the start
and ending timestamps for the data collection archive must show that it
was run immediately prior or immediately following the test based on the
first and last timestamps in the primecontroller log for the test.
For example, if the test starting and ending timestamps are
15:00:00 and 18:00:00 respectively, and a data collection script takes 5
minutes to run, if it is run prior to the test it should start no earlier
than 14:50:00. If run after the test, the script should start no later
than 18:05:00.
- If some aspect of the data
collection can not be automated such as accessing a SAN manager to get
details on the storage configuration, then this should be noted.
- Submission of the configuration
gathering script(s) is encouraged but not required.
- Log files from the run upon
request
- Additional files or the output of commands
run on the SUT, VMs, or clients to help document details relevant to
questions that may arise during the review.
Once you have the
submission ready, please email SPECvirt_sc2010 submissions to subvirt_sc2010@spec.org.
In order to publicly
disclose SPECvirt_sc2010 results, the submitter must adhere to these reporting
rules in addition to having followed the run rules described in this document.
The goal of the reporting rules is to ensure the system under test is
sufficiently documented such that someone could reproduce the test and its
results.
Compliant runs need
to be submitted to SPEC for review and must be accepted prior to public
disclosure. If public statements using SPECvirt_sc2010 are made
they must follow the SPEC Fair Use Rule (http://www.spec.org/fairuse.html).
Many other SPEC
benchmarks allow duplicate submissions for a single system sold under various
names. Each SPECvirt_sc2010 result from a power enabled run submitted to SPEC
or made public must be for an actual run of the benchmark on the SUT named in
the result. Electrically equivalent submissions for power enabled runs are not
allowed, unless it is also mechanically equivalent (e.g. rebadged).
The submitter is
required to run a script that will collect available configuration details of the
SUT and all the virtual machines used for the benchmark, including:
- SUT configuration and tuning
- SUT storage configuration and
tuning
- SUT network configuration
- Hypervisor configuration and
tuning
- Virtual machine configuration (#
of virtual CPUs, memory, disk, network adapters)
- Virtual networking configuration
- Additional files or the output of
commands run on the SUT to help document details relevant to questions
that may arise during the review.
The primary reason
for this step is to ensure that there are not subtle differences that the
vendor may miss.
The submitter is required to run a script which provides the details
of each VM, its operating system and application tunings that is not captured in
the SUT configuration collection script including:
- Guest OS, filesystem, and network
configuration and tuning (e.g. non-default registry or /etc/sysctl.conf)
- Application-specific configuration
and tuning files
- Output of commands to document
details related to the specific requirements for a workload such as the
infraserver files shared with the webserver and software versions used.
During a review of the result, the submitter may be required to provide,
upon request, additional details of the VM, operating system and
application tunings and log files that may not be captured in the above script.
These may include, but are not limited to:
- Application-specific log files
- Additional files or the output of
commands run on the VMs to help document details relevant to questions
that may arise during the review.
The primary reason for this step is to ensure that the vendor has disclosed
all non-default tunings.
4.3 Client Configuration Collection
The submitter is required to run a script which collects the details of
each type or uniquely configured physical and virtual client used, such that
the testbed's client configuration could be reproduced. Clients that are
clones of a specific and documented type may be identified and data collection
is encouraged but not required. The client collection script should
collect files and output of commands to document the client configuration
and tuning details including:
- Hardware configuration (physical
and if applicable virtual)
- Host operating system, filesystem,
and network configuration and tuning (e.g. non-default registry or
/etc/sysctl.conf)
- If virtualized client(s) are used:
hypervisor-specific configuration and tuning files
- Virtualized client Guest OS,
filesystem, and network configuration and tuning (e.g. non-default
registry or /etc/sysctl.conf)
During a review of the result, the submitter may be required to provide,
upon request, additional details of the client configuration that
may not be captured in the above script to help document details relevant to questions
that may arise during the review.
The submitter must submit the Configuration Collection Archive containing
the data (files and command ouput) described sections 4.0, 4.1.and 4.2 above
using the highlevel directory structure described below as the foundation:
- Physical_Configuration
- System_Under_Test
- SUT_Storage
- SUT_Network
- Clients
- Client_type<type 1-n>
- Physical_Configuration
- Software_Configuration
- VM_Configuration (if
applicable)
- Virtual_Configuration
- Tile1
- appserver
- VM_Configuration
- Software_Configuration
- dbserver
- VM_Configuration
- Software_Configuration
- idleserver
- VM_Configuration
- Software_Configuration
- infraserver
- VM_Configuration
- Software_Configuration
- mailserver
- VM_Configuration
- Software_Configuration
- webserver
- VM_Configuration
- Software_Configuration
- Tile<2-n>
- Repeat directory structure above
for each tile in the test and populate with data from the corresponding
set of VMs.
SPEC provides client
driver software, which includes tools for running the benchmark and reporting
its results. The client drivers are written in Java; precompiled class
files are included with the kit, so no build step is necessary. Recompilation
of the client driver software is not allowed, unless prior approval from SPEC
is given.
This software
implements various checks for conformance with these run and reporting rules;
therefore, the SPEC software must be used as provided. Source code
modifications are not allowed, unless prior approval from SPEC is given. Any
such substitution must be reviewed and deemed "performance-neutral"
by the OSSC.
The kit also includes
source code for the file set generators, script code for the web server, and
other necessary components.
SPECvirt_sc2010 uses
modified versions of SPECweb2005, SPECjAppServer2004, and SPECmail2008 for its
virtualized workloads. For reference, the run rules for those benchmarks are
listed below:
NOTE: Not all of these run rules
are applicable to SPECvirt, but when a compliance issue is raised, SPEC
reserves the right to refer back to these individual benchmarks' run rules as
needed for clarification.
Copyright
© 2011 Standard Performance Evaluation Corporation. All rights reserved.
Java® is a
registered trademark of Oracle Corporation.