SPEC® MPIM2007 Result

Copyright 2006-2010 Standard Performance Evaluation Corporation

SGI

SGI ICE X (Intel Xeon E5-2690 v2, 3.0 GHz)

MPI2007 license: 4 Test date: Sep-2013
Test sponsor: SGI Hardware Availability: Sep-2013
Tested by: SGI Software Availability: Jul-2013
Benchmark results graph

Results Table

Benchmark Base Peak
Ranks Seconds Ratio Seconds Ratio Seconds Ratio Ranks Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
104.milc 800 12.0 130   12.2 128   12.2 129   800 12.0 130   12.2 128   12.2 129  
107.leslie3d 800 39.0 134   39.0 134   38.5 135   800 39.0 134   39.0 134   38.5 135  
113.GemsFDTD 800 325   19.4 341   18.5 321   19.7 96 186   33.9 186   33.9 186   34.0
115.fds4 800 12.1 161   12.4 157   12.3 159   800 12.1 161   12.4 157   12.3 159  
121.pop2 800 115   35.8 115   35.9 115   35.8 800 115   35.8 115   35.9 115   35.8
122.tachyon 800 20.9 134   20.9 134   21.0 133   800 20.9 134   20.9 134   21.0 133  
126.lammps 800 114   25.5 114   25.5 114   25.5 160 112   26.0 112   25.9 112   26.0
127.wrf2 800 40.2 194   40.0 195   40.2 194   800 40.2 194   40.0 195   40.2 194  
128.GAPgeofem 800 15.0 138   13.9 149   13.8 149   800 15.0 138   13.9 149   13.8 149  
129.tera_tf 800 27.8 99.7 27.7 100   27.8 99.6 800 27.8 99.7 27.7 100   27.8 99.6
130.socorro 800 37.0 103   37.2 103   37.4 102   800 37.0 103   37.2 103   37.4 102  
132.zeusmp2 800 27.0 115   27.1 115   27.1 114   800 27.0 115   27.1 115   27.1 114  
137.lu 800 28.6 129   28.6 129   28.6 129   800 28.6 129   28.6 129   28.6 129  
Hardware Summary
Type of System: Homogeneous
Compute Node: SGI ICE X Compute Node
Interconnect: InfiniBand (MPI and I/O)
File Server Node: SGI Rackable C1103-TY12
Total Compute Nodes: 40
Total Chips: 80
Total Cores: 800
Total Threads: 1600
Total Memory: 2560 GB
Base Ranks Run: 800
Minimum Peak Ranks: 96
Maximum Peak Ranks: 800
Software Summary
C Compiler: Intel C++ Composer XE 2013 for Linux,
Version 14.0.0.051 Build 20130529
C++ Compiler: Intel C++ Composer XE 2013 for Linux,
Version 14.0.0.051 Build 20130529
Fortran Compiler: Intel Fortran Composer XE 2013 for Linux,
Version 14.0.0.051 Build 20130529
Base Pointers: 64-bit
Peak Pointers: 64-bit
MPI Library: SGI MPT 2.08 Patch 11012
Other MPI Info: OFED 1.5.2
Pre-processors: None
Other Software: None

Node Description: SGI ICE X Compute Node

Hardware
Number of nodes: 40
Uses of the node: compute
Vendor: SGI
Model: SGI ICE X (Intel Xeon E5-2690 v2, 3.0 GHz)
CPU Name: Intel Xeon E5-2690 v2
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 20
Cores per chip: 10
Threads per core: 2
CPU Characteristics: Ten Core, 3.0 GHz, 8.0 GT/s QPI
Hyper-Threading Technology enabled
CPU MHz: 3000
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per core
L3 Cache: 25 MB I+D on chip per chip
Other Cache: None
Memory: 64 GB (8 x 8 GB 2Rx4 PC3-14900R-13, ECC)
Disk Subsystem: None
Other Hardware: None
Adapter: Mellanox MT27500 with ConnectX-3 ASIC
(PCIe x8 Gen3 8 GT/s)
Number of Adapters: 2
Slot Type: PCIe x8 Gen3
Data Rate: InfiniBand 4x FDR
Ports Used: 1
Interconnect Type: InfiniBand
Software
Adapter: Mellanox MT27500 with ConnectX-3 ASIC
(PCIe x8 Gen3 8 GT/s)
Adapter Driver: OFED-1.5.2
Adapter Firmware: 2.7.8200
Operating System: SUSE Linux Enterprise Server 11 SP2,
Kernel 3.0.80-0.7-default
Local File System: NFSv3
Shared File System: NFSv3 IPoIB
System State: Multi-user, run level 3
Other Software: SGI Tempo Compute Node 2.7.3,
Build 708rp14.sles11sp2-130531120

Node Description: SGI Rackable C1103-TY12

Hardware
Number of nodes: 1
Uses of the node: fileserver
Vendor: SGI
Model: SGI Rackable C1103-TY12 (Intel Xeon X5670, 2.93
GHz)
CPU Name: Intel Xeon X5670
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 12
Cores per chip: 6
Threads per core: 2
CPU Characteristics: Intel Turbo Boost Technology up to 3.33 GHz
Hyper-Threading Technology enabled
CPU MHz: 2933
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per chip
L3 Cache: 12 MB I+D on chip per chip
Other Cache: None
Memory: 96 GB (8 * 12 GB 2Rx4 PC3-10600R-9, ECC)
Disk Subsystem: 36 TB RAID 6
36 x 1 TB SAS (Seagate Constellation 7200RPM)
Other Hardware: None
Adapter: Mellanox MT27500 with ConnectX-3 ASIC
(PCIe x8 Gen3 8 GT/s)
Number of Adapters: 2
Slot Type: PCIe x8 Gen3
Data Rate: InfiniBand 4x FDR
Ports Used: 2
Interconnect Type: InfiniBand
Software
Adapter: Mellanox MT27500 with ConnectX-3 ASIC
(PCIe x8 Gen3 8 GT/s)
Adapter Driver: OFED-1.5.2
Adapter Firmware: 2.11.312
Operating System: SUSE Linux Enterprise Server 11 SP1
Kernel 2.6.32.54-0.3-default
Local File System: xfs
Shared File System: --
System State: Multi-user, run level 3
Other Software: SGI Foundation Software 2.5,
Build 705r10.sles11-1110192111
SGI InfiniteStorage Software Platform, version
2.5,
Build 705r10.sles11-1110192111

Interconnect Description: InfiniBand (MPI and I/O)

Hardware
Vendor: Mellanox Technologies and SGI
Model: None
Switch Model: SGI FDR Integrated IB Switch Blade 2SW9x27 with
Mellanox SwitchX device 51000
Number of Switches: 10
Number of Ports: 36
Data Rate: InfiniBand 4x FDR
Firmware: 07130007_LLR and 08130007_LLR
Topology: Enhanced Hypercube
Primary Use: MPI and I/O traffic

Submit Notes

The config file option 'submit' was used.

General Notes

130.socorro (base): "nullify_ptrs" src.alt was used.

 Software environment:
   export MPI_REQUEST_MAX=65536
   export MPI_TYPE_MAX=32768
   export MPI_BUFS_THRESHOLD=1
   export MPI_IB_RAILS=2
   ulimit -s unlimited

 BIOS settings:
   AMI BIOS version 3.0
   Hyper-Threading Technology enabled (default)
   Intel Turbo Boost Technology disabled

 Transparent Hugepage: Disabled

 Job Placement:
   Each MPI job was assigned to a topologically compact set
   of nodes.

 Additional notes regarding interconnect:
   The Infiniband network consists of two independent planes,
   with half the switches in the system allocated to each plane.
   I/O traffic is restricted to one plane, while MPI traffic can
   use both planes.

 Peak run:
   In the peak run, some benchmarks used different number of ranks
   from base. It is the only difference between base and peak.

Compiler Invocation

C benchmarks:

 icc 

C++ benchmarks:

126.lammps:  icpc 

Fortran benchmarks:

 ifort 

Benchmarks using both Fortran and C:

 icc   ifort 

Portability Flags

121.pop2:  -DSPEC_MPI_CASE_FLAG 
127.wrf2:  -DSPEC_MPI_CASE_FLAG   -DSPEC_MPI_LINUX 
130.socorro:  -assume nostd_intent_in 

Base Optimization Flags

C benchmarks:

 -O3   -xAVX   -no-prec-div 

C++ benchmarks:

126.lammps:  -O3   -xAVX   -no-prec-div   -ansi-alias 

Fortran benchmarks:

 -O3   -xAVX   -no-prec-div 

Benchmarks using both Fortran and C:

 -O3   -xAVX   -no-prec-div 

Peak Optimization Flags

C benchmarks:

104.milc:  basepeak = yes 
122.tachyon:  basepeak = yes 

C++ benchmarks:

126.lammps:  -O3   -xAVX   -no-prec-div   -ansi-alias 

Fortran benchmarks:

107.leslie3d:  basepeak = yes 
113.GemsFDTD:  -O3   -xAVX   -no-prec-div 
129.tera_tf:  basepeak = yes 
137.lu:  basepeak = yes 

Benchmarks using both Fortran and C:

115.fds4:  basepeak = yes 
121.pop2:  basepeak = yes 
127.wrf2:  basepeak = yes 
128.GAPgeofem:  basepeak = yes 
130.socorro:  basepeak = yes 
132.zeusmp2:  basepeak = yes 

Other Flags

C benchmarks:

 -lmpi 

C++ benchmarks:

126.lammps:  -lmpi 

Fortran benchmarks:

 -lmpi 

Benchmarks using both Fortran and C:

 -lmpi 

The flags file that was used to format this result can be browsed at
http://www.spec.org/mpi2007/flags/SGI_x86_64_Intel14_flags.html.

You can also download the XML flags source by saving the following link:
http://www.spec.org/mpi2007/flags/SGI_x86_64_Intel14_flags.xml.