MPI2007 license: | 4 | Test date: | Nov-2011 |
---|---|---|---|
Test sponsor: | SGI | Hardware Availability: | Feb-2011 |
Tested by: | SGI | Software Availability: | Nov-2011 |
Benchmark | Base | Peak | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Ranks | Seconds | Ratio | Seconds | Ratio | Seconds | Ratio | Ranks | Seconds | Ratio | Seconds | Ratio | Seconds | Ratio | |
Results appear in the order in which they were run. Bold underlined text indicates a median measurement. | ||||||||||||||
121.pop2 | 2048 | 69.0 | 56.4 | 58.6 | 66.4 | 58.6 | 66.4 | 2048 | 69.0 | 56.4 | 58.6 | 66.4 | 58.6 | 66.4 |
122.tachyon | 2048 | 158 | 12.3 | 67.6 | 28.8 | 67.3 | 28.9 | 4608 | 171 | 11.3 | 44.7 | 43.5 | 45.0 | 43.2 |
125.RAxML | 2048 | 92.7 | 31.5 | 93.1 | 31.4 | 92.9 | 31.4 | 4608 | 64.6 | 45.2 | 65.0 | 44.9 | 64.4 | 45.3 |
126.lammps | 2048 | 60.4 | 40.7 | 60.0 | 41.0 | 60.0 | 41.0 | 4608 | 28.5 | 86.3 | 26.5 | 92.7 | 26.6 | 92.6 |
128.GAPgeofem | 2048 | 79.9 | 74.3 | 80.0 | 74.1 | 80.1 | 74.1 | 3072 | 71.5 | 83.0 | 71.6 | 82.9 | 71.8 | 82.7 |
129.tera_tf | 2048 | 46.5 | 23.6 | 46.5 | 23.6 | 46.6 | 23.6 | 4608 | 33.4 | 32.9 | 33.3 | 33.0 | 33.3 | 33.0 |
132.zeusmp2 | 2048 | 35.0 | 60.6 | 36.5 | 58.0 | 36.6 | 57.9 | 2048 | 35.0 | 60.6 | 36.5 | 58.0 | 36.6 | 57.9 |
137.lu | 2048 | 32.6 | 129 | 32.7 | 128 | 32.7 | 129 | 2048 | 32.6 | 129 | 32.7 | 128 | 32.7 | 129 |
142.dmilc | 2048 | 26.8 | 137 | 26.9 | 137 | 27.0 | 137 | 4608 | 21.4 | 172 | 21.2 | 174 | 21.3 | 173 |
143.dleslie | 2048 | 29.7 | 104 | 29.4 | 105 | 29.6 | 105 | 2048 | 29.7 | 104 | 29.4 | 105 | 29.6 | 105 |
145.lGemsFDTD | 2048 | 77.6 | 56.9 | 77.6 | 56.9 | 77.5 | 56.9 | 2560 | 77.2 | 57.1 | 77.2 | 57.1 | 77.2 | 57.1 |
147.l2wrf2 | 2048 | 90.2 | 91.0 | 90.2 | 90.9 | 90.2 | 91.0 | 4608 | 72.4 | 113 | 72.2 | 114 | 71.9 | 114 |
Hardware Summary | |
---|---|
Type of System: | Homogeneous |
Compute Node: | SGI Altix ICE 8400EX Compute Node |
Interconnect: | InfiniBand (MPI and I/O) |
File Server Node: | SGI InfiniteStorage Nexis 2000 NAS |
Total Compute Nodes: | 384 |
Total Chips: | 768 |
Total Cores: | 4608 |
Total Threads: | 9216 |
Total Memory: | 9 TB |
Base Ranks Run: | 2048 |
Minimum Peak Ranks: | 2048 |
Maximum Peak Ranks: | 4608 |
Software Summary | |
---|---|
C Compiler: | Intel C++ Composer XE 2011 for Linux, Version 12.1.0.233 Build 20110811 |
C++ Compiler: | Intel C++ Composer XE 2011 for Linux, Version 12.1.0.233 Build 20110811 |
Fortran Compiler: | Intel Fortran Composer XE 2011 for Linux, Version 12.1.0.233 Build 20110811 |
Base Pointers: | 64-bit |
Peak Pointers: | 64-bit |
MPI Library: | SGI MPT 2.05 |
Other MPI Info: | OFED 1.5.2 |
Pre-processors: | None |
Other Software: | None |
Hardware | |
---|---|
Number of nodes: | 384 |
Uses of the node: | compute |
Vendor: | SGI |
Model: | SGI Altix ICE 8400EX IP-105 (Intel Xeon X5690, 3.46 GHz) |
CPU Name: | Intel Xeon X5690 |
CPU(s) orderable: | 1-2 chips |
Chips enabled: | 2 |
Cores enabled: | 12 |
Cores per chip: | 6 |
Threads per core: | 2 |
CPU Characteristics: | Six Core, 3.46 GHz, 6.4 GT/s QPI Intel Turbo Boost Technology up to 3.73 GHz Hyper-Threading Technology enabled |
CPU MHz: | 3467 |
Primary Cache: | 32 KB I + 32 KB D on chip per core |
Secondary Cache: | 256 KB I+D on chip per core |
L3 Cache: | 12 MB I+D on chip per chip |
Other Cache: | None |
Memory: | 24 GB (6 x 4 GB 2Rx4 PC3-10600R-9, ECC) |
Disk Subsystem: | None |
Other Hardware: | None |
Adapter: | Mellanox MT26428 ConnectX IB QDR (PCIe x8 Gen2 5 GT/s) |
Number of Adapters: | 2 |
Slot Type: | PCIe x8 Gen2 |
Data Rate: | InfiniBand 4x QDR |
Ports Used: | 1 |
Interconnect Type: | InfiniBand |
Software | |
---|---|
Adapter: | Mellanox MT26428 ConnectX IB QDR (PCIe x8 Gen2 5 GT/s) |
Adapter Driver: | OFED-1.5.2 |
Adapter Firmware: | 2.7.8200 |
Operating System: | SUSE Linux Enterprise Server 11 SP1, Kernel 2.6.32.43-0.4-default |
Local File System: | NFSv3 |
Shared File System: | NFSv3 IPoIB |
System State: | Multi-user, run level 3 |
Other Software: | SGI Performance Suite 1.2 Build 704r5.sles11-1103212004 SGI Tempo Compute Node 2.4, Build 704rp74.sles11-1106302006 |
Hardware | |
---|---|
Number of nodes: | 1 |
Uses of the node: | fileserver |
Vendor: | SGI |
Model: | SGI Altix XE 270 (Intel Xeon X5670, 2.93 GHz) |
CPU Name: | Intel Xeon X5670 |
CPU(s) orderable: | 1-2 chips |
Chips enabled: | 2 |
Cores enabled: | 12 |
Cores per chip: | 6 |
Threads per core: | 2 |
CPU Characteristics: | Intel Turbo Boost Technology up to 3.33 GHz Hyper-Threading Technology enabled |
CPU MHz: | 2933 |
Primary Cache: | 32 KB I + 32 KB D on chip per core |
Secondary Cache: | 256 KB I+D on chip per chip |
L3 Cache: | 12 MB I+D on chip per chip |
Other Cache: | None |
Memory: | 96 GB (12*8 GB DDR3-1333 CL9 DIMMs) |
Disk Subsystem: | 8.8 TB RAID 5 60 x 146 GB SAS (Seagate Cheetah 15K.5) |
Other Hardware: | None |
Adapter: | Mellanox MT26428 ConnectX IB QDR (PCIe x8 Gen2 5 GT/s) |
Number of Adapters: | 2 |
Slot Type: | PCIe x8 Gen2 |
Data Rate: | InfiniBand 4x QDR |
Ports Used: | 2 |
Interconnect Type: | InfiniBand |
Software | |
---|---|
Adapter: | Mellanox MT26428 ConnectX IB QDR (PCIe x8 Gen2 5 GT/s) |
Adapter Driver: | OFED-1.4.0 |
Adapter Firmware: | 2.7.0 |
Operating System: | SUSE Linux Enterprise Server 11 (x86_64) Kernel 2.6.27.19-5-default |
Local File System: | xfs |
Shared File System: | -- |
System State: | Multi-user, run level 3 |
Other Software: | SGI Foundation Software 2, Build 700r3.sles11-1004061553 |
Hardware | |
---|---|
Vendor: | Mellanox Technologies and SGI |
Model: | None |
Switch Model: | SGI QDR_1.5_HYPR_2454 with Mellanox Device 48438 (Infiniscale IV) |
Number of Switches: | 96 |
Number of Ports: | 36 |
Data Rate: | InfiniBand 4x QDR |
Firmware: | 5040005 |
Topology: | Enhanced Hypercube |
Primary Use: | MPI and I/O traffic |
The config file option 'submit' was used. For benchmarks that used 2048 or 2560 MPI ranks, four ranks were assigned to each CPU chip, leaving 2 cores per chip idle.
Software environment: export MPI_REQUEST_MAX=65536 export MPI_TYPE_MAX=32768 export MPI_BUFS_THRESHOLD=1 export MPI_IB_RAILS=2 ulimit -s unlimited BIOS settings: AMI BIOS version 080016 Hyper-Threading Technology enabled (default) Intel Turbo Boost Technology enabled (default) Intel Turbo Boost Technology activated in the OS via /etc/init.d/acpid start /etc/init.d/powersaved start powersave -f Job Placement: In the run with 3072 and 4608 ranks, each MPI job was assigned to a topologically compact set of nodes with 64 switches for 3072 ranks and 96 switches for 4608 ranks. In the run with 2048 and 2560 MPI ranks, four ranks were assigned to each CPU chip, leaving 2 cores per chip idle. There were 64 switches used for 2048 ranks and 80 switches used for 2560 ranks, with topology compact configurations in both cases. Additional notes regarding interconnect: The Infiniband network consists of two independent planes, with half the switches in the system allocated to each plane. I/O traffic is restricted to one plane, while MPI traffic can use both planes. SGI manufactures its own switch blades using unmodified Mellanox switch ASICs. The test system has SGI QDR_1.5_HYPR_2454 switch with Mellanox 36-port QDR Infiniband switch Device 48438 (InfiniScale IV).
icc |
126.lammps: | icpc |
ifort |
icc ifort |
121.pop2: | -DSPEC_MPI_CASE_FLAG |
-O3 -xSSE4.2 -no-prec-div |
126.lammps: | -O3 -xSSE4.2 -no-prec-div -ansi-alias |
-O3 -xSSE4.2 -no-prec-div |
-O3 -xSSE4.2 -no-prec-div |
-O3 -xSSE4.2 -no-prec-div |
126.lammps: | -O3 -xSSE4.2 -no-prec-div -ansi-alias |
129.tera_tf: | -O3 -xSSE4.2 -no-prec-div |
137.lu: | basepeak = yes |
143.dleslie: | basepeak = yes |
145.lGemsFDTD: | Same as 129.tera_tf |
121.pop2: | basepeak = yes |
128.GAPgeofem: | -O3 -xSSE4.2 -no-prec-div |
132.zeusmp2: | basepeak = yes |
147.l2wrf2: | Same as 128.GAPgeofem |