Compiler Invocation:
C: cc
F90: f90
F77: f77
Base Tuning:
C: -fast -xopenmp -xalias_level=std -xipo=2
-xprefetch_level=3 -xcode=abs44 -m64 -lmtmalloc
-g -xpagesize=4m -xprofile
f90: -fast -openmp -xcode=abs44 -m64 -xipo=2 -autopar
-fma=fused -g -xpagesize=4m -xprofile
ONESTEP=yes
Extra art allowed flags:
331.art_l: -DINTS_PER_CACHELINE=16 -DDBLS_PER_CACHELINE=8
Peak Notes:
ONESTEP=yes
311.wupwise_l: -fast -openmp -xunroll=4 -autopar -m64 -xcode=abs44
-xipo=2 -fma=fused -xpagesize=4m -xunroll=4
-xprofile
313.swim_l: -fast -openmp -m64 -xipo=2 -autopar -fma=fused
-xpagesize=512k -xprefetch=latx:3 -xprofile
315.mgrid_l: -fast -openmp -xipo=2 -xprefetch_level=3 -m64
-xcode=abs44 -xpagesize=512K -xprefetch=latx:4.8
-fma=fused -Qoption iropt -Apf:l2subblock=256
-xprofile
317.applu_l: -fast -xipo=2 -openmp -xautopar -m64 -fma=fused
-xpagesize=4m -xprefetch=latx:2.8
-Qoption iropt -Rloop_dist -xunroll=3 -xprofile
321.equake_l: -fast -xopenmp -xprefetch_level=3 -xpagesize=64K
-xprefetch=latx:2 -xipo=2 -lmtmalloc
-W2,-Apf:l2subblock=256 -m64 -xprofile
325.apsi_l: -fast -openmp -m64 -xipo=2 -autopar -fma=fused
-xpagesize=4m -xprefetch=latx:3.4
-Qoption iropt -Rloop_dist -xprofile
327.gafort_l: -fast -openmp -xprefetch_level=3 -m64 -fma=fused
-xprefetch=latx:0.5 -xprofile
329.fma3d_l: -fast -openmp -xcode=abs44 -m64 -xipo=2 -autopar
-fma=fused -g -xpagesize=4m -xprofile
331.art_l: -fast -xopenmp -xipo=2 -xprefetch_level=3 -m64
-xprefetch=latx:3 -xprofile
Alternate Source for Base and Peak:
315.mgrid_l: intel, correct an OpenMP coding standard problem.
Available as SPEC OMP alternative source:
ompl2001-mgrid-20071113.tar.gz
329.fma3d_l: sqrt.init, avoid a potential race condition.
Available as SPEC OMP alternative source:
ompl2001-fma3dsqrtinit-20070912.tar.gz
Alternate Source for Peak:
325.apsi_l: ompl.dd, change initial data distribution for WORK array.
Available as SPEC OMP alternative source:
ompl2001-dd-20040128.tar.gz
Feedback optimization (-xprofile) is done as follows,
unless otherwise noted:
fdo_pre0: rm -rf `pwd`/feedback.profile
PASS1: -xprofile=collect:./feedback
PASS2: -xprofile=use:./feedback
Base and Peak User Environment Settings:
unlimit stacksize (in /bin/csh)
setenv SUNW_MP_PROCBIND "2 4 6 10 12 14 18 20 22 26 28 30 34 36 38
42 44 46 50 52 54 58 60 62 66 68 70 74 76 78 82 84 86 90 92 94 98
100 102 106 108 110 114 116 118 122 124 126 130 132 134 138 140
142 146 148 150 154 156 158 162 164 166 170 172 174 178 180 182
186 188 190 194 196 198 202 204 206 210 212 214 218 220 222 226
228 230 234 236 238 242 244 246 250 252 254 258 260 262 266 268
270 274 276 278 282 284 286 290 292 294 298 300 302 306 308 310
314 316 318 322 324 326 330 332 334 338 340 342 346 348 350 354
356 358 362 364 366 370 372 374 378 380 382 386 388 390 394 396
398 402 404 406 410 412 414 418 420 422 426 428 430 434 436 438
442 444 446 450 452 454 458 460 462 466 468 470 474 476 478 482
484 486 490 492 494 498 500 502 506 508 510"
setenv SUNW_MP_THR_IDLE SPIN
setenv OMP_DYNAMIC FALSE
Additional Peak User Environment Settings:
OMP_NUM_THREADS settings per benchmark
311.wupwise_l 192
313.swim_l 64
315.mgrid_l 128
317.applu_l 256
321.equake_l 128
325.apsi_l 192
327.gafort_l 256
329.fma3d_l 256
331.art_l 96
SUNW_MP_PROCBIND was set per benchmark to distribute the work to as
many cpus and cores as possible. See config file for details.
For a description of Sun Studio 12 Compiler flags, portability flags
and system parameters used to generate this result, please refer to
SUN-20080714-Studio-Solaris-sparc.txt file in the flags directory.
This result was measured on Sun SPARC Enterprise M9000.
The Sun SPARC Enterprise M9000 and the Fujitsu SPARC Enterprise
M9000 are electrically equivalent.
"CMU" = CPU/Memory Unit; each holds 2 or 4 CPU chips.
|