Base Flags and Notes:
ONESTEP=yes for all benchmarks
C: -fast -xopenmp -xalias_level=std -xipo=2
-xprefetch_level=3
-xprofile -lmtmalloc
f90: -fast -autopar -openmp -xipo=2
-xprefetch_level=3 -xprofile
Portability and Extra flags:
318.galgel_m: -e -fixed
330.art_m: -DINTS_PER_CACHELINE=16 -DDBLS_PER_CACHELINE=8
Peak Flags and Notes:
ONESTEP=yes for all benchmarks
310.wupwise_m: -fast -openmp -autopar -xipo=2
-Qoption iropt -Athr,
-Ainline:inc=800:cp=1 -xprefetch_level=3
-xunroll=2 -xpagesize=512k -xprofile
312.swim_m: -fast -xautopar -xreduction
-Qoption iropt -Ainline:cs=700,
-Atile:skewp:b256 -xpad=common:3022
-xprefetch=latx:1.6 -lmtmalloc
314.mgrid_m: -fast -autopar -xreduction -xipo=2
-xprefetch_level=3 -xprefetch=latx:2
-Qoption iropt -Apf:largedim -xpagesize=64k
-xchip=ultra3 -xarch=v8plusa
316.applu_m: -fast -openmp -xipo=2
-Qoption iropt -Aujam:inner=g
-xunroll=1 -xprefetch=latx:1.8 -xprofile
srcalt = ompl.32
318.galgel_m: -fast -xipo=2 -openmp -autopar
-xlic_lib=sunperf -xarch=v8plusb
RM_SOURCES=lapak.f90
STACKSIZE = 16384
320.equake_m: -fast -xipo=2 -xopenmp -xrestrict
-xalias_level=strong -xprefetch_level=3
-W2,-Apf:pdl=1 -Wc,-Qlp-ip=1-ol=1
-xpagesize=64k
-xprofile -lmtmalloc -lmopt -lm
srcalt = ompl.32
324.apsi_m: -fast -openmp -autopar -xipo=2
-xpagesize=8k -xprofile
srcalt = ompl.32
326.gafort_m: -fast -openmp -xprofile
srcalt = ompl.32
328.fma3d_m: -fast -openmp -Qoption iropt -Athr,
-Apf:pdl=1 -Qoption cg -Qlp-ip=1
-xipo=2 -xprefetch_level=3 -xprofile
-lmtmalloc
srcalt = ompl.32
330.art_m: -fast -xopenmp -xalias_level=std -xipo=2
-xprefetch_level=3 -W2,-Apf:outer=0:pdl=1
-Wc,-Qlp-ip=1 -xprofile -lmtmalloc
srcalt = ompl.32
332.ammp_m: -fast -xopenmp -xalias_level=strong
-xprefetch_level=3 -lmopt -lm -lmtmalloc
Alternate Source:
330.art_m: ompm-purdue1-20040324.tar
Feedback optimization (-xprofile) is done as follows,
unless otherwise noted:
fdo_pre0: rm -rf `pwd`/../feedback.profile
PASS1: -xprofile=collect:./feedback
PASS2: -xprofile=use:./feedback
Base and Peak User Environment:
SUNW_MP_GUIDED_SCHED_WEIGHT=2
setenv STACKSIZE 2048
setenv OMP_NUM_THREADS 16
setenv OMP_DYNAMIC FALSE
setenv SUNW_MP_PROCBIND "0 8 1 9 2 10 3 11 4 12 5 13
6 14 7 15
setenv MPSSSTACK 4M
setenv MPSSHEAP 4M
setenv LD_PRELOAD mpss.so.1
ulimit -s 524288
Kernel Parameters (/etc/system):
set autoup=900
set tune_t_fsflushr=1
|