Baseline optimization flags:
C programs: -openmp -O3 -ipo -ansi -ansi_alias (ONESTEP)
Fortran programs: -openmp -O3 -ipo (ONESTEP)
OpenMP runtime library libguide.a statically linked
Extra Flags:
331.art_l: -DINTS_PER_CACHELINE=32 -DDBLS_PER_CACHELINE=16
Baseline user environment:
OMP_NUM_THREADS=128
limit stacksize 256000
KMP_STACKSIZE 124M
KMP_LIBRARY TURNAROUND
OMP_DYNAMIC FALSE
KMP_SCHEDULE static,balanced
Alternate sources:
Add critical region around update of linked list in parallel loop.
Approved src.alt available as ompl-purdue1-20040324.tar.gz
Used for 331.art_l base.
For all benchmarks threads were bound to CPUs using the following submit command:
dplace -x2 -cNTM1,0 $command,
where NTM1 is the number of threads minus 1.
This binds threads in order of creation, beginning with the master
thread on cpu NTM1, the first slave thread on cpu NTM1-1, and so on.
The -x2 flag instructs dplace to skip placement of the lightweight
OpenMP monitor thread, which is created prior to the slave threads.
|