SPEC® CFP2006 Result

Copyright 2006-2017 Standard Performance Evaluation Corporation

Sugon

Sugon A620-G30 (AMD EPYC 7451)

CPU2006 license: 9046 Test date: Dec-2017
Test sponsor: Sugon Hardware Availability: Dec-2017
Tested by: Sugon Software Availability: Oct-2017
Benchmark results graph
Hardware
CPU Name: AMD EPYC 7451
CPU Characteristics: AMD Turbo CORE technology up to 3.20 GHz
CPU MHz: 2300
FPU: Integrated
CPU(s) enabled: 48 cores, 2 chips, 24 cores/chip, 2 threads/core
CPU(s) orderable: 1,2 chips
Primary Cache: 64 KB I + 32 KB D on chip per core
Secondary Cache: 512 KB I+D on chip per core
L3 Cache: 64 MB I+D on chip per chip, 8 MB shared / 3 cores
Other Cache: None
Memory: 1 TB (16 x 64 GB 2S2Rx4 PC4-2667V-R, running
at 2400)
Disk Subsystem: 1 x 2000 GB SATA, 7200 RPM
Other Hardware: None
Software
Operating System: Red Hat Enterprise Linux Server 7.4
Kernel 3.10.0-693.2.2
Compiler: C/C++/Fortran: Version 4.5.2.1 of x86 Open64
Compiler Suite (from AMD)
Auto Parallel: No
File System: ext4
System State: Run level 3 (Multi User)
Base Pointers: 64-bit
Peak Pointers: 32/64-bit
Other Software: None

Results Table

Benchmark Base Peak
Copies Seconds Ratio Seconds Ratio Seconds Ratio Copies Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
410.bwaves 96 1077 1210 1079 1210 1079 1210 48 501 1300 501 1300 500 1300
416.gamess 96 1064 1770 1062 1770 1063 1770 96 947 1990 943 1990 943 1990
433.milc 96 890 990 891 989 890 990 48 358 1230 358 1230 358 1230
434.zeusmp 96 407 2150 403 2170 404 2160 96 401 2180 394 2210 398 2200
435.gromacs 96 392 1750 393 1740 396 1730 96 291 2350 291 2360 292 2350
436.cactusADM 96 524 2190 527 2180 524 2190 48 250 2300 236 2430 241 2380
437.leslie3d 96 1016 889 1026 880 1024 881 48 377 1200 375 1200 376 1200
444.namd 96 489 1570 489 1570 489 1580 96 412 1870 411 1870 410 1880
447.dealII 96 343 3200 341 3220 344 3200 96 326 3370 326 3370 323 3400
450.soplex 96 834 960 834 960 835 959 48 377 1060 380 1050 377 1060
453.povray 96 231 2210 234 2180 232 2210 96 187 2730 184 2770 186 2740
454.calculix 96 326 2430 325 2430 327 2420 96 350 2260 361 2190 346 2290
459.GemsFDTD 96 1267 804 1274 799 1266 805 48 577 882 577 883 576 885
465.tonto 96 521 1810 524 1800 525 1800 48 252 1870 251 1880 253 1870
470.lbm 96 771 1710 766 1720 779 1690 48 361 1830 360 1830 358 1840
481.wrf 96 718 1490 717 1500 718 1490 48 348 1540 347 1540 347 1540
482.sphinx3 96 1473 1270 1479 1270 1480 1260 48 517 1810 516 1810 518 1810

Submit Notes

The config file option 'submit' was used.
'numactl' was used to bind copies to the cores.
See the configuration file for details.

Operating System Notes

'ulimit -s unlimited' was used to set environment stack size
'ulimit -l 2097152' was used to set environment locked pages in memory limit

runspec command invoked through numactl i.e.:
numactl --interleave=all runspec <etc>

Set dirty_ratio=8 to limit dirty cache to 8% of memory
Set swappiness=1 to swap only if necessary
Set zone_reclaim_mode=1 to free local node memory and avoid remote memory
sync then drop_caches=3 to reset caches before invoking runcpu

Transparent huge pages were enabled for this run (OS default)

Set vm/nr_hugepages=86016 in /etc/sysctl.conf
mount -t hugetlbfs nodev /mnt/hugepages

General Notes

Environment variables set by runspec before the start of the run:
HUGETLB_LIMIT = "896"
LD_LIBRARY_PATH = "/home/cpu2006/amd1603-rate-libs-revB/32:/home/cpu2006/amd1603-rate-libs-revB/64"

The binaries were built with the AMD supported x86 Open64 Compiler Suite,
which is only available from AMD at
http://developer.amd.com/tools-and-sdks/cpu-development/x86-open64-compiler-suite/
Binaries were compiled on a system with 2 x AMD Opteron 6378 chips + 128 GB Memory using RHEL 6.3

Base Compiler Invocation

C benchmarks:

 opencc 

C++ benchmarks:

 openCC 

Fortran benchmarks:

 openf95 

Benchmarks using both Fortran and C:

 opencc   openf95 

Base Portability Flags

410.bwaves:  -DSPEC_CPU_LP64 
416.gamess:  -DSPEC_CPU_LP64 
433.milc:  -DSPEC_CPU_LP64 
434.zeusmp:  -DSPEC_CPU_LP64 
435.gromacs:  -DSPEC_CPU_LP64 
436.cactusADM:  -DSPEC_CPU_LP64   -fno-second-underscore 
437.leslie3d:  -DSPEC_CPU_LP64 
444.namd:  -DSPEC_CPU_LP64 
447.dealII:  -DSPEC_CPU_LP64 
450.soplex:  -DSPEC_CPU_LP64 
453.povray:  -DSPEC_CPU_LP64 
454.calculix:  -DSPEC_CPU_LP64 
459.GemsFDTD:  -DSPEC_CPU_LP64 
465.tonto:  -DSPEC_CPU_LP64 
470.lbm:  -DSPEC_CPU_LP64 
481.wrf:  -DSPEC_CPU_LINUX   -DSPEC_CPU_CASE_FLAG   -DSPEC_CPU_LP64   -fno-second-underscore 
482.sphinx3:  -DSPEC_CPU_LP64 

Base Optimization Flags

C benchmarks:

 -Ofast   -OPT:malloc_alg=1   -HP:bd=2m:heap=2m   -IPA:plimit=8000   -IPA:small_pu=100   -mso   -march=bdver1   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs 

C++ benchmarks:

 -Ofast   -static   -CG:load_exe=0   -OPT:malloc_alg=1   -INLINE:aggressive=on   -HP:bd=2m:heap=2m   -D__OPEN64_FAST_SET   -march=bdver2   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs 

Fortran benchmarks:

 -Ofast   -LNO:blocking=off   -LNO:simd_peel_align=on   -OPT:rsqrt=2   -OPT:unroll_size=256   -HP:bd=2m:heap=2m   -mso   -march=bdver1   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs 

Benchmarks using both Fortran and C:

 -Ofast   -OPT:malloc_alg=1   -HP:bd=2m:heap=2m   -IPA:plimit=8000   -IPA:small_pu=100   -mso   -march=bdver1   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs   -LNO:blocking=off   -LNO:simd_peel_align=on   -OPT:rsqrt=2   -OPT:unroll_size=256 

Peak Compiler Invocation

C benchmarks:

 opencc 

C++ benchmarks:

 openCC 

Fortran benchmarks:

 openf95 

Benchmarks using both Fortran and C:

 opencc   openf95 

Peak Portability Flags

410.bwaves:  -DSPEC_CPU_LP64 
416.gamess:  -DSPEC_CPU_LP64 
433.milc:  -DSPEC_CPU_LP64 
434.zeusmp:  -DSPEC_CPU_LP64 
435.gromacs:  -DSPEC_CPU_LP64 
436.cactusADM:  -DSPEC_CPU_LP64   -fno-second-underscore 
437.leslie3d:  -DSPEC_CPU_LP64 
444.namd:  -DSPEC_CPU_LP64 
453.povray:  -DSPEC_CPU_LP64 
454.calculix:  -DSPEC_CPU_LP64 
459.GemsFDTD:  -DSPEC_CPU_LP64 
465.tonto:  -DSPEC_CPU_LP64 
470.lbm:  -DSPEC_CPU_LP64 
481.wrf:  -DSPEC_CPU_LINUX   -DSPEC_CPU_CASE_FLAG   -DSPEC_CPU_LP64   -fno-second-underscore 

Peak Optimization Flags

C benchmarks:

433.milc:  -Ofast   -CG:movnti=1   -CG:locs_best=on   -HP:bdt=2m:heap=2m   -IPA:plimit=7000   -IPA:callee_limit=1200   -OPT:struct_array_copy=2   -OPT:alias=field_sensitive   -mso   -march=bdver1   -mno-fma4 
470.lbm:  -Ofast   -CG:cmp_peep=on   -OPT:keep_ext=on   -HP:bdt=2m:heap=2m   -IPA:plimit=8000   -IPA:small_pu=100   -march=bdver1   -mno-fma4   -mso 
482.sphinx3:  -Ofast   -m32   -IPA:plimit=1000   -OPT:malloc_alg=2   -CG:cmp_peep=on   -CG:p2align=0   -CG:load_exe=1   -CG:dsched=on   -INLINE:aggressive=on   -LNO:prefetch=2   -LNO:prefetch_ahead=4   -mso   -march=bdver2   -WB,   -mno-fma4   -mno-tbm   -mno-xop 

C++ benchmarks:

444.namd:  -Ofast   -IPA:plimit=3000   -LNO:ignore_feedback=off   -CG:local_sched_alg=0   -CG:load_exe=0   -OPT:unroll_size=256   -fno-exceptions   -HP:bdt=2m:heap=2m   -LNO:if_select_conv=1   -OPT:alias=disjoint   -LNO:psimd_iso_unroll=ON   -march=bdver2   -mno-fma4   -WB,   -mno-xop   -mno-tbm 
447.dealII:  -Ofast   -D__OPEN64_FAST_SET   -static   -INLINE:aggressive=on   -LNO:opt=1   -LNO:simd=2   -fno-emit-exceptions   -m32   -OPT:unroll_times_max=8   -OPT:unroll_size=256   -OPT:unroll_level=2   -HP:bdt=2m:heap=2m   -GRA:unspill=on   -CG:cmp_peep=on   -CG:movext_icmp=off   -TENV:frame_pointer=off   -march=bdver1   -mno-fma4 
450.soplex:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -O3   -LNO:ignore_feedback=off   -INLINE:aggressive=on   -OPT:RO=1   -OPT:IEEE_arith=3   -OPT:IEEE_NaN_Inf=off   -OPT:fold_unsigned_relops=on   -fno-exceptions   -CG:p2align=0   -m32   -mno-fma4   -HP:bdt=2m:heap=2m   -WOPT:sib=on   -march=bdver1 
453.povray:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -CG:pre_local_sched=off   -CG:p2align=0   -CG:p2align_split=on   -CG:dsched=on   -INLINE:aggressive=on   -HP:bd=2m:heap=2m   -OPT:transform=2   -OPT:alias=disjoint   -WOPT:aggcm=0   -march=bdver2   -mno-fma4   -WB,   -mno-xop   -mno-tbm   -Wl,   -z,muldefs 

Fortran benchmarks:

410.bwaves:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -OPT:Ofast   -OPT:treeheight=on   -LNO:blocking=off   -LNO:ignore_feedback=off   -LNO:fu=4   -LNO:loop_model_simd=on   -LNO:simd_rm_unity_remainder=on   -WOPT:aggstr=0   -HP:bdt=2m:heap=2m   -CG:cmp_peep=on   -march=bdver2   -mno-fma4 
416.gamess:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -LNO:fu=6   -LNO:blocking=0   -LNO:simd=2   -OPT:ro=3   -OPT:recip=on   -CG:local_sched_alg=1   -HP:bdt=2m:heap=2m   -WOPT:sib=on   -march=bdver1   -mno-fma4 
434.zeusmp:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -LNO:blocking=off   -LNO:interchange=off   -IPA:plimit=1500   -HP:bdt=2m:heap=2m   -march=bdver2   -mno-fma4 
437.leslie3d:  -Ofast   -CG:pre_minreg_level=2   -LNO:simd=0   -LNO:fusion=2   -HP:bdt=2m:heap=2m   -mso   -march=bdver1   -mno-fma4 
459.GemsFDTD:  -Ofast   -IPA:plimit=1500   -OPT:unroll_size=1024   -OPT:unroll_times_max=16   -LNO:fission=2   -CG:local_sched_alg=2   -HP   -march=bdver1   -mno-fma4 
465.tonto:  -Ofast   -OPT:alias=no_f90_pointer_alias   -LNO:blocking=off   -CG:load_exe=1   -CG:local_sched_alg=3   -IPA:plimit=525   -HP:bdt=2m:heap=2m   -march=bdver2   -WB,   -mno-fma4   -mno-tbm   -mno-xop 

Benchmarks using both Fortran and C:

435.gromacs:  -Ofast   -OPT:rsqrt=2   -HP:bdt=2m:heap=2m   -CG:local_sched_alg=2   -CG:load_exe=3   -GRA:unspill=on   -march=bdver2   -mno-fma4   -LNO:simd=3 
436.cactusADM:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -LNO:blocking=off   -LNO:prefetch=2   -LNO:pf2=0   -LNO:prefetch_ahead=4   -HP   -CG:locs_shallow_depth=1   -CG:load_exe=0   -CG:dsched=on   -WOPT:sib=on   -march=bdver2   -mno-fma4 
454.calculix:  -Ofast   -OPT:unroll_size=256   -OPT:alias=disjoint   -GRA:optimize_boundary=on   -CG:dsched=on   -HP:bdt=2m:heap=2m   -march=bdver1   -mno-fma4 
481.wrf:  -Ofast   -LNO:blocking=off   -LANG:copyinout=off   -IPA:callee_limit=5000   -GRA:prioritize_by_density=on   -HP   -WOPT:sib=on   -march=bdver1   -mno-fma4 

The flags files that were used to format this result can be browsed at
http://www.spec.org/cpu2006/flags/x86-openflags-rate-revA-I.html,
http://www.spec.org/cpu2006/flags/Sugon-Naples-Platform-Settings-revC-I.html.

You can also download the XML flags sources by saving the following links:
http://www.spec.org/cpu2006/flags/x86-openflags-rate-revA-I.xml,
http://www.spec.org/cpu2006/flags/Sugon-Naples-Platform-Settings-revC-I.xml.