SPEC® CFP2006 Result

Copyright 2006-2017 Standard Performance Evaluation Corporation

Sugon

Sugon A620-G30 (AMD EPYC 7351)

CPU2006 license: 9046 Test date: Dec-2017
Test sponsor: Sugon Hardware Availability: Dec-2017
Tested by: Sugon Software Availability: Oct-2017
Benchmark results graph
Hardware
CPU Name: AMD EPYC 7351
CPU Characteristics: AMD Turbo CORE technology up to 2.90 GHz
CPU MHz: 2400
FPU: Integrated
CPU(s) enabled: 32 cores, 2 chips, 16 cores/chip, 2 threads/core
CPU(s) orderable: 1,2 chips
Primary Cache: 64 KB I + 32 KB D on chip per core
Secondary Cache: 512 KB I+D on chip per core
L3 Cache: 64 MB I+D on chip per chip, 8 MB shared / 2 cores
Other Cache: None
Memory: 512 GB (16 x 32 GB 2Rx4 PC4-2667V-R, running at
2400)
Disk Subsystem: 1 x 3000 GB SATA, 7200 RPM
Other Hardware: None
Software
Operating System: Red Hat Enterprise Linux Server 7.4
Kernel 3.10.0-693.2.2
Compiler: C/C++/Fortran: Version 4.5.2.1 of x86 Open64
Compiler Suite (from AMD)
Auto Parallel: No
File System: ext4
System State: Run level 3 (Multi User)
Base Pointers: 64-bit
Peak Pointers: 32/64-bit
Other Software: None

Results Table

Benchmark Base Peak
Copies Seconds Ratio Seconds Ratio Seconds Ratio Copies Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
410.bwaves 64 707 1230 707 1230 707 1230 32 334 1300 334 1300 334 1300
416.gamess 64 1060 1180 1063 1180 1063 1180 64 943 1330 946 1320 945 1330
433.milc 64 613 958 614 956 614 956 32 242 1210 242 1210 242 1210
434.zeusmp 64 334 1740 335 1740 341 1710 64 324 1790 322 1810 323 1800
435.gromacs 64 393 1160 392 1170 393 1160 64 290 1580 293 1560 290 1570
436.cactusADM 64 439 1740 435 1760 437 1750 32 200 1910 208 1840 210 1820
437.leslie3d 64 675 891 676 890 677 888 32 289 1040 283 1060 285 1060
444.namd 64 491 1050 490 1050 491 1050 64 413 1240 412 1250 412 1250
447.dealII 64 324 2260 325 2260 322 2270 64 305 2400 304 2400 310 2360
450.soplex 64 582 917 582 917 582 917 32 293 910 294 908 294 908
453.povray 64 233 1460 233 1460 234 1460 64 185 1840 184 1850 186 1830
454.calculix 64 321 1640 322 1640 322 1640 64 343 1540 341 1550 342 1540
459.GemsFDTD 64 836 812 839 809 837 811 32 418 812 424 801 420 808
465.tonto 64 467 1350 465 1350 464 1360 32 244 1290 244 1290 244 1290
470.lbm 64 611 1440 612 1440 611 1440 32 278 1580 277 1590 279 1580
481.wrf 64 523 1370 524 1360 524 1360 32 272 1310 276 1300 270 1320
482.sphinx3 64 1063 1170 1068 1170 1068 1170 32 444 1400 446 1400 446 1400

Submit Notes

The config file option 'submit' was used.
'numactl' was used to bind copies to the cores.
See the configuration file for details.

Operating System Notes

'ulimit -s unlimited' was used to set environment stack size
'ulimit -l 2097152' was used to set environment locked pages in memory limit

runspec command invoked through numactl i.e.:
numactl --interleave=all runspec <etc>

Set dirty_ratio=8 to limit dirty cache to 8% of memory
Set swappiness=1 to swap only if necessary
Set zone_reclaim_mode=1 to free local node memory and avoid remote memory
sync then drop_caches=3 to reset caches before invoking runcpu

Transparent huge pages were enabled for this run (OS default)

Set vm/nr_hugepages=57344 in /etc/sysctl.conf
mount -t hugetlbfs nodev /mnt/hugepages

Platform Notes

BIOS settings:
Determinism Slider = Power
cTDP Control = Manual
cTDP = 200
This system Sugon A620-G30 is electrically equal with Sugon A420-G30 populated with the same processors and memories.

General Notes

Environment variables set by runspec before the start of the run:
HUGETLB_LIMIT = "896"
LD_LIBRARY_PATH = "/home/cpu2006/amd1603-rate-libs-revB/32:/home/cpu2006/amd1603-rate-libs-revB/64"

The binaries were built with the AMD supported x86 Open64 Compiler Suite,
which is only available from AMD at
http://developer.amd.com/tools-and-sdks/cpu-development/x86-open64-compiler-suite/
Binaries were compiled on a system with 2 x AMD Opteron 6378 chips + 128 GB Memory using RHEL 6.3

Base Compiler Invocation

C benchmarks:

 opencc 

C++ benchmarks:

 openCC 

Fortran benchmarks:

 openf95 

Benchmarks using both Fortran and C:

 opencc   openf95 

Base Portability Flags

410.bwaves:  -DSPEC_CPU_LP64 
416.gamess:  -DSPEC_CPU_LP64 
433.milc:  -DSPEC_CPU_LP64 
434.zeusmp:  -DSPEC_CPU_LP64 
435.gromacs:  -DSPEC_CPU_LP64 
436.cactusADM:  -DSPEC_CPU_LP64   -fno-second-underscore 
437.leslie3d:  -DSPEC_CPU_LP64 
444.namd:  -DSPEC_CPU_LP64 
447.dealII:  -DSPEC_CPU_LP64 
450.soplex:  -DSPEC_CPU_LP64 
453.povray:  -DSPEC_CPU_LP64 
454.calculix:  -DSPEC_CPU_LP64 
459.GemsFDTD:  -DSPEC_CPU_LP64 
465.tonto:  -DSPEC_CPU_LP64 
470.lbm:  -DSPEC_CPU_LP64 
481.wrf:  -DSPEC_CPU_LINUX   -DSPEC_CPU_CASE_FLAG   -DSPEC_CPU_LP64   -fno-second-underscore 
482.sphinx3:  -DSPEC_CPU_LP64 

Base Optimization Flags

C benchmarks:

 -Ofast   -OPT:malloc_alg=1   -HP:bd=2m:heap=2m   -IPA:plimit=8000   -IPA:small_pu=100   -mso   -march=bdver1   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs 

C++ benchmarks:

 -Ofast   -static   -CG:load_exe=0   -OPT:malloc_alg=1   -INLINE:aggressive=on   -HP:bd=2m:heap=2m   -D__OPEN64_FAST_SET   -march=bdver2   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs 

Fortran benchmarks:

 -Ofast   -LNO:blocking=off   -LNO:simd_peel_align=on   -OPT:rsqrt=2   -OPT:unroll_size=256   -HP:bd=2m:heap=2m   -mso   -march=bdver1   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs 

Benchmarks using both Fortran and C:

 -Ofast   -OPT:malloc_alg=1   -HP:bd=2m:heap=2m   -IPA:plimit=8000   -IPA:small_pu=100   -mso   -march=bdver1   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs   -LNO:blocking=off   -LNO:simd_peel_align=on   -OPT:rsqrt=2   -OPT:unroll_size=256 

Peak Compiler Invocation

C benchmarks:

 opencc 

C++ benchmarks:

 openCC 

Fortran benchmarks:

 openf95 

Benchmarks using both Fortran and C:

 opencc   openf95 

Peak Portability Flags

410.bwaves:  -DSPEC_CPU_LP64 
416.gamess:  -DSPEC_CPU_LP64 
433.milc:  -DSPEC_CPU_LP64 
434.zeusmp:  -DSPEC_CPU_LP64 
435.gromacs:  -DSPEC_CPU_LP64 
436.cactusADM:  -DSPEC_CPU_LP64   -fno-second-underscore 
437.leslie3d:  -DSPEC_CPU_LP64 
444.namd:  -DSPEC_CPU_LP64 
453.povray:  -DSPEC_CPU_LP64 
454.calculix:  -DSPEC_CPU_LP64 
459.GemsFDTD:  -DSPEC_CPU_LP64 
465.tonto:  -DSPEC_CPU_LP64 
470.lbm:  -DSPEC_CPU_LP64 
481.wrf:  -DSPEC_CPU_LINUX   -DSPEC_CPU_CASE_FLAG   -DSPEC_CPU_LP64   -fno-second-underscore 

Peak Optimization Flags

C benchmarks:

433.milc:  -Ofast   -CG:movnti=1   -CG:locs_best=on   -HP:bdt=2m:heap=2m   -IPA:plimit=7000   -IPA:callee_limit=1200   -OPT:struct_array_copy=2   -OPT:alias=field_sensitive   -mso   -march=bdver1   -mno-fma4 
470.lbm:  -Ofast   -CG:cmp_peep=on   -OPT:keep_ext=on   -HP:bdt=2m:heap=2m   -IPA:plimit=8000   -IPA:small_pu=100   -march=bdver1   -mno-fma4   -mso 
482.sphinx3:  -Ofast   -m32   -IPA:plimit=1000   -OPT:malloc_alg=2   -CG:cmp_peep=on   -CG:p2align=0   -CG:load_exe=1   -CG:dsched=on   -INLINE:aggressive=on   -LNO:prefetch=2   -LNO:prefetch_ahead=4   -mso   -march=bdver2   -WB,   -mno-fma4   -mno-tbm   -mno-xop 

C++ benchmarks:

444.namd:  -Ofast   -IPA:plimit=3000   -LNO:ignore_feedback=off   -CG:local_sched_alg=0   -CG:load_exe=0   -OPT:unroll_size=256   -fno-exceptions   -HP:bdt=2m:heap=2m   -LNO:if_select_conv=1   -OPT:alias=disjoint   -LNO:psimd_iso_unroll=ON   -march=bdver2   -mno-fma4   -WB,   -mno-xop   -mno-tbm 
447.dealII:  -Ofast   -D__OPEN64_FAST_SET   -static   -INLINE:aggressive=on   -LNO:opt=1   -LNO:simd=2   -fno-emit-exceptions   -m32   -OPT:unroll_times_max=8   -OPT:unroll_size=256   -OPT:unroll_level=2   -HP:bdt=2m:heap=2m   -GRA:unspill=on   -CG:cmp_peep=on   -CG:movext_icmp=off   -TENV:frame_pointer=off   -march=bdver1   -mno-fma4 
450.soplex:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -O3   -LNO:ignore_feedback=off   -INLINE:aggressive=on   -OPT:RO=1   -OPT:IEEE_arith=3   -OPT:IEEE_NaN_Inf=off   -OPT:fold_unsigned_relops=on   -fno-exceptions   -CG:p2align=0   -m32   -mno-fma4   -HP:bdt=2m:heap=2m   -WOPT:sib=on   -march=bdver1 
453.povray:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -CG:pre_local_sched=off   -CG:p2align=0   -CG:p2align_split=on   -CG:dsched=on   -INLINE:aggressive=on   -HP:bd=2m:heap=2m   -OPT:transform=2   -OPT:alias=disjoint   -WOPT:aggcm=0   -march=bdver2   -mno-fma4   -WB,   -mno-xop   -mno-tbm   -Wl,   -z,muldefs 

Fortran benchmarks:

410.bwaves:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -OPT:Ofast   -OPT:treeheight=on   -LNO:blocking=off   -LNO:ignore_feedback=off   -LNO:fu=4   -LNO:loop_model_simd=on   -LNO:simd_rm_unity_remainder=on   -WOPT:aggstr=0   -HP:bdt=2m:heap=2m   -CG:cmp_peep=on   -march=bdver2   -mno-fma4 
416.gamess:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -LNO:fu=6   -LNO:blocking=0   -LNO:simd=2   -OPT:ro=3   -OPT:recip=on   -CG:local_sched_alg=1   -HP:bdt=2m:heap=2m   -WOPT:sib=on   -march=bdver1   -mno-fma4 
434.zeusmp:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -LNO:blocking=off   -LNO:interchange=off   -IPA:plimit=1500   -HP:bdt=2m:heap=2m   -march=bdver2   -mno-fma4 
437.leslie3d:  -Ofast   -CG:pre_minreg_level=2   -LNO:simd=0   -LNO:fusion=2   -HP:bdt=2m:heap=2m   -mso   -march=bdver1   -mno-fma4 
459.GemsFDTD:  -Ofast   -IPA:plimit=1500   -OPT:unroll_size=1024   -OPT:unroll_times_max=16   -LNO:fission=2   -CG:local_sched_alg=2   -HP   -march=bdver1   -mno-fma4 
465.tonto:  -Ofast   -OPT:alias=no_f90_pointer_alias   -LNO:blocking=off   -CG:load_exe=1   -CG:local_sched_alg=3   -IPA:plimit=525   -HP:bdt=2m:heap=2m   -march=bdver2   -WB,   -mno-fma4   -mno-tbm   -mno-xop 

Benchmarks using both Fortran and C:

435.gromacs:  -Ofast   -OPT:rsqrt=2   -HP:bdt=2m:heap=2m   -CG:local_sched_alg=2   -CG:load_exe=3   -GRA:unspill=on   -march=bdver2   -mno-fma4   -LNO:simd=3 
436.cactusADM:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -LNO:blocking=off   -LNO:prefetch=2   -LNO:pf2=0   -LNO:prefetch_ahead=4   -HP   -CG:locs_shallow_depth=1   -CG:load_exe=0   -CG:dsched=on   -WOPT:sib=on   -march=bdver2   -mno-fma4 
454.calculix:  -Ofast   -OPT:unroll_size=256   -OPT:alias=disjoint   -GRA:optimize_boundary=on   -CG:dsched=on   -HP:bdt=2m:heap=2m   -march=bdver1   -mno-fma4 
481.wrf:  -Ofast   -LNO:blocking=off   -LANG:copyinout=off   -IPA:callee_limit=5000   -GRA:prioritize_by_density=on   -HP   -WOPT:sib=on   -march=bdver1   -mno-fma4 

The flags files that were used to format this result can be browsed at
http://www.spec.org/cpu2006/flags/x86-openflags-rate-revA-I.html,
http://www.spec.org/cpu2006/flags/Sugon-Naples-Platform-Settings-revC-I.html.

You can also download the XML flags sources by saving the following links:
http://www.spec.org/cpu2006/flags/x86-openflags-rate-revA-I.xml,
http://www.spec.org/cpu2006/flags/Sugon-Naples-Platform-Settings-revC-I.xml.