The General Atomic and Molecular Electronic Structure System (GAMESS), is a publicly available application for evaluating ab initio quantum chemistry on various single and multi-processor archtectures. The package models molecules and reactions at the quantum level.
The code is portable; one version is used for scalar, vector, parallel, or for 32- or 64-bit systems. The SPEC-HPG version is distributed only for benchmarking and is not supported for research by SPEC or Iowa State University. For a full research version of GAMESS program, contact:
Dr. Mike Schmidt Department of Chemistry, Iowa State University, Ames, Iowa 50011 tel: 515-294-9796; FAX: (515)294-0105 email: mike@si.fi.ameslab.gov
An on-line version of the user's guide is available.
Additional information is available at the GAMESS home page.
Source files: 109389 lines, 3957287 bytes ( 21% comments, 79% code ) Total subprograms: 865 Subroutines: 813 Functions: 51 Program: 1 Block Data: 0
Memory is allocated dynamically, from a fixed-length pool. Wave functions (integral blocks) are written to disk to avoid recomputation. These files can be quite large -- over 2 gigabytes for large datasets. The wave functions can also be recomputed directly for systems with poor I/O capability or insufficient disk capacity.
The program has been explicitly paralleled for the distributed-memory MIMD model. Parallelism is expressed using the TCGMSG message-passing library. For the benchmark, calls to TCGMSG are filtered to PVM3 through a library developed at IBM.
Each processor executes the same program, entering the loops over sets of electron shells for which integrals must be computed. Each processor skips most integral blocks, taking only those that it determines to have been assigned to it. The calculation is completed by adding together the partial matrices evaluated by each node from its partial integral list.
The explicitly parallel program is said to scale well up to 200 processors.
The input file specifies one of two load balancing schemes:
Loop-level balancing: each process takes regular turns, evaluating every nth block of integrals (for n processes), skipping over the others.
Dynamic load balancing: an integral block is assigned to each processor as it finishes its previous task.
These strategies correspond to parallelizing at different levels of the quadruply nested loop which dominates the serial execution profile.
SUBROUTINE TWOEI C C ME = this processor's ID number C NPROC = number of processors C C initialize parallel work C IPCOUNT = ME - 1 NEXT = -1 MINE = -1 C C begin the four loops over the electron shell sets C DO 920 II = . . DO 900 JJ = IF (dynamic load balancing) THEN MINE = MINE + 1 IF (MINE.GT.NEXT) NEXT = NXTVAL() IF (NEXT.NE.MINE) GO TO 900 END IF . . DO 880 KK = . . DO 860 LL = IF (loop-level balancing) THEN IPCOUNT = IPCOUNT + 1 IF (MOD(IPCOUNT,NPROC).NE.0) GO TO 860 END IF . . C C Generate integral block C Write to disk or place directly in matrix C . . 860 CONTINUE 880 CONTINUE 900 CONTINUE 920 CONTINUE END
The orginal paper(*) outlines these strategies in detail.
(*) Michael W. Schmidt, et. al, "General Atomic and Molecular Electronic Structure System," Journal of Computational Chemistry, Vol. 14, No. 11, 1347-1363 (1993)
The code was profiled extensively on an SGI Challenge (running IRIX 5.3).
Serial execution profiles
Direct SCF (Self-Consistent Field) calculations
No direct SCF calculations (integrals stored to disk)
Fri Jan 19 13:50:01 PST 1996