Evaluating SPEC CHEM2002 (GAMESS)

Contents


Evaluating SPEC CHEM2002 (GAMESS)

Program Description

The General Atomic and Molecular Electronic Structure System (GAMESS), is a publicly available application for evaluating ab initio quantum chemistry on various single and multi-processor archtectures. The package models molecules and reactions at the quantum level.

The code is portable; one version is used for scalar, vector, parallel, or for 32- or 64-bit systems. The SPEC-HPG version is distributed only for benchmarking and is not supported for research by SPEC or Iowa State University. For a full research version of GAMESS program, contact:

  Dr. Mike Schmidt 
  Department of Chemistry,
  Iowa State University,
  Ames, Iowa 50011
                  
  tel:   515-294-9796;    FAX: (515)294-0105
  email: mike@si.fi.ameslab.gov

An on-line version of the user's guide is available.

Additional information is available at the GAMESS home page.

Program Statistics
 Source files:   109389 lines,   3957287 bytes    
                 ( 21% comments, 79% code )

 Total subprograms:    865
   Subroutines:        813
   Functions:           51
   Program:              1
   Block Data:           0
   
Memory Usage

Memory is allocated dynamically, from a fixed-length pool. Wave functions (integral blocks) are written to disk to avoid recomputation. These files can be quite large -- over 2 gigabytes for large datasets. The wave functions can also be recomputed directly for systems with poor I/O capability or insufficient disk capacity.

Explicit Parallelism

The program has been explicitly paralleled for the distributed-memory MIMD model. Parallelism is expressed using the TCGMSG message-passing library. For the benchmark, calls to TCGMSG are filtered to PVM3 through a library developed at IBM.

Each processor executes the same program, entering the loops over sets of electron shells for which integrals must be computed. Each processor skips most integral blocks, taking only those that it determines to have been assigned to it. The calculation is completed by adding together the partial matrices evaluated by each node from its partial integral list.

The explicitly parallel program is said to scale well up to 200 processors.

Load Balancing

The input file specifies one of two load balancing schemes:

These strategies correspond to parallelizing at different levels of the quadruply nested loop which dominates the serial execution profile.

    SUBROUTINE TWOEI
C
C     ME = this processor's ID number
C     NPROC = number of processors
C
C     initialize parallel work
C
      IPCOUNT = ME - 1   
      NEXT = -1 
      MINE = -1  
C
C     begin the four loops over the electron shell sets
C
      DO 920 II =
   .
   .
         DO 900 JJ =
      IF (dynamic load balancing) THEN
         MINE = MINE + 1
         IF (MINE.GT.NEXT) NEXT = NXTVAL()
         IF (NEXT.NE.MINE) GO TO 900
      END IF
      .
      .
            DO 880 KK = 
         .
         .
               DO 860 LL =
      IF (loop-level balancing) THEN
         IPCOUNT = IPCOUNT + 1
         IF (MOD(IPCOUNT,NPROC).NE.0) GO TO 860
      END IF
      .
      .
C
C     Generate integral block
C     Write to disk or place directly in matrix
C
      .
      .
  860          CONTINUE
  880       CONTINUE
  900    CONTINUE
  920 CONTINUE
      END

The orginal paper(*) outlines these strategies in detail.

(*) Michael W. Schmidt, et. al, "General Atomic and Molecular Electronic Structure System," Journal of Computational Chemistry, Vol. 14, No. 11, 1347-1363 (1993)

Execution Profiles

The code was profiled extensively on an SGI Challenge (running IRIX 5.3).

Serial execution profiles

Direct SCF (Self-Consistent Field) calculations

No direct SCF calculations (integrals stored to disk)


Gregg M. Skinner (skinner@csrd.uiuc.edu)
Bill Pottenger (potteng@csrd.uiuc.edu)

Fri Jan 19 13:50:01 PST 1996