Parallel Runs

Generally there are two hardware scenarios which determine the kind of parallelization that is possible to use:

- On a
**single node**with several CPUs and/or cores using the same memory (shared memory), the user can run all parallelized modules of`TURBOMOLE`. For some modules, both shared-memory and MPI versions are available, but it is recommended not to use the latter ones for performance reasons.How to run the parallel

`TURBOMOLE`SMP version on multi-core and/or multi-CPU systems: Please see chapter 3.2.1. - On a
**cluster**a parallel calculation can be performed using several distinct nodes, each one with local memory and disks. This can be done with the MPI version.How to run the parallel

`TURBOMOLE`MPI version on clusters: Please see chapter 3.2.2.

The list of parallelized programs includes presently:

`ridft`-- parallel ground state Hartree-Fock and DFT energies including RI-J and the multipole accelerated RI (MA-RI-J) (SMP and MPI)`rdgrad`-- parallel ground state gradients from`ridft`calculations (SMP and MPI)`dscf`-- Hartree-Fock and DFT ground state calculations for all available DFT functionals, without the usage of RI-J approximation (SMP and MPI)`odft`-- EXX or LHF ground state calculations (SMP only)`grad`-- parallel ground state gradients from`dscf`calculations (SMP and MPI)`ricc2`-- parallel ground and excited state calculations of energies and gradients at MP2 and CC2 level using RI, as well as energy calculations of other wave function models, see chapter 9.6. (SMP and MPI)`mpgrad`-- parallel conventional (i.e. non-RI) MP2 energy and gradient calculations. Please note that RI-MP2 is one to two orders of magnitude faster than conventional MP2, so even serial RI-MP2 will be faster than parallel MP2 calculations. (MPI only)`aoforce`-- parallel Hartree-Fock and DFT analytic 2nd derivatives for vibrational frequencies, IR spectra, generation of Hessian for transition state searches and check for minimum structures. (SMP only)`escf`-- parallel TDDFT, RPA, CIS excited state calculations (UV-Vis and CD spectra, polarizabilities). (SMP only)`egrad`-- parallel TDDFT, RPA, CIS excited state analytic gradients, including polarizability derivatives for RAMAN spectra. (SMP only)`NumForce`-- this script can used for a trivial parallelization of the numerical displaced coordinates.

Additional optional keywords for parallel runs with the MPI binaries are
described in Chapter 18.
However, those keywords do not have to be set by the users. When using the
parallel version of `TURBOMOLE`, scripts are replacing the binaries. Those
scripts prepare a usual input, run the necessary steps and automatically start the parallel programs.
The users just have to set environment variables, see Sec. 3.2.2 below.

To use the OpenMP parallelization only an environment variable needs to be set. But to use this parallelization efficiently one should consider a few additional points, e.g. memory usage, which are described in Sec. 3.2.1.

- Running Parallel Jobs -- SMP case
- Setting up the parallel SMP environment
- OpenMP parallelization of
`dscf`,`odft`and`ricc2` - Multi-thread parallelization of
`aoforce`,`escf`and`egrad` - Multi-thread parallelization of
`dscf`,`grad`,`ridft`and`rdgrad` - Global Arrays parallelization of
`ridft`and`rdgrad`

- Running Parallel Jobs -- MPI case