LAMMPS WWW Site

Overview:

Accelerator options

LAMMPS has several accelerator options, implemented via five accelerator packages. Some of the packages support multiple hardware options and precision options (double,mixed,single). These are the package abbreviations used in the plots and tables below.

For acceleration on a CPU:

For acceleration on an Intel KNL:

For acceleration on an NVIDIA GPU:

Note that if you browse the alphabetized listing of pair, fix, compute, etc commands in Section commands 3.5 of the manual, many commands are followed by parentheses, with letters for "g" = GPU, "i" = Intel, "k" = Kokkos, "o" = OMP, and "t" = OPT. This indicates which packages support that command.


Machines and node hardware

Benchmarks were run on the following machines and node hardware.

mutrino = Intel Haswell CPUs or Intel KNLs

ride80 = IBM Power8 CPUs with NVIDIA K80 GPUs

ride100 = IBM Power8 CPUs with NVIDIA P100 GPUs


How to build LAMMPS and run the benchmarks

This table shows which accelerator packages were used on which machines:

Machine Hardware CPU OPT OMP GPU Intel/CPU Intel/KNL Kokkos/OMP Kokkos/KNL Kokkos/Cuda
mutrino Haswell/KNL yes yes yes no yes yes yes yes no
ride80 K80 no no no yes no no no no yes
ride100 P100 no no no yes no no no no yes

These are the software environments on each machine and the Makefiles used to build LAMMPS with different accelerator packages.

mutrino

ride80

ride100

Some of the Makefiles were used to build LAMMPS with multiple accelerator packages and options included, specifically the "cpu" and "knl" makefiles:

Makefile suffix Accelerator options
cpu CPU, OPT, OMP, Intel/CPU
kokkos_omp Kokkos/OMP
kokkos_serial Kokkos/serial
knl CPU/KNL, OPT/KNL, OMP/KNL, Intel/KNL
kokkos_knl Kokkos/KNL
kokkos_knl_serial Kokkos/KNL/serial
gpu GPU
kokkos_cuda Kokkos/Cuda

If a specific benchmark requires a build with additional package(s) installed, it is noted with the benchmark info below.

With the software environment initialized (e.g. modules loaded) and the machine Makefiles copied into src/MAKE/MINE, building LAMMPS is straightforward:

cp Makefile.serrano_cpu lammps/src/MAKE/MINE   # for example
cd lammps/src
make yes-manybody                              # install any packages the benchmark requires
make yes-opt yes-user-intel ...                # install accelerator package(s) supported by the Makefile
make serrano_cpu                               # target = suffix of Makefile.machine 

This should produce an executable named lmp_machine, e.g. lmp_serrano_cpu. If desired, you can copy the executable to a directory where you run the benchmark.

Note that if the GPU package in being included in the build, these steps should be done before the LAMMPS build:

cp Makefile.gpulib.ride100.double lammps/lib/gpu    # for example
cd lammps/lib/gpu
make -f Makefile.gpulib.ride100.double clean
make -f Makefile.gpulib.ride100.double 

This should produce the file lammps/lib/gpu/libgpu.a.

IMPORTANT NOTE: Achieving best performance for the benchmarks (or your own input script) on a particular machine with a particular accelerator option, requires attention to the following issues.

All of the plots below include a link to a table with details on all of these issues. The table shows the mpirun (or equivalent) command used to produce each data point on each curve in the plot, the LAMMPS command-line arguments used to get best performance with a particular package on that hardware, and a link to the logfile produced by the benchmark run.


How to interpret the plots

All the plots below have atoms or nodes on the x-axis, and performance on the y-axis. On all the plots, better performance is up and worse performance is down. For all the plots:

Per-core and per-node plots:

Strong-scaling and weak-scaling plots:



ReaxFF (HNS) benchmark

Additional packages needed for this benchmark: USER-REAXC

Comments: