CORAL Benchmark Codes

What is CORAL?

CORAL is a first-of-its-kind U.S. Department of Energy (DOE) collaboration between the National Nuclear Security Administration’s (NNSA’s) ASC Program and the Office of Science’s Advanced Scientific Computing Research program (ASCR) that will culminate in three ultra-high performance supercomputers at Lawrence Livermore, Oak Ridge, and Argonne national laboratories. The systems, delivered in the 2017 timeframe, will be used for the most demanding scientific and national security simulation and modeling applications, and will enable continued U.S. leadership in computing. The Livermore system resulting from CORAL will be named Sierra.

CORAL is the next major phase in the U.S. Department of Energy’s scientific computing roadmap and path to exascale computing. The procurements resulting from CORAL will influence the modernization of future generations of computing throughout the NNSA complex.

Consult the summary of the CORAL benchmarking process [PDF] presented at the May 31 CORAL vendor meeting. Updated September 24.

Figures of Merit (FOMs) for baseline calculations, scaling data, and initial weights [XLSX] [PDF] are subject to change until issuance of the final RFP.

CORAL Benchmarks

The CORAL benchmarks contained within represent their state as of February 2014. Other than minor bug fixes and build issues, these versions will not be updated. Users should use the links provided to access official benchmark home pages for updated and maintained versions.

Supplemental Information | Change Log

CORAL Benchmarks
Scalable Science Benchmarks	Priority Level	Lines of Code	Parallelism		Language				Code Description/Notes
Scalable Science Benchmarks	Priority Level	Lines of Code	MPI	OpenMP/ Pthreads	Fortran	Python	C	C++	Code Description/Notes
LSMS	TR-1	200,000	X	X	X			X	Floating point performance, point-to-point communication scaling.
QBOX	TR-1	47,000	X	X				X	Quantum molecular dynamics. Memory bandwidth, high floating-point intensity, collectives (alltoallv, allreduce, bcast).
HACC	TR-1	35,000	X	X				X	Compute intensity, random memory access, all-to-all communication.
Nekbone	TR-1	48,000	X		X		X		Compute intensity, small messages, allreduce.

Throughput Benchmarks	Priority Level	Lines of Code	Parallelism		Language				Code Description/Notes
Throughput Benchmarks	Priority Level	Lines of Code	MPI	OpenMP/ Pthreads	Fortran	Python	C	C++	Code Description/Notes
CAM-SE	TR-1	150,000	X	X	X		X		Memory bandwidth, strong scaling, MPI latency.
UMT2013	TR-1	51,000	X	X	X	X	X	X	Single physics package code. Unstructured Mesh deterministic radiation Transport. Memory bandwidth, compute intensity, large messages, Python.
AMG2013	TR-1	75,000	X	X			X		Algebraic Multi-Grid linear system solver for unstructured mesh physics packages.
MCB	TR-1	13,000	X	X			X		Monte Carlo transport. Non-floating-point intensive, branching, load balancing.
QMCPACK	TR-2	200.000	X	X			X	X	Memory bandwidth, thread efficiency, compilers.
NAMD	TR-2	180,000	X	X				X	Classical molecular dynamics. Compute intensity, random memory access, small messages, all-to-all communications.
LULESH	TR-2	5,000	X	X			X		Shock hydrodynamics for unstructured meshes. Fine-grained loop level threading.
SNAP	TR-2	3,000	X	X	X				Deterministic radiation transport for structured meshes.
miniFE	TR-2	50,000	X	X				X	Finite element code.

Data-Centric Benchmarks	Priority Level	Lines of Code	Parallelism		Language				Code Description/Notes
Data-Centric Benchmarks	Priority Level	Lines of Code	MPI	OpenMP/ Pthreads	Fortran	Python	C	C++	Code Description/Notes
Graph500	TR-1			X			X		Scalable breadth-first seach of a large undirected graph.
Integer Sort	TR-1	2,000	X		X		X		Parallel integer sort.
Hash	TR-1		X		X		X		Parallel hash benchmark.
SPECint2006 "peak"	TR-2				X		X	X	CPU integer processor benchmark; Report peak results or estimates.

Skeleton Benchmarks	Priority Level	Lines of Code	Parallelism		Language				Code Description/Notes
Skeleton Benchmarks	Priority Level	Lines of Code	MPI	OpenMP/ Pthreads	Fortran	Python	C	C++	Code Description/Notes
CLOMP	TR-1			X			X		Measure OpenMP overheads and other performance impacts due to threading.
IOR	TR-1	4,000	X				X		Interleaved or Random I/O benchmark. Used for testing the performance of parallel filesystems and burst buffers using various interfaces and access patterns.
CORAL MPI benchmarks	TR-1	1,000	X				X		Subsystem functionality and performance tests. Collection of independent MPI benchmarks to measure various aspects of MPI performance including interconnect messaging rate, latency, aggregate bandwidth, and collective latencies.
Memory benchmarks STREAM \| STRIDE	TR-1	1,500			X		X		Memory subsystem functionality and performance tests. Collection of STREAMS and STRIDE memory benchmarks to measure the memory subsystem under a variety of memory access patterns.
LCALS	TR-1	5,000		X				X	Single node. Application loops to test the performance of SIMD vectorization.
Pynamic	TR-2	12,000	X			X		X	Subsystem functionality and performance test. Dummy application that closely models the footprint of an important Python-based multi-physics ASC code.
HACC IO	TR-2	2,000	X					X	Application centric I/O benchmark tests.
FTQ	TR-2	1,000					X		Fixed Time Quantum test. Measures operating system noise.
XSBench (mini OpenMC)	TR-2	1,000		X			X		Monte Carlo Neutron Transport. Stresses system through memory capacity (including potential NVRAM), random memory access, memory latency, threading, and memory contention.
MiniMADNESS	TR-2	10,000	X	X				X	Vector FPU, threading, active-messages.

Microkernel Benchmarks	Priority Level	Lines of Code	Parallelism		Language				Code Description/Notes
Microkernel Benchmarks	Priority Level	Lines of Code	MPI	OpenMP/ Pthreads	Fortran	Python	C	C++	Code Description/Notes
NEKbonemk	TR-3	2,000			X				Single node. NEKbone micro-kernel and SIMD compiler challenge.
HACCmk	TR-3	250		X				X	Single core optimization and SIMD compiler challenge, compute intensity.
UMTmk	TR-3	550			X				Single node UMT microkernel.
AMGmk	TR-3	1,800		X			X		Three compute intensive kernels from AMG.
MILCmk	TR-3	5,000		X			X		Compute intensity and memory performance.
GFMCmk	TR-3	150		X	X				Random memory access, single node.

Scalable Science Benchmarks

LSMS

QBOX

HACC

Nekbone

Throughput Benchmarks

CAM-SE

UMT2013

AMG2013

MCB

QMCPACK

NAMD

NAMD home page
NAMD summary
NAMD tar file
Inputs: small 1M 3M

LULESH

SNAP

miniFE

Data-Centric Benchmarks

Integer Sort

Hash

SPECint2006 "peak"

Vendors are encouraged to use their own SPEC CINT2006 license to run the benchmark. If you do not have a copy, please send e-mail to coral-apps [at] lists.llnl.gov (coral-apps[at]lists[dot]llnl[dot]gov).

Skeleton Benchmarks

CLOMP

IOR

CORAL MPI Benchmarks

Memory Benchmarks

LCALS

Pynamic

HACC IO

FTQ

XSBench

MiniMADNESS

Microkernel Benchmarks

NEKbonemk

Nekbonemk kernel change

HACCmk

UMTmk

AMGmk

AMGmk tar file

MILCmk

GFMCmk

GFMCmk tar file

Supplemental Information

The following content is provided as supplemental information for vendors. It is included to help offerers better understand our application requirements. It is not part of the formal RFP process or technical requirements, should not take precedence over formal RFP technical requirements, and is not intended to be addressed by offerers as part of a formal response.

Performance Characteristics of HYDRA, a multi-physics simulation code from LLNL
Use Cases for Large Memory Appliance/Burst Buffer, an LLNL perspective

Change Log

02/05 - UMT2013 source was updated to address OpenMP thread safety issues. The files Teton/transport/Teton/mods/[ZoneData,Quadrature,Boundary]_mod.F90 have added threadprivate() directive statements. Details are in the updated README file. Due to the late date of this fix, offerers need not recalculate FOMs if tests have already been completed.
01/31 - The LCALSSuite.cxx file was changed to set array allocation lengths to be maximum array index accessed over all suite kernels. This fixes an issue where the VOL3D_CALC kernel was accessing data beyond some of the end of the arrays it was using. (Ref. CORAL RFP Q&A 18)
01/23 - The SNAP summary file and example inputs in the source distribution were updated to clarify how to run problems.
01/22 - Updated the Nekbone summary file to reflect a relaxation of the allowable deviation in spectral element count from 5% to 15% (see CORAL RFP Q&A for more detail).
01/10 - Updated UMT build to handle preprocessing of .F90 files correctly. Updated UMT README to correct definition of FOM to be consistent with source code and summary file.
01/08 - Update to KMI Hash benchmark to support > 2 Gb per MPI rank with MPI-2.
01/06 - Updated SNAP summary file with command line parameters to scale to various problem sizes.

Pre-RFP Release Change Log

Questions or comments about the benchmarks should be directed to coral-apps [at] lists.llnl.gov (coral-apps[at]lists[dot]llnl[dot]gov).
Last modified on June 19, 2014
LLNL-WEB-637074

CORAL Benchmark Codes

What is CORAL?

CORAL Benchmarks

Scalable Science Benchmarks

Throughput Benchmarks

Data-Centric Benchmarks

Skeleton Benchmarks

Microkernel Benchmarks

Supplemental Information

Change Log

SITE MAP

LLNL.GOV

ORGANIZATIONS

RESOURCES

SITE MAP

LLNL.GOV

ORGANIZATIONS

RESOURCES

CORAL Benchmark Codes

What is CORAL?

CORAL Benchmarks

Scalable Science Benchmarks

Throughput Benchmarks

Data-Centric Benchmarks

Skeleton Benchmarks

Microkernel Benchmarks

Supplemental Information

Change Log

SITE MAP

LLNL.GOV

ORGANIZATIONS

RESOURCES

STAY CONNECTED