Continuing its tradition as a dominant Top500 high-performance computing (HPC) site, LLNL began installing components in May 2023 for NNSA’s first exascale supercomputer, El Capitan. An exascale supercomputer can calculate at least one quintillion (1,000,000,000,000,000,000+) Double Precision (64-bit) operations per second (1 exaflop). When it is deployed in 2024, El Capitan is projected to be the world’s most powerful supercomputer capable of performing more than 2 exaflops per second.
El Capitan’s purpose
Funded by NNSA’s ASC program, El Capitan was a collaboration among the three NNSA labs—Livermore, Los Alamos, and Sandia. El Capitan will ensure the safety, security, and reliability of the nation’s nuclear stockpile in the absence of underground testing. It will be essential for the design and stewardship of a modernized stockpile and other critical national security missions. Research performed on El Capitan will also support unclassified mission areas of interest to national security, including material discovery, high energy-density physics, nuclear data, material equations of state, and conventional weapon design.
To ensure the system achieves its full computing potential, LLNL is investing in cognitive simulation capabilities such as artificial intelligence (AI) and machine learning (ML) techniques that will benefit both unclassified and classified missions.
El Capitan’s early access systems
Three of El Capitan’s smaller early access systems—Tenaya, Tioga, and RZVernal—currently rank among the Top500 supercomputers in the world.
Tuolumne and RZAdams
Research performed on El Capitan’s largest “sister” system, Tuolumne, will support other unclassified projects in energy security, climate change, cancer drug discovery, and other areas of public interest. Like Tuolumne, El Capitan’s other smaller, unclassified “sister” system, RZAdams, will support both weapons and non-weapons missions. Both systems were purchased under the El Capitan contract and are scheduled to arrive in 2024.
El Capitan’s novel systems’ software strategy
El Capitan will be the first ASC Advanced Technology System, which includes ASC's largest systems, to use the TOSS, the Trilab Operating System Software, which is the same environment and operating system as ASC’s commodity technology machines use. This advancement will simplify system administration and improve user experiences.
Siting El Capitan
The HPE/AMD system required the efforts of hundreds of people as well as public and private partnerships. Many years of careful planning and preparation have paved the way for its successful arrival, including a massive construction upgrade of power and water to LLNL’s HPC facility.
While it will be one of the world’s most energy-efficient supercomputers, El Capitan will require about 30 megawatts (MW) of energy to run at peak—enough power to run a mid-size city.
El Capitan’s details and distinguishing features
- Funded by NNSA’s ASC program
- Siting complete in 2024
- Expected peak performance ≥ 2.0 exaflops
- Peak power < 40 MW (anticipating ~30 MW)
- AMD MI300 accelerated processing unit (APU)-3D chiplet design, which includes a tightly coupled central processing unit (CPU)—graphics processing unit (GPU) in one processing unit
- Slingshot interconnect
- Innovative Rabbit near-node local storage
- Will be used by all three Tri-Labs (Livermore, Los Alamos, and Sandia)