The Advanced Technology (AT) systems currently sited and in use at LLNL were planned, researched, developed, procured, tested, integrated, and deployed for programmatic computing needs of the nation’s Stockpile Stewardship Program. Projects and technologies include strategic planning, performance modeling, benchmarking, and procurement and integration coordination, market research, and the investigation of advanced architectural concepts and hardware (including node interconnects and machine area networks) via prototype development, deployment, and test bed activities. Current systems include:
- Sequoia: A 20-petaFLOP/s IBM BlueGene/Q advanced technology platform that brings many innovations over the previous BlueGene generations, including 16 cores per node, multithreaded cores, a five-dimensional torus interconnect, water cooling, and optical fiber links. The system has a staggering 1.6 million processor cores with a total possible 102 million hardware threads all operating simultaneously. This type of parallelism dictates new directions in supercomputing and enters a new regime of the possible physical systems that can be simulated numerically. Codes that are optimized for multi-core and multi-threading run best on this machine. This platform is used as a Capability Computing Campaign (CCC) machine for tri-lab stockpile stewardship milestones. Every six months a new CCC process is run, and the next suite of codes is ushered onto the machine.
- Sierra: An uncertainty quantification (UQ) and weapons-science-focused to be sited at LLNL in 2017, to fill a critical role in support of the Directed Stockpile Work mission during the FY18–FY22 timeframe. The Sierra AT system will replace Sequoia and its mission. Operation of the Sierra AT system will fall under a proven national user facility paradigm, and the system will be available to LLNL, LANL, and SNL. LLNL is partnering with two Department of Energy Office of Science labs (Argonne and Oak Ridge) in a collaboration named CORAL to acquire three leadership computing systems, one of which is the Sierra AT system.
- Sierra early access systems: In late 2016, LLNL acquired three small-scale “early access” (EA) versions of Sierra, consisting of IBM Minsky compute nodes with 20 Power 8 cores each and 4 NVIDIA Pascal graphics processing unites (GPUs). These small systems feature components only one generation behind those of Sierra. EA systems enable application porting and tuning in advance of the CORAL Sierra system delivery and acceptance (late 2017 to mid-2018). Sierra will utilize next-generation Power 9 processors and NVIDIA GPUs to provide 125-petaFLOP performance in support of the ASC program. To enable this work, beta software co-designed by the CORAL laboratories and IBM is being installed on the EA systems.
Gordon Bell examines a Sequoia component while
Mark Springer (left) and Adam Bertsch look on during
a tour of the LCC B453. (Read more about Bell's visit.)
The governance model provides a methodology to effectively allocate and schedule AT computing resources among the three National Nuclear Security Administration (NNSA) laboratories for weapons deliverables that merit priority on this class of resource. Download the document here.
BlueGene/L earned the number 1 position on the TOP500 list of world's most powerful supercomputers from Nov. 2004 to Jun. 2008 with a sustained world-record speed of 478.2 teraFLOP/s. BG/L was a revolutionary, low-cost machine delivering extraordinary computing power for the nation's Stockpile Stewardship Program.