Commodity Technology Systems

The Commodity Technology (CT) systems currently sited and in use at LLNL were planned, researched, developed, procured, tested, integrated, and deployed to unify computing across the National Nuclear Security Administration (NNSA) defense complex. These systems leverage industry advances and open source software standards to build, field, and integrate Linux clusters of various sizes into production service. The programmatic objective is to dramatically reduce overall total cost of ownership of these commodity systems relative to best practices in Linux cluster deployments today. This objective strives to quickly make these systems robust, useful production clusters under the coming load of ASC scientific simulation capacity workloads.

Ruby Supercomputer, with tinted purple light
Ruby, one of LLNL's newest commodity technology clusters, is being used in the fight against COVID-19.

NNSA's new capacity computing systems, called the Commodity Technology Systems-1 (CTS-1), are its third joint procurement under the Advanced Simulation and Computing (ASC) program. These computing clusters will provide the needed computing capacity for NNSA's day-to-day scientific work at the three labs managing the nation's nuclear deterrent.

Under the CTS-1 contract, Penguin Computing—a Silicon Valley–based developer of high-performance Linux cluster computing systems—will furnish the labs with multiple systems, ranging in size from a few hundred to several thousand nodes.

The previous procurement, the Tri-Lab Linux Capacity Cluster (TLCC2), represented a multi-million dollar and multi-year contract to provide multiple procurement options exceeding 3 petaFLOP/s in CT systems. Under the terms of the contract, computing clusters built of scalable units (SUs) were delivered to Lawrence Livermore, Los Alamos, and Sandia National Laboratories between October 2011 and June 2012. Each SU represented 50 teraFLOP/s of peak computing power and was designed to be interconnected to create more powerful systems. The SUs were divided among the three labs, with each lab configuring the SUs into clusters according to mission needs.

In October 2011, LLNL received the first of 18 SUs, which were combined into a single classified cluster named Zin, with a peak speed of 970 teraFLOP/s. Additional SUs were combined to create the single unclassified cluster named Cab, which has a peak speed of 431 teraFLOP/s. Cab is in the "collaboration zone," where users in the new High Performance Computing Innovation Center (HPCIC) can access the machine. A third cluster, Merl, is a small resource shared by LLNL and the ASC Program for small to moderate parallel jobs. The names of the three supercomputers were inspired by the Livermore area wine country.

This tri-lab procurement model reduces costs through economies of scale based on standardized hardware and software environments at the three labs. Scientists are now using the TLCC2 computers for programmatic simulations.

Retired Systems

Hyperion was a large-scale, multi-vendor-partner environment test bed for emerging breakthrough technologies to evaluate innovative node architectures, networks, and alternative storage solutions. Hyperion was a unique and critical resource for the functionality, performance, stability, and scalability testing of system software.

Purple was a testimony to the successful realization of the bold vision expressed one decade earlier: the development of the complex three-dimensional integrated weapons performance applications and their demonstration on computers capable of successfully running these extraordinary codes.