Arm-based supercomputer prototype to be deployed at Sandia National Laboratories by Department of Energy
Microprocessors designed by Arm are ubiquitous in automobile electronics, cellphones, and other embedded applications, but until recently they have not provided the performance necessary to make them practical for high-performance computing.
Astra, one of the first supercomputers to use processors based on the Arm architecture in a large-scale high-performance computing platform, is expected to be deployed in late summer at Sandia National Laboratories.
The US Department of Energy’s National Nuclear Security Administration says that Astra, the first of a potential series of advanced architecture prototype platforms, will be deployed as part of the Vanguard program.
This program will evaluate the feasibility of emerging high-performance computing architectures as production platforms to support NNSA’s mission to maintain and enhance the safety, security and effectiveness of the US nuclear stockpile.
Astra will be based on the recently announced Cavium ThunderX2 64-bit Arm-v8 microprocessor.
The platform will consist of 2,592 compute nodes, of which each is 28-core, dual-socket, and will be at a theoretical peak of more than 2.3 petaflops, equivalent to 2.3 quadrillion floating-point operations, or calculations, per second.
While being the fastest is not one of the goals of Astra or the Vanguard program in general, a single Astra node is roughly 100 times faster than a modern Arm-based cellphone.
“One of the important questions Astra will help us answer is how well does the peak performance of this architecture translate into real performance for our mission applications,” says Mark Anderson, program director for NNSA’s Advanced Simulation and Computing program, which funds Astra.
A first step for Vanguard
Scott Collis, director of Sandia’s Center for Computing Research, says: “Emerging architectures come with many challenges.
“Since the NNSA has not previously deployed high-performance computing platforms based on Arm processors, there are gaps in the software that must be addressed before considering this technology for future platforms much larger in scale than Astra.”
As part of a multiple-lab partnership, researchers anticipate continually improving Astra and future platforms.
Ken Alvin, senior manager of Sandia’s extreme-scale computing group, says: “Sandia researchers partnering with counterparts at Los Alamos and Lawrence Livermore national laboratories expect to develop an improved software-and-tools environment that will enable mission codes to make increasingly effective use of Astra as well as future leadership-class platforms.
“The Vanguard program is designed to allow the NNSA to take prudent risks in exploring emerging technologies and broadening our future computing options.”
Astra will be installed at Sandia in an expanded part of the building that originally housed the innovative Red Storm supercomputer.
The Astra platform will be deployed in partnership with Westwind Computer Products and Hewlett Packard Enterprise.
James Laros, Vanguard project lead, says: “Astra, like Red Storm, will require a very intimate collaboration between Sandia and commercial partners.
“In this case, all three NNSA defense labs will work closely with Westwind, HP Enterprise, Arm, Cavium and the wider high-performance computing community to achieve a successful outcome of this project.”
Astra takes its name from the Latin phrase “per aspera ad astra”, which translates as “through difficulties to the stars”.
Views from stakeholders
Steve Hull, president of Westwind Computer Product, says: “The development of a scalable Arm platform based on the HPE Apollo 70 will become a key resource to expand the Arm high-performance computing ecosystem.
“Westwind is honored to be entrusted by Sandia, in its continued commitment to developing small businesses here in New Mexico, to implement such an important project.”
Mike Vildibill, vice president of the Advanced Technology Group at HPE, says: “By delivering the world’s largest Arm-based supercomputer featuring the HPE Apollo 70 platform, a purpose-built architecture that includes advanced performance and storage capabilities, we are enabling the US Department of Energy and National Nuclear Security Administration to power innovative solutions for energy and national security uses.”
Drew Henry, senior vice president and general manager of Arm’s infrastructure business line, says: “Arm has been deeply engaged with Sandia National Laboratories working to comprehend and deliver on the needs of the high-performance computing community.
“We are eager to support the Vanguard program as a key milestone deployment for Arm and our partners, delivering on a shared vision to spur innovation in this critical domain.”
Gopal Hegde, vice president and general manager of the data center processor group at Cavium, says: “Cavium is pleased to partner with Sandia National Laboratories to enable the Arm-v8-based high-performance computing cluster as part of the Vanguard program.
“Vanguard is an additional proof point regarding readiness and maturity of ThunderX2 processors for large-scale deployments and will further accelerate the entire computing ecosystem on the Arm server architecture.”