Center for Understandable, Performant Exascale Communication Systems: https://cup-ecs.github.io/index.html
Let’s say you have a computer running an application that can solve specific problems, and you’re getting a performance out of that application running at 50% efficiency.
When you run the application on a bigger and better computer, though, it runs at 20% efficiency. While the program cranks up its computations quickly because that’s part of its whole infrastructure, you are getting lower performance.
Moving data is an essential piece of what’s known as “parallel computing,” a type of computation where many calculations are carried out simultaneously. If the computer doesn’t run efficiently, you don’t get the answer fast enough.
Understanding why things work when they work and improving the speed and functionality of next-generation supercomputers is the goal of University of Tennessee at Chattanooga SimCenter researchers.
It’s also one reason why the U.S. Department of Energy’s National Nuclear Security Administration has selected the SimCenter, the University of New Mexico Center for Advanced Research Computing and the University of Alabama at Birmingham Collaborative Computing Lab to receive a $4-million grant, with more than $1.1 million going to the SimCenter. The award is funded through the Predictive Science Academic Alliance Program, which directs research and development to maintain the safety, security and effectiveness of the U.S nuclear weapons stockpile.
The program’s grant has created the Center for Understandable, Performant Exascale Communication Systems, which researches how to make high-speed computer-to-computer communication more efficient. SimCenter personnel will design and develop new ways to enhance communication performance between computer systems. The innovations will then be used to help the National Nuclear Security Administration understand why larger computers aren’t running as efficiently as smaller ones.
“Exascale” is a term for computational capacity and refers to systems capable of performing calculations at a speed of 10 to the 18th power, or a quintillion calculations per second. That’s a one with 18 zeroes behind it.
At UTC, the five-year project will be led by Tony Skjellum, Abi Arabshahi and Craig Tanis. Skjellum is director of the SimCenter; Arabshahi is a research professor in mechanical engineering; and Tanis is an assistant professor in computer science and engineering.
“What we are doing is working on the infrastructure software—the ‘parallel processing middleware,’ as it’s called—and understanding the parallel applications to get more net performance out of them,” said Skjellum, SimCenter director since 2017. “We are expected to bring into practical and wider use the best of these results to make the applications work better on the largest parallel machines for exascale.”
UTC, the University of New Mexico and the University of Alabama at Birmingham will operate as a single investigative team.
“We expect to make an impact in the best case on all those other (Predictive Science Academic Alliance Program ) centers by making what we do become best practices across this high-end computing environment over five years,” Skjellum said.
Skjellum has a long history with the research leathroughders at the other two institutions—Patrick Bridges at the University of Mexico and Purushotham Bangalore at the University of Alabama at Birmingham.
“It’s just coincidental because I work with a lot of people, but when it came time to get this center, it happened to be two of my former students,” he said. “I’m particularly proud of them as my colleagues. We have worked together for over 25 years, and it’s a very long collaboration to lead to this center. For us, it’s a very cool thing.”
Joanne Romagni, UTC vice chancellor for research and dean of the graduate school, said receiving the national research designation puts UTC in the same category as other major institutions working in modeling, simulation and high-performance computing.
“UTC hasn’t had a national research center designation of this magnitude before,” Romagni said. “This center builds on the 18-year history of SimCenter’s strengths in modeling and simulation as well as high-performance computing.
“What it means for research is that Drs. Tanis, Arabshahi and Skjellum are part of a group of three universities working on the most demanding problems in computing within a larger alliance of academic institutions, all supporting the U.S. Department of Energy’s three defense labs.
“Our undergraduate and grad students will get advantaged access to working on these elite class systems in collaboration with world-class scientists and engineers at the labs. Even more importantly long term, opportunities for internships and potential future employment with these national labs are opened to our student participants,” she said.
Through the Center for Understandable, Performant Exascale Communication Systems, the program is poised to create and maintain crucial collaborations with lab personnel at the Department of Energy’s National Nuclear Security Administration and other research institutions.
“It is a big honor to be trusted and to be sitting at the same peer table with other leading universities,” Skjellum said. “What is more gratifying to us as professors and scientists is enabling our students to work with their peers at those other institutions. This plays to our mission of helping our students. We have students ready to do the work, and we have the research and acumen to achieve these research goals.”