The SimCenter has been awarded the designation of “PSAAP III Focused Investigative Center” under the banner of the Center for Understandable Performant Exascale Communication Systems (CUP-ECS). The center, led by the University of New Mexico (UNM) in partnership with UTC and the University of Alabama at Birmingham (UAB), is funded by the third iteration of the Predictive Science Academic Alliance Program (PSAAP-III) under the National Nuclear Security Administration (NNSA). The PSAAP III program is a highly competitive group of universities including MIT, UIUC, UT Austin, and Stanford, as well as other household names among the most elite institutions of higher learning in the United States.
Three UTC faculty will co-lead the five-year project: Dr. Tony Skjellum (SimCenter), Dr. Craig Tanis (Computer Science & Engineering), and Dr. Abi Arabshahi (SimCenter). CUP-ECS is poised to create and maintain crucial collaborations with NNSA lab personnel via student exchanges. UTC students–ranging from undergrads to PhD—will undertake R&D working with UTC faculty members and laboratory personnel on some of the most demanding high-performance applications on some of the world’s fastest computers. The team will also integrate the center’s research into computer science courses at each institution.
Housed across all three universities, center personnel will design, develop, and optimize new communication abstractions, model their performance, and perform roofline-style constraint analyses of their impact on NNSA application performance. These innovations and insights will then be used to help NNSA application and runtime designers understand, predict, and optimize key trade-offs between application communication strategies and application performance. Systematically assessing the impact of these insights on NNSA applications will then lead to yet newer runtime abstractions and optimizations that further improve application performance, resulting in the next iteration of models, assessment, and innovation.
MPI is still the predominant communication system on upcoming exascale supercomputers, and CUP-ECS is specifically looking to develop new abstractions, models, designs, and implementations to revolutionize the development of both MPI and other similar high-end communication systems (e.g. GASNet, Sandia Portals, and others). This work will include developing high-fidelity mathematical models of communication systems to make them faster, easier to understand, and more predictable.
CUP-ECS is also poised to create and maintain crucial collaborations with NNSA lab personnel. These collaborations will be supported by (1) 10-week student visits to the national laboratories (generally during the summer), (2) periodic week-long PI and staff visits to NNSA laboratories, (3) annual NNSA visits to the center, and (4) tutorials and symposia offered by CUP-ECS personnel at conferences and regular laboratory visits.
The team will integrate research, enabling technologies, and application evaluations into multiple courses, teaching core parallel computing techniques to a broad range of students at each institution. At the introductory undergraduate level, students will be introduced to modern parallel computing and communication system concepts. Advanced undergraduate and graduate electives will focus on teaching CUP-ECS-developed techniques in the design of modern HPC applications. Finally, specialized graduate-level electives (potentially also offered as online short courses for NNSA personnel) will focus on mastery of new communication abstractions and models. The courses at each level are as follows:
- Broad Undergraduate: UTC CPSC 2800 Introduction to Operating Systems and CPSC 4550 Computer Networks
- Undergraduate/Graduate Elective: UTC CPSC 5260 Introduction to Parallel Algorithms
- Specialized Graduate: UTC CPSC 7110 High-Performance Scientific Computing
The team will pursue a three-pronged strategy to recruit U.S. students for the Ph.D. degree and postdoctoral programs at each university so that students trained on the projects become strong candidates for further integration into the future NNSA workforce, as follows.
- Recruitment: Offer competitive support to undergraduates in strong U.S. institutions with the potential for exciting Ph.D. topics in HPC, particularly women and ethnic minorities in their respective states and nearby via face-to-face recruitment efforts
- Retention: Provide significant mentoring, support, and guidance to ensure that students and postdoctoral fellows make timely progress, with frequent interaction with nearby national labs for senior students and postdocs providing career development opportunities not as readily available as at other institutions
- Engagement: Connect students and postdocs with laboratory collaborators for internships and/or visits and provide opportunities to attend major conferences during their PhD or postdoctoral program
The research team has a long history of using these techniques to recruit, mentor, graduate, and place U.S. citizens in NNSA/DOE laboratories. The center’s PIs have already demonstrated that a research assistantship–to–laboratory internship graduate student pathway improves recruitment and retention of U.S. citizen student researchers.