Gregory Ganger Professor, Affiliated Faculty CMU Scholars Page Office 2208 Mehrabian Collaborative Innovation Center Email ganger@ece.cmu.edu Phone (412) 268-1297 Department CIT - Electrical and Computer Engineering Computer Science Department: Affiliated Administrative Support Person Karen Lindenfelser Research Interests Systems Data-Intensive and Cloud Computing Distributed Systems Advisees Sanjith Athlur Timothy Kim Hojin Park Ziyue Qiu Theo Gregersen CSD Courses Taught 15746 - Fall, 2025 15719 - Spring, 2025 15746 - Fall, 2024 15719 - Spring, 2024 I have broad research interests in computer systems, including cloud computing, storage/file systems, operating systems and distributed systems. I am involved in several ongoing projects in such areas as systems for large-scale ML, cloud/cluster resource scheduling, and exploitation of new storage/NVM technologies. Big-learning systems for Big Data Modern data analytics often relies on statistical machine learning (ML) to parameterize models that fit observation data, for use in making predictions, correlating causes with effects, etc. Growth in data and desired model precision dictate parallel execution of ML algorithms on clusters, with the corresponding work distribution, synchronization, and data consistency challenges. The big-learning group is exploring powerful new approaches for efficient, scalable, and robust big-learning on Big Data. Cloud Computing We are exploring software systems challenges in efficiently supporting and exploiting cloud computing, such as resource allocation/scheduling and exploiting elasticity for stateful services (e.g., storage) and long-running computations (e.g., large-scale ML). Parallel Data Lab (PDL) As Director of the Parallel Data Lab, I lead and collaborate on a number of storage-related projects in areas such as storage system architecture, file systems, and Big Data systems. For example, in addition to the activities discussed above, we are exploring how system software should change to accommodate new storage technologies like non-volatile RAM (e.g., PCM) and best exploit Flash. Publications Conference GRAPHPIPE: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism 2025 • PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, VOL 1, ASPLOS 2025 • 557-571 Jeon B, Wu M, Cao S, Kim S, Park S, Aggarwal N, Unger C, Arfeen D, Liao P, Miao X, Alizadeh M, Ganger GR, Chen T, Jia Z Preprint Nonuniform-Tensor-Parallelism: Mitigating GPU failure impact for Scaled-up LLM Training 2025 Arfeen D, Mudigere D, More A, Gopireddy B, Inci A, Ganger GR Conference Okapi: Decoupling Data Striping and Redundancy Grouping in Cluster File Systems 2025 • Proceedings of the 19th Usenix Symposium on Operating Systems Design and Implementation Osdi 2025 • 897-914 Athlur S, Kim T, Kadekodi S, Maturana F, Ramos X, Merchant A, Rashmi KV, Ganger GR Conference <i>Morph</i>: Efficient File-Lifetime Redundancy Management for Cluster File Systems 2024 • PROCEEDINGS OF THE 2024 ACM SIGOPS 30TH SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, SOSP 2024 • 330-346 Kim T, Athlur S, Kadekodi S, Maturana F, Delvira D, Merchant A, Ganger GR, Rashmi KV Preprint A Call for Research on Storage Emissions 2024 McAllister S, Kazhamiaka F, Berger D, Fonseca R, Frost K, Ogus A, Shah M, Bianchini R, Amvrosiadis G, Beckmann N, Ganger G
Conference GRAPHPIPE: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism 2025 • PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, VOL 1, ASPLOS 2025 • 557-571 Jeon B, Wu M, Cao S, Kim S, Park S, Aggarwal N, Unger C, Arfeen D, Liao P, Miao X, Alizadeh M, Ganger GR, Chen T, Jia Z
Preprint Nonuniform-Tensor-Parallelism: Mitigating GPU failure impact for Scaled-up LLM Training 2025 Arfeen D, Mudigere D, More A, Gopireddy B, Inci A, Ganger GR
Conference Okapi: Decoupling Data Striping and Redundancy Grouping in Cluster File Systems 2025 • Proceedings of the 19th Usenix Symposium on Operating Systems Design and Implementation Osdi 2025 • 897-914 Athlur S, Kim T, Kadekodi S, Maturana F, Ramos X, Merchant A, Rashmi KV, Ganger GR
Conference <i>Morph</i>: Efficient File-Lifetime Redundancy Management for Cluster File Systems 2024 • PROCEEDINGS OF THE 2024 ACM SIGOPS 30TH SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, SOSP 2024 • 330-346 Kim T, Athlur S, Kadekodi S, Maturana F, Delvira D, Merchant A, Ganger GR, Rashmi KV
Preprint A Call for Research on Storage Emissions 2024 McAllister S, Kazhamiaka F, Berger D, Fonseca R, Frost K, Ogus A, Shah M, Bianchini R, Amvrosiadis G, Beckmann N, Ganger G