Adding Machine Learning to the Management of Heterogeneous Resources

March 25, 2021

Thaleia-Dimitra Doudali


Adding Machine Learning to the Management of Heterogeneous Resources

Time:   3:30pm
Location:   Zoom3 - https://zoom.us/j/3911012202 (pass: s3)

Computing platforms increasingly incorporate heterogeneous hardware technologies, as a way to scale application performance, resource capacities and achieve cost effectiveness. However, this heterogeneity, along with the greater irregularity in the behavior of emerging workloads, render existing resource management approaches ineffective. This results in a significant gap between the realized vs. achievable performance and efficiency. My research develops a practical approach for using machine learning to bridge this gap, specifically targeting systems with heterogeneous memory technologies. In this talk, I will answer the key challenges into realizing this approach. These include which machine learning (ML) method to use, which part of the memory management stack to target, and how to configure its deployment. I will present new techniques for integrating machine learning (ML) in existing system-level management of hybrid memory hardware, which, on average, bridge 80% of the performance gap. I will also describe system-level configuration management methods that lead to additional 3x improvements. Finally, I will present cross-stack synergies that further facilitate the practical integration of machine learning in systems.