A Gossip-based Approach to Exascale System Services

Hits: 3828
Type of Publication:
  • Soltero, Philip
  • Bridges, Patrick
  • Arnold, Dorian
  • Lang, Michael
Large-scale server deployments in the commercial internet space have been using group based protocols such as peer-to-peer and gossip to allow coordination of services and data across global distributed data centers. Here we look at applying these methods, which are themselves derived from early work in distributed systems, to large-scale, tightly coupled systems used in high performance computing. In this paper, we study Gossip protocols and their ability to aggregate data across large-scale systems in support of system services. We report accuracy and performance of these estimated results and then focus on a simulated powercapping service to show the tradeoffs of this approach in practice.
International Workshop on Runtime and Operating Systems for Supercomputers (ROSS) 2013. Held in conjunction with ICS 2013, Eugene, Oregon, USA, June 10, 2013

© 2018 New Mexico Consortium