Understanding and isolating the noise in the Linux kernel

Hits: 3647
Type of Publication:
  • Akkan, Hakan
  • Lang, Michael
  • Liebrook, Lorie
International Journal of High Performance Computing Applications
Scientific applications are interrupted by the operating system far too often. Historically, operating systems have been optimized to time-share a single resource, the CPU. We now have an abundance of cores, but we are still swapping out the application to run other tasks and therefore increasing the application’s time to solution. In addition, with parallel applications the probability of one of the tasks entering a synchronization point late due to one of these interrupts increases with increasing system scale, which further increases the application turn-around time. This paper reviews measures that can be taken to reduce application interruption using only compile and run time configurations in a recent unmodified Linux kernel. Although these measures have been available for some time, to the best of the authors’ knowledge, they have never been implemented in a high-performance computing context. We then introduce our invasive method, where we remove the involuntary preemption induced by task scheduling. Our experiments show that parallel applications benefit from these modifications even at relatively small scales. At the modest scale of our testbed, we see a 1.91% improvement in a bulk-synchronous-parallel application that should project into higher benefits at extreme scales.

© 2018 New Mexico Consortium