Cache Injection for Parallel Applications
Edgar Leon, IBM Austin Research Laboratory, Dec 15, 2010
For two decades, the memory wall has affected many applications in their ability to benefit from improvements in processor speed. Cache injection addresses this disparity for I/O by writing data into a processor’s cache directly from the I/O bus. This technique reduces data latency and, unlike data prefetching, improves memory bandwith utilization. These improvements are significant for data intensive applications whose performance is dominated by compulsory cache misses. In this talk, Dr. Leon presents a detailed evaluation of three injection policies and their effect on the performance of two parallel applications and several collective micro-benchmarks. He will demonstrate that the effectiveness of cache injection on performance is a function of the communication characteristics of applications, the injection policy, the target cache, and the severity of the memory wall. For example, he will show that injecting message payloads to the L3 cache can improve the performance of network-bandwith limited applications. In addition, Dr. Leon will show that cache injection improves the performance of several collective operations, but not all-to-all operations (implementation dependent). This study shows negligible pollution to the target caches.