PyDac: A Resilient Run-time Framework for Divide-and-Conquer Applications on a Heterogeneous Many-core Architecture

Hits: 3428
Type of Publication:
  • Huang, Bin
  • Sass, Ron
  • DeBardeleben, Nathan
  • Blanchard, Sean
Proceedings of the The 6th Workshop on UnConventional High Performance Computing 2013 (UCHPC 2013)
Heterogeneous many-core architectures that consist of big cores and small cores promise a good balance between single-thread per- formance and multi-thread throughput. Such systems impose challenges on the runtime system design. One such challenge is the reliability of the hardware and it is likely that the runtime system will need to contain faults and reduce the chance of a fault from propagating. We propose a Python-based run-time framework called PyDac. PyDac supports a two-level programming model based on the divide-and-conquer strategy. This framework isolates all of data that a small core is working on. Therefore, a faulty small core can be reset independently and the task restarted. To test this run-time, an unconventional heterogeneous architecture consisting of PowerPC and ARM cores was emulated on an FPGA. We demonstrate feasibility of this runtime design with Strassen’s algorithm.

© 2018 New Mexico Consortium