PerFlow

A domain-specific framework for performance analysis of large-scale parallel programs (面向性能分析领域的编程框架)

Github (源码地址)

About (关于PerFlow)

User Guide (使用指南)

Documentation (文档)

Introduction (简介)

PerFlow是一套集成了性能数据采集和分析的全链工具。 在性能数据采集阶段,PerFlow结合基于二进制文件的静态分析和基于采样的动态分析。 在性能数据分析阶段,PerFlow将性能数据组织成一种图(性能抽象图),图中的点代表代码段,边代表代码段之间的依赖关系,包括数据依赖、控制依赖、线程间锁依赖和进程间通信依赖等。 同时,PerFlow提出了一种基于数据流图的编程抽象,允许用户使用数据流图表示性能分析任务的整个过程。 开发人员可以通过PerFlow提供的Python编程接口,自定义分析性能数据,并可以与numpy、sklearn等数据分析包结合使用。 PerFlow的优点是:有效降低开发人员手写性能分析任务的复杂度。

Performance analysis is widely used to identify performance issues of parallel applications. However, complex communications and data dependence, as well as the interactions between different kinds of performance issues make high-efficiency performance analysis even harder. Although a large number of performance tools have been designed, accurately pinpointing root causes for such complex performance issues still needs specific in-depth analysis. To implement each such analysis, significant human efforts and domain knowledge are normally required. To reduce the burden of implementing accurate performance analysis, we propose a domain specific programming framework, named PerFlow. PerFlow abstracts the step-by-step process of performance analysis as a dataflow graph. This dataflow graph consists of main performance analysis sub-tasks, called passes, which can either be provided by PerFlow’s built-in analysis library, or be implemented by developers to meet their requirements. Moreover, to achieve effective analysis, we propose a Program Abstraction Graph to represent the performance of a program execution and then leverage various graph algorithms to automate the analysis. We demonstrate the efficacy of PerFlow by three case studies of real-world applications with up to 700K lines of code. Results show that PerFlow significantly eases the implementation of customized analysis tasks. In addition, PerFlow is able to perform analysis and locate performance bugs automatically and effectively.

How to Use (简单使用方法)

import perflow as pf

# Static binary analysis and dynamic profiling
pag = pf.run(bin = 'a.out', cmd = 'mpirun -np 4 ./a.out')
# Hotspot analysis
results = pf.hotspot_detection(pag.vs)
# Report the results
pf.report(results)

Publications (发表)

Yuyang Jin, Haojie Wang, Runxin Zhong, Chen Zhang, Jidong Zhai. PerFlow: a domain specific framework for automatic performance analysis of parallel applications[C]//Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 2022: 177-191. [PDF]

@inproceedings {jin2022perflow,
  title = {PerFlow: a domain specific framework for automatic performance analysis of parallel applications},
  author = {Jin, Yuyang and Wang, Haojie and Zhong, Runxin and Zhang, Chen and Zhai, Jidong},
  booktitle = {Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming},
  pages = {177--191},
  year = {2022}
}

License