Program comprehension is a necessary step during software understanding and maintenance. It is usually performed by analyzing data gathered from program execution. These execution traces reveal the relationship between high-level concepts (features) and low-level implementation details. However, identifying features from execution traces is difficult and time-consuming due to their large volume and complexity. Existing work assists the process by semi-automated tools, leveraging either human input or prior knowledge of the program implementation. It remains the key limitation towards a general method to build such mapping. In this paper, we proposed TRASE, an approach to identify features by segmenting the execution traces without the need for any human intervention. The segments are identified by mining recurring patterns on a sequence database, which is constructed by numerous execution traces gathered from normal use of a program. Each segment refers to a feature and the labels are inferred from the traces to assist program comprehension. The performance is evaluated on traces collected from android applications and a synthetic dataset. TRASE achieved up to 86% in F1 score and the result indicates that it is robust to highly variate traces while efficient for large data.
Bibliographical noteThis work was supported by Key-Area Research and Development Program of Guangdong Province (No. 2020B010164002).
- Dynamic analysis
- Execution trace
- Program comprehension
- Sequential pattern mining