FRLink: Improving the recovery of missing issue-commit links by revisiting file relevance
详细信息    查看全文
文摘
Context: Though linking issues and commits plays an important role in software verification and maintenance, such link information is not always explicitly provided during software development or maintenance activities. Current practices in recovering such links highly depend on tedious manual examination. To automatically recover missing links, several approaches have been proposed to compare issue reports with log messages and source code files in commits. However, none of such approaches looked at the role of non-source code complementary documents in commits; nor did they consider the distinct roles each piece of the source code played in the same commit.Objective: We propose to revisit the definition of relevant files contributing to missing link recovery. More specifically, our work extends existing approaches from two perspectives: (1) Inclusion extension: incorporating complementary documents (i.e., non-source documents) to learn from more relevant data; (2) Exclusion extension: analyzing and filtering out irrelevant source code files to reduce data noise.Method: We propose a File Relevance-based approach (FRLink), to implement the above two considerations. FRLink utilizes non-source documents in commits, since they typically clarify code changes details, with similar textual information from corresponding issues. Moreover, FRLink differentiates the roles of different source code files in a single commit and discards files containing no similar code terms as those in issues based on similarity analysis.Results: FRLink is evaluated on 6 projects and compared with RCLinker, which is the latest state-of-the-art approach in missing link recovery. The result shows that FRLink outperforms RCLinker in F-Measure by 40.75% when achieving the highest recalls.Conclusion: FRLink can significantly improve the performance of missing link recovery compared with existing approaches. This indicates that in missing link recovery studies, sophisticated data selection and processing techniques necessitate more discussions due to the increasing variety and volume of information associated with issues and commits.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700