Paper ID: 2305.16323
Detecting Concept Drift for the reliability prediction of Software Defects using Instance Interpretation
Zeynab Chitsazian, Saeed Sedighian Kashi, Amin Nikanjam
In the context of Just-In-Time Software Defect Prediction (JIT-SDP), Concept drift (CD) can occur due to changes in the software development process, the complexity of the software, or changes in user behavior that may affect the stability of the JIT-SDP model over time. Additionally, the challenge of class imbalance in JIT-SDP data poses a potential risk to the accuracy of CD detection methods if rebalancing is implemented. This issue has not been explored to the best of our knowledge. Furthermore, methods to check the stability of JIT-SDP models over time by considering labeled evaluation data have been proposed. However, it should be noted that future data labels may not always be available promptly. We aim to develop a reliable JIT-SDP model using CD point detection directly by identifying changes in the interpretation of unlabeled simplified and resampled data. To evaluate our approach, we first obtained baseline methods based on model performance monitoring to identify CD points on labeled data. We then compared the output of the proposed methods with baseline methods based on performance monitoring of threshold-dependent and threshold-independent criteria using well-known performance measures in CD detection methods, such as accuracy, MDR, MTD, MTFA, and MTR. We also utilize the Friedman statistical test to assess the effectiveness of our approach. As a result, our proposed methods show higher compatibility with baseline methods based on threshold-independent criteria when applied to rebalanced data, and with baseline methods based on threshold-dependent criteria when applied to simple data.
Submitted: May 6, 2023