首页> 外文期刊>IEEE Transactions on Reliability >Two-Stage Cost-Sensitive Learning for Software Defect Prediction
【24h】

Two-Stage Cost-Sensitive Learning for Software Defect Prediction

机译:用于软件缺陷预测的两阶段成本敏感型学习

获取原文
获取原文并翻译 | 示例
           

摘要

Software defect prediction (SDP), which classifies software modules into defect-prone and not-defect-prone categories, provides an effective way to maintain high quality software systems. Most existing SDP models attempt to attain lower classification error rates other than lower misclassification costs. However, in many real-world applications, misclassifying defect-prone modules as not-defect-prone ones usually leads to higher costs than misclassifying not-defect-prone modules as defect-prone ones. In this paper, we first propose a new two-stage cost-sensitive learning (TSCS) method for SDP, by utilizing cost information not only in the classification stage but also in the feature selection stage. Then, specifically for the feature selection stage, we develop three novel cost-sensitive feature selection algorithms, namely, Cost-Sensitive Variance Score (CSVS), Cost-Sensitive Laplacian Score (CSLS), and Cost-Sensitive Constraint Score (CSCS), by incorporating cost information into traditional feature selection algorithms. The proposed methods are evaluated on seven real data sets from NASA projects. Experimental results suggest that our TSCS method achieves better performance in software defect prediction compared to existing single-stage cost-sensitive classifiers. Also, our experiments show that the proposed cost-sensitive feature selection methods outperform traditional cost-blind feature selection methods, validating the efficacy of using cost information in the feature selection stage.
机译:软件缺陷预测(SDP)将软件模块分为易缺陷和不易缺陷两类,它提供了一种维护高质量软件系统的有效方法。大多数现有的SDP模型都试图降低分类错误率,而不是降低错误分类的成本。但是,在许多实际应用中,将容易出错的模块错误分类为不易损坏的模块通常会导致成本高昂,而将不容易错误的模块错误分类为易出错的模块通常会导致更高的成本。在本文中,我们首先提出了一种新的SDP两阶段成本敏感学习(TSCS)方法,该方法不仅在分类阶段而且在特征选择阶段都利用成本信息。然后,专门针对特征选择阶段,我们开发了三种新颖的成本敏感特征选择算法,分别为:成本敏感方差得分(CSVS),成本敏感拉普拉斯得分(CSLS)和成本敏感约束得分(CSCS),通过将成本信息整合到传统特征选择算法中。在NASA项目的七个真实数据集上评估了所提出的方法。实验结果表明,与现有的单阶段成本敏感分类器相比,我们的TSCS方法在软件缺陷预测中具有更好的性能。而且,我们的实验表明,所提出的成本敏感特征选择方法优于传统的成本盲特征选择方法,从而验证了在特征选择阶段使用成本信息的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号