...
首页> 外文期刊>IEEE Transactions on Reliability >A Novel Class-Imbalance Learning Approach for Both Within-Project and Cross-Project Defect Prediction
【24h】

A Novel Class-Imbalance Learning Approach for Both Within-Project and Cross-Project Defect Prediction

机译:一种用于项目内和跨项目缺陷预测的新型班级不平衡学习方法

获取原文
获取原文并翻译 | 示例
           

摘要

Software defect prediction (SDP) is an available way to enhance test efficiency and guarantee software reliability. However, there are more clean instances than defective instances in real software projects, and this results in severe class distribution skews and gets the poor performance of classifiers. So solving the class-imbalance problem in SDP has attracted growing attention from industry and academia in software engineering. In this paper, we propose a novel class-imbalance learning approach for both within-project and cross-project class-imbalance problem. We utilize the thought of stratification embedded in nearest neighbor (STr-NN) to produce evolving training datasets with balanced data. For within-project, we directly employ the STr-NN approach for defect prediction. For cross-project, we first introduce transfer component analysis to mitigate the distribution differences between source and target dataset, and then employ the STr-NN approach on the transferred data. We conduct experiments on PROMISE and NASA datasets using ensemble learning based on weight vote. Experimental results indicate that our approach has higher area under curve (AUC), Recall and comparable probability of a false alarm (pf), and F-measure than some existing methods for the class-imbalance problem.
机译:软件缺陷预测(SDP)是提高测试效率并保证软件可靠性的一种可用方法。但是,在实际的软件项目中,干净的实例要比有缺陷的实例更多,这会导致严重的类分布偏斜并导致分类器的性能不佳。因此,解决SDP中的类不平衡问题引起了软件工程界和学术界越来越多的关注。在本文中,我们针对项目内和跨项目类不平衡问题提出了一种新颖的类不平衡学习方法。我们利用嵌入最近邻(STr-NN)中的分层思想来生成具有平衡数据的不断发展的训练数据集。对于项目内部,我们直接采用STr-NN方法进行缺陷预测。对于跨项目,我们首先介绍传输成分分析以减轻源数据集和目标数据集之间的分布差异,然后对传输的数据采用STr-NN方法。我们使用基于权重投票的集成学习对PROMISE和NASA数据集进行实验。实验结果表明,与类不平衡问题的某些现有方法相比,我们的方法具有更高的曲线下面积(AUC),召回率和可比的虚警概率(pf)和F测度。

著录项

  • 来源
    《IEEE Transactions on Reliability》 |2020年第1期|40-54|共15页
  • 作者

  • 作者单位

    China Univ Min & Technol Sch Comp Sci & Technol Xuzhou 221116 Jiangsu Peoples R China|Minist Educ Engn Res Ctr Mine Digitalizat Xuzhou 221116 Jiangsu Peoples R China|Zaozhuang Univ Dept Informat Sci & Engn Zaozhuang 277160 Peoples R China;

    China Univ Min & Technol Sch Comp Sci & Technol Xuzhou 221116 Jiangsu Peoples R China|Minist Educ Engn Res Ctr Mine Digitalizat Xuzhou 221116 Jiangsu Peoples R China;

    Guilin Univ Elect Technol Guangxi Key Lab Trusted Software Guilin 541004 Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Training; Learning systems; Software; Measurement; NASA; Predictive models; Machine learning; Class-imbalance; cross-project; ensemble learning; software defect prediction (SDP); within-project;

    机译:训练;学习系统;软件;测量;美国宇航局预测模型;机器学习;类不平衡;跨项目;整体学习;软件缺陷预测(SDP);项目内;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号