首页> 外文会议>IEEE International Conference on Software Quality, Reliability and Security >An Empirical Study of Dynamic Incomplete-Case Nearest Neighbor Imputation in Software Quality Data
【24h】

An Empirical Study of Dynamic Incomplete-Case Nearest Neighbor Imputation in Software Quality Data

机译:软件质量数据中动态不完全案例最近邻插补的实证研究

获取原文

摘要

Software quality prediction is an important yet difficult problem in software project development and management. Historical datasets can be used to build models for software quality prediction. However, the missing data significantly affects the prediction ability of models in knowledge discovery. Instead of ignoring missing observations, we investigate and improve incomplete-case k-nearest neighbor based imputation. K-nearest neighbor imputation is widely applied but has rarely been improved to have the most appropriate parameter settings for each imputation. This work conducts imputation on four well-known software quality datasets to discover the impact of the new imputation method we proposed. We compare it with mean imputation and other commonly used versions of k-nearest neighbor imputation. The empirical results show that the proposed dynamic incomplete-case nearest neighbor imputation performs better when the missingness is completely at random or non-ignorable, regardless of the percentage of missing values.
机译:软件质量预测是软件项目开发和管理中一个重要而又困难的问题。历史数据集可用于构建软件质量预测模型。但是,缺少的数据会显着影响知识发现中模型的预测能力。而不是忽略缺失的观测值,我们研究和改进了基于不完整情况的k最近邻居的归因。 K近邻插补被广泛应用,但很少进行改进以使每个插补具有最合适的参数设置。这项工作对四个著名的软件质量数据集进行插补,以发现我们提出的新插补方法的影响。我们将其与均值插补和k最近邻插补的其他常用版本进行比较。实证结果表明,无论缺失值的百分比如何,当缺失完全处于随机或不可忽略时,所提出的动态不完全情况最近邻插值法会表现出更好的效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号