...
首页> 外文期刊>Applied Intelligence >Identifying mislabeled training data with the aid of unlabeled data
【24h】

Identifying mislabeled training data with the aid of unlabeled data

机译:借助未贴标签的数据识别贴错标签的训练数据

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents a new approach for identifying and eliminating mislabeled training instances for supervised learning algorithms. The novelty of this approach lies in the using of unlabeled instances to aid the detection of mislabeled training instances. This is in contrast with existing methods which rely upon only the labeled training instances. Our approach is straightforward and can be applied to many existing noise detection methods with only marginal modifications on them as required. To assess the benefit of our approach, we choose two popular noise detection methods: majority filtering (MF) and consensus filtering (CF). MFAUD/CFAUD is the new proposed variant of MF/CF which relies on our approach and denotes majority/consensus filtering with the aid of unlabeled data. Empirical study validates the superiority of our approach and shows that MFAUD and CFAUD can significantly improve the performances of MF and CF under different noise ratios and labeled ratios. In addition, the improvement is more remarkable when the noise ratio is greater.
机译:本文提出了一种新方法,用于识别和消除监督学习算法中标记错误的训练实例。这种方法的新颖之处在于使用未标记的实例来帮助检测错误标记的训练实例。这与仅依赖于标记的训练实例的现有方法相反。我们的方法简单明了,可以应用到许多现有的噪声检测方法中,仅需根据需要对其进行少量修改。为了评估我们方法的好处,我们选择了两种流行的噪声检测方法:多数滤波(MF)和共识滤波(CF)。 MFAUD / CFAUD是MF / CF的新提议变体,它依赖于我们的方法,并借助未标记的数据表示多数/共识过滤。实证研究证实了我们方法的优越性,并表明MFAUD和CFAUD在不同的噪声比率和标记比率下可以显着改善MF和CF的性能。另外,当噪声比更大时,改进更加显着。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号