首页> 外文期刊>Decision support systems >The data complexity index to construct an efficient cross-validation method
【24h】

The data complexity index to construct an efficient cross-validation method

机译:数据复杂度指标构建一种有效的交叉验证方法

获取原文
获取原文并翻译 | 示例
           

摘要

Cross-validation is a widely used model evaluation method in data mining applications. However, it usually takes a lot of effort to determine the appropriate parameter values, such as training data size and the number of experiment runs, to implement a validated evaluation. This study develops an efficient cross-validation method called Complexity-based Efficient (CBE) cross-validation for binary classification problems. CBE cross-validation establishes a complexity index, called the CBE index, by exploring the geometric structure and noise of data. The CBE index is used to calculate the optimal training data size and the number of experiment runs to reduce model evaluation time when dealing with computationally expensive classification data sets. A simulated and three real data sets are employed to validate the performance of the proposed method in the study, while the validation methods compared are repeated random sub-sampling validation and K-fold cross-validation. The results show that CBE cross-validation, repeated random sub-sampling validation and K-fold cross-validation have similar validation performance, except that the training time required for CBE cross-validation is indeed lower than that for the other two methods.
机译:交叉验证是数据挖掘应用程序中一种广泛使用的模型评估方法。但是,通常需要花费大量精力来确定适当的参数值,例如训练数据大小和实验运行次数,以实施经过验证的评估。这项研究开发了一种有效的交叉验证方法,称为二元分类问题的基于复杂度的有效(CBE)交叉验证。 CBE交叉验证通过探索数据的几何结构和噪声来建立称为CBE索引的复杂性指标。 CBE索引用于计算最佳训练数据大小,并且在处理计算量大的分类数据集时,实验运行的次数减少了模型评估时间。模拟的和三个真实的数据集用于验证研究中所提出方法的性能,而比较的验证方法是重复随机子采样验证和K倍交叉验证。结果表明,CBE交叉验证,重复随机子抽样验证和K倍交叉验证具有相似的验证性能,但CBE交叉验证所需的训练时间确实比其他两种方法要少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号