首页> 外文期刊>International journal of machine learning and cybernetics >An efficient automatic multiple objectives optimization feature selection strategy for internet text classification
【24h】

An efficient automatic multiple objectives optimization feature selection strategy for internet text classification

机译:一种用于互联网文本分类的高效自动多目标优化特征选择策略

获取原文
获取原文并翻译 | 示例
           

摘要

Research on feature selection in text classification is usually limited to propose various techniques to select a set of features with highest scores based on different metrics. The selected features are usually determined by using a separate validation dataset with a fixed threshold. Obviously, it may not generalize well to new data as the best number for selected features is various on different datasets. In this paper, we first conduct a deep analysis, and find that simply extracting the features based on the score calculated by a metric may not always be the best strategy as it may turn many documents into zero length, which make them not suitable for training. We then model the feature selection process as a multiple objectives optimization problem to gain the best number of selected features rationally and automatically. In addition, as the optimization process costs a lot of resources, we design a parallel algorithm to improve the running time using dynamic programming. Extensive experiments are performed on several popular datasets, and the results indicate that our proposed approach is effective and feasible.
机译:对文本分类中的特征选择的研究通常仅限于提出各种技术,以基于不同的指标来选择得分最高的一组特征。通常通过使用具有固定阈值的单独验证数据集来确定所选特征。显然,由于新选择的特征的最佳数量在不同数据集上不尽相同,因此对于新数据可能无法很好地概括。在本文中,我们首先进行了深入分析,发现仅根据度量标准计算出的分数来提取特征可能并不总是最佳策略,因为它可能会使许多文档变为零长度,这使其不适合进行培训。然后,我们将特征选择过程建模为多目标优化问题,以合理,自动地获得最佳数量的选定特征。此外,由于优化过程需要大量资源,因此我们设计了一种并行算法,以使用动态编程来缩短运行时间。在几个流行的数据集上进行了广泛的实验,结果表明我们提出的方法是有效和可行的。

著录项

  • 来源
  • 作者单位

    South China Normal Univ, Guangdong Engineeting Res Ctr Smart Learning, Guangzhou, Guangdong, Peoples R China;

    South China Normal Univ, Guangdong Engineeting Res Ctr Smart Learning, Guangzhou, Guangdong, Peoples R China|South China Normal Univ, Sch Comp Sci, Guangzhou, Guangdong, Peoples R China;

    Univ Hong Kong, Sch Comp Sci, Hong Kong, Peoples R China;

    Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China;

    Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Hong Kong, Peoples R China;

    Sichuan Univ, Dept Comp Sci, Chengdu, Sichuan, Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号