首页> 美国卫生研究院文献>PLoS Clinical Trials >uEFS: An efficient and comprehensive ensemble-based feature selection methodology to select informative features
【2h】

uEFS: An efficient and comprehensive ensemble-based feature selection methodology to select informative features

机译:uEFS:一种有效且全面的基于集合的特征选择方法,用于选择信息丰富的特征

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Feature selection is considered to be one of the most critical methods for choosing appropriate features from a larger set of items. This task requires two basic steps: ranking and filtering. Of these, the former necessitates the ranking of all features, while the latter involves filtering out all irrelevant features based on some threshold value. In this regard, several feature selection methods with well-documented capabilities and limitations have already been proposed. Similarly, feature ranking is also nontrivial, as it requires the designation of an optimal cutoff value so as to properly select important features from a list of candidate features. However, the availability of a comprehensive feature ranking and a filtering approach, which alleviates the existing limitations and provides an efficient mechanism for achieving optimal results, is a major problem. Keeping in view these facts, we present an efficient and comprehensive univariate ensemble-based feature selection (uEFS) methodology to select informative features from an input dataset. For the uEFS methodology, we first propose a unified features scoring (UFS) algorithm to generate a final ranked list of features following a comprehensive evaluation of a feature set. For defining cutoff points to remove irrelevant features, we subsequently present a threshold value selection (TVS) algorithm to select a subset of features that are deemed important for the classifier construction. The uEFS methodology is evaluated using standard benchmark datasets. The extensive experimental results show that our proposed uEFS methodology provides competitive accuracy and achieved (1) on average around a 7% increase in f-measure, and (2) on average around a 5% increase in predictive accuracy as compared with state-of-the-art methods.
机译:特征选择被认为是从较大的项目集中选择适当特征的最关键方法之一。此任务需要两个基本步骤:排名和过滤。其中,前者需要对所有特征进行排名,而后者则需要基于某个阈值过滤掉所有不相关的特征。在这方面,已经提出了具有充分记录的能力和限制的几种特征选择方法。类似地,特征排名也不是无关紧要的,因为它需要指定最佳截止值,以便从候选特征列表中正确选择重要特征。然而,一个全面的特征等级和过滤方法的可用性是一个主要问题,它减轻了现有限制并提供了一种实现最佳结果的有效机制。考虑到这些事实,我们提出了一种有效且全面的基于单变量集成的特征选择(uEFS)方法,以从输入数据集中选择信息量丰富的特征。对于uEFS方法,我们首先提出一种统一的特征评分(UFS)算法,以在对特征集进行全面评估之后生成最终的特征排名列表。为了定义截止点以去除不相关的特征,我们随后提出了阈值选择(TVS)算法,以选择对分类器构造重要的特征子集。使用标准基准数据集评估uEFS方法。广泛的实验结果表明,我们提出的uEFS方法提供了竞争性准确性,与(-)状态相比,(1)f度量平均提高了约7%,(2)预测准确性平均提高了约5%。最先进的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号