Feature subset selection Filter-Wrapper based on low quality data

Jose M. Cadenas; M. Carmen Garrido; Raquel Martinez

首页> 外文期刊>Expert Systems with Application >Feature subset selection Filter-Wrapper based on low quality data

【24h】

Feature subset selection Filter-Wrapper based on low quality data

机译：基于低质量数据的特征子集选择Filter-Wrapper

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Today, feature selection is an active research in machine learning. The main idea of feature selection is to choose a subset of available features, by eliminating features with little or no predictive information, as well as redundant features that are strongly correlated. There are a lot of approaches for feature selection, but most of them can only work with crisp data. Until now there have not been many different approaches which can directly work with both crisp and low quality (imprecise and uncertain) data. That is why, we propose a new method of feature selection which can handle both crisp and low quality data. The proposed approach is based on a Fuzzy Random Forest and it integrates filter and wrapper methods into a sequential search procedure with improved classification accuracy of the features selected. This approach consists of the following main steps: (1) scaling and discretization process of the feature set; and feature pre-selection using the discretization process (filter); (2) ranking process of the feature pre-selection using the Fuzzy Decision Trees of a Fuzzy Random Forest ensemble; and (3) wrapper feature selection using a Fuzzy Random Forest ensemble based on cross-validation. The efficiency and effectiveness of this approach is proved through several experiments using both high dimensional and low quality datasets. The approach shows a good performance (not only classification accuracy, but also with respect to the number of features selected) and good behavior both with high dimensional datasets (microarray datasets) and with low quality datasets.

机译：如今，特征选择是机器学习中的一项活跃研究。特征选择的主要思想是通过消除具有很少或没有预测信息的特征以及高度相关的冗余特征来选择可用特征的子集。特征选择有很多方法，但是大多数方法只能使用清晰的数据。到目前为止，还没有许多不同的方法可以直接使用清晰和低质量（不精确和不确定）的数据。因此，我们提出了一种新的特征选择方法，该方法可以处理清晰和低质量的数据。所提出的方法基于模糊随机森林，并将过滤器和包装器方法集成到顺序搜索过程中，从而提高了所选特征的分类精度。该方法包括以下主要步骤：（1）特征集的缩放和离散化过程；以及使用离散化过程（滤波器）进行特征预选；（2）利用模糊随机森林集成的模糊决策树对特征预选进行排序。（3）使用基于交叉验证的模糊随机森林集成来选择包装器特征。通过使用高维和低质量数据集的几次实验证明了这种方法的效率和有效性。该方法在高维数据集（微阵列数据集）和低质量数据集上均显示出良好的性能（不仅分类准确，而且还涉及所选特征的数量）和良好的行为。

著录项

来源
《Expert Systems with Application》 |2013年第16期|6241-6252|共12页
作者
Jose M. Cadenas; M. Carmen Garrido; Raquel Martinez;
展开▼
作者单位

Department of Information Engineering and Communication, Campus of Espinardo, University of Murcia, 30100 Murcia, Spain;

Department of Information Engineering and Communication, Campus of Espinardo, University of Murcia, 30100 Murcia, Spain;

Department of Information Engineering and Communication, Campus of Espinardo, University of Murcia, 30100 Murcia, Spain;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Feature selection; Low quality data; Fuzzy Random Forest; Fuzzy Decision Tree;

机译：功能选择;低质量数据;模糊随机森林模糊决策树;

相似文献

外文文献
中文文献
专利

1. A GRASP algorithm for fast hybrid (filter-wrapper) feature subset selection in high-dimensional datasets [J] . Pablo Bermejo, Jose A. Gamez, Jose M. Puerta Pattern recognition letters . 2011,第5期

机译：高维数据集中用于快速混合（滤波器包装）特征子集选择的GRASP算法
2. Efficient Genetic-Wrapper Algorithm Based Data Mining for Feature Subset Selection in a Power Quality Pattern Recognition Application [J] . Brahmadesam Krishna, Baskaran Kaliaperumal The international arab journal of information technology . 2011,第4期

机译：电能质量模式识别应用中基于遗传算法的数据挖掘特征子集选择
3. Filter-Wrapper Combination and Embedded Feature Selection for Gene Expression Data [J] . Shilan S. Hameed, Olutomilayo Olayemi Petinrin, Abdirahman Osman Hashi, International Journal of Advances in Soft Computing and Its Applications . 2018,第1期

机译：基因表达数据的过滤器包装器组合和嵌入式特征选择
4. A Filter-Wrapper based Feature Selection for Optimized Website Quality Prediction [C] . Akshi Kumar, Anshika Arora Amity International Conference on Artificial Intelligence . 2019

机译：基于筛选器的特征选择以优化网站质量预测
5. Stability analysis of feature selection approaches with low quality data. [D] . Altidor, Wilker. 2011

机译：具有低质量数据的特征选择方法的稳定性分析。
6. An optimal set of features for predicting type IV secretion system effector proteins for a subset of species based on a multi-level feature selection approach [O] . Zhila Esna Ashari, Nairanjana Dasgupta, Kelly A. Brayton, 2012

机译：基于多级特征选择方法预测一组子集的IV型分泌系统效应蛋白的一组最佳特征
7. IFSS – An Improved Filter-Wrapper Algorithm for Feature Subset Selection [O] . Saurabh Soni, M. E. Scholar, Pratik Patel, 2015

机译：IFss - 一种改进的特征子集选择滤波器包装算法

Feature subset selection Filter-Wrapper based on low quality data

摘要

著录项

相似文献

相关主题

期刊订阅