A New Approach of Feature Selection for Text Categorization

机译：文本分类特征选择的新方法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper proposes a new approach of feature selection based on the independent measure between features for text categorization. A fundamental hypothesis that occurrence of the terms in documents is independent of each other, widely used in the probabilistic models for text categorization (TO , is discussed. However, the basic hypothesis is incomplete for independence of feature set. From the view of feature selection, a new independent measure between features is designed, by which a feature selection algorithm is given to obtain a feature subset. The selected subset is high in relevance with category and strong in independence between features, satisfies the basic hypothesis at maximum degree. Compared with other traditional feature selection method in TC (which is only taken into the relevance account), the performance of feature subset selected by our method is prior to others with experiments on the benchmark dataset of 20 Newsgroups.

机译：本文提出了一种基于特征之间独立度量的特征选择方法，用于文本分类。讨论了术语在文档中的出现彼此独立的基本假设，该基本假设被广泛用于文本分类的概率模型中（TO）。但是，基本假设对于特征集的独立性是不完整的。设计了一种新的特征间独立度量，通过特征选择算法获得特征子集，所选择的子集与类别相关性强，特征间独立性强，最大程度满足了基本假设。在TC的其他传统特征选择方法中（仅考虑了相关性），通过对20个新闻组的基准数据集进行实验，我们的方法选择的特征子集的性能要优于其他方法。

著录项

来源
《Conference on Web Information System and Applications(WISA 2006); 20061019-22; Nanjing(CN)》|2006年|P.1335-1339|共5页
会议地点 Nanjing(CN)
作者
CUI Zifeng; XU Baowen; ZHANG Weifeng; XU Junling;
展开▼
作者单位

School of Computer Science and Engineering, Southeast University, Nanjing 210096, Jiangsu, China;

Department of Computer Science and Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, Jiangsu, China;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机网络;
关键词
feature selection; independency; CHI-square test; text categorization;

机译：特征选择;独立性;卡方检验;文本分类;

相似文献

外文文献
中文文献
专利

1. Text Categorization Optimization By A Hybrid Approach Using Multiple Feature Selection And Feature Extraction Methods [J] . K. Rajeswari, Sneha Nakil, Neha Patil, International Journal of Engineering Research and Applications . 2014,第5期

机译：基于多种特征选择和特征提取的混合方法文本分类优化
2. IGICA: A Hybrid Feature Selection Approach in Text Categorization [J] . Mohammad Mojaveriyan, Hossein Ebrahimpour-komleh, Seyed jalaleddin Mousavirad International Journal of Intelligent Systems and Applications . 2016,第3期

机译：IGICA：文本分类中的混合特征选择方法
3. Experimental Investigation for Text Categorization Based on Hybrid Approach Using Feature Selection and Classification Techniques [J] . K. Sridharan, M. Chitra Asian Journal of Information Technology . 2016,第14期

机译：基于特征选择和分类技术混合方法的文本分类实验研究
4. Hybrid IG and GA based Feature Selection Approach for Text Categorization [C] . Manda Thejaswee, P. Srilakshmi, G. Karuna, International Conference on Electronics, Communication and Aerospace Technology . 2020

机译：文本分类的混合IG和GA的特征选择方法
5. Study of feature selection algorithms for text-categorization. [D] . Dave, Kandarp. 2011

机译：用于文本分类的特征选择算法的研究。
6. Improved Feature-Selection Method Considering the Imbalance Problem in Text Categorization [O] . Jieming Yang, Zhaoyang Qu, Zhiying Liu -1

机译：文本分类中考虑不平衡问题的改进特征选择方法
7. Text Categorization Optimization By A Hybrid Approach Using Multiple Feature Selection And Feature Extraction Methods [O] . K. Rajeswari, Sneha Nakil 2014

机译：基于多特征选择和特征提取方法的混合方法文本分类优化

A New Approach of Feature Selection for Text Categorization

摘要

著录项

相似文献

相关主题

期刊订阅