An Improved KNN Text Classification Algorithm Based on Clustering

Zhou Yong; Li Youwen; Xia Shixiong

首页> 外文期刊>Journal of Computers >An Improved KNN Text Classification Algorithm Based on Clustering

【24h】

An Improved KNN Text Classification Algorithm Based on Clustering

机译：一种基于聚类的改进的KNN文本分类算法

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

—The traditional KNN text classification algorithm used all training samples for classification, so it had a huge number of training samples and a high degree of calculation complexity, and it also didn’t reflect the different importance of different samples. In allusion to the problems mentioned above, an improved KNN text classification algorithm based on clustering center is proposed in this paper. Firstly, the given training sets are compressed and the samples near by the border are deleted, so the multipeak effect of the training sample sets is eliminated. Secondly, the training sample sets of each category are clustered by k-means clustering algorithm, and all cluster centers are taken as the new training samples. Thirdly, a weight value is introduced, which indicates the importance of each training sample according to the number of samples in the cluster that contains this cluster center. Finally, the modified samples are used to accomplish KNN text classification. The simulation results show that the algorithm proposed in this paper can not only effectively reduce the actual number of training samples and lower the calculation complexity, but also improve the accuracy of KNN text classification algorithm.

机译：- 传统的KNN文本分类算法使用了所有培训样本进行分类，因此它具有大量的培训样本和高度的计算复杂性，并且还没有反映不同样本的不同重要性。在本文中提出了一种基于聚类中心的改进的KNN文本分类算法。首先，压缩给定的训练集，删除边界附近的样本，因此消除了训练样本集的多跳效果。其次，每个类别的训练样本集由K-means聚类算法集群，所有群集中心都被视为新的培训样本。第三，介绍了权重值，这表明每个训练样本根据包含该群集中心的群集中的样本数量的重要性。最后，修改后的样本用于完成KNN文本分类。仿真结果表明，本文提出的算法不仅可以有效地降低训练样本的实际数量并降低计算复杂性，而且提高了KNN文本分类算法的准确性。

著录项

来源
《Journal of Computers》 |2009年第3期|共8页
作者
Zhou Yong; Li Youwen; Xia Shixiong;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Supervised and semi-supervised learning in text classification using enhanced KNN algorithm: a comparative study of supervised and semi-supervised classification in text categorisation [J] . M. A. Wajeed, T. Adilakshmi International Journal of Intelligent Systems Technologies and Applications . 2012,第3a4期

机译：使用增强型KNN算法的文本分类中的有监督和半监督学习：文本分类中有监督和半监督分类的比较研究
2. Application Research of KNN Algorithm Based on Clustering in Big Data Talent Demand Information Classification [J] . Xiao Qingtao, Zhong Xin, Zhong Chenghua International Journal of Pattern Recognition and Artificial Intelligence . 2020,第6期

机译：KNN算法在大数据人才需求信息分类中基于集群的应用研究
3. Automatic fast double KNN classification algorithm based on ACC and hierarchical clustering for big data [J] . Li Haiyun, Li Haifeng, Wei Kaibin International journal of communication systems . 2018,第16期

机译：基于ACC和层次聚类的大数据自动快速双KNN分类算法。
4. A clustering-Based KNN improved algorithm CLKNN for text classification [C] . Lijuan Zhou, Linshuang Wang, Xuebin Ge, 2010 2nd International Asia Conference on Informatics in Control, Automation and Robotics (CAR 2010) . 2010

机译：基于聚类的KNN改进算法CLKNN用于文本分类
5. Information retrieval: A framework for recommending text-based classification algorithms. [D] . Saleeb, Hany. 2002

机译：信息检索：一种推荐基于文本的分类算法的框架。
6. Large scale biomedical texts classification: a kNN and an ESA-based approaches [O] . Khadim Dramé, Fleur Mougin, Gayo Diallo 2016

机译：大规模生物医学文献分类：kNN和基于ESA的方法
7. An Improved KNN Text Classification Algorithm Based on Clustering [O] . Shixiong Xia, Youwen Li, Yong Zhou 2009

机译：基于聚类的改进的KNN文本分类算法

An Improved KNN Text Classification Algorithm Based on Clustering

摘要

著录项

相似文献

相关主题

期刊订阅