A survey on text document categorization using enhanced sentence vector space model and bi-gram text representation model based on novel fusion techniques

机译：基于新型融合技术的增强句向量空间模型和二元语法文本表示模型对文本文档分类的研究

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this today's technology, many of digital documents are being generated and available each day. However, it would cost a vast amount of time and human efforts to classify them in reasonable categories like important and unimportant, spam or no-spam. The text document classification tasks pass under the Automatic Classification (also known as pattern Recognition) problem in Machine Learning and Text Mining. It is necessary to classify large text documents into specific classes, to make clear and search simply. Classified data are easy for users to browse. The importance of common text document placement is the representation of the unknown text for some pre-categories as representations for survival. The Combination of classifiers is fused together to increase the accuracy classification result in a single text document. The contemplate text document classification depend on different representation model and fusion based classifiers are explained in the paper. In order to examine different techniques, Enhanced Sentence Vector Space Model (ES-VSM) and a Bigram is used to match the layout of a problem document. The result completed by assessing different current classifiers by looking accuracy of their performance in advance. This will explain and promote a willingness of new research participants to respond to challenging situations and respond to similar responses.

机译：在今天的这项技术中，每天都会生成并提供许多数字文档。但是，将它们分类为合理的类别（例如重要和不重要的垃圾邮件或无垃圾邮件）将花费大量时间和精力。文本文档分类任务通过了机器学习和文本挖掘中的自动分类（也称为模式识别）问题。有必要将大型文本文档分类为特定类，以使其清晰并简单地进行搜索。分类数据易于用户浏览。普通文本文档放置的重要性在于，对于某些预类别而言，未知文本的表示应作为生存的表示。分类器的组合融合在一起，以提高单个文本文档中分类结果的准确性。考虑文本文档的分类取决于不同的表示模型，并在本文中说明了基于融合的分类器。为了检查不同的技术，使用增强句向量空间模型（ES-VSM）和Bigram来匹配问题文档的布局。通过预先查看不同分类器的性能准确性来评估其结果，从而得出结果。这将解释并促进新的研究参与者愿意对具有挑战性的情况做出反应并对类似的反应做出反应。

著录项

来源
《2018 2nd International Conference on Inventive Systems and Control》|2018年|218-225|共8页
会议地点 Coimbatore(IN)
作者
Abdisa Demissie Amensisa; Seema Patil; Poorva Agrawal;
展开▼
作者单位

Symbiosis International University, Symbiosis Institute of Technology, Pune, India;

Symbiosis International University, Symbiosis Institute of Technology, Pune, India;

Symbiosis International University, Symbiosis Institute of Technology, Pune, India;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Classification algorithms; Text categorization; Hidden Markov models; Data models; Training; Conferences; Control systems;

机译：分类算法;文本分类;隐马尔可夫模型;数据模型;培训;会议;控制系统;;

相似文献

外文文献
中文文献
专利

1. Text Document Categorization using Enhanced Sentence Vector Space Model and Bi-Gram Text Representation Model Based on Novel Fusion Techniques [J] . Abdisa Demissie Amensisa New Media and Mass Communication . 2020,第4期

机译：基于新型融合技术的基于增强句子矢量空间模型和双革文本表示模型的文本文档分类
2. Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model [J] . Computer and Information Science . 2009,第4期

机译：基于矢量空间模型的贝叶斯分类文本文档预处理
3. The phrase-based vector space model for automatic retrieval of free-text medical documents [J] . Wenlei Mao, Wesley W. Chu Data & Knowledge Engineering . 2007,第1期

机译：自动检索自由文本医学文档的基于短语的向量空间模型
4. A survey on text document categorization using enhanced sentence vector space model and bi-gram text representation model based on novel fusion techniques [C] . Abdisa Demissie Amensisa, Seema Patil, Poorva Agrawal International Conference on Inventive Systems and Control . 2018

机译：基于新型融合技术的增强句子矢量空间模型和双革文本表示模型的文本文档分类调查
5. Text association mining with cross-sentence inference, structure-based document model and multi-relational text mining. [D] . Thaicharoen, Supphachai. 2009

机译：带有跨句推理的文本关联挖掘，基于结构的文档模型和多关系文本挖掘。
6. Free-text medical document retrieval via phrase-based vector space model. [O] . Wenlei Mao, Wesley W. Chu 2002

机译：通过基于短语的向量空间模型检索自由文本医学文献。
7. Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model [O] . R. Rajkumar, V. P. Kallimani, Lee Lam Hong, 2009

机译：基于向量空间模型的贝叶斯分类文本文档预处理

A survey on text document categorization using enhanced sentence vector space model and bi-gram text representation model based on novel fusion techniques

摘要

著录项

相似文献

相关主题

期刊订阅