首页> 外文会议>2018 2nd International Conference on Inventive Systems and Control >A survey on text document categorization using enhanced sentence vector space model and bi-gram text representation model based on novel fusion techniques
【24h】

A survey on text document categorization using enhanced sentence vector space model and bi-gram text representation model based on novel fusion techniques

机译:基于新型融合技术的增强句向量空间模型和二元语法文本表示模型对文本文档分类的研究

获取原文
获取原文并翻译 | 示例

摘要

In this today's technology, many of digital documents are being generated and available each day. However, it would cost a vast amount of time and human efforts to classify them in reasonable categories like important and unimportant, spam or no-spam. The text document classification tasks pass under the Automatic Classification (also known as pattern Recognition) problem in Machine Learning and Text Mining. It is necessary to classify large text documents into specific classes, to make clear and search simply. Classified data are easy for users to browse. The importance of common text document placement is the representation of the unknown text for some pre-categories as representations for survival. The Combination of classifiers is fused together to increase the accuracy classification result in a single text document. The contemplate text document classification depend on different representation model and fusion based classifiers are explained in the paper. In order to examine different techniques, Enhanced Sentence Vector Space Model (ES-VSM) and a Bigram is used to match the layout of a problem document. The result completed by assessing different current classifiers by looking accuracy of their performance in advance. This will explain and promote a willingness of new research participants to respond to challenging situations and respond to similar responses.
机译:在今天的这项技术中,每天都会生成并提供许多数字文档。但是,将它们分类为合理的类别(例如重要和不重要的垃圾邮件或无垃圾邮件)将花费大量时间和精力。文本文档分类任务通过了机器学习和文本挖掘中的自动分类(也称为模式识别)问题。有必要将大型文本文档分类为特定类,以使其清晰并简单地进行搜索。分类数据易于用户浏览。普通文本文档放置的重要性在于,对于某些预类别而言,未知文本的表示应作为生存的表示。分类器的组合融合在一起,以提高单个文本文档中分类结果的准确性。考虑文本文档的分类取决于不同的表示模型,并在本文中说明了基于融合的分类器。为了检查不同的技术,使用增强句向量空间模型(ES-VSM)和Bigram来匹配问题文档的布局。通过预先查看不同分类器的性能准确性来评估其结果,从而得出结果。这将解释并促进新的研究参与者愿意对具有挑战性的情况做出反应并对类似的反应做出反应。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号