Using Statistical Properties to Enhance Text Categorization

机译：使用统计属性来增强文本分类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Statistical properties extracted from text are useful in many areas. Knowing who authored some text or knowing the category of a text is among the uses of collecting such statistics. In this paper, language-independent properties of text are studied using two categorized corpora of news articles. It is observed that the properties do not depend on the corpus or on its size. Several interesting properties are identified which enable minimizing the training set for an intelligent categorization system. Some other applications for such statistics could be to compare the information content and rate of new information between different corpora as well as to enhance the categorization of text.

机译：从文本中提取的统计特性在许多领域都很有用。知道谁撰写了一些文本或了解文本的类别是收集此类统计数据的用途。在本文中，使用两种分类的新闻文章的语言的文本的独立性属性。观察到性质不依赖于语料库或其尺寸。识别出几个有趣的属性，它能够最小化智能分类系统的培训集。此类统计数据的其他一些应用程序可以是比较不同的语料库之间的信息内容和新信息的速率，并增强文本的分类。

著录项

来源
《World multi-conference on systemics, cybernetics and informatics》|2015年||共5页
会议地点
作者
Ziad OSMAN; Rached ZANTOUT;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Statistical Properties; text categorization; text mining; data mining;

机译：统计属性;文本分类;文本挖掘;数据挖掘;

相似文献

外文文献
中文文献
专利

1. Using Statistical Properties to Enhance Text Categorization [J] . Rached Zantout, Ziad Osman Journal of Systemics, Cybernetics and Informatics . 2015,第3期

机译：使用统计属性来增强文本分类
2. Text Document Categorization using Enhanced Sentence Vector Space Model and Bi-Gram Text Representation Model Based on Novel Fusion Techniques [J] . Abdisa Demissie Amensisa New Media and Mass Communication . 2020,第4期

机译：基于新型融合技术的基于增强句子矢量空间模型和双革文本表示模型的文本文档分类
3. An enhanced text categorization method based on improved text frequency approach and mutual information algorithm [J] . Pei Zhili, Shi Xiaohu, Maurizio Marchese, 自然科学进展：英文版 . 2007,第012期

机译：基于改进文本频率法和互信息算法的改进文本分类方法
4. Using Statistical Properties to Enhance Text Categorization [C] . Ziad OSMAN, Rached ZANTOUT World multi-conference on systemics, cybernetics and informatics . 2015

机译：使用统计属性来增强文本分类
5. Robust statistical techniques for the categorization of images using associated text. [D] . Sable, Carl Lewis. 2003

机译：使用关联文本对图像进行分类的可靠统计技术。
6. Enhancing Text Categorization with Semantic-enriched Representation and Training Data Augmentation [O] . Xinghua Lu, Bin Zheng, Atulya Velivelli, 2006

机译：通过丰富的语义表示和训练数据增强来增强文本分类
7. Arabic text correction using dynamic categorized dictionaries: a statistical approach [O] . Yahya Adnan, Salhi Ali 2013

机译：使用动态分类词典的阿拉伯语文本校正：统计方法

Using Statistical Properties to Enhance Text Categorization

摘要

著录项

相似文献

相关主题

期刊订阅