首页> 美国政府科技报告 >Natural Language Text Classification and Filtering with Trigrams and Evolutionary Nearest Neighbour Classifiers. Software Engineering (SEN).
【24h】

Natural Language Text Classification and Filtering with Trigrams and Evolutionary Nearest Neighbour Classifiers. Software Engineering (SEN).

机译:基于Trigrams和进化最近邻分类器的自然语言文本分类和过滤。软件工程(sEN)。

获取原文

摘要

N grams offer fast language independent multi-class text categorization. Text is reduced in a single pass to ngram vectors. These are assigned to one of several classes by (1) nearest neighbour (KNN) and (2) genetic algorithm operating on weights in a nearest neighbour classifier. 91% accuracy is found on binary classification on short multi-author technical English documents. This falls if more categories are used but 69% is obtained with 8 classes. Zipf law is found not to apply to trigrams.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号