首页> 外国专利> DOCUMENT MULTI-CLASSIFICATION DEVICE AND DOCUMENT MULTI-CLASSIFICATION METHOD FOR CLASSIFYING ONE DOCUMENT INTO PLURALITY OF CATEGORIES BY USING LEXICO-SEMANTIC PATTERN OBTAINED BY RECONFIGURING SEMANTIC CATEGORY OF WORDS CONSTITUTING SENTENCE

DOCUMENT MULTI-CLASSIFICATION DEVICE AND DOCUMENT MULTI-CLASSIFICATION METHOD FOR CLASSIFYING ONE DOCUMENT INTO PLURALITY OF CATEGORIES BY USING LEXICO-SEMANTIC PATTERN OBTAINED BY RECONFIGURING SEMANTIC CATEGORY OF WORDS CONSTITUTING SENTENCE

机译:通过重构构成句子的单词的语义类别而获得的词汇语义模式,将一种文档分类为多个文档的文档多分类设备和文档多分类方法

摘要

The present invention relates to a document multi-classification device and method for classifying one document into a plurality of categories by using a lexico-semantic pattern (LSP) obtained by reconfiguring a semantic category of words constituting a sentence. The present invention comprises: a pre-processing unit for defining an LSP, which includes a morpheme, a syllable, and a word phrase, and storing the same in a database, and defining a concept, which is a group of a plurality of hierarchically structured LSPs, and storing the same in the database; an analysis unit performing morpheme analysis on a sentence included in a document to be analyzed, and matching the same to the LSP so as to calculate a syntax analysis resu and a classification unit matching the syntax analysis result according to a document classification rule so as to extract at least one document classification of the document to be analyzed.
机译:文档多分类设备和方法技术领域本发明涉及一种文档多分类设备和方法,其通过使用通过重新构成构成句子的单词的语义类别而获得的词汇语义模式(LSP)将一个文档分类为多个类别。本发明包括:预处理单元,用于定义包括语素,音节和单词短语的LSP,并将其存储在数据库中,并定义概念,该概念是多个层次结构中的一组结构化的LSP,并将其存储在数据库中;分析单元,对要分析的文档中包含的句子进行词素分析,并将其与LSP进行匹配,以计算语法分析结果;分类单元,其根据文档分类规则匹配语法分析结果,以提取待分析文档的至少一个文档分类。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号