首页> 外国专利> Device for generating aligned corpus based on unsupervised-learning alignment, method thereof, device for analyzing destructive expression morpheme using aligned corpus, and method for analyzing morpheme thereof

Device for generating aligned corpus based on unsupervised-learning alignment, method thereof, device for analyzing destructive expression morpheme using aligned corpus, and method for analyzing morpheme thereof

机译:基于无监督学习对齐生成对齐语料的装置,方法,使用对齐语料分析破坏性表达语素的装置及其词素分析方法

摘要

Disclosed is a device for generating an aligned corpus based on unsupervised-learning alignment, and a method thereof, a device for analyzing a destructive expression morpheme using an aligned corpus, and a method for analyzing a morpheme thereof.;The morpheme analyzing device includes a knowledge database and an analyzer. The knowledge database includes an aligned corpus for storing a plurality of knowledge information sets used for a per-language morpheme analysis, and stores a morpheme dictionary for storing morpheme information corresponding to a normal expression and normal expression information corresponding to a destructive expression (here, the destructive expression represents an expression that is erroneous in orthography or is not normalized and standardized). The analyzer performs a morpheme analysis on an input separate word by use of the knowledge database and outputs an analysis result, and when a morpheme on the input separate word is not provided in the morpheme dictionary, finds a normal expression corresponding to the destructive expression by use of the aligned corpus regarding the destructive expression included in the input separate word, and performs a morpheme analysis.
机译:本发明公开了一种基于无监督学习对齐生成对齐语料的装置及其方法,使用对齐语料分析破坏性表达语素的装置及其词素的分析方法。语素分析装置包括:知识数据库和分析器。知识数据库包括用于存储用于每种语言的词素分析的多个知识信息集的对齐语料库,并且存储词素词典,该词素词典用于存储与正常表达相对应的词素信息和与破坏性表达相对应的正常表达信息(在此,破坏性表达表示拼写错误或未经规范化和标准化的表达。分析器通过知识数据库对输入的单独词进行词素分析并输出分析结果,并且当词素词典中未提供输入的单独词的词素时,通过以下步骤找到与破坏性表达相对应的正则表达式:对输入的单独单词中包含的破坏性表达使用对齐语料库,并执行词素分析。

著录项

  • 公开/公告号US10282413B2

    专利类型

  • 公开/公告日2019-05-07

    原文格式PDF

  • 申请/专利权人 SYSTRAN INTERNATIONAL CO. LTD.;

    申请/专利号US201415026275

  • 发明设计人 CHANG JIN JI;

    申请日2014-08-27

  • 分类号G06F17/27;G06F17/30;G06N99;

  • 国家 US

  • 入库时间 2022-08-21 12:11:55

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号