首页> 外国专利> Language model creation method, language model creation device, and language model creation program

Language model creation method, language model creation device, and language model creation program

机译:语言模型创建方法,语言模型创建设备和语言模型创建程序

摘要

PPROBLEM TO BE SOLVED: To create a language model and split a word without using teacher data. PSOLUTION: A language mode creation device selects a plurality of sentences stored in character string data 131 at random, and creates a character string splitting pattern group indicating a character string as a word punctuation candidate in the selected sentences using the language model 132. The probability of the sentence corresponding to the character string splitting pattern of the character string splitting pattern group is recorded in a storage, and a character string pattern is selected among the character string splitting pattern groups based on the probability. The language model 132 is updated using the selected character string splitting pattern. Such a process is executed to all the sentences stored in the character string data 131, and the language model 132 is optimized. Using the language model 132 optimized in such a manner, split of a most likelihood word of the sentence is performed. PCOPYRIGHT: (C)2010,JPO&INPIT
机译:

要解决的问题:在不使用教师数据的情况下创建语言模型并拆分单词。

解决方案:语言模式创建设备使用语言模型132随机选择存储在字符串数据131中的多个句子,并创建一个字符串拆分模式组,该字符串拆分模式组指示在所选句子中作为单词标点候选的字符串将与字符串分割模式组的字符串分割模式相对应的句子的概率记录在存储器中,并基于该概率从字符串分割模式组中选择字符串模式。使用所选择的字符串分割模式来更新语言模型132。对存储在字符串数据131中的所有句子执行这样的处理,并且优化语言模型132。使用以这种方式优化的语言模型132,执行句子的最可能单词的分割。

版权:(C)2010,日本特许厅&INPIT

著录项

  • 公开/公告号JP5199901B2

    专利类型

  • 公开/公告日2013-05-15

    原文格式PDF

  • 申请/专利权人 日本電信電話株式会社;

    申请/专利号JP20090010931

  • 发明设计人 持橋 大地;山田 武士;

    申请日2009-01-21

  • 分类号G06F17/27;

  • 国家 JP

  • 入库时间 2022-08-21 16:54:44

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号