首页> 外国专利> METHOD, APPARATUS, AND SYSTEM FOR BUILDING A COMPACT LANGUAGE MODEL FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION (LVCSR) SYSTEM

METHOD, APPARATUS, AND SYSTEM FOR BUILDING A COMPACT LANGUAGE MODEL FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION (LVCSR) SYSTEM

机译:用于建立大词汇量连续语音识别(LVCSR)系统的紧凑语言模型的方法,装置和系统

摘要

According to one aspect of the invention, a method is provided in which a set of probabilistic attributes in an N-gram language model is classified into a plurality of classes. Each resultant class is clustered into a plurality of segments to build a codebook for the respective class using a modified K-means clustering process which dynamically adjusts the size and centroid of each segment during each iteration in the modified K-means clustering process. A probabilistic attribute in each class is then represented by the centroid of the corresponding segment to which the respective probabilistic attribute belongs.
机译:根据本发明的一个方面,提供了一种方法,其中将N-gram语言模型中的一组概率属性分类为多个类别。使用改进的K-均值聚类过程将每个结果类聚类为多个片段,以为相应的类构建码本,该过程在修正的K-means聚类过程中的每次迭代期间动态调整每个片段的大小和质心。然后,每个类别中的概率属性由各个概率属性所属的相应段的质心表示。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号