首页> 外文会议> >A study on speaker normalization using vocal tract normalization and speaker adaptive training
【24h】

A study on speaker normalization using vocal tract normalization and speaker adaptive training

机译:利用声道归一化和说话人自适应训练对说话人归一化的研究

获取原文

摘要

Although speaker normalization is attempted in very different manners, vocal tract normalization (VTN) and speaker adaptive training (SAT) share many common properties. We show that both lead to more compact representations of the phonetically relevant variations of the training data and that both achieve improved error rate performance only if a complementary normalization or adaptation operation is conducted on the test data. Algorithms for fast test speaker enrolment are presented for both normalization methods: in the framework of SAT, a pre-transformation step is proposed, which alone, i.e. without subsequent unsupervised MLLR adaptation, reduces the error rate by almost 10% on the WSJ 5k test sets. For VTN, the use of a Gaussian mixture model makes obsolete a first recognition pass to obtain a preliminary transcription of the test utterance at hardly any loss in performance.
机译:尽管以非常不同的方式尝试了说话人归一化,但是声道归一化(VTN)和说话者自适应训练(SAT)具有许多共同的特性。我们表明,这两种方法都可以更紧凑地表示训练数据的语音相关变化,并且只有在对测试数据进行互补的归一化或自适应操作时,两者才能实现更高的误码率性能。针对这两种归一化方法,提出了用于快速测试演讲者注册的算法:在SAT的框架中,提出了一个预转换步骤,仅此一步,即无需后续的无监督MLLR自适应,就可以将WSJ 5k测试的错误率降低近10%套。对于VTN,使用高斯混合模型会使首次识别过时而在性能几乎不降低的情况下过早地获得测试话语的初步转录。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号