首页> 外文会议>International Conference on Spoken Language Processing; 20041004-08; Jeju(KR) >Speaker Normalization through Constrained MLLR Based Transforms
【24h】

Speaker Normalization through Constrained MLLR Based Transforms

机译:通过基于约束MLLR的变换对说话人进行归一化

获取原文
获取原文并翻译 | 示例

摘要

In this paper, a novel speaker normalization method is presented and compared to a well known vocal tract length normalization method. With this method, acoustic observations of training and testing speakers are mapped into a normalized acoustic space through speaker-specific transformations with the aim of reducing inter-speaker acoustic variability. For each speaker, an affine transformation is estimated with the goal of reducing the mismatch between the acoustic data of the speaker and a set of target hidden Markov models. This transformation is estimated through constrained maximum likelihood linear regression and then applied to map the acoustic observations of the speaker into the normalized acoustic space. Recognition experiments made use of two corpora, the first one consisting of adults' speech, the second one consisting of children's speech. Performing training and recognition with normalized data resulted in a consistent reduction of the word error rate with respect to the baseline systems trained on un-normalized data. In addition, the novel method always performed better than the reference vocal tract length normalization method.
机译:在本文中,提出了一种新颖的说话人归一化方法,并将其与众所周知的声道长度归一化方法进行了比较。使用此方法,可以通过特定于扬声器的变换将训练和测试扬声器的声学观测结果映射到归一化的声学空间中,以减少扬声器之间的声学​​差异。对于每个说话者,估计仿射变换,以减少说话者的声学数据与一组目标隐马尔可夫模型之间的不匹配。通过约束最大似然线性回归来估算此变换,然后将其应用到将扬声器的声学观察结果映射到归一化声学空间中。识别实验使用了两个语料库,第一个由成人语音组成,第二个由儿童语音组成。相对于在非标准化数据上训练的基准系统,使用标准化数据执行训练和识别可导致单词错误率的持续降低。另外,新方法总是比参考声道长度归一化方法表现更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号