Speaker Normalization through Constrained MLLR Based Transforms

机译：通过基于约束MLLR的变换对说话人进行归一化

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, a novel speaker normalization method is presented and compared to a well known vocal tract length normalization method. With this method, acoustic observations of training and testing speakers are mapped into a normalized acoustic space through speaker-specific transformations with the aim of reducing inter-speaker acoustic variability. For each speaker, an affine transformation is estimated with the goal of reducing the mismatch between the acoustic data of the speaker and a set of target hidden Markov models. This transformation is estimated through constrained maximum likelihood linear regression and then applied to map the acoustic observations of the speaker into the normalized acoustic space. Recognition experiments made use of two corpora, the first one consisting of adults' speech, the second one consisting of children's speech. Performing training and recognition with normalized data resulted in a consistent reduction of the word error rate with respect to the baseline systems trained on un-normalized data. In addition, the novel method always performed better than the reference vocal tract length normalization method.

机译：在本文中，提出了一种新颖的说话人归一化方法，并将其与众所周知的声道长度归一化方法进行了比较。使用此方法，可以通过特定于扬声器的变换将训练和测试扬声器的声学观测结果映射到归一化的声学空间中，以减少扬声器之间的声学差异。对于每个说话者，估计仿射变换，以减少说话者的声学数据与一组目标隐马尔可夫模型之间的不匹配。通过约束最大似然线性回归来估算此变换，然后将其应用到将扬声器的声学观察结果映射到归一化声学空间中。识别实验使用了两个语料库，第一个由成人语音组成，第二个由儿童语音组成。相对于在非标准化数据上训练的基准系统，使用标准化数据执行训练和识别可导致单词错误率的持续降低。另外，新方法总是比参考声道长度归一化方法表现更好。

著录项

来源
《International Conference on Spoken Language Processing; 20041004-08; Jeju(KR)》|2004年|P.2893-2896|共4页
会议地点 Jeju(KR)
作者
D. Giuliani; M. Gerosa; F. Brugnara;
展开▼
作者单位

Centro per la Ricerca Scientifica e Tecnologica, 38050 Pante di Povo, Trento, Italy;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类应用语言学;
关键词

相似文献

外文文献
中文文献
专利

1. A study on MLLR-based speaker models using for speaker verification [J] . Masaharu Katoh, Junya Kanou, Akinori Ito, 電子情報通信学会技術研究報告. 音声. Speech . 2000,第137期

机译：基于MLLR的说话人模型用于说话人验证的研究
2. A study on MLLR-based speaker models using for speaker verification [J] . Masaharu Katoh, Junya Kanou, Akinori Ito, 電子情報通信学会技術研究報告. 音声. Speech . 2000,第137期

机译：基于MLLR的扬声器模型使用扬声器验证研究
3. A basis representation of constrained MLLR transforms for robust adaptation [J] . Daniel Povey, Kaisheng Yao Computer speech and language . 2012,第1期

机译：约束MLLR变换的基本表示形式，用于鲁棒自适应
4. Speaker recognition with region-constrained MLLR transforms [C] . Stolcke Andreas IEEE International Conference on Acoustics, Speech and Signal Processing;ICASSP . 2012

机译：具有区域约束的MLLR变换的说话人识别
5. Acoustic-feature-based frequency warping for speaker normalization. [D] . Gouvea, Evandro Bacci. 1999

机译：基于声音特征的频率扭曲，用于扬声器归一化。
6. Revisiting vocal perception in non-human animals: a review of vowel discrimination speaker voice recognition and speaker normalization [O] . Buddhamas Kriengwatana, Paola Escudero, Carel ten Cate 2014

机译：重温非人类动物的声音感知：元音辨别说话人语音识别和说话人正常化的综述
7. Speaker recognition with session variability normalization based on MLLR adaptation transforms [O] . Andreas Stolcke, Senior Member, Sachin S. Kajarekar, 1987

机译：基于MLLR自适应变换的具有会话可变性归一化的说话人识别
8. Speaker Recognition with Region-Constrained MLLR Transforms. [R] . Stolcke, A., Mandal, A., Shriberg, E. 2013

机译：具有区域约束的mLLR变换的说话人识别。

Speaker Normalization through Constrained MLLR Based Transforms

摘要

著录项

相似文献

相关主题

期刊订阅