Orthogonalized Distinctive Phonetic Feature Extraction for Noise-Robust Automatic Speech Recognition

Takashi FUKUDA; Tsuneo NITTA

首页> 外文期刊>IEICE Transactions on Information and Systems >Orthogonalized Distinctive Phonetic Feature Extraction for Noise-Robust Automatic Speech Recognition

【24h】

Orthogonalized Distinctive Phonetic Feature Extraction for Noise-Robust Automatic Speech Recognition

机译：正交特征语音特征提取在强噪声自动语音识别中的应用

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose a noise-robust automatic speech recognition system that uses orthogonalized distinctive phonetic features (DPFs) as input of HMM with diagonal covariance. In an orthogonalized DPF extraction stage, first, a speech signal is converted to acoustic features composed of local features (LFs) and ΔP, then a multilayer neural network (MLN) with 15x3 output units composed of context-dependent DPFs of a preceding context DPF vector, a current DPF vector, and a following context DPF vector maps the LFs to DPFs. Karhunen-Loeve transform (KLT) is then applied to orthogonalize each DPF vector in the context-dependent DPFs, using orthogonal bases calculated from a DPF vector that represents 38 Japanese phonemes. Each orthogonalized DPF vector is finally decor-related one another by using Gram-Schmidt orthogonalization procedure. In experiments, after evaluating the parameters of the MLN input and output units in the DPF extractor, the orthogonalized DPFs are compared with original DPFs. The orthogonalized DPFs are then evaluated in comparison with a standard parameter set of MFCCs and dynamic features. Next, noise robustness is tested using four types of additive noise. The experimental results show that the use of the proposed orthogonalized DPFs can significantly reduce the error rate in an isolated spoken-word recognition task both with clean speech and with speech contaminated by additive noise. Furthermore, we achieved significant improvements when combining the orthogonalized DPFs with conventional static MFCCs and ΔP.

机译：在本文中，我们提出了一种抗噪自动语音识别系统，该系统使用正交化的独特语音特征（DPF）作为具有对角协方差的HMM的输入。在正交DPF提取阶段，首先，将语音信号转换为由局部特征（LF）和ΔP组成的声学特征，然后将多层神经网络（MLN）转换为15x3输出单元，该输出单元由先前上下文DPF的上下文相关DPF组成向量，当前DPF向量和后续上下文DPF向量将LF映射到DPF。然后，使用Karhunen-Loeve变换（KLT），使用从表示38个日语音素的DPF矢量计算出的正交基，将上下文依赖的DPF中的每个DPF矢量正交化。最后，通过使用Gram-Schmidt正交化过程，将每个正交化的DPF向量彼此进行装饰相关。在实验中，在评估DPF提取器中MLN输入和输出单元的参数后，将正交化的DPF与原始DPF进行比较。然后，将正交化的DPF与MFCC和动态特征的标准参数集进行比较。接下来，使用四种类型的加性噪声测试噪声鲁棒性。实验结果表明，所提出的正交DPF可以显着降低带有清晰语音和加性噪声污染的语音的孤立口语识别任务中的错误率。此外，当将正交化的DPF与常规的静态MFCC和ΔP结合使用时，我们获得了重大改进。

著录项

来源
《IEICE Transactions on Information and Systems》 |2004年第5期|p.1110-1118|共9页
作者
Takashi FUKUDA; Tsuneo NITTA;
展开▼
作者单位

Graduate School of Engineering, Toyohashi University of Technology, Toyohashi-shi, 441-8580 Japan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;
关键词
automatic speech recognition (ASR); feature extraction; distinctive phonetic feature (DPF); orthogonalization; local feature (LF);

机译：自动语音识别（ASR）;特征提取;语音特征（DPF）;正交化;局部特征（OF）;

相似文献

外文文献
中文文献
专利

1. Automatic speech recognition by using distinctive phonetic features approximated with logarithmic normal distribution [J] . Takashi Fukuda, Tsuneo Nitta 電子情報通信学会技術研究報告. 音声. Speech . 2003,第93期

机译：通过使用与对数正态分布近似的独特语音特征进行自动语音识别
2. Automatic speech recognition by using distinctive phonetic features approximated with logarithmic normal distribution [J] . Takashi Fukuda, Tsuneo Nitta 電子情報通信学会技術研究報告. 音声. Speech . 2003,第93期

机译：通过使用对数正态分布近似的独特语音特征自动语音识别
3. PHONOBEST: a noise-robust automatic speech recognition system with phonetic estimation process based on "expectation" [J] . Ikuyo Katsuse, Yoshimori Sugano 電子情報通信学会技術研究報告. 音声. Speech . 2001,第604期

机译：PHONOBEST：一种基于“期望”的具有语音估计过程的鲁棒性自动语音识别系统
4. Noise-robust Automatic Speech Recognition Using Orthogonalized Distinctive Phonetic Feature Vectors [C] . Takashi Fukuda, Tsuneo Nitta, International Speech Communication Association(ISCA) European Conference on Speech Communication and Technology - EUROSPEECH . 2003

机译：使用正交化的独特拼音特征向量的噪声鲁棒自动语音识别
5. Two modified methods of feature extraction for automatic speech recognition. [D] . Ge, Wangning. 2013

机译：自动语音识别的特征提取的两种改进方法。
6. Perceptual Doping: An Audiovisual Facilitation Effect on Auditory Speech Processing From Phonetic Feature Extraction to Sentence Identification in Noise [O] . Shahram Moradi, Björn Lidestam, Elaine Hoi Ning Ng, -1

机译：知觉兴奋剂：从语音特征提取到噪声中的句子识别对听觉语音处理的视听促进作用
7. Canonicalization of Feature Parameters for Robust Speech Recognition Based on Distinctive Phonetic Feature (DPF) Vectors [O] . M. NURUL HUDA, M. GHULAM, T. FUKUDA, 2008

机译：基于独特拼音特征（DPF）向量的强大语音识别特征参数的Canonicalization

Orthogonalized Distinctive Phonetic Feature Extraction for Noise-Robust Automatic Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅