A Low-Complexity Parabolic Lip Contour Model With Speaker Normalization for High-Level Feature Extraction in Noise-Robust Audiovisual Speech Recognition

Borgstrom B.J.; Alwan A.

首页> 外文期刊>IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans >A Low-Complexity Parabolic Lip Contour Model With Speaker Normalization for High-Level Feature Extraction in Noise-Robust Audiovisual Speech Recognition

【24h】

A Low-Complexity Parabolic Lip Contour Model With Speaker Normalization for High-Level Feature Extraction in Noise-Robust Audiovisual Speech Recognition

机译：具有说话人归一化功能的低复杂度抛物线形嘴唇轮廓模型，用于噪声鲁棒的视听语音识别中的高级特征提取

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper proposes a novel low-complexity lip contour model for high-level optic feature extraction in noise-robust audiovisual (AV) automatic speech recognition systems. The model is based on weighted least-squares parabolic fitting of the upper and lower lip contours, does not require the assumption of symmetry across the horizontal axis of the mouth, and is therefore realistic. The proposed model does not depend on the accurate estimation of specific facial points, as do other high-level models. Also, we present a novel low-complexity algorithm for speaker normalization of the optic information stream, which is compatible with the proposed model and does not require parameter training. The use of the proposed model with speaker normalization results in noise robustness improvement in AV isolated-word recognition relative to using the baseline high-level model.

机译：本文提出了一种新颖的低复杂度嘴唇轮廓模型，用于在鲁棒视听（AV）自动语音识别系统中进行高级光学特征提取。该模型基于上嘴唇轮廓和下嘴唇轮廓的加权最小二乘抛物线拟合，不需要在嘴的水平轴上对称的假设，因此是现实的。所提出的模型不像其他高级模型那样依赖于特定面部的准确估计。此外，我们提出了一种新颖的低复杂度算法，用于光学信息流的说话人归一化，它与提出的模型兼容，并且不需要参数训练。相对于使用基线高级模型，将建议的模型与说话者归一化配合使用可提高AV孤立单词识别的噪声鲁棒性。

著录项

来源
《IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans》 |2008年第6期|p.1-1280|共1280页
作者
Borgstrom B.J.; Alwan A.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化基础理论;
关键词
feature extraction; speech recognition; high-level optic feature extraction; isolated word recognition; low-complexity algorithm; low-complexity parabolic lip contour model; lower lip contours; noise robustness; noise-robust audiovisual automatic speech recognit;

机译：特征提取;语音识别;高级光学特征提取;孤立词识别;低复杂度算法;低复杂度抛物线形嘴唇轮廓模型;下唇轮廓;噪声鲁棒性;噪声鲁棒的视听自动语音识别;

相似文献

外文文献
中文文献
专利

1. Lip Detection and Lip Geometric Feature Extraction using Constrained Local Model for Spoken Language Identification using Visual Speech Recognition [J] . Aparna Brahme, Umesh Bhadade Indian Journal of Science and Technology . 2016,第32期

机译：基于视觉语音识别的受限局部模型用于口语识别的嘴唇检测和嘴唇几何特征提取
2. Speaker Recognition System for Limited Speech Data Using High-Level Speaker Specific Features and Support Vector Machines [J] . Satyanand Singh, Assaf Mansour H., Nitin Agarwal, International Journal of Applied Engineering Research . 2017,第19aPta1期

机译：使用高级扬声器特定功能和支持向量机有限语音数据的扬声器识别系统
3. Emotional speech feature normalization and recognition based on speaker-sensitive feature clustering [J] . Chengwei Huang, Baolin Song, Li Zhao International journal of speech technology . 2016,第4期

机译：基于说话人敏感特征聚类的情感语音特征归一化与识别
4. Normalized amplitude modulation features for large vocabulary noise-robust speech recognition [C] . Mitra, Vikramjit IEEE International Conference on Acoustics, Speech and Signal Processing;ICASSP . 2012

机译：归一化幅度调制功能，用于大词汇量鲁棒语音识别
5. Acoustic modeling and speaker normalization strategies with application to robust in-vehicle speech recognition and dialect classification. [D] . Yapanel, Umit. 2005

机译：声学建模和说话人归一化策略及其在强大的车载语音识别和方言分类中的应用。
6. A Comparison of Low-Complexity Real-Time Feature Extraction for Neuromorphic Speech Recognition [O] . Jyotibdha Acharya, Aakash Patil, Xiaoya Li, 2018

机译：用于神经形态语音识别的低复杂度实时特征提取的比较
7. Comparison of feature extraction and normalization methods for speaker recognition using grid-audiovisual database [O] . Musab T. S. Al-Kaltakchi, Haithem Abd Al-Raheem Taha, Mohanad Abd Shehab, 2020

机译：使用Grid-audiovisual数据库比较扬声器识别的特征提取和归一化方法

A Low-Complexity Parabolic Lip Contour Model With Speaker Normalization for High-Level Feature Extraction in Noise-Robust Audiovisual Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅