首页> 外文会议>2017 24th National and 2nd International Iranian Conference on Biomedical Engineering >Lip-Reading via Deep Neural Network Using Appearance-Based Visual Features
【24h】

Lip-Reading via Deep Neural Network Using Appearance-Based Visual Features

机译:使用基于外观的视觉功能通过深度神经网络进行口头阅读

获取原文
获取原文并翻译 | 示例

摘要

Lip-reading, is visually interpreting lips movements in order to understand speech, when there is no access to the normal sound. Image processing techniques for lip-reading recognition has been widely applied in various kinds of applications. As an application, computer-based video system developed to provide lip-reading instruction to hearing-impaired adults and teenagers. Taking a step toward automating the process, challenges such as coarticulation phenomenon, homophone effect, insufficient training data per class, choice of features and speaker-dependency are faced. Finding a method to overcome these challenges is desirable. This paper describes a lip-reading model, highlighting the feature extraction and recognition parts. Certain arrangement of blocks are considered in a way to achieve optimal appearance-based features for feature extraction part, while a properly structured Deep Belief Network (DBN) is used for the recognition part. The challenging dataset of CUAVE is used in this study, and visual phone recognition (VPR) accuracies are reported on the phone-level. Proposed lip-reading recognizer is unique in its usage for all speakers. Our suggested method outperforms the conventional Hidden Markov Model (HMM)-based recognizer, and the best VPR accuracy of %45.63 is achieved, using the best DBN.
机译:唇读是在无法获得正常声音的情况下,以视觉方式解释嘴唇的运动以理解语音。用于唇读识别的图像处理技术已广泛应用于各种应用中。作为一种应用程序,开发了基于计算机的视频系统,以向听力受损的成年人和青少年提供唇读指导。朝着使过程自动化的方向迈进,面临着诸如关节发音现象,同音效果,每堂课训练数据不足,功能选择和说话者依赖性等挑战。寻找一种克服这些挑战的方法是可取的。本文描述了一个唇读模型,重点介绍了特征提取和识别部分。考虑以某种方式实现块的特定排列,以实现针对特征提取部分的基于外观的最佳特征,而将结构正确的深度信任网络(DBN)用于识别部分。这项研究使用了具有挑战性的CUAVE数据集,并且在电话级别上报告了可视电话识别(VPR)准确性。提议的唇读识别器在所有扬声器中的用法都是唯一的。我们建议的方法优于传统的基于隐马尔可夫模型(HMM)的识别器,并且使用最佳DBN可以实现%45.63的最佳VPR精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号