首页> 外文OA文献 >LIP-READING VIA DEEP NEURAL NETWORKS USING HYBRID VISUAL FEATURES
【2h】

LIP-READING VIA DEEP NEURAL NETWORKS USING HYBRID VISUAL FEATURES

机译:利用混合视觉功能的深神经网络唇读

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Lip-reading is typically known as visually interpreting the speaker's lip movements during speaking. Experiments over many years have revealed that speech intelligibility increases if visual facial information becomes available. This effect becomes more apparent in noisy environments. Taking steps toward automating this process, some challenges will be raised such as coarticulation phenomenon, visual units' type, features diversity and their inter-speaker dependency. While efforts have been made to overcome these challenges, presentation of a flawless lip-reading system is still under the investigations. This paper searches for a lipreading model with an efficiently developed incorporation and arrangement of processing blocks to extract highly discriminative visual features. Here, application of a properly structured Deep Belief Network (DBN)- based recognizer is highlighted. Multi-speaker (MS) and speaker-independent (SI) tasks are performed over CUAVE database, and phone recognition rates (PRRs) of 77.65% and 73.40% are achieved, respectively. The best word recognition rates (WRRs) achieved in the tasks of MS and SI are 80.25% and 76.91%, respectively. Resulted accuracies demonstrate that the proposed method outperforms the conventional Hidden Markov Model (HMM) and competes well with the state-of-the-art visual speech recognition works.
机译:唇读通常称为通话期间视觉解释说话人的嘴唇动作。经过多年的实验表明语音可懂度的增加,如果脸部视觉信息变得可用。这个效果变得在嘈杂的环境更加明显。采取措施朝向自动化这个过程中,一些挑战将提高如协同发音现象,视觉单位型,设有多样性及其扬声器间依赖性。尽管已作出努力来克服这些挑战,一个完美的唇读系统的表现仍然是调查当中。这对于一个唇读模型纸搜索与有效地发达掺入和处理块的配置,以提取高辨别的视觉特征。在这里,一个结构合理的坚定信念网络(DBN)的应用 - 基于识别被突出显示。多扬声器(MS)和扬声器无关(SI)的任务是在CUAVE数据库中执行,和电话识别率的77.65%和73.40%(PRRS)分别实现。在MS和SI的任务所取得的最好的词识别率(WRRs)为80.25%和76.91%,分别。导致精度表明,所提出的方法优于传统的隐马尔可夫模型(HMM)并竞争以及与所述状态的最先进的视觉语音识别作品。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号