首页> 外文会议>International conference on intelligent virtual agents >Lip-Reading: Furhat Audio Visual Intelligibility of a Back Projected Animated Face
【24h】

Lip-Reading: Furhat Audio Visual Intelligibility of a Back Projected Animated Face

机译:唇读:背面投影动画面孔的Furhat视听清晰度

获取原文

摘要

Back projecting a computer animated face, onto a three dimensional static physical model of a face, is a promising technology that is gaining ground as a solution to building situated, flexible and human-like robot heads. In this paper, we first briefly describe Furhat, a back projected robot head built for the purpose of multimodal multiparty human-machine interaction, and its benefits over virtual characters and robotic heads; and then motivate the need to investigating the contribution to speech intelligibility Furhat's face offers. We present an audio-visual speech intelligibility experiment, in which 10 subjects listened to short sentences with degraded speech signal. The experiment compares the gain in intelligibility between lip reading a face visualized on a 2D screen compared to a 3D back-projected face and from different viewing angles. The results show that the audio-visual speech intelligibility holds when the avatar is projected onto a static face model (in the case of Furhat), and even, rather surprisingly, exceeds it. This means that despite the movement limitations back projected animated face models bring about; their audio visual speech intelligibility is equal, or even higher, compared to the same models shown on flat displays. At the end of the paper we discuss several hypotheses on how to interpret the results, and motivate future investigations to better explore the characteristics of visual speech perception 3D projected faces.
机译:将计算机动画人脸投影到人脸的三维静态物理模型上,是一项很有前途的技术,它已成为一种解决方案,可用于构建位置灵活,人性化的机器人头。在本文中,我们首先简要描述Furhat,它是为多模式多方人机交互目的而构建的反向投影机器人头,其相对于虚拟角色和机器人头的优势;然后激发研究Furhat脸部表情对语音清晰度的贡献的必要性。我们提出了一个视听语音清晰度实验,其中10个受试者听了语音信号下降的简短句子。实验比较了从嘴唇读取2D屏幕上可视化的脸部(与3D反向投影的脸部相比)和从不同视角获得的清晰度。结果表明,将化身投影到静态人脸模型时(在Furhat的情况下),视听语音清晰度仍然保持,甚至令人惊讶地超过了它。这意味着尽管存在运动限制,但仍可以实现回投影动画人脸模型。与平板显示器上显示的相同型号相比,它们的视听语音清晰度可比,甚至更高。在本文的最后,我们讨论了关于如何解释结果的几种假设,并激发了未来的研究以更好地探索3D投影人脸的视觉语音感知特性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号