首页> 外文会议>Annual neural information processing systems conference >Real Time Voice Processing with Audiovisual Feedback: Toward Autonomous Agents with Perfect Pitch
【24h】

Real Time Voice Processing with Audiovisual Feedback: Toward Autonomous Agents with Perfect Pitch

机译:具有视听反馈的实时语音处理:朝着具有完美间距的自主代理

获取原文

摘要

We have implemented a real time front end for detecting voiced speech and estimating its fundamental frequency. The front end performs the signal processing for voice-driven agents that attend to the pitch contours of human speech and provide continuous audiovisual feedback. The algorithm we use for pitch tracking has several distinguishing features: it makes no use of FFTs or autocorrelation at the pitch period; it updates the pitch incrementally on a sample-by-sample basis; it avoids peak picking and does not require interpolation in time or frequency to obtain high resolution estimates; and it works reliably over a four octave range, in real time, without the need for postprocessing to produce smooth contours. The algorithm is based on two simple ideas in neural computation: the introduction of a purposeful nonlinearity, and the error signal of a least squares fit. The pitch tracker is used in two real time multimedia applications: a voice-to-MTDI player that synthesizes electronic music from vocalized melodies, and an audiovisual Karaoke machine with multimodal feedback. Both applications run on a laptop and display the user's pitch scrolling across the screen as he or she sings into the computer.
机译:我们已经实施了一个实时前端,用于检测浊音语音并估算其基本频率。前端对语音驱动的代理进行信号处理,该代理参加人类语音的音高轮廓并提供连续的视听反馈。我们用于音调跟踪的算法具有多个区别特征:它不会在音高时期使用FFT或自相关;它以逐个样本逐步更新间距;它避免了峰值拣选,并且不需要在时间或频率上插值以获得高分辨率估计值;它可以在四个八度游戏范围内可靠地工作,实时不需要进行后处理以产生平滑轮廓。该算法基于神经计算中的两个简单思路:引入有目的的非线性,以及最小二乘拟合的误差信号。音调跟踪器用于两个实时多媒体应用:一个语音到MTDI播放器,用于将电子音乐从发声旋转合成,以及具有多模级反馈的视听卡拉OK机器。这两个应用程序都在笔记本电脑上运行,并在屏幕上显示用户的音高滚动,因为他或她唱歌进入计算机。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号