Real Time Voice Processing with Audiovisual Feedback: Toward Autonomous Agents with Perfect Pitch

机译：具有视听反馈的实时语音处理：朝着具有完美间距的自主代理

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We have implemented a real time front end for detecting voiced speech and estimating its fundamental frequency. The front end performs the signal processing for voice-driven agents that attend to the pitch contours of human speech and provide continuous audiovisual feedback. The algorithm we use for pitch tracking has several distinguishing features: it makes no use of FFTs or autocorrelation at the pitch period; it updates the pitch incrementally on a sample-by-sample basis; it avoids peak picking and does not require interpolation in time or frequency to obtain high resolution estimates; and it works reliably over a four octave range, in real time, without the need for postprocessing to produce smooth contours. The algorithm is based on two simple ideas in neural computation: the introduction of a purposeful nonlinearity, and the error signal of a least squares fit. The pitch tracker is used in two real time multimedia applications: a voice-to-MTDI player that synthesizes electronic music from vocalized melodies, and an audiovisual Karaoke machine with multimodal feedback. Both applications run on a laptop and display the user's pitch scrolling across the screen as he or she sings into the computer.

机译：我们已经实施了一个实时前端，用于检测浊音语音并估算其基本频率。前端对语音驱动的代理进行信号处理，该代理参加人类语音的音高轮廓并提供连续的视听反馈。我们用于音调跟踪的算法具有多个区别特征：它不会在音高时期使用FFT或自相关;它以逐个样本逐步更新间距;它避免了峰值拣选，并且不需要在时间或频率上插值以获得高分辨率估计值;它可以在四个八度游戏范围内可靠地工作，实时不需要进行后处理以产生平滑轮廓。该算法基于神经计算中的两个简单思路：引入有目的的非线性，以及最小二乘拟合的误差信号。音调跟踪器用于两个实时多媒体应用：一个语音到MTDI播放器，用于将电子音乐从发声旋转合成，以及具有多模级反馈的视听卡拉OK机器。这两个应用程序都在笔记本电脑上运行，并在屏幕上显示用户的音高滚动，因为他或她唱歌进入计算机。

著录项

来源
《Annual neural information processing systems conference》|2003年||共8页
会议地点
作者
Lawrence K. Saul; Daniel D. Lee; Charles L. Isbell; Yann LeCun;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Time-dependent Neural Processing of Auditory Feedback during Voice Pitch Error Detection [J] . Roozbeh Behroozmand1 Hanjun Liu2 and Charles R. Larson1 Journal of Cognitive Neuroscience . 2011,第5期

机译：语音音高错误检测期间听觉反馈的时变神经处理
2. Attention Modulates Cortical Processing of Pitch Feedback Errors in Voice Control [J] . Huijing Hu, Ying Liu, Zhiqiang Guo, Scientific reports. . 2015,第1期

机译：注意调节语音控制中音调反馈错误的皮质处理
3. Differential effects of perturbation direction and magnitude on the neural processing of voice pitch feedback. [J] . Liu H, Meshman M, Behroozmand R, Clinical neurophysiology . 2011,第5期

机译：扰动方向和大小对语音音调反馈的神经处理的差异影响。
4. Real Time Voice Processing with Audiovisual Feedback: Toward Autonomous Agents with Perfect Pitch [C] . Lawrence K. Saul, Daniel D. Lee, Charles L. Isbell, Annual neural information processing systems conference . 2003

机译：具有视听反馈的实时语音处理：朝着具有完美间距的自主代理
5. Transfer of Suprasegmental Improvements to Novel Sentences and Segmental Accuracy Using Real Time Audiovisual Pitch Training [D] . Tan, April. 2020

机译：使用实时视听音高训练转移Suprase段改进的新句子和分段精度
6. Time-dependent Neural Processing of Auditory Feedback during Voice Pitch Error Detection [O] . Roozbeh Behroozmand, Hanjun Liu, Charles R. Larson -1

机译：时间相关的神经处理听觉反馈的音高错误检测期间
7. Sensory Processing: Advances in Understanding Structure and Function of Pitch-Shifted Auditory Feedback in Voice Control [O] . Charles R Larson, Donald A Robin 2016

机译：感觉处理：在语音控制中理解音高变换听觉反馈的结构和功能的进展

Real Time Voice Processing with Audiovisual Feedback: Toward Autonomous Agents with Perfect Pitch

摘要

著录项

相似文献

相关主题

期刊订阅