首页> 外文会议>International Conference on Spoken Language Processing; 20041004-08; Jeju(KR) >Precise Phone Boundary Detection Using Wavelet Packet and Recurrent Neural Networks
【24h】

Precise Phone Boundary Detection Using Wavelet Packet and Recurrent Neural Networks

机译:使用小波包和递归神经网络的精确电话边界检测

获取原文
获取原文并翻译 | 示例

摘要

The automatic segmentation is an important research subject in speech processing. Many approaches are developed in this field and good results are reported. In this paper we show that choosing wavelet packet coefficients enables us to overcome the problem of the tradeoff between frequency and time resolution which appears in normal spectral features like MFCC and causes low time resolution. In addition, the phone boundaries have a transition nature which is an effect of vocal tract movement limitations. Usually, there is a transition zone between two unlike phones. We can use some aspects of this phenomenon in segmentation, by applying features near before and after the boundary to the input of the segmentation model. We used this point, and found good results, and also we found that if we use a more dynamic segmentation model with the ability of following dynamics, like recurrent neural network, it locates phone boundaries more precisely. We tested our approach using two sets of train and test Persian utterances. Experimental results showed an overall 8.14 millisecond tolerance for detected phone boundaries.
机译:自动分割是语音处理中的重要研究课题。在该领域开发了许多方法,并取得了良好的效果。在本文中,我们表明选择小波包系数使我们能够克服频率和时间分辨率之间的权衡问题,这种问题出现在常规频谱特征(如MFCC)中,并导致较低的时间分辨率。另外,电话边界具有过渡性质,这是声道运动限制的结果。通常,两个不同的手机之间有一个过渡区域。通过在分割模型的输入边界之前和之后附近应用特征,我们可以在分割中使用此现象的某些方面。我们利用这一点,发现了很好的结果,并且还发现,如果我们使用具有动态跟踪能力的更具动态性的细分模型(例如递归神经网络),则可以更精确地定位电话边界。我们使用两组训练和波斯语发音测试了我们的方法。实验结果表明,检测到的电话边界的整体公差为8.14毫秒。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号