首页> 外文期刊>IEEE Journal on Selected Areas in Communications >Intelligent voice smoother for silence-suppressed voice over Internet
【24h】

Intelligent voice smoother for silence-suppressed voice over Internet

机译:智能语音平滑器可在Internet上抑制静音

获取原文
获取原文并翻译 | 示例
           

摘要

When transporting voice data with silence suppression over the Internet, the problem of jitter introduced from the network often renders the speech unintelligible. It is thus indispensable to offer intramedia synchronization to remove jitter while retaining minimal playout delay (PD). We propose a neural network (NN)-based intravoice synchronization mechanism, called the intelligent voice smoother (IVoS). The IVoS is composed of three components: (1) the smoother buffer; (2) the NN traffic predictor; and (3) the constant bit rate (CBR) enforcer. Newly arriving frames, assumed to follow a generic Markov modulated Bernoulli process (MMBP), are queued in the smoother buffer. The NN traffic predictor employs an online-trained back propagation NN (BPNN) to predict three traffic characteristics of every newly encountered talkspurt period. Based on the predicted characteristics, the CBR enforcer derives an adaptive buffering delay (ABD) by means of a near-optimal simple closed-form formula. It then imposes the delay on the playout of the first frame in the talkspurt period. The CBR enforcer in turn regulates CBR-based departures for the remaining frames of the talkspurt, aiming at assuring minimal mean and variance of distortion of talkspurts (DOT) and mean PD. Simulation results reveal that, compared to three other playout approaches, the IVoS achieves superior playout, yielding negligible DOT and PD, irrespective of traffic variation.
机译:当通过Internet传输具有静音抑制功能的语音数据时,从网络引入的抖动问题通常使语音难以理解。因此,在保持最小的播放延迟(PD)的同时,提供媒体内同步以消除抖动是必不可少的。我们提出了一种基于神经网络(NN)的语音同步机制,称为智能语音平滑器(IVoS)。 IVoS由三个部分组成:(1)更平滑的缓冲区; (2)NN流量预测器; (3)恒定比特率(CBR)强制程序。假定遵循通用马尔可夫调制的伯努利过程(MMBP)的新到达帧在较平滑的缓冲区中排队。 NN交通预测器采用在线训练的反向传播NN(BPNN)来预测每个新遇到的通话突峰期的三个交通特征。基于预测的特性,CBR强制程序通过接近最佳的简单封闭形式公式得出自适应缓冲延迟(ABD)。然后,在通话突峰期间将延迟强加于第一帧的播放。 CBR强制执行者反过来针对通话突峰的其余帧调整基于CBR的偏离,旨在确保通话突点(DOT)和平均PD的均值和失真方差最小。仿真结果表明,与其他三种播出方法相比,IVoS实现了出色的播出,无论流量如何变化,产生的DOT和PD均可以忽略不计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号