首页> 外文会议>INTERSPEECH 2012 >Boosting Classification Based Speech Separation Using Temporal Dynamics
【24h】

Boosting Classification Based Speech Separation Using Temporal Dynamics

机译:基于分类的语音分离使用时间动态提升

获取原文

摘要

Significant advances in speech separation have been made by formulating it as a classification problem, where the desired output is the ideal binary mask (IBM). Previous work does not explicitly model the correlation between neighboring time-frequency units and standard binary classifiers are used. As one of the most important characteristics of speech signal is its temporal dynamics, the IBM contains highly structured, instead of, random patterns. In this study, we incorporate temporal dynamics into classification by employing structured output learning. In particular, we use linear-chain structured perceptrons to account for the interactions of neighboring labels in time. However, the performance of structured perceptrons largely depends on the linear separability of features. To address this problem, we employ pre-trained deep neural networks to automatically learn effective feature functions for structured perceptrons. The experiments show that the proposed system significantly outperforms previous IBM estimation systems.
机译:通过将其作为分类问题将其制定为语音分离的显着进展,其中所需的输出是理想的二进制掩模(IBM)。以前的工作没有明确地模拟相邻的时频单元和标准二进制分类器之间的相关性。由于语音信号的最重要特征之一是其时间动态,IBM包含高度结构,而不是随机图案。在这项研究中,我们通过采用结构化输出学习将时间动态纳入分类。特别是,我们使用线性链结构的感觉分布来计算邻近标签及时的相互作用。然而,结构化的感知的性能在很大程度上取决于特征的线性可分离性。为了解决这个问题,我们采用预先训练的深度神经网络,自动学习结构化的感知的有效功能功能。实验表明,所提出的系统显着优于先前的IBM估计系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号