首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing;ICASSP >Speech overlap detection and attribution using convolutive non-negative sparse coding
【24h】

Speech overlap detection and attribution using convolutive non-negative sparse coding

机译:使用卷积非负稀疏编码进行语音重叠检测和归因

获取原文

摘要

Overlapping speech is known to degrade speaker diarization performance with impacts on speaker clustering and segmentation. While previous work made important advances in detecting overlapping speech intervals and in attributing them to relevant speakers, the problem remains largely unsolved. This paper reports the first application of convolutive non-negative sparse coding (CNSC) to the overlap problem. CNSC aims to decompose a composite signal into its underlying contributory parts and is thus naturally suited to overlap detection and attribution. Experimental results on NIST RT data show that the CNSC approach gives comparable results to a state-of-the-art hidden Markov model based overlap detector. In a practical diarization system, CNSC based speaker attribution is shown to reduce the speaker error by over 40% relative in overlapping segments.
机译:众所周知,重叠语音会降低说话者的二分音表现,并影响说话者的聚类和分段。尽管先前的工作在检测重叠的语音间隔并将其归因于相关说话者方面取得了重要的进展,但问题仍未解决。本文报道了卷积非负稀疏编码(CNSC)在重叠问题上的首次应用。 CNSC的目的是将复合信号分解成其潜在的贡献部分,因此自然适用于重叠检测和归因。 NIST RT数据上的实验结果表明,CNSC方法与基于最新隐马尔可夫模型的重叠检测器可比。在实际的数字化系统中,基于CNSC的说话者归因显示可将说话者错误相对于重叠部分减少40%以上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号