首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >System fusion and speaker linking for longitudinal diarization of TV shows
【24h】

System fusion and speaker linking for longitudinal diarization of TV shows

机译:系统融合和扬声器链接,实现电视节目的纵向差异化

获取原文

摘要

Performing speaker diarization while uniquely identifying the speakers in a collection of audio recordings is a challenging task. Based on our previous work on speaker diarization and linking, we developed a system for diarizing longitudinal TV show data sets based on the fusion of speaker diarization system outputs and speaker linking. Agreement between multiple diarization outputs is found prior to speaker linking, largely reducing the diarization error rate at the expense of keeping some speech data unlabelled. To deal with noisy clusters, a linear prediction based technique was used to label speakers after linking. Considerable gains for both fusion and labelling are reported. Despite the challenges of the longitudinal diarization task, this system obtained similar performance for linked and non-linked tasks under moderate session variability, highlighting the viability of a linking approach to longitudinal diarization of speech in the presence of noise, music and special audio effects.
机译:在唯一地识别一组录音中的说话者的同时执行说话者区分是一项艰巨的任务。基于我们以前关于扬声器区分和链接的工作,我们开发了一个基于扬声器区分系统输出和扬声器链接融合的纵向电视节目数据集数字化系统。在说话者链接之前,先找到多个数字化输出之间的一致性,这大大降低了数字化错误率,但以保持某些语音数据未标记为代价。为了处理嘈杂的群集,链接后使用了基于线性预测的技术来标记说话者。据报道融合和标记都获得了相当大的收益。尽管存在纵向差异化任务的挑战,但该系统在中等会话可变性的情况下,对于链接和非链接任务仍获得了相似的性能,突出了在噪声,音乐和特殊音频效果下,语音纵向差异化的链接方法的可行性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号