首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing;ICASSP >Speaker diarization of broadcast streams using two-stage clustering based on i-vectors and cosine distance scoring
【24h】

Speaker diarization of broadcast streams using two-stage clustering based on i-vectors and cosine distance scoring

机译:使用基于i向量和余弦距离评分的两级聚类对广播流进行说话人二分法

获取原文

摘要

In this paper we present our system for speaker diarization of broadcast news based on recent advances in the speaker recognition field. In the system, speaker segments determined by the speaker change-point detector are represented by i-vectors and similarity of segments' speakers evaluated using cosine distance scoring. Linear discriminant analysis is employed to cope with intra-speaker variability. The experiments were carried out using the COST278 multilingual broadcast news database. We demonstrate improvement of the performance over the baseline system based on the Bayesian Information Criterion (BIC) and highlight significant impact of cepstral mean normalization. Finally, two-stage clustering employing BIC-based clustering to pre-cluster segments in the first stage is examined and showed to yield further performance improvement. The best performing configuration of our system achieved 52.4 % relative improvement of the speaker error rate over the baseline.
机译:在本文中,我们基于说话人识别领域的最新进展,介绍了广播新闻的说话人二分系统。在该系统中,由扬声器改变点检测器确定的扬声器片段由i向量表示,并且使用余弦距离评分对片段扬声器的相似性进行评估。线性判别分析用于应对扬声器内变化。实验是使用COST278多语言广播新闻数据库进行的。我们证明了在基于贝叶斯信息准则(BIC)的基线系统上性能的提高,并突出了倒谱均值归一化的显着影响。最后,检查了在第一阶段中采用基于BIC的聚类对预聚类段进行两阶段聚类的过程,结果表明该聚类可进一步提高性能。我们系统的最佳性能配置使扬声器错误率相对于基线的相对改善率为52.4%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号