首页> 外文会议>INTERSPEECH 2012 >DiarTk : An Open Source Toolkit for Research in Multistream Speaker Diarization and its Application to Meetings Recordings.
【24h】

DiarTk : An Open Source Toolkit for Research in Multistream Speaker Diarization and its Application to Meetings Recordings.

机译:DiaRTK:用于多阵扬声器日益改估的研究开源工具包及其在会议记录中的应用。

获取原文

摘要

The speaker diarization task consists of inferring "who spoke when" in an audio stream without any prior knowledge and has been object of several NIST international evaluation campaigns is last years. A common trend for improving performances has been the use of several different feature streams as diverse as speaker location features, visual features or noise robust acoustic features. This paper describes an open source toolkit released under GPL license aiming at facilitating research in multistream speaker diarization and reproducing state-of-the-art results. In contrary to other related diarization toolkits, it is explicitly designed to handle an arbitrary number of features with very different statistics while limiting the computational complexity. The release includes a set of scripts to replicate benchmark results on previous NIST evaluations and is intended to provide an easy to use software to study and include novel features into diarization systems.
机译:扬声器日复速决任务包括推断出在没有任何先前知识的音频流中的“谁说话”,并且几个NIST国际评估活动的目的是去年。改进性能的共同趋势一直在使用几种不同的特征流,作为扬声器位置特征,视觉特征或噪音强大的声学特征等于各种不同的特征流。本文介绍了在GPL许可下发布的开源工具包,旨在促进在多阵线扬声器日期和再现最先进的结果中的研究。彼此相关的相关日期工具包,明确地设计用于处理具有非常不同统计数据的任意数量的特征,同时限制计算复杂性。该版本包括一组脚本,用于复制以前的NIST评估的基准结果,并且旨在提供易于使用的软件来研究,并将新颖的特征列入日复日复速度系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号