首页> 外文会议>INTERSPEECH 2012 >Speaker Clustering for a Mixture of Singing and Reading
【24h】

Speaker Clustering for a Mixture of Singing and Reading

机译:扬声器聚类,用于唱歌和阅读的混合

获取原文

摘要

In this study, we propose a speaker clustering algorithm based on reading and singing speech samples for each speaker. As a speaking style, singing introduces changes in the time-frequency structure of a speaker's voice. The purpose of this study is to introduce advancements into speech systems such as speech indexing and retrieval which improve robustness to intrinsic variations in speech production. Clustering is performed within a GMM mean supervector space. The proposed method includes two stages. First, initial clusters are obtained using traditional clustering techniques such as k-means, and hierarchical. Next, each cluster is refined in a PLDA subspace resulting in a more speaker dependent representation that is less sensitive to speaking style. The proposed algorithm improves the average clustering accuracy of the k-means baseline by +9.3% absolute.
机译:在这项研究中,我们提出了一种基于每个扬声器读取和唱歌语音样本的扬声器聚类算法。作为说话的风格,唱歌引入了扬声器语音时频结构的变化。本研究的目的是将进步引入语音系统,例如语音索引和检索,这改善了语音生产中内在变化的鲁棒性。群集在GMM平均监控器空间内执行。该方法包括两个阶段。首先,使用诸如K-Means等传统聚类技术获得初始集群和分层。接下来,每个群集都在PLDA子空间中精制,导致更多的扬声器依赖表示,对话框不太敏感。该算法提高了K-Means基线的平均聚类精度+ 9.3%绝对。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号