首页> 外文会议>International conference on text, speech and dialogue >Automatic Phonetic Segmentation Using the Kaldi Toolkit
【24h】

Automatic Phonetic Segmentation Using the Kaldi Toolkit

机译:使用Kaldi工具包的自动语音分割

获取原文

摘要

In this paper we explore the possibilities of hidden Markov model based automatic phonetic segmentation with the Kaldi toolkit. We compare the Kaldi toolkit and the Hidden Markov Model Toolkit (HTK) in terms of segmentation accuracy. The well-tuned HTK-based phonetic segmentation framework was taken as the baseline and compared to a newly proposed segmentation framework built from the default examples and recipes available in the Kaldi repository. Since the segmentation accuracy of the HTK-based system was significantly higher than that of the Kaldi-based system, the default Kaldi setting was modified with respect to pause model topology, the way of generating phonetic questions for clustering, and the number of Gaussian mixtures used during modeling. The modified Kaldi-based system achieved results comparable to those obtained by HTK--slightly worse for small segmentation errors but better for gross segmentation errors. We also confirmed that, for both toolkits, the standard three-state left-to-right model topology was significantly outperformed by a modified five-state left-to-right topology, especially with respect to small segmentation errors.
机译:在本文中,我们使用Kaldi工具包探索了基于隐马尔可夫模型的自动语音分割的可能性。我们在分割精度方面比较了Kaldi工具包和隐马尔可夫模型工具包(HTK)。调整良好的基于​​HTK的语音分割框架被用作基准,并与根据Kaldi信息库中可用的默认示例和配方构建的新提议的分割框架进行了比较。由于基于HTK的系统的分割精度明显高于基于Kaldi的系统,因此针对暂停模型拓扑,生成语音问题进行聚类的方式以及高斯混合的数量,对默认的Kaldi设置进行了修改。在建模期间使用。改进的基于Kaldi的系统所获得的结果可与HTK所获得的结果相媲美-对于较小的细分误差,效果稍差一些,但对于总体细分误差,效果则更好。我们还确认,对于这两个工具包,标准的三态从左至右模型拓扑都明显优于修改后的五态从左至右拓扑,尤其是在较小的分割误差方面。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号