Automatic Phonetic Segmentation Using the Kaldi Toolkit

机译：使用Kaldi工具包的自动语音分割

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we explore the possibilities of hidden Markov model based automatic phonetic segmentation with the Kaldi toolkit. We compare the Kaldi toolkit and the Hidden Markov Model Toolkit (HTK) in terms of segmentation accuracy. The well-tuned HTK-based phonetic segmentation framework was taken as the baseline and compared to a newly proposed segmentation framework built from the default examples and recipes available in the Kaldi repository. Since the segmentation accuracy of the HTK-based system was significantly higher than that of the Kaldi-based system, the default Kaldi setting was modified with respect to pause model topology, the way of generating phonetic questions for clustering, and the number of Gaussian mixtures used during modeling. The modified Kaldi-based system achieved results comparable to those obtained by HTK--slightly worse for small segmentation errors but better for gross segmentation errors. We also confirmed that, for both toolkits, the standard three-state left-to-right model topology was significantly outperformed by a modified five-state left-to-right topology, especially with respect to small segmentation errors.

机译：在本文中，我们使用Kaldi工具包探索了基于隐马尔可夫模型的自动语音分割的可能性。我们在分割精度方面比较了Kaldi工具包和隐马尔可夫模型工具包（HTK）。调整良好的基于HTK的语音分割框架被用作基准，并与根据Kaldi信息库中可用的默认示例和配方构建的新提议的分割框架进行了比较。由于基于HTK的系统的分割精度明显高于基于Kaldi的系统，因此针对暂停模型拓扑，生成语音问题进行聚类的方式以及高斯混合的数量，对默认的Kaldi设置进行了修改。在建模期间使用。改进的基于Kaldi的系统所获得的结果可与HTK所获得的结果相媲美-对于较小的细分误差，效果稍差一些，但对于总体细分误差，效果则更好。我们还确认，对于这两个工具包，标准的三态从左至右模型拓扑都明显优于修改后的五态从左至右拓扑，尤其是在较小的分割误差方面。

著录项

来源
《International conference on text, speech and dialogue》|2017年|138-146|共9页
会议地点
作者
Jindřich Matoušek; Michal Klíma;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Automatic phonetic segmentation; HTK; Kaldi; Hidden Markov models;

机译：自动语音分割; HTK;卡尔迪隐马尔可夫模型;

相似文献

外文文献
中文文献
专利

1. Automatic speech recognition system with pitch dependent features for Punjabi language on KALDI toolkit [J] . Guglani Jyoti, Mishra A. N. Applied Acoustics . 2020,第Octa期

机译：在Kaldi Toolkit上的Punjabi语言具有音调依赖功能的自动语音识别系统
2. MINIMUM SEGMENTATION ERROR BASED DISCRIMINATIVE TRAINING OF HMM FOR AUTOMATIC PHONETIC SEGMENTATION [J] . Yi-Jian Wu, Hisashi Kawai, Jinfu Ni, 電子情報通信学会技術研究報告. 音声. Speech . 2003,第263期

机译：基于最小分割误差的HMM的有区别的训练，用于自动语音分割
3. MINIMUM SEGMENTATION ERROR BASED DISCRIMINATIVE TRAINING OF HMM FOR AUTOMATIC PHONETIC SEGMENTATION [J] . Yi-Jian Wu, Hisashi Kawai, Jinfu Ni, 電子情報通信学会技術研究報告. 音声. Speech . 2003,第263期

机译：基于最低分割误差的自动语音分割的HMM辨别训练
4. Automatic Phonetic Segmentation Using the Kaldi Toolkit [C] . Jindrich Matousek, Michal Klima International Conference on Text, Speech and Dialogue . 2017

机译：使用Kaldi Toolkit自动语音分割
5. Experiments on automatic phonetic segmentation and transcription of speech. [D] . Lennig, Matthew. 1984

机译：自动语音分割和语音转录的实验。
6. Using forced alignment for automatic acoustic-phonetic segmentation of aphasic discourse [O] . Alice Lee, Anthony Pak Hin Kong, Sam-Po Law -1

机译：使用强制对准了失语症话语自动声语音分割
7. Automatic Syllable Segmentation Using Broad Phonetic Class Information [O] . Ludusan Bogdan, Dupoux Emmanuel 2016

机译：使用广泛的语音类别信息进行自动音节分割

Automatic Phonetic Segmentation Using the Kaldi Toolkit

摘要

著录项

相似文献

相关主题

期刊订阅