首页> 外文期刊>Engineering journal >Acoustic-Phonetic Approaches for Improving Segment-Based Speech Recognition for Large Vocabulary Continuous Speech
【24h】

Acoustic-Phonetic Approaches for Improving Segment-Based Speech Recognition for Large Vocabulary Continuous Speech

机译:改进大词汇量连续语音基于片段的语音识别的声学方法

获取原文
           

摘要

Segment-based speech recognition has shown to be a competitive alternative to the state-of-the-art HMM-based techniques. Its accuracies rely heavily on the quality of the segment graph from which the recognizer searches for the most likely recognition hypotheses. In order to increase the inclusion rate of actual segments in the graph, it is important to recover possible missing segments generated by segment-based segmentation algorithm. An aspect of this research focuses on determining the missing segments due to missed detection of segment boundaries. The acoustic discontinuities, together with manner-distinctive features are utilized to recover the missing segments. Another aspect of improvement to our segment-based framework tackles the restriction of having limited amount of training speech data which prevents the usage of more complex covariance matrices for the acoustic models. Feature dimensional reduction in the form of the Principal Component Analysis (PCA) is applied to enable the training of full covariance matrices and it results in improved segment-based phoneme recognition. Furthermore, to benefit from the fact that segment-based approach allows the integration of phonetic knowledge, we incorporate the probability of each segment being one type of sound unit of a certain specific common manner of articulation into the scoring of the segment graphs. Our experiment shows that, with the proposed improvements, our segment-based framework approximately increases the phoneme recognition accuracy by approximately 25% of the one obtained from the baseline segment-based speech recognition. (71 views)
机译:基于段的语音识别已被证明是基于HMM的最新技术的一种竞争选择。它的准确性在很大程度上取决于识别器从中搜索最可能的识别假设的分段图的质量。为了提高图中实际段的包含率,重要的是恢复由基于段的分割算法生成的可能丢失的段。本研究的一个方面着重于确定由于未检测到段边界而导致的丢失段。声音的不连续性以及与众不同的特征被用于恢复丢失的片段。改进基于段的框架的另一个方面解决了训练语音数据量有限的限制,这阻止了将更复杂的协方差矩阵用于声学模型。采用主成分分析(PCA)形式的特征维数缩减可实现完整协方差矩阵的训练,并且可改善基于片段的音素识别。此外,受益于基于片段的方法可以整合语音知识的事实,我们将每个片段成为某种特定的常用发音方式的声音单元类型的概率合并到片段图的评分中。我们的实验表明,通过提出的改进,我们的基于片段的框架将音素识别的准确性大约提高了从基于基线的基于片段的语音识别获得的音素的大约25%。 (71观看次数)

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号