首页> 美国卫生研究院文献>other >Automatic intelligibility classification of sentence-level pathological speech
【2h】

Automatic intelligibility classification of sentence-level pathological speech

机译:句子级病理性语音的自动清晰度分类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Pathological speech usually refers to the condition of speech distortion resulting from atypicalities in voice and/or in the articulatory mechanisms owing to disease, illness or other physical or biological insult to the production system. Although automatic evaluation of speech intelligibility and quality could come in handy in these scenarios to assist experts in diagnosis and treatment design, the many sources and types of variability often make it a very challenging computational processing problem. In this work we propose novel sentence-level features to capture abnormal variation in the prosodic, voice quality and pronunciation aspects in pathological speech. In addition, we propose a post-classification posterior smoothing scheme which refines the posterior of a test sample based on the posteriors of other test samples. Finally, we perform feature-level fusions and subsystem decision fusion for arriving at a final intelligibility decision. The performances are tested on two pathological speech datasets, the NKI CCRT Speech Corpus (advanced head and neck cancer) and the TORGO database (cerebral palsy or amyotrophic lateral sclerosis), by evaluating classification accuracy without overlapping subjects’ data among training and test partitions. Results show that the feature sets of each of the voice quality subsystem, prosodic subsystem, and pronunciation subsystem, offer significant discriminating power for binary intelligibility classification. We observe that the proposed posterior smoothing in the acoustic space can further reduce classification errors. The smoothed posterior score fusion of subsystems shows the best classification performance (73.5% for unweighted, and 72.8% for weighted, average recalls of the binary classes).
机译:病理性言语通常是指由于疾病,疾病或对生产系统的其他物理或生物学侮辱而导致的声音和/或发音机制不典型所导致的言语失真状态。尽管在这些情况下可以自动评估语音的清晰度和质量,以帮助专家进行诊断和治疗设计,但是可变性的多种来源和类型通常使它成为一个非常具有挑战性的计算处理问题。在这项工作中,我们提出了新颖的句子级功能来捕获病理语音中韵律,语音质量和发音方面的异常变化。另外,我们提出了一种分类后的后平滑方案,该方案基于其他测试样本的后验来细化测试样本的后验。最后,我们执行特征级融合和子系统决策融合,以得出最终的清晰度决策。通过评估分类准确性而没有在训练和测试分区之间重叠受试者的数据,在两个病理性语音数据集(NKI CCRT语音语料库(高级头颈癌)和TORGO数据库(脑瘫或肌萎缩性侧索硬化))上对表演进行了测试。结果表明,语音质量子系统,韵律子系统和语音子系统中的每一个的特征集为二元清晰度分类提供了显着的区分能力。我们观察到,在声学空间中提出的后验平滑可以进一步减少分类误差。子系统的平滑后分数融合显示出最佳的分类性能(二元类的平均召回率为73.5%(未加权)和72.8%(加权)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号