首页> 外文会议>International Conference on Signal Processing and Communications >Overlapped/Non-Overlapped Speech Transition Point Detection Using Bag-of-Audio-Words
【24h】

Overlapped/Non-Overlapped Speech Transition Point Detection Using Bag-of-Audio-Words

机译:使用袋式音频词的重叠/非重叠语音转换点检测

获取原文

摘要

Overlapped speech refers to an audio signal which contains speech of two or more speakers speaking simultaneously. Overlapped speech is one of the main sources of error for speaker diarization systems. This work presents an initial study to identify the transition points of overlapped to non-overlapped speech and vice-versa. Characteristics of overlapped and non-overlapped speech are examined in terms of the vocal tract system, excitation source, and modulation spectrum. The Hilbert envelope (HE) of Linear Prediction (LP) residual signal represents the excitation source characteristics of speech signal. The Sum of Ten Largest Peaks (STLP) of the spectrum and Mel-Frequency Cepstral Coefficients (MFCCs) represent the vocal tract shape information. The modulation spectrum energy (ModSE) captures the information of slowly varying temporal envelope of speech. A Bag-of-Audio-Words (BoAW) based approach is used to detect the transition points. News debates are one of the main sources of naturally occurred overlapped speech. Therefore, the present work is evaluated on Indian news debate scenario. A high Identification Rate (IR) and low Spurious Rate (SR) is observed when all the features are used simultaneously as a 16d feature(13-MFCCs, HE of LP residual, STLP and ModSE) for the detection task.
机译:重叠语音是指包含两个或多个同时发言的扬声器的语音的音频信号。语音重叠是说话人二字化系统的主要错误来源之一。这项工作提出了一项初步研究,以识别重叠语音到非重叠语音的转换点,反之亦然。根据声道系统,激励源和调制频谱来检查重叠和不重叠语音的特征。线性预测(LP)残留信号的希尔伯特包络(HE)表示语音信号的激励源特性。频谱的十个最大峰值(STLP)和梅尔频率倒谱系数(MFCC)代表声道形状信息。调制频谱能量(ModSE)捕获缓慢变化的语音时间包络的信息。基于语音袋(BoAW)的方法用于检测过渡点。新闻辩论是自然发生的重叠语音的主要来源之一。因此,目前的工作是在印度新闻辩论情况下进行评估的。当所有特征同时用作检测任务的16d特征(13-MFCC,LP残差的HE,STLP和ModSE)时,观察到高识别率(IR)和低杂散率(SR)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号