【24h】

Structure-based voiced/usable speech detection using state space embedding

机译:使用状态空间嵌入的基于结构的有声/可用语音检测

获取原文

摘要

The process of speech production in the human system is very complex, possesses nonlinearities, and can only be precisely modeled in terms of nonlinear dynamics. A non-linear speech classification approach is proposed, which classifies speech based on features extracted from Takens' method of delays, a technique used to reconstruct signals into a trajectory in multidimensional state space. In this research, two types of speech detection are presented, namely, voiced and usable speech (for speaker identification purposes). The proposed approach has been able to yield a probability of error of 12% in noisy environments for voiced speech detection, and 78% correct usable speech detection by comparing the structures of embedded voiced speech frames with embedded unvoiced speech frames, and embedded usable speech frames with unusable speech. Some applications of this speech detection technique include the enhancement of speaker identification and speech recognition systems.
机译:人类系统中语音产生的过程非常复杂,具有非线性,并且只能根据非线性动力学进行精确建模。提出了一种非线性语音分类方法,该方法基于从Takens的延迟方法中提取的特征对语音进行分类,该方法用于将信号重构为多维状态空间中的轨迹。在这项研究中,提出了两种类型的语音检测,即有声语音和可用语音(用于说话人识别)。通过比较嵌入的有声语音帧与嵌入的无声语音帧以及嵌入的可用语音帧的结构,所提出的方法已经能够在嘈杂的环境中为语音检测提供12%的错误概率,并提供78%的正确可用语音检测。语音不可用。这种语音检测技术的一些应用包括增强了说话人识别和语音识别系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号