Acoustic Event Detection in Speech Overlapping Scenarios Based on High-Resolution Spectral Input and Deep Learning

Espi Miquel; Fujimoto Masakiyo; Nakatani Tomohiro

首页> 外文期刊>IEICE transactions on information and systems >Acoustic Event Detection in Speech Overlapping Scenarios Based on High-Resolution Spectral Input and Deep Learning

【24h】

Acoustic Event Detection in Speech Overlapping Scenarios Based on High-Resolution Spectral Input and Deep Learning

机译：Acoustic Event Detection in Speech Overlapping Scenarios Based on High-Resolution Spectral Input and Deep Learning

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相关主题

摘要

We present a method for recognition of acoustic events in conversation scenarios where speech usually overlaps with other acoustic events. While speech is usually considered the most informative acoustic event in a conversation scene, it does not always contain all the information. Non-speech events, such as a door knock, steps, or a keyboard typing can reveal aspects of the scene that speakers miss or avoid to mention. Moreover, being able to robustly detect these events could further support speech enhancement and recognition systems by providing useful information cues about the surrounding scenarios and noise. In acoustic event detection, state-of-the-art techniques are typically based on derived features (e.g. MFCC, or Mel-filter-banks) which have successfully parameterized the spectrogram of speech but reduce resolution and detail when we are targeting other kinds of events. In this paper, we propose a method that learns features in an unsupervised manner from high-resolution spectrogram patches (considering a patch as a certain number of consecutive frame features stacked together), and integrates within the deep neural network framework to detect and classify acoustic events. Superiority over both previous works in the field, and similar approaches based on derived features, has been assessed by statical measures and evaluation with CHIL2007 corpus, an annotated database of seminar recordings.

著录项

来源
《IEICE transactions on information and systems》 |2015年第10期|1799-1807|共9页
作者
Espi Miquel; Fujimoto Masakiyo; Nakatani Tomohiro;
展开▼
作者单位

NTT Corp, NTT Commun Sci Lab, Kyoto 6190237, Japan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种英语
中图分类通信;
关键词
acoustic event detection/recognition; high-resolution feature; spectrogram patch; communication scene understanding;

Acoustic Event Detection in Speech Overlapping Scenarios Based on High-Resolution Spectral Input and Deep Learning

摘要

著录项

相关主题

期刊订阅