首页> 外文会议>INTERSPEECH 2012 >Prof-Life-Log: Audio Environment Detection for Naturalistic Audio Streams

【24h】

Prof-Life-Log: Audio Environment Detection for Naturalistic Audio Streams

机译：Life-log：自然音频流的音频环境检测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this study, we develop a new system for real world audio environment matching. Environment detection within unknown audio streams requires a system that operates in an unsupervised manner since it will be faced with unknown environments without prior information. In addition, the overall solution should be computationally efficient for large audio collection. In the proposed approach, a Gaussian mixture model(GMM) is trained on large amounts of unlabeled audio data and used as a background acoustic model. Subsequently, an acoustic signature vector (ASV) is computed for each environment. Here, the ASV vector is designed to capture the unique acoustic characteristics of an environment. Using the ASV vectors, we demonstrate that it is possible to compute an effective similarity measure between two acoustic environments. We demonstrate the performance of the proposed system on real-world audio data, and compare it to a traditional GMM-UBM (Universal Background Model) system. Experiments show that our system achieves an equal error rate (EER) that is +35% better than a baseline GMM-UBM system.

机译：在这项研究中，我们开发了一个新的现实音频环境匹配的新系统。未知音频流内的环境检测需要一个以无监督方式操作的系统，因为它将面临未知的环境而无需先前信息。此外，整体解决方案应计算大量音频收集。在所提出的方法中，高斯混合模型（GMM）受到大量未标记的音频数据的培训，并用作背景声学模型。随后，为每个环境计算声学签名矢量（ASV）。这里，ASV矢量旨在捕获环境的独特声学特性。使用ASV向量，我们证明可以计算两个声学环境之间的有效相似度量。我们展示了所提出的系统对现实世界音频数据的性能，并将其与传统的GMM-UBM（通用背景模型）系统进行比较。实验表明，我们的系统达到了相同的错误率（eer），比基线GMM-UBM系统更好+ 35％。

著录项

来源
《INTERSPEECH 2012》|2012年||共4页
会议地点
作者
Ali Ziaei; Abhijeet Sangwan; John H.L. Hansen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 73.4136083;
关键词
Audio Environment Detection; Acoustic Signature; Real word audio data; Prof-Life-Log;

机译：音频环境检测;声学签名;真实的单词音频数据;pref-life-log;

相似文献

外文文献
中文文献
专利

1. Leveraging Frequency-Dependent Kernel and DIP-Based Clustering for Robust Speech Activity Detection in Naturalistic Audio Streams [J] . Harishchandra Dubey, Abhijeet Sangwan, John H. L. Hansen Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2018,第11期

机译：利用基于频率的内核和基于DIP的聚类在自然音频流中进行健壮的语音活动检测
2. Audio-Facial Laughter Detection in Naturalistic Dyadic Conversations [J] . Bekir Berker Turker, Yucel Yemez, T. Metin Sezgin, Affective Computing, IEEE Transactions on . 2017,第4期

机译：自然二进对话中的音频笑声检测
3. Monitoring of audio visual quality by key indicators Detection of selected audio and audiovisual artefacts [J] . Blanco Fernandez Ignacio, Leszczuk Mikolaj Multimedia Tools and Applications . 2018,第2期

机译：通过关键指标监控视听质量检测选定的视听制品
4. Prof-Life-Log: Audio Environment Detection for Naturalistic Audio Streams [C] . Ali Ziaei, Abhijeet Sangwan, John H.L. Hansen Annual conference of the International Speech Communication Association . 2012

机译：Prof-Life-Log：用于自然音频流的音频环境检测
5. Prof-Life-Log: Speech and speaker advancements for massive naturalistic audio streams [D] . Ziaei, Ali 2015

机译：Prof-Life-Log：大量自然主义音频流的语音和扬声器改进
6. The phase of cortical oscillations determines the perceptual fate of visual cues in naturalistic audiovisual speech [O] . Raphaël Thézé, Anne-Lise Giraud, Pierre Mégevand 2020

机译：皮质振荡的阶段决定了自然化视听语言中视觉线索的感知命运
7. Toeplitz Inverse Covariance Based Robust Speaker Clustering for Naturalistic Audio Streams [O] . Harishchandra Dubey, Abhijeet Sangwan, John H.L. Hansen 2019

机译：基于Toeplitz逆协方便的自然主义音频流的强大扬声器聚类

Prof-Life-Log: Audio Environment Detection for Naturalistic Audio Streams

摘要

著录项

相似文献

相关主题

期刊订阅