首页> 美国卫生研究院文献>Sensors (Basel Switzerland) >Towards End-to-End Acoustic Localization Using Deep Learning: From Audio Signals to Source Position Coordinates

【2h】

Towards End-to-End Acoustic Localization Using Deep Learning: From Audio Signals to Source Position Coordinates

机译：使用深度学习实现端到端声学定位：从音频信号到源位置坐标

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a novel approach for indoor acoustic source localization using microphone arrays, based on a Convolutional Neural Network (CNN). In the proposed solution, the CNN is designed to directly estimate the three-dimensional position of a single acoustic source using the raw audio signal as the input information and avoiding the use of hand-crafted audio features. Given the limited amount of available localization data, we propose, in this paper, a training strategy based on two steps. We first train our network using semi-synthetic data generated from close talk speech recordings. We simulate the time delays and distortion suffered in the signal that propagate from the source to the array of microphones. We then fine tune this network using a small amount of real data. Our experimental results, evaluated on a publicly available dataset recorded in a real room, show that this approach is able to produce networks that significantly improve existing localization methods based on SRP-PHAT strategies and also those presented in very recent proposals based on Convolutional Recurrent Neural Networks (CRNN). In addition, our experiments show that the performance of our CNN method does not show a relevant dependency on the speaker’s gender, nor on the size of the signal window being used.

机译：本文提出了一种基于卷积神经网络（CNN）的使用麦克风阵列进行室内声源定位的新方法。在提出的解决方案中，CNN旨在使用原始音频信号作为输入信息来直接估计单个声源的三维位置，并避免使用手工制作的音频功能。鉴于可用的本地化数据数量有限，我们在本文中提出了一种基于两个步骤的训练策略。我们首先使用从近距离语音记录生成的半合成数据来训练我们的网络。我们模拟了从源传播到麦克风阵列的信号中的时间延迟和失真。然后，我们使用少量实际数据微调此网络。我们对在真实房间中记录的公开可用数据集进行的实验结果表明，这种方法能够产生可显着改善基于SRP-PHAT策略以及基于卷积递归神经网络的最新提议中提出的定位方法的网络。网络（CRNN）。另外，我们的实验表明，我们的CNN方法的性能并未显示出与说话者性别的相关性，也没有显示出所使用的信号窗口的大小。

著录项

期刊名称 Sensors (Basel Switzerland)
作者
Juan Manuel Vera-Diaz; Daniel Pizarro; Javier Macias-Guarasa;
展开▼
作者单位

展开▼
年(卷),期 2018(18),10
年度 2018
页码 3418
总页数 22
原文格式 PDF
正文语种
中图分类
关键词
acoustic source localization microphone arrays deep learning convolutional neural networks;

机译：声源定位;麦克风阵列;深度学习;卷积神经网络;

相似文献

外文文献
中文文献
专利

1. The Influence of the Coordinate Errors of Setting Sensors of a Piezoelectric Antenna on the Accuracy of Localizing Sources of Acoustic Emission Signals [J] . A. E. Kareev, L. N. Stepanova, E. S. Tenitilov Russian Journal of Nondestructive Testing . 2010,第11期

机译：压电天线设置传感器的坐标误差对声发射信号定位源精度的影响
2. A generalizable deep learning framework for localizing and characterizing acoustic emission sources in riveted metallic panels [J] . Ebrahimkhanlou Arvin, Dubuc Brennan, Salamone Salvatore Mechanical systems and signal processing . 2019,第Sepa1期

机译：用于在铆接金属板上定位和表征声发射源的通用化深度学习框架
3. A generalizable deep learning framework for localizing and characterizing acoustic emission sources in riveted metallic panels [J] . Ebrahimkhanlou Arvin, Dubuc Brennan, Salamone Salvatore Mechanical systems and signal processing . 2019,第SEPa1期

机译：用于在铆接金属板上定位和表征声发射源的通用化深度学习框架
4. Deep Learning for Audio Signal Source Positioning Using Microphone Array [C] . Resul Adanur, Yıldıray Yeşilyurt, Cem Şişman, International Conference on Digital Information Processing and Communications . 2019

机译：使用麦克风阵列进行音频信号源定位的深度学习
5. ON THE SPATIAL STRUCTURE OF THE ACOUSTIC SIGNAL FIELD NEAR THE DEEP OCEAN BOTTOM DUE TO A NEAR-SURFACE CW SOURCE. [D] . GRANT, DAVID EDWARD. 1986

机译：由于近表面CW源，在深海底部附近的声信号场的空间结构。
6. End-to-End Deep Learning Fusion of Fingerprint and Electrocardiogram Signals for Presentation Attack Detection [O] . Rami M. Jomaa, Hassan Mathkour, Yakoub Bazi, 2020

机译：指纹和心电图信号的端到端深度学习融合用于演示攻击检测
7. Machine Learning and End-to-End Deep Learning for Monitoring Driver Distractions From Physiological and Visual Signals [O] . Martin Gjoreski, Matja Z Gams, Mitja Lustrek, 2020

机译：机器学习和端到端深度学习，用于监测生理和视觉信号的驾驶员分心

Towards End-to-End Acoustic Localization Using Deep Learning: From Audio Signals to Source Position Coordinates

摘要

著录项

相似文献

相关主题

期刊订阅