Listen, Think and Listen Again: Capturing Top-down Auditory Attention for Speaker-independent Speech Separation

机译：聆听，思考和倾听：捕捉对扬声器的言语分离的自上而下的听觉关注

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recent deep learning methods have made significant progress in multi-talker mixed speech separation. However, most existing models adopt a driftless strategy to separate all the speech channels rather than selectively attend the target one. As a result, those frameworks may be failed to offer a satisfactory solution in complex auditory scene where the number of input sounds is usually uncertain and even dynamic. In this paper, we present a novel neural network based structure motivated by the top-down attention behavior of human when facing complicated acoustical scene. Different from previous works, our method constructs an inference-attention structure to predict interested candidates and extract each speech channel of them. Our work gets rid of the limitation that the number of channels must be given or the high computation complexity for label permutation problem. We evaluated our model on the WSJO mixed-speech tasks. In all the experiments, our model gets highly competitive to reach and even outperform the baselines.

机译：最近的深度学习方法在多谈话者混合言论中取得了重大进展。然而，大多数现有模型采用令人厌恶的策略来分离所有语音渠道，而不是选择性地参加目标。结果，这些框架可能未能在复杂的听觉场景中提供令人满意的解决方案，其中输入声音的数量通常是不确定甚至动态的。在本文中，我们介绍了一种基于神经网络的基于神经网络的结构，当面对复杂的声学场景时，人类的自上而下注意行为。与以前的作品不同，我们的方法构建了推理关注结构，以预测感兴趣的候选者并提取它们的每个语音频道。我们的工作摆脱了必须给出频道数量的限制或标签排列问题的高计算复杂度。我们在WSJO混合语音任务上进行了评估模型。在所有实验中，我们的模型对达到甚至优于基线来竞争激烈。

著录项

来源
《International Joint Conference on Artificial Intelligence》|2018年|3662-4402p|共8页
会议地点
作者
Jing Shi; Jiaming Xu; Guangcan Liu; Bo Xu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation [J] . Ephrat Ariel, Mosseri Inbar, Lang Oran, ACM Transactions on Graphics . 2018,第4CD期

机译：期待听鸡尾酒会：独立于演讲者的视听模型，用于语音分离
2. Attention-related modulation of auditory-cortex responses to speech sounds during dichotic listening. [J] . Kimmo Alho, Johanna Salonen, Teemu Rinne, Brain research . 2012,第Null期

机译：二分听期间听觉皮层对语音的注意力相关调制。
3. Speaker-independent auditory attention decoding without access to clean speech sources [J] . Cong Han, James O’Sullivan, Yi Luo, Science Advances . 2019,第5期

机译：扬声器无关的听觉注意力解码，无需获取清洁语音来源
4. Listen, Think and Listen Again: Capturing Top-down Auditory Attention for Speaker-independent Speech Separation [C] . Jing Shi, Jiaming Xu, Guangcan Liu, International Joint Conference on Artificial Intelligence . 2018

机译：聆听，思考和倾听：捕捉对扬声器的言语分离的自上而下的听觉关注
5. Processing of speech in complex listening environments by individuals with obscure auditory dysfunction. [D] . Block, Kimberly Laws. 2006

机译：听觉功能障碍的人士在复杂的聆听环境中对语音的处理。
6. Remote Microphone Hearing Aid Use Improves Classroom Listening Without Adverse Effects on Spatial Listening and Attention Skills in Children With Auditory Processing Disorder: A Randomised Controlled Trial [O] . Georgios Stavrinos, Vasiliki (Vivian) Iliadou, Menelaos Pavlou, 2020

机译：远程麦克风助听器使用改善了课堂聆听没有对空间听力和注意力技能的不利影响在听觉处理障碍的儿童中：随机对照试验
7. Who Are You Listening to? Towards a Dynamic Measure of Auditory Attention to Speech-on-speech. [O] . Moïra-Phoebé Huet, Christophe Micheyl, Etienne Gaudrain, 2018

机译：你是谁在听？朝着动态测量对语音言语的听觉关注。

Listen, Think and Listen Again: Capturing Top-down Auditory Attention for Speaker-independent Speech Separation

摘要

著录项

相似文献

相关主题

期刊订阅