首页> 外文会议>International Joint Conference on Artificial Intelligence >Listen, Think and Listen Again: Capturing Top-down Auditory Attention for Speaker-independent Speech Separation
【24h】

Listen, Think and Listen Again: Capturing Top-down Auditory Attention for Speaker-independent Speech Separation

机译:聆听,思考和倾听:捕捉对扬声器的言语分离的自上而下的听觉关注

获取原文

摘要

Recent deep learning methods have made significant progress in multi-talker mixed speech separation. However, most existing models adopt a driftless strategy to separate all the speech channels rather than selectively attend the target one. As a result, those frameworks may be failed to offer a satisfactory solution in complex auditory scene where the number of input sounds is usually uncertain and even dynamic. In this paper, we present a novel neural network based structure motivated by the top-down attention behavior of human when facing complicated acoustical scene. Different from previous works, our method constructs an inference-attention structure to predict interested candidates and extract each speech channel of them. Our work gets rid of the limitation that the number of channels must be given or the high computation complexity for label permutation problem. We evaluated our model on the WSJO mixed-speech tasks. In all the experiments, our model gets highly competitive to reach and even outperform the baselines.
机译:最近的深度学习方法在多谈话者混合言论中取得了重大进展。然而,大多数现有模型采用令人厌恶的策略来分离所有语音渠道,而不是选择性地参加目标。结果,这些框架可能未能在复杂的听觉场景中提供令人满意的解决方案,其中输入声音的数量通常是不确定甚至动态的。在本文中,我们介绍了一种基于神经网络的基于神经网络的结构,当面对复杂的声学场景时,人类的自上而下注意行为。与以前的作品不同,我们的方法构建了推理关注结构,以预测感兴趣的候选者并提取它们的每个语音频道。我们的工作摆脱了必须给出频道数量的限制或标签排列问题的高计算复杂度。我们在WSJO混合语音任务上进行了评估模型。在所有实验中,我们的模型对达到甚至优于基线来竞争激烈。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号