首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >What’s all the Fuss about Free Universal Sound Separation Data?
【24h】

What’s all the Fuss about Free Universal Sound Separation Data?

机译:关于免费通用声音分离数据的所有大惊小怪是什么?

获取原文

摘要

We introduce the Free Universal Sound Separation (FUSS) dataset, a new corpus for experiments in separating mixtures of an unknown number of sounds from an open domain of sound types. The dataset consists of 23 hours of single-source audio data drawn from 357 classes, which are used to create mixtures of one to four sources. To simulate reverberation, an acoustic room simulator is used to generate impulse responses of box-shaped rooms with frequency-dependent reflective walls. Additional open-source data augmentation tools are also provided to produce new mixtures with different combinations of sources and room simulations. Finally, we introduce an open-source baseline separation model, based on an improved time-domain convolutional network (TDCN++), that can separate a variable number of sources in a mixture. This model achieves 9.8 dB of scale-invariant signal-to-noise ratio improvement (SI-SNRi) on mixtures with two to four sources, while reconstructing single-source inputs with 35.8 dB absolute SI-SNR. We hope this dataset will lower the barrier to new research and allow for fast iteration and application of novel techniques from other machine learning domains to the sound separation challenge.
机译:我们介绍了自由的通用声音分离(FUSS)DataSet,一种新的语料库,用于从声音类型的开放域分离未知数量的声音的混合物。 DataSet由33小时的单源音频数据组成,从357类绘制,用于创建一个到四个源的混音。为了模拟混响,声学室模拟器用于产生带频率依赖性反射壁的盒形房间的脉冲响应。还提供了附加的开源数据增强工具,以产生具有不同组合的新混合物和房间模拟。最后,我们引入了一个基于改进的时域卷积网络(TDCN ++)的开源基线分离模型,其可以将变量数量分离在混合物中。该模型在具有两到四个源的混合物上实现了9.8 dB的比例不变信噪比改进(SI-SNRI),同时重建35.8 dB绝对SI-SNR的单源输入。我们希望这个数据集将降低新研究的障碍,并允许从其他机器学习域中快速迭代和应用新颖的技术到声音分离挑战。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号