首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals
【24h】

Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals

机译:用于从和弦音频信号中无监督地提取主旋律的源/滤波器模型

获取原文
获取原文并翻译 | 示例
           

摘要

Extracting the main melody from a polyphonic music recording seems natural even to untrained human listeners. To a certain extent it is related to the concept of source separation, with the human ability of focusing on a specific source in order to extract relevant information. In this paper, we propose a new approach for the estimation and extraction of the main melody (and in particular the leading vocal part) from polyphonic audio signals. To that aim, we propose a new signal model where the leading vocal part is explicitly represented by a specific source/filter model. The proposed representation is investigated in the framework of two statistical models: a Gaussian Scaled Mixture Model (GSMM) and an extended Instantaneous Mixture Model (IMM). For both models, the estimation of the different parameters is done within a maximum-likelihood framework adapted from single-channel source separation techniques. The desired sequence of fundamental frequencies is then inferred from the estimated parameters. The results obtained in a recent evaluation campaign (MIREX08) show that the proposed approaches are very promising and reach state-of-the-art performances on all test sets.
机译:从复音音乐录音中提取主要旋律似乎是很自然的,即使对于未经培训的听众也是如此。在某种程度上,它与源分离的概念有关,人类有能力专注于特定源以提取相关信息。在本文中,我们提出了一种从和弦音频信号中估计和提取主要旋律(尤其是主唱声部)的新方法。为此,我们提出了一种新的信号模型,其中主要的人声部分由特定的源/滤波器模型明确表示。在两个统计模型的框架内研究了提出的表示形式:高斯比例混合模型(GSMM)和扩展瞬时混合物模型(IMM)。对于这两个模型,在从单通道源分离技术改编的最大似然框架内完成不同参数的估计。然后从估计的参数中推断出所需的基本频率序列。在最近的评估活动(MIREX08)中获得的结果表明,提出的方法非常有前途,并且在所有测试集上都达到了最先进的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号