首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Identifying Missing and Extra Notes in Piano Recordings Using Score-Informed Dictionary Learning
【24h】

Identifying Missing and Extra Notes in Piano Recordings Using Score-Informed Dictionary Learning

机译:使用乐谱告知字典学习识别钢琴录音中的遗漏音和多余音符

获取原文
获取原文并翻译 | 示例
           

摘要

The goal of automatic music transcription (AMT) is to obtain a high-level symbolic representation of the notes played in a given audio recording. Despite being researched for several decades, current methods are still inadequate for many applications. To boost the accuracy in a music tutoring scenario, we exploit that the score to be played is specified and we only need to detect the differences to the actual performance. In contrast to previous work that uses score information for postprocessing, we employ the score to construct a transcription method that is tailored to the given audio recording. By adapting a score-informed dictionary learning technique as used for source separation, we learn for each score pitch a spectral pattern describing the energy distribution of associated notes in the recording. In this paper, we identify several systematic weaknesses in our previous approach and introduce three extensions to improve its performance. First, we extend our dictionary of spectral templates to a dictionary of variable-length spectrotemporal patterns. Second, we integrate the score information using soft rather than hard constraints, to better take into account that differences from the score indeed occur. Third, we introduce new regularizers to guide the learning process. Our experiments show that these extensions particularly improve the accuracy for identifying extra notes, while the accuracy for correct and missing notes remains at a similar level. The influence of each extension is demonstrated with further experiments.
机译:自动音乐转录(AMT)的目标是获得给定音频记录中演奏的音符的高级符号表示。尽管已经进行了数十年的研究,但是当前的方法仍然不足以用于许多应用。为了提高音乐辅导场景中的准确性,我们利用指定要播放的乐谱,并且只需要检测与实际演奏的差异即可。与以前使用乐谱信息进行后处理的工作相反,我们采用乐谱来构建适合给定音频记录的转录方法。通过适应用于源分离的分数通知字典学习技术,我们为每个分数音调学习了一个频谱模式,该频谱模式描述了记录中相关音符的能量分布。在本文中,我们确定了以前方法中的几个系统缺陷,并介绍了三个扩展以改善其性能。首先,我们将光谱模板的字典扩展为可变长度光谱时空模式的字典。其次,我们使用软约束而非硬约束来集成分数信息,以更好地考虑与分数之间确实存在差异。第三,我们引入新的正则化器来指导学习过程。我们的实验表明,这些扩展特别提高了识别多余音符的准确性,而正确和缺失音符的准确性保持在相似的水平。进一步的实验证明了每个扩展的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号