首页> 外文会议>9th International conference on language resources and evaluation >Towards Automatic Transformation between Different Transcription Conventions: Prediction of Intonation Markers from Linguistic and Acoustic Features
【24h】

Towards Automatic Transformation between Different Transcription Conventions: Prediction of Intonation Markers from Linguistic and Acoustic Features

机译:走向不同转录习惯之间的自动转换:从语言和声学特征预测语调标记

获取原文

摘要

Because of the tremendous effort required for recording and transcription, large-scale spoken language corpora have been hardly developed in Japanese, with a notable exception of the Corpus of Spontaneous Japanese (CSJ). Various research groups have individually developed conversation corpora in Japanese, but these corpora are transcribed by different conventions and have few annotations in common, and some of them lack fundamental annotations, which are prerequisites for conversation research. To solve this situation by sharing existing conversation corpora that cover diverse styles and settings, we have tried to automatically transform a transcription made by one convention into that made by another convention. Using a conversation corpus transcribed in both the Conversation-Analysis-style (CA-style) and CSJ-style, we analyzed the correspondence between CA's 'intonation markers' and CSJ's 'tone labels,' and constructed a statistical model that converts tone labels into intonation markers with reference to linguistic and acoustic features of the speech. The result showed that there is considerable variance in intonation marking even between trained transcribers. The model predicted with 85% accuracy the presence of the intonation markers, and classified the types of the markers with 72% accuracy.
机译:由于记录和转录需要付出巨大的努力,因此除了日语自发语料库(CSJ)以外,几乎没有用日语开发大型口语语料库。各个研究小组已经分别开发了日语会话语料库,但是这些语料库是通过不同的约定进行转录的,并且很少有共同的注释,并且其中一些缺少基本的注释,这是进行会话研究的前提。为了通过共享涵盖各种样式和设置的现有会话语料库来解决这种情况,我们尝试将一种约定进行的转录自动转换为另一种约定进行的转录。使用以会话分析样式(CA样式)和CSJ样式转录的会话语料库,我们分析了CA的“语调标记”和CSJ的“音调标签”之间的对应关系,并构建了将音调标签转换为语调标记,涉及语音的语言和声学特征。结果表明,即使在受训的转录者之间,语调标记也存在相当大的差异。该模型以85%的准确度预测语调标记的存在,并以72%的准确度对标记的类型进行分类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号