首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Re-Translation Strategies for Long Form, Simultaneous, Spoken Language Translation
【24h】

Re-Translation Strategies for Long Form, Simultaneous, Spoken Language Translation

机译:长格式,同时,口语翻译的重新翻译策略

获取原文

摘要

We investigate the problem of simultaneous machine translation of long-form speech content. We target a continuous speech-to-text scenario, generating translated captions for a live audio feed, such as a lecture or play-by-play commentary. As this scenario allows for revisions to our incremental translations, we adopt a re-translation approach to simultaneous translation, where the source is repeatedly translated from scratch as it grows. This approach naturally exhibits very low latency and high final quality, but at the cost of incremental instability as the output is continuously refined. We experiment with a pipeline of industry-grade speech recognition and translation tools, augmented with simple inference heuristics to improve stability. We use TED Talks as a source of multilingual test data, developing our techniques on English-to-German spoken language translation. Our minimalist approach to simultaneous translation allows us to scale our final evaluation to several other target languages, dramatically improving incremental stability for all of them.
机译:我们研究了长格式语音内容的同时机器翻译问题。我们的目标是连续语音到文本的场景,为实时音频源(例如演讲或播放中的评论)生成翻译后的字幕。由于这种情况允许对我们的增量翻译进行修订,因此我们采用了重新翻译的方式进行同时翻译,即随着来源的增长从头开始反复翻译来源。这种方法自然表现出非常低的延迟和较高的最终质量,但是随着输出的不断完善,以增加的不稳定性为代价。我们尝试了一系列工业级语音识别和翻译工具,并添加了简单的推理方法以提高稳定性。我们将TED演讲用作多语言测试数据的来源,从而开发了英语到德语口语翻译的技术。我们采用简约的同声翻译方法,使我们可以将最终评估扩展到其他几种目标语言,从而极大地提高了所有目标语言的增量稳定性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号