首页> 美国卫生研究院文献>Heliyon >Statistical-based system combination approach to gain advantages over different machine translation systems
【2h】

Statistical-based system combination approach to gain advantages over different machine translation systems

机译:基于统计的系统组合方法来获得优于不同机器翻译系统的优势

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Every machine translation system has some advantages. We propose an improved statistical system combination approach to achieve the advantages of existing machine translation systems. The primary task is to score all the phrases of the outputs of different machine translation systems selected for combination. Three steps are involved in the proposed statistical system combination approach, viz., alignment, decoding, and scoring. Pair alignment is done in the first step to prevent duplication so that only a single phrase is chosen from various phrases containing the same information. Thus the alignment and scoring strategy are implemented in our approach. Hypotheses are built in the second step. In the third step, we calculate the scores for all the hypotheses. The hypothesis with the highest score is chosen as the final translated output. Wrong scoring can mislead to identify the best part from different systems. It may be noted that a particular phrase may appear in various ways in different translations. To resolve the challenges, we incorporate WordNet in the alignment phase and word2vec in the scoring phase along with the existing factors. We find that the system combination model using WordNet and word2vec injection improves the machine translation accuracy. In this work, we have merged three systems viz., Hierarchical machine translation system, Bing Microsoft Translate, and Google Translate. The broad tests of translation on eight language pairs with benchmark datasets demonstrate that the proposed system achieves better quality than the individual systems and the state-of-the-art system combination models.
机译:每个机器翻译系统都有一些优点。我们提出一种改进的统计系统组合方法,以实现现有机器翻译系统的优势。主要任务是对选择用于组合的不同机器翻译系统的所有输出短语进行打分。提议的统计系统组合方法涉及三个步骤,即对齐,解码和评分。配对对齐是在第一步中完成的,以防止重复,以便从包含相同信息的各种短语中仅选择一个短语。因此,在我们的方法中实施了对齐和评分策略。第二步建立假设。第三步,我们计算所有假设的得分。选择得分最高的假设作为最终翻译输出。错误的评分可能会误导您从不同系统中识别出最佳部分。可以注意到,特定短语可能以不同的方式出现在不同的翻译中。为了解决这些挑战,我们将WordNet纳入对齐阶段,将word2vec纳入评分阶段,并结合现有因素。我们发现使用WordNet和word2vec注入的系统组合模型可以提高机器翻译的准确性。在这项工作中,我们合并了三个系统,即分层机器翻译系统,Bing Microsoft Translate和Google Translate。对带有基准数据集的八种语言对的翻译进行的广泛测试表明,与单个系统和最新的系统组合模型相比,该系统的质量更高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号