首页> 外文会议>Conference on empirical methods in natural language processing >Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser
【24h】

Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser

机译:将一组贪心依赖解析器提取到一个MST解析器中

获取原文

摘要

We introduce two first-order graph-based dependency parsers achieving a new state of the art. The first is a consensus parser built from an ensemble of independently trained greedy LSTM transition-based parsers with different random initializations. We cast this approach as minimum Bayes risk decoding (under the Hamming cost) and argue that weaker consensus within the ensemble is a useful signal of difficulty or ambiguity. The second parser is a "distillation" of the ensemble into a single model. We train the distillation parser using a structured hinge loss objective with a novel cost that incorporates ensemble uncertainty estimates for each possible attachment, thereby avoiding the intractable cross-entropy computations required by applying standard distillation objectives to problems with structured outputs. The first-order distillation parser matches or surpasses the state of the art on English, Chinese, and German.
机译:我们介绍了两个基于图的一阶依赖解析器,它们实现了最新的技术水平。第一个是共识解析器,该解析器是由具有不同随机初始化的独立训练的基于贪婪LSTM过渡的解析器集成而成的。我们将此方法视为最小贝叶斯风险解码(在汉明成本法下),并认为集合内较弱的共识是困难或模棱两可的有用信号。第二个解析器是将集合“蒸馏”为单个模型。我们使用结构化的铰链损耗物镜训练蒸馏解析器,并采用新颖的成本,该成本结合了每种可能附件的整体不确定性估计,从而避免了通过将标准蒸馏物镜应用于结构化输出问题而需要进行的棘手的交叉熵计算。一阶蒸馏解析器在英语,中文和德语上达到或超过了最新技术水平。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号