首页> 外文会议>Conference on Empirical Methods in Natural Language Processing >Learning to Explain: Datasets and Models for Identifying Valid Reasoning Chains in Multihop Question-Answering
【24h】

Learning to Explain: Datasets and Models for Identifying Valid Reasoning Chains in Multihop Question-Answering

机译:学习解释:数据集和模型,用于识别多跳题问答中的有效推理链

获取原文

摘要

Despite the rapid progress in multihop question-answering (QA), models still have trouble explaining why an answer is correct, with limited explanation training data available to learn from. To address this, we introduce three explanation datasets in which explanations formed from corpus facts are annotated. Our first dataset, eQASC, contains over 98K explanation annotations for the multihop question answering dataset QASC, and is the first that annotates multiple candidate explanations for each answer. The second dataset eQASC-perturbed is constructed by crowd-sourcing perturbations (while preserving their validity) of a subset of explanations in QASC, to test consistency and generalization of explanation prediction models. The third dataset eOBQA is constructed by adding explanation annotations to the OBQA dataset to test generalization of models trained on eQASC. We show that this data can be used to significantly improve explanation quality (+14% absolute Fl over a strong retrieval baseline) using a BERT-based classifier, but still behind the upper bound, offering a new challenge for future research. We also explore a delexicalized chain representation in which repeated noun phrases are replaced by variables, thus turning them into generalized reasoning chains (for example: "X is a Y" AND "Y has Z" IMPLIES "X has Z"). We find that generalized chains maintain performance while also being more robust to certain perturbations.
机译:尽管有多跳问题回答(QA)的进展迅速,但模型仍然有问题解释了答案是正确的,有限的解释培训数据可以从中学习。为了解决这个问题,我们介绍了三个解释数据集,其中由语料库事实形成的解释是注释的。我们的第一个DataSet EQSAC,包含超过98k的解释注释,用于多跳问题应答DataSet QUAC,并且是为每个答案注释多个候选解释的第一个解释。第二个数据集Eqasc-erburbed是通过人群采购的扰动(同时保持其有效性)在Qasc中的解释子集中构成,以测试解释预测模型的一致性和泛化。第三个数据集Eobqa是通过向OBQA数据集添加说明注释来构建,以测试EQASC上培训的模型的概括。我们表明,使用基于伯爵的分类器,可以使用该数据来显着提高解释质量(+ 14%的绝对流动),但仍然落后于上限,为未来的研究提供了新的挑战。我们还探讨了一个不同的链式表示,其中重复的名词短语被变量替换,从而将它们转换为概括的推理链(例如:“x是Y”,“Y具有Z”表示“x具有”)。我们发现广义链保持性能,同时对某些扰动也更加强大。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号