首页> 外文会议>International Conference on Machine Learning >REALM: Retrieval-Augmented Language Model Pre-Training
【24h】

REALM: Retrieval-Augmented Language Model Pre-Training

机译:王牌:检索 - 增强语言模型预培训

获取原文

摘要

Language model pre-training has been shown to capture a surprising amount of world knowledge, crucial for NLP tasks such as question answering. However, this knowledge is stored implicitly in the parameters of a neural network, requiring ever-larger networks to cover more facts. To capture knowledge in a more modular and interpretable way, we augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia, used during pre-training, fine-tuning and inference. For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner, using masked language modeling as the learning signal and backpropagating through a retrieval step that considers millions of documents. We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA). We compare against state-of-the-art models for both explicit and implicit knowledge storage on three popular Open-QA benchmarks, and find that we outperform all previous methods by a significant margin (4-16% absolute accuracy), while also providing qualitative benefits such as interpretability and modularity.
机译:语言模型预培训已被证明捕捉到令人惊讶的世界知识,对于NLP任务,如问题应答。然而,这种知识在神经网络的参数中隐含地存储,需要更大的网络来涵盖更多事实。以一种更模块化和可解释的方式捕获知识,我们使用潜在知识猎犬增加语言模型预培训,这允许模型从预训练期间使用的大型语料库中获取和参加文档,这些模型调谐和推理。我们首次展示如何以无监督的方式预先培训这种知识猎犬,使用屏蔽语言建模作为学习信号,并通过考虑数百万文档的检索步骤来反向。我们展示了通过对开放式域问题回答(Open-QA)的具有挑战性任务进行微调的检索增强语言模型预培训(REARM)的有效性。我们在三个流行的开放式基准测试中与最先义的模型进行比较,并发现我们以前的所有方法优于所有先前的方法(绝对精度为4-16%),同时也提供诸如解释性和模块化等定性益处。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号