【24h】

Seeing Things from a Different Angle: Discovering Diverse Perspectives about Claims

机译:从不同的角度看东西:发现有关索赔的多样化视角

获取原文

摘要

One key consequence of the information revolution is a significant increase and a contamination of our information supply. The practice of fact-checking won't suffice to eliminate the biases in text data we observe, as the degree of factuality alone does not determine whether biases exist in the spectrum of opinions visible to us. To better understand controversial issues, one needs to view them from a diverse yet comprehensive set of perspectives. For example, there are many ways to respond to a claim such as "animals should have lawful rights", and these responses form a spectrum of perspectives, each with a stance relative to this claim and, ideally, with evidence supporting it. Inherently, this is a natural language understanding task, and we propose to address it as such. Specifically, we propose the task of substantiated perspective discovery where, given a claim, a system is expected to discover a diverse set of well-corroborated perspectives that take a stance with respect to the claim. Each perspective should be substantiated by evidence paragraphs which summarize pertinent results and facts. We construct Perspecurum, a dataset of claims, perspectives and evidence, making use of online debate websites to create the initial data collection, and augmenting it using search engines in order to expand and diversify our dataset. We use crowdsourcing to filter out noise and ensure high-quality data. Our dataset contains 1k claims, accompanied by pools of 10k and 8k perspective sentences and evidence paragraphs, respectively. We provide a thorough analysis of the dataset to highlight key underlying language understanding challenges, and show that human baselines across multiple subtasks far outperform machine baselines built upon state-of-the-art NLP techniques. This poses a challenge and an opportunity for the NLP community to address.
机译:信息革命的一个关键后果是我们信息供应的重大增加和污染。事实检查的实践不足以消除我们观察到的文本数据中的偏差,因为单独的事实程度不确定是否存在偏差在我们对我们可见的意见范围内。为了更好地了解有争议的问题,需要从多样化但全面的角度来看他们。例如,有许多方法可以响应诸如“动物应该具有合法权利”的索赔,并且这些反应形成了一种视角,每个视角都具有相对于本发明的主席,理想情况下,有证据支持它。本质上,这是一种自然语言理解任务,我们建议解决它。具体地,我们提出了考虑到权利要求的特定透视发现的任务,该系统预计将发现一种多样化的良好的粗体透视图,该良好的良好的贯穿透视图,其具有关于权利要求的立场。每个透视都应通过总结相关结果和事实的证据段落来证明。我们构建Perspecurum,索赔,透视和证据的数据集,利用在线辩论网站创建初始数据收集,并使用搜索引擎增强它以扩展和多样化我们的数据集。我们使用众包来滤除噪音并确保高质量的数据。我们的数据集包含1K索赔,分别伴随着10K和8K透视句子和证据段落。我们对数据集提供了彻底的分析,以突出显示关键的潜在语言理解挑战,并显示跨多个子机构的人类基线远优先于最先进的NLP技术内置的机器基线。这给NLP社区带来了挑战和机会。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号