首页> 外文会议>International Workshop on Computational Processing of the Portuguese Language >Building a Question-Answering Corpus Using Social Media and News Articles
【24h】

Building a Question-Answering Corpus Using Social Media and News Articles

机译:使用社交媒体和新闻文章构建一个问题答案的语料库

获取原文

摘要

Is it possible to develop a reliable QA-CORPUS using social media data? What are the challenges faced when attempting such a task? In this paper, we discuss these questions and present our findings when developing a QA-CORPUS on the topic of Brazilian finance. In order to populate our corpus, we relied on opinions from experts on Brazilian finance that are active on the Twitter application. From these experts, we extracted information from news websites that are used as answers in the corpus. Moreover, to effectively provide rankings of answers to questions, we employ novel word vector based similarity measures between short sentences (that accounts for both questions and Tweets). We validated our methods on a recently released dataset of similarity between short Portuguese sentences. Finally, we also discuss the effectiveness of our approach when used to rank answers to questions from real users.
机译:是否有可能使用社交媒体数据开发可靠的QA-Corpus?尝试此类任务时面临的挑战是什么?在本文中,我们讨论这些问题并在制定巴西金融主题的QA语料库中展示我们的调查结果。为了填充我们的语料库,我们依赖于在Twitter申请中积极的巴西金融专家的意见。从这些专家来看,我们从新闻网站中提取了用作语料库中答案的新闻网站的信息。此外,为了有效地提供问题的答案排名,我们在短句之间采用了新的Word Vectory的类似性措施(对于两个问题和推文来说)。我们在短葡萄牙语句子之间验证了最近发布的相似性数据集的方法。最后,我们还讨论了我们的方法的有效性,当习惯于从真实用户那里对问题的答案进行排名。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号