【24h】

Dataset Mention Extraction and Classification

机译:数据集提及提取和分类

获取原文

摘要

Datasets are integral artifacts of empirical scientific research. However, due to natural language variation, their recognition can be difficult and even when identified, can often be inconsistently referred across and within publications. We report our approach to the Coleridge Initiative's Rich Context Competition, which tasks participants with identifying dataset surface forms (dataset mention extraction) and associating the extracted mention to its referred dataset (dataset classification). In this work, we propose various neural baselines and evaluate these model on one-plus and zero-shot classification scenarios. We further explore various joint learning approaches - exploring the synergy between the tasks - and report the issues with such techniques.
机译:数据集是经验科学研究不可或缺的产物。但是,由于自然语言的变化,它们的识别可能会很困难,甚至在被识别时也常常会在出版物中和出版物中前后不一致地被提及。我们向Coleridge Initiative的Rich Context Competition报告了我们的方法,该竞赛要求参与者识别数据集表面形式(数据集提及提取),并将提取的提及关联到其引用的数据集(数据集分类)。在这项工作中,我们提出了各种神经基线,并在一加和零击分类方案中评估了这些模型。我们进一步探索各种联合学习方法-探索任务之间的协同作用-并报告此类技术的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号