【24h】

Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing

机译:在MeSH术语索引中结合图形标题和描述性文本

获取原文

摘要

The goal of text classification is to automatically assign categories to documents. Deep learning automatically learns effective features from data instead of adopting human-designed features. In this paper, we focus specifically on biomedical document classification using a deep learning approach. We present a novel multichannel TextCNN model for MeSH term indexing. Beyond the normal use of the text from the abstract and title for model training, we also consider figure and table captions, as well as paragraphs associated with the figures and tables. We demonstrate that these latter text sources are important feature sources for our method. A new dataset consisting of these text segments curated from 257,590 full text articles together with the articles' MED-LINE/PubMed MeSH terms is publicly available.
机译:文本分类的目的是自动为文档分配类别。深度学习会自动从数据中学习有效的功能,而不是采用人工设计的功能。在本文中,我们专门研究使用深度学习方法的生物医学文献分类。我们提出了一种新颖的用于MeSH词索引的多通道TextCNN模型。除了正常使用摘要和标题中的文本来进行模型训练外,我们还考虑图形和表格标题以及与图形和表格相关的段落。我们证明后面的这些文本源是我们方法的重要特征源。由257,590篇全文文章以及这些文章的MED-LINE / PubMed MeSH术语精心策划的包含这些文本片段的新数据集可公开获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号