首页> 美国卫生研究院文献>other >SKIMMR: facilitating knowledge discovery in life sciences by machine-aided skim reading
【2h】

SKIMMR: facilitating knowledge discovery in life sciences by machine-aided skim reading

机译:SKIMMR:通过机器辅助阅读来促进生命科学中的知识发现

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Background. Unlike full reading, ‘skim-reading’ involves the process of looking quickly over information in an attempt to cover more material whilst still being able to retain a superficial view of the underlying content. Within this work, we specifically emulate this natural human activity by providing a dynamic graph-based view of entities automatically extracted from text. For the extraction, we use shallow parsing, co-occurrence analysis and semantic similarity computation techniques. Our main motivation is to assist biomedical researchers and clinicians in coping with increasingly large amounts of potentially relevant articles that are being published ongoingly in life sciences.>Methods. To construct the high-level network overview of articles, we extract weighted binary statements from the text. We consider two types of these statements, co-occurrence and similarity, both organised in the same distributional representation (i.e., in a vector-space model). For the co-occurrence weights, we use point-wise mutual information that indicates the degree of non-random association between two co-occurring entities. For computing the similarity statement weights, we use cosine distance based on the relevant co-occurrence vectors. These statements are used to build fuzzy indices of terms, statements and provenance article identifiers, which support fuzzy querying and subsequent result ranking. These indexing and querying processes are then used to construct a graph-based interface for searching and browsing entity networks extracted from articles, as well as articles relevant to the networks being browsed. Last but not least, we describe a methodology for automated experimental evaluation of the presented approach. The method uses formal comparison of the graphs generated by our tool to relevant gold standards based on manually curated PubMed, TREC challenge and MeSH data.>Results. We provide a web-based prototype (called ‘SKIMMR’) that generates a network of inter-related entities from a set of documents which a user may explore through our interface. When a particular area of the entity network looks interesting to a user, the tool displays the documents that are the most relevant to those entities of interest currently shown in the network. We present this as a methodology for browsing a collection of research articles. To illustrate the practical applicability of SKIMMR, we present examples of its use in the domains of Spinal Muscular Atrophy and Parkinson’s Disease. Finally, we report on the results of experimental evaluation using the two domains and one additional dataset based on the TREC challenge. The results show how the presented method for machine-aided skim reading outperforms tools like PubMed regarding focused browsing and informativeness of the browsing context.
机译:>背景。与完整阅读不同,“略读”涉及快速浏览信息的过程,以尝试覆盖更多内容,同时仍能保留基本内容的表面视图。在这项工作中,我们通过提供从图形中自动提取的实体的基于图的动态视图,专门模拟了这种自然的人类活动。对于提取,我们使用浅层解析,共现分析和语义相似度计算技术。我们的主要动机是协助生物医学研究人员和临床医生应对越来越多的潜在相关文章,这些文章正在生命科学领域不断发表。>方法。为了构建文章的高级网络概述,我们从文本中提取加权的二进制语句。我们考虑这些语句的两种类型,共现和相似性,它们都以相同的分布表示形式组织(即在向量空间模型中)。对于共现权重,我们使用逐点相互信息来指示两个共现实体之间的非随机关联程度。为了计算相似性语句权重,我们基于相关的共现向量使用余弦距离。这些语句用于构建术语,语句和出处物品标识符的模糊索引,从而支持模糊查询和后续结果排名。然后,这些索引和查询过程用于构建基于图的界面,以搜索和浏览从文章以及与正在浏览的网络相关的文章中提取的实体网络。最后但并非最不重要的一点是,我们描述了一种对提出的方法进行自动实验评估的方法。该方法基于手动整理的PubMed,TREC挑战和MeSH数据,将我们的工具生成的图形与相关的黄金标准进行了正式比较。>结果。我们提供了一个基于网络的原型(称为“ SKIMMR”)可以根据用户可以通过我们的界面浏览的一组文档生成相互关联的实体网络。当用户对实体网络的特定区域感兴趣时,该工具将显示与该网络中当前显示的那些感兴趣实体最相关的文档。我们将其作为一种浏览研究文章集的方法。为了说明SKIMMR的实际适用性,我们提供了其在脊髓性肌萎缩症和帕金森氏病领域中的应用实例。最后,我们报告使用两个域和一个基于TREC挑战的附加数据集进行实验评估的结果。结果表明,针对集中浏览和浏览上下文的信息性,所提出的用于机器辅助读取的方法如何优于PubMed等工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号