首页> 外文会议>International conference on Asian-Pacific digital libraries >Large-Scale Experiments for Mathematical Document Classification
【24h】

Large-Scale Experiments for Mathematical Document Classification

机译:数学文件分类的大型实验

获取原文

摘要

The ever increasing amount of digitally available information is curse and blessing at the same time. On the one hand, users have increasingly large amounts of information at their fingertips. On the other hand, the assessment and refinement of web search results becomes more and more tiresome and difficult for non-experts in a domain. Therefore, established digital libraries offer specialized collections with a certain degree of quality. This quality can largely be attributed to the great effort invested into semantic enrichment of the provided documents e.g. by annotating their documents with respect to a domain-specific taxonomy. This process is still done manually in many domains, e.g. chemistry (CAS), medicine (MeSH), or mathematics (MSC). But due to the growing amount of data, this manual task gets more and more time consuming and expensive. The only solution for this problem seems to employ automated classification algorithms, but from evaluations done in previous research, conclusions to a real world scenario are difficult to make. We therefore conducted a large scale feasibility study on a real world data set from one of the biggest mathematical digital libraries, i.e. Zentralblatt MATH, with special focus on its practical applicability.
机译:数量越来越多的数字信息是诅咒和祝福。一方面,用户在他们的指尖中越来越大量的信息。另一方面,网络搜索结果的评估和改进变得越来越令人厌倦,并且在域中的非专家变得越来越困难。因此,已建立的数字图书馆提供具有一定程度的质量的专业集合。这种质量可能主要归因于投资于提供的文件的语义富集的巨大努力。通过对域特定分类作用的文献进行注释。这一过程仍然在许多域中手动完成,例如,化学(CAS),药物(网)或数学(MSC)。但由于数据越来越多的数据,本手册任务越来越耗费耗时和昂贵。这个问题的唯一解决方案似乎采用了自动分类算法,但从先前研究中的评估,对现实世界情景的结论很难制造。因此,我们对来自最大数学数字图书馆之一的真实世界数据进行了大规模的可行性研究,即Zentralblatt数学,特别关注其实际适用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号