...
首页> 外文期刊>International Journal of Artificial Intelligence Tools: Architectures, Languages, Algorithms >Consolidating Heterogeneous Enterprise Data for Named Entity Linking and Web Intelligence
【24h】

Consolidating Heterogeneous Enterprise Data for Named Entity Linking and Web Intelligence

机译:合并异构企业数据以进行命名实体链接和Web Intelligence

获取原文
获取原文并翻译 | 示例
           

摘要

Linking named entities to structured knowledge sources paves the way for state-of-the-art Web intelligence applications which assign sentiment to the correct entities, identify trends, and reveal relations between organizations, persons and products. For this purpose this paper introduces Recognyze, a named entity linking component that uses background knowledge obtained from linked data repositories, and outlines the process of transforming heterogeneous data silos within an organization into a linked enterprise data repository which draws upon popular linked open data vocabularies to foster interoperability with public data sets. The presented examples use comprehensive real-world data sets from Orell Fussli Business Information, Switzerland's largest business information provider. The linked data repository created from these data sets comprises more than nine million triples on companies, the companies' contact information, key people, products and brands. We identify the major challenges of tapping into such sources for named entity linking, and describe required data pre-processing techniques to use and integrate such data sets, with a special focus on disambiguation and ranking algorithms. Finally, we conduct a comprehensive evaluation based on business news from the New Journal of Zurich and AWP Financial News to illustrate how these techniques improve the performance of the Recognyze named entity linking component.
机译:将命名实体链接到结构化知识源为最新的Web智能应用程序铺平了道路,该应用程序将情感分配给正确的实体,识别趋势并揭示组织,人员和产品之间的关系。为此,本文介绍了Recognyze,这是一个命名实体链接组件,它使用从链接数据存储库中获得的背景知识,并概述了将组织内的异构数据孤岛转换为链接企业数据存储库的过程,该过程利用流行的链接开放数据词汇来促进与公共数据集的互操作性。呈现的示例使用了瑞士最大的商业信息提供商Orell Fussli商业信息所提供的全面的真实数据集。从这些数据集创建的链接数据存储库包含有关公司,公司的联系信息,关键人物,产品和品牌的三百万个三倍。我们确定了利用此类资源进行命名实体链接所面临的主要挑战,并描述了使用和集成此类数据集所需的数据预处理技术,特别关注歧义消除和排序算法。最后,我们根据《苏黎世新期刊》和《 AWP金融新闻》的商业新闻进行全面评估,以说明这些技术如何提高Recognyze命名实体链接组件的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号