【24h】

CoreKG: a Knowledge Lake Service

机译:CoreKG:知识湖服务

获取原文

摘要

With Data Science continuing to emerge as a powerful differentiator across industries, organisations are now focused on transforming their data into actionable insights. This task is challenging as in today's knowledge-, service-, and cloud-based economy, businesses accumulate massive amounts of raw data from a variety of sources. Data Lakes introduced as a storage repository to organize this raw data in its native format (supporting from relational to NoSQL DBs) until it is needed. The rationale behind a Data Lake is to store raw data and let the data analyst decide bow to cook/curate them later. In this paper, wo present the notion of Knowledge Lake. i.e. a contextualized Data Lake. The Knowledge Lake will provide the foundation for big data analytics by automatically curating the raw data in the Data Lake and to prepare them for deriving insights. We present CoreKG -an open source Data and Knowledge Lake service- which offers researchers and developers a single REST API to organize, curate, index and query their data and metadata in the Lake and over time. CoreKG manages multiple database technologies (from Relational to NoSQL) and offers a built-in design for data curation, security and provenance.
机译:随着数据科学在各个行业中不断成为强大的差异化产品,组织现在正致力于将其数据转换为可行的见解。在当今基于知识,服务和云的经济中,企业从各种来源收集大量原始数据,因此这项任务具有挑战性。引入了Data Lakes作为存储库,以原始格式组织原始数据(从关系型数据库到NoSQL DBs的支持),直到需要它为止。 Data Lake背后的基本原理是存储原始数据,然后让数据分析师决定鞠躬以备后用。在本文中,我们介绍了知识湖的概念。即上下文化的Data Lake。知识湖将通过自动整理数据湖中的原始数据并为获取洞见做准备,从而为大数据分析奠定基础。我们提供CoreKG(一种开源的数据和知识湖服务),为研究人员和开发人员提供一个REST API,以组织,管理,索引和查询其在湖中的数据和元数据,并随着时间的推移进行查询。 CoreKG管理多种数据库技术(从关系型到NoSQL),并提供用于数据管理,安全性和出处的内置设计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号