首页> 外文会议>International conference on conceptual modeling >Bridging the Gaps towards Advanced Data Discovery over Semi-structured Data
【24h】

Bridging the Gaps towards Advanced Data Discovery over Semi-structured Data

机译:在半结构化数据上架起通往高级数据发现的桥梁

获取原文

摘要

In this work we argue that two main gaps currently hinder the development of new applications requiring sophisticated data discovery capabilities over rich (semi-structured) entity-relationship data. The first gap exists at the conceptual level, and the second at the logical level. Aiming at fulfilling the identified gaps, we propose a novel methodology for developing data discovery applications. We first describe a data discovery extension to the classic ER conceptual model termed Entity Relationship Data Discovery (ERD2). We further present a novel logical model termed the Document Category Sets (DCS) model, used to represent entities and their relationships within an enhanced document model, and describe how data discovery requirements captured by the ERD2 conceptual model can be translated into the DCS logical model. Finally, we propose an efficient data discovery system implementation, and share details of two different data discovery applications that were developed in IBM using the proposed methodology.
机译:在这项工作中,我们认为当前存在两个主要差距,这阻碍了新应用程序的开发,这些应用程序需要在丰富的(半结构化的)实体关系数据上具有复杂的数据发现功能。第一个差距存在于概念层面,第二个差距存在于逻辑层面。为了弥补已发现的差距,我们提出了一种开发数据发现应用程序的新颖方法。我们首先描述对称为实体关系数据发现(ERD2)的经典ER概念模型的数据发现扩展。我们进一步介绍了一种称为文档类别集(DCS)模型的新颖逻辑模型,该模型用于表示增强型文档模型中的实体及其关系,并描述了如何将ERD2概念模型捕获的数据发现需求转换为DCS逻辑模型。 。最后,我们提出了一种有效的数据发现系统实施方案,并共享了使用所提出的方法在IBM中开发的两个不同的数据发现应用程序的详细信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号