首页> 外文会议>International Conference on Big Data and Smart Computing >Scalable OWL-Horst ontology reasoning using SPARK
【24h】

Scalable OWL-Horst ontology reasoning using SPARK

机译:使用SPARK的可扩展OWL-Horst本体推理

获取原文

摘要

In this paper, we present an approach to perform reasoning for scalable OWL ontologies in a Hadoop-based distributed computing cluster. Rule-based reasoning is typically used for a scalable OWL-Horst reasoning; typically, the system repeatedly performs many operations involving semantic axioms for big ontology triples until no further inferred data exists. Thus, the reasoning systems suffer from performance limitations when ontology reasoning is performed via disk-based MapReduce approaches. To overcome this drawback, we propose an approach that loads triples to memory in computer nodes that are connected by SPARK - a memory-based cluster computing platform - and executes ontology reasoning. To implement an OWL Horst ontology reasoning system, we first define a set of algorithms such that they divide large triples into Resilient Distributed Datasets (RDDs), taking into account the patterns and interdependencies of the reasoning rules. We then load each RDD into the memory of computers composing a distributed computing cluster and subsequently perform distributed reasoning by rule execution orders. To evaluate the proposed methods, we compare it to WebPIE using the LUBM set, which is formal dataset for evaluating ontology inferences and search speeds. The proposed approach shows throughput is improved by 200% (98k/sec) as compared to WebPIE (33k/sec) using the LUBM6000 (860 million triples, 109 gigabyte).
机译:在本文中,我们提出了一种在基于Hadoop的分布式计算集群中对可伸缩OWL本体执行推理的方法。基于规则的推理通常用于可伸缩的OWL-Horst推理;通常,系统针对大型本体三元组重复执行涉及语义公理的许多操作,直到不存在进一步的推断数据为止。因此,当通过基于磁盘的MapReduce方法执行本体推理时,推理系统会受到性能限制。为克服此缺点,我们提出了一种方法,该方法将三元组加载到由SPARK(基于内存的群集计算平台)连接的计算机节点中的内存中,并执行本体推理。为了实现OWL Horst本体推理系统,我们首先定义一组算法,以便将大型三元组划分为弹性分布式数据集(RDD),同时考虑到推理规则的模式和相互依赖性。然后,我们将每个RDD加载到组成分布式计算集群的计算机的内存中,然后按照规则执行顺序执行分布式推理。为了评估所提出的方法,我们使用LUBM集将其与WebPIE进行比较,LUBM集是用于评估本体推理和搜索速度的正式数据集。与使用LUBM6000(8.6亿个三倍,109 GB)的WebPIE(33k /秒)相比,所提出的方法显示吞吐量提高了200%(98k /秒)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号