首页> 外文会议>Systems and Information Engineering Design Symposium >Context Matrix Methods for Property and Structure Ontology Completion in Wikidata
【24h】

Context Matrix Methods for Property and Structure Ontology Completion in Wikidata

机译:Wikidata中属性和结构本体完成的上下文矩阵方法

获取原文

摘要

Wikidata is a crowd-sourced knowledge base built by the creators of Wikipedia that applies the principles of neutrality and verifiability to data. In its more than eight years of existence, it has grown enormously, although disproportionately. Some areas are well curated and maintained, while many parts of the knowledge base are incomplete or use inconsistent classifications. Therefore, tools are needed that can use the instantiated data to infer and report structural gaps and suggest ways to address these gaps. We propose a context matrix to automatically suggest potential values for properties. This method can be extended to evaluating the ontology represented by knowledge base. In particular, it could be used to propose types and classes, supporting the discovery of ontological relationships that lend conceptual identification to the content entities. To work with the large, unlabelled data set, we first employ a pipeline to shrink the data to a minimal representation without information loss. We then process the data to build a recommendation model using property frequencies. We explore the results of these models in the context of suggesting type classifications in Wikidata and discuss potential extended applications. As a result of this work, we demonstrate approaches to contextualizing recently-added content in the knowledge base as well as proposing new connections for existing content. Finally, these methods could be applied to other knowledge graphs to develop similar completions for the entities contained therein.
机译:Wikidata是由维基百科的创造者建造的人群源知识库,该基础适用于数据的中立和可验证性的原则。在它的存在超过八年的情况下,它已经增长了很大,但虽然不成比例。有些区域策划和维护很好,而知识库的许多部分是不完整的或使用不一致的分类。因此,需要工具,可以使用实例化数据推断和报告结构间隙并建议解决这些差距的方法。我们提出了一个上下文矩阵来自动建议属性的潜在值。可以扩展此方法以评估知识库所代表的本体。特别地,它可以用于提出类型和类,支持对内容实体提供概念识别的本体关系的发现。要使用大型未标记的数据集,我们首先使用管道将数据缩小到无信息损失的最小表示。然后,我们使用属性频率处理数据以构建推荐模型。我们在暗示Wikidata的型号分类的背景下探讨这些模型的结果,并讨论潜在的扩展应用。由于这项工作,我们展示了对知识库中最近添加的内容的方法以及提出了用于现有内容的新连接。最后,这些方法可以应用于其他知识图表,以开发所包含的实体的类似完成。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号