首页> 外文会议>European conference on machine learning and knowledge discovery in databases >Pushing-Down Tensor Decompositions over Unions to Promote Reuse of Materialized Decompositions
【24h】

Pushing-Down Tensor Decompositions over Unions to Promote Reuse of Materialized Decompositions

机译:在联合上推下张量分解以促进对物化分解的重用

获取原文

摘要

From data collection to decision making, the life cycle of data often involves many steps of integration, manipulation, and analysis. To be able to provide end-to-end support for the full data life cycle, today's data management and decision making systems increasingly combine operations for data manipulation, integration as well as data analysis. Tensor-relational model (TRM) is a framework proposed to support both relational algebraic operations (for data manipulation and integration) and tensor algebraic operations (for data analysis). In this paper, we consider joint processing of relational algebraic and tensor analysis operations. In particular, we focus on data processing workflows that involve data integration from multiple sources (through unions) and tensor decomposition tasks. While, in traditional relational algebra, the costliest operation is known to be the join, in a framework that provides both relational and tensor operations, tensor decomposition tends to be the computationally costliest operation. Therefore, it is most critical to reduce the cost of the tensor decomposition task by manipulating the data processing workflow in a way that reduces the cost of the tensor decomposition step. Therefore, in this paper, we consider data processing workflows involving tensor decomposition and union operations and we propose a novel scheme for pushing down the tensor decompositions over the union operations to reduce the overall data processing times and to promote reuse of materialized tensor decomposition results. Experimental results confirm the efficiency and effectiveness of the proposed scheme.
机译:从数据收集到决策,数据的生命周期通常涉及集成,操纵和分析的许多步骤。为了能够在整个数据生命周期中提供端到端支持,当今的数据管理和决策系统越来越多地结合了用于数据操作,集成和数据分析的操作。张量关系模型(TRM)是一个框架,旨在支持关系代数运算(用于数据操作和集成)和张量代数运算(用于数据分析)。在本文中,我们考虑关系代数和张量分析操作的联合处理。特别是,我们专注于涉及来自多个源(通过联合)的数据集成和张量分解任务的数据处理工作流。虽然在传统的关系代数中,最昂贵的运算是联接,但在同时提供关系和张量运算的框架中,张量分解往往是计算上最昂贵的运算。因此,通过以降低张量分解步骤成本的方式操纵数据处理工作流来降低张量分解任务的成本是最关键的。因此,在本文中,我们考虑了涉及张量分解和联合运算的数据处理工作流,并提出了一种新的方案来降低联合运算上的张量分解,以减少整体数据处理时间并促进物化张量分解结果的重用。实验结果证实了该方案的有效性和有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号