首页> 外文会议>International Convention on Information and Communication Technology, Electronics and Microelectronics >Delta view generation for incremental loading of large dimensions in a data warehouse
【24h】

Delta view generation for incremental loading of large dimensions in a data warehouse

机译:增量视图生成,用于在数据仓库中增量加载大型维度

获取原文

摘要

Incremental load is the preferred approach in efficient ETL processes. Fact tables are the ones who benefit the most from this approach, since they are large in terms of row count. For the sake of simplicity, dimension tables are often ignored and populated in a full reload manner. However, big dimensions (e.g. Client) can also have a significant impact on the ETL process and should also be considered for incremental load. Although they have much smaller cardinality than a typical fact table, it usually takes much more resources to calculate one dimension table row than to calculate one fact table row. Large dimension tables are based on multiple source tables, and it is not trivial to determine the changed records that should be considered for the incremental load because changes in any and all of underlying source tables must be considered. In this paper, we present an algorithm for the dimension's delta view generation. Delta view for a dimension encompasses all its source tables and produces a set of keys (e.g. ClientIds) that should be incrementally processed. We have employed this approach in a real world project and have noticed a significant reduction in the loading time of big dimensions.
机译:增量负载是高效ETL过程中的首选方法。事实表是最受益于这种方法的表,因为它们的行数很大。为了简单起见,通常以完全重载的方式忽略和填充维表。但是,较大的尺寸(例如客户)也会对ETL流程产生重大影响,因此也应考虑增加负载。尽管它们的基数比典型事实表小得多,但是与计算一个事实表行相比,计算一维表行通常要花费更多的资源。大型表是基于多个源表的,因此确定增加的负载应考虑的已更改记录并非易事,因为必须考虑任何和所有基础源表中的更改。在本文中,我们提出了一种用于维数增量视图生成的算法。一个维度的Delta视图包含其所有源表,并生成一组应进行增量处理的键(例如ClientId)。我们在现实世界的项目中采用了这种方法,并且注意到大幅面加载时间大大减少了。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号