首页> 外文会议>IEEE International Conference on Data Science in Cyberspace >SAQP++: Bridging the Gap between Sampling-Based Approximate Query Processing and Aggregate Precomputation
【24h】

SAQP++: Bridging the Gap between Sampling-Based Approximate Query Processing and Aggregate Precomputation

机译:SAQP ++:缩小基于采样的近似查询处理与聚合预计算之间的差距

获取原文

摘要

In booming Big Data era, interactive data analytics is becoming a regular demand for data scientists. With the continuous growing of the data, timely response for aggregation queries is becoming increasingly challenging. To address this challenge, scientists have proposed two separate methods: sampling-based approximate query processing (SAQP) and aggregate precomputation (Materialization) such as data cubes. In this paper, we propose a novel framework: SAQP++, which combines sampling and precomputed aggregate together to reach the goal of both relative accurate results and acceptable preparation time. Using SUM aggregate function as the example, we propose an optimal solution of materializing under uniform distribution, and a hill climbing based algorithm of materializing under non-uniform distribution, respectively. Our experiments show that SAQP++ achieves a more flexible and better trade-off among preprocessing cost, query response time, and answer quality than SAQP or Materialization alternatives.
机译:在蓬勃发展的大数据时代,交互式数据分析正成为数据科学家的常规需求。随着数据的不断增长,对聚合查询的及时响应变得越来越具有挑战性。为了应对这一挑战,科学家们提出了两种单独的方法:基于采样的近似查询处理(SAQP)和聚合预计算(Materialization),例如数据多维数据集。在本文中,我们提出了一个新颖的框架:SAQP ++,该框架将采样和预先计算的聚合组合在一起,以达到相对准确的结果和可接受的准备时间的目标。以SUM集合函数为例,分别提出了均匀分布情况下物化的最优解和非均匀分布情况下基于爬山的物化算法。我们的实验表明,与SAQP或实现方案相比,SAQP ++在预处理成本,查询响应时间和答案质量之间实现了更加灵活和更好的权衡。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号