首页> 外文期刊>Software >SIMD compression and the intersection of sorted integers
【24h】

SIMD compression and the intersection of sorted integers

机译:SIMD压缩和排序整数的交集

获取原文
获取原文并翻译 | 示例
           

摘要

Sorted lists of integers are commonly used in inverted indexes and database systems. They are often compressed in memory. We can use the single-instruction, multiple data (SIMD) instructions available in common processors to boost the speed of integer compression schemes. Our S4-BP128-D4 scheme uses as little as 0.7CPU cycles per decoded 32-bit integer while still providing state-of-the-art compression. However, if the subsequent processing of the integers is slow, the effort spent on optimizing decompression speed can be wasted. To show that it does not have to be so, we (1) vectorize and optimize the intersection of posting lists; (2) introduce the SIMD GALLOPING algorithm. We exploit the fact that one SIMD instruction can compare four pairs of 32-bit integers at once. We experiment with two Text REtrieval Conference (TREC) text collections, GOV2 and ClueWeb09 (category B), using logs from the TREC million-query track. We show that using only the SIMD instructions ubiquitous in all modern CPUs, our techniques for conjunctive queries can double the speed of a state-of-the-art approach. Copyright (c) 2015 John Wiley & Sons, Ltd.
机译:整数排序列表通常在反向索引和数据库系统中使用。它们通常在内存中被压缩。我们可以使用普通处理器中可用的单指令多数据(SIMD)指令来提高整数压缩方案的速度。我们的S4-BP128-D4方案每个解码的32位整数仅使用0.7个CPU周期,同时仍提供最先进的压缩功能。但是,如果整数的后续处理很慢,则可能会浪费用于优化解压缩速度的精力。为了表明并非必须如此,我们(1)对发布列表的交集进行矢量化和优化; (2)介绍SIMD GALLOPING算法。我们利用一个SIMD指令可以一次比较四对32位整数的事实。我们使用来自TREC百万查询跟踪的日志,尝试了两个文本检索会议(TREC)文本集合GOV2和ClueWeb09(类别B)。我们证明,仅使用所有现代CPU中普遍存在的SIMD指令,我们的联合查询技术就可以使最新方法的速度提高一倍。版权所有(c)2015 John Wiley&Sons,Ltd.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号