【24h】

Pattern-Aware Cache Management for Efficient Subset Retrieving of Astronomical Image Data

机译:用于高效子集检索天文图像数据的模式感知缓存管理

获取原文

摘要

FITS (Flexible Image Transport System) is the most widely used data format in astronomy. The size of one FITS file ranges from Megabytes(MB) to Gigabytes(GB), even to Terabytes(TB), and astronomers are among the first researchers to encounter Big Data. For most cases astronomers are only interested in certain small sub-area of time series image. However loading the whole raw FITS file from HDD (Hard Disk Drive) every time then cutting it for the target sub-area is both time consuming and I/O wasting, and there is no existing cache scheme optimized for the subset retrieval of FITS files. By recognizing the hot sub-areas according to the latest query history, loading and merging related sub-images via a coordinate-mapping algorithm, we proposed a Pattern-Aware (PA) cache management strategy to efficiently retrieve sub-image data from huge amounts of FITS files. Our novel method was compared with traditional LRU, LFU and LRFU strategies on full FITS files and sub-files respectively. The results show that our PA strategy can maintain a high hit ratio of 64.32%, and reduce the average response time by about 24% than the best of these traditional schemes. These results are achieved with a cache to raw requested data size ratio of 8.77%.
机译:适合(柔性图像传输系统)是天文学中最广泛使用的数据格式。甚至到Tberabytes(GB)的兆字节(MB)到千兆字节(GB)的大小范围,即使是TBYTES(TB),而天文学家则是第一个遇到大数据的研究人员之一。对于大多数情况而言,天文学家只对某些小型时间序列图像的小区区域感兴趣。然而,每次加载来自硬盘(硬盘驱动器)的整个原始符合文件,然后将其切割为目标子区域耗时和I / O浪费,并且没有针对适合文件的子集检索优化的现有高速缓存方案。通过根据最新查询历史记录,通过坐标映射算法加载和合并相关子图像来识别热子区域,我们提出了一种模式感知(PA)缓存管理策略,以有效地从大量检索子图像数据fits文件。我们的新方法与传统的LRU,LFU和LRFU策略分别进行了比较。结果表明,我们的PA策略可以保持64.32%的高击中比率,并将平均响应时间降低约24%,而不是这些传统方案的最佳。这些结果以高速缓存到原始请求的数据尺寸比率为8.77%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号