【24h】

Resource Oblivious Sorting on Multicores

机译:资源无知在多设备上排序

获取原文

摘要

We present a new deterministic sorting algorithm that interleaves the partitioning of a sample sort with merging. Sequentially, it sorts n elements in O(n log n) time cache-obliviously with an optimal number of cache misses. The parallel complexity (or critical path length) of the algorithm is O(log n log log n), which improves on previous bounds for deterministic sample sort. Given a multicore computing environment with a global shared memory and p cores, each having a cache of size M organized in blocks of size B, our algorithm can be scheduled effectively on these p cores in a cache-oblivious manner. We improve on the above cache-oblivious processor-aware parallel implementation by using the Priority Work Stealing Scheduler (PWS) that we presented recently in a companion paper. The PWS scheduler is both processor- and cache-oblivious (i.e. resource oblivious), and it tolerates asynchrony among the cores. Using PWS, we obtain a resource oblivious scheduling of our sorting algorithm that matches the performance of the processor-aware version. Our analysis includes the delay incurred by false-sharing. We also establish good bounds for our algorithm with the randomized work stealing scheduler.
机译:我们介绍了一种新的确定性分类算法,其交织具有合并的样本排序的分区。顺序地,它在O(n log n)时间缓存中排序n个元素,非常有一个最佳的高速缓存未命中。算法的并行复杂度(或临界路径长度)是O(log n log log n),其提高了确定性样本排序的先前界限。给定具有全局共享存储器和P内核的多核计算环境,每个都具有在大小B块中组织的大小M的高速缓存,我们的算法可以以缓存忽期的方式有效地在这些P内核上有效地调度。我们通过使用我们最近在伴侣论文中呈现的优先级工作窃取调度程序(PWS)来提高上述缓存令人沮丧的处理器感知并行实现。 PWS调度程序既是处理器和缓存忘记(即资源忘记),它可以容忍核心之间的异步。使用PWS,我们获得了我们对分类算法的资源不知情的调度,其匹配处理器感知版本的性能。我们的分析包括假共享产生的延迟。我们还为我们的算法建立了良好的界限,随机工作窃取调度程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号