首页> 美国卫生研究院文献>other >Exploiting Thread-Level and Instruction-Level Parallelism to Cluster Mass Spectrometry Data using Multicore Architectures

【2h】

Exploiting Thread-Level and Instruction-Level Parallelism to Cluster Mass Spectrometry Data using Multicore Architectures

机译：利用多核体系结构利用线程级和指令级并行性对质谱数据进行聚类

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Modern mass spectrometers can produce large numbers of peptide spectra from complex biological samples in a short time. A substantial amount of redundancy is observed in these data sets from peptides that may get selected multiple times in Liquid Chromatography Tandem Mass Spectrometry (LC-MS/MS) experiments. A large number of spectra do not get mapped to specific peptide sequences due to low signal-to-noise (S/N) ratio of the spectra from these machines. Clustering is one way to mitigate the problems of these complex mass spectrometry data sets. Recently we presented a graph theoretic framework, known as CAMS, for clustering of large-scale mass spectrometry data. CAMS utilized a novel metric to exploit the spatial patterns in the mass spectrometry peaks which allowed highly accurate clustering results. However, comparison of each spectrum with every other spectrum makes the clustering problem computationally inefficient. In this paper we present a parallel algorithm, called P-CAMS, that uses thread-level and instruction-level parallelism on multicore architectures to substantially decrease running times. P-CAMS relies on intelligent matrix completion to reduce the number of comparisons, threads to run on each core and Single Instruction Multiple Data (SIMD) paradigm inside each thread to exploit massive parallelism on multicore architectures. A carefully crafted load-balanced scheme that uses spatial locations of the mass spectrometry peaks mapped to nearest level cache and core allows super-linear speedups. We study the scalability of the algorithm with a wide variety of mass spectrometry data and variation in architecture specific parameters. The results show that SIMD style data parallelism combined with thread-level parallelism for multicore architectures is a powerful combination that allows substantial reduction in runtimes even for all-to-all comparison algorithms. The quality assessment is performed using real-world data set and is shown to be consistent with the serial version of the same algorithm.

机译：现代质谱仪可在短时间内从复杂的生物样品中产生大量肽谱。在这些数据集中，从肽中发现了大量的冗余，这些肽在液相色谱串联质谱（LC-MS / MS）实验中可能会多次选择。由于来自这些机器的光谱的信噪比（S / N）低，因此无法将大量光谱映射到特定的肽序列。聚类是减轻这些复杂质谱数据集问题的一种方法。最近，我们提出了一种称为CAMS的图形理论框架，用于对大规模质谱数据进行聚类。 CAMS利用一种新颖的度量标准来利用质谱峰中的空间模式，从而可以实现高度准确的聚类结果。但是，将每个光谱与每个其他光谱进行比较会使聚类问题在计算上效率低下。在本文中，我们提出了一种称为P-CAMS的并行算法，该算法在多核体系结构上使用线程级和指令级并行性来显着减少运行时间。 P-CAMS依靠智能矩阵完成来减少比较次数，在每个内核上运行的线程以及每个线程内部的单指令多数据（SIMD）范例，以在多核体系结构上利用大规模并行性。精心设计的负载均衡方案使用映射到最近级缓存和核心的质谱峰的空间位置，可以实现超线性加速。我们使用各种各样的质谱数据和特定于体系结构的参数来研究算法的可扩展性。结果表明，对于多核体系结构，SIMD样式数据并行性与线程级并行性相结合是一种强大的组合，即使对于所有比较算法，也可以大幅减少运行时间。质量评估是使用实际数据集执行的，并且证明与同一算法的串行版本一致。

著录项

期刊名称 other
作者
Fahad Saeed; Jason D. Hoffert; Trairak Pisitkun; Mark A. Knepper;
展开▼
作者单位

展开▼
年(卷),期 -1(3),-1
年度 -1
页码 54
总页数 32
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Exploiting thread-level and instruction-level parallelism to cluster mass spectrometry data using multicore architectures [J] . Fahad Saeed, Jason D. Hoffert, Trairak Pisitkun, Network Modeling Analysis in Health Informatics and Bioinformatics . 2014,第1期

机译：使用多核体系结构利用线程级和指令级并行性对质谱数据进行聚类
2. Architecture optimization for multimedia application exploiting data and thread-level parallelism [J] . Limousin C, Sebot J, Vartanian A, Journal of systems architecture . 2005,第1期

机译：利用数据和线程级并行性的多媒体应用程序的体系结构优化
3. A Multiprocessor Architecture SKY that Exploits Thread-Level Parallelism in Non-Numerical Applications [J] . Ryotaro Kobayashi, Yukihiro Ogawa, Mitsuaki Iwata 情報処理学会論文誌 . 2001,第2期

机译：在非数值应用程序中利用线程级并行性的多处理器体系结构SKY
4. A high performance algorithm for clustering of large-scale protein mass spectrometry data using multi-core architectures [C] . Saeed Fahad, Hoffert Jason D., Knepper Mark A. IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining . 2013

机译：使用多核架构对大规模蛋白质质谱数据进行聚类的高性能算法
5. An asynchronous superscalar architecture for exploiting instruction-level parallelism [D] . Werner, Tony Lee. 2000

机译：用于利用指令级并行性的异步超高架构
6. Human Leukocyte Antigen (HLA) B27 Allotype-Specific Binding and Candidate Arthritogenic Peptides Revealed through Heuristic Clustering of Data-independent Acquisition Mass Spectrometry (DIA-MS) Data [O] . Ralf B. Schittenhelm, Saranjah Sivaneswaran, Terry C. C. Lim Kam Sian, 2016

机译：人类白细胞抗原（HLA）B27同种异型特异性结合和候选致关节炎肽通过独立于数据采集质谱（DIA-MS）数据的启发式聚类揭示
7. Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading [O] . Jack L. Lo, Susan J. Eggers, Joel S. Emer, 1997

机译：通过同步多线程将线程级并行性转换为指令级并行性

Exploiting Thread-Level and Instruction-Level Parallelism to Cluster Mass Spectrometry Data using Multicore Architectures

摘要

著录项

相似文献

相关主题

期刊订阅