Cholesky and Gram-Schmidt Orthogonalization for Tall-and-Skinny QR Factorizations on Graphics Processors

机译：Cholesky和Gram-Schmidt正交化在图形处理器上的高而瘦QR分解

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a method for the QR factorization of large tall-and-skinny matrices that combines block Gram-Schmidt and the Cholesky decomposition to factorize the input matrix column panels, overcoming the sequential nature of this operation. This method uses re-orthogonalization to obtain a satisfactory level of orthogonality both in the Gram-Schmidt process and the Cholesky QR. Our approach has the additional benefit of enabling the introduction of a static look-ahead technique for computing the Cholesky decomposition on the CPU while the remaining operations (all Level-3 BLAS) are performed on the GPU. In contrast with other specific factorizations for tall-skinny matrices, the novel method has the key advantage of not requiring any custom GPU kernels. This simplifies the implementation and favours portability to future GPU architectures. Our experiments show that, for tall-skinny matrices, the new approach outperforms the code in MAGMA by a large margin, while it is very competitive for square matrices when the memory transfers and CPU computations are the bottleneck of Householder QR.

机译：我们提出了一种将大型又瘦的矩阵进行QR分解的方法，该方法结合了块Gram-Schmidt和Cholesky分解来分解输入矩阵列面板，从而克服了该操作的顺序性。该方法使用重新正交化在Gram-Schmidt过程和Cholesky QR中都获得令人满意的正交度。我们的方法还有一个好处，就是可以引入静态的超前技术来计算CPU上的Cholesky分解，而其余操作（所有Level-3 BLAS）都在GPU上执行。与其他针对高瘦矩阵的特定因式分解相反，该新颖方法的主要优点是不需要任何自定义GPU内核。这简化了实现，并有利于将来的GPU架构的可移植性。我们的实验表明，对于高瘦的矩阵，这种新方法在很大程度上要优于MAGMA中的代码，而当内存传输和CPU计算成为Householder QR的瓶颈时，它对于平方矩阵非常有竞争力。

著录项

来源
《International European Conference on Parallel and Distributed Computing》|2019年|469-480|共12页
会议地点 Gottingen(DE)
作者
Andres E. Tomas; Enrique S. Quintana-Orti;
展开▼
作者单位

Dept. d'Enginyeria i Ciencia dels Computadors Universitat Jaume I 12.071 Castello de la Plana Spain Dept. de Sistemes Informatics i Computacio Universitat Politecnica de Valencia 46.022 Valencia Spain;

Dept. d'Informatica de Sistemes i Computadors Universitat Politecnica de Valencia 46.022 Valencia Spain;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
QR factorization; Tall-and-skinny matrices; Graphics processing unit; Gram-Schmidt; Cholesky factorization; Look-ahead; High-performance;

机译：QR分解高而瘦的矩阵；图形处理单元；克-施密特胆固醇分解展望;高性能的;

相似文献

外文文献
中文文献
专利

1. Tall-and-skinny QR factorization with approximate Householder reflectors on graphics processors [J] . Tomas Andres E., Quintana-Orti Enrique S. Journal of supercomputing . 2020,第11期

机译：高度瘦的QR QR分解，具有在图形处理器上的近似家庭式反射器
2. ERROR ANALYSIS OF THE CHOLESKY QR-BASED BLOCK ORTHOGONALIZATION PROCESS FOR THE ONE-SIDED BLOCK JACOBI SVD ALGORITHM [J] . Kudo Shuhei, Yamamoto Yusaku, Imamura Toshiyuki Computing and informatics . 2020,第6期

机译：基于巧克斯基QR基础正交化过程的误差分析，单面块Jacobi SVD算法
3. PARALLEL GRAM-SCHMIDT ORTHOGONALISATION AND QR FACTORIZATION ON AN ARRAY PROCESSOR [J] . Clint M., Weston JS., Kontoghiorghes EJ. Zeitschrift fur Angewandte Mathematik und Mechanik . 1996,第s1期

机译：阵列处理器上的并行Gram-SchmiDT正交化和QR分解
4. Cholesky and Gram-Schmidt Orthogonalization for Tall-and-Skinny QR Factorizations on Graphics Processors [C] . Andres E. Tomas, Enrique S. Quintana-Orti International European Conference on Parallel and Distributed Computing . 2019

机译：CHOLESKY和GRAM-SCHMIDT在图形处理器上高瘦QR型QR案件的正交化
5. Matrix factorizations, triadic matrices, and modified Cholesky factorizations for optimization [D] . Fang, Haw-ren 2006

机译：矩阵分解，三元矩阵和改进的Cholesky分解以进行优化
6. Overcoming Catastrophic Interference in Connectionist Networks Using Gram-Schmidt Orthogonalization [O] . Vipin Srivastava, Suchitra Sampath, David J. Parker -1

机译：使用克-施密特正交化技术克服连接网络中的灾难性干扰
7. Error Analysis of the Cholesky QR-Based Block Orthogonalization Process for the One-Sided Block Jacobi SVD Algorithm [O] . Shuhei Kudo, Yusaku Yamamoto, Toshiyuki Imamura 2020

机译：基于巧克斯基QR基础正交化过程的误差分析，单面块Jacobi SVD算法
8. Cholesky Factorization, Schur Complements, Correlation Coefficients, Angles between Vectors, and the QR Factorization [R] . Delosme, J.-M., Ipsen, I. C., Paige, C. C. 1988

机译：Cholesky分解，schur补，相关系数，矢量之间的角度和QR分解

Cholesky and Gram-Schmidt Orthogonalization for Tall-and-Skinny QR Factorizations on Graphics Processors

摘要

著录项

相似文献

相关主题

期刊订阅