首页> 外文学位 >High Performance Computing and Real Time Software for High Dimensional Data Classification
【24h】

High Performance Computing and Real Time Software for High Dimensional Data Classification

机译:高性能计算和实时软件,用于高维数据分类

获取原文
获取原文并翻译 | 示例

摘要

We present high performance computing and real time software for high dimensional data classification. We investigate the OpenMP parallelization and optimization of two graph-based data classification algorithms. The new algorithms are based on graph and PDE solution techniques and provide significant accuracy and performance advantages over traditional data classification algorithm. We use OpenMP as the parallelization language to parallelize the most time-consuming parts, which is the Nystr"om extension eigensolver. The Nystr"om extension calculates eigenvalue/eigenvectors of the graph Laplacian and this is a self-contained module that can be used in conjunction with other graph-Laplacian based methods such as spectral clustering. We then optimize the OpenMP implementations and detail the performance on traditional supercomputer nodes (in our case a Cray XC30), and test the optimization steps on emerging testbed systems based on Intel's Knights Corner and Landing processors. We show both performance improvement and strong scaling behavior. A large number of optimization techniques and analyses are necessary before the algorithm reaches almost ideal scaling. We further develop the parallelized real time software. The software is published on Image Processing On Line (IPOL) and uses IPOL's server. It provides real time computation for image classification.;We apply the parallelized classification algorithms to various problems. The first one is hyperspectral image classification, a challenging modality due to the dimension of pixels which can range from hundreds to over a thousand frequencies depending on the sensor. Our methods use the full dimensionality of the data and consider a similarity graph based on pairwise comparisons of pixels. The graph is segmented using a pseudospectral algorithm for graph clustering that requires information about the eigenfunctions of the graph Laplacian but does not require the computation of the full graph. The parallelized Nystr"om extension method randomly samples the graph to construct a low rank approximation of the graph Laplacian. With at most a few hundreds eigenfunctions, we can implement the clustering method to solve variational problems for graph-cut-based semi-supervised and unsupervised classification problems. The second application is ego-motion classification of body-worn videos. Portable cameras record dynamic first-person video footage and these videos contain information on the motion of the individual on whom the camera is mounted, defined as ego. We address the task of discovering ego-motion from the video itself, without other external calibration information. We investigate the use of similarity transformations between successive video frames to extract signals reflecting ego-motions and their frequencies. We use the parallelized graph-based unsupervised and semi-supervised learning algorithms to segment the video frames into different ego-motion categories. We show very accurate results on both choreographed test videos and ego-motion videos provided by the Los Angeles Police Department.
机译:我们提供用于高维数据分类的高性能计算和实时软件。我们研究了两种基于图的数据分类算法的OpenMP并行化和优化。新算法基于图形和PDE解决方案技术,与传统的数据分类算法相比,具有明显的准确性和性能优势。我们使用OpenMP作为并行化语言来并行化最耗时的部分,即Nystr“ om扩展本征求解器。Nystr” om扩展计算图Laplacian的特征值/特征向量,这是一个可以使用的自包含模块结合其他基于图拉普拉斯的方法,例如光谱聚类。然后,我们对OpenMP实现进行优化,并详细说明传统超级计算机节点(在我们的情况下为Cray XC30)的性能,并在基于英特尔Knights Corner和Landing处理器的新兴测试平台系统上测试优化步骤。我们展示了性能改进和强大的缩放行为。在算法达到几乎理想的缩放比例之前,必须进行大量优化技术和分析。我们进一步开发并行化的实时软件。该软件在在线图像处理(IPOL)上发布,并使用IPOL的服务器。它为图像分类提供实时计算。;我们将并行分类算法应用于各种问题。第一个是高光谱图像分类,这是一个具有挑战性的模式,这归因于像素的尺寸,视传感器而定,其频率范围可能从数百个到上千个。我们的方法使用数据的全部维度,并基于像素的成对比较考虑相似度图。使用伪谱算法对图进行分割,以实现图聚类,该算法需要有关图拉普拉斯算子的本征函数的信息,但不需要计算完整图。并行Nystr“ om扩展方法随机采样图以构建图拉普拉斯算子的低秩逼近。利用最多几百个特征函数,我们可以实现聚类方法,以解决基于图割的半监督算法的变分问题。无监督分类问题;第二个应用是人体运动视频的自我运动分类;便携式摄像机记录动态的第一人称视频录像,这些视频包含有关安装摄像机的个人运动的信息,定义为自我。解决了在没有其他外部校准信息的情况下从视频本身发现自我运动的任务;我们研究了使用连续视频帧之间的相似性变换来提取反映自我运动及其频率的信号;我们使用了基于并行图的无监督和半监督学习算法将视频帧划分为不同的自我运动类别。洛杉矶警察局提供的精心编排的测试视频和自我运动视频的结果非常准确。

著录项

  • 作者

    Meng, Zhaoyi.;

  • 作者单位

    University of California, Los Angeles.;

  • 授予单位 University of California, Los Angeles.;
  • 学科 Applied mathematics.;Computer science.;Information science.
  • 学位 Ph.D.
  • 年度 2018
  • 页码 108 p.
  • 总页数 108
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号