掌桥科研
一站式科研服务平台
科技查新
收录引用
专题文献检索
外文数据库(机构版)
更多产品
首页
成为会员
我要充值
退出
我的积分:
中文会员
开通
中文文献批量获取
外文会员
开通
外文文献批量获取
我的订单
会员中心
我的包量
我的余额
登录/注册
文献导航
中文期刊
>
中文会议
>
中文学位
>
中国专利
>
外文期刊
>
外文会议
>
外文学位
>
外国专利
>
外文OA文献
>
外文科技报告
>
中文图书
>
外文图书
>
工业技术
基础科学
医药卫生
农业科学
教科文艺
经济财政
社会科学
哲学政法
其他
工业技术
基础科学
医药卫生
农业科学
教科文艺
经济财政
社会科学
哲学政法
其他
自然科学总论
数学、物理、化学、力学
天文学、地球科学
生物科技
医学、药学、卫生
航空航天、军事
农林牧渔
机械、仪表工业
化工、能源
冶金矿业
电子学、通信
计算机、自动化
土木、建筑、水利
交通运输
轻工业技术
材料科学
电工技术
一般工业技术
环境科学、安全科学
图书馆学、情报学
社会科学
其他
马克思主义、列宁主义、毛泽东思想、邓小平理论
哲学、宗教
社会科学总论
政治、法律
军事
经济
文化、科学、教育、体育
语言、文字
文学
艺术
历史、地理
自然科学总论
数理科学和化学
天文学、地球科学
生物科学
医药、卫生
农业科学
工业技术
交通运输
航空、航天
环境科学、安全科学
综合性图书
自然科学总论
数学、物理、化学、力学
天文学、地球科学
生物科技
医学、药学、卫生
航空航天、军事
农林牧渔
机械、仪表工业
化工、能源
冶金矿业
电子学、通信
计算机、自动化
土木、建筑、水利
交通运输
轻工业技术
材料科学
电工技术
一般工业技术
环境科学、安全科学
图书馆学、情报学
社会科学
其他
自然科学总论
数学、物理、化学、力学
天文学、地球科学
生物科技
医学、药学、卫生
航空航天、军事
农林牧渔
机械、仪表工业
化工、能源
冶金矿业
电子学、通信
计算机、自动化
土木、建筑、水利
交通运输
轻工业技术
电工技术
一般工业技术
环境科学、安全科学
图书馆学、情报学
社会科学
其他
自然科学总论
数学、物理、化学、力学
天文学、地球科学
生物科技
医学、药学、卫生
航空航天、军事
农林牧渔
机械、仪表工业
化工、能源
冶金矿业
电子学、通信
计算机、自动化
土木、建筑、水利
交通运输
轻工业技术
材料科学
电工技术
一般工业技术
环境科学、安全科学
图书馆学、情报学
社会科学
其他
美国国防部AD报告
美国能源部DE报告
美国航空航天局NASA报告
美国商务部PB报告
外军国防科技报告
美国国防部
美国参联会主席指示
美国海军
美国空军
美国陆军
美国海军陆战队
美国国防技术信息中心(DTIC)
美军标
美国航空航天局(NASA)
战略与国际研究中心
美国国土安全数字图书馆
美国科学研究出版社
兰德公司
美国政府问责局
香港科技大学图书馆
美国海军研究生院图书馆
OALIB数据库
在线学术档案数据库
数字空间系统
剑桥大学机构知识库
欧洲核子研究中心机构库
美国密西根大学论文库
美国政府出版局(GPO)
加利福尼亚大学数字图书馆
美国国家学术出版社
美国国防大学出版社
美国能源部文献库
美国国防高级研究计划局
美国陆军协会
美国陆军研究实验室
英国空军
美国国家科学基金会
美国战略与国际研究中心-导弹威胁网
美国科学与国际安全研究所
法国国际关系战略研究院
法国国际关系研究所
国际宇航联合会
美国防务日报
国会研究处
美国海运司令部
北约
盟军快速反应部队
北约浅水行动卓越中心
北约盟军地面部队司令部
北约通信信息局
北约稳定政策卓越中心
美国国会研究服务处
美国国防预算办公室
美国陆军技术手册
一般OA
科技期刊论文
科技会议论文
图书
科技报告
科技专著
标准
其它
美国卫生研究院文献
分子生物学
神经科学
药学
外科
临床神经病学
肿瘤学
细胞生物学
遗传学
公共卫生&环境&职业病
应用微生物学
全科医学
免疫学
动物学
精神病学
兽医学
心血管
放射&核医学&医学影像学
儿科
医学进展
微生物学
护理学
生物学
牙科&口腔外科
毒理学
生理学
医院管理
妇产科学
病理学
生化技术
胃肠&肝脏病学
运动科学
心理学
营养学
血液学
泌尿科学&肾病学
生物医学工程
感染病
生物物理学
矫形
外周血管病
药物化学
皮肤病学
康复学
眼科学
行为科学
呼吸学
进化生物学
老年医学
耳鼻喉科学
发育生物学
寄生虫学
病毒学
医学实验室检查技术
生殖生物学
风湿病学
麻醉学
危重病护理
生物材料
移植
医学情报
其他学科
人类生活必需品
作业;运输
化学;冶金
纺织;造纸
固定建筑物
机械工程;照明;加热;武器;爆破
物理
电学
人类生活必需品
作业;运输
化学;冶金
纺织;造纸
固定建筑物
机械工程;照明;加热;武器;爆破
物理
电学
马克思主义、列宁主义、毛泽东思想、邓小平理论
哲学、宗教
社会科学总论
政治、法律
军事
经济
文化、科学、教育、体育
语言、文字
文学
艺术
历史、地理
自然科学总论
数理科学和化学
天文学、地球科学
生物科学
医药、卫生
农业科学
工业技术
交通运输
航空、航天
环境科学、安全科学
综合性图书
主题
主题
题名
作者
关键词
摘要
高级搜索 >
外文期刊
外文会议
外文学位
外国专利
外文图书
外文OA文献
中文期刊
中文会议
中文学位
中国专利
中文图书
外文科技报告
清除
历史搜索
清空历史
首页
>
外文会议
>
International Conference for High Performance Computing, Networking, Storage and Analysis
International Conference for High Performance Computing, Networking, Storage and Analysis
召开年:
2014
召开地:
New Orleans, LA(US)
出版时间:
-
会议文集:
-
会议论文
热门论文
全部论文
全选(
0
)
清除
导出
1.
Parallel Deep Neural Network Training for Big Data on Blue Gene/Q
机译:
针对Blue Gene / Q上大数据的并行深度神经网络训练
作者:
I-Hsin Chung
;
Sainath Tara N.
;
Ramabhadran Bhuvana
;
Pichen Michael
;
Gunnels John
;
Austel Vernon
;
Chauhari Upendra
;
Kingsbury Brian
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
Big Data;
learning (artificial intelligence);
neural nets;
parallel architectures;
pattern recognition;
DNN;
IBM BG/Q computer system;
IBM Blue Gene/Q computer system;
big data;
data-parallel Hessian-free 2nd order optimization algorithm;
interprocessor communication characteristics;
machine learning techniques;
parallel architectures;
parallel computing algorithms;
parallel deep neural network training;
pattern recognition tasks;
programming model;
training time costs;
Neural networks;
Optimization;
Prefetching;
Speech recognition;
Synchronization;
Training;
Big Data;
High Performance Computing;
Speech Recognition;
2.
Understanding Soft Error Resiliency of Blue Gene/Q Compute Chip through Hardware Proton Irradiation and Software Fault Injection
机译:
通过硬件质子辐照和软件故障注入了解Blue Gene / Q计算芯片的软错误恢复能力
作者:
Chen-Yong Cher
;
Gupta Meeta S.
;
Bose Pradip
;
Muller K. Paul
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
SRAM chips;
cache storage;
floating point arithmetic;
mainframes;
microprocessor chips;
parallel machines;
radiation hardening (electronics);
software fault tolerance;
AFI experiments;
BlueGene/Q compute chip;
BlueGene/Q hardware resiliency features;
HPC systems;
Level-1 caches;
SRAM-based register files;
Sequoia system;
application-level fault injection experiments;
correctable errors;
floating point register files;
hardware proton irradiation;
petascale high performance computing systems;
soft error resiliency;
software fault injection;
software resiliency;
third generation IBM massively parallel energy efficient Blue Gene series supercomputers;
Circuit faults;
Hardware;
Particle beams;
Protons;
Radiation effects;
Registers;
Software;
chip irradiation;
co-design;
fault injection;
high-performance applications;
soft error rate;
3.
Dissecting On-Node Memory Access Performance: A Semantic Approach
机译:
剖析节点上的内存访问性能:一种语义方法
作者:
Gimenez Alfredo
;
Gamblin Todd
;
Rountree Barry
;
Bhatele Abhinav
;
Jusufi Ilir
;
Bremer Peer-Timo
;
Hamann Bernd
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
distributed memory systems;
multi-threading;
storage management;
CPU manufacturers;
PMU;
attribute semantic information;
code regions;
data motion;
data objects;
design decisions;
distributed memory systems;
domain decomposition;
fine-grained memory access performance data;
memory access optimization;
memory behaviour;
multithreading;
on-node memory access performance;
performance ramifications;
power efficiency;
sampled memory accesses;
sampling-based performance measurement units;
semantic approach;
Context;
Hardware;
Kernel;
Libraries;
Program processors;
Semantics;
Topology;
4.
Application Centric Energy-Efficiency Study of Distributed Multi-Core and Hybrid CPU-GPU Systems
机译:
分布式多核和混合CPU-GPU系统的以应用为中心的能效研究
作者:
Cumming Ben
;
Fourestey Gilles
;
Fuhrer Oliver
;
Gysi Tobias
;
Fatica Massimiliano
;
Schulthess Thomas C.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
distributed memory systems;
energy conservation;
graphics processing units;
power aware computing;
weather forecasting;
GF-Watt metric;
HPCG benchmark;
application centric energy-efficiency;
distributed memory system;
distributed multicore CPU-GPU systems;
hybrid CPU-GPU systems;
processor architectures;
production-level regional climate simulation code;
production-level regional weather simulation code;
Atmospheric modeling;
Benchmark testing;
Computational modeling;
Graphics processing units;
Meteorology;
Power demand;
Production;
5.
Scaling MapReduce Vertically and Horizontally
机译:
垂直和水平缩放MapReduce
作者:
El-Helw Ismail
;
Hofman Rutger
;
Bal Henri E.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
data handling;
graphics processing units;
multiprocessing systems;
parallel processing;
pipeline processing;
GPMR;
GPU cluster;
Glasswing pipeline;
Hadoop;
MapReduce applications;
MapReduce framework;
OpenCL;
accelerators;
coarse-grained parallelism;
disk access;
fine-grained parallelism;
horizontal MapReduce scaling;
horizontal scalability;
multicore CPU;
multicore CPU cluster;
vertical MapReduce scaling;
vertical scalability;
Graphics processing units;
Instruction sets;
Kernel;
Parallel processing;
Performance evaluation;
Pipelines;
Scalability;
Heterogeneous;
MapReduce;
OpenCL;
Scalability;
6.
Correctness Field Testing of Production and Decommissioned High Performance Computing Platforms at Los Alamos National Laboratory
机译:
洛斯阿拉莫斯国家实验室生产和退役的高性能计算平台的正确性现场测试
作者:
Michalak Sarah E.
;
Rust William N.
;
Daly John T.
;
Dubois Rew J.
;
Dubois David H.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
natural sciences computing;
parallel processing;
HPC platforms;
Los Alamos National Laboratory;
SDC;
correctness field testing;
decommissioned high performance computing platform;
intermittent error mechanism;
production high performance computing platform;
scientific calculations;
silent data corruption;
transient error mechanism;
Computer architecture;
Data transfer;
High performance computing;
Production;
SDRAM;
Testing;
Transient analysis;
Cluster computing;
HPC cluster;
Linpack;
field testing;
high performance computing;
interconnect testing;
intermittent error;
resilience;
silent data corruption;
soft error;
transient error;
7.
An Image-Based Approach to Extreme Scale in Situ Visualization and Analysis
机译:
基于图像的极端尺度原位可视化和分析方法
作者:
Ahrens James
;
Jourdain Sebastien
;
OLeary Patrick
;
Patchett John
;
Rogers David H.
;
Petersen Mark
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
data analysis;
data visualisation;
image processing;
public domain software;
software tools;
extreme scale data analysis;
extreme scale in situ visualization;
extreme scale scientific simulations;
interactive image-based approach;
open source tools;
Analytical models;
Atmospheric modeling;
Cameras;
Computational modeling;
Data models;
Data visualization;
Databases;
8.
The DRIHM Project: A Flexible Approach to Integrate HPC, Grid and Cloud Resources for Hydro-Meteorological Research
机译:
DRIHM项目:整合HPC,网格和云资源进行水文气象研究的灵活方法
作者:
Dagostino Daniele
;
Clematis Andrea
;
Galizia Antonella
;
Quarati Alfonso
;
Danovaro Emanuele
;
Roverelli Luca
;
Zereik Gabriele
;
Kranzlmuller Dieter
;
Schiffers Michael
;
Gentschen Felde Nils
;
Straube Christian
;
Caumontz Olivier
;
Richard Evelyne
;
Garrote Luis
;
Harpham Quillon
;
Jagers H.R.A.
;
Dimitrijevic Vladimir
;
Dekic Ljiljana
;
Fiorizz Elisabetta
;
Delogu Fabio
;
Parodi Antonio
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
cloud computing;
geophysics computing;
grid computing;
hydrology;
meteorology;
parallel processing;
DRIHM project;
HPC;
cloud resources;
distributed computing resources;
distributed research infrastructure for hydrometeorology project;
e-science infrastructure;
grid resources;
heterogeneous HMR models;
high performance computing resources;
hydrometeorological research;
iterative learning-by-doing approach;
Atmospheric modeling;
Biological system modeling;
Computational modeling;
Data models;
Forecasting;
Meteorology;
Predictive models;
9.
CYPRESS: Combining Static and Dynamic Analysis for Top-Down Communication Trace Compression
机译:
CYPRESS:结合静态和动态分析进行自上而下的通信轨迹压缩
作者:
Jidong Zhai
;
Jianfei Hu
;
Xiongchao Tang
;
Xiaosong Ma
;
Wenguang Chen
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
parallel machines;
program diagnostics;
software performance evaluation;
trees (mathematics);
CYPRESS;
communication template;
compile time;
dynamic analysis;
dynamic compression methods;
dynamic trace compression;
execution scale;
hybrid static-dynamic method;
interprocedural analysis;
interprocess compression overhead;
intraprocess compression overhead;
iterative computing features;
loop structure;
next-generation HPC systems;
parallel application performance analysis;
parallel application performance optimization;
program communication structure;
static analysis;
supercomputers;
top-down communication trace compression;
Algorithm design and analysis;
Asynchronous communication;
Data structures;
Educational institutions;
Libraries;
Performance analysis;
Runtime;
High Performance Computing;
Message Passing;
Performance Analysis;
Trace Compression;
10.
pTatin3D: High-Performance Methods for Long-Term Lithospheric Dynamics
机译:
pTatin3D:长期岩石圈动力学的高性能方法
作者:
May Dave /A/.
;
Brown Jason
;
Le Pourhiet Laetitia
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
Earth crust;
Earth mantle;
bandwidth allocation;
brittleness;
differential equations;
finite element analysis;
geophysics computing;
parallel processing;
viscoplasticity;
Cray XC-30;
cache pressure;
continental rifting;
geodynamic phenomena;
geodynamics modeling package;
high-contrast brittle materials;
high-performance methods;
intranode scalability;
local element structure;
long-term lithospheric dynamics;
material composition;
material-point-method;
matrix-free geometric multigrid preconditioner;
memory bandwidth;
memory footprint;
memory-bus;
multigrid finite-element method;
pTatin3D multigrid preconditioner;
post-failure analysis;
viscoplastic Stokes problems;
Equations;
Finite element analysis;
Mathematical model;
Rocks;
Sparse matrices;
Viscosity;
Stokes;
geodynamics;
matrix-free;
multilevel preconditioners;
variable viscosity;
vectorization;
11.
RAHTM: Routing Algorithm Aware Hierarchical Task Mapping
机译:
RAHTM:路由算法感知的分层任务映射
作者:
Abdel-Gawad Ahmed H.
;
Thottethodi Mithuna
;
Bhatele Abhinav
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
divide and conquer methods;
graph theory;
parallel processing;
Blue Gene/Q platform;
HPC applications;
MPI process mapping;
RAHTM;
communication graph;
communication performance;
communication-heavy benchmarks;
divide-and-conquer strategy;
high performance computing applications;
iterative communication;
linear programming;
mapping optimization;
network topology;
offline analysis;
performance improvement;
routing algorithm aware hierarchical task mapping;
supercomputer;
Algorithm design and analysis;
Bandwidth;
Benchmark testing;
Measurement;
Network topology;
Routing;
Topology;
divide-and-conquer;
linear programming;
routing;
task mapping;
torus;
12.
Efficient Implementation of Many-Body Quantum Chemical Methods on the Intel® Xeon Phi Coprocessor
机译:
在英特尔®至强融核协处理器上高效实现多体量子化学方法
作者:
Apra Edoardo
;
Klemm Michael
;
Kowalski Karol
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
chemistry computing;
coprocessors;
electronic structure;
quantum chemistry;
CCSD(T) quantum chemistry;
Intel Xeon phi coprocessor;
NWChem computational chemistry package;
electronic structure calculation;
many integrated core architecture;
many-body quantum chemical method;
Computational modeling;
Computer architecture;
Coprocessors;
Graphics processing units;
Tensile stress;
Vectors;
Chemistry;
distributed architectures;
parallel algorithms;
13.
Mapping to Irregular Torus Topologies and Other Techniques for Petascale Biomolecular Simulation
机译:
映射到不规则的Torus拓扑和其他技术用于PB级生物分子模拟
作者:
Phillips James C.
;
Yanhua Sun
;
Jain Nikhil
;
Bohm Eric J.
;
Kale Laxmikant V.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
biology computing;
digital simulation;
mainframes;
molecular biophysics;
molecular dynamics method;
parallel machines;
topology;
3D Cray Gemini toroidal networks;
5D IBM Blue Gene/Q toroidal networks;
ANL Mira;
NCSA Blue Waters;
NERSC Edison;
ORNL Titan;
TACC Stampede;
communication-intensive codes;
fixed-size spatial decomposition;
full machine simulations;
hardware failure;
irregular node allocation shapes;
irregular torus topology mapping;
leadership machines;
molecular dynamics program NAMD;
multiple-copy algorithms;
network contention;
partition node allocations;
periodic 3D grid;
petascale biomolecular simulation;
petascale supercomputers;
topology adaptation;
topology awareness;
topology-agnostic codes;
toroidal network topologies;
Biological system modeling;
Computational modeling;
Graphics processing units;
Network topology;
Partitioning algorithms;
Resource management;
Topology;
14.
High-Productivity Framework on GPU-Rich Supercomputers for Operational Weather Prediction Code ASUCA
机译:
富含GPU的超级计算机上的高生产率框架,用于运行天气预报代码ASUCA
作者:
Shimokawabe Takashi
;
Aoki Toyohiro
;
Onodera Naoyuki
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
application program interfaces;
geophysics computing;
graphics processing units;
mainframes;
message passing;
parallel machines;
peer-to-peer computing;
CPU codes;
GPU codes;
GPU-rich supercomputer TSUBAME 2.5;
Japan Meteorological Agency;
MPI;
NVIDIA K20X GPU;
Tokyo Institute of Technology;
high-productivity framework;
intranode GPU peer-to-peer direct access;
next-generation high resolution meso-scale atmospheric model;
operational weather prediction code ASUCA;
parallel efficiency;
skillful programming techniques;
user-written codes;
Atmospheric modeling;
Computational modeling;
Graphics processing units;
Mathematical model;
Numerical models;
Programming;
Weather forecasting;
15.
A Volume Integral Equation Stokes Solver for Problems with Variable Coefficients
机译:
变系数问题的体积积分方程斯托克斯求解器
作者:
Malhotra Dhairya
;
Gholami Amir
;
Biros George
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
approximation theory;
integral equations;
parallel machines;
14-order approximation;
Intel accelerator;
NVIDIA;
Stampede system;
Stokes equation;
adaptive fast multipole method;
nonuniform discretization;
penalty formulation;
peta FLOPS;
pore structure;
porous medium;
pressure;
variable coefficient;
velocity;
volume integral equation Stokes solver;
Chebyshev approximation;
Convergence;
Convolution;
Equations;
Geometry;
Integral equations;
Mathematical model;
16.
Faster Parallel Traversal of Scale Free Graphs at Extreme Scale with Vertex Delegates
机译:
带有顶点代表的极端比例的无标度图的更快的并行遍历
作者:
Pearce Roger
;
Gokhale Maya
;
Amato Nancy M.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
data structures;
distributed memory systems;
graph theory;
parallel machines;
tree searching;
BFS;
IBM BG-P;
Page-Rank;
SSSP;
asynchronous broadcast operations;
breadth-first search;
distributed memory supercomputers;
extreme scale graphs;
high-degree vertices;
hub data structures;
k-core decomposition;
parallel workload;
scale free graph parallel traversal;
single source shortest path;
social network graphs;
storage imbalances;
Algorithm design and analysis;
Benchmark testing;
Computational modeling;
Data structures;
Partitioning algorithms;
Scalability;
Supercomputers;
17.
Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices
机译:
高性能共轭梯度基准的共享内存实现及其在非结构化矩阵中的应用
作者:
Jongsoo Park
;
Smelyanskiy Mikhail
;
Vaidyanathan Karthikeyan
;
Heinecke Alexander
;
Kalamkar Dhiraj D.
;
Xing Liu
;
Patwary M. Mostofa Ali
;
Yutong Lu
;
Dubey Pradeep
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
conjugate gradient methods;
iterative methods;
matrix algebra;
message passing;
optimisation;
parallel processing;
shared memory systems;
3D grid;
CG convergence rate;
Gauss-Seidel smoother parallelization;
HPC applications;
HPCG;
MPI parallelization;
TFLOPS;
Tianhe-2 system;
Xeon Phi shared-memory implementation;
algorithmic optimizations;
architecture-aware optimizations;
block multicolor reordering;
communication overhead;
communication pattern;
data access locality;
high performance conjugate gradient benchmark;
next generation extreme-scale computing systems;
parallelism;
sparse linear solvers;
unstructured matrices;
Benchmark testing;
Convergence;
Equations;
Parallel processing;
Sparse matrices;
Synchronization;
Vectors;
18.
Recycled Error Bits: Energy-Efficient Architectural Support for Floating Point Accuracy
机译:
回收的错误位:浮点精度的节能体系结构支持
作者:
Nathan Ralph
;
Anthonio Bryan
;
Shih-Lien Lu
;
Naeimi Helia
;
Sorin Daniel J.
;
Sun Xinghua
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
floating point arithmetic;
hardware-software codesign;
performance evaluation;
program compilers;
recycling;
FPU;
all-software scheme;
architecturally recycled error bits;
compiler pass;
energy-efficient architectural support;
floating point accuracy;
floating point addition;
operation rounding error;
recycled error bits;
Accuracy;
Assembly;
Benchmark testing;
Hardware;
Instruments;
Registers;
Software;
innovative hardware/software co-design;
linear and nonlinear systems;
numerical methods;
19.
High-Performance Computation of Distributed-Memory Parallel 3D Voronoi and Delaunay Tessellation
机译:
分布式内存并行3D Voronoi和Delaunay细分的高性能计算
作者:
Peterka Tom
;
Morozov Dmitriy
;
Phillips Chris
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
computational geometry;
data analysis;
data visualisation;
distributed memory systems;
mesh generation;
parallel algorithms;
computational geometry methods;
data analysis;
data visualization;
distributed-memory parallel 3D Delaunay tessellation;
distributed-memory parallel 3D Voronoi tessellation;
distributed-memory scalable parallel algorithm;
high-performance computation;
Computational geometry;
Data models;
Face;
Heuristic algorithms;
Libraries;
Parallel algorithms;
Three-dimensional displays;
Delaunay tessellation;
Voronoi;
computational geometry;
20.
FlexSlot: Moving Hadoop Into the Cloud with Flexible Slot Management
机译:
FlexSlot:通过灵活的插槽管理将Hadoop迁移到云中
作者:
Yanfei Guo
;
Jia Rao
;
Changjun Jiang
;
Xiaobo Zhou
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
cloud computing;
data handling;
parallel processing;
resource allocation;
FlexSlot;
Hadoop task management;
data distribution;
data skew;
flexible slot management;
job completion time;
load imbalance;
map stragglers;
performance interference;
private cloud;
resource allocation;
resource pool-based virtual cluster management;
resource usage;
resource utilization;
semantic gap;
task execution;
user-transparent task slot management scheme;
virtual node;
Acceleration;
Cloud computing;
Dynamic scheduling;
Measurement;
Memory management;
Resource management;
Runtime;
21.
Efficient I/O and Storage of Adaptive-Resolution Data
机译:
高效的I / O和自适应分辨率数据的存储
作者:
Kumar Sudhakar
;
Edwards John
;
Bremer Peer-Timo
;
Knoll Aaron
;
Christensen Cameron
;
Vishwanath Venkatram
;
Carns Philip
;
Schmidt John /A/.
;
Pascucci V.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
data structures;
grid computing;
input-output programs;
parallel machines;
AMR setting;
AMR simulation;
Edison supercomputer;
I/O performance;
Mira supercomputer;
PIDX framework;
S3D large-scale combustion code;
Uintah;
adaptive mesh refinement simulation;
adaptive-resolution I/O framework;
adaptive-resolution data;
data resolution;
disk usage;
domain data;
importance-driven storage;
independent grid;
lower resolution;
multiresolution representation;
regions of interest;
resolution level;
uniform grid;
Adaptation models;
Arrays;
Data models;
Encoding;
Indexes;
Layout;
Spatial resolution;
22.
Pipelining Computational Stages of the Tomographic Reconstructor for Multi-Object Adaptive Optics on a Multi-GPU System
机译:
用于多GPU系统的多对象自适应光学层析成像重建器的流水线计算阶段
作者:
Charara Ali
;
Ltaief Hatem
;
Gratadour Damien
;
Keyes David
;
Sevin Arnaud
;
Abdelfattah Ahmad
;
Gendron Eric
;
Morel Carine
;
Vidal Fabrice
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
adaptive optics;
astronomical image processing;
astronomical telescopes;
computerised tomography;
floating point arithmetic;
graphics processing units;
pipeline processing;
scheduling;
E-ELT;
European extremely large telescope project;
MOAO technique;
MORSE;
MOSAIC;
Matrices Over Runtime System at Exascale numerical library;
TR simulation;
astronomical instruments;
computational stage pipelining;
data coherency;
data dependencies;
dynamic scheduler;
floating-point operations;
ground-based astronomy;
heterogeneous systems;
largest-scale AO problem;
multiGPU system;
multiobject adaptive optics;
multiobject spectroscopy;
numerical algorithm;
tomographic reconstructor simulation;
Computational modeling;
Covariance matrices;
Libraries;
Runtime;
Telescopes;
Tomography;
Computational Astronomy;
Dense Linear Algebra;
Dynamic Scheduler;
GPU Computing;
Multi-Objects Adaptive Optics;
23.
Parallelization of Reordering Algorithms for Bandwidth and Wavefront Reduction
机译:
带宽和波前减少的重排序算法的并行化
作者:
Karantasis Konstantinos /I/.
;
Lenharth Andrew
;
Nguyen Donald
;
Garzaran Mara J.
;
Pingali Keshav
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
cache storage;
iterative methods;
matrix multiplication;
parallel algorithms;
parallel machines;
sparse matrices;
HSL library;
HSL-RCM;
SpMV iteration;
Stampede supercomputer;
bandwidth reduction;
cache locality;
matrix reordering;
parallel RCM;
parallel implementation;
parallel iterative solver;
parallelization;
reordering algorithm;
reverse cut hill-McKee;
sequential HSL-Sloan;
sparse matrix computation;
sparse matrix-vector multiplication;
wavefront reduction;
Arrays;
Bandwidth;
Heuristic algorithms;
Indexes;
Parallel processing;
Runtime;
Sparse matrices;
24.
DISC: A Domain-Interaction Based Programming Model with Support for Heterogeneous Execution
机译:
DISC:支持异类执行的基于域交互的编程模型
作者:
Kurt Mehmet Can
;
Agrawal Gagan
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
parallel programming;
DISC;
HPC systems;
distributed memory execution;
domain elements;
domain-interaction based programming model;
heterogeneity;
heterogeneous execution;
heterogeneous nodes;
heterogeneous processors;
high performance computing;
interprocess communication;
molecular dynamics applications;
runtime system;
stencil computations;
unstructured grid computations;
Computational modeling;
Data structures;
Load modeling;
Parallel programming;
Program processors;
Runtime;
heterogeneous support;
load balancing;
programming model;
25.
Real-Time Scalable Cortical Computing at 46 Giga-Synaptic OPS/Watt with ~100× Speedup in Time-to-Solution and ~100,000× Reduction in Energy-to-Solution
机译:
实时可扩展皮质计算,功耗为46 Giga-Synaptic OPS / Watt,解决时间缩短了约100倍,解决方案的能量减少了约100,000倍
作者:
Cassidy Andrew S.
;
Alvarez-Icaza Rodrigo
;
Akopyan Filipp
;
Sawada Jun
;
Arthur John V.
;
Merolla Paul /A/.
;
Datta Piyali
;
Gonzalez Tallada Marc
;
Taba Brian
;
Andreopoulos Alexander
;
Amir Arnon
;
Esser Steven K.
;
Kusnitz Jeff
;
Appuswamy Rathinakumar
;
Haymes Chuck
;
Brezzo Bernard
;
Moussalli Roger
;
Bellofatto Ralph
;
Baks Christian
;
Mastro Michael
;
Schleupen Kai
;
Cox Charles E.
;
Inoue Ken
;
Millman Steve
;
Imam Nabil
;
Mcquinn Emmett
;
Nakamura Yutaka Y.
;
Vo Ivan
;
Guok Chen
;
Nguyen Donald
;
Lekuch Scott
;
Asaad Sameh
;
Friedman Daniel
;
Jackson Bryan L.
;
Flickner Myron D.
;
Risk William P.
;
Manohar Rajit
;
Modha Dharmendra S.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
computer vision;
neural chips;
neural net architecture;
recurrent neural nets;
True North chips;
codesigned silicon expression;
complex recurrent neural network simulation;
computer vision application;
cortex-like scalability;
energy to-solution;
energy-to-solution;
giga-synaptic OPS/Watt;
magnitude reduction;
magnitude speedup;
neuroscience;
neurosynaptic computation;
parallel event-driven kernel;
real-time scalable cortical computing;
software expression;
synapse brain-inspired neurosynaptic processor;
time-to-solution;
von Neumann architecture;
Compass;
Computational modeling;
Computer architecture;
Kernel;
Message systems;
Nerve fibers;
26.
Slim Fly: A Cost Effective Low-Diameter Network Topology
机译:
Slim Fly:经济高效的低直径网络拓扑
作者:
Besta Maciej
;
Hoefler Torsten
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
graph theory;
multiprocessor interconnection networks;
parallel processing;
HPC network;
Slim Fly;
bandwidth;
cost effective low-diameter network topology;
deadlock-free routing scheme;
latency;
optimal network diameter;
physical layout;
power consumption;
Bandwidth;
Joining processes;
Measurement;
Network topology;
Routing;
System recovery;
Topology;
27.
Fast Iterative Graph Computation: A Path Centric Approach
机译:
快速迭代图计算:以路径为中心的方法
作者:
Pingpeng Yuan
;
Wenya Zhang
;
Changfeng Xie
;
Hai Jin
;
Ling Liu
;
Kisung Lee
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
iterative methods;
mathematics computing;
parallel processing;
storage media;
trees (mathematics);
Path Graph;
compact storage design;
edge-centric system;
in-memory graph;
iterative graph computation;
large scale graph processing;
out-of-core graph;
parallel computation model;
partition tree level;
path centric approach;
path-centric computation;
path-centric computation model;
random access minimization;
scatter-gather programming model;
sequential access maximization;
sequential updates;
storage media;
tree-based partitions;
vertex centric system;
Computing model;
Graph;
Iterative computation;
Path;
Storage;
28.
Orion: Scaling Genomic Sequence Matching with Fine-Grained Parallelization
机译:
Orion:扩展基因组序列匹配与细粒度并行化
作者:
Mahadik Kanak
;
Chaterji Somali
;
Bowen Zhou
;
Kulkarni Milind
;
Bagchi Saurabh
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
biology computing;
database management systems;
genomics;
query processing;
string matching;
Orion;
database segmentation;
fine-grained parallelization;
genomic sequence matching;
query sequence;
Bioinformatics;
DNA;
Databases;
Genomics;
Organisms;
Parallel processing;
29.
Reciprocal Resource Fairness: Towards Cooperative Multiple-Resource Fair Sharing in IaaS Clouds
机译:
互惠资源公平:在IaaS云中实现协作式多资源公平共享
作者:
Haikun Liu
;
Bingsheng He
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
cloud computing;
resource allocation;
virtual machines;
virtualisation;
IaaS cloud;
RRF;
VM density;
cloud model;
cloud provider;
cloud tenant;
complementary mechanism;
cooperative multiple-resource fair sharing;
economic fairness;
fair sharing multiple type;
hierarchical mechanism;
inter-tenant resource trading;
intra-tenant weight adjustment;
multiple resource type;
pay-as-you-use cloud environment;
reciprocal resource fairness;
resource allocation mechanism;
resource sharing;
resource/energy efficiency;
virtual machine density;
virtualized environment;
Dynamic scheduling;
Economics;
Indexes;
Random access memory;
Resource management;
Support vector machines;
Vectors;
IaaS;
cloud computing;
fairness;
resource sharing;
30.
Fence Scoping
机译:
围栏范围
作者:
Changhui Lin
;
Nagarajan Vijay
;
Gupta Rajesh
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
program control structures;
program diagnostics;
software performance evaluation;
S-Fence;
customizable fence;
fence instructions;
fence scoping;
lock-free algorithms;
memory accesses;
program performance;
scope information;
Buffer storage;
Frequency selective surfaces;
Hardware;
Memory management;
Program processors;
Programming;
Semantics;
Fence instructions;
Memory models;
Scope;
31.
MC-Checker: Detecting Memory Consistency Errors in MPI One-Sided Applications
机译:
MC-Checker:在MPI单面应用程序中检测内存一致性错误
作者:
Zhezhe Chen
;
Dinan James
;
Zhen Tang
;
Balaji Pavan
;
Hua Zhong
;
Jun Wei
;
Tao Huang
;
Feng Qin
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
application program interfaces;
message passing;
program debugging;
program diagnostics;
MC-Checker;
MPI one-sided applications;
data movement;
data synchronization;
distributed shared data;
dynamic events;
load/store operations;
memory consistency bug diagnosis;
memory consistency error detection;
one-sided communication;
online instrumentation;
Analytical models;
Computer bugs;
Data models;
Instruments;
Load modeling;
Runtime;
Synchronization;
Bug Detection;
MPI;
One-Sided Communication;
32.
Omnisc'IO: A Grammar-Based Approach to Spatial and Temporal I/O Patterns Prediction
机译:
Omnisc'IO:基于语法的时空I / O模式预测方法
作者:
Dorier Matthieu
;
Ibrahim Shadi
;
Antoniu Gabriel
;
Ross Robert
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
Unix;
cache storage;
grammars;
input-output programs;
message passing;
parallel processing;
scheduling;
storage management;
HPC applications;
I/O optimizations;
I/O subsystem;
MPI I/O stacks;
Omnisc'IO;
POSIX stacks;
caching techniques;
grammar-based approach;
post-petascale machines;
prefetching techniques;
scheduling techniques;
spatial I/O pattern prediction;
temporal I/O pattern prediction;
Context;
Grammar;
Hidden Markov models;
Libraries;
Prediction algorithms;
Predictive models;
Prefetching;
Exascale;
Grammar;
HPC;
I/O;
Omnisc'IO;
Prediction;
Storage;
33.
Quantitatively Modeling Application Resilience with the Data Vulnerability Factor
机译:
利用数据漏洞因子定量建模应用程序弹性
作者:
Li Yu
;
Dong Li
;
Mittal Sparsh
;
Vetter Jeffrey S.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
data protection;
pattern classification;
safety-critical software;
software fault tolerance;
DVF;
application resilience modelling;
data vulnerability factor;
protection mechanism;
representative computational kernel;
vulnerability classification;
Algorithm design and analysis;
Analytical models;
Computational modeling;
Data models;
Data structures;
Hardware;
Resilience;
34.
NUMARCK: Machine Learning Algorithm for Resiliency and Checkpointing
机译:
NUMARCK:弹性和检查点的机器学习算法
作者:
Zhengzhang Chen
;
Seung Woo Son
;
Hendrix William
;
Agrawal Ankit
;
Wei-Keng Liao
;
Choudhary Alok
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
checkpointing;
data analysis;
iterative methods;
learning (artificial intelligence);
parallel processing;
software fault tolerance;
HPC system;
NUMARCK;
Northwestern University machine learning algorithm for resiliency and check pointing;
data analysis;
fault tolerance technique;
high performance computing;
simulation iteration;
Approximation algorithms;
Approximation methods;
Checkpointing;
Computational modeling;
Data models;
Error analysis;
Machine learning algorithms;
35.
Nonblocking Epochs in MPI One-Sided Communication
机译:
MPI单面通信中的非阻塞时代
作者:
Zounmevo Judicael /A/.
;
Xin Zhao
;
Balaji Pavan
;
Gropp William
;
Afsahi Ahmad
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
matrix decomposition;
message passing;
MPI one-sided communication;
MPI-2.0;
MPI-3.0;
application pattern;
atomic updates;
communication patterns;
contention avoidance;
epoch-closing routines;
latency issues;
latency propagation issues;
lower-upper matrix decomposition;
matching epochs;
nonRMA communication-related latencies;
nonblocking RMA synchronizations;
nonblocking epochs;
serialization;
synchronization model;
Delays;
Engines;
Hazards;
Proposals;
Semantics;
Standards;
Synchronization;
MPI;
RMA;
latency propagation;
nonblocking synchronizations;
one-sided;
36.
Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on a Hybrid CPU/GPU Cluster
机译:
在CPU / GPU混合群集上避免通信的Krylov方法的域分解预处理器
作者:
Yamazaki Ichitaro
;
Rajamanickam Sivasankaran
;
Boman Erik G.
;
Hoemmen Mark
;
Heroux Michael /A/.
;
Tomov Stanimire
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
graphics processing units;
iterative methods;
mathematics computing;
CA techniques;
CAGMRES;
Intel Xeon CPU;
Krylov subspace projection methods;
Nvidia Fermi GPU;
communication-avoiding Krylov method;
domain decomposition preconditioners;
generalized minimum residual method;
hybrid CPU-GPU cluster;
iterative methods;
large-scale linear systems of equations;
Central Processing Unit;
Graphics processing units;
Jacobian matrices;
Kernel;
Linear systems;
Sparse matrices;
Vectors;
37.
Optimization of a Multilevel Checkpoint Model with Uncertain Execution Scales
机译:
执行尺度不确定的多级检查点模型的优化
作者:
Sheng Di
;
Bautista-Gome Leonardo
;
Cappello Franck
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
checkpointing;
multiprocessing systems;
numerical analysis;
optimisation;
checkpoint levels;
extreme-scale numerical simulation;
extreme-scale systems;
failure scales;
massive system outages;
multilevel checkpoint model;
optimal checkpoint intervals;
optimization;
processes/cores;
transient uncorrectable memory errors;
uncertain execution scales;
wall-clock length;
Analytical models;
Approximation algorithms;
Computational modeling;
Equations;
Heating;
Mathematical model;
Optimization;
38.
24.77 Pflops on a Gravitational Tree-Code to Simulate the Milky Way Galaxy with 18600 GPUs
机译:
引力树代码上的24.77 Pflops可模拟具有18600 GPU的银河系
作者:
Bedorf Jeroen
;
Gaburov Evghenii
;
Fujii Michiko S.
;
Nitadori Keigo
;
Ishiyama Tomoaki
;
Zwart Simon Portegies
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
Galaxy;
N-body simulations (astronomical);
graphics processing units;
gravitation;
parallel machines;
tree codes;
GPU;
Milky Way Galaxy model;
Milky Way Galaxy simulation;
N-body gravitational tree-code Bonsai;
Swiss Piz Daint supercomputer;
US ORNL Titan;
bar structure;
graphics processing unit;
long term evolution;
numerical algorithms;
parallel efficiency;
scientific motivation;
spiral arms;
Computational modeling;
Graphics processing units;
Gravity;
Instruction sets;
Supercomputers;
Submitted in the categories: Scalability;
Time-to-solution and Peak performance;
39.
A System Software Approach to Proactive Memory-Error Avoidance
机译:
主动避免内存错误的系统软件方法
作者:
Costa Carlos H. /A/.
;
Yoonho Park
;
Rosenburg Bryan S.
;
Chen-Yong Cher
;
Kyung Dong Ryu
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
Linux;
checkpointing;
error correction codes;
firmware;
parallel processing;
storage management;
BG-P system;
CR;
HPC systems;
Linux;
OS-based approach;
checkpoint-restart;
correctable error patterns;
error-correcting codes;
firmware;
memory error pattern analysis;
page migration;
proactive memory management system;
proactive memory-error avoidance;
system software approach;
Algorithm design and analysis;
Correlation;
Error analysis;
Error correction codes;
Memory management;
Monitoring;
Prediction algorithms;
Memory Structures;
Operating Systems;
Reliability;
and Fault-Tolerance;
40.
Managing DRAM Latency Divergence in Irregular GPGPU Applications
机译:
管理不规则GPGPU应用程序中的DRAM延迟差异
作者:
Chatterjee Niladrish
;
OConnor Mike
;
Loh Gabriel H.
;
Jayasena Nuwan
;
Balasubramonia Rajeev
会议名称:
《》
|
2014年
关键词:
DRAM chips;
graphics processing units;
storage management chips;
DRAM system;
SIMT architecture;
average memory stall latency;
bandwidth utilization;
high bandwidth usage;
independent memory channels;
interleaving requests;
interwarp interference;
irregular GPGPU applications;
latency divergence;
memory controllers;
memory requests;
memory scheduling mechanisms;
memory scheduling policy;
scheduling decisions;
throughput-optimized GPU memory controller;
Bandwidth;
Graphics processing units;
Instruction sets;
Memory management;
Parallel processing;
Random access memory;
41.
Compiler Techniques for Massively Scalable Implicit Task Parallelism
机译:
大规模可扩展隐式任务并行性的编译器技术
作者:
Armstrong Timothy G.
;
Wozniak Justin M.
;
Wilde Mark
;
Foster Ian T.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
concurrency control;
optimising compilers;
parallel programming;
simulated annealing;
tree searching;
Swift/T;
application benchmark;
application code;
asynchronous tasks;
code optimization;
communication overhead reduction;
compiler optimizations;
compiler transformations;
data-driven task parallel execution model;
data-driven task parallelism;
deterministic scripts;
heterogeneous resource;
high-level language;
homogeneous resource;
intermediate representations;
load balancing;
lower-level programming models;
parallel applications;
parallel codes;
scalable implicit task parallelism;
serial codes;
simulated annealing;
unbalanced tree search;
Data models;
Load modeling;
Optimization;
Parallel processing;
Runtime;
Servers;
Synchronization;
42.
Understanding the Effects of Communication and Coordination on Checkpointing at Scale
机译:
理解沟通和协调对大规模检查站的影响
作者:
Ferreira Kurt B.
;
Widener Patrick
;
Levy Scott
;
Arnold Dorian
;
Hoefler Torsten
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
checkpointing;
fault tolerant computing;
synchronisation;
anticipated scalability issues;
communication effects;
coordination effects;
critical analysis;
fault-tolerance;
hierarchical uncoordinated checkpointing protocols;
hybrid checkpointing systems;
large-scale systems;
local checkpoint activity;
local node compute time;
message log volume optimization;
process delays;
resilience mechanisms;
simulation-based approach;
synchronization overheads;
system administrators;
Checkpointing;
Computational modeling;
Delays;
Mathematical model;
Protocols;
Resilience;
Synchronization;
43.
Using an Adaptive HPC Runtime System to Reconfigure the Cache Hierarchy
机译:
使用自适应HPC运行时系统重新配置缓存层次结构
作者:
Totoni Ehsan
;
Torrellas Josep
;
Kale Laxmikant V.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
cache storage;
formal languages;
parallel processing;
power aware computing;
HPC codes;
HPC environment;
SPMD model;
adaptive HPC runtime system;
application pattern;
cache energy saving;
cache hierarchy reconfiguration;
cycle-level simulation;
formal language theory;
parallel configuration;
performance penalty;
processor energy;
single program multiple data model;
software-controlled reconfigurable streaming buffer configuration;
software-controlled reconfiguration;
Adaptive systems;
Hardware;
Kernel;
Prefetching;
Runtime;
Sparse matrices;
Supercomputers;
44.
Exploring Automatic, Online Failure Recovery for Scientific Applications at Extreme Scales
机译:
探索适用于极端规模的科学应用程序的自动在线故障恢复
作者:
Gamell Marc
;
Katz Daniel S.
;
Kolla Hemanth
;
Chen Jiann-Jong
;
Klasky Scott
;
Parashar Manish
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
application program interfaces;
checkpointing;
parallel processing;
Fenix;
MPI-based parallel application;
S3D combustion simulation;
application resilience;
automatic data recovery;
check pointing;
exascale vision;
extreme scales;
node failures;
online failure recovery;
process-node-blade-cabinet failure;
scientific application;
Checkpointing;
Combustion;
Fault tolerance;
Fault tolerant systems;
Peer-to-peer computing;
Runtime;
Synchronization;
45.
Physics-Based Urban Earthquake Simulation Enhanced by 10.7 BlnDOF × 30 K Time-Step Unstructured FE Non-Linear Seismic Wave Simulation
机译:
10.7 BlnDOF×30 K时步非结构有限元非线性地震波仿真增强了基于物理的城市地震仿真
作者:
Ichimura T.
;
Fujita Kinya
;
Tanaka Shoji
;
Hori Muneo
;
Lalith Maddegedara
;
Shizawa Yoshihisa
;
Kobayashi Hideo
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
application program interfaces;
buildings (structures);
earthquake engineering;
finite element analysis;
geophysics computing;
message passing;
seismic waves;
seismology;
10.7 BlnDOF problem;
30 K time steps;
3D finite-element;
3D nonlinear ground motion;
FE nonlinear seismic wave simulation;
GAMERA;
MPI-OpenMP hybrid seismic wave amplification simulation code;
building structures;
physics-based urban earthquake simulation;
stochastic response;
urban earthquake response analysis;
Analytical models;
Computational modeling;
Computers;
Earthquakes;
Finite element analysis;
Mathematical model;
Seismic waves;
Time-to-solution;
scalability;
unstructured FEM;
urban earthquake simulation;
46.
The Lightweight Distributed Metric Service: A Scalable Infrastructure for Continuous Monitoring of Large Scale Computing Systems and Applications
机译:
轻量级分布式度量服务:一种可扩展的基础结构,用于连续监视大型计算系统和应用程序
作者:
Agelastos Anthony
;
Allan Benjamin
;
Brandt Jim
;
Cassella Paul
;
Enos Jeremy
;
Fullop Joshi
;
Gentile Ann
;
Monk Steve
;
Naksinehaboon Nichamon
;
Ogden Jeff
;
Rajan Mahesh
;
Showerman Michael
;
Stevenson Joel
;
Taerat Narate
;
Tucker Tom
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
parallel processing;
resource allocation;
software metrics;
computing system monitoring;
high performance computing platform;
lightweight distributed metric service;
resource utilization;
Bandwidth;
Instruction sets;
Measurement;
Memory management;
Monitoring;
Resource management;
Sockets;
resource management;
resource monitoring;
47.
Best Practices and Lessons Learned from Deploying and Operating Large-Scale Data-Centric Parallel File Systems
机译:
从部署和操作以数据为中心的大规模并行文件系统中汲取的最佳实践和经验教训
作者:
Oral Sarp
;
Simmons Jeff
;
Hill Jason
;
Leverman Dustin
;
Feiyi Wang
;
Ezell Matt
;
Miller Ross
;
Fuller Douglas
;
Gunasekaran Raghul
;
Youngjae Kim
;
Gupta Swastik
;
Vazhkudai Devesh Tiwari Sudharshan S.
;
Rogers James H.
;
Dillow David
;
Shipman Galen M.
;
Bland Arthur S.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
parallel processing;
software engineering;
storage management;
HPC;
PFS;
data-centric parallel file system;
file system software development;
storage system design;
technology evaluation;
Bandwidth;
Benchmark testing;
Computational modeling;
Data models;
Procurement;
Servers;
System performance;
48.
MSL: A Synthesis Enabled Language for Distributed Implementations
机译:
MSL:分布式实现的综合启用语言
作者:
Zhilei Xu
;
Kamil Shoaib
;
Solar-Lezama Armando
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
C language;
distributed memory systems;
operating system kernels;
parallel programming;
program compilers;
program debugging;
storage management;
C-like language;
Fortran;
MSL language;
NAS parallel benchmark;
array manipulation;
automated bug-finding technology;
bulk-synchronous distributed memory kernels;
bulk-synchronous parallelism;
code generation;
computational kernels;
distributed implementation;
generative programming;
high level notations;
low level C code;
nontrivial distributed kernels;
semantic analysis;
software synthesis;
synthesis enabled language;
synthesis features;
Arrays;
Benchmark testing;
Generators;
Kernel;
Programming;
Semantics;
Synthesizers;
49.
Lattice QCD with Domain Decomposition on Intel® Xeon Phi Co-Processors
机译:
具有英特尔®至强融核协处理器的域分解功能的莱迪思QCD
作者:
Heybrock Simon
;
Joo Balint
;
Kalamkar Dhiraj D.
;
Smelyanskiy Mikhail
;
Vaidyanathan Karthikeyan
;
Wettig Tilo
;
Dubey Pradeep
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
coprocessors;
data handling;
iterative methods;
lattice theory;
multiprocessing systems;
physics computing;
quantum chromodynamics;
Intel Xeon Phi coprocessors;
KNC cluster;
alternative solver algorithm;
close-to-linear on-chip scaling;
data movement;
domain decomposition;
extreme-scale architecture;
iterative solvers;
lattice QCD;
lattice quantum chromodynamics;
multinode domain-decomposition solver;
Gold;
Jacobian matrices;
Lattices;
Layout;
Linear systems;
Prefetching;
Vectors;
Domain decomposition;
G.1.3 Numerical Analysis: Numerical Linear Algebra Sparse;
Intel® Xeon Phi coprocessor;
Lattice QCD Categories and subject descriptors: D.3.4 Programming Languages: Processors Optimization;
and very la;
structured;
50.
Petascale High Order Dynamic Rupture Earthquake Simulations on Heterogeneous Supercomputers
机译:
异构超级计算机上的Petascale高阶动态破裂地震模拟
作者:
Heinecke Alexander
;
Breuer Alexander
;
Rettenberger Sebastian
;
Bader Michael
;
Gabriel Alice-Agnes
;
Pelties Christian
;
Bode Arndt
;
Barth William
;
Liao Xiang-Ke
;
Vaidyanathan Karthikeyan
;
Smelyanskiy Mikhail
;
Dubey Pradeep
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
Galerkin method;
computational geometry;
earthquake engineering;
earthquakes;
fracture;
geophysics computing;
mainframes;
mesh generation;
parallel machines;
parallel processing;
seismic waves;
seismology;
wave propagation;
ADER-DG software;
DP-PFLOPS;
Intel Xeon Phi coprocessor platforms;
Landers earthquake;
SeisSol software;
Tianhe-2 supercomputer;
arbitrary high-order derivative discontinuous Galerkin software;
architecture-aware optimizations;
civil engineering;
complicated geometries;
compute-communication overlapping scheme shadowing;
earthquake faulting;
earthquake model;
end-to-end optimization;
full-frictional sliding;
ground motion;
heterogeneous solver structure;
heterogeneous supercomputers;
multiphysics computations;
near-optimal weak scaling;
performance model;
petascale high-order dynamic rupture earthquake simulations;
realistic geological models;
rupture evolution;
seismic wave propagation;
unstructured meshes;
Computational modeling;
Earthquakes;
Jacobian matrices;
Kernel;
Optimization;
Seismic waves;
Stress;
ADER-DG;
SeisSol;
dynamic rupture;
earthquake simulation;
heterogeneous supercomputers;
hybrid parallelization;
petascale performance;
51.
Oil and Water Can Mix: An Integration of Polyhedral and AST-Based Transformations
机译:
油和水可以混合:多面体和基于AST的转换的集成
作者:
Shirako Jun
;
Pouchet Louis-Noel
;
Sarkar Vivek
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
optimising compilers;
parallelising compilers;
AST-based transformations;
IBM Power7 multicore processor;
Intel Nehalem multicore processor;
PoCC polyhedral compiler;
abstract optimization;
coarse-grain parallelism;
complex loop structures;
data locality;
data reuse maximization;
data reusing;
fine-grain parallelism;
high-performance code generation;
loop transformations;
multicore machines;
optimization flow;
optimizing compilers;
pipeline-parallelism;
polyhedral compilation model;
polyhedral framework;
polyhedral-based transformations;
program restructuring;
seamless composition handling;
short-vector SIMD performance;
syntactic-based transformation;
transformation stages;
unroll-and-jam;
Arrays;
Data models;
Nickel;
Optimization;
Parallel processing;
Schedules;
Silicon;
52.
Parallel Bayesian Network Structure Learning for Genome-Scale Gene Networks
机译:
基因组规模基因网络的并行贝叶斯网络结构学习
作者:
Misra Sudip
;
Vasimuddin Md
;
Pamnany Kiran
;
Chockalingam Sriram P.
;
Yong Dong
;
Min Xie
;
Aluru Maneesha R.
;
Aluru Srinivas
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
belief networks;
biology computing;
genetic algorithms;
genomics;
learning (artificial intelligence);
parallel algorithms;
NP-hard;
Tianhe-2 supercomputer;
algorithmic innovation;
gene expression value;
genome-scale gene networks;
heuristic algorithm;
learning Bayesian networks;
model plant Arabidopsis thaliana;
modeling capability;
parallel Bayesian network structure learning;
parallel algorithm;
scaling efficiency;
single thread performance;
vectorization technique;
Bayes methods;
Bioinformatics;
Coprocessors;
Genomics;
Hypercubes;
Instruction sets;
Vectors;
Bayesian networks;
gene networks;
parallel machine learning;
systems biology;
53.
Structure Slicing: Extending Logical Regions with Fields
机译:
结构切片:使用字段扩展逻辑区域
作者:
Bauer Matthias
;
Treichler Sean
;
Slaughter Elliott
;
Aiken Alex
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
FORTRAN;
combustion;
data models;
digital simulation;
mainframes;
parallel machines;
production engineering computing;
Legion;
OpenACC code;
S3D;
data layout;
data movement;
data placement;
data usage specification;
field noninterference;
logical region data model;
logical region extension;
mainstream programming systems;
production combustion simulation;
structure slicing;
supercomputers;
task parallelism;
vectorized CPU-only Fortran implementation;
Arrays;
Indexes;
Layout;
Optimization;
Parallel processing;
Program processors;
Runtime;
54.
Efficient Sparse Matrix-Vector Multiplication on GPUs Using the CSR Storage Format
机译:
使用CSR存储格式的GPU上的高效稀疏矩阵矢量乘法
作者:
Greathouse Joseph L.
;
Daga Mayank
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
graphics processing units;
mathematics computing;
matrix multiplication;
parallel processing;
sparse matrices;
CSR storage format;
CSR-adaptive;
CSR-based SpMV;
DRAM;
clSpMV cocktail;
compressed sparse row;
graphics processing units;
local scratchpad memory;
parallel GPU compute unit;
sparse matrix-vector multiplication;
streaming data;
Bandwidth;
Graphics processing units;
Heuristic algorithms;
Instruction sets;
Random access memory;
Sparse matrices;
Vectors;
AMD;
Sparse matrix-vector multiplication (SpMV);
compressed sparse row (CSR);
general purpose computation on graphics processing units (GPGPU);
performance acceleration;
55.
Maximizing Throughput of Overprovisioned HPC Data Centers Under a Strict Power Budget
机译:
在严格的功率预算下最大化超额配置的HPC数据中心的吞吐量
作者:
Sarood Osman
;
Langer Akhil
;
Gupta Arpan
;
Kale Laxmikant
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
computer centres;
mainframes;
parallel machines;
power consumption;
resource allocation;
scheduling;
SLURM scheduling policy;
adaptive runtime system;
hardware facilitated capability;
network interconnects;
node allocation;
online resource manager;
overprovisioned HPC data centers;
performance modeling scheme;
power 4.75 MW;
power allocation;
power consumption;
power efficient computers;
power response characteristics;
resource allocation decisions;
software-based online resource management system;
strict power budget;
supercomputers;
throughput maximization;
Linear programming;
Mathematical model;
Parallel processing;
Power demand;
Resource management;
Throughput;
Time-frequency analysis;
56.
Fail-in-Place Network Design: Interaction Between Topology, Routing Algorithm and Failures
机译:
失效网络设计:拓扑,路由算法和故障之间的相互作用
作者:
Domke Jens
;
Hoefler Torsten
;
Matsuoka Shingo
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
computer network management;
computer network performance evaluation;
computer network reliability;
telecommunication network routing;
telecommunication network topology;
broken network element;
fail-in-place characteristic;
fail-in-place network design;
fail-in-place strategy;
failure cost;
failure rate;
faulty network component;
high performance computer;
network failure simulation tool chain;
network performance;
performance degradation;
real-world HPC system;
routing algorithm;
system designer;
system downtime;
system lifetime;
topology;
Algorithm design and analysis;
Degradation;
Hardware;
Network topology;
Routing;
Throughput;
Topology;
Network design;
availability;
fail-in-place;
fault tolerance;
network management;
network simulations;
routing protocols;
57.
Scaling the Power Wall: A Path to Exascale
机译:
扩展电源墙:通往百亿亿美元的道路
作者:
Villa Oreste
;
Johnson Daniel R.
;
Oconnor Mike
;
Bolotin Evgeny
;
Nellans David
;
Luitjens Justin
;
Sakharnykh Nikolai
;
Peng Wang
;
Micikevicius Paulius
;
Scudiero Anthony
;
Keckler Stephen W.
;
Dally William J.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
multiprocessing systems;
parallel machines;
performance evaluation;
power aware computing;
ExaFlops;
HPC application;
energy efficiency improvement;
exascale system;
performance projection;
supercomputer development;
Bandwidth;
Computer architecture;
Graphics processing units;
Instruction sets;
Kernel;
Registers;
Supercomputers;
58.
FAST: Near Real-Time Searchable Data Analytics for the Cloud
机译:
快速:云的近实时可搜索数据分析
作者:
Yu Hua
;
Hong Jiang
;
Dan Feng
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
cloud computing;
file organisation;
FAST;
cloud storage system;
correlation-aware hashing;
real-time searchable data analytics;
semantic correlation;
Complexity theory;
Correlation;
Data analysis;
Feature extraction;
Real-time systems;
Semantics;
Vectors;
Cloud storage;
data analytics;
real-time performance;
semantic correlation;
59.
Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications
机译:
用于图形应用的GPU上的快速稀疏矩阵-向量乘法
作者:
Ashari Arash
;
Sedaghati Naser
;
Eisenlohr John
;
Parthasarath Srinivasan
;
Sadayappan P.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
graph theory;
graphics processing units;
mathematics computing;
matrix multiplication;
parallel architectures;
ACSR;
CSR format;
CUDA implementation;
GPUs;
NVIDIA CUSP libraries;
SpMV approach;
adaptive SpMV algorithm;
compressed sparse row;
computational kernel;
cuSPARSE libraries;
dynamic graphs;
dynamic parallelism;
fast sparse matrix-vector multiplication;
graph applications;
iterative invocations;
power-law graphs;
thread divergence;
Heuristic algorithms;
Instruction sets;
Kernel;
Parallel processing;
Sparse matrices;
Standards;
Vectors;
ACSR;
CSR;
GPU;
HYB;
SpMV;
60.
Two-Choice Randomized Dynamic I/O Scheduler for Object Storage Systems
机译:
对象存储系统的两选随机动态I / O调度程序
作者:
Dong Dai
;
Yong Chen
;
Kimpe Dries
;
Ross Robert
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
dynamic scheduling;
input-output programs;
parallel processing;
storage management;
HPC cluster;
collaborative probe;
dynamic I/O scheduling;
exascale high-performance computing platform;
high burst-write throughput;
metadata maintainer;
metadata management;
object storage systems;
redirect table;
storage servers;
stragglers;
two-choice randomized dynamic I/O scheduler;
Collaboration;
Dynamic scheduling;
Probes;
Servers;
Synchronization;
Throughput;
Time factors;
61.
Parallel Programming with Migratable Objects: Charm++ in Practice
机译:
具有可迁移对象的并行编程:实践中的Charm ++
作者:
Acun Bilge
;
Gupta Arpan
;
Jain Nikhil
;
Langer Akhil
;
Menon Harshitha
;
Mikida Eric
;
Xiang Ni
;
Robson Michael
;
Yanhua Sun
;
Totoni Ehsan
;
Wesolowski Lukasz
;
Kale Laxmikant
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
mainframes;
parallel machines;
parallel programming;
Blue Gene/Q;
CHARM++ parallel programming framework;
Cray XE6;
Stampede;
migratable objects;
petascale computing;
rough landscape;
runtime system introspection;
scalable parallel application programming;
science and engineering applications;
supercomputing technology;
Checkpointing;
Computational modeling;
Control systems;
Load management;
Program processors;
Programming;
Runtime;
62.
Enabling Efficient Multithreaded MPI Communication through a Library-Based Implementation of MPI Endpoints
机译:
通过基于库的MPI端点实现实现高效的多线程MPI通信
作者:
Sridharan Sridha
;
Dinan James
;
Kalamkar Dhiraj D.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
application program interfaces;
message passing;
multi-threading;
multiprocessing systems;
multiprocessor interconnection networks;
software libraries;
MPI endpoints extension;
MPI endpoints interface;
MPI job;
MPI library;
MPI+OpenMP baseline;
high-speed interconnection network;
independent network communication;
lattice QCD Dslash kernel;
library-based design;
library-based implementation;
multiple processor core;
multithreaded MPI application;
multithreaded MPI communication;
performance evaluation;
process-like communication performance;
production MPI implementation;
thread count tradeoff;
threading overhead;
Arrays;
Context;
Kernel;
Libraries;
Message systems;
Parallel programming;
Semantics;
Endpoints;
Hybrid Parallel Programming;
MPI;
63.
Scheduling Multi-tenant Cloud Workloads on Accelerator-Based Systems
机译:
在基于加速器的系统上调度多租户云工作负载
作者:
Sengupta Dipak
;
Goswami Anshuman
;
Schwan Karsten
;
Pallavi Krishna
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
cloud computing;
graphics processing units;
parallel processing;
resource allocation;
scheduling;
GPU scheduling problem;
accelerator-based systems;
data movement engines;
device-level scheduling;
dynamic model;
high end cloud services;
high performance applications;
load balancing;
multitenant cloud workload scheduling;
per-device scheduling;
strings scheduler;
Context;
Graphics processing units;
Processor scheduling;
Runtime;
Servers;
Switches;
Synchronization;
CUDA;
GPU;
Multi-tenancy;
hierarchical scheduling;
runtime systems;
virtualization;
64.
Maximizing Throughput on a Dragonfly Network
机译:
最大化蜻蜓网络的吞吐量
作者:
Jain Nikhil
;
Bhatele Abhinav
;
Xiang Ni
;
Wright N.J.
;
Kale Laxmikant V.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
multiprocessor interconnection networks;
parallel machines;
telecommunication network routing;
telecommunication network topology;
Petaflop prototype machine;
application communication pattern;
dragonfly topology;
interconnection network;
job placement policy;
routing strategy;
supercomputer;
throughput maximization;
Adaptation models;
Bandwidth;
Network topology;
Predictive models;
Routing;
Throughput;
Topology;
dragonfly networks;
job placement;
modeling;
prediction;
routing;
65.
A Computation- and Communication-Optimal Parallel Direct 3-Body Algorithm
机译:
计算和通信最佳的并行直接3体算法
作者:
Koanantakool Penporn
;
Yelick Katherine
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
parallel algorithms;
resource allocation;
3-body computations;
3-body interactions;
bounded load imbalance;
communication optimality;
communication-optimal parallel direct 3-body algorithm;
computation minimization;
computation-optimal parallel direct 3-body algorithm;
k-body case;
load balancing;
nested loop formulation;
optional replication factor;
particle simulation methods;
Approximation algorithms;
Bandwidth;
Clustering algorithms;
Force;
Heuristic algorithms;
Program processors;
Three-dimensional displays;
communication-avoiding algorithms;
n-body;
parallel algorithms;
particle methods;
66.
Fast Parallel Computation of Longest Common Prefixes
机译:
最长公共前缀的快速并行计算
作者:
Shun Julian
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
computational complexity;
data structures;
parallel algorithms;
40-core shared-memory machine;
Kärkkäinen-Sanders skew algorithm;
LCP array;
artificial strings;
longest common prefix array;
parallel LCP algorithms;
parallel algorithms;
parallel time complexity;
polylogarithmic depth;
sequential LCP algorithms;
suffix arrays;
Algorithm design and analysis;
Arrays;
Bioinformatics;
Heuristic algorithms;
Parallel algorithms;
Phased arrays;
Radiation detectors;
67.
Parallel De Bruijn Graph Construction and Traversal for De Novo Genome Assembly
机译:
De Novo基因组装配的并行De Bruijn图构建和遍历
作者:
Georganas Evangelos
;
Buluc Aydin
;
Chapman Jarrod
;
Oliker Leonid
;
Rokhsar Daniel
;
Yelick Katherine
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
Cray computers;
biology computing;
genomics;
graph theory;
input-output programs;
parallel algorithms;
program assemblers;
Cray XC30;
I/O requirement;
Meraculous;
UPC;
data hazard avoidance;
de novo assembler;
de novo genome assembly;
distributed hash table;
fine-grained parallelism;
genomic sequence;
human genome;
k-mer analysis;
memory requirements;
one-sided communication capability;
optimized parallelization;
parallel algorithm;
parallel de Bruijn graph construction and traversal;
production assembler;
scalability property;
unified parallel C;
wheat genome;
Algorithm design and analysis;
Assembly;
Bioinformatics;
Genomics;
Memory management;
Program processors;
Sequential analysis;
68.
Optimizing Data Locality for Fork/Join Programs Using Constrained Work Stealing
机译:
使用受限工作窃取来优化Fork / Join程序的数据局部性
作者:
Lifflander Jonathan
;
Krishnamoorthy Sriram
;
Kale Laxmikant V.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
concurrency control;
parallel programming;
processor scheduling;
resource allocation;
tree data structures;
Cilk;
automated approach;
coarse task specification;
coarser tasks;
computation phases;
concurrency;
constrained work stealing;
constrained work-stealing algorithms;
constraint scheduler actions;
data locality improvement;
data locality maximization;
data locality optimization;
dynamic coarsening;
fine-grained task adaptability;
fork/join program scheduling;
locality-optimized load balancing;
performance improvements;
sequential overhead improvement;
spatial locality improvement;
steal operation scheduling;
steal tree construction;
user-specified approach;
work-stealing scheduling construction;
Heuristic algorithms;
Kernel;
Optimization;
Parallel processing;
Runtime;
Schedules;
Synchronization;
cilk;
data locality;
fork/join;
task granularity;
69.
Microbank: Architecting Through-Silicon Interposer-Based Main Memory Systems
机译:
Microbank:设计基于硅中介层的主存储系统
作者:
Young Hoon Son
;
Seongil O.
;
Hyunggyun Yang
;
Daejin Jung
;
Jung Ho Ahn
;
Kim Jung-Ho
;
Jangwoo Kim
;
Lee Jae W.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
DRAM chips;
storage management;
μbank;
DDR3-based memory system;
DRAM page-management policy;
I/O energy efficiency;
IPC;
TSI-based memory system;
bank-level parallelism;
energy consumption;
main memory system architecture;
memory-intensive SPEC 2006 benchmark;
microbank;
prediction-based DRAM;
through-silicon interposer;
Bandwidth;
Data transfer;
Decoding;
Random access memory;
Silicon;
Substrates;
Wires;
70.
Scalable and High Performance Betweenness Centrality on the GPU
机译:
GPU上的可扩展性和高性能中间性中心
作者:
McLaughlin Adam
;
Bader David /A/.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
complex networks;
graph theory;
graphics processing units;
parallel processing;
GPU algorithm;
GTEPS;
computational cost;
graph traversals;
high-diameter graphs;
hybrid GPU implementations;
local data structures;
scalable high performance betweenness centrality;
scale-free graphs;
Algorithm design and analysis;
Arrays;
Graphics processing units;
Instruction sets;
Parallel processing;
Scalability;
GPUs;
Graph Algorithms;
Parallel Algorithms;
71.
Scalable Computation of Stream Surfaces on Large Scale Vector Fields
机译:
大规模矢量场上流面的可扩展计算
作者:
Kewei Lu
;
Han-Wei Shen
;
Peterka Tom
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
data analysis;
data visualisation;
parallel processing;
resource allocation;
HPC system;
flow field visualization;
large scale vector field;
parallel stream surface algorithm;
seeding curve segment;
workload balancing;
Distributed databases;
Heuristic algorithms;
Load management;
Partitioning algorithms;
Runtime;
Surface treatment;
Vectors;
Algorithms;
Dynamic load balancing;
Flow Visualization;
Parallel stream surface;
72.
In-Situ Feature Extraction of Large Scale Combustion Simulations Using Segmented Merge Trees
机译:
使用分段合并树的大规模燃烧模拟的原位特征提取
作者:
Landge Aaditya G.
;
Pascucci V.
;
Gyulassy Attila
;
Bennett Janine C.
;
Kolla Hemanth
;
Chen Jiann-Jong
;
Bremer Peer-Timo
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
approximation theory;
data structures;
feature extraction;
image segmentation;
trees (mathematics);
combustion simulations;
feature-based analysis;
in-situ analysis techniques;
in-situ feature extraction;
leadership class supercomputers;
local approximation;
merge tree segmentation;
reduced data representations;
scientific simulations;
simulation runtime;
system I/O constraints;
threshold based features;
Algorithm design and analysis;
Analytical models;
Bismuth;
Combustion;
Computational modeling;
Feature extraction;
Program processors;
feature extraction;
in situ analysis;
merge tree computation;
segmented merge tree;
topological data analysis;
73.
IndexFS: Scaling File System Metadata Performance with Stateless Caching and Bulk Insertion
机译:
IndexFS:通过无状态缓存和批量插入来扩展文件系统元数据性能
作者:
Kai Ren
;
Qing Zheng
;
Patil Swapnil
;
Gibson Garth
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
cache storage;
checkpointing;
meta data;
middleware;
HDFS;
IndexFS;
Lustre;
N-N check pointing;
PVFS;
bulk insertion;
bulk namespace insertion;
client-based storm free caching techniques;
creation intensive workloads;
disk locality;
distributed file systems;
file system metadata performance scaling;
high-performance operations;
hot spot mitigation;
log-structured layout optimization;
metadata scalability;
middleware design;
namespace partitioning;
out-of-core metadata throughput;
per-directory basis;
preserving server;
stateless caching;
stateless consistent metadata caching;
storage systems;
table-based architecture;
Compaction;
Indexes;
Middleware;
Receivers;
Scalability;
Servers;
Throughput;
Distributed file systems;
bulk insertion;
file system metadata;
log-structured merge tree;
stateless caching;
74.
Pardicle: Parallel Approximate Density-Based Clustering
机译:
分组:基于并行近似密度的聚类
作者:
Patwary M. Mostofa Ali
;
Satish Nadathur
;
Sundaram Narayanan
;
Manne Fredrik
;
Habib Salman
;
Dubey Pradeep
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
approximation theory;
computational complexity;
distributed shared memory systems;
pattern clustering;
resource allocation;
sampling methods;
Intel Xeon Phi coprocessor;
approximate algorithm;
arbitrarily-shaped cluster;
astrophysics;
density based sampling;
density-based clustering algorithm;
distributed memory;
dynamic partitioning;
exact algorithm;
heuristic algorithm;
load balancing;
load locality;
multinode computer;
near-linear speedup;
noise data filter;
parallel DBSCAN algorithm;
parallel approximate density-based clustering;
pardicle;
particle data;
performance improvement;
shared memory;
single node Intel Xeon processor;
synthetic massive datasets;
Approximation algorithms;
Approximation methods;
Clustering algorithms;
Data structures;
Heuristic algorithms;
Instruction sets;
Partitioning algorithms;
Density based clustering;
Disjoint-set data structure;
Union-Find algorithm;
approximate clustering algorithm;
75.
Scalable Kernel Fusion for Memory-Bound GPU Applications
机译:
适用于内存绑定GPU应用的可扩展内核融合
作者:
Wahib Mohamed
;
Maruyama Naoya
会议名称:
《》
|
2014年
关键词:
cache storage;
finite difference methods;
graphics processing units;
parallel processing;
performance evaluation;
HPC applications;
codeless performance upper-bound projection model;
data arrays;
data dependencies;
data traffic;
finite difference methods;
kernel precedences;
memory-bound GPU applications;
memory-bound kernels;
off-chip memory;
on-chip cache;
optimal kernel fusions;
scalable kernel fusion;
Arrays;
Graphics processing units;
Instruction sets;
Kernel;
Meteorology;
Optimization;
System-on-chip;
76.
Practical Symbolic Race Checking of GPU Programs
机译:
GPU程序的实用符号竞赛检查
作者:
Peng Li
;
Guodong Li
;
Gopalakrishnan Ganesh
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
C++ language;
graphics processing units;
parallel architectures;
program diagnostics;
C++ CUDA program;
CUDA kernel;
GPU program;
Lonestar;
Parboil;
SESA;
static analysis;
symbolic execution;
symbolic race checking;
thread-ID based decision;
Concrete;
Graphics processing units;
History;
Indexes;
Instruction sets;
Kernel;
Schedules;
CUDA;
Data Flow Analsis;
Formal Verification;
GPU;
Parallelism;
Symbolic Execution;
Taint Analysis;
Virtual Machine;
77.
Fault-Tolerant Dynamic Task Graph Scheduling
机译:
容错动态任务图调度
作者:
Kurt Mehmet Can
;
Krishnamoorthy Sriram
;
Agrawal Kunal
;
Agrawal Gagan
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
fault tolerant computing;
graph theory;
meta data;
parallel processing;
scheduling;
task analysis;
asymptotically optimal fault tolerant execution;
fault tolerant design;
fault-tolerant dynamic task graph scheduling;
localized task recovery;
metadata;
space overheads;
successor-predecessor relationships;
task graph structure;
time overheads;
work stealing-based task scheduling algorithm;
Arrays;
Dynamic scheduling;
Fault tolerance;
Fault tolerant systems;
Instruction sets;
Radiation detectors;
Scheduling algorithms;
cilk;
dag;
fault tolerance;
task graphs;
work stealing;
78.
A Unified Programming Model for Intra- and Inter-Node Offloading on Xeon Phi Clusters
机译:
Xeon Phi群集上节点内和节点间卸载的统一编程模型
作者:
Noack Marko
;
Wende Florian
;
Steinke Thomas
;
Cordes Frank
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
application program interfaces;
coprocessors;
message passing;
parallel machines;
HAM-offload;
Intel LEO;
MPI+X;
OpenMP 4.0;
Xeon Phi clusters;
compute node;
coprocessors;
heterogeneous active messages;
hybrid programming approaches;
inter-node offloading;
intra-node offloading;
molecular dynamics;
scaling applications;
scaling behavior;
standard offload programming models;
supercomputer;
unified offload API;
unified programming model;
Computational modeling;
Coprocessors;
Data transfer;
Libraries;
Low earth orbit satellites;
Performance evaluation;
Programming;
79.
A Communication-Optimal Framework for Contracting Distributed Tensors
机译:
签约分布式张量的通信最优框架
作者:
Rajbhandari Sujan
;
Nikam Akshay
;
Pai-Wei Lai
;
Stock Kevin
;
Krishnamoorthy Sriram
;
Sadayappan P.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
mathematics computing;
matrix multiplication;
parallel machines;
tensors;
Blue Gene/Q supercomputer;
communication-optimal framework;
distributed tensor contraction algorithm;
matrix multiplication operation;
torus network;
Chemistry;
Distributed databases;
Indexes;
Memory management;
Scalability;
Tensile stress;
Three-dimensional displays;
80.
A Study on Balancing Parallelism, Data Locality, and Recomputation in Existing PDE Solvers
机译:
在现有PDE解算器中平衡并行性,数据局部性和计算的研究
作者:
Olschanowsky Catherine
;
Strout Michelle Mills
;
Guzik Stephen
;
Loffeld John
;
Hittinger Jeffrey
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
grid computing;
multiprocessing systems;
optimisation;
parallel processing;
partial differential equations;
CFD codes;
Chombo framework;
NUMA multicore nodes;
PDE solvers;
communication-avoiding variants;
computational fluid dynamic codes;
data locality;
inter-loop optimization strategies;
multicore systems;
node parallelization schemes;
parallel scaling;
parallelism balancing;
partial differential equation;
program idioms;
structured-grid PDE solver frameworks;
Computational fluid dynamics;
Equations;
Kernel;
Multicore processing;
Optimization;
Parallel processing;
Schedules;
81.
ECC Parity: A Technique for Efficient Memory Error Resilience for Multi-Channel Memory Systems
机译:
ECC奇偶校验:一种用于多通道存储系统的有效存储错误恢复能力的技术
作者:
Xun Jian
;
Kumar Ravindra
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
DRAM chips;
error correction codes;
parallel processing;
parity check codes;
storage management;
DIMM-kill correct ECC;
ECC capacity overhead reduction;
ECC correction bits;
ECC parity;
HPC system;
availability requirement;
bitwise parity;
commercial chip kill correct ECC;
fault-free memory region;
faulty memory region;
memory channel;
memory energy efficiency;
memory energy per instruction;
memory error correction code;
memory error resilience;
multichannel memory system;
reliability requiement;
Circuit faults;
Error correction;
Error correction codes;
Layout;
Memory management;
Optimization;
Resilience;
82.
Finding Constant from Change: Revisiting Network Performance Aware Optimizations on IaaS Clouds
机译:
从变化中找到恒常:在IaaS云上重新审视网络性能意识的优化
作者:
Yifan Gong
;
Bingsheng He
;
Dan Li
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
application program interfaces;
cloud computing;
conjugate gradient methods;
message passing;
principal component analysis;
virtualisation;
Amazon EC2;
CG;
IaaS clouds;
MPI;
N-body;
RPCA;
collective communications;
conjugate gradient;
constant component;
distributed application optimization;
dynamic network performance;
generic topology mapping;
mathematical method;
network performance aware optimizations;
network topology;
pair-wise network performance measurements;
robust principal component analysis;
virtualization;
Bandwidth;
Knowledge engineering;
Network topology;
Optimization;
Sparse matrices;
Topology;
Virtual machining;
Cloud Computing;
Network Performance Aware Optimization;
RPCA;
83.
Anton 2: Raising the Bar for Performance and Programmability in a Special-Purpose Molecular Dynamics Supercomputer
机译:
安东2:提高专用分子动力学超级计算机的性能和可编程性标准
作者:
Shaw David E.
;
Grossman J.P.
;
Bank Joseph /A/.
;
Batson Brannon
;
Butts J. Adam
;
Chao Jack C.
;
Deneroff Martin M.
;
Dror Ron O.
;
Even Amos
;
Fenton Christopher H.
;
Forte Anthony
;
Gagliardo Joseph
;
Gill Gennette
;
Greskamp Brian
;
Ho C. Richard
;
Ierardi Douglas J.
;
Iserovich Lev
;
Kuskin Jeffrey S.
;
Larson Richard H.
;
Layman Timothy
;
Li-Siang Lee
;
Lerer Adam K.
;
Li Cong
;
Killebrew Daniel
;
Mackenzie Kenneth M.
;
Mok Shark Yeuk-Hai
;
Moraes Mark /A/.
;
Mueller Richard
;
Nociolo Lawrence J.
;
Peticolas Jon L.
;
Quan Terry
;
Ramot Daniel
;
Salmon John K.
;
Scarpazza Daniele P.
;
Schafer U. Ben
;
Siddique Naseer
;
Snyder Christopher W.
;
Spengler Jochen
;
Tang Ping Tak Peter
;
Theobald Michael
;
Toma Horia
;
Towles Brian
;
Vitale Benjamin
;
Wang Stanley C.
;
Young Cliff
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
digital simulation;
parallel machines;
physics computing;
Anton 2 architecture;
all-atom bio molecular simulations;
fine-grained event-driven operation;
molecular dynamics simulations;
performance improvement;
software-based optimizations;
special-purpose molecular dynamics supercomputer;
Arrays;
Biological system modeling;
Computational modeling;
Flexible printed circuits;
Hardware;
Pipelines;
Random access memory;
84.
Metascalable Quantum Molecular Dynamics Simulations of Hydrogen-on-Demand
机译:
按需氢的可分级的量子分子动力学模拟
作者:
Nomura Ken-Ichi
;
Kalia Rajiv K.
;
Nakano Atsuki
;
Vashishta Priya
;
Shimamura Kohei
;
Shimojo Fuyuki
;
Kunaseth Manaschai
;
Messina Paul C.
;
Romerod Nichols /A/.
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
computational complexity;
density functional theory;
divide and conquer methods;
error analysis;
hydrogen production;
molecular dynamics method;
parallel processing;
production engineering computing;
IBM Blue Gene/Q cores;
algorithmic innovations;
computational cost;
divide-conquer-recombine paradigm;
error analyses;
floating-point performance;
global real-space multigrid;
hierarchical band-space-domain decomposition;
hydrogen-on-demand;
lean divide-and-conquer density functional theory algorithm;
local plane-wave bases;
metascalable quantum molecular dynamics simulations;
nanostructural design;
on-demand hydrogen production;
renewable energy technologies;
self-consistent-field iterations;
weak-scaling parallel efficiency;
Computational efficiency;
Computational modeling;
Computers;
Discrete Fourier transforms;
Production;
Quantum mechanics;
Wave functions;
Density functional theory;
Divide-and-conquer;
On-demand hydrogen production;
85.
A User-Friendly Approach for Tuning Parallel File Operations
机译:
调整并行文件操作的用户友好方法
作者:
McLay Robert
;
James Doug
;
Si Liu
;
Cazes John
;
Barth William
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
application program interfaces;
file organisation;
input-output programs;
message passing;
parallel processing;
HPC;
I/O bandwidth;
Lustre file system;
parallel MPI write operation tuning;
user-friendly approach;
Arrays;
Bandwidth;
Benchmark testing;
Communities;
Libraries;
Tuning;
Writing;
86.
Optimized Scheduling Strategies for Hybrid Density Functional theory Electronic Structure Calculations
机译:
混合密度泛函理论电子结构计算的优化调度策略
作者:
Dawson William
;
Gygi Francois
会议名称:
《International Conference for High Performance Computing, Networking, Storage and Analysis》
|
2014年
关键词:
chemistry computing;
density functional theory;
molecular electronic states;
parallel machines;
resource allocation;
scheduling;
search problems;
Argonne National Laboratory;
Mira Blue Gene/Q computer;
Qbox density functional theory code;
calculation schedule;
chemistry;
data availability;
data distribution;
electronic interaction;
exchange integral;
hybrid DFT simulation;
hybrid density functional theory electronic structure calculation;
liquid water;
load balancing;
load scalability;
materials science application;
metal-water interface;
optimized scheduling strategy;
parallel computation;
partial data-replication;
random search algorithm;
representative simulation;
Computational modeling;
Discrete Fourier transforms;
Load management;
Optimal scheduling;
Processor scheduling;
Schedules;
意见反馈
回到顶部
回到首页