Parallel Processing of cluster by Map Reduce

Madhavi Vaidya

首页> 外文期刊>International Journal of Distributed and Parallel Systems >Parallel Processing of cluster by Map Reduce

【24h】

Parallel Processing of cluster by Map Reduce

机译：通过Map Reduce并行处理集群

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

MapReduce is a parallel programming model and an associated implementation introduced by Google. In the programming model, a user specifies the computation by two functions, Map and Reduce. The underlying MapReduce library automatically parallelizes the computation, and handles complicated issues like data distribution, load balancing and fault tolerance. Massive input, spread across many machines, need to parallelize. Moves the data, and provides scheduling, fault tolerance. The original MapReduce implementation by Google, as well as its open-source counterpart, Hadoop, is aimed for parallelizing computing in large clusters of commodity machines. Map Reduce has gained a great popularity as it gracefully and automatically achieves fault tolerance. It automatically handles the gathering of results across the multiple nodes and returns a single result or set. This paper gives an overview of MapReduce programming model and its applications. The author has described here the workflow of MapReduce process. Some important issues, like fault tolerance, are studied in more detail. Even the illustration of working of Map Reduce is given. The data locality issue in heterogeneous environments can noticeably reduce the Map Reduce performance. In this paper, the author has addressed the illustration of data across nodes in a way that each node has a balanced data processing load stored in a parallel manner. Given a data intensive application running on a Hadoop Map Reduce cluster, the auhor has exemplified how data placement is done in Hadoop architecture and the role of Map Reduce in the Hadoop Architecture. The amount of data stored in each node to achieve improved data-processing performance is explained here.

机译：MapReduce是Google引入的并行编程模型和相关实现。在编程模型中，用户通过Map和Reduce两个函数指定计算。底层的MapReduce库自动并行化计算，并处理复杂的问题，如数据分发，负载平衡和容错。分布在许多机器上的大量输入需要并行化。移动数据，并提供计划，容错能力。 Google最初的MapReduce实施及其开放源代码Hadoop的目标是使大型商用机器集群中的计算并行化。 Map Reduce优雅且自动实现容错功能，因此赢得了极大的欢迎。它会自动处理跨多个节点的结果收集，并返回单个结果或集合。本文概述了MapReduce编程模型及其应用。作者在这里描述了MapReduce流程的工作流程。一些重要的问题，例如容错，将得到更详细的研究。甚至给出了Map Reduce的工作示例。异构环境中的数据局部性问题会明显降低Map Reduce性能。在本文中，作者以每个节点具有以并行方式存储的平衡数据处理负载的方式解决了跨节点数据的说明。给定一个在Hadoop Map Reduce集群上运行的数据密集型应用程序，auhor举例说明了如何在Hadoop体系结构中完成数据放置以及Map Reduce在Hadoop体系结构中的作用。此处说明了为提高数据处理性能而在每个节点中存储的数据量。

著录项

来源
《International Journal of Distributed and Parallel Systems》 |2012年第1期|共页
作者
Madhavi Vaidya;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类 TP311.13;
关键词

相似文献

外文文献
中文文献
专利

1. Research on K-medoids clustering algorithm based on data density and its parallel processing based on MapReduce [J] . Aiguo Liu, Shuli Zou, Taorong Qiu, Journal of Residuals Science & Technology . 2016,第7期

机译：基于数据密度的K-medoids聚类算法及其基于MapReduce的并行处理研究
2. Map-Reduce based tipping point scheduler for parallel image processing [J] . Akhtar Mohammad Nishat, Saleh Junita Mohamad, Awais Habib, Expert Systems with Application . 2020,第Jana期

机译：基于Map-Reduce的引爆点计划程序进行并行图像处理
3. Parallel K-Means Implementation for Data Clustering Using Hadoop Map-Reduce [J] . Maithri C, Chandramouli E Journal of computational and theoretical nanoscience . 2018,第11a12期

机译：使用Hadoop地图 - 减少的数据群集并行K-Meanse实现
4. Clonal Selection based Parallel Fuzzy Clustering using Map-reduce [C] . Bharti Saneja, Rinkle Rani International Conference on Parallel, Distributed and Grid Computing . 2018

机译：基于克隆约简的Map-Reduce并行模糊聚类
5. Transmission shift map optimization for reduced electrical energy consumption in a pre-transmission parallel plug-in hybrid electric vehicle. [D] . Moore, Jonathan Dean. 2013

机译：变速箱变速图优化，可减少变速箱前并联混合动力电动汽车的电能消耗。
6. Parallelizing Affinity Propagation Using Graphics Processing Units for Spatial Cluster Analysis over Big Geospatial Data [O] . Xuan Shi -1

机译：使用图形处理单元对亲和力进行并行传播以对大地理空间数据进行空间聚类分析
7. PARALLEL AND DISTRIBUTE PROCESSING FOR VIRTUAL MAPREDUCE CLUSTERS BY USING IMPROVISED HYBRID JOB SCHEDULING ALGORITHM [O] . Dr N Sandhya 2017

机译：通过使用简易混合作业调度算法对虚拟MapReduce集群的并行和分发处理
8. Parallel Optical Processing to Convert Elevation Data to Slope Maps. Phase II. Practical Considerations. [R] . hecht, avron s. 1975

机译：将高程数据转换为斜率图的并行光学处理。第二阶段。实际考虑因素。

Parallel Processing of cluster by Map Reduce

摘要

著录项

相似文献

相关主题

期刊订阅