Metadata Replication with Synchronous OpCodes Writing for Namenode Multiplexing in Hadoop

机译：与Hadoop中的NameNode多路复用的同步操作码编写的元数据复制

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A single Active Namenode (ANN) of Hadoop Distributed File System (HDFS) become a bottleneck when we require high-throughput read operations such as large-scale data analysis. Recently, various kinds of namenode schemes are proposed including asynchronous check pointing schemes to address the ANN bottleneck issue. Even if asynchronous schemes offers high throughput reading operations, they suffers in stale read problem where the latest data return is not guaranteed. In this paper, we propose a novel metadata replication scheme with synchronous OpCodes writing to achieve namenode multiplexing, where we can avoid the stale read problem. To reduce synchronization overhead, our proposed scheme conducts reduced replication only for metadata updates such as a write request, using quasi byte-level metadata operation codes. We conducted the empirical experiment to verify the effectiveness of our proposed schemes. The results show that our method reduces by 50.95% in the average required number of NNs when the number of NNs for read-only operation is 100.

机译：当我们需要大规模数据分析等高吞吐量读取操作时，Hadoop分布式文件系统（HDFS）的单个活动NameNode（HDF）成为瓶颈。最近，提出了各种NameNode方案，包括异步检查指向方案来解决ANN瓶颈问题。即使异步方案提供高吞吐量读取操作，它们也会遇到陈旧的读取问题，其中无法保证最新的数据返回。在本文中，我们提出了一种新颖的元数据复制方案，具有同步操作码写入以实现NameNode复用，我们可以避免陈旧的读取问题。为了减少同步开销，我们所提出的方案仅使用准字节级元数据操作代码对诸如写请求的元数据更新进行减少的复制。我们进行了实证实验，以验证我们提出的计划的有效性。结果表明，当只读操作的NN的数量为100时，我们的方法在NNS的数量时，NNS的平均所需数量为50.95％。

著录项

来源
《IEEE International IOT, Electronics and Mechatronics Conference》|2021年|1-7|共7页
会议地点
作者
Taeha Kim; Sangyoon Oh;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Multiplexing; Mechatronics; Data analysis; File systems; Conferences; Metadata; Writing;

机译：多路复用;机电一体化;数据分析;文件系统;会议;元数据;写作;

相似文献

外文文献
中文文献
专利

1. A Distributed and Cooperative NameNode Cluster for a Highly-Available Hadoop Distributed File System [J] . Yonghwan KIM, Tadashi ARARAGI, Junya NAKAMURA, IEICE transactions on information and systems . 2015,第4期

机译：适用于高可用性Hadoop分布式文件系统的分布式协作NameNode集群
2. H2Hadoop: Improving Hadoop Performance Using Metadata of Related Jobs [J] . S.Kanaka Lakshmi, K.Ramachandra Rao International Journal of Computer Science and Technology . 2017,第4a1期

机译：H2Hadoop：使用相关作业的元数据提高Hadoop性能
3. Dr. Hadoop: an infinite scalable metadata management for Hadoop—How the baby elephant becomes immortal [J] . Dipayan Dev, Ripon Patgiri Frontiers of Information Technology & Electronic Engineering . 2016,第1期

机译：Hadoop博士：适用于Hadoop的无限可扩展元数据管理-小象长生不老
4. Hadoop high availability through metadata replication [C] . Feng Wang, Jie Qiu, Jie Yang, Proceeding of the First international workshop on Cloud data management . 2009

机译：通过元数据复制实现Hadoop高可用性
5. Improving Hadoop performance by using metadata of related jobs in text datasets via enhancing MapReduce workflow. [D] . Alshammari, Hamoud. 2016

机译：通过增强MapReduce工作流程，在文本数据集中使用相关作业的元数据来提高Hadoop性能。
6. Three replication origins in Sulfolobus species: Synchronous initiation of chromosome replication and asynchronous termination [O] . Magnus Lundgren, Anders Andersson, Lanming Chen, 2004

机译：硫磺菌种的三个复制起点：染色体复制的同步启动和异步终止
7. Enhancing NameNode Fault Tolerance in Hadoop Distributed File System [O] . Ohnmar Aung, Thandar Thein 2014

机译：在Hadoop分布式文件系统中增强NameNode容错

Metadata Replication with Synchronous OpCodes Writing for Namenode Multiplexing in Hadoop

摘要

著录项

相似文献

相关主题

期刊订阅