首页> 中文期刊> 《软件》 >一种分布式ETL工具的设计与实现

一种分布式ETL工具的设计与实现

         

摘要

传统的ETL工具有集中执行、对服务器性能要求高等缺点,针对这些缺点,本文提出了一种基于Hadoop的分布式ETL系统。该系统在分布式文件系统基础上,利用相应的数据过滤器和Hadoop强大的并行处理能力,实现了集群分布式执行ETL流程。该分布式ETL系统具有较高的可扩展性和吞吐效率,同时能够自动实现负载均衡,执行效率高。%Traditional ETL tools all have these disadvantages, like centralized execution and high performance servers. Aiming at these disadvantages, this paper comes out with a distibuted ETL system, which is based on the Hadoop computing framework. With a distributed ifle system, this ETL tool can execute the ETL processes under help of some data iflters and the computing ability of Hadoop among server cluster distributedly. The distributed ETL tool has high scalability and throughput efifciency and can automatically realize load balance with high execution efifciency.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号