首页> 外文期刊>International Journal of Distributed Sensor Networks >DSMC: A Novel Distributed Store-Retrieve Approach of Internet Data Using MapReduce Model and Community Detection in Big Data
【24h】

DSMC: A Novel Distributed Store-Retrieve Approach of Internet Data Using MapReduce Model and Community Detection in Big Data

机译:DSMC:一种使用MapReduce模型和大数据社区检测的新型Internet数据分布式存储-检索方法

获取原文
           

摘要

The processing of big data is a hotspot in the scientific research. Data on the Internet is very large and also very important for the scientific researchers, so the capture and store of Internet data is a priority among priorities. The traditional single-host web spider and data store approaches have some problems such as low efficiency and large memory requirement, so this paper proposes a big data store-retrieve approach DSMC (distributed store-retrieve approach using MapReduce model and community detection) based on distributed processing. Firstly, the distributed capture method using MapReduce to deduplicate big data is presented. Secondly, the storage optimization method is put forward; it uses the hash functions with light-weight characteristics and the community detection to address the storage structure and solve the data retrieval problems. DSMC has achieved the high performance of large web data comparison and storage and gets the efficient data retrieval at the same time. The experimental results show that, in the Cloudsim platform, comparing with the traditional web spider, the proposed DSMC approach shows better efficiency and performance.
机译:大数据处理是科学研究的一个热点。 Internet上的数据非常大,而且对于科研人员也非常重要,因此,Internet数据的捕获和存储是优先事项。传统的单主机Web蜘蛛和数据存储方法存在效率低,内存需求大等问题,因此,本文提出了一种基于大数据存储检索方法DSMC(使用MapReduce模型和社区检测的分布式存储检索方法)。分布式处理。首先,提出了使用MapReduce对大数据进行重复数据删除的分布式捕获方法。其次,提出了存储优化方法。它使用具有轻量级特征的哈希函数和社区检测来解决存储结构并解决数据检索问题。 DSMC实现了大型Web数据比较和存储的高性能,并同时获得了有效的数据检索。实验结果表明,在Cloudsim平台上,与传统的Web Spider相比,提出的DSMC方法具有更好的效率和性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号