...
首页> 外文期刊>Cluster computing >Parallel data intensive applications using MapReduce: a data mining case study in biomedical sciences
【24h】

Parallel data intensive applications using MapReduce: a data mining case study in biomedical sciences

机译:使用MapReduce的并行数据密集型应用程序:生物医学科学中的数据挖掘案例研究

获取原文
获取原文并翻译 | 示例
           

摘要

Performance is an open issue in data intensive applications (e.g. data mining tasks). Parallel and distributed computing systems (e.g. multicore computing, grid computing, cloud computing,etc.), along with hybrid programming models (e.g. MapReduce, MPI, etc.), is seen a sought-after solution for accelerating data-intensive applications. One of main challenges is how to exploit these advanced technologies effectively in facilitating fundamental science discoveries such as those in Biomedical Sciences. This paper explores how MapReduce and Cloud computing can accelerate performance of data intensive applications through a real data mining use case in the Biomedical Sciences. We have first adapted the data mining task using MapReduce model and then deployed it onto the Cloud. We have built an analytic model based on the MapReduce computations to evaluate the efficiency and performance of the prototype. The results, from both experiments and the evaluation model, show the performance and scalability can be enhanced through these advanced technologies.
机译:在数据密集型应用程序(例如数据挖掘任务)中,性能是一个未解决的问题。并行和分布式计算系统(例如多核计算,网格计算,云计算等)以及混合编程模型(例如MapReduce,MPI等)被视为加速数据密集型应用程序的抢手解决方案。主要挑战之一是如何有效利用这些先进技术来促进基础科学发现,例如生物医学领域的发现。本文探讨了MapReduce和云计算如何通过生物医学科学中的实际数据挖掘用例来加速数据密集型应用程序的性能。我们首先使用MapReduce模型调整了数据挖掘任务,然后将其部署到了Cloud。我们已经基于MapReduce计算建立了一个分析模型,以评估原型的效率和性能。来自实验和评估模型的结果表明,通过这些先进技术可以提高性能和可伸缩性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号