首页> 美国卫生研究院文献>PeerJ Computer Science >A hybrid multi-objective whale optimization algorithm for analyzing microarray data based on Apache Spark
【2h】

A hybrid multi-objective whale optimization algorithm for analyzing microarray data based on Apache Spark

机译:一种混合多目标鲸类优化算法用于分析基于Apache Spark的微阵列数据

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

A microarray is a revolutionary tool that generates vast volumes of data that describe the expression profiles of genes under investigation that can be qualified as Big Data. Hadoop and Spark are efficient frameworks, developed to store and analyze Big Data. Analyzing microarray data helps researchers to identify correlated genes. Clustering has been successfully applied to analyze microarray data by grouping genes with similar expression profiles into clusters. The complex nature of microarray data obligated clustering methods to employ multiple evaluation functions to ensure obtaining solutions with high quality. This transformed the clustering problem into a Multi-Objective Problem (MOP). A new and efficient hybrid Multi-Objective Whale Optimization Algorithm with Tabu Search (MOWOATS) was proposed to solve MOPs. In this article, MOWOATS is proposed to analyze massive microarray datasets. Three evaluation functions have been developed to ensure an effective assessment of solutions. MOWOATS has been adapted to run in parallel using Spark over Hadoop computing clusters. The quality of the generated solutions was evaluated based on different indices, such as Silhouette and Davies–Bouldin indices. The obtained clusters were very similar to the original classes. Regarding the scalability, the running time was inversely proportional to the number of computing nodes.
机译:微阵列是一种革命性的工具,可以生成巨大的数据,描述可以合格为大数据的调查中的基因表达谱。 Hadoop和Spark是高效的框架,用于存储和分析大数据。分析微阵列数据有助于研究人员识别相关基因。已成功应用群集来分析微阵列数据,通过将具有与类似的表达方式分组到集群中的基因分析。微阵列数据义的聚类方法的复杂性质采用多种评估功能,以确保以高质量获得解决方案。这将聚类问题转换为多目标问题(MOP)。提出了一种新的和高效的混合多目标鲸优化算法,具有禁忌搜索(MOWOOD)来解决MOP。在本文中,提出了割草来分析大规模的微阵列数据集。已经开发了三种评估功能,以确保对解决方案有效评估。割草已经适用于使用Spark在Hadoop计算集群中并行运行。基于不同的指标评估所生成的解决方案的质量,例如剪影和Davies-Bouldin指数。获得的群集与原始类别非常相似。关于可伸缩性,运行时间与计算节点的数量成反比。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号