Towards Efficient Ensemble Hierarchical Clustering with MapReduce-based Clusters Clustering Technique and the Innovative Similarity Criterion

Tian Ping; Shen Huitao; Abolfathi Ahad

首页> 外文期刊>Journal of grid computing >Towards Efficient Ensemble Hierarchical Clustering with MapReduce-based Clusters Clustering Technique and the Innovative Similarity Criterion

【24h】

Towards Efficient Ensemble Hierarchical Clustering with MapReduce-based Clusters Clustering Technique and the Innovative Similarity Criterion

机译：Towards Efficient Ensemble Hierarchical Clustering with MapReduce-based Clusters Clustering Technique and the Innovative Similarity Criterion

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相关主题

摘要

Today, data plays an important and fundamental role in our daily lives. The increasing growth of data production has led to the big data revolution. Managing and analyzing this data, which is often unlabeled, is a major challenge for the real world. Clustering is one of the most important branches of data mining for data analysis and its purpose is to divide the data into meaningful subsets called clusters. Hierarchical clustering is one of the unsupervised learning algorithms for grouping data points with similar properties, so that its concept lies in the construction and analysis of dendrograms. Over the decades, many algorithms have been developed for clustering with different approaches. In this paper, an efficient ensemble hierarchical clustering algorithm based on MapReduce-based clusters clustering technique and an innovative similarity criterion is introduced. The main idea of ensemble clustering is to combine the results of different single clustering methods. Ensemble techniques usually produce better results than single methods due to multiple learning. Accordingly, it can be expected that the aggregation of hierarchical clustering methods will lead to higher quality in clustering. In addition, MapReduce is a model for implementing big data applications, where we use this model to implement hierarchical clustering methods. Meanwhile, the similarity between the samples is calculated through an innovative similarity criterion. The proposed approach is presented in three steps. In the first step, the data are clustered by several single hierarchical clustering methods. Then in the second step, hyper-clusters are generated by applying the clusters clustering technique. Finally, the final clusters are generated in the third step. This is done by allocating samples to hyper-clusters. Accordingly, the final clusters are formed in the third step. The simulation is performed on multiple real-world datasets and the results show better performance of the proposed approach compared to algorithms such as CHC and RCESCC.

著录项

来源
《Journal of grid computing》 |2022年第4期|共19页
作者
Tian Ping; Shen Huitao; Abolfathi Ahad;
展开▼
作者单位

Xuchang Univ;

Islamic Azad Univ;

展开▼
收录信息
原文格式 PDF
正文语种英语
中图分类计算技术、计算机技术;
关键词
Hierarchical clustering; Ensemble clustering; MapReduce model; Clusters clustering; Hyper-clusters;

Towards Efficient Ensemble Hierarchical Clustering with MapReduce-based Clusters Clustering Technique and the Innovative Similarity Criterion

摘要

著录项

相关主题

期刊订阅