Benchmarking the Clustering Performances of Evolutionary Algorithms: A Case Study on Varying Data Size

dc.authoridKayaalp, Fatih/0000-0002-8752-3335
dc.contributor.authorKayaalp, F.
dc.contributor.authorErdogmus, P.
dc.date.accessioned2021-12-01T18:47:22Z
dc.date.available2021-12-01T18:47:22Z
dc.date.issued2020
dc.department[Belirlenecek]en_US
dc.description.abstractBackground and objective: Clustering is a widely used popular method for data analysis within many clustering algorithms for years. Today it is used in many predictions, collaborative filtering and automatic segmentation systems on different domains. Also, to be broadly used in practice, such clustering algorithms need to give both better performance and robustness when compared to the ones currently used. In recent years, evolutionary algorithms are used in many domains since they are robust and easy to implement. And many clustering problems can be easily solved with such algorithms if the problem is modeled as an optimization problem. In this paper, we present an optimization approach for clustering by using four well-known evolutionary algorithms which are Biogeography-Based Optimization (BBO), Grey Wolf Optimization (GWO), Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). Method: the objective function has been specified to minimize the total distance from cluster centers to the data points. Euclidean distance is used for distance calculation. We have applied this objective function to the given algorithms both to find the most efficient clustering algorithm and to compare the clustering performances of algorithms against different data sizes. In order to benchmark the clustering performances of algorithms in the experiments, we have used a number of datasets with different data sizes such as some small scale, medium and big data. The clustering performances have been compared to K-means as it is a widely used clustering algorithm for years in literature. Rand Index, Adjusted Rand Index, Mirkin's Index and Hubert's Index have been considered as parameters for evaluating the clustering performances. Result: As a result of the clustering experiments of algorithms over different datasets with varying data sizes according to the specified performance criteria, GA and GWO algorithms show better clustering performances among the others. Conclusions: The results of the study showed that although the algorithms have shown satisfactory clustering results on small and medium scale datasets, the clustering performances on Big data need to be improved. (C) 2020 AGBM. Published by Elsevier Masson SAS. All rights reserved.en_US
dc.identifier.doi10.1016/j.irbm.2020.06.002
dc.identifier.endpage275en_US
dc.identifier.issn1959-0318
dc.identifier.issn1876-0988
dc.identifier.issue5en_US
dc.identifier.scopus2-s2.0-85087211302en_US
dc.identifier.scopusqualityQ2en_US
dc.identifier.startpage267en_US
dc.identifier.urihttps://doi.org/10.1016/j.irbm.2020.06.002
dc.identifier.urihttps://hdl.handle.net/20.500.12684/10248
dc.identifier.volume41en_US
dc.identifier.wosWOS:000576259700004en_US
dc.identifier.wosqualityQ4en_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.language.isoenen_US
dc.publisherElsevier Science Incen_US
dc.relation.ispartofIrbmen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectClusteringen_US
dc.subjectOptimizationen_US
dc.subjectEvolutionary algorithmsen_US
dc.subjectPSOen_US
dc.subjectGWOen_US
dc.subjectBBOen_US
dc.subjectK-meansen_US
dc.subjectGenetic Algorithmen_US
dc.subjectBig Dataen_US
dc.titleBenchmarking the Clustering Performances of Evolutionary Algorithms: A Case Study on Varying Data Sizeen_US
dc.typeArticleen_US

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Küçük Resim Yok
İsim:
10248.pdf
Boyut:
1.76 MB
Biçim:
Adobe Portable Document Format
Açıklama:
Tam Metin / Full Text