Yazar "Kayaalp, F." seçeneğine göre listele
Listeleniyor 1 - 8 / 8
Sayfa Başına Sonuç
Sıralama seçenekleri
Öğe Benchmarking the Clustering Performances of Evolutionary Algorithms: A Case Study on Varying Data Size(Elsevier Science Inc, 2020) Kayaalp, F.; Erdogmus, P.Background and objective: Clustering is a widely used popular method for data analysis within many clustering algorithms for years. Today it is used in many predictions, collaborative filtering and automatic segmentation systems on different domains. Also, to be broadly used in practice, such clustering algorithms need to give both better performance and robustness when compared to the ones currently used. In recent years, evolutionary algorithms are used in many domains since they are robust and easy to implement. And many clustering problems can be easily solved with such algorithms if the problem is modeled as an optimization problem. In this paper, we present an optimization approach for clustering by using four well-known evolutionary algorithms which are Biogeography-Based Optimization (BBO), Grey Wolf Optimization (GWO), Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). Method: the objective function has been specified to minimize the total distance from cluster centers to the data points. Euclidean distance is used for distance calculation. We have applied this objective function to the given algorithms both to find the most efficient clustering algorithm and to compare the clustering performances of algorithms against different data sizes. In order to benchmark the clustering performances of algorithms in the experiments, we have used a number of datasets with different data sizes such as some small scale, medium and big data. The clustering performances have been compared to K-means as it is a widely used clustering algorithm for years in literature. Rand Index, Adjusted Rand Index, Mirkin's Index and Hubert's Index have been considered as parameters for evaluating the clustering performances. Result: As a result of the clustering experiments of algorithms over different datasets with varying data sizes according to the specified performance criteria, GA and GWO algorithms show better clustering performances among the others. Conclusions: The results of the study showed that although the algorithms have shown satisfactory clustering results on small and medium scale datasets, the clustering performances on Big data need to be improved. (C) 2020 AGBM. Published by Elsevier Masson SAS. All rights reserved.Öğe Classification Performance Evaluation on Diagnosis of Breast Cancer(Springer Science and Business Media Deutschland GmbH, 2021) Basarslan, M. S.; Kayaalp, F.Cancer, which has many different types such as breast, pleural, and leukemia, is one of the common health problems of today. Most of them cause pain and treatment processes are so challenging. Medical authorities report that the diagnosis of cancer at early stages has a positive effect on medical treatments' success. On the way to design a computer-aided cancer diagnosis system about breast cancer to support the decisions of doctors about medical treatments, classification performances of six classifiers are investigated in this study. For this purpose, classifier models have been created with machine learning algorithms such as Support Vector Machine (SVM), Naïve Bayes (NB) and Random Forest (RF); and deep learning algorithms such as Recurrent Neural Network (RNN), Gated Recurrent Unit (GRU), and Long Short-Term Memory Network (LSTM). Two different open-source breast cancer datasets were used namely Wisconsin and Coimbra on experiments. Accuracy (Acc), Sensitivity (Sens), Precision (Pre), F-measure (F) were used as performance criteria. As a result of the tests, Acc values between 85% and 98% were obtained in the Coimbra breast cancer dataset; while Acc values were obtained between 92% and 97% in Wisconsin. According to the results obtained, it is seen that deep learning algorithms (RNN, GRU, and LSTM) are more successful than machine learning algorithms (SVM NB and RF). Among the deep learning algorithms, LSTM is more successful. © 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.Öğe Performance evaluation of classification algorithms on diagnosis of breast cancer and skin disease(Springer, 2021) Sinan Basarslan, M.; Kayaalp, F.Health is so important for human beings. Thanks to the technological developments both in medicine and information technologies, the success percentages of both medical diagnosing and medical treatment systems are increasing day by day. Cancer is the most common causes of death in today’s world and is generally diagnosed at the last stages. Cancer has many types such as breast cancer, skin cancer, leukemia and etc. Diagnosis of cancer at early stages is very important for the success of medical treatments. The aim of this study was to evaluate the classification performances of some popular algorithms on the way to design an efficient computer aided breast and/or skin cancer diagnosing system to support the doctors and patients. For this purpose, same machine learning and deep learning algorithms were applied on immunotherapy dataset and breast cancer Coimbra dataset from UCI machine learning data repository. Feature selection by information gain and reliefF were applied on datasets before classification in order to increase the efficiency of classification processes. Support Vector Machines (SVM), Random Forest (RF), Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN) algorithms were used for classification experiments. Accuracy values are used for performance metric. According to these results, RNN has shown the best performance among the others with 92% on both datasets. This shows that deep learning algorithms especially RNN has great potential to diagnose the cancer from dataset with high success ratios. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd 2021.Öğe Retrraction note: Leakage detection and localization on water transportation pipelines: a multi-label classification approach (Neural Computing and Applications, (2017), 28, 10, (2905-2914), 10.1007/s00521-017-2872-4)(Springer Science and Business Media Deutschland GmbH, 2024) Kayaalp, F.; Zengin, A.; Kara, R.; Zavrak, S.The Editor-in-Chief and the publisher have retracted this article. The article was submitted to be part of a guest-edited issue. An investigation by the publisher found a number of articles, including this one, with a number of concerns, including but not limited to compromised editorial handling and peer review process, inappropriate or irrelevant references or not being in scope of the journal or guest-edited issue. Based on the investigation's findings the Editor-in-Chief therefore no longer has confidence in the results and conclusions of this article. Fatih Kayaalp and Sultan Zavrak disagree with this retraction. Ahmet Zengin has not clearly stated whether or not they agree with this retraction. Resul Kara did not respond to correspondence from the publisher about this retraction. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.Öğe Sentiment analysis of coronavirus data with ensemble and machine learning methods(Murat Yakar, 2024) Başarslan, M.S.; Kayaalp, F.The coronavirus pandemic has distanced people from social life and increased the use of social media. People's emotions can be determined with text data collected from social media applications. This is used in many fields, especially in commerce. This study aims to predict people's sentiments about the pandemic by applying sentiment analysis to Twitter tweets about the pandemic using single machine learning classifiers (Decision Tree-DT, K-Nearest Neighbor-KNN, Logistic Regression-LR, Naïve Bayes-NB, Random Forest-RF) and ensemble learning methods (Majority Voting (MV), Probabilistic Voting (PV), and Stacking (STCK)). After vectorizing the tweets using two predictive methods, Word2Vec (W2V) and Doc2Vec, and two traditional word representation methods, Term Frequency-Inverse Document Frequency (TF-IDF) and Bag of Words (BOW), classification models built using single machine learning classifiers were compared to models built using ensemble learning methods (MV, PV and STCK) by heterogeneously combining single machine classifier algorithms. Accuracy (ACC), F-measure (F), precision (P), and recall (R) were used as performance measures, with training/test separation rates of 70%-30% and 80%-20%, respectively. Among these models, the ACC of ensemble learning models ranged from 89% to 73%, while the ACC of single classifier models ranged from 60% to 80%. Among the ensemble learning methods, STCK with Doc2Vec text representation/embedding method gave the best ACC result of 89%. According to the experimental results, ensemble models built with heterogeneous machine learning classifier algorithms gave better results than single machine learning classifier algorithms. © Author(s) 2024.Öğe Sentiment analysis with ensemble and machine learning methods in multi-domain datasets(Murat Yakar, 2023) Başarslan, M.S.; Kayaalp, F.The first place to get ideas on all the activities considered to occur in everyday life was the comments on the websites. This is an area that deals with these interpretations in the natural language processing, which is a sub-branch of artificial intelligence. Sentiment analysis studies, which is a task of natural language processing are carried out to give people an idea and even guide them with such comments. In this study, sentiment analysis was implemented on public user feedback on websites in two different areas. TripAdvisor dataset includes positive or negative user comments about hotels. And Rotten Tomatoes dataset includes positive (fresh) or negative (rotten) user comments about films. Sentiments analysis on datasets have been carried out by using Word2Vec word embedding model, which learns the vector representations of each word containing the positive or negative meaning of the sentences, and the Term Frequency Inverse Document Frequency text representation model with four machine learning methods (Naïve Bayes-NB, Support Vector Machines-SVM, Logistic Regression-LR, K-Nearest Neighbour-kNN) and two ensemble learning methods (Stacking, Majority Voting-MV). Accuracy and F-measure is used as a performance metric experiments. According to the results, Ensemble learning methods have shown better results than single machine learning algorithms. Among the overall approaches, MV outperformed Stacking. © Author(s) 2023.Öğe Sentiment Analysis with Various Deep Learning Models on Movie Reviews(Institute of Electrical and Electronics Engineers Inc., 2022) Basarslan, M.S.; Kayaalp, F.Social media have led to the development of artificial intelligence tasks such as sentiment analysis to see whether people's posts have a positive or negative effect on other people. Ideas that affect society directly or indirectly about various domains, such as a movie or a meal, are very important for many business operations. This paper presents a sentiment analysis study which was carried out with 7 models based on various methods of deep learning algorithms on IMDB dataset. The best result was obtained with the model consisting of 2 Bi-LSTM and 2 dropout layers with 80%-20% train-test separation and an accuracy value of 88.21%. © 2022 IEEE.Öğe TSCBAS: A Novel Correlation Based Attribute Selection Method and Application on Telecommunications Churn Analysis(Institute of Electrical and Electronics Engineers Inc., 2019) Kayaalp, F.; Başarslan, M. S.; Polat, K.Attribute selection has a significant effect on the performance of the machine learning studies by selecting the attributes having significant effect on result, reducing the number of attributes, and reducing the calculation cost. In this study, a new attribute selection method which is a combination of the R-correlation coefficient-based attribute selection (RCBAS) and the ?-correlation coefficient-based attribute selection (?CBAS) called the Two-Stage Correlation-Based Attribute Selection (TSCBAS) is proposed to select significant attributes. The proposed attribute selection method has been applied to customer churn prediction on a telecommunications dataset for performance evaluation. The dataset used in the study includes real customer call records details for the years 2013 and 2014 obtained from a major telecommunications company in Turkey. Apart from the proposed attribute selection method, four different methods named Rcorrelation coefficient-based attribute selection, ?-correlation coefficient-based attribute selection, ReliefF, and Gain Ratio have been used for creating five datasets. After that, four classifier algorithms including Random Forest, C4.5 Decision Tree, Naive Bayes and AdaBoost.M1 have been applied. The obtained results have been compared according to the performance metrics comprising Accuracy (ACC), Sensitivity (TPR), Specificity (SPC), F-measure (F), AUC (area under the ROC curve), and run-time. The results of the comparisons show that the proposed attribute selection algorithm outperforms the state of the art methods on customer churn prediction. © 2018 IEEE.