Yazar "Basarslan, Muhammet Sinan" seçeneğine göre listele
Listeleniyor 1 - 3 / 3
Sayfa Başına Sonuç
Sıralama seçenekleri
Öğe Prediction of Potential Bank Customers: Application on Data Mining(Springer International Publishing Ag, 2020) Basarslan, Muhammet Sinan; Argun, Irem DuzdarBanking is an important industry, where financial transactions are performed to meet our needs in our everyday lives. Today, banks are frequently used to meet all kinds of financial transactions. In line with the increasing competition, the banks are aiming at acquiring new customers through customer satisfaction. At this point, studies on acquiring new customers by analyzing the customer data have gained importance recently. As a result, data analysis units have been established in the banks. In addition to the banks, these units have also been established for data analysis in customer focused industries such as insurance and telecommunication. In this study, models are established by using classification algorithms to estimate potential bank customers on the bank dataset obtained by telemarketing method in UCI Machine Learning Repository, and the results are compared. Using this comparison result, it is aimed to perform a more detailed and effective data analysis. Various models have been established with various classification algorithms for the estimation of customer acquisition. The classification algorithms used in this study include the C4.5 Decision Tree, Navie Bayes (NB) algorithm, K nearest neighbors algorithm (k-nn), Logistic Regression algorithm (LogReg), Random Forest algorithm (RanFor), and Adaptive Boosting algorithm (AdaBoostM1-Ada). While establishing the classification models, it is aimed to achieve consistency in the performance of the classification models by dividing the test and training data set by two different methods. K-fold Cross Validation and Holdout methods are used for this purpose. In the K-fold cross validation, training and test da-ta sets are separated with 5- and 10-fold cross validation. In the holdout method, the dataset was divided into training and test datasets with the 60-40%, 75-25% and 80-20% training and test separation ratios, respectively. These separations are evaluated for Accuracy (ACC), Precision (PPV), Sensitivity (TPR), and F-measure (F) performance. The performance results are similar in both separation results. According to the Accuracy and F-measure criteria, the classification model established by Random Forest algorithm highest results the other models, whereas the Naive Bayes algorithm gave highest results according to the precision criterion, and the AdaBoostM1 classification algorithm yielded better according to the sensitivity criterion.Öğe Sentiment analysis using a deep ensemble learning model(Springer, 2023) Basarslan, Muhammet Sinan; Kayaalp, FatihThe coronavirus pandemic has kept people away from social life and this has led to an increase in the use of social media over the past two years. Thanks to social media, people can now instantly share their thoughts on various topics such as their favourite movies, restaurants, hotels, etc. This has created a huge amount of data and many researchers from different sciences have focused on analysing this data. Natural Language Processing (NLP) is one of these areas of computer science that uses artificial technologies. Sentiment analysis is also one of the tasks of NLP, which is based on extracting emotions from huge post data. In this study, sentiment analysis was performed on two datasets of tweets about coronavirus and TripAdvisor hotel reviews. A frequency-based word representation method (Term Frequency-Inverse Document Frequency (TF-IDF)) and a prediction-based Word2Vec word embedding method were used to vectorise the datasets. Sentiment analysis models were then built using single machine learning methods (Decision Trees-DT, K-Nearest Neighbour-KNN, Naive Bayes-NB and Support Vector Machine-SVM), single deep learning methods (Long Short Term Memory-LSTM, Recurrent Neural Network-RNN) and heterogeneous ensemble learning methods (Stacking and Majority Voting) based on these single machine learning and deep learning methods. Accuracy was used as a performance measure. The heterogeneous model with stacking (LSTM-RNN) has outperformed the other models with accuracy values of 0.864 on the coronavirus dataset and 0.898 on the Trip Advisor dataset and they have been evaluated as promising results when compared to the literature. It has been observed that the use of single methods as an ensemble gives better results, which is consistent with the literature, which is a step forward in the detection of sentiments through posts. Investigating the performance of heterogeneous ensemble learning models based on different algorithms in sentiment analysis tasks is planned as future work.Öğe Sentiment Analysis with Machine Learning Methods on Social Media(Ediciones Univ Salamanca, 2020) Basarslan, Muhammet Sinan; Kayaalp, FatihSocial media has become an important part of our everyday life due to the widespread use of the Internet. Of the social media services, Twitter is among the most used ones around the world. People share their opinions by writing tweets about numerous subjects, such as politics, sports, economy, etc. Millions of tweets per day create a huge dataset, which drew attention of the data scientists to focus on these data for sentiment analysis. The sentiment analysis focuses to identify the social media posts of users about a specific topic and categorize them as positive, negative or neutral. Thus, the study aims to investigate the effect of types of text representation on the performance of sentiment analysis. In this study, two datasets were used in the experiments. The first one is the user reviews about movies from the IMDB, which has been labeled by Kotzias, and the second one is the Twitter tweets, including the tweets of users about health topic in English in 2019, collected using the Twitter API. The Python programming language was used in the study both for implementing the classification models using the Naive Bayes (NB), Support Vector Machines (SVM) and Artificial Neural Networks (ANN) algorithms, and for categorizing the sentiments as positive, negative and neutral. The feature extraction from the dataset was performed using Term Frequency-Inverse Document Frequency (TF-IDF) and Word2Vec (W2V) modeling techniques. The success percentages of the classification algorithms were compared at the end. According to the experimental results, Artificial Neural Network had the best accuracy performance in both datasets compared to the others.