Sentiment analysis using a deep ensemble learning model

dc.authoridBAŞARSLAN, MUHAMMET SİNAN/0000-0002-7996-9169en_US
dc.authoridKayaalp, Fatih/0000-0002-8752-3335en_US
dc.authorscopusid57203003458en_US
dc.authorscopusid56495320500en_US
dc.authorwosidBAŞARSLAN, MUHAMMET SİNAN/AAH-2116-2020en_US
dc.contributor.authorBasarslan, Muhammet Sinan
dc.contributor.authorKayaalp, Fatih
dc.date.accessioned2024-08-23T16:07:06Z
dc.date.available2024-08-23T16:07:06Z
dc.date.issued2023en_US
dc.departmentDüzce Üniversitesien_US
dc.description.abstractThe coronavirus pandemic has kept people away from social life and this has led to an increase in the use of social media over the past two years. Thanks to social media, people can now instantly share their thoughts on various topics such as their favourite movies, restaurants, hotels, etc. This has created a huge amount of data and many researchers from different sciences have focused on analysing this data. Natural Language Processing (NLP) is one of these areas of computer science that uses artificial technologies. Sentiment analysis is also one of the tasks of NLP, which is based on extracting emotions from huge post data. In this study, sentiment analysis was performed on two datasets of tweets about coronavirus and TripAdvisor hotel reviews. A frequency-based word representation method (Term Frequency-Inverse Document Frequency (TF-IDF)) and a prediction-based Word2Vec word embedding method were used to vectorise the datasets. Sentiment analysis models were then built using single machine learning methods (Decision Trees-DT, K-Nearest Neighbour-KNN, Naive Bayes-NB and Support Vector Machine-SVM), single deep learning methods (Long Short Term Memory-LSTM, Recurrent Neural Network-RNN) and heterogeneous ensemble learning methods (Stacking and Majority Voting) based on these single machine learning and deep learning methods. Accuracy was used as a performance measure. The heterogeneous model with stacking (LSTM-RNN) has outperformed the other models with accuracy values of 0.864 on the coronavirus dataset and 0.898 on the Trip Advisor dataset and they have been evaluated as promising results when compared to the literature. It has been observed that the use of single methods as an ensemble gives better results, which is consistent with the literature, which is a step forward in the detection of sentiments through posts. Investigating the performance of heterogeneous ensemble learning models based on different algorithms in sentiment analysis tasks is planned as future work.en_US
dc.identifier.doi10.1007/s11042-023-17278-6
dc.identifier.issn1380-7501
dc.identifier.issn1573-7721
dc.identifier.scopus2-s2.0-85174249458en_US
dc.identifier.scopusqualityQ1en_US
dc.identifier.urihttps://doi.org/10.1007/s11042-023-17278-6
dc.identifier.urihttps://hdl.handle.net/20.500.12684/14499
dc.identifier.wosWOS:001142539400018en_US
dc.identifier.wosqualityQ2en_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.language.isoenen_US
dc.publisherSpringeren_US
dc.relation.ispartofMultimedia Tools and Applicationsen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectSentiment Analysisen_US
dc.subjectText Representationen_US
dc.subjectWord Embeddingen_US
dc.subjectEnsemble Learningen_US
dc.subjectDeep Learningen_US
dc.subjectMachine Learningen_US
dc.subjectDeep Ensemble Learningen_US
dc.subjectCovid-19en_US
dc.titleSentiment analysis using a deep ensemble learning modelen_US
dc.typeArticleen_US

Dosyalar