Sentiment analysis of coronavirus data with ensemble and machine learning methods

dc.authorscopusid57203003458en_US
dc.authorscopusid56495320500en_US
dc.contributor.authorBaşarslan, M.S.
dc.contributor.authorKayaalp, F.
dc.date.accessioned2024-08-23T16:07:28Z
dc.date.available2024-08-23T16:07:28Z
dc.date.issued2024en_US
dc.departmentDüzce Üniversitesien_US
dc.description.abstractThe coronavirus pandemic has distanced people from social life and increased the use of social media. People's emotions can be determined with text data collected from social media applications. This is used in many fields, especially in commerce. This study aims to predict people's sentiments about the pandemic by applying sentiment analysis to Twitter tweets about the pandemic using single machine learning classifiers (Decision Tree-DT, K-Nearest Neighbor-KNN, Logistic Regression-LR, Naïve Bayes-NB, Random Forest-RF) and ensemble learning methods (Majority Voting (MV), Probabilistic Voting (PV), and Stacking (STCK)). After vectorizing the tweets using two predictive methods, Word2Vec (W2V) and Doc2Vec, and two traditional word representation methods, Term Frequency-Inverse Document Frequency (TF-IDF) and Bag of Words (BOW), classification models built using single machine learning classifiers were compared to models built using ensemble learning methods (MV, PV and STCK) by heterogeneously combining single machine classifier algorithms. Accuracy (ACC), F-measure (F), precision (P), and recall (R) were used as performance measures, with training/test separation rates of 70%-30% and 80%-20%, respectively. Among these models, the ACC of ensemble learning models ranged from 89% to 73%, while the ACC of single classifier models ranged from 60% to 80%. Among the ensemble learning methods, STCK with Doc2Vec text representation/embedding method gave the best ACC result of 89%. According to the experimental results, ensemble models built with heterogeneous machine learning classifier algorithms gave better results than single machine learning classifier algorithms. © Author(s) 2024.en_US
dc.identifier.doi10.31127/tuje.1352481
dc.identifier.endpage185en_US
dc.identifier.issn2587-1366
dc.identifier.issue2en_US
dc.identifier.scopus2-s2.0-85192945148en_US
dc.identifier.scopusqualityN/Aen_US
dc.identifier.startpage175en_US
dc.identifier.trdizinid1232985en_US
dc.identifier.urihttps://doi.org/10.31127/tuje.1352481
dc.identifier.urihttps://search.trdizin.gov.tr/tr/yayin/detay/1232985
dc.identifier.urihttps://hdl.handle.net/20.500.12684/14662
dc.identifier.volume8en_US
dc.indekslendigikaynakScopusen_US
dc.indekslendigikaynakTR-Dizinen_US
dc.language.isoenen_US
dc.publisherMurat Yakaren_US
dc.relation.ispartofTurkish Journal of Engineeringen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectEnsemble learningen_US
dc.subjectMachine learningen_US
dc.subjectSentiment analysisen_US
dc.subjectText representationen_US
dc.subjectWord embeddingen_US
dc.titleSentiment analysis of coronavirus data with ensemble and machine learning methodsen_US
dc.typeArticleen_US

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
14662.pdf
Boyut:
3.97 MB
Biçim:
Adobe Portable Document Format