The Battle of Chatbot Giants: An Experimental Comparison of ChatGPT and Bard

dc.contributor.authorKabakuş, Abdullah Talha
dc.contributor.authorDogru, İbrahim
dc.date.accessioned2025-03-24T19:47:28Z
dc.date.available2025-03-24T19:47:28Z
dc.date.issued2024
dc.departmentDüzce Üniversitesi
dc.description.abstractNowadays, it is hard to find a part of human life that Artificial Intelligence (AI) has not been involved in. With the recent advances in AI, the change for chatbots has been an ‘evolution’ instead of a ‘revolution’. AI-powered chatbots have become an integral part of customer services as they are as functional as humans (if not more), and they can provide 24/7 service (unlike humans). There are several publicly available, widely used AI-powered chatbots. So, “Which one is better?” is a question that instinctively comes to mind and needs to be shed light on. Motivated by the question, an experimental comparison of two widely used AI-powered chatbots, namely ChatGPT and Bard, was proposed in this study. For a quantitative comparison, (i) a gold standard QA dataset, which comprised 2.390 questions from 109 topics, was used, and (ii) a novel answer-scoring algorithm was proposed. The covered chatbots were evaluated using the proposed algorithm on the dataset to reveal their (i) generated answer length, and (ii) generated answer accuracy. According to the experimental results, (i) Bard generated lengthy answers compared to ChatGPT, and (ii) Bard provided answers more similar to the ground truth compared to ChatGPT.
dc.description.abstractNowadays, it is hard to find a part of human life that Artificial Intelligence (AI) has not been involved in. With the recent advances in AI, the change for chatbots has been an ‘evolution’ instead of a ‘revolution’. AI-powered chatbots have become an integral part of customer services as they are as functional as humans (if not more), and they can provide 24/7 service (unlike humans). There are several publicly available, widely used AI-powered chatbots. So, “Which one is better?” is a question that instinctively comes to mind and needs to shed light on. Motivated by the question, an experimental comparison of two widely used AI-powered chatbots, namely ChatGPT and Bard, was proposed in this study. For a quantitative comparison, (i) a gold standard QA dataset, which comprised 2,390 questions from 109 topics, was used and (ii) a novel answer-scoring algorithm based on cosine similarity was proposed. The covered chatbots were evaluated using the proposed algorithm on the dataset to reveal their (i) generated answer length and (ii) generated answer accuracy. According to the experimental results, (i) Bard generated lengthy answers compared to ChatGPT and (ii) Bard provided answers more similar to the ground truth compared to ChatGPT.
dc.identifier.doi10.29137/umagd.1390083
dc.identifier.endpage691
dc.identifier.issn1308-5514
dc.identifier.issue2
dc.identifier.startpage679
dc.identifier.urihttps://doi.org/10.29137/umagd.1390083
dc.identifier.urihttps://hdl.handle.net/20.500.12684/18756
dc.identifier.volume16
dc.language.isoen
dc.publisherKirikkale University
dc.relation.ispartofInternational Journal of Engineering Research and Development
dc.relation.publicationcategoryMakale - Ulusal Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.snmzKA_DergiPark_20250324
dc.subjectchatbot|question answering|artificial intelligence|ChatGPT|Bard|geniş dil modeli|chatbot|question answering|artificial intelligence|ChatGPT|Bard|large language model
dc.titleThe Battle of Chatbot Giants: An Experimental Comparison of ChatGPT and Bard
dc.title.alternativeThe Battle of Chatbot Giants: An Experimental Comparison of ChatGPT and Bard
dc.typeArticle

Dosyalar