A novel bidirectional long short-term memory model with multi-head attention for accurate language detection

Küçük Resim Yok

Tarih

2025

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Gazi Univ, Fac Engineering Architecture

Erişim Hakkı

info:eu-repo/semantics/openAccess

Özet

Language detection, one of the most important elements used in natural language processing, is used extensively in various applications such as machine translation, sentiment analysis, and information retrieval. Thanks to language detection, communication between people in many different countries is possible. In addition, human-animal interaction can also be carried out in this area. In this paper, a novel Bidirectional Long Short-Term Memory model with Multi-Head Attention mechanism is proposed to accurately classify text into 17 languages, namely Arabic, Danish, Dutch, English, French, German, Greek, Hindi, Italian, Kannada, Malayalam, Portuguese, Russian, Spanish, Swedish, Tamil, and Turkish. A publicly available dataset consisting of 10,337 texts written in the above-mentioned languages is utilized to train and evaluate the proposed model. The proposed novel model achieved an extraordinary accuracy, precision, recall, and F1-score of 99.9%, outperforming the state-of-the-art baseline models. In particular, the proposed model demonstrated perfect precision (100%) for 15 languages, namely Arabic, Dutch, English, French, German, Greek, Hindi, Italian, Kannada, Malayalam, Portuguese, Russian, Swedish, Tamil, and Turkish. This research highlights the effectiveness of deep learning techniques in language detection, providing promising avenues for further advances in the field of multilingual text processing.

Açıklama

Anahtar Kelimeler

Language detection, language classification, translation, deep learning, long short-term memory

Kaynak

Journal of the Faculty of Engineeringand Architecture of Gazi University

WoS Q Değeri

Q3

Scopus Q Değeri

Q2

Cilt

40

Sayı

3

Künye