A novel bidirectional long short-term memory model with multi-head attention for accurate language detection
Küçük Resim Yok
Tarih
2025
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Gazi Univ, Fac Engineering Architecture
Erişim Hakkı
info:eu-repo/semantics/openAccess
Özet
Language detection, one of the most important elements used in natural language processing, is used extensively in various applications such as machine translation, sentiment analysis, and information retrieval. Thanks to language detection, communication between people in many different countries is possible. In addition, human-animal interaction can also be carried out in this area. In this paper, a novel Bidirectional Long Short-Term Memory model with Multi-Head Attention mechanism is proposed to accurately classify text into 17 languages, namely Arabic, Danish, Dutch, English, French, German, Greek, Hindi, Italian, Kannada, Malayalam, Portuguese, Russian, Spanish, Swedish, Tamil, and Turkish. A publicly available dataset consisting of 10,337 texts written in the above-mentioned languages is utilized to train and evaluate the proposed model. The proposed novel model achieved an extraordinary accuracy, precision, recall, and F1-score of 99.9%, outperforming the state-of-the-art baseline models. In particular, the proposed model demonstrated perfect precision (100%) for 15 languages, namely Arabic, Dutch, English, French, German, Greek, Hindi, Italian, Kannada, Malayalam, Portuguese, Russian, Swedish, Tamil, and Turkish. This research highlights the effectiveness of deep learning techniques in language detection, providing promising avenues for further advances in the field of multilingual text processing.
Açıklama
Anahtar Kelimeler
Language detection, language classification, translation, deep learning, long short-term memory
Kaynak
Journal of the Faculty of Engineeringand Architecture of Gazi University
WoS Q Değeri
Q3
Scopus Q Değeri
Q2
Cilt
40
Sayı
3