A Quest for Formant-Based Compact Nonuniform Trapezoidal Filter Banks for Speech Processing with VGG16

Parlak, Cevahir; Altun, Yusuf

A Quest for Formant-Based Compact Nonuniform Trapezoidal Filter Banks for Speech Processing with VGG16

dc.authorid	ALTUN, Yusuf/0000-0002-2099-0959	en_US
dc.authorscopusid	55807221400	en_US
dc.authorscopusid	25031391400	en_US
dc.authorwosid	ALTUN, Yusuf/AAA-9929-2020	en_US
dc.contributor.author	Parlak, Cevahir
dc.contributor.author	Altun, Yusuf
dc.date.accessioned	2024-08-23T16:07:17Z
dc.date.available	2024-08-23T16:07:17Z
dc.date.issued	2024	en_US
dc.department	Düzce Üniversitesi	en_US
dc.description.abstract	In this text, we discuss the filter banks used for speech analysis and propose a novel filter bank for speech processing applications. Filter banks are building blocks of speech processing applications. Multiple filter strategies have been proposed, including Mel, PLP, Seneff, Lyon, and Gammatone filters. MFCC is a transformed version of Mel filters and is still a state-of-the-art method for speech recognition applications. However, 40 years after their debut, time is running out to launch new structures as novel speech features. The proposed acoustic filter banks (AFB) are innovative alternatives to dethrone Mel filters, PLP filters, and MFCC features. Foundations of AFB filters are based on the formant regions of vowels and consonants. In this study, we pioneer an acoustic filter bank comprising 11 frequency regions and conduct experiments using the VGG16 model on the TIMIT and Speech Command V2 datasets. The outcomes of the study concretely indicate that MFCC, Mel, and PLP filters can effectively be replaced with novel AFB filter bank features.	en_US
dc.description.sponsorship	DAS:Data are available at the following site: https://github.com/cevparlak/AFB-Filters.	en_US
dc.identifier.doi	10.1007/s00034-024-02794-z
dc.identifier.issn	0278-081X
dc.identifier.issn	1531-5878
dc.identifier.scopus	2-s2.0-85200054858	en_US
dc.identifier.scopusquality	Q2	en_US
dc.identifier.uri	https://doi.org/10.1007/s00034-024-02794-z
dc.identifier.uri	https://hdl.handle.net/20.500.12684/14562
dc.identifier.wos	WOS:001281629000005	en_US
dc.identifier.wosquality	N/A	en_US
dc.indekslendigikaynak	Web of Science	en_US
dc.indekslendigikaynak	Scopus	en_US
dc.language.iso	en	en_US
dc.publisher	Springer Birkhauser	en_US
dc.relation.ispartof	Circuits Systems and Signal Processing	en_US
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı	en_US
dc.rights	info:eu-repo/semantics/closedAccess	en_US
dc.subject	Speech processing	en_US
dc.subject	MFCC	en_US
dc.subject	Mel filters	en_US
dc.subject	PLP	en_US
dc.subject	Filter banks	en_US
dc.subject	Convolutional neural networks	en_US
dc.subject	Discrimination	en_US
dc.subject	Recognition	en_US
dc.subject	Frequency	en_US
dc.subject	Loudness	en_US
dc.subject	Perception	en_US
dc.subject	Model	en_US
dc.title	A Quest for Formant-Based Compact Nonuniform Trapezoidal Filter Banks for Speech Processing with VGG16	en_US
dc.type	Article	en_US

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

A Quest for Formant-Based Compact Nonuniform Trapezoidal Filter Banks for Speech Processing with VGG16

Dosyalar

Koleksiyon