Cyclical hybrid imputation technique for missing values in data sets

Kotan, Kurban; Kirisoglu, Serdar

Cyclical hybrid imputation technique for missing values in data sets

dc.authorid	kirisoglu, serdar/0000-0002-4416-6657
dc.authorid	KOTAN, Kurban/0000-0002-6660-0565;
dc.contributor.author	Kotan, Kurban
dc.contributor.author	Kirisoglu, Serdar
dc.date.accessioned	2025-10-11T20:48:24Z
dc.date.available	2025-10-11T20:48:24Z
dc.date.issued	2025
dc.department	Düzce Üniversitesi	en_US
dc.description.abstract	The problem of missing data in data sets is the most important first step to be addressed in the preprocessing phase. Because incorrect imputation of missing data increases the error in the modeling phase and reduces the prediction performance of the model. When it comes to health, it is inevitable to choose models that show a higher success rate. In cases where there is missing data, the performance of machine learning models may differ depending on the amount of data contained in the data set. The presence of missing data and this high rate affects the accuracy and reliability of analysis and modeling studies because it will affect the complete amount of data in the data set. Estimating and filling in the missing data very precisely, close to its real value, will provide a significant visible performance increase in the modeling phase, which is the next stage. After imputing the missing data with an artificial intelligence model rather than a random method, it is obvious that the accuracy of the model trained with this data is higher than the model trained with data filled with classical filling methods such as mean and mode. In this study, we propose a new algorithm that has been tested on many datasets to address the problems caused by missing data imputation in the dataset. The algorithm aims to impute missing values more effectively by using row-based and column-based imputation techniques together and cyclically. The algorithm takes into account individual missing values using column-based imputation features and the overall data structure using row-based imputation features. The proposed algorithm achieved 100% accuracy with some row and column-based imputation techniques on 3 different datasets used in the study. Higher accuracy was achieved compared to other imputation techniques.	en_US
dc.identifier.doi	10.1038/s41598-025-90964-7
dc.identifier.issn	2045-2322
dc.identifier.issue	1	en_US
dc.identifier.pmid	39994302	en_US
dc.identifier.scopus	2-s2.0-85218681647	en_US
dc.identifier.scopusquality	Q1	en_US
dc.identifier.uri	https://doi.org/10.1038/s41598-025-90964-7
dc.identifier.uri	https://hdl.handle.net/20.500.12684/21907
dc.identifier.volume	15	en_US
dc.identifier.wos	WOS:001433275500039	en_US
dc.identifier.wosquality	Q1	en_US
dc.indekslendigikaynak	Web of Science	en_US
dc.indekslendigikaynak	Scopus	en_US
dc.indekslendigikaynak	PubMed	en_US
dc.language.iso	en	en_US
dc.publisher	Nature Portfolio	en_US
dc.relation.ispartof	Scientific Reports	en_US
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.snmz	KA_WOS_20250911
dc.subject	Artificial intelligence	en_US
dc.subject	Machine learning	en_US
dc.subject	Deep learning	en_US
dc.subject	Imputation	en_US
dc.subject	Missing values	en_US
dc.title	Cyclical hybrid imputation technique for missing values in data sets	en_US
dc.type	Article	en_US

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
PubMed İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

Cyclical hybrid imputation technique for missing values in data sets

Dosyalar

Koleksiyon