Cyclical hybrid imputation technique for missing values in data sets

dc.authoridkirisoglu, serdar/0000-0002-4416-6657
dc.authoridKOTAN, Kurban/0000-0002-6660-0565;
dc.contributor.authorKotan, Kurban
dc.contributor.authorKirisoglu, Serdar
dc.date.accessioned2025-10-11T20:48:24Z
dc.date.available2025-10-11T20:48:24Z
dc.date.issued2025
dc.departmentDüzce Üniversitesien_US
dc.description.abstractThe problem of missing data in data sets is the most important first step to be addressed in the preprocessing phase. Because incorrect imputation of missing data increases the error in the modeling phase and reduces the prediction performance of the model. When it comes to health, it is inevitable to choose models that show a higher success rate. In cases where there is missing data, the performance of machine learning models may differ depending on the amount of data contained in the data set. The presence of missing data and this high rate affects the accuracy and reliability of analysis and modeling studies because it will affect the complete amount of data in the data set. Estimating and filling in the missing data very precisely, close to its real value, will provide a significant visible performance increase in the modeling phase, which is the next stage. After imputing the missing data with an artificial intelligence model rather than a random method, it is obvious that the accuracy of the model trained with this data is higher than the model trained with data filled with classical filling methods such as mean and mode. In this study, we propose a new algorithm that has been tested on many datasets to address the problems caused by missing data imputation in the dataset. The algorithm aims to impute missing values more effectively by using row-based and column-based imputation techniques together and cyclically. The algorithm takes into account individual missing values using column-based imputation features and the overall data structure using row-based imputation features. The proposed algorithm achieved 100% accuracy with some row and column-based imputation techniques on 3 different datasets used in the study. Higher accuracy was achieved compared to other imputation techniques.en_US
dc.identifier.doi10.1038/s41598-025-90964-7
dc.identifier.issn2045-2322
dc.identifier.issue1en_US
dc.identifier.pmid39994302en_US
dc.identifier.scopus2-s2.0-85218681647en_US
dc.identifier.scopusqualityQ1en_US
dc.identifier.urihttps://doi.org/10.1038/s41598-025-90964-7
dc.identifier.urihttps://hdl.handle.net/20.500.12684/21907
dc.identifier.volume15en_US
dc.identifier.wosWOS:001433275500039en_US
dc.identifier.wosqualityQ1en_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.indekslendigikaynakPubMeden_US
dc.language.isoenen_US
dc.publisherNature Portfolioen_US
dc.relation.ispartofScientific Reportsen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.snmzKA_WOS_20250911
dc.subjectArtificial intelligenceen_US
dc.subjectMachine learningen_US
dc.subjectDeep learningen_US
dc.subjectImputationen_US
dc.subjectMissing valuesen_US
dc.titleCyclical hybrid imputation technique for missing values in data setsen_US
dc.typeArticleen_US

Dosyalar