Determining the parameters of data mining techniques and their effects on iron deficiency based prediction

MOHAMMED, Sahar J.; ABBAS, Ahmed Kh.; AHMAD, Arshed A.; MOHAMMED, Mohammed S.; SARI, Murat; USLU TUNA, Hande

doi:10.14744/sigma.2025.00038

Determining the parameters of data mining techniques and their effects on iron deficiency based prediction

Sahar J. MOHAMMED ¹

, Ahmed Kh. ABBAS ¹

, Arshed A. AHMAD ¹

, Mohammed S. MOHAMMED ¹

, Murat SARI ²

, Hande USLU TUNA ³

¹University of Diyala, College of Education for Pure Sciences, Diyala, 32001, Iraq
²Department of Mathematical Engineering, Istanbul Technical University, Istanbul, 34469, Türkiye
³Department of Mathematics, Yildiz Technical University, Istanbul, 34220, Türkiye

Sigma J Eng Nat Sci 2025; 43(2): 655-664 DOI: 10.14744/sigma.2025.00038

Full Text PDF

Abstract

Training datasets are not the only elements affecting the overall prediction system; data mining parameters also have effects on the implementation processes that need to be taken into account. The purpose of this research is to investigate the influence of the main characteristics of the most used data mining approaches on anemia prediction. In this context, for the K-Nearest Neighbour (K-NN) approach, it is critical to define the k-value to specify the number of points used to measure the distance between various types of classes. Furthermore, the Local Weighted Learning (LWL) has a kernel value that specifies the width of the search process used to generate the LWL weight function. The Sequential Minimal Optimization (SMO) has an n-tuple alpha value that is determined by the training data in order to meet the Kraush Kuhh Tucker (KKT) condition and speed up the prediction process. When a superior choice is optimized for each strategy, these data mining methods are shown to produce high-performance predictions. It has also been noticed that the number of features and dataset size have an impact on the performance of these methods. In this study, feature selection methods and mining methods are compared in terms of appropriate selection of parameters and dependency on dataset information. The methods pro-posed here have predicted anemia more accurately than prior versions of each method. For the applied dataset, the features are reduced from 11 to 8. In addition to this feature reduction and parameter selections of a good method, i.e. K-NN, has an increase of about 3.8% in prediction performances based on the proposed model.

Keywords: Anemia; Attribute Selection; Prediction; Sequential Minimal Optimization; Local Weighted Learning