Background: Machine Learning (ML) is a technique currently used to predict, diagnose, and treat diseases. Acute Lymphoblastic Leukemia (ALL) is the most common cancer among pediatric patients. A frequent complication in children with ALL is anemia, which leads to the need for recurrent blood transfusions. This, in turn, can result in increased Liver Iron Concentration (LIC). Currently, diagnosing LIC is performed using T2* MRI techniques; however, due to the limitations of MRI, employing ML techniques to predict LIC has become necessary.
Materials and Methods: In this retrospective cohort study, a collection of datasets was obtained from 66 children (mean age = 10.7 year) diagnosed with ALL, and three ML models were used, including Random Forest Classifier (RFC), Support Vector Classifier (SVC), and Logistic Regression (LR). Given the small sample size, a preprocessing step including feature standardization and SMOTE oversampling was taken only within the training datasets during cross-validation to prevent data leakage.
Results: Among the evaluated models, LR achieved the highest precision–recall (PR) Area Under the Curve (AUC) and receiver operating characteristic (ROC) AUC values (test PR AUC = 0.94, p-value = 0.002; CV PR AUC = 0.98 p-value < 0.001; test ROC AUC = 0.98, p-value = 0.002; CV ROC AUC = 0.98, p-value < 0.001). The permutation feature importance identified serum ferritin (SF) and transfusion volume per kilogram (TV/Kg) as the dominant predictors of LIC.
Conclusion: This study indicates that ML models are promising ones for predicting LIC. However, due to the limited sample size, future studies with larger cohorts are warranted to validate these findings
Type of Study:
Research |
Subject:
Hematology Received: 2025/10/28 | Accepted: 2026/03/28 | Published: 2026/03/28