Integrating Statistical Clustering Methods and Machine Learning to Uncover Latent Population Profiles in Childhood Disability

Authors

  • Ali Satty Northern Border University
  • Zakariya M. S. Mohammed Department of Mathematics, College of Science, Northern Border University, Arar, Saudi Arabia

DOI:

https://doi.org/10.19139/soic-2310-5070-3308

Keywords:

Childhood Disability, Clustering Methods, Latent Class Analysis, Partitioning Around Medoids

Abstract

Childhood disability in low-resource settings is shaped by intersecting socioeconomic and geographic disadvantages, yet traditional regression approaches cannot capture how these factors combine to form distinct population subgroups. Using nationally representative MICS6 data from the Central African Republic, this study integrates two complementary unsupervised learning methods, Latent Class Analysis (LCA) and Partitioning Around Medoids (PAM) with Gower distance, to identify latent vulnerability profiles among 6,167 children aged 5--17 years. All analyses were conducted within a survey-weighted framework to account for the complex sampling design. Model selection using BIC/AIC (LCA) and weighted silhouette and Calinski--Harabasz indices (PAM) consistently supported a six-cluster structure. These clusters reflected meaningful combinations of household wealth, maternal education, residential setting, and region, with disability prevalence ranging from 28--43\% across clusters. Although individual-level agreement between LCA and PAM assignments was minimal, both methods revealed convergent population-level patterns, including one small, highly vulnerable subgroup with markedly elevated disability burden. The findings demonstrate the methodological value of triangulating model-based and distance-based clustering to uncover hidden structures in complex survey data and provide new insights into the multidimensional nature of childhood disability in fragile settings.

Downloads

Published

2026-04-19

Issue

Section

Research Articles

How to Cite

Integrating Statistical Clustering Methods and Machine Learning to Uncover Latent Population Profiles in Childhood Disability. (2026). Statistics, Optimization & Information Computing. https://doi.org/10.19139/soic-2310-5070-3308