Model Selection Criteria CV, UBR, and GCV for a Mixed Truncated Spline–Gaussian Kernel Estimator in Health Modeling

Authors

  • I Nyoman Budiantara Institut Teknologi Sepuluh November
  • Nur Chamidah Airlangga University
  • Andrea Tri Rian Dani Mulawarman University
  • Muhammad Anshari Business Information Systems at the Universiti Brunei Darussalam
  • Muhammad Fikry Al-Farizi Sepuluh Nopember Institute of Technology

DOI:

https://doi.org/10.19139/soic-2310-5070-3472

Keywords:

Health Modeling, Cross-Validation, Unbiased Risk, Generalized Cross-Validation, Mixed Estimators

Abstract

Nonparametric regression modeling has commonly applied a single estimator to all predictor variables. Although this approach is straightforward, it can be overly restrictive because predictors often exhibit heterogeneous relationships with the response variable, including linear trends, smooth nonlinear patterns, abrupt changes, or localized variations. Using a uniform estimator may therefore limit model flexibility and reduce predictive accuracy. To overcome this limitation, this study investigates a Mixed Truncated Spline–Gaussian Kernel Estimator, which allows each predictor to be modeled using the estimation technique most appropriate to its underlying data structure. The main objective of this research is to compare the performance of three smoothing parameter selection criteria, namely Cross-Validation (CV), Unbiased Risk (UBR), and Generalized Cross-Validation (GCV). These criteria are employed to determine optimal smoothing parameters, including the number and locations of knots in the truncated spline component and the bandwidth in the Gaussian kernel component. The empirical analysis is conducted using health-related data on heart disease risk factors, a domain characterized by complex and potentially nonlinear relationships. The results indicate that models incorporating three knot points consistently outperform alternative specifications. This superior performance is reflected in lower values of Mean Squared Error (MSE) and Mean Absolute Percentage Error (MAPE), as well as higher coefficients of determination (R²). Among the selection criteria examined, GCV yields the most accurate and stable model, outperforming both CV and UBR. From a methodological perspective, this study contributes to nonparametric regression by providing a systematic evaluation of smoothing parameter selection within a mixed estimator framework. From an applied standpoint, the proposed approach enhances the modeling of heart disease risk factors by offering greater flexibility and precision. Furthermore, the findings support Sustainable Development Goal (SDG) 3: Good Health and Well-Being by promoting robust, data-driven methods for evidence-based health policy formulation.

Downloads

Published

2026-04-13

Issue

Section

Research Articles

How to Cite

Model Selection Criteria CV, UBR, and GCV for a Mixed Truncated Spline–Gaussian Kernel Estimator in Health Modeling. (2026). Statistics, Optimization & Information Computing. https://doi.org/10.19139/soic-2310-5070-3472

Most read articles by the same author(s)