Zero-Inflated Count Regression Models in Solving Challenges Posed by Outlier-Prone Data; an Application to Length of Hospital Stay
Archives of Academic Emergency Medicine,
Vol. 12 No. 1 (2024),
1 January 2024
,
Page e13
https://doi.org/10.22037/aaem.v12i1.2074
Abstract
Introduction: Ignoring outliers in data may lead to misleading results. Length of stay (LOS) is often considered a count variable with a high frequency of outliers. This study exemplifies the potential of robust methodologies in enhancing the accuracy and reliability of analyses conducted on skewed and outlier-prone count data of LOS.
Methods: The application of Zero-Inflated Poisson (ZIP) and robust Zero-Inflated Poisson (RZIP) models in solving challenges posed by outlier LOS data were evaluated. The ZIP model incorporates two components, tackling excess zeros with a zero-inflation component and modeling positive counts with a Poisson component. The RZIP model introduces the Robust Expectation-Solution (RES) algorithm to enhance parameter estimation and address the impact of outliers on the model's performance.
Results: Data from 254 intensive care unit patients were analyzed (62.2% male). Patients aged 65 or older accounted for 58.3% of the sample. Notably, 38.6% of patients exhibited zero LOS. The overall mean LOS was 5.89 (± 9.81) days, and 9.45% of cases displayed outliers. Our analysis using the RZIP model revealed significant predictors of LOS, including age, underlying comorbidities (p<0.001), and insurance status (p=0.013). Model comparison demonstrated the RZIP model's superiority over ZIP, as evidenced by lower Akaike information criteria (AIC) and Bayesians information criteria (BIC) values.
Conclusions: The application of the RZIP model allowed us to uncover meaningful insights into the factors influencing LOS, paving the way for more informed decision-making in hospital management.
- Length of stay
- intensive care units
- outliers
- robust
- excess zeros
How to Cite
References
Sarul LS, Sahin S. An application of claim frequency data using zero inflated and hurdle models in general insurance. Journal of Business Economics and Finance. 2015;4(4).
Hilbe JM. Negative binomial regression: Cambridge University Press; 2011.
Workie MS, Azene AG. Bayesian zero-inflated regression model with application to under-five child mortality. Journal of big data. 2021;8(1):1-23.
O'Hara R, Kotze J. Do not log-transform count data. Nature Precedings. 2010:1-.
Gardner W, Mulvey EP, Shaw EC. Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models. Psychological bulletin. 1995;118(3):392.
Huang JQ, Hooper PM, Marrie TJ. Factors associated with length of stay in hospital for suspected community-acquired pneumonia. Canadian respiratory journal. 2006;13:317-24.
Sroka CJ, Nagaraja HN. Odds ratios from logistic, geometric, Poisson, and negative binomial regression models. BMC medical research methodology. 2018;18(1):1-11.
Fernandez GA, Vatcheva KP. A comparison of statistical methods for modeling count data with an application to hospital length of stay. BMC Medical Research Methodology. 2022;22(1):1-21.
Farhadi Hassankiadeh R, Kazemnejad A, Gholami Fesharaki M, Kargar Jahromi S. Efficiency of zero-inflated generalized poisson regression model on hospital length of stay using real data and simulation study. Caspian Journal of Health Research. 2018;3(1):5-9.
SONG JX. Zero-inflated Poisson regression to analyze lengths of hospital stays adjusting for intra-center correlation. Communications in Statistics—Simulation and Computation®. 2005;34(1):235-41.
Abonazel MR, El-sayed SM, Saber OM. Performance of robust count regression estimators in the case of overdispersion, zero inflated, and outliers: simulation study and application to German health data. Commun Math Biol Neurosci. 2021;2021:Article ID 55.
Zandkarimi E, Moghimbeigi A, Mahjub H, Majdzadeh R. Robust inference in the multilevel zero-inflated negative binomial model. Journal of Applied Statistics. 2020 2020/01/25;47(2):287-305.
Mullahy J. Specification and testing of some modified count data models. Journal of econometrics. 1986;33(3):341-65.
Lambert D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics. 1992;34(1):1-14.
Hall DB, Shen J. Robust estimation for zero‐inflated Poisson regression. Scandinavian Journal of Statistics. 2010;37(2):237-52.
Jansakul N, Hinde J. Score tests for zero-inflated Poisson models. Computational statistics & data analysis. 2002;40(1):75-96.
Vuong QH. Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica: journal of the Econometric Society. 1989:307-33.
Feng CX, Li L. Modeling zero inflation and overdispersion in the length of hospital stay for patients with ischaemic heart disease. Advanced Statistical Methods in Data Science. 2016:35-53.
Nanni L. Modeling Zero-inflated and overdispersed count data: Application to IN-Hospital mortality data. 2019.
Zeleke AJ, Moscato S, Miglio R, Chiari L. Length of stay analysis of COVID-19 hospitalizations using a count regression model and Quantile regression: a study in Bologna, Italy. International journal of environmental research and public health. 2022;19(4):2224.
- Abstract Viewed: 255 times
- pdf Downloaded: 947 times