Zero-Inflated Count Regression Models in Solving Challenges Posed by Outlier-Prone Data; an Application to Length of Hospital Stay
学术急诊医学档案,
卷 12 编号 1 (2024),
1 دی 2024
,
第 e13 页
https://doi.org/10.22037/aaem.v12i1.2074
摘要
Introduction: Ignoring outliers in data may lead to misleading results. Length of stay (LOS) is often considered a count variable with a high frequency of outliers. This study exemplifies the potential of robust methodologies in enhancing the accuracy and reliability of analyses conducted on skewed and outlier-prone count data of LOS.
Methods: The application of Zero-Inflated Poisson (ZIP) and robust Zero-Inflated Poisson (RZIP) models in solving challenges posed by outlier LOS data were evaluated. The ZIP model incorporates two components, tackling excess zeros with a zero-inflation component and modeling positive counts with a Poisson component. The RZIP model introduces the Robust Expectation-Solution (RES) algorithm to enhance parameter estimation and address the impact of outliers on the model's performance.
Results: Data from 254 intensive care unit patients were analyzed (62.2% male). Patients aged 65 or older accounted for 58.3% of the sample. Notably, 38.6% of patients exhibited zero LOS. The overall mean LOS was 5.89 (± 9.81) days, and 9.45% of cases displayed outliers. Our analysis using the RZIP model revealed significant predictors of LOS, including age, underlying comorbidities (p<0.001), and insurance status (p=0.013). Model comparison demonstrated the RZIP model's superiority over ZIP, as evidenced by lower Akaike information criteria (AIC) and Bayesians information criteria (BIC) values.
Conclusions: The application of the RZIP model allowed us to uncover meaningful insights into the factors influencing LOS, paving the way for more informed decision-making in hospital management.
- Length of stay
- intensive care units
- outliers
- robust
- excess zeros
##submission.howToCite##
参考
Sarul LS, Sahin S. An application of claim frequency data using zero inflated and hurdle models in general insurance. Journal of Business Economics and Finance. 2015;4(4).
Hilbe JM. Negative binomial regression: Cambridge University Press; 2011.
Workie MS, Azene AG. Bayesian zero-inflated regression model with application to under-five child mortality. Journal of big data. 2021;8(1):1-23.
O'Hara R, Kotze J. Do not log-transform count data. Nature Precedings. 2010:1-.
Gardner W, Mulvey EP, Shaw EC. Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models. Psychological bulletin. 1995;118(3):392.
Huang JQ, Hooper PM, Marrie TJ. Factors associated with length of stay in hospital for suspected community-acquired pneumonia. Canadian respiratory journal. 2006;13:317-24.
Sroka CJ, Nagaraja HN. Odds ratios from logistic, geometric, Poisson, and negative binomial regression models. BMC medical research methodology. 2018;18(1):1-11.
Fernandez GA, Vatcheva KP. A comparison of statistical methods for modeling count data with an application to hospital length of stay. BMC Medical Research Methodology. 2022;22(1):1-21.
Farhadi Hassankiadeh R, Kazemnejad A, Gholami Fesharaki M, Kargar Jahromi S. Efficiency of zero-inflated generalized poisson regression model on hospital length of stay using real data and simulation study. Caspian Journal of Health Research. 2018;3(1):5-9.
SONG JX. Zero-inflated Poisson regression to analyze lengths of hospital stays adjusting for intra-center correlation. Communications in Statistics—Simulation and Computation®. 2005;34(1):235-41.
Abonazel MR, El-sayed SM, Saber OM. Performance of robust count regression estimators in the case of overdispersion, zero inflated, and outliers: simulation study and application to German health data. Commun Math Biol Neurosci. 2021;2021:Article ID 55.
Zandkarimi E, Moghimbeigi A, Mahjub H, Majdzadeh R. Robust inference in the multilevel zero-inflated negative binomial model. Journal of Applied Statistics. 2020 2020/01/25;47(2):287-305.
Mullahy J. Specification and testing of some modified count data models. Journal of econometrics. 1986;33(3):341-65.
Lambert D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics. 1992;34(1):1-14.
Hall DB, Shen J. Robust estimation for zero‐inflated Poisson regression. Scandinavian Journal of Statistics. 2010;37(2):237-52.
Jansakul N, Hinde J. Score tests for zero-inflated Poisson models. Computational statistics & data analysis. 2002;40(1):75-96.
Vuong QH. Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica: journal of the Econometric Society. 1989:307-33.
Feng CX, Li L. Modeling zero inflation and overdispersion in the length of hospital stay for patients with ischaemic heart disease. Advanced Statistical Methods in Data Science. 2016:35-53.
Nanni L. Modeling Zero-inflated and overdispersed count data: Application to IN-Hospital mortality data. 2019.
Zeleke AJ, Moscato S, Miglio R, Chiari L. Length of stay analysis of COVID-19 hospitalizations using a count regression model and Quantile regression: a study in Bologna, Italy. International journal of environmental research and public health. 2022;19(4):2224.
- 摘要 ##plugins.themes.ojsPlusA.frontend.article.viewed##: 117 ##plugins.themes.ojsPlusA.frontend.article.times##
- pdf (English) ##plugins.themes.ojsPlusA.frontend.article.downloaded##: 436 ##plugins.themes.ojsPlusA.frontend.article.times##