A marginalized random effects hurdle negative binomial model for analyzing refined-scale crash frequency data
More Info
expand_more
Abstract
Crash frequency prediction models have been an important subject of safety research that unveils a relationship between crash occurrences and their influencing factors. Recently, the hourly-based refined-scale crash frequency analysis becomes attractive since it holds the benefits of introducing time-varying explanatory information (e.g. traffic volume and operating speed). However, crash frequency data with short time intervals possess the analytical issues of excessive zeros and unobserved heterogeneity. In this study, a marginalized random effects hurdle negative binomial (MREHNB) model was, for the first time, developed in which the hurdle modelling structure handles the excessive zeros issue and site-specific random effect terms capture the factors associated with unobserved heterogeneity. Moreover, the marginalized inference approach was introduced here to obtain the marginal mean inference for the overall population rather than subject-specific estimations. Empirical analyses were conducted based on data from the Shanghai urban expressway system, and the MREHNB model was compared with the HNB (hurdle negative binomial) and the REHNB (random effects hurdle negative binomial) model. In terms of model goodness-of-fits, REHNB and MREHNB model showed substantial improvement compared to the HNB model while there was no distinct difference between the REHNB and MREHNB models. However, as for the estimated parameters, the MREHNB model provided better inference precisions. Furthermore, the MREHNB model provided interesting findings for the crash contributing factors, for example, higher ratios of local vehicles within the traffic volume would enhance the probability of crash occurrence; and a non-linear relationship was concluded between traffic volume and crash frequency with the moderate level of volume held the highest crash occurrence probability. Finally, in-depth analyses about the modeling results and the model technique were discussed. The results will assist in designing more efficient control strategies for near real-time traffic management.