Accident prediction models are mostly developed using single-level count data models such asthe traditional negative binomial models with fixed or varying dispersion parameter, assumingindependency of data. For many accident data sets in road safety analysis, especially those ofsome highly disaggregated nature (hourly data), there often exists a hierarchical structure in thedata which manifests itself in some form of correlation. Crash prediction models developed withaggregate data could produce biased results due to the assumption of data independence andinflation of the adequacy of the model’s explanation due to the use of aggregate data. This paperinvestigates the potential effects of data aggregation and correlation on accident predictionmodels. The analysis uses an accident database including hour-level and storm-level accidentcounts over individual winter snow storms at four highway sections in Ontario. Models of twodifferent levels of aggregation: aggregated event-based models and disaggregated hourly-basedmodels were developed. It was found that the effect of data aggregation has a significant effecton model results while the difference between the conventional regression and multilevelregression is inconsequential.
展开▼