Roundabout Accident Prediction Model: Random-Parameter Negative Binomial Approach

Roundabouts have been used widely on all road classes in the United Kingdom because they are considered safer than other types of intersections in general. The objective of this study was to examine geometric and traffic characteristics and their influences on the number of accidents. Data from each of 70 roundabouts (with 284 approaches) included all recorded vehicle accidents as well as geometric and traffic characteristics for the entire roundabout, within circulatory lanes, and at roundabout approaches. Resulting estimates were compared with those from random-parameter and fixed-parameter negative binomial count data models. The random-parameter results provided better goodness of fit than the fixed-parameter results, and more variables were found to be significant. Significant variables that influenced the number of accidents were total approach traffic, truck percentage, entry width, inscribed circle diameter, number of lanes, and presence of traffic signals.

The number of roundabouts is increasing in countries and regions where roundabouts already are common, and roundabouts are earning popularity where few existed in the past. With respect to traffic operations and safety, roundabouts often are favored over other inter section types. Roundabouts can improve safety by reducing or changing conflict types, decreasing accident severity, and encouraging drivers to reduce speeds (1)(2)(3)(4). Many studies have analyzed roundabout safety, and results indicated a significant reduction in the number of accidents after intersections were converted to roundabouts (5)(6)(7)(8)(9)(10)(11)(12).
Geometric layout, operational analysis, and safety evaluation are significant recurring requirements for roundabout design. Small geometric modifications can lead to considerable changes in roundabout safety, operational performance, or both. The Highway Safety Manual uses traffic volume as a major input to the function for basecondition safety performance (13). As a substitute for an intersection, a roundabout is likely to exhibit a similar traffic volume influence on its anticipated safety performance. Many studies have been undertaken to predict accident models [Poisson or negative binomial (NB) models] from geometric and traffic variables in count data (14)(15)(16)(17)(18)(19)(20). However, such studies assumed that the variables (geometric and traffic) were constant across the observations (roundabouts).
In some cases, constraining the parameters as constants when they actually vary across observations can lead to inconsistent and biased parameter estimates (21). However, allowing some or all parameters to vary may account for heterogeneity across observations. For this reason, later research on general accident models (but not at round-abouts) has used random-parameter models, which can be viewed as an extension of random-effects models. Rather than influencing only the intercept of the model, random-parameter models allow some or all estimated parameters-including traffic and geometric variablesto vary across observations. Several studies have found that some variables vary across roadway segments (22)(23)(24)(25)(26).
The current study examines whether such heterogeneity across observations exists at roundabouts. The objective was to relate the total number of accidents to a range of explanatory variables, then determine a relationship that could be used to predict site-specific accident risks from such variables by using the random-parameter NB approach. Therefore, this study extends beyond previous roundabout accident models that used fixed-parameter models.

Model estimation
Many statistical methods are available to predict the number of accidents on roadway segments and intersections. Because a number of accidents is not negative, the recommended model types usually are discrete Poisson or NB distributions. Also, past research results have indicated that crash data are characterized by overdispersion (i.e., the variance is greater than the mean), which makes NB regression appropriate for modeling crash data (27,28). However, previous studies assumed fixed parameters across observations (roadway segments or intersections); if the model was used with fixed parameters, the resulting estimates could be biased and wrong conclusions might be drawn about the independent variables. Random-parameter count data models were introduced for this reason (29).
Anastasopoulos and Mannering described the methodological approach behind random-parameter models applied to count data (22); this approach is summarized in the following paragraphs. For the basic Poisson model, probability (P) of roundabout i having n accidents is where λ i is the Poisson parameter for roundabout i, which is the expected number of accidents for roundabout i (E[n i ]) (21). Poisson regression typically specifies a function of explanatory variables (in this study, geometric and traffic characteristics) by using a log-linear function: where β is a vector of estimated parameters and X i is a vector of explanatory variables for roundabout i (21). . The standard errors of the estimated parameter vector will be incorrect as a result, and incorrect inferences could be drawn (22). To account for this possibility, the NB model is derived by rewriting where exp(ε i ) is a gamma-distributed error term with mean 1 and variance α.
is a gamma function (22). To account for heterogeneity (unobserved factors that may vary across observations) with random parameters, estimation procedures have been established (with the use of simulated maximum likelihood estimation) for incorporating random parameters in Poisson and NB count-data models (21,30). An alternative to a random-parameter approach in the NB case would be to allow λ to vary as a function of the mean (22). The random-parameter modeling technique is used in this study to allow unobserved heterogeneities for predicting accidents. To allow for such random parameters in count data models, independent parameters can be written as where φ i is a randomly distributed term (e.g., a normally distributed term with mean 0 and variance α 2 ) and the term used in this paper. With this equation, the Poisson parameter becomes λ i /φ i = exp(βX i ) in the Poisson model and λ i /φ i = exp(βX i + ε i ) in the NB model, with the corresponding probability for Poisson or NB of P(n i /φ i ) (from Equation 1). With this random-parameter version, the log likelihood (logL) can be written as where ( •) is the probability density function of φ i (22). Because the maximum likelihood estimation of random-parameter Poisson and NB models is computationally cumbersome, a simulated maximum likelihood method is used. The simulation approach uses 200 Halton draws, which are sequences used to generate deterministically constructed, nearly uniformly distributed points that appear to be random in the interval [0, 1] (31); for numerical integration, they have been shown to provide a more efficient draw distribution than random draws (22).
From the Poisson and NB model parameter estimations and their variations, marginal effects can be estimated to describe the relative magnitude between dependent and independent variables. "In the case of accident frequencies, marginal effects give the change in the num-ber of accidents given a unit change in any independent variable, x, and are simply calculated as the partial derivative, ∂λ i /∂x," where λ i is defined according to the model being considered, as in Equations 2 and 3 (for fixed-parameter Poisson and NB models, respectively) or as λ i /φ i equal to exp(βX i ) and λ i /φ i equal to exp(βX i + ε i ) (for randomparameter Poisson and NB models, respectively) (22, p. 154). Even though marginal effects are generated for each roundabout i, only averages over the roundabout population are listed in the results presented later.

Model evaluation
As part of the process of selecting the most appropriate and bestfitting models, assessments were made according to statistical approaches. First, the model was evaluated according to the significance of the variables included; the estimated regression coefficient for each independent variable should be statistically significant. Three t-test statistics were used to test the significance of the variables: 1.65, 1.96, and 2.58 for the 90%, 95%, and 99% levels of significance, respectively.
To measure the overall model fit, the ρ 2 c -statistic (similar to R 2 in regression models) was used (21). It is defined as where logL(β) and logL(C ) are the log likelihood at convergence and with constant only, respectively. Therefore, a perfect model has a likelihood equal to 1. The closer the ρ 2 c statistic is to 1, the more variance the estimated model is explaining.
The likelihood ratio test was used to compare fixed-and randomparameter models by using the likelihoods at convergence. The test statistic is where logL(β F ) and logL(β RP ) are the log likelihood at convergence for the fixed-parameter and the random-parameter NB models, respectively (22). The χ 2 statistic is distributed with the number of degrees of freedom equal to the difference in the number of parameters between the fixed-and random-parameter models. The two models also were compared according to the relationship between actual mean values and predicted values of the response variables.

dAtA Characteristics and Collection
Seventy UK roundabouts that comprised a total of 284 approaches were selected for analysis. Nine roundabouts were on the M1 motorway, 10 were on the M6 motorway, six were on the M5 motorway, and nine were on the M4 motorway; the remaining roundabouts were located on other motorways and A-class roads.
The characteristics of whole roundabouts (n = 70) include For all roundabouts, accident data were collected from the STATS19 database for 11 years from 2002 to 2012, inclusive. These data included all injury accidents reported by police for all vehicles and pertained only to roundabouts located on motorways and A-class roads; roundabouts in urban areas were excluded. Data for average annual daily traffic (AADT) and percentage of truck traffic were acquired for local authority roads from the UK TRADS database (traffic count data collected from permanently located counting sites on the motorway and trunk road network in England) and from the UK National Highways Agency for 2011 and 2012. Roundabout entry width, circulatory roadway width, and inscribed circle diameter were estimated for the studied roundabouts from aerial photographs viewed on an online mapping website. Roundabout geometric information is illustrated in Figure 1. Summary statistics of all variables are presented in Table 1.
General Accident trends STATS19 data revealed 5,520 recorded collisions at the roundabouts between 2002 and 2012 (11 years) and a total of 11,510 vehicles and 7,808 casualties associated with those accidents. The number of fatal, serious, and slight accidents decreased over the 11-year period; the number of casualties fluctuated from 2002 to 2007, when the highest number of slight casualties was recorded, and then the number decreased ( Figure 2).
The highest contributing factor recorded was driver or rider error or reaction (which included following too close, failed to judge other person's path or speed, poor turn or maneuver, sudden braking, and junction overshoot). Most of the approach accidents occurred within 100 m of the roundabout (2,318 accidents); 284 accidents occurred more than 100 m away from the entry line, and 1,234 accidents were recorded in the circulatory lanes.

Results
The objective of the analysis was to relate the total number of accidents to a range of explanatory variables and determine a relationship that could be used to predict site-specific accident risks. The randomparameter NB distribution method is used and compared with the fixed-parameter NB distribution. Table 2 presents the estimation results for random-and fixedparameter NB models for whole roundabouts, within circulatory lanes, and at approaches. Table 3 illustrates that the model-estimated average marginal effects can be quite different. For whole roundabouts, the random-parameter NB model results in an improvement in the log likelihood at convergence from −319.6350 in the fixed-parameter model to −317.0940 in the random-parameter case ( Table 2). With      respect to overall fit, ρ 2 c improves from .083 in the fixed-parameter case to .091 in the random-parameter case. For whole roundabouts, the resulting χ 2 (Equation 8) was 5.082, with 1 degree of freedom, which indicates 98% confidence that the random-parameter model is statistically better than the fixed-parameter model. The change resulting from the switch from fixed to random parameters justifies the added complexity of the model. Results of the likelihood ratio test suggest that the model improvement is significant at a p-value of .05.
For the data within circulatory lanes, the random-parameter NB model results in a significantly better log likelihood at convergence and better overall fit, with ρ 2 c improving from .088 in fixed-parameter model to .113 in random-parameter model ( Table 2). The resulting χ 2 was 13.8382, with 1 degree of freedom, which indicates 99.99% confidence that the random-parameter model is statistically better. The likelihood ratio test suggests that the model improvement is significant at a p-value of .0001. The t-statistics indicate that the impact of variables on accident number is higher in the random-parameter model than in the fixed-parameter model.
At approaches, the random-parameter NB model results in a small improvement in log likelihood at convergence, from a ρ 2 c of .055 for the fixed-parameter model to .058 in the random-parameter model. The likelihood ratio test using χ 2 of 3.5334 with 2 degrees of freedom gives an 83% confidence that the random-parameter model provides a better fit; however, this percentage alone is not enough to justify adoption of the random-parameter model. Because the random-parameter model has lower log likelihood than the fixedparameter model (−879.1532 versus −880.9199), it can be used as a better model. Improvement also can be noticed in a comparison of the relationship between actual and predicted values in random-and fixed-parameter model.
For each roundabout category, all the variables presented in Table 1 were tested to find their significance. The percentage of truck traffic at whole roundabouts, traffic signal indicator at circulatory lanes (1 if unsignalized; 0 otherwise), and lane number indicator (1 if lane number is 2; 0 otherwise) and grade type indicator (1 if grade separated; 0 otherwise) at roundabout approaches were found to produce statistically significant random parameters. A parameter is considered random when the standard deviation of the parameter distribution is statistically greater than 0. (If the estimated standard deviation of the variable is not significantly greater than 0, then the variable is fixed across the observations.) Unsignalized whole roundabouts, the inscribed circle diameter of the roundabout, and signalized approaches had significant effects on the number of accidents, as indicated by the t-statistic in Table 2, but their effect was fixed across the observations. For whole roundabouts, the percentage of truck traffic results in a random parameter that is normally distributed, with a mean 0.06 and a standard deviation of 0.055. Given these parameters, 13.8% of the distribution is less than 0 (i.e., only 13.8% of the roundabouts had fewer accidents) and 86.2% of the distribution is greater than 0 (i.e., most roundabouts with higher truck percentages had more accidents). This result indicates that at most roundabouts, the number of accidents increases as the percentage of truck traffic increases. The t-statistic indicates that the significance of the percentage of truck traffic in the fixed-parameter model is lower than in the random-parameter NB model, which provides support for the random-parameter NB model. According to Table 3, the random-parameter marginal effects indicate that 1% increase in truck traffic will increase the number of accidents by 2.77% (whereas the fixed-parameter model indicates that the number of accidents will increase 3.16%, on average).
Inscribed circle diameter and AADT had "statistically" highly signi ficant effects on the number of accidents. As these variables increase, the number of accidents increases. Meanwhile, unsignalized roundabouts had a significant effect on decreasing the number of accidents. Regarding the average marginal effect in Table 3, a 1-m increase in inscribed circle diameter was associated with an increased number of accidents by an average of 0.22 over the 11-year period (which is close to the fixed-parameter model average, 0.21). Arndt and Troutbeck reported that roundabout safety increased with smaller roundabout diameter, which helps to maintain lower speeds and therefore provide safety for roundabouts (32); results of the present study support their findings. The present results also support those of Retting, who stated that roundabouts with a larger inscribed circle diameter are less safe than those with a smaller inscribed circle diameter (33). However, unsignalized roundabouts were associated with a reduction of 26.41 accidents over the 11-year period (versus an average of 27.55 in the fixed-parameter models). Because AADT is entered in logarithm form, a 1% increase leads to a 0.40% increase in the expected number of accidents, which is in line with previous findings (11, 14-20, 32, 34-37).
Within circulatory lanes, unsignalized traffic results in a randomparameter model that is normally distributed, with a mean of −1.267 and a standard deviation of 0.827; 93.7% of the distribution is less than 0, and 6.3% is greater than 0. Therefore, the number of accidents decreases in most of the unsignalized circulatory lanes. The average marginal effect for the traffic signal indicator shows that the number of accidents decreases by 13.57 in the random-parameter model and by 12.4 in the fixed-parameter model. Unsignalized circulatory lanes probably are safer because they are not located on motorways; most are fully at-grade junctions and have low traffic volumes.
Within circulatory lanes, a 1-m increase in inscribed circle diameter led to an accident increase of 0.083 in the random-parameter model over the 11-year period (0.084 in the fixed-parameter model) ( Table 3). This result supports Retting's findings (33). Rodegerdts et al. found that a higher inscribed circle diameter leads vehicles to increase their circulating speed and therefore decreases roundabout safety (3).
AADT was insignificant within circulatory lanes, but the percentage of truck traffic had a highly significant effect on the number of accidents: a 1% increase in truck traffic increased the expected number of accidents by an average of 0.90% in the random-parameter model and by an average of 0.748% in the fixed-parameter model.
At approaches, entry width had an insignificant effect on the number of accidents. Maycock and Hall found that entry width had a significant effect on reducing accident frequency (14); however, Retting found that the roundabouts with wider entries were less safe (33).
The approach two-lane indicator produced random parameters with standard deviations significantly different from 0. The lane number indicator is normally distributed with a mean of 0.164 and a standard deviation of 0.409. This distribution indicates that 34% is less than 0 and 66% is greater than 0, which means that the number of accidents increased on more than one-half of the approaches with two lanes by an average of 1.25 over the 11-year period ( Table 3). (The average marginal effect was 1.76 in the fixed-parameter model.) However, the distribution of the indicator indicates that the number of accidents is lower on some two-lane approaches (34%). In the before-and-after studies by Daniels et al. (34) and Persaud et al. (11), roundabout approaches with two lanes tended to perform worse; Brüde and Larsson found that the number of lanes was a significant variable (18).
All signalized approaches had a significant effect on increasing the number of accidents. Table 3 shows that accidents increase by 1.81 (versus 1.47 in the fixed-parameter model) with signalized approaches. However, this result is in contrast with that of Martin, who reported that at-grade roundabouts and grade-separated roundabouts reduced collisions by 28% and 6%, respectively, after signalization (38). And according to the UK Department of Transport, accidents decreased when roundabouts were signalized because signals regulate traffic speed (39). Presumably, those junctions that were modified had exhibited accident rates at the high end of the range before being signalized, and even though signalization reduced the accident rates at those roundabouts, the reduction was not sufficient to meet a value exhibited by roundabouts less in need of signalization.
At 99.99% of the approaches to grade-separated roundabouts, which permit higher traffic speeds, the number of accidents was on average 5.40 higher (6.52 in the fixed-parameter model) than at approaches to at-grade roundabouts. A probable reason for this difference is that grade-separated roundabouts are located at hightraffic-volume motorway junctions and have large inscribed circle diameters.
AADT has a fixed effect on accident occurrence at the 99% confidence level, which means that at a large majority of roundabout approaches, the number of accidents increases as AADT increases. A 1% increase in AADT led to a 0.66% increase in the expected number of accidents.
Figures 3 through 5 present predicted compared with actual values for random-and fixed-parameter models for the three roundabout categories (whole, circulatory lanes, and approaches). The random-parameter models provide better overall fits.

suMMARy ANd CoNClusioNs
Accident prediction models were estimated for 70 whole roundabouts, within circulatory lanes, and at roundabout approaches on motorways and A-class roads in the United Kingdom. Randomparameter NB count data models were used and their results compared with those of fixed-parameter models. The random-parameter models for whole roundabouts, within circulatory lanes, and at roundabout approaches provide better overall models than the fixed-parameter models. The prediction ability of the random-parameter model is an improvement at more than the 95% and 99% confidence limit over the fixed-parameter model for whole roundabouts and circulatory lanes, respectively. Moreover, the relationship between actual and predicted values implies that the random-parameter model fits the data better than fixed-parameter models for whole roundabouts, within circulatory lanes, and at roundabout approaches.
The effect of some parameters on accidents varies significantly across observations and results in random parameters in the randomparameter models: percentage of truck traffic for whole roundabouts, traffic signalization within circulatory lanes, and grade type and number of lanes indicators at approaches. Table 4 summarizes the significant variables found in the three random-parameter models, lists the random and fixed variables in those models, and gives the average marginal effect for each of those variables. For the random parameters, the percentage of observations in which the actual marginal effect is greater than 0 (or, in the case of signalized circulatory lanes, less than 0) also is given.
The inscribed circle diameter was associated with an increased risk of circulating accidents. The marginal effect, while statistically very significant, was low in increasing the number of accidents over the 11-year period (only 0.22 and 0.083 for models of whole roundabouts and within circulatory lanes, respectively). All unsignalized roundabouts had fewer accidents overall. All signalized approaches had more accidents, and most unsignalized circulatory lanes had fewer accidents. Approaches located at grade-separated roundabouts had more accidents than at-grade roundabouts. The majority of the two-lane approaches (66%) had more accidents.
The primary aim of this study was not to determine accident cause and effect but rather to provide better tools for understanding the likelihood of roundabout accidents. The random-parameter models are better than the fixed-parameter models because they identify more significant variables, better fit the data (as indicated by relationship between actual and predicted values), and provide information about the number of observations that have a marginal effect greater than 0 for the random parameters identified.
Many of the observations may appear counterintuitive at first. For example, unsignalized roundabouts or circulatory lanes may experience fewer accidents not because signals cause accidents but because unsignalized roundabouts and circulatory lanes generally carry less traffic and therefore present fewer opportunities for traffic conflicts. However, the relationship between geometric parameters of the roundabout and the possibility of accident occurrence suggests that certain characteristics influence accident occurrence or pre-diction. Therefore, additional investigation of the interactions among traffic flow, signalization, and element widths and diameter is warranted. More work also is needed so sight lines, pavement condition, and driver behavior can be included as independent variables in accident prediction. Furthermore, an interesting follow-on study apply this approach to roundabouts in other (e.g., urban) areas.