首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
There has been considerable research conducted over the last 20 years focused on predicting motor vehicle crashes on transportation facilities. The range of statistical models commonly applied includes binomial, Poisson, Poisson-gamma (or negative binomial), zero-inflated Poisson and negative binomial models (ZIP and ZINB), and multinomial probability models. Given the range of possible modeling approaches and the host of assumptions with each modeling approach, making an intelligent choice for modeling motor vehicle crash data is difficult. There is little discussion in the literature comparing different statistical modeling approaches, identifying which statistical models are most appropriate for modeling crash data, and providing a strong justification from basic crash principles. In the recent literature, it has been suggested that the motor vehicle crash process can successfully be modeled by assuming a dual-state data-generating process, which implies that entities (e.g., intersections, road segments, pedestrian crossings, etc.) exist in one of two states-perfectly safe and unsafe. As a result, the ZIP and ZINB are two models that have been applied to account for the preponderance of "excess" zeros frequently observed in crash count data. The objective of this study is to provide defensible guidance on how to appropriate model crash data. We first examine the motor vehicle crash process using theoretical principles and a basic understanding of the crash process. It is shown that the fundamental crash process follows a Bernoulli trial with unequal probability of independent events, also known as Poisson trials. We examine the evolution of statistical models as they apply to the motor vehicle crash process, and indicate how well they statistically approximate the crash process. We also present the theory behind dual-state process count models, and note why they have become popular for modeling crash data. A simulation experiment is then conducted to demonstrate how crash data give rise to "excess" zeros frequently observed in crash data. It is shown that the Poisson and other mixed probabilistic structures are approximations assumed for modeling the motor vehicle crash process. Furthermore, it is demonstrated that under certain (fairly common) circumstances excess zeros are observed-and that these circumstances arise from low exposure and/or inappropriate selection of time/space scales and not an underlying dual state process. In conclusion, carefully selecting the time/space scales for analysis, including an improved set of explanatory variables and/or unobserved heterogeneity effects in count regression models, or applying small-area statistical methods (observations with low exposure) represent the most defensible modeling approaches for datasets with a preponderance of zeros.  相似文献   

2.
Efficient geometric design and signal timing not only improve operational performance at signalized intersections by expanding capacity and reducing traffic delays, but also result in an appreciable reduction in traffic conflicts, and thus better road safety. Information on the incidence of crashes, traffic flow, geometric design, road environment, and traffic control at 262 signalized intersections in Hong Kong during 2002 and 2003 are incorporated into a crash prediction model. Poisson regression and negative binomial regression are used to quantify the influence of possible contributory factors on the incidence of killed and severe injury (KSI) crashes and slight injury crashes, respectively, while possible interventions by traffic flow are controlled. The results for the incidence of slight injury crashes reveal that the road environment, degree of curvature, and presence of tram stops are significant factors, and that traffic volume has a diminishing effect on the crash risk. The presence of tram stops, number of pedestrian streams, road environment, proportion of commercial vehicles, average lane width, and degree of curvature increase the risk of KSI crashes, but the effect of traffic volume is negligible.  相似文献   

3.
Motor vehicle crashes are a leading cause of death for young people in the United States. Assessing which drivers are at a high risk of experiencing a crash is important for the implementation of traffic regulations. Illegal street racing has been associated with a high rate of motor vehicle crashes. In this study, we link Utah statewide driver license citations and motor vehicle crash data to evaluate the rate of crashes for drivers with street racing citations relative to other drivers. Using a zero-inflated negative binomial model we found that drivers with no citations are approximately three times more likely to be at zero risk of a crash compared to drivers with street racing citations. Moreover, among drivers at non-negligible risk of crash, cited street racers are more likely to be involved in a crash compared to drivers without citations or those cited for violations other than street racing. However, drivers with increased numbers of non-street-racing citations experience crash risks approaching those of the cited street racers.  相似文献   

4.
Predicting motor vehicle crashes using Support Vector Machine models   总被引:1,自引:0,他引:1  
Crash prediction models have been very popular in highway safety analyses. However, in highway safety research, the prediction of outcomes is seldom, if ever, the only research objective when estimating crash prediction models. Only very few existing methods can be used to efficiently predict motor vehicle crashes. Thus, there is a need to examine new methods for better predicting motor vehicle crashes. The objective of this study is to evaluate the application of Support Vector Machine (SVM) models for predicting motor vehicle crashes. SVM models, which are based on the statistical learning theory, are a new class of models that can be used for predicting values. To accomplish the objective of this study, Negative Binomial (NB) regression and SVM models were developed and compared using data collected on rural frontage roads in Texas. Several models were estimated using different sample sizes. The study shows that SVM models predict crash data more effectively and accurately than traditional NB models. In addition, SVM models do not over-fit the data and offer similar, if not better, performance than Back-Propagation Neural Network (BPNN) models documented in previous research. Given this characteristic and the fact that SVM models are faster to implement than BPNN models, it is suggested to use these models if the sole purpose of the study consists of predicting motor vehicle crashes.  相似文献   

5.
This paper documents the application of the Conway–Maxwell–Poisson (COM-Poisson) generalized linear model (GLM) for modeling motor vehicle crashes. The COM-Poisson distribution, originally developed in 1962, has recently been re-introduced by statisticians for analyzing count data subjected to over- and under-dispersion. This innovative distribution is an extension of the Poisson distribution. The objectives of this study were to evaluate the application of the COM-Poisson GLM for analyzing motor vehicle crashes and compare the results with the traditional negative binomial (NB) model. The comparison analysis was carried out using the most common functional forms employed by transportation safety analysts, which link crashes to the entering flows at intersections or on segments. To accomplish the objectives of the study, several NB and COM-Poisson GLMs were developed and compared using two datasets. The first dataset contained crash data collected at signalized four-legged intersections in Toronto, Ont. The second dataset included data collected for rural four-lane divided and undivided highways in Texas. Several methods were used to assess the statistical fit and predictive performance of the models. The results of this study show that COM-Poisson GLMs perform as well as NB models in terms of GOF statistics and predictive performance. Given the fact the COM-Poisson distribution can also handle under-dispersed data (while the NB distribution cannot or has difficulties converging), which have sometimes been observed in crash databases, the COM-Poisson GLM offers a better alternative over the NB model for modeling motor vehicle crashes, especially given the important limitations recently documented in the safety literature about the latter type of model.  相似文献   

6.
This paper presents an empirical inquiry into the applicability of zero-altered counting processes to roadway section accident frequencies. The intent of such a counting process is to distinguish sections of roadway that are truly safe (near zero-accident likelihood) from those that are unsafe but happen to have zero accidents observed during the period of observation (e.g. one year). Traditional applications of Poisson and negative binomial accident frequency models do not account for this distinction and thus can produce biased coefficient estimates because of the preponderance of zero-accident observations. Zero-altered probability processes such as the zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) distributions are examined and proposed for accident frequencies by roadway functional class and geographic location. The findings show that the ZIP structure models are promising and have great flexibility in uncovering processes affecting accident frequencies on roadway sections observed with zero accidents and those with observed accident occurrences. This flexibility allows highway engineers to better isolate design factors that contribute to accident occurrence and also provides additional insight into variables that determine the relative accident likelihoods of safe versus unsafe roadways. The generic nature of the models and the relatively good power of the Vuong specification test used in the non-nested hypotheses of model specifications offers roadway designers the potential to develop a global family of models for accident frequency prediction that can be embedded in a larger safety management system.  相似文献   

7.
The intent of this note is to succinctly articulate additional points that were not provided in the original paper (Lord et al., 2005) and to help clarify a collective reluctance to adopt zero-inflated (ZI) models for modeling highway safety data. A dialogue on this important issue, just one of many important safety modeling issues, is healthy discourse on the path towards improved safety modeling. This note first provides a summary of prior findings and conclusions of the original paper. It then presents two critical and relevant issues: the maximizing statistical fit fallacy and logic problems with the ZI model in highway safety modeling. Finally, we provide brief conclusions.  相似文献   

8.
In this study, the safety of cyclists at unsignalized priority intersections within built-up areas is investigated. The study focuses on the link between the characteristics of priority intersection design and bicycle–motor vehicle (BMV) crashes. Across 540 intersections that are involved in the study, the police recorded 339 failure-to-yield crashes with cyclists in four years. These BMV crashes are classified into two types based on the movements of the involved motorists and cyclists:
  • • 
    type I: through bicycle related collisions where the cyclist has right of way (i.e. bicycle on the priority road);
  • • 
    type II: through motor vehicle related collisions where the motorist has right of way (i.e. motorist on the priority road).
The probability of each crash type was related to its relative flows and to independent variables using negative binomial regression. The results show that more type I crashes occur at intersections with two-way bicycle tracks, well marked, and reddish coloured bicycle crossings. Type I crashes are negatively related to the presence of raised bicycle crossings (e.g. on a speed hump) and other speed reducing measures. The accident probability is also decreased at intersections where the cycle track approaches are deflected between 2 and 5 m away from the main carriageway. No significant relationships are found between type II crashes and road factors such as the presence of a raised median.  相似文献   

9.
An extensive programme of periodic motor vehicle inspection was introduced in Norway after 1995, when the treaty between Norway and the European Union (EU) granting Norway (not a member of the EU) access to the EU inner market took effect (The EEA treaty). This paper evaluates the effects on accidents of periodic inspections of cars. Trucks and buses were not included in the study. Negative binomial regression models were fitted to data on accidents and inspections created by merging data files provided by a major insurance company and by the Public Roads Administration. Technical defects prior to inspection were associated with an increased accident rate. Inspections were found to strongly reduce the number of technical defects in cars. Despite this, no effect of inspections on accident rate were found. This finding is inconsistent with the fact that technical defects appear to increase the accident rate; one would expect the repair of such defects to reduce the accident rate. Potential explanations of the findings in terms of behavioural adaptation among car owners are discussed. It is suggested that car owners adapt driving behaviour to the technical condition of the car and that the effect attributed to technical defects before inspection may in part be the result of a tendency for owners who are less concerned about safety to neglect the technical condition of their cars. These car owners might have had a higher accident rate than other car owners irrespective of the technical condition of the car.  相似文献   

10.
In recent years there have been numerous studies that have sought to understand the factors that determine the frequency of accidents on roadway segments over some period of time, using count data models and their variants (negative binomial and zero-inflated models). This study seeks to explore the use of random-parameters count models as another methodological alternative in analyzing accident frequencies. The empirical results show that random-parameters count models have the potential to provide a fuller understanding of the factors determining accident frequencies.  相似文献   

11.
INTRODUCTION: Thoracic trauma secondary to motor vehicle crashes (MVC) continues to be a major cause of morbidity and mortality. Specific vehicle features may increase the risk of severe thoracic injury when striking the occupant. We sought to determine which vehicle contact points were associated with an increased risk of severe thoracic injury in MVC to focus subsequent design modifications necessary to reduce thoracic injury. METHODS: The National Automotive Sampling System (NASS) databases from 1993 to 2001 and the Crash Injury Research and Engineering Network (CIREN) databases from 1996 to 2004 were analyzed separately using univariate and multivariate logistic regression stratified by restraint use and crash direction. The risk of driver thoracic injury, defined as an abbreviated injury scale (AIS) of score > or =3, was determined as it related to specific points of contact between the vehicle and the driver. RESULTS: The incidence of severe chest injury in NASS and CIREN were 5.5% and 33%, respectively. The steering wheel, door panel, armrest, and seat were identified as contact points associated with an increased risk of severe chest injury. The door panel and arm rest were consistently a frequent cause of severe injury in both the NASS and CIREN data. CONCLUSIONS: Several vehicle contact points, including the steering wheel, door panel, armrest and seat are associated with an increased risk of severe thoracic injury when striking the occupant. These elements need to be further investigated to determine which characteristics need to be manipulated in order to reduce thoracic trauma during a crash.  相似文献   

12.
Since the factors contributing to crash frequency and severity usually differ, an integrated model under the multinomial generalized Poisson (MGP) architecture is proposed to analyze simultaneously crash frequency and severity—making estimation results increasingly efficient and useful. Considering the substitution pattern among severity levels and the shared error structure, four models are proposed and compared—the MGP model with or without error components (EMGP and MGP models, respectively) and two nested generalized Poisson models (NGP model). A case study based on accident data for Taiwan's No. 1 Freeway is conducted. The results show that the EMGP model has the best goodness-of-fit and prediction accuracy indices. Additionally, estimation results show that factors contributing to crash frequency and severity differ markedly. Safety improvement strategies are proposed accordingly.  相似文献   

13.
The modeling of crash count data is a very important topic in highway safety. As documented in the literature, given the characteristics associated with crash data, transportation safety analysts have proposed a significant number of analysis tools, statistical methods and models for analyzing such data. Among the data issues, we find the one related to crash data which have a large amount of zeros and a long or heavy tail. It has been found that using this kind of dataset could lead to erroneous results or conclusions if the wrong statistical tools or methods are used. Thus, the purpose of this paper is to introduce a new distribution, known as the negative binomial–Lindley (NB-L), which has very recently been introduced for analyzing data characterized by a large number of zeros. The NB–L offers the advantage of being able to handle this kind of datasets, while still maintaining similar characteristics as the traditional negative binomial (NB). In other words, the NB–L is a two-parameter distribution and the long-term mean is never equal to zero. To examine this distribution, simulated and observed data were used. The results show that the NB–L can provide a better statistical fit than the traditional NB for datasets that contain a large amount of zeros.  相似文献   

14.
Crash data can often be characterized by over-dispersion, heavy (long) tail and many observations with the value zero. Over the last few years, a small number of researchers have started developing and applying novel and innovative multi-parameter models to analyze such data. These multi-parameter models have been proposed for overcoming the limitations of the traditional negative binomial (NB) model, which cannot handle this kind of data efficiently. The research documented in this paper continues the work related to multi-parameter models. The objective of this paper is to document the development and application of a flexible NB generalized linear model with randomly distributed mixed effects characterized by the Dirichlet process (NB-DP) to model crash data. The objective of the study was accomplished using two datasets. The new model was compared to the NB and the recently introduced model based on the mixture of the NB and Lindley (NB-L) distributions. Overall, the research study shows that the NB-DP model offers a better performance than the NB model once data are over-dispersed and have a heavy tail. The NB-DP performed better than the NB-L when the dataset has a heavy tail, but a smaller percentage of zeros. However, both models performed similarly when the dataset contained a large amount of zeros. In addition to a greater flexibility, the NB-DP provides a clustering by-product that allows the safety analyst to better understand the characteristics of the data, such as the identification of outliers and sources of dispersion.  相似文献   

15.
Falls and their injury outcomes have count distributions that are highly skewed toward the right with clumping at zero, posing analytical challenges. Different modelling approaches have been used in the published literature to describe falls count distributions, often without consideration of the underlying statistical and modelling assumptions. This paper compares the use of modified Poisson and negative binomial (NB) models as alternatives to Poisson (P) regression, for the analysis of fall outcome counts. Four different count-based regression models (P, NB, zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB)) were each individually fitted to four separate fall count datasets from Australia, New Zealand and United States. The finite mixtures of P and NB regression models were also compared to the standard NB model. Both analytical (F, Vuong and bootstrap tests) and graphical approaches were used to select and compare models. Simulation studies assessed the size and power of each model fit. This study confirms that falls count distributions are over-dispersed, but not dispersed due to excess zero counts or heterogeneous population. Accordingly, the P model generally provided the poorest fit to all datasets. The fit improved significantly with NB and both zero-inflated models. The fit was also improved with the NB model, compared to finite mixtures of both P and NB regression models. Although there was little difference in fit between NB and ZINB models, in the interests of parsimony it is recommended that future studies involving modelling of falls count data routinely use the NB models in preference to the P or ZINB or finite mixture distribution. The fact that these conclusions apply across four separate datasets from four different samples of older people participating in studies of different methodology, adds strength to this general guiding principle.  相似文献   

16.
As one of the major analysis methods, statistical models play an important role in traffic safety analysis. They can be used for a wide variety of purposes, including establishing relationships between variables and understanding the characteristics of a system. The purpose of this paper is to document a new type of model that can help with the latter. This model is based on the Generalized Waring (GW) distribution. The GW model yields more information about the sources of the variance observed in datasets than other traditional models, such as the negative binomial (NB) model. In this regards, the GW model can separate the observed variability into three parts: (1) the randomness, which explains the model's uncertainty; (2) the proneness, which refers to the internal differences between entities or observations; and (3) the liability, which is defined as the variance caused by other external factors that are difficult to be identified and have not been included as explanatory variables in the model. The study analyses were accomplished using two observed datasets to explore potential sources of variation. The results show that the GW model can provide meaningful information about sources of variance in crash data and also performs better than the NB model.  相似文献   

17.
Hot spot identification (HSID) aims to identify potential sites—roadway segments, intersections, crosswalks, interchanges, ramps, etc.—with disproportionately high crash risk relative to similar sites. An inefficient HSID methodology might result in either identifying a safe site as high risk (false positive) or a high risk site as safe (false negative), and consequently lead to the misuse the available public funds, to poor investment decisions, and to inefficient risk management practice. Current HSID methods suffer from issues like underreporting of minor injury and property damage only (PDO) crashes, challenges of accounting for crash severity into the methodology, and selection of a proper safety performance function to model crash data that is often heavily skewed by a preponderance of zeros. Addressing these challenges, this paper proposes a combination of a PDO equivalency calculation and quantile regression technique to identify hot spots in a transportation network. In particular, issues related to underreporting and crash severity are tackled by incorporating equivalent PDO crashes, whilst the concerns related to the non-count nature of equivalent PDO crashes and the skewness of crash data are addressed by the non-parametric quantile regression technique. The proposed method identifies covariate effects on various quantiles of a population, rather than the population mean like most methods in practice, which more closely corresponds with how black spots are identified in practice. The proposed methodology is illustrated using rural road segment data from Korea and compared against the traditional EB method with negative binomial regression. Application of a quantile regression model on equivalent PDO crashes enables identification of a set of high-risk sites that reflect the true safety costs to the society, simultaneously reduces the influence of under-reported PDO and minor injury crashes, and overcomes the limitation of traditional NB model in dealing with preponderance of zeros problem or right skewed dataset.  相似文献   

18.
There has been a considerable amount of work devoted by transportation safety analysts to the development and application of new and innovative models for analyzing crash data. One important characteristic about crash data that has been documented in the literature is related to datasets that contained a large amount of zeros and a long or heavy tail (which creates highly dispersed data). For such datasets, the number of sites where no crash is observed is so large that traditional distributions and regression models, such as the Poisson and Poisson-gamma or negative binomial (NB) models cannot be used efficiently. To overcome this problem, the NB-Lindley (NB-L) distribution has recently been introduced for analyzing count data that are characterized by excess zeros. The objective of this paper is to document the application of a NB generalized linear model with Lindley mixed effects (NB-L GLM) for analyzing traffic crash data. The study objective was accomplished using simulated and observed datasets. The simulated dataset was used to show the general performance of the model. The model was then applied to two datasets based on observed data. One of the dataset was characterized by a large amount of zeros. The NB-L GLM was compared with the NB and zero-inflated models. Overall, the research study shows that the NB-L GLM not only offers superior performance over the NB and zero-inflated models when datasets are characterized by a large number of zeros and a long tail, but also when the crash dataset is highly dispersed.  相似文献   

19.
In this study, a two-state Markov switching count-data model is proposed as an alternative to zero-inflated models to account for the preponderance of zeros sometimes observed in transportation count data, such as the number of accidents occurring on a roadway segment over some period of time. For this accident-frequency case, zero-inflated models assume the existence of two states: one of the states is a zero-accident count state, which has accident probabilities that are so low that they cannot be statistically distinguished from zero, and the other state is a normal-count state, in which counts can be non-negative integers that are generated by some counting process, for example, a Poisson or negative binomial. While zero-inflated models have come under some criticism with regard to accident-frequency applications - one fact is undeniable - in many applications they provide a statistically superior fit to the data. The Markov switching approach we propose seeks to overcome some of the criticism associated with the zero-accident state of the zero-inflated model by allowing individual roadway segments to switch between zero and normal-count states over time. An important advantage of this Markov switching approach is that it allows for the direct statistical estimation of the specific roadway-segment state (i.e., zero-accident or normal-count state) whereas traditional zero-inflated models do not. To demonstrate the applicability of this approach, a two-state Markov switching negative binomial model (estimated with Bayesian inference) and standard zero-inflated negative binomial models are estimated using five-year accident frequencies on Indiana interstate highway segments. It is shown that the Markov switching model is a viable alternative and results in a superior statistical fit relative to the zero-inflated models.  相似文献   

20.
Developing sound or reliable statistical models for analyzing motor vehicle crashes is very important in highway safety studies. However, a significant difficulty associated with the model development is related to the fact that crash data often exhibit over-dispersion. Sources of dispersion can be varied and are usually unknown to the transportation analysts. These sources could potentially affect the development of negative binomial (NB) regression models, which are often the model of choice in highway safety. To help in this endeavor, this paper documents an alternative formulation that could be used for capturing heterogeneity in crash count models through the use of finite mixture regression models. The finite mixtures of Poisson or NB regression models are especially useful where count data were drawn from heterogeneous populations. These models can help determine sub-populations or groups in the data among others. To evaluate these models, Poisson and NB mixture models were estimated using data collected in Toronto, Ontario. These models were compared to standard NB regression model estimated using the same data. The results of this study show that the dataset seemed to be generated from two distinct sub-populations, each having different regression coefficients and degrees of over-dispersion. Although over-dispersion in crash data can be dealt with in a variety of ways, the mixture model can help provide the nature of the over-dispersion in the data. It is therefore recommended that transportation safety analysts use this type of model before the traditional NB model, especially when the data are suspected to belong to different groups.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号