首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Much of the data collected on motor vehicle crashes is count data. The standard Poisson regression approach used to model this type of data does not take into account the fact there are few crash events and hence, many observed zeros. In this paper, we applied the zero-inflated Poisson (ZIP) model (which adjusts for the many observed zeros) and the negative binomial (NB) model to analyze young driver motor vehicle crashes. The results of the ZIP regression model are comparable to those from fitting a NB regression model for general over-dispersion. The findings highlight that driver confidence/adventurousness and the frequency of driving prior to licensing are significant predictors of crash outcome in the first 12 months of driving. We encourage researchers, when analyzing motor vehicle crash data, to consider the empirical frequency distribution first and to apply the ZIP and NB models in the presence of extra zeros due, for example, to under-reporting.  相似文献   

2.
The integrated circuits (ICs) on wafers are highly vulnerable to defects generated during the semiconductor manufacturing process. The spatial patterns of locally clustered defects are likely to contain information related to the defect generating mechanism. For the purpose of yield management, we propose a multi-step adaptive resonance theory (ART1) algorithm in order to accurately recognise the defect patterns scattered over a wafer. The proposed algorithm consists of a new similarity measure, based on the p-norm ratio and run-length encoding technique and pre-processing procedure: the variable resolution array and zooming strategy. The performance of the algorithm is evaluated based on the statistical models for four types of simulated defect patterns, each of which typically occurs during fabrication of ICs: random patterns by a spatial homogeneous Poisson process, ellipsoid patterns by a multivariate normal, curvilinear patterns by a principal curve, and ring patterns by a spherical shell. Computational testing results show that the proposed algorithm provides high accuracy and robustness in detecting IC defects, regardless of the types of defect patterns residing on the wafer.  相似文献   

3.
Emerge in technology brought well-organized manufacturing systems to produce high-quality items. Therefore, monitoring and control of products have become a challenging task for quality inspectors. From these highly efficient processes, produced items are mostly zero-defect and modeled based on zero-inflated distributions. The zero-inflated Poisson (ZIP) and zero-inflated Negative Binomial (ZINB) distributions are the most common distributions, used to model the high-yield and rare health-related processes. Therefore, data-based control charts under ZIP and ZINB distributions (i.e., Y-ZIP and Y-ZINB) are proposed for the monitoring of high-quality processes. Usually, with the defect counts, few covariates are also measured in the process, and the generalized linear model based on the ZIP and ZINB distributions are used to estimate their parameters. In this study, we have designed monitoring structures (i.e., PR-ZIP and PR-ZINB) based on the ZIP and ZINB regression models which will provide the monitoring of defect counts by accounting the single covariate. Further, proposed model-based charts are compared with the existing data-based charts. The simulation study is designed to access the performance of monitoring methods in terms of run length properties and a case study on the number of flight delays between Atlanta and Orlando during 2012–2014 is also provided to highlight the importance of the stated research.  相似文献   

4.
Semiconductor yield data is binary by nature since an integrated circuit (IC), or die, can only pass or fail a particular test. Still, many people rely on simple linear regression to analyse this type of data, which can produce incorrect results, e.g. negative yield predictions. We discuss the use of logistic regression for the analysis of yield data, and show that even this approach has to be modified to take into account the phenomenon of overdispersion. We present several approaches to accommodate overdispersion. We also discuss a statistic for measuring the spatial dependence of ICs within a wafer, the spatial log-odds ratio, which provides additional information to complement a yield analysis. These ideas are demonstrated using an example from our manufacturing area. © 1997 by John Wiley & Sons, Ltd.  相似文献   

5.
The zero-inflated Poisson (ZIP) distribution is an extension of the ordinary Poisson distribution and is used to model count data with an excessive number of zeros. In ZIP models, it is assumed that random shocks occur with probability p, and upon the occurrence of random shock, the number of nonconformities in a product follows the Poisson distribution with parameter λ. In this article, we study in more detail the exponentially weighted moving average control chart based on the ZIP distribution (regarded as ZIP-EWMA) and we also propose a double EWMA chart with an upper time-varying control limit to monitor ZIP processes (regarded as ZIP-DEWMA chart). The two charts are studied to detect upward shifts not only in each parameter individually but also in both parameters simultaneously. The steady-state performance and the performance with estimated parameters are also investigated. The performance of the two charts has been evaluated in terms of the average and standard deviation of the run length, and compared with Shewhart-type and CUSUM schemes for ZIP distribution, it is shown that the proposed chart is very effective especially in detecting shifts in p when λ remains in control (IC) and in both parameters simultaneously. Finally, one real example is given to display the application of the ZIP charts on practitioners.  相似文献   

6.
There has been considerable research conducted over the last 20 years focused on predicting motor vehicle crashes on transportation facilities. The range of statistical models commonly applied includes binomial, Poisson, Poisson-gamma (or negative binomial), zero-inflated Poisson and negative binomial models (ZIP and ZINB), and multinomial probability models. Given the range of possible modeling approaches and the host of assumptions with each modeling approach, making an intelligent choice for modeling motor vehicle crash data is difficult. There is little discussion in the literature comparing different statistical modeling approaches, identifying which statistical models are most appropriate for modeling crash data, and providing a strong justification from basic crash principles. In the recent literature, it has been suggested that the motor vehicle crash process can successfully be modeled by assuming a dual-state data-generating process, which implies that entities (e.g., intersections, road segments, pedestrian crossings, etc.) exist in one of two states-perfectly safe and unsafe. As a result, the ZIP and ZINB are two models that have been applied to account for the preponderance of "excess" zeros frequently observed in crash count data. The objective of this study is to provide defensible guidance on how to appropriate model crash data. We first examine the motor vehicle crash process using theoretical principles and a basic understanding of the crash process. It is shown that the fundamental crash process follows a Bernoulli trial with unequal probability of independent events, also known as Poisson trials. We examine the evolution of statistical models as they apply to the motor vehicle crash process, and indicate how well they statistically approximate the crash process. We also present the theory behind dual-state process count models, and note why they have become popular for modeling crash data. A simulation experiment is then conducted to demonstrate how crash data give rise to "excess" zeros frequently observed in crash data. It is shown that the Poisson and other mixed probabilistic structures are approximations assumed for modeling the motor vehicle crash process. Furthermore, it is demonstrated that under certain (fairly common) circumstances excess zeros are observed-and that these circumstances arise from low exposure and/or inappropriate selection of time/space scales and not an underlying dual state process. In conclusion, carefully selecting the time/space scales for analysis, including an improved set of explanatory variables and/or unobserved heterogeneity effects in count regression models, or applying small-area statistical methods (observations with low exposure) represent the most defensible modeling approaches for datasets with a preponderance of zeros.  相似文献   

7.
The widely adopted techniques for regional crash modeling include the negative binomial model (NB) and Bayesian negative binomial model with conditional autoregressive prior (CAR). The outputs from both models consist of a set of fixed global parameter estimates. However, the impacts of predicting variables on crash counts might not be stationary over space. This study intended to quantitatively investigate this spatial heterogeneity in regional safety modeling using two advanced approaches, i.e., random parameter negative binomial model (RPNB) and semi-parametric geographically weighted Poisson regression model (S-GWPR).  相似文献   

8.
Yield analysis is one of the key concerns in the fabrication of semiconductor wafers. An effective yield analysis model will contribute to production planning and control, cost reductions and the enhanced competitiveness of enterprises. In this article, we propose a novel discrete spatial model based on defect data on wafer maps for analyzing and predicting wafer yields at different chip locations. More specifically, based on a Bayesian framework, we propose a hierarchical generalized linear mixed model, which incorporates both global trends and spatially correlated effects to characterize wafer yields with clustered defects. Both real and simulated data are used to validate the performance of the proposed model. The experimental results show that the newly proposed model offers an improved fit to spatially correlated wafer map data.  相似文献   

9.
This paper presents an empirical inquiry into the applicability of zero-altered counting processes to roadway section accident frequencies. The intent of such a counting process is to distinguish sections of roadway that are truly safe (near zero-accident likelihood) from those that are unsafe but happen to have zero accidents observed during the period of observation (e.g. one year). Traditional applications of Poisson and negative binomial accident frequency models do not account for this distinction and thus can produce biased coefficient estimates because of the preponderance of zero-accident observations. Zero-altered probability processes such as the zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) distributions are examined and proposed for accident frequencies by roadway functional class and geographic location. The findings show that the ZIP structure models are promising and have great flexibility in uncovering processes affecting accident frequencies on roadway sections observed with zero accidents and those with observed accident occurrences. This flexibility allows highway engineers to better isolate design factors that contribute to accident occurrence and also provides additional insight into variables that determine the relative accident likelihoods of safe versus unsafe roadways. The generic nature of the models and the relatively good power of the Vuong specification test used in the non-nested hypotheses of model specifications offers roadway designers the potential to develop a global family of models for accident frequency prediction that can be embedded in a larger safety management system.  相似文献   

10.
Recently, machine learning-based technologies have been developed to automate the classification of wafer map defect patterns during semiconductor manufacturing. The existing approaches used in the wafer map pattern classification include directly learning the image through a convolution neural network and applying the ensemble method after extracting image features. This study aims to classify wafer map defects more effectively and derive robust algorithms even for datasets with insufficient defect patterns. First, the number of defects during the actual process may be limited. Therefore, insufficient data are generated using convolutional auto-encoder (CAE), and the expanded data are verified using the evaluation technique of structural similarity index measure (SSIM). After extracting handcrafted features, a boosted stacking ensemble model that integrates the four base-level classifiers with the extreme gradient boosting classifier as a meta-level classifier is designed and built for training the model based on the expanded data for final prediction. Since the proposed algorithm shows better performance than those of existing ensemble classifiers even for insufficient defect patterns, the results of this study will contribute to improving the product quality and yield of the actual semiconductor manufacturing process.  相似文献   

11.
Unreliable chips tend to form spatial clusters on semiconductor wafers. The spatial patterns of these defects are largely reflected in functional testing results. However, the spatial cluster information of unreliable chips has not been fully used to predict the performance in field use in the literature. This paper proposes a novel wafer yield prediction model that incorporates the spatial clustering information in functional testing. Fused LASSO is first adopted to derive variables based on the spatial distribution of defect clusters. Then, a logistic regression model is used to predict the final yield (ratio of chips that remain functional until expected lifetime) with derived spatial covariates and functional testing values. The proposed model is evaluated both on real production wafers and in an extensive simulation study. The results show that by explicitly considering the characteristics of defect clusters, our proposed model provides improved performance compared to existing methods. Moreover, the cross‐validation experiments prove that our approach is capable of using historical data to predict yield on newly produced wafers.  相似文献   

12.
A common technique used for the calibration of collision prediction models is the Generalized Linear Modeling (GLM) procedure with the assumption of Negative Binomial or Poisson error distribution. In this technique, fixed coefficients that represent the average relationship between the dependent variable and each explanatory variable are estimated. However, the stationary relationship assumed may hide some important spatial factors of the number of collisions at a particular traffic analysis zone. Consequently, the accuracy of such models for explaining the relationship between the dependent variable and the explanatory variables may be suspected since collision frequency is likely influenced by many spatially defined factors such as land use, demographic characteristics, and traffic volume patterns. The primary objective of this study is to investigate the spatial variations in the relationship between the number of zonal collisions and potential transportation planning predictors, using the Geographically Weighted Poisson Regression modeling technique. The secondary objective is to build on knowledge comparing the accuracy of Geographically Weighted Poisson Regression models to that of Generalized Linear Models. The results show that the Geographically Weighted Poisson Regression models are useful for capturing spatially dependent relationships and generally perform better than the conventional Generalized Linear Models.  相似文献   

13.
Historically, the application of logistic and Poisson regression has been focused in the social science and medical fields where the response variable typically has only a few possible outcomes. These techniques are not commonly applied to characterize military operations even though response variables that measure success or failure are commonly encountered in this field. This paper explores the application of ordinal logistic and Poisson regression as alternatives to ordinary least squares estimation for modeling operational performance in a military testing environment. The operational test planners chose a nested face‐centered experimental design, which was executed to collect test data. Three modeling techniques were employed in the analysis: multiple linear regression, ordinal logistic regression, and Poisson regression. The purpose of the study was to determine which regression technique best fits the test data. Cross validation and model goodness comparison were accomplished by assessing that the model fits for each model type in combination with a comparison of significant main effects and interactions. Finally, contrasts are provided relative to the ease of implementing each technique. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

14.
Zero-inflated Poisson (ZIP) model is very useful in high-yield processes where an excessive number of zero observations exist. This model can be viewed as an extension of the standard Poisson distribution. In this paper, a one-sided generally weighted moving average (GWMA) control chart is proposed for monitoring upward shifts in the two parameters of a ZIP process (regarded as ZIP-GWMA chart). The design parameters of the proposed chart are provided, and through a simulation study, it is shown that the ZIP-GWMA performs better than the existing control charts under shifts in both parameters. Moreover, an illustrative example is presented to display the application of the proposed chart on practitioners.  相似文献   

15.
Safe Communities (SC) is a global movement that brings together community stakeholders to collaboratively address injury concerns. SC accreditation is a formal process through which communities are recognized for strengthening local injury prevention capacity. Six million Americans live in 25 SC sites, but no research has been done to understand the model’s potential impact on this population. This study explored the temporal relationship between SC accreditation and injury trends in three SC sites from the state of Illinois—Arlington Heights, Itasca, and New Lenox. Hospitalization data, including patient demographics, exposure information, injury outcomes, and economic variables, were obtained from a statewide hospital discharge database for a 12-year period (1999–2011). Joinpoint regression models were fitted to identify any periods of significant change, examine the direction of the injury trend, and to estimate monthly percent changes in injury counts and rates. Poisson random-intercept regression measured the average total change since the official SC accreditation for the three communities combined and compared them to three matched control sites. In joinpoint regression, one of the SC sites showed a 10-year increase in hospitalization cases and rates followed by a two-year decline, and the trend reversal occurred while the community was pursuing the SC accreditation. Injury hospitalizations decreased after accreditation compared to the pre-accreditation period when SC sites were compared to their control counterparts using Poisson modeling. Our findings suggest that the SC model may be a promising approach to reduce injuries. Further research is warranted to replicate these findings in other communities.  相似文献   

16.
Falls and their injury outcomes have count distributions that are highly skewed toward the right with clumping at zero, posing analytical challenges. Different modelling approaches have been used in the published literature to describe falls count distributions, often without consideration of the underlying statistical and modelling assumptions. This paper compares the use of modified Poisson and negative binomial (NB) models as alternatives to Poisson (P) regression, for the analysis of fall outcome counts. Four different count-based regression models (P, NB, zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB)) were each individually fitted to four separate fall count datasets from Australia, New Zealand and United States. The finite mixtures of P and NB regression models were also compared to the standard NB model. Both analytical (F, Vuong and bootstrap tests) and graphical approaches were used to select and compare models. Simulation studies assessed the size and power of each model fit. This study confirms that falls count distributions are over-dispersed, but not dispersed due to excess zero counts or heterogeneous population. Accordingly, the P model generally provided the poorest fit to all datasets. The fit improved significantly with NB and both zero-inflated models. The fit was also improved with the NB model, compared to finite mixtures of both P and NB regression models. Although there was little difference in fit between NB and ZINB models, in the interests of parsimony it is recommended that future studies involving modelling of falls count data routinely use the NB models in preference to the P or ZINB or finite mixture distribution. The fact that these conclusions apply across four separate datasets from four different samples of older people participating in studies of different methodology, adds strength to this general guiding principle.  相似文献   

17.
Developing sound or reliable statistical models for analyzing motor vehicle crashes is very important in highway safety studies. However, a significant difficulty associated with the model development is related to the fact that crash data often exhibit over-dispersion. Sources of dispersion can be varied and are usually unknown to the transportation analysts. These sources could potentially affect the development of negative binomial (NB) regression models, which are often the model of choice in highway safety. To help in this endeavor, this paper documents an alternative formulation that could be used for capturing heterogeneity in crash count models through the use of finite mixture regression models. The finite mixtures of Poisson or NB regression models are especially useful where count data were drawn from heterogeneous populations. These models can help determine sub-populations or groups in the data among others. To evaluate these models, Poisson and NB mixture models were estimated using data collected in Toronto, Ontario. These models were compared to standard NB regression model estimated using the same data. The results of this study show that the dataset seemed to be generated from two distinct sub-populations, each having different regression coefficients and degrees of over-dispersion. Although over-dispersion in crash data can be dealt with in a variety of ways, the mixture model can help provide the nature of the over-dispersion in the data. It is therefore recommended that transportation safety analysts use this type of model before the traditional NB model, especially when the data are suspected to belong to different groups.  相似文献   

18.
A zero‐inflated Poisson (ZIP) process is different from a standard Poisson process in that it results in a greater number of zeros. It can be used to model defect counts in manufacturing processes with occasional occurrences of non‐conforming products. ZIP models have been developed assuming that random shocks occur independently with probability p, and the number of non‐conformities in a product subject to a random shock follows a Poisson distribution with parameter λ. In our paper, a control charting procedure using a combination of two cumulative sum (CUSUM) charts is proposed for monitoring increases in the two parameters of the ZIP process. Furthermore, we consider a single CUSUM chart for detecting simultaneous increases in the two parameters. Simulation results show that a ZIP‐Shewhart chart is insensitive to shifts in p and smaller shifts in λ in terms of the average number of observations to signal. Comparisons between the combined CUSUM method and the single CUSUM chart show that the latter's performance is worse when there are only increases in p, but better when there are only increases in λ or when both parameters increase. The combined CUSUM method, however, is much better than the single CUSUM chart when one parameter increases while the other decreases. Finally, we present a case study from the light‐emitting diode packaging industry. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

19.
Although past research has linked alcohol outlet density to higher rates of drinking and many related social problems, there is conflicting evidence of density's association with traffic crashes. An abundance of local alcohol outlets simultaneously encourages drinking and reduces driving distances required to obtain alcohol, leading to an indeterminate expected impact on alcohol-involved crash risk. This study separately investigates the effects of outlet density on (1) the risk of injury crashes relative to population and (2) the likelihood that any given crash is alcohol-involved, as indicated by police reports and single-vehicle nighttime status of crashes. Alcohol outlet density effects are estimated using Bayesian misalignment Poisson analyses of all California ZIP codes over the years 1999–2008. These misalignment models allow panel analysis of ZIP-code data despite frequent redefinition of postal-code boundaries, while also controlling for overdispersion and the effects of spatial autocorrelation. Because models control for overall retail density, estimated alcohol-outlet associations represent the extra effect of retail establishments selling alcohol. The results indicate a number of statistically well-supported associations between retail density and crash behavior, but the implied effects on crash risks are relatively small. Alcohol-serving restaurants have a greater impact on overall crash risks than on the likelihood that those crashes involve alcohol, whereas bars primarily affect the odds that crashes are alcohol-involved. Off-premise outlet density is negatively associated with risks of both crashes and alcohol involvement, while the presence of a tribal casino in a ZIP code is linked to higher odds of police-reported drinking involvement. Alcohol outlets in a given area are found to influence crash risks both locally and in adjacent ZIP codes, and significant spatial autocorrelation also suggests important relationships across geographical units. These results suggest that each type of alcohol outlet can have differing impacts on risks of crashing as well as the alcohol involvement of those crashes.  相似文献   

20.
In semiconductor manufacturing, wafer testing is performed to ensure the performance of each product after wafer fabrication. The wafer map is used to visualize the color-coded wafer test results based on the locations. The defects on the wafer map may be randomly distributed or form clustered patterns. The various clustered defect patterns are usually caused by assignable faults. The identification of the patterns is thus important to provide valuable hints for the root causes diagnosis. Solving the problems helps improve the manufacturing processes and reduce costs. In this study, we present a novel convolutional neural network (CNN)–based method to automatically recognize the defect pattern on wafer maps. Our method uses polar mapping before the training of CNN to transform the circular wafer map into a matrix, which can be processed within CNN architecture. This procedure also reduces the input size and solves variations in wafer sizes and die sizes. To eliminate the effects of rotation, we apply data augmentation in the training of CNN. Experiments using the real-world dataset prove the effectiveness and superiority of our method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号