首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
We suggest a two-stage multinomial logit model (TMLM) for incorporating and interpreting both the interaction and main effects in the model for multi-categorized responses. TMLM combines the robustness of multinomial logit model (MLM) with the good properties of decision tree (DT), which makes it possible to cluster homogeneous subjects and thus to incorporate the interaction effects of explanatory variables in MLM. In the first step of TMLM, DT is applied to determine the most influential interaction effects and to create a cluster variable that represents categories with best splits for optimal tree. In the second step, the cluster variable is involved in MLM as an explanatory variable. With TMLM, it is possible to interpret not only the interactions among explanatory variables, but also the main effects. It is also possible to cluster and characterize homogeneous subjects; these would not be possible with MLM. This model also improves the accuracy rate in multi-classification for multi-categorized responses. We apply TMLM to the national pension data of disability pensioners in Korea and compare the results with two types of MLM models. TMLM is suggested as a statistical model for characterizing both the interaction and main effects of explanatory variables and also for improving accuracy rates comparing to MLM.  相似文献   

2.
This research empirically tests the postulations of Gentner concerning the properties of explanatory analogy. It does so in the context of teaching programming. The factor analogy was operationalized by varying the clarity and systematicity/abstractness of the analogies used. The dependent variables were score obtained on program comprehension and program composition tasks and the time taken to perform the tasks. Research subjects were 15- to 17-year-olds without prior exposure to computer programming. Differences in age were controlled. The results provide empirical support for Gentner's postulations on the relative goodness of competing analogies. In particular, good explanatory analogies are characterized by clarity and high systematicity/abstractness.  相似文献   

3.
马江洪  张文修  梁怡 《计算机学报》2003,26(12):1652-1659
复杂海量数据往往表现为多种结构特征的混合体,回归类混合模型就是对这种混合体的一个描述.该文基于统计学的有限混合分布理论和可识别性的相关结果,针对回归变量的三种情形:(1)解释变量固定,(2)解释变量随机,(3)解释变量固定且类别参数指定,分别讨论挖掘一般回归类的混合模型的可识别性问题,并给出同族回归类混合模型可识别的相应充分条件.这些条件的一个共同特点是它们都与一类特别的解释变量集合有关,而该类集合是由同族的回归函数与回归参数唯一确定的,其元素使不同的回归参数对应回归函数的相同值.特别地,当回归函数线性时,这类集合就是解释变量空间中的超平面.  相似文献   

4.
Realistic mathematical models of physical processes contain uncertainties. These models are often described by stochastic differential equations (SDEs) or stochastic partial differential equations (SPDEs) with multiplicative noise. The uncertainties in the right-hand side or the coefficients are represented as random fields. To solve a given SPDE numerically one has to discretise the deterministic operator as well as the stochastic fields. The total dimension of the SPDE is the product of the dimensions of the deterministic part and the stochastic part. To approximate random fields with as few random variables as possible, but still retaining the essential information, the Karhunen–Loève expansion (KLE) becomes important. The KLE of a random field requires the solution of a large eigenvalue problem. Usually it is solved by a Krylov subspace method with a sparse matrix approximation. We demonstrate the use of sparse hierarchical matrix techniques for this. A log-linear computational cost of the matrix-vector product and a log-linear storage requirement yield an efficient and fast discretisation of the random fields presented.  相似文献   

5.
This study focuses on the one of the most critical issues of modeling under severe conditions of uncertainty: determining the relative importance (weight) of the explanatory variables. The ability to determine relative importance of explanatory variables and the reliability of such outcome are of utmost importance to the decision makers, who utilize such models as components of decision support or decision making. We compare the reliability of traditional method multiple linear regression versus fuzzy logic‐based soft regression. We provide a case study (cross‐national model of background factors facilitating economic growth) to illustrate the performance of both methods. We conclude that soft regression is definitely more reliable and consistent tool to determine relative importance of explanatory variables.  相似文献   

6.
Sequence segmentation is a well-studied problem, where given a sequence of elements, an integer K, and some measure of homogeneity, the task is to split the sequence into K contiguous segments that are maximally homogeneous. A classic approach to find the optimal solution is by using a dynamic program. Unfortunately, the execution time of this program is quadratic with respect to the length of the input sequence. This makes the algorithm slow for a sequence of non-trivial length. In this paper we study segmentations whose measure of goodness is based on log-linear models, a rich family that contains many of the standard distributions. We present a theoretical result allowing us to prune many suboptimal segmentations. Using this result, we modify the standard dynamic program for 1D log-linear models, and by doing so reduce the computational time. We demonstrate empirically, that this approach can significantly reduce the computational burden of finding the optimal segmentation.  相似文献   

7.
8.
Susceptibility or hazard models are often established by means of logistic regression techniques in order to describe the effect of a group of explanatory variables on the probability of a dichotomous or binary response. Since the available variables do not always meet the assumptions of logit-linearity of the logistic regression, a modified approach is proposed. Firstly a favorability function associated with each explanatory variable based on the conditional probability measures is introduced. Next, a simple transformation based on the empirical probability function for non-continuous variables is suggested, while nonparametric kernel estimation is considered for continuous ones. The favorability-based transformations lead to new explanatory variables for the logistic regression model. The performance of the method is evaluated using simulated data. In addition, a real case-study is presented, in which a GIS-based landslides susceptibility model is carried out.  相似文献   

9.
Many studies have generated cost estimating relationships (CERs) for transportation projects via data analysis. Some studies collected data from databases, while others sourced data from conventional paper-based formats. When cost data were not in a consistent format, many studies failed to discuss the streamlining of pattern recognition, ranging from generating a problem statement, data warehouse and prediction modeling to information management. This study adopts a standard procedure of identifying CERs for transportation projects. For the proposed dimensional data warehouse, a pavement maintenance and rehabilitation project was selected as a case study for extracting data and concealed prediction rules. Linear and log-linear statistical approaches were adopted to create most advantageous models, defined based on their explanatory power and mean absolute prediction error. The resulting favorable estimation models created from the proposed cost data warehouse were integrated into an expert system to facilitate information management and generate preliminary budgets for transportation agencies.  相似文献   

10.
Identifiability for a very flexible family of latent class models introduced recently is examined. These models allow for a conditional association between selected pairs of response variables conditionally on the latent and are based on logistic regression models both for the latent weights and for the conditional distributions of the response variables in terms of subject specific covariates. Generalized logits (global or continuation, which are relevant with ordered categorical responses and involve comparisons of cumulated probabilities) may be used as an alternative to the usual logits of type local which are log-linear. A compact matrix formulation for the Jacobian of the parametrization and a simple algorithm for checking local identifiability numerically is described. A few examples involving causal inference are examined.  相似文献   

11.
A fuzzy regression model is developed to construct the relationship between the response and explanatory variables in fuzzy environments. To enhance explanatory power and take into account the uncertainty of the formulated model and parameters, a new operator, called the fuzzy product core (FPC), is proposed for the formulation processes to establish fuzzy regression models with fuzzy parameters using fuzzy observations that include fuzzy response and explanatory variables. In addition, the sign of parameters can be determined in the model-building processes. Compared to existing approaches, the proposed approach reduces the amount of unnecessary or unimportant information arising from fuzzy observations and determines the sign of parameters in the models to increase model performance. This improves the weakness of the relevant approaches in which the parameters in the models are fuzzy and must be predetermined in the formulation processes. The proposed approach outperforms existing models in terms of distance, mean similarity, and credibility measures, even when crisp explanatory variables are used.  相似文献   

12.
A general purpose engineering economy problem solving package has been created. This software allows the user to type in equations involving compound interest factors and unknown variables and will calculate the results or solve for the unknowns. The program is also able to compare alternative sets of cash flows using PEX, AEX, and ROR methodology. The program also includes interactive graphics features. This paper describes those features.

For graphical output the program includes the capability to graph up to six equations with respect to one independent variable. This allows for the display of PEX vs. I graphs and breakeven graphs. The program will also draw cash flow diagrams and network diagrams.

Cash flow data can be input graphically, using a mouse, by selecting a cash flow pattern (single amount, uniform series, or gradient series) and pointing to the position on the cash flow time line where the pattern should be placed. The program can calculate the net equivalent worth of the cash flows at any point in time given an interest rate, or find an interest rate at which the equivalent net cash flow is zero.

This software is written in FORTRAN and can be linked to available graphics libraries on various computers. The author currently has versions for the VAX minicomputer, and IBM PC compatibles.  相似文献   


13.
分析实际程序时往往需要分析程序中函数的调用, 一般使用过程间分析来实现全程序分析.函数内联是一种最为精确、易于实现的过程间分析方法.通过函数内联, 可以使得已有过程内分析方法和工具支持包含函数调用的程序的分析.但是, 函数内联后代码的规模急剧增加, 同时将产生大量中间变量, 增加程序分析的变量维度, 导致程序分析过程时空开销大大增加.本文考虑基于抽象解释框架下函数内联过程间分析的一些不足, 并提出相应优化方法.基于抽象解释的程序分析关注自动推导程序变量之间的不变式约束关系, 因此程序变量构成的程序环境大小(即各程序点处须考虑的相关变量集合)对分析的时空开销具有重要影响.为了减少函数内联后程序分析的开销, 本文提出了面向内联函数块的程序环境降维优化方法.该方法针对内联函数后的程序代码, 分析确定不同程序点处需维护的程序环境(即相关变量集合), 而不是所有程序点共享同一全局程序环境, 从而实现程序状态的降维.详细描述了基于该方法所实现的工具DRIP (Dimension Reduction for analyzing function Inlined Program) 的架构、模块及算法细节.并在WCET Benchmarks测试集开展了分析实验, 实验结果表明: DRIP在变量消除上取得的效果良好, 甚至在某些测试集上能减少一半以上的变量, 并在一定程度上降低了分析过程的时空开销.  相似文献   

14.
The analysis of exercise electrocardiogram (ECG) is based on the alteration of the measured variables in the detection of coronary artery disease (CAD). In its existing form the analysis of the exercise ECG is laborious and requires much time. The temporal analysis of the ECG variable and the comparison between different phases of the exercise test is difficult and time consuming, especially the simultaneous examination of the variables over several leads. In this article we present a computer program, ECG Variable Cine, for the visualization of the temporal changes of values of exercise ECG variables over the selected ECG lead system. The program includes the stationary 3-D presentation for the variables' alteration simultaneously in all selected leads over the time of exercise test. In addition, the program determines two parameters; the average value of the variable over the selected leads at every sample moment, and the chronotropic index, a parameter that indicates heart rate response to exercise. According to the results the average value of ST-segment deviation at the end of the exercise over the leads and chronotropic index are clinically more competent than the maximum value of ST-segment depression in the detection of CAD.  相似文献   

15.
We present latent log-linear models, an extension of log-linear models incorporating latent variables, and we propose two applications thereof: log-linear mixture models and image deformation-aware log-linear models. The resulting models are fully discriminative, can be trained efficiently, and the model complexity can be controlled. Log-linear mixture models offer additional flexibility within the log-linear modeling framework. Unlike previous approaches, the image deformation-aware model directly considers image deformations and allows for a discriminative training of the deformation parameters. Both are trained using alternating optimization. For certain variants, convergence to a stationary point is guaranteed and, in practice, even variants without this guarantee converge and find models that perform well. We tune the methods on the USPS data set and evaluate on the MNIST data set, demonstrating the generalization capabilities of our proposed models. Our models, although using significantly fewer parameters, are able to obtain competitive results with models proposed in the literature.  相似文献   

16.
Land use changes have a pronounced impact on hydrology. Vice versa, hydrologic changes affect land use patterns. The objective of this study is to test whether hydrologic variables can explain land use change. We employ a set of spatially distributed hydrologic variables and compare it against a set of commonly used explanatory variables for land use change. The explanatory power of these variables is assessed by using a logistic regression approach to model the spatial distribution of land use changes in a meso-scale Indian catchment. When hydrologic variables are additionally included, the accuracies of the logistic regression models improve, which is indicated by a change in the relative operating characteristic statistic (ROC) by up to 11%. This is mostly due to the complementarity of the two datasets that is reflected in the use of 44% commonly used variables and 56% hydrologic variables in the best models for land use change.  相似文献   

17.
A synchronizer is a compiler that transforms a program designed to run in a synchronous network into a program that runs in an asynchronous network. The behavior of a simple synchronizer, which also represents a basic mechanism for distributed computing and for the analysis of marked graphs, was studied by S. Even and S. Rajsbaum (1990) under the assumption that message transmission delays and processing times are constant. We study the behavior of the simple synchronizer when processing times and transmission delays are random. The main performance measure is the rate of a network, i.e., the average number of computational steps executed by a processor in the network per unit time. We analyze the effect of the topology and the probability distributions of the random variables on the behavior of the network. For random variables with exponential distribution, we provide tight (i.e., attainable) bounds and study the effect of a bottleneck processor on the rate  相似文献   

18.
As biometric authentication systems become more prevalent, it is becoming increasingly important to evaluate their performance. This paper introduces a novel statistical method of performance evaluation for these systems. Given a database of authentication results from an existing system, the method uses a hierarchical random effects model, along with Bayesian inference techniques yielding posterior predictive distributions, to predict performance in terms of error rates using various explanatory variables. By incorporating explanatory variables as well as random effects, the method allows for prediction of error rates when the authentication system is applied to potentially larger and/or different groups of subjects than those originally documented in the database. We also extend the model to allow for prediction of the probability of a false alarm on a "watch-list" as a function of the list size. We consider application of our methodology to three different face authentication systems: a filter-based system, a Gaussian mixture model (GMM)-based system, and a system based on frequency domain representation of facial asymmetry  相似文献   

19.
多元线性回归模型通常用来研究一个因变量依赖多个解释变量的变化关系,但它有一个前提条件就是解释变量之间不存在相关关系.在实际的应用中,特别是计量经济学中,解释变量之间一般都存在有高度相关关系或近似相关关系,从而使得模型估计不准确.为此,通过协方差计算变换矩阵,提供一种变换矩阵消除随机变量之间相关关系的方法,通过spss25进行实证分析,最后发现通过矩阵变换变换后的数据t检验的显著性值明显降低.  相似文献   

20.
Correlated survival outcomes occur quite frequently in the biomedical research. Available software is limited, particularly if we wish to obtain smoothed estimate of the baseline hazard function in the context of random effects model for correlated data. The main objective of this paper is to describe an R package called frailtypack that can be used for estimating the parameters in a shared gamma frailty model with possibly right-censored, left-truncated stratified survival data using penalized likelihood estimation. Time-dependent structure for the explanatory variables and/or extension of the Cox regression model to recurrent events are also allowed. This program can also be used simply to obtain directly a smooth estimate of the baseline hazard function. To illustrate the program we used two data sets, one with clustered survival times, the other one with recurrent events, i.e., the rehospitalizations of patients diagnosed with colorectal cancer. We show how to fit the model with recurrent events and time-dependent covariates using Andersen-Gill approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号