首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 22 毫秒
1.
将机器学习的理论和方法应用于气象预报领域,基于贝叶斯推理学习的理论,使用朴素贝叶斯分类器(Na(i)ve Bayes classifier)对降雨量预测问题进行了分类预测研究,提出了预测降雨量的朴素贝叶斯算法learn-and-classify--rainfall,将各预测因子及预测目标按照气象学分级标准进行分级,以历年气象数据为训练集,在训练集上学习各预测目标的先验概率及各预测因子的条件概率,用NBC计算出极大后验假设作为预测目标值,该算法具有鲁棒性强、易实现等优点,表现出较强的实用性和有效性,经实验表明,预测精度明显高于目前短期气候预测中采用的回归分析、聚类分析等其它预测方法.同时它还对困扰气象工作者的如何选择预测因子的问题具有指导作用.  相似文献   

2.
To enhance security of three-dimensional images, an inter-view local texture analysis (ILTA) based stereo image reversible data hiding method is presented. Due to low accuracy of existing predictors, two novel predictors are proposed to improve the prediction precision. In the first predictor, a texture analysis model is built by using ILTA, in which the texture similarity between a pair of matched pixels in the stereo image is used to classify pixels into horizontal texture, vertical texture, smooth and complex types. Thus, the accurate prediction is adaptively computed by considering the pixel type. Moreover, an intra-view based predictor as the second predictor is also described to predict pixels by optimal weights finding (OWF). Since ILTA and OWF predictors are combined to predict pixels in the stereo image, sharp prediction error histograms of two views are both constructed, and then multi-level histogram shifting is used to embed secret data reversibly for obtaining low image distortion and high embedding capacity. Experimental results demonstrates that ILTA and OWF predictors can obtain precise predicted values, and the proposed data hiding method outperforms some state-of-the-art data hiding methods in terms of embedding capacity and quality of stego stereo image.  相似文献   

3.
为实现对工作面煤与瓦斯突出快速、准确和动态的预测,提出一种基于主成分分析和权重贝叶斯的工作面煤与瓦斯突出预测方法,通过建立工作面煤与瓦斯突出预测的权重贝叶斯模型进行突出危险性等级预测。利用主成分分析确定预测模型中分类变量权重以提高预测准确性。在此基础上,设计基于相似度的训练样本数据更新方式实现对突出预测模型的有效重构。实验结果表明,与朴素贝叶斯模型和权重贝叶斯模型相比,基于主成分分析和权重贝叶斯工作面煤与瓦斯突出预测方法能快速获得高准确度的突出预测结果,为现场指导矿井工作面安全生产提供参考。  相似文献   

4.
Maximum likelihood (ML) in the linear model overfits when the number of predictors (M) exceeds the number of objects (N). One of the possible solution is the relevance vector machine (RVM) which is a form of automatic relevance detection and has gained popularity in the pattern recognition machine learning community by the famous textbook of Bishop (2006). RVM assigns individual precisions to weights of predictors which are then estimated by maximizing the marginal likelihood (type II ML or empirical Bayes). We investigated the selection properties of RVM both analytically and by experiments in a regression setting.We show analytically that RVM selects predictors when the absolute z-ratio (|least squares estimate|/standard error) exceeds 1 in the case of orthogonal predictors and, for M = 2, that this still holds true for correlated predictors when the other z-ratio is large. RVM selects the stronger of two highly correlated predictors. In experiments with real and simulated data, RVM is outcompeted by other popular regularization methods (LASSO and/or PLS) in terms of the prediction performance. We conclude that type II ML is not the general answer in high dimensional prediction problems.In extensions of RVM to obtain stronger selection, improper priors (based on the inverse gamma family) have been assigned to the inverse precisions (variances) with parameters estimated by penalized marginal likelihood. We critically assess this approach and suggest a proper variance prior related to the Beta distribution which gives similar selection and shrinkage properties and allows a fully Bayesian treatment.  相似文献   

5.
The predictability of data values is studied at a fundamental level. Two basic predictor models are defined: Computational predictors perform an operation on previous values to yield predicted next values. Examples we study are stride value prediction and last value prediction; Context-Based predictors match recent value history (context) with previous value history and predict values based entirely on previously observed patterns. To understand the potential of value prediction we perform simulations with unbounded prediction tables that are immediately updated using correct data values. Simulations of integer SPEC95 benchmarks show that data values can be highly predictable. Best performance is obtained with context-based predictors; overall prediction accuracies are between 56% and 92%. The context based predictor typically has an accuracy about 20% better than the computational predictors (last value and stride). Results with bounded tables suggest the feasibility of context-based predictors that approximate the performance with unbounded tables.  相似文献   

6.
We present a Bayesian variable selection method for the setting in which the number of independent variables or predictors in a particular dataset is much larger than the available sample size. While most of the existing methods allow some degree of correlations among predictors but do not consider these correlations for variable selection, our method accounts for correlations among the predictors in variable selection. Our correlation-based stochastic search (CBS) method, the hybrid-CBS algorithm, extends a popular search algorithm for high-dimensional data, the stochastic search variable selection (SSVS) method. Similar to SSVS, we search the space of all possible models using variable addition, deletion or swap moves. However, our moves through the model space are designed to accommodate correlations among the variables. We describe our approach for continuous, binary, ordinal, and count outcome data. The impact of choices of prior distributions and hyperparameters is assessed in simulation studies. We also examined the performance of variable selection and prediction as the correlation structure of the predictors varies. We found that the hybrid-CBS resulted in lower prediction errors and identified better the true outcome associated predictors than SSVS when predictors were moderately to highly correlated. We illustrate the method on data from a proteomic profiling study of melanoma, a type of skin cancer.  相似文献   

7.
高晖  张莉  李琳 《软件学报》2010,21(9):2118-2134
结合经验数据和专家知识,基于贝叶斯网建立了软件体系结构层次的结构特征、变化原因与软件适应性之间的因果关联模型,即软件体系结构层次的软件适应性预测模型,扩展贝叶斯网学习算法解决了该预测模型中较弱因果关系的发现问题.最后给出应用预测模型在软件体系结构层次上评估软件适应性的方法和实例.  相似文献   

8.
In this paper, the problem of designing weighted fusion robust time-varying Kalman predictors is considered for multisensor time-varying systems with uncertainties of noise variances. Using the minimax robust estimation principle and the unbiased linear minimum variance (ULMV) rule, based on the worst-case conservative system with the conservative upper bounds of noise variances, the local and five weighted fused robust time-varying Kalman predictors are designed, which include a robust weighted measurement fuser, three robust weighted state fusers, and a robust covariance intersection (CI) fuser. Their actual prediction error variances are guaranteed to have the corresponding minimal upper bounds for all admissible uncertainties of noise variances. Their robustness is proved based on the proposed Lyapunov equation approach. The concept of the robust accuracy is presented, and the robust accuracy relations are proved. The corresponding steady-state robust local and fused Kalman predictors are also presented, and the convergence in a realization between the time-varying and steady-state robust Kalman predictors is proved by the dynamic error system analysis (DESA) method and the dynamic variance error system analysis (DVESA) method. Simulation results show the effectiveness and correctness of the proposed results.  相似文献   

9.
袁泽  陈斌 《图学学报》2022,43(1):125-132
采用传统的空间插值方法对区域污染物进行模拟与预测,针对源数据分布不均,效果一般的问题,提出了采用INLA-SPDE模型来模拟与预测区域污染物的方法.模型的空间分量使用随机偏微分方程表达,时间分量则采用一阶时序自相关模型,同时还包含气象参数等10种协变量,以2019年度京津冀地区日均PM2.5浓度为例,逐月建立了时空模拟...  相似文献   

10.
杜超  王志海  江晶晶  孙艳歌 《软件学报》2017,28(11):2891-2904
基于模式的贝叶斯分类模型是解决数据挖掘领域分类问题的一种有效方法.然而,大多数基于模式的贝叶斯分类器只考虑模式在目标类数据集中的支持度,而忽略了模式在对立类数据集合中的支持度.此外,对于高速动态变化的无限数据流环境,在静态数据集下的基于模式的贝叶斯分类器就不能适用.为了解决这些问题,提出了基于显露模式的数据流贝叶斯分类模型EPDS(Bayesian classifier algorithm based on emerging pattern for data stream).该模型使用一个简单的混合森林结构来维护内存中事务的项集,并采用一种快速的模式抽取机制来提高算法速度.EPDS采用半懒惰式学习策略持续更新显露模式,并为待分类事务在每个类下建立局部分类模型.大量实验结果表明,该算法比其他数据流分类模型有较高的准确度.  相似文献   

11.
基于贝叶斯网络的跳频序列多步预测*   总被引:1,自引:1,他引:0  
根据跳频频率序列具有混沌特性,在相空间重构理论基础上提出一种用于跳频频率序列预测的贝叶斯网络模型。该模型将重构后的整个相空间作为先验数据信息,进而通过学习贝叶斯网络并利用贝叶斯网络推理算法达到对跳频频率多步预测的目的。仿真结果表明该方法具有良好的多步预测能力,并能有效地克服过拟合现象。  相似文献   

12.
Feedforward neural networks, particularly multilayer perceptrons, are widely used in regression and classification tasks. A reliable and practical measure of prediction confidence is essential. In this work three alternative approaches to prediction confidence estimation are presented and compared. The three methods are the maximum likelihood, approximate Bayesian, and the bootstrap technique. We consider prediction uncertainty owing to both data noise and model parameter misspecification. The methods are tested on a number of controlled artificial problems and a real, industrial regression application, the prediction of paper "curl". Confidence estimation performance is assessed by calculating the mean and standard deviation of the prediction interval coverage probability. We show that treating data noise variance as a function of the inputs is appropriate for the curl prediction task. Moreover, we show that the mean coverage probability can only gauge confidence estimation performance as an average over the input space, i.e., global performance and that the standard deviation of the coverage is unreliable as a measure of local performance. The approximate Bayesian approach is found to perform better in terms of global performance.  相似文献   

13.
陈伟  陈继明 《计算机应用》2016,36(4):914-917
针对如何分配一个未来一段时间内满足QoS要求的云服务和感知可能将要发生的QoS违规的问题,提出一种基于时间序列预测方法的云服务QoS预测方法。该预测方法利用改进的贝叶斯常均值(IBCM)模型,能够准确地预测云服务未来一段时间内的QoS状态。实验通过搭建Hadoop集群模拟云平台并收集了响应时间和吞吐量两种QoS属性的数据作为预测对象,实验结果表明:相比自回归积分滑动平均(ARIMA)模型和贝叶斯常均值折扣模型等时间序列预测方法,基于改进的贝叶斯常均值模型的云服务QoS预测方法的平方和误差(SSE)、平均绝对误差(MAE)、均方误差(MSE)和和平均绝对百分比误差(MAPE)均比前两者小一个数量级,因此具有更高的预测精度;同时预测结果对比图说明提出的预测方法具有更好的拟合效果。  相似文献   

14.
Sparse on-line gaussian processes   总被引:7,自引:0,他引:7  
We develop an approach for sparse representations of gaussian process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets. The method is based on a combination of a Bayesian on-line algorithm, together with a sequential construction of a relevant subsample of the data that fully specifies the prediction of the GP model. By using an appealing parameterization and projection techniques in a reproducing kernel Hilbert space, recursions for the effective parameters and a sparse gaussian approximation of the posterior process are obtained. This allows for both a propagation of predictions and Bayesian error measures. The significance and robustness of our approach are demonstrated on a variety of experiments.  相似文献   

15.
The embedding dimension and the number of nearest neighbors are very important parameters in the prediction of chaotic time series. To reduce the prediction errors and the uncertainties in the determination of the above parameters, a new chaos Bayesian optimal prediction method (CBOPM) is proposed by choosing optimal parameters in the local linear prediction method (LLPM) and improving the prediction accuracy with Bayesian theory. In the new method, the embedding dimension and the number of nearest neighbors are combined as a parameter set. The optimal parameters are selected by mean relative error (MRE) and correlation coefficient (CC) indices according to optimization criteria. Real hydrological time series are taken to examine the new method. The prediction results indicate that CBOPM can choose the optimal parameters adaptively in the prediction process. Compared with several LLPM models, the CBOPM has higher prediction accuracy in predicting hydrological time series.  相似文献   

16.
We describe a software system, called just enough delivery (JED), for optimising single-copy newspaper sales, based on a combination of neural and Bayesian technology. The prediction model is a huge feedforward neural network, in which each output corresponds to the sales prediction for a single outlet. Input-to-hidden weights are shared between outlets. The hidden-to-output weights are specific to each outlet, but linked through the introduction of priors. All weights and hyperparameters can be inferred using (empirical) Bayesian inference. The system has been tested on data for several different newspapers and magazines. Consistent performance improvements of 1 to 3% more sales with the same total amount of deliveries have been obtained.  相似文献   

17.
《Journal of Process Control》2014,24(7):1046-1056
Soft sensors are used to predict response variables, which are difficult to measure, using the data of predictors that can be obtained relatively easier. Arranging time-lagged data of predictors and applying partial least squares (PLS) to the dataset is a popular approach for extracting the correlation between data of the responses and predictors of the process dynamic. However, the model input dimension dramatically soars once multiple time delays are incorporated. In addition, the selection of variables in the dynamic PLS (DPLS) model is a critical step for the robustness and the accuracy of the inferential model, since irrelevant inputs deteriorate the prediction performance of the soft sensor. The sparse PLS (SPLS) is a variable selection method that simultaneously selects the important predictors and finds the correlation between the predictors and responses. The sparsity of the model is dependent on a cut-off value in the SPLS algorithm that is determined using a cross-validation procedure. Therefore, the threshold is a compromise for all latent variable directions. It is necessary to further shrink the inputs from the result of SPLS to obtain a more compact model. In the presented work, named SPLS-VIP, the variable importance in projection (VIP) method was used to filter out the insignificant inputs from the SPLS result. An industrial soft sensor for predicting oxygen concentrations in the air separation process was developed based on the proposed approach. The prediction performance and the model interpretability could be further improved from the SPLS method using the proposed approach.  相似文献   

18.
A novel Bayesian paradigm for the identification of output error models has recently been proposed in which, in place of postulating finite-dimensional models of the system transfer function, the system impulse response is searched for within an infinite-dimensional space. In this paper, such a nonparametric approach is applied to the design of optimal predictors and discrete-time models based on prediction error minimization by interpreting the predictor impulse responses as realizations of Gaussian processes. The proposed scheme describes the predictor impulse responses as the convolution of an infinite-dimensional response with a low-dimensional parametric response that captures possible high-frequency dynamics. Overparameterization is avoided because the model involves only a few hyperparameters that are tuned via marginal likelihood maximization. Numerical experiments, with data generated by ARMAX and infinite-dimensional models, show the definite advantages of the new approach over standard parametric prediction error techniques and subspace methods both in terms of predictive capability on new data and accuracy in reconstruction of system impulse responses.  相似文献   

19.
Regression conformal prediction produces prediction intervals that are valid, i.e., the probability of excluding the correct target value is bounded by a predefined confidence level. The most important criterion when comparing conformal regressors is efficiency; the prediction intervals should be as tight (informative) as possible. In this study, the use of random forests as the underlying model for regression conformal prediction is investigated and compared to existing state-of-the-art techniques, which are based on neural networks and k-nearest neighbors. In addition to their robust predictive performance, random forests allow for determining the size of the prediction intervals by using out-of-bag estimates instead of requiring a separate calibration set. An extensive empirical investigation, using 33 publicly available data sets, was undertaken to compare the use of random forests to existing state-of-the-art conformal predictors. The results show that the suggested approach, on almost all confidence levels and using both standard and normalized nonconformity functions, produced significantly more efficient conformal predictors than the existing alternatives.  相似文献   

20.
Penalized quasi-likelihood(PQL) procedure for statistical inference in generalized linear mixed models (GLMMs) and in Bayesian disease mapping and ecological modeling are revisited. In GLMM framework, empirical Bayes PQL (EBPQL) procedure is discussed in the context of approximating posterior point and interval prediction of random effects. An in-depth Monte Carlo assessment on EBPQL point and interval estimation of random effects, fixed effects, and prior parameters in univariate and bivariate (shared component) disease mapping and ecological models is presented, with illustrative examples including spatial and ecological modeling of infant mortality rates (relative uncommon events), suicide hospitalization rates (rare events) and suicide mortality rates (extremely rare events), and associated ecological risk factors in local health areas in British Columbia Canada. In particular, EBPQL interval prediction of random effects is explored by prediction uncertainty attributions with respect to uncertainties associated with estimation of random effects, fixed effects, and prior parameters. Estimation of percent attributions of EBPQL random effects prediction errors to prior uncertainty is developed in the context of GLMMs and explored in Bayesian disease mapping and ecological models, suggesting evidence that uncertainty about prior parameter(s) may have minor and negligible influence on EBPQL interval prediction of random effects in Bayesian hierarchical disease mapping and ecological modeling of moderate Poisson observations. The EBPQL inference procedure may be judiciously and profitably utilized in Bayesian disease mapping and ecological model development.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号