首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
As Machine Learning (ML) is widely applied in security-critical fields, the requirements for the interpretability of ML also increase. The interpretability aims at helping people understand internal operation principles and decision principles of models, so as to improve models'' credibility. However, research on the interpretability of ML models such as Random Forest (RF) is still in the infant stage. Considering the strict and standardized characteristics of formal methods and their wide application in the field of ML in recent years, this study leverages formal methods and logical reasoning to develop an ML interpretability method for interpreting the prediction of RF. Specifically, the decision-making process of RF is encoded into a first-order logical formula, and a Minimal Unsatisfiable Core (MUC) is taken as the core. Local interpretation of feature importance and counterfactual sample generation methods are provided. Experimental results on several public datasets illustrate the high quality of the proposed feature importance measurement, and the counterfactual sample generation method outperforms the existing state-of-the-art methods. Moreover, from the perspective of user-friendliness, the user report can be generated according to the analysis results of counterfactual samples, which can provide suggestions for users to improve their situation in real-life applications.  相似文献   

2.
随着机器学习在安全关键领域的应用愈加广泛,对于机器学习可解释性的要求也愈加提高.可解释性旨在帮助人们理解模型内部的运作原理以及决策依据,增加模型的可信度.然而,对于随机森林等机器学习模型的可解释性相关研究尚处于起步阶段.鉴于形式化方法严谨规范的特性以及近年来在机器学习领域的广泛应用,提出一种基于形式化和逻辑推理方法的机器学习可解释性方法,用于解释随机森林的预测结果.即将随机森林模型的决策过程编码为一阶逻辑公式,并以最小不满足核为核心,提供了关于特征重要性的局部解释以及反事实样本生成方法.多个公开数据集的实验结果显示,所提出的特征重要性度量方法具有较高的质量,所提出的反事实样本生成算法优于现有的先进算法;此外,从用户友好的角度出发,可根据基于反事实样本分析结果生成用户报告,在实际应用中,能够为用户改善自身情况提供建议.  相似文献   

3.
以SPOT 5多光谱影像为数据源,通过与SAM、SID以及常规的最大似然法(ML)和最小距离法(MD)的对比,研究了基于SAM-SID混合法的土地覆盖多光谱遥感分类技术。研究结果显示,相比于SAM和SID,SID(TAN)和SID(SIN)两个SAM-SID混合参量对多光谱影像上地物识别的能力更强,尤以SID(SIN)的识别能力最强;基于SID(SIN)的多光谱遥感分类验证精度达78.94%,不但明显高于SAM和SID法,而且也高于常规的MD和ML监督分类方法。这说明SAM-SID混合分类方法不但适用于高光谱遥感分类,同时在多光谱遥感分类中也有很强的适用性。  相似文献   

4.
Multispectral extensions to the traditional gray level simultaneous autoregressive (SAR) and Markov random field (MRF) models are considered. Furthermore, a new image model is proposed, the pseudo-Markov model, which retains the characteristics of the multispectral Markov model, yet admits to a simplified parameter estimation method. These models are well-suited to analysis and modeling of color images. For each model considered, procedures are developed for parameter estimation and image synthesis. Experimental results, based on known image models and natural texture samples, substantiate the validity of thee results  相似文献   

5.
Remote sensing image fusion based on Bayesian linear estimation   总被引:1,自引:0,他引:1  
A new remote sensing image fusion method based on statistical parameter estimation is proposed in this paper. More specially, Bayesian linear estimation (BLE) is applied to observation models between remote sensing images with different spa- tial and spectral resolutions. The proposed method only estimates the mean vector and covariance matrix of the high-resolution multispectral (MS) images, instead of assuming the joint distribution between the panchromatic (PAN) image and low-resolution multispectral image. Furthermore, the proposed method can enhance the spatial resolution of several principal components of MS images, while the traditional Principal Component Analysis (PCA) method is limited to enhance only the first principal component. Experimental results with real MS images and PAN image of Landsat ETM demonstrate that the proposed method performs better than traditional methods based on statistical parameter estimation, PCA-based method and wavelet-based method.  相似文献   

6.
With the cloud scenario products from CloudSat, we developed a high spatiotemporal resolution cloud-type classification procedure for Himawari-8 multispectral datasets using maximum-likelihood estimation (MLE) and random forests (RF) classification. The training and classification procedures were processed independently, and both algorithms provided cloud-type results with a good performance. Validation indicated that the use of the visible (VIS) channel significantly improved the cloud-type identification capabilities, while the use of three or more channels simultaneously resulted in considerable improvements over the use of bispectral combinations. The comparison among different classifiers also revealed that RF was more sensitive than MLE to the quality and distribution of the training data. After retraining the RF using MLE-based clustered samples, we produced two more-reasonable and efficient classifiers that can be used during the day and night.  相似文献   

7.
An adaptive segmentation algorithm is developed which simultaneously estimates the parameters of the underlying Gibbs random field (GRF)and segments the noisy image corrupted by additive independent Gaussian noise. The algorithm, which aims at obtaining the maximum a posteriori (MAP) segmentation is a simulated annealing algorithm that is interrupted at regular intervals for estimating the GRF parameters. Maximum-likelihood (ML) estimates of the parameters based on the current segmentation are used to obtain the next segmentation. It is proven that the parameter estimates and the segmentations converge in distribution to the ML estimate of the parameters and the MAP segmentation with those parameter estimates, respectively. Due to computational difficulties, however, only an approximate version of the algorithm is implemented. The approximate algorithm is applied on several two- and four-region images with different noise levels and with first-order and second-order neighborhoods  相似文献   

8.
Wild-land fires have become intense and more frequent all over the world. Improving the accuracy of mapping fuel models is essential for fuel management decisions and explicit fire behavior prediction for real-time support of suppression tactics and logistics decisions. The overall aim of this paper is to develop the use of lidar (LIght Detection and Ranging) remote sensing to accurately and effectively assess fuel models in East Texas. More specific goals include: (1) developing lidar derived products and the methodology to use them for assessing fuel models; (2) investigating the use of several techniques for data fusion of lidar and multispectral imagery for assessing fuel models; (3) investigating the gain in fuels mapping accuracy when using lidar as opposed to QuickBird imagery alone; and (4) producing spatially explicit digital fuel maps. Estimates of fuel models were compared with in-situ data collected over 62 plots. We employ a unique approach to classify fuel models using a combination of lidar height bins and multispectral image data. Different image processing approaches for fusing lidar and multispectral data, such as the Minimum Noise Fraction (MNF) and Principle Component Analysis (PCA), were used to improve the overall accuracy of image classification. Supervised image classification methods provided better accuracy (90.10%) with the fusion of airborne lidar data with QuickBird data than with QuickBird imagery alone (76.52%).According to our results, lidar derived data provide accurate estimates of surface fuel parameters efficiently and accurately over extensive areas of forests. This study demonstrates the importance of using accurate maps of fuel models derived using new lidar remote sensing techniques.  相似文献   

9.
Accurate crop-type classification is a challenging task due, primarily, to the high within-class spectral variations of individual crops during the growing season (phenological development) and, second, to the high between-class spectral similarity of crop types. Utilizing within-season multi-temporal optical and multi-polarization synthetic aperture radar (SAR) data, this study introduces a combined object- and pixel-based image classification methodology for accurate crop-type classification. Particularly, the study investigates the improvement of crop-type classification by using the least number of multi-temporal RapidEye (RE) images and multi-polarization Radarsat-2 (RS-2) data utilized in an object- and pixel-based image analysis framework. The method was tested on a study area in Manitoba, Canada, using three different classifiers including the standard Maximum Likelihood (ML), Decision Tree (DT), and Random Forest (RF) classifiers. Using only two RE images of July and August, the proposed method results in overall accuracies (OAs) of about 95%, 78%, and 93% for the ML, DT, and RF classifiers, respectively. Moreover, the use of only two quad-pol images of RS-2 of June and September resulted in OAs of 92%, 75%, and 90% for the ML, DT, and RF classifiers, respectively. The best classification results were achieved by the synergistic use of two RE and two RS-2 images. In this case, the overall classification accuracies were 97% for both ML and RF classifiers. In addition, the average producer’s accuracies of 95% and 96% were achieved by the ML and RF classifiers, respectively, whereas the average user accuracy was 94% for both classifiers. The results indicated promising potentials for rapid and cost-effective local-scale crop-type classification using a limited number of high-resolution optical and multi-polarization SAR images. Very accurate classification results can be considered as a replacement for sampling the agricultural fields at the local scale. The result of this very accurate classification at discrete locations (approximately 25 × 25 km frames) can be applied in a separate procedure to increase the accuracy of crop area estimation at the regional to provincial scale by linking these local very accurate spatially discrete results to national wall-to-wall continuous crop classification maps.  相似文献   

10.
This article proposes a new multispectral image texture segmentation algorithm using a multi-resolution fuzzy Markov random field model for a variable scale in the wavelet domain. The algorithm considers multi-scalar information in both vertical and lateral directions. The feature field of the scalable wavelet coefficients is modelled, combining with the fuzzy label field describing the spatially constrained correlations between neighbourhood features to achieve a more accurate parameter estimation. The extended scalable label field models the label data from different scales to obtain more homogeneous areas; image segmentation results are finally obtained according to the Bayesian rule from a coarser to a finer scale. Multispectral texture images and remote-sensing images are used to test the effectiveness of the the proposed method. Segmentation results show that the new method simultaneously presents a better performance in achieving the homogeneity of the region and accuracy of detected boundaries compared with existing image segmentation algorithms.  相似文献   

11.
ABSTRACT

Accurate mapping of wetland distribution is required for wetland conservation, management, and restoration, but remains a challenge due to the complexity of wetland landscapes. This research employed four seasons of multispectral images from Gaofen-1 satellite to map wetland land-cover distribution in Hangzhou bay coastal wetland (245 km2) in China. Maximum likelihood classifier (MLC), random forest (RF), and the expert-based approach were examined based on spectral, spatial, and phenological features. The results showed that land-cover classification accuracies of 83.9% using RF and 90.3% using the expert-based approach were obtained, and they had higher accuracy than MLC, which had an overall accuracy of only 63.3%. The high classification accuracy for nine land-cover classes using the expert-based approach indicated the important role of expert knowledge from the phenological features in improving wetland classification accuracy. As high spatial resolution satellite images become more easily obtainable, effective use of temporal information of different sensor data will be valuable for detailed land-cover classification with higher accuracy. The approach to establish expert rules from multitemporal images provides a new way to improve land-cover classification in different terrestrial ecosystems.  相似文献   

12.
A new relevance feedback (RF) approach for content-based image retrieval is presented. This approach uses Gaussian mixture (GM) models of the image features and a query that is updated in a probabilistic manner. This update reflects the preferences of the user and is based on the models of both the positive and negative feedback images. The retrieval is based on a recently proposed distance measure between probability density functions, which can be computed in closed form for GM models. The proposed approach takes advantage of the form of this distance measure and updates it very efficiently based on the models of the userspecified relevant and irrelevant images. It is also shown that this RF framework is fairly general and can be applied in case other image models or distance measures are used instead of those proposed in this work. Finally, comparative numerical experiments are provided, which that demonstrate the merits of the proposed RF methodology and the use of the distance measure, and also the advantages of using GMs for image modelling.  相似文献   

13.
Generalized linear mixed models (GLMM) form a very general class of random effects models for discrete and continuous responses in the exponential family. They are useful in a variety of applications. The traditional likelihood approach for GLMM usually involves high dimensional integrations which are computationally intensive. In this work, we investigate the case of binary outcomes analyzed under a two stage probit normal model with random effects. First, it is shown how ML estimates of the fixed effects and variance components can be computed using a stochastic approximation of the EM algorithm (SAEM). The SAEM algorithm can be applied directly, or in conjunction with a parameter expansion version of EM to speed up the convergence. A procedure is also proposed to obtain REML estimates of variance components and REML-based estimates of fixed effects. Finally an application to a real data set involving a clinical trial is presented, in which these techniques are compared to other procedures (penalized quasi-likelihood, maximum likelihood, Bayesian inference) already available in classical softwares (SAS Glimmix, SAS Nlmixed, WinBUGS), as well as to a Monte Carlo EM (MCEM) algorithm.  相似文献   

14.
An important technique in cultural heritage preservation is multispectral acquisition, where one recovers a detailed spectral record of a painting using carefully calibrated lighting. This is difficult to do with frescoes, because it is hard to recover the spatial variation in light intensity that results from factors like the imaging setup and the curvature of the fresco. We introduce a new formulation of the lightness problem applied to images of pictorial artworks. The problem is different from the conventional lightness problem, because artists often paint the effects of light, so the albedo field contains a component that mimics an illumination field. Our method distinguishes between physical illumination and painted shading through spatial frequency effects and dynamic range considerations. We evaluate our method using multispectral images of paintings, where the physical illumination field is known. Our method produces estimates of the illumination intensity field that compare very well with the known ground truth, and outperforms other state-of-the art lightness recovery algorithms. For frescoes, ground truth is not available, but we show that our method produces consistent results, in the sense that the illumination functions estimated on the image and on (some of) its subimages are very similar on the overlap. We show our method produces qualitatively good color corrections for images of frescoes found on the web.  相似文献   

15.
This paper deals with a comparison of recent statistical models based on fuzzy Markov random fields and chains for multispectral image segmentation. The fuzzy scheme takes into account discrete and continuous classes which model the imprecision of the hidden data. In this framework, we assume the dependence between bands and we express the general model for the covariance matrix. A fuzzy Markov chain model is developed in an unsupervised way. This method is compared with the fuzzy Markovian field model previously proposed by one of the authors. The segmentation task is processed with Bayesian tools, such as the well-known MPM (mode of posterior marginals) criterion. Our goal is to compare the robustness and rapidity for both methods (fuzzy Markov fields versus fuzzy Markov chains). Indeed, such fuzzy-based procedures seem to be a good answer, e.g., for astronomical observations when the patterns present diffuse structures. Moreover, these approaches allow us to process missing data in one or several spectral bands which correspond to specific situations in astronomy. To validate both models, we perform and compare the segmentation on synthetic images and raw multispectral astronomical data  相似文献   

16.
Modeling textured images using generalized long correlation models   总被引:2,自引:0,他引:2  
The long correlation (LC) models are a general class of random field (RF) models which are able to model correlations, extending over large image regions with few model parameters. The LC models have seen limited use, due to lack of an effective method for estimating the model parameters. In this work, we develop an estimation scheme for a very general form of this model and demonstrate its applicability to texture modeling applications. The relationship of the generalized LC models to other classes of RF models, namely the simultaneous autoregressive (SAR) and Markov random field (MRF) models, is shown. While it is known that the SAR model is a special case of the LC model, we show that the MRF model is also encompassed by this model. Consequently, the LC model may be considered as a generalization of the SAR and MRF models  相似文献   

17.
Ribonucleic acid (RNA) hybridization is widely used in popular RNA simulation software in bioinformatics. However, limited by the exponential computational complexity of combinatorial problems, it is challenging to decide, within an acceptable time, whether a specific RNA hybridization is effective. We hereby introduce a machine learning based technique to address this problem. Sample machine learning (ML) models tested in the training phase include algorithms based on the boosted tree (BT), random forest (RF), decision tree (DT) and logistic regression (LR), and the corresponding models are obtained. Given the RNA molecular coding training and testing sets, the trained machine learning models are applied to predict the classification of RNA hybridization results. The experiment results show that the optimal predictive accuracies are 96.2%, 96.6%, 96.0% and 69.8% for the RF, BT, DT and LR-based approaches, respectively, under the strong constraint condition, compared with traditional representative methods. Furthermore, the average computation efficiency of the RF, BT, DT and LR-based approaches are 208 679, 269 756, 184 333 and 187 458 times higher than that of existing approach, respectively. Given an RNA design, the BT-based approach demonstrates high computational efficiency and better predictive accuracy in determining the biological effectiveness of molecular hybridization.   相似文献   

18.
为了充分利用多光谱影像波段间的相关性,提出高斯Copula的多光谱遥感影像分割方法.首先,建立基于马尔可夫随机场的标号场模型,使用Potts模型刻画该标号场.然后,建立表征像素光谱测度的特征场,利用高斯Copula建立像素光谱测度的多变量统计模型以刻画该特征场.结合标号场、特征场模型及各模型参数的先验概率,利用贝叶斯定理建立多光谱影像分割的后验概率模型.最后,设计适用于模拟后验概率模型的M-H算法,在最大后验概率策略下获取最优分割结果.对模拟和真实多光谱影像分割结果表明,文中方法描述波段间相关性的能力较强,准确性较高.  相似文献   

19.
Automatic land cover classification from satellite images is an important topic in many remote sensing applications. In this paper, we consider three different statistical approaches to tackle this problem: two of them, namely the well-known maximum likelihood classification (ML) and the support vector machine (SVM), are noncontextual methods. The third one, iterated conditional modes (ICM), exploits spatial context by using a Markov random field. We apply these methods to Landsat 5 Thematic Mapper (TM) data from Tenerife, the largest of the Canary Islands. Due to the size and the strong relief of the island, ground truth data could be collected only sparsely by examination of test areas for previously defined land cover classes.We show that after application of an unsupervised clustering method to identify subclasses, all classification algorithms give satisfactory results (with statistical overall accuracy of about 90%) if the model parameters are selected appropriately. Although being superior to ML theoretically, both SVM and ICM have to be used carefully: ICM is able to improve ML, but when applied for too many iterations, spatially small sample areas are smoothed away, leading to statistically slightly worse classification results. SVM yields better statistical results than ML, but when investigated visually, the classification result is not completely satisfying. This is due to the fact that no a priori information on the frequency of occurrence of a class was used in this context, which helps ML to limit the unlikely classes.  相似文献   

20.
This paper presents a semi-parametric method of parameter estimation for the class of logarithmic ACD (Log-ACD) models using the theory of estimating functions (EF). A number of theoretical results related to the corresponding EF estimators are derived. A simulation study is conducted to compare the performance of the proposed EF estimates with corresponding ML (maximum likelihood) and QML (quasi maximum likelihood) estimates. It is argued that the EF estimates are relatively easier to evaluate and have sampling properties comparable with those of ML and QML methods. Furthermore, the suggested EF estimates can be obtained without any knowledge of the distribution of errors is known. We apply all these suggested methodology for a real financial duration dataset. Our results show that Log-ACD (1, 1) fits the data well giving relatively smaller variation in forecast errors than in Linear ACD (1, 1) regardless of the method of estimation. In addition, the Diebold–Mariano (DM) and superior predictive ability (SPA) tests have been applied to confirm the performance of the suggested methodology. It is shown that the new method is slightly better than traditional methods in practice in terms of computation; however, there is no significant difference in forecasting ability for all models and methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号