首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 21 毫秒
1.
Sparse CCA using a Lasso with positivity constraints   总被引:1,自引:0,他引:1  
Canonical correlation analysis (CCA) describes the relationship between two sets of variables by finding linear combinations of the variables with maximal correlation. A sparse version of CCA is proposed that reduces the chance of including unimportant variables in the canonical variates and thus improves their interpretation. A version of the Lasso algorithm incorporating positivity constraints is implemented in tandem with alternating least squares (ALS), to obtain sparse canonical variates. The proposed method is demonstrated on simulation studies and a data set from market basket analysis.  相似文献   

2.
The objective of DALASS is to simplify the interpretation of Fisher's discriminant function coefficients. The DALASS problem—discriminant analysis (DA) modified so that the canonical variates satisfy the LASSO constraint—is formulated as a dynamical system on the unit sphere. Both standard and orthogonal canonical variates are considered. The globally convergent continuous-time algorithms are illustrated numerically and applied to some well-known data sets.  相似文献   

3.
Canonical correlation analysis was used to examine the relations between the six reflective Thematic Mapper bands and six forest structural variables for 70 lodgepole pine forest stands in Yellowstone National Park, U.S.A. Two significant canonical variate pairs were extracted, accounting for 96·4 per cent of the total information in the overall canonical correlation analysis. Results of the canonical redundancy analysis indicate that 78 per cent of the overall unstandardized variance in spectral data is explained by the first two spectral canonical variates, while the first and second biotic canonical variates explain 59 per cent and 5·9 per cent of the raw variance in the spectral data. The first two biotic canonical variates collectively explain 59 per cent of the raw variance in the biotic data, and the first and second spectral canonical variates explain 41 per cent and 6 per cent of the raw variance in the biotic data, respectively. Height, live basal area, leaf area index (LAI), and size diversity are highly intercorrelated and act in combination to affect the overall reflectance, or brightness, of a forest stand. Overstory live density and understory total living cover relate strongly to stand greenness, particularly TM band 4.  相似文献   

4.
Discriminant analysis and canonical variates analysis on principal components of a number of extracts from multi-spectral images showed that low order components with large eigenvalues are not necessarily the most important for distinguishing classes of landcover and discarding components with small eigenvalues may reduce the accuracy of discrimination. It is therefore inadvisable to use principal components analysis for reducing the number of wavebands used for discriminant analysis.  相似文献   

5.
Frequently multivariate experimental data taken over multiple occasions is found to produce a multivariate multi-table experiment. Neither the separate analysis of each occasion, using MANOVA or canonical variate analysis, nor the joint analysis using the STATIS-ACT methodology for multiple tables, are adequate to capture the real structure of the data matrices, since the first account for group structure but not time evolution and the second confuses the “between” and “within” group variabilities. A method baptized Canonical STATIS, based on STATIS-ACT methodology, is proposed. The method permits to account for data group structure as well as time evolution on the various occasions by obtaining common or stable canonical variates across multiple occasions or data sets. A simultaneous representation of groups, variables and occasions (biplot) is obtained, thereby widening the capabilities of classical methods.  相似文献   

6.
This paper is mainly devoted to the analysis of the discrimination capability of a radar polarimeter using a purely statistical approach. The statistical analysis is intended to find the set of variates that best summarizes the dilTerences between classes. We have analysed the advantages of a fully polarimetric sensor with respect to a conventional radar that does not retain the phase relation between dilTerent polarizations. In this work, we have used the radar images acquired by the DC-8/AIRSAR over the Flevoland test site “Holland” during the MASTRO I campaign.

The canonical discriminant analysis gives the best results to summarize the information content of the data and to reduce the dimension of the variables to be considered in the classification. The phase information only shows significant discrimination power when several independent samples are averaged. A good speckle reduction technique improves the classification results, even when the phase information is not taken into account.  相似文献   

7.
从模式分类的角度出发,提出一种监督的局部保持典型相关分析(SLPCCA),通过最大类内成对样本与其近邻间的权重相关性,因而能有效利用样本类别信息的同时保持数据的局部流形结构,并且融合判别型典型相关分析(DCCA)的鉴别信息而不受总类别数的限制。此外,为了提取数据的非线性特征,在核方法的基础上又提出一种核化的SLPCCA(KSLPCCA)。在ORL、Yale、AR和FERET等人脸数据库的实验结果表明,该算法比其他传统的典型相关分析方法具有更好的识别效果。  相似文献   

8.
Spatial interpolation methods are normally used to create aerial rainfall maps from remote measuring data collected by raingauge network. However, most spatial interpolation methods are not in the form of interpretable data models. This could make further analysis on the spatial data difficult. This paper proposes a methodology to analyze and establish an interpretable fuzzy model for monthly rainfall spatial interpolation. The proposed methodology integrates the benefits of various soft computing techniques. The final outcome is the proposal of an interpretable fuzzy model that allows human analysts to gain insight into the spatial data to be modeled. The accuracy of the model is evaluated by eight monthly rainfall data in the northeast region of Thailand. The interpretability of the model is assessed by the interpretable fuzzy modeling criteria. The experimental results showed that the proposed methodology could be an alternative technique to create rainfall maps and to understand the characteristics of the spatial data.  相似文献   

9.
稀疏保持典型相关分析及在特征融合中的应用   总被引:3,自引:0,他引:3  
稀疏保持投影(Sparsity preserving projections, SPP)由于保持了数据间的稀疏重构性, 因而获取的投影向量满足旋转、尺度和平移的不变性, 并能够在无标签的情况下提取样本的自然鉴别信息, 在人脸识别领域取得了较为成功的应用. 本文在典型相关分析(Canonical correlation analysis, CCA)的基础上引入稀疏保持项, 提出一种稀疏保持典型相关分析(Sparsity preserving canonical correlation analysis, SPCCA). 该方法不仅实现了两组特征集鉴别信息的有效融合, 同时对提取特征间的稀疏重构性加以约束, 增强了特征的表示和鉴别能力. 在多特征手写体字符集与人脸数据集上的实验结果表明, SPCCA比CCA具有更优的识别性能.  相似文献   

10.
张芳娟  杨燕  杜圣东 《计算机应用》2018,38(11):3150-3155
针对高校资助管理办法效率低下、工作量大等问题,提出一种增强特征判别性的典型相关分析(EN-DCCA)方法,并结合分类集成方法实现高校学生助学金预测。将学生在校多维度数据划分为两个不同视图,已有的各种多视图判别典型相关分析算法没有综合考虑视图类别之间的相关性和视图组合特征的判别性两者因素。EN-DCCA的优化目标在最大化类内相关的同时最小化类间相关,并且考虑了视图组合特征的判别性,进一步强化了属性的判别性能,更有利于分类预测。高校学生助学金预测的实现过程:首先,根据学生生活行为和学习表现将数据预处理为两个不同视图,然后用EN-DCCA方法对这两个视图数据进行特征学习,最后用分类集成方法完成预测。在真实的数据集上进行实验,所提方法的预测准确率达到90.01%,较增强视图组合特征判别性的典型相关分析(CECCA)的集成方法提高了2个百分点,实验结果表明,所提方法能有效实现高校助学金预测。  相似文献   

11.
一种基于稀疏典型性相关分析的图像检索方法   总被引:1,自引:0,他引:1  
庄凌  庄越挺  吴江琴  叶振超  吴飞 《软件学报》2012,23(5):1295-1304
图像语义检索的一个关键问题就是要找到图像底层特征与语义之间的关联,由于文本是表达语义的一种有效手段,因此提出通过研究文本与图像两种模态之间关系来构建反映两者间潜在语义关联的有效模型的思路,基于该模型,可使用自然语言形式(文本语句)来表达检索意图,最终检索到相关图像.该模型基于稀疏典型性相关分析(sparse canonical correlation analysis,简称sparse CCA),按照如下步骤训练得到:首先利用隐语义分析方法构造文本语义空间,然后以视觉词袋(bag of visual words)来表达文本所对应的图像,最后通过Sparse CCA算法找到一个语义相关空间,以实现文本语义与图像视觉单词间的映射.使用稀疏的相关性分析方法可以提高模型可解释性和保证检索结果稳定性.实验结果验证了Sparse CCA方法的有效性,同时也证实了所提出的图像语义检索方法的可行性.  相似文献   

12.
化工生产过程往往含有大量的过程变量,且过程多处于闭环控制作用下,产生的测量数据常常存在互相关和自相关。规范变量分析(CVA)通过最大化两个变量集间的相关度,实现对高维数据的降维,并得到一组最大限度地解释变量集中信息的规范变量,很好地解决了上述问题。本文介绍一种基于CVA的过程监控方法,并将此方法应用于一实际化工单元的过程监控,利用控制图,及时准确地检测到过程故障,表明了基于CVA的监控方法的有效性。  相似文献   

13.
The usefulness of questionnaire and voice data to screen for laryngeal disorders is explored. Answers to 14 questions form a questionnaire data vector. Twenty-three variables computed by the commercial “Dr.Speech” software from a digital voice recording of a sustained phonation of the vowel sound/a/constitute a voice data vector. Categorization of the data into a healthy class and two classes of disorders, namely diffuse and nodular mass lesions of vocal folds is the task pursued in this work. Visualization of data and automated decisions is also an important aspect of this work. To make the categorization, a support vector machine (SVM) is designed based on genetic search. Linear as well as nonlinear canonical correlation analysis (CCA) is employed, to study relations between the questionnaire and voice data sets. The curvilinear component analysis, performing nonlinear mapping into a two-dimensional space, is used for visualizing data and decisions. Data from 240 patients were used in the experimental studies. It was found that the questionnaire data provide more information for the categorization than the voice data. There are 3-4 common directions along which the statistically significant variations of the questionnaire and voice data occur. However, the linear relations between the variations occurring in the two data sets are not strong. On the other hand, very strong linear relations were observed between the nonlinear variates obtained from the questionnaire data and linear ones computed from the voice data. Questionnaire data carry great potential for preventive health care in laryngology.  相似文献   

14.
In the presented paper a new method of identification of canonical coherent scatterers in the quad-polarimetric SAR data are presented. The proposed method is based on the analysis of polarimetric signatures. The observed signatures are compared with the polarimetric signatures of four canonical objects: trihedral, dihedral and helix – right and left which represent basic scattering mechanisms: single bounce, double bounce and helix scattering. The polarimetric matrices are treated as vectors in a unitary space with a scalar product that generates the norm. A recognized object is classified to one of the four coherent classes by a Kohonen network. It is not trained in an iteration process but its weights are adjusted according to the given patterns. The network classification is supported by rules. The obtained maps of pixels that represent canonical objects are compared with a map of coherent scatterers which was obtained by using the polarimetric entropy approach. The developed method of canonical coherent scatterers identification based on the polarimetric signatures analysis allows us not only to identify precisely the canonical coherent scatterers but also to determine the type of scattering mechanism characteristic for each of them. Since the proposed method works on a single-look (non-averaged) SAR data, it does not cause any spatial nor spectral decrease of amount of information because averaging is not conducted. Moreover, the proposed method will enable us the identification of a type of scattering mechanism in the canonical coherent pixels. This is an improvement in comparison to the existing methods. The obtained results should be more precise because the full polarimetric information about the scatterers is used in the identification procedure.  相似文献   

15.
S-distributions are univariate statistical distributions with four parameters. They have a simple mathematical structure yet provide excellent approximations for many traditional distributions and also contain a multitude of distributional shapes without a traditional analog. S-distributions furthermore have a number of beneficial features, for instance, in terms of data classification and scaling properties. They provide an appealing compromise between generality in data representation and logistic simplicity and have been applied in a variety of fields from applied biostatistics to survival analysis and risk assessment. Given their advantages in the single- variable case, it is desirable to extend S-distributions to several variates. This article proposes such an extension. It focuses on bivariate distributions whose marginals are S-distributions, but it is clear how more than two variates are to be addressed. The construction of bivariate S- distributions utilizes copulas, which have been developed quite rapidly in recent years. It is demonstrated here how one may generate such copulas and employ them to construct and analyze bivariate—and, by extension, multivariate—S-distributions. Particular emphasis is placed on Archimedean copulas, because they are easy to implement, yet quite flexible in fitting a variety of distributional shapes. It is illustrated that the bivariate S-distributions thus constructed have considerable flexibility. They cover a variety of marginals and a wide range of dependences between the variates and facilitate the formulation of relationships between measures of dependence and model parameters. Several examples of marginals and copulas illustrate the flexibility of bivariate S-distributions.  相似文献   

16.
17.
Developing Takagi–Sugeno fuzzy models by evolutionary algorithms mainly requires three factors: an encoding scheme, an evaluation method, and appropriate evolutionary operations. At the same time, these three factors should be designed so that they can consider three important aspects of fuzzy modeling: modeling accuracy, compactness, and interpretability. This paper proposes a new evolutionary algorithm that fulfills such requirements and solves fuzzy modeling problems. Two major ideas proposed in this paper lie in a new encoding scheme and a new fitness function, respectively. The proposed encoding scheme consists of three chromosomes, one of which uses unique chained possibilistic representation of rule structure. The proposed encoding scheme can achieve simultaneous optimization of parameters of antecedent membership functions and rule structures with the new fitness function developed in this paper. The proposed fitness function consists of five functions that consider three evaluation criteria in fuzzy modeling problems. The proposed fitness function guides evolutionary search direction so that the proposed algorithm can find more accurate compact fuzzy models with interpretable antecedent membership functions. Several evolutionary operators that are appropriate for the proposed encoding scheme are carefully designed. Simulation results on three modeling problems show that the proposed encoding scheme and the proposed fitness functions are effective in finding accurate, compact, and interpretable Takagi–Sugeno fuzzy models. From the simulation results, it is shown that the proposed algorithm can successfully find fuzzy models that approximate the given unknown function accurately with a compact number of fuzzy rules and membership functions. At the same time, the fuzzy models use interpretable antecedent membership functions, which are helpful in understanding the underlying behavior of the obtained fuzzy models.  相似文献   

18.
Data recorded from multiple sources sometimes exhibit non-instantaneous couplings. For simple data sets, cross-correlograms may reveal the coupling dynamics. But when dealing with high-dimensional multivariate data there is no such measure as the cross-correlogram. We propose a simple algorithm based on Kernel Canonical Correlation Analysis (kCCA) that computes a multivariate temporal filter which links one data modality to another one. The filters can be used to compute a multivariate extension of the cross-correlogram, the canonical correlogram, between data sources that have different dimensionalities and temporal resolutions. The canonical correlogram reflects the coupling dynamics between the two sources. The temporal filter reveals which features in the data give rise to these couplings and when they do so. We present results from simulations and neuroscientific experiments showing that tkCCA yields easily interpretable temporal filters and correlograms. In the experiments, we simultaneously performed electrode recordings and functional magnetic resonance imaging (fMRI) in primary visual cortex of the non-human primate. While electrode recordings reflect brain activity directly, fMRI provides only an indirect view of neural activity via the Blood Oxygen Level Dependent (BOLD) response. Thus it is crucial for our understanding and the interpretation of fMRI signals in general to relate them to direct measures of neural activity acquired with electrodes. The results computed by tkCCA confirm recent models of the hemodynamic response to neural activity and allow for a more detailed analysis of neurovascular coupling dynamics.  相似文献   

19.
Advances in information technology have set the pace for tremendous growth in the development of new computer-mediated channels of communication services and technologies. That these recent developments are fueled by technology might misleadingly suggest that the selection of a communication channel is largely based on technological criteria. Communication technologies require multiple users and cannot be used successfully by one person acting alone. Therefore, problems may arise when users fail to consider their self-efficacy and/or fail to consider social factors related to communication channel use.The main purpose of this study was to establish a better measure and model for use in predicting and explaining electronic-mail systems as an example of computer-mediated communication technologies (CMCT) usage and choice. The results indicated that all of the eight hypotheses showed significant correlation between criterion and predictor variates, supported by different canonical functions. The objective of the study was achieved by showing that the proposed research model can explain and predict the individual and combined effects of user self-efficacy, technological characteristics, and social-influence perspectives on CMCT usage and choice.  相似文献   

20.
This paper proposes a classification method that is based on easily interpretable fuzzy rules and fully capitalizes on the two key technologies, namely pruning the outliers in the training data by SVMs (support vector machines), i.e., eliminating the influence of outliers on the learning process; finding a fuzzy set with sound linguistic interpretation to describe each class based on AFS (axiomatic fuzzy set) theory. Compared with other fuzzy rule-based methods, the proposed models are usually more compact and easily understandable for the users since each class is described by much fewer rules. The proposed method also comes with two other advantages, namely, each rule obtained from the proposed algorithm is simply a conjunction of some linguistic terms, there are no parameters that are required to be tuned. The proposed classification method is compared with the previously published fuzzy rule-based classifiers by testing them on 16 UCI data sets. The results show that the fuzzy rule-based classifier presented in this paper, offers a compact, understandable and accurate classification scheme. A balance is achieved between the interpretability and the accuracy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号