首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Similarity search in high dimensional space is a nontrivial problem due to the so-called curse of dimensionality. Recent techniques such as Piecewise Aggregate Approximation (PAA), Segmented Means (SMEAN) and Mean-Standard deviation (MS) prove to be very effective in reducing data dimensionality by partitioning dimensions into subsets and extracting aggregate values from each dimension subset. These partition-based techniques have many advantages including very efficient multi-phased approximation while being simple to implement. They, however, are not adaptive to the different characteristics of data in diverse applications.We propose SubSpace Projection (SSP) as a unified framework for these partition-based techniques. SSP projects data onto subspaces and computes a fixed number of salient features with respect to a reference vector. A study of the relationships between query selectivity and the corresponding space partitioning schemes uncovers indicators that can be used to predict the performance of the partitioning configuration. Accordingly, we design a greedy algorithm to efficiently determine a good partitioning of the data dimensions. The results of our extensive experiments indicate that the proposed method consistently outperforms state-of-the-art techniques.  相似文献   

2.
In practice, many applications require a dimensionality reduction method to deal with the partially labeled problem. In this paper, we propose a semi-supervised dimensionality reduction framework, which can efficiently handle the unlabeled data. Under the framework, several classical methods, such as principal component analysis (PCA), linear discriminant analysis (LDA), maximum margin criterion (MMC), locality preserving projections (LPP) and their corresponding kernel versions can be seen as special cases. For high-dimensional data, we can give a low-dimensional embedding result for both discriminating multi-class sub-manifolds and preserving local manifold structure. Experiments show that our algorithms can significantly improve the accuracy rates of the corresponding supervised and unsupervised approaches.  相似文献   

3.
Variable selection and dimension reduction are two commonly adopted approaches for high-dimensional data analysis, but have traditionally been treated separately. Here we propose an integrated approach, called sparse gradient learning (SGL), for variable selection and dimension reduction via learning the gradients of the prediction function directly from samples. By imposing a sparsity constraint on the gradients, variable selection is achieved by selecting variables corresponding to non-zero partial derivatives, and effective dimensions are extracted based on the eigenvectors of the derived sparse empirical gradient covariance matrix. An error analysis is given for the convergence of the estimated gradients to the true ones in both the Euclidean and the manifold setting. We also develop an efficient forward-backward splitting algorithm to solve the SGL problem, making the framework practically scalable for medium or large datasets. The utility of SGL for variable selection and feature extraction is explicitly given and illustrated on artificial data as well as real-world examples. The main advantages of our method include variable selection for both linear and nonlinear predictions, effective dimension reduction with sparse loadings, and an efficient algorithm for large p, small n problems.  相似文献   

4.
目标跟踪问题中目标所在环境的变化对跟踪效果有较大影响.鉴于此,提出一种基于弹性网结构的稀疏表示模型,井在粒子滤波框架下设计一种应用稀疏表示模型的抗干扰动态弹性网目标跟踪算法.同时,设计一种根据环境变化程度动态更新稀疏表示模型参数的方法,以克服光照变化等干扰对算法跟踪质量的影响.此外,所提出算法通过使用各向异性核函数计算...  相似文献   

5.
Data Mining and Knowledge Discovery - Spectral-based subspace clustering methods have proved successful in many challenging applications such as gene sequencing, image recognition, and motion...  相似文献   

6.
Speaker verification has been studied widely from different points of view, including accuracy, robustness and being real-time. Recent studies have turned toward better feature stability and robustness. In this paper we study the effect of nonlinear manifold based dimensionality reduction for feature robustness. Manifold learning is a popular recent approach for nonlinear dimensionality reduction. Algorithms for this task are based on the idea that each data point may be described as a function of only a few parameters. Manifold learning algorithms attempt to uncover these parameters in order to find a low-dimensional representation of the data. From the manifold based dimension reduction approaches, we applied the widely used Isometric mapping (Isomap) algorithm. Since in the problem of speaker verification, the input utterance is compared with the model of the claiming client, a speaker dependent feature transformation would be beneficial for deciding on the identity of the speaker. Therefore, our first contribution is to use Isomap dimension reduction approach in the speaker dependent context and compare its performance with two other widely used approaches, namely principle component analysis and factor analysis. The other contribution of our work is to perform the nonlinear transformation in a speaker-dependent framework. We evaluated this approach in a GMM based speaker verification framework using Tfarsdat Telephone speech dataset for different noises and SNRs and the evaluations have shown reliability and robustness even in low SNRs. The results also show better performance for the proposed Isomap approach compared to the other approaches.  相似文献   

7.
Canonical correlation analysis (CCA) is a popular and powerful dimensionality reduction method to analyze paired multi-view data. However, when facing semi-paired and semi-supervised multi-view data which widely exist in real-world problems, CCA usually performs poorly due to its requirement of data pairing between different views and un-supervision in nature. Recently, several extensions of CCA have been proposed, however, they just handle the semi-paired scenario by utilizing structure information in each view or just deal with semi-supervised scenario by incorporating the discriminant information. In this paper, we present a general dimensionality reduction framework for semi-paired and semi-supervised multi-view data which naturally generalizes existing related works by using different kinds of prior information. Based on the framework, we develop a novel dimensionality reduction method, termed as semi-paired and semi-supervised generalized correlation analysis (S2GCA). S2GCA exploits a small amount of paired data to perform CCA and at the same time, utilizes both the global structural information captured from the unlabeled data and the local discriminative information captured from the limited labeled data to compensate the limited pairedness. Consequently, S2GCA can find the directions which make not only maximal correlation between the paired data but also maximal separability of the labeled data. Experimental results on artificial and four real-world datasets show its effectiveness compared to the existing related dimensionality reduction methods.  相似文献   

8.
针对稀疏编码模型在字典基的选择时忽略了群效应,且欧氏距离不能有效度量特征与字典基之间距离的问题,提出基于弹性网和直方图相交的非负局部稀疏编码方法(EH-NLSC)。首先,在优化函数中引入弹性网模型,消除字典基选择数目的限制,能够选择多组相关特征而排除冗余特征,提高了编码的判别性和有效性。然后,在局部性约束中引入直方图相交,重新定义特征与字典基之间的距离,确保相似的特征可以共享其局部的基。最后采用多类线性支持向量机进行分类。在4个公共数据集上的实验结果表明,与局部线性约束的编码算法(LLC)和基于非负弹性网的稀疏编码算法(NENSC)相比,EH-NLSC的分类准确率分别平均提升了10个百分点和9个百分点,充分体现了其在图像表示和分类中的有效性。  相似文献   

9.
Wang  Jim Jing-Yan  Cui  Xuefeng  Yu  Ge  Guo  Lili  Gao  Xin 《Neural computing & applications》2019,31(3):701-710
Neural Computing and Applications - Sparse coding, which represents a data point as a sparse reconstruction code with regard to a dictionary, has been a popular data representation method....  相似文献   

10.
ABSTRACT

Pansharpening with sparse representation (SR) and details injection (ID) can both produce visually and quantitatively pleasing images. The former constructs pansharpened image by combining the dictionary and estimated sparse coefficients while the later by sharpening the multispectral bands through adding the proper spatial details from panchromatic (Pan) image. The combination of these two methods has been putting forward as the pansharpening method based on sparse representation of injected details (SR-D). Although SR-D has achieved better results both in visual and quantitative parts than many state-of-art methods, it ignores the intrinsic geometric structure connection between the multispectral image (MS) and the corresponding high-resolution MS image. In this paper, we propose a new pansharpening method, called manifold regularized sparse representation of injected details (MR-SR-D) by introducing a manifold regularization (MR) into the former SR-D model. The manifold regularization utilized a graph Laplacian to incorporate the locally geometrical structure of the multispectral data. Experimental results on the IKONOS, QuickBird and WorldView2 data sets show that the proposed method can achieve remarkable spectral and spatial quality on both reduced scale and full scale.  相似文献   

11.
The paper derives a framework suitable to discuss the classical Koopmans-Levin (KL) and maximum likelihood (ML) algorithms to estimate parameters of errors-in-variables linear models in a unified way. Using the capability of the unified approach a new parameter estimation algorithm is presented offering flexibility to ensure acceptable variance in the estimated parameters. The developed algorithm is based on the application of Hankel matrices of variable size and can equally be considered as a generalized version of the KL method (GKL) or as a reduced version of the ML estimation. The methodology applied to derive the GKL algorithm is used to present a straightforward derivation of the subspace identification algorithm.  相似文献   

12.
In the era of Big Data, a practical yet challenging task is to make learning techniques more universally applicable in dealing with the complex learning problem, such as multi-source multi-label learning. While some of the early work have developed many effective solutions for multi-label classification and multi-source fusion separately, in this paper we learn the two problems together, and propose a novel method for the joint learning of multiple class labels and data sources, in which an optimization framework is constructed to formulate the learning problem, and the result of multi-label classification is induced by the weighted combination of the decisions from multiple sources. The proposed method is responsive in exploiting the label correlations and fusing multi-source data, especially in the fusion of long-tail data. Experiments on various multi-source multi-label data sets reveal the advantages of the proposed method.  相似文献   

13.
A unified framework for the construction of various synchronous and asynchronous parallel matrix multisplitting iterative methods, suitable to the SIMD and MIMD multiprocessor systems, respectively, is presented, and its convergence theory is established under rather weak conditions. These afford general method models and systematical convergence criterions for studying the parallel iterations in the sense of matrix multisplitting. In addition, how the known parallel matrix multisplitting iterative methods can be classified into this new framework, and what novel ones can be generated by it are shown in detail.  相似文献   

14.
Constraint Satisfaction Problem (CSP) involves finding values for variables to satisfy a set of constraints. Consistency check is the key technique in solving this class of problems. Past research has developed many algorithms for such a purpose, e.g., node consistency, are consistency, generalized node and arc consistency, specific methods for checking specific constraints, etc. In this article, an attempt is made to unify these algorithms into a common framework. This framework consists of two parts. the first part is a generic consistency check algorithm, which allows and encourages each individual constraint to be checked by its specific consistency methods. Such an approach provides a direct way of practical implementation of the CSP model for real problem-solving. the second part is a general schema for describing the handling of each type of constraint. the schema characterizes various issues of constraint handling in constraint satisfaction, and provides a common language for expressing, discussing, and exchanging constraint handling techniques. © 1995 John Wiley & Sons, Inc.  相似文献   

15.
Knowledge patterns, such as association rules, clusters or decision trees, can be defined as concise and relevant information that can be extracted, stored, analyzed, and manipulated by knowledge workers in order to drive and specialize business decision processes. In this paper we deal with data mining patterns. The ability to manipulate different types of patterns under a unified environment is becoming a fundamental issue for any ‘intelligent’ and data-intensive application. However, approaches proposed so far for pattern management usually deal with specific and predefined types of patterns and mainly concern pattern extraction and exchange issues. Issues concerning the integrated, advanced management of heterogeneous patterns are in general not (or marginally) taken into account.  相似文献   

16.
In this paper, a unified framework for multimodal content retrieval is presented. The proposed framework supports retrieval of rich media objects as unified sets of different modalities (image, audio, 3D, video and text) by efficiently combining all monomodal heterogeneous similarities to a global one according to an automatic weighting scheme. Then, a multimodal space is constructed to capture the semantic correlations among multiple modalities. In contrast to existing techniques, the proposed method is also able to handle external multimodal queries, by embedding them to the already constructed multimodal space, following a space mapping procedure of a submanifold analysis. In our experiments with five real multimodal datasets, we show the superiority of the proposed approach against competitive methods.  相似文献   

17.
We propose a general framework for structure identification, as defined by Dechter and Pearl. It is based on the notion of prime implicate, and handles Horn, bijunctive and affine, as well as Horn-renamable formulas, for which, to our knowledge, no polynomial algorithm has been proposed before. This framework, although quite general, gives good complexity results, and in particular we get for Horn formulas the same running time and better output size than the algorithms previously known.  相似文献   

18.
19.
Sparse representations, motivated by strong evidence of sparsity in the primate visual cortex, are gaining popularity in the computer vision and pattern recognition fields, yet sparse methods have not gained widespread acceptance in the facial understanding communities. A main criticism brought forward by recent publications is that sparse reconstruction models work well with controlled datasets, but exhibit coefficient contamination in natural datasets. To better handle facial understanding problems, specifically the broad category of facial classification problems, an improved sparse paradigm is introduced in this paper. Our paradigm combines manifold learning for dimensionality reduction, based on a newly introduced variant of semi-supervised Locality Preserving Projections, with a ?1 reconstruction error, and a regional based statistical inference model. We demonstrate state-of-the-art classification accuracy for the facial understanding problems of expression, gender, race, glasses, and facial hair classification. Our method minimizes coefficient contamination and offers a unique advantage over other facial classification methods when dealing with occlusions. Experimental results are presented on multi-class as well as binary facial classification problems using the Labeled Faces in the Wild, Cohn–Kanade, Extended Cohn–Kanade, and GEMEP-FERA datasets demonstrating how and under what conditions sparse representations can further the field of facial understanding.  相似文献   

20.
Graph-based induction as a unified learning framework   总被引:6,自引:0,他引:6  
We describe a graph-based induction algorithm that extracts typical patterns from colored digraphs. The method is shown to be capable of solving a variety of learning problems by mapping the different learning problems into colored digraphs. The generality and scope of this method can be attributed to the expressiveness of the colored digraph representation, which allows a number of different learning problems to be solved by a single algorithm. We demonstrate the application of our method to two seemingly different learning tasks: inductive learning of classification rules, and learning macro rules for speeding up inference. We also show that the uniform treatment of these two learning tasks enables our method to solve complex learning problems such as the construction of hierarchical knowledge bases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号