首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The performance of clustering in document space can be influenced by the high dimension of the vectors, because there exists a great deal of redundant information in the high-dimensional vectors, which may make the similarity between vectors inaccurate. Hence, it is very considerable to derive a low-dimensional subspace that contains less redundant information, so that document vectors can be grouped more reasonably. In general, learning a subspace and clustering vectors are treated as two independent steps; in this case, we cannot estimate whether the subspace is appropriate for the method of clustering or vice versa. To overcome this drawback, this paper combines subspace learning and clustering into an iterative procedure named adaptive subspace learning (ASL). Firstly, the intracluster similarity and the intercluster separability of vectors can be increased via the initial cluster indicators in the step of subspace learning, and then affinity propagation is adopted to partition the vectors into a specific number of clusters, so as to update the cluster indicators and repeat subspace learning. In ASL, the obtained subspace can become more suitable for the clustering with the iterative optimization. The proposed method is evaluated using NG20, Classic3 and K1b datasets, and the results are shown to be superior to the conventional methods of document clustering.  相似文献   

2.
We have developed an informative sample subspace (ISS) method that is suitable for projecting high-dimensional data onto a low-dimensional subspace for classification purposes. In this paper, we present an ISS algorithm that uses a maximal mutual information criterion to search a labelled training data set directly for the subspace's projection base vectors. We evaluate the usefulness of the ISS method using synthetic data as well as real world problems. Experimental results demonstrate that the ISS algorithm is effective and can be used as a general method for representing high-dimensional data in a low-dimensional subspace for classification.  相似文献   

3.
The purpose of this article is to show the effectiveness of a positive linear decomposition in the derivation of robust features of high-dimensional dynamic measurements, in order to achieve effective pattern recognition and classification. The method begins with the singular value decomposition, projecting a matrix of dynamic process measurements (taken at uniform intervals over some time-window) onto a low-dimensional subspace. A convex cone, defined by the non-negativity of measurements, is then created. For normalization purposes a polygon, whose corners specify the feature vectors of the data, is formed by intersecting the cone with a plane. This polygon is reduced to a triangle with only the three most representative corners. The net effect of these steps is that the original orthogonal basis of the subspace (consisting of the first three principal components) is replaced by a new, non-orthogonal basis, which offers the advantage of containing only positive measurements and requiring only positive superposition of basis vectors to span the physically meaningful portion of the subspace. One of the vectors in this basis is selected as the feature vector for pattern recognition; a spanning tree created from the feature vectors classifies the patterns. The feature vectors from the new basis are much more robust with respect to changes in the width of the time window, and classification was possible even with feature vectors of differing time windows.  相似文献   

4.
In this article, a modified complex-valued FastICA algorithm is utilized to extract the specific feature of the Gaussian noise component from mixtures so that the estimated component is as independent as possible to the other non-Gaussian signal components. Once the noise basis vector is obtained, we can estimate direction of arrival by searching the array manifold for direction vectors, which are as orthogonal as possible to the estimated noise basis vector especially for highly correlated signals with closely spaced direction. Superior resolution capabilities achieved with the proposed method in comparison with the conventional multiple signal classification (MUSIC) method, the spatial smoothing MUSIC method, and the signal subspace scaled MUSIC method are shown by simulation results.  相似文献   

5.
张阳  王小宁 《计算机应用》2021,41(11):3151-3155
文本特征是自然语言处理中的关键部分。针对目前文本特征的高维性和稀疏性问题,提出了一种基于Word2Vec词嵌入和高维生物基因选择遗传算法(GARBO)的文本特征选择方法,从而便于后续文本分类任务。首先,优化数据输入形式,使用Word2Vec词嵌入方法将文本转变成类似基因表示的词向量;然后,将高维词向量模拟基因表达方式进行迭代进化;最后,使用随机森林分类器对特征选择后的文本进行分类。使用中文评论数据集对所提出的方法进行实验,实验结果表明了优化后的GARBO特征选择方法在文本特征选择上的有效性,该方法成功地将300维特征降低为50维更有价值的特征,分类准确率达到88%,与其他过滤式文本特征选择方法相比,能够有效地降低文本特征维度,提高文本分类效果。  相似文献   

6.
As an emerging biometric for human identification, iris recognition has received increasing attention in recent years. This paper makes an attempt to reflect shape information of the iris by analyzing local intensity variations of an iris image. In our framework, a set of one-dimensional (1D) intensity signals is constructed to contain the most important local variations of the original 2D iris image. Gaussian-Hermite moments of such intensity signals reflect to a large extent their various spatial modes and are used as distinguishing features. A resulting high-dimensional feature vector is mapped into a low-dimensional subspace using Fisher linear discriminant, and then the nearest center classifier based on cosine similarity measure is adopted for classification. Extensive experimental results show that the proposed method is effective and encouraging.  相似文献   

7.
在实体检索任务中,为了从大规模实体库中高效筛选与查询相关的候选实体,可使用稠密向量检索模型.然而在现有的稠密向量检索模型中,由于实体向量维度较高,导致实时计算效率较低、存储空间较大.文中通过实验发现这些实体向量存在大量的冗余信息:一方面,绝大多数实体向量分布在互不相同的象限里;另一方面,语义相近的实体所在的象限也更近.因此,文中提出二值化的实体检索方法,用于压缩实体向量,加速相似度计算.具体而言,方法利用符号函数(sign),二值化压缩高维稠密的浮点向量,并通过汉明距离加快检索.从理论上分析文中方法保证检索性能的原因.通过定性、定量的分析实验验证理论的正确性,并给出基于随机升维旋转的二值检索性能改善方法.  相似文献   

8.
非线性局部寻优时间弯曲校正及签名特征空间稳定性研究   总被引:7,自引:1,他引:7  
根据签名动态信息进行签名认证可以提高认证系统的安全性,它是在由签名动态信息的特征值张成的特征空间上的分类问题,然而,签名动态信息时间序列的时间弯曲现象使得特征值分离,不容易在特征空间上确定出真签名的特征值稳定的子空间,在签名样本数量小时尤为如此,因此提出一种非线性局部寻优时间弯曲校正方法,这具有较好的校正效果和较低的计算复要度,利用它对签名样本的动态信息时间序列进行校正,可以提高签名特征向量在特征空间上分布的聚扰性,拉开真,伪签名特征向量在特征空间上的距离,综合利用非线性局部寻优时间弯曲校正方法和线性时间弯曲校正方法对有限数量的标准签名样本进行处理,可在特征空间划分出不同置信度的特征稳定的子空间,以此满足不同安全程度认证的需要。  相似文献   

9.
This paper exploits the fact that any row vector of the observability matrix applied for transforming the state converts the latter to the new state component in the form of some derivative of the output component. Using the same but appropriately chosen vectors for transforming the system with the observation not fully corrupted by white noise we can accurately determine some state components. These vectors create the basis for the l-dimensional subspace of transformation vectors to the new accurately determinable state components. Using this basis the state transformation is constructed which in one step converts the singular linear filtering problem to a nonsingular one with state dimension decreased by l.  相似文献   

10.
董明刚  曾慧斌  敬超 《控制与决策》2021,36(8):1804-1814
对现有的分解方法进行改进,提出一种基于弱关联的自适应高维多目标进化算法(WAEA).首先,提出一种基于夹角子空间的关联策略,使得一个解能与多个参考向量相关联;其次,提出弱关联概念,并基于此概念设计双模态标量函数,使算法能够更好地处理复杂PF问题,此外,算法通过检测参考向量子空间内解的数量,自适应调整惩罚参数大小,使其能有效处理各类多目标问题;最后,将WAEA算法与8种代表性的高维多目标算法进行比较,实验结果表明WAEA算法在处理复杂Pareto前沿的高维多目标问题时能更好地平衡Pareto最优解的收敛性与多样性.  相似文献   

11.
指出在二维主成分分析中,特征向量的任意两个分量之间是相关的,并给出此相关性的数学表达,进一步提出最小化相关性的二维主成分分析。该方法改进二维主成分分析的目标函数,最大化特征向量间总体散度的同时,最小化特征向量各分量间的相关性。最后,在Yale标准人脸库上的实验结果表明,文中方法有较强的特征抽取能力,在识别性能上优于二维主成分分析及对角二维主成分分析。  相似文献   

12.
针对共空间模式(Common Spatial Patterns,CSP)对源信号和记录的脑电信号之间严格的线性模式的假设关系,充分发挥张量在多维上同时处理的优势,研究了一种核张量子空间分解EEG特征提取方法。首先生成EEG数据的张量,利用带二次等式约束的最小二乘问题解决张量分解问题,并将张量扩展到子空间,减小计算的压力,最后推广到核空间,将数据投影到高维特征空间来增强辨别能力。实验数据采用2005年BCI竞赛III的数据集III_3a,实验结果表明,KTSD方法能够从多类运动想象任务的EEG数据中提取相应的特征,并得到较好分类结果和运行效率。  相似文献   

13.
A method of document clustering based on locality preserving indexing (LPI) and support vector machines (SVM) is presented. The document space is generally of high dimensionality, and clustering in such a high-dimensional space is often infeasible due to the curse of dimensionality. In this paper, by using LPI, the documents are projected into a lower-dimension semantic space in which the documents related to the same semantic are close to each other. Then, by using SVM, the vectors in semantic space are mapped by means of a Gaussian kernel to a high-dimensional feature space in which the minimal enclosing sphere is searched. The sphere, when mapped back to semantics space, can separate into several independent components by the support vectors, each enclosing a separate cluster of documents. By combining the LPI and SVM, not only higher clustering accuracies in a more unsupervised effective way, but also better generalization properties can be obtained. Extensive demonstrations are performed on the Reuters-21578 and TDT2 data sets. This work was supported by National Science Foundation of China under Grant 60471055, Specialized Research Fund for the Doctoral Program of Higher Education under Grant 20040614017.  相似文献   

14.
Today’s ever-increasing application of high-dimensional data sets makes it necessary to find a way to fully comprehend them. One of these ways is visualizing data sets. However, visualizing more than 3-dimensional data sets in a fathomable way has always been a serious challenge for researchers in this field. There are some visualizing methods already available such as parallel coordinates, scatter plot matrix, RadViz, bubble charts, heatmaps, Sammon mapping and self organizing maps. In this paper, an axis-based method (called Nasseh method) is introduced in which familiar elements of visualization of 1-, 2- and 3-dimensional data sets are used to visualize higher dimensional data sets so that it will be easier to explore the data sets in the corresponding dimensions. Nasseh method can be used in many applications from illustrating points in high-dimensional geometry to visualizing estimated Pareto-fronts for many-objective optimization problems.  相似文献   

15.
Efficient and compact representation of images is a fundamental problem in computer vision. In this paper, we propose methods that use Haar-like binary box functions to represent a single image or a set of images. A desirable property of these box functions is that their inner product operation with an image can be computed very efficiently. We propose two closely related novel subspace methods to model images: the non-orthogonal binary subspace (NBS) method and binary principal component analysis (B-PCA) algorithm. NBS is spanned directly by binary box functions and can be used for image representation, fast template matching and many other vision applications. B-PCA is a structure subspace that inherits the merits of both NBS (fast computation) and PCA (modeling data structure information). B-PCA base vectors are obtained by a novel PCA guided NBS method. We also show that BPCA base vectors are nearly orthogonal to each other. As a result, in the non-orthogonal vector decomposition process, the computationally intensive pseudo-inverse projection operator can be approximated by the direct dot product without causing significant distance distortion. Experiments on real image datasets show promising performance in image matching, reconstruction and recognition tasks with significant speed improvement.  相似文献   

16.
Subspace clustering finds sets of objects that are homogeneous in subspaces of high-dimensional datasets, and has been successfully applied in many domains. In recent years, a new breed of subspace clustering algorithms, which we denote as enhanced subspace clustering algorithms, have been proposed to (1) handle the increasing abundance and complexity of data and to (2) improve the clustering results. In this survey, we present these enhanced approaches to subspace clustering by discussing the problems they are solving, their cluster definitions and algorithms. Besides enhanced subspace clustering, we also present the basic subspace clustering and the related works in high-dimensional clustering.  相似文献   

17.
白琳  陈豪 《计算机科学》2010,37(11):103-106
针对独立信号源的欠定盲分离,通过一定的理论分析,提出了一种基于伪提取矢量的欠定盲源分离方法。该方法通过判断采样点处取值占优的源信号,然后在观测信号采样点处选取对应的伪提取矢量,以恢复取值占优的源信号采样点的值,来实现欠定盲源分离。将该算法与经典的基于线性规划的欠定盲源分离方法进行了仿真,结果表明该方法由于在信号的各采样点处无需优化,因此大大提高了信号分离的速度,信号的分离速度要比基于线性规划的方法快数十倍。  相似文献   

18.
基于向量组的Fisher线性鉴别分析方法   总被引:1,自引:0,他引:1       下载免费PDF全文
提出了一种基于向量组的Fisher线性鉴别分析方法。该方法先将原始的高维向量分割为低维的子向量组,再对向量组运用Fisher线性鉴别分析。这种处理方法,不但能够解决任意高维下的小样本问题,而且通过选择恰当的子向量维数,可以从向量中抽取出最有效的特征值。此外,基于向量组的Fisher线性鉴别分析是Fisher线性鉴别分析和二维Fisher线性鉴别分析的进一步推广。  相似文献   

19.
Kernel matched subspace detectors for hyperspectral target detection   总被引:1,自引:0,他引:1  
In this paper, we present a kernel realization of a matched subspace detector (MSD) that is based on a subspace mixture model defined in a high-dimensional feature space associated with a kernel function. The linear subspace mixture model for the MSD is first reformulated in a high-dimensional feature space and then the corresponding expression for the generalized likelihood ratio test (GLRT) is obtained for this model. The subspace mixture model in the feature space and its corresponding GLRT expression are equivalent to a nonlinear subspace mixture model with a corresponding nonlinear GLRT expression in the original input space. In order to address the intractability of the GLRT in the feature space, we kernelize the GLRT expression using the kernel eigenvector representations as well as the kernel trick where dot products in the feature space are implicitly computed by kernels. The proposed kernel-based nonlinear detector, so-called kernel matched subspace detector (KMSD), is applied to several hyperspectral images to detect targets of interest. KMSD showed superior detection performance over the conventional MSD when tested on several synthetic data and real hyperspectral imagery.  相似文献   

20.
In the mid-nineteenth century, Donders had proposed that for every human head rotating away from the primary pointing direction, the rotational vectors in the direction of the corresponding axes of rotation, is restricted to lie on a surface. Donders'' intuition was that under such a restriction, the head orientation would be a function of its pointing direction. In this paper, we revisit Donders'' Law and show that indeed the proposed intuition is true for a restricted class of head-orientations satisfying a class of quadratic Donders'' surfaces, if the head points to a suitable neighborhood of the frontal pointing direction. Moreover, on a suitably chosen subspace of the 3D rotation group ${\rm SO}(3)$, we describe a head movement dynamical system with input control signals that are the three external torques on the head provided by muscles. Three output signals are also suitably chosen as follows. Two of the output signals are coordinates of the frontal pointing direction. The third signal measures deviation of the state vector from the Donders'' surface. We claim that the square system is locally feedback linearizable on the subspace chosen, and the linear dynamics is decomposed into parts, transverse and tangential to the Donders'' surface. We demonstrate our approach by synthesizing a tracking and path-following controller. Additionally, for different choices of the Donders'' surface parameters, head gaits are visualized by simulating different movement patterns of the head-top vector, as the head-pointing vector rotates around a circle.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号