首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Graph based pattern representation offers a versatile alternative to vectorial data structures. Therefore, a growing interest in graphs can be observed in various fields. However, a serious limitation in the use of graphs is the lack of elementary mathematical operations in the graph domain, actually required in many pattern recognition algorithms. In order to overcome this limitation, the present paper proposes an embedding of a given graph population in a vector space Rn. The key idea of this embedding approach is to interpret the distances of a graph g to a number of prototype graphs as numerical features of g. In previous works, the prototypes were selected beforehand with heuristic selection algorithms. In the present paper we take a more fundamental approach and regard the problem of prototype selection as a feature selection or dimensionality reduction problem, for which many methods are available. With several experiments we show the feasibility of graph embedding based on prototypes obtained from such feature selection algorithms and demonstrate their potential to outperform previous approaches.  相似文献   

2.
3.
4.
Fusion of information from graph features and content can provide superior inference for an anomaly detection task, compared to the corresponding content-only or graph feature-only statistics. In this paper, we design and execute an experiment on a time series of attributed graphs extracted from the Enron email corpus which demonstrates the benefit of fusion. The experiment is based on injecting a controlled anomaly into the real data and measuring its detectability.  相似文献   

5.
6.
In this paper we present a study of structural features of handwriting extracted from three characters “d”, “y”, and “f” and grapheme “th”. The features used are based on the standard features used by forensic document examiners. The process of feature extraction is presented along with the results. Analysis of the usefulness of features was conducted via searching the optimal feature sets using the wrapper method. A neural network was used as a classifier and a genetic algorithm was used to search for optimal feature sets. It is shown that most of the structural micro features studied, do possess discriminative power, which justifies their use in forensic analysis of handwriting. The results also show that the grapheme possessed significantly higher discriminating power than any of the three single characters studied, which supports the opinion that a character form is affected by its adjacent characters.  相似文献   

7.
This paper proposes a new discriminant analysis with orthonormal coordinate axes of the feature space. In general, the number of coordinate axes of the feature space in the traditional discriminant analysis depends on the number of pattern classes. Therefore, the discriminatory capability of the feature space is limited considerably. The new discriminant analysis solves this problem completely. In addition, it is more powerful than the traditional one in so far as the discriminatory power and the mean error probability for coordinate axes are concerned. This is also shown by a numerical example.  相似文献   

8.
Identification of relevant genes from microarray data is an apparent need in many applications. For such identification different ranking techniques with different evaluation criterion are used, which usually assign different ranks to the same gene. As a result, different techniques identify different gene subsets, which may not be the set of significant genes. To overcome such problems, in this study pipelining the ranking techniques is suggested. In each stage of pipeline, few of the lower ranked features are eliminated and at the end a relatively good subset of feature is preserved. However, the order in which the ranking techniques are used in the pipeline is important to ensure that the significant genes are preserved in the final subset. For this experimental study, twenty four unique pipeline models are generated out of four gene ranking strategies. These pipelines are tested with seven different microarray databases to find the suitable pipeline for such task. Further the gene subset obtained is tested with four classifiers and four performance metrics are evaluated. No single pipeline dominates other pipelines in performance; therefore a grading system is applied to the results of these pipelines to find out a consistent model. The finding of grading system that a pipeline model is significant is also established by Nemenyi post-hoc hypothetical test. Performance of this pipeline model is compared with four ranking techniques, though its performance is not superior always but majority of time it yields better results and can be suggested as a consistent model. However it requires more computational time in comparison to single ranking techniques.  相似文献   

9.
A method is proposed for constructing salient features from a set of features that are given as input to a feedforward neural network used for supervised learning. Combinations of the original features are formed that maximize the sensitivity of the network's outputs with respect to variations of its inputs. The method exhibits some similarity to Principal Component Analysis, but also takes into account supervised character of the learning task. It is applied to classification problems leading to improved generalization ability originating from the alleviation of the curse of dimensionality problem.  相似文献   

10.
Many studies on Graph Data Augmentation (GDA) approaches have emerged. The techniques have rapidly improved performance for various graph neural network (GNN) models, increasing the current state-of-the-art accuracy by absolute values of 4.20%, 5.50%, and 4.40% on Cora, Citeseer, and PubMed, respectively. The success is attributed to two integral properties of relational approaches: topology-level and feature-level augmentation. This work provides an overview of some GDA algorithms which are reasonably categorized based on these integral properties. Next, we engage the three most widely used GNN backbones (GCN, GAT, and GraphSAGE) as plug-and-play methods for conducting experiments. We conclude by evaluating the algorithm’s effectiveness to demonstrate significant differences among various GDA techniques based on accuracy and time complexity with additional datasets different from those used in the original works. While discussing practical and theoretical motivations, considerations, and strategies for GDA, this work comprehensively investigates the challenges and future direction by pinpointing several open conceivable issues that may require further study based on far-reaching literature interpretation and empirical outcomes.  相似文献   

11.
Selecting relevant features for support vector machine (SVM) classifiers is important for a variety of reasons such as generalization performance, computational efficiency, and feature interpretability. Traditional SVM approaches to feature selection typically extract features and learn SVM parameters independently. Independently performing these two steps might result in a loss of information related to the classification process. This paper proposes a convex energy-based framework to jointly perform feature selection and SVM parameter learning for linear and non-linear kernels. Experiments on various databases show significant reduction of features used while maintaining classification performance.  相似文献   

12.
Graphs are a powerful and popular representation formalism in pattern recognition. Particularly in the field of document analysis they have found widespread application. From the formal point of view, however, graphs are quite limited in the sense that the majority of mathematical operations needed to build common algorithms, such as classifiers or clustering schemes, are not defined. Consequently, we observe a severe lack of algorithmic procedures that can directly be applied to graphs. There exists recent work, however, aimed at overcoming these limitations. The present paper first provides a review of the use of graph representations in document analysis. Then we discuss a number of novel approaches suitable for making tools from statistical pattern recognition available to graphs. These novel approaches include graph kernels and graph embedding. With several experiments, using different data sets from the field of document analysis, we show that the new methods have great potential to outperform traditional procedures applied to graph representations.  相似文献   

13.
In graph embedding based methods, we usually need to manually choose the nearest neighbors and then compute the edge weights using the nearest neighbors via L2 norm (e.g. LLE). It is difficult and unstable to manually choose the nearest neighbors in high dimensional space. So how to automatically construct a graph is very important. In this paper, first, we give a L2-graph like L1-graph. L2-graph calculates the edge weights using the total samples, avoiding manually choosing the nearest neighbors; second, a L2-graph based feature extraction method is presented, called collaborative representation based projections (CRP). Like SPP, CRP aims to preserve the collaborative representation based reconstruction relationship of data. CRP utilizes a L2 norm graph to characterize the local compactness information. CRP maximizes the ratio between the total separability information and the local compactness information to seek the optimal projection matrix. CRP is much faster than SPP since CRP calculates the objective function with L2 norm while SPP calculate the objective function with L1 norm. Experimental results on FERET, AR, Yale face databases and the PolyU finger-knuckle-print database demonstrate that CRP works well in feature extraction and leads to a good recognition performance.  相似文献   

14.
Within the last decade increasing computing power and the scientific advancement of algorithms allowed the analysis of various aspects of human faces such as facial expression estimation [20], head pose estimation [17], person identification [2] or face model fitting [31]. Today, computer scientists can use a bunch of different techniques to approach this challenge 4, 29, 3, 17, 9 and 21. However, each of them still has to deal with non-perfect accuracy or high execution times.  相似文献   

15.
16.
模块二维主成分分析——人脸识别新方法   总被引:7,自引:0,他引:7       下载免费PDF全文
提出了模块二维主成分分析(M2DPCA)线性鉴别分析方法。M2DPCA方法先对图像矩阵进行分块,对分块得到的子图像矩阵直接进行鉴别分析。其特点是:能有效地降低模式原始特征的维数;可以完全避免使用矩阵的奇异值分解,特征抽取方便;此外,2DPCA是M2DPCA的特例。在ORL人脸库上试验结果表明,M2DPCA方法在识别性能上优于PCA,比2DPCA更具有鲁棒性。  相似文献   

17.
18.
This paper is a historical overview of graph-based methodologies in Pattern Recognition in the last 40 years; history is interpreted with the aim of recognizing the rationale inspiring the papers published in these years, so as to roughly classify them. Despite the extent of scientific production in this field, it is possible to identify three historical periods, each having its own connotation common to most of the corresponding papers, which are called here as the pure, the impure and extreme periods.  相似文献   

19.
20.
This paper presents a document classifier based on text content features and its application to email classification. We test the validity of a classifier which uses Principal Component Analysis Document Reconstruction (PCADR), where the idea is that principal component analysis (PCA) can compress optimally only the kind of documents-in our experiments email classes-that are used to compute the principal components (PCs), and that for other kinds of documents the compression will not perform well using only a few components. Thus, the classifier computes separately the PCA for each document class, and when a new instance arrives to be classified, this new example is projected in each set of computed PCs corresponding to each class, and then is reconstructed using the same PCs. The reconstruction error is computed and the classifier assigns the instance to the class with the smallest error or divergence from the class representation. We test this approach in email filtering by distinguishing between two message classes (e.g. spam from ham, or phishing from ham). The experiments show that PCADR is able to obtain very good results with the different validation datasets employed, reaching a better performance than the popular Support Vector Machine classifier.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号