首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
神经网络模型的透明化及输入变量约简   总被引:1,自引:0,他引:1  
由于神经网络很容易实现从输入空间到输出空间的非线性映射,因此,神经网络应用者往往未考虑输入变量和输出变量之间的相关性,直接用神经网络来实现输入变量与输出变量之间的黑箱建模,致使模型中常存在冗余变量,并造成模型可靠性和鲁棒性差。提出一种透明化神经网络黑箱特性的方法,并用它剔除模型中的冗余变量。该方法首先利用神经网络释义图可视化网络;再利用连接权法计算神经网络输入变量的相对贡献率,判断其对输出变量的重要性;最后利用改进的随机化测验对连接权和输入变量贡献率进行显著性检验,修剪模型,并以综合贡献度和相对贡献率均不显著的输入变量的交集为依据,剔除冗余变量,实现NN模型透明化及变量选择。实验结果表明,该方法增加了模型的透明度,选择出了最佳输入变量,剔除了冗余输入变量,提高了模型的可靠性和鲁棒性。因此,该研究为神经网络模型的透明化及变量约简提供了一种新的方法。  相似文献   

2.
判别局部排列是基于谱分析片排列框架下的降维算法,但是,算法只能针对单流形数据进行降维.针对判别局部排列算法存在的缺陷,着重研究了多流形学习和半监督学习技术,利用标签传播算法(LP)和线性重构分析,提出一种流行结构保持的半监督降维算法,利用标签传播后得到的全体样本标签信息进行片都构建,并通过求解目标函数的最优解来获得低维嵌入.在YALE和FERET这两个标准人连数据库上的实验,验证了算法的有效性能并体现了算法在分类上的良好性能.  相似文献   

3.
产生式方法和判别式方法是解决分类问题的两种不同框架,具有各自的优势。为利用两种方法各自的优势,文中提出一种产生式与判别式线性混合分类模型,并设计一种基于遗传算法的产生式与判别式线性混合分类模型的学习算法。该算法将线性混合分类器混合参数的学习看作一个最优化问题,以两个基分类器对每个训练数据的后验概率值为数据依据,用遗传算法找出线性混合分类器混合参数的最优值。实验结果表明,在大多数数据集上,产生式与判别式线性混合分类器的分类准确率优于或近似于它的两个基分类器中的优者。  相似文献   

4.
Wang  Guoqiang  Gong  Lei  Pang  Yajun  Shi  Nianfeng 《Neural Processing Letters》2020,51(1):611-638
Neural Processing Letters - In this paper, we propose an effective dimensionality reduction algorithm named Discriminant Collaborative Locality Preserving Projections (DCLPP), which takes advantage...  相似文献   

5.
On Discriminative Bayesian Network Classifiers and Logistic Regression   总被引:5,自引:1,他引:4  
Discriminative learning of the parameters in the naive Bayes model is known to be equivalent to a logistic regression problem. Here we show that the same fact holds for much more general Bayesian network models, as long as the corresponding network structure satisfies a certain graph-theoretic property. The property holds for naive Bayes but also for more complex structures such as tree-augmented naive Bayes (TAN) as well as for mixed diagnostic-discriminative structures. Our results imply that for networks satisfying our property, the conditional likelihood cannot have local maxima so that the global maximum can be found by simple local optimization methods. We also show that if this property does not hold, then in general the conditional likelihood can have local, non-global maxima. We illustrate our theoretical results by empirical experiments with local optimization in a conditional naive Bayes model. Furthermore, we provide a heuristic strategy for pruning the number of parameters and relevant features in such models. For many data sets, we obtain good results with heavily pruned submodels containing many fewer parameters than the original naive Bayes model.Editors: Pedro Larrañaga, Jose A. Lozano, Jose M. Peña and Iñaki Inza  相似文献   

6.
Understanding the world we live in requires access to a large amount of background knowledge: the commonsense knowledge that most people have and most computer systems don't. Many of the limitations of artificial intelligence today relate to the problem of acquiring and understanding common sense. The Open Mind Common Sense project began to collect common sense from volunteers on the Internet starting in 2000. The collected information is converted to a semantic network called ConceptNet. Reducing the dimensionality of ConceptNet's graph structure gives a matrix representation called AnalogySpace, which reveals large-scale patterns in the data, smoothes over noise, and predicts new knowledge. Extending this work, we have created a method that uses singular value decomposition to aid in the integration of systems or representations. This technique, called blending, can be harnessed to find and exploit correlations between different resources, enabling commonsense reasoning over a broader domain.  相似文献   

7.
Conditional random fields (CRFs) are a statistical framework that has recently gained in popularity in both the automatic speech recognition (ASR) and natural language processing communities because of the different nature of assumptions that are made in predicting sequences of labels compared to the more traditional hidden Markov model (HMM). In the ASR community, CRFs have been employed in a method similar to that of HMMs, using the sufficient statistics of input data to compute the probability of label sequences given acoustic input. In this paper, we explore the application of CRFs to combine local posterior estimates provided by multilayer perceptrons (MLPs) corresponding to the frame-level prediction of phone classes and phonological attribute classes. We compare phonetic recognition using CRFs to an HMM system trained on the same input features and show that the monophone label CRF is able to achieve superior performance to a monophone-based HMM and performance comparable to a 16 Gaussian mixture triphone-based HMM; in both of these cases, the CRF obtains these results with far fewer free parameters. The CRF is also able to better combine these posterior estimators, achieving a substantial increase in performance over an HMM-based triphone system by mixing the two highly correlated sets of phone class and phonetic attribute class posteriors.  相似文献   

8.
统计模式识别中的维数削减与低损降维   总被引:31,自引:0,他引:31  
较为全面地回顾了统计模式识别中常用的一些特征选择、特征提取等主流特征降维方法,介绍了它们各自的特点及其适用范围,在此基础上,提出了一种新的基于最优分类器——贝叶斯分类器的可用于自动文本分类及其它大样本模式分类的特征选择方法——低损降维.在标准数据集Reuters-21578上进行的仿真实验结果表明,与互信息、χ^2统计量以及文档频率这三种主流文本特征选择方法相比,低损降维的降维效果与互信息、χ^2统计量相当,而优于文档频率.  相似文献   

9.
Understanding facial expressions in image sequences is an easy task for humans. Some of us are capable of lipreading by interpreting the motion of the mouth. Automatic lipreading by a computer is a challenging task, with so far limited success. The inverse problem of synthesizing real looking lip movements is also highly non-trivial. Today, the technology to automatically generate an image series that imitates natural postures is far from perfect. We introduce a new framework for facial image representation, analysis and synthesis, in which we focus just on the lower half of the face, specifically the mouth. It includes interpretation and classification of facial expressions and visual speech recognition, as well as a synthesis procedure of facial expressions that yields natural looking mouth movements. Our image analysis and synthesis processes are based on a parametrization of the mouth configuration set of images. These images are represented as points on a two-dimensional flat manifold that enables us to efficiently define the pronunciation of each word and thereby analyze or synthesize the motion of the lips. We present some examples of automatic lips motion synthesis and lipreading, and propose a generalization of our solution to the problem of lipreading different subjects.  相似文献   

10.
Using Correspondence Analysis to Combine Classifiers   总被引:7,自引:0,他引:7  
Several effective methods have been developed recently for improving predictive performance by generating and combining multiple learned models. The general approach is to create a set of learned models either by applying an algorithm repeatedly to different versions of the training data, or by applying different learning algorithms to the same data. The predictions of the models are then combined according to a voting scheme. This paper focuses on the task of combining the predictions of a set of learned models. The method described uses the strategies of stacking and Correspondence Analysis to model the relationship between the learning examples and their classification by a collection of learned models. A nearest neighbor method is then applied within the resulting representation to classify previously unseen examples. The new algorithm does not perform worse than, and frequently performs significantly better than other combining techniques on a suite of data sets.  相似文献   

11.
Bayesian belief nets (BNs) are often used for classification tasks—typically to return the most likely class label for each specified instance. Many BN-learners, however, attempt to find the BN that maximizes a different objective function—viz., likelihood, rather than classification accuracy—typically by first learning an appropriate graphical structure, then finding the parameters for that structure that maximize the likelihood of the data. As these parameters may not maximize the classification accuracy, “discriminative parameter learners” follow the alternative approach of seeking the parameters that maximize conditional likelihood (CL), over the distribution of instances the BN will have to classify. This paper first formally specifies this task, shows how it extends standard logistic regression, and analyzes its inherent sample and computational complexity. We then present a general algorithm for this task, ELR, that applies to arbitrary BN structures and that works effectively even when given incomplete training data. Unfortunately, ELR is not guaranteed to find the parameters that optimize conditional likelihood; moreover, even the optimal-CL parameters need not have minimal classification error. This paper therefore presents empirical evidence that ELR produces effective classifiers, often superior to the ones produced by the standard “generative” algorithms, especially in common situations where the given BN-structure is incorrect.  相似文献   

12.
Reducing the dimensionality of a classification problem produces a more computationally-efficient system. Since the dimensionality of a classification problem is equivalent to the number of neurons in the first hidden layer of a network, this work shows how to eliminate neurons on that layer and simplify the problem. In the cases where the dimensionality cannot be reduced without some degradation in classification performance, we formulate and solve a constrained optimization problem that allows a trade-off between dimensionality and performance. We introduce a novel penalty function and combine it with bilevel optimization to solve the constrained problem. The performance of our method on synthetic and applied problems is superior to other known penalty functions such as weight decay, weight elimination, and Hoyer's function. An example of dimensionality reduction for hyperspectral image classification demonstrates the practicality of the new method. Finally, we show how the method can be extended to multilayer and multiclass neural network problems.  相似文献   

13.
Some articulated motion representations rely on frame-wise abstractions of the statistical distribution of low-level features such as orientation, color, or relational distributions. As configuration among parts changes with articulated motion, the distribution changes, tracing a trajectory in the latent space of distributions, which we call the configuration space. These trajectories can then be used for recognition using standard techniques such as dynamic time warping. The core theory in this paper concerns embedding the frame-wise distributions, which can be looked upon as probability functions, into a low-dimensional space so that we can estimate various meaningful probabilistic distances such as the Chernoff, Bhattacharya, Matusita, Kullback-Leibler (KL) or symmetric-KL distances based on dot products between points in this space. Apart from computational advantages, this representation also affords speed-normalized matching of motion signatures. Speed normalized representations can be formed by interpolating the configuration trajectories along their arc lengths, without using any knowledge of the temporal scale variations between the sequences. We experiment with five different probabilistic distance measures and show the usefulness of the representation in three different contexts—sign recognition (with large number of possible classes), gesture recognition (with person variations), and classification of human-human interaction sequences (with segmentation problems). We find the importance of using the right distance measure for each situation. The low-dimensional embedding makes matching two to three times faster, while achieving recognition accuracies that are close to those obtained without using a low-dimensional embedding. We also empirically establish the robustness of the representation with respect to low-level parameters, embedding parameters, and temporal-scale parameters.  相似文献   

14.
Patch Alignment for Dimensionality Reduction   总被引:7,自引:0,他引:7  
Spectral analysis-based dimensionality reduction algorithms are important and have been popularly applied in data mining and computer vision applications. To date many algorithms have been developed, e.g., principal component analysis, locally linear embedding, Laplacian eigenmaps, and local tangent space alignment. All of these algorithms have been designed intuitively and pragmatically, i.e., on the basis of the experience and knowledge of experts for their own purposes. Therefore, it will be more informative to provide a systematic framework for understanding the common properties and intrinsic difference in different algorithms. In this paper, we propose such a framework, named "patch alignment,” which consists of two stages: part optimization and whole alignment. The framework reveals that 1) algorithms are intrinsically different in the patch optimization stage and 2) all algorithms share an almost identical whole alignment stage. As an application of this framework, we develop a new dimensionality reduction algorithm, termed Discriminative Locality Alignment (DLA), by imposing discriminative information in the part optimization stage. DLA can 1) attack the distribution nonlinearity of measurements; 2) preserve the discriminative ability; and 3) avoid the small-sample-size problem. Thorough empirical studies demonstrate the effectiveness of DLA compared with representative dimensionality reduction algorithms.  相似文献   

15.
非线性降维和半监督学习都是近年来机器学习的热点。将半监督的方法运用到非线性降维中,提出了基于图的半监督降维的算法。该算法用等式融合的方法推出了标记传播算法的另一种表达形式,用标记传播的结果作为初始的数据映射,然后在图谱张成的线性空间中寻找最逼近初始映射的数据作为最后的半监督降维的结果。实验表明,所提算法可以获得平滑的数据映射,更接近于理想的降维效果。与标记传播算法、图谱逼近算法、无监督的降维算法的比较也体现出本算法的优越性。  相似文献   

16.
Using Architectural Models to Manage and Visualize Runtime Adaptation   总被引:2,自引:0,他引:2  
The architectural runtime configuration management approach provides an accurate model of adaptive software system behavior over time. ARCM improves the visibility and understandability of runtime adaptive processes while allowing human input into the adaptation-control loop.  相似文献   

17.
Locally-weighted regression is a computationally-efficient technique for non-linear regression. However, for high-dimensional data, this technique becomes numerically brittle and computationally too expensive if many local models need to be maintained simultaneously. Thus, local linear dimensionality reduction combined with locally-weighted regression seems to be a promising solution. In this context, we review linear dimensionality-reduction methods, compare their performance on non-parametric locally-linear regression, and discuss their ability to extend to incremental learning. The considered methods belong to the following three groups: (1) reducing dimensionality only on the input data, (2) modeling the joint input-output data distribution, and (3) optimizing the correlation between projection directions and output data. Group 1 contains principal component regression (PCR); group 2 contains principal component analysis (PCA) in joint input and output space, factor analysis, and probabilistic PCA; and group 3 contains reduced rank regression (RRR) and partial least squares (PLS) regression. Among the tested methods, only group 3 managed to achieve robust performance even for a non-optimal number of components (factors or projection directions). In contrast, group 1 and 2 failed for fewer components since these methods rely on the correct estimate of the true intrinsic dimensionality. In group 3, PLS is the only method for which a computationally-efficient incremental implementation exists. Thus, PLS appears to be ideally suited as a building block for a locally-weighted regressor in which projection directions are incrementally added on the fly.  相似文献   

18.
提出了基于聚类的核矩阵维度缩减技术。它的主要思想是首先利用非线性映射≠将原始输入空间变换到某高维特征空间,然后根据k-均值聚类算法缩减训练样本的数目,得到一缩减的代表集,利用该代表集计算得到一组标准正交的基向量,构成一个低维的投影子空间。CENPARMI手写体阿拉伯数字库的试验结果证实了所提算法的有效性。  相似文献   

19.
为了提取具有较好判别性能的低维特征,提出了一种新的有监督的线性降维算法——边界判别投影,即,最小化同类样本间的最大距离,最大化异类样本间的最小距离,同时保持数据流形的几何形状.与经典的基于边界定义的算法相比,边界判别投影可以较好地保持数据流形的几何结构和判别结构等全局特性,可避免小样本问题,具有较低的计算复杂度,可应用于超高维的大数据降维.人脸数据集上的实验结果表明,边界判别分析是一种有效的降维算法,可应用于大数据上的特征提取.  相似文献   

20.
使用分类器自动发现特定领域的深度网入口   总被引:4,自引:0,他引:4  
王辉  刘艳威  左万利 《软件学报》2008,19(2):246-256
在深度网研究领域,通用搜索引擎(比如Google和Yahoo)具有许多不足之处:它们各自所能覆盖的数据量与整个深度网数据总量的比值小于1/3;与表层网中的情况不同,几个搜索引擎相结合所能覆盖的数据量基本没有发生变化.许多深度网站点能够提供大量高质量的信息,并且,深度网正在逐渐成为一个最重要的信息资源.提出了一个三分类器的框架,用于自动识别特定领域的深度网入口.查询接口得到以后,可以将它们进行集成,然后将一个统一的接口提交给用户以方便他们查询信息.通过8组大规模的实验,验证了所提出的方法可以准确高效地发现特定领域的深度网入口.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号