首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Clustering aims to partition a data set into homogenous groups which gather similar objects. Object similarity, or more often object dissimilarity, is usually expressed in terms of some distance function. This approach, however, is not viable when dissimilarity is conceptual rather than metric. In this paper, we propose to extract the dissimilarity relation directly from the available data. To this aim, we train a feedforward neural network with some pairs of points with known dissimilarity. Then, we use the dissimilarity measure generated by the network to guide a new unsupervised fuzzy relational clustering algorithm. An artificial data set and a real data set are used to show how the clustering algorithm based on the neural dissimilarity outperforms some widely used (possibly partially supervised) clustering algorithms based on spatial dissimilarity.  相似文献   

2.
Neurocontroller design via supervised and unsupervised learning   总被引:1,自引:0,他引:1  
In this paper we study the role of supervised and unsupervised neural learning schemes in the adaptive control of nonlinear dynamic systems. We suggest and demonstrate that the teacher's knowledge in the supervised learning mode includes a-priori plant sturctural knowledge which may be employed in the design of exploratory schedules during learning that results in an unsupervised learning scheme. We further demonstrate that neurocontrollers may realize both linear and nonlinear control laws that are given explicitly in an automated teacher or implicitly through a human operator and that their robustness may be superior to that of a model based controller. Examples of both learning schemes are provided in the adaptive control of robot manipulators and a cart-pole system.  相似文献   

3.

One relevant problem in data quality is missing data. Despite the frequent occurrence and the relevance of the missing data problem, many machine learning algorithms handle missing data in a rather naive way. However, missing data treatment should be carefully treated, otherwise bias might be introduced into the knowledge induced. In this work, we analyze the use of the k-nearest neighbor as an imputation method. Imputation is a term that denotes a procedure that replaces the missing values in a data set with some plausible values. One advantage of this approach is that the missing data treatment is independent of the learning algorithm used. This allows the user to select the most suitable imputation method for each situation. Our analysis indicates that missing data imputation based on the k-nearest neighbor algorithm can outperform the internal methods used by C4.5 and CN2 to treat missing data, and can also outperform the mean or mode imputation method, which is a method broadly used to treat missing values.  相似文献   

4.
This paper proposes a new methodology which combines supervised and unsupervised learning for evaluating power system dynamic security. Based on the concept of stability margin, pre-fault power system conditions are assigned to the output neurons on the two-dimensional grid with the growing hierarchical self-organizing map technique (GHSOM) via supervised artificial neural networks (ANNs) which perform an estimation of post-fault power system state. The technique estimates the dynamic stability index that corresponds to the most critical value of synchronizing and damping torques of multimachine power systems. ANN-based pattern recognition is carried out with the growing hierarchical self-organizing feature mapping in order to provide adaptive neural network architecture during its unsupervised training process. Numerical tests, carried out on a IEEE 9 bus power system are presented and discussed. The analysis using such method provides accurate results and improves the effectiveness of system security evaluation.  相似文献   

5.
For Tikhonov regularization in supervised learning from data, the effect on the regularized solution of a joint perturbation of the regression function and the data is investigated. Spectral windows in the finite-sample and population cases are compared via probabilistic estimates of the differences between regularized solutions.  相似文献   

6.
This paper describes in full detail a model of a hierarchical classifier (HC). The original classification problem is broken down into several subproblems and a weak classifier is built for each of them. Subproblems consist of examples from a subset of the whole set of output classes. It is essential for this classification framework that the generated subproblems would overlap, i.e. some individual classes could belong to more than one subproblem. This approach allows to reduce the overall risk. Individual classifiers built for the subproblems are weak, i.e. their accuracy is only a little better than the accuracy of a random classifier. The notion of weakness for a multiclass model is extended in this paper. It is more intuitive than approaches proposed so far. In the HC model described, after a single node is trained, its problem is split into several subproblems using a clustering algorithm. It is responsible for selecting classes similarly classified. The main scope of this paper is focused on finding the most appropriate clustering method. Some algorithms are defined and compared. Finally, we compare a whole HC with other machine learning approaches.  相似文献   

7.
设计了一个通用的基于控制流和数据流的结构测试数据自动生成的工具。该工具根据控制流和数据流测试中所采用的覆盖标准来选取测试路径,并以改进后的迭代松弛法为核心,对所选取的路径生成测试数据。同时工具采用Fibonacci法优化选取路径,对不可达路径进行处理,并对测试数据的分支覆盖率、DCP覆盖率等进行了统计。实验结果表明该工具是可行的。  相似文献   

8.
A new scheme, incorporating dimensionality reduction and clustering, suitable for classification of a large volume of remotely sensed data using a small amount of memory is proposed. The scheme involves transforming the data from multidimensional n-space to a 3-dimensional primary color space of blue, green and red coordinates. The dimensionality reduction is followed by data reduction, which involves assigning 3-dimensional samples to a 2-dimensional array. Finally, a multi-stage ISODATA technique incorporating a novel seedpoint picking method is used to obtain the desired number of clusters.

The storage requirements are reduced to a low value by making five passes through the data and storing necessary information during each pass. The first three passes are used to find the minimum and maximum values of some of the variables. The data reduction is done and a classification table is formed during the fourth pass. The classification map is obtained during the fifth pass. The computer memory required is about 2K machine words.

The efficacy of the algorithm is justified by simulation studies using multispectral LANDSAT data.  相似文献   


9.
A neural network classifier, called supervised extended ART (SEART), that incorporates a supervised mechanism into the extended unsupervised ART is presented here. It uses a learning theory called Nested Generalized Exemplar (NGE) theory. In any time, the training instances may or may not have desired outputs, that is, this model can handle supervised learning and unsupervised learning simultaneously. The unsupervised component finds the cluster relations of instances, and the supervised component learns the desired associations between clusters and classes. In addition, this model has the ability of incremental learning. It works equally well when instances in a cluster belong to different classes. Also, multi-category and nonconvex classifications can be dealt with. Besides, the experimental results are very encouraging.  相似文献   

10.
Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) has become an important tool in breast cancer diagnosis, but evaluation of multitemporal 3D image data holds new challenges for human observers. To aid the image analysis process, we apply supervised and unsupervised pattern recognition techniques for computing enhanced visualizations of suspicious lesions in breast MRI data. These techniques represent an important component of future sophisticated computer-aided diagnosis (CAD) systems and support the visual exploration of spatial and temporal features of DCE-MRI data stemming from patients with confirmed lesion diagnosis. By taking into account the heterogeneity of cancerous tissue, these techniques reveal signals with malignant, benign and normal kinetics. They also provide a regional subclassification of pathological breast tissue, which is the basis for pseudo-color presentations of the image data. Intelligent medical systems are expected to have substantial implications in healthcare politics by contributing to the diagnosis of indeterminate breast lesions by non-invasive imaging.  相似文献   

11.
数字化经络仪、中医健康量表和四诊仪是中医临床常用辅助诊断工具,提供了很多中医临床数据。数据分布不均衡,同一个病例具有多个诊断标记是临床数据常见现象。以亚健康数据为例探索针对不均衡数据的机器学习分类方法;以肾脏疾病为例研究综合三种辅助诊断工具的混合分类模型;以心血管病、血脂异常疾病、尿酸升高类疾病为例,探索多标记数据分类方法。实验均取得良好分类效果,同时所选择特征符合医学理论,具有临床指导意义。  相似文献   

12.
Pattern Analysis and Applications - The paper has proposed a linear unsupervised transfer learning (LUTL). Therefore, a cost function has been introduced. In the cost function of the proposed LUTL,...  相似文献   

13.
14.
In this paper a classification framework for incomplete data, based on electrostatic field model is proposed. An original approach to exploiting incomplete training data with missing features, involving extensive use of electrostatic charge analogy, has been used. The framework supports a hybrid supervised and unsupervised training scenario, enabling learning simultaneously from both labelled and unlabelled data using the same set of rules and adaptation mechanisms. Classification of incomplete patterns has been facilitated by introducing a local dimensionality reduction technique, which aims at exploiting all available information using the data ‘as is’, rather than trying to estimate the missing values. The performance of all proposed methods has been extensively tested in a wide range of missing data scenarios, using a number of standard benchmark datasets in order to make the results comparable with those available in current and future literature. Several modifications to the original Electrostatic Field Classifier aiming at improving speed and robustness in higher dimensional spaces have also been introduced and discussed.  相似文献   

15.
Algorithms on streaming data have attracted increasing attention in the past decade. Among them, dimensionality reduction algorithms are greatly interesting due to the desirability of real tasks. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) are two of the most widely used dimensionality reduction approaches. However, PCA is not optimal for general classification problems because it is unsupervised and ignores valuable label information for classification. On the other hand, the performance of LDA is degraded when encountering limited available low-dimensional spaces and singularity problem. Recently, Maximum Margin Criterion (MMC) was proposed to overcome the shortcomings of PCA and LDA. Nevertheless, the original MMC algorithm could not satisfy the streaming data model to handle large-scale high-dimensional data set. Thus an effective, efficient and scalable approach is needed. In this paper, we propose a supervised incremental dimensionality reduction algorithm and its extension to infer adaptive low-dimensional spaces by optimizing the maximum margin criterion. Experimental results on a synthetic dataset and real datasets demonstrate the superior performance of our proposed algorithm on streaming data.  相似文献   

16.
张孝  王珊  廉娜 《计算机应用》2008,28(11):2737-2740
出处对于研究人员,特别是对科学家判断数据和实验的正确性和时效性尤其重要。随着数据库视图实体化技术和数据标注/修订技术的广泛应用,出处的研究正逐渐成为一个新的研究热点。合适的出处数据集是测试出处管理的新技术/算法的功能准确性和性能的基础之一,而在获得真实数据之前能够生成尽可能真实的模拟出处数据,对验证和改进算法同样具有关键作用。给出了一个新的出处数据库生成器ProGen,能够根据数据出处所使用的关系模式和出处上的标注约束来生成所需规模的出处数据库,实验表明所给出的实现是高效、可伸缩的。  相似文献   

17.
18.
19.
With the rapid evolution of new engineered surfaces, there is a strong need for developing tools to measure and characterize these surfaces at different scales. In order to obtain all meaningful details of the surface at various required scales, data fusion can be performed on data obtained from a combination of instruments or technologies. In order to evaluate the fusion methods, typically, well-recognized images like ‘Lena’ are used. But surface metrology datasets are distinctly different from those images, as all the data points are in focus, compared to typical images with a subject in focus and background with various levels of out-of-focus. So, a performance study was conducted on a wide range of surface samples and it was shown that Regional Edge Intensity (REI) is the preferred fusion method for surface metrology datasets, and Regional Energy (RE) is the second preferred method, when single-scale performance metrics are considered.  相似文献   

20.
Aydogdu  Ozge  Ekinci  Murat 《Multimedia Tools and Applications》2020,79(37-38):27205-27227
Multimedia Tools and Applications - The characteristics of the data stream have brought enormous challenges to classification algorithms. Concept drift is the most concerning characteristics, and...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号