首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
In this paper we present a new approach for building metadata schemas by integrating existing ontologies and structured vocabularies (thesauri). This integration is based on the specification of inclusion relationships between thesaurus terms and ontology concepts and results in application-specific metadata schemas incorporating the structural views of ontologies and the deep classification schemes provided by thesauri. We will also show how the result of this integration can be used for RDF schema creation and metadata querying. In our context, (metadata) queries exploit the inclusion semantics of term relationships, which introduces some recursion. We will present a fairly simple database-oriented solution for querying such metadata which avoids a (recursive) tree traversal and is based on a linear encoding of thesaurus hierarchies. Published online: 22 September 2000  相似文献   

3.
Graphs are a flexible and general formalism providing rich models in various important domains, such as distributed computing, intelligent tutoring systems or social network analysis. In many cases, such models need to take changes in the graph structure into account, that is, changes in the number of nodes or in the graph connectivity. Predicting such changes within graphs can be expected to yield important insight with respect to the underlying dynamics, e.g. with respect to user behaviour. However, predictive techniques in the past have almost exclusively focused on single edges or nodes. In this contribution, we attempt to predict the future state of a graph as a whole. We propose to phrase time series prediction as a regression problem and apply dissimilarity- or kernel-based regression techniques, such as 1-nearest neighbor, kernel regression and Gaussian process regression, which can be applied to graphs via graph kernels. The output of the regression is a point embedded in a pseudo-Euclidean space, which can be analyzed using subsequent dissimilarity- or kernel-based processing methods. We discuss strategies to speed up Gaussian processes regression from cubic to linear time and evaluate our approach on two well-established theoretical models of graph evolution as well as two real data sets from the domain of intelligent tutoring systems. We find that simple regression methods, such as kernel regression, are sufficient to capture the dynamics in the theoretical models, but that Gaussian process regression significantly improves the prediction error for real-world data.  相似文献   

4.
Skeletal trees are commonly used in order to express geometric properties of the shape. Accordingly, tree-edit distance is used to compute a dissimilarity between two given shapes. We present a new tree-edit based shape matching method which uses a recent coarse skeleton representation. The coarse skeleton representation allows us to represent both shapes and shape categories in the form of depth-1 trees. Consequently, we can easily integrate the influence of the categories into shape dissimilarity measurements. The new dissimilarity measure gives a better within group versus between group separation, and it mimics the asymmetric nature of human similarity judgements.  相似文献   

5.
距离与差异性度量是聚类分析中的基本概念,是许多聚类算法的核心内容。在经典的聚类分析中,度量差异性的指标是距离的简单函数。该文针对混合属性数据集,提出两种距离定义,将差异性度量推广成为距离、类大小等因素的多元函数,使得原来只适用于数值属性或分类属性数据的聚类算法可用于混合属性数据。实验结果表明新的距离定义和差异性度量方法可提高聚类的质量。  相似文献   

6.
7.
This paper proposes a novel method for fast pattern matching based on dissimilarity functions derived from the Lp norm, such as the Sum of Squared Differences (SSD) and the Sum of Absolute Differences (SAD). The proposed method is full-search equivalent, i.e. it yields the same results as the Full Search (FS) algorithm. In order to pursue computational savings the method deploys a succession of increasingly tighter lower bounds of the adopted Lp norm-based dissimilarity function. Such bounding functions allow for establishing a hierarchy of pruning conditions aimed at skipping rapidly those candidates that cannot satisfy the matching criterion. The paper includes an experimental comparison between the proposed method and other full-search equivalent approaches known in literature, which proves the remarkable computational efficiency of our proposal.  相似文献   

8.
Segmentation of volumetric data is an important part of many analysis pipelines, but frequently requires manual inspection and correction. While plenty of volume editing techniques exist, it remains cumbersome and errorprone for the user to find and select appropriate regions for editing. We propose an approach to improve volume editing by detecting potential segmentation defects while considering the underlying structure of the object of interest. Our method is based on a novel histogram dissimilarity measure between individual regions, derived from structural information extracted from the initial segmentation. Based on this information, our interactive system guides the user towards potential defects, provides integrated tools for their inspection, and automatically generates suggestions for their resolution. We demonstrate that our approach can reduce interaction effort and supports the user in a comprehensive investigation for high‐quality segmentations.  相似文献   

9.
In this paper, we argue to learn dissimilarity for interactive search in content based image retrieval. In literature, dissimilarity is often learned via the feature space by feature selection, feature weighting or by adjusting the parameters of a function of the features. Other than existing techniques, we use feedback to adjust the dissimilarity space independent of feature space. This has the great advantage that it manipulates dissimilarity directly. To create a dissimilarity space, we use the method proposed by Pekalska and Duin, selecting a set of images called prototypes and computing distances to those prototypes for all images in the collection. After the user gives feedback, we apply active learning with a one-class support vector machine to decide the movement of images such that relevant images stay close together while irrelevant ones are pushed away (the work of Guo ). The dissimilarity space is then adjusted accordingly. Results on a Corel dataset of 10000 images and a TrecVid collection of 43907 keyframes show that our proposed approach is not only intuitive, it also significantly improves the retrieval performance.  相似文献   

10.
As a result of the distribution of interrelated information over several different information systems, the interconnection of information systems has increased in recent years. However, a purely technical interconnection is insufficient for users who need to find their way to information they are looking for. Thesauri are a proven means to identify documents, e.g., books of interest in a library. For different domains, different thesauri are available, which can be used in information systems as well, e.g., for the indexing and retrieval of data objects. Thus, the interconnection of information systems raises the need to integrate related thesauri. Furthermore, recent advances in open interoperability technologies (World Wide Web, CORBA, and Java) offer the potential for completely new technical solutions for employing thesauri. This paper presents an approach for integrating multiple thesaurus databases. It concentrates on the integration of distributed and heterogeneous thesaurus databases and the integration of multilingual and monolingual thesauri. The software architecture takes advantage of the most advanced Internet and CORBA technology currently available in public domain and in commercial implementations.  相似文献   

11.
针对动态主元分析方法中残差自相关性降低过程故障检测率问题,提出基于动态主元分析残差互异度的故障检测与诊断方法.首先,应用动态主元分析(Dynamic principal component analysis,DPCA)计算动态过程数据的残差得分;接下来,应用滑动窗口技术并结合互异度指标(Dissimilarity)来监控过程残差得分状态;最后,利用基于变量贡献图的方法进行过程故障诊断分析.本文方法通过DPCA捕获过程的动态特征,同时互异度指标区别于传统的平方预测误差(Square prediction error,SPE),它可以有效地对具有自相关性的残差得分进行过程状态监控.通过一个数值例子和Tennessee Eastman(TE)过程的仿真实验并与传统方法对比分析,仿真结果进一步证实了本文方法的有效性.  相似文献   

12.
13.
针对动态主元分析方法中残差自相关性降低过程故障检测率问题,提出基于动态主元分析残差互异度的故障检测与诊断方法.首先,应用动态主元分析(Dynamic principal component analysis,DPCA)计算动态过程数据的残差得分;接下来,应用滑动窗口技术并结合互异度指标(Dissimilarity)来监控过程残差得分状态;最后,利用基于变量贡献图的方法进行过程故障诊断分析.本文方法通过DPCA捕获过程的动态特征,同时互异度指标区别于传统的平方预测误差(Square prediction error,SPE),它可以有效地对具有自相关性的残差得分进行过程状态监控.通过一个数值例子和Tennessee Eastman(TE)过程的仿真实验并与传统方法对比分析,仿真结果进一步证实了本文方法的有效性.  相似文献   

14.
基于相异性选择的密度聚类算法研究   总被引:4,自引:0,他引:4  
在最优K相异性算法(OptiSim)的基础上,提出一种扩展的最优K相异性算(EOptiSim)。由于EOptiSim在处理组合数据库和分布式数据库方面能弥补基本的OptiSim方法的不足,所以通过在DBSCAN算法之前应用0ptiSim或EOptiSim多样化代表性子集选择技术。在显著降低I/O耗费和内存需求的同时,不仅能够有效地聚类单一的大规模空间数据库,而且还能聚类大规模组合数据库或分布式数据库.实验结果表明本文的算法是可行、有效的.  相似文献   

15.
A new framework for computing the Euclidean distance and weighted distance from the boundary of a given digitized shape is presented. The distance is calculated with sub-pixel accuracy. The algorithm is based on a equal distance contour evolution process. The moving contour is embedded as a level set in a time varying function of higher dimension. This representation of the evolving contour makes possible the use of an accurate and stable numerical scheme, due to Osher and Sethian [22]. The relation between the classical shape from shading problem and the weighted distance transform is presented, as well as an algorithm that calculates the geodesic distance transform on surfaces.  相似文献   

16.
In this paper, a new matching pursuits dissimilarity measure (MPDM) is presented that compares two signals using the information provided by their matching pursuits (MP) approximations, without requiring any prior domain knowledge. MPDM is a flexible and differentiable measure that can be used to perform shape-based comparisons and fuzzy clustering of very high-dimensional, possibly compressed, data. A novel prototype based classification algorithm, which is termed the computer aided minimization procedure (CAMP), is also proposed. The CAMP algorithm uses the MPDM with the competitive agglomeration (CA) fuzzy clustering algorithm to build reliable shape based prototypes for classification. MP is a well known sparse signal approximation technique, which is commonly used for video and image coding. The dictionary and coefficient information produced by MP has previously been used to define features to build discrimination and prototype based classifiers. However, existing MP based classification applications are quite problem domain specific, thus making their generalization to other problems quite difficult. The proposed CAMP algorithm is the first MP based classification system that requires no assumptions about the problem domain and builds a bridge between the MP and fuzzy clustering algorithms. Experimental results also show that the CAMP algorithm is more resilient to outliers in test data than the multilayer perceptron (MLP) and support-vector-machine (SVM) classifiers, as well as prototype-based classifiers using the Euclidean distance as their dissimilarity measure.  相似文献   

17.
基于算法随机性理论和奇异描述的置信学习机器   总被引:3,自引:0,他引:3  
摘要根据Kolmogorov算法随机性理论,为学习机器建立了一种置信机制,描述了置信学习机器的算法.论证了通过样本奇异描述函数定义的可计算的样本序列随机性描述函数与Kolmogorov算法随机性理论中定义的,不可计算的序列随机性描述函数具有相同的意义.分别从样本空间距离、样本对分类边界的支持力度和样本应变大小3个不同的角度设计了样本奇异描述函数,利用它们实现了置信学习机器算法.该置信学习机器在Cleveland心脏病理数据识别和签名认证实验中都取得了比较满意的结果.  相似文献   

18.
A method for the automatic extraction of words with similar meanings is presented which is based on the analysis of word distribution in large monolingual text corpora. It involves compiling matrices of word co-occurrences and reducing the dimensionality of the semantic space by conducting a singular value decomposition. This way problems of data sparseness are reduced and a generalization effect is achieved which considerably improves the results. The method is largely language independent and has been applied to corpora of English, French, German, and Russian, with the resulting thesauri being freely available. For the English thesaurus, an evaluation has been conducted by comparing it to experimental results as obtained from test persons who were asked to give judgements of word similarities. According to this evaluation, the machine generated results come close to native speaker’s performance.  相似文献   

19.
M. Edahiro 《Algorithmica》1996,16(3):316-338
The equispreading tree on the plane with Manhattan distance, which is a Steiner tree such that all paths from the root to all leaves have the same length, is analyzed. This problem is not only fundamental in computational geometry but also critical for equidistant routings in VLSI clock design. Several characteristics for the trees are discussed together with an algorithm constructing equispreading trees in the bottom-up fashion. This algorithm achieves linear time and space complexity with respect to the number of leaves, and minimizes the path length from the root to leaves. Furthermore, this paper shows that the shortest-path-length equispreading trees are related to the smallest enclosing circles in Manhattan distance.  相似文献   

20.
There is no known algorithm that solves the general case of theapproximate string matching problem with the extended edit distance, where the edit operations are: insertion, deletion, mismatch and swap, in timeo(nm), wheren is the length of the text andm is the length of the pattern. In an effort to study this problem, the edit operations were analysed independently. It turns out that the approximate matching problem with only the mismatch operation can be solved in timeO(nm logm). If the only edit operation allowed is swap, then the problem can be solved in timeO(n logm logσ), whereσ=min(m, |Σ|). In this paper we show that theapproximate string matching problem withswap andmismatch as the edit operations, can be computed in timeO(nm logm). Amihood Amir was partially supported by NSF Grant CCR-01-04494 and ISF Grant 35/05. This work is part of Estrella Eisenberg’s M.Sc. thesis. Ely Porat was partially supported by GIF Young Scientists Program Grant 2055-1168.6/2002.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号