首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
An approach to concept learning from examples and concept learning by observation is presented that is based on a intuitive notion of conceptual distance between examples (concepts) and combines symbolical and numerical methods. The approach is based on the observation that very different examples generalize to an expression that is very far from each of them, while identical examples generalize to themselves. Following this idea the authors propose some domain-independent and intuitively justified estimates for the conceptual distance. A hierarchical conceptual clustering algorithm that groups objects so as to maximize the cohesiveness (a reciprocal of the conceptual distance) of the clusters is presented. It is shown that conceptual clustering can improve learning from complex examples describing objects and the relation between them  相似文献   

2.
为了提升文本聚类效果,改善传统聚类算法在参数设定,稳定性等方面存在的不足,提出新的文本聚类算法TCBIBK(a Text Clustering algorithm Based on Improved BIRCH and K-nearest neighbor)。该算法以BIRCH聚类算法为原型,聚类过程中除判断文本对象与簇的距离外,增加判断簇与簇之间的距离,采取主动的簇合并或分裂,设置动态的阈值。同时结合KNN分类算法,在保证良好聚类效率前提下提升聚类稳定性,将TCBIBK算法应用于文本聚类,能够提高文本聚类效果。对比实验结果表明,该算法聚类有效性与稳定性都得到较大提高。  相似文献   

3.
This paper describes the development of an operational prototype for a comprehensive microsimulation model of urban systems. It examines several important design advances that emerged during the transition from a conceptual framework to operational code. ILUTE (Integrated Land Use, Transportation, Environment) simulates the evolution of an integrated urban system over an extended period of time. This model is intended to replace conventional models for the analysis of a broad range of transportation, housing and other urban policies. An overview of the ILUTE framework was presented at the 9th IATBR conference (Miller and Salvini, 2001). Since then, considerable progress has been made on the overall model and its component sub-models. At present, an operational prototype is being tested using data from the Greater Toronto Area. Disaggregate information for the model is synthesized from census data, travel survey data, activity data, and randomly generated proxy data. ILUTE is based on the “ideal model” described in the final report of the Transit Cooperative Research Program’s (TCRP) Project H-12, “Integrated Urban Models for Simulation of Transit and Land-Use Policies” (Miller et al., 1998).  相似文献   

4.
5.
6.
The hierarchy of Symbolic Transition Systems, introduced by Henzinger, Majumdar and Raskin, is an elegant classification tool for some families of infinite-state operational models that support some variants of a symbolic “backward closure” verification algorithm. It was first used and illustrated with families of hybrid systems.In this paper we investigate whether the STS hierarchy can account for classical families of infinite-state systems outside of timed or hybrid systems.  相似文献   

7.
The k-means algorithm is well known for its efficiency in clustering large data sets. However, working only on numeric values prohibits it from being used to cluster real world data containing categorical values. In this paper we present two algorithms which extend the k-means algorithm to categorical domains and domains with mixed numeric and categorical values. The k-modes algorithm uses a simple matching dissimilarity measure to deal with categorical objects, replaces the means of clusters with modes, and uses a frequency-based method to update modes in the clustering process to minimise the clustering cost function. With these extensions the k-modes algorithm enables the clustering of categorical data in a fashion similar to k-means. The k-prototypes algorithm, through the definition of a combined dissimilarity measure, further integrates the k-means and k-modes algorithms to allow for clustering objects described by mixed numeric and categorical attributes. We use the well known soybean disease and credit approval data sets to demonstrate the clustering performance of the two algorithms. Our experiments on two real world data sets with half a million objects each show that the two algorithms are efficient when clustering large data sets, which is critical to data mining applications.  相似文献   

8.
Multi-stream interactive systems can be seen as “hidden adversary” systems (HAS), where the observable behaviour on any interaction channel is affected by interactions happening on other channels. One way of modelling HAS is in the form of a multi-process I/O automata, where each interacting process appears as a token in a shared state space. Constraints in the state space specify how the dynamics of one process affects other processes. We define the “liveness criterion” of each process as the end objective to be achieved by the process. The problem now for each process is to achieve this objective in the face of unforeseen interferences from other processes. In an earlier paper, it was proposed that this uncertainty can be mitigated by collaboration among the disparate processes. Two types of collaboration philosophies were also suggested: altruistic collaboration and pragmatic collaboration. This paper addresses the HAS validation problem where processes collaborate altruistically.  相似文献   

9.
合理的聚类原型是正确聚类的前提.针对现有聚类算法原型选取不合理、计算聚类个数存在偏差等问题,提出基于过滤模型的聚类算法(CA-FM).算法以提出的过滤模型去除干扰聚类过程的边界和噪声对象,依据核心对象之间的近邻关系生成邻接矩阵,通过遍历矩阵计算聚类个数;然后,按密度因子将数据对象排序,从中选出聚类原型;最后,将其余对象按照距高密度对象的最小距离划分到相应的簇中,形成最终聚类.在人工合成数据集、UCI数据集以及人脸识别数据集上的实验结果验证了算法的有效性,与同类算法相比,CA-FM算法具有较高的聚类精度.  相似文献   

10.
在现实世界中经常遇到混合数值属性和分类属性的数据, k-prototypes是聚类该类型数据的主要算法之一。针对现有混合属性聚类算法的不足,提出一种基于分布式质心和新差异测度的改进的 k-prototypes 算法。在新算法中,首先引入分布式质心来表示簇中的分类属性的簇中心,然后结合均值和分布式质心来表示混合属性的簇中心,并提出一种新的差异测度来计算数据对象与簇中心的距离,新差异测度考虑了不同属性在聚类过程中的重要性。在三个真实数据集上的仿真实验表明,与传统的聚类算法相比,本文算法的聚类精度要优于传统的聚类算法,从而验证了本文算法的有效性。  相似文献   

11.
基于视觉系统的聚类算法   总被引:15,自引:0,他引:15  
人类对于结构的感知方式和产生数据的物理系统原理对于聚类分析而言具有同等的重要性。因此,在聚类算法的设计和分析中,模拟人类的主要器官-视觉系统将帮助我们解决这一领域的一些基本问题。从这一观点出发,文中提出一类基于初级视觉系统尺度空间理论的聚类算法,并通过引入显著性假设,将生物物理学中的Weber定律与聚类结构的有效性联系起来。由此产生的聚类算法简洁有效,并可部分地回答那些与人类感知数据结构相关联的聚类有效性问题。我们的数值试验表明这一方法具有广泛的应用前景。  相似文献   

12.
13.
W.K. Chiu  K.M. Yu   《Computer aided design》2008,40(12):1080-1093
Among the different types of direct digital manufacturing (DDM) technologies, some of them can be used for making functionally graded material (FGM) objects. Apart from specific characteristics of the DDM process being employed, one problem in FGM object fabrication is the generation of the corresponding information complete format so that the functionally graded material information can be realized. In this paper, this issue is addressed and the three-dimensional printing (3DP) process is considered as the DDM technology employed for making a FGM prototype. The property of printing a 3D prototype in color is adopted and a methodology is proposed for representing the mechanical properties of an FGM object by color information. In this methodology, an object is considered as “functionally graded” if its mechanical strength is gradually changed within the object and the mechanical strength arisen from gluing powdered material by binders in 3DP is assumed to be proportional to the concentrations of the binders applied in it. If the concentration of each primary color binder is different and a pixel of color is printed in the appropriate proportion, the resultant pixel would have a corresponding binder concentration value. To determine the binder concentration requirements, a computer-aided engineering (CAE) analysis is first carried out and the concentration requirements in different parts of the object are inferred from the CAE analysis result. These requirements are then converted to color information by applying the proposed methodology. As the binder concentrations of different colors are different, a colorful prototype would also be an FGM one.  相似文献   

14.
Noetica is a tool for structuring knowledge about concepts and the relationships between them. It differs from typical information systems in that the knowledge it represents is abstract, highly connected, and includes meta-knowledge (knowledge about knowledge). Noetica represents knowledge using a strongly typed graph data model. By providing a rich type system it is possible to represent conceptual information using formalised structures. A class hierarchy provides a basic classification for all objects. This allows for a consistency of representation that is not often found in “free” semantic networks, and gives the ability to easily extend a knowledge model while retaining its semantics.

Visualisation and query tools are provided for this data model. Visualisation can be used to explore complete sets of link-classes, show paths while navigating through the database, or visualise the results of queries. Noetica supports goal-directed queries (a series of user-supplied goals that the system attempts to satisfy in sequence) and pathfinding queries (where the system finds relationships between objects in the database by following links).  相似文献   


15.
16.
We present a new algorithm, and its distributed implementation, for reducing labeled transition systems modulo strong bisimulation. The base of this algorithm is the Kanellakis–Smolka “naive method”, which has a high theoretical complexity but is successful in practice and well suited to parallelization. This basic approach is combined with optimizations inspired by the Kanellakis–Smolka algorithm for the case of bounded fanout, which has the best known time complexity. The distributed implementation is improved with respect to previous attempts by a better overlap between communication and computation, which results in an efficient usage of both memory and processing power. We also discuss the time complexity of this algorithm and show experimental results with sequential and distributed prototype tools.  相似文献   

17.
基于虚拟原型的概念设计描述模型V-desModel   总被引:2,自引:0,他引:2  
杨强  郭阳  彭宇行  李思昆 《软件学报》2002,13(4):748-753
传统的概念设计方法由于缺乏真实感的交互手段,难以直观表达设计者的意图.基于虚拟原型的概念设计不仅能为设计者提供逼真的虚拟设计环境,而且充分体现了现代设计的成本低、周期短以及灵活性强等特点.针对概念设计的特点以及虚拟原型的特征分类,提出了基于虚拟原型的概念设计模型V-desModel,其核心是利用产品视图模型描述设计对象,将虚拟特征概念融入视图模型中,并采用可扩展"三维实体-约束图"来描述设计对象之间的约束关系.V-desModel模型能有效地支持基于虚拟原型的概念设计过程,较好地解决了概念设计中产品虚拟原  相似文献   

18.
The theory of concept (or Galois) lattices provides a simple and formal approach to conceptual clustering. In this paper we present GALOIS, a system that automates and applies this theory. The algorithm utilized by GALOIS to build a concept lattice is incremental and efficient, each update being done in time at most quadratic in the number of objects in the lattice. Also, the algorithm may incorporate background information into the lattice, and through clustering, extend the scope of the theory. The application we present is concerned with information retrieval via browsing, for which we argue that concept lattices may represent major support structures. We describe a prototype user interface for browsing through the concept lattice of a document-term relation, possibly enriched with a thesaurus of terms. An experimental evaluation of the system performed on a medium-sized bibliographic database shows good retrieval performance and a significant improvement after the introduction of background knowledge.  相似文献   

19.
CID: an efficient complexity-invariant distance for time series   总被引:1,自引:1,他引:0  
The ubiquity of time series data across almost all human endeavors has produced a great interest in time series data mining in the last decade. While dozens of classification algorithms have been applied to time series, recent empirical evidence strongly suggests that simple nearest neighbor classification is exceptionally difficult to beat. The choice of distance measure used by the nearest neighbor algorithm is important, and depends on the invariances required by the domain. For example, motion capture data typically requires invariance to warping, and cardiology data requires invariance to the baseline (the mean value). Similarly, recent work suggests that for time series clustering, the choice of clustering algorithm is much less important than the choice of distance measure used.In this work we make a somewhat surprising claim. There is an invariance that the community seems to have missed, complexity invariance. Intuitively, the problem is that in many domains the different classes may have different complexities, and pairs of complex objects, even those which subjectively may seem very similar to the human eye, tend to be further apart under current distance measures than pairs of simple objects. This fact introduces errors in nearest neighbor classification, where some complex objects may be incorrectly assigned to a simpler class. Similarly, for clustering this effect can introduce errors by “suggesting” to the clustering algorithm that subjectively similar, but complex objects belong in a sparser and larger diameter cluster than is truly warranted.We introduce the first complexity-invariant distance measure for time series, and show that it generally produces significant improvements in classification and clustering accuracy. We further show that this improvement does not compromise efficiency, since we can lower bound the measure and use a modification of triangular inequality, thus making use of most existing indexing and data mining algorithms. We evaluate our ideas with the largest and most comprehensive set of time series mining experiments ever attempted in a single work, and show that complexity-invariant distance measures can produce improvements in classification and clustering in the vast majority of cases.  相似文献   

20.
基于二部图的概念聚类研究   总被引:1,自引:0,他引:1       下载免费PDF全文
传统概念聚类算法中簇的更新和存储不仅依赖于对象数目和属性数目,而且依赖于属性值的数目,这种局限性使其不适用于大型数据集。提出一种新的基于二部图的概念聚类算法(BGBCC),该算法通过获得二部图的近似极大ε二元组集,有效地进行数据与属性的关联聚类。实验表明,该算法能得到较好的聚类结果,且能在较短的时间内进行大型数据集的概念聚类。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号