首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 812 毫秒
1.
大量的无结构数据和复合对象是工程数据库的重要特征。本文介绍工程数据库管理系统EDBMS/2对这两类复杂对象的管理策略及其SQL查询。在EDBMS/2中引入FILE型属性和VAR型属性用于存储形式多样的无结构数据,并采用用户自定义的方法,使之有多媒体数据管理功能。EDBMS/2允许来自不同表中的一组记录通过联系构成一个复合对象。本文给出的“路径表达式”为基于复合对象的复杂查询提供了强有力的工具。  相似文献   

2.
以面向属性归纳的算法实现知识发现的可行性研究   总被引:1,自引:0,他引:1  
面向属性归纳的方法以关系的或面向属性的操作实现知识发现过程。面向属性归纳的含义还表现在知识获取工具的核心与基础是一组动作基元,包括问题表达,知识提升,人机交互等。文中着重讨论了知识生成规则,基于置信度的归纳策略,并以野生大豆数据库为基础设计了一个算法。  相似文献   

3.
耿翩飞 《电脑》2000,(9):37-38
上网除了浏览之外,EMAIL就是最重要和最频繁的使用了。可是对于无人帮助的新手而言,设置EMAIL实在是一道天堑,本文将带大家全面学习E-mail有关知识,让这天堑变通途  相似文献   

4.
基于自适应聚类的数据预处理算法I   总被引:1,自引:0,他引:1  
提出了KDD的一种逻辑模型。以数据库或数据仓库中的数据为例 ,根据先验知识或可能的挖掘目标 ,利用SQL命令滤除无关属性 ,形成基于某种概念分层的归纳数据库或汇总数据库。针对数据库中的属性 ,利用非监督学习算法 ,获取相应聚类 ,从而形成面向任务的目标数据子集 ,以保证数据挖掘结果的质量和有效性  相似文献   

5.
基于自适应聚类的数据预处理算法Ⅰ   总被引:4,自引:1,他引:4  
提出了KDD的一种逻辑模型。以数据库或数据仓库中的数据为例,根据先验知识或可能的挖掘目标,利用SQL命令滤除无关属性,形成基于某种概念分层的归纳数据库或汇总数据库。针对数据库中的属性,利用非监督学习算法,获取相应聚类,从而形成面向任务的目标数据子集,以保证数据挖掘结果的质量和有效性。  相似文献   

6.
利用面向对象程序设计及其开发工具设计编制软件已是九十年代软件开发的一大特征,本文从面向对象的程序设计的角度,剖析了CLIENT/SERVER环境下的通用数据库开发工具POWERBUILDER的特点、基本结构及其开发数据库应用系统的方法。本文是介绍POWERBUILDER的引导性文章,有关的详细内容,将在后续文章中介绍。  相似文献   

7.
启发式知识获取方法研究   总被引:3,自引:0,他引:3  
归纳学习是解决知识自动获取的有效方法,针对ID3算法、基于粗集的归纳学习以及其它一些归纳学习方法存在的问题,提出了一种新的归纳学习算法ITIL。此算法用信息增益为启发式,选择尽量少的重要属性或组合,以可分辨性为依据提取规则,许多实例表明,这些规则不仅简单,而且冗余小,作为知识获取模块的一部分,ITIL已被集成到一个“基于知识发现的医疗诊断辅助系统”动态知识库子系统中。  相似文献   

8.
面向属性的归纳与概念聚类   总被引:2,自引:0,他引:2  
面向属性的归纳是新近提出的一种广泛用于数据库中的知识发现的方法,提出这种方法与一种机器学习方法--概念聚类之间的紧密联系,并描述如何使用一个概念聚类算法进行面向属性的归纳。  相似文献   

9.
基于粗集理论知识表达系统的一种归纳学习方法   总被引:45,自引:2,他引:43  
吴福保  李奇 《控制与决策》1999,14(3):206-211
基于粗集(RS)理论,针对知识表达系统提出一种新的归纳学习方法,对该方法中条件属性的简化,核值表的求取,决策规则的约简进行了详细讨论,并给出相应的求解算法,本方法为机器学习以及从数据库中进行机器发现提供了新的思路。  相似文献   

10.
面向属性的归纳与概念聚类   总被引:3,自引:1,他引:3  
伍小荣  谢立宏 《计算机工程》2003,29(5):92-93,123
面向属性的归纳是新近提出的一种广泛用于数据库中知识发现的方法,文章指出这种方法与一种机器学习方法-概念聚类之间的紧密联系,并描述如何使用一个概念聚类算法进行面向属性的归纳。  相似文献   

11.
李宏丽 《数字社区&智能家居》2009,5(6):4252-4253,4256
农作物的长势监测和产量估算一直是遥感技术应用的重要方面,而一个好的农作物分类算法对于农作物产量和长势进行监测十分关键。目前对于一些特色农作物而言,这方面的研究比较缺乏。因此拳研究设计了符合特色农作物的长势监测和产量测算功能模块,将数据挖掘和知识发现应用到专家分类算法中,自行开发了适合农作物数据发现和挖掘的归纳学习算法,充分利用了波谱库中大量的波谱数据、相关属性和空间数据,形成了基于波谱库的特色农作物智能专家分类系统。  相似文献   

12.
Extensive research has been performed for developing knowledge based intelligent monitoring systems for improving the reliability of manufacturing processes. Due to the high expense of obtaining knowledge from human experts, it is expected to develop new techniques to obtain the knowledge automatically from the collected data using data mining techniques. Inductive learning has become one of the widely used data mining methods for generating decision rules from data. In order to deal with the noise or uncertainties existing in the data collected in industrial processes and systems, this paper presents a new method using fuzzy logic techniques to improve the performance of the classical inductive learning approach. The proposed approach, in contrast to classical inductive learning method using hard cut point to discretize the continuous-valued attributes, uses soft discretization to enable the systems have less sensitivity to the uncertainties and noise. The effectiveness of the proposed approach has been illustrated in an application of monitoring the machining conditions in uncertain environment. Experimental results show that this new fuzzy inductive learning method gives improved accuracy compared with using classical inductive learning techniques.  相似文献   

13.
On Detecting Spatial Outliers   总被引:1,自引:1,他引:0  
The ever-increasing volume of spatial data has greatly challenged our ability to extract useful but implicit knowledge from them. As an important branch of spatial data mining, spatial outlier detection aims to discover the objects whose non-spatial attribute values are significantly different from the values of their spatial neighbors. These objects, called spatial outliers, may reveal important phenomena in a number of applications including traffic control, satellite image analysis, weather forecast, and medical diagnosis. Most of the existing spatial outlier detection algorithms mainly focus on identifying single attribute outliers and could potentially misclassify normal objects as outliers when their neighborhoods contain real spatial outliers with very large or small attribute values. In addition, many spatial applications contain multiple non-spatial attributes which should be processed altogether to identify outliers. To address these two issues, we formulate the spatial outlier detection problem in a general way, design two robust detection algorithms, one for single attribute and the other for multiple attributes, and analyze their computational complexities. Experiments were conducted on a real-world data set, West Nile virus data, to validate the effectiveness of the proposed algorithms.
Feng Chen (Corresponding author)Email:
  相似文献   

14.
Abstract

The problem of knowledge acquisition has been recognized as the major bottleneck in the development of knowledge-based systems. An encouraging approach to alleviate this problem is inductive learning. Inductive learning systems accept, as input, a set of data that represent instances of the problem domain and produce, as output, the rules of the knowledge base. Each data item is described by a set of attribute values and is assigned to a unique decision class. A common characteristic of the existing inductive learning systems, is that they are empirical in nature and do not take into account the implications of the inductive rule generation process on the performance of the resulting set of rules. That performance is assessed when the rules are used to classify new unlabelled data. This paper demonstrates that the performance of a rule set is a function of the rule generation and rule interpretation processes. These two processes are interrelated and should not be considered separately. The interrelation of rule generation and rule interpretation is analysed and suggestions to improve the performance of existing inductive learning systems, are forwarded.  相似文献   

15.
一种连续条件属性值的决策表的归纳学习方法   总被引:1,自引:0,他引:1  
对由连续条件属性值和离散决策属性值组成的决策表,提出了一种归纳学习方法。把决策表中的连续条件属性值看作一矩阵,进行矩阵的奇异值分解,以确定决策表条件属性的数目。用模糊C均值聚类的方法对连续条件属性值进行不同聚类数目的聚类,得到不同聚类数目下的离散决策表,对这些决策表进行条件属性简化,从而得到不同的条件属性数目。比较矩阵奇异值分解后决策表条件属性的数目和上述不同聚类数目下的离散决策表简化后的条件属性的数目,并考虑决策属性的数目,确定最终的聚类数目。在此基础上,给出了由连续条件属性值和离散决策属性值组成的决策表的归纳学习方法,并验证了其有效性。  相似文献   

16.
Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding “patterns” by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods.  相似文献   

17.
Inductive learning is a method for automated knowledge acquisition. It converts a set of training data into a knowledge structure. In the process of knowledge induction, statistical techniques can play a major role in improving performance. In this paper, we investigate the competition and integration between the traditional statistical and the inductive learning methods. First, the competition between these two approaches is examined. Then, a general framework for integrating these two approaches is presented. This framework suggests three possible integrations: (1) statistical methods as preprocessors for inductive learning, (2) inductive learning methods as preprocessors for statistical classification, and (3) the combination of the two methods to develop new algorithms. Finally, empirical evidence concerning these three possible integrations are discussed. The general conclusion is that algorithms integrating statistical and inductive learning concepts are likely to make the most improvement in performance.  相似文献   

18.
Mining fuzzy association rules for classification problems   总被引:3,自引:0,他引:3  
The effective development of data mining techniques for the discovery of knowledge from training samples for classification problems in industrial engineering is necessary in applications, such as group technology. This paper proposes a learning algorithm, which can be viewed as a knowledge acquisition tool, to effectively discover fuzzy association rules for classification problems. The consequence part of each rule is one class label. The proposed learning algorithm consists of two phases: one to generate large fuzzy grids from training samples by fuzzy partitioning in each attribute, and the other to generate fuzzy association rules for classification problems by large fuzzy grids. The proposed learning algorithm is implemented by scanning training samples stored in a database only once and applying a sequence of Boolean operations to generate fuzzy grids and fuzzy rules; therefore, it can be easily extended to discover other types of fuzzy association rules. The simulation results from the iris data demonstrate that the proposed learning algorithm can effectively derive fuzzy association rules for classification problems.  相似文献   

19.
Two exploratory data analysis techniques the comap and the quad plot are shown to have both strengths and shortcomings when analysing spatial multivariate datasets. A hybrid of these two techniques is proposed: the quad map which is shown to overcome the outlined shortcomings when applied to a dataset containing weather information for disaggregate incidents of urban fires. Common to the quad plot, the quad map uses Polya models in order to articulate the underlying assumptions behind histograms. The Polya model formalises the situation in which past fire incident counts are computed and displayed in (multidimensional) histograms as appropriate assessments of conditional probability providing valuable diagnostics such as posterior variance i.e. sensitivity to new information. Finally we discuss how new technology in particular Online Analytics Processing (OLAP) and Geographical Information Systems (GISs) offer potential in automating exploratory spatial data analyses techniques, such as the quad map.  相似文献   

20.

属性约简是机器学习和知识发现的研究热点, 而属性重要性度量则是构建属性约简算法的关键环节. 针对不完备的混合型信息系统, 在邻域关系下定义了一种新的属性集成重要性度量—–邻域组合测度, 并据此提出一种基于邻域组合测度的属性约简(NCMAR) 算法. 通过多个UCI 数据集上的实验表明, NCMAR算法不仅能够直接处理符号和数值属性共存的混合信息系统, 而且适用于不完备信息系统, 在获得较小约简结果的同时, 能够保证较高的分类精度.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号