首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
基于二进制的长频繁项目集挖掘算法   总被引:1,自引:1,他引:0  
结合挖掘长频繁项目集的自顶向下搜索策略,提出一种基于二进制的长频繁项目集挖掘算法.该算法用数值递减搜索策略产生候选项,在用到频繁项目集修剪其子集减少候选项的基础上还通过事务特征减少搜索事务数,并运用二进制的逻辑"与"运算计算支持数,提高了算法的效率.算法分析和实验表明,该算法是有效的、快速的.  相似文献   

2.
属性约简是粗糙集理论中的一个重要内容,其核心任务是得到属性集的核。本文提出了一种基于二进制运算的属性核求解算法,该算法简单直观且易于实现。我们通过设计C语言程序验证了算法的有效性。  相似文献   

3.
基于粒计算的Rough集模型   总被引:2,自引:1,他引:1  
上近似、下近似是Rough集的基本定义,它使我们能够用精确的集合讨论不精确的概念,Rough集利用可计算的边界域实现了G.Frege的边界思想.然而,Rough集本身的代数定义和其他各种扩展模型并没有提供简单直观的计算边界元素数目的算法.在二进制粒计算的基础上,通过定义粒矩阵和粒矩阵运算,建立了基于粒计算的知识表示方法和基于粒计算的Rough集模型,据此可以获得Rough集基本概念的粒矩阵表示和粒矩阵快速计算方法,为建立基于粒计算的知识发现算法提供了理论基础.举例证明了Rough包含与Rough相等的隶属度函数定义并非充要条件.同时给出了基于粒计算的Rough包含与Rough相等的充要条件.  相似文献   

4.
粒计算是一种处理不确定性数据的理论方法,涵盖粗糙集、模糊集、商空间、词计算等。目前,数据的粒化与粒的计算主要涉及集合的运算与度量,集合运算的低效制约着粒计算相关算法的应用领域。为此,提出了一种二进制粒计算模型,给出了粒的三层结构,包括粒子、粒群与粒库,并定义了二进制粒子及二进制粒子的运算,将传统的集合运算转化为二进制数的计算,进一步给出了二进制粒子的距离度量,将等价类的集合表示方式转化为粒子的距离度量表示方式,给出了粒子距离的相关性质。该模型定义了二进制粒群距离的概念,给出了二进制粒群距离的计算方法,提出了基于二进制粒群距离的属性约简方法,证明了该方法与经典粗糙集约简方法的等价性,并以二进制粒群距离作为启发式信息,给出了两种约简算法。  相似文献   

5.
基于二进制区分矩阵的约简算法研究   总被引:1,自引:1,他引:1  
杨帆  朱新坚  曹广益 《计算机仿真》2007,24(2):79-83,140
给出了一种基于二进制区分矩阵的约简方法.首先基于粗糙集理论定义了二进制区分矩阵及运算规则、基于二进制区分矩阵的最小约简的判别及属性重要性的计算方法.在定义的基础上,给出了基于二进制区分矩阵的求核算法、相对属性约简算法及值约简算法.该约简方法以位操作为主与传统的约简方法比较不包括复杂的逻辑化简和集合运算,在一定程度上简化了计算,提高了约简效率.将该算法应用于数字电路设计的开关电路综合中,得到最简数字电路的逻辑表达,从而说明了算法的有效性.  相似文献   

6.
该文研究了基于集合化的数据模块快速生成技术,根据笛卡儿积的运算特点采用集合化运算思想,通过将信息集集合化、装备结构单元集合化,提出了一种将信息集嵌入到装备结构单元的模型构建方式,利用该模型进行信息集与装备结构单元的笛卡儿积集合化运算,实现数据模块的快速有效构建。这种构建技术突破了传统IETM手册顶层规划中数据模块生成方法的理念,对数据模块的完备性、准确性给予极大地保障,提高后期IETM手册研制的效率,减小返工。  相似文献   

7.
关联规则是数据挖掘的主要技术。文中介绍了关联规则的基本概念,阐述了自顶向下算法的基本思想和存在的不足,扩展了相关定义和性质,提出了基于自顶向下算法基础上的改进算法。该算法的主要特点是运用集合运算的思想和递归的方法,保存前面扫描时比较运算的结果进行最大频繁集的查找。最后用实例进行仿真实验并做了比较分析,效率有所提高。  相似文献   

8.
基于复杂对象的逻辑推理一直是演绎对象数据库中的研究热点。论文叙述了集合在演绎对象数据库中的应用和实现技术,重点介绍了部分集在规则中的两个作用:列举集合中的所有元素和聚集集合中所有元素。同时讨论了完整集的交、并、划分和差等运算的实现思想。  相似文献   

9.
基于属性约简的分明矩阵方法的思想,本文提出Rough集不可分明关系和不可分明集概念,给出了基于二进制的条件属性约简和属性值约简方法,该方法在形式上更加简单,二进制参加运算,运算速度快,并通过示例说明了该方法比传统的Rough集理论中的方法更优越.  相似文献   

10.
在非线性模型参数失配下,直接采用滤波算法很难获到理想的估计状态.本文基于扩展集员估计方法,在状态估计中引入参数的不确定信息,提出一种参数失配有界下的状态估计方法.该方法应用区间或集合运算的法则,计算由参数失配引起的偏差范围,并将其用椭球集外包.在状态估计的预测步,通过该偏差椭球集与先验椭球区间的并运算,得到预测椭球区间;在状态估计的更新步,利用观测椭球集对预测椭球区间进行更新,从而得到后验椭球集合以及状态估计值.最后,在数值仿真和发酵模型中的仿真应用验证了算法的有效性.  相似文献   

11.
方刚  熊江 《计算机工程》2011,37(13):58-60
在空间数据库中挖掘带约束条件的频繁邻近类别集时,使用传统约束性关联规则的挖掘算法存在冗余候选项和重复计算等问题。为此,提出一种带约束条件的频繁邻近类别集挖掘算法,该算法以邻近类别集标识值双向变化的方法产生候选频繁邻近类别集,通过标识值的“与”运算计算支持数,达到提高算法挖掘效率的目的。实验结果表明,该算法比现有算法更简单快速。  相似文献   

12.
Choosing appropriate classification algorithms for a given data set is very important and useful in practice but also is full of challenges. In this paper, a method of recommending classification algorithms is proposed. Firstly the feature vectors of data sets are extracted using a novel method and the performance of classification algorithms on the data sets is evaluated. Then the feature vector of a new data set is extracted, and its k nearest data sets are identified. Afterwards, the classification algorithms of the nearest data sets are recommended to the new data set. The proposed data set feature extraction method uses structural and statistical information to characterize data sets, which is quite different from the existing methods. To evaluate the performance of the proposed classification algorithm recommendation method and the data set feature extraction method, extensive experiments with the 17 different types of classification algorithms, the three different types of data set characterization methods and all possible numbers of the nearest data sets are conducted upon the 84 publicly available UCI data sets. The results indicate that the proposed method is effective and can be used in practice.  相似文献   

13.
《Real》1996,2(5):297-313
This paper compares three image stabilization algorithms when used as preprocessors for a target tracking application. These algorithms vary in computational complexity, accuracy, and ability. Algorithm 1 is capable of only pixel-level realignment of imagery, while Algorithms 2 and 3 are capable of full subpixel stabilization with respect to translation, rotation, and scale. The algorithms are evaluated on their performance in the stabilization of one synthetic forward looking infrared (FLIR) data set and two real FLIR imagery data sets. The evaluation tools incorporated include mean absolute error of the output data set and the overall performance of an automatic target acquisition system (developed at the Army Research Laboratory) that uses the algorithms as a front end preprocessor. We found that for this tracking application, extremely accurate subpixel stabilization was a requirement for proper operation. We also found that in this application, Algorithm 3 performed significantly better than the other two algorithms.  相似文献   

14.
陈珂  洪银杰  陈刚 《软件学报》2012,23(6):1588-1601
基于可能世界的不确定集合的相似查询,从语义上或者从计算方法的角度来看,都有别于传统的确定型集合上的技术.由于集合中的项存在不确定性,即一个项出现在集合中是有一定概率的,使得传统处理集合的技术不再适用.提出了一个基于可能世界的集合期望相似度的度量公式.在期望的度量公式中,如果一对集合(X,Y)的期望相似度大于给定的阈值τ∈(0,1),则被称为相似集合对.一般的算法,在基于可能世界的情况下计算不确定集合的期望相似度,其复杂度是指数级的.提出了利用动态规划来计算集合期望相似度的算法,该算法的复杂度是多项式级别,极大地减少了计算时间.实验结果表明了基于该算法查询的可用性和高性能.  相似文献   

15.
Fuzzy rough set is a generalization of crisp rough set to deal with data sets with real value attributes. A primary use of fuzzy rough set theory is to perform attribute reduction for decision systems with numerical conditional attribute values and crisp (symbolic) decision attributes. In this paper we define inconsistent fuzzy decision system and their reductions, and develop discernibility matrix-based algorithms to find reducts. Finally, two heuristic algorithms are developed and comparison study is provided with the existing algorithms of attribute reduction with fuzzy rough sets. The proposed method in this paper can deal with decision systems with numerical conditional attribute values and fuzzy decision attributes rather than crisp ones. Experimental results imply that our algorithm of attribute reduction with general fuzzy rough sets is feasible and valid.  相似文献   

16.
In model-based diagnosis or other research fields, the hitting sets of a set cluster are usually used. In this paper we introduce some algorithms, including the new BHS-tree and Boolean algebraic algorithms. In the BHS-tree algorithm, a binary-tree is used for the computation of hitting sets, and in the Boolean algebraic algorithm, components are represented by Boolean variables. It runs just for one time to catch the minimal hitting sets. We implemented the algorithms and present empirical results in order to show their superiority over other algorithms for computing hitting sets.  相似文献   

17.
Fuzzy rule induction in a set covering framework   总被引:1,自引:0,他引:1  
  相似文献   

18.
Efficient aggregation algorithms for compressed data warehouses   总被引:9,自引:0,他引:9  
Aggregation and cube are important operations for online analytical processing (OLAP). Many efficient algorithms to compute aggregation and cube for relational OLAP have been developed. Some work has been done on efficiently computing cube for multidimensional data warehouses that store data sets in multidimensional arrays rather than in tables. However, to our knowledge, there is nothing to date in the literature describing aggregation algorithms on compressed data warehouses for multidimensional OLAP. This paper presents a set of aggregation algorithms on compressed data warehouses for multidimensional OLAP. These algorithms operate directly on compressed data sets, which are compressed by the mapping-complete compression methods, without the need to first decompress them. The algorithms have different performance behaviors as a function of the data set parameters, sizes of outputs and main memory availability. The algorithms are described and the I/O and CPU cost functions are presented in this paper. A decision procedure to select the most efficient algorithm for a given aggregation request is also proposed. The analysis and experimental results show that the algorithms have better performance on sparse data than the previous aggregation algorithms  相似文献   

19.
In cluster analysis, one of the most challenging and difficult problems is the determination of the number of clusters in a data set, which is a basic input parameter for most clustering algorithms. To solve this problem, many algorithms have been proposed for either numerical or categorical data sets. However, these algorithms are not very effective for a mixed data set containing both numerical attributes and categorical attributes. To overcome this deficiency, a generalized mechanism is presented in this paper by integrating Rényi entropy and complement entropy together. The mechanism is able to uniformly characterize within-cluster entropy and between-cluster entropy and to identify the worst cluster in a mixed data set. In order to evaluate the clustering results for mixed data, an effective cluster validity index is also defined in this paper. Furthermore, by introducing a new dissimilarity measure into the k-prototypes algorithm, we develop an algorithm to determine the number of clusters in a mixed data set. The performance of the algorithm has been studied on several synthetic and real world data sets. The comparisons with other clustering algorithms show that the proposed algorithm is more effective in detecting the optimal number of clusters and generates better clustering results.  相似文献   

20.
随着云计算技术的蓬勃发展,各种应用产生的数据量日益庞大,然而目前大规模数据集的安全并未得到充分保障。密码学是确保数据安全的有效手段,但是将传统加密方法直接应用于大数据加密存在安全隐患。提出了针对大规模数据集加密的排列算法,包括两方面工作:提出一种适用于大规模数据集的分组加密排列算法,对数据规模为2mN〈2m+1的数据集,经过m+2轮加密可实现全局扩散;基于MapReduce编程模型,在Hadoop平台上实现了分组加密排列算法。理论分析与实验结果表明,该排列算法具有优异的全局扩散性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号