首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
郭启铭  樊玮 《计算机工程》2008,34(4):111-112,115
在类一属性相关离散化方法的基础上,提出一种基于Cramer’s V的连续属性离散化算法CVM,该方法利用统计学中的Cramer’s V来量化类一属性相关度,以保证离散后的类一属性相关度最大。与CADD和CAIM算法的实验比较以及对离散化后的数据进行C4.5分类测试,表明CVM算法性能良好,其离散化的数据明显地提高了分类器的预测精度。  相似文献   

2.
.连续属性离散化算法比较研究*   总被引:2,自引:0,他引:2  
探讨了贪心及其改进算法、基于属性重要性、基于信息熵和基于聚类四类连续属性离散化算法,并通过实验验证这四类算法的离散化效果.实验结果表明,数据集离散化的效果不仅取决于使用算法,而且与数据集连续属性的分布和决策数据值的分类也有密切关系.  相似文献   

3.
李晓飞 《计算机应用与软件》2009,26(10):262-264,272
连续属性离散化问题是机器学习的重要方面,是数据预处理问题之一.提供的基于动态层次聚类的离散化算法是层次聚类算法的一种改进.对该算法进行定性分析-对随机采集数据根据相似度进行聚类分析,得到论域的一种划分.通过实验表明,基于动态层次聚类的离散化算法对连续属性的划分更加合理,更加有效.  相似文献   

4.
将粗糙集理论中属性重要度和依赖度的概念与分级聚类离散化算法相结合,提出了一种纳税人连续型属性动态的离散化算法。首先将纳税数据对象的每个连续型属性划分为2类,然后利用粗糙集理论计算每个条件属性对于决策属性的重要度,再通过重要度由大至小排序进行增类运算,最后将保持与原有数据对象集依赖度一致的分类结果输出。该算法能够动态地对数据对象进行类别划分,实现纳税人连续型属性的离散化。通过采用专家分析和关联分析的实验结果,验证了该算法具有较高的纳税人连续型属性离散化精度和性能。  相似文献   

5.
介绍了一种基于统计分析的数据离散化方法——谱系聚类法,以胶合板缺陷检测数据为应用对象进行了基于谱系聚类的数据离散化研究,并与其它离散化方法进行了对比分析,对比结果表明经谱系聚类方法离散化后的数据,再进行粗糙集约简时,会有更多的冗余属性和记录被约掉,从而可以降低模型的复杂程度,加快获取知识的进程,提高分类的准确率。工程实践证明谱系聚类是一种有效的可用于数据预处理的离散化方法,结合粗糙集算法可以获取满意的数据挖掘结果。  相似文献   

6.
数据属性离散化是作战仿真数据预处理的重要组成部分,也是作战仿真数据研究的重点和难点.论述了进行数据属性离散化的必要性,提出一种基于改进属性重要度和信息熵(Discretization by Improved Attribute Significance and Information Entropy,DIAFIE)的作战仿真数据属性离散化算法.算法定义了属性重要度并以此为聚类判断依据将数据值域划分为多个离散区间,然后根据信息熵优化合并相邻区间以保证离散化结果的精度.实验证明上述算法能有效处理作战仿真数据属性离散化问题,具有产生断点少、分类精度高的优点.  相似文献   

7.
胡运禄  于津 《福建电脑》2013,29(3):118-121
连续型属性的离散化是数据挖掘研究中一个重要的组成部分,连续属性离散化方法的性能对数据挖掘结果会产生直接的影响。本文将基于目标函数的模糊聚类算法-FCM引入到连续属性离散化中,在对FCM算法中的模糊聚类数目a和初始聚类中心位置进行优化的基础上,提出了改进的NFCM算法,根据需要离散化的数据分布特点来进行离散化,减少了模糊聚类算法的迭代次数,提高了连续属性离散化的效率。  相似文献   

8.
传统的基于区分矩阵的属性约简算法只能处理离散数据,而绝大部分数据既包含离散属性又包含连续属性.针对这一问题,本文使用一种可以对离散数据和连续数据进行统一处理的方法.该方法利用柔性逻辑等价关系替代原来的不可分辨关系,简化了传统算法中的离散化过程,提高了算法效率.实验表明,与传统的算法相比,改进后算法省略了离散化这一过程,可以对离散数据和连续数据统一进行处理.  相似文献   

9.
一种改进的CAIM算法   总被引:1,自引:0,他引:1       下载免费PDF全文
在CAIM算法中,离散判别式仅考虑了区间中最多的类与属性间的依赖度,使离散化过度而导致结果不精确。基于此,提出对CAIM的改进算法,该算法考虑到按属性重要性从小到大顺序进行离散,同时根据粗糙集理论提出条件属性可分辨率概念,与近似精度同时控制信息表最终的离散程度,有效解决了离散化过度问题。实验通过C4.5和支持向量机分别对离散化后的数据进行识别和分类预测,结果证明了该算法的有效性。  相似文献   

10.
提出了一种基于最佳分类数和粗糙集理论的汽轮机轴系振动故障诊断方法。该方法利用模糊C均值聚类算法(FCM)把数据的连续属性离散化,以形成隶属度矩阵及属性分类数,根据隶属度矩阵及属性分类数进行划分系数和划分熵的有效性评判,最终找到连续属性的最佳分类数。然后根据最佳分类数对数据的连续属性进行实际的离散化,将离散化后形成的离散数据根据粗糙集理论,进行数据挖掘,得到诊断规则,有效提高了汽轮机轴系振动故障的诊断水平。  相似文献   

11.
A non-delegatable strong designated verifier signature (NSDVS) enforces verification of a signature by a designated verifier only. The concept is useful in various commercial cryptographic applications such as copyright protection, e-voting, and e-libraries. This paper reports the shortest NSDVS so far that consists of only two elements. The scheme is inspired by an identification scheme and Cramer et al.’s OR-proof technique where a prover can prove that he knows at least one out two secrets. It is solidified by a symmetric key based group to group encryption algorithm. Two implementations of the algorithm are reported. The scheme is provably secure with respect to its properties of unforgeability, non-transferability, privacy of signer’s identity, and non-delegatability.  相似文献   

12.
Since the first practical and secure public-key encryption scheme without random oracles proposed by Cramer and Shoup in 1998, Cramer–Shoup’s scheme and its variants remained the only practical and secure public-key encryption scheme without random oracles until 2004. In 2004, Canetti et al. proposed a generic transformation from a selective identity-based encryption scheme to a public-key encryption by adding a one-time strongly signature scheme. Since then, some transformation techniques from a selective identity-based encryption scheme to a public-key encryption have been proposed to enhance the computational efficiency, for example, Boneh–Katz’s construction and Boyen–Mei–Waters’ scheme. These transformations have either traded-off the publicly verifiable properties or tightness of security reduction. In 2007, Zhang proposed another generic transformation by adding Chameleon hash functions. In this paper, we introduce another technique from the Boneh–Boyen’s selective identity-based encryption scheme to a public-key encryption which is publicly verifiable and is slightly more efficient than Zhang’s transformation. The proposed public-key encryption scheme is based on the decisional bilinear Diffie–Hellman assumption and the target collision resistant hash functions.  相似文献   

13.
基于幂律分布的网络用户快速排序算法   总被引:1,自引:0,他引:1  
随着网络论坛、博客、微博的发展,引出社会网络中的用户排序问题。将在线网络论坛中用户映射为节点,用户评论过程中形成的回复关系映射为有向关联图,其节点度符合幂律分布。且论坛中用户的主题发布行为和回复关系符合Pagerank算法的互增强和随机游走特性,因此选用Pagerank算法排序用户影响力。该文提出的研究问题 如何提高用户排序应用中数据的存储和运行效率。天涯网络论坛中80%以上用户入度为0,据此,根据入度是否为0划分为两个集合,对入度为0集合按出度构造链接表,设计了基于集合划分的高效排序算法SD-Rank。SD-Rank时空复杂性为O(V′),V′为入度非0节点集。对天涯网络论坛真实用户数据的实验结果表明 SD-Rank算法时空复杂性优于Pagerank算法。  相似文献   

14.
为提高异常入侵检测的效率,提出一种混合偏最小二乘特征提取和核心向量机算法的入侵检测模型。模型使用偏最小二乘算法在入侵数据集上进行主成分提取,在此基础上构建特征集,引入适用于解决大规模样本训练问题的核心向量机算法,在特征集上建立入侵检测模型,使用该模型对异常入侵行为进行检测和判断。通过基于KDD99数据集上的入侵检测实验,验证了混合模型的可行性和有效性。  相似文献   

15.
The conformality of NURBS surfaces greatly affects the results of rendering and tessellation applications. To improve the conformality of NURBS surfaces, an optimization algorithm using general bilinear transformations is presented in this paper. The conformality energy is first formulated and its numerical approximation is then constructed using the composite simpson’s rule. The initial general bilinear transformation is obtained by approximating the conformal mapping of its 3D discretized mesh using a least square method, which is further optimized by the Levenberg–Marquardt method. Examples are given to show the performance of our algorithm for rendering and tessellation applications.  相似文献   

16.
Generalized Core Vector Machines   总被引:4,自引:0,他引:4  
Kernel methods, such as the support vector machine (SVM), are often formulated as quadratic programming (QP) problems. However, given$m$training patterns, a naive implementation of the QP solver takes$O(m^3)$training time and at least$O(m^2)$space. Hence, scaling up these QPs is a major stumbling block in applying kernel methods on very large data sets, and a replacement of the naive method for finding the QP solutions is highly desirable. Recently, by using approximation algorithms for the minimum enclosing ball (MEB) problem, we proposed the core vector machine (CVM) algorithm that is much faster and can handle much larger data sets than existing SVM implementations. However, the CVM can only be used with certain kernel functions and kernel methods. For example, the very popular support vector regression (SVR) cannot be used with the CVM. In this paper, we introduce the center-constrained MEB problem and subsequently extend the CVM algorithm. The generalized CVM algorithm can now be used with any linear/nonlinear kernel and can also be applied to kernel methods such as SVR and the ranking SVM. Moreover, like the original CVM, its asymptotic time complexity is again linear in$m$and its space complexity is independent of$m$. Experiments show that the generalized CVM has comparable performance with state-of-the-art SVM and SVR implementations, but is faster and produces fewer support vectors on very large data sets.  相似文献   

17.
A system of singularly perturbed convection-diffusion equations with weak coupling is considered. The system is first discretized by an upwind finite difference scheme for which an a posteriori error estimate in the maximum norm is constructed. Then the a posteriori error bound is used to design an adaptive gird algorithm. Finally, a first-order rate of convergence, independent of the perturbation parameters, is established by using the theory of the discrete Green’s function. Numerical results are presented to illustrate support our theoretical results.  相似文献   

18.
A non-overlapping domain decomposition is applied to a multibody unilateral contact problem with given friction (Tresca’s model). Approximations are proposed on the basis of the primary variational formulation (in terms of displacements) and linear finite elements. For the discretized problem we employ the concept of local Schur complements, grouping every two subdomains which share a contact area. The proposed algorithm of successive approximations can be recommended for “short” contacts only, since the contact areas are not divided by interfaces. The numerical examples show the practical efficiency of the algorithm.  相似文献   

19.
This paper proposes new algorithms for fixed-length approximate string matching and approximate circular string matching under the Hamming distance. Fixed-length approximate string matching and approximate circular string matching are special cases of approximate string matching and have numerous direct applications in bioinformatics and text searching. Firstly, a counter-vector-mismatches (CVM) algorithm is proposed to solve fixed-length approximate string matching with k-mismatches. The development of CVM algorithm is based on the parallel summation of counters located in the same machine word. Secondly, a parallel counter-vector-mismatches (PCVM) algorithm is proposed to accelerate CVM algorithm in parallel. The PCVM algorithm is integrated into two-level parallelisms that exploit not only word-level parallelism but also data parallelism via parallel environments such as multi-core processors and graphics processing units (GPUs). In the particular case of adopting GPUs, a shared-mem parallel counter-vector-mismatches (PCVMsmem) scheme can be implemented from PCVM algorithm. The PCVMsmem scheme can exploit the memory model of GPUs to optimize performance of PCVM algorithm. Finally, this paper shows several methods to adopt CVM and PCVM algorithms in case the input pattern is in circular structure. In the experiments with real DNA packages, our proposed algorithms and scheme work greatly faster than previous bit-vector-mismatches and parallel bit-vector-mismatches algorithms.  相似文献   

20.
In this paper we propose a stable numerical method for an ill-posed backward parabolic equation with time-dependent coefficients in a parallelepiped. The problem is reformulated as an ill-posed least squares problem which is solved by the conjugate gradient method with an a posteriori stopping rule. The least squares problem is discretized by a splitting method which reduces the large dimensions of the discretized problem. We calculate the gradient of the objective functional of the discretized least squares problem by the aid of an adjoint discretized problem which enhances its accuracy. The algorithm is tested on several examples, that proves its efficiency.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号