共查询到20条相似文献,搜索用时 328 毫秒
1.
2.
3.
4.
李涛 《数字社区&智能家居》2007,(18)
关联规则挖掘向来是数据挖掘的一个重要领域,挖掘算法也层出不穷.本文在深入分析了FP树特性的基础上,改进了FP树构造过程,通过一次扫描事务数据库生成FP树.缩短了关联规则挖掘时间,提高了效率,实验验证了其有效性. 相似文献
5.
数据库中的知识发现在飞行器故障诊断中的应用 总被引:3,自引:0,他引:3
宛霞 《计算机工程与应用》2003,39(18):221-223
论文提出了在大型数据库中,为飞行器故障诊断的KDD故障诊断树和知识演变模式。其思想是在飞行器工程领域中,创建故障诊断树和基于知识演变的系统。这种知识演变模式提供了同步和交互进行的两种处理模式:数据挖掘处理和优化处理。在基于知识演变模式中,数据挖掘处理在信念空间层进行符号推理,而优化处理在整体空间层进行查询。这些模式和故障诊断树都是为快速查找飞行器故障服务的,可以有效地提高飞行器试验的成功率。 相似文献
6.
高维复杂数据处理是数据挖掘领域中的关键问题,针对现有特征选择分类算法存在的预测精确度失衡、整体分类效率低下等问题,提出了一种结合概率相关性和极限随机森林的特征选择分类算法(P-ERF)。该算法使用充分考虑特征之间相关性与P值结合的特征选择方式,避免了树节点分裂过程中造成的冗余性问题;并以随机树为基分类器、极限随机森林为整体框架,使P-ERF算法获得了更高的精准度和更好的泛化误差。实验结果表明,P-ERF算法相较于随机森林算法、极限随机森林算法,在数据集分类精度与整体性方面均得到良好的效果。 相似文献
7.
8.
预测性模型中的一种数据挖掘算法 总被引:4,自引:0,他引:4
在数据挖掘中,预测性模型是一个重要的方面。文中提出一个基于N阶转移概率的数据挖掘规则。给出了基本模型和算法,并评估了该算法的复杂性和优缺点。根据该算法,结合序列树提出了一个改进算法。 相似文献
9.
近几年来,模式识别、神经网络、关联测试、决策树等大量的技术发展非常迅速,“数据挖掘”这个词也逐渐在软件工程出版物中出现,文章对软件工程中使用的数据挖掘技术之一“基于量度的分类树”进行了详细的调查。 相似文献
10.
数据挖掘与人才素质评价 总被引:7,自引:0,他引:7
根据数据挖掘技术的概念、数据的组织方式和系统的设计方法,利用k-中心算法对人才素质进行分类,借助判定树归纳分类法对人才素质进行分析和评价,提出了混合数据挖掘算法,提高人才素质评价的高效性和科学性。 相似文献
11.
晁阳 《数字社区&智能家居》2006,(5)
在人们越来越注重效率的今天,数据挖掘这门学科已经比以往任何时候都要热门。本文报告了数据挖掘分析的现状,通过一个现实的航天数据挖掘分析的项目构架提出当今数据挖掘领域所普遍存在并没有被注意到的一些问题。并通过该航天数据挖掘分析的实践经验提出了一直被数据挖掘领域的研究者们所关注的数据挖掘的前途问题。借此使数据挖掘这门学科的应用问题能得到广大数据挖掘研究者的重视。 相似文献
12.
晁阳 《数字社区&智能家居》2006,(2):1-2
在人们越来越注重效率的今天,数据挖掘这门学科已经比以往任何时候都要热门。本文报告了数据挖掘分析的现状。通过一个现实的航天数据挖掘分析的项目构架提出当今数据挖掘领域所普遍存在并没有被注意到的一些问题。并通过该航天数据挖掘分析的实践经验提出了一直被数据挖掘领域的研究者们所关注的数据挖掘的前途问题。借此使数据挖掘这门学科的应用问题能得到广大数据挖掘研究者的重视。 相似文献
13.
随着网络资源的日益丰富,从中发现潜在的、有价值的信息的商业需求一直推动着数据挖掘技术不断向前发展,由于Web数据本身具有半结构化、组织性差的特点,使得Web数据挖掘工作变得十分困难,而XML的出现为Web数据挖掘技术带来了新的契机和巨大的发展。本文介绍了XML技术以及Web数据挖掘,阐述了XML技术在Web数据挖掘中的应用。由于基于XML的Web数据挖掘是一门新兴的技术,如何进一步充分利用Web资源进行数据挖掘还有待于进一步研究。 相似文献
14.
Time-series analysis is a powerful technique to discover patterns and trends in temporal data. However, the lack of a conceptual model for this data-mining technique forces analysts to deal with unstructured data. These data are represented at a low-level of abstraction and their management is expensive. Most analysts face up to two main problems: (i) the cleansing of the huge amount of potentially-analysable data and (ii) the correct definition of the data-mining algorithms to be employed. Owing to the fact that analysts’ interests are also hidden in this scenario, it is not only difficult to prepare data, but also to discover which data is the most promising. Since their appearance, data warehouses have, therefore, proved to be a powerful repository of historical data for data-mining purposes. Moreover, their foundational modelling paradigm, such as, multidimensional modelling, is very similar to the problem domain. In this article, we propose a unified modelling language (UML) extension through UML profiles for data-mining. Specifically, the UML profile presented allows us to specify time-series analysis on top of the multidimensional models of data warehouses. Our extension provides analysts with an intuitive notation for time-series analysis which is independent of any specific data-mining tool or algorithm. In order to show its feasibility and ease of use, we apply it to the analysis of fish-captures in Alicante. We believe that a coherent conceptual modelling framework for data-mining assures a better and easier knowledge-discovery process on top of data warehouses. 相似文献
15.
传统的数据挖掘方法存在效率低和非智能化等不足,难以满足网络环境下对海量数据的挖掘需要。文中从数据挖掘技术与Agent技术的特征出发,论述了两者结合的优势,将Agent技术应用到数据挖掘中,提出了基于Agent的数据挖掘模型,并阐述了该模型的组织结构。该模型能够降低问题的复杂性,减少人工的参与,在很大程度上提高了数据挖掘的智能性和高效性。 相似文献
16.
传统的数据挖掘方法存在效率低和非智能化等不足,难以满足网络环境下对海量数据的挖掘需要。文中从数据挖掘技术与Agent技术的特征出发,论述了两者结合的优势,将Agent技术应用到数据挖掘中,提出了基于Agent的数据挖掘模型,并阐述了该模型的组织结构。该模型能够降低问题的复杂性,减少人工的参与,在很大程度上提高了数据挖掘的智能性和高效性。 相似文献
17.
Fabien De Marchi Stéphane Lopes Jean-Marc Petit 《Journal of Intelligent Information Systems》2009,32(1):53-73
Foreign keys form one of the most fundamental constraints for relational databases. Since they are not always defined in existing
databases, the discovery of foreign keys turns out to be an important and challenging task. The underlying problem is known
to be the inclusion dependency (IND) inference problem. In this paper, data-mining algorithms are devised for IND inference
in a given database. We propose a two-step approach. In the first step, unary INDs are discovered thanks to a new preprocessing
stage which leads to a new algorithm and to an efficient implementation. In the second step, n-ary IND inference is achieved.
This step fits in the framework of levelwise algorithms used in many data-mining algorithms. Since real-world databases can
suffer from some data inconsistencies, approximate INDs, i.e. INDs which almost hold, are considered. We show how they can
be safely integrated into our unary and n-ary discovery algorithms. An implementation of these algorithms has been achieved
and tested against both synthetic and real-life databases. Up to our knowledge, no other algorithm does exist to solve this
data-mining problem. 相似文献
18.
19.
When using data-mining tools to analyze big data, users often need tools to support the understanding of individual data attributes and control the analysis progress. This requires the integration of data-mining algorithms with interactive tools to manipulate data and analytical process. This is where visual analytics can help. More than simple visualization of a dataset or some computation results, visual analytics provides users an environment to iteratively explore different inputs or parameters and see the corresponding results. In this research, we explore a design of progressive visual analytics to support the analysis of categorical data with a data-mining algorithm, Apriori. Our study focuses on executing data mining techniques step-by-step and showing intermediate result at every stage to facilitate sense-making. Our design, called Pattern Discovery Tool, targets for a medical dataset. Starting with visualization of data properties and immediate feedback of users’ inputs or adjustments, Pattern Discovery Tool could help users detect interesting patterns and factors effectively and efficiently. Afterward, further analyses such as statistical methods could be conducted to test those possible theories. 相似文献
20.
Visual data mining in large geospatial point sets 总被引:2,自引:0,他引:2
Visual data-mining techniques have proven valuable in exploratory data analysis, and they have strong potential in the exploration of large databases. Detecting interesting local patterns in large data sets is a key research challenge. Particularly challenging today is finding and deploying efficient and scalable visualization strategies for exploring large geospatial data sets. One way is to share ideas from the statistics and machine-learning disciplines with ideas and methods from the information and geo-visualization disciplines. PixelMaps in the Waldo system demonstrates how data mining can be successfully integrated with interactive visualization. The increasing scale and complexity of data analysis problems require tighter integration of interactive geospatial data visualization with statistical data-mining algorithms. 相似文献