首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 109 毫秒
1.
基于SQL SERVER 2005的第三代数据挖掘系统构建分析   总被引:1,自引:0,他引:1  
介绍了数据挖掘软件及工具的发展历史,阐述了SQLServer2005平台的功能优势,构建了基于SQLServer2005的数据挖掘系统。  相似文献   

2.
基于SQL Server 2005的数据挖掘的研究   总被引:1,自引:0,他引:1  
提高数据挖掘的效率是目前信息技术研究的热点问题之一。介绍了数据挖掘的概念、过程模型以及体系结构,讨论了基于Microsoft SQL Server2005的数据挖掘方案和采用SQLServer分析服务实现数据挖掘的相关技术。采用SQLServer分析服务的数据挖掘.实现了数据挖掘、数据仓库与应用程序的紧密耦合,从而大大提高了数据挖掘的效率。  相似文献   

3.
本文简单介绍了数据挖掘技术和SQLServer2000数据库,重点讨论了用SQLServer2000数据库建立数据挖掘模型的,从海量的数据中找出有用信息的过程。  相似文献   

4.
欧阳桂秀 《福建电脑》2013,(9):166-168,188
先介绍了怎样配置Microsoft SQLServer2005数据库。然后介绍在Path。最后介绍在Eclipse中xe-A-rJava程序,实现Java和Microsoft SQLServer2《惦在Java程序中查询Microsofy SQLServer2005数据库中的记录。Eclipse中配置Build数据库的连接,可以在Java程序中查询Microsoft SQLServer2005数据库中的记录。  相似文献   

5.
分析服务组件是SQLServer2000中专门用于进行数据挖掘的工具组件,它基于可视化的Windows开发平台。文中讨论了用分析服务组件进行数据挖掘的方法,介绍了用类似SQL的DDL构建了数据挖掘应用的一个实例。  相似文献   

6.
本文探讨数据挖掘技术在中油集团新疆培训中心的应用。现有培训管理信息系统的数据库积累了大量历史数据,在此基础上使用数据挖掘技术,应用微软SQLServer2005的数据挖掘集成环境,以Microsoft时序算法为例,建立数据挖掘模型,进行数据挖掘,预测各承办部门的培训能力,实现为管理人员合理配置培训资源的决策提供有用信息,最后总结了在开发过程遇到的问题及解决办法。  相似文献   

7.
本文介绍了数据挖掘的基本概念,说明了聚类是数据挖掘的一个很重要的功能。同时进一步解释了什么是聚类分析和常用的聚类算法,详细说明了在VisualBasic6.0结合SQLServer2000环境下划分方法中的“K-中心点”聚类算法的实现方法。  相似文献   

8.
本文介绍SQLServer2005中的几个新视图,它们取代了SQLServer2000中的sysindexes系统数据表,你可以使用这些视图访问数据库对象中与存储有关的元数据。[编者按]  相似文献   

9.
研究了Oracle、SQLServer、Delphi等几种常用的数据仓库开发工具,并探讨了这些软件在数据挖掘中的应用。  相似文献   

10.
数据挖掘在SQL Server2005中的应用   总被引:2,自引:0,他引:2  
本文首先介绍了数据挖掘的概念和处理过程,然后介绍了SQL Server2005中的数据挖掘功能,最后给出了在SQL Server2005中实现数据挖掘项目的整个流程。  相似文献   

11.
A Survey of Uncertain Data Algorithms and Applications   总被引:8,自引:0,他引:8  
In recent years, a number of indirect data collection methodologies have lead to the proliferation of uncertain data. Such data points are often represented in the form of a probabilistic function, since the corresponding deterministic value is not known. This increases the challenge of mining and managing uncertain data, since the precise behavior of the underlying data is no longer known. In this paper, we provide a survey of uncertain data mining and management applications. In the field of uncertain data management, we will examine traditional methods such as join processing, query processing, selectivity estimation, OLAP queries, and indexing. In the field of uncertain data mining, we will examine traditional mining problems such as classification and clustering. We will also examine a general transform based technique for mining uncertain data. We discuss the models for uncertain data, and how they can be leveraged in a variety of applications. We discuss different methodologies to process and mine uncertain data in a variety of forms.  相似文献   

12.
Time series analysis has always been an important and interesting research field due to its frequent appearance in different applications. In the past, many approaches based on regression, neural networks and other mathematical models were proposed to analyze the time series. In this paper, we attempt to use the data mining technique to analyze time series. Many previous studies on data mining have focused on handling binary-valued data. Time series data, however, are usually quantitative values. We thus extend our previous fuzzy mining approach for handling time-series data to find linguistic association rules. The proposed approach first uses a sliding window to generate continues subsequences from a given time series and then analyzes the fuzzy itemsets from these subsequences. Appropriate post-processing is then performed to remove redundant patterns. Experiments are also made to show the performance of the proposed mining algorithm. Since the final results are represented by linguistic rules, they will be friendlier to human than quantitative representation.  相似文献   

13.
We investigate the use of biased sampling according to the density of the data set to speed up the operation of general data mining tasks, such as clustering and outlier detection in large multidimensional data sets. In density-biased sampling, the probability that a given point will be included in the sample depends on the local density of the data set. We propose a general technique for density-biased sampling that can factor in user requirements to sample for properties of interest and can be tuned for specific data mining tasks. This allows great flexibility and improved accuracy of the results over simple random sampling. We describe our approach in detail, we analytically evaluate it, and show how it can be optimized for approximate clustering and outlier detection. Finally, we present a thorough experimental evaluation of the proposed method, applying density-biased sampling on real and synthetic data sets, and employing clustering and outlier detection algorithms, thus highlighting the utility of our approach.  相似文献   

14.
A binary decision diagram based approach for mining frequent subsequences   总被引:2,自引:1,他引:1  
Sequential pattern mining is an important problem in data mining. State of the art techniques for mining sequential patterns, such as frequent subsequences, are often based on the pattern-growth approach, which recursively projects conditional databases. Explicitly creating database projections is thought to be a major computational bottleneck, but we will show in this paper that it can be beneficial when the appropriate data structure is used. Our technique uses a canonical directed acyclic graph as the sequence database representation, which can be represented as a binary decision diagram (BDD). In this paper, we introduce a new type of BDD, namely a sequence BDD (SeqBDD), and show how it can be used for efficiently mining frequent subsequences. A novel feature of the SeqBDD is its ability to share results between similar intermediate computations and avoid redundant computation. We perform an experimental study to compare the SeqBDD technique with existing pattern growth techniques, that are based on other data structures such as prefix trees. Our results show that a SeqBDD can be half as large as a prefix tree, especially when many similar sequences exist. In terms of mining time, it can be substantially more efficient when the support is low, the number of patterns is large, or the input sequences are long and highly similar.  相似文献   

15.
Temporal data mining is still one of important research topic since there are application areas that need knowledge from temporal data such as sequential patterns, similar time sequences, cyclic and temporal association rules, and so on. Although there are many studies for temporal data mining, they do not deal with discovering knowledge from temporal interval data such as patient histories, purchaser histories, and web logs etc. We propose a new temporal data mining technique that can extract temporal interval relation rules from temporal interval data by using Allen’s theory: a preprocessing algorithm designed for the generalization of temporal interval data and a temporal relation algorithm for mining temporal relation rules from the generalized temporal interval data. This technique can provide more useful knowledge in comparison with conventional data mining techniques.  相似文献   

16.
Knowledge Discovery from Series of Interval Events   总被引:4,自引:0,他引:4  
Knowledge discovery from data sets can be extensively automated by using data mining software tools. Techniques for mining series of interval events, however, have not been considered. Such time series are common in many applications. In this paper, we propose mining techniques to discover temporal containment relationships in such series. Specifically, an item A is said to contain an item B if an event of type B occurs during the time span of an event of type A, and this is a frequent relationship in the data set. Mining such relationships provides insight about temporal relationships among various items. We implement the technique and analyze trace data collected from a real database application. Experimental results indicate that the proposed mining technique can discover interesting results. We also introduce a quantization technique as a preprocessing step to generalize the method to all time series.  相似文献   

17.
挖掘多关系关联规则   总被引:4,自引:0,他引:4  
何军  刘红岩  杜小勇 《软件学报》2007,18(11):2752-2765
关联规则的挖掘是数据挖掘中的一项重要和基础的技术,已进行了多方面的深入研究,有着广泛的应用.传统数据挖掘算法是针对单表数据进行处理的,在应用于多关系数据挖掘时存在诸多问题.对多关系关联规则的挖掘问题进行了重新定义和总结.提出了多关系关联规则挖掘的一个框架,并对已有算法进行了分类.然后对各类代表性算法进行了描述、分析和对比,对尚存在的问题进行了分析和总结.最后,对该领域未来的研究工作提出了建议.  相似文献   

18.
Our main research objective is to define a data mining query language, supported by a system that can optimize constraint-based data mining queries. We have invented ExAnte, a simple yet effective preprocessing technique for frequent-pattern mining. ExAnte exploits constraints to dramatically reduce the analyzed data to those containing patterns of interest. This data reduction, in turn, induces a strong reduction of the candidate patterns' search space, thus supporting substantial performance improvements in subsequent mining.  相似文献   

19.
Spatial data mining algorithms heavily depend on the efficient processing of neighborhood relations since the neighbors of many objects have to be investigated in a single run of a typical algorithm. Therefore, providing general concepts for neighborhood relations as well as an efficient implementation of these concepts will allow a tight integration of spatial data mining algorithms with a spatial database management system. This will speed up both, the development and the execution of spatial data mining algorithms. In this paper, we define neighborhood graphs and paths and a small set of database primitives for their manipulation. We show that typical spatial data mining algorithms are well supported by the proposed basic operations. For finding significant spatial patterns, only certain classes of paths “leading away” from a starting object are relevant. We discuss filters allowing only such neighborhood paths which will significantly reduce the search space for spatial data mining algorithms. Furthermore, we introduce neighborhood indices to speed up the processing of our database primitives. We implemented the database primitives on top of a commercial spatial database management system. The effectiveness and efficiency of the proposed approach was evaluated by using an analytical cost model and an extensive experimental study on a geographic database.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号