首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
数据挖掘技术在网络型异常入侵检测系统中的应用   总被引:10,自引:0,他引:10  
网络型异常检测的关键问题在于建立正常模式,将当前的系统或用户行为与建立好的正常模式进行比较,判断其偏离程度。简单介绍了数据挖掘算法以及基于数据挖掘的入侵检测系统的分类,从不同分类的角度介绍了数据挖掘方法在入侵检测系统中的应用。重点对比了模式比较的各种方法,并且使用网络型异常检测方法验证收集的正常数据是否充足的问题。  相似文献   

2.
Revisiting the Foundations of Artificial Immune Systems for Data Mining   总被引:3,自引:0,他引:3  
This paper advocates a problem-oriented approach for the design of artificial immune systems (AIS) for data mining. By problem-oriented approach we mean that, in real-world data mining applications the design of an AIS should take into account the characteristics of the data to be mined together with the application domain: the components of the AIS - such as its representation, affinity function, and immune process - should be tailored for the data and the application. This is in contrast with the majority of the literature, where a very generic AIS algorithm for data mining is developed and there is little or no concern in tailoring the components of the AIS for the data to be mined or the application domain. To support this problem-oriented approach, we provide an extensive critical review of the current literature on AIS for data mining, focusing on the data mining tasks of classification and anomaly detection. We discuss several important lessons to be taken from the natural immune system to design new AIS that are considerably more adaptive than current AIS. Finally, we conclude this paper with a summary of seven limitations of current AIS for data mining and ten suggested research directions.  相似文献   

3.
This paper introduces a method for mining co-occurring events from longitudinal data, and applies this method to detecting adverse drug reactions (ADRs) from patient data. Electronic health records are richer than older data sources (such as spontaneous report records) and thus are ideal for ADR mining. However, current data mining methods, such as disproportionality ratios and temporal itemset mining, ignore certain important aspects of the longitudinal data in patient records. In this paper, we highlight two specific problems with current methods, which we name temporal and contextual sensitivity, and discuss why these two properties are vital to mining patterns from longitudinal data. We also propose two sensitive longitudinal rate comparison measures, which utilize condition occurrence rates and length of drug eras, for mining ADRs from this type of data. These novel methods are then used to rank potential ADRs, along with existing state-of-the-art methods, under many simulated yet realistic datasets. In 48 out of 60 experiments, the proposed longitudinal rate comparison methods significantly outperform other methods in mining known ADRs from other drug / condition pairs.  相似文献   

4.
Educational data mining is an emerging discipline, concerned with developing methods for exploring the unique types of data that come from the educational context. This work is a survey of the specific application of data mining in learning management systems and a case study tutorial with the Moodle system. Our objective is to introduce it both theoretically and practically to all users interested in this new research area, and in particular to online instructors and e-learning administrators. We describe the full process for mining e-learning data step by step as well as how to apply the main data mining techniques used, such as statistics, visualization, classification, clustering and association rule mining of Moodle data. We have used free data mining tools so that any user can immediately begin to apply data mining without having to purchase a commercial tool or program a specific personalized tool.  相似文献   

5.
粒子群优化算法在关联规则挖掘中的研究综述   总被引:1,自引:0,他引:1  
关联规则挖掘是数据挖掘中的重要领域,考虑到当前数据的大规模、高维度、模态多样及类型复杂等特性,传统关联规则挖掘算法已无法适应大数据的需求,粒子群优化算法作为一种高效的智能优化算法,为其提供了一种全新的解决方案,近年来被广泛应用于该领域。首先对粒子群优化算法的基本原理及关联规则的基本概念进行了详细介绍,回顾了粒子群优化算法的研究进展,分析了粒子群优化算法在关联规则挖掘中的研究,包括常用的数据转换方法、编码方式及评估指标,并与其他在关联规则挖掘中被广泛应用的算法进行了对比,总结了各自的优缺点及适用场景。然后对已有改进方法进行了较为系统的分类,即分为基于参数、基于变异机制和混合其他算法的改进。接着梳理归纳了粒子群优化算法在关联规则挖掘中的应用领域,阐述了该算法在购物篮、金融、医疗、工业生产及风险评估领域中的应用优势。最后在介绍这一领域的最新研究进展的基础上,通过对现存问题进行分析,讨论了进一步的研究方向。  相似文献   

6.
基于粗糙集的两种离散化算法的研究   总被引:9,自引:0,他引:9  
随着知识发现和数据挖掘的迅速发展,出现了很多的方法,这些方法很多都依赖于离散的数据。但是,大部分现实中应用的数据都带有连续变量的属性。为了使得数据挖掘的技术能够用在这些数据上面,必须进行离散化。文章探讨了基于粗糙集的离散化方法。论文做实验来比较局部和全局离散化算法,实验结果表明,这两种算法对于数据集有敏感性。  相似文献   

7.
The insurance industry of Hong Kong has been experiencing steady growth in the last decade. One of the current problems in the industry is that, in general, insurance agent turnover is high. The selection of new agents is treated as a regular recruitment exercise. This study focuses on the characteristics of data warehousing and the appropriate data mining techniques that can be used to support agent selection in the insurance industry. We examine the application of three popular data mining methods – discriminant analysis, decision trees and artificial neural networks – incorporated with a data warehouse to the prediction of the length of service, sales premiums and persistence indices of insurance agents. An intelligent decision support system, namely Intelligent Agent Selection Assistant for Insurance, is presented, which will help insurance managers to select quality agents by using data mining in a data warehouse environment.  相似文献   

8.
Advances in multimedia data acquisition and storage technology have led to the growth of very large multimedia databases. Analyzing this huge amount of multimedia data to discover useful knowledge is a challenging problem. This challenge has opened the opportunity for research in Multimedia Data Mining (MDM). Multimedia data mining can be defined as the process of finding interesting patterns from media data such as audio, video, image and text that are not ordinarily accessible by basic queries and associated results. The motivation for doing MDM is to use the discovered patterns to improve decision making. MDM has therefore attracted significant research efforts in developing methods and tools to organize, manage, search and perform domain specific tasks for data from domains such as surveillance, meetings, broadcast news, sports, archives, movies, medical data, as well as personal and online media collections. This paper presents a survey on the problems and solutions in Multimedia Data Mining, approached from the following angles: feature extraction, transformation and representation techniques, data mining techniques, and current multimedia data mining systems in various application domains. We discuss main aspects of feature extraction, transformation and representation techniques. These aspects are: level of feature extraction, feature fusion, features synchronization, feature correlation discovery and accurate representation of multimedia data. Comparison of MDM techniques with state of the art video processing, audio processing and image processing techniques is also provided. Similarly, we compare MDM techniques with the state of the art data mining techniques involving clustering, classification, sequence pattern mining, association rule mining and visualization. We review current multimedia data mining systems in detail, grouping them according to problem formulations and approaches. The review includes supervised and unsupervised discovery of events and actions from one or more continuous sequences. We also do a detailed analysis to understand what has been achieved and what are the remaining gaps where future research efforts could be focussed. We then conclude this survey with a look at open research directions.  相似文献   

9.
10.
神经网络与非线性模式数据挖掘研究   总被引:1,自引:2,他引:1  
邓乾罡  孟波 《计算机工程与设计》2004,25(10):1667-1668,1694
论述了人工智能技术在数据挖掘领域应用的一些理论进展。非线性模式的规则提取是数据挖掘的一个主要任务,然而,目前有效的方法却很少。着重论述了一个专用于对非线性模式数据进行数据挖掘的模型,并且给出了简要的算法和一个例子。  相似文献   

11.
数据挖掘技术在IT基础设施监控系统中的应用   总被引:1,自引:0,他引:1  
本文介绍了数据挖掘技术在IT基础设施监控系统中的应用,着重阐述了采用业界广泛采纳的数据挖掘流程标准CRISP-DM,利用时序数据挖掘技术和一元线性回归预测技术从监控历史数据中发现有趣模式的过程,并针对该应用的特殊性,对相关技术的参数设置方法进行了改进,引入了一个新的指数——负载指数——以增加模型的精确性。  相似文献   

12.
13.
The ability to combine domain specific knowledge and special knowledge about using mathematical-statistical methods for analyzing big data bases at present time is not wide-spread in science and business. For the near future, an increase in data mining applications can be expected. So, one needs instruments to support non-specialists in using specific knowledege about data mining. In this paper a data mining architecture is introduced. Its main advantage is to offer a systematical scheme for data mining methods. These methods are structured with reference to applications. The data mining application architecture is a decision and structuring support for data mining problems to users, scientists and students.  相似文献   

14.
DNA序列数据挖掘技术   总被引:4,自引:1,他引:4       下载免费PDF全文
朱扬勇  熊赟 《软件学报》2007,18(11):2766-2781
DNA序列数据是一类重要的生物数据.研究DNA序列数据解读其含义是后基因组时代的主要研究任务.数据挖掘是目前最有效的数据分析手段之一,用于发现大量数据所隐含的各种规律,也是生物信息学采用的主要数据分析技术.将数据挖掘技术用于DNA序列数据分析,已得到了广泛关注和快速发展,并取得了许多研究成果.综述了DNA序列数据挖掘领域的研究状况和进展,提出了3个研究阶段:基于统计的挖掘方法应用阶段、一般化挖掘方法应用阶段和专门的DNA序列数据挖掘方法设计阶段.阐述了DNA序列数据挖掘的基础是序列相似性,评述了DNA序列数据挖掘领域所采用的关键技术,包括DNA序列模式、关联、聚类、分类和异常挖掘等,分析讨论了其相应的生物应用背景和意义.最后给出DNA序列数据挖掘进一步研究的热点问题,包括DNA序列数据新的存储和索引机制的设计、根据生物领域知识的数据挖掘新模型和算法的设计等.  相似文献   

15.
WEB数据挖掘旨在从大量的WEB数据信息中发现有用的模式和隐藏的信息,从而为决策者提供决策支持,优化市场策略,有效地解决当今互联网信息膨胀的问题。WEB数据挖掘的一个重要应用就是电子商务。电子商务是一个基于网络平台的现代化的商业模式,目前电子商务发展势头强劲,WEB数据挖掘在电子商务中必定有广阔的应用前景。本文将WEB数据挖掘与电子商务相结合,介绍了在电子商务平台中进行WEB数据挖掘的方法,从而为电子商务从业人员提供借鉴,以便更好地分析数据间的隐藏关系和模式,掌握用户喜好,为电子商务平台的市场决策提供决策支持,减少风险。  相似文献   

16.
张永梅  郭莎  季艳  马礼  张睿 《计算机科学》2018,45(3):223-230
大多数数据库都不能有效地处理数据的时间维度,时空同现模式挖掘有利于提取隐含在时空数据集中有价值的信息,目前已经成为研究热点。针对现有同现模式发现方法挖掘效率较低的问题,采用双层网络对时空数据进行初始化建模,针对传统方法在进行时空兴趣度计算时未考虑对象类型存在有效周期的问题,改进了现有兴趣度计算方法,引入了权重特征值,并提出了基于网络的时空同现模式挖掘算法。实验表明,在使用不同数据量的测试集中挖掘同现模式集时,新算法的运行效率优于不对数据集进行建模的方法以及仅对实例层进行建模的方法。  相似文献   

17.
数据挖掘在电子商务中的应用   总被引:8,自引:6,他引:2  
严潭 《微计算机信息》2006,(12):201-202
数据挖掘技术作为解决“数据爆炸”时代出现的“信息缺乏”的最有效手段之一,受到了企业界的极大关注。文章阐述了电子商务中数据挖掘技术的框架、数据资源、专业人员、基本方法,分析了数据挖掘在电子商务中的具体应用。  相似文献   

18.
数据挖掘(Data Mining,DM)是一门应用性很强的技术。该文阐述了数据挖掘技术的概念、方法和过程,介绍了数据挖掘在当前医学领域的应用情况。  相似文献   

19.
开源数据库-重症特别护理信息集MIMIC数据库包含了大量的医学数据,自它发布之日起,便得到了众多研究人员的青睐。但低效的挖掘方法很难发现内部的隐含信息,这使得MIMIC数据库得不到很好的利用,造成了资源的浪费。探索新兴的挖掘方法进行知识发现便显得异常重要。文中对围绕MIMIC数据库的各种挖掘方法进行综述,重点阐述了新出现的机器学习和深度学习方法。同时将传统统计学模型与新出现的人工智能技术包括机器学习和深度学习技术进行比较分析。结果发现相比传统的统计学模型,机器学习和深度学习技术在预测病人的早期死亡率、发现疾病影响因素等方面普遍效果更好,这有助于改善医疗质量、帮助医生进行辅助诊断,在一定程度上也减少了病人的医疗费用。  相似文献   

20.
《Knowledge》2006,19(6):438-444
One major goal for data mining is to understand data. Rule based methods are better than other methods in making mining results comprehensible. However, current rule based classifiers make use of a small number of rules and a default prediction to build a concise predictive model. This reduces the explanatory ability of the rule based classifier. In this paper, we propose to use multiple and negative target rules to improve explanatory ability of rule based classifiers. We show experimentally that this understandability is not at the cost of accuracy of rule based classifiers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号