首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
路晶  胡顺仿 《计算机仿真》2021,38(5):246-249,422
以实现多种形态高维数据流的高效、精确并行计算为出发点,提出基于粒度理论的高维数据流并行计算方法.使用基于动态粒度的数据流挖掘模型,高效挖掘高维数据流;利用基于局部保持投影原理和主成分分析原理压制高维数据流噪声,减少高维数据流噪声隐患;依据降噪后不同高维数据流特点,采用高维数据流相关性分析并行计算方法,得到高维数据的皮尔逊积差相关系数,实现数据流关联,并基于数据流十字转门模型,定义适合高维数据流分析的滑动数据流窗口模式,实现高维数据流的并行计算.实验结果验证,上述方法挖掘高维数据流的内存消耗低,高维数据流数据去噪能力强,具备较高的高维数据流并行计算精度,且并行计算效率高.  相似文献   

2.
史金成  胡学钢 《微机发展》2007,17(11):11-14
上世纪末,为适应网络监控、入侵检测、情报分析、商业交易管理和分析等应用的要求,数据流技术应运而生。数据流独特的特点,对传统数据的处理方法带来了很大的挑战。介绍了数据流的有关概念及数据流挖掘的特点,讨论了数据流挖掘的研究现状。最后,举例说明了数据流挖掘的应用,并展望了数据流挖掘未来的研究方向。  相似文献   

3.
由于数据流不同于传统静态数据的特点,对其进行有效的分析和挖掘遇到了极大的挑战。本文对近年来数据流挖掘方面的进展进行了综述,介绍数据流的基本概念、数据流模型和对数据流的概要描述,总结数据流挖掘中常用的算法,最后结合其在不同领域中的应用对数据流挖掘的意义进行分析。  相似文献   

4.
数据流图可视化编辑工具SECAI是适用于传统的面向数据流需求分析的一种工具软件. SECAI是为了满足需求分析时绘制数据流图和编写数据字典的需求,因此它的主要功能包括:绘制分层数据流图、保证数据流的一致性以及记录外部实体的信息、记录加工的信息、记录数据流的信息、记录数据存储(文件)的信息并根据以上信息自动编写数据字典.文中使用UML描述了数据流图可视化编辑工具的设计方法,设计了数据流一致性保持算法并给出了运行结果  相似文献   

5.
数据流分析与技术研究   总被引:1,自引:0,他引:1       下载免费PDF全文
数据流作为一种新的数据形态,不同于传统的静态数据,具有连续快速、短暂易逝和不可预测的特点,对其进行有效地分析和挖掘遇到了极大的挑战。介绍了数据流的基本概念、数据流模型、数据流处理模型和目前一些数据流管理系统,并对数据流技术及其挖掘算法进行归纳和分类论述。  相似文献   

6.
介绍了数据流的定义和特点及数据流频繁模式的基本概念。针对数据流的特性,讨论分析了目前国内外数据流频繁模式挖掘算法、算法特性及应用情况,最后展望了数据流频繁模式挖掘的进一步研究工作。  相似文献   

7.
服务流程需要处理服务之间大量的异构数据的交互,不同的数据流处理方式直接影响了服务流程的执行效率。阐述了服务流程模型中的数据流表示模型、数据映射机制与数据流验证机制,论述了服务流程运行中的数据流调度、数据存储以及传输等数据管理问题,分析了数据流处理在服务流程中的应用情况。最后,结合现有的数据流研究进展,提出了数据流研究的展望。  相似文献   

8.
数据流管理和挖掘技术探析   总被引:2,自引:1,他引:1  
数据流管理和挖掘技术是数据库领域的新研究方向之一。概述了数据库技术的发展趋势以及数据流的概念、特点、体系结构、应用领域,分析了数据流概要数据结构的构造问题和数据流的连续近似查询技术,最后介绍了数据流挖掘技术。旨在描述数据流管理和挖掘技术的发展概况,为进一步的研究提供有益的借鉴。  相似文献   

9.
本文基于数据流框架理论,提出了如何将数据流分析方法应用于JAVA字节码中,通过建立数据流与半格、数据流和函数调用图的关系,从而对类型信息进行分析.实验表明该数据流分析方法能够对文件中的类型信息进行较精确的分析.  相似文献   

10.
使用GPU技术的数据流分位数并行计算方法   总被引:1,自引:0,他引:1  
周勇  王皓  程春田 《计算机应用》2010,30(2):543-546
数据流实时、连续、快速到达的特点决定了数据流的实时处理能力。在处理低维数据流时经常使用分位数信息来描述数据流的统计信息,利用图形处理器(GPU)的强大计算能力和高内存带宽的特性计算数据流分位数信息,提出了基于统一计算设备架构(CUDA)的数据流处理模型和基于该模型的数据流分位数并行计算方法。实验证明,该方法在提供不低于纯CPU分位数算法相同精度的条件下,使数据流分位数的实时计算带宽得到了显著的提高。  相似文献   

11.
无线传感器网络本质上是一个以数据为中心的网络,它处理的数据为传感器采集的连续不断的数据流.因此,现有的数据管理技术把无线传感器网络看作为来自物理世界的连续数据流组成的分布式数据库.由于传感器节点的计算能力、存储容量、通信能力以及电池能量有限,再加上flash存储器以及数据流本身的特性,给数据管理带来了传统分布式数据库系统没有的一些新挑战.从数据库系统的体系结构、数据存储与索引技术、数据模式、查询处理及优化技术等方面介绍了无线传感器网络的数据管理技术的研究现状.  相似文献   

12.
In recent times, data are generated as a form of continuous data streams in many applications. Since handling data streams is necessary and discovering knowledge behind data streams can often yield substantial benefits, mining over data streams has become one of the most important issues. Many approaches for mining frequent itemsets over data streams have been proposed. These approaches often consist of two procedures including continuously maintaining synopses for data streams and finding frequent itemsets from the synopses. However, most of the approaches assume that the synopses of data streams can be saved in memory and ignore the fact that the information of the non-frequent itemsets kept in the synopses may cause memory utilization to be significantly degraded. In this paper, we consider compressing the information of all the itemsets into a structure with a fixed size using a hash-based technique. This hash-based approach skillfully summarizes the information of the whole data stream by using a hash table, provides a novel technique to estimate the support counts of the non-frequent itemsets, and keeps only the frequent itemsets for speeding up the mining process. Therefore, the goal of optimizing memory space utilization can be achieved. The correctness guarantee, error analysis, and parameter setting of this approach are presented and a series of experiments is performed to show the effectiveness and the efficiency of this approach.  相似文献   

13.
一种基于变尺度滑动窗口的数据流频繁集挖掘算法   总被引:2,自引:0,他引:2  
基干传统滑动窗口机制的数据流频繁集挖掘算法较多地考虑快速且精确的效果,而较少考虑数据流的时变特性,对传统的滑动窗口机制进行改进.同时考虑数据流的海量特性和时变特性,提出一种基于变尺度滑动窗口机制的数据流频繁集挖掘算法V-Stream.该算法采用事务链表组的概要数据结构.能够根据数据流的数据分布变化自适应调整窗口大小.Eclipse上的仿真实验结果表明,V-Stream相比Manku算法提高了挖掘数据流频繁集的时间与空间效率.  相似文献   

14.
This work aims to connect two rarely combined research directions, i.e., non-stationary data stream classification and data analysis with skewed class distributions. We propose a novel framework employing stratified bagging for training base classifiers to integrate data preprocessing and dynamic ensemble selection methods for imbalanced data stream classification. The proposed approach has been evaluated based on computer experiments carried out on 135 artificially generated data streams with various imbalance ratios, label noise levels, and types of concept drift as well as on two selected real streams. Four preprocessing techniques and two dynamic selection methods, used on both bagging classifiers and base estimators levels, were considered. Experimentation results showed that, for highly imbalanced data streams, dynamic ensemble selection coupled with data preprocessing could outperform online and chunk-based state-of-art methods.  相似文献   

15.
These days, endless streams of data are generated by various sources such as sensors, applications, users, etc. Due to possible issues in sources, such as malfunctions in sensors, platforms, or communication, the generated data might be of low quality, and this can lead to wrong outcomes for the tasks that rely on these data streams. Therefore, controlling the quality of data streams has become increasingly significant. Many approaches have been proposed for controlling the quality of data streams, and hence, various research areas have emerged in this field. To the best of our knowledge, there is no systematic literature review of research papers within this field that comprehensively reviews approaches, classifies them, and highlights the challenges.In this paper, we present the state of the art in the area of quality control of data streams, and characterize it along four dimensions. The first dimension represents the goal of the quality analysis, which can be either quality assessment, or quality improvement. The second dimension focuses on the quality control method, which can be online, offline, or hybrid. The third dimension focuses on the quality control technique, and finally, the fourth dimension represents whether the quality control approach uses any contextual information (inherent, system, organizational, or spatiotemporal context) or not. We compare and critically review the related approaches proposed in the last two decades along these dimensions. We also discuss the open challenges and future research directions.  相似文献   

16.
A data stream is a massive, open-ended sequence of data elements continuously generated at a rapid rate. Mining data streams is more difficult than mining static databases because the huge, high-speed and continuous characteristics of streaming data. In this paper, we propose a new one-pass algorithm called DSM-MFI (stands for Data Stream Mining for Maximal Frequent Itemsets), which mines the set of all maximal frequent itemsets in landmark windows over data streams. A new summary data structure called summary frequent itemset forest (abbreviated as SFI-forest) is developed for incremental maintaining the essential information about maximal frequent itemsets embedded in the stream so far. Theoretical analysis and experimental studies show that the proposed algorithm is efficient and scalable for mining the set of all maximal frequent itemsets over the entire history of the data streams.  相似文献   

17.
基于数据流的移动数据挖掘研究综述*   总被引:1,自引:1,他引:0  
无线网络和移动设备的应用为我们带来巨大的便利,可以随时随地获得信息,同时它也引发了对高效数据流分析工具的需求。移动数据挖掘是在普适环境下的数据流挖掘,从连续的数据流中发现知识。讨论了数据流、数据流管理系统和移动数据挖掘以及它们的特点,介绍了该领域的一些研究成果,突出了面临的挑战和一些相应的策略,并对这些策略进行了比较,最后展望了这一领域的研究前景。  相似文献   

18.
Querying live media streams is a challenging problem that is becoming an essential requirement in a growing number of applications. Research in multimedia information systems has addressed and made good progress in dealing with archived data. Meanwhile, research in stream databases has received significant attention for querying alphanumeric symbolic streams. The lack of a data model capable of representing different multimedia data in a declarative way, hiding the media heterogeneity and providing reasonable abstractions for querying live multimedia streams poses the challenge of how to make the best use of data in video, audio and other media sources for various applications. In this paper we propose a system that enables directly capturing media streams from sensors and automatically generating more meaningful feature streams that can be queried by a data stream processor. The system provides an effective combination between extendible digital processing techniques and general data stream management research. Together with other query techniques developed in related data stream management streams, our system can be used in those application areas where multifarious live media senors are deployed for surveillance, disaster response, live conferencing, telepresence, etc.
Bin LiuEmail:
  相似文献   

19.
数据流管理技术   总被引:1,自引:1,他引:1  
最近,人们已经广泛认识到:在某些新的应用领域中,把数据看作瞬时的数据流比看作持久的关系更为适合。本文首先分析了传统数据库管理系统处理数据流的局限性,然后分析了三个典型的数据流管理系统的基本实现技术,讨论了当前数据流管理技术的研究现状和今后的研究方向,最后,给出了一个数据流管理原型系统的体系结构。  相似文献   

20.
数据流管理系统研究与进展   总被引:6,自引:2,他引:4  
综述了数据流管理系统的研究现状及相关的技术,包括相关的基本概念的阐述、现有实验系统的回顾、流式查询中存在的问题及相关的解决方案,并就今后如何进行数据流管理系统的研究提出了一些新的看法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号