首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
基于剪切的XML数据流自适应发布算法   总被引:1,自引:0,他引:1  
XML数据流上的分片策略是基于剪切的XML数据流发布系统面临的首要问题.文中针对基于剪切的XML数据流中对XML片段解析和连接的操作代价,提出了基于Hole-Filler模型的XML数据流的基本代价模型,在此基础上提出数据流自适应发布算法AXF,以期在数据和查询动态变化的情况下自动调整XML数据分片策略以获得最佳的系统运行性能、自适应能力和扩展性.实验结果表明AXF算法可以提高XML片段的有效率,在客户端、服务器及网络传输方面均获得良好的性能.  相似文献   

2.
连接是数据库研究中至关重要的一环,在没有边界、连续的数据流模型中,由于存储有限和实时性需求,连接算法主要基于滑动窗口作近似处理.主要研究数据流上一种特殊的连接,命名为点连接.点连接是指对于任意r∈R(称为主流),有唯一的s∈S(称为副流)与之对应,其中s.a=r.a且s.time最接近r.time(time称为时间特征).因此,流R与流S上的数据属于n∶1的关系.而在真实的分布式环境下,因为网络等原因,流数据到达的时间和顺序往往不一致,导致连接成功率下降.提出一种新的连接查询处理算法,能够在复杂的网络环境下获取更多的连接输出.实验模拟了2种网络环境,分别在数据有序到达和数据乱序到达2种情况下对算法验证,证明此算法比已有算法更优.  相似文献   

3.
数据流的连接常作为数据流查询操作的支撑算法.以往算法多考虑的是周期性演化的数据流,对于非周期性数据流连接涉及较少.提出一种变换高斯分布下的数据流连接算法.通过采样统计确定当前高斯中心点,并以此为中心划分数据块.提出在变换高斯分布下的确定数据连接块的方法.实验表明本算法与同类算法相比可以在有限的内存下产生更高的连接率,更小的I/O代价.  相似文献   

4.
序列数据一类重要的数据类型,在文本、Web访问日志文件、生物数据库等应用中普遍存在,对其进行相似性查询是一种获取有用信息的重要手段.在大型序列数据库中进行高效相似性查询的关键因素之一就是查询算法的过滤能力,即设计能快速过滤与查询序列不相关序列集的过滤器十分重要.提出了结合序列距离的度量性质和序列自身特征的多重过滤算法SSQ_MF,SSQ_MF使用了长度过滤器、前缀过滤器和基于参考集的过滤器,使得算法过滤能力较基于单一过滤器算法进一步增强.此外,设计了有关数据结构对查询数据库的一些统计信息进行了预计算和保存,有效估计了各过滤器的过滤集大小,并构建了一个由过滤集大小确定的最优过滤顺序模型,使得算法的过滤代价最低.实验结果表明,算法SSQ_MF的查询性能优于单一过滤器算法和随机过滤顺序的多过滤器算法.  相似文献   

5.
无线传感器网络中Skyline节点连续查询算法   总被引:2,自引:0,他引:2  
信俊昌  王国仁 《计算机学报》2012,35(11):2415-2430
作为多目标决策的重要手段之一,Skyline节点查询在传感器网络应用中发挥着非常重要的作用.文中深入地分析了Skyline节点查询的性质,提出了基于过滤的Skyline节点连续查询算法(FIlter based Skyline moniToringalgorithm,FIST).FIST算法共包括自底向上、自顶向下和混合3种过滤方式,均通过在传感器节点设置本地或全局过滤器来避免不必要的数据传输,进而节约传感器节点的能量.自底向上过滤方式通过缓存先前Skyline结果作为本地过滤器来避免数据重复传输,而自顶向下过滤则通过设置超立方体作为全局过滤器来避免数据反复更新.由于两者各有利弊,因而提出了混合过滤方式,通过为节点选择合适的过滤器来扬长避短.大量仿真实验的结果表明,FIST算法能有效地减少Skyline节点连续查询过程中传感器节点的通信代价,进而降低传感器网络的能量消耗.  相似文献   

6.
针对云环境下空间数据连接查询处理问题,提出了一种基于Spark的多路空间连接查询处理算法BSMWSJ.该算法采用网格划分方法将整个数据空间划分成大小相同的网格单元,并将各类数据集中的空间对象,根据其空间位置划分到相应的网格单元中,不同网格单元中的空间数据对象进行并行连接查询处理.在多路空间连接查询处理过程中,采用边界过滤的方法来过滤无用数据,即通过计算前面连接操作候选结果的MBR来过滤后续连接数据集,从而过滤掉无用的连接对象,减少连接对象的多余投影与复制,并采用重复避免策略来减少重复结果的输出,从而进一步减少后续连接计算的代价.合成数据集和真实数据集上的大量实验结果表明:提出的多路空间连接查询处理算法在性能上明显优于现有的多路连接查询处理算法.  相似文献   

7.
提出了一种基于过滤的算法(filter based algorithm,FBA)来连续地维护传感器网络中的滑动窗口轮廓查询。首先,研究了利用元组过滤器和格过滤器来减少网络中数据传输量的两种方法。由于它们各有利弊,提出了根据数据分布来选择合适的过滤器的自适应过滤法;另外,提出了一系列的优化方法来进一步提高算法的能量有效性。仿真和真实数据的实验结果表明,FBA及其优化方法能有效地减少连续维护传感器网络中滑动窗口轮廓时的通信代价,进而节约传感器网络的能量。  相似文献   

8.
研究了Ad hoc网络的特点及应用情况,分析了在Ad hoc网络中传统数据流多连接查询处理策略所面临的问题,并在此基础上提出了一种基于大纲的多数据流连接优化算法(SMJ)。实验结果显示SMJ算法可以极大降低Ad hoc网络数据流连接查询处理的通信代价。  相似文献   

9.
XML查询的结构连接算法   总被引:1,自引:0,他引:1  
针对目前多数XML结构连接方法在输入元素集合不存在索引或者无序的情况下,对输入数据临时排序或建立索引代价过高的问题,分析经典的Stack-Tree-Desc算法以及B 树索引的优化算法,提出不局限于外部索引结构的XML查询优化策略并给出算法实现.实验结果表明该算法较Stack-Tree-Desc算法查询效率更高.  相似文献   

10.
基于最小生成树的数据流窗口连接优化算法   总被引:1,自引:1,他引:0  
与传统关系数据库不同,数据流管理系统主要处理并发的连续查询.由于查询可能随时增删,所以其主要关注适合查询增删的并发连续查询优化,而不是单条查询优化.提出适合频繁增删查询环境下的数据流窗口连接优化算法.对于新注册的查询以类似最小生成树算法写出数据流的探测序列,然后在不更改其他查询探测序列顺序的情况下尽量合并,减少重复计算.注册或删除查询并不影响其他的查询计划,不需要执行繁琐的查询计划迁移.理论分析和实验证明,该算法简单,优化性能在可接受的范围内,尤其适合查询更新频率较高的系统.  相似文献   

11.
The XML stream filtering is gaining widespread attention from the research community in recent years. There have been many efforts to improve the performance of the XML filtering system by utilizing XML schema information. In this paper, we design and implement an XML stream filtering system, SFilter, which uses DTD or XML schema information for improving the performance. We propose the simplification and two kinds of optimization, one is static and the other is dynamic optimization. The Simplification and static optimization transform the XPath queries to make automata as an index structure for the filtering. The dynamic optimization are done in runtime at the filtering time. We developed five kinds of static optimization and two kinds of dynamic optimization. We present the novel filtering algorithm for the resulting transformed XPath queries and runtime optimizing. The experimental result shows that our system filters the XML streams efficiently.  相似文献   

12.
李军  廖豪  陈洁  谭建龙 《计算机科学》2010,37(12):22-25
多媒体数据流包含多种数据形态(文本、图片、音视频)和多种通道信息(地址信息、链接信息、时间和会话信息等)。多媒体数据流通道之间具有一定的内容相关性。以往对多媒体过滤的相关工作局限于单一的数据模态,不支持不同模态信息的融合过滤和不同数据通道间的关联过滤。提出了一个新的支持多模态融合过滤和多通道联合过滤的多媒体数据流过滤模型(简称为MCFMS模型)。在真实多媒体数据流上的实验结果证明,在复杂数据流环境下,MCFMS模型可以有效地进行多模态融合过滤和多通道联合过滤。  相似文献   

13.
Based on the assumption that selections are zero-expense operations, “selection pushdown” rules, which apply selections in random order before as many joins as possible in order to reduce subsequent join costs, have been widely applied in traditional query optimization methods. However, in multimedia information systems, selections generally contain expensive multimedia operations, making “pushdown” rules no longer able to produce optimal query execution plan. Therefore, we in this paper develop a theory for optimizing queries with expensive multimedia operations, which can establish the optimal placement of each multimedia operation in a query plan by the comprehensive consideration of selectivity and unit execution cost of each operation. Then we present an algorithm for the theory and implement it in a prototype system. Experimental results show that, compared with traditional optimization algorithms, our algorithm not only has the modest time complexity that is polynomial in the number of multimedia operations in a query plan, but also can reduce the execution cost of a query plan by orders of magnitude.  相似文献   

14.
We propose a novel partition path-based (PPB) grouping strategy to store compressed XML data in a stream of blocks. In addition, we employ a minimal indexing scheme called block statistic signature (BSS) on the compressed data, which is a simple but effective technique to support evaluation of selection and aggregate XPath queries of the compressed data. We present a formal analysis and empirical study of these techniques. The BSS indexing is first extended into effective cluster statistic signature (CSS) and multiple-cluster statistic signature (MSS) indexing by establishing more layers of indexes. We analyze how the response time is affected by various parameters involved in our compression strategy such as the data stream block size, the number of cluster layers, and the query selectivity. We also gain further insight about the compression and querying performance by studying the optimal block size in a stream, which leads to the minimum processing cost for queries. The cost model analysis provides a solid foundation for predicting the querying performance. Finally, we demonstrate that our PPB grouping and indexing strategies are not only efficient enough to support path-based selection and aggregate queries of the compressed XML data, but they also require relatively low computation time and storage space when compared with other state-of-the-art compression strategies.  相似文献   

15.
Optimizing top-k selection queries over multimedia repositories   总被引:2,自引:0,他引:2  
Repositories of multimedia objects having multiple types of attributes (e.g., image, text) are becoming increasingly common. A query on these attributes will typically, request not just a set of objects, as in the traditional relational query model (filtering), but also a grade of match associated with each object, which indicates how well the object matches the selection condition (ranking). Furthermore, unlike in the relational model, users may just want the k top-ranked objects for their selection queries for a relatively small k. In addition to the differences in the query model, another peculiarity of multimedia repositories is that they may allow access to the attributes of each object only through indexes. We investigate how to optimize the processing of top-k selection queries over multimedia repositories. The access characteristics of the repositories and the above query model lead to novel issues in query optimization. In particular, the choice of the indexes used to search the repository strongly influences the cost of processing the filtering condition. We define an execution space that is search-minimal, i.e., the set of indexes searched is minimal. Although the general problem of picking an optimal plan in the search-minimal execution space is NP-hard, we present an efficient algorithm that solves the problem optimally with respect to our cost model and execution space when the predicates in the query are independent. We also show that the problem of optimizing top-k selection queries can be viewed, in many cases, as that of evaluating more traditional selection conditions. Thus, both problems can be viewed together as an extended filtering problem to which techniques of query processing and optimization may be adapted.  相似文献   

16.
The filtering of incoming tuples of a data stream should be completed quickly and continuously, which requires strict time and space constraints. In order to guarantee these constraints, the selection predicates of continuous queries are grouped or indexed in most data stream management systems (DSMS). This paper proposes a new scheme called attribute selection construct (ASC). Given a set of continuous queries, an ASC divides the domain of an attribute of a data stream into a set of disjoint regions based on the selection predicates that are imposed on the attribute. Each region maintains the pre-computed matching results of the selection predicates. Consequently, an ASC can collectively evaluate all of its selection predicates at the same time. Furthermore, it can also monitor the overall evaluation statistics, such as its selectivity and tuple dropping ratio, dynamically. For those attributes that are employed to express the selection predicates of the queries, the processing order of their ASC’s can significantly influence the overall performance of a multiple query evaluation. The evaluation sequence can be optimized by periodically capturing the run-time tuple dropping ratio of its current evaluation sequence. The performance of the proposed method is analyzed by a series of experiments to identify its various characteristics.  相似文献   

17.
There is a growing interest in applications that utilize continuous sensing of individual activity or context, via sensors embedded or associated with personal mobile devices (e.g., smartphones). Reducing the energy overheads of sensor data acquisition and processing is essential to ensure the successful continuous operation of such applications, especially on battery-limited mobile devices. To achieve this goal, this paper presents a framework, called ACQUA, for ‘acquisition-cost’ aware continuous query processing. ACQUA replaces the current paradigm, where the data is typically streamed (pushed) from the sensors to the one or more smartphones, with a pull-based asynchronous model, where a smartphone retrieves appropriate blocks of relevant sensor data from individual sensors, as an integral part of the query evaluation process. We describe algorithms that dynamically optimize the sequence (for complex stream queries with conjunctive and disjunctive predicates) in which such sensor data streams are retrieved by the query evaluation component, based on a combination of (a) the communication cost & selectivity properties of individual sensor streams, and (b) the occurrence of the stream predicates in multiple concurrently executing queries. We also show how a transformation of a group of stream queries into a disjunctive normal form provides us with significantly greater degrees of freedom in choosing this sequence, in which individual sensor streams are retrieved and evaluated. While the algorithms can apply to a broad category of sensor-based applications, we specifically demonstrate their application to a scenario where multiple stream processing queries execute on a single smartphone, with the sensors transferring their data over an appropriate PAN technology, such as Bluetooth or IEEE 802.11. Extensive simulation experiments indicate that ACQUA’s intelligent batch-oriented data acquisition process can result in as much as 80 % reduction in the energy overhead of continuous query processing, without any loss in the fidelity of the processing logic.  相似文献   

18.
This paper describes a unified data model that represents multimedia, timeline, and simulation data utilizing a single set of related data modeling constructs. A uniform model for multimedia types structures image, sound, video, and long text data in a consistent way, giving multimedia schemas and queries a degree of data independence even for these complex data types. Information that possesses an intrinsic temporal element can all be represented using a construct called a stream. Streams can be aggregated into parallel multistreams, thus providing a structure for viewing multiple sets of time-based information. The unified stream construct permits real-time measurements, numerical simulation data, and visualizations of that data to be aggregated and manipulated using the same set of operators. Prototypes based on the model have been implemented for two medical application domains: thoracic oncology and thermal ablation therapy of brain tumors. Sample schemas, queries, and screenshots from these domains are provided. Finally, a set of examples is included for an accompanying visual query language discussed in detail in another document  相似文献   

19.
Query matching on XML streams is challenging work for querying efficiency when the amount of queried stream data is huge and the data can be streamed in continuously. In this paper, the method Syntactic Twig-Query Matching (STQM) is proposed to process queries on an XML stream and return the query results continuously and immediately. STQM matches twig queries on the XML stream in a syntactic manner by using a lexical analyzer and a parser, both of which are built from our lexical-rules and grammar-rules generators according to the user's queries and document schema, respectively. For query matching, the lexical analyzer scans the incoming XML stream and the parser recognizes XML structures for retrieving every twig-query result from the XML stream. Moreover, STQM obtains query results without a post-phase for excluding false positives, which are common in many streaming query methods. Through the experimental results, we found that STQM matches the twig query efficiently and also has good scalability both in the queried data size and the branch degree of the twig query. The proposed method takes less execution time than that of a sequence-based approach, which is widely accepted as a proper solution to the XML stream query.  相似文献   

20.
滑动窗口聚集查询在数据流管理系统中应用广泛,数据流到达高峰期,必须考虑滑动窗口聚集查询中出现的降载问题。分析了子集模型的特点和已有降载策略的不足,给出了数据流滑动窗口聚集查询降载问题的约束条件,提出了能保证子集结果产生的基于丢弃窗口更新策略的降载算法。理论分析和实验结果表明,该算法对数据流滑动窗口聚集查询降载问题的处理具有较高的有效性和实用性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号