共查询到20条相似文献,搜索用时 0 毫秒
1.
数据流上的连续预测聚集查询 总被引:3,自引:0,他引:3
提出了一种数据流上未来值的连续查询,称为连续预测查询.采用数理统计的方法给出了带有COUNT聚集函数的连续预测聚集查询实现算法.通过采用TPC-H标准测试数据和随机生成的模拟数据进行了实验.理论和实验结果表明,给出的带有COUNT的连续预测聚集查询实现算法具有很高的性能和精度. 相似文献
2.
空间文本数据流上连续查询(CQST)在基于位置的服务中应用广泛,其在不断更新的数据流上,持续监控满足空间和文本约束的结果.为了将数据流中的对象尽快匹配给CQST,在CQST上构建高效的过滤技术是关键.CQST查询评估方法——为查询选取恰当的空间文本索引,构建高效的过滤策略提升索引的空间文本过滤性能,为数据流中到来的对象... 相似文献
3.
数据流上的预测聚集查询处理算法 总被引:16,自引:3,他引:16
实时数据流未来趋势的预测具有重要的实际应用意义.例如,在环境监测传感器网络中,通过对感知数据流进行预测聚集查询,观察者可以预测网络覆盖的区域在未来一段时间内的平均温度和湿度,以确定是否会发生异常事件.目前的研究工作多数集中在数据流上当前数据的查询,数据流上预测查询的研究工作还很少.采用多元线性回归方法,给出了数据流上的聚集值预测模型,提出了一种数据流预测聚集查询处理方法.当预测失败的次数大于预先给定的阈值时,给出了一种预测模型自动调整策略,以降低预测误差.还提出了滑动窗口的更新周期、数据流的流速对预测精度影响的数学模型.理论分析与实验结果表明,提出的预测聚集查询处理算法具有较高的性能,并且能够返回满足用户精度要求的预测查询结果.在实验中,采用TPC-H国际标准测试数据和TAO(tropical atmosphere ocean)测量的海洋表面空气温度数据来构造数据流. 相似文献
4.
基于滑动窗口的数据流压缩技术及连续查询处理方法 总被引:8,自引:0,他引:8
基于滑动窗口的连续查询处理是数据流研究领域的一个热点问题.已有的研究工作均假设滑动窗口内的数据能够全部保存在主存中,若滑动窗口内的数据量超过了可用主存空间,已有的查询处理方法则无法正常工作.提出两种数据流上的滑动窗口压缩技术,有效地降低了滑动窗口的存储空间需求.同时,给出了基于压缩滑动窗口的连续查询处理算法,理论分析和实验结果表明,这些算法具有很好的性能,能够满足数据流连续查询处理的实时性要求. 相似文献
5.
6.
7.
Letchner Julie R Christopher Balazinska Magdalena Philipose Matthai 《Internet Computing, IEEE》2008,12(6):30-36
Building applications on top of sensor data streams is challenging because sensor data is noisy. A model-based view can reduce noise by transforming raw sensor streams into streams of probabilistic state estimates, which smooth out errors and gaps. The authors propose a novel model-based view, the Markovian stream, to represent correlated probabilistic sequences. Applications interested in evaluating event queries — extracting sophisticated state sequences — can improve robustness by querying a Markovian stream view instead of querying raw data directly. The primary challenge is to properly handle the Markovian stream's correlations. 相似文献
8.
基于自动机的XML流多查询处理 总被引:1,自引:0,他引:1
XML流数据处理在研究领域引起广泛关注,该文针对XML流上的多查询处理提出一种算法,把多个查询合并为一个共享前缀的查询树,应用自动机和运行时栈相结合的方法,单遍扫描XML流处理数据流上的多个查询。该算法采用一种分层栈结构保存查询模式匹配候选集,利用XML节点的区间编码来确定节点之间的关系,返回整条匹配路径。 相似文献
9.
10.
Cheng Reynold Kao Ben Kwan Alan Prabhakar Sunil Tu Yicheng 《Knowledge and Data Engineering, IEEE Transactions on》2010,22(2):234-248
The idea of allowing query users to relax their correctness requirements in order to improve performance of a data stream management system (e.g., location-based services and sensor networks) has been recently studied. By exploiting the maximum error (or tolerance) allowed in query answers, algorithms for reducing the use of system resources have been developed. In most of these works, however, query tolerance is expressed as a numerical value, which may be difficult to specify. We observe that in many situations, users may not be concerned with the actual value of an answer, but rather which object satisfies a query (e.g., "who is my nearest neighbor?”). In particular, an entity-based query returns only the names of objects that satisfy the query. For these queries, it is possible to specify a tolerance that is "nonvalue-based.” In this paper, we study fraction-based tolerance, a type of nonvalue-based tolerance, where a user specifies the maximum fractions of a query answer that can be false positives and false negatives. We develop fraction-based tolerance for two major classes of entity-based queries: 1) nonrank-based query (e.g., range queries) and 2) rank-based query (e.g., k-nearest-neighbor queries). These definitions provide users with an alternative to specify the maximum tolerance allowed in their answers. We further investigate how these definitions can be exploited in a distributed stream environment. We design adaptive filter algorithms that allow updates be dropped conditionally at the data stream sources without affecting the overall query correctness. Extensive experimental results show that our protocols reduce the use of network and energy resources significantly. 相似文献
11.
空间文本数据流上连续k近邻查询(Continuous k-nearest neighbor Queries over Spatial-Textual data streams,CkQST)能在空间文本对象组成的数据流上检索并实时更新k个包含指定关键字的空间邻近对象,是空间文本数据流上连续查询(Continuous Qu... 相似文献
12.
概率数据流的并行Skyline查询作为当前大数据分析的一个重要方面,在诸多实际应用中发挥着重要作用。针对并行概率流Skyline查询过程中因发生故障而导致查询结果不准确和查询中断等问题,提出了一种基于复制的容错并行Skyline查询方法REPS。该方法选择参与并行处理的计算节点作为副本节点,并采用层次-循环式数据副本放置策略,选择优先级高的副本恢复数据来保证数据恢复的高效性;同时将故障检测、丢失数据恢复和查询过程恢复贯穿于整个查询更新过程中,以减少容错处理的额外通信和计算开销,并实现快速的容错并行查询。实验结果表明,REPS方法不仅在无故障发生和单个节点失效时具有较高的查询处理效率,而且对于多节点失效情形,仍然能够保持较高的查询处理速率且满足查询需求。 相似文献
13.
时空数据流的聚集查询技术已经成为数据库领域的研究热点。到目前为止,还没有一种有效的全时态聚集索引适用于非欧氏空间的路网数据流聚集查询。实现路网数据流的全时态聚集查询,必须解决:(1)路网的非欧氏空间特性问题;(2)路网上移动对象的重复计数、非均匀分布以及预测聚集问题。Sketch RR-tree解决了非欧氏空间特性和重复计数问题;为解决非均匀分布问题,借鉴草图划分思想,提出动态草图索引结构DynSketch:采用AMH智能划分Sketch RR-tree,使每个划分区域内车辆均匀分布,以提高聚集查询质量;同时,基于DynSketch,结合ES预测模型,提出了路网数据流的预测聚集查询算法。 相似文献
14.
Chen Songting Li Hua-Gang Tatemura Jun'ichi Hsiung Wang-Pin Agrawal Divyakant Candan K. Sel uk 《Knowledge and Data Engineering, IEEE Transactions on》2008,20(12):1627-1640
An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex Generalized-Tree-Pattern (GTP) queries. Our filtering mechanism is based on a novel Tree-of-Path (TOP) encoding scheme, which compactly represents the path matches for the entire document. First, we show that the TOP encodings can be efficiently produced via a shared bottom-up path matching. Second, with the aid of this TOP encoding, we can 1) achieve polynomial time and space complexity for post processing, 2) avoid redundant predicate evaluations, 3) allow an efficient duplicate-free and merge join-based algorithm for merging multiple encoded path matches and 4) simplify the processing of GTP queries. Overall our approach maximizes the sharing opportunity across queries by exploiting the suffix as well as prefix sharing. At the same time, our TOP encodings allow efficient post processing for GTP queries. Extensive performance studies show that our GFilter solution not only achieves significantly better filtering performance than state-of-the-art algorithms, but also is capable of efficiently filtering the more complex GTP queries. 相似文献
15.
16.
在线无线射频识别(radio frequency identification,RFID)数据流上的复杂事件处理技术是一个新的课题。现有研究工作仅是针对单一的复杂事件查询,没有考虑多复杂事件同时查询的处理策略。在复杂事件语言SASE(stream-based and shared event processing)的基础上设计了专门针对多查询的自动机及相关的优化技术,解决了RFID数据流上多复杂事件查询的问题。实验结果表明,算法在查询数量较大时,时间与空间上较传统算法有更好的表现。 相似文献
17.
18.
Kun-Lung Wu Shyh-Kwei Chen Yu P.S. 《Knowledge and Data Engineering, IEEE Transactions on》2006,18(11):1560-1575
Efficient processing of continual range queries over moving objects is critically important in providing location-aware services and applications. A set of continual range queries, each defining the geographical region of interest, can be periodically (re)evaluated to locate moving objects that are currently within individual query boundaries. We study a new query indexing method, called CES-based indexing, for incremental processing of continual range queries over moving objects. A set of containment-encoded squares (CES) are predefined, each with a unique ID. CESs are virtual constructs (VC) used to decompose query regions and to store indirectly precomputed search results. Compared with a prior VC-based approach, the number of VCs visited in a search operation is reduced from (4L2-1)/3 to log(L)+1, where L is the maximal side length of a VC. Search time is hence significantly lowered. Moreover, containment encoding among the CESs makes it easy to identify all those VCs that need not be visited during an incremental query (re)evaluation. We study the performance of CES-based indexing and compare it with a prior VC-based approach 相似文献
19.
近年来,子图查询作为图数据库管理的一项重要课题受到国内外学者的广泛关注。在现实应用中大部分图数据是频繁更新的,而现有方法对图数据的频繁更新的维护代价较高。子图查询本身就是NP完全问题,在动态图数据上子图查询问题就变得更加困难。针对上述问题,提出了支持动态图数据的子图查询方法。该方法首先构造出每张图的拓扑层次序列作为索引,在序列中加入标号以便数据更新后对索引进行维护,再根据序列间的匹配关系过滤出候选集合,最后采用图同构算法验证候选集中的图,最终得到结果集合。该方法的索引构造简单且体积小,并且在图数据库更新后无需重构索引,不仅支持动态图数据上的子图查询,在静态图数据上也表现出良好的性能。 相似文献