首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We describe a framework for supporting arbitrarily complex SQL queries with “uncertain” predicates. The query semantics is based on a probabilistic model and the results are ranked, much like in Information Retrieval. Our main focus is query evaluation. We describe an optimization algorithm that can compute efficiently most queries. We show, however, that the data complexity of some queries is #P-complete, which implies that these queries do not admit any efficient evaluation methods. For these queries we describe both an approximation algorithm and a Monte-Carlo simulation algorithm.  相似文献   

2.
基于词典的英汉双向跨语言信息检索方法   总被引:1,自引:0,他引:1       下载免费PDF全文
杨辉  张玥杰  张涛 《计算机工程》2009,35(16):273-274
基于文本检索会议关于英汉跨语言信息检索的任务评价,分别以英汉双向查询翻译和英汉查询为主导策略与翻译对象,采用英汉电子词典作为获取翻译知识的知识源,结合构建的英汉单语信息检索系统,实现完整的英汉双向跨语言信息检索过程。实验结果验证了该系统的有效性。  相似文献   

3.
利用人工和自动生成的资源进行中文信息检索查询扩展   总被引:4,自引:0,他引:4  
在中文信息检索的研究和实践中,由于查询与文件集中词的不匹配现象导致一些相关的文件不能被成功地检索出来,这是影响检索效果的一个很关键的问题。该文提出并实现了利用人工和自动生成的资源进行中文信息检索查询扩展,在NTCIR-2中文信息检索测试集上进行的实验表明,相对于不进行查询扩展的检索结果,该扩展方法取得了具有统计意义提高的检索效果。  相似文献   

4.
5.
The recent evolution in sensor node location technology has spurred the development of a special type of in-network processing for wireless sensor networks (WSN), called spatial query processing. These queries require data from nodes within a region (called region of interest) defined by the users. The state of the art of spatial query processing considers, in general, that nodes are always on. However, nodes can go to sleep mode (turn off the radio in duty cycles) in order to save energy. This work proposes an energy-efficient in-network spatial query processing mechanism that assumes nodes having no knowledge about their neighbors. The proposed mechanism is able to process spatial queries without the necessity of periodic beacon transmissions for neighbor table updates or for synchronization. Hence, it can work properly over different types of duty cycle algorithms.  相似文献   

6.
A wireless sensor network (WSN) is composed of tens or hundreds of spatially distributed autonomous nodes, called sensors. Sensors are devices used to collect data from the environment related to the detection or measurement of physical phenomena. In fact, a WSN consists of groups of sensors where each group is responsible for providing information about one or more physical phenomena (e.g., group for collecting temperature data). Sensors are limited in power, computational capacity, and memory. Therefore, a query engine and query operators for processing queries in WSNs should be able to handle resource limitations such as memory and battery life. Adaptability has been explored as an alternative approach when dealing with these conditions. Adaptive query operators (algorithms) can adjust their behavior in response to specific events that take place during data processing. In this paper, we propose an adaptive in-network aggregation operator for query processing in sensor nodes of a WSN, called ADAGA (ADaptive AGgregation Algorithm for sensor networks). The ADAGA adapts its behavior according to memory and energy usage by dynamically adjusting data-collection and data-sending time intervals. ADAGA can correctly aggregate data in WSNs with packet replication. Moreover, ADAGA is able to predict non-performed detection values by analyzing collected values. Thus, ADAGA is able to produce results as close as possible to real results (obtained when no resource constraint is faced). The results obtained through experiments prove the efficiency of ADAGA.  相似文献   

7.
8.
一种基于上下文的中文信息检索查询扩展   总被引:13,自引:5,他引:13  
在中文信息检索的研究和实践中,由于查询中所使用的词可能与文件集中使用的词不匹配而导致一些相关的文件不能被成功地检索出来,这是影响检索效果的一个很关键的问题。查询扩展可以在一定程度上解决这种词的不匹配现象,然而,实验表明,通常简单的查询扩展并不能稳定地提高中文信息检索的检索效果。本论文中提出并实现了一种基于上下文的查询扩展方法,可以根据查询的上下文对扩展词进行选择,是一种相对“智能”的查询扩展方法。在TREC - 9 中文信息检索测试集上进行的实验表明,相对于通常简单的查询扩展,基于上下文的查询扩展方法取得了具有统计意义提高的检索效果。  相似文献   

9.
The inverted index is widely used in the existing information retrieval field. In order to support containment queries for structured documents such as XML, it needs to be extended. Previous work suggested an extension in storing the inverted index for XML documents and processing containment queries, and compared two implementation options: using an RDBMS and using an Information Retrieval (IR) engine. However, the previous work has two drawbacks in extending the inverted index. One is that the RDBMS implementation is generally much worse in the performance than the IR engine implementation. The other is that when a containment query is processed in an RDBMS, the number of join operations increases in proportion to the number of containment relationships in the query and a join operation always occurs between large relations. In order to solve these problems, we propose in this paper a novel approach to extend the inverted index for containment query processing, and show its effectiveness through experimental results. In particular, our performance study shows that (1) our RDBMS approach almost always outperforms the previous RDBMS and IR approaches, (2) our RDBMS approach is not far behind our IR approach with respect to performance, and (3) our approach is scalable to the number of containment relationships in queries. Therefore, our results suggest that, without having to make any modifications on the RDBMS engine, a native implementation using an RDBMS can support containment queries as efficiently as an IR implementation.  相似文献   

10.
在无线传感器网络环境中,用户经常提交空间范围查询以获取网络某局部区域的统计信息,如最大温度、平均湿度等。现有的基于路线的空间范围查询处理算法假设节点通信模型为理想的圆盘模型,而实际的网络并不满足该假设,导致其能量消耗大且查询结果质量差。提出了一种链路感知的空间范围查询处理算法LSA,它根据网络拓扑和链路质量动态地将查询区域划分为若干个网格,依次收集各网格中节点的感知数据,以生成最终的查询结果。LSA算法通过遍历查询区域内的所有网格,保证了算法查询结果的质量。提出了启发式的网格划分方法以降低节点间数据通信的丢包率,给出链路感知的数据收集算法,以减少算法的能量消耗,提高查询结果的质量。通过仿真实验系统地分析和比较了LSA算法和现有的IWQE算法的能量消耗及查询结果质量,结果表明,在绝大多数情况下,LSA算法优于IWQE算法。  相似文献   

11.
Mobile nodes in some challenging network scenarios, e.g. battlefield and disaster recovery scenarios, suffer from intermittent connectivity and frequent partitions. Disruption Tolerant Network (DTN) technologies are designed to enable communications in such environments. Several DTN routing schemes have been proposed. However, not much work has been done on designing schemes that provide efficient information access in such challenging network scenarios. In this paper, we explore how a content-based information retrieval system can be designed for DTNs. There are three important design issues, namely (a) how data should be replicated and stored at multiple nodes, (b) how a query is disseminated in sparsely connected networks, and (c) how a query response is routed back to the issuing node. We first describe how to select nodes for storing the replicated copies of data items. We consider the random and the intelligent caching schemes. In the random caching scheme, nodes that are encountered first by a data-generating node are selected to cache the extra copies while in the intelligent caching scheme, nodes that can potentially meet more nodes, e.g. faster nodes, are selected to cache the extra data copies. The number of replicated data copies K can be the same for all data items or varied depending on the access frequencies of the data items. In this work, we consider fixed, proportional and square-root replication schemes. Then, we describe two query dissemination schemes: (a) W-copy Selective Query Spraying (WSS) scheme and (b) L-hop Neighborhood Spraying (LNS) scheme. In the WSS scheme, nodes that can move faster are selected to cache the queries while in the LNS scheme, nodes that are within L-hops of a querying node will cache the queries. For message routing, we use an enhanced Prophet scheme where a next-hop node is selected only if its predicted delivery probability to the destination is higher than a certain threshold. We conduct extensive simulation studies to evaluate different combinations of the replication and query dissemination algorithms. Our results reveal that the scheme that performs the best is the one that uses the WSS scheme combined with binary spread of replicated data copies. The WSS scheme can achieve a higher query success ratio when compared to a scheme that does not use any data and query replication. Furthermore, the square-root and proportional replication schemes provide higher query success ratio than the fixed copy approach with varying node density. In addition, the intelligent caching approach can further improve the query success ratio by 5.3–15.8% with varying node density. Our results using different mobility models reveal that the query success ratio degrades at most 7.3% when the Community-Based model is used compared to the Random Waypoint (RWP) model [J. Broch et al., A Performance Comparison of Multihop wireless Ad hoc Network Routing Protocols, ACM Mobicom, 1998, pp. 85–97]. Compared to the RWP and the Community-Based mobility models, the UmassBusNet model from the DieselNet project [X. Zhang et al., Modeling of a Bus-based Disruption Tolerant Network Trace, Proceedings of ACM Mobihoc, 2007.] achieves much lower query success ratio because of the longer inter-node encounter time.  相似文献   

12.
High resolution sampling of physical phenomenon is a prime application of large scale wireless sensor networks (WSNs). With hundreds of nodes deployed over vast tracts of land, monitoring data can now be generated at unprecedented spatio-temporal scales. However, the limited battery life of individual nodes in the network mandates smart ways of collecting this data by maximizing localized processing of information at the node level. In this paper, we propose a WSN query processing method that enhances localized information processing by harnessing the two inherent aspects of WSN communication, i.e., multihop and multipath data transmission. In an active WSN where data collection queries are regularly processed, multihop and multipath routing leads to a situation where a significant proportion of nodes relay and overhear data generated by other nodes in the network. We propose that nodes opportunistically sample this data as they communicate. We model the data communication process in a WSN and show that opportunistic sampling during data communication leads to surprisingly accurate global knowledge at each node. We present an opportunistic query processing system that uses the accumulated global knowledge to limit the data collection requirements for future queries while ensuring temporal freshness of the results.  相似文献   

13.
Privacy is a major concern when users query public online data services. The privacy of millions of people has been jeopardized in numerous user data leakage incidents in many popular online applications. To address the critical problem of personal data leakage through queries, we enable private querying on public data services so that the contents of user queries and any user data are hidden and therefore not revealed to the online service providers. We propose two protocols for private processing of database queries, namely BHE and HHE. The two protocols provide strong query privacy by using Paillier’s homomorphic encryption, and support common database queries such as range and join queries by relying on the bucketization of public data. In contrast to traditional Private Information Retrieval proposals, BHE and HHE only incur one round of client server communication for processing a single query. BHE is a basic private query processing protocol that provides complete query privacy but still incurs expensive computation and communication costs. Built upon BHE, HHE is a hybrid protocol that applies ciphertext computation and communication on a subset of the data, such that this subset not only covers the actual requested data but also resembles some frequent query patterns of common users, thus achieving practical query performance while ensuring adequate privacy levels. By using frequent query patterns and data specific privacy protection, HHE is not vulnerable to the traditional attacks on k-Anonymity that exploit data similarity and skewness. Moreover, HHE consistently protects user query privacy for a sequence of queries in a single query session.  相似文献   

14.
结构化P2P网络虽然具有扩展性良好的数据查找机制,但只支持基于键的准确匹配搜索.为提供更丰富的数据查询能力,本文提出一种基于主题重叠网络的结构化P2P搜索算法--主题重叠网络搜索算法(TONS).其基本思想是在结构化P2P网络之上,将结点按主题组织成分层的重叠网络,使含有相似主题的结点相互链接在一起;利用主题中继结点所具有的全局导航能力,TONS能够基于内容将查询限定在P2P网络的局部范围内,并且通过在重叠网络中随机添加一些长距离链接,使重叠网络具有Small-World特性,改善TONS的搜索性能.实验结果表明,TONS大大提高了搜索的查全率,减少了P2P网络信息搜索时的平均路径距离和平均消息数目.  相似文献   

15.
Information Retrieval (IR) systems assist users in finding information from the myriad of information resources available on the Web. A traditional characteristic of IR systems is that if different users submit the same query, the system would yield the same list of results, regardless of the user. Personalised Information Retrieval (PIR) systems take a step further to better satisfy the user’s specific information needs by providing search results that are not only of relevance to the query but are also of particular relevance to the user who submitted the query. PIR has thereby attracted increasing research and commercial attention as information portals aim at achieving user loyalty by improving their performance in terms of effectiveness and user satisfaction. In order to provide a personalised service, a PIR system maintains information about the users and the history of their interactions with the system. This information is then used to adapt the users’ queries or the results so that information that is more relevant to the users is retrieved and presented. This survey paper features a critical review of PIR systems, with a focus on personalised search. The survey provides an insight into the stages involved in building and evaluating PIR systems, namely: information gathering, information representation, personalisation execution, and system evaluation. Moreover, the survey provides an analysis of PIR systems with respect to the scope of personalisation addressed. The survey proposes a classification of PIR systems into three scopes: individualised systems, community-based systems, and aggregate-level systems. Based on the conducted survey, the paper concludes by highlighting challenges and future research directions in the field of PIR.  相似文献   

16.
17.
Modern applications requiring spatial network processing pose several interesting query optimization challenges. Spatial networks are usually represented as graphs, and therefore, queries involving a spatial network can be executed by using the corresponding graph representation. This means that the cost for executing a query is determined by graph properties such as the graph order and size (i.e., number of nodes and edges) and other graph parameters. In this paper, we present novel methods to estimate the number of nodes and edges in regions of interest in spatial networks, towards predicting the space and time requirements for range queries. The methods are evaluated by using real-life and synthetic data sets. Experimental results show that the number of nodes and edges can be estimated efficiently and accurately, with relatively small space requirements, thus providing useful information to the query optimizer.  相似文献   

18.
As database technology is applied to more and more application domains, user queries are becoming increasingly complex (e.g. involving a large number of joins and a complex query structure). Query optimizers in existing database management systems (DBMS) were not developed for efficiently processing such queries and often suffer from problems such as intolerably long optimization time and poor optimization results. To tackle this challenge, we present a new similarity-based approach to optimizing complex queries in this paper. The key idea is to identify similar subqueries that often appear in a complex query and share the optimization result among similar subqueries in the query. Different levels of similarity for subqueries are introduced. Efficient algorithms to identify similar queries in a given query and optimize the query based on similarity are presented. Related issues, such as choosing good starting nodes in a query graph, evaluating identified similar subqueries and analyzing algorithm complexities, are discussed. Our experimental results demonstrate that the proposed similarity-based approach is quite promising in optimizing complex queries with similar subqueries in a DBMS.  相似文献   

19.
Typical delay tolerant networks(DTNs)often suffer from long and variable delays,frequent connectivity disruptions,and high bit error rates.In DTNs,the design of an efficient routing algorithm is one of the key issues.The existing methods improve the accessibility probability of the data transmission by transmitting many copies of the packet to the network,but they may cause a high network overhead.To address the tradeoff between a successful delivery ratio and the network overhead,we propose a DTN routing algorithm based on the Markov location prediction model,called the spray and forward routing algorithm(SFR).Based on historical information of the nodes,the algorithm uses the second-order Markov forecasting mechanism to predict the location of the destination node,and then forwards the data by greedy routing,which reduces the copies of packets by spraying the packets in a particular direction.In contrast to a fixed mode where a successful-delivery ratio and routing overhead are contradictory,a hybrid strategy with multi-copy forwarding is able to reduce the copies of the packets efficiently and at the same time maintain an acceptable successful-delivery ratio.The simulation results show that the proposed SFR is efficient enough to provide better network performance than the spray and wait routing algorithm,in scenarios with sparse node density and fast mobility of the nodes.  相似文献   

20.
断接下查询的缓存处理   总被引:5,自引:0,他引:5  
吴婷婷  章文嵩  周兴铭 《计算机学报》2003,26(10):1393-1399
移动环境下,由于无线网络可靠性低、费用高,移动主机本身受电源、资源等方面的限制,移动主机经常会主动或被动地处于断接,即没有网络连接的状态.为了提高断接时移动客户对数据的访问能力,有效利用移动缓存,该文提出断接下基于语义缓存的查询处理QPID算法.该算法的主要思路是先找出缓存中与当前查询相关的缓存项,再通过对相关项数据的进一步处理获得缓存中满足查询的结果.试验表明,基于QPID算法的查询处理可以更好地满足断接下客户的查询请求.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号