首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到8条相似文献,搜索用时 62 毫秒
1.
Keyword search is the most popular technique for querying large tree-structured datasets, often of unknown structure, in the web. Recent keyword search approaches return lowest common ancestors (LCAs) of the keyword matches ranked with respect to their relevance to the keyword query. A major challenge of a ranking approach is the efficiency of its algorithms as the number of keywords and the size and complexity of the data increase. To face this challenge most of the known approaches restrict their ranking to a subset of the LCAs (e.g., SLCAs, ELCAs), missing relevant results.In this work, we design novel top-k-size stack-based algorithms on tree-structured data. Our algorithms implement ranking semantics for keyword queries which is based on the concept of LCA size. Similar to metric selection in information retrieval, LCA size reflects the proximity of keyword matches in the data tree. This semantics does not rank a predefined subset of LCAs and through a layered presentation of results, it demonstrates improved effectiveness compared to previous relevant approaches. To address performance challenges our algorithms exploit a lattice of the partitions of the keyword set, which empowers a linear time performance. This result is obtained without the support of auxiliary precomputed data structures. An extensive experimental study on various and large datasets confirms the theoretical analysis. The results show that, in contrast to other approaches, our algorithms scale smoothly when the size of the dataset and the number of keywords increase.  相似文献   

2.
Efficiently enumerating results of keyword search over data graphs   总被引:2,自引:0,他引:2  
Various approaches for keyword search in different settings (e.g., relational databases and XML) actually deal with the problem of enumerating K-fragments. For a given set of keywords K, a K-fragment is a subtree T of the given data graph, such that T contains all the keywords of K and no proper subtree of T has this property. There are three types of K-fragments: directed, undirected and strong. This paper describes efficient algorithms for enumerating K-fragments. Specifically, for all three types of K-fragments, algorithms are given for enumerating all K-fragments with polynomial delay and polynomial space. It is shown how these algorithms can be enhanced to enumerate K-fragments in a heuristic order. For directed K-fragments and acyclic data graphs, an algorithm is given for enumerating with polynomial delay in the order of increasing weight (i.e., the ranked order), assuming that K is of a fixed size.  相似文献   

3.
4.
对于加密云数据的搜索,传统的关键词模糊搜索方案虽然能搜索到相关文档,但是搜索的结果并不令人满意。在用户输入正确的情况下,无法完成近似搜索,当用户出现拼写错误时,返回的结果中包含大量无关关键词文档,严重浪费了带宽资源。针对目前在加密云数据下关键词模糊搜索的缺陷,提出了一种新型的关键词模糊搜索方案,通过对关键词计算相关度分数并对文档根据相关度分数进行排序,将top-k(即相关度最高的k个文档)个文档返回给搜索用户,减少了不必要的带宽浪费和用户寻找有效文档的时间消耗,提供了更加有效的搜索结果,并且通过引入虚假陷门集,增大了云服务器对文档关键词的分析难度,增加了系统的隐私性保护。  相似文献   

5.
传统的可搜索加密方案仅支持精确匹配的搜索,在效率和性能上都不能适应云计算环境。用支持多种字符串相似性操作的R+树构建索引,实现了云计算中对加密数据的模糊关键字搜索;用编辑距离来量化关键字的相似度,提出了一种可以返回与关键字更接近的文件检索方法。通过字符串聚类提高了模糊关键字搜索的效率。  相似文献   

6.
More and more data owners are encouraged to outsource their data onto cloud servers for reducing infrastructure, maintenance cost and also to get ubiquitous access to their stored data. However, security is one issue that discourages data owners from adopting cloud servers for data storage. Searchable Encryption (SE) is one of the few ways of assuring privacy and confidentiality of such data by storing them in encrypted form at the cloud servers. SE enables the data owners and users to search over encrypted data through trapdoors. Most of the user information requirements are fulfilled either through Boolean or Ranked search approaches. This paper aims at understanding how the confidentiality and privacy of information can be guaranteed while processing single and multi-keyword queries over encrypted data using Boolean and Ranked search approaches. This paper presents all possible leakages that happen in SE and also specifies which privacy preserving approach to be adopted in SE schemes to prevent those leakages to help the practitioners and researchers to design and implement secure searchable encryption systems. It also highlights various application scenarios where SE could be utilized. This paper also explores the research challenges and open problems that need to be focused in future.  相似文献   

7.
We present pest, a novel approach to the approximate querying of graph-structured data such as RDF that exploits the data's structure to propagate term weights between related data items. We focus on data where meaningful answers are given through the application semantics, e.g., pages in wikis, persons in social networks, or papers in a research network such as Mendeley. The pest matrix generalizes the Google Matrix used in PageRank with a term-weight dependent leap and accommodates different levels of (semantic) closeness for different relations in the data, e.g., friend vs. co-worker in a social network. Its eigenvectors represent the distribution of a term after propagation. The eigenvectors for all terms together form a (vector space) index that takes the structure of the data into account and can be used with standard document retrieval techniques. In extensive experiments including a user study on a real life wiki, we show how pest improves the quality of the ranking over a range of existing ranking approaches, yet achieves a query performance comparable to a plain vector space index.  相似文献   

8.
李应 《智能系统学报》2008,3(3):259-264
根据多媒体音频数据的特点,提出一种适用于快速音频数据检索的局部搜索数据结构,即局部搜索树(local search tree,LS-tree).在局部搜索树中,分别以音频数据小波变换系数的过零率和平均幅度作为主、次关键码,基于局部范围对作为索引的其他系数进行组织.其次,基于局部搜索树,提出采用小波包最好基小波塔型算法实现音频数据检索.最后,把采用局部搜索树的小波包最好基—小波塔型算法的搜索和基于小波不同级系数的检索方法相比较,结果表明,这种方法对音频数据检索的快速和有效性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号