首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
RDF is a knowledge representation language dedicated to the annotation of resources within the framework of the semantic web. Among the query languages for RDF, SPARQL allows querying RDF through graph patterns, i.e., RDF graphs involving variables. Other languages, inspired by the work in databases, use regular expressions for searching paths in RDF graphs. Each approach can express queries that are out of reach of the other one. Hence, we aim at combining these two approaches. For that purpose, we define a language, called PRDF (for “Path RDF”) which extends RDF such that the arcs of a graph can be labeled by regular expression patterns. We provide PRDF with a semantics extending that of RDF, and propose a correct and complete algorithm which, by computing a particular graph homomorphism, decides the consequence between an RDF graph and a PRDF graph. We then define the PSPARQL query language, extending SPARQL with PRDF graph patterns and complying with RDF model theoretic semantics. PRDF thus offers both graph patterns and path expressions. We show that this extension does not increase the computational complexity of SPARQL and, based on the proposed algorithm, we have implemented a correct and complete PSPARQL query engine.  相似文献   

2.
This paper discusses the issues involved in designing a query language for the Semantic Web and presents the OWL query language (OWL-QL) as a candidate standard language and protocol for query–answering dialogues among Semantic Web computational agents using knowledge represented in the W3Cs ontology web language (OWL). OWL-QL is a formal language and precisely specifies the semantic relationships among a query, a query answer, and the knowledge base(s) used to produce the answer. Unlike standard database and Web query languages, OWL-QL supports query–answering dialogues in which the answering agent may use automated reasoning methods to derive answers to queries, as well as dialogues in which the knowledge to be used in answering a query may be in multiple knowledge bases on the Semantic Web, and/or where those knowledge bases are not specified by the querying agent. In this setting, the set of answers to a query may be of unpredictable size and may require an unpredictable amount of time to compute.  相似文献   

3.
Cooperative query answering supports query relaxation and provides approximate answers as well as exact answers. To facilitate the query relaxation, a knowledge representation framework has been widely adopted, which accommodates semantic relationships or distance metrics to represent similarities among data values. In this paper, we propose a metricized knowledge abstraction hierarchy (MKAH) that supports multi-level data abstraction hierarchy and distance metric among data values. We show that the abstraction hierarchy is useful in representing the semantic relationship, and the abstraction hierarchy can provide data values with different scope according to their abstraction levels. The distance metric expresses the semantic similarity among data values with quantitative measure, and thus it enables query results to be ranked. To verify the practicality and effectiveness of the MKAH, we have implemented a prototype system in the area of career job search. Through various experiments, we show that the MKAH provides rich semantic representation and high quality distance measure. Furthermore, the experiments confirm that the domain adopting the MKAH can be compatible with other numeric domains, and that is advantageous in building up large scaled systems.  相似文献   

4.
Graphs are widely used for modeling complicated data such as social networks, bibliographical networks and knowledge bases. The growing sizes of graph databases motivate the crucial need for developing powerful and scalable graph-based query engines. We propose a SPARQL-like language, G-SPARQL, for querying attributed graphs. The language enables the expression of different types of graph queries that are of large interest in the databases that are modeled as large graph such as pattern matching, reachability and shortest path queries. Each query can combine both structural predicates and value-based predicates (on the attributes of the graph nodes/edges). We describe an algebraic compilation mechanism for our proposed query language which is extended from the relational algebra and based on the basic construct of building SPARQL queries, the Triple Pattern. We describe an efficient hybrid Memory/Disk representation of large attributed graphs where only the topology of the graph is maintained in memory while the data of the graph are stored in a relational database. The execution engine of our proposed query language splits parts of the query plan to be pushed inside the relational database (using SQL) while the execution of other parts of the query plan is processed using memory-based algorithms, as necessary. Experimental results on real and synthetic datasets demonstrate the efficiency and the scalability of our approach and show that our approach outperforms native graph databases by several factors.  相似文献   

5.
Certain answers are a widely accepted semantics of query answering over incomplete databases. As their computation is a coNP-hard problem, recent research has focused on developing (polynomial time) evaluation algorithms with correctness guarantees, that is, techniques computing a sound but possibly incomplete set of certain answers. The aim is to make the computation of certain answers feasible in practice, settling for under-approximations.In this paper, we present novel evaluation algorithms with correctness guarantees, which provide better approximations than current techniques, while retaining polynomial time data complexity. The central tools of our approach are conditional tables and the conditional evaluation of queries. We propose different strategies to evaluate conditions, leading to different approximation algorithms—more accurate evaluation strategies have higher running times, but they pay off with more certain answers being returned. Thus, our approach offers a suite of approximation algorithms enabling users to choose the technique that best meets their needs in terms of balance between efficiency and quality of the results.  相似文献   

6.
As a result of the extensive research in view-based query processing, three notions have been identified as fundamental, namely rewriting, answering, and losslessness. Answering amounts to computing the tuples satisfying the query in all databases consistent with the views. Rewriting consists in first reformulating the query in terms of the views and then evaluating the rewriting over the view extensions. Losslessness holds if we can answer the query by solely relying on the content of the views. While the mutual relationship between these three notions is easy to identify in the case of conjunctive queries, the terrain of notions gets considerably more complicated going beyond such a query class. In this paper, we revisit the notions of answering, rewriting, and losslessness and clarify their relationship in the setting of semistructured databases, and in particular for the basic query class in this setting, i.e., two-way regular path queries. Our first result is a clean explanation of the relationship between answering and rewriting, in which we characterize rewriting as a “linear approximation” of query answering. We show that applying this linear approximation to the constraint-satisfaction framework yields an elegant automata-theoretic approach to query rewriting. As for losslessness, we show that there are indeed two distinct interpretations for this notion, namely with respect to answering, and with respect to rewriting. We also show that the constraint-theoretic approach and the automata-theoretic approach can be combined to give algorithmic characterization of the various facets of losslessness. Finally, we deal with the problem of coping with loss, by considering mechanisms aimed at explaining lossiness to the user.  相似文献   

7.
The large volume and nature of data available to the casual users and programs motivate the increasing interest of the database community in studying flexible and efficient techniques for extracting and querying semistructured data. On the other hand, efficient methods have been discovered for solving the so-called model-checking problem for some modal logics. The aim of this paper is to show how some of these methods can be used for querying semistructured data. For doing that we show that semistructured data can be naturally seen as Kripke Transition Systems. To keep the presentation independent of a specific language, we introduce a graphical query language that includes some of the features of the query languages based on graphs and patterns. We show how to associate CTL formulas to queries of this language. This allows us to see the problems of solving a query as an instance of the model-checking problem for CTL that can be solved in polynomial time. We have tested the method by using a model-checker, and have studied the applicability of the method to some existing languages for semistructured databases.  相似文献   

8.
Question answering on the Web is moving beyond the stage where users simply type a query and retrieve a ranked ordering of appropriate Web pages. Users and analysts want targeted answers to their questions without extraneous information. These answers might contain information from current and authoritative sources, terms with the same meaning as those used in the query, relevant links such as justifications, follow-up questions fitting the context, and provenance information. Next-generation question-answering systems might also provide better querying support. This could include identifying whether questions are incoherent and therefore can't be answered, too general and would retrieve too many answers, or over constrained and would retrieve few if any answers. We present a spectrum of techniques for improving question answering and discuss their potential uses and impact.  相似文献   

9.
李庆红 《计算机工程》2011,37(13):68-70
针对传统海量数据精确查询负载过大的问题,引入基于仿真的置信区间自动抽样方法(Bootstrap)对数据库提供支持。通过对部分或采样数据进行查询,将查询简化到基础数据上,在对整个数据集查询一次的时间内,完成对多个样本重复多次的查询,得到数据库查询的置信区间;再进行基础SQL查询,得到符合用户要求的近似结果。实验结果表明,引入Bootstrap方法进行数据查询是有效的。  相似文献   

10.
Recommendation systems aim to recommend items or packages of items that are likely to be of interest to users. Previous work on recommendation systems has mostly focused on recommending points of interest (POI), to identify and suggest top-k items or packages that meet selection criteria and satisfy compatibility constraints on items in a package, where the (packages of) items are ranked by their usefulness to the users. As opposed to prior work, this paper investigates two issues beyond POI recommendation that are also important to recommendation systems. When there exist no sufficiently many POI that can be recommended, we propose (1) query relaxation recommendation to help users revise their selection criteria, or (2) adjustment recommendation to guide recommendation systems to modify their item collections, such that the users׳ requirements can be satisfied.We study two related problems, to decide (1) whether the query expressing the selection criteria can be relaxed to a limited extent, and (2) whether we can update a bounded number of items, such that the users can get desired recommendations. We establish the upper and lower bounds of these problems, all matching, for both combined and data complexity, when selection criteria and compatibility constraints are expressed in a variety of query languages, for both item recommendation and package recommendation. To understand where the complexity comes from, we also study the impact of variable sizes of packages, compatibility constraints and selection criteria on the analyses of these problems. Our results indicate that in most cases the complexity bounds of query relaxation and adjustment recommendation are comparable to their counterparts of the basic recommendation problem for testing whether a given set of (resp. packages of) items makes top-k items (resp. packages). In other words, extending recommendation systems with the query relaxation and adjustment recommendation functionalities typically does not incur extra overhead.  相似文献   

11.
We give a general framework for approximate query processing in semistructured databases. We focus on regular path queries, which are the integral part of most of the query languages for semistructured databases. To enable approximations, we allow the regular path queries to be distorted. The distortions are expressed in the system by using weighted regular expressions, which correspond to weighted regular transducers. After defining the notion of weighted approximate answers we show how to compute them in order of their proximity to the query. In the new approximate setting, query containment has to be redefined in order to take into account the quantitative proximity information in the query answers. For this, we define the approximate containment, and its variants k-containment and reliable contain-ment. Then, we give an optimal algorithm for deciding the k-containment. Regarding the reliable approximate containment, we show that it is polynomial time equivalent to the notorious limitedness problem in distance automata.  相似文献   

12.
It has been observed that queries over XML data sources are often unsatisfiable. Unsatisfiability may stem from several different sources, e.g., the user may be insufficiently familiar with the labels appearing the documents, or may not be intimately aware of the hierarchical structure of the documents. To deal with query and document mismatches, previous research has considered returning answers that maximally satisfy (in some sense) the query, instead of only returning strictly satisfying answers. However, this breaks the golden database rule that only strictly satisfying answers are returned when querying. Indeed, the relationship between the query and answers is no longer clear, when unsatisfying answers are returned. To reinstate the golden database rule, this article proposes a framework for automatically correcting queries over XML. This framework generates similar satisfiable queries, when the user query is unsatisfiable. The user can then choose a satisfiable query of interest, and receive exactly satisfying answers to this query.  相似文献   

13.
以RDF结构为基础的数据网的发展中,高效数据检索成为关键问题之一。形式化查询语言(如SPARQL)因其语法的复杂性及查询本体的相关性阻碍其效用的发挥,迫切需要新的方法或工具实现以自然语言为基础(如关键字检索)的检索。形式化查询语言是检索这类结构化数据的有效方式,用户习惯自然语言为基础的检索方式。因而如何自动将关键词为基础的检索方式转换成以形式化查询为基础的检索方式是实现数据网的重要一环。关联数据的自然语言查询方法自动将自然语言查询转换成SPARQL查询,提高系统的有效性和效率。文中在抽象转换度量模型的基础上,以本体为基础构建查询语义图及实现语义消歧,构建SPARQL查询。实验结果表明,该方法具有更高的召回率、精度及更低的时间消耗。  相似文献   

14.
Nowadays, huge volumes of data are organized or exported in tree-structured form. Querying capabilities are provided through tree-pattern queries. The need for querying tree-structured data sources when their structure is not fully known, and the need to integrate multiple data sources with different tree structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. In this paper, we consider a query language that allows the partial specification of a tree pattern. Queries in this language range from structureless keyword-based queries to completely specified tree patterns. To support the evaluation of partially specified queries, we use semantically rich constructs, called dimension graphs, which abstract structural information of the tree-structured data. We address the problem of query containment in the presence of dimension graphs and we provide necessary and sufficient conditions for query containment. As checking query containment can be expensive, we suggest two heuristic approaches for query containment in the presence of dimension graphs. Our approaches are based on extracting structural information from the dimension graph that can be added to the queries while preserving equivalence with respect to the dimension graph. We considered both cases: extracting and storing different types of structural information in advance, and extracting information on-the-fly (at query time). Both approaches are implemented, validated, and compared through experimental evaluation.  相似文献   

15.
XML数据库的安全查询处理   总被引:1,自引:0,他引:1  
当今XML数据库查询系统必须处理快速增长的数据量和大量的用户,如果使用细粒度级别的访问控制保护敏感XML数据,则查询效率会较低,因为当用户视图需要计算得出时,很难对XML文档的每个节点实施访问控制。提出一个安全的XML查询方案,它使用缓存存储查询结果和安全信息。用户查询重写为安全的系统查询,依据缓存是否命中,决定在缓存还是源XML文档上执行得到查询结果;提出一个新的缓存替换策略LSL,它基于安全级别对缓存更新。实验结果表明,该方案能显著地改进查询系统性能。  相似文献   

16.
We present in this paper BP-QL, a novel query language for querying business processes. The BP-QL language is based on an intuitive model of business processes, an abstraction of the emerging BPEL (business process execution language) standard. It allows users to query business processes visually, in a manner very analogous to how such processes are typically specified, and can be employed in a distributed setting, where process components may be provided by distinct providers.  相似文献   

17.
现今,计算机网络被广泛应用于生活的方方面面,而从海量的信息中搜寻出人们所需要的还存在诸多问题,于是产生了本体的概念。而本体的查询和推理是基于本体的应用中重要的组成部分,研究的目的是为了使知识得以充分表达并且对信息的查询更加精确、完备。首先介绍了本体的概念并建立本体模型,然后用本体杳询语言SPARQL对已有模型进行查询并用SWRL对模型进行语义规则的扩充;最后介绍了Jena,并对本体模型进行推理,由此获得了更多知识。结论就足,在利用SPARQL和Jena进行查询与推理的过程中,推理将提高查询能力,而规则是提高推理能力的关键。  相似文献   

18.
Web users often post queries through form-based interfaces on the Web to retrieve data from the Web; however, answers to these queries are mostly computed according to keywords entered into different fields specified in a query interface, and their precision and recall could be low. The precision and recall ratios in answering this type of query can be improved by considering closely related previous queries submitted through the same interface, along with their answers. In this paper, we present an approach for enhancing the retrieval of relevant answers to a form-based Web query by adopting the data-mining approach using previous, relevant queries and their answers. Experimental results on a randomly selected set of 3,800 documents retrieved from various Web sites show that our data-mining, query-rewriting approach achieves average precision and true positive ratios on rewritten queries in the upper 80% range, whereas the average false positive ratio is less than 2.0%. Work partially done during a visit to BYU and partially supported by National Natural Science Foundation of China No. 60503036 and Fok YingTong Education Foundation No. 104027.  相似文献   

19.
Linked Open Data initiatives have encouraged the publication of large RDF datasets into the Linking Open Data (LOD) cloud, including DBpedia, YAGO, and Geo-Names. Despite the size of LOD datasets and the development of (semi-)automatic methods to create and link LOD data, these datasets may be still incomplete, negatively affecting thus accuracy of Linked Data processing techniques. We acquire query answer completeness by capturing knowledge collected from the crowd, and propose a novel hybrid query processing engine that brings together machine and human computation to execute SPARQL queries. Our system, HARE, implements these hybrid query processing techniques. HARE encompasses several features: (1) a completeness model for RDF that exploits the characteristics of RDF in order to estimate the completeness of an RDF dataset; (2) a crowd knowledge base that captures crowd answers about missing values in the RDF dataset; (3) a query engine that combines on-the-fly crowd knowledge and estimates provided by the RDF completeness model, to decide upon the sub-queries of a SPARQL query that should be executed against the dataset or via crowd computing to enhance query answer completeness; and (4) a microtask manager that exploits the semantics encoded in the dataset RDF properties, to crowdsource SPARQL sub-queries as microtasks and update the crowd knowledge base with the results from the crowd. Effectiveness and efficiency of HARE are empirically studied on a collection of 50 SPARQL queries against the DBpedia dataset. Experimental results clearly show that our solution accurately enhances answer completeness.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号