首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Zheng  Shihui  Zhou  Aoying  Zhang  Long  Lu  Hongjun 《World Wide Web》2003,6(2):233-253
XML has been recognized as a promising language for data exchange over the Internet. A number of query languages have been proposed for querying XML data. Most of those languages are path-expression based. One difficulty in forming path-expression based queries is that users have to know the structure of XML data against which the queries were issued. In this paper, we describe a DTD-driven visual query interface for XML database systems. With such an interface, a user can easily form path-expression based queries by clicking elements in the DTD tree displayed on the screen and supplying conditions if necessary. The interface and the query generation process are described in detail.  相似文献   

3.
We propose a principled optimization-based interactive query relaxation framework for queries that return no answers. Given an initial query that returns an empty-answer set, our framework dynamically computes and suggests alternative queries with fewer conditions than those the user has initially requested, in order to help the user arrive at a query with a non-empty-answer, or at a query for which no matter how many additional conditions are ignored, the answer will still be empty. Our proposed approach for suggesting query relaxations is driven by a novel probabilistic framework based on optimizing a wide variety of application-dependent objective functions. We describe optimal and approximate solutions of different optimization problems using the framework. Moreover, we discuss two important extensions to the base framework: the specification of a minimum size on the number of results returned by a relaxed query and the possibility of proposing multiple conditions at the same time. We analyze the proposed solutions, experimentally verify their efficiency and effectiveness, and illustrate their advantages over the existing approaches.  相似文献   

4.
《Knowledge》2007,20(1):1-16
In this paper, we discuss a formal system for representing and analyzing real world events. The event representation discussed in this paper accounts for three important event attributes, namely, time, space, and label. We introduce the notion of sequence templates that appears natural for capturing event related semantics; and in semantically analyzing user queries. To harness this potential, we present a formal structure to represent the queries related to real world events as well as an approach to semantically analyze a user query, and collate event related information to be dispatched to the user. Finally, we discuss the design and implementation of the Query-Event Analysis System (QEAS), which is an integrated system to (a) identify a best-matching sequence template(s) given a user query; (b) derive the meta-events based on the selected sequence templates; and (c) and use the meta-event information to answer the user query.  相似文献   

5.
XML is rapidly emerging as a standard for data representation and exchange over the World Wide Web and an increasing amount of sensitive business data is processed in XML format. Therefore, it is critical to have control mechanisms to restrict a user to access only the parts of XML documents that she is authorized to access. In this paper, we propose the first DTD-based access control model that employs graph matching to analyze if an input query is fully acceptable, fully rejectable, or partially acceptable. In this way, there will be no further security overhead for the processing of fully acceptable and rejectable queries. For partially acceptable queries, we propose a graph-matching based authorization model for an optimized rewriting procedure in which a recursive query (query with descendant axis ‘//’) will be rewritten into an equivalent recursive one if possible and into a non-recursive one only if necessary, resulting queries that can fully take advantage of structural join based query optimization techniques. Moreover, we propose an index structure for XML element types to speed up the query rewriting procedure, a facility that is potentially useful for applications with large DTDs. Our performance study results showed that our algorithms armed with rewriting indexes are promising.  相似文献   

6.
A knowledge-based approach for retrieving images by content   总被引:10,自引:0,他引:10  
A knowledge based approach is introduced for retrieving images by content. It supports the answering of conceptual image queries involving similar-to predicates, spatial semantic operators, and references to conceptual terms. Interested objects in the images are represented by contours segmented from images. Image content such as shapes and spatial relationships are derived from object contours according to domain specific image knowledge. A three layered model is proposed for integrating image representations, extracted image features, and image semantics. With such a model, images can be retrieved based on the features and content specified in the queries. The knowledge based query processing is based on a query relaxation technique. The image features are classified by an automatic clustering algorithm and represented by Type Abstraction Hierarchies (TAHs) for knowledge based query processing. Since the features selected for TAH generation are based on context and user profile, and the TAHs can be generated automatically by a clustering algorithm from the feature database, our proposed image retrieval approach is scalable and context sensitive. The performance of the proposed knowledge based query processing is also discussed  相似文献   

7.
Finding the occurrences of structural patterns in XML data is a key operation in XML query processing. Existing algorithms for this operation focus almost exclusively on path patterns or tree patterns. Current applications of XML require querying of data whose structure is complex or is not fully known to the user, or integrating XML data sources with different structures. These applications have motivated recently the introduction of query languages that allow a partial specification of path patterns in a query. In this paper, we consider partial path queries, a generalization of path pattern queries, and we focus on their efficient evaluation under the indexed streaming evaluation model. Our approach explicitly deals with repeated labels (that is, multiple occurrences of the same label in a query). We show that partial path queries can be represented as rooted dags for which a topological ordering of the nodes exists. We present three algorithms for the efficient evaluation of these queries. The first one exploits a structural summary of data to generate a set of path patterns that together are equivalent to a partial path query. To evaluate these path patterns, we extend a previous algorithm for path-pattern queries so that it can work on path patterns with repeated labels. The second one extracts a spanning tree from the query dag, uses a stack-based algorithm to find the matches of the root-to-leaf paths in the tree, and merge-joins the matches to compute the answer. Finally, the third one exploits multiple pointers of stack entries and a topological ordering of the query dag to apply a stack-based holistic technique. We analyze our algorithms and perform extensive experimental evaluations. Our experimental results show that the holistic algorithm outperforms the other ones. Our approaches are the first ones to efficiently evaluate this class of queries in the indexed streaming model.  相似文献   

8.
Searching XML data with a structured XML query can improve the precision of results compared with a keyword search. However, the structural heterogeneity of the large number of XML data sources makes it difficult to answer the structured query exactly. As such, query relaxation is necessary. Previous work on XML query relaxation poses the problem of unnecessary computation of a big number of unqualified relaxed queries. To address this issue, we propose an adaptive relaxation approach which relaxes a query against different data sources differently based on their conformed schemas. In this paper, we present a set of techniques that supports this approach, which includes schema-aware relaxation rules for relaxing a query adaptively, a weighted model for ranking relaxed queries, and algorithms for adaptive relaxation of a query and top-k query processing. We discuss results from a comprehensive set of experiments that show the effectiveness and the efficiency of our approach.  相似文献   

9.
Modern search engines employ advanced techniques that go beyond the structures that strictly satisfy the query conditions in an effort to better capture the user intentions. In this work, we introduce a novel query paradigm that considers a user query as an example of the data in which the user is interested. We call these queries exemplar queries. We provide a formal specification of their semantics and show that they are fundamentally different from notions like queries by example, approximate queries and related queries. We provide an implementation of these semantics for knowledge graphs and present an exact solution with a number of optimizations that improve performance without compromising the result quality. We study two different congruence relations, isomorphism and strong simulation, for identifying the answers to an exemplar query. We also provide an approximate solution that prunes the search space and achieves considerably better time performance with minimal or no impact on effectiveness. The effectiveness and efficiency of these solutions with synthetic and real datasets are experimentally evaluated, and the importance of exemplar queries in practice is illustrated.  相似文献   

10.
When a query is posed on a centralized database, if it refers to attributes that are not defined in the database, the user is warranted to get either an error or an empty set. In contrast, when a query is posed on a peer in a P2P system and refers to attributes not found in the local database, the query should not be simply rejected if the relevant information is available at other peers. This paper proposes a query model for unstructured P2P systems to answer such queries. (a) We introduce a class of polymorphic queries, a revision of conjunctive queries by incorporating type variables to accommodate attributes not defined in the local database. (b) We define the semantics of polymorphic queries in terms of horizontal and vertical object expansions, to find attributes and tuples, respectively, missing from the local database. We show that both expansions can be conducted in a uniform framework. (c) We develop a top-K algorithm to approximately answer polymorphic queries. (d) We also provide a method to merge tuples collected from various peers, based on matching keys specified in polymorphic queries. Our experimental study verifies that polymorphic queries are able to find more sensible information than traditional queries supported by P2P systems, and that these queries can be evaluated efficiently.  相似文献   

11.
Semistructured data occur in situations where information lacks a homogeneous structure and is incomplete. Yet, up to now the incompleteness of information has not been reflected by special features of query languages. Our goal is to investigate the principles of queries that allow for incomplete answers. We do not present, however, a concrete query language. Queries over classical structured data models contain a number of variables and constraints on these variables. An answer is a binding of the variables by elements of the database such that the constraints are satisfied. In the present paper, we loosen this concept in so far as we allow also answers that are partial; that is, not all variables in the query are bound by such an answer. Partial answers make it necessary to refine the model of query evaluation. The first modification relates to the satisfaction of constraints: in some circumstances we consider constraints involving unbound variables as satisfied. Second, in order to prevent a proliferation of answers, we only accept answers that are maximal in the sense that there are no assignments that bind more variables and satisfy the constraints of the query. Our model of query evaluation consists of two phases, a search phase and a filter phase. Semistructured databases are essentially labeled directed graphs. In the search phase, we use a query graph containing variables to match a maximal portion of the database graph. We investigate three different semantics for query graphs, which give rise to three variants of matching. For each variant, we provide algorithms and complexity results. In the filter phase, the maximal matchings resulting from the search phase are subjected to constraints, which may be weak or strong. Strong constraints require all their variables to be bound, while weak constraints do not. We describe a polynomial algorithm for evaluating a special type of queries with filter constraints, and assess the complexity of evaluating other queries for several kinds of constraints. In the final part, we investigate the containment problem for queries consisting only of search constraints under the different semantics.  相似文献   

12.
Fast Nearest-Neighbor Query Processing in Moving-Object Databases   总被引:4,自引:1,他引:4  
A desirable feature in spatio-temporal databases is the ability to answer future queries, based on the current data characteristics (reference position and velocity vector). Given a moving query and a set of moving objects, a future query asks for the set of objects that satisfy the query in a given time interval. The difficulty in such a case is that both the query and the data objects change positions continuously, and therefore we can not rely on a given fixed reference position to determine the answer. Existing techniques are either based on sampling, or on repetitive application of time-parameterized queries in order to provide the answer. In this paper we develop an efficient method in order to process nearest-neighbor queries in moving-object databases. The basic advantage of the proposed approach is that only one query is issued per time interval. The time-parameterized R-tree structure is used to index the moving objects. An extensive performance evaluation, based on CPU and I/O time, shows that significant improvements are achieved compared to existing techniques.  相似文献   

13.
This paper provides a formal specification for concept-based image retrieval using triples. To effectively manage a vast amount of images, we may need an image retrieval system capable of indexing and searching images based on the characteristics of their content. However, such a content-based image retrieval technique alone may not satisfy user queries if retrieved images turn out to be relevant only when they are conceptually related with the queries. In this paper, we develop an image retrieval mechanism to extract semantics of images based on triples. The semantics can be captured by deriving concepts from its constituent objects and spatial relationships between them. The concepts are basically composite objects formed from the aggregation of the constituents. In our mechanism, all the spatial relationships between objects including the concepts are uniformly represented by triples, which are used for indexing images as well as capturing their semantics. We also develop a query evaluation for supporting the concept-based image retrieval. ©1999 John Wiley & Sons, Inc.  相似文献   

14.
The problem of word mismatch in information retrieval (IR) occurs because users often use different words to describe concepts in their queries than authors use to describe the same concepts in their documents. Query expansion is used to deal with the mismatch between author and user vocabularies. To support query expansion, indices on words related by lexical semantics and syntactical co-occurrence need to be maintained. Two issues become paramount in supporting query expansion: the size of index tables and the query processing overhead. In this paper, we propose to use the notion of multi-granularity for more efficient indexing and query processing while the same degrees of precision and recall are maintained. We also describes extensions of this technique to handle: (1) query relaxation to handle words with multiple senses and with other semantic relationships; (2) progressive processing of queries with top N results and (3) progressive processing of queries with specification of the importance of each keyword.  相似文献   

15.
16.
Semantics preserving SPARQL-to-SQL translation   总被引:2,自引:0,他引:2  
Most existing RDF stores, which serve as metadata repositories on the Semantic Web, use an RDBMS as a backend to manage RDF data. This motivates us to study the problem of translating SPARQL queries into equivalent SQL queries, which further can be optimized and evaluated by the relational query engine and their results can be returned as SPARQL query solutions. The main contributions of our research are: (i) We formalize a relational algebra based semantics of SPARQL, which bridges the gap between SPARQL and SQL query languages, and prove that our semantics is equivalent to the mapping-based semantics of SPARQL; (ii) Based on this semantics, we propose the first provably semantics preserving SPARQL-to-SQL translation for SPARQL triple patterns, basic graph patterns, optional graph patterns, alternative graph patterns, and value constraints; (iii) Our translation algorithm is generic and can be directly applied to existing RDBMS-based RDF stores; and (iv) We outline a number of simplifications for the SPARQL-to-SQL translation to generate simpler and more efficient SQL queries and extend our defined semantics and translation to support the bag semantics of a SPARQL query solution. The experimental study showed that our proposed generic translation can serve as a good alternative to existing schema dependent translations in terms of efficient query evaluation and/or ensured query result correctness.  相似文献   

17.
Xyleme is a huge warehouse integrating XML data of the Web. Xyleme considers a simple data model with data trees and tree types for describing the data sources, and a simple query language based on tree queries with boolean conditions. The main components of the data model are a mediated schema modeled by an abstract tree type, as a view of a set of tree types associated with actual data trees, called concrete tree types, and a mapping expressing the connection between the mediated schema and the concrete tree types. The first contribution of this paper is formal: we provide a declarative model-theoretic semantics for Xyleme tree queries, a way of checking tree query containment, and a characterization of tree queries as a composition of branch queries. The other contributions are algorithmic and handle the potentially huge size of the mapping relation which is a crucial issue for semantic integration and query evaluation in Xyleme. First, we propose a method for pre-evaluating queries at compile time by storing some specific meta-information about the mapping into map translation tables. These map translation tables summarize the set of all the branch queries that can be generated from the mediated schema and the set of all the mappings. Then, we propose different operators and strategies for relaxing queries which, having an empty map translation table, will have no answer if they are evaluated against the data. Finally, we present a method for semi-automatically generating the mapping relation.  相似文献   

18.
Currently, a good portion of datasets on Internet are accessed through data services, where user’s queries are answered as a composition of multiple data services. Defining the semantics of data services is the first step towards automating their composition. An interesting approach to define the semantics of data services is by describing them as semantic views over a domain ontology. However, defining such semantic views cannot always be done with certainty, especially when the service’s returned data are too complex. In such case, a data service is associated with several possible semantic views. In addition, complex correlations may be present among these possible semantic views, mainly when data services encapsulate the same data sources. In this paper, we propose a probabilistic approach to model the semantic uncertainty of data services. Services along with their possible semantic views are represented in probabilistic service registry. The correlations among service semantics are modeled through a directed probabilistic graphical model (Bayesian network). Based on our modeling, we study the problem of compositing correlated data services to answer a user query, and propose an efficient method to compute the different possible compositions and their probabilities.  相似文献   

19.
Genetic algorithms for approximate similarity queries   总被引:1,自引:0,他引:1  
Algorithms to query large sets of simple data (composed of numbers and small character strings) are constructed to retrieve the exact answer, retrieving every relevant element, so the answer said to be exact. Similarity searching over complex data is much more expensive than searching over simple data. Moreover, comparison operations over complex data usually consider features extracted from each element, instead of the elements themselves. Thus, even if an algorithm retrieves an exact answer, it is ‘exact’ regarding the extracted features, not regarding the original elements themselves. Therefore, trading exact answering with query time response can be worthwhile. In this work we developed two search strategies based on genetic algorithms to allow retrieving approximate data indexed by Metric Access Methods (MAM) within a limited, user-defined, amount of time. These strategies allow implementing algorithms to answer both range and k-nearest neighbor queries, and allow also to estimate the precision obtained for the approximate answer. Experimental evaluation shows that very good results (corresponding to what the user would expect) can be obtained in a fraction of the time required to obtain the exact answer.  相似文献   

20.
In contrast to a traditional setting where users express queries against the database schema, we assert that the semantics of data can often be understood by viewing the data in the context of the user interface (UI) of the software tool used to enter the data. That is, we believe that users will understand the data in a database by seeing the labels, drop-down menus, tool tips, or other help text that are built into the user interface. Our goal is to allow domain experts with little technical skill to understand and query data. In this paper, we present our GUi As View (Guava) framework and describe how we use forms-based UIs to generate a conceptual model that represents the information in the user interface. We then describe how we generate a query interface from the conceptual model. We characterize the resulting query language using a subset of the relational algebra. Since most application developers want to craft a physical database to meet desired performance needs, we present here a transformation channel that can be configured by instantiating one or more of our transformation operators. The channel, once configured, automatically transforms queries from our query interface into queries that address the underlying physical database and delivers query results that conform to our query interface. In this paper, we define and formalize our database transformation operators. The contributions of this paper are that first, we demonstrate the feasibility of creating a query interface based directly on the user interface and second, we introduce a general purpose database transformation channel that will likely shorten the application development process and increase the quality of the software by automatically generating software artifacts that are often made manually and are prone to errors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号