首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Traditional database search uses pattern match in the comparison process. For a query with some search words, tuples are selected only if the words of the tuples exactly match the query words. In this paper, we propose a new method for evaluating relational ranking queries (or top-N queries) with text attributes. This method defines semantic distance functions and utilizes semantic match between words in database search. The attempt is that tuples, not only exactly matching, but also close to the query according to semantic distances, can both be fetched. The basic idea of the method is to create an index based on WordNet to expand the tuple words semantically. The candidate results for a query are retrieved by the index and a simple SQL selection statement, and then top-N answers are obtained. Extensive experiments are carried out to measure the performance of this new strategy for the evaluation of ranking queries over relational databases.  相似文献   

2.
Reliability of answers to queries in relational databases   总被引:1,自引:0,他引:1  
The author studies the problem of determining the reliability of answers to queries in a relational database system, where the information in the database comes from various sources with varying degrees of reliability. An extended relational model is proposed in which each tuple in a relation is associated with an information source vector which identifies the information source(s) that contributed to that tuple. The author shows how relational algebra operations can be extended, and implemented using information source vectors, to calculate the vector corresponding to each tuple in the answer to a query, and hence, to identify information source(s) contributing to each tuple in the answer. This also enables the database system to calculate the reliability of each tuple in the answer to a query as a function of the reliability of information sources  相似文献   

3.
This work deals with databases containing ill-known attribute values represented by possibility distributions. Any such database has a canonical interpretation as a set of regular databases, but it is well known that their manipulation raises a number of problems, in particular with respect to the soundness of querying operations and the tractability of the evaluation process. In This work, we propose a query language including three operators which are soundly defined on extended possibilistic (compact) relations, which is the key for tractability. The originality of the approach is twofold: 1) a nesting mechanism is introduced in the data model in order to support the expression of the result of some of the operations allowed, and 2) a joint operation enables to compose possibilistic relations under some reasonable hypotheses.  相似文献   

4.
We present an approach for mining frequent conjunctive in arbitrary relational databases. Our pattern class is the simple, but appealing subclass of simple conjunctive queries. Our algorithm, called Conqueror $^+$ , is capable of detecting previously unknown functional and inclusion dependencies that hold on the database relations as well as on joins of relations. These newly detected dependencies are then used to prune redundant queries. We propose an efficient database-oriented implementation of our algorithm using SQL and provide several promising experimental results.  相似文献   

5.
6.
Keyword search can provide users an easy method to query large and complex databases without any knowledge of structured query languages or underlying database schema. Most of the existing studies have focused on generating candidate structured queries relevant to keywords. Due to the large size of generated queries, the execution costs may be prohibitive. However, existing studies lack the idea of a generalized method to optimize the plan of the large set of generated queries. In this paper, we introduce a graph-theoretic optimization approach. We propose a general graph model, Weighted Operator Graph, to address the costs of keyword query evaluation plans. The proposed model is flexible to integrate all of the cost-based plans in a uniform way. We define a Keyword Query Optimization Problem based on a theoretical cost model as a graph-theoretic problem and show it to be a NP-hard problem. We propose a greedy heuristic Maximum Propagation that reduces the size of the intermediate result as early as possible. The proposed algorithm allows us to achieve efficiency in terms of query evaluation costs. The experimental studies on both synthetic and real data set results show that our work outperforms the existing work.  相似文献   

7.
This paper presents some applications of partial evaluation method to a query optimization in deductive database. A Horn clause transformation is used for the partial evaluation of a query in an intensional database, and its application to multiple query processing is discussed. Three strategies are presented for the compatible case, ordered case and crossed case. In each case, partial evaluation is used to preprocess the intensional database in order to obtain subqueries which direct access to an extensional database.  相似文献   

8.
Let F1,…,FsR[X1,…,Xn] be polynomials of degree at most d, and suppose that F1,…,Fs are represented by a division free arithmetic circuit of non-scalar complexity size L. Let A be the arrangement of Rn defined by F1,…,Fs.For any point xRn, we consider the task of determining the signs of the values F1(x),…,Fs(x) (sign condition query) and the task of determining the connected component of A to which x belongs (point location query). By an extremely simple reduction to the well-known case where the polynomials F1,…,Fs are affine linear (i.e., polynomials of degree one), we show first that there exists a database of (possibly enormous) size sO(L+n) which allows the evaluation of the sign condition query using only (Ln)O(1)log(s) arithmetic operations. The key point of this paper is the proof that this upper bound is almost optimal.By the way, we show that the point location query can be evaluated using dO(n)log(s) arithmetic operations. Based on a different argument, analogous complexity upper-bounds are exhibited with respect to the bit-model in case that F1,…,Fs belong to Z[X1,…,Xn] and satisfy a certain natural genericity condition. Mutatis mutandis our upper-bound results may be applied to the sparse and dense representations of F1,…,Fs.  相似文献   

9.
With recent advances in mobile computing technologies, mobile devices can render 3D objects realistically. Users of these devices such as tourists, mixed-reality gamers, and rescue officers, often need real-time retrieval of 3D objects over wireless networks. Due to bandwidth and latency restrictions in mobile settings, efficient continuous retrieval of 3D objects is a major challenge. In this paper, we present a motion-aware approach to this problem in a client-server model. Specifically, we propose: (i) representing 3D objects in multiple resolutions through wavelets to facilitate motion-aware incremental retrieval, (ii) motion-aware buffer management schemes for both client and server, (iii) an efficient index structure for 3D objects represented by wavelets, and (iv) techniques for processing group queries exploiting group motion behavior of clients. The results of our extensive experimental study demonstrate the effectiveness of our solution.  相似文献   

10.
Several salient-object-based data models have been proposed to model video data. However, none of them addresses the development of an index structure to efficiently handle salient-object-based queries. There are several indexing schemes that have been proposed for spatiotemporal relationships among objects, and they are used to optimize timestamp and interval queries, which are rarely used in video databases. Moreover, these index structures are designed without consideration of the granularity levels of constraints on salient objects and the characteristics of video data. In this paper, we propose a multilevel index structure (MINDEX) to efficiently handle the salient-object-based queries with different levels of constraints. We present experimental results showing the performance of different methods of MINDEX construction.  相似文献   

11.
Generally, a database system containing null value attributes will not operate properly. This study proposes an efficient and systematic approach for estimating null values in a relational database which utilizes clustering algorithms to cluster data, and a regression coefficient to determine the degree of influence between different attributes. Two databases are used to verify the proposed method: (1) Human resource database; and (2) Waugh's database. Furthermore, the mean of absolute error rate (MAER) and average error are used as evaluation criteria to compare the proposed method with other methods. It demonstrates that the proposed method is superior to existing methods for estimating null values in relational database systems. Jia-Wen Wang was born on September 5, 1978, in Taipei, Taiwan, Republic of China. She received the M.S. degree in information management from the National Yunlin University of Science and Technology, Yunlin, Taiwan, in 2003. Since 2003, she has been a PhD degree student in Information Management Department at the National Yunlin University of Science and Technology. Her current research interests include fuzzy systems, database systems, and artificial intelligence. Ching-Hsue Cheng received the B.S. degree in mathematics from Chinese Military Academy, Taiwan, in 1982, the M.S. degree in applied mathematics from the Chung Yuan Christian University, Taiwan, in 1988, and the Ph.D. degree in system engineering and management from National Defence University, Taiwan, in 1994. Currently, he is a professor of the Department of Information Management, National YunLin University of Technology & Science. His research interests are in decision science, soft computing, software reliability, performance evaluation, and fuzzy time series. He has published more than 120 refereed papers in these areas. He has been a principal investigator and project leader in a number of projects with government, and other research-sponsoring agencies.  相似文献   

12.
Finding typical instances is an effective approach to understand and analyze large data sets. In this paper, we apply the idea of typicality analysis from psychology and cognitive science to database query answering, and study the novel problem of answering top-k typicality queries. We model typicality in large data sets systematically. Three types of top-k typicality queries are formulated. To answer questions like “Who are the top-k most typical NBA players?”, the measure of simple typicality is developed. To answer questions like “Who are the top-k most typical guards distinguishing guards from other players?”, the notion of discriminative typicality is proposed. Moreover, to answer questions like “Who are the best k typical guards in whole representing different types of guards?”, the notion of representative typicality is used. Computing the exact answer to a top-k typicality query requires quadratic time which is often too costly for online query answering on large databases. We develop a series of approximation methods for various situations: (1) the randomized tournament algorithm has linear complexity though it does not provide a theoretical guarantee on the quality of the answers; (2) the direct local typicality approximation using VP-trees provides an approximation quality guarantee; (3) a local typicality tree data structure can be exploited to index a large set of objects. Then, typicality queries can be answered efficiently with quality guarantees by a tournament method based on a Local Typicality Tree. An extensive performance study using two real data sets and a series of synthetic data sets clearly shows that top-k typicality queries are meaningful and our methods are practical.  相似文献   

13.
Traditional database query languages such as datalog and SQL allow the user to specify only mandatory requirements on the data to be retrieved from a database. In many applications, it may be natural to express not only mandatory requirements but also preferences on the data to be retrieved. Lacroix and Lavency10) extended SQL with a notion of preference and showed how the resulting query language could still be translated into the domain relational calculus. We explore the use of preference in databases in the setting of datalog. We introduce the formalism of preference datalog programs (PDPs) as preference logic programs without uninterpreted function symbols for this purpose. PDPs extend datalog not only with constructs to specify which predicate is to be optimized and the criterion for optimization but also with constructs to specify which predicate to be relaxed and the criterion to be used for relaxation. We can show that all of the soft requirements in Reference10) can be directly encoded in PDP. We first develop anaively-pruned bottom-up evaluation procedure that is sound and complete for computing answers to normal and relaxation queries when the PDPs are stratified, we then show how the evaluation scheme can be extended to the case when the programs are not necessarily stratified, and finally we develop an extension of themagic templates method for datalog14) that constructs an equivalent but more efficient program for bottom-up evaluation. Kannan Govindarajan, Ph.D.: He obtained his bachelors degree in Computer Science and Engineering from the Indian Institute of Technology, Madras, and he completed his Ph.D. degree in Computer Science from the State University of New York at Buffalo. His dissertation research was on optimization and relaxation techniques for logic languages. His interests lie in the areas of programming languages, databases, and distributed systems. He currently leads the trading community effort in the E-speak Operation in Hewlett Packard Company. Prior to that, he was a member of the Java Products Group in Oracle Corporation. Bharat Jayaraman, Ph.D.: He is a Professor in the Department of Computer Science at the State University of New York at Buffalo. He obtained his bachelors degree in Electronics from the Indian Institute of Technology, Madras (1975), and his Ph.D. from the University of Utah (1981). His research interests are in programming languages and declarative modeling of complex systems. Dr. Jayaraman has published over 50 papers in refereed conferences and journals. He has served on the program committees of several conferences in the area of programming languages, and he is presently on the Editorial Board of the Journal of Functional and Logic Programming. Surya Mantha, Ph.D.: He is a manager in the Communications and Software Services Group of Pittiglio Rabin Todd & McGrath (PRTM), a management consulting firm serving high technology industries. He obtained a bachelors degree in Computer Science and Engineering from the Indian Institute of Technology, Kanpur, an MBA in Finance and Competitive Strategy from the University of Rochester, and a Ph.D. in Computer Science from the University of Utah (1991). His research interests are in the modeling of complex business processes, inter-enterprise application integration, and business strategy. Dr. Mantha has two US patents, and has published over 10 research papers. Prior to joining PRTM, he was a researcher and manager in the Architecture and Document Services Technology Center at Xerox Corporation in Rochester, New York.  相似文献   

14.
A reduced cover set of the set of full reducer semijoin programs for an acyclic query graph for a distributed database system is given. An algorithm is presented that determines the minimum cost full reducer program. The computational complexity of finding the optimal full reducer for a single relation is of the same order as that of finding the optimal full reducer for all relations. The optimization algorithm is able to handle query graphs where more than one attribute is common between the relations. A method for determining the optimum profitable semijoin program is presented. A low-cost algorithm which determines a near-optimal profitable semijoin program is outlined. This is done by converting a semijoin program into a partial order graph. This graph also allows one to maximize the concurrent processing of the semijoins. It is shown that the minimum response time is given by the largest cost path of the partial order graph. This reducibility is used as a post optimizer for the SSD-1 query optimization algorithm. It is shown that the least upper bound on the length of any profitable semijoin program is N(N-1) for a query graph of N nodes  相似文献   

15.
In this paper, we identify a novel and interesting type of queries, contextual ranking queries, which return the ranks of query tuples among some context tuples given in the queries. Contextual ranking queries are useful for olap and decision support applications in non-traditional data exploration. They provide a mechanism to quickly identify where tuples stand within the context. In this paper, we extend the sql language to express contextual ranking queries and propose a general partition-based framework for processing them. In this framework, we use a novel method that utilizes bitmap indices built on ranking functions. This method can efficiently identify a small number of candidate tuples, thus achieves lower cost than alternative methods. We analytically investigate the advantages and drawbacks of these methods, according to a preliminary cost model. Experimental results suggest that the algorithm using bitmap indices on ranking functions can be substantially more efficient than other methods.  相似文献   

16.
董东  马丽  苏国斌 《计算机工程与设计》2005,26(8):2092-2096,2099
XML已经成为数据表示和交换的数据格式标准。随着大量XML文档的出现,应用数据库技术实现对XML数据的管理引起了越来越多研究者的兴趣。作为研究XML数据库技术的一个开始点,通过与关系数据库比较,可以深刻理解XML数据库与关系数据库的异同,进而为解决XML数据库所面临的问题,如为数据冗余控制、并发访问控制等提供必要的基础。两种数据库的比较是从数据模型、查询路径、完整性约束和规范化5个方面进行的,由于数据模型是数据库的基石,二者的数据模型从构造机制、名字的惟一性、空值、实体标识、实体问关系、文档顺序、数据结构的规则性、递归、数据自描述性等9个方面进行了详细讨论。  相似文献   

17.
One attractive approach to object databases is to see them as potentially an evolutionary development from relational databases. This paper concentrates on substantiating the technical basis for this claim, and illustrates it in some detail with an upwards-compatible extension of ANSI SQL2 for conventional objects. This could serve as a foundation for the development of higher-level facilities for more complex objects.  相似文献   

18.
Controlled query evaluation for logic-oriented information systems provides a model for the dynamic enforcement of confidentiality policies in scenarios where users are able to reason about a priori knowledge and the answers to previous queries. Previous foundational work assumes that the control mechanism can solve the arising implication problems and deals only with closed queries. In this paper, we overcome these limitations by refining the abstract model for appropriately represented relational databases. We identify a relational submodel where all instances share a fixed infinite Herbrand domain but have finite base relations, and we require finite and domain-independent query results. Then, via suitable syntactic restrictions on the policy and query languages, each occurring implication problem can be equivalently expressed as a universal validity problem within the Bernays-Schönfinkel class, whose (known) decidability in the classical setting is extended to our framework. For refusal and lying, we design and verify evaluation methods for open queries, exploiting controlled query evaluation of appropriate sequences of closed queries, which include answer completeness tests. Additionally, we present alternative evaluation methods that work for lying and the combined approach but at the price of potentially reduced cooperativeness.  相似文献   

19.
Problems associated with defining normal forms of relational tables relevant to statistical processing are discussed. The concepts of derived identifier, class identifier, derived class-counts, count domains, compact domains, and uniform domains for statistical relational tables are introduced. The structures of the first and the second statistical-normal forms and the relational decompositions needed to achieve them are also discussed. It is shown that the statistical-normal form can be an important method to determine whether the usual statistical analysis techniques are valid. Some suggestions are presented for extending the structured query language (SQL) statements to achieve these operations on statistical relational tables. Some results linking Codd's normal forms with statistical normal forms are discussed. Relational statistical abnormalities, called outlyers, are also discussed  相似文献   

20.
The execution of logic queries in a distributed database environment is studied. Conventional optimization strategies, such as the early evaluation of selection conditions and the clustering of processing to manipulate and exchange large sets of tuples, are redefined in view of the additional difficulties due to logic queries, in particular to recursive rules. In order to allow efficient processing of these logic queries, several program transformation techniques that attempt to minimize distribution costs based on the idea of semijoins and generalized semijoins in conventional databases are presented. Although local computation of semijoins is not possible for the general case, classes of programs are indicated for which these transformations succeed in producing set-oriented computation. Processes evaluating the recursive program in a distributed network are described, and an efficient method for testing the termination of the computation is developed. The approach is compared with sequential as well as dataflow-oriented evaluation  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号