期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On learning multivalued dependencies with queries

Víctor Lavín Puente 《Theoretical computer science》2011,412(22):2331-2339

Data dependencies play an important role in the design of relational databases. There is a strong connection between dependencies and some fragments of the propositional logic. In particular, functional dependencies are closely related to Horn formulas. Also, multivalued dependencies are characterized in terms of multivalued formulas. It is known that both Horn formulas and sets of functional dependencies are learnable in the exact model of learning with queries. Here we present an algorithm that learns a non-trivial subclass of multivalued formulas using membership and equivalence queries. Furthermore, a slight modification of the algorithm allows us to learn the corresponding subclass of multivalued dependencies. 相似文献

2.

Learning a subclass of k-quasi-Horn formulas with membership queries

Víctor Lavín Puente 《Information Processing Letters》2011,111(11):550-555

Boolean formulas have been widely studied in the field of learning theory. We focus on the model of learning with queries, and study a restriction of the class of k-quasi-Horn formulas, that is, conjunctive normal form formulas where the number of unnegated literals per clause is at most k. This class is known to be as hard to learn as the general class CNF of conjunctive normal form formulas. By imposing some constraints, we define a fragment of this logic that can be learned using only membership queries. Also we prove that none of these constraints makes by itself the class learnable. 相似文献

3.

Mixed transitivity for functional and multivalued dependencies in database relations

Carlo Zaniolo 《Information Processing Letters》1980,10(1):32-34

相似文献

4.

Axiomatizing functional dependencies in the Higher-Order Entity-Relationship Model

Sven Hartmann Klaus-Dieter Schewe 《Information Processing Letters》2003,87(3):133-137

A finite axiomatization of functional dependencies on conceptual database schemata is presented which naturally generalizes the well-known Armstrong axioms. The underlying conceptual data model is the Higher-Order Entity-Relationship Model. 相似文献

5.

Approximate dependencies in database systems

Aditya N. SahariaTerence M. Barron 《Decision Support Systems》1995,13(3-4)

Functional dependencies are the most commonly used approach for capturing real-word integrity constraints which are to be reflected in a database. There are, however, many useful kinds of constraints, especially approximate ones, that cannot be represented correctly by functional dependencies and therefore are enforced via programs which update the database, if they are enforced at all. This tends to make such constraints invisible since they are not an explicit part of the database, increasing maintenance problems and the likelihood of inconsistencies. We propose a new approach, cluster dependencies, as a way to enforce approximate dependencies. By treating equality as a fuzzy concept and defining appropriate similarity measures, it is possible to represent a broad range of approximate constraints directly in the database by storing and accessing cluster definitions. We discuss different interpretations of cluster dependencies and describe the additional data structures needed to enforce them. We also contrast them with an existing approach, fuzzy functional dependencies, which are much more limited in the kind of approximate constraints they can represent. 相似文献

6.

Reverse engineering database queries from examples: State-of-the-art,challenges, and research opportunities

《Information Systems》2019

With the popularization of data access and usage, an increasing number of users without expert knowledge of databases is required to perform data interactions. Often, these users face the challenges of writing and reformulating database queries, which consume a considerable amount of time and frequently yield unsatisfactory results. To facilitate this human–database interaction, researchers have investigated the Query By Example (QBE) paradigm in which database queries are (semi) automatically discovered from data examples given by users. This paradigm allows non-database experts to formulate queries without relying on complex query languages. In this context, this work aims to present a systematic review of the recent developments, open challenges, and research opportunities of the QBE reported in the literature. This work also describes strategies employed to leverage efficient example acquisition and query reverse engineering. The obtained results show that recent research developments have focused on enhancing the expressiveness of produced queries, minimizing user interaction, and enabling efficient query learning in the context of data retrieval, exploration, integration, and analytics. Our findings indicate that future research should concentrate efforts to provide innovative solutions to the challenges of improving controllability and transparency, considering diverse user preferences in the processes of learning personalized queries, ensuring data quality, and improving the support of additional SQL features and operators. 相似文献

7.

基于条件函数依赖的数据库一致性检测研究

下载免费PDF全文

耿寅融刘波《计算机工程与应用》2012,48(3):122-125

条件函数依赖是函数依赖在语义上的扩充,可以应用于数据清洗工作,在数据库一致性的修复上应用广泛。讨论了条件函数依赖的相关语义规则,重点研究了基于条件函数依赖对违反数据库一致性元组的检测工作,并引入置信度评价机制,对相关的检测规则进行了改进。改进后的检测方法在基于多个函数依赖的检测中显示出了优越性,使得检测工作更为精简,检测标准更加明确。相似文献

8.

A characterization of multivalued dependencies equivalent to a join dependency

Nathan Goodman Y.C. Tay 《Information Processing Letters》1984,18(5):261-266

相似文献

9.

Efficient preprocessing of XML queries using structured signatures

Yon Dohn Chung 《Information Processing Letters》2003,87(5):257-264

The paper proposes a preprocessing scheme for efficient processing of XML queries in XML-based information retrieval systems. For the preprocessing, we use a signature-based approach. In the conventional (flat document-based) information retrieval systems, user queries consist of keywords and boolean operators, and thus signatures are structured in a flat manner. However, in XML-based information retrieval systems, the user queries have the form of path queries. Therefore, the flat signature cannot be effective for XML documents. In the paper, we propose two structured signature methods for XML documents. Through experiments, we evaluate the performance of the proposed methods. 相似文献

10.

Theory revision with queries: Horn, read-once, and parity formulas

Judy Goldsmith Robert H. Sloan Balázs Szörényi 《Artificial Intelligence》2004,156(2):139-176

A theory, in this context, is a Boolean formula; it is used to classify instances, or truth assignments. Theories can model real-world phenomena, and can do so more or less correctly. The theory revision, or concept revision, problem is to correct a given, roughly correct concept. This problem is considered here in the model of learning with equivalence and membership queries. A revision algorithm is considered efficient if the number of queries it makes is polynomial in the revision distance between the initial theory and the target theory, and polylogarithmic in the number of variables and the size of the initial theory. The revision distance is the minimal number of syntactic revision operations, such as the deletion or addition of literals, needed to obtain the target theory from the initial theory. Efficient revision algorithms are given for Horn formulas and read-once formulas, where revision operators are restricted to deletions of variables or clauses, and for parity formulas, where revision operators include both deletions and additions of variables. We also show that the query complexity of the read-once revision algorithm is near-optimal. 相似文献

11.

A note on approximation measures for multi-valued dependencies in relational databases

Chris Giannella 《Information Processing Letters》2003,85(3):153-158

We consider the problem of defining a normalized approximation measure for multi-valued dependencies in relational database theory. An approximation measure is a function mapping relation instances to real numbers. The number to which an instance is mapped, intuitively, describes the strength of the dependency in that instance. A normalized approximation measure for functional dependencies has been proposed previously: the minimum number of tuples that need be removed for the functional dependency to hold divided by the total number of tuples. This leads naturally to a normalized measure for multi-valued dependencies: the minimum number of tuples that need be removed for the multi-valued dependency to hold divided by the total number of tuples.The measure for functional dependencies can be computed efficiently, O(|r|log(|r|)) where |r| is the relation instance. However, we show that an efficient algorithm for computing the analogous measure for multi-valued dependencies is not likely to exist. A polynomial time algorithm for computing the measure would lead to a polynomial time algorithm for an NP-complete problem (proven by a reduction from the maximum edge biclique problem in graph theory). Hence, we argue that it is not a good measure. We propose an alternate measure based on the lossless join characterization of multi-valued dependencies. This measure is efficiently computable, O(|r|²). 相似文献

12.

基于XML Schema的XML强多值依赖的推理规则集

下载免费PDF全文

殷丽凤郝忠孝《计算机工程与应用》2010,46(28):152-156

XML强多值依赖的推理规则集问题是解决不完全信息环境下XML数据依赖蕴涵问题的基础,是不完全信息环境下XML模式设计理论的关键问题之一。提出了XML Schema、符合XML Schema的不完全XML文档树等概念;基于子树信息等价和子树信息相容的概念提出了XML强多值依赖的定义及性质;给出了相应的推理规则集,并对其正确性和完备性进行了证明。研究成果为不完全信息环境下存在XSMVD的XML Schema设计奠定了基础。相似文献

13.

The complexity of learning concept classes with polynomial general dimension

Johannes Köbler Wolfgang Lindner 《Theoretical computer science》2006

相似文献

14.

A general comparison of language learning from examples and from queries

Sanjay Jain Steffen Lange Sandra Zilles 《Theoretical computer science》2007

In language learning, strong relationships between Gold-style models and query models have recently been observed: in some quite general setting Gold-style learners can be replaced by query learners and vice versa, without loss of learning capabilities. These ‘equalities’ hold in the context of learning indexable classes of recursive languages. 相似文献

15.

Parallel algorithms for solving the satisfaction problem of functional and multivalued data dependencies

Chao-Chih Yang Weicong Shen 《Data & Knowledge Engineering》1989,3(4):323-338

Parallel algorithms for solving the satisfaction problem of non-trivial functional and multivalued data dependencies (FDs and MVDs) in a relation of N tuples by M processors are developed in this paper. Algorithms performing, in a parallel manner, batch or interactive checking of these data dependencies are also discussed. The M processors are organized as a linear systolic array. The time complexities of the first two algorithms for solving the FD satisfaction problem under M N are both O(N), and that of Algorithm (3) or (4) for solving the FD or MVD satisfaction problem under N M is O(N²/M). The latter complexity reduced to O(N) if N = M and is at least not worse than O(N log N) if N = M (N/log N). 相似文献

16.

XML函数依赖研究综述

刘嘉廖湖声《计算机工程与科学》2014,36(2):331-339

函数依赖作为数据库规范化的基础在关系理论中起着重要的作用。近年来，XML得到广泛应用并已成为互联网上数据传输和交换的标准。由于XML半结构化的特性，使得如何定义XML函数依赖使其具有更强的描述能力，以及如何解决相应的逻辑蕴涵问题成为当今学术界所面临的挑战。针对这些问题，系统地描述了目前关于XML函数依赖的研究现状，特别是把分析的重点放在如何定义函数依赖、判断其蕴涵关系以及从XML文档中发现函数依赖等问题上。最后讨论了诸如类型化函数依赖关系等一些相关的研究方向。相似文献

17.

On learning multicategory classification with sample queries

Joel Ratsaby 《Information and Computation》2003,185(2):298-327

Consider the pattern recognition problem of learning multicategory classification from a labeled sample, for instance, the problem of learning character recognition where a category corresponds to an alphanumeric letter. The classical theory of pattern recognition assumes labeled examples appear according to the unknown underlying pattern-class conditional probability distributions where the pattern classes are picked randomly according to their a priori probabilities. In this paper we pose the following question: Can the learning accuracy be improved if labeled examples are independently randomly drawn according to the underlying class conditional probability distributions but the pattern classes are chosen not necessarily according to their a priori probabilities? We answer this in the affirmative by showing that there exists a tuning of the sub-sample proportions which minimizes a loss criterion. The tuning is relative to the intrinsic complexity of the Bayes-classifier. As this complexity depends on the underlying probability distributions which are assumed to be unknown, we provide an algorithm which learns the proportions in an on-line manner utilizing sample querying which asymptotically minimizes the criterion. In practice, this algorithm may be used to boost the performance of existing learning classification algorithms by apportioning better sub-sample proportions. 相似文献

18.

图依赖研究与应用综述

余旭曹建军翁年凤袁震曾志贤《计算机应用研究》2023,40(5):1312-1317

图依赖是用于解决图数据的数据一致性问题的数据质量规则。基于图依赖提升数据一致性的过程通常分为图依赖定义与形式化、图依赖自动挖掘、基于图依赖的数据一致性提升三步。介绍了针对数据一致性的图依赖理论,并根据拓展类型将图依赖分为基于结构约束拓展、基于语义约束拓展和基于外部约束拓展的图依赖;综述并对比了从图数据中自动挖掘图依赖及其拓展的算法;分析了应用图依赖提高数据一致性的研究现状;总结了当前研究中仍存在的问题,并依据问题展望了图依赖在数据质量领域的应用前景。相似文献

19.

Answering constraint-based mining queries on itemsets using previous materialized results

Roberto Esposito Rosa Meo Marco Botta 《Journal of Intelligent Information Systems》2006,26(1):95-111

In recent years, researchers have begun to study inductive databases, a new generation of databases for leveraging decision support applications. In this context, the user interacts with the DBMS using advanced, constraint-based languages for data mining where constraints have been specifically introduced to increase the relevance of the results and, at the same time, to reduce its volume. In this paper we study the problem of mining frequent itemsets using an inductive database. We propose a technique for query answering which consists in rewriting the query in terms of union and intersection of the result sets of other queries, previously executed and materialized. Unfortunately, the exploitation of past queries is not always applicable. We then present sufficient conditions for the optimization to apply and show that these conditions are strictly connected with the presence of functional dependencies between the attributes involved in the queries. We show some experiments on an initial prototype of an optimizer which demonstrates that this approach to query answering is viable and in many practical cases it drastically reduces the query execution time. 相似文献

20.

On the uselessness of quantum queries

David A. Meyer 《Theoretical computer science》2011,412(51):7068-7074

Given a prior probability distribution over a set of possible oracle functions, we define a number of queries to be useless for determining some property of the function if the probability that the function has the property is unchanged after the oracle responds to the queries. A familiar example is the parity of a uniformly random Boolean-valued function over {1,2,…,N}, for which N−1 classical queries are useless. We prove that if 2k classical queries are useless for some oracle problem, then k quantum queries are also useless. For such problems, which include classical threshold secret sharing schemes, our result also gives a new way to obtain a lower bound on the quantum query complexity, even in cases where neither the function nor the property to be determined is Boolean. 相似文献