首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Data publishing has generated much concern on individual privacy. Recent work has shown that different background knowledge can bring various threats to the privacy of published data. In this paper, we study the privacy threat from the full functional dependency (FFD) that is used as part of adversary knowledge. We show that the cross-attribute correlations by FFDs (e.g., Phone → Zipcode) can bring potential vulnerability. Unfortunately, none of the existing anonymization principles (e.g., k-anonymity, ?-diversity, etc.) can effectively prevent against an FFD-based privacy attack. We formalize the FFD-based privacy attack and define the privacy model, (d,?)-inference, to combat the FD-based attack. We distinguish the safe FFDs that will not jeopardize privacy from the unsafe ones. We design robust algorithms that can efficiently anonymize the microdata with low information loss when the unsafe FFDs are present. The efficiency and effectiveness of our approach are demonstrated by the empirical study.  相似文献   

3.
We study the problem of updating intensional relations in the framework of deductive databases on which integrity constraints (specifically functional dependencies) are defined. First, a formalization of a model-theoretic semantics of updates is provided: the notions ofrepresentability, consistency anddeterminism are introduced to characterize the various cases. Then, a proof-theoretic approach, based on a variant of resolution integrated with the chase procedure, is defined, showing that the method exactly captures the above notions. It turns out that using functional dependencies it is possible to resolve potential ambiguities in several practical cases. Also, precomputations can be performed at definition time to execute update requests more efficiently.Work partially supported by Consiglio Nazionale delle Ricerche, within Progetto Finalizzato Sistemi Informatici e Calcolo Parallelo, LRC Logidata+, and by System & Management S.p.A.A preliminary version of this paper appeared in [33].  相似文献   

4.
Conditional functional dependencies(CFDs) are important techniques for data consistency. However, CFDs are limited to 1) provide the reasonable values for consistency repairing and 2) detect potential errors. This paper presents context-aware conditional functional dependencies(CCFDs) which contribute to provide reasonable values and detect potential errors. Especially, we focus on automatically discovering minimal CCFDs. In this paper, we present context relativity to measure the relationship of CFDs. The overlap of the related CFDs can provide reasonable values which result in more accuracy consistency repairing, and some related CFDs are combined into CCFDs.Moreover,we prove that discovering minimal CCFDs is NP-complete and we design the precise method and the heuristic method. We also present the dominating value to facilitate the process in both the precise method and the heuristic method. Additionally, the context relativity of the CFDs affects the cleaning results. We will give an approximate threshold of context relativity according to data distribution for suggestion. The repairing results are approvedmore accuracy, even evidenced by our empirical evaluation.  相似文献   

5.
In this paper, we propose an efficient rule discovery algorithm, called FD_Mine, for mining functional dependencies from data. By exploiting Armstrong’s Axioms for functional dependencies, we identify equivalences among attributes, which can be used to reduce both the size of the dataset and the number of functional dependencies to be checked. We first describe four effective pruning rules that reduce the size of the search space. In particular, the number of functional dependencies to be checked is reduced by skipping the search for FDs that are logically implied by already discovered FDs. Then, we present the FD_Mine algorithm, which incorporates the four pruning rules into the mining process. We prove the correctness of FD_Mine, that is, we show that the pruning does not lead to the loss of useful information. We report the results of a series of experiments. These experiments show that the proposed algorithm is effective on 15 UCI datasets and synthetic data.  相似文献   

6.
函数依赖作为数据库规范化的基础在关系理论中起着重要的作用。近年来,XML得到广泛应用并已成为互联网上数据传输和交换的标准。由于XML半结构化的特性,使得如何定义XML函数依赖使其具有更强的描述能力,以及如何解决相应的逻辑蕴涵问题成为当今学术界所面临的挑战。针对这些问题,系统地描述了目前关于XML函数依赖的研究现状,特别是把分析的重点放在如何定义函数依赖、判断其蕴涵关系以及从XML文档中发现函数依赖等问题上。最后讨论了诸如类型化函数依赖关系等一些相关的研究方向。  相似文献   

7.
This paper proposes a kind of probably approximately correct (PAC) learning framework for inferring a set of functional dependencies (FDs) from example tuples. A simple algorithm is considered that outputs a set of all FDs which hold in a set of example tuples. Letr be a relation (a set of tuples). We define the error for a set of FDsFS as the minimum Σ t∈ν; where ν (ν ⊂r) is a set such thatFS holds inr − ν, andP(t) denotes the probability that tuplet is picked fromr. Our attention is focused on the sample complexity, and we show that the number of example tuples required to infer a set of FDs whose error does not exceed ω with probability at least 1 − δ under an arbitrary probability distribution is. Tatsuya Akutsu, Ph. D.: He is an associate professor of Department of Computer Science, Gunma University. He received the B. E. degree in 1984, the M. E. degree in 1986 and the Dr. Eng. degree in 1989 from The University of Tokyo. From 1989 to 1994, he was with Mechanical Engineering Laboratory, MITI, Japan. His research interests are design and analysis of algorithms, computational learning theory and bioinformatics.  相似文献   

8.
Bayesian networks have become a popular technique for representing and reasoning with probabilistic information.The fuzzy functional dependency is an important kind of data dependencies in relational databases with fuzzy values,The purpose of this paper is to set up a connection between these data dependencies and Bayesian networks.The connection is done through a set of methods that enable pepople to obtain the most information of independent conditions from fuzzy functional dependencies.  相似文献   

9.
《Information Systems》1999,24(7):535-554
We extend the relational data model to incorporate linear orderings into data domains, which we call the ordered relational model. The conventional Functional Dependencies (FDs) are examined in the context of ordered relational databases by using the notion of System Ordering Independence (SOI), which refers to the desirable scenario that the ordering of tuples in a relation is independent of the implementation of the underlying DBMS. We also extend Armstrong's axiom system for FDs to object relations, which are a subclass of ordered relations that allow us to view tuples as objects. We formally define Ordered Functional Dependencies (OFDs) for the extended model by means of two possible extensions of domains, pointwise-orderings and lexicographical orderings. We first present a sound and complete axiom system for OFDs in the case of pointwise-orderings and then establish a sound and complete set of chase rules for OFDs in the case of lexicographical orderings. Our main result shows that the implication problems for both cases of OFDs are decidable, and that it is linear time for the case of pointwise-orderings.  相似文献   

10.
《Information Systems》2004,29(6):483-507
We examine the issue of how to measure the degree to which a functional dependency (FD) is approximate. The primary motivation lies in the fact that approximate FDs represent potentially interesting patterns existent in a table. Their discovery is a valuable data mining problem. However, before algorithms can be developed, a measure must be defined quantifying their approximation degree.First we develop an approximation measure by axiomatizing the following intuition: the degree to which XY is approximate in a table T is the degree to which T determines a function from ΠX(T) to ΠY(T). We prove that a unique unnormalized measure satisfies these axioms up to a multiplicative constant. Next we compare the measure developed with two other measures from the literature. In all but one case, we show that the measures can be made to differ as much as possible within normalization. We examine these measure on several real datasets and observe that many of the theoretically possible extreme differences do not bear themselves out. We offer some conclusions as to particular situations where certain measures are more appropriate than others.  相似文献   

11.
Computing functional dependencies from a relation is an important database topic, with many applications in database management, reverse engineering and query optimization. Whereas it has been deeply investigated in those fields, strong links exist with the mathematical framework of Formal Concept Analysis. Considering the discovery of functional dependencies, it is indeed known that a relation can be expressed as the binary relation of a formal context, whose implications are equivalent to those dependencies. However, this leads to a new data representation that is quadratic in the number of objects w.r.t. the original data. Here, we present an alternative avoiding such a data representation and show how to characterize functional dependencies using the formalism of pattern structures, an extension of classical FCA to handle complex data. We also show how another class of dependencies can be characterized with that framework, namely, degenerated multivalued dependencies. Finally, we discuss and compare the performances of our new approach in a series of experiments on classical benchmark datasets.  相似文献   

12.
A database is C-Armstrong for a given set of constraints in a class C if it satisfies every constraint of the set and violates every constraint in C not implied by the set. Therefore, Armstrong databases are test data that perfectly illustrate the current perceptions about the semantics of a schema. We extend the existing theory of Armstrong relations to a toolbox of Armstrong tables. That is, we investigate structural and computational properties of Armstrong tables for the class of functional dependencies (FDs) over SQL tables. Relations are special instances of SQL tables with no duplicate rows and no null value occurrences. While FDs do not enjoy Armstrong tables, the combined class of standard FDs and NOT NULL constraints does enjoy Armstrong tables. The problem of finding an Armstrong table is shown to be precisely exponential for this combined class. However, we establish an algorithm that computes Armstrong tables with a size at most quadratic in that of a minimum-sized Armstrong table. Our resulting toolbox of Armstrong tables can be applied by data engineers to concisely visualize constraints on SQL data. Such support can lead to designs that guarantee efficient data management in practice.  相似文献   

13.
条件函数依赖是函数依赖在语义上的扩充,可以应用于数据清洗工作,在数据库一致性的修复上应用广泛。讨论了条件函数依赖的相关语义规则,重点研究了基于条件函数依赖对违反数据库一致性元组的检测工作,并引入置信度评价机制,对相关的检测规则进行了改进。改进后的检测方法在基于多个函数依赖的检测中显示出了优越性,使得检测工作更为精简,检测标准更加明确。  相似文献   

14.
Template dependencies were introduced by Sadri and Ullman [17] to generalize existing forms of data dependencies. It was hoped that by studying a large and natural class of dependencies, we could solve the inference problem for these dependencies, while that problem was elusive for restricted subsets of the template dependencies, such as embedded multivalued dependencies. At about the same time, other generalizations of known dependency forms were developed, such as the implicational dependencies of Fagin [11] and the algebraic dependencies of Yannakakis and Papadimitriou [20]. Unlike the template dependencies, the latter forms include the functional dependencies as special cases. In this paper we show that no nontrivial functional dependency follows from template dependencies, and we characterize those template dependencies that follow from functional dependencies. We then give a complete set of axioms for reasoning about combinations of functional and template dependencies. As a result, template dependencies augmented by functional dependencies can serve as a substitute for the more general implicational or algebraic dependencies, providing the same ability to represent those dependencies that appear ‘in nature’, while providing a somewhat simpler notation and set of axioms than the more general classes.  相似文献   

15.
粗糙函数依赖的近似度量   总被引:2,自引:1,他引:1       下载免费PDF全文
为了发现粗糙关系数据库中潜在的和有趣的模式,提出并研究了粗糙函数依赖的近似度量和精确度量。首先,对于关系数据库的近似度量及其满足的性质进行了研究,在此基础上提出了粗糙关系数据库(Rough Relational Database,简称RRDB)的近似度量及精确度量,对该两种度量进行了形式化定义,并且进一步研究了他所满足的性质,给出了相应的实例。该度量的提出及其性质的研究有利于粗糙关系数据库的知识发现及数据查询的研究,并且进一步扩大了粗糙关系数据库的研究领域。  相似文献   

16.
《Artificial Intelligence》2007,171(16-17):985-1010
In this paper we tackle the issue of the automatic recognition of functional dependencies among guessed predicates in constraint problem specifications. Functional dependencies arise frequently in pure declarative specifications, because of the intermediate results that need to be computed in order to express some of the constraints, or due to precise modeling choices, e.g., to provide multiple viewpoints of the search space in order to increase constraint propagation. In either way, the recognition of dependencies greatly helps solvers, allowing them to avoid spending search on unfruitful branches, while maintaining the highest degree of declarativeness. By modeling constraint problem specifications as second-order formulae, we provide a characterization of functional dependencies in terms of semantic properties of first-order ones, and prove undecidability of the problem of their recognition. Despite such negative result, we advocate the (in many cases effective) possibility of using automated tools to mechanize this task. Additionally, we show how suitable search procedures can be automatically synthesized in order to exploit recognized dependencies. We present opl examples of various problems, taken from bio-informatics, planning and resource allocation, and show how in many cases opl greatly benefits from the addition of such search procedures. Moreover, we also give evidence that writing sophisticated ad-hoc search procedures that handle dependencies exploiting the peculiarities of the particular problem is a very difficult and error-prone task which in many cases does not seem to pay-off.  相似文献   

17.
An Armstrong database is a database that obeys precisely a given set of sentences (and their logical consequences) and no other sentences of a given type. It is shown that if the sentences of interest are inclusion dependencies and standard functional dependencies (functional dependencies for which the left-hand side is nonempty), then there is always an Armstrong database for each set of sentences. (An example of an inclusion dependency is the sentence that says that every MANAGER is an EMPLOYEE.) If, however, the sentences of interest are inclusion dependencies and unrestricted functional dependencies, then there need not exist an Armstrong database. This result holds even if we allow only ‘full’ inclusion dependencies. Thus, a fairly sharp line is drawn, in a case of interest, as to when an Armstrong database must exist. These results hold whether we restrict our attention to finite databases (databases with a finite number of tuples), or whether we allow unrestricted databases.  相似文献   

18.
Translations of relational schemas are extended to the set of functional and join dependencies. A basic theorem on the representation of the closure of an attribute subset is proved.Translated from Kibernetika, No. 5, pp. 18–26, September–October, 1990.  相似文献   

19.
We consider the problem of discovering the functional and inclusion dependencies that a given database instance satisfies. This technique is used in a database design tool that uses example databases to give feedback to the designer. If the examples show deficiencies in the design, the designer can directly modify the examples. the tool then infers new dependencies and the database schema can be modified, if necessary. the discovery of the functional and inclusion dependencies can also be used in analyzing an existing database. the problem of inferring functional dependencies has several connections to other topics in knowledge discovery and machine learning. In this article we discuss the use of examples in the design of databases, and give an overview of the complexity results and algorithms that have been developed for this problem. © 1992 John Wiley & Sons, Inc.  相似文献   

20.
A finite axiomatization of functional dependencies on conceptual database schemata is presented which naturally generalizes the well-known Armstrong axioms. The underlying conceptual data model is the Higher-Order Entity-Relationship Model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号