共查询到20条相似文献,搜索用时 15 毫秒
1.
Schema theory is the most well-known model of evolutionary algorithms. Imitating from genetic algorithms (GA), nearly all schemata defined for genetic programming (GP) refer to a set of points in the search space that share some syntactic characteristics. In GP, syntactically similar individuals do not necessarily have similar semantics. The instances of a syntactic schema do not behave similarly, hence the corresponding schema theory becomes unreliable. Therefore, these theories have been rarely used to improve the performance of GP. The main objective of this study is to propose a schema theory which could be a more realistic model for GP and could be potentially employed for improving GP in practice. To achieve this aim, the concept of semantic schema is introduced. This schema partitions the search space according to semantics of trees, regardless of their syntactic variety. We interpret the semantics of a tree in terms of the mutual information between its output and the target. The semantic schema is characterized by a set of semantic building blocks and their joint probability distribution. After introducing the semantic building blocks, an algorithm for finding them in a given population is presented. An extraction method that looks for the most significant schema of the population is provided. Moreover, an exact microscopic schema theorem is suggested that predicts the expected number of schema samples in the next generation. Experimental results demonstrate the capability of the proposed schema definition in representing the semantics of the schema instances. It is also revealed that the semantic schema theorem estimation is more realistic than previously defined schemata. 相似文献
2.
由于数据源数据模式的自治性、异构性,不确定性是模式匹配过程固有的本质特性。提出了一种基于证据理论的不确定性匹配方法,首先根据属性类型把模式空间分成若干模式子空间;然后将不同的匹配器结果看作不同的证据源,利用不同的匹配器的结果生成了多个基本概率分配函数,采用改进的Dempster组合规则把多个匹配器结果自动组合,减少人工干预,并解决了不同的匹配器结果组合时证据间冲突的问题;最后利用Kuhn Munkres算法获取模式映射。实验结果表明了方法的可行性和有效性。 相似文献
3.
Integrating heterogeneous database schemata is a major task in federated database design where preexisting and heterogeneous database systems need to be integrated virtually by providing a homogenization database interface. Most proposed schema integration methods suffer from very complex result schemata and insufficient handling of extensional relations, i.e. in the way how redundant data of the input systems are dealt with. Redundancy among the input systems may thus remain undetected and, hence, remains uncontrolled.Our GIM (Generic Integration Model) method is based on the elegant and mathematically founded theory of formal concept analysis (FCA). The main idea is to integrate schemata into one formal context which is a binary relation between a set of attributes and a set of base extensions (set of potential objects). From that context we apply an FCA-algorithm to semi-automatically derive a concept lattice which we interpret as an inheritance hierarchy of classes for a homogenized schema. Thus, the integration task following our method can be supported by tools. 相似文献
4.
Francisco J. González-Castaño Rafael Asorey-Cacheda Héctor Cerezo-Costas Juan C. Burguillo-Rial Felipe J. Gil-Castiñeira 《Multimedia Tools and Applications》2010,48(2):291-312
In this paper we present a novel multicast near-Video on Demand (nVoD) coding schema, which relies on the intrinsic redundancy of the underlying nVoD protocol to provide implicit error correction, by employing content segments as blocks for coding operations. As a result, this implicit error correction has zero overhead, unlike the direct application of error-correcting codes, which increase content bitrate in the same proportion as target error probability. The findings in this paper indicate that our proposal outperforms previous approaches with explicit error correction (error protection within content segments) in terms of transmission bandwidth for the same packet loss probability. We present an analytical approach that can be used to tune implicit error correction (coding matrix selection), which we validate with simulations. We also simulate the impact of the coding schema on two different nVoD protocols, fast broadcasting (FB) and recursive frequency splitting (RFS). Finally, we show the benefits of applying this schema to a real scenario with WiMax transport. 相似文献
5.
Extract, Transform and Load (ETL) processes organized as workflows play an important role in data warehousing. As ETL workflows are usually complex, various ETL facilities have been developed to address their control-flow process modeling and execution control. To evaluate the quality of ETL facilities, Synthetic ETL workflow test cases, consisting of control-flow and data-flow aspects are needed to check ETL facility functionalities at construction time and to validate the correctness and performance of ETL facilities at run time. Although there are some synthetic workflow and data set test case generation approaches existed in literatures, little work is done to consider both aspects at the same time specifically for ETL workflow generators. To address this issue, this paper proposes a schema aware ETL workflow generator with which users can characterize their ETL workflows by various parameters and get ETL workflow test cases with control-flow of ETL activities, complied schemas and associated recordsets. Our generator consists of three steps. First, with type and ratio of individual activities and their connection characteristic parameter specification, the generator will produce ETL activities and form ETL skeleton which determine how generated activities are cooperated with each other. Second, with schema transformation characteristic parameter specification, e.g. ranges of numbers of attributes, the generator will resolve attribute dependencies and refine input/output schemas with complied attributes and their data types. In the last step, recordsets are generated following cardinality specifications. ETL workflows in specific patterns are produced in the experiment in order to show the ability of our generator. Also experiments to generate thousands of ETL workflow test cases in seconds have been done to verify the usability of the generator. 相似文献
6.
General schema theory for genetic programming with subtree-swapping crossover: part I 总被引:2,自引:0,他引:2
This is the first part of a two-part paper which introduces a general schema theory for genetic programming (GP) with subtree-swapping crossover. The theory is based on a Cartesian node reference system which makes it possible to describe programs as functions over the space N(2) and allows one to model the process of selection of the crossover points of subtree-swapping crossovers as a probability distribution over N(4). In Part I, we present these notions and models and show how they can be used to calculate useful quantities. In Part II we will show how this machinery, when integrated with other definitions, such as that of variable-arity hyperschema, can be used to construct a general and exact schema theory for the most commonly used types of GP. 相似文献
7.
Prabhaker Mateti 《Software》1983,13(2):163-179
A two level specification of the functional behaviour of a class of indenting programs for Pascal is presented. The transformation that these programs perform on the input text is a composition of splitting input lines, altering the blank space between lexical tokens and computing the margin required in front of each of the split lines. The high level specification is given as a stylized Pascal grammar in Extended BNF. In contrast, the low level specifications, which are operationally closer to a program, and which define how syntactically invalid text is dealt with, require several mathematical functions that capture the essence of these basic transformations. The specifications of an indenting program for Pascal are then obtained as a further elaboration of these functions. Most indentation styles appearing in the literature can be specified with precision using methods developed in this paper. Our experience in this case study indicates that although specifications for real-life programs can be given using simple mathematics, the effort required is still considerable. 相似文献
8.
General schema theory for genetic programming with subtree-swapping crossover: Part II 总被引:2,自引:0,他引:2
This paper is the second part of a two-part paper which introduces a general schema theory for genetic programming (GP) with subtree-swapping crossover (Part I (Poli and McPhee, 2003)). Like other recent GP schema theory results, the theory gives an exact formulation (rather than a lower bound) for the expected number of instances of a schema at the next generation. The theory is based on a Cartesian node reference system, introduced in Part I, and on the notion of a variable-arity hyperschema, introduced here, which generalises previous definitions of a schema. The theory includes two main theorems describing the propagation of GP schemata: a microscopic and a macroscopic schema theorem. The microscopic version is applicable to crossover operators which replace a subtree in one parent with a subtree from the other parent to produce the offspring. Therefore, this theorem is applicable to Koza's GP crossover with and without uniform selection of the crossover points, as well as one-point crossover, size-fair crossover, strongly-typed GP crossover, context-preserving crossover and many others. The macroscopic version is applicable to crossover operators in which the probability of selecting any two crossover points in the parents depends only on the parents' size and shape. In the paper we provide examples, we show how the theory can be specialised to specific crossover operators and we illustrate how it can be used to derive other general results. These include an exact definition of effective fitness and a size-evolution equation for GP with subtree-swapping crossover. 相似文献
9.
Form invariance of schema and exact schema theorem 总被引:2,自引:0,他引:2
One of the most important research questions in GAs is the explanation of the evolutionary process of CAs as a mathematical object. In this paper, we use matrix linear transformations to do it, first. This new method makes the study on mechanism of CAs simpler. We obtain the conditions under which the operators of crossover and mutation are commutative operators of CAs. We also give an exact schema equation on the basis of the concept of schema space. The result is similar to Stephens and Waelbroeck's work, but they have novel meanings and a larger degree of coarse graining. 相似文献
10.
《Information and Software Technology》1999,41(5):275-281
When transforming relational database (RDB) schema into object-oriented database(OODB) schema, much effort was put on examining key and inclusion dependency (ID) constraints to identify class and establish inheritance and association between classes. However, in order to further remove the original data redundancy and update anomaly, multi-valued dependency (MVD) should also be examined. In this paper, we discuss class structures and define well-structured classes. Based on MVDs, a theorem is given transforming a relation schema into a well-structured class. To transform RDB schema into OODB schema, a composition process simplifying the input RDB schema and an algorithm transforming the simplified RDB schema into well-structured OODB classes are developed. 相似文献
11.
Giuseppe Santucci 《Data & Knowledge Engineering》1998,25(3):301-326
Within the database field, schema refinements have been proved useful for documentation and maintenance purposes; moreover, schemata describing the reality of interest at different levels of abstraction are extensively used in Computer Aided Software Engineering tools and visual query languages. So, much effort has been spent in analyzing schema transformations and schema refinements. Till now, however, while the syntaxof schema transformations has been deeply investigated, the semantics has been very often neglected. In this paper we present a full formal framework, supporting both the syntax and the semantics of schema refinements. Such a formal framework is used to support a methodology able to merge a set of schemata and the top-down chains of refinement planes produced during their design. The result of this kind of integration, that we call multilevel integration, is an integrated schema plus an associated top-down chain of schemata. The integrated schema and the chain are related to the input schemata by interesting properties, giving rise to a two-dimensional structure useful for exploring the data content of complex information systems. 相似文献
12.
The research issues of intelligent information integration have become ubiquitous and critically important in e-business (EB) with the increasing dependence on Internet/Intranet and information technology (IT). Accessing the intelligent information sources separately without integration may lead to the chaos of information requested. It is also not cost-effective in EB settings. A common general way to deal with heterogeneity problems in traditional III is to create a common data model. The eXtensible Markup Language (XML) has been the standard data document format for exchanging information on the Web. XML only deals with the structural heterogeneity; it can barely handle the semantic heterogeneity. Ontologies are regarded as an important and natural means to represent the implicit semantics and relationships in the real world. And they are used to assist to reach semantic interoperability in III in this research. In this paper, we provide a generic construct orientation no ad hoc method to generate the global schema to enable the web-based alternative to traditional III. We provide a wiser query method over multiple intelligent information sources by applying global-as-view (GAV) and local-as-view (LAV) approach with the use of ontology to enhance both structural and semantic interoperability of the underlying intelligent information sources. We construct a prototype implementing the method to provide a proof on the validity and feasibility. 相似文献
13.
Unauthorized changes to databases can result in significant losses for organizations as well as individuals. Watermarking can be used to protect the integrity of databases against unauthorized alterations. Prior work focused on watermarking database tables or relations. Malicious alteration cannot be detected in all cases. In this paper we argue that watermarking database indexes in addition to the database tables would improve the detection of unauthorized alterations. Usually, each database table in commercial applications has more than one index attached to it. Thus, watermarking the database table and all its indexes improve the likelihood of detecting malicious attacks. In general, watermarking different indexes like R-trees, B-trees, Hashes, require different watermarking techniques and exploit different redundancies in the underlying data structure. This diversity in watermarking techniques contributes to the overall integrity of the databases.Traditional relational watermarks introduce some error to the watermarked values and thus cannot be applied to all attributes. This paper proposes a novel watermarking scheme for R-tree data structures that does not change the values of the attributes. Moreover, the watermark does not change the size of the R-tree. The proposed technique takes advantage of the fact that R-trees do not put conditions on the order of entries inside the node. In the proposed scheme, entries inside R-tree nodes are rearranged, relative to a “secret” initial order (a secret key), in a way that corresponds to the value of the watermark.To achieve that, we propose a one-to-one mapping between all possible permutations of entries in the R-tree node and all possible values of the watermark. Without loss of generality, watermarks are assumed to be numeric values. The proposed mapping employs a numbering system that uses variable base with factorial value.The detection rate of the malicious attacks depends on the nature of the attack, distribution of the data, and the size of the R-tree node. Our extensive analysis and experimental results showed that the proposed technique detects data alteration with high probability (that reaches up to 99%) on real datasets using reasonable node sizes and attack model. The watermark insertion and extraction are mainly main memory operations, and thus, have minimal effect on the cost of R-tree operations. 相似文献
14.
A survey of approaches to automatic schema matching 总被引:75,自引:1,他引:75
Erhard Rahm Philip A. Bernstein 《The VLDB Journal The International Journal on Very Large Data Bases》2001,10(4):334-350
Schema matching is a basic problem in many database application domains, such as data integration, E-business, data warehousing,
and semantic query processing. In current implementations, schema matching is typically performed manually, which has significant
limitations. On the other hand, previous research papers have proposed many techniques to achieve a partial automation of
the match operation for specific application domains. We present a taxonomy that covers many of these existing approaches,
and we describe the approaches in some detail. In particular, we distinguish between schema-level and instance-level, element-level
and structure-level, and language-based and constraint-based matchers. Based on our classification we review some previous
match implementations thereby indicating which part of the solution space they cover. We intend our taxonomy and review of
past work to be useful when comparing different approaches to schema matching, when developing a new match algorithm, and
when implementing a schema matching component.
Received: 5 February 2001 / Accepted: 6 September 2001 Published online: 21 November 2001 相似文献
15.
16.
基于关系模式的向量模型和XML模式树模型,提出了一种关系模式到模块化的XML Schema的模型映射方法BTT,其映射规则保持了关系模式的结构、属性以及约束信息的完整保留,自底向上的映射顺序在没有牺牲转换效率的情况下,完成了模块化的封装与重用,使得对转换后的XML Schema文档的更新操作可以在模块化内部完成,大大提高了维护效率。实验结果表明了与传统的嵌套层次的XML Schema文档相比,BTT方法转换形成的XML Schema在维护效率上有明显优势。 相似文献
17.
18.
Katarina Grolinger Miriam A.M. Capretz 《Information and Software Technology》2011,53(2):159-170
Context
The constant changes in today’s business requirements demand continuous database revisions. Hence, database structures, not unlike software applications, deteriorate during their lifespan and thus require refactoring in order to achieve a longer life span. Although unit tests support changes to application programs and refactoring, there is currently a lack of testing strategies for database schema evolution.Objective
This work examines the challenges for database schema evolution and explores the possibility of using various testing strategies to assist with schema evolution. Specifically, the work proposes a novel unit test approach for the application code that accesses databases with the objective of proactively evaluating the code against the altered database.Method
The approach was validated through the implementation of a testing framework in conjunction with a sample application and a relatively simple database schema. Although the database schema in this study was simple, it was nevertheless able to demonstrate the advantages of the proposed approach.Results
After changes in the database schema, the proposed approach found all SELECT statements as well as the majority of other statements requiring modifications in the application code. Due to its efficiency with SELECT statements, the proposed approach is expected to be more successful with database warehouse applications where SELECT statements are dominant.Conclusion
The unit test approach that accesses databases has proven to be successful in evaluating the application code against the evolved database. In particular, the approach is simple and straightforward to implement, which makes it easily adoptable in practice. 相似文献19.
Régis Riveret Pietro Baroni Yang Gao Guido Governatori Antonino Rotolo Giovanni Sartor 《Annals of Mathematics and Artificial Intelligence》2018,83(1):21-71
The combination of argumentation and probability paves the way to new accounts of qualitative and quantitative uncertainty, thereby offering new theoretical and applicative opportunities. Due to a variety of interests, probabilistic argumentation is approached in the literature with different frameworks, pertaining to structured and abstract argumentation, and with respect to diverse types of uncertainty, in particular the uncertainty on the credibility of the premises, the uncertainty about which arguments to consider, and the uncertainty on the acceptance status of arguments or statements. Towards a general framework for probabilistic argumentation, we investigate a labelling-oriented framework encompassing a basic setting for rule-based argumentation and its (semi-) abstract account, along with diverse types of uncertainty. Our framework provides a systematic treatment of various kinds of uncertainty and of their relationships and allows us to back or question assertions from the literature. 相似文献
20.
为解决复杂XML文档中的递归元素、多命名空间信息、重复结构、面向未来可扩展元素和属性等关系模式的映射,针对P_Schema和B_Schema在XML模式到关系模式映射中的不足,提出了一种基于C_Schema的XML模式到关系模式的映射方法。C_Schema对P_Schema和B_Schema进行了继承和扩展,将上述复杂信息提取出来生成新类型,并在其相应的父元素中保留对新类型的引用。C_Schema与XML Schema模式是一种等价关系,根据一定的映射规则,将C_Schema直接映射为关系模式,实现将XML文档在数据库中的存储。依据该模式可实现各种基于模式的复杂XML文档到关系数据库的存储与还原,可广泛应用于各种基于XML的行业标准中。 相似文献