共查询到20条相似文献,搜索用时 0 毫秒
1.
A variety of integrity constraints have been studied for data cleaning. While these constraints can detect the presence of
errors, they fall short of guiding us to correct the errors. Indeed, data repairing based on these constraints may not find
certain fixes that are guaranteed correct, and worse still, may even introduce new errors when attempting to repair the data. We propose
a method for finding certain fixes, based on master data, a notion of certain regions, and a class of editing rules. A certain region is a set of attributes that are assured correct by the users. Given a certain region and master data, editing
rules tell us what attributes to fix and how to update them. We show how the method can be used in data monitoring and enrichment.
We also develop techniques for reasoning about editing rules, to decide whether they lead to a unique fix and whether they
are able to fix all the attributes in a tuple, relative to master data and a certain region. Furthermore, we present a framework and an algorithm to find certain fixes, by interacting
with the users to ensure that one of the certain regions is correct. We experimentally verify the effectiveness and scalability
of the algorithm. 相似文献
3.
补丁上演《集结号》岁末年初,街头巷尾最热门的话题可能就是近期火爆上映的冯氏电影《集结号》了。的确这个惨烈、悲壮又充满豪情的故事,震撼着很多朋友的心声。与此相映衬的是,最近在我们的电脑上同样也上演着一部危机四伏的电脑集结号!微软的浏览器、办公软件和邮件系统在这个月依旧漏洞百出,很多朋友为能安然过冬,相继更换了Firefox、WPS和Foxmail。不过,有些时候,问题可能比表现上看起来要复杂得多,比如一些第三方公司的PDF文件漏洞造成了IE浏览器的威胁,甚至迫使微软不得不给自家产品打上补丁,以修正这些本不属于自己的问题。 相似文献
7.
Local community detection is a widely used method for identifying groups of nodes starting from seeding nodes. The seed(s) are usually selected either randomly or based only on structural properties of the network. However, in many cases the choice of seed(s) incorporates external knowledge that attaches to these nodes an additional importance for their community. This knowledge, may be derived from an expert on the domain, or may arise from the network’s side information and it constitutes our motivation for the present work; this additional information about the importance of seed(s) can be exploited for detection of better and more relevant communities. We call such biased seed(s), hint(s). Our approach, is to reflect the importance of hints by changing appropriately the network in their vicinity. To the best of our knowledge, no such viewpoint of the seeding nodes in local community detection has been considered before. The aim of this study is to identify a single community which contains the hint(s). Our key contribution is the proposed Hint Enhancement Framework(HEF) that applies a two-step procedure to discover the community of the hint(s): 1) it changes the network by amplifying the hint(s) using re-weighting or re-wiring strategies so as to materialize the bias towards them and 2) it applies local community detection algorithms on the altered network of step 1. We experimentally evaluate HEF in synthetic and real datasets, and demonstrate the positive aspects of the framework in identifying better communities, in comparison with plain local community detection algorithms as well as a global one. 相似文献
10.
This paper describes an environment that supports novice information systems designers in initial concept formation by organising design `cases' in evolving networks, automatically clustered by a neural agent to suggest generalisations of design patterns. Flexible reframing of the cases also supports creativity by uncovering less obvious but relevant design concepts. If the responsibility, of linking related design cases is distributed to a population of learners who contribute their cases and tentative generalisations, genetic techniques can be applied to sustain the evolution of the base of cases towards ontological forms that better guide and constrain the case reuse process, thus supporting knowledge transfer. 相似文献
11.
Three systems of differing complexity have been built which support the development of courseware libraries for reuse at three differing organizations. One system was developed for a handful of authors at a small company, one for a university team of about twenty-five people, and one for a company with about one hundred authors. The small company has not found the cost-benefit balances attractive enough to continue investment. The university team has published several products with its collaborative hypermedia system. The system for the large company includes extremely sophisticated library structures and coordination mechanisms but is a challenge for the new user to fully understand. As is often the case with reuse, the institutional commitment to courseware reuse and the ease of use of the tools are critical factors in success. Based on the experiences with the first three courseware reuse systems and the increased popularity of the World Wide Web, a new courseware reuse methodology has been implemented on the World Wide Web. 相似文献
12.
An efficient scheme of wavelength reuse is presented for solving the capacity limitation of WDM star single-hop networks by limited wavelengths. According to this scheme, the nodes supported by the network can be at least doubled under limited wavelength numbers. Under the same number of nodes, the delay of network can be greatly lowered, the throughput of the network can be increased 1-3 times, and the properties of the network can be efficiently improved. 相似文献
13.
I spend a lot of time working on tools for embedded system design-hardware-software codesign, system scheduling, software optimization. I've worked on them because I think that good tools are important. They help make design tasks possible that would otherwise be impossible; they also help provide the discipline that produces working systems and minimizes working nights. But tools don't necessarily have to be complex or specially designed for embedded systems. just as you can use a screwdriver to open a can, you can adapt existing, everyday programs to new uses for embedded computing 相似文献
14.
We introduce a technique for forcing the calibration of a financial model to produce valid parameters. The technique is based on learning from hints. It converts simple curve fitting into genuine calibration, where broad conclusions can be inferred from parameter values. The technique augments the error function of curve fitting with consistency hint error functions based on the Kullback-Leibler distance. We introduce an efficient EM-type optimization algorithm tailored to this technique. We also introduce other consistency hints, and balance their weights using canonical errors. We calibrate the correlated multifactor Vasicek model of interest rates, and apply it successfully to Japanese Yen swaps market and US dollar yield market. 相似文献
15.
At the heart of today's information-explosion problems are issues involving semantics, mutual understanding, concept matching,
and interoperability. Ontologies and the Semantic Web are offered as a potential solution, but creating ontologies for real-world
knowledge is nontrivial. If we could automate the process, we could significantly improve our chances of making the Semantic
Web a reality. While understanding natural language is difficult, tables and other structured information make it easier to
interpret new items and relations. In this paper we introduce an approach to generating ontologies based on table analysis.
We thus call our approach TANGO (Table ANalysis for Generating Ontologies). Based on conceptual modeling extraction techniques,
TANGO attempts to (i) understand a table's structure and conceptual content; (ii) discover the constraints that hold between
concepts extracted from the table; (iii) match the recognized concepts with ones from a more general specification of related
concepts; and (iv) merge the resulting structure with other similar knowledge representations. TANGO is thus a formalized
method of processing the format and content of tables that can serve to incrementally build a relevant reusable conceptual
ontology. 相似文献
16.
The main result of this paper is a separation result: there is a positive integer k such that for all well-behaving functions t( n), there is a language accepted by a nondeterministic (multi-tape) Turing machine in time t(n) which cannot be accepting by any deterministic (multitape) Turing machine in time O(t(n)) and simultaneously space o((t(n))
1/k
). This implies, for example that for any positive integer, l,l k, there is a language accepted by a n
l
time bounded NDTM which cannot be accepted by a DTM in time and space O(n
l
) and O((log n)
l
) respectively for any l. Such a result is not provable by direct diagonalization because we do not have time to simulate and do the opposite". We devise a different method for accomplishing the result: We first use an alternating Turing machine to speed up the simulation of a time and space bounded DTM and then argue that if our separation result did not hold, every NDTM can itself be simulated faster by another NDTM producing a contradiction to the standard hierarchy results. Some other applications of this method are also presented.Supported by NSF Grant No. MCS-8105557 相似文献
17.
Undesirable absorption, distribution, metabolism, excretion (ADME) properties are the cause of many drug development failures and this has led to the need to identify such problems earlier in the development process. This review highlights computational (in silico) approaches that have been used to identify the characteristics of ligands influencing molecular recognition and/or metabolism by the drug-metabolising enzyme UDP-gucuronosyltransferase (UGT). Current studies applying pharmacophore elucidation, 2D-quantitative structure metabolism relationships (2D-QSMR), 3D-quantitative structure metabolism relationships (3D-QSMR), and non-linear pattern recognition techniques such as artificial neural networks and support vector machines for modelling metabolism by UGT are reported. An assessment of the utility of in silico approaches for the qualitative and quantitative prediction of drug glucuronidation parameters highlights the benefit of using multiple pharmacophores and also non-linear techniques for classification. Some of the challenges facing the development of generalisable models for predicting metabolism by UGT, including the need for screening of more diverse structures, are also outlined. 相似文献
19.
Logic programs can often be inefficient. The usual solution to this problem has been to return some control to the user in the form of impure language features like cut. The authors argue that it is not necessary to resort to such impure features for efficiency. This point is illustrated by considering how most of the common uses of cut can be eliminated from Prolog source programs, relying on static analysis to generate them at compile time. Three common situations where the cut is used are considered. Static analysis techniques are given to detect such situations, and applicable program transformations are described. Two language constructs, firstof and oneof, for situations involving don't-care nondeterminism, are suggested. These constructs have better declarative readings than the cut and extend better to parallel evaluation strategies. Together, these proposals result in a system where users need rely much less on cuts for efficiency, thereby promoting a purer programming style without sacrificing efficiency 相似文献
20.
基于面向对象复用技术的基础类库有力地支持了RAD软件开发模式,大大提高了软件开发中的资源再利用率和共享性。怎样才能快速高效地开发出一个高可复用和高可维护的基础类库呢?在面向对象及软件复用技术基础上,介绍了基于OOR的基础类库技术,分析和例举了当前主流基础类库设计架构以及功能模块,阐述了基础类库的开发思想以及具体设计步骤,在具体应用中加以实现和验证,最后提出了FCL技术热点问题以及研究展望。 相似文献
|