共查询到20条相似文献,搜索用时 15 毫秒
1.
We propose a general, formal definition of the concept of malware (malicious software) as a single sentence in the language of a certain modal logic. Our definition is general thanks to its abstract formulation, which, being abstract, is independent of—but nonetheless generally
applicable to—the manifold concrete manifestations of malware. From our formulation of malware, we derive equally general
and formal definitions of benware (benign software), anti-malware (“antibodies” against malware), and medware (medical software or “medicine” for affected software). We provide theoretical tools and practical techniques for the detection, comparison, and classification of malware and its derivatives. Our general defining principle is causation of (in)correctness. 相似文献
2.
Data integration with uncertainty 总被引:1,自引:0,他引:1
Xin Luna Dong Alon Halevy Cong Yu 《The VLDB Journal The International Journal on Very Large Data Bases》2009,18(2):469-500
This paper reports our first set of results on managing uncertainty in data integration. We posit that data-integration systems
need to handle uncertainty at three levels and do so in a principled fashion. First, the semantic mappings between the data
sources and the mediated schema may be approximate because there may be too many of them to be created and maintained or because
in some domains (e.g., bioinformatics) it is not clear what the mappings should be. Second, the data from the sources may
be extracted using information extraction techniques and so may yield erroneous data. Third, queries to the system may be
posed with keywords rather than in a structured form. As a first step to building such a system, we introduce the concept
of probabilistic schema mappings and analyze their formal foundations. We show that there are two possible semantics for such
mappings: by-table semantics assumes that there exists a correct mapping but we do not know what it is; by-tuple semantics assumes that the correct mapping may depend on the particular tuple in the source data. We present the query complexity
and algorithms for answering queries in the presence of probabilistic schema mappings, and we describe an algorithm for efficiently
computing the top-k answers to queries in such a setting. Finally, we consider using probabilistic mappings in the scenario of data exchange. 相似文献
3.
We develop new algorithms for learning monadic node selection queries in unranked trees from annotated examples, and apply
them to visually interactive Web information extraction.
We propose to represent monadic queries by bottom-up deterministic Node Selecting Tree Transducers (NSTTs), a particular class
of tree automata that we introduce. We prove that deterministic NSTTs capture the class of queries definable in monadic second
order logic (MSO) in trees, which Gottlob and Koch (2002) argue to have the right expressiveness for Web information extraction,
and prove that monadic queries defined by NSTTs can be answered efficiently. We present a new polynomial time algorithm in
RPNI-style that learns monadic queries defined by deterministic NSTTs from completely annotated examples, where all selected nodes
are distinguished.
In practice, users prefer to provide partial annotations. We propose to account for partial annotations by intelligent tree
pruning heuristics. We introduce pruning NSTTs—a formalism that shares many advantages of NSTTs. This leads us to an interactive learning algorithm for monadic queries
defined by pruning NSTTs, which satisfies a new formal active learning model in the style of Angluin (1987).
We have implemented our interactive learning algorithm integrated it into a visually interactive Web information extraction
system—called SQUIRREL—by plugging it into the Mozilla Web browser. Experiments on realistic Web documents confirm excellent quality with very few
user interactions during wrapper induction.
Editor: Georgios Paliouras and Yasubumi Sakakibara 相似文献
4.
Danilo M. Eler Marcel Y. Nakazaki Fernando V. Paulovich Davi P. Santos Gabriel F. Andery Maria Cristina F. Oliveira João Batista Neto Rosane Minghim 《The Visual computer》2009,25(10):923-937
Multidimensional Visualization techniques are invaluable tools for analysis of structured and unstructured data with variable
dimensionality. This paper introduces PEx-Image—Projection Explorer for Images—a tool aimed at supporting analysis of image collections. The tool supports a methodology that employs interactive visualizations
to aid user-driven feature detection and classification tasks, thus offering improved analysis and exploration capabilities.
The visual mappings employ similarity-based multidimensional projections and point placement to layout the data on a plane
for visual exploration. In addition to its application to image databases, we also illustrate how the proposed approach can
be successfully employed in simultaneous analysis of different data types, such as text and images, offering a common visual
representation for data expressed in different modalities. 相似文献
5.
Linux malware can pose a significant threat—its (Linux) penetration is exponentially increasing—because little is known or
understood about Linux OS vulnerabilities. We believe that now is the right time to devise non-signature based zero-day (previously
unknown) malware detection strategies before Linux intruders take us by surprise. Therefore, in this paper, we first do a
forensic analysis of Linux executable and linkable format (ELF) files. Our forensic analysis provides insight into different
features that have the potential to discriminate malicious executables from benign ones. As a result, we can select a features’
set of 383 features that are extracted from an ELF headers. We quantify the classification potential of features using information
gain and then remove redundant features by employing preprocessing filters. Finally, we do an extensive evaluation among classical
rule-based machine learning classifiers—RIPPER, PART, C4.5 Rules, and decision tree J48—and bio-inspired classifiers—cAnt
Miner, UCS, XCS, and GAssist—to select the best classifier for our system. We have evaluated our approach on an available
collection of 709 Linux malware samples from vx heavens and offensive computing. Our experiments show that ELF-Miner provides more than 99% detection accuracy with less than 0.1% false alarm rate. 相似文献
6.
Propositional satisfiability (SAT) is a success story in Computer Science and Artificial Intelligence: SAT solvers are currently
used to solve problems in many different application domains, including planning and formal verification. The main reason
for this success is that modern SAT solvers can successfully deal with problems having millions of variables. All these solvers
are based on the Davis–Logemann–Loveland procedure (dll). In its original version, dll is a decision procedure, but it can be very easily modified in order to return one or all assignments satisfying the input
set of clauses, assuming at least one exists. However, in many cases it is not enough to compute assignments satisfying all
the input clauses: Indeed, the returned assignments have also to be “optimal” in some sense, e.g., they have to satisfy as
many other constraints—expressed as preferences—as possible. In this paper we start with qualitative preferences on literals,
defined as a partially ordered set (poset) of literals. Such a poset induces a poset on total assignments and leads to the
definition of optimal model for a formula ψ as a minimal element of the poset on the models of ψ. We show (i) how dll can be extended in order to return one or all optimal models of ψ (once converted in clauses and assuming ψ is satisfiable), and (ii) how the same procedures can be used to compute optimal models wrt a qualitative preference on formulas and/or wrt a quantitative
preference on literals or formulas. We implemented our ideas and we tested the resulting system on a variety of very challenging
structured benchmarks. The results indicate that our implementation has comparable performances with other state-of-the-art
systems, tailored for the specific problems we consider. 相似文献
7.
Thomas Ågotnes Wiebe van der Hoek Michael Wooldridge 《Autonomous Agents and Multi-Agent Systems》2011,22(1):4-30
Agents that must reach agreements with other agents need to reason about how their preferences, judgments, and beliefs might
be aggregated with those of others by the social choice mechanisms that govern their interactions. The emerging field of judgment aggregation studies aggregation from a logical perspective, and considers how multiple sets of logical formulae can be aggregated to a single consistent set. As a special
case, judgment aggregation can be seen to subsume classical preference aggregation. We present a modal logic that is intended
to support reasoning about judgment aggregation scenarios (and hence, as a special case, about preference aggregation): the
logical language is interpreted directly in judgment aggregation rules. We present a sound and complete axiomatisation. We
show that the logic can express aggregation rules such as majority voting; rule properties such as independence; and results
such as the discursive paradox, Arrow’s theorem and Condorcet’s paradox—which are derivable as formal theorems of the logic.
The logic is parameterised in such a way that it can be used as a general framework for comparing the logical properties of
different types of aggregation—including classical preference aggregation. As a case study we present a logical study of,
including a formal proof of, the neutrality lemma, the main ingredient in a well-known proof of Arrow’s theorem. 相似文献
8.
Marsha Chechik Shiva Nejati Mehrdad Sabetzadeh 《Innovations in Systems and Software Engineering》2012,8(1):3-18
A key problem in model-based development is integrating a collection of models into a single, larger, specification as a way
to construct a functional system, to develop a unified understanding, or to enable automated reasoning about properties of
the resulting system. In this article, we suggest that the choice of a particular model integration operator depends on the
inter-model relationships that hold between individual models. Based on this observation, we distinguish three key integration
operators studied in the literature—merge, composition and weaving—and describe each operator along with the notion of relationship that underlies it. We then focus on the merge activity and
provide a detailed look at the factors that one must consider in defining a merge operator, particularly the way in which
the relationships should be captured during merge. We illustrate these factors using two merge operators that we have developed
in our earlier work for combining models that originate from distributed teams. 相似文献
9.
Recovering Surface Layout from an Image 总被引:2,自引:1,他引:2
Derek Hoiem Alexei A. Efros Martial Hebert 《International Journal of Computer Vision》2007,75(1):151-172
Humans have an amazing ability to instantly grasp the overall 3D structure of a scene—ground orientation, relative positions
of major landmarks, etc.—even from a single image. This ability is completely missing in most popular recognition algorithms,
which pretend that the world is flat and/or view it through a patch-sized peephole. Yet it seems very likely that having a
grasp of this “surface layout” of a scene should be of great assistance for many tasks, including recognition, navigation,
and novel view synthesis.
In this paper, we take the first step towards constructing the surface layout, a labeling of the image intogeometric classes. Our main insight is to learn appearance-based models of these geometric classes, which coarsely describe the 3D
scene orientation of each image region. Our multiple segmentation framework provides robust spatial support, allowing a wide
variety of cues (e.g., color, texture, and perspective) to contribute to the confidence in each geometric label. In experiments
on a large set of outdoor images, we evaluate the impact of the individual cues and design choices in our algorithm. We further
demonstrate the applicability of our method to indoor images, describe potential applications, and discuss extensions to a
more complete notion of surface layout. 相似文献
10.
This article deals with kernel security protection. We propose a characterization of malicious kernel-targeted actions, based
on how the way they act to corrupt the kernel. Then, we discuss security measures able to counter such attacks. We finally
expose our approach based on hardware-virtualization that is partially implemented into our demonstrator Hytux, which is inspired from bluepill (Rutkowska in subverting vista kernel for fun and profit. In: Black Hat in Las Vegas, 2006), a malware that installs itself
as a lightweight hypervisor—on a hardware-virtualization compliant CPU—and puts a running Microsoft Windows Operating System
into a virtual machine. However, in contrast with bluepill, Hytux is a lightweight hypervisor that implements protection mechanisms in a more privileged mode than the Linux kernel. 相似文献
11.
While offering many practical benefits for distributed applications, mobile agent systems pose some fundamental security challenges.
In this paper, we present a new approach to mobile agent security which helps to address some of these challenges. We present
a new technique, which we refer to as trust enhanced security, and apply it to mobile agent-based systems; this new technique
advocates a shift in security solutions from security-centric to trust-centric. This extends the traditional security mechanisms
by enabling trust decisions through explicit specification and management of security-related trust relationships. The integration
of the trust decisions into security decision-making process leads to our trust enhanced security performance. A formal trust
model is proposed and is incorporated into the development of a novel trust management architecture—MobileTrust for mobile agent-based applications. We have conducted detailed practical investigations to evaluate and validate the emergent
properties of the trust enhanced security technique. We present and discuss the key results in this paper. 相似文献
12.
We investigate the computational power of threshold—AND circuits versus threshold—XOR circuits. In contrast to the observation
that small weight threshold—AND circuits can be simulated by small weight threshold—XOR circuit, we present a function with
small size unbounded weight threshold—AND circuits for which all threshold—XOR circuits have exponentially many nodes. This
answers the basic question of separating subsets of the hypercube by hypersurfaces induced by sparse real polynomials. We
prove our main result by a new lower bound argument for threshold circuits. Finally we show that unbounded weight threshold
gates cannot simulate alternation: There are -functions which need exponential size threshold—AND circuits.
Received: August 8, 1996. 相似文献
13.
14.
Alexander Helleboogh Giuseppe Vizzari Adelinde Uhrmacher Fabien Michel 《Autonomous Agents and Multi-Agent Systems》2007,14(1):87-116
Real environments in which agents operate are inherently dynamic—the environment changes beyond the agents’ control. We advocate that, for multi-agent simulation, this dynamism must be modeled explicitly as part of the simulated environment, preferably using concepts and constructs that relate to the real world. In this paper, we describe such concepts and constructs, and we provide a formal framework to unambiguously specify their relations and meaning. We apply the formal framework to model a dynamic RoboCup Soccer environment and elaborate on how the framework poses new challenges for exploring the modeling of environments in multi-agent simulation. 相似文献
15.
J. H. Reif 《Algorithmica》2000,28(3):271-287
In this paper we show that if the input points to the geometric closest pair problem are given with limited precision (each
coordinate is specified with O( log n) bits), then we can compute the closest pair in O(n log log n) time. We also apply our spatial decomposition technique to the k -nearest neighbor and n -body problems, achieving similar improvements.
To make use of the limited precision of the input points, we use a reasonable machine model that allows ``bit shifting'
in constant time—we believe that this model is realistic, and provides an interesting way of beating the Ω(n log n) lower bound that exists for this problem using the more typical algebraic RAM model.
Received October 12, 1998; revised October 26, 1998. 相似文献
16.
This paper presents new approaches to the validation of loop optimizations that compilers use to obtain the highest performance
from modern architectures. Rather than verify the compiler, the approach of translation validationperforms a validation check after every run of the compiler, producing a formal proof that the produced target code is a correct
implementation of the source code.
As part of an active and ongoing research project on translation validation, we have previously described approaches for validating
optimizations that preserve the loop structure of the code and have presented a simulation-based general technique for validating
such optimizations. In this paper, for more aggressive optimizations that alter the loop structure of the code—such as distribution,
fusion, tiling, and interchange—we present a set of permutation ruleswhich establish that the transformed code satisfies all the implied data dependencies necessary for the validity of the considered
transformation. We describe the extensions to our tool voc-64 which are required to validate these structure-modifying optimizations.
This paper also discusses preliminary work on run-time validation of speculative loop optimizations. This involves using run-time tests to ensure the correctness of loop optimizations whose correctness cannot be guaranteed
at compile time. Unlike compiler validation, run-time validation must not only determine when an optimization has generated
incorrect code, but also recover from the optimization without aborting the program or producing an incorrect result. This
technique has been applied to several loop optimizations, including loop interchange and loop tiling, and appears to be quite
promising.
This research was supported in part by NSF grant CCR-0098299, ONR grant N00014-99-1-0131, and the John von Neumann Minerva
Center for Verification of Reactive Systems. 相似文献
17.
With the exponential growth of moving objects data to the Gigabyte range, it has become critical to develop effective techniques
for indexing, updating, and querying these massive data sets. To meet the high update rate as well as low query response time
requirements of moving object applications, this paper takes a novel approach in moving object indexing. In our approach,
we do not require a sophisticated index structure that needs to be adjusted for each incoming update. Rather, we construct conceptually
simple short-lived index images that we only keep for a very short period of time (sub-seconds) in main memory. As a consequence, the resulting technique
MOVIES supports at the same time high query rates and high update rates, trading this property for query result staleness. Moreover, MOVIES is the first main memory method supporting time-parameterized predictive queries. To support this feature, we present two
algorithms: non-predictive MOVIES and predictive MOVIES. We obtain the surprising result that a predictive indexing approach—considered state-of-the-art in an external-memory scenario—does
not scale well in a main memory environment. In fact, our results show that MOVIES outperforms state-of-the-art moving object indexes such as a main-memory adapted B
x
-tree by orders of magnitude w.r.t. update rates and query rates. In our experimental evaluation, we index the complete road
network of Germany consisting of 40,000,000 road segments and 38,000,000 nodes. We scale our workload up to 100,000,000 moving
objects, 58,000,000 updates per second and 10,000 queries per second, a scenario at a scale unmatched by any previous work. 相似文献
18.
Bogdan Alexe Mauricio Hernández Lucian Popa Wang-Chiew Tan 《The VLDB Journal The International Journal on Very Large Data Bases》2012,21(2):191-211
One of the main steps toward integration or exchange of data is to design the mappings that describe the (often complex) relationships
between the source schemas or formats and the desired target schema. In this paper, we introduce a new operator, called MapMerge,
that can be used to correlate multiple, independently designed schema mappings of smaller scope into larger schema mappings.
This allows a more modular construction of complex mappings from various types of smaller mappings such as schema correspondences
produced by a schema matcher or pre-existing mappings that were designed by either a human user or via mapping tools. In particular,
the new operator also enables a new “divide-and-merge” paradigm for mapping creation, where the design is divided (on purpose)
into smaller components that are easier to create and understand and where MapMerge is used to automatically generate a meaningful
overall mapping. We describe our MapMerge algorithm and demonstrate the feasibility of our implementation on several real
and synthetic mapping scenarios. In our experiments, we make use of a novel similarity measure between two database instances
with different schemas that quantifies the preservation of data associations. We show experimentally that MapMerge improves
the quality of the schema mappings, by significantly increasing the similarity between the input source instance and the generated
target instance. Finally, we provide a new algorithm that combines MapMerge with schema mapping composition to correlate flows
of schema mappings. 相似文献
19.
Md Anisur Rahman Mehedi Masud Iluju Kiringa Abdulmotaleb El Saddik 《Journal of Intelligent Information Systems》2012,38(2):533-554
The task of combining data residing at different sources to provide the user a unified view is known as data integration. Schema mappings act as glue between the global schema and the source schemas of a data integration system. Global-and-local-as-view (GLAV) is one the approaches for specifying the schema mappings. Tableaux are used for expressing queries and functional dependencies on a single database. We investigate a general technique for
expressing a GLAV mapping by a tabular structure called mapping assertion tableaux (MAT). In a similar way, we also express the tuple generating dependency (tgd) and equality generating dependency (egd) constraints by tabular forms, called tabular tgd (TTGD) and tabular egd (TEGD), respectively. A set consisting of the MATs, TTGDs and TEGDs are called schema mapping tableaux (SMT). We present algorithms that use SMT as operator on an instance of the source schema to produce an instance of the target
schema. We show that the target instances computed by the SMT are ‘minimal’ and ‘most general’ in nature. We also define the
notion of equivalence between the schema mappings of two data integration systems and present algorithms that optimize schema
mappings through the manipulation of the SMT. 相似文献
20.
We apply genetic programming to the evolution of strategies for playing the game of backgammon. We explore two different strategies
of learning: using a fixed external opponent as teacher, and letting the individuals play against each other. We conclude
that the second approach is better and leads to excellent results: Pitted in a 1000-game tournament against a standard benchmark
player—Pubeval—our best evolved program wins 62.4% of the games, the highest result to date. Moreover, several other evolved programs attain
win percentages not far behind the champion, evidencing the repeatability of our approach. 相似文献