共查询到20条相似文献,搜索用时 663 毫秒
1.
Maximum entropy models and subjective interestingness: an application to tiles in binary databases 总被引:3,自引:3,他引:0
Tijl De Bie 《Data mining and knowledge discovery》2011,23(3):407-446
Recent research has highlighted the practical benefits of subjective interestingness measures, which quantify the novelty
or unexpectedness of a pattern when contrasted with any prior information of the data miner (Silberschatz and Tuzhilin, Proceedings
of the 1st ACM SIGKDD international conference on Knowledge discovery and data mining (KDD95), 1995; Geng and Hamilton, ACM Comput Surv 38(3):9, 2006). A key challenge here is the formalization of this prior information in a way that lends itself to the definition of a subjective
interestingness measure that is both meaningful and practical. In this paper, we outline a general strategy of how this could be achieved, before working out the details for a use case that is important in its own right. Our general strategy is based on considering prior information as constraints on a probabilistic
model representing the uncertainty about the data. More specifically, we represent the prior information by the maximum entropy
(MaxEnt) distribution subject to these constraints. We briefly outline various measures that could subsequently be used to
contrast patterns with this MaxEnt model, thus quantifying their subjective interestingness. We demonstrate this strategy
for rectangular databases with knowledge of the row and column sums. This situation has been considered before using computation
intensive approaches based on swap randomizations, allowing for the computation of empirical p-values as interestingness measures (Gionis et al., ACM Trans Knowl Discov Data 1(3):14, 2007). We show how the MaxEnt model can be computed remarkably efficiently in this situation, and how it can be used for the same
purpose as swap randomizations but computationally more efficiently. More importantly, being an explicitly represented distribution,
the MaxEnt model can additionally be used to define analytically computable interestingness measures, as we demonstrate for
tiles (Geerts et al., Proceedings of the 7th international conference on Discovery science (DS04), 2004) in binary databases. 相似文献
2.
This paper takes up a remark in the well-known paper of Alon, Matias, and Szegedy (J. Comput. Syst. Sci. 58(1):137–147, 1999) about the computation of the frequency moments of data streams and shows in detail how any F
k
with k≥1 can be approximately computed using space O(km
1−1/k
(k+log m+log log n)) based on approximate counting. An important building block for this, which may be interesting in its own right, is a new
approximate variant of reservoir sampling using space O(log log n) for constant error parameters. 相似文献
3.
Earlier approximate response time analysis (RTA) methods for tasks with offsets (transactional task model) exhibit two major
deficiencies: (i) They overestimate the calculated response times resulting in an overly pessimistic result. (ii) They suffer
from time complexity problems resulting in an RTA method that may not be applicable in practice. This paper shows how these
two problems can be alleviated and combined in one single fast-and-tight RTA method that combines the best of worlds, high
precision response times and a fast approximate RTA method.
Simulation studies, on randomly generated task sets, show that the response time improvement is significant, typically about 15%
tighter response times in 50% of the cases, resulting in about 12% higher admission probability for low priority tasks subjected
to admission control. Simulation studies also show that speedups of more than two orders of magnitude, for realistically sized
tasks sets, compared to earlier RTA analysis techniques, can be obtained.
Other improvements such as Palencia Gutiérrez, González Harbour (Proceedings of the 20th IEEE real-time systems symposium
(RTSS), pp. 328–339, 1999), Redell (Technical Report TRITA-MMK 2003:4, Dept. of Machine Design, KTH, 2003) are orthogonal and complementary which means that our method can easily be incorporated also in those methods. Hence, we
conclude that the fast-and-tight RTA method presented is the preferred analysis technique when tight response-time estimates
are needed, and that we do not need to sacrifice precision for analysis speed; both are obtained with one single method.
相似文献
Mikael NolinEmail: |
4.
《Computer Methods in Applied Mechanics and Engineering》2005,194(2-5):205-228
We describe a deterministic finite element (FE) solution algorithm for a stochastic elliptic boundary value problem (sbvp), whose coefficients are assumed to be random fields with finite second moments and known, piecewise smooth two-point spatial correlation function. Separation of random and deterministic variables (parametrization of the uncertainty) is achieved via a Karhunen–Loève (KL) expansion. An O(N log N) algorithm for the computation of the KL eigenvalues is presented, based on a kernel independent fast multipole method (FMM). Truncation of the KL expansion gives an (M, 1) Wiener polynomial chaos (PC) expansion of the stochastic coefficient and is shown to lead to a high dimensional, deterministic boundary value problem (dbvp). Analyticity of its solution in the stochastic variables with sharp bounds for the domain of analyticity are used to prescribe variable stochastic polynomial degree r = (r1, …, rM) in an (M, r) Wiener PC expansion for the approximate solution. Pointwise error bounds for the FEM approximations of KL eigenpairs, the truncation of the KL expansion and the FE solution to the dbvp are given. Numerical examples show that M depends on the spatial correlation length of the random diffusion coefficient. The variable polynomial degree r in PC-stochastic Galerkin FEM allows to handle KL expansions with M up to 30 and r1 up to 10 in moderate time. 相似文献
5.
Harry Buhrman Benjamin Hescott Steven Homer Leen Torenvliet 《Theory of Computing Systems》2010,47(2):317-341
We study properties of non-uniform reductions and related completeness notions. We strengthen several results of Hitchcock
and Pavan (ICALP (1), Lecture Notes in Computer Science, vol. 4051, pp. 465–476, Springer, 2006) and give a trade-off between the amount of advice needed for a reduction and its honesty on NEXP. We construct an oracle
relative to which this trade-off is optimal. We show, in a more systematic study of non-uniform reductions, among other things
that non-uniformity can be removed at the cost of more queries. In line with Post’s program for complexity theory (Buhrman
and Torenvliet in Bulletin of the EATCS 85, pp. 41–51, 2005) we connect such ‘uniformization’ properties to the separation of complexity classes. 相似文献
6.
Susanne Albers 《Algorithmica》2010,58(2):461-477
We study web caching with request reordering. The goal is to maintain a cache of web documents so that a sequence of requests
can be served at low cost. To improve cache hit rates, a limited reordering of requests is allowed. Feder et al. (Proceedings
of the 13th ACM–SIAM Symposium on Discrete Algorithms, pp. 104–105, 2002), who recently introduced this problem, considered caches of size 1, i.e. a cache can store one document. They presented
an offline algorithm based on dynamic programming as well as online algorithms that achieve constant factor competitive ratios.
For arbitrary cache sizes, Feder et al. (Theor. Comput. Sci. 324:201–218, 2004) gave online strategies that have nearly optimal competitive ratios in several cost models. 相似文献
7.
In a recent paper Boykov et al. (LNCS, Vol. 3953, pp. 409–422, 2006) propose an approach for computing curve and surface evolution using a variational approach and the geo-cuts method of Boykov and Kolmogorov (International conference on computer vision, pp. 26–33, 2003). We recall in this paper how this is related to well-known approaches for mean curvature motion, introduced by Almgren et al.
(SIAM Journal on Control and Optimization 31(2):387–438, 1993) and Luckhaus and Sturzenhecker (Calculus of Variations and Partial Differential Equations 3(2):253–271, 1995), and show how the corresponding problems can be solved with sub-pixel accuracy using Parametric Maximum Flow techniques. This provides interesting algorithms for computing crystalline curvature motion,
possibly with a forcing term.
A. Chambolle’s research supported by ANR project “MICA”, grant ANR-08-BLAN-0082.
J. Darbon’s research supported by ONR grant N000140710810. 相似文献
8.
Michell-like 2D layouts generated by genetic ESO 总被引:1,自引:1,他引:0
The theory of least-weight trusses for one load condition and a stress constraint was established by Michell (Philos Mag 8:589–597,
1904). His work was largely ignored for about 50 years, but from then on, a lot of research effort has been devoted to construct
optimal topologies satisfying Michell’s optimality criteria. But the exact analytical Michell layout is still not known for
most boundary conditions. It is therefore useful to develop numerical methods for generating approximate Michell-like topologies.
We show in this paper that genetic ESO (GESO) methods are suitable for constructing Michell-like layouts. To illustrate the
validity of GESO, seven different load cases for a plane structure with two point supports are considered. The Michell-like
layouts are generated for vertical or horizontal loads applied at different heights and different angles. The results are
compared and classified and a new Michell truss is proposed on the basis of the GESO results. Plane structures under two point
loads are also considered, and the GESO results are compared with Melchers (Struct Multidisc Optim 29:85–92, 2005) solutions. 相似文献
9.
A popular approach in combinatorial optimization is to model problems as integer linear programs. Ideally, the relaxed linear
program would have only integer solutions, which happens for instance when the constraint matrix is totally unimodular. Still,
sometimes it is possible to build an integer solution with the same cost from the fractional solution. Examples are two scheduling
problems (Baptiste and Schieber, J. Sched. 6(4):395–404, 2003; Brucker and Kravchenko, J. Sched. 11(4):229–237, 2008) and the single disk prefetching/caching problem (Albers et al., J. ACM 47:969–986, 2000). We show that problems such as the three previously mentioned can be separated into two subproblems: (1) finding an optimal
feasible set of slots, and (2) assigning the jobs or pages to the slots. It is straigthforward to show that the latter can
be solved greedily. We are able to solve the former with a totally unimodular linear program, from which we obtain simple
combinatorial algorithms with improved worst case running time. 相似文献
10.
Nano fabrication with biomolecular/DNA self assembly is a promising area of research. Building nano structures with self assembly
is both efficient and inexpensive. Soloveichik and Winfree (SIAM J Comput 36(6):1544–1569, 2007) formalized a two dimensional (2D) tile assembly model based on Wang’s tiling technique. Algorithms with an optimal tile
complexity of
(\Uptheta(\fraclog(N)log(log(N))))\left(\Uptheta\left(\frac{\log(N)}{\log(\log(N))}\right)\right) were proposed earlier to uniquely self assemble an N × N square (with a temperature of α = 2) on this model. However efficient constructions to assemble arbitrary shapes are not
known and have remained open. In this paper we present self assembling algorithms to assemble a triangle of base 2N − 1 (units) and height N with a tile complexity of
\Uptheta(log(N)).\Uptheta(\log(N)). We also describe how this framework can be used to construct other shapes such as rhombus, hexagon etc. 相似文献
11.
Luca Dedè 《Journal of scientific computing》2012,50(2):287-305
We propose a Reduced Basis method for the solution of parametrized optimal control problems with control constraints for which
we extend the method proposed in Dedè, L. (SIAM J. Sci. Comput. 32:997, 2010) for the unconstrained problem. The case of a linear-quadratic optimal control problem is considered with the primal equation
represented by a linear parabolic partial differential equation. The standard offline–online decomposition of the Reduced
Basis method is employed with the Finite Element approximation as the “truth” one for the offline step. An error estimate
is derived and an heuristic indicator is proposed to evaluate the Reduced Basis error on the optimal control problem at the
online step; also, the indicator is used at the offline step in a Greedy algorithm to build the Reduced Basis space. We solve
numerical tests in the two-dimensional case with applications to heat conduction and environmental optimal control problems. 相似文献
12.
Unique Fixpoint Induction (UFI) is the chief inference rule to prove the equivalence of recursive processes in the Calculus
of Communicating Systems (CCS) (Milner 1989). It plays a major role in the equational approach to verification. Equational verification is of special interest as it
offers theoretical advantages in the analysis of systems that communicate values, have infinite state space or show parameterised
behaviour. We call these kinds of systems VIPSs. VIPSs is the acronym of Value-passing, Infinite-State and Parameterised Systems. Automating the application of UFI in the
context of VIPSs has been neglected. This is both because many VIPSs are given in terms of recursive function symbols, making
it necessary to carefully apply induction rules other than UFI, and because proving that one VIPS process constitutes a fixpoint
of another involves computing a process substitution, mapping states of one process to states of the other, that often is
not obvious. Hence, VIPS verification is usually turned into equation solving (Lin 1995a). Existing tools for this proof task, such as VPAM (Lin 1993), are highly interactive. We introduce a method that automates the use of UFI. The method uses middle-out reasoning (Bundy
et al. 1990a) and, so, is able to apply the rule even without elaborating the details of the application. The method introduces meta-variables
to represent those bits of the processes’ state space that, at application time, were not known, hence, changing from equation
verification to equation solving. Adding this method to the equation plan developed by Monroy et al. (Autom Softw Eng 7(3):263–304,
2000a), we have implemented an automatic verification planner. This planner increases the number of verification problems that
can be dealt with fully automatically, thus improving upon the current degree of automation in the field.
Partially supported by grants CONACyT-47557-Y and ITESM CCEM-0302-05.
Partially supported by EPSRC GR/L/11724. 相似文献
13.
Jake Porway Qiongchen Wang Song Chun Zhu 《International Journal of Computer Vision》2010,88(2):254-283
In this paper we present a hierarchical and contextual model for aerial image understanding. Our model organizes objects (cars,
roofs, roads, trees, parking lots) in aerial scenes into hierarchical groups whose appearances and configurations are determined
by statistical constraints (e.g. relative position, relative scale, etc.). Our hierarchy is a non-recursive grammar for objects
in aerial images comprised of layers of nodes that can each decompose into a number of different configurations. This allows
us to generate and recognize a vast number of scenes with relatively few rules. We present a minimax entropy framework for
learning the statistical constraints between objects and show that this learned context allows us to rule out unlikely scene
configurations and hallucinate undetected objects during inference. A similar algorithm was proposed for texture synthesis
(Zhu et al. in Int. J. Comput. Vis. 2:107–126, 1998) but didn’t incorporate hierarchical information. We use a range of different bottom-up detectors (AdaBoost, TextonBoost,
Compositional Boosting (Freund and Schapire in J. Comput. Syst. Sci. 55, 1997; Shotton et al. in Proceedings of the European Conference on Computer Vision, pp. 1–15, 2006; Wu et al. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, 2007)) to propose locations of objects in new aerial images and employ a cluster sampling algorithm (C4 (Porway and Zhu, 2009)) to choose the subset of detections that best explains the image according to our learned prior model. The C4 algorithm
can quickly and efficiently switch between alternate competing sub-solutions, for example whether an image patch is better
explained by a parking lot with cars or by a building with vents. We also show that our model can predict the locations of
objects our detectors missed. We conclude by presenting parsed aerial images and experimental results showing that our cluster
sampling and top-down prediction algorithms use the learned contextual cues from our model to improve detection results over
traditional bottom-up detectors alone. 相似文献
14.
We study the on-line minimum weighted bipartite matching problem in arbitrary metric spaces. Here, n not necessary disjoint points of a metric space M are given, and are to be matched on-line with n points of M revealed one by one. The cost of a matching is the sum of the distances of the matched points, and the goal is to find or
approximate its minimum. The competitive ratio of the deterministic problem is known to be Θ(n), see (Kalyanasundaram, B., Pruhs, K. in J. Algorithms 14(3):478–488, 1993) and (Khuller, S., et al. in Theor. Comput. Sci. 127(2):255–267, 1994). It was conjectured in (Kalyanasundaram, B., Pruhs, K. in Lecture Notes in Computer Science, vol. 1442, pp. 268–280, 1998) that a randomized algorithm may perform better against an oblivious adversary, namely with an expected competitive ratio
Θ(log n). We prove a slightly weaker result by showing a o(log 3
n) upper bound on the expected competitive ratio. As an application the same upper bound holds for the notoriously hard fire
station problem, where M is the real line, see (Fuchs, B., et al. in Electonic Notes in Discrete Mathematics, vol. 13, 2003) and (Koutsoupias, E., Nanavati, A. in Lecture Notes in Computer Science, vol. 2909, pp. 179–191, 2004).
The authors were partially supported by OTKA grants T034475 and T049398. 相似文献
15.
The weighted essentially non-oscillatory (WENO) methods are a popular high-order spatial discretization for hyperbolic partial
differential equations. Recently Henrick et al. (J. Comput. Phys. 207:542–567, 2005) noted that the fifth-order WENO method by Jiang and Shu (J. Comput. Phys. 126:202–228, 1996) is only third-order accurate near critical points of the smooth regions in general. Using a simple mapping function to the
original weights in Jiang and Shu (J. Comput. Phys. 126:202–228, 1996), Henrick et al. developed a mapped WENO method to achieve the optimal order of accuracy near critical points. In this paper
we study the mapped WENO scheme and find that, when it is used for solving the problems with discontinuities, the mapping
function in Henrick et al. (J. Comput. Phys. 207:542–567, 2005) may amplify the effect from the non-smooth stencils and thus cause a potential loss of accuracy near discontinuities. This
effect may be difficult to be observed for the fifth-order WENO method unless a long time simulation is desired. However,
if the mapping function is applied to seventh-order WENO methods (Balsara and Shu in J. Comput. Phys. 160:405–452, 2000), the error can increase much faster so that it can be observed with a moderate output time. In this paper a new mapping
function is proposed to overcome this potential loss of accuracy. 相似文献
16.
The computation of strongly connected components (SCCs) in discrete-state models is a critical step in formal verification
of LTL and fair CTL properties, but the potentially huge number of reachable states and SCCs constitutes a formidable challenge.
We consider the problem of computing the set of states in SCCs or terminal SCCs in an asynchronous system. We employ the idea
of saturation, which has shown clear advantages in symbolic state-space exploration (Ciardo et al. in Softw Tools Technol Transf 8(1):4–25,
2006; Zhao and Ciardo in Proceedings of 7th international symposium on automated technology for verification and analysis, pp
368–381, 2009), to improve two previously proposed approaches. We use saturation to speed up state exploration when computing each SCC
in the Xie-Beerel algorithm, and we compute the transitive closure of the transition relation using a novel algorithm based on saturation. Furthermore, we show that the techniques we developed
are also applicable to the computation of fair cycles. Experimental results indicate that the improved algorithms using saturation achieve a substantial speedup over previous
BFS algorithms. In particular, with the new transitive closure computation algorithm, up to 10150 SCCs can be explored within a few seconds. 相似文献
17.
In recent macro models with staggered price and wage settings, the presence of variables such as relative price and wage dispersion
is prevalent, which leads to the source of bifurcations. In this paper, we illustrate how to detect the existence of a bifurcation
in stylized macroeconomic models with Calvo (J Monet Econ 12(3):383–398, 1983) pricing. Following the general approach of Judd (Numerical methods in economics, 1998), we employ l’Hospital’s rule to characterize the first-order dynamics of relative price distortion in terms of its higher-order
derivatives. We also show that, as in the usual practice in the literature, the bifurcation can be eliminated through renormalization
of model variables. Furthermore, we demonstrate that the second-order approximate solutions under this renormalization and
under bifurcations can differ significantly. 相似文献
18.
We study the complexity of the propositional minimal inference problem. Although the complexity of this problem has been already
extensively studied before because of its fundamental importance in nonmonotonic logics and commonsense reasoning, no complete
classification of its complexity was found. We classify the complexity of four different and well-studied formalizations of
the problem in the version with unbounded queries, proving that the complexity of the minimal inference problem for each of
them has a trichotomy (between P, coNP-complete, and Π2P-complete). One of these results finally settles with a positive answer the trichotomy conjecture of Kirousis and Kolaitis
(Theory Comput. Syst. 37(6):659–715, 2004). In the process we also strengthen and give a much simplified proof of the main result from Durand and Hermann (Proceedings
20th Symposium on Theoretical Aspects of Computer Science (STACS 2003), pp. 451–462, 2003). 相似文献
19.
Preventing micro-channels from clogging is a major issue in most micro and nanofluidic systems (Gravesen et al., J Micromech
Microeng 3(4):168–182, 1993; Jensen et al., In: Proc. of MicroTAS 2002, Nara, Japan, pp 733–735, 2002; Wong et al., J Fluid Mech 292:71–94, 1995). The T-shaped channel first reported by Kohnle et al. (In: IEEE MEMS, the 15th international IEEE micro electro mechanical
conference (ed), Las Vegas, pp 77–80, 2002) prevents micro-channels from clogging by the aid of the equilibrium bubble position in such a geometry. This work is concerned
with the static and dynamic behaviour of bubbles in such T-shaped micro-channels. The aspect ratio of a rectangle enclosing
the T-shaped channel and the contact angle of the walls are the main parameters influencing the static and dynamic bubble
behaviour. It is investigated in this article how these parameters relate to the equilibrium bubble shape and how optimum
bubble velocities can be achieved inside the channel. An analytical model depending on the contact angle and the channel geometry
is presented that allows to determine the bubble configuration inside the channel by minimizing the bubble’s surface energy.
A second model is derived to predict the velocity of gas bubbles driven by buoyancy in vertical T-shaped channels. The model
is applied to design T-shaped channels with a maximum mobility of gas bubbles. Experiments with MEMS fabricated devices and
CFD simulations are used to verify the models. Furthermore design rules for an optimum non-clogging channel geometry which
provides the highest gas bubble mobility are given. 相似文献
20.
When an image is viewed at varying resolutions, it is known to create discrete perceptual jumps or transitions amid the continuous
intensity changes. In this paper, we study a perceptual scale-space theory which differs from the traditional image scale-space theory in two aspects. (i) In representation, the perceptual scale-space adopts a full generative model. From a Gaussian
pyramid it computes a sketch pyramid where each layer is a primal sketch representation (Guo et al. in Comput. Vis. Image Underst. 106(1):5–19, 2007)—an attribute graph whose elements are image primitives for the image structures. Each primal sketch graph generates the
image in the Gaussian pyramid, and the changes between the primal sketch graphs in adjacent layers are represented by a set
of basic and composite graph operators to account for the perceptual transitions. (ii) In computation, the sketch pyramid and graph operators are inferred, as hidden
variables, from the images through Bayesian inference by stochastic algorithm, in contrast to the deterministic transforms
or feature extraction, such as computing zero-crossings, extremal points, and inflection points in the image scale-space.
Studying the perceptual transitions under the Bayesian framework makes it convenient to use the statistical modeling and learning
tools for (a) modeling the Gestalt properties of the sketch graph, such as continuity and parallelism etc; (b) learning the
most frequent graph operators, i.e. perceptual transitions, in image scaling; and (c) learning the prior probabilities of
the graph operators conditioning on their local neighboring sketch graph structures. In experiments, we learn the parameters
and decision thresholds through human experiments, and we show that the sketch pyramid is a more parsimonious representation
than a multi-resolution Gaussian/Wavelet pyramid. We also demonstrate an application on adaptive image display—showing a large
image in a small screen (say PDA) through a selective tour of its image pyramid. In this application, the sketch pyramid provides
a means for calculating information gain in zooming-in different areas of an image by counting a number of operators expanding
the primal sketches, such that the maximum information is displayed in a given number of frames.
A short version was published in ICCV05 (Wang et al. 2005). 相似文献