共查询到20条相似文献,搜索用时 15 毫秒
1.
Seung Woo Son Konrad Malkowski Guilin Chen Mahmut Kandemir Padma Raghavan 《The Journal of supercomputing》2007,41(3):179-213
Reducing power consumption is quickly becoming a first-class optimization metric for many high-performance parallel computing
platforms. One of the techniques employed by many prior proposals along this direction is voltage scaling and past research
used it on different components such as networks, CPUs, and memories. In contrast to most of the existent efforts on voltage
scaling that target a single component (CPU, network or memory components), this paper proposes and experimentally evaluates
a voltage/frequency scaling algorithm that considers CPU and communication links in a mesh network at the same time. More
specifically, it scales voltages/frequencies of CPUs in the nodes and the communication links among them in a coordinated
fashion (instead of one after another) such that energy savings are maximized without impacting execution time. Our experiments
with several tree-based sparse matrix computations reveal that the proposed integrated voltage scaling approach is very effective
in practice and brings 13% and 17% energy savings over the pure CPU and pure communication link voltage scaling schemes, respectively.
The results also show that our savings are consistent with the different network sizes and different sets of voltage/frequency
levels.
相似文献
Padma RaghavanEmail: |
2.
Existing estimation approaches for spatial databases often rely on the assumption that data distribution in a small region
is uniform, which seldom holds in practice. Moreover, their applicability is limited to specific estimation tasks under certain
distance metric. This paper develops the Power-method, a comprehensive technique applicable to a wide range of query optimization problems under both L∞ and L2 metrics. The Power-method eliminates the local uniformity assumption and is, therefore, accurate even for datasets where
existing approaches fail. Furthermore, it performs estimation by evaluating only one simple formula with minimal computational
overhead. Extensive experiments confirm that the Power-method outperforms previous techniques in terms of accuracy and applicability
to various optimization scenarios.
相似文献
Yufei TaoEmail: |
3.
We study two topological properties of the 3-ary n-cube Q
n
3. Given two arbitrary distinct nodes x and y in Q
n
3, we prove that there exists an x–y path of every length ranging from d(x,y) to 3
n
−1, where d(x,y) is the length of a shortest path between x and y. Based on this result, we prove that Q
n
3 is edge-pancyclic by showing that every edge in Q
n
3 lies on a cycle of every length ranging from 3 to 3
n
.
相似文献
Hui-Ling HuangEmail: |
4.
This paper presents the design and implementation of an efficient reconfigurable parallel prefix computation hardware on field-programmable
gate arrays (FPGAs). The design is based on a pipelined dataflow algorithm, and control logic is added to reconfigure the
system for arbitrary parallelism degree. The system receives multiple input streams of elements in parallel and produces output
streams in parallel. It has an advantage of controlling the degree of parallelism explicitly at run time. The time complexity
of the design is O(d+(N−d)/d), where d and N are parallelism degree and stream size, respectively. When the stream size is sufficiently larger than the initial trigger
time of the pipeline (d), the time complexity becomes O(N/d). Unlike the prefix computation circuits found in the literature, the design is scalable for different problem sizes including
unknown sized data. The design is modular based on a finite state machine, and implemented and tested for target FPGA devices
Xilinx Spartan2S XC2S300EFT256-6Q and XC2S600EFG676-6.
相似文献
H. K. DaiEmail: |
5.
Low power has played an increasingly important role for embedded systems. To save power, lowering voltage and frequency is
very straightforward and effective; therefore, dynamic voltage scaling (DVS) has become a prevalent low-power technique. However,
DVS makes no effect on power saving when the voltage reaches a lower bound. Fortunately, a technique called dynamic pipeline
scaling (DPS) can overcome this limitation by switching pipeline modes at low-voltage level. Approaches proposed in previous
work on DPS were based on hardware support. From viewpoint of compiler, little has been addressed on this issue. This paper
presents a DPS optimization technique at compiler time to reduce power dissipation. The useful information of an application
is exploited to devise an analytical model to assess the cost of enabling DPS mechanism. As a consequence, we can determine
the switching timing between pipeline modes at compiler time without causing significant run-time overhead. The experimental
result shows that our approach is effective in reducing energy consumption.
相似文献
Rong-Guey Chang (Corresponding author)Email: |
6.
Robert W. Numrich 《The Journal of supercomputing》2008,43(3):281-298
We define a normed metric space for computer programs and derive from it the Principle of Computational Least Action. In our
model, programs follow trajectories determined by Newton’s equation of motion in an abstract computational phase space and
generate computational action as they evolve. A program’s action norm is the L
1-norm of its action function, and its distance from other programs is the distance derived from the action norm. The Principle
of Computational Least Action states the goal of performance optimization as finding the program with the smallest action
norm. We illustrate this principle by analyzing a simple program.
相似文献
Robert W. NumrichEmail: |
7.
Recognising Algebraic Surfaces from Two Outlines 总被引:1,自引:1,他引:0
Simon Collings Ryszard Kozera Lyle Noakes 《Journal of Mathematical Imaging and Vision》2008,30(2):181-193
Photographic outlines of 3 dimensional solids are robust and rich in information useful for surface reconstruction. This paper
studies algebraic surfaces viewed from 2 cameras with known intrinsic and extrinsic parameters. It has been known for some
time that for a degree d=2 (quadric) algebraic surface there is a 1-parameter family of surfaces that reproduce the outlines. When the algebraic surface
has degree d>2, we prove a new result: that with known camera geometry it is possible to completely reconstruct an algebraic surface from
2 outlines i.e. the coefficients of its defining polynomial can be determined in a known coordinate frame. The proof exploits
the existence of frontier points, which are calculable from the outlines. Examples and experiments are presented to demonstrate
the theory and possible applications.
相似文献
Simon CollingsEmail: |
8.
The computation of the Discrete Fourier Transform for a general lattice in ℝ
d
can be reduced to the computation of the standard 1-dimensional Discrete Fourier Transform. We provide a mathematically rigorous
but simple treatment of this procedure and apply it to the DFT on the hexagonal lattice.
相似文献
Xiqiang ZhengEmail: |
9.
Subspace sums for extracting non-random data from massive noise 总被引:1,自引:1,他引:0
Anne M. Denton 《Knowledge and Information Systems》2009,20(1):35-62
An algorithm is introduced that distinguishes relevant data points from randomly distributed noise. The algorithm is related
to subspace clustering based on axis-parallel projections, but considers membership in any projected cluster of a given side
length, as opposed to a particular cluster. An aggregate measure is introduced that is based on the total number of points
that are close to the given point in all possible 2
d
projections of a d-dimensional hypercube. No explicit summation over subspaces is required for evaluating this measure. Attribute values are
normalized based on rank order to avoid making assumptions on the distribution of random data. Effectiveness of the algorithm
is demonstrated through comparison with conventional outlier detection on a real microarray data set as well as on time series
subsequence data.
相似文献
Anne M. DentonEmail: |
10.
Nicolai Vorobjov 《Journal of Symbolic Computation》1999,27(6):2346
The paper describes several algorithms related to a problem of computing the local dimension of a semialgebraic set. Let a semialgebraic set V be defined by a system of k inequalities of the formf ≥ 0 with f R [ X1, ,Xn ], deg(f) < d , andx V . An algorithm is constructed for computing the dimension of the Zariski tangent space to V at x in time (kd)O(n). Let x belong to a stratum of codimension lxin V with respect to a smooth stratification ofV . Another algorithm computes the local dimension dimx(V) with the complexity (k(lx + 1)d)O(lx2n). Ifl = maxx Vlx, and for every connected component the local dimension is the same at each point, then the algorithm computes the dimension of every connected component with complexity (k(l + 1)d)O(l2n). If V is a real algebraic variety defined by a system of equations, then the complexity of the algorithm is less thankdO(l2n) , and the algorithm also finds the dimension of the tangent space to V at x in time kdO(n). Whenl is fixed, like in the case of a smooth V , the complexity bounds for computing the local dimension are (kd)O(n)andkdO(n) respectively. A third algorithm finds the singular locus ofV in time (kd)O(n2). 相似文献
11.
Region merging methods consist of improving an initial segmentation by merging some pairs of neighboring regions. In a graph,
merging two regions, separated by a set of vertices, is not straightforward. The perfect fusion graphs defined in J. Cousty
et al. (J. Math. Imaging Vis. 30:(1):87–104, 2008) verify all the basic properties required by region merging algorithms as used in image segmentation. Unfortunately, the
graphs which are the most frequently used in image analysis (namely, those induced by the direct and the indirect adjacency
relations) are not perfect fusion graphs. The perfect fusion grid, introduced in the above mentioned reference, is an adjacency
relation on ℤ
d
which can be used in image analysis, which indeed induces perfect fusion graphs and which is “between” the graphs induced
by the direct and the indirect adjacencies. One of the main results of this paper is that the perfect fusion grid is the only
such graph whatever the dimension d.
相似文献
Gilles BertrandEmail: |
12.
13.
With the rising adoption of web services, effective management of web services becomes a critical issue in making the paradigm
of service-oriented computing more practical. In this paper, a novel structure, called Vector-based service Lattice ( VsLattice), is devised to index web services in a semantic way. Each web service is modeled as a group of Service Operation Vectors (SOVs) in the vector space, and each SOV represents an operation provided by the service. The web services, SOVs and the relationship between web services and SOVs form the Conceptual Indexing Context (CIC) of a given service collection. In the CIC, web services that provide similar operations (functions) are conceptually indexed by the same Operation Vector Concepts (OVCs). The underlying relationships among the OVCs are captured with the VsLattice, which is constructed by adopting the traditional concept lattice in a CIC. By taking advantage of the information obtained from the VsLattice, a new representation of SOV is devised. Based on this representation, a novel service retrieval model and the implemental system are developed to retrieve
web services efficiently. The performance and retrieving quality of the proposed approach has been evaluated through a series
of experiments.
相似文献
Aoying Zhou (Corresponding author)Email: |
14.
An electrokinetic mixer driven by oscillatory cross flow has been studied numerically as a means for generating chaotic mixing
in microfluidic devices for both confined and throughput mixing configurations. The flow is analyzed using numerical simulation
of the unsteady Navier–Stokes equations combined with the tracking of single and multi-species passive tracer particles. First,
the case of confined flow mixing is studied in which flow in the perpendicular channels of the oscillatory mixing element
is driven sinusoidally, and 90° out of phase. The flow is shown to be chaotic by means of positive effective (finite time)
Lyapunov exponents, and the stretching and folding of material lines leading to Lagrangian tracer particle dispersion. The
transition to chaotic flow in this case depends strongly on the Strouhal number (St), and weakly on the ratio of the cross flow channel length to width (L/W). For L/W = 2, the flow becomes appreciably chaotic as evidenced by visual particle dispersion at approximately St = 0.32, and the transitional value of St increases slightly with increasing aspect ratio. A peak degree of mixing on the order of 85% is obtained for the range of
parameter values explored here. In the second phase of the analysis, the effect of combining a fixed throughput flow with
the oscillatory cross channel motion for use in a continuous mixing operation is examined in a star cell geometry. Chaotic
mixing is again observed, and the characteristics of the downstream dispersion patterns depend mainly on the Strouhal number
and the (dimensionless) throughput rate. In the star cell, the flow becomes appreciably chaotic as evidenced by visual particle
dispersion at approximately St = 1, slightly higher than for the case of cross cell. The star cell mixing behavior is marked by the convergence of the degree
of mixing to a plateau level as the Strouhal number is increased at fixed flow rate. Degree of mixing values from 70 to 80%
are obtained indicating that the continuous flow is bounded by the maximum degree of mixing obtained from the confined flow
configuration.
相似文献
Jai A. PathakEmail: |
15.
Jian-Feng Cai Raymond H. Chan Carmine Di Fiore 《Journal of Mathematical Imaging and Vision》2007,29(1):79-91
Recently, a powerful two-phase method for restoring images corrupted with high level impulse noise has been developed. The
main drawback of the method is the computational efficiency of the second phase which requires the minimization of a non-smooth
objective functional. However, it was pointed out in (Chan et al. in Proc. ICIP 2005, pp. 125–128) that the non-smooth data-fitting
term in the functional can be deleted since the restoration in the second phase is applied to noisy pixels only. In this paper,
we study the analytic properties of the resulting new functional ℱ. We show that ℱ, which is defined in terms of edge-preserving
potential functions φ
α
, inherits many nice properties from φ
α
, including the first and second order Lipschitz continuity, strong convexity, and positive definiteness of its Hessian. Moreover,
we use these results to establish the convergence of optimization methods applied to ℱ. In particular, we prove the global
convergence of some conjugate gradient-type methods and of a recently proposed low complexity quasi-Newton algorithm. Numerical
experiments are given to illustrate the convergence and efficiency of the two methods.
相似文献
Carmine Di FioreEmail: |
16.
Lorina Dascal Adi Ditkowski Nir A. Sochen 《Journal of Mathematical Imaging and Vision》2007,29(1):63-77
We analyze the discrete maximum principle for the Beltrami color flow. The Beltrami flow can display linear as well as nonlinear
behavior according to the values of a parameter β, which represents the ratio between spatial and color distances. In general, the standard schemes fail to satisfy the discrete
maximum principle. In this work we show that a nonnegative second order difference scheme can be built for this flow only
for small β, i.e. linear-like diffusion. Since this limitation is too severe, we construct a novel finite difference scheme, which is
not nonnegative and satisfies the discrete maximum principle for all values of β. Numerical results support the analysis.
相似文献
Nir A. Sochen (Corresponding author)Email: |
17.
On the partial terminal Steiner tree problem 总被引:1,自引:0,他引:1
We investigate a practical variant of the well-known graph Steiner tree problem. For a complete graph G
=
(
V,E
) with length function l:E
→
R
+
and two vertex subsets R
⊂
V and R
′
⊆
R, a partial terminal Steiner tree is a Steiner tree which contains all vertices in R such that all vertices in R
∖
R
′ belong to the leaves of this Steiner tree. The partial terminal Steiner tree problem is to find a partial terminal Steiner tree T whose total lengths ∑
(u,v)
∈T
l
(
u,v
) is minimum. In this paper, we show that the problem is both NP-complete and MAX SNP-hard when the lengths of edges are restricted
to either 1 or 2. We also provide an approximation algorithm for the problem.
相似文献
Sun-Yuan HsiehEmail: |
18.
J. D. Parisse J. Giordano P. Perrier Y. Burtschell I. A. Graur 《Microfluidics and nanofluidics》2009,6(5):699-709
The shock wave propagation in the micro channel of the different sizes is studied numerically in order to estimate the possibility
of the experimental apparatus development. The full compressible Navier–Stokes equations are used for the numerical simulation.
The shock wave velocity attenuation is found for the channel height smaller than H = 200 μm. The influence of the channel size and of the diaphragm pressure ratio on the shock wave velocity is considered.
The considerable influence of the viscous effects on the shock propagation is shown.
相似文献
J. D. ParisseEmail: |
19.
Eleanor Toye Richard Sharp Anil Madhavapeddy David Scott Eben Upton Alan Blackwell 《Personal and Ubiquitous Computing》2007,11(2):97-106
We present a study of using camera-phones and visual-tags to access mobile services. Firstly, a user-experience study is described in which participants were both observed learning to interact with a prototype mobile service and interviewed
about their experiences. Secondly, a pointing-device task is presented in which quantitative data was gathered regarding the speed and accuracy with which participants aimed and clicked
on visual-tags using camera-phones. We found that participants’ attitudes to visual-tag-based applications were broadly positive,
although they had several important reservations about camera-phone technology more generally. Data from our pointing-device
task demonstrated that novice users were able to aim and click on visual-tags quickly (well under 3 s per pointing-device
trial on average) and accurately (almost all meeting our defined speed/accuracy tradeoff of 6% error-rate). Based on our findings,
design lessons for camera-phone and visual-tag applications are presented.
相似文献
Eleanor Toye (Corresponding author)Email: |
Richard SharpEmail: |
Anil MadhavapeddyEmail: |
David ScottEmail: |
Eben UptonEmail: |
Alan BlackwellEmail: |
20.
A polynomial-time algorithm is presented for partitioning a collection of sporadic tasks among the processors of an identical
multiprocessor platform. Since the partitioning problem is NP-hard in the strong sense, this algorithm is unlikely to be optimal.
A quantitative characterization of its worst-case performance is provided in terms of resource augmentation.
相似文献
Nathan Wayne Fisher (Corresponding author)Email: |