共查询到20条相似文献,搜索用时 0 毫秒
1.
In order to make timely and effective decisions, businesses need the latest information from data warehouse repositories. To keep these repositories up-to-date with respect to end user updates, near-real-time data integration is required. An important phase in near-real-time data integration is data transformation where the stream of updates is joined with disk-based master data. The stream-based algorithm MESHJOIN (Mesh Join) has been proposed to amortize disk access over fast streams. MESHJOIN makes no assumptions about the data distribution. In real-world applications, however, skewed distributions can be found, such as a stream of products sold, where certain products are sold more frequently than the remainder of the products. The question arises is how much does MESHJOIN lose in terms of performance by not adapting to data skew. In this paper we perform a rigorous experimental study analyzing the possible performance improvements while considering typical data distributions. For this purpose we design an algorithm Extended Hybrid Join (X-HYBRIDJOIN) that is complementary to MESHJOIN in that it can adapt to data skew and stores parts of the master data in memory permanently, reducing the disk access overhead significantly. We compare the performance of X-HYBRIDJOIN against the performance of MESHJOIN. We take several precautions to make sure the comparison is adequate and focuses on the utilization of data skew. The experiments show that considering data skew offers substantial room for performance gains that cannot be found in non-adaptive approaches such as MESHJOIN. We also present a cost model for X-HYBRIDJOIN, and based on that cost model, the algorithm is tuned. 相似文献
2.
3.
In this paper, a new particle filter is proposed to solve the nonlinear and non-Gaussian filtering problem when measurements are randomly delayed by one sampling time and the latency probability of the delay is unknown. In the proposed method, particles and their weights are updated in Bayesian filtering framework by considering the randomly delayed measurement model, and the latency probability is identified by maximum likelihood criterion. The superior performance of the proposed particle filter as compared with existing methods and the effectiveness of the proposed identification method of latency probability are both illustrated in two numerical examples concerning univariate non-stationary growth model and bearing only tracking. 相似文献
4.
Mónica Millán-Giraldo J. Salvador Sánchez V. Javier Traver 《Neural computing & applications》2011,20(7):935-944
In many real applications, data are not all available at the same time, or it is not affordable to process them all in a batch process, but rather, instances arrive sequentially in a stream. The scenario of streaming data introduces new challenges to the machine learning community, since difficult decisions have to be made. The problem addressed in this paper is that of classifying incoming instances for which one attribute arrives only after a given delay. In this formulation, many open issues arise, such as how to classify the incomplete instance, whether to wait for the delayed attribute before performing any classification, or when and how to update a reference set. Three different strategies are proposed which address these issues differently. Orthogonally to these strategies, three classifiers of different characteristics are used. Keeping on-line learning strategies independent of the classifiers facilitates system design and contrasts with the common alternative of carefully crafting an ad hoc classifier. To assess how good learning is under these different strategies and classifiers, they are compared using learning curves and final classification errors for fifteen data sets. Results indicate that learning in this stringent context of streaming data and delayed attributes can successfully take place even with simple on-line strategies. Furthermore, active strategies behave generally better than more conservative passive ones. Regarding the classifiers, it was found that simple instance-based classifiers such as the well-known nearest neighbor may outperform more elaborate classifiers such as the support vector machines, especially if some measure of classification confidence is considered in the process. 相似文献
5.
Khaleel Mershad Qutaibah M. Malluhi Mourad Ouzzani Mingjie Tang Michael Gribskov Walid G. Aref 《Distributed and Parallel Databases》2018,36(1):81-119
Collaborative databases such as genome databases, often involve extensive curation activities where collaborators need to interact to be able to converge and agree on the content of data. In a typical scenario, a member of the collaboration makes some updates and these become visible to all collaborators for possible comments and modifications. At the same time, these updates are usually pending the approval or rejection from the data custodian based on the related discussion and the content of the data. Unfortunately, the approval and authorization of updates in current databases is based solely on the identity of the user, e.g., via the SQL GRANT and REVOKE commands. In this paper, we present a scalable cloud-based collaborative database system to support collaboration and data curation scenarios. Our system is based on an Update Pending Approval model. In a nutshell, when a collaborator updates a given data item, it is marked as pending approval until the data custodian approves or rejects the update. Until then, any other collaborator can view and comment on the data, pending its approval. We fully realized our system inside HBase, a cloud-based platform. We also conducted extensive experiments showing that the system scales well under different workloads. 相似文献
6.
The service capacities of a source peer at different times in a peer-to-peer (P2P) network exhibit temporal correlation. Unfortunately, there is no analytical result which clearly characterizes the expected download time from a source peer with stochastic and time-varying service capacity. The main contribution of this paper is to analyze the expected file download time in P2P networks with stochastic and time-varying service capacities. The service capacity of a source peer is treated as a stochastic process. Analytical results which characterize the expected download time from a source peer with stochastic and time-varying service capacity are derived for the autoregressive model of order 1. Simulation results are presented to validate our analytical results. Numerical data are given to show the impact of the degree of correlation and the strength of noise on the expected file download time. For any chunk allocation method, an analytical result of the expected parallel download time from a source peer with stochastic and time-varying service capacity is derived. It is shown that the algorithm which chooses chunk sizes proportional to the expected service capacities has better performance than the algorithm which chooses equal chunk sizes. It is also shown that multiple source peers do reduce the parallel download time significantly. 相似文献
7.
Thomas J. Walsh Ali Nouri Lihong Li Michael L. Littman 《Autonomous Agents and Multi-Agent Systems》2009,18(1):83-105
This work considers the problems of learning and planning in Markovian environments with constant observation and reward delays. We provide a hardness result for the general planning problem and positive results for several special cases with deterministic or otherwise constrained dynamics. We present an algorithm, Model Based Simulation, for planning in such environments and use model-based reinforcement learning to extend this approach to the learning setting in both finite and continuous environments. Empirical comparisons show this algorithm holds significant advantages over others for decision making in delayed-observation environments. 相似文献
8.
Neural Processing Letters - A class of global exponential synchronization problem for delayed quaternion-valued neural networks with stochastic impulses has been investigated in this paper, where... 相似文献
9.
Le Song Purva Jagdale Liandong Yu Zhijian Liu Cheng Zhang Rongke Gao Xiangchun Xuan 《Microfluidics and nanofluidics》2018,22(11):134
Electrokinetic instabilities have been extensively studied in microchannel fluid flows with conductivity or conductivity and permittivity gradients for various microfluidic applications. This work presents an experimental and numerical investigation of the electrokinetic co-flow of ferrofluid and buffer solutions with matched electric conductivities. We find that the ferrofluid and buffer interface becomes unstable with periodic waves if the applied direct-current electric field reaches a threshold value. We develop a two-dimensional numerical model to seek a preliminary understanding of such an electrically originated flow instability. Our model indicates that the observed phenomenon is not a consequence of the electric body force acting on the permittivity gradients between the ferrofluid and buffer solutions. It is instead attributed to the diffusion-induced conductivity gradients that are formed at the ferrofluid and buffer interface due to the mismatching diffusivities of ferrofluid nanoparticles and buffer ions. 相似文献
10.
Due to the emergence of Grid computing over the Internet, there is presently a need for dynamic load balancing algorithms
which take into account the characteristics of Grid computing environments. In this paper, we consider a Grid architecture
where computers belong to dispersed administrative domains or groups which are connected with heterogeneous communication
bandwidths. We address the problem of determining which group an arriving job should be allocated to and how its load can
be distributed among computers in the group to optimize the performance. We propose algorithms which guarantee finding a load
distribution over computers in a group that leads to the minimum response time or computational cost. We then study the effect
of pricing on load distribution by considering a simple pricing function. We develop three fully distributed algorithms to
decide which group the load should be allocated to, taking into account the communication cost among groups. These algorithms
use different information exchange methods and a resource estimation technique to improve the accuracy of load balancing.
We conducted extensive simulations to evaluate the performance of the proposed algorithms and strategies. 相似文献
11.
In this paper,we investigate the trade-offs between delay and capacity in mobile wireless networks with infrastructure support.We consider three different mobility models,independent and identically distributed (i.i.d) mobility model,random walk mobility model with constant speed and L’evy flight mobility model.For i.i.d mobility model and random walk mobility model with the speed θ(1/n~(1/2)),,we get the theoretical results of the average packet delay when capacityis θ(1),θ(1/n~(1/2)) individually,where n is the number of nodes.We find that the optimal average packet delay is achieved whencapacity λ(n) <(1/(2.n.log2(1/((1-e)-(k/n))+1)),where K is the number of gateways.It is proved that average packet delay D(n) dividedby capacity λ(n) is bounded below by (n/(k·w)).When ω(n~(1/2))≤KO(n((1-η)·(α+1))/2)ln n) when K=o(n~η)(0≤η<1).We also provethat when ω(1/2)≤K相似文献
12.
《Computers & Operations Research》2001,28(9):885-898
This paper investigates the performance of a rate adaptation buffer in the case that the arriving cell stream is generated by an on/off-source, where both the on-periods and the off-periods are geometrically distributed. The ratio between the input rate and the output rate takes an arbitrary integer value greater than one. Under the assumption of an infinite storage capacity, exact explicit expressions are obtained for the mean values and the tail distributions of the buffer contents and the cell delay. Furthermore, an approximation is derived for the cell loss ratio in a finite-capacity buffer. Some numerical results are presented and discussed.Scope and purposeIn communication networks a rate adaptation buffer is used at the interface between two consecutive links, when the speed of the incoming link exceeds the speed of the outgoing link, in order to avoid excessive loss of information. So far, only few papers in the literature have investigated the buffer dimensioning of a rate adapter. All of these papers assume an uncorrelated arrival process on the incoming link and/or small differences between the input and the output rates, assumptions which in practice may not always be realistic. The present paper therefore presents and analyzes a discrete-time queueing model for a rate adaptation buffer which both accounts for the presence of correlation in the arrival stream and allows (possibly) large input/output ratios. 相似文献
13.
The paper addresses consensus under nonlinear couplings and bounded delays for multi-agent systems, where the agents have the single-integrator dynamics. The network topology is undirected and may alter as time progresses. The couplings are uncertain and satisfy a conventional sector condition with known sector slopes. The delays are uncertain, time-varying and obey known upper bounds. The network satisfies a symmetry condition that resembles the Newton’s Third Law. Explicit analytical conditions for the robust consensus are offered that employ only the known upper bounds for the delays and the sector slopes. 相似文献
14.
利用超声波进行距离测量有着成本低、测量精度高的特点,因而在非接触距离测量中有着广泛应用。然而超声波换能器在接收低频信号过程中需要完成一个较长时间的起振过程,在短距(10 cm)测量场合中起振延迟在超声波飞行时间测量中占比可达50%以上,严重影响了超声波真实飞行时间的测量精度。本文提出了考虑超声波换能器延迟误差的距离测量公式,结合最小二乘法对超声波飞行距离、换能器延迟时间、温度和器件距离等参数进行了精确校准。在实验中,采用24.5 K超声波脉冲,使用基于到达时间差TDOA(Time Difference of Arrival)的方法对所提出方法进行了验证。实验证明,该方法在保证精度不丢失的情况下,避免了在不同环境温度下的多次采样和校准,解决了最小二乘法在低频超声波短距测量中可能存在的参数校准困难,对于各类短距离测量应用有较好的精度提升效果。所提出的测量和校准方法算法简单、实施方便,可基本满足各类短距测量需求。 相似文献
15.
Equations are derived here for the state estimate and the error covariance of a linear system with time delay under the assumption that the noise sequences entering the state and observation equations are correlated. The case of a single-process time delay is considered for convenience and the estimation criterion is taken to be the minimization of the error variance. 相似文献
16.
A. V. Kolomiyets 《Cybernetics and Systems Analysis》2001,37(4):618-622
Oscillatory processes are considered that arise in systems with signals delayed under the action of external random perturbations. A simulation algorithm whose results are represented in numerical and graphical forms is proposed and realized as a component of the "GeoPoisk" software package used for interpretation of geological prospecting data. 相似文献
17.
Although stochastic dynamical systems have received a great deal of attention in terms of stabilization studies, so far there are few works on controlled stochastic dynamical systems with state delay. In this paper, a controlled stochastic dynamical system represented by a stochastic differential equation with state delay is considered. Condition under which the system is exponentially stable in mean square and in probability is examined. 相似文献
18.
19.
We deal with nonlinear dynamical systems, consisting of a linear nominal part perturbed by model uncertainties, nonlinearities and both additive and multiplicative random noise, modeled as a Wiener process. In particular, we study the problem of finding suitable measurement feedback control laws such that the resulting closed-loop system is stable in some probabilistic sense. To this aim, we introduce a new notion of stabilization in probability, which is the natural counterpart of the classical concept of regional stabilization for deterministic nonlinear dynamical systems and stands as an intermediate notion between local and global stabilization in probability. This notion requires that, given a target set, a trajectory, starting from some compact region of the state space containing the target, remains forever inside some larger compact set, eventually enters any given neighborhood of the target in finite time and remains thereinafter, all these events being guaranteed with some probability. We give a Lyapunov-based sufficient condition for achieving stability in probability and a separation result which splits the control design into a state feedback problem and a filtering problem. Finally, we point out constructive procedures for solving the state feedback and filtering problem with arbitrarily large region of attraction and arbitrarily small target for a wide class of nonlinear systems, which at least include feedback linearizable systems. The generality of the result is promising for applications to other classes of stochastic nonlinear systems. In the deterministic case, our results recover classical stabilization results for nonlinear systems. 相似文献
20.
S J Chung 《Computer methods and programs in biomedicine》1989,29(4):273-282
A microcomputer program in BASIC for predicting the survival probability by the time after diagnosis in patients with chronic leukemias is designed. Formulas used in this program are derived from the data published by Feinleib and MacMahon (Blood 15 (1960) 332-349). A general equation previously published by the author is applied in this study to calculate the survival probability. Analysis of the computer-assisted predicted and Feinleib and MacMahon's observed data has shown that the program is fairly accurate and reliable with a close agreement in expressing survival probability as a function of the time after diagnosis. The computer-assisted predictive formulas can determine the relationship between the time and the survival probability, and may be of value for prognostic evaluation of patients with chronic leukemias. 相似文献