首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
A number of software cost estimation methods have been presented in literature over the past decades. Analogy based estimation (ABE), which is essentially a case based reasoning (CBR) approach, is one of the most popular techniques. In order to improve the performance of ABE, many previous studies proposed effective approaches to optimize the weights of the project features (feature weighting) in its similarity function. However, ABE is still criticized for the low prediction accuracy, the large memory requirement, and the expensive computation cost. To alleviate these drawbacks, in this paper we propose the project selection technique for ABE (PSABE) which reduces the whole project base into a small subset that consist only of representative projects. Moreover, PSABE is combined with the feature weighting to form FWPSABE for a further improvement of ABE. The proposed methods are validated on four datasets (two real-world sets and two artificial sets) and compared with conventional ABE, feature weighted ABE (FWABE), and machine learning methods. The promising results indicate that project selection technique could significantly improve analogy based models for software cost estimation.  相似文献   

2.
Analogy-based effort estimation (ABE) is one of the prominent methods for software effort estimation. The fundamental concept of ABE is closer to the mentality of expert estimation but with an automated procedure in which the final estimate is generated by reusing similar historical projects. The main key issue when using ABE is how to adapt the effort of the retrieved nearest neighbors. The adaptation process is an essential part of ABE to generate more successful accurate estimation based on tuning the selected raw solutions, using some adaptation strategy. In this study, we show that there are three interrelated decision variables that have great impact on the success of adaptation method: (1) number of nearest analogies (k), (2) optimum feature set needed for adaptation and (3) adaptation weights. To find the right decision regarding these variables, one need to study all possible combinations and evaluate them individually to select the one that can improve all prediction evaluation measures. The existing evaluation measures usually behave differently, presenting sometimes opposite trends in evaluating prediction methods. This means that changing one decision variable could improve one evaluation measure while it is decreasing the others. Therefore, the main theme of this research is how to come up with best decision variables that improve adaptation strategy and thus the overall evaluation measures without degrading the others. The impact of these decisions together has not been investigated before; therefore, we propose to view the building of adaptation procedure as a multi-objective optimization problem. The Particle swarm optimization algorithm (PSO) is utilized to find the optimum solutions for such decision variables based on optimizing multiple evaluation measures. We evaluated the proposed approaches over 15 datasets and using four evaluation measures. After extensive experimentation, we found that: (1) predictive performance of ABE has noticeably been improved, (2) optimizing all decision variables together is more efficient than ignoring any one of them, and (3) optimizing decision variables for each project individually yields better accuracy than optimizing them for the whole dataset.  相似文献   

3.
The estimation of software development effort has been centralized mostly on the accuracy of estimates through dealing with heterogeneous datasets regardless of the fact that the software projects are inherently complex and uncertain. In particular, Analogy Based Estimation (ABE), as a widely accepted estimation method, suffers a great deal from the problem of inconsistent and non-normal datasets because it is a comparison-based method and the quality of comparisons strongly depends on the consistency of projects. In order to overcome this problem, prior studies have suggested the use of weighting methods, outlier elimination techniques and various types of soft computing methods. However the proposed methods have reduced the complexity and uncertainty of projects, the results are not still convincing and the methods are limited to a special domain of software projects, which causes the generalization of methods to be impossible. Localization of comparison and weighting processes through clustering of projects is the main idea behind this paper. A hybrid model is proposed in which the software projects are divided into several clusters based on key attributes (development type, organization type and development platform). A combination of ABE and Particle Swarm Optimization (PSO) algorithm is used to design a weighting system in which the project attributes of different clusters are given different weights. Instead of comparing a new project with all the historical projects, it is only compared with the projects located in the related clusters based on the common attributes. The proposed method was evaluated through three real datasets that include a total of 505 software projects. The performance of the proposed model was compared with other well-known estimation methods and the promising results showed that the proposed localization can considerably improve the accuracy of estimates. Besides the increase in accuracy, the results also certified that the proposed method is flexible enough to be used in a wide range of software projects.  相似文献   

4.
陈小龙  马磊  张文旭 《计算机应用》2015,35(7):1824-1828
针对一个无融合中心传感器网络中的状态估计问题,提出一种基于量化信息的分布式卡尔曼滤波(QDKF)算法。首先,在分布式卡尔曼滤波(DKF)中,以节点状态估计精度为加权准则,动态选取加权矩阵,使得全局估计误差的协方差最小;然后,进一步考虑了网络带宽受限制的情况,在DKF算法中加入均匀量化器,节点之间通信使用量化后的信息,以减少网络通信的带宽需求。QDKF算法仿真采用了8 bit的均匀量化器,与Metropolis加权法和最大度加权法相比,动态加权法的状态估计均方根误差分别降低了25%和27.33%。实验结果表明,采用动态加权法的QDKF算法能提高系统的状态估计精度,减少带宽需求,适用于网络通信受限制的应用场合。  相似文献   

5.
Recently, we proposed an improvement to the conventional eigenvoice (EV) speaker adaptation using kernel methods. In our novel kernel eigenvoice (KEV) speaker adaptation, speaker supervectors are mapped to a kernel-induced high dimensional feature space, where eigenvoices are computed using kernel principal component analysis. A new speaker model is then constructed as a linear combination of the leading eigenvoices in the kernel-induced feature space. KEV adaptation was shown to outperform EV, MAP, and MLLR adaptation in a TIDIGITS task with less than 10 s of adaptation speech. Nonetheless, due to many kernel evaluations, both adaptation and subsequent recognition in KEV adaptation are considerably slower than conventional EV adaptation. In this paper, we solve the efficiency problem and eliminate all kernel evaluations involving adaptation or testing observations by finding an approximate pre-image of the implicit adapted model found by KEV adaptation in the feature space; we call our new method embedded kernel eigenvoice (eKEV) adaptation. eKEV adaptation is faster than KEV adaptation, and subsequent recognition runs as fast as normal HMM decoding. eKEV adaptation makes use of multidimensional scaling technique so that the resulting adapted model lies in the span of a subset of carefully chosen training speakers. It is related to the reference speaker weighting (RSW) adaptation method that is based on speaker clustering. Our experimental results on Wall Street Journal show that eKEV adaptation continues to outperform EV, MAP, MLLR, and the original RSW method. However, by adopting the way we choose the subset of reference speakers for eKEV adaptation, we may also improve RSW adaptation so that it performs as well as our eKEV adaptation.  相似文献   

6.
Into the Blue: Better Caustics through Photon Relaxation   总被引:1,自引:0,他引:1  
The photon mapping method is one of the most popular algorithms employed in computer graphics today. However, obtaining good results is dependent on several variables including kernel shape and bandwidth, as well as the properties of the initial photon distribution. While the photon density estimation problem has been the target of extensive research, most algorithms focus on new methods of optimising the kernel to minimise noise and bias. In this paper we break from convention and propose a new approach that directly redistributes the underlying photons. We show that by relaxing the initial distribution into one with a blue noise spectral signature we can dramatically reduce background noise, particularly in areas of uniform illumination. In addition, we propose an efficient heuristic to detect and preserve features and discontinuities. We then go on to demonstrate how reconfiguration also permits the use of very low bandwidth kernels, greatly improving render times whilst reducing bias.  相似文献   

7.
In this paper we propose a Gaussian-kernel-based online kernel density estimation which can be used for applications of online probability density estimation and online learning. Our approach generates a Gaussian mixture model of the observed data and allows online adaptation from positive examples as well as from the negative examples. The adaptation from the negative examples is realized by a novel concept of unlearning in mixture models. Low complexity of the mixtures is maintained through a novel compression algorithm. In contrast to the existing approaches, our approach does not require fine-tuning parameters for a specific application, we do not assume specific forms of the target distributions and temporal constraints are not assumed on the observed data. The strength of the proposed approach is demonstrated with examples of online estimation of complex distributions, an example of unlearning, and with an interactive learning of basic visual concepts.  相似文献   

8.
Domain adaptation aims to correct the mismatch in statistical properties between the source domain on which a classifier is trained and the target domain to which the classifier is to be applied. In this paper, we address the challenging scenario of unsupervised domain adaptation, where the target domain does not provide any annotated data to assist in adapting the classifier. Our strategy is to learn robust features which are resilient to the mismatch across domains and then use them to construct classifiers that will perform well on the target domain. To this end, we propose novel kernel learning approaches to infer such features for adaptation. Concretely, we explore two closely related directions. In the first direction, we propose unsupervised learning of a geodesic flow kernel (GFK). The GFK summarizes the inner products in an infinite sequence of feature subspaces that smoothly interpolates between the source and target domains. In the second direction, we propose supervised learning of a kernel that discriminatively combines multiple base GFKs. Those base kernels model the source and the target domains at fine-grained granularities. In particular, each base kernel pivots on a different set of landmarks—the most useful data instances that reveal the similarity between the source and the target domains, thus bridging them to achieve adaptation. Our approaches are computationally convenient, automatically infer important hyper-parameters, and are capable of learning features and classifiers discriminatively without demanding labeled data from the target domain. In extensive empirical studies on standard benchmark recognition datasets, our appraches yield state-of-the-art results compared to a variety of competing methods.  相似文献   

9.
The ratio of two probability densities can be used for solving various machine learning tasks such as covariate shift adaptation (importance sampling), outlier detection (likelihood-ratio test), feature selection (mutual information), and conditional probability estimation. Several methods of directly estimating the density ratio have recently been developed, e.g., moment matching estimation, maximum-likelihood density-ratio estimation, and least-squares density-ratio fitting. In this paper, we propose a kernelized variant of the least-squares method for density-ratio estimation, which is called kernel unconstrained least-squares importance fitting (KuLSIF). We investigate its fundamental statistical properties including a non-parametric convergence rate, an analytic-form solution, and a leave-one-out cross-validation score. We further study its relation to other kernel-based density-ratio estimators. In experiments, we numerically compare various kernel-based density-ratio estimation methods, and show that KuLSIF compares favorably with other approaches.  相似文献   

10.
多核学习方法(Multiple kernel learning, MKL)在视觉语义概念检测中有广泛应用, 但传统多核学习大都采用线性平稳的核组合方式而无法准确刻画复杂的数据分布. 本文将精确欧氏空间位置敏感哈希(Exact Euclidean locality sensitive Hashing, E2LSH)算法用于聚类, 结合非线性多核组合方法的优势, 提出一种非线性非平稳的多核组合方法—E2LSH-MKL. 该方法利用Hadamard内积实现对不同核函数的非线性加权,充分利用了不同核函数之间交互得到的信息; 同时利用基于E2LSH哈希原理的聚类算法,先将原始图像数据集哈希聚类为若干图像子集, 再根据不同核函数对各图像子集的相对贡献大小赋予各自不同的核权重, 从而实现多核的非平稳加权以提高学习器性能; 最后,把E2LSH-MKL应用于视觉语义概念检测. 在Caltech-256和TRECVID 2005数据集上的实验结果表明,新方法性能优于现有的几种多核学习方法.  相似文献   

11.
Development effort is one of the most important metrics that must be estimated in order to design the plan of a project. The uncertainty and complexity of software projects make the process of effort estimation difficult and ambiguous. Analogy-based estimation (ABE) is the most common method in this area because it is quite straightforward and practical, relying on comparison between new projects and completed projects to estimate the development effort. Despite many advantages, ABE is unable to produce accurate estimates when the importance level of project features is not the same or the relationship among features is difficult to determine. In such situations, efficient feature weighting can be a solution to improve the performance of ABE. This paper proposes a hybrid estimation model based on a combination of a particle swarm optimization (PSO) algorithm and ABE to increase the accuracy of software development effort estimation. This combination leads to accurate identification of projects that are similar, based on optimizing the performance of the similarity function in ABE. A framework is presented in which the appropriate weights are allocated to project features so that the most accurate estimates are achieved. The suggested model is flexible enough to be used in different datasets including categorical and non-categorical project features. Three real data sets are employed to evaluate the proposed model, and the results are compared with other estimation models. The promising results show that a combination of PSO and ABE could significantly improve the performance of existing estimation models.  相似文献   

12.
Kernel Bandwidth Estimation for Nonparametric Modeling   总被引:1,自引:0,他引:1  
Kernel density estimation is a nonparametric procedure for probability density modeling, which has found several applications in various fields. The smoothness and modeling ability of the functional approximation are controlled by the kernel bandwidth. In this paper, we describe a Bayesian estimation method for finding the bandwidth from a given data set. The proposed bandwidth estimation method is applied in three different computational-intelligence methods that rely on kernel density estimation: 1) scale space; 2) mean shift; and 3) quantum clustering. The third method is a novel approach that relies on the principles of quantum mechanics. This method is based on the analogy between data samples and quantum particles and uses the Schrodinger potential as a cost function. The proposed methodology is used for blind-source separation of modulated signals and for terrain segmentation based on topography information.  相似文献   

13.
The Nadaraya–Watson estimator, also known as kernel regression, is a density-based regression technique. It weights output values with the relative densities in input space. The density is measured with kernel functions that depend on bandwidth parameters. In this work we present an evolutionary bandwidth optimizer for kernel regression. The approach is based on a robust loss function, leave-one-out cross-validation, and the CMSA-ES as optimization engine. A variant with local parameterized Nadaraya–Watson models enhances the approach, and allows the adaptation of the model to local data space characteristics. The unsupervised counterpart of kernel regression is an approach to learn principal manifolds. The learning problem of unsupervised kernel regression (UKR) is based on optimizing the latent variables, which is a multimodal problem with many local optima. We propose an evolutionary framework for optimization of UKR based on scaling of initial local linear embedding solutions, and minimization of the cross-validation error. Both methods are analyzed experimentally.  相似文献   

14.
We propose a novel approach to online estimation of probability density functions, which is based on kernel density estimation (KDE). The method maintains and updates a non-parametric model of the observed data, from which the KDE can be calculated. We propose an online bandwidth estimation approach and a compression/revitalization scheme which maintains the KDE's complexity low. We compare the proposed online KDE to the state-of-the-art approaches on examples of estimating stationary and non-stationary distributions, and on examples of classification. The results show that the online KDE outperforms or achieves a comparable performance to the state-of-the-art and produces models with a significantly lower complexity while allowing online adaptation.  相似文献   

15.
Nowadays, the cloud computing environment is becoming a natural choice to deploy and provide Web services that meet user needs. However, many services provide the same functionality and high quality of service (QoS) but different self‐adaptive behaviors. In this case, providers' adaptation policies are useful to select services with high QoS and high quality of adaptation (QoA). Existing approaches do not take into account providers' adaptation policies in order to select services with high reputation and high reaction to changes, which is important for the composition of self‐adaptive Web services. In order to actively participate to compositions, candidate services must negotiate their self‐* capabilities. Moreover, they must evaluate the participation constraints against their capabilities specified in terms of QoS and adaptation policies. This paper exploits a variant of particle swarm optimization and kernel density estimation in the selection of service compositions and the concurrent negotiations of their QoS and QoA capabilities. Selection and negotiation processes are held between intelligent agents, which adopt swarm intelligence techniques for achieving optimal selection and optimal agreement on providers' offers. To resolve unknown autonomic behavior of candidate services, we deal with the lack of such information by predicting the real QoA capabilities of a service through the kernel density estimation technique. Experiments show that our solution is efficient in comparison with several state‐of‐the‐art selection approaches.  相似文献   

16.
In the nonparametric kernel estimation of the unknown probability densities and their derivatives there exist several methods for estimation of the kernel function bandwidth of which the CV and SCV methods of cross-validation are most simple and suitable. The former method was developed both for the density itself and its derivatives; the latter one, for density only. Yet it generates estimates with a higher rate of convergence and substantially smaller scatter. For the problem of nonparametric restoration of the density derivative from an independent sample, a data-based estimate of the kernel function bandwidth was constructed.  相似文献   

17.
相似性度量是聚类分析的重要基础,如何有效衡量类属型符号间的相似性是相似性度量的一个难点.文中根据离散符号的核概率密度衡量符号间的相似性,与传统的简单符号匹配及符号频度估计方法不同,该相似性度量在核函数带宽的作用下,不再依赖同一属性上符号间独立性假设.随后建立类属型数据的贝叶斯聚类模型,定义基于似然的类属型对象-簇间相似性度量,给出基于模型的聚类算法.采用留一估计和最大似然估计,提出3种求解方法在聚类过程中动态确定最优的核带宽.实验表明,相比使用特征加权或简单匹配距离的聚类算法,文中算法可以获得更高的聚类精度,估计的核函数带宽在重要特征识别等应用中具有实际意义.  相似文献   

18.
Kernel-based methods have been widely investigated in the soft-computing community. However, they focus mainly on numeric data. In this paper, we propose a novel method for kernel learning on categorical data, and show how the method can be used to derive effective classifiers for linear classification. Based on kernel density estimation for categorical attributes, three popular classification methods, i.e., Naive Bayes, nearest neighbor and prototype-based classification, are effectively extended to classify categorical data. We also propose two data-driven approaches to the bandwidth selection problem, with one aimed at minimizing the mean squared error of the kernel estimate and the other endeavored to attribute weights optimization. Theoretical analysis indicates that, as in the numeric case, kernel learning of categorical attributes is capable to make the classes to be more separable, resulting in outstanding performances of the new classifiers on various real-world data sets.  相似文献   

19.
Exponentially weighted moving average (EWMA) controllers are the most commonly used run-to-run controllers in semiconductor manufacturing industry. An EWMA controller can be implemented in two different ways. One way is to keep the process gain as its off-line estimate and update the intercept term at each run, which is termed EWMA with intercept adaptation; the other is to keep the intercept term as its off-line estimate and update the process gain at each run, which is termed EWMA with gain adaptation. Despite the fact that gain variation and adaptation is typical in semiconductor industry, most EWMA formulations are for intercept adaptation and few results exist on the stability and sensitivity of EWMA with gain adaptation. In this paper, we propose a general formulation to analyze the stability of both EWMA controllers. The proposed state-space representation not only reveals the similarities and differences between two types of EWMA controllers, but also explains why the stability conditions for both types of EWMA controllers are independent of process disturbances. In addition, we propose a general framework that unifies the analysis of the optimal control performance for both types of EWMA controllers. The proposed framework is different from existing approaches in that it decouples the state estimation from the control law, and derives the optimal weighting based on the state estimation performance. The proposed framework significantly simplifies the analysis procedure, especially for EWMA with gain adaptation. Using this framework, we derive the optimal EWMA weighting through solving the discrete-time algebraic Riccati equation (DARE) for various process disturbances that are encountered in semiconductor manufacturing industry. Simulation examples are given to illustrate the optimality of the EWMA weighting derived using the framework. Some practical aspects of controller tuning are also discussed based on the simulation results.  相似文献   

20.
We present new methods for fast Gaussian process (GP) inference in large-scale scenarios including exact multi-class classification with label regression, hyperparameter optimization, and uncertainty prediction. In contrast to previous approaches, we use a full Gaussian process model without sparse approximation techniques. Our methods are based on exploiting generalized histogram intersection kernels and their fast kernel multiplications. We empirically validate the suitability of our techniques in a wide range of scenarios with tens of thousands of examples. Whereas plain GP models are intractable due to both memory consumption and computation time in these settings, our results show that exact inference can indeed be done efficiently. In consequence, we enable every important piece of the Gaussian process framework—learning, inference, hyperparameter optimization, variance estimation, and online learning—to be used in realistic scenarios with more than a handful of data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号