首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, a tabu search based clustering approach called TS-Clustering is proposed to deal with the minimum sum-of-squares clustering problem. In the TS-Clustering algorithm, five improvement operations and three neighborhood modes are given. The improvement operation is used to enhance the clustering solution obtained in the process of iterations, and the neighborhood mode is used to create the neighborhood of tabu search. The superiority of the proposed method over some known clustering techniques is demonstrated for artificial and real life data sets.  相似文献   

2.
New efficient algorithms for the LCS and constrained LCS problems   总被引:1,自引:0,他引:1  
In this paper, we study the classic and well-studied longest common subsequence (LCS) problem and a recent variant of it, namely the constrained LCS (CLCS) problem. In the CLCS problem, the computed LCS must also be a supersequence of a third given string. In this paper, we first present an efficient algorithm for the traditional LCS problem that runs in O(Rloglogn+n) time, where R is the total number of ordered pairs of positions at which the two strings match and n is the length of the two given strings. Then, using this algorithm, we devise an algorithm for the CLCS problem having time complexity O(pRloglogn+n) in the worst case, where p is the length of the third string.  相似文献   

3.
Polynomial-time approximation algorithms with nontrivial performance guarantees are presented for the problems of (a) partitioning the vertices of a weighted graph intok blocks so as to maximize the weight of crossing edges, and (b) partitioning the vertices of a weighted graph into two blocks of equal cardinality, again so as to maximize the weight of crossing edges. The approach, pioneered by Goemans and Williamson, is via a semidefinite programming relaxation. The first author was supported in part by NSF Grant CCR-9225008. The work described here was undertaken while the second author was visiting Carnegie Mellon University; at that time he was a Nuffield Science Research Fellow, and was supported in part by Grant GR/F 90363 of the UK Science and Engineering Research Council, and Esprit Working Group 7097 “RAND”.  相似文献   

4.
The capacitated clustering problem (CCP) has been studied in a wide range of applications. In this study, we investigate a challenging CCP in computational biology, namely, sibling reconstruction problem (SRP). The goal of SRP is to establish the sibling relationship (i.e., groups of siblings) of a population from genetic data. The SRP has gained more and more interests from computational biologists over the past decade as it is an important and necessary keystone for studies in genetic and population biology. We propose a large-scale mixed-integer formulation of the CCP for SRP that is based on both combinatorial and statistical genetic concepts. The objective is not only to find the minimum number of sibling groups, but also to maximize the degree of similarity of individuals in the same sibling groups while each sibling group is subject to genetic constraints derived from Mendel's laws. We develop a new randomized greedy optimization algorithm to effectively and efficiently solve this SRP. The algorithm consists of two key phases: construction and enhancement. In the construction phase, a greedy approach with randomized perturbation is applied to construct multiple sibling groups iteratively. In the enhancement phase, a two-stage local search with a memory function is used to improve the solution quality with respect to the similarity measure. We demonstrate the effectiveness of the proposed algorithm using real biological data sets and compare it with state-of-the-art approaches in the literature. We also test it on larger simulated data sets. The experimental results show that the proposed algorithm provide the best reconstruction solutions.  相似文献   

5.
Based on the identification technique of active constraints, we propose a Newton-like algorithm and a quasi-Newton algorithm for solving the box-constrained optimization problem. The two algorithms require only the solution of a lower-dimensional system of linear equations at each iteration. In the proposed quasi-Newton algorithm, we make use of an approximate direction derivative of the multiplier functions so that only first-order derivatives of the objective function are needed to evaluate. Under mild assumptions, global convergence of the two algorithms is established. In particular, locally quadratic convergence for the Newton-like algorithm and locally superlinear convergence for the quasi-Newton algorithm are obtained without assuming that the strict complementarity condition holds at the solution.  相似文献   

6.
Time series analysis utilising more than a single forecasting approach is a procedure originated many years ago as an attempt to improve the performance of the individual model forecasts. In the literature there is a wide range of different approaches but their success depends on the forecasting performance of the individual schemes. A clustering algorithm is often employed to distinguish smaller sets of data that share common properties. The application of clustering algorithms in combinatorial forecasting is discussed with an emphasis placed on the formulation of the problem so that better forecasts are generated. Additionally, the hybrid clustering algorithm that assigns data depending on their distance from the hyper-plane that provides their optimal modelling is applied. The developed cluster-based combinatorial forecasting schemes were examined in a single-step ahead prediction of the pound-dollar daily exchange rate and demonstrated an improvement over conventional linear and neural based combinatorial schemes.  相似文献   

7.
聚类集成算法通常对聚类成员差异性要求较高,导致算法在生成聚类成员阶段计算复杂度提高。针对该问题提出了一种基于遗传算法的聚类集成方法CEGA,不考虑聚类成员的差异性,而是利用目标函数将聚类问题转化为聚类成员的优化问题,充分利用遗传算法内在的并行性和全局寻优能力,对聚类成员进行优化组合,并以得到的最优染色体作为聚类集成最终结果。分析了CEGA的复杂度及适用范围,并利用UCI数据库中部分数据集进行实验,实验结果表明这种聚类集成方法的有效性。  相似文献   

8.
A particle swarm optimization based simultaneous learning framework for clustering and classification (PSOSLCC) is proposed in this paper. Firstly, an improved particle swarm optimization (PSO) is used to partition the training samples, the number of clusters must be given in advance, an automatic clustering algorithm rather than the trial and error is adopted to find the proper number of clusters, and a set of clustering centers is obtained to form classification mechanism. Secondly, in order to exploit more useful local information and get a better optimizing result, a global factor is introduced to the update strategy update strategy of particle in PSO. PSOSLCC has been extensively compared with fuzzy relational classifier (FRC), vector quantization and learning vector quantization (VQ+LVQ3), and radial basis function neural network (RBFNN), a simultaneous learning framework for clustering and classification (SCC) over several real-life datasets, the experimental results indicate that the proposed algorithm not only greatly reduces the time complexity, but also obtains better classification accuracy for most datasets used in this paper. Moreover, PSOSLCC is applied to a real world application, namely texture image segmentation with a good performance obtained, which shows that the proposed algorithm has a potential of classifying the problems with large scale.  相似文献   

9.
Search results of spatio-temporal data are often displayed on a map, but when the number of matching search results is large, it can be time-consuming to individually examine all results, even when using methods such as filtered search to narrow the content focus. This suggests the need to aggregate results via a clustering method. However, standard unsupervised clustering algorithms like K-means (i) ignore relevance scores that can help with the extraction of highly relevant clusters, and (ii) do not necessarily optimize search results for purposes of visual presentation. In this article, we address both deficiencies by framing the clustering problem for search-driven user interfaces in a novel optimization framework that (i) aims to maximize the relevance of aggregated content according to cluster-based extensions of standard information retrieval metrics and (ii) defines clusters via constraints that naturally reflect interface-driven desiderata of spatial, temporal, and keyword coherence that do not require complex ad-hoc distance metric specifications as in K-means. After comparatively benchmarking algorithmic variants of our proposed approach – RadiCAL – in offline experiments, we undertake a user study with 24 subjects to evaluate whether RadiCAL improves human performance on visual search tasks in comparison to K-means clustering and a filtered search baseline. Our results show that (a) our binary partitioning search (BPS) variant of RadiCAL is fast, near-optimal, and extracts higher-relevance clusters than K-means, and (b) clusters optimized via RadiCAL result in faster search task completion with higher accuracy while requiring a minimum workload leading to high effectiveness, efficiency, and user satisfaction among alternatives.  相似文献   

10.
In this paper, we extend job scheduling models to include aspects of history-dependent scheduling, where setup times for a job are affected by the aggregate activities of all predecessors of that job. Traditional approaches to machine scheduling typically address objectives and constraints that govern the relative sequence of jobs being executed using available resources. This paper optimises the operations of multiple unrelated resources to address sequential and history-dependent job scheduling constraints along with time window restrictions. We denote this consolidated problem as the general precedence scheduling problem (GPSP). We present several applications of the GPSP and show that many problems in the literature can be represented as special cases of history-dependent scheduling. We design new ways to model this class of problems and then proceed to formulate it as an integer program. We develop specialized algorithms to solve such problems. An extensive computational analysis over a diverse family of problem data instances demonstrates the efficacy of the novel approaches and algorithms introduced in this paper.  相似文献   

11.
12.
Clustering is the process of grouping objects that are similar, where similarity between objects is usually measured by a distance metric. The groups formed by a clustering method are referred as clusters. Clustering is a widely used activity with multiple applications ranging from biology to economics. Each clustering technique has some advantages and disadvantages. Some clustering algorithms may even require input parameters which strongly affect the result. In most cases, it is not possible to choose the best distance metric, the best clustering method, and the best input argument values for an input data set. Therefore, multiple clusterings can be obtained by several distance metrics, several clustering methods, and several input argument values. And, multiple clusterings can be combined into a new and better quality final clustering. We propose a family of combining multiple clustering algorithms that are memory efficient, scalable, robust, and intuitive. Our new algorithms offer tremendous speed gain and low memory requirements by working at cluster level, while producing very good quality final clusters. Extensive experimental evaluations on some very challenging artificially generated and real data sets from a diverse set of domains establish the usefulness of our methods.  相似文献   

13.
针对无线传感器网络(WSNs)由于自身特点易于遭到入侵且传统被动的安全机制无法完全应对这一问题,对人工免疫系统(AIS)进行研究,设计一种新的入侵检测系统(IDS)模型。模型采用危险理论和适用于WSNs的改良树突状细胞算法(DCA),可使节点之间彼此分工合作共同识别入侵,加强了网络的鲁棒性。仿真结果显示:与早期的自我—非我(SNS)模型相比,研究的模型在检测能力和能耗上均有很好的表现。  相似文献   

14.
基于PSO_KFCM的医学图像分割   总被引:1,自引:0,他引:1  
在核模糊聚类算法(KFCM)的基础上,提出了一种新的PSO KFCM聚类算法.新算法利用高斯核函数,把输入空间的样本映射到高维特征空间,利用微粒群算法的全局搜索、快速收敛的特点,代替KFCM算法逐次迭代的过程,在特征空间中进行聚类,克服了KFCM对初始值和噪声数据敏感、易陷入局部最优的缺点.通过对医学图像进行分割,仿真实验结果表明,新算法在性能上比KFCM聚类算法有较大改进,具有更好的聚类效果,且算法能够很快地收敛.  相似文献   

15.
Networks-on-Chip (NoC) is an interesting option in design of communication infrastructures for embedded systems. It provides a scalable structure and balanced communication between the cores. Parallel applications that take advantage of the NoC architectures, are usually are communication-intensive. Thus, a big deal of data packets is transmitted simultaneously through the network. In order to avoid congestion delays that deteriorate the execution time of the implemented applications, an efficient routing strategy must be thought of carefully. In this paper, the ant colony optimization paradigm is explored to find and optimize routes in a mesh-based NoC. The proposed routing algorithms are simple yet efficient. The routing optimization is driven by the minimization of total latency during packets transmission between the tasks that compose the application. The presented performance evaluation is threefold: first, the impact of well-known synthetic traffic patterns is assessed; second, randomly generated applications are mapped into the NoC infrastructure and some synthetic communication traffics, that follow known patterns, are used to simulate real situations; third, sixteen real-world applications of the E3S and one specific application for digital image processing are mapped and their execution time evaluated. In both cases, the obtained results are compared to those obtained with known general purpose algorithms for deadlock free routing. The comparison avers the effectiveness and superiority of the ant colony inspired routing.  相似文献   

16.
Finding the longest common subsequence of a given set of input strings is a relevant problem arising in various practical settings. One of these problems is the so-called longest arc-preserving common subsequence problem. This NP-hard combinatorial optimization problem was introduced for the comparison of arc-annotated ribonucleic acid (RNA) sequences. In this work we present an integer linear programming (ILP) formulation of the problem. As even in the context of rather small problem instances the application of a general purpose ILP solver is not viable due to the size of the model, we study alternative ways based on model reduction in order to take profit from this ILP model. First, we present a heuristic way for reducing the model, with the subsequent application of an ILP solver. Second, we propose the application of an iterative hybrid algorithm that makes use of an ILP solver for generating high quality solutions at each iteration. Experimental results concerning artificial and real problem instances show that the proposed techniques outperform an available technique from the literature.  相似文献   

17.
In this paper the problem of automatic clustering a data set is posed as solving a multiobjective optimization (MOO) problem, optimizing a set of cluster validity indices simultaneously. The proposed multiobjective clustering technique utilizes a recently developed simulated annealing based multiobjective optimization method as the underlying optimization strategy. Here variable number of cluster centers is encoded in the string. The number of clusters present in different strings varies over a range. The points are assigned to different clusters based on the newly developed point symmetry based distance rather than the existing Euclidean distance. Two cluster validity indices, one based on the Euclidean distance, XB-index, and another recently developed point symmetry distance based cluster validity index, Sym-index, are optimized simultaneously in order to determine the appropriate number of clusters present in a data set. Thus the proposed clustering technique is able to detect both the proper number of clusters and the appropriate partitioning from data sets either having hyperspherical clusters or having point symmetric clusters. A new semi-supervised method is also proposed in the present paper to select a single solution from the final Pareto optimal front of the proposed multiobjective clustering technique. The efficacy of the proposed algorithm is shown for seven artificial data sets and six real-life data sets of varying complexities. Results are also compared with those obtained by another multiobjective clustering technique, MOCK, two single objective genetic algorithm based automatic clustering techniques, VGAPS clustering and GCUK clustering.  相似文献   

18.
The paper presents a comparison of ant algorithms and simulated annealing as well as their applications in multicriteria discrete dynamic programming. The considered dynamic process consists of finite states and decision variables. In order to describe the effectiveness of multicriteria algorithms, four measures of the quality of the nondominated set approximations are used.  相似文献   

19.
We study the Weighted t-Uniform Sparsest Cut (Weighted t-USC) and other related problems. In an instance of the Weighted t-USC problem, a parameter t and an undirected graph G=(V,E) with edge-weights w:ER0 and vertex-weights η:VR+ are given. The goal is to find a vertex set SV with |S|t while minimizing w(S,V\S)/η(S), where w(S,V\S) is the total weight of the edges with exactly one endpoint in S and η(S)=vSη(v). For this problem, we present a (O(logt),1+ϵ) factor bicriteria approximation algorithm. Our algorithm outperforms the current best algorithm when t=no(1). We also present better approximation algorithms for Weighted ρ-Unbalanced Cut and Min–Max k-Partitioning problems.  相似文献   

20.
为了提升文本聚类效果,改善传统聚类算法在参数设定,稳定性等方面存在的不足,提出新的文本聚类算法TCBIBK(a Text Clustering algorithm Based on Improved BIRCH and K-nearest neighbor)。该算法以BIRCH聚类算法为原型,聚类过程中除判断文本对象与簇的距离外,增加判断簇与簇之间的距离,采取主动的簇合并或分裂,设置动态的阈值。同时结合KNN分类算法,在保证良好聚类效率前提下提升聚类稳定性,将TCBIBK算法应用于文本聚类,能够提高文本聚类效果。对比实验结果表明,该算法聚类有效性与稳定性都得到较大提高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号