首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
针对K-均值聚类在色彩量化中的应用,提出了一种基于核的聚类色彩量化算法.将原空间中待聚类的样本通过非线性映射,映射到一个高维的核空间中,从而将非线性问题变为线性问题,并通过Delphi编程加以实现.实验结果证明,该算法计算简单,鲁棒性强,具有一定的实用价值.  相似文献   

2.
传统的模糊C均值算法直接基于原始数据进行聚类,数据的内在结构可能会被噪声、异常值或其他因素破坏,因此聚类性能会受到影响。为提升FCM算法的鲁棒性,提出了一种基于自适应近邻信息的模糊C均值聚类算法。近邻信息指的是一种基于数据点之间相似度的度量,每个数据点都可以看作其他数据点的近邻,但是不同数据点之间的相似度是不同的。将样本点的近邻信息GX和类中心点的近邻信息GV融入基础FCM模型中,为聚类过程提供更多的数据结构信息,用于指导聚类算法中的簇划分过程,以提升算法的稳定性,并提出了3个迭代算法求解本文提出的聚类模型。与其他先进聚类算法对比,在部分基准数据集上聚类性能有10%以上的提升,同时还从参数敏感性、收敛性、消融实验等方面对算法进行评价。实验结果可以充分显示本文提出的聚类算法的可行性与有效性。  相似文献   

3.
基于历史数据聚类的火电机组工况划分   总被引:2,自引:0,他引:2  
针对调峰背景下火电机组非稳态工况增多,以及常见运行工况偏离设计工况等问题,提出了基于历史运行数据聚类的工况划分模型。首先,考虑到运行数据中非稳态工况与稳态工况并存的情况,以功率作为特征变量,提出基于功率差值期望区间估计的稳态判别算法,筛选出历史数据中的非稳态工况;其次,由于稳态工况下外部边界条件变量的分布差异性,提出改进的多步K-均值聚类算法进行稳态工况的划分,并利用silhouette评价准则确定每步条件下的最佳聚类数;最后,采用某实际发电用重型燃气轮机的历史运行数据进行模型验证。通过与传统K-均值聚类算法比较,所提出的模型能够有效解决工况分类数目较少以及样本分布不均的问题。  相似文献   

4.
多车场车辆路径问题的新型聚类蚁群算法   总被引:3,自引:0,他引:3  
在对多车场带时间窗的车辆路径问题进行详细阐述的基础上,以车辆运输总费用最少为目标函数,建立了问题的数学模型.提出了先采用聚类蚁群算法将多车场带时间窗的车辆路径问题分解为若干个单车场车辆路径问题,然后对各单车场问题应用改进蚁群算法进行优化的求解思路.最后通过一个实例将这种新型聚类蚁群算法与就近分配禁忌搜索算法和K-均值算法的优化能力进行了对比.试验结果表明,该算法对优化多车场带时间窗的车辆路径问题的求解结果是相当令人满意的.  相似文献   

5.
随着互联网和计算机技术越来越广泛的应用,数据量也在迅速增长,如何在海量数据中快速挖掘到有价值的信息成为数据挖掘研究的重点。本文分析了Hadoop云计算平台,设计了基于Hadoop的数据分析系统,提出了基于MapReduce的K-均值空间聚类算法。  相似文献   

6.
在对多车场带时间窗的车辆路径问题进行详细阐述的基础上,以车辆运输总费用最少为目标函数,建立了问题的数学模型。提出了先采用聚类蚁群算法将多车场带时间窗的车辆路径问题分解为若干个单车场车辆路径问题,然后对各单车场问题应用改进蚁群算法进行优化的求解思路。最后通过一个实例将这种新型聚类蚁群算法与就近分配禁忌搜索算法和K-均值算法的优化能力进行了对比。试验结果表明,该算法对优化多车场带时间窗的车辆路径问题的求解结果是相当令人满意的。  相似文献   

7.
针对单一模型无法全面描述木材含水率的复杂非线性特性问题,本文提出一种多模型建模的方法测量木材含水率,该方法先用模糊C均值聚类算法将含水率等效电阻、进风口、出风口温度等数据分成具有不同聚类中心的子集,每一子集依据样本数分别采用径向基网络、支持向量积训练得出子模型,再用模糊聚类后产生的隶属度将各子模型的输出加权求和得到木材含水率测量的模型。通过实例验证了本方法与RBFNN建模对于木材含水率的检测,具有更好的泛化结果和测量精度。  相似文献   

8.
一种K-均值脸谱图聚类新算法   总被引:2,自引:0,他引:2  
王金甲  洪文学  李昕 《仪器仪表学报》2007,28(10):1916-1920
Chernoff脸谱图简单,类似卡通画,能图形化地表示多元数据。但脸谱图聚类算法具有主观性的巨大的对比工作量,脸谱特征分配困难。因此,本文提出一种新的脸谱图聚类算法,它合并了K均值聚类或模糊G均值聚类算法。IRIS和蔬菜油数据集的实验结果表明新算法优于传统的聚类算法。  相似文献   

9.
基于酵母二次迁移实验中表达谱相似的五类基因表达数据,研究了不同相似性度量准则、数据预处理方法及质心初始化方式对K-均值聚类效果的影响.结果表明若对基因表达数据进行K-均值聚类分析,最好采用能反映数据结构特征的向量对质心进行初始化.若随机初始化质心,则采用取相对表达水平的预处理方式,以欧几里德距离(Euclidean distance)作为相似性测量准则,可以获得最佳的聚类结果;在欧氏距离准则下,标准化处理因可能破坏原始数据的幅度特征,而导致聚类结果变坏.若以Pearson相关系数为相似性准则则不同的数据预处理方式对结果无显著影响.  相似文献   

10.
由于三维扫描设备采集的点云数据庞大,本文提出了一种特征保持的点云精简方法以在减少冗余数据的同时更好地保持原始曲面的几何特征。首先,利用K均值聚类法在空间域对点云全局聚类,对点云构建K-d树并以K-d树的部分节点作为初始化聚类中心。然后,用主成分分析法估计点云法矢和候选特征点,遍历每个聚类,若类中包含特征点则将该类细分为多个子类,细分时将聚类映射到高斯球。最后,基于自适应均值漂移法对高斯球上的数据进行分类,高斯球上的聚类结果对应为空间聚类细分结果,各聚类中心的集合为精简结果。以多个实物模型为例验证了算法的有效性。结果表明,本文方法精简的点云在平坦区域保留少数点,在高曲率区域保留更多的点。相比于非均匀网格、层次聚类、K均值点云精简法,该方法对包含尖锐特征的曲面精简误差最小,更好地保留了原始曲面的几何特征。  相似文献   

11.
针对常用聚类算法对复杂分布数据难以有效聚类的问题,把网络分析技术与基于代价函数最优的聚类技术相结合,提出一种新颖的迭代可调节网络聚类算法。该算法采用网络的思想建立样本空间模型,把数据聚类问题转化为基于节点生长连接的网络分析问题;并设计了可调节的节点间相似关系测度和相应的聚类准则来构建节点间邻域搜索及节点生长操作;通过改变调节系数来实现网络节点间连接关系的整体调节。新算法能够在无需预先设定簇数目的情况下,自动获得簇的数目和样本数据的分布位置。采用4组不同样本分布的人工数据集聚类和往复压缩机气阀泄漏故障诊断试验,对比测试了新算法与K均值算法(KM)的性能,结果表明迭代可调节网络聚类算法可实现对复杂分布的流形数据聚类,在准确率及自动处理程度性能指标上明显优于常用的KM算法。  相似文献   

12.
为获得具有模糊规则自适应约简性能和较好的泛化性能的TSK分类器,本文提出了一种结合模糊(C+P)均值聚类(FCPM)算法和SP-V-支持向量机(SVM)分类算法来构建TSK(Takagi-Sugeno-Kang)分类器的方法。该方法首先用FCPM聚类算法对训练数据进行聚类;然后根据聚类结果确定TSK分类器的模糊规则前件中的高斯隶属度函数的中心和宽度参数;最后采用成组稀疏约束SP-V-SVM算法对模糊规则后件参数进行学习,该算法不仅改善了系统的泛化性能,还使系统具有模糊规则自适应约简功能,使得系统更为紧凑。与相关算法在UCI和IDA标准数据集分类实验中的模糊规则数和分类性能对比表明:用提出的分类算法所构造的TSK分类器不仅具有较好的分类性能,而且模糊规则数少,有利于构建更为紧凑的模糊分类系统。  相似文献   

13.
A typical process route is a sample of planning the process route. It is a kind of the process planning knowledge. In order to discover the typical process route in the process planning database from the Computer Aided Process Planning (CAPP), Knowledge Discovery in Database (KDD) is applied. Process data selection, process data purge and process data transformation are employed to get optimized process data. The clustering analysis is adopted as the algorithm mining the typical process route. A mathematics model describing the process route was built by the data matrix. There are three similarities in process route clustering: the similarity between operations was measured by the Manhattan distance based on operation code; the similarity between process routes was calculated by the Euclidean distance and expressed as a dissimilarity matrix; the similarity between process route clusters was evaluated by the average distance based on the dissimilarity matrix. Then, the process route clusters were eventually merged by the agglomerative hierarchical clustering method. And the process routes clustering result was determined by the clustering granularity of process route. This method has been applied successfully to discovering the typical process route of a kind of axle sleeves. This project is supported by the National High-Tech. R&D Program for CIMS, China (Grant No. 2003AA411041).  相似文献   

14.
When encountering too many records, each of which has several attributes, clustering of the data is an important issue on mining and classification. Recently many advances on clustering algorithms have been made such that clustering of data is done precisely and quickly. Clustering algorithms use optimization algorithms which simultaneously provide the number of clusters as default. These algorithms cluster the data so that those which belong to a cluster have maximum similarity and those in different clusters have minimum similarity. The k-means algorithm is a traditional algorithm for clustering problems. One of the most important difficulties of clustering algorithms is determining the number of clusters before starting the algorithm. In other words, by having knowledge on distribution of data, the number of clusters should be estimated and then imported to the problem as an input. In this paper, the data collected on quality control of mechanized tunneling are analyzed. They consist of measurements of 16 characteristics for 200 initial installed rings of segments on the tunnel walls inspected by the quality control team. A dynamic validity index is used and combined to the k-means algorithm for clustering the data so that the optimal number of clusters can be determined simultaneously. The application of the algorithm shows that the total installed rings can be clustered into four clusters. These four classes of quality can best describe the total installed rings on the tunnel in comparison of other number of classes (or clusters). Furthermore, this approach helps the quality team to determine the most effective or best performance executive team whom their installed rings have best class and minimum variations.  相似文献   

15.
R树能较好地满足逆向工程、CAD/CAM、机器视觉等领域的动态数据维护及空间查询需求,而CR树是其优秀的变体之一。针对CR树的上溢结点分裂算法存在的聚类结果不理想以及计算代价过高等问题,提出一种主元分析导向的增量式k均值算法,可在既有分类中心附近的第一主元方向上搜索新的初始分类中心。将该算法与Silhouette指标相结合应用于求解由上溢结点分裂问题所转化的点集聚类问题,能以较小的计算代价自适应获取近似全局最优的点集聚类结果。试验结果表明,基于增量式聚类的R树上溢结点分裂算法在R树构建效率、存储利用率及空间查询等方面的综合性能优于CR树与RR*树。  相似文献   

16.
针对染纱生产的工艺能耗测量问题,提出一种基于自适应模糊聚类的多神经网络的染纱能耗软测量方法.该方法采用自适应模糊C均值聚类算法,基于实时采集的样本数据,将训练集划分成不同聚类中心的子集,并自适应修正.每个子集用径向基函数网络训练得到子模型,然后根据聚类后的隶属度,将各子模型的输出加权求和获得最后结果.通过对染缸能耗软测量建模,并对其进行仿真和典型实例研究,表明该方法具有良好的预测精度和鲁棒性,且与制造执行系统结合具有良好的在线测量能力.  相似文献   

17.
In this paper, the problem of clustering machines into cells and components into part-families with the consideration of ratio-level and ordinal-level data is dealt with. The ratio-level data is characterized by the use of workload information obtained both from per-unitprocess times and production quantity of components, and from machine capacity. In the case of ordinal-level data, we consider the sequence of operations for every component. These data sets are used in place of conventional binary data for arriving at clusters of cells and part-families. We propose a new approach to cell formation by viewing machines, and subsequently components, as 'points' in multi-dimensional space, with their coordinates defined by the corresponding elements in a Machine-Component Incidence Matrix (MCIM). An iterative algorithm that improves upon the seed solution is developed. The seed solution is obtained by formulating the given clustering problem as a Traveling Salesman Problem (TSP). The solutions yielded by the proposed clustering algorithm are found to be good and comparable to those reported in the literature.  相似文献   

18.
基于改进核模糊聚类算法的软测量建模研究   总被引:8,自引:3,他引:8  
针对发酵过程软测量建模采用单模型建模方法存在计算量大和精度较差的问题,提出一种基于改进核模糊聚类算法的多模型神经网络软测量建模方法.该方法首先使用主元分析方法对样本数据进行数据处理,所得主元变量作为模型的输入变量,然后使用基于粒子群优化算法的核模糊C均值聚类算法(PSKFCM)对数据集作聚类划分,最后针对每个聚类建立局部神经网络模型,多个局部神经网络模型估计结果的融合即为软测量模型的输出.将所提建模方法应用于红霉素发酵过程生物量浓度软测量建模,结果表明所建软测量模型具有较高的精度和良好的泛化能力.  相似文献   

19.
COOPERATIVE CLUSTERING BASED ON GRID AND DENSITY   总被引:2,自引:0,他引:2  
Based on the analysis of features of the grid-based clustering method-clustering in quest (CLIQUE) and density-based clustering method-density-based spatial clustering of applications with noise (DBSCAN), a new clustering algorithm named cooperative clustering based on grid and density (CLGRID) is presented. The new algorithm adopts an equivalent rule of regional inquiry and density unit identification. The central region of one class is calculated by the grid-based method and the margin region by a density-based method. By clustering in two phases and using only a small number of seed objects in representative units to expand the cluster, the frequency of region query can be decreased, and consequently the cost of time is reduced. The new algorithm retains positive features of both grid-based and density-based methods and avoids the difficulty of parameter searching. It can discover clusters of arbitrary shape with high efficiency and is not sensitive to noise. The application of CLGRID on test data sets demonstrates its validity and higher efficiency, which contrast with traditional DBSCAN with R* tree.  相似文献   

20.
A honeybee-mating approach for cluster analysis   总被引:1,自引:0,他引:1  
Cluster analysis, which is the subject of active research in several fields, such as statistics, pattern recognition, machine learning, and data mining, is to partition a given set of data or objects into clusters. K-means is used as a popular clustering method due to its simplicity and high speed in clustering large datasets. However, K-means has two shortcomings. First, dependency on the initial state and convergence to local optima. The second is that global solutions of large problems cannot be found with reasonable amount of computation effort. In order to overcome local optima problem lots of studies done in clustering. Over the last decade, modeling the behavior of social insects, such as ants and bees, for the purpose of search and problem solving has been the context of the emerging area of swarm intelligence. Honeybees are among the most closely studied social insects. Honeybee mating may also be considered as a typical swarm-based approach to optimization, in which the search algorithm is inspired by the process of marriage in real honeybee. Neural networks algorithms are useful for clustering analysis in data mining. This study proposes a two-stage method, which first uses self-organizing feature maps (SOM) neural network to determine the number of clusters and then uses honeybee mating optimization algorithm based on K-means algorithm to find the final solution. We compared proposed algorithm with other heuristic algorithms in clustering, such as GA, SA, TS, and ACO, by implementing them on several well-known datasets. Our finding shows that the proposed algorithm works better than others. In order to further demonstration of the proposed approach’s capability, a real-world problem of an Internet bookstore market segmentation based on customer loyalty is employed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号