首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 15 毫秒
1.
Multiple kernel clustering is an unsupervised data analysis method that has been used in various scenarios where data is easy to be collected but hard to be labeled. However, multiple kernel clustering for incomplete data is a critical yet challenging task. Although the existing absent multiple kernel clustering methods have achieved remarkable performance on this task, they may fail when data has a high value-missing rate, and they may easily fall into a local optimum. To address these problems, in this paper, we propose an absent multiple kernel clustering (AMKC) method on incomplete data. The AMKC method first clusters the initialized incomplete data. Then, it constructs a new multiple-kernel-based data space, referred to as K-space, from multiple sources to learn kernel combination coefficients. Finally, it seamlessly integrates an incomplete-kernel-imputation objective, a multiple-kernel-learning objective, and a kernel-clustering objective in order to achieve absent multiple kernel clustering. The three stages in this process are carried out simultaneously until the convergence condition is met. Experiments on six datasets with various characteristics demonstrate that the kernel imputation and clustering performance of the proposed method is significantly better than state-of-the-art competitors. Meanwhile, the proposed method gains fast convergence speed.  相似文献   

2.
涂袁志  孙树栋  李庆贺  王萌 《工业工程》2012,15(4):119-123,135
航空制造企业的零件生产周期波动较大,将基于粗糙集理论的k-means聚类应用于零件生产周期研究,通过周期类上、下近似方法刻画企业实际生产周期样本的归属,并将不同类别的周期表达为一种覆盖关系;实例仿真结果表明该算法能为企业制定期量标准、均衡生产提供决策依据。  相似文献   

3.
K-means算法是一种常用的聚类算法,但是聚类中心的初始化是其中的一个难点。笔者提出了一个基于层次思想的初始化方法。一般聚类问题均可看作加权聚类,通过层层抽样减少数据量,然后采用自顶向下的方式,从抽样结束层到原始数据层,每层都进行聚类,其中每层初始聚类中心均通过对上层聚类中心进行换算得到,重复该过程直到原始数据层,可得原始数据层的初始聚类中心。模拟数据和真实数据的实验结果均显示基于层次抽样初始化的K-means算法不仅收敛速度快、聚类质量高,而且对噪声不敏感,其性能明显优于现有的相关算法。  相似文献   

4.
廖勇  陈晔  徐海燕 《工业工程》2015,18(1):119-127
针对中小企业的发展特点,提出了从经营现状和发展潜力两个维度进行综合评价中小企业绩效的二维模型,并采用创业板中171家中小企业的数据进行了模型验证分析。考虑到评价企业数据量较大,论文依托于聚类分析方法,提出了一种典型样本企业选取策略,然后应用优势粗糙集理论对典型样本集进行了专家知识学习,形成中小企业绩效评价的决策规则,对所有企业进行绩效分类,构建出二维评价模型。其结果分析表明,我国中小企业在现状和发展潜力方面表现均优的企业较少,企业的发展潜力存在不足。此外,基于实际数据,论文讨论和演算了训练样本数量与粗糙集学习分类质量关系,发现粗糙集学习分类中存在过学习现象,即训练样本数的增多并不一定能提高分类质量。  相似文献   

5.
Security-sensitive functions are the basis for building a taint-style vulnerability model. Current approaches for extracting security-sensitive functions either don’t analyze data flow accurately, or not conducting pattern analyzing of conditions, resulting in higher false positive rate or false negative rate, which increased manual confirmation workload. In this paper, we propose a security sensitive function mining approach based on preconditon pattern analyzing. Firstly, we propose an enhanced system dependency graph analysis algorithm for precisely extracting the conditional statements which check the function parameters and conducting statistical analysis of the conditional statements for selecting candidate security sensitive functions of the target program. Then we adopt a precondition pattern mining method based on conditional statements nomalizing and clustering. Functions with fixed precondition patterns are regarded as security-sensitive functions. The experimental results on four popular open source codebases of different scales show that the approach proposed is effective in reducing the false positive rate and false negative rate for detecting security sensitive functions.  相似文献   

6.
基于Rough集理论的模糊神经网络构造方法   总被引:4,自引:0,他引:4  
提出了在模糊神经网络中使用Rough集理论进行网络结构设计的方法。由于Rough集理论有强大的数值分析能力,而模糊神经网络具有准确的逼近收敛能力和较高的精度,所以通过两者的结合,可以得到一种可理解性好、计算简单、收敛速度快的神经网络模型。这种网络构造方法的主要过程为:首先,利用Rough集理论对给定数据集进行规则获取;然后,根据这些规则构造模糊神经网络各层的神经元个数及相关参数初始值;最后,用BP算法迭代求出网络的各种参数,完成网络的设计。给出了一个二维非线性函数拟合的实例,进一步验证了方法的正确性。  相似文献   

7.
With the rapid development of mobile communication all over the world, the similarity of mobile phone communication data has received widely attention due to its advantage for the construction of smart cities. Mobile phone communication data can be regarded as a type of time series and dynamic time warping (DTW) and derivative dynamic time warping (DDTW) are usually used to analyze the similarity of these data. However, many traditional methods only calculate the distance between time series while neglecting the shape characteristics of time series. In this paper, a novel hybrid method based on the combination of dynamic time warping and derivative dynamic time warping is proposed. The new method considers not only the distance between time series, but also the shape characteristics of time series. We demonstrated that our method can outperform DTW and DDTW through extensive experiments with respect to cophenetic correlation.  相似文献   

8.
In recent years, mobile Internet technology and location based services have wide application. Application providers and users have accumulated huge amount of trajectory data. While publishing and analyzing user trajectory data have brought great convenience for people, the disclosure risks of user privacy caused by the trajectory data publishing are also becoming more and more prominent. Traditional k-anonymous trajectory data publishing technologies cannot effectively protect user privacy against attackers with strong background knowledge. For privacy preserving trajectory datapublishing, we propose a differential privacy based (k-Ψ)-anonymity method to defend against re-identification and probabilistic inference attack. The proposed method is divided into two phases: in the first phase, a dummy-based (k-Ψ)-anonymous trajectory data publishing algorithm is given, which improves (k-δ)-anonymity by considering changes of threshold δ on different road segments and constructing an adaptive threshold set Ψ that takes into account road network information. In the second phase, Laplace noise regarding distance of anonymous locations under differential privacy is used for trajectory perturbation of the anonymous trajectory dataset outputted by the first phase. Experiments on real road network dataset are performed and the results show that the proposed method improves the trajectory indistinguishability and achieves good data utility in condition of preserving user privacy.  相似文献   

9.
本文提出了一种应用于SAR图像目标识别的动态字典学习算法,该算法通过在字典学习过程中自动删除和增加字典条目来调整字典表示性能与尺寸.删除操作是在删除代价的约束下针对相关度高或利用率低的字典条目进行,而增加操作是在增加代价的约束下针对信号表示的残留误差的主分量进行,通过交替执行删除和增加操作来不断优化字典,使其表示能力达到最佳.在MSTAR数据集上的实验验证了算法性能,并给出了相应的参数调整建议.从实验结果和分析可看出,该算法具有识别率高、算法稳定等特点.  相似文献   

10.
客户细分是保险行业进行差异化营销的基础。由于知识冗余的存在,采用传统的聚类方法进行客户细分存在细分质量低的问题。为有效进行客户细分,提出基于属性约简和SOM的聚类模型。应用属性约简规则处理数据可有效识别冗余知识,找出关键属性;将关键属性作为SOM神经模型的输入,提高客户细分质量。以H保险公司作为实例,使用该模型进行客户细分,通过聚类结果比较,证明方法有效。  相似文献   

11.
为了提高火星探测器着陆时对坡度的估计精度,研究了一种基于三维点云数据聚类与随机搜索最优拟合平面的坡度估计方法。将通过激光雷达测量获得的三维点云数据进行稀疏表示,利用稀疏系数对数据点进行聚类与分割,划分子空间;对子空间中的数据点进行平面拟合,随机搜索最优拟合平面;根据最优拟合平面计算平面法向量之间夹角,其在数值上等于坡度角,从而完成坡度估计。实验表明:该方法可以对坡度进行较为准确的估计;与常用的坡度估计方法相比,相对误差较小。  相似文献   

12.
二维线性鉴别分析(2DLDA)是一种直接基于矩阵的特征提取方法,跳过传统的基于Fisher鉴别准则 的线性鉴别分析方法中必须先将二维矩阵转化成一维矢量的过程,有效地提高了特征提取速度且避免了小样本 问题,其识别率优于传统的Fisherface方法。结合模糊集理论,提出了一种新的2DLDA算法———模糊2DLDA (F1DLDA)算法。首先采用FKNN算法得到相应的样本分布信息,并按其对最后得到的特征向量所作的贡献融入 到特征抽取过程中,得到有效的样本特征向量集。实验表明,F2DLDA算法的性能优于传统的2  相似文献   

13.
Electromagnetic emissions are radiated from every part of a personal computer motherboard, thus producing electromagnetic interference (EMI). EMI has an adverse effect on the surrounding environment because EMI could cause malfunctions or fatal problems in other digital devices. EMI engineers diagnose motherboard EMI problems using the electromagnetic noise data measured by the spectrum analyzer. Finding the sources (e.g., PS2, USB, VGA) of electromagnetic noise is a time-consuming process. The attribute selection and fault diagnosis was developed based on the advantage of rough set theory (RST). RST is a novel data mining approach for dealing with vagueness and uncertainty. It can be used to find hidden patterns in data sets. In this study, the basic rough set theory concepts are introduced. The rough set approach enables one to discover the minimal subsets of condition attributes associated with the motherboard EMI fault diagnosis problem. The operating sequence includes data collection, data preprocessing, discretization, attribute reduction, reduction filtering, rule generation, and classification accuracy. Historical EMI noise data, colleted from a famous motherboard company in Taiwan, were used to generate diagnostic rules. Our research result (average diagnostic accuracy of 80% above) shows that the RST model is a promising approach for EMI diagnostic support systems.  相似文献   

14.
To acquire non-ferrous metals related news from different countries’ internet, we proposed a cross-lingual non-ferrous metals related news recognition method based on CNN with a limited bilingual dictionary. Firstly, considering the lack of related language resources of non-ferrous metals, we use a limited bilingual dictionary and CCA to learn cross-lingual word vector and to represent news in different languages uniformly. Then, to improve the effect of recognition, we use a variant of the CNN to learn recognition features and construct the recognition model. The experimental results show that our proposed method acquires better results.  相似文献   

15.
张旭梅  石瀚凌 《工业工程》2011,14(6):126-132
针对客户流失分析中实际客户样本数据量大、流失与未流失客户样本分布不平衡的特点,提出一种基于Boosting与代价敏感决策树的集成方法,并将其应用于商业银行个人理财业务的客户流失分析。通过实际商业银行客户数据集测试,并与支持向量机、人工神经网络和Logistic回归等方法进行比较,发现该方法能够有效解决客户流失问题。  相似文献   

16.
Traditional distributed denial of service (DDoS) detection methods need a lot of computing resource, and many of them which are based on single element have high missing rate and false alarm rate. In order to solve the problems, this paper proposes a DDoS attack information fusion method based on CNN for multi-element data. Firstly, according to the distribution, concentration and high traffic abruptness of DDoS attacks, this paper defines six features which are respectively obtained from the elements of source IP address, destination IP address, source port, destination port, packet size and the number of IP packets. Then, we propose feature weight calculation algorithm based on principal component analysis to measure the importance of different features in different network environment. The algorithm of weighted multi-element feature fusion proposed in this paper is used to fuse different features, and obtain multi-element fusion feature (MEFF) value. Finally, the DDoS attack information fusion classification model is established by using convolutional neural network and support vector machine respectively based on the MEFF time series. Experimental results show that the information fusion method proposed can effectively fuse multi-element data, reduce the missing rate and total error rate, memory resource consumption, running time, and improve the detection rate.  相似文献   

17.
With the popularity of sensor-rich mobile devices, mobile crowdsensing (MCS) has emerged as an effective method for data collection and processing. However, MCS platform usually need workers’ precise locations for optimal task execution and collect sensing data from workers, which raises severe concerns of privacy leakage. Trying to preserve workers’ location and sensing data from the untrusted MCS platform, a differentially private data aggregation method based on worker partition and location obfuscation (DP-DAWL method) is proposed in the paper. DP-DAWL method firstly use an improved K-means algorithm to divide workers into groups and assign different privacy budget to the group according to group size (the number of workers). Then each worker’s location is obfuscated and his/her sensing data is perturbed by adding Laplace noise before uploading to the platform. In the stage of data aggregation, DP-DAWL method adopts an improved Kalman filter algorithm to filter out the added noise (including both added noise of sensing data and the system noise in the sensing process). Through using optimal estimation of noisy aggregated sensing data, the platform can finally gain better utility of aggregated data while preserving workers’ privacy. Extensive experiments on the synthetic datasets demonstrate the effectiveness of the proposed method.  相似文献   

18.
A size-dependent computational approach for bending, free vibration and buckling analyses of isotropic and sandwich functionally graded (FG) microplates is in this study presented. We consider both shear deformation and small scale effects through the generalized higher order shear deformation theory and modified couple stress theory (MCST). The present model only retains a single material length scale parameter for capturing properly size effects. A rule of mixture is used to model material properties varying through the thickness of plates. The principle of virtual work is used to derive the discrete system equations which are approximated by moving Kriging interpolation (MKI) meshfree method. Numerical examples consider the inclusions of geometrical parameters, volume fraction, boundary conditions and material length scale parameter. Reliability and effectiveness of the present method are confirmed through numerical results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号