共查询到18条相似文献,搜索用时 15 毫秒
1.
Lingyun Xiang Guohan Zhao Qian Li Gwang-Jun Kim Osama Alfarraj Amr Tolba 《计算机、材料和连续体(英文)》2021,67(1):267-282
Multiple kernel clustering is an unsupervised data analysis method that has been used in various scenarios where data is easy to be collected but hard to be labeled. However, multiple kernel clustering for incomplete data is a critical yet challenging task. Although the existing absent multiple kernel clustering methods have achieved remarkable performance on this task, they may fail when data has a high value-missing rate, and they may easily fall into a local optimum. To address these problems, in this paper, we propose an absent multiple kernel clustering (AMKC) method on incomplete data. The AMKC method first clusters the initialized incomplete data. Then, it constructs a new multiple-kernel-based data space, referred to as K-space, from multiple sources to learn kernel combination coefficients. Finally, it seamlessly integrates an incomplete-kernel-imputation objective, a multiple-kernel-learning objective, and a kernel-clustering objective in order to achieve absent multiple kernel clustering. The three stages in this process are carried out simultaneously until the convergence condition is met. Experiments on six datasets with various characteristics demonstrate that the kernel imputation and clustering performance of the proposed method is significantly better than state-of-the-art competitors. Meanwhile, the proposed method gains fast convergence speed. 相似文献
2.
3.
K-means算法是一种常用的聚类算法,但是聚类中心的初始化是其中的一个难点。笔者提出了一个基于层次思想的初始化方法。一般聚类问题均可看作加权聚类,通过层层抽样减少数据量,然后采用自顶向下的方式,从抽样结束层到原始数据层,每层都进行聚类,其中每层初始聚类中心均通过对上层聚类中心进行换算得到,重复该过程直到原始数据层,可得原始数据层的初始聚类中心。模拟数据和真实数据的实验结果均显示基于层次抽样初始化的K-means算法不仅收敛速度快、聚类质量高,而且对噪声不敏感,其性能明显优于现有的相关算法。 相似文献
4.
针对中小企业的发展特点,提出了从经营现状和发展潜力两个维度进行综合评价中小企业绩效的二维模型,并采用创业板中171家中小企业的数据进行了模型验证分析。考虑到评价企业数据量较大,论文依托于聚类分析方法,提出了一种典型样本企业选取策略,然后应用优势粗糙集理论对典型样本集进行了专家知识学习,形成中小企业绩效评价的决策规则,对所有企业进行绩效分类,构建出二维评价模型。其结果分析表明,我国中小企业在现状和发展潜力方面表现均优的企业较少,企业的发展潜力存在不足。此外,基于实际数据,论文讨论和演算了训练样本数量与粗糙集学习分类质量关系,发现粗糙集学习分类中存在过学习现象,即训练样本数的增多并不一定能提高分类质量。 相似文献
5.
Security-sensitive functions are the basis for building a taint-style vulnerability model. Current approaches for extracting security-sensitive functions either don’t analyze data flow accurately, or not conducting pattern analyzing of conditions, resulting in higher false positive rate or false negative rate, which increased manual confirmation workload. In this paper, we propose a security sensitive function mining approach based on preconditon pattern analyzing. Firstly, we propose an enhanced system dependency graph analysis algorithm for precisely extracting the conditional statements which check the function parameters and conducting statistical analysis of the conditional statements for selecting candidate security sensitive functions of the target program. Then we adopt a precondition pattern mining method based on conditional statements nomalizing and clustering. Functions with fixed precondition patterns are regarded as security-sensitive functions. The experimental results on four popular open source codebases of different scales show that the approach proposed is effective in reducing the false positive rate and false negative rate for detecting security sensitive functions. 相似文献
6.
基于Rough集理论的模糊神经网络构造方法 总被引:4,自引:0,他引:4
提出了在模糊神经网络中使用Rough集理论进行网络结构设计的方法。由于Rough集理论有强大的数值分析能力,而模糊神经网络具有准确的逼近收敛能力和较高的精度,所以通过两者的结合,可以得到一种可理解性好、计算简单、收敛速度快的神经网络模型。这种网络构造方法的主要过程为:首先,利用Rough集理论对给定数据集进行规则获取;然后,根据这些规则构造模糊神经网络各层的神经元个数及相关参数初始值;最后,用BP算法迭代求出网络的各种参数,完成网络的设计。给出了一个二维非线性函数拟合的实例,进一步验证了方法的正确性。 相似文献
7.
With the rapid development of mobile communication all over the world, the similarity of mobile phone communication data has received widely attention due to its advantage for the construction of smart cities. Mobile phone communication data can be regarded as a type of time series and dynamic time warping (DTW) and derivative dynamic time warping (DDTW) are usually used to analyze the similarity of these data. However, many traditional methods only calculate the distance between time series while neglecting the shape characteristics of time series. In this paper, a novel hybrid method based on the combination of dynamic time warping and derivative dynamic time warping is proposed. The new method considers not only the distance between time series, but also the shape characteristics of time series. We demonstrated that our method can outperform DTW and DDTW through extensive experiments with respect to cophenetic correlation. 相似文献
8.
In recent years, mobile Internet technology and location based services have wide application. Application providers and users have accumulated huge amount of trajectory data. While publishing and analyzing user trajectory data have brought great convenience for people, the disclosure risks of user privacy caused by the trajectory data publishing are also becoming more and more prominent. Traditional k-anonymous trajectory data publishing technologies cannot effectively protect user privacy against attackers with strong background knowledge. For privacy preserving trajectory datapublishing, we propose a differential privacy based (k-Ψ)-anonymity method to defend against re-identification and probabilistic inference attack. The proposed method is divided into two phases: in the first phase, a dummy-based (k-Ψ)-anonymous trajectory data publishing algorithm is given, which improves (k-δ)-anonymity by considering changes of threshold δ on different road segments and constructing an adaptive threshold set Ψ that takes into account road network information. In the second phase, Laplace noise regarding distance of anonymous locations under differential privacy is used for trajectory perturbation of the anonymous trajectory dataset outputted by the first phase. Experiments on real road network dataset are performed and the results show that the proposed method improves the trajectory indistinguishability and achieves good data utility in condition of preserving user privacy. 相似文献
9.
10.
11.
12.
二维线性鉴别分析(2DLDA)是一种直接基于矩阵的特征提取方法,跳过传统的基于Fisher鉴别准则
的线性鉴别分析方法中必须先将二维矩阵转化成一维矢量的过程,有效地提高了特征提取速度且避免了小样本
问题,其识别率优于传统的Fisherface方法。结合模糊集理论,提出了一种新的2DLDA算法———模糊2DLDA
(F1DLDA)算法。首先采用FKNN算法得到相应的样本分布信息,并按其对最后得到的特征向量所作的贡献融入
到特征抽取过程中,得到有效的样本特征向量集。实验表明,F2DLDA算法的性能优于传统的2 相似文献
13.
Attribute Selection Based on Rough Set Theory for Electromagnetic Interference (EMI) Fault Diagnosis
Electromagnetic emissions are radiated from every part of a personal computer motherboard, thus producing electromagnetic interference (EMI). EMI has an adverse effect on the surrounding environment because EMI could cause malfunctions or fatal problems in other digital devices. EMI engineers diagnose motherboard EMI problems using the electromagnetic noise data measured by the spectrum analyzer. Finding the sources (e.g., PS2, USB, VGA) of electromagnetic noise is a time-consuming process. The attribute selection and fault diagnosis was developed based on the advantage of rough set theory (RST). RST is a novel data mining approach for dealing with vagueness and uncertainty. It can be used to find hidden patterns in data sets. In this study, the basic rough set theory concepts are introduced. The rough set approach enables one to discover the minimal subsets of condition attributes associated with the motherboard EMI fault diagnosis problem. The operating sequence includes data collection, data preprocessing, discretization, attribute reduction, reduction filtering, rule generation, and classification accuracy. Historical EMI noise data, colleted from a famous motherboard company in Taiwan, were used to generate diagnostic rules. Our research result (average diagnostic accuracy of 80% above) shows that the RST model is a promising approach for EMI diagnostic support systems. 相似文献
14.
To acquire non-ferrous metals related news from different countries’ internet, we proposed a cross-lingual non-ferrous metals related news recognition method based on CNN with a limited bilingual dictionary. Firstly, considering the lack of related language resources of non-ferrous metals, we use a limited bilingual dictionary and CCA to learn cross-lingual word vector and to represent news in different languages uniformly. Then, to improve the effect of recognition, we use a variant of the CNN to learn recognition features and construct the recognition model. The experimental results show that our proposed method acquires better results. 相似文献
15.
针对客户流失分析中实际客户样本数据量大、流失与未流失客户样本分布不平衡的特点,提出一种基于Boosting与代价敏感决策树的集成方法,并将其应用于商业银行个人理财业务的客户流失分析。通过实际商业银行客户数据集测试,并与支持向量机、人工神经网络和Logistic回归等方法进行比较,发现该方法能够有效解决客户流失问题。 相似文献
16.
Jieren Cheng Canting Cai Xiangyan Tang Victor S. Sheng Wei Guo Mengyang Li 《计算机、材料和连续体(英文)》2020,63(1):131-150
Traditional distributed denial of service (DDoS) detection methods need a lot of computing resource, and many of them which are based on single element have high missing rate and false alarm rate. In order to solve the problems, this paper proposes a DDoS attack information fusion method based on CNN for multi-element data. Firstly, according to the distribution, concentration and high traffic abruptness of DDoS attacks, this paper defines six features which are respectively obtained from the elements of source IP address, destination IP address, source port, destination port, packet size and the number of IP packets. Then, we propose feature weight calculation algorithm based on principal component analysis to measure the importance of different features in different network environment. The algorithm of weighted multi-element feature fusion proposed in this paper is used to fuse different features, and obtain multi-element fusion feature (MEFF) value. Finally, the DDoS attack information fusion classification model is established by using convolutional neural network and support vector machine respectively based on the MEFF time series. Experimental results show that the information fusion method proposed can effectively fuse multi-element data, reduce the missing rate and total error rate, memory resource consumption, running time, and improve the detection rate. 相似文献
17.
With the popularity of sensor-rich mobile devices, mobile crowdsensing (MCS) has emerged as an effective method for data collection and processing. However, MCS platform usually need workers’ precise locations for optimal task execution and collect sensing data from workers, which raises severe concerns of privacy leakage. Trying to preserve workers’ location and sensing data from the untrusted MCS platform, a differentially private data aggregation method based on worker partition and location obfuscation (DP-DAWL method) is proposed in the paper. DP-DAWL method firstly use an improved K-means algorithm to divide workers into groups and assign different privacy budget to the group according to group size (the number of workers). Then each worker’s location is obfuscated and his/her sensing data is perturbed by adding Laplace noise before uploading to the platform. In the stage of data aggregation, DP-DAWL method adopts an improved Kalman filter algorithm to filter out the added noise (including both added noise of sensing data and the system noise in the sensing process). Through using optimal estimation of noisy aggregated sensing data, the platform can finally gain better utility of aggregated data while preserving workers’ privacy. Extensive experiments on the synthetic datasets demonstrate the effectiveness of the proposed method. 相似文献
18.
A size-dependent computational approach for bending, free vibration and buckling analyses of isotropic and sandwich functionally graded (FG) microplates is in this study presented. We consider both shear deformation and small scale effects through the generalized higher order shear deformation theory and modified couple stress theory (MCST). The present model only retains a single material length scale parameter for capturing properly size effects. A rule of mixture is used to model material properties varying through the thickness of plates. The principle of virtual work is used to derive the discrete system equations which are approximated by moving Kriging interpolation (MKI) meshfree method. Numerical examples consider the inclusions of geometrical parameters, volume fraction, boundary conditions and material length scale parameter. Reliability and effectiveness of the present method are confirmed through numerical results. 相似文献