首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
针对传统卷积神经网络对多传感器指纹识别泛化能力降低、准确率不高的问题,提出改进的Stacking集成学习算法。首先将AlexNet进行改进,在AlexNet中引入深度可分离卷积减少参数量,加快训练速度;引入空间金字塔池化,提升网络获取全局信息的能力;引入批归一化,加快网络收敛速度,同时提升网络在测试集上的准确率;使用全局平均池化替代全连接层,防止过拟合。然后将DenseNet和改进的AlexNet 2种卷积神经网络作为Stacking的基学习器对指纹进行分类,获得预测结果。最后对相同基学习器训练得到的各个模型,根据预测精度对各预测结果赋权,得到的预测结果再由元分类器分类。改进的Stacking算法在多传感器指纹数据库上进行实验,最终识别准确率达98.43%,相对AlexNet提升了20.05%,相对DenseNet提升了4.25%。  相似文献   

2.
软件缺陷预测是软件质量保障领域的热点研究课题,缺陷预测模型的质量与训练数据有密切关系。用于缺陷预测的数据集主要存在数据特征的选择和数据类不平衡问题。针对数据特征选择问题,采用软件开发常用的过程特征和新提出的扩展过程特征,然后采用基于聚类分析的特征选择算法进行特征选择;针对数据类不平衡问题,提出改进的Borderline-SMOTE过采样方法,使得训练数据集的正负样本数量相对平衡且合成样本的特征更符合实际样本特征。采用bugzilla、jUnit等项目的开源数据集进行实验,结果表明:所采用的特征选择算法在保证模型F-measure值的同时,可以降低57.94%的模型训练时间;使用改进的Borderline-SMOTE方法处理样本得到的缺陷预测模型在Precision、Recall、F-measure、AUC指标上比原始方法得到的模型平均分别提高了2.36个百分点、1.8个百分点、2.13个百分点、2.36个百分点;引入了扩展过程特征得到的缺陷预测模型比未引入扩展过程特征得到的模型在F-measure值上平均提高了3.79%;与文献中的方法得到的模型相比,所提方法得到的模型在F-measure值上平均提高了15.79%。实验结果证明所提方法能有效提升缺陷预测模型的质量。  相似文献   

3.
软件缺陷预测是软件质量保障领域的热点研究课题,缺陷预测模型的质量与训练数据有密切关系。用于缺陷预测的数据集主要存在数据特征的选择和数据类不平衡问题。针对数据特征选择问题,采用软件开发常用的过程特征和新提出的扩展过程特征,然后采用基于聚类分析的特征选择算法进行特征选择;针对数据类不平衡问题,提出改进的Borderline-SMOTE过采样方法,使得训练数据集的正负样本数量相对平衡且合成样本的特征更符合实际样本特征。采用bugzilla、jUnit等项目的开源数据集进行实验,结果表明:所采用的特征选择算法在保证模型F-measure值的同时,可以降低57.94%的模型训练时间;使用改进的Borderline-SMOTE方法处理样本得到的缺陷预测模型在Precision、Recall、F-measure、AUC指标上比原始方法得到的模型平均分别提高了2.36个百分点、1.8个百分点、2.13个百分点、2.36个百分点;引入了扩展过程特征得到的缺陷预测模型比未引入扩展过程特征得到的模型在F-measure值上平均提高了3.79%;与文献中的方法得到的模型相比,所提方法得到的模型在F-measure值上平均提高了15.79%。实验结果证明所提方法能有效提升缺陷预测模型的质量。  相似文献   

4.
陈文兵  管正雄  陈允杰 《计算机应用》2018,38(11):3305-3311
深度卷积神经网络(CNN)在大规模带有标签的数据集训练下,训练后模型能够取得高的识别率或好的分类效果,而利用较小规模数据集训练CNN模型则通常出现过拟合现象。针对这一问题,提出了一种集成高斯混合模型(GMM)及条件生成式对抗网络(CGAN)的数据增强方法并记作GMM-CGAN。首先,通过围绕核心区域随机滑动采样的方法增加数据集样本数量;其次,假定噪声随机向量服从GMM描述的分布,将它作为CGAN生成器的初始输入,图像标签作为CGAN条件,训练CGAN以及GMM模型的参数;最后,利用已训练CGAN生成符合样本真实分布的新数据集。对包含12种雾型386个样本的天气形势图基准集利用GMM-CGAN方法进行数据增强,增强后的数据集样本数多达38600个,将该数据集训练的CNN模型与仅使用仿射变换增强的数据集及CGAN方法增强的数据集训练的CNN模型相比,实验结果表明,前者的平均分类正确率相较于后两个模型分别提高了18.2%及14.1%,达到89.1%。  相似文献   

5.
储层岩性分类是地质研究基础, 基于数据驱动的机器学习模型虽然能较好地识别储层岩性, 但由于测井数据是特殊的序列数据, 模型很难有效提取数据的空间相关性, 造成模型对储层识别仍存在不足. 针对此问题, 本文结合双向长短期循环神经网络(bidirectional long short-term memory, BiLSTM)和极端梯度提升决策树(extreme gradient boosting decision tree, XGBoost), 提出双向记忆极端梯度提升(BiLSTM-XGBoost, BiXGB)模型预测储层岩性. 该模型在传统XGBoost基础上融入了BiLSTM, 大大增强了模型对测井数据的特征提取能力. BiXGB模型使用BiLSTM对测井数据进行特征提取, 将提取到的特征传递给XGBoost分类模型进行训练和预测. 将BiXGB模型应用于储层岩性数据集时, 模型预测的总体精度达到了91%. 为了进一步验证模型的准确性和稳定性, 将模型应用于UCI公开的Occupancy序列数据集, 结果显示模型的预测总体精度也高达93%. 相较于其他机器学习模型, BiXGB模型能准确地对序列数据进行分类, 提高了储层岩性的识别精度, 满足了油气勘探的实际需要, 为储层岩性识别提供了新的方法.  相似文献   

6.
端到端双通道特征重标定DenseNet图像分类   总被引:1,自引:0,他引:1       下载免费PDF全文
目的 针对密集连接卷积神经网络(DenseNet)没有充分考虑通道特征相关性以及层间特征相关性的缺点,本文结合软注意力机制提出了端到端双通道特征重标定密集连接卷积神经网络。方法 提出的网络同时实现了DenseNet网络的通道特征重标定与层间特征重标定。给出了DenseNet网络通道特征重标定与层间特征重标定方法;构建了端到端双通道特征重标定密集连接卷积神经网络,该网络每个卷积层的输出特征图经过两个通道分别完成通道特征重标定以及层间特征重标定,再进行两种重标定后特征图的融合。结果 为了验证本文方法在不同图像分类数据集上的有效性和适应性,在图像分类数据集CIFAR-10/100以及人脸年龄数据集MORPH、Adience上进行了实验,提高了图像分类准确率,并分析了模型的参数量、训练及测试时长,验证了本文方法的实用性。与DenseNet网络相比,40层及64层双通道特征重标定密集连接卷积神经网络DFR-DenseNet(dual feature reweight DenseNet),在CIFAR-10数据集上,参数量仅分别增加1.87%、1.23%,错误率分别降低了12%、9.11%,在CIFAR-100数据集上,错误率分别降低了5.56%、5.41%;与121层DFR-DenseNet网络相比,在MORPH数据集上,平均绝对误差(MAE)值降低了7.33%,在Adience数据集上,年龄组估计准确率提高了2%;与多级特征重标定密集连接卷积神经网络MFR-DenseNet(multiple feature reweight DenseNet)相比,DFR-DenseNet网络参数量减少了一半,测试耗时约缩短为MFR-DenseNet的61%。结论 实验结果表明本文端到端双通道特征重标定密集连接卷积神经网络能够增强网络的学习能力,提高图像分类的准确率,并对不同图像分类数据集具有一定的适应性、实用性。  相似文献   

7.
Xie  Jiucheng  Pun  Chi-Man  Pan  Zhaoqing  Gao  Hao  Wang  Baoyun 《Neural Processing Letters》2019,50(1):263-282
This paper proposes a solution based on Adaptive learning using the CNN model. The proposed method automatically updates the recognition model according to online training dataset accumulated directly from the system and retraining recognition model. The data updating task focuses on data samples that are less similar to previous trained ones. The purpose of this solution is to upgrade the model to a new one more adaptive, expecting to reach higher accuracy. In the adaptive learning approach, the recognition system is capable of self-learning and complementing data, without experts needed for data labeling or training. The proposed solution includes 5 main phases: (1) Detect and recognize low confident objects; (2) Track objects in n frames in future progress to make sure whether they are interesting objects or not. (3) In case of objects that are recognized with high confidence: labeling (same class of object) for the corresponding data samples to be recognized with low confidence scores which were tracked in the previous process. In case of objects determined not to be of interesting objects, the samples are labeled as Negative for all previous samples, which were tracked in n previous frames; (4) Initialize a training dataset based on a selective combination of previously trained data and the new data. (5) Retrain and update the model if it results in higher accuracy. We have conducted experiments to compare results of the proposed model—PDnet and some state of the art methods such as AlexNet and Vgg. The experimental results demonstrate that the proposed method provides the higher accuracy when the model are self-learned over time. On the other hand, our adaptive learning is applicable to the traditional recognition models such as AlexNet and Vgg model for improving accuracy.  相似文献   

8.
乳腺X线摄影技术是目前乳腺癌早期发现和诊断的重要手段。然而乳腺X线图像中肿块边缘模糊,分类相对困难,因此提升乳腺肿块的诊断精度从而及早预防和治疗仍是医学领域的一大挑战。针对乳腺肿块的特点,提出了一种结合密集卷积神经网络(DenseNet)和压缩激励(SE)模块的新网络(DSAMNet),该网络融合了二者优势,既加强特征重用,又实现特征提取过程中的特征重标定。根据SE模块嵌入DenseNet的不同位置,提出了模型SE-DenseNet-A、SE-DenseNet-B和SE-DenseNet-C。对SE-DenseNet的池化函数进行改进,提出了模型DSAMNet-A、DSAMNet-B和DSAMNet-C。综合不同结构和不同深度的网络模型在公开数据集CBIS-DDSM上进行训练和测试。实验结果表明,DSAMNet-B有更加优异的性能,其准确率比DenseNet模型的准确率提高了10.8%,AUC达到了0.929。  相似文献   

9.
甘岚  郭子涵  王瑶 《计算机应用》2019,39(10):2923-2929
使用AlexNet实现胃肿瘤细胞图像分类时,存在数据集过小和模型收敛速度慢、识别率低的问题。针对上述问题,提出基于径向变换(RT)的数据增强(DA)和改进AlexNet的方法。将原始数据集划分为测试集和训练集,测试集采用剪裁方式增加数据,训练集首先采用剪裁、旋转、翻转和亮度变换得到增强图片集;然后选取其中一部分进行RT处理达到增强效果。此外,采用替换激活函数和归一化层的方式提高AlexNet的收敛速度并提高其泛化性能。实验结果表明,所提方法能以较快的收敛速度和较高的识别准确率实现胃肿瘤细胞图像的识别,在测试集中最高准确率为99.50%,平均准确率为96.69%,癌变、正常和增生三个类别的F1值分别为0.980、0.954和0.958,表明该方法较好地实现了胃肿瘤细胞图像的识别。  相似文献   

10.
目前,在医学图像领域存在乳腺癌组织病理图像自动分类难以应用于临床诊断的现象,究其根源是当前没有大型公开的数据集或数据集数据不均衡。针对上述问题,提出一种结合密集卷积神经网络(dense convolutional network,DenseNet)、注意力机制(attention mecheanism)和焦点损失函数(Focal loss)的乳腺癌组织病理图像的多分类模型,即DAFLNet。DAFLNet在乳腺癌组织病理图像数据集BreaKHis上进行训练、验证与测试,最终实验结果显示,该模型对良恶性二分类的识别准确率达到99.1%,对乳腺亚型八分类的识别准确率达到95.5%。证明在数据不均衡的条件下,DAFLNet模型能够准确地对乳腺组织病理图像进行八分类。  相似文献   

11.
In order to realize the fertility detection and classification of hatching eggs, a method based on deep learning is proposed in this paper. The 5-days hatching eggs are divided into fertile eggs, dead eggs and infertile eggs. Firstly, we combine the transfer learning strategy with convolutional neural network (CNN). Then, we use a network of two branches. In the first branch, the dataset is pre-trained with the model trained by AlexNet network on large-scale ImageNet dataset. In the second branch, the dataset is directly trained on a multi-layer network which contains six convolutional layers and four pooling layers. The features of these two branches are combined as input to the following fully connected layer. Finally, a new model is trained on a small-scale dataset by this network and the final accuracy of our method is 99.5%. The experimental results show that the proposed method successfully solves the multi-classification problem in small-scale dataset of hatching eggs and obtains high accuracy. Also, our model has better generalization ability and can be adapted to eggs of diversity.  相似文献   

12.
网页标题具有简洁、信息量大的特点,而且其中蕴含了丰富、动态、复杂的人物关系。本文主要针对网页标题文本中的人物关系抽取进行研究,提出一种双模型投票的机器学习方法:首先,针对19种关系类型分别进行特征抽取和选择;其次,使用两种统计模型——最大熵和支持向量机,分别进行模型训练;再次,对于每种关系类型利用模型投票的方法,即选择训练集中得到性能较好的模型作为该类的模型;最后,使用训练好的模型对测试集进行测试。结果显示,本文方法对于人物关系抽取任务取得了总体F1值为67.64%的性能。  相似文献   

13.
ContextParametric cost estimation models need to be continuously calibrated and improved to assure more accurate software estimates and reflect changing software development contexts. Local calibration by tuning a subset of model parameters is a frequent practice when software organizations adopt parametric estimation models to increase model usability and accuracy. However, there is a lack of understanding about the cumulative effects of such local calibration practices on the evolution of general parametric models over time.ObjectiveThis study aims at quantitatively analyzing and effectively handling local bias associated with historical cross-company data, thus improves the usability of cross-company datasets for calibrating and maintaining parametric estimation models.MethodWe design and conduct three empirical studies to measure, analyze and address local bias in cross-company dataset, including: (1) defining a method for measuring the local bias associated with individual organization data subset in the overall dataset; (2) analyzing the impacts of local bias on the performance of an estimation model; (3) proposing a weighted sampling approach to handle local bias. The studies are conducted on the latest COCOMO II calibration dataset.ResultsOur results show that the local bias largely exists in cross company dataset, and the local bias negatively impacts the performance of parametric model. The local bias based weighted sampling technique helps reduce negative impacts of local bias on model performance.ConclusionLocal bias in cross-company data does harm model calibration and adds noisy factors to model maintenance. The proposed local bias measure offers a means to quantify degree of local bias associated with a cross-company dataset, and assess its influence on parametric model performance. The local bias based weighted sampling technique can be applied to trade-off and mitigate potential risk of significant local bias, which limits the usability of cross-company data for general parametric model calibration and maintenance.  相似文献   

14.

During disasters, multimedia content on social media sites offers vital information. Reports of injured or deceased people, infrastructure destruction, and missing or found people are among the types of information exchanged. While several studies have demonstrated the importance of both text and picture content for disaster response, previous research has primarily concentrated on the text modality and not so much success with multi-modality. Latest research in multi-modal classification in disaster related tweets uses comparatively primitive models such as KIMCNN and VGG16. In this research work we have taken this further and utilized state-of-the-art models in both text and image classification to try and improve multi-modal classification of disaster related tweets. The research was conducted on two different classification tasks, first to detect if a tweet is informative or not, second to understand the response needed. The process of multimodal analysis is broken down by incorporating different methods of feature extraction from the textual data corpus and pre-processing the corresponding image corpus, then we use several classification models to train and predict the output and compare their performances while tweaking the parameters to improve the results. Models such as XLNet, BERT and RoBERTa in text classification and ResNet, ResNeXt and DenseNet in image classification were trained and analyzed. Results show that the proposed multimodal architecture outperforms models trained using a single modality (text or image alone). Also, it proves that the newer state-of-the-art models outperform the baseline models by a reasonable margin for both the classification tasks.

  相似文献   

15.

Botnets pose significant threats to cybersecurity. The infected Internet of Things (IoT) devices are used to launch unsupported malicious activities on target entities to disrupt their operations and services. To address this danger, we propose a machine learning-based method, for detecting botnets by analyzing network traffic data flow including various types of botnet attacks. Our method uses a hybrid model where a Variational AutoEncoder (VAE) is trained in an unsupervised manner to learn latent representations that describe the benign traffic data, and one-class classifier (OCC) for detecting anomaly (also called novelty detection). The main aim of this research is to learn the discriminating representations of the normal data in low dimensional latent space generated by VAE, and thus improve the predictive power of the OCC to detect malicious traffic. We have evaluated the performance of our model, and compared it against baseline models using a real network based dataset, containing popular IoT devices, and presenting a wide variety of attacks from two recent botnet families Mirai and Bashlite. Tests showed that our model can detect botnets with a satisfactory performance.

  相似文献   

16.
目的 针对现有图像转换方法的深度学习模型中生成式网络(generator network)结构单一化问题,改进了条件生成式对抗网络(conditional generative adversarial network,CGAN)的结构,提出了一种融合残差网络(ResNet)和稠密网络(DenseNet)两种不同结构的并行生成器网络模型。方法 构建残差、稠密生成器分支网络模型,输入红外图像,分别经过残差、稠密生成器分支网络各自生成可见光转换图像,并提出一种基于图像分割的线性插值算法,将各生成器分支网络的转换图像进行融合,获取最终的可见光转换图像;为防止小样本条件下的训练过程中出现过拟合,在判别器网络结构中插入dropout层;设计最优阈值分割目标函数,在并行生成器网络训练过程中获取最优融合参数。结果 在公共红外-可见光数据集上测试,相较于现有图像转换深度学习模型Pix2Pix和CycleGAN等,本文方法在性能指标均方误差(mean square error,MSE)和结构相似性(structural similarity index,SSIM)上均取得显著提高。结论 并行生成器网络模型有效融合了各分支网络结构的优点,图像转换结果更加准确真实。  相似文献   

17.
18.
图像分类的深度卷积神经网络模型综述   总被引:3,自引:0,他引:3       下载免费PDF全文
图像分类是计算机视觉中的一项重要任务,传统的图像分类方法具有一定的局限性。随着人工智能技术的发展,深度学习技术越来越成熟,利用深度卷积神经网络对图像进行分类成为研究热点,图像分类的深度卷积神经网络结构越来越多样,其性能远远好于传统的图像分类方法。本文立足于图像分类的深度卷积神经网络模型结构,根据模型发展和模型优化的历程,将深度卷积神经网络分为经典深度卷积神经网络模型、注意力机制深度卷积神经网络模型、轻量级深度卷积神经网络模型和神经网络架构搜索模型等4类,并对各类深度卷积神经网络模型结构的构造方法和特点进行了全面综述,对各类分类模型的性能进行了对比与分析。虽然深度卷积神经网络模型的结构设计越来越精妙,模型优化的方法越来越强大,图像分类准确率在不断刷新的同时,模型的参数量也在逐渐降低,训练和推理速度不断加快。然而深度卷积神经网络模型仍有一定的局限性,本文给出了存在的问题和未来可能的研究方向,即深度卷积神经网络模型主要以有监督学习方式进行图像分类,受到数据集质量和规模的限制,无监督式学习和半监督学习方式的深度卷积神经网络模型将是未来的重点研究方向之一;深度卷积神经网络模型的速度和资源消耗仍不尽人意,应用于移动式设备具有一定的挑战性;模型的优化方法以及衡量模型优劣的度量方法有待深入研究;人工设计深度卷积神经网络结构耗时耗力,神经架构搜索方法将是未来深度卷积神经网络模型设计的发展方向。  相似文献   

19.
针对卷积结构的深度学习模型在小样本学习场景中泛化性能较差的问题,以AlexNet和ResNet为例,提出一种基于小样本无梯度学习的卷积结构预训练模型的性能优化方法.首先基于因果干预对样本数据进行调制,由非时序数据生成序列数据,并基于协整检验从数据分布平稳性的角度对预训练模型进行定向修剪;然后基于资本资产定价模型(CAP...  相似文献   

20.
针对轴承故障数据严重失衡导致所训练的模型诊断能力和泛化能力较差等问题,提出基于Wasserstein距离的生成对抗网络来平衡数据集的方法。该方法首先将少量故障样本进行对抗训练,待网络达到纳什均衡时,再将生成的故障样本添加到原始少量故障样本中起到平衡数据集的作用;提出基于全局平均池化卷积神经网络的诊断模型,将平衡后的数据集输入到诊断模型中进行训练,通过模型自适应地逐层提取特征,实现故障的精确分类诊断。实验结果表明,所提诊断方法优于其他算法和模型,同时拥有较强的泛化能力和鲁棒性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号