首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 744 毫秒
1.
为了充分利用工业过程中大量无标签样本信息,并减少过程的不确定因素对无标签样本质量的影响,提出一种助训练框架下的半监督孪生支持向量回归软测量建模方法。采用孪生支持向量回归机构建主学习器,对高置信度无标签样本添加伪标签;同时,基于K近邻算法构建辅学习器,最大化学习器在近邻样本集上的均方误差,经过此项指标筛选后的待处理样本集包含了更多的数据信息;主、辅学习器二者相辅相成,一定程度上提高了模型的泛化性;再利用所构建的助训练框架提高样本利用率后得到预测模型,实现对无标签样本信息的充分挖掘。通过对脱丁烷塔工业过程中的实际数据进行建模仿真,所得结果表明此模型具有良好的预测性能。  相似文献   

2.
Thin-walled parts are widely used in the aerospace, shipbuilding, and automotive industry, but due to its unique structure and high accuracy requirements, which leads to an increase in scrapped parts, high cost in production, and a more extended period in the trial machining process. However, to adapt to fast production cycles and increase the efficiency of thin-walled parts machining, this paper presents a Digital Twin-driven thin-walled part manufacturing framework to allow the machine operator to manage the product changes, make the start-up phases faster and more accurate. The framework has three parts: preparation, machining, and measurement, driven by Digital Twin technologies in detail. By establishing and updating the workpiece Digital Twin under a different status, various manufacturing information and data can be integrated and available to machine operators and other Digital Twins. It can serve as a guideline for establishing the machine tool and workpiece Digital Twin and integrating them into the machining process. It provides the machine operator opportunities to interact with both the physical manufacturing process and its digital data in real-time. The digital representation of the physical process can support them to manage the trial machining from different aspects. In addition, a demonstrative case study is presented to explain the implementation of this framework in a real manufacturing environment.  相似文献   

3.
依照图像识别出的对象标签,通过层次结构来分类图像集是图像自动化分类的重要研究问题之一。现有的方法实现了对象标签已知情况下的层次结构构建,仅存在少量方法考虑部分对象标签未知的影响。本文对经典方法进行了扩展和优化,实现了存在部分对象标签未知情况下的层次结构构建和更新。利用卷积神经网络(Convolutional neural network, CNN)对图像编码,提出半监督学习方法,根据传统算法构建类标签已知图像集的层次结构,通过周期性相似性比较,对层次结构中标签未知图像进行聚类,实现对半监督分层模型(Semi-supervised layer-wise model,SLM)的构建。本文采用了真实公开的数据集,实验结果表明,该方法能够有效地实现层次结构的构建和更新,并且能够在较小规模的数据集上取得好的预测分类效果。  相似文献   

4.
Although administrators of online communities (OCs) may focus on improving their OCs through upgrading technology and enhancing the usability of their OCs to attract additional users, the level of OC participation may be associated with social motives. The purpose of this study is to understand how social motivations (that is, network externalities and social norms) affect members committed to OCs. This study tests the hypotheses on data collected from 396 undergraduate students. Data analyses show that network externalities and social norms directly influence social interaction ties, which subsequently results in commitment toward a community. Social norms also directly influence relationship commitments to a community. The results provide insights into how social motivations lead to commitment to an OC, reminding OC administrators to encourage member commitment to the OC from the perspective of social motivations.  相似文献   

5.
Despite deep learning models can largely release the pressure of manual feature engineering in intelligent fault diagnosis of rotor-bearing systems, their performance mostly depends on enough labeled samples constructed from the vibration signals. Acquiring lots of labeled samples is often laborious, and the vibration sensors tightly fixed on the equipment may influence their structures after long time running. To address these two problems, a new framework based on small labeled infrared thermal images and enhanced convolutional neural network (ECNN) transferred from convolutional auto-encoder (CAE) is proposed. First, infrared thermal images are measured to characterize various health states of rotor-bearing system. Second, exponential linear unit (ELU) and stochastic pooling (SP) are used to construct ECNN. Then, the model parameters of a CAE pre-trained with unlabeled thermal images are transferred to initialize the ECNN. Finally, small labeled thermal images are used for training ECNN to further adjust model parameters. The collected thermal images are used to test the diagnosis performance of the proposed method. The analysis and comparison results show that the proposed method outperforms the current mainstream methods.  相似文献   

6.
Dataset size continues to increase and data are being collected from numerous applications. Because collecting labeled data is expensive and time consuming, the amount of unlabeled data is increasing. Semi-supervised learning (SSL) has been proposed to improve conventional supervised learning methods by training from both unlabeled and labeled data. In contrast to classification problems, the estimation of labels for unlabeled data presents added uncertainty for regression problems. In this paper, a semi-supervised support vector regression (SS-SVR) method based on self-training is proposed. The proposed method addresses the uncertainty of the estimated labels for unlabeled data. To measure labeling uncertainty, the label distribution of the unlabeled data is estimated with two probabilistic local reconstruction (PLR) models. Then, the training data are generated by oversampling from the unlabeled data and their estimated label distribution. The sampling rate is different based on uncertainty. Finally, expected margin-based pattern selection (EMPS) is employed to reduce training complexity. We verify the proposed method with 30 regression datasets and a real-world problem: virtual metrology (VM) in semiconductor manufacturing. The experiment results show that the proposed method improves the accuracy by 8% compared with conventional supervised SVR, and the training time for the proposed method is 20% shorter than that of the benchmark methods.  相似文献   

7.
Cost-Sensitive Active Visual Category Learning   总被引:1,自引:0,他引:1  
We present an active learning framework that predicts the tradeoff between the effort and information gain associated with a candidate image annotation, thereby ranking unlabeled and partially labeled images according to their expected ??net worth?? to an object recognition system. We develop a multi-label multiple-instance approach that accommodates realistic images containing multiple objects and allows the category-learner to strategically choose what annotations it receives from a mixture of strong and weak labels. Since the annotation cost can vary depending on an image??s complexity, we show how to improve the active selection by directly predicting the time required to segment an unlabeled image. Our approach accounts for the fact that the optimal use of manual effort may call for a combination of labels at multiple levels of granularity, as well as accurate prediction of manual effort. As a result, it is possible to learn more accurate category models with a lower total expenditure of annotation effort. Given a small initial pool of labeled data, the proposed method actively improves the category models with minimal manual intervention.  相似文献   

8.
Recognizing activities for older adults is challenging as we observe a variety of activity patterns caused due to aging (e.g., limited dexterity, limb control, slower response time) or/and underlying health conditions (e.g., dementia). However, existing literature with deep learning methods has successfully recognized activities when the dataset contains high-quality annotations and is captured in a controlled environment. On the contrary, data captured in a real-world environment, especially with older adults exhibiting memory-related symptoms, varying psychological and mental health status, reliance on caregivers to perform daily activities, and unavailability of domain-specific annotators, makes obtaining quality data with annotations challenging; leaving us with limited labeled data and abundant unlabeled data. In this paper, we hypothesize that projecting the labeled data representations comprising a specific set of activities onto a new representation space characterized by the unlabeled data comprising activities beyond the limited activities in the labeled dataset would help us rely less on the annotated data to improve activity detection performance. Motivated by this, we propose STAR-Lite, a self-taught learning framework that involves a pre-training framework to prepare the new representation space considering activities beyond the initial labels in the labeled dataset. STAR-Lite projects the labeled data representations on the new representation space characterized by unlabeled data labels and learns higher-level representations of the labeled dataset while optimizing inter- and intra- class distances without explicitly using a computation hungry similarity-based approach. We demonstrate that our proposed approach, STAR-Lite (a) improves activity recognition performance in a supervised setting and (b) is feasible for real-world deployment. To enhance the feasibility of deploying STAR-Lite on devices with limited memory resources, we explore model compression techniques such as pruning and quantization and propose a novel layer-wise pruning-rate optimization technique that effectively compresses the network while preserving the model performance. The evaluation was performed using the Alzheimer’s Activity Recognition dataset (AAR) captured from 25 individuals living in a retirement community center with IRB approval (#Y18NR12035) using an in-house SenseBox infrastructure while concurrently assessing the clinical evaluation of the participants for dementia, and independent living. Our extensive evaluation reveals that STAR-Lite can detect activities with an F1-score of 85.12% despite 62% reduction in model size and 5% improvement of execution time on a resource constrained device.  相似文献   

9.
针对tri_training协同训练算法在小样本的高光谱遥感影像半监督分类过程中,存在增选样本的误标记问题,提出一种基于空间邻域信息的半监督协同训练分类算法tri_training_SNI(tri_training based on Spatial Neighborhood Information)。首先利用分类器度量方法不一致度量和新提出的不一致精度度量从MLR(Multinomial Logistic Regression)、KNN(k-Nearest Neighbor)、ELM(Extreme Learning Machine)和RF(Random Forest)4个分类器中选择3分类性能差异性最大的3个分类器;然后在样本选择过程中,采用选择出来的3个分类器,在两个分类器分类结果相同的基础上,加入初始训练样本的8邻域信息进行未标记样本的二次筛选和标签的确定,提高了半监督学习的样本选择精度。通过对AVIRIS和ROSIS两景高光谱遥感影像进行分类实验,结果表明与传统的tri_training协同算法相比,该算法在分类精度方面有明显提高。  相似文献   

10.
Automatic image annotation has emerged as an important research topic due to its potential application on both image understanding and web image search. Due to the inherent ambiguity of image-label mapping and the scarcity of training examples, the annotation task has become a challenge to systematically develop robust annotation models with better performance. From the perspective of machine learning, the annotation task fits both multi-instance and multi-label learning framework due to the fact that an image is usually described by multiple semantic labels (keywords) and these labels are often highly related to respective regions rather than the entire image. In this paper, we propose an improved Transductive Multi-Instance Multi-Label (TMIML) learning framework, which aims at taking full advantage of both labeled and unlabeled data to address the annotation problem. The experiments over the well known Corel 5000 data set demonstrate that the proposed method is beneficial in the image annotation task and outperforms most existing image annotation algorithms.  相似文献   

11.
Vision-based defect classification is an important technology to control the quality of product in manufacturing system. As it is very hard to obtain enough labeled samples for model training in the real-world production, the semi-supervised learning which learns from both labeled and unlabeled samples is more suitable for this task. However, the intra-class variations and the inter-class similarities of surface defect, named as the poor class separation, may cause the semi-supervised methods to perform poorly with small labeled samples. While graph-based methods, such as graph convolution network (GCN), can solve the problem well. Therefore, this paper proposes a new graph-based semi-supervised method, named as multiple micrographs graph convolutional network (MMGCN), for surface defect classification. Firstly, MMGCN performs graph convolution by constructing multiple micrographs instead of a large graph, and labels unlabeled samples by propagating label information from labeled samples to unlabeled samples in the micrographs to obtain multiple labels. Weighting the labels can obtain the final label, which can solve the limitations of computation complexity and practicality of original GCN. Secondly, MMGCN divides unlabeled dataset into multiple batches and sets an accuracy threshold. When the model accuracy reaches the threshold, the unlabeled datasets are labeled in batches. A famous case has been used to evaluate the performance of the proposed method. The experimental results demonstrate that the proposed MMGCN can achieve better computation complexity and practicality than GCN. And for accuracy, MMGCN can also obtain the best performance and the best class separation in the comparison with other semi-supervised surface defect classification methods.  相似文献   

12.
A machine learning framework which uses unlabeled data from a related task domain in supervised classification tasks is described. The unlabeled data come from related domains, which share the same class labels or generative distribution as the labeled data. Patterns in the unlabeled data are learned via a neural network and transferred to the target domain from where the labeled data are generated, so as to improve the performance of the supervised learning task. We call this approach self-taught transfer learning from unlabeled data. We introduce a general-purpose feature learning algorithm producing features that retain information from the unlabeled data. Information preservation assures that the features obtained will be useful for improving the classification performance of the supervised tasks.  相似文献   

13.
工业生产过程数据由于主导变量分析代价等因素可能出现有标签样本少而无标签样本多的情况,为提升对无标签样本利用的准确性与充分性,提出一种自训练框架下的三优选半监督回归算法。对无标签样本与有标签样本进行优选,保证两类数据的相似性,以提高无标签样本预测的准确性;利用高斯过程回归方法对所选有标签样本集建模,预测所选无标签样本集,得到伪标签样本集;通过对伪标签样本集置信度进行判断,优选出置信度高的样本用于更新初始样本集;为了进一步提高无标签样本利用的充分性,在自训练框架下,进行多次循环筛选提高无标签样本的利用率。通过对脱丁烷塔过程实际数据的建模仿真,验证了所提方法在较少有标签样本情况下的良好预测性能。  相似文献   

14.
在零件的工艺设计阶段, 加工工艺方案的生成强依赖于设计人员选择和应用的工艺知识. 而由于实际的生产环境与设计人员选择工艺知识存在着诸多偏差, 加工方案与实际的工艺过程不匹配成为当前零件制造领域关注的难题. 为解决上述问题, 本文提出了一种数据与知识双驱动的零件特征工艺决策方法. 本方法使用基于注意力机制的MLP深度学习算法, 从结构化工艺数据中挖掘工艺知识, 关联零件特征与特征工艺标签. 将其经过数据加工后, 用于训练神经网络模型. 经过验证, 该方法能够以零件特征的工艺数据为输入, 输出其对应的特征工艺标签的概率分布, 为零件工艺方案的选择提供决策支持.  相似文献   

15.
Several authors have shown that, when labeled data are scarce, improved classifiers can be built by augmenting the training set with a large set of unlabeled examples and then performing suitable learning. These works assume each unlabeled sample originates from one of the (known) classes. Here, we assume each unlabeled sample comes from either a known or from a heretofore undiscovered class. We propose a novel mixture model which treats as observed data not only the feature vector and the class label, but also the fact of label presence/absence for each sample. Two types of mixture components are posited. "Predefined" components generate data from known classes and assume class labels are missing at random. "Nonpredefined" components only generate unlabeled data-i.e., they capture exclusively unlabeled subsets, consistent with an outlier distribution or new classes. The predefined/nonpredefined natures are data-driven, learned along with the other parameters via an extension of the EM algorithm. Our modeling framework addresses problems involving both the known,and unknown classes: (1) robust classifier design, (2) classification with rejections, and (3) identification of the unlabeled samples (and their components) from unknown classes. Case 3 is a step toward new class discovery. Experiments are reported for each application, including topic discovery for the Reuters domain. Experiments also demonstrate the value of label presence/absence data in learning accurate mixtures.  相似文献   

16.
一种进化半监督式模糊聚类的入侵检测算法   总被引:3,自引:0,他引:3       下载免费PDF全文
在入侵检测系统中,未知标签数据容易获得,标签数据较难获得,对此提出了一种基于进化半监督式模糊聚类入侵检测算法。算法利用标签数据信息担任染色体的角色,引导非标签数据每个模糊分类的进化过程,能够使用少量的标签数据和大量未知标签数据生成入侵检测系统分类器,可处理模糊类标签,不易陷入局部最优,适合并行结构的实现。实验结果表明,算法有较高的检测率。  相似文献   

17.
Minimum Squared Error Classification (MSEC) is a learning method for predicting the class labels of samples in real time. However, as a regression algorithm, MSEC tries its best to map the training samples into their class labels using a linear projection without considering the manifold structure of the data. In this paper, we introduce a supervised label learning framework using an effective manifold learning strategy. This method which is referred to as Manifold Supervised Label Prediction (MSLP) generalizes MSEC objective function to incorporate intra-class relationships of data. Thus, in addition to relying on the relationship between a training sample and its label, we propose to also learn the relationship between the training samples while transforming them. As a testbed for MSLP, we apply it to an image identification venue in which image samples with a very low spatial resolution (16 × 16) are used. These images have been dramatically influenced by a down-sampling process in order to reduce their size and hence, improving over computation time. We also show that the blurring process for reducing the artifacts introduced by down-sampling serendipitously results in better identification accuracies. Finally, unlike MSEC that classifies a query sample based on the deviation between the predicted and the true class labels, we compare both the training and the query samples in the label prediction space. A set of comprehensive experiments on benchmark palmprint databases including Multispectral PolyU, PolyU 2D/3D, and PolyU Contact-free I shows meaningful improvements over existing state-of-the-art algorithms.  相似文献   

18.
This paper describes the design of a neural network based labeled object identification system, to be used for product classification at the final inspection stage of an IBM personal computer manufacturing line. The objective was to design and identification system using existing equipment that would provide robust and accurate classification, as well as a simple means for adding new product models to the system. In the first stage of the identification system, an image of the product is obtained, and the region containing the label is segmented from the rest of the image. Preprocessing operations are performed to extract the region of interest from the segmented image. Normalized and preprocessed images of the labels are compressed using a fully-connected back-propagation autoencoder network. Features extracted in this manner are used as inputs to a Learning Vector Quantization (LVQ) network, trained to classify the labels. The system so designed is shown to satisfy the primary requirements of a typical industrial classification system.  相似文献   

19.
基于集成学习的半监督情感分类方法研究   总被引:1,自引:0,他引:1  
情感分类旨在对文本所表达的情感色彩类别进行分类的任务。该文研究基于半监督学习的情感分类方法,即在很少规模的标注样本的基础上,借助非标注样本提高情感分类性能。为了提高半监督学习能力,该文提出了一种基于一致性标签的集成方法,用于融合两种主流的半监督情感分类方法:基于随机特征子空间的协同训练方法和标签传播方法。首先,使用这两种半监督学习方法训练出的分类器对未标注样本进行标注;其次,选取出标注一致的未标注样本;最后,使用这些挑选出的样本更新训练模型。实验结果表明,该方法能够有效降低对未标注样本的误标注率,从而获得比任一种半监督学习方法更好的分类效果。  相似文献   

20.
翁楦乔  文成林 《控制工程》2022,29(1):175-181
针对传统方法难以利用大量时序数据和无标签数据对电网进行故障诊断的问题,提出了基于深度特征聚类和循环神经网络(RNN)的电网智能故障诊断方法.该方法首先利用卷积神经网络搭建起特征提取器来提取时序数据的高层特征,然后对提取的特征进行半监督聚类,为无标签样本获得对应的标签,从而可以确定无标签样本所属的故障类别并加以利用;然后...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号