首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
An important problem to be addressed by diagnostic systems in industrial applications is the estimation of faults with incomplete observations. This work discusses different approaches for handling missing data, and performance of data-driven fault diagnosis schemes. An exploiting classifier and combined methods were assessed in Tennessee–Eastman process, for which diverse incomplete observations were produced. The use of several indicators revealed the trade-off between performances of the different schemes. Support vector machines (SVM) and C4.5, combined with k-nearest neighbourhood (kNN), produce the highest robustness and accuracy, respectively. Bayesian networks (BN) and centroid appear as inappropriate options in terms of accuracy, while Gaussian naïve Bayes (GNB) is sensitive to imputation values. In addition, feature selection was explored for further performance enhancement, and the proposed contribution index showed promising results. Finally, an industrial case was studied to assess informative level of incomplete data in terms of the redundancy ratio and generalize the discussion.  相似文献   

2.
This study describes a method for mining and modeling binding data obtained from a large panel of targets (in vitro safety pharmacology) to distinguish differences between promiscuous and selective compounds. Two naïve Bayes models for promiscuity and selectivity were generated and validated on a test set as well as publicly available drug databases. The model shows a higher score (lower promiscuity) for marketed drugs than for compounds in early development or compounds that failed during clinical development. Such models can be used in triaging high‐throughput screening data or for lead optimization.  相似文献   

3.
Plant volatiles influence host selection of herbivorous insects. Since volatiles often vary in space and time, herbivores (especially polyphagous ones) may be able to use these compounds as cues to track variation in host plant quality based on their innate abilities and previous experience. We investigated the behavioral response of naïve (fed on artificial diet) and experienced (fed on poplar) gypsy moth (Lymantria dispar) caterpillars, a polyphagous species, towards constitutive and herbivore-induced black poplar (Populus nigra) volatiles at different stages of herbivore attack. In Y-tube olfactometer assays, both naïve and experienced caterpillars were attracted to constitutive volatiles and volatiles released after short-term herbivory (up to 6 hr). Naïve caterpillars also were attracted to volatiles released after longer-term herbivory (24–30 hr), but experienced caterpillars preferred the odor of undamaged foliage. A multivariate statistical analysis comparing the volatile emission of undamaged plants vs. plants after short and longer-term herbivory, suggested various compounds as being responsible for distinguishing between the odors of these plants. Ten compounds were selected for individual testing of caterpillar behavioral responses in a four-arm olfactometer. Naïve caterpillars spent more time in arms containing (Z)-3-hexenol and (Z)-3-hexenyl acetate than in solvent permeated arms, while avoiding benzyl cyanide and salicyl aldehyde. Experienced caterpillars avoided benzyl cyanide and preferred (Z)-3-hexenyl acetate and the homoterpene (E)-4,8-dimethyl-1,3,7-nonatriene (DMNT) over solvent. Only responses to DMNT were significantly different when comparing experienced and naïve caterpillars. The results show that gypsy moth caterpillars display an innate behavioral response towards constitutive and herbivore-induced plant volatiles, but also that larval behavior is plastic and can be modulated by previous feeding experience.  相似文献   

4.
The verification of the geographical origin of olive oils by analytical techniques is still a challenge. The goal of this work is to explore the application and accuracy of different chemometric tools combined with near infrared spectroscopy (NIR) based analytical methods in the field of geographical authenticity of olive oils. As olive oils associated with different geographical origins are mainly characterized by different fatty acid (FA) and triacylglycerol (TAG) compositions, NIR methods for the fast and reliable determination of these parameters are developed. Next, these NIR methods are used to characterize a comprehensive set of olive oils (n > 5000) derived from 19 different countries. This set of data is used to build a statistical workflow, which allows the determination of the geographical origin of unknown olive oil samples. First of all, the untreated data set is pretreated by k‐means clustering and the selection of the relevant analytical variables by principal component analysis (PCA) and linear discriminant analysis (LDA) and min/max normalization of all parameters. Subsequently, classification is performed with a reduced sample set of the 200 most similar samples identified by k‐nearest neighbor tool (kNN). For classification purpose kNN, LDA, naïve Bayes classifier, and logit regression are applied. Practical Applications: The established statistical workflow can be used to verify the geographical origin of olive oils. The application and usage of up to four different statistical models for classification purpose results in a superior probability of the predicted origin in comparison to the application of only one single statistical classification test. As standardized methods are used as reference methods for building the NIR methods, the FA and TAG composition and the iodine value can be either determined by the standard methods or by the described NIR method. The presented statistical approach will help to build up a system for the verification of the geographical origin of olive oils.  相似文献   

5.
Currently, some fault prognosis technology occasionally has relatively unsatisfied performance especially for in-cipient faults in nonlinear processes duo to their large time delay and complex internal connection. To overcome this deficiency, multivariate time delay analysis is incorporated into the high sensitive local kernel principal com-ponent analysis. In this approach, mutual information estimation and Bayesian information criterion (BIC) are separately used to acquire the correlation degree and time delay of the process variables. Moreover, in order to achieve prediction, time series prediction by back propagation (BP) network is applied whose input is multivar-iate correlated time series other than the original time series. Then the multivariate time delayed series and future values obtained by time series prediction are combined to construct the input of local kernel principal component analysis (LKPCA) model for incipient fault prognosis. The new method has been exemplified in a sim-ple nonlinear process and the complicated Tennessee Eastman (TE) benchmark process. The results indicate that the new method has superiority in the fault prognosis sensitivity over other traditional fault prognosis methods. ? 2016 The Chemical Industry and Engineering Society of China, and Chemical Industry Press. Al rights reserved.  相似文献   

6.
徐圆  刘莹  朱群雄 《化工学报》2013,64(12):4290-4295
复杂过程故障预测是保证过程安全可靠运行的关键,而复杂系统的工作状态往往由多元时滞序列决定,该序列含有变量间的时滞信息及相关关系,具有一定的信息完备性。因此文章提出一种基于多元时滞序列驱动的复杂过程故障预测方法,该方法首先构建复杂系统的时滞符号有向图(TD-SDG)进而得到多元时滞序列,然后针对复杂系统变量多、关系复杂的问题,提出一种独立成分分析(ICA)和ELM神经网络集成的方法,此方法可快速获取多元时滞序列的独立成分从而建立监控统计量,最终达到故障预测的目的。通过在Tennessee Eastman(TE)过程上的仿真实验验证,表明所提方法能够至少提前15 min预测到故障,方便工作人员及时有效地采取措施。  相似文献   

7.
基于PCA混合模型的多工况过程监控   总被引:7,自引:5,他引:2       下载免费PDF全文
许仙珍  谢磊  王树青 《化工学报》2011,62(3):743-752
针对传统多元统计故障检测方法大多假设测量数据服从单一高斯分布的不足,提出了一种基于PCA(principal component analysis)混合模型的多工况过程监测方法。首先通过直接对混合模型的各高斯成分的协方差进行PCA降维变换,使得协方差阵对角化,既减少了运算量又避免了变量相关而导致的奇异性问题;同时采用BYY增量EM算法自动获取混合模型的最佳混合分量数目,避免了常规EM算法的不足。所得的混合模型,除包括均值、协方差和先验概率等参数外,还包括了PCA载荷阵,即对每个混合元建立了PCA模型。然后给出了统计量定义,实现对多工况过程的故障检测。数值例子和TE过程的应用表明,本文提出的方法无需过程先验知识,能自动获取工况数目、精确估计各个工况的统计特性,并更准确及时地检测出多工况过程的各种故障。  相似文献   

8.
Abstract

In this article we extend Shiryaev's quickest change detection formulation by also accounting for the cost of observations used before the change point. The observation cost is captured through the average number of observations used in the detection process before the change occurs. The objective is to select an on–off observation control policy that decides whether or not to take a given observation, along with the stopping time at which the change is declared, to minimize the average detection delay, subject to constraints on both the probability of false alarm and the observation cost. By considering a Lagrangian relaxation of the constraint problem and using dynamic programming arguments, we obtain an a posteriori probability-based two-threshold algorithm that is a generalized version of the classical Shiryaev algorithm. We provide an asymptotic analysis of the two-threshold algorithm and show that the algorithm is asymptotically optimal—that is, the performance of the two-threshold algorithm approaches that of the Shiryaev algorithm—for a fixed observation cost, as the probability of false alarm goes to zero. We also show, using simulations, that the two-threshold algorithm has good observation cost-delay trade-off curves and provides significant reduction in observation cost compared to the naïve approach of fractional sampling, where samples are skipped randomly. Our analysis reveals that, for practical choices of constraints, the two thresholds can be set independent of each other: one based on the constraint of false alarm and another based on the observation cost constraint alone.  相似文献   

9.
Most multivariate statistical monitoring methods based on principal component analysis (PCA) assume implicitly that the observations at one time are statistically independent of observations at past time and the latent variables follow a Gaussian distribution. However, in real chemical and biological processes, these assumptions are invalid because of their dynamic and nonlinear characteristics. Therefore, monitoring charts based on conventional PCA tend to show many false alarms and bad detectability. In this paper, a new statistical process monitoring method using dynamic independent component analysis (DICA) is proposed to overcome these disadvantages. ICA is a recently developed technique for revealing hidden factors that underlies sets of measurements followed on a non-Gaussian distribution. Its goal is to decompose a set of multivariate data into a base of statistically independent components without a loss of information. The proposed DICA monitoring method is applying ICA to the augmenting matrix with time-lagged variables. DICA can show more powerful monitoring performance in the case of a dynamic process since it can extract source signals which are independent of the auto- and cross-correlation of variables. It is applied to fault detection in both a simple multivariate dynamic process and the Tennessee Eastman process. The simulation results clearly show that the method effectively detects faults in a multivariate dynamic process.  相似文献   

10.
董顺  李益国  孙栓柱  刘西陲  沈炯 《化工学报》2018,69(8):3528-3536
作为一种经典的方法,主成分分析(PCA)在多元统计过程监控领域得到了广泛的应用。然而,主成分分析及其各种改进方法仅从原始数据中提取了一层特征,缺乏对深层次特征的提取。计算机领域深度学习技术的发展表明了深层次的网络结构有利于数据特征的提取,因此,将主成分分析网络(PCANet)这种深度学习网络结构引入到故障诊断领域,与多元统计过程监控方法进行结合,以增强故障检测效果。在PCANet框架下,针对工业过程数据的动态特征,在网络结构中增加了状态空间模型作为动态层以解决动态性问题。此外,还以故障检测为目标重新设计了输出层。最后,通过在TE过程上的仿真测试验证了该方法用于故障检测的可行性和有效性。  相似文献   

11.
炼化装置故障链式效应定量安全预警方法   总被引:2,自引:2,他引:0       下载免费PDF全文
胡瑾秋  张来斌  王安琪 《化工学报》2016,67(7):3091-3100
炼化装置故障及其故障链式效应对油气生产和人民生命安全所造成的危害严重。从故障链角度进行事故风险研究,提出炼化装置故障链式效应定量安全预警方法。首先分析炼化装置故障链式关系结构,基于目标树成功树-动态主逻辑图(GTST-DMLD)建立其故障链式效应关系模型,揭示炼化装置故障链式效应行为规律和关联本质,从而评价装置异常工况下的安全状态。进一步以马尔可夫过程为理论基础,建立故障链式效应预测模型,预测故障传播的后果和方向,并计算各后果的发生概率,为现场操作人员进行主动维修或应急处置提供依据。案例分析中通过对某化工厂常压塔装置、减压炉装置为研究对象进行应用与验证,结果表明该方法可以准确地对系统故障发生后的状态进行评价和预测,方法有效、可行,便于操作人员在处置已有故障的同时,注意预防其他异常工况的发生,降低油气生产加工过程中的整体风险。  相似文献   

12.
High-throughput analysis of biomass is necessary to ensure consistent and uniform feedstocks for agricultural and bioenergy applications and is needed to inform genomics and systems biology models. Pyrolysis followed by mass spectrometry such as molecular beam mass spectrometry (py-MBMS) analyses are becoming increasingly popular for the rapid analysis of biomass cell wall composition and typically require the use of different data analysis tools depending on the need and application. Here, the authors report the py-MBMS analysis of several types of lignocellulosic biomass to gain an understanding of spectral patterns and variation with associated biomass composition and use machine learning approaches to classify, differentiate, and predict biomass types on the basis of py-MBMS spectra. Py-MBMS spectra were also corrected for instrumental variance using generalized linear modeling (GLM) based on the use of select ions relative abundances as spike-in controls. Machine learning classification algorithms e.g., random forest, k-nearest neighbor, decision tree, Gaussian Naïve Bayes, gradient boosting, and multilayer perceptron classifiers were used. The k-nearest neighbors (k-NN) classifier generally performed the best for classifications using raw spectral data, and the decision tree classifier performed the worst. After normalization of spectra to account for instrumental variance, all the classifiers had comparable and generally acceptable performance for predicting the biomass types, although the k-NN and decision tree classifiers were not as accurate for prediction of specific sample types. Gaussian Naïve Bayes (GNB) and extreme gradient boosting (XGB) classifiers performed better than the k-NN and the decision tree classifiers for the prediction of biomass mixtures. The data analysis workflow reported here could be applied and extended for comparison of biomass samples of varying types, species, phenotypes, and/or genotypes or subjected to different treatments, environments, etc. to further elucidate the sources of spectral variance, patterns, and to infer compositional information based on spectral analysis, particularly for analysis of data without a priori knowledge of the feedstock composition or identity.  相似文献   

13.
翟坤  杜文霞  吕锋  辛涛  句希源 《化工学报》2019,70(2):716-722
针对复杂工业系统动态非线性故障检测过程精度低和计算量大的问题,提出了一种改进的动态核主元分析故障检测方法,该方法首先利用不可区分度剔除相关程度较小或者不相关变量,减少数据量,然后通过观测值扩展对筛选后的新数据构建增广矩阵,并对矩阵使用核主元分析提取变量数据的非线性空间相关特征,最后通过监测T 2SPE 两种统计量诊断出系统发生故障及识别故障变量。仿真实验证明,该方法能对风力发电机故障进行有效监测和诊断,与KPCA方法相比,改进的动态核主元分析方法对微小故障更为敏感。  相似文献   

14.
Building energy consumption accounts for nearly 40% of global energy consumption, HVAC (Heating, Ventilating, and Air Conditioning) systems are the major building energy consumers, and as one type of HVAC systems, the heat pump air conditioning system, which is more energy-efficient compared to the traditional air conditioning system, is being more widely used to save energy. However, in northern China, extreme climatic conditions increase the cooling and heating load of the heat pump air conditioning system and accelerate the aging of the equipment, and the sensor may detect drifted parameters owing to climate change. This non-linear drifted parameter increases the false alarm rate of the fault detection and the need for unnecessary troubleshooting. In order to overcome the impact of the device aging and the drifted parameter, a Kalman filter and SPC (statistical process control) fault detection method are introduced in this paper. In thismethod, themodel parameter and its standard variance can be estimated by Kalman filter based on the gray model and the real-time data of the air conditioning system. Further, by using SPC to construct the dynamic control limits, false alarm rate is reduced. And this paper mainly focuses on the cold machine failure in the component failure and its soft fault detection. This approach has been tested on a simulation model of the “Sino-German Energy Conservation Demonstration Center” building heat pump air-conditioning system in Shenyang, China, and the results show that the Kalman filter and SPC fault detection method is simple and highly efficient with a low false alarm rate, and it can deal with the difficulties caused by the extreme environment and the non-linear influence of the parameters, and what's more, it provides a good foundation for dynamic fault diagnosis and fault prediction analysis.  相似文献   

15.
Nonlinear dynamic process monitoring based on dynamic kernel principal component analysis (DKPCA) is proposed. The kernel functions used in kernel PCA (KPCA) are profitable for capturing nonlinear property of processes and the time-lagged data extension is suitable for describing dynamic characteristic of processes. DKPCA enables us to monitor an arbitrary process with severe nonlinearity and (or) dynamics. In this respect, it is a generalized concept of multivariate statistical monitoring approaches. A unified monitoring index combined T2 with SPE is also suggested. The proposed monitoring method based on DKPCA is applied to a simulated nonlinear process and a wastewater treatment process. A comparison study of PCA, dynamic PCA, KPCA, and DKPCA is investigated in terms of type I error rate, type II error rate, and detection delay. The monitoring results confirm that the proposed methodology results in the best monitoring performance, i.e., low missing alarms and small detection delay, for all the faults.  相似文献   

16.
In a typical Euclidean three‐dimensional colour space such as CIELAB, the ‘third‐dimension’, such as CIELAB chroma, has long been criticized as being confusing and difficult to understand for naïve observers and it had relatively poor consistency in visual assessments. As an attempt to find a promising replacement to existing ‘third‐dimension’, two psychophysical experiments were conducted in this study using naïve observers. In the first experiment, 24 Korean observers assessed 48 NCS colour chips in terms of bright, light‐heavy, active‐passive, fresh‐stale, clean‐dirty, clear, boring, natural‐not natural, warm‐cool, intense‐weak, saturated, vivid‐dull, distinct‐indistinct, full‐thin and striking. According to experimental results, ‘saturated’ and ‘vivid‐dull’ were found to highly correlate with CIELAB chroma and were thus regarded as good candidates to become alternatives to existing ‘third‐dimension’. In the second experiment, 40 Korean and 68 British observers assessed more than 100 samples in terms of saturation, vividness, blackness and whiteness. Thus, observers assessed 120 samples for saturation, vividness and whiteness. For blackness, 110 samples were assessed. In both experiments, the colour samples were presented in a viewing cabinet and assessed individually. Principal component analysis identified two components that were associated with CIELAB lightness and chroma. In general, there was a similarity between the visual results of the British and Korean observers. High correlation coefficients were found for the following comparisons: predicted values of Berns' depth model versus the present ‘saturation’ response; Berns' clarity versus ‘vividness’ response; Berns' vividness versus ‘blackness’ response; and CIELAB lightness versus ‘whiteness’ response. © 2016 Wiley Periodicals, Inc. Col Res Appl, 42, 203–215, 2017  相似文献   

17.
关联向量机在微生物发酵传感器故障诊断中的应用   总被引:2,自引:0,他引:2  
微生物发酵过程具有严重的非线性和时变性,有许多过程参数需要监控,因此发酵罐的传感器故障诊断显得尤为重要。为此提出了一种微生物发酵的故障诊断新方法,即两个关联向量机分别作为观测器和分类器。观测器用于估计某传感器所测得的参数值(文中以二氧化碳释放率为例)以便得到残差序列,分类器用于对残差序列进行分类。仿真结果表明这种方法是可以有效地诊断传感器的故障的。  相似文献   

18.
根据多模型可以改善模型估计精度,提高泛化性的思想,提出一种基于改进加权粗糙集的多模型软测量建模方法。加权粗糙集可以有效地处理不平衡数据的分类问题,但是传统的样本权重选择方法缺乏整体考虑,容易引起分类器整体精度的下降。通过向加权粗糙集引入类别权重,得到了一种基于最小风险贝叶斯决策理论的加权粗糙集决策算法,并利用AdaBoostM2算法寻优样本权重及类别权重。通过上述方法构建的最小风险加权粗糙集分类器,有效地提高了分类精度,从而保证了各个子模型的可靠性。  相似文献   

19.
Protein–protein interactions mediate essentially all biological processes. A detailed understanding of these interactions is thus a major goal of modern biological chemistry. In recent years, genome sequencing efforts have revealed tens of thousands of novel genes, but the benefits of genome sequences will only be realized if these data can be translated to the level of protein function. While genome databases offer tremendous opportunities to expand our knowledge of protein–protein interactions, they also present formidable challenges to traditional protein chemistry methods. Indeed, it has become apparent that efficient analysis of proteins on a proteome‐wide scale will require the use of rapid combinatorial approaches. In this regard, phage display is an established combinatorial technology that is likely to play an even greater role in the future of biology. This article reviews recent applications of phage display to the analysis of protein–protein interactions. With combinatorial mutagenesis strategies, it is now possible to rapidly map the binding energetics at protein–protein interfaces through statistical analysis of phage‐displayed protein libraries. In addition, naïve phage‐displayed peptide libraries can be used to obtain small peptide ligands to essentially any protein of interest, and in many cases, these binding peptides act as antagonists or even agonists of natural protein functions. These methods are accelerating the pace of research by enabling the study of complex protein–protein interactions with simple molecular biology methods. With further optimization and automation, it may soon be possible to study hundreds of different proteins in parallel with efforts comparable to those currently expended on the analysis of individual proteins.  相似文献   

20.
In this paper, on-line batch process monitoring is developed on the basis of the three-way data structure and the time-lagged window of process dynamic behavior. Two methods, DPARAFAC (dynamic parallel factor analysis) and DTri-PLS (dynamic trilinear partial least squares), are used here depending on the process variables only or on the process variables and quality indices, respectively. Although multivariate analysis using such PARAFAC (parallel factor analysis) and Tri-PLS (trilinear partial least squares) models has been reported elsewhere, they are not suited for practicing on-line batch monitoring owing to the constraints of their data structures. A simple modification of the data structure provides a framework wherein the moving window based model can be incorporated in the existing three-way data structure to enhance the detectability of the on-line batch monitoring. By a sequence of time window of each batch, the proposed methodology is geared toward giving meaningful results that can be easily connected to the current measurements without the extra computation for the estimation of unmeasured process variables. The proposed method is supported by using two sets of benchmark fault detection problems. Comparisons with the existing two-way and three-way multiway statistical process control methods are also included.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号