首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
Data-driven soft sensors have been widely used in both academic research and industrial applications for predicting hard-to-measure variables or replacing physical sensors to reduce cost. It has been shown that the performance of these data-driven soft sensors could be greatly improved by selecting only the vital variables that strongly affect the primary variables, rather than using all the available process variables. In this work, a comprehensive evaluation of different variable selection methods for PLS-based soft sensor development is presented, and a new metric is proposed to assess the performance of different variable selection methods. The following seven variable selection methods are compared: stepwise regression (SR), partial least squares with regression coefficients (PLS-BETA), PLS with variable importance in projection (PLS-VIP), uninformative variable elimination with PLS (UVE-PLS), genetic algorithm with PLS (GA-PLS), least absolute shrinkage and selection operator (Lasso), and competitive adaptive reweighted sampling with PLS (CARS-PLS). Their strengths and limitations for soft sensor development are demonstrated by a simulated case study and an industrial case study.  相似文献   

2.
4H-甲基咪唑苯二氮(?)酮(TIBO)类衍生物是抗爱滋病的一种新药,分子连接性指数是经证明应用广泛、较为成功的一种指数,本文定义并计算了TIBO类衍生物原子的特征值δi,利用量子化学计算方法,建构新的拓扑集成指数G和分子连接性指数mX,基于多元回归技术建立的对TIBO类衍生物药物的油水分配系数,作出精确估算和预测的定量结构-活性相关关系,得到的多元回归方程为:logP=0.782 G-0.1430X 0.2312X-3.829,估算的平均相对误差为2.53%。为了检验模型的稳定性和预测能力,做了留一法交互校验,预测平均相对误差为3.40%。该模型相关系数高,稳定性好,预测能力强。  相似文献   

3.
基于回归系数的变量筛选方法用于近红外光谱分析   总被引:1,自引:0,他引:1  
提出了一种基于回归系数的变量逐步筛选方法。对光谱中各变量计算其回归系数后,按其绝对值由大到小将相应变量排列,采用PLS交互检验按前向选择法逐步选择最佳变量子集。用该方法对玉米和柴油近红外光谱数据进行分析,对玉米蛋白质、柴油十六烷值和粘度分别选择出了14、12以及30个最佳变量用于建模,所得预测结果均优于全谱变量建模的预测结果。可见本方法是一种有效实用的近红外光谱变量选择方法。  相似文献   

4.
A new approach for the estimation and the validation of a structural equation model with a formative-reflective scheme is presented. The basis of the paper is a proposal for overcoming a potential deficiency of PLS path modeling. In the PLS approach the reflective scheme assumed for the endogenous latent variables (LVs) is inverted; moreover, the model errors are not explicitly taken into account for the estimation of the endogenous LVs. The proposed approach utilizes all the relevant information in the formative manifest variables (MVs) providing solutions which respect the causal structure of the model. The estimation procedure is based on the optimization of the redundancy criterion. The new approach, entitled redundancy analysis approach to path modeling (RA-PM) is compared with both traditional PLS Path Modeling and LISREL methodology, on the basis of real and simulated data.  相似文献   

5.
A new approach for the estimation and the validation of a structural equation model with a formative-reflective scheme is presented. The basis of the paper is a proposal for overcoming a potential deficiency of PLS path modeling. In the PLS approach the reflective scheme assumed for the endogenous latent variables (LVs) is inverted; moreover, the model errors are not explicitly taken into account for the estimation of the endogenous LVs. The proposed approach utilizes all the relevant information in the formative manifest variables (MVs) providing solutions which respect the causal structure of the model. The estimation procedure is based on the optimization of the redundancy criterion. The new approach, entitled redundancy analysis approach to path modeling (RA-PM) is compared with both traditional PLS Path Modeling and LISREL methodology, on the basis of real and simulated data.  相似文献   

6.
利用偏最小二乘法的一种变量筛选法   总被引:1,自引:0,他引:1  
根据偏最小二乘法(PLS)建模中的回归系数等一些信息,筛选原始自变量,在不损失模型预报能力的前提下,除去冗余的或影响不大的一些原始自变量,使模型更简单。本研究中找到了用于删除变量的一种新判据,计算简单,使用效果好。研究结果表明,利用PL3法得到的删除变量的新判据筛选变量是一种非常实用和有效的变量筛选方法,该法非常适合处理海量数据或变量数很大的建模问题,可使最终所得的模型中变量数大大减少,使模型大大简化,因而便于分析和解决实际问题。在处理中药指纹图谱数据时,与传统的算法比较,模型得到了大大简化。  相似文献   

7.
用文献设定的结构参数和本文设定的结构参数,分别与由HyperChem7.5Student Evaluation计算得到的量化参数作为自变量构成2组数据,以逐步回归,遗传算法-偏最小二乘法(GA-PLS)和遗传算法-支持向量机(GA-SVM)等算法就黄酮类化合物对PTKs抑制性进行QSAR研究。用各算法模型处理数据,由本文设定的结构参数构成的数据集获得的预测结果更好,表明采用取代基团类型和取代位置结合的编码参数包含的信息更为丰富,对物质性质的描述更加合理。在各种算法中, GA-SVM模型均具有最佳预测效果,该算法对2组数据作留一法预测处理得到的相关系数R和PTKs抑制性实验值与预测值的平均绝对误差MAE分别为0.7595,0.2871和0.7864,0.2883。研究还表明,GA-PLS和GA-SVM联用算法的预测效果远高于单独使用的PLS和SVM算法;由逐步回归建立的MLR模型对2组数据进行计算处理,尽管拟合时相关系数R分别达到0.8136和0.8250,但作留一法交互验证时却下降到0.7113和0.7354,明显低于GA-PLS和GA-SVM联用算法。  相似文献   

8.
运用偏最小二乘(PLS)和遗传算法(GA)预测含能材料的爆炸性能。利用GA在"分子结构—爆炸性能(QSDR)"数据中选取较少的变量个数,以较少的变量个数包含较多的变量信息,再用PLS进行结构性能模型的建立和性能预测。将这种方法应用于呋咱和芳香类含能材料的性能预测当中,可以验证方法的有效性。  相似文献   

9.
针对流程工业中, 因多工况导致数据分布变化引起传统软测量模型预测性能恶化问题, 本文提出一种基于 超图正则化的域适应多工况软测量回归模型框架. 首先, 采用非线性迭代偏最小二乘回归算法为基模型, 在潜变量 空间利用历史工况数据重构当前工况数据, 以增强工况间的相关性, 有效减小数据分布差异; 同时, 对重构系数施加 低秩稀疏约束, 保留了数据的局部和全局子空间结构; 其次, 通过超图拉普拉斯正则项对域适应潜变量求解过程进 行约束, 避免在寻找潜变量过程中破坏数据结构. 最后, 利用交替方向乘子法优化求解模型参数. 在多个数据集上 的实验表明, 本文方法在多工况环境下可有效提高软测量模型的预测精度和泛化性能.  相似文献   

10.
四氢咪唑苯二氮卓酮类抗艾滋病药物定量构效关系的研究   总被引:5,自引:4,他引:1  
采用三维全息原子场作用矢量(3D-HoVAIF)研究89个四氢咪唑苯二氮卓酮(TIBO)类抗艾滋病药物的定量构效关系(QSAR).分别运用偏最小二乘回归、人工神经网络建模,同时采用内部及外部双重验证的办法深入分析和检验模型的稳定性.PLS与ANN建模的复相关系数(R2cum)、留一法(leave-one-out,LOO)交互校验(cross-validation,CV)复相关系数(Q2CV)和外部样本校验复相关系数(Q2ext),分别为0.802、0.710、0.552和0.871、0.864、0.760.表明用3D-HoVAIF表征TIBO类抗艾滋病药物分子结构信息较好,建立QSAR模型的稳定性和预测能力良好,运用ANN建模优于PLS及前人报道的多元线性回归(multiple linear regression,MLR).  相似文献   

11.
Multicollinearity and difficulty of interpreting the coefficients of dam regression models pose two problems: (1) selection of informative variables for analysing dam deformation behaviour, and (2) mitigation of the multicollinearity among the variables. Resolving these two problems necessitates the application of genetic algorithm-based partial least square (GA-PLS) and statistically inspired modification of PLS algorithm (SIMPLS). A SIMPLS regression with the predictor variables selected by GA-PLS (hybrid GA/SIMPLS regression) is put forward to interpret the results obtained from periodic monitoring surveys of hydraulic structures. The hybrid model is employed for analysing the crack behaviour of an earth-rock dam in China. The results show the proposed model is superior to an ordinary SIMPLS and stepwise regression, especially when multicollinearity and influential outliers exist among the variables.  相似文献   

12.
This paper is concerned with data science and analytics as applied to data from dynamic systems for the purpose of monitoring, prediction, and inference. Collinearity is inevitable in industrial operation data. Therefore, we focus on latent variable methods that achieve dimension reduction and collinearity removal. We present a new dimension reduction expression of state space framework to unify dynamic latent variable analytics for process data, dynamic factor models for econometrics, subspace identification of multivariate dynamic systems, and machine learning algorithms for dynamic feature analysis. We unify or differentiate them in terms of model structure, objectives with constraints, and parsimony of parameterization. The Kalman filter theory in the latent space is used to give a system theory foundation to some empirical treatments in data analytics. We provide a unifying review of the connections among the dynamic latent variable methods, dynamic factor models, subspace identification methods, dynamic feature extractions, and their uses for prediction and process monitoring. Both unsupervised dynamic latent variable analytics and the supervised counterparts are reviewed. Illustrative examples are presented to show the similarities and differences among the analytics in extracting features for prediction and monitoring.  相似文献   

13.
针对药物构效关系呈非线性的特征,提出一种径基函数(radial basis function,RBF)-自适应偏最小二乘回归(adaptive partial least squares regression,APLSR)相结合的建模方法。该组合方法应用RBF实现自变量非线性变换,应用APLSR方法消除非线性变换后输出变量间存在的复共线性,并以模型的预报能力为目标,自适应地确定PLSR模型的最佳隐变量个数,从而获得预报性能良好的模型。本文将RBF-APLSR方法应用于含硫苯衍生物的定量构效关系建模,取得了令人满意的效果,其预报精度高于PLSR方法。  相似文献   

14.
《Journal of Process Control》2014,24(7):1046-1056
Soft sensors are used to predict response variables, which are difficult to measure, using the data of predictors that can be obtained relatively easier. Arranging time-lagged data of predictors and applying partial least squares (PLS) to the dataset is a popular approach for extracting the correlation between data of the responses and predictors of the process dynamic. However, the model input dimension dramatically soars once multiple time delays are incorporated. In addition, the selection of variables in the dynamic PLS (DPLS) model is a critical step for the robustness and the accuracy of the inferential model, since irrelevant inputs deteriorate the prediction performance of the soft sensor. The sparse PLS (SPLS) is a variable selection method that simultaneously selects the important predictors and finds the correlation between the predictors and responses. The sparsity of the model is dependent on a cut-off value in the SPLS algorithm that is determined using a cross-validation procedure. Therefore, the threshold is a compromise for all latent variable directions. It is necessary to further shrink the inputs from the result of SPLS to obtain a more compact model. In the presented work, named SPLS-VIP, the variable importance in projection (VIP) method was used to filter out the insignificant inputs from the SPLS result. An industrial soft sensor for predicting oxygen concentrations in the air separation process was developed based on the proposed approach. The prediction performance and the model interpretability could be further improved from the SPLS method using the proposed approach.  相似文献   

15.
The control of batch end-product quality is an important issue in many high value-added manufacturing industries, particularly specialty chemicals and pharmaceuticals. An attractive approach for controlling such systems is to apply partial least squares (PLS) models, which can utilize the low-dimensional latent variable (or score) space with the corresponding control optimization performed on these few latent variables. The manipulated variable trajectories (MVTs) can then be reconstructed from the optimized scores. The existing PLS-based batch end-product quality control methodology does not incorporate the disturbance model in its formulation. As a result, it is demonstrated in this paper that these control formulations can be incapable of adequate disturbance rejection. To reject disturbances adequately, it is necessary to utilize feedback information and re-compute optimal control sequences at multiple instants during a batch run. However, this can lead to erratic control action, resulting in the deterioration of batch end-product quality. To resolve this issue, a revised PLS-based batch end-product quality controller is proposed in this paper that explicitly accounts for disturbance induced plant-model mismatch by including a simple disturbance model. Furthermore, the proposed controller formulation adds hard constraints to incremental changes of the re-computed control sequences to avoid erratic behavior in the case of multiple control decision points. The ability of the proposed control scheme to reject disturbances and obtain desirable MVTs is demonstrated using a benchmark simulation of a fed-batch fermentation process.  相似文献   

16.
Feature extraction based on decision boundaries   总被引:8,自引:0,他引:8  
A novel approach to feature extraction for classification based directly on the decision boundaries is proposed. It is shown how discriminantly redundant features and discriminantly informative features are related to decision boundaries. A procedure to extract discriminantly informative features based on a decision boundary is proposed. The proposed feature extraction algorithm has several desirable properties: (1) it predicts the minimum number of features necessary to achieve the same classification accuracy as in the original space for a given pattern recognition problem; and (2) it finds the necessary feature vectors. The proposed algorithm does not deteriorate under the circumstances of equal class means or equal class covariances as some previous algorithms do. Experiments show that the performance of the proposed algorithm compares favorably with those of previous algorithms  相似文献   

17.
基于T-PLS贡献图方法的故障诊断技术   总被引:5,自引:0,他引:5  
多变量统计过程监控对于复杂工业过程是一种有效的故障检测和诊断技术. 最小二乘(或称潜空间投影)模型是多变量统计过程监控中常用的一种投影模型, 能够同时对过程数据和质量数据进行建模. 讨论了一种新的基于全潜空间投影模型的故障诊断技术. 全潜空间投影模型中有4个检测统计量. 提出了一种新的T2贡献图计算方法, 对于所有检测统计量, 得到了相应的贡献图算法. 为了确定一个变量是否发生了故障, 计算所有变量贡献图的控制限. 该技术可以将辨识到的故障变量分为与Y有关和与Y无关的两类. 基于Tennessee Eastman过程的案例研究表明了该技术的有效性.  相似文献   

18.
大多数子空间聚类算法将高维数据映射到低维子空间时不能较好捕获数据间几何结构.针对上述问题,文中提出引入低秩约束先验的深度子空间聚类算法,兼顾数据全局和局部结构信息.算法结合低秩表示与深度自编码器,利用低秩约束捕获数据全局结构,并将约束神经网络的潜在特征表示为低秩.自编码通过最小化重构误差进行非线性低维子空间映射,保留数据的局部特性.以多元逻辑回归函数作为判别模型,预测子空间分割.整个算法在无监督联合学习框架下进行优化.在5个数据集上的实验验证文中方法的有效性.  相似文献   

19.
A latent variable iterative learning model predictive control (LV-ILMPC) method is presented for trajectory tracking in batch processes. Different from the iterative learning model predictive control (ILMPC) model built from the original variable space, LV-ILMPC develops a latent variable model based on dynamic partial least squares (DyPLS) to capture the dominant features of each batch. In each latent variable space, we use a state–space model to describe the dynamic characteristics of the internal model, and an LV-ILMPC controller is designed. Each LV-ILMPC controller tracks the set points of the current batch projection in the corresponding latent variable space, and the optimal control law is determined and the persistent process disturbances is rejected along both time and batch horizons. The proposed LV-ILMPC formulation is based on general LV-MPC and incorporates an iterative learning function into LV-MPC. In addition, the real physical input that drives the process can be reconstructed from the latent variable space. Therefore, this algorithm is particularly suitable for multiple-input, multiple-output (MIMO) systems with strong coupling and serious collinearity. Three studies are used to illustrate the effectiveness of the proposed LV-ILMPC .  相似文献   

20.
韩敏  张瑞全  许美玲 《控制与决策》2017,32(9):1647-1652
针对灰色绝对关联度模型和灰色相似关联度模型存在的问题,提出一种基于相对变化面积的改进灰色关联度模型.以序列几何形状的相似程度为基础,构建反应折线相似程度的相对变化面积,并以此作为关联系数的计算依据,同时以局部关联度的平均值度量整体的相似性,定义灰色关联度模型.此外,根据关联度计算结果,提出一种基于集合思想的变量选择算法,有效去除变量间的无关和冗余变量.仿真结果验证了所提出算法的有效性和合理性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号