首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
4.
5.
6.
7.
In this paper, a new algorithm, the nonlinear PLS improved by the numeric genetic algorithm, called NPLSNGA, is applied to deal with nonlinear functions for inner relationship in QSAR. The NGA is used twice in NPLSNGA, once for nonlinear regression, and the other use is for nonlinear equations. Using the inner relationship of quadratic polynomial function, the fungicidal activity of a series of O-ethyl-N-isopropylphosphoro (thioureido) thioates was studied. The results are superior to the results of the reference. In QSAR of carboquinon derivatives and an anticarcinogenic drug for clinical media, the inner relation of sigmoid function was used. The results are equivalent to the results of ANN.  相似文献   

8.
Prediction of sample properties using spectroscopic data with multivariate calibration is often enhanced by wavelength selection. This paper reports on a built-in wavelength selection method in which the estimated regression vector contains zero to near-zero coefficients for undesirable wavelengths. The method is based on Tikhonov regularization with the model 1-norm (TR1) and is applied to simulated and near-infrared (NIR) spectral data. Models are also formed from wavelength subsets determined by the standard method of stepwise regression (SWR). Harmonious (bias/variance tradeoff) and parsimonious considerations are compared with and without wavelength selection for principal component regression (PCR), ridge regression (RR), partial least squares (PLS), and multiple linear regression (MLR). Results show that TR1 models generally contain large baseline regions of near-zero coefficients, thereby essentially achieving built-in wavelength selection. For example, wavelengths with spectral interferences and/or poor signal-to-noise ratios obtain near zero regression coefficients. Results often improve with TR1 models, compared to full wavelength PCR, RR, and PLS models. The SWR subset results are similar to those for the TR1 models using the NIR data and worse with the simulated spectral situations. In general, wavelength selection improves prediction accuracy at a sacrifice to a potential increase in variance and the parsimony remains nearly equivalent compared to full wavelength models. New insights gained from the reported studies provide useful guidelines on when to use full wavelengths or use wavelength selection methods. Specifically, when a small number of large wavelength effects (good sensitivity and selectivity) exist, subset selection by SWR (with caution) and TR1 do well. With a small to moderate number of large to moderate sized wavelength effects, TR1 is better. Lastly, when a large number of small effects are present, full wavelengths with the methods of PCR, RR, or PLS are best.  相似文献   

9.
A series of herbicidal materials, N-phenylacetamides (NPAs), has been studied for their Quantitative Structure–Activity Relationships (QSAR). The molecular structure as well as the activity data were taken from literature [O. Kirino, C. Takayama, A. Mine, Quantitative structure relationships of herbicidal N-(1-methyl-1-phenylethyi) phenylacetamides, Journal Pesticide Science 11 (1986) 611–617]. The independent variables used to describe the structure of compounds consisted of seven physicochemical properties, including the mode of molecular connection, steric factor, hydrophobic parameter, etc. Fifty different compounds constitute a sample set which is divided into two groups, 47 of them form a training set and the remaining three a checking set. Through a systematic study by using the classic multivariate analysis such as the Multiple Linear Regression (MLR), the Principal Component Analysis (PCA), and the Partial Least Squares (PLS) Regression, several QSAR models were established. For finding a better way to depict the nonlinear nature of the problem, multi-layered feed-forward (MLF) neural networks (NNs) was employed. The results indicated that the conventional multivariate analysis gave larger prediction errors, while the NNs method showed better accuracy in both self-checking and prediction-checking. The error variance of predictions made by NNs was the smallest among the all methods tested, only around half of the others.  相似文献   

10.
Sample-to-sample variability has proven to be a major challenge in achieving calibration transfer in quantitative biological Raman spectroscopy. Multiple morphological and optical parameters, such as tissue absorption and scattering, physiological glucose dynamics and skin heterogeneity, vary significantly in a human population introducing nonanalyte specific features into the calibration model. In this paper, we show that fluctuations of such parameters in human subjects introduce curved (nonlinear) effects in the relationship between the concentrations of the analyte of interest and the mixture Raman spectra. To account for these curved effects, we propose the use of support vector machines (SVM) as a nonlinear regression method over conventional linear regression techniques such as partial least-squares (PLS). Using transcutaneous blood glucose detection as an example, we demonstrate that application of SVM enables a significant improvement (at least 30%) in cross-validation accuracy over PLS when measurements from multiple human volunteers are employed in the calibration set. Furthermore, using physical tissue models with randomized analyte concentrations and varying turbidities, we show that the fluctuations in turbidity alone causes curved effects which can only be adequately modeled using nonlinear regression techniques. The enhanced levels of accuracy obtained with the SVM based calibration models opens up avenues for prospective prediction in humans and thus for clinical translation of the technology.  相似文献   

11.
12.
13.
14.
The paper describes linear and nonlinear modeling for simultaneous prediction of the dissolved oxygen (DO) and biochemical oxygen demand (BOD) levels in the river water using the set of independent measured variables. Partial least squares (PLS2) regression and feed forward back propagation artificial neural networks (FFBP ANNs) modeling methods were applied to predict the DO and BOD levels using eleven input variables measured monthly in the river water at eight different sites over a period of ten years. The performance of the models was assessed through the root mean squared error (RMSE), the bias, the standard error of prediction (SEP), the coefficient of determination (R2), the Nash-Sutcliffe coefficient of efficiency (Ef), and the accuracy factor (Af), computed from the measured and model-predicted values of the dependent variables (DO, BOD). Goodness of the model fit to the data was also evaluated through the relationship between the residuals and the model predicted values of DO and BOD, respectively. Although, the model predicted values of DO and BOD by both the linear (PLS2) and nonlinear (ANN) models were in good agreement with their respective measured values in the river water, the nonlinear model (ANN) performed relatively better than the linear one. Relative importance and contribution of the input variables to the identified ANN model was evaluated through the partitioning approach. The developed models can be used as tool for the water quality prediction.  相似文献   

15.
Common methods of building linear calibration models are principal component regression (PCR), partial least squares (PLS), and least squares (LS). Recently, the method of cyclic subspace regression (CSR) has been presented and shown to provide PCR, PLS, LS and other related intermediate regressions with one algorithm. When forming a linear model with spectral data for quantitative analysis, prediction results can be adversely affected by responses that do not conform well to the linear model proposed. Wavelength selection can be used to eliminate wavelengths where such problem responses occur. It has recently been reported that CSR regression vectors can be formed by summing weighted eigenvectors where weights are determined from the hat matrix, singular values, and eigenvectors characterizing the sample space. Investigation of these weights shows that wavelength selection based on loading vectors can be misleading. Specifically, by using CSR it is shown that a small weight for an eigenvector can annihilate a large peak in a loading vector. In this study, correlograms are used with CSR regression vectors and eigenvector weights as wavelength-selection criteria. It is demonstrated that even though a model generated by LS for a wavelength subset produces substantially reduced prediction errors relative to PCR and PLS, CSR weight plots show that the LS model overfits and should not be used. Simulated situations containing spectral regions with excess noise or nonlinear responses are examined to study the effectiveness of wavelength selection based on the previously listed criteria. Near infrared spectra of gasoline samples with several known properties are also studied.  相似文献   

16.
17.
18.
Concentrations of the stable isotope oxygen-18 in precipitation samples have been modeled by geographical and meteorological features of the sampling stations. Precipitation samples are from 30 locations in Austria; oxygen-18 concentrations have been determined by isotope ratio mass spectrometry; each location has been characterized by three geographical features (longitude, latitude and elevation), and five meteorological features (relative humidity, fresh snow, wind speed, precipitation, and air temperature). All data are monthly means for the summer period April to September computed as long term averages for a time period of typically 20 years. The basic feature set has been augmented by 49 nonlinear transformations of the original features giving a data set with 57 features. The number of objects was 180 corresponding to the 30 sampling stations times six months. Different methods of feature selection (correlation coefficient, backward elimination, all subsets regression, and genetic algorithm) have been applied, and for the 20 best feature subsets the prediction errors of PLS models have been estimated from test sets. The final best model for a prediction of oxygen-18 concentrations contains ten features selected by a genetic algorithm. This model has been applied to the computation of a geographical oxygen-18 distribution map from Austria based on computed oxygen-18 values for 200 locations, and experimental values of 30 locations.  相似文献   

19.
Recently, microwave resonance technology (MRT) sensor systems operating at four resonances instead of a single resonance frequency were established as a process analytical technology (PAT) tool for moisture monitoring. The additional resonance frequencies extend the technologies’ possible application range in pharmaceutical production processes remarkably towards higher moisture contents. In the present study, a novel multi-resonance MRT sensor was installed in a bottom-tangential-spray fluidized bed granulator in order to provide a proof-of-concept of the recently introduced technology in industrial pilot-scale equipment. The mounting position within the granulator was optimized to allow faster measurements and thereby even tighter process control. As the amount of data provided by using novel MRT sensor systems has increased manifold by the additional resonance frequencies and the accelerated measurement rate, it permitted to investigate the benefit of more sophisticated evaluation methods instead of the simple linear regression which is used in established single-resonance systems. Therefore, models for moisture prediction based on multiple linear regression (MLR), principal component regression (PCR), and partial least squares regression (PLS) were built and assessed. Correlation was strong (all R2?>?0.988) and predictive abilities were rather acceptable (all RMSE ≤0.5%) for all models over the whole granulation process up to 16% residual moisture. While PCR provided best predictive abilities, MLR proofed as a simple and valuable alternative without the need of chemometric data evaluation.  相似文献   

20.
A kernel-based algorithm is potentially very efficient for predicting key quality variables of nonlinear chemical and biological processes by mapping an original input space into a high-dimensional feature space. Nonlinear data structure in the original space is most likely to be linear at the high-dimensional feature space. In this work, kernel partial least squares (PLS) was applied to predict inferentially key process variables in an industrial cokes wastewater treatment plant. The primary motive was to give operators and process engineers a reliable and accurate estimation of key process variables such as chemical oxygen demand, total nitrogen, and cyanides concentrations in real time. This would allow them to arrive at the optimum operational strategy in an early stage and minimize damage to the operating units as shock loadings of toxic compounds in the influent often cause process instability. The proposed kernel-based algorithm could effectively capture the nonlinear relationship in the process variables and show far better performance in prediction of the quality variables compared to the conventional linear PLS and other nonlinear PLS method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号