首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 406 毫秒
1.
Prediction of sample properties using spectroscopic data with multivariate calibration is often enhanced by wavelength selection. This paper reports on a built-in wavelength selection method in which the estimated regression vector contains zero to near-zero coefficients for undesirable wavelengths. The method is based on Tikhonov regularization with the model 1-norm (TR1) and is applied to simulated and near-infrared (NIR) spectral data. Models are also formed from wavelength subsets determined by the standard method of stepwise regression (SWR). Harmonious (bias/variance tradeoff) and parsimonious considerations are compared with and without wavelength selection for principal component regression (PCR), ridge regression (RR), partial least squares (PLS), and multiple linear regression (MLR). Results show that TR1 models generally contain large baseline regions of near-zero coefficients, thereby essentially achieving built-in wavelength selection. For example, wavelengths with spectral interferences and/or poor signal-to-noise ratios obtain near zero regression coefficients. Results often improve with TR1 models, compared to full wavelength PCR, RR, and PLS models. The SWR subset results are similar to those for the TR1 models using the NIR data and worse with the simulated spectral situations. In general, wavelength selection improves prediction accuracy at a sacrifice to a potential increase in variance and the parsimony remains nearly equivalent compared to full wavelength models. New insights gained from the reported studies provide useful guidelines on when to use full wavelengths or use wavelength selection methods. Specifically, when a small number of large wavelength effects (good sensitivity and selectivity) exist, subset selection by SWR (with caution) and TR1 do well. With a small to moderate number of large to moderate sized wavelength effects, TR1 is better. Lastly, when a large number of small effects are present, full wavelengths with the methods of PCR, RR, or PLS are best.  相似文献   

2.
Recent work has shown that ridge regression (RR) is Pareto to partial least squares (PLS) and principal component regression (PCR) when the variance indicator Euclidian norm of the regression coefficients, //p//, is plotted against the bias indicator root mean square error of calibration (RMSEC). Simplex optimization demonstrates that RR is Pareto for several other spectral data sets when //p// is used with RMSEC and the root mean square error of evaluation (RMSEE) as optimization criteria. From this investigation, it was observed that while RR is Pareto optimal, PLS and PCR harmonious models are near equivalent to harmonious RR models. Additionally, it was found that RR is Pareto robust, i.e., models formed at one temperature were then used to predict samples at another temperature. Wavelength selection is commonly performed to improve analysis results such that bias indicators RMSEC, RMSEE, root mean square error of validation, or root mean square error of cross-validation decrease using a subset of wavelengths. Just as critical to an analysis of selected wavelengths is an assessment of variance. Using wavelengths deemed optimal in a previous study, this paper reports on the variance/bias tradeoff. An approach that forms the Pareto model with a Pareto wavelength subset is suggested.  相似文献   

3.
The combination of Raman and infrared spectroscopy on the one hand and wavelength selection on the other hand is used to improve the partial least-squares (PLS) prediction of seven selected yarn properties. These properties are important for on-line quality control during production. From 71 yarn samples, the Raman and infrared spectra are measured and reference methods are used to determine the selected properties. Making separate PLS models for all yarn properties using the Raman and infrared spectra, prior to wavelength selection, reveals that Raman spectroscopy outperforms infrared spectroscopy. If wavelength selection is applied, the PLS prediction error decreases and the correlation coefficient increases for all properties. However, a substantial wavelength selection effect is present for the infrared spectra compared to the Raman spectra. For the infrared spectra, wavelength selection results in PLS prediction errors comparable with the prediction performance of the Raman spectra prior to wavelength selection. Concatenating the Raman and infrared spectra does not enhance the PLS prediction performance, not even after wavelength selection. It is concluded that an infrared spectrometer, combined with a wavelength selection procedure, can be used if no (suitable) Raman instrument is available.  相似文献   

4.
Common methods of building linear calibration models are principal component regression (PCR), partial least squares (PLS), and least squares (LS). Recently, the method of cyclic subspace regression (CSR) has been presented and shown to provide PCR, PLS, LS and other related intermediate regressions with one algorithm. When forming a linear model with spectral data for quantitative analysis, prediction results can be adversely affected by responses that do not conform well to the linear model proposed. Wavelength selection can be used to eliminate wavelengths where such problem responses occur. It has recently been reported that CSR regression vectors can be formed by summing weighted eigenvectors where weights are determined from the hat matrix, singular values, and eigenvectors characterizing the sample space. Investigation of these weights shows that wavelength selection based on loading vectors can be misleading. Specifically, by using CSR it is shown that a small weight for an eigenvector can annihilate a large peak in a loading vector. In this study, correlograms are used with CSR regression vectors and eigenvector weights as wavelength-selection criteria. It is demonstrated that even though a model generated by LS for a wavelength subset produces substantially reduced prediction errors relative to PCR and PLS, CSR weight plots show that the LS model overfits and should not be used. Simulated situations containing spectral regions with excess noise or nonlinear responses are examined to study the effectiveness of wavelength selection based on the previously listed criteria. Near infrared spectra of gasoline samples with several known properties are also studied.  相似文献   

5.
Variable (or wavelength) selection plays an important role in the quantitative analysis of near-infrared (NIR) spectra. A modified method of uninformative variable elimination (UVE) was proposed for variable selection in NIR spectral modeling based on the principle of Monte Carlo (MC) and UVE. The method builds a large number of models with randomly selected calibration samples at first, and then each variable is evaluated with a stability of the corresponding coefficients in these models. Variables with poor stability are known as uninformative variable and eliminated. The performance of the proposed method is compared with UVE-PLS and conventional PLS for modeling the NIR data sets of tobacco samples. Results show that the proposed method is able to select important wavelengths from the NIR spectra, and makes the prediction more robust and accurate in quantitative analysis. Furthermore, if wavelet compression is combined with the method, more parsimonious and efficient model can be obtained.  相似文献   

6.
In multivariate calibration methods like partial least squares (PLS), especially when the spectra data consists of measurements at hundreds and even thousands of analytical channels, it is widely accepted that before a multivariate regression model is built, a well-performed variable selection can be helpful to improve the predictive ability of the model. In the present paper, the idea of variable selection is extended. Unlike in traditional variable selection methods, where the deleted variables and the variables included in the regression model are essentially weighted with discrete values 0 and 1, respectively, the strategy adopted in this paper is to weight the variables with continuous non-negative values. A recently proposed global optimization method, particle swarm optimization (PSO) algorithm is used to search for the weights of variables optimizing the training of a calibration set and the prediction of an independent validation set. Since variable selection is just a special case of variable weighting, the latter is expected to be more rational and flexible. Variable weighting would reduce the negative influence of wavelengths with undesirable qualities while retaining the useful information carried by them. Variable weighting would also prevent the possible spoiling of the multi-channel advantage of the model by variable selection, which would happen when the number of selected wavelengths is small. Two real data sets are investigated and the results of variable-weighted PLS and those of PLS are compared to demonstrate the advantages of the proposed method.  相似文献   

7.
Preprocessing of near-infrared spectra to remove unwanted, i.e., non-related spectral variation and selection of informative wavelengths is considered to be a crucial step prior to the construction of a quantitative calibration model. The standard methodology when comparing various preprocessing techniques and selecting different wavelengths is to compare prediction statistics computed with an independent set of data not used to make the actual calibration model. When the errors of reference value are large, no such values are available at all, or only a limited number of samples are available, other methods exist to evaluate the preprocessing method and wavelength selection. In this work we present a new indicator (SE) that only requires blank sample spectra, i.e., spectra of samples that are mixtures of the interfering constituents (everything except the analyte), a pure analyte spectrum, or alternatively, a sample spectrum where the analyte is present. The indicator is based on computing the net analyte signal of the analyte and the total error, i.e., instrumental noise and bias. By comparing the indicator values when different preprocessing techniques and wavelength selections are applied to the spectra, the optimal preprocessing technique and the optimal wavelength selection can be determined without knowledge of reference values, i.e., it minimizes the non-related spectral variation. The SE indicator is compared to two other indicators that also use net analyte signal computations. To demonstrate the feasibility of the SE indicator, two near-infrared spectral data sets from the pharmaceutical industry were used, i.e., diffuse reflectance spectra of powder samples and transmission spectra of tablets. Especially in pharmaceutical spectroscopic applications, it is expected beforehand that the non-related spectral variation is rather large and it is important to remove it. The indicator gave excellent results with respect to wavelength selection and optimal preprocessing. The SE indicator performs better than the two other indicators, and it is also applicable to other situations where the Beer-Lambert law is valid.  相似文献   

8.
基于主成分分析的消光法波长选择算法   总被引:1,自引:0,他引:1  
在消光法颗粒粒径测量中,被测颗粒系的消光光谱包含有颗粒粒径、折射率等信息.在可见及可见-红外波段内,对单峰R-R分布的消光光谱,一阶微分以及二阶微分消光光谱进行主成分变换.通过分析比较,提出一种基于主成分分析的特征波长的选择算法.该算法首先对一阶微分消光光谱进行主成分变换,然后将每个波长下的一阶微分消光谱对主成分贡献率的大小作为特征波长选择的主要依据,同时将光谱范围的边界波长也作为特征波长,这样的波长选择方法保证了选出的光谱消光值具有较高的信息量.文中分别对单峰及双峰R-R分布的颗粒系采用独立模式反演算法进行仿真实验验证,仿真实验证实了所提方法的有效性和实用性.  相似文献   

9.
Falaggis K  Towers DP  Towers CE 《Applied optics》2011,50(28):5484-5498
The method of excess fractions (EF) is well established to resolve the fringe order ambiguity generated in interferometric detection. Despite this background, multiwavelength interferometric absolute long distance measurements have only been reported with varying degrees of success. In this paper we present a theoretical model that can predict the unambiguous measurement range in EF based on the selected measurement wavelengths and phase noise. It is shown that beat wavelength solutions are a subset of this theoretical model. The performance of EF, for a given phase noise, is shown to be equivalent to beat techniques but offers many alternative sets of measurement wavelengths and therefore EF offer significantly greater flexibility in experimental design.  相似文献   

10.
In this paper, the performances of four improved analytical methods (backward stepwise selection of peak intensities, sum of characteristic peaks of a component, moving window partial least squares, and genetic algorithms) using wavelength selection for the analysis of xylene mixtures by Raman spectroscopy are tested for further use on the new "digital micromirror device associated with a photomultiplier tube" Raman spectrometer. It is shown that the errors of prediction using only a few selected points (from 4 to 49 depending on the method) are almost the same as when using the whole spectral range (1050 points). Compared to the last two methods, the "backward stepwise selection of peak intensities" and "sum of characteristic peaks of a component" methods are robust under industrial conditions and appear to be well suited for chemical quantitative analysis with the new Raman spectrometer, which allows the measurements of the total intensity to be made simultaneously for a number of pre-selected frequencies. Results show that the errors of prediction can be near to or even lower than 2%.  相似文献   

11.
All-optical wavelength conversion with multicasting is investigated in this paper, which is based on cross-phase modulation in a highly nonlinear fiber. With a pump-modulated light and only a single continuous-wave probe, wavelength multicasting is realized by appropriately controlling the powers of two beams. Our simulation work reveals that 10 multicast channels can be obtained with their Q factors being larger than six, if both pump and probe powers are properly selected. These wavelength channels of multicasting are positioned around the central wavelength of the probe on the blue-shifted and red-shifted sides. The central wavelength and the channel spacing can be affected by the wavelengths of the probe and the pump. The wavelength multicasting technique studied in this paper is simpler and can offer more multicast channels than that based on four-wave mixing.  相似文献   

12.
尚静  孟庆龙  张艳 《包装工程》2020,41(3):51-56
目的探究采用紫外/可见光谱技术结合化学计量学预测李子硬度的可行性。方法以“红”李子和“青”李子为研究对象,采用光谱采集系统获取李子样本的平均光谱;采用标准正态变换对原始光谱数据进行预处理,并利用连续投影算法(SPA)和竞争性自适应重加权算法(CARS)从全光谱的1024个波长中分别提取2个(513.04 nm和636.72 nm)和10个(230.01,244.67,274.71,287.66,290.90,300.59,311.78,423.08,515.39,631.31 nm)特征波长;分别建立基于全光谱和提取的特征波长预测李子硬度的误差反向传播(BP)网络模型。结果将采用SPA和CARS特征波长选择方法提取的特征变量作为BP网络输入,明显提升了BP网络模型的运行效率,且SPA-BP网络模型具有相对较好的李子硬度预测能力(rp=0.695,预测样本集均方根误差为1.610 kg/cm2)。结论采用紫外/可见光谱技术结合特征波长提取方法可实现李子硬度的快速无损检测。  相似文献   

13.
The level of chemical oxygen demand (COD) is an important index to evaluate whether sewage meets the discharge requirements, so corresponding tests should be carried out before discharge. Fourier transform infrared spectroscopy (FTIR) and attenuated total reflectance (ATR) can detect COD in sewage effectively, which has advantages over conventional chemical analysis methods. And the selection of characteristic bands was one of the key links in the application of FTIR/ATR spectroscopy. In this work, based on the moving window partial least-squares (MWPLS) regression to select a characteristic wavelength, a method of equivalent wavelength selection was proposed combining with paired t-test equivalent concept. The results showed that the prediction effect of the selected wavelength was very close to that of the MWPLS method, while the number of wavelength points was much smaller. SEPAve, RP.Ave, SEPStd, and RP,Std which characterized the modeling effect were 26.3 mg L-1, 0.969, 3.49 mg L-1, and 0.006, respectively. The validation effect V-SEP and V-RP were 28.64 mg L-1 and 0.960, respectively.The selected waveband was between 1809 cm-1 and 1568 cm-1. The method was of more reference value for the design of FTIR/ATR spectral instrument for COD detection.  相似文献   

14.
At present, the prevalence of diabetes is increasing because the human body cannot metabolize the glucose level. Accurate prediction of diabetes patients is an important research area. Many researchers have proposed techniques to predict this disease through data mining and machine learning methods. In prediction, feature selection is a key concept in preprocessing. Thus, the features that are relevant to the disease are used for prediction. This condition improves the prediction accuracy. Selecting the right features in the whole feature set is a complicated process, and many researchers are concentrating on it to produce a predictive model with high accuracy. In this work, a wrapper-based feature selection method called recursive feature elimination is combined with ridge regression (L2) to form a hybrid L2 regulated feature selection algorithm for overcoming the overfitting problem of data set. Overfitting is a major problem in feature selection, where the new data are unfit to the model because the training data are small. Ridge regression is mainly used to overcome the overfitting problem. The features are selected by using the proposed feature selection method, and random forest classifier is used to classify the data on the basis of the selected features. This work uses the Pima Indians Diabetes data set, and the evaluated results are compared with the existing algorithms to prove the accuracy of the proposed algorithm. The accuracy of the proposed algorithm in predicting diabetes is 100%, and its area under the curve is 97%. The proposed algorithm outperforms existing algorithms.  相似文献   

15.
Transscleral cyclophotocoagulation (TSCPC) is currently performed clinically as an effective treatment for end-stage glaucoma. We develop a theoretical model for the analysis of optical attenuation phenomena during TSCPC as a basis for selection of an optimal wavelength. A multilayered Monte Carlo model was developed to calculate the fluence and the rate of heat generation in each tissue layer for the wavelengths of Nd:YAG, diode, ruby, krypton yellow, and argon lasers. Of the five wavelengths under study, our theoretical results suggest that the diode laser wavelength offers the best penetration through the conjunctiva, sclera, and ciliary muscle and highest absorption within the ciliary pigment epithelium.  相似文献   

16.
Fu GH  Xu QS  Li HD  Cao DS  Liang YZ 《Applied spectroscopy》2011,65(4):402-408
In this paper a novel wavelength region selection algorithm, called elastic net grouping variable selection combined with partial least squares regression (EN-PLSR), is proposed for multi-component spectral data analysis. The EN-PLSR algorithm can automatically select successive strongly correlated prediction variable groups related to the response variable using two steps. First, a portion of the correlated predictors are selected and divided into subgroups by means of the grouping effect of elastic net estimation. Then, a recursive leave-one-group-out strategy is employed to further shrink the variable groups in terms of the root mean square error of cross-validation (RMSECV) criterion. The performance of the algorithm with real near-infrared (NIR) spectroscopic data sets shows that the EN-PLSR algorithm is competitive with full-spectrum PLS and moving window partial least squares (MWPLS) regression methods and it is suitable for use with strongly correlated spectroscopic data.  相似文献   

17.
In this paper we provide a detailed account of an ultra-wideband wavelength converter that shifts from 1310 to 1550?nm using a 1310?nm semiconductor optical amplifier as the nonlinear medium. The experimental approach uses an arrayed waveguide grating (AWG) as a method to slice the broadband output ASE of the 1310?nm SOA into multiple outputs at this O-band. A four-wave mixing technique is used to generate the wavelength conversion, whereby two wavelengths at 1310?nm are used and interact with the 1550?nm continuous wave output from a bismuth-based erbium-doped optical amplifier. In this demonstration, the interacting wavelengths are 1316.75, 1317.47 and 1542.21?nm. The downward conversion wavelengths are 1542.93 and 1541.49?nm, with a converted wavelength spacing of 224?nm.  相似文献   

18.
To date, surface plasmon resonance (SPR) spectroscopy identifies molecules via specific bindings with their ligands immobilized on a surface. We demonstrate here that a high-resolution multiwavelength SPR technique can measure the electronic states of the molecules and thus allow direct identification of the molecules. Using this new capability, we have studied the electronic and conformational differences between the oxidized and reduced states of cytochrome c immobilized on a modified gold electrode. When the wavelength of the incident light is far away from the optical absorption bands of the protein, a approximately 0.008 degree decrease in the resonance angle, due to a conformational change, occurs as the protein is switched from the oxidized to reduced states. When the wavelength is tuned to the absorption bands, the resonance angle oscillates at the wavelengths of the absorption peaks, which provides electronic signatures of the protein.  相似文献   

19.
A new wavelength interval selection procedure, moving window partial least-squares regression (MWPLSR), is proposed for multicomponent spectral analysis. This procedure builds a series of PLS models in a window that moves over the whole spectral region and then locates useful spectral intervals in terms of the least complexity of PLS models reaching a desired error level. Based on a proposed theory demonstrating the necessity of wavelength selection, it is shown that MWPLSR provides a viable approach to eliminate the extra variability generated by non-composition-related factors such as the perturbations in experimental conditions and physical properties of samples. A salient advantage of MWPLSR is that the calibration model is very stable against the interference from non-composition-related factors. Moreover, the selection of spectral intervals in terms of the least model complexity enables the reduction of the size of a calibration sample set in calibration modeling. Two strategies are suggested for coupling the MWPLSR procedure with PLS for multicomponent spectral analysis: One is the inclusion of all selected intervals to develop a PLS calibration model, and the other is the combination of the PLS models built separately in each interval. The combination of multiple PLS models offers a novel potential tool for improving the performance of individual models. The proposed procedures are evaluated using two open-path Fourier transform infrared data sets and one near-infrared data set, each having different noise characteristics. The results reveal that the proposed procedures are very promising for vibrational spectroscopy-based multicomponent analyses and give much better prediction than the full-spectrum PLS modeling.  相似文献   

20.
The evaluation of the predictive ability of a model, is an essential moment of all the chemometrical techniques. So it must be performed very carefully. However, in the case of selection of relevant variables (an essential step in the case of data sets with many, frequently thousands, variables) the selection is generally performed using all the available objects. In some recent classification and class modeling techniques, from the original or from the selected variables the Mahalanobis distances of the leverages from the centroids of the categories in the problem are computed, and then added to the original variables. Also here the Mahalanobis distances are computed with all the objects. The consequence is an overestimate of the prediction ability, very large when the ratio between the number of the objects and that of the variables is rather low, so that the variance-covariance matrix is unstable.In this paper the correct validation procedures are described for the cases of selection of variables and of the addition of Mahalanobis distances computed on the original variables or the selected variables. The estimates of the prediction ability are compared with those obtained with insufficient validation strategies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号