首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 890 毫秒
1.
Archetypal analysis represents each individual in a data set as a mixture of individuals of pure type or archetypes. The archetypes themselves are restricted to being mixtures of the individuals in the data set. Archetypes are selected by minimizing the squared error in representing each individual as a mixture of archetypes. The usefulness of archetypal analysis is illustrated on several data sets. Computing the archetypes is a nonlinear least squares problem, which is solved using an alternating minimizing algorithm.  相似文献   

2.
For the analysis of the spectra of complex biofluids, preprocessing methods play a crucial role in rendering the subsequent data analyses more robust and accurate. Normalization is a preprocessing method, which accounts for different dilutions of samples by scaling the spectra to the same virtual overall concentration. In the field of 1H NMR metabonomics integral normalization, which scales spectra to the same total integral, is the de facto standard. In this work, it is shown that integral normalization is a suboptimal method for normalizing spectra from metabonomic studies. Especially strong metabonomic changes, evident as massive amounts of single metabolites in samples, significantly hamper the integral normalization resulting in incorrectly scaled spectra. The probabilistic quotient normalization is introduced in this work. This method is based on the calculation of a most probable dilution factor by looking at the distribution of the quotients of the amplitudes of a test spectrum by those of a reference spectrum. Simulated spectra, spectra of urine samples from a metabonomic study with cyclosporin-A as the active compound, and spectra of more than 4000 samples of control animals demonstrate that the probabilistic quotient normalization is by far more robust and more accurate than the widespread integral normalization and vector length normalization.  相似文献   

3.
Two novel methods are described for direct quantitative analysis of NMR free induction decay (FID) signals. The methods use adaptations of the generalized rank annihilation method (GRAM) and the direct exponential curve resolution algorithm (DECRA). With FID-GRAM, the Hankel matrix of the sample signal is compared with that of a reference mixture to obtain quantitative data about the components. With FID-DECRA, a single-sample FID matrix is split into two matrices, allowing quantitative recovery of decay constants and the individual signals in the FID. Inaccurate results were obtained with FID-GRAM when there were differences between the frequency or transverse relaxation time of signals for the reference and test samples. This problem does not arise with FID-DECRA, because comparison with a reference signal is unnecessary. Application of FID-DECRA to 19F NMR data, which contained overlapping signals from three components, gave concentrations comparable to those derived from partial least squares (PLS) analysis of the Fourier transformed spectra. However, the main advantage of FID-DECRA was that accurate (<5% error) and precise (2.3% RSD) results were obtained using only one calibration sample, whereas with PLS, a training set of 10 standard mixtures was used to give comparable accuracy and precision.  相似文献   

4.
An algorithm for statistical processing of the set of multicomponent excitation–emission matrices for laser-induced fluorescence spectroscopy is proposed that is based on principal component analysis. It is shown for the first time that the fluorescence emission and excitation spectra of unknown fluorophores in optically thin samples can be calculated. Using the proposed algorithm, it is possible to pass from principal components with alternating signs to positive quantities corresponding to the spectra of real substances. The method is applied to a mixture of three fluorescent dyes, and it is demonstrated that the obtained spectra of principal components well reproduce the spectra of initial dyes.  相似文献   

5.
6.
The application of the traditional methods of multivariate statistics, such as the calculation of principle components, to the analysis of NMR spectra taken on sets of biofluid samples is one of the central approaches in the field of metabonomics. While this approach has proven to be a powerful and widely applicable technique, it has an inherent weakness, in that it tends to be dominated by those chemical species present at relatively higher concentrations. Using a set of commercial honey samples, a comparison of this classical metabonomics approach to one based on the use of the selective TOCSY experiment is presented. While the NMR spectrum of honey and its classical metabonomic analysis is completely dominated by a very few chemical species, specifically alpha-glucose and fructose, the statistical signal carried by minor honey components, such as amino acids, may be accessed using a selective TOCSY-based approach. This approach has the intrinsic virtue that it focuses the statistical analysis on a set of predefined chemical species, which might be chosen for their metabolic significance, and could be composed of either major or minor mixture constituents. Furthermore, the selective TOCSY method allows for more certain chemical identification, acquisition times of approximately 1 min, and accurate quantification of the species contributing to the statistical discriminatory signal.  相似文献   

7.
The chemometric techniques of multivariate curve resolution (MCR) are aimed at extracting the spectra and concentrations of individual components present in mixtures using a minimum set of initial assumptions. We present results from the application of alternating least squares (ALS) based MCR to the analysis of hyperspectral images of in situ biological material. The spectra of individual pure components were mathematically extracted and then identified by searching the spectra against a commercial library. No prior information about the chemical composition of the material was used in the data analysis. The spectra recovered by ALS-MCR analysis of an FT-IR microspectroscopic image of an 8-micron-cornkernel section matched very well the spectra of the corn storage protein, zein, and starch. Through the application of MCR, we were able to show the presence of a second spectrally different protein, which could not be easily seen using univariate analysis. These results demonstrate the value of multivariate curve resolution techniques for the analysis of biological tissue. The value of principal components analysis (PCA) for hyperspectral image analysis is also discussed.  相似文献   

8.
The application of trilinear decomposition (TLD) to the analysis of fluorescence excitation-emission matrices of mixtures of polycyclic aromatic hydrocarbons (PAHs) is described. The variables constituting the third-order tensor are excitation wavelength, emission wavelength, and concentration of a fluorescence quencher (nitromethane). The addition of a quencher to PAH mixtures selectively reduces the fluorescence intensity of mixture components according to the Stern-Volmer equation. TLD allows the three-way matrix to be decomposed to give unique solutions for the excitation spectrum, emission spectrum, and quenching profiles for each component. The availability of spectra and calculated Stern-Volmer constants can aid in the identification of unknown components. Preprocessing of the data to correct for Rayleigh/Raman scatter and primary absorption by the quencher is necessary. Both three-component (anthracene, pyrene, 1-methylpyrene) and four-component (fluoranthene, anthracene, pyrene, 2,3-benzofluorene) synthetic mixtures are successfully resolved by TLD using quencher concentrations up to 100 mM. Results are compared using both alternating least-squares and direct trilinear decomposition algorithms. The reproducibility of extracted Stern-Volmer constants is determined from replicate experiments. To illustrate the application of TLD to a real sample, a chromatographic cut from the analysis of a light gas oil sample was used. Analysis of the TLD extracted spectra and quenching constants suggests the presence of three classes of polycyclic aromatic hydrocarbons consistent with data from a second dimension of chromatography and mass spectrometry.  相似文献   

9.
We describe a new fluorescence method that allows the resolution of both the decay times and emission spectra of mixtures of fluorophores. This method is completely general and does not require any assumptions or knowledge of the decay times or emission spectra of the individual fluorophores. We use the phase angle spectra and modulation spectra of the mixture, measured over a range of suitable light modulation frequencies and emission wavelengths. These data are analyzed by nonlinear least-squares analysis to recover the emission spectra and the associated decay times. The principle of the method and the nature of the data are illustrated by using two-component mixtures with increasing spectral overlap. We then demonstrate the recovery of minor components, of structure emission spectra, and of a three-component mixture with completed overlapping emission spectra. And finally, we describe the resolution of a two-component mixture with decay times of 0.8 and 1.4 ns using modulation frequencies up to 774 MHz.  相似文献   

10.
The structure of the mobile phase in liquid chromatography plays an important role in the determination of retention behavior on reversed-phase stationary materials. One of the most commonly employed mobile phases is a mixture of methanol and water. In this work, infrared and Raman spectroscopic methods were used to investigate the structure of species formed in methanol/water mixtures. Chemometric methods using multivariate curve resolution by alternating least-squares analysis were used to resolve the overlapped spectra and to determine concentration profiles as a function of composition. The results showed that the structure of these mixtures could be described by a mixture model consisting of four species, namely, methanol, water, and two complexes, methanol/water (1:1) and methanol/water (1:4). The spectral frequencies and concentration profiles found from the Raman and infrared measurements were consistent with one another and with theoretical calculations.  相似文献   

11.
12.
This paper describes mathematical techniques to correct for analyte-irrelevant optical variability in tissue spectra by combining multiple preprocessing techniques to address variability in spectral properties of tissue overlying and within the muscle. A mathematical preprocessing method called principal component analysis (PCA) loading correction is discussed for removal of inter-subject, analyte-irrelevant variations in muscle scattering from continuous-wave diffuse reflectance near-infrared (NIR) spectra. The correction is completed by orthogonalizing spectra to a set of loading vectors of the principal components obtained from principal component analysis of spectra with the same analyte value, across different subjects in the calibration set. Once the loading vectors are obtained, no knowledge of analyte values is required for future spectral correction. The method was tested on tissue-like, three-layer phantoms using partial least squares (PLS) regression to predict the absorber concentration in the phantom muscle layer from the NIR spectra. Two other mathematical methods, short-distance correction to remove spectral interference from skin and fat layers and standard normal variate scaling, were also applied and/or combined with the proposed method prior to the PLS analysis. Each of the preprocessing methods improved model prediction and/or reduced model complexity. The combination of the three preprocessing methods provided the most accurate prediction results. We also performed a preliminary validation on in vivo human tissue spectra.  相似文献   

13.
Elucidation of the composition of chemical-biological samples is a main focus of systems biology and metabolomics. Due to the inherent complexity of these mixtures, reliable, efficient, and potentially automatable methods are needed to identify the underlying metabolites and natural products. Because of its rich chemical information content, nuclear magnetic resonance (NMR) spectroscopy has a unique potential for this task. Here we present a generalization and application of a recently introduced NMR data collection, processing, and analysis strategy that circumvents the need for extensive purification and hyphenation prior to analysis. It uses covariance TOCSY NMR spectra measured on a 1-mm high-temperature cryogenic probe that are analyzed by a spectral trace clustering algorithm yielding 1D NMR spectra of the individual components for their unambiguous identification. The method is demonstrated on a metabolic model mixture and is then applied to the unpurified venom mixture of an individual walking stick insect that contains several slowly interconverting and closely related metabolites.  相似文献   

14.
Ion mobility spectrometry is a rapid scanning measurement method for which compression methods that facilitate the handling of large collections of data are beneficial. Peak distortion in reconstructed ion mobility spectra from linear wavelet compression is problematic in that artifact peaks may cause false positive alarms. Peak shifting also may cause false alarms if target peaks shift out of or interfering peaks shift into detection windows. Nonlinear wavelet compression (NLWC) preserves peak shape and can lessen the degree of distortion, shifting, and artifact peaks in the reconstructed spectra. NLWC was applied to achieve high compression and fidelity in the reconstructed spectra. Another benefit is that NLWC improves signal-to-noise ratios and thus the models built from compressed data are improved. By compressing both the drift time order and the spectrum acquisition order, greater compressions maybe achieved. A two-way nonlinear wavelet compression method that incorporates alternating least squares (2W-NLWC-ALS) algorithm was devised by applying ALS to partially reconstructed wavelet coefficients generated from two-way NLWC. The number of components in a data set can be determined automatically using ASIMPLISMA. The smaller ALS models are saved as the final compressed data and can be used to reconstruct the entire data set efficiently without maintaining the compressed wavelet coefficient matrix of the original data set. The 2W-NLWC-ALS algorithm provides greater compression ratios compared to regular wavelet compression and interpretable models. Using this method, large volumes of data can be acquired and easily evaluated through a simple compressed model. A compression ratio of 510 ppm, root-mean-square error (E(RMS)) of 6.3 mV (full-scale signal is usually 1 V or larger), and relative root-mean-square error (RE(RMS)) of 1.62% were achieved for data sets collected by CAM. A compression ratio of 46 ppm, E(RMS) of 9.2 mV, and RE(RMS) of 0.42% were achieved for data sets collected with an ITEMISER instrument. The 2W-NLWC-ALS algorithm is an efficient compression method that provides the benefits of a simple model.  相似文献   

15.
Metabolite identification in the complex NMR spectra of biological samples is a challenging task due to significant spectral overlap and limited signal-to-noise. In this study we present a new approach, RANSY (ratio analysis NMR spectroscopy), which identifies all the peaks of a specific metabolite on the basis of the ratios of peak heights or integrals. We show that the spectrum for an individual metabolite can be generated by exploiting the fact that the peak ratios for any metabolite in the NMR spectrum are fixed and proportional to the relative numbers of magnetically distinct protons. When the peak ratios are divided by their coefficients of variation derived from a set of NMR spectra, the generation of an individual metabolite spectrum is enabled. We first tested the performance of this approach using one-dimensional (1D) and two-dimensional (2D) NMR data of mixtures of synthetic analogues of common body fluid metabolites. Subsequently, the method was applied to (1)H NMR spectra of blood serum samples to demonstrate the selective identification of a number of metabolites. The RANSY approach, which does not need any additional NMR experiments for spectral simplification, is easy to perform and has the potential to aid in the identification of unknown metabolites using 1D or 2D NMR spectra in virtually any complex biological mixture.  相似文献   

16.
A second-order multivariate calibration approach, based on a combination of unfolded-partial least-squares with residual bilinearization (U-PLS/RBL), has been applied to fluorescence excitation-emission matrix data for multicomponent mixtures showing inner filter effects. The employed chemometric algorithm is the most successful one regarding the prediction of analyte concentrations when significant inner filter effects occur, even in the presence of unexpected sample components, which require strict adherence to the second-order advantage. Results for simulated fluorescence excitation-emission data are described, in comparison with the classical approach based on parallel factor analysis and other second-order algorithms, including generalized rank annihilation, bilinear least squares combined with residual bilinearization and multivariate curve resolution-alternating leastsquares. A set of experimental data was also studied, in which calibration was performed with fluorescence excitation-emission matrices for samples containing mixtures of chrysene (the analyte of interest) and benzopyrene (which produced strong inner filter effect across the useful wavelength range). Prediction was made on validation samples with a qualitative composition similar to the calibration set, and also on test samples containing an unexpected component (pyrene). In this latter case, U-PLS/RBL showed a unique success for the analysis of the calibrated component chrysene, achieving the useful second-order advantage.  相似文献   

17.
Wang C  Kong H  Guan Y  Yang J  Gu J  Yang S  Xu G 《Analytical chemistry》2005,77(13):4108-4116
Liquid chromatography/mass spectrometry (LC/MS) followed by multivariate statistical analysis has been successfully applied to the plasma phospholipids metabolic profiling in type 2 diabetes mellitus (DM-2). Principal components analysis and partial least-squares discriminant analysis (PLS-DA) models were tested and compared in class separation between the DM2 and control. The application of an orthogonal signal correction filtered model highly improved the class distinction and predictive power of PLS-DA models. Additionally, unit variance scaling was also tested. With this methodology, it was possible not only to differentiate the DM2 from the control but also to discover and identify the potential biomarkers with LC/MS/MS. The proposed method shows that LC/MS combining with multivariate statistical analysis is a complement or an alternative to NMR for metabonomics applications.  相似文献   

18.
19.
Identification and quantification of analytes in complex solution-state mixtures are critical procedures in many areas of chemistry, biology, and molecular medicine. Nuclear magnetic resonance (NMR) is a unique tool for this purpose providing a wealth of atomic-detail information without requiring extensive fractionation of the samples. We present three new multidimensional-NMR based approaches that are geared toward the analysis of mixtures with high complexity at natural (13)C abundance, including approaches that are encountered in metabolomics. Common to all three approaches is the concept of the extraction of one-dimensional (1D) consensus spectral traces or 2D consensus planes followed by clustering, which significantly improves the capability to identify mixture components that are affected by strong spectral overlap. The methods are demonstrated for covariance (1)H-(1)H TOCSY and (13)C-(1)H HSQC-TOCSY spectra and triple-rank correlation spectra constructed from pairs of (13)C-(1)H HSQC and (13)C-(1)H HSQC-TOCSY spectra. All methods are first demonstrated for an eight-compound metabolite model mixture before being applied to an extract from E. coli cell lysate.  相似文献   

20.
We applied two methods of “blind” spectral decomposition (MILCA and SNICA) to quantitative and qualitative analyses of UV absorption spectra of several non-trivial mixture types. Both methods use the concept of statistical independence and aim at the reconstruction of minimally dependent components from a linear mixture. We examined mixtures of major ecotoxicants (aromatic and polyaromatic hydrocarbons), amino acids and complex mixtures of vitamins in a veterinary drug. Both MICLA and SNICA were able to recover concentrations and individual spectra with minimal errors comparable with instrumental noise. In most cases their performance was similar to or better than that of other chemometric methods such as MCR-ALS, SIMPLISMA, RADICAL, JADE and FastICA. These results suggest that the ICA methods used in this study are suitable for real life applications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号