首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Lutz U  Lutz RW  Lutz WK 《Analytical chemistry》2006,78(13):4564-4571
Mass spectrometry (MS) is increasingly being used for metabolic profiling, but detection modes such as constant neutral loss or multiple reaction monitoring have not often been reported. These modes allow focusing on structurally related compounds, which could be advantageous for situations in which the trait under investigation is associated with a particular class of metabolites. In this study, we analyzed endogenous glucuronides excreted in human urine by monitoring characteristic transitions of putative steroid glucuronides by LC-MS/MS for discrimination of females from males. Two methods for data extraction were used: (i) a manual procedure based on visual inspection of the chromatograms and selection of 23 peaks and (ii) a software-supported method (MarkerView) set to extract 100 peaks. Data from 10 female and 10 male students were analyzed by principal component analysis (PCA) and partial least-squares discriminant analysis (PLS-DA) using software SIMCA. With PCA, only the manual peak selection resulted in clustering males and females. With PLS-DA, the manual method provided full separation on the basis of one single discriminant; the software-supported approach required a two-component model for complete separation. Loading plots were analyzed for their ability to reveal peaks with high discriminating power, that is, potential biomarkers. The PLS-DA models were validated with urine samples collected from five new females and five new males. Gender was correctly assigned for all. Our results indicate that inclusion of biological criteria for variable selection coupled to class-specific MS analysis and data extraction by appropriate software may constitute a valuable addition to the methods available for metabolomics.  相似文献   

2.
While for 1H NMR techniques there already exist common analytical and reporting standards, this does not apply to LC-MS metabolic profiling approaches. These standards are the more recommended when applying metabonomics to human biofluids, particularly urine samples, due to the high degree of biological variation compared to animals. A control study was performed, and urine samples of 30 healthy male and female human subjects were collected at intervals of 8 h twice a day for three consecutive days. Using selective multiple reaction monitoring in combination with a column-switching tool for the analysis of the mercapturate pattern, samples were screened for time and gender differences, the most common confounders. Data preprocessing parameters, alignment, scaling to internal standards, and normalization techniques were optimized by PCA, PLS-DA, and OPLS models. Great care was taken in the validation process of both analytical and chemometric protocols. Additionally, a problem of LC-MS, the combination of "different-batch" data to "one-batch" data could be solved by a batchwise scaling procedure. Based on these results, the use of metabolic profiling via mercapturates will be feasible for the detection of disease or toxicity markers in the future since mercapturates are important biomarkers of reactive metabolites known to be involved in many toxic processes.  相似文献   

3.
The theory together with an algorithm for uncorrelated linear discriminant analysis (ULDA) is introduced and applied to explore metabolomics data. ULDA is a supervised method for feature extraction (FE), discriminant analysis (DA) and biomarker screening based on the Fisher criterion function. While principal component analysis (PCA) searches for directions of maximum variance in the data, ULDA seeks linearly combined variables called uncorrelated discriminant vectors (UDVs). The UDVs maximize the separation among different classes in terms of the Fisher criterion. The performance of ULDA is evaluated and compared with PCA, partial least squares discriminant analysis (PLS-DA) and target projection discriminant analysis (TP-DA) for two datasets, one simulated and one real from a metabolomic study. ULDA showed better discriminatory ability than PCA, PLS-DA and TP-DA. The shortcomings of PCA, PLS-DA and TP-DA are attributed to interference from linear correlations in data. PLS-DA and TP-DA performed successfully for the simulated data, but PLS-DA was slightly inferior to ULDA for the real data. ULDA successfully extracted optimal features for discriminant analysis and revealed potential biomarkers. Furthermore, by means of cross-validation, the classification model obtained by ULDA showed better predictive ability than PCA, PLS-DA and TP-DA. In conclusion, ULDA is a powerful tool for revealing discriminatory information in metabolomics data.  相似文献   

4.
Wang C  Kong H  Guan Y  Yang J  Gu J  Yang S  Xu G 《Analytical chemistry》2005,77(13):4108-4116
Liquid chromatography/mass spectrometry (LC/MS) followed by multivariate statistical analysis has been successfully applied to the plasma phospholipids metabolic profiling in type 2 diabetes mellitus (DM-2). Principal components analysis and partial least-squares discriminant analysis (PLS-DA) models were tested and compared in class separation between the DM2 and control. The application of an orthogonal signal correction filtered model highly improved the class distinction and predictive power of PLS-DA models. Additionally, unit variance scaling was also tested. With this methodology, it was possible not only to differentiate the DM2 from the control but also to discover and identify the potential biomarkers with LC/MS/MS. The proposed method shows that LC/MS combining with multivariate statistical analysis is a complement or an alternative to NMR for metabonomics applications.  相似文献   

5.
Metabolomics is an emerging field providing insight into physiological processes. It is an effective tool to investigate disease diagnosis or conduct toxicological studies by observing changes in metabolite concentrations in various biofluids. Multivariate statistical analysis is generally employed with nuclear magnetic resonance (NMR) or mass spectrometry (MS) data to determine differences between groups (for instance diseased vs healthy). Characteristic predictive models may be built based on a set of training data, and these models are subsequently used to predict whether new test data falls under a specific class. In this study, metabolomic data is obtained by doing a (1)H NMR spectroscopy on urine samples obtained from healthy subjects (male and female) and patients suffering from Streptococcus pneumoniae. We compare the performance of traditional PLS-DA multivariate analysis to support vector machines (SVMs), a technique widely used in genome studies on two case studies: (1) a case where nearly complete distinction may be seen (healthy versus pneumonia) and (2) a case where distinction is more ambiguous (male versus female). We show that SVMs are superior to PLS-DA in both cases in terms of predictive accuracy with the least number of features. With fewer number of features, SVMs are able to give better predictive model when compared to that of PLS-DA.  相似文献   

6.
In metabolomics, the purpose is to identify and quantify all the metabolites in a biological system. Combined gas chromatography and mass spectrometry (GC/MS) is one of the most commonly used techniques in metabolomics together with 1H NMR, and it has been shown that more than 300 compounds can be distinguished with GC/MS after deconvolution of overlapping peaks. To avoid having to deconvolute all analyzed samples prior to multivariate analysis of the data, we have developed a strategy for rapid comparison of nonprocessed MS data files. The method includes baseline correction, alignment, time window determinations, alternating regression, PLS-DA, and identification of retention time windows in the chromatograms that explain the differences between the samples. Use of alternating regression also gives interpretable loadings, which retain the information provided by m/z values that vary between the samples in each retention time window. The method has been applied to plant extracts derived from leaves of different developmental stages and plants subjected to small changes in day length. The data show that the new method can detect differences between the samples and that it gives results comparable to those obtained when deconvolution is applied prior to the multivariate analysis. We suggest that this method can be used for rapid comparison of large sets of GC/MS data, thereby applying time-consuming deconvolution only to parts of the chromatograms that contribute to explain the differences between the samples.  相似文献   

7.
Application of metabonomics to nutritional sciences, also termed as nutrimetabonomics, offers the possibility to measure metabolic responses associated with the consumption of specific nutrients and foods. As dietary differences generally only lead to subtle metabolic changes, measuring diet associated metabolic phenotypes is a challenge, and also an opportunity to develop and test new chemometric strategies that can highlight metabolic information in relation to different dietary habits. While multivariate statistical techniques have long been used to analyse dietary data from diet records and questionnaires, to date no attempt has been made to link dietary patterns with metabolic profiles. Using a three-step strategy, it was possible to merge 1H NMR plasma metabolic profile data with specific dietary patterns as assessed by Principal Component Analysis (PCA) and Partial Least Squares-Discriminant Analysis (PLS-DA). Five dietary patterns (energy intake, plant versus animal based diet, “traditional diet” versus sugar-rich diet, “traditional” versus “modern” diets, and consumption of skim versus whole dairy products) were found by applying PCA to the food frequency questionnaire data which explained 50% of the variation. Metabolic phenotypes associated with these dietary patterns were obtained by PLS-DA and were mainly based on differences in lipids and amino acid profiles in plasma. This new approach to assess relationships between dietary intake and metabolic profiling data will allow greater steps to be made in merging nutritional epidemiology with metabonomics.  相似文献   

8.
Metabolomics encompasses the study of small molecules in a biological sample. Liquid Chromatography coupled with Mass Spectrometry (LC-MS) profiling is an important approach for the identification and quantification of metabolites from complex biological samples. The amount and complexity of data produced in an LC-MS profiling experiment demand automatic tools for the preprocessing, analysis, and extraction of useful biological information. Data preprocessing—a topic that covers noise filtering, peak detection, deisotoping, alignment, identification, and normalization—is thus an active area of metabolomics research. Recent years have witnessed development of many software for data preprocessing, and still there is a need for further improvement of the data preprocessing pipeline. This review presents an overview of selected software tools for preprocessing LC-MS based metabolomics data and tries to provide future directions.  相似文献   

9.
We present a method for the qualitative and quantitative study of transient metabolic flux of phage infection at the molecular level. The method is based on statistical total correlation spectroscopy (STOCSY) and partial least squares discriminant analysis (PLS-DA) applied to nuclear magnetic resonance (NMR) metabonomic data sets. An algorithm for this type of study is developed and demonstrated. The method has been implemented on (1)H NMR data sets of growth media in planktonic cultures of Pseudomonas aeruginosa infected with bacteriophage pf1. Transient metabolic flux of various important metabolites, identified by STOCSY and PLS-DA analysis applied to the NMR data set, are estimated at various stages of growth. The opportunistic and nosocomial pathogen P. aeruginosa is one of the best-studied model organism for bacterial biofilms. Complete information regarding metabolic connectivity of this system is not possible by conventional spectroscopic approach. Our study presents temporal comparative (1)H NMR metabonomic analyses of filamentous phage pf1 infection in planktonic cultures of P. aeruginosa K strain (PAK). We exemplify here the potential of STOCSY and PLS-DA tools to gain mechanistic insight into subtle changes and to determine the transient flux associated with metabolites following metabolic perturbations resulting from phage infection. Our study has given new avenues in correlating existing postgenomic data with current metabonomic results in P. aeruginosa biofilms research.  相似文献   

10.
Ultra-performance liquid chromatography coupled to mass spectrometry (UPLC/MS) has been used increasingly for measuring changes of low molecular weight metabolites in biofluids/tissues in response to biological challenges such as drug toxicity and disease processes. Typically samples show high variability in concentration, and the derived metabolic profiles have a heteroscedastic noise structure characterized by increasing variance as a function of increased signal intensity. These sources of experimental and instrumental noise substantially complicate information recovery when statistical tools are used. We apply and compare several preprocessing procedures and introduce a statistical error model to account for these bioanalytical complexities. In particular, the use of total intensity, median fold change, locally weighted scatter plot smoothing, and quantile normalizations to reduce extraneous variance induced by sample dilution were compared. We demonstrate that the UPLC/MS peak intensities of urine samples should respond linearly to variable sample dilution across the intensity range. While all four studied normalization methods performed reasonably well in reducing dilution-induced variation of urine samples in the absence of biological variation, the median fold change normalization is least compromised by the biologically relevant changes in mixture components and is thus preferable. Additionally, the application of a subsequent log-based transformation was successful in stabilizing the variance with respect to peak intensity, confirming the predominant influence of multiplicative noise in peak intensities from UPLC/MS-derived metabolic profile data sets. We demonstrate that variance-stabilizing transformation and normalization are critical preprocessing steps that can benefit greatly metabolic information recovery from such data sets when widely applied chemometric methods are used.  相似文献   

11.
ChemCam, a laser-induced breakdown spectroscopy (LIBS) instrument on the Mars Science Laboratory rover, will analyze the chemistry of the martian surface beginning in 2012. Prior to integration on the rover, the ChemCam instrument collected data on a variety of rock types to provide a training set for analysis of data from Mars. Models based on calibration data can be used to classify rocks via multivariate statistical techniques such as partial least squares-discriminant analysis (PLS-DA). In this study, we employ a version of PLS-DA in which modeling is applied in a defined classification flow to a variety of geological materials and compare the results with the traditional PLS-DA technique. Results show that the modified algorithm is more effective at classifying samples.  相似文献   

12.
In metabolomics, the objective is to identify differences in metabolite profiles between samples. A widely used tool in metabolomics investigations is gas chromatography-mass spectrometry (GC/MS). More than 400 compounds can be detected in a single analysis, if overlapping GC/MS peaks are deconvoluted. However, the deconvolution process is time-consuming and difficult to automate, and additional processing is needed in order to compare samples. Therefore, there is a need to improve and automate the data processing strategy for data generated in GC/MS-based metabolomics; if not, the processing step will be a major bottleneck for high-throughput analyses. Here we describe a new semiautomated strategy using a hierarchical multivariate curve resolution approach that processes all samples simultaneously. The presented strategy generates (after appropriate treatment, e.g., multivariate analysis) tables of all the detected metabolites that differ in relative concentrations between samples. The processing of 70 samples took similar time to that of the GC/TOFMS analyses of the samples. The strategy has been validated using two different sets of samples: a complex mixture of standard compounds and Arabidopsis samples.  相似文献   

13.
为了能够快速判别百合是否掺假,利用激发-发射矩阵(EEM)荧光技术对纯百合和掺假百合样品进行了荧光光谱分析,并构建了百合及其掺假百合的荧光指纹特征图谱;然后借助主成分分析-线性判别分析(PCALDA)和偏最小二乘-判别分析(PLS-DA)两种化学模式识别方法,对百合中掺假粉末的种类进行了快速鉴别和分类。实验结果表明:两个分类模型均能根据百合样本的EEM荧光光谱数据准确识别掺假百合样本,且正确分类率均高达95%。利用PCA-LDA和PLS-DA成功建立了快速判别百合掺假的新方法,同时完善了百合荧光指纹特征图谱,有望为建立更全面、更准确地评价百合药材的质量标准体系打下基础。  相似文献   

14.
Currently, no standard metrics are used to quantify cluster separation in PCA or PLS-DA scores plots for metabonomics studies or to determine if cluster separation is statistically significant. Lack of such measures makes it virtually impossible to compare independent or inter-laboratory studies and can lead to confusion in the metabonomics literature when authors putatively identify metabolites distinguishing classes of samples based on visual and qualitative inspection of scores plots that exhibit marginal separation. While previous papers have addressed quantification of cluster separation in PCA scores plots, none have advocated routine use of a quantitative measure of separation that is supported by a standard and rigorous assessment of whether or not the cluster separation is statistically significant. Here quantification and statistical significance of separation of group centroids in PCA and PLS-DA scores plots are considered. The Mahalanobis distance is used to quantify the distance between group centroids, and the two-sample Hotelling's T2 test is computed for the data, related to an F-statistic, and then an F-test is applied to determine if the cluster separation is statistically significant. We demonstrate the value of this approach using four datasets containing various degrees of separation, ranging from groups that had no apparent visual cluster separation to groups that had no visual cluster overlap. Widespread adoption of such concrete metrics to quantify and evaluate the statistical significance of PCA and PLS-DA cluster separation would help standardize reporting of metabonomics data.  相似文献   

15.
Biofluids, like urine, form very complex matrixes containing a large number of potential biomarkers, that is, changes of endogenous metabolites in response to xenobiotic exposure. This paper describes a fast and sensitive method of screening biomarkers in rat urine. Biomarkers for phospholipidosis, induced by an antidepressant drug, were studied. Urine samples from rats exposed to citalopram were analyzed using solid-phase extraction (SPE) and liquid chromatography mass spectrometry (LC/MS) analysis detecting negative ions. A fast iterative method, called Gentle, was used for the automatic curve resolution, and metabolic fingerprints were obtained. After peak alignment principal component analysis (PCA) was performed for pattern recognition, PCA loadings were studied as a means of discovering potential biomarkers. In this study a number of potential biomarkers of phospholipidosis in rats are discussed. They are reported by their retention time and base peak, as their identification is not within the scope of the study. In addition to the fact that it was possible to differentiate control samples from dosed samples, the data were very easy to interpret, and signals from xenobiotic-related substances were easily removed without affecting the endogenous compounds. The proposed method is a complement or an alternative to NMR for metabolomic applications.  相似文献   

16.
Elucidation of the composition of chemical-biological samples is a main focus of systems biology and metabolomics. Due to the inherent complexity of these mixtures, reliable, efficient, and potentially automatable methods are needed to identify the underlying metabolites and natural products. Because of its rich chemical information content, nuclear magnetic resonance (NMR) spectroscopy has a unique potential for this task. Here we present a generalization and application of a recently introduced NMR data collection, processing, and analysis strategy that circumvents the need for extensive purification and hyphenation prior to analysis. It uses covariance TOCSY NMR spectra measured on a 1-mm high-temperature cryogenic probe that are analyzed by a spectral trace clustering algorithm yielding 1D NMR spectra of the individual components for their unambiguous identification. The method is demonstrated on a metabolic model mixture and is then applied to the unpurified venom mixture of an individual walking stick insect that contains several slowly interconverting and closely related metabolites.  相似文献   

17.
Liquid chromatography-mass spectrometry (LC-MS) is a common method for profiling biological samples in metabolomics. However, LC-MS data of metabolomic studies are often affected by high noise levels, retention time shifts, and high variability in signal intensities. With a new chip-based nanoelectrospray source it becomes possible to directly infuse complex biological samples such as plasma without any chromatographic separation beforehand. In combination with highly diluted samples and long data acquisition times, the parallel analysis of hundreds of compounds is now possible. In a proof-of-concept study, 10 human plasma samples from females and males were analyzed with the intention to separate the two groups by their different metabolomes. The reproducibility was so high that statistical analysis of the data could be performed without prior normalization. Two groups of female and male samples were separated by a supervised machine learning algorithm, principal component analysis, and hierarchical clustering. Peaks contributing to the group separation were characterized by accurate mass measurement and MS-MS fragmentation and by spiking experiments. The feasibility of direct sample infusion using the new chip-based nanoelectrospray device opens a new dimension for the rapid parallel analysis of complex biological mixtures.  相似文献   

18.
Data processing and analysis have become true rate and success limiting factors for molecular research where a large number of samples of high complexity are included in the data set. In general rather complicated methodologies are needed for the combination and comparison of information as obtained from selected analytical platforms. Although commercial as well as freely accessible software for high-throughput data processing are available for most platforms, tailored in-house solutions for data management and analysis can provide the versatility and transparency eligible for e.g. method development and pilot studies.This paper describes a procedure for exploring metabolic fingerprints in urine samples from prostate and bladder cancer patients with a set of in-house developed Matlab tools. In spite of the immense amount of data produced by the LC-MS platform, in this study more than 1010 data points, it is shown that the data processing tasks can be handled with reasonable computer resources. The preprocessing steps include baseline subtraction and noise reduction, followed by an initial time alignment. In the data analysis the fingerprints are treated as 2-D images, i.e. pixel by pixel, in contrast to the more common list-based approach after peak or feature detection. Although the latter approach greatly reduces the data complexity, it also involves a critical step that may obscure essential information due to undetected or misaligned peaks. The effects of remaining time shifts after the initial alignment are reduced by a binning and ‘blurring’ procedure prior to the comparative multivariate and univariate data analyses. Other factors than cancer assignment were taken into account by ANOVA applied to the PCA scores as well as to the individual variables (pixels). It was found that the analytical day-to-day variations in our study had a large confounding effect on the cancer related differences, which emphasizes the role of proper normalization and/or experimental design. While PCA could not establish significant cancer related patterns, the pixel-wise univariate analysis could provide a list of about a hundred ‘hotspots’ indicating possible biomarkers. This was also the limited goal for this study, with focus on the exploration of a really huge and complex data set. True biomarker identification, however, needs thorough validation and verification in separate patient sets.  相似文献   

19.
LC-MS-based proteomics requires methods with high peak capacity and a high degree of automation, integrated with data-handling tools able to cope with the massive data produced and able to quantitatively compare them. This paper describes an off-line two-dimensional (2D) LC-MS method and its integration with software tools for data preprocessing and multivariate statistical analysis. The 2D LC-MS method was optimized in order to minimize peptide loss prior to sample injection and during the collection step after the first LC dimension, thus minimizing errors from off-column sample handling. The second dimension was run in fully automated mode, injecting onto a nanoscale LC-MS system a series of more than 100 samples, representing fractions collected in the first dimension (8 fractions/sample). As a model study, the method was applied to finding biomarkers for the antiinflammatory properties of zilpaterol, which are coupled to the beta2-adrenergic receptor. Secreted proteomes from U937 macrophages exposed to lipopolysaccharide in the presence or absence of propanolol or zilpaterol were analysed. Multivariate statistical analysis of 2D LC-MS data, based on principal component analysis, and subsequent targeted LC-MS/MS identification of peptides of interest demonstrated the applicability of the approach.  相似文献   

20.
Metabolite profiling in biomarker discovery, enzyme substrate assignment, drug activity/specificity determination, and basic metabolic research requires new data preprocessing approaches to correlate specific metabolites to their biological origin. Here we introduce an LC/MS-based data analysis approach, XCMS, which incorporates novel nonlinear retention time alignment, matched filtration, peak detection, and peak matching. Without using internal standards, the method dynamically identifies hundreds of endogenous metabolites for use as standards, calculating a nonlinear retention time correction profile for each sample. Following retention time correction, the relative metabolite ion intensities are directly compared to identify changes in specific endogenous metabolites, such as potential biomarkers. The software is demonstrated using data sets from a previously reported enzyme knockout study and a large-scale study of plasma samples. XCMS is freely available under an open-source license at http://metlin.scripps.edu/download/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号