首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 234 毫秒
1.
Wang C  Kong H  Guan Y  Yang J  Gu J  Yang S  Xu G 《Analytical chemistry》2005,77(13):4108-4116
Liquid chromatography/mass spectrometry (LC/MS) followed by multivariate statistical analysis has been successfully applied to the plasma phospholipids metabolic profiling in type 2 diabetes mellitus (DM-2). Principal components analysis and partial least-squares discriminant analysis (PLS-DA) models were tested and compared in class separation between the DM2 and control. The application of an orthogonal signal correction filtered model highly improved the class distinction and predictive power of PLS-DA models. Additionally, unit variance scaling was also tested. With this methodology, it was possible not only to differentiate the DM2 from the control but also to discover and identify the potential biomarkers with LC/MS/MS. The proposed method shows that LC/MS combining with multivariate statistical analysis is a complement or an alternative to NMR for metabonomics applications.  相似文献   

2.
The theory together with an algorithm for uncorrelated linear discriminant analysis (ULDA) is introduced and applied to explore metabolomics data. ULDA is a supervised method for feature extraction (FE), discriminant analysis (DA) and biomarker screening based on the Fisher criterion function. While principal component analysis (PCA) searches for directions of maximum variance in the data, ULDA seeks linearly combined variables called uncorrelated discriminant vectors (UDVs). The UDVs maximize the separation among different classes in terms of the Fisher criterion. The performance of ULDA is evaluated and compared with PCA, partial least squares discriminant analysis (PLS-DA) and target projection discriminant analysis (TP-DA) for two datasets, one simulated and one real from a metabolomic study. ULDA showed better discriminatory ability than PCA, PLS-DA and TP-DA. The shortcomings of PCA, PLS-DA and TP-DA are attributed to interference from linear correlations in data. PLS-DA and TP-DA performed successfully for the simulated data, but PLS-DA was slightly inferior to ULDA for the real data. ULDA successfully extracted optimal features for discriminant analysis and revealed potential biomarkers. Furthermore, by means of cross-validation, the classification model obtained by ULDA showed better predictive ability than PCA, PLS-DA and TP-DA. In conclusion, ULDA is a powerful tool for revealing discriminatory information in metabolomics data.  相似文献   

3.
Application of metabonomics to nutritional sciences, also termed as nutrimetabonomics, offers the possibility to measure metabolic responses associated with the consumption of specific nutrients and foods. As dietary differences generally only lead to subtle metabolic changes, measuring diet associated metabolic phenotypes is a challenge, and also an opportunity to develop and test new chemometric strategies that can highlight metabolic information in relation to different dietary habits. While multivariate statistical techniques have long been used to analyse dietary data from diet records and questionnaires, to date no attempt has been made to link dietary patterns with metabolic profiles. Using a three-step strategy, it was possible to merge 1H NMR plasma metabolic profile data with specific dietary patterns as assessed by Principal Component Analysis (PCA) and Partial Least Squares-Discriminant Analysis (PLS-DA). Five dietary patterns (energy intake, plant versus animal based diet, “traditional diet” versus sugar-rich diet, “traditional” versus “modern” diets, and consumption of skim versus whole dairy products) were found by applying PCA to the food frequency questionnaire data which explained 50% of the variation. Metabolic phenotypes associated with these dietary patterns were obtained by PLS-DA and were mainly based on differences in lipids and amino acid profiles in plasma. This new approach to assess relationships between dietary intake and metabolic profiling data will allow greater steps to be made in merging nutritional epidemiology with metabonomics.  相似文献   

4.
In the field of metabonomics, 1H NMR and full scan mass spectrometry methods have usually been combined with principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) to detect patterns in biofluids that correspond to specific effects, usually a toxic site effect of a compound. Confounders together with great interindividual variation complicate such analysis in humans, and therefore, metabonomic data are almost restricted to animals. In our study, a constant neutral loss (CNL) scan on a linear ion trap demonstrated increased sensitivity and specificity compared to a full scan approach and was performed to detect mercapturic acids (MA), a class of effect markers. The method was applied to human volunteers administered 50 and 500 mg of acetaminophen (AAP), a model compound known to form MAs. Using a new algorithm to prepare the CNL data for chemometrics, discrimination of control and postdose samples could be performed using PCA and PLS-DA. The loadings plots clearly revealed AAP-MA as a marker, even at low-dose levels. Orthogonal signal correction (OSC) was carried out to investigate background information that is not due to exposure. Surprisingly, the OSC data provided a classification of male and female subjects showing the performance of the new approach.  相似文献   

5.
While for 1H NMR techniques there already exist common analytical and reporting standards, this does not apply to LC-MS metabolic profiling approaches. These standards are the more recommended when applying metabonomics to human biofluids, particularly urine samples, due to the high degree of biological variation compared to animals. A control study was performed, and urine samples of 30 healthy male and female human subjects were collected at intervals of 8 h twice a day for three consecutive days. Using selective multiple reaction monitoring in combination with a column-switching tool for the analysis of the mercapturate pattern, samples were screened for time and gender differences, the most common confounders. Data preprocessing parameters, alignment, scaling to internal standards, and normalization techniques were optimized by PCA, PLS-DA, and OPLS models. Great care was taken in the validation process of both analytical and chemometric protocols. Additionally, a problem of LC-MS, the combination of "different-batch" data to "one-batch" data could be solved by a batchwise scaling procedure. Based on these results, the use of metabolic profiling via mercapturates will be feasible for the detection of disease or toxicity markers in the future since mercapturates are important biomarkers of reactive metabolites known to be involved in many toxic processes.  相似文献   

6.
We present a method for the qualitative and quantitative study of transient metabolic flux of phage infection at the molecular level. The method is based on statistical total correlation spectroscopy (STOCSY) and partial least squares discriminant analysis (PLS-DA) applied to nuclear magnetic resonance (NMR) metabonomic data sets. An algorithm for this type of study is developed and demonstrated. The method has been implemented on (1)H NMR data sets of growth media in planktonic cultures of Pseudomonas aeruginosa infected with bacteriophage pf1. Transient metabolic flux of various important metabolites, identified by STOCSY and PLS-DA analysis applied to the NMR data set, are estimated at various stages of growth. The opportunistic and nosocomial pathogen P. aeruginosa is one of the best-studied model organism for bacterial biofilms. Complete information regarding metabolic connectivity of this system is not possible by conventional spectroscopic approach. Our study presents temporal comparative (1)H NMR metabonomic analyses of filamentous phage pf1 infection in planktonic cultures of P. aeruginosa K strain (PAK). We exemplify here the potential of STOCSY and PLS-DA tools to gain mechanistic insight into subtle changes and to determine the transient flux associated with metabolites following metabolic perturbations resulting from phage infection. Our study has given new avenues in correlating existing postgenomic data with current metabonomic results in P. aeruginosa biofilms research.  相似文献   

7.
Lutz U  Lutz RW  Lutz WK 《Analytical chemistry》2006,78(13):4564-4571
Mass spectrometry (MS) is increasingly being used for metabolic profiling, but detection modes such as constant neutral loss or multiple reaction monitoring have not often been reported. These modes allow focusing on structurally related compounds, which could be advantageous for situations in which the trait under investigation is associated with a particular class of metabolites. In this study, we analyzed endogenous glucuronides excreted in human urine by monitoring characteristic transitions of putative steroid glucuronides by LC-MS/MS for discrimination of females from males. Two methods for data extraction were used: (i) a manual procedure based on visual inspection of the chromatograms and selection of 23 peaks and (ii) a software-supported method (MarkerView) set to extract 100 peaks. Data from 10 female and 10 male students were analyzed by principal component analysis (PCA) and partial least-squares discriminant analysis (PLS-DA) using software SIMCA. With PCA, only the manual peak selection resulted in clustering males and females. With PLS-DA, the manual method provided full separation on the basis of one single discriminant; the software-supported approach required a two-component model for complete separation. Loading plots were analyzed for their ability to reveal peaks with high discriminating power, that is, potential biomarkers. The PLS-DA models were validated with urine samples collected from five new females and five new males. Gender was correctly assigned for all. Our results indicate that inclusion of biological criteria for variable selection coupled to class-specific MS analysis and data extraction by appropriate software may constitute a valuable addition to the methods available for metabolomics.  相似文献   

8.
A large metabolomics study was performed on 600 plasma samples taken at four time points before and after a single intake of a high fat test meal by obese and lean subjects. All samples were analyzed by a liquid chromatography-mass spectrometry (LC-MS) lipidomic method for metabolic profiling. A pragmatic approach combining several well-established statistical methods was developed for processing this large data set in order to detect small differences in metabolic profiles in combination with a large biological variation. Such metabolomics studies require a careful analytical and statistical protocol. The strategy included data preprocessing, data analysis, and validation of statistical models. After several data preprocessing steps, partial least-squares discriminant analysis (PLS-DA) was used for finding biomarkers. To validate the found biomarkers statistically, the PLS-DA models were validated by means of a permutation test, biomarker models, and noninformative models. Univariate plots of potential biomarkers were used to obtain insight in up- or downregulation. The strategy proposed proved to be applicable for dealing with large-scale human metabolomics studies.  相似文献   

9.
针对小样本步态数据引起的分类器泛化能力差的问题,提出了基于支持向量机的步态分类方法.采集了24名青年和24名老年受试者的步态数据,提取24个步态特征训练支持向量机,采用交叉验证方法评估分类器的泛化性能.结果表明,本文提出的方法能够有效地对小样本步态数据分类,并且具有良好的泛化性.不同的核函数对分类性能影响较小.与传统反向传播学习算法的神经网络分类器进行了比较,支持向量机分类性能明显优于传统反向传播学习算法的神经网络.支持向量机在步态分类中具有广泛的应用前景.  相似文献   

10.
Modelling the dispersion of flashing jets using CFD   总被引:1,自引:0,他引:1  
Risk assessments related to industrial environments where gas is kept in liquid form under high pressure rely on the results from predictive tools. Computational Fluid Dynamics (CFD) is one such predictive tool and it is currently used for a range of applications. One of the most challenging application areas is the simulation of multiphase flows resulting from a breach or leakage in a pressurised pipeline or a vessel containing liquefied gas. The present paper deals with the modelling of the post-flashing scenario of a jet emanating from a circular orifice. In addition to being based on the equations governing fluid flow, the models used are those related to turbulence, droplet transport, evaporation, break-up and coalescence. Some of these models are semi-empirical and based on the data from applications other than flashing. However, these are the only models that are currently available in commercial codes and that would be used by consulting engineers for the type of modelling discussed above, namely the dispersion of a flashing release. A method for calculating inlet boundary conditions after flashing is also presented and issues related to such calculations are discussed. The results from a number of CFD based studies are compared with available experimental results. The results show that whilst a number of features of the experimental results can be reproduced by the CFD model, there are also a number of important shortcomings. The shortcomings are highlighted and discussed. Finally, an optimum approach to modelling of this type is suggested and methods to overcome modelling difficulties are proposed.  相似文献   

11.
12.
Successful identification of the important metabolite features in high-resolution nuclear magnetic resonance (NMR) spectra is a crucial task for the discovery of biomarkers that have the potential for early diagnosis of disease and subsequent monitoring of its progression. Although a number of traditional features extraction/selection methods are available, most of them have been conducted in the original frequency domain and disregarded the fact that an NMR spectrum comprises a number of local bumps and peaks with different scales. In the present study a complex wavelet transform that can handle multiscale information efficiently and has an energy shift-insensitive property is proposed as a method to improve feature extraction and classification in NMR spectra. Furthermore, a multiple testing procedure based on a false discovery rate (FDR) was used to identify important metabolite features in the complex wavelet domain. Experimental results with real NMR spectra showed that classification models constructed with the complex wavelet coefficients selected by the FDR-based procedure yield lower rates of misclassification than models constructed with original features and conventional wavelet coefficients.  相似文献   

13.
Support vector machines (SVMs) provide an interesting computational paradigm for the classification of data from high-energy physics and particle astrophysics experiments. In this study, the classification power of SVMs is compared with those from standard supervised algorithms, i.e. likelihood ratio and artificial neural networks (ANN), using test beam data from the transition radiation detector prototype of the PAMELA satellite-borne magnetic spectrometer. Concerning signal/background discrimination, SVM and ANN show the best performance. Moreover, our analysis shows that the use of SVM allows an accurate estimate of the discrimination efficiency of unseen data points: indeed, since almost the same efficiency is obtained with or without the cross-validation technique, the performance of SVM appears to be stable. On the other hand, the ANN shows a tendency to overfit the data, while this tendency is not observed using SVM. For these reasons, SVM could be used in particle astrophysics experiments where, due to the harsh experimental conditions, efficient and robust classification algorithms are needed.  相似文献   

14.
The purpose of this paper is to develop a data-mining-based dynamic dispatching rule selection mechanism for a shop floor control system to make real-time scheduling decisions. In data mining processes, data transformations (including data normalisation and feature selection) and data mining algorithms greatly influence the predictive accuracy of data mining tasks. Here, the z-scores data normalisation mechanism and genetic-algorithm-based feature selection mechanism are used for data transformation tasks, then support vector machines (SVMs) is applied for the dynamic dispatching rule selection classifier. The simulation experiments demonstrate that the proposed data-mining-based approach is more generalisable than approaches that do not employ a data-mining-based approach, in terms of accurately assigning the best dispatching strategy for the next scheduling period. Moreover, the proposed SVM classifier using the data-mining-based approach yields a better system performance than obtained with a classical SVM-based dynamic dispatching rule selection mechanism and heuristic individual dispatching rules under various performance criteria over a long period.  相似文献   

15.
System reliability depends on inherent mechanical and structural aging factors as well as on operational and environmental conditions, which could enhance (or smoothen) such factors. In practice, the involved dependences may burden the modeling of the reliability behavior over time, in which traditional stochastic modeling approaches may likely fail. Empirical prediction methods, such as support vector machines (SVMs), become a valid alternative whenever reliable time series data are available. However, the prediction performance of SVMs depends on the setting of a number of parameters that influence the effectiveness of the training stage during which the SVMs are constructed based on the available data set. The problem of choosing the most suitable values for the SVM parameters can be framed in terms of an optimization problem aimed at minimizing a prediction error. In this work, this problem is solved by particle swarm optimization (PSO), a probabilistic approach based on an analogy with the collective motion of biological organisms. SVM in liaison with PSO is then applied to tackle reliability prediction problems based on time series data of engineered components. Comparisons of the obtained results with those given by other time series techniques indicate that the PSO + SVM model is able to provide reliability predictions with comparable or great accuracy. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

16.
Model validation is critical in predicting the performance of manufacturing processes. In predictive regression, proper selection of variables helps minimize the model mismatch error, proper selection of models helps reduce the model estimation error, and proper validation of models helps minimize the model prediction error. In this paper, the literature is briefly reviewed and a rigorous procedure is proposed for evaluating the validation and data splitting methods in predictive regression modeling. Experimental data from a honing surface roughness study will be used to illustrate the methodology. In particular, the individual versus average data splitting methods as well as the fivefold versus threefold cross-validation methods are compared. This paper shows that statistical tests and prediction errors evaluation are important in subset selection and cross-validation of predictive regression models. No statistical differences were found between the fivefold and the threefold cross-validation methods, and between use of the individual and average data splitting methods in predictive regression modeling.  相似文献   

17.
The problem of model uncertainty versus model inaccuracy is examined in the light of the concept of the ‘probability of correctness of a model under a given context’ introduced by Apostolakis. To avoid possible difficulties linked with this concept, a distinction is introduced between ‘predictive’ models and ‘constitutive’ models, the former being generic in the sense that they can host the latter as submodels. A metric or distance between linear models as well as an objective of the model are introduced, from which we can give an operational definition of ‘model uncertainty’ (with respect to distribution of parameters of the associated constitutive models) and of ‘model accuracy’ with respect to a reference model. Finally the choice of a predictive model is linked to a loss function and a cost of using or defining a model.  相似文献   

18.
独立分量分析的图像融合算法   总被引:2,自引:0,他引:2  
独立分量分析可实现图像的稀疏编码并具有能很好地捕捉图像重要边缘信息的特性.本文提出一种基于独立分量分析的图像融合算法,结合支持向量机对多聚焦图像的清晰域、模糊域进行判断以及在ICA域中进行图像分割以提取图像的主要边缘特征信息来实现特征级的多聚焦图像的融合.实验结果表明,本文提出的融合算法是有效的.  相似文献   

19.
为了能够快速判别百合是否掺假,利用激发-发射矩阵(EEM)荧光技术对纯百合和掺假百合样品进行了荧光光谱分析,并构建了百合及其掺假百合的荧光指纹特征图谱;然后借助主成分分析-线性判别分析(PCALDA)和偏最小二乘-判别分析(PLS-DA)两种化学模式识别方法,对百合中掺假粉末的种类进行了快速鉴别和分类。实验结果表明:两个分类模型均能根据百合样本的EEM荧光光谱数据准确识别掺假百合样本,且正确分类率均高达95%。利用PCA-LDA和PLS-DA成功建立了快速判别百合掺假的新方法,同时完善了百合荧光指纹特征图谱,有望为建立更全面、更准确地评价百合药材的质量标准体系打下基础。  相似文献   

20.
This article presents an experimental study about the classification ability of several classifiers for multi-class classification of cannabis seedlings. As the cultivation of drug type cannabis is forbidden in Switzerland law enforcement authorities regularly ask forensic laboratories to determinate the chemotype of a seized cannabis plant and then to conclude if the plantation is legal or not. This classification is mainly performed when the plant is mature as required by the EU official protocol and then the classification of cannabis seedlings is a time consuming and costly procedure. A previous study made by the authors has investigated this problematic [1] and showed that it is possible to differentiate between drug type (illegal) and fibre type (legal) cannabis at an early stage of growth using gas chromatography interfaced with mass spectrometry (GC-MS) based on the relative proportions of eight major leaf compounds. The aims of the present work are on one hand to continue former work and to optimize the methodology for the discrimination of drug- and fibre type cannabis developed in the previous study and on the other hand to investigate the possibility to predict illegal cannabis varieties. Seven classifiers for differentiating between cannabis seedlings are evaluated in this paper, namely Linear Discriminant Analysis (LDA), Partial Least Squares Discriminant Analysis (PLS-DA), Nearest Neighbour Classification (NNC), Learning Vector Quantization (LVQ), Radial Basis Function Support Vector Machines (RBF SVMs), Random Forest (RF) and Artificial Neural Networks (ANN). The performance of each method was assessed using the same analytical dataset that consists of 861 samples split into drug- and fibre type cannabis with drug type cannabis being made up of 12 varieties (i.e. 12 classes). The results show that linear classifiers are not able to manage the distribution of classes in which some overlap areas exist for both classification problems. Unlike linear classifiers, NNC and RBF SVMs best differentiate cannabis samples both for 2-class and 12-class classifications with average classification results up to 99% and 98%, respectively. Furthermore, RBF SVMs correctly classified into drug type cannabis the independent validation set, which consists of cannabis plants coming from police seizures. In forensic case work this study shows that the discrimination between cannabis samples at an early stage of growth is possible with fairly high classification performance for discriminating between cannabis chemotypes or between drug type cannabis varieties.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号