首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
A multivariate data matrix containing a number of missing values was obtained from a study on the changes in colour and phenolic composition during the ageing of port. Two approaches were taken in the analysis of the data. The first involved the use of multiple imputation (MI) followed by principal components analysis (PCA). The second examined the use of maximum likelihood principal component analysis (MLPCA). The use of multiple imputation allows for missing value uncertainty to be incorporated into the analysis of the data. Initial estimates of missing values were firstly calculated using the Expectation Maximization algorithm (EM), followed by Data Augmentation (DA) in order to generate five imputed data matrices. Each complete data matrix was subsequently analysed by PCA, then averaging their principal component (PC) scores and loadings to give an estimation of errors. The first three PCs accounted for 93.3% of the explained variance. Changes to colour and monomeric anthocyanin composition were explained on PC1 (79.63% explained variance), phenolic composition and hue mainly on PC2 (8.61% explained variance) and phenolic composition and the formation of polymeric pigment on PC3 (5.04% explained variance). In MLPCA estimates of measurement uncertainty is incorporated in the decomposition step, with missing values being assigned large measurement uncertainties. PC scores on the first two PCs after multiple imputation and PCA (MI+PCA) were comparable to maximum likelihood scores on the first two PCs extracted by MLPCA.  相似文献   

2.
3.
Procedures to compensate for correlated measurement errors in multivariate data analysis are described. These procedures are based on the method of maximum likelihood principal component analysis (MLPCA), previously described in the literature. MLPCA is a decomposition method similar to conventional PCA, but it takes into account measurement uncertainty in the decomposition process, placing less emphasis on measurements with large variance. Although the original MLPCA algorithm can accommodate correlated measurement errors, two drawbacks have limited its practical utility in these cases: (1) an inability to handle rank deficient error covariance matrices, and (2) demanding memory and computational requirements. This paper describes two simplifications to the original algorithm that apply when errors are correlated only within the rows of a data matrix and when all of these row covariance matrices are equal. Simulated and experimental data for three-component mixtures are used to test the new methods. It was found that inclusion of error covariance information via MLPCA always gave results which were at least as good and normally better than PCA when the true error covariance matrix was available. However, when the error covariance matrix is estimated from replicates, the relative performance depends on the quality of the estimate and the degree of correlation. For experimental data consisting of mixtures of cobalt, chromium and nickel ions, maximum likelihood principal components regression showed an improvement of up to 50% in the cross-validation error when error covariance information was included.  相似文献   

4.
The problem of multicollinearity associated with the estimation of a functional logit model can be solved by using as predictor variables a set of functional principal components. The functional parameter estimated by functional principal component logit regression is often nonsmooth and then difficult to interpret. To solve this problem, different penalized spline estimations of the functional logit model are proposed in this paper. All of them are based on smoothed functional PCA and/or a discrete penalty in the log-likelihood criterion in terms of B-spline expansions of the sample curves and the functional parameter. The ability of these smoothing approaches to provide an accurate estimation of the functional parameter and their classification performance with respect to unpenalized functional PCA and LDA-PLS are evaluated via simulation and application to real data. Leave-one-out cross-validation and generalized cross-validation are adapted to select the smoothing parameter and the number of principal components or basis functions associated with the considered approaches.  相似文献   

5.
将主成分分析(Principal Component Analysis,PCA)用于信号处理,并与奇异值分解(Singular Value Decomposition,SVD)方法比较。分析总结PCA及SVD信号处理原理,提出基于PCA的特征值差分谱理论用于信号消噪。结果表明,PCA与SVD的处理效果较相似,相似性原因为原始矩阵右奇异向量即为协方差矩阵特征向量。SVD较PCA的重构误差小,因SVD无需计算协方差矩阵,可避免舍入误差产生。  相似文献   

6.
熊天  张天骐  闻斌  吴超 《声学技术》2023,42(6):794-803
针对单一传统方法对歌声分离不彻底的问题,文章提出了一种基于鲁棒主成分分析(Robust Principal Component Analysis, RPCA)和梅尔频率倒谱系数(Mel Frequency Cepstrum Coefficients, MFCC)反复结构的两步歌声伴奏分离模型。该模型有效地改善了鲁棒主成分分析对歌声分离不完全和梅尔频率倒谱系数反复结构歌声在低频处分离不佳的问题。首先使用鲁棒主成分分析将混合音乐信号分解为低秩矩阵和稀疏矩阵,然后分别对其提取梅尔频率倒谱系数特征参数并且对其进行相似运算,构建相似矩阵及建立梅尔频率倒谱系数反复结构模型并通过反复结构模型分别得到低秩矩阵和稀疏矩阵相关的掩蔽矩阵,最后根据构建的掩蔽矩阵模型以及傅里叶逆变换得到背景音乐和歌声。在公开数据集上进行了实验,实验结果表明本文算法在歌声分离性能上与比较算法相比,平均信号干扰比值最高有接近7 dB的提高。  相似文献   

7.
Multivariate statistical methods for the analysis, monitoring and diagnosis of process operating performance are becoming more important because of the availability of on-line process computers which routinely collect measurements on large numbers of process variables. Traditional univariate control charts have been extended to multivariate quality control situations using the Hotelling T2 statistic. Recent approaches to multivariate statistical process control which utilize not only product quality data (Y), but also all of the available process variable data (X) are based on multivariate statistical projection methods (principal component analysis, (PCA), partial least squares, (PLS), multi-block PLS and multi-way PCA). An overview of these methods and their use in the statistical process control of multivariate continuous and batch processes is presented. Applications are provided on the analysis of historical data from the catalytic cracking section of a large petroleum refinery, on the monitoring and diagnosis of a continuous polymerization process and on the monitoring of an industrial batch process.  相似文献   

8.
Constrained principal component analysis (CPCA) incorporates external information into principal component analysis (PCA) of a data matrix. CPCA first decomposes the data matrix according to the external information (external analysis), and then applies PCA to decomposed matrices (internal analysis). The external analysis amounts to projections of the data matrix onto the spaces spanned by matrices of external information, while the internal analysis involves the generalized singular value decomposition (GSVD). Since its original proposal, CPCA has evolved both conceptually and methodologically; it is now founded on firmer mathematical ground, allows a greater variety of decompositions, and includes a wider range of interesting special cases. In this paper we present a comprehensive theory and various extensions of CPCA, which were not fully envisioned in the original paper. The new developments we discuss include least squares (LS) estimation under possibly singular metric matrices, two useful theorems concerning GSVD, decompositions of data matrices into finer components, and fitting higher-order structures. We also discuss four special cases of CPCA; 1) CCA (canonical correspondence analysis) and CALC (canonical analysis with linear constraints), 2) GMANOVA (generalized MANOVA), 3) Lagrange's theorem, and 4) CANO (canonical correlation analysis) and related methods. We conclude with brief remarks on advantages and disadvantages of CPCA relative to other competitors. Received: June 23, 2000; revised version: July 9, 2001  相似文献   

9.
Vogt, N.B., Bye, E., Thrane, K.E., Jacobsen, T. and Benestad, C., 1989. Composition activity relationships — CARE. Part Il. Indirect and direct mutagens multivariate dose-response regression. Chemometrics and Intelligent Laboratory Systems, 6: 127–142.Part I of this work described the cluster and principal component analysis of the two sample sets and the three types of data available. To investigate the relationship between concentration of chemical compounds and mutagenicity a univariate correlation analysis using Kendall rank correlation has been performed. This shows that there is a systematic difference between the element and polycyclic aromatic hydrocarbon (PAH) variables when bacterial strain Salmonella TA98 is used with and without metabolic activation. Extending the mutagenicity test data to include bacterial strain TA100 reveals that with TA100 the difference between elements and PAH is not observed. Correlation analysis of object scores from principal component analysis (PCA) of elements and PAH versus average mutagenicity and object scores from PCA of mutagenicity dose-response data is used to interpret the connection between source contrasts and mutagenicity. Partial least squares regression is made to obtain predictive models and interpretation of the connection between chemistry and dose-response mutagenicity. The analysis shows that there are substantial differences between the two sets of samples with respect to how well the biological activity may be predicted.  相似文献   

10.
Static time-of-flight secondary ion mass spectrometry (TOF-SIMS) was performed on monolayers on scribed silicon (Si(scr)) derived from 1-alkenes, 1-alkynes, 1-holoalkanes, aldehydes, and acid chlorides. To rapidly determine the variation in the data without introducing user bias, a multivariate analysis was performed. First, principal components analysis (PCA) was done on data obtained from silicon scribed with homologous series of aldehydes and acid chlorides. For this study, the positive ion spectra, the negative ion spectra, and the concatentated (linked) positive and negative ion spectra were preprocessed by normalization, mean centering, and autoscaling. The mean centered data consistently showed the best correlations between the scores on PC1 and the number of carbon atoms in the adsorbate. These correlations were not as strong for the normalized and autoscaled data. After reviewing these methods, it was concluded that mean centering is the best preprocessing method for TOF-SIMS spectra of monolayers on Si(scr). A PCA analysis of all of the positive ion spectra revealed a good correlation between the number of carbon atoms in all of the adsorbates and the scores on PC1. PCA of all of the negative ion spectra and the concatenated positive and negative ion spectra showed a correlation based on the number of carbon atoms in the adsorbate and the class of the adsorbate. These results imply that the positive ion spectra are most sensitive to monolayer thickness, while the negative ion spectra are sensitive to the nature of the substrate-monolayer interface and the monolayer thickness. Loadings show an inverse relationship between (inorganic) fragments that are expected from the substrate and (organic) fragments expected from the monolayer. Multivariate peak intensity ratios were derived. It is also suggested that PCA can be used to detect outlier surfaces. Partial least squares showed a strong correlation between the number of carbon atoms in the adsorbate and the number it predicted.  相似文献   

11.
Time-of-flight secondary ion mass spectrometry (TOF-SIMS) enables chemically imaging the distributions of various lipid species in model membranes. However, discriminating the TOF-SIMS data of structurally similar lipids is very difficult because the high intensity, low mass fragment ions needed to achieve submicrometer lateral resolution are common to multiple lipid species. Here, we demonstrate that principal component analysis (PCA) can discriminate the TOF-SIMS spectra of four unlabeled saturated phosphatidylcholine species, 1,2-dilauroyl-sn-glycero-3-phosphocholine (DLPC), 1,2-dimyristoyl-sn-glycero-3-phosphocholine (DMPC), 1,2-dipalmitoyl-sn-glycero-3-phosphocholine (DPPC), and 1,2-distearoyl-sn-glycero-3-phosphocholine (DSPC) according to variations in the intensities of their low mass fragment ions (m/z ≤ 200). PCA of TOF-SIMS images of phase-separated DSPC/DLPC and DPPC/DLPC membranes enabled visualizing the distributions of each phosphatidylcholine species with higher contrast and specificity than that of individual TOF-SIMS ion images. Comparison of the principal component (PC) scores images to atomic force microscopy (AFM) images acquired at the same membrane location before TOF-SIMS analysis confirmed that the PC scores images reveal the phase-separated membrane domains. The lipid composition within these domains was identified by projection of their TOF-SIMS spectra onto PC models developed using pure lipid standards. This approach may enable the identification and chemical imaging of structurally similar lipid species within more complex membranes.  相似文献   

12.
It is well known that no single experimental condition can be found under which the extraction of all the volatile compounds in a gas chromatographic analysis of roasted coffee beans by headspace-solid phase microextraction (HS-SPME) is maximized. This is due to the large number of peaks recorded. In this work, the scores vector of the first principal component obtained from PCA on chromatographic peak areas was used as the response to find the optimal conditions for simultaneous optimization of coffee volatiles extraction via response surface methodology (RSM). This strategy consists in compressing several highly correlated peak areas into a single response variable for a central composite design (CCD). RSM was used to identify an optimal factor combination that reflects a compromise between the partially conflicting behavior of the volatiles groups. This simultaneous optimization approach was compared with the desirability function method. The versatility of the PCA-RSM methodology allows it to be used in other chromatographic applications, resulting in an interpretable procedure to solve new analytical problems.  相似文献   

13.
Individuals are thought to have their own distinctive scent, analogous to a signature or fingerprint. To test this idea, we collected axillary sweat, urine and saliva from 197 adults from a village in the Austrian Alps, taking five sweat samples per subject over 10 weeks using a novel skin sampling device. We analysed samples using stir bar sorptive extraction in connection with thermal desorption gas chromatograph-mass spectrometry (GC-MS), and then we statistically analysed the chromatographic profiles using pattern recognition techniques. We found more volatile compounds in axillary sweat than in urine or saliva, and among these we found 373 peaks that were consistent over time (detected in four out of five samples per individual). Among these candidate compounds, we found individually distinct and reproducible GC-MS fingerprints, a reproducible difference between the sexes, and we identified the chemical structures of 44 individual and 12 gender-specific volatile compounds. These individual compounds provide candidates for major histocompatibility complex and other genetically determined odours. This is the first study on human axillary odour to sample a large number of subjects, and our findings are relevant to understanding the chemical nature of human odour, and efforts to design electronic sensors (e-nose) for biometric fingerprinting and disease diagnoses.  相似文献   

14.
A new 'particle-velocity-field smoothing' (PVFS) algorithm is proposed to decorrelate up to three coherent sources for azimuth-elevation direction-finding using vector-hydrophones in the underwater acoustic medium. The coherency among the incident sources would reduce the data correlation matrix's rank to below the number of incident sources, but this proposed algorithm restores the rank by summing the individual particle-velocity-field component's data correlation matrices. This scheme uses identically oriented underwater-acoustic vector-hydrophones, its locations may be arbitrary. Each underwater-acoustic vector-hydrophone consists of two or three collocated but orthogonally oriented velocity-hydrophones, and a pressure-hydrophone. In contrast to the customary 'spatial smoothing' technique, this proposed PVFS algorithm does not reduce the array's spatial aperture and does not require any 'virtual array interpolation' even for an irregularly shaped array grid. Monte Carlo simulations verify this proposed scheme's efficacy.  相似文献   

15.
Principal component regression (PCR) is unique in that the principal component analysis (PCA) step is explicitly involved in the central part of the method. In the present paper, the PCA part is examined in order to study the influence of noise in spectra on PCR by spectral simulation. It has been suggested, as a result, that PCR calibration would have a large inaccuracy when the estimated number of basis factors analyzed by the eigenvalue method is less than that by cross-validation, which was studied by use of synthesized spectra. This instability is because the minute noise is largely enhanced by the PCA calculation via the normalization of loadings. At the same time, the noise enhancement by PCA has also been characterized to influence the estimation of basis factors.  相似文献   

16.
Principal component analysis (PCA) is widely used to reconstruct the spectral reflectance of surface colors. However, the estimated spectral accuracy is low when using only one set of three principal components for three-channel color-acquisition devices. In this study, the spectral space was first divided into 11 subgroups, and the principal components were calculated for individual subgroups. Then the principal components were further extended from three to nine through the residual spectral error of the reflectance in each subgroup. For each target sample, the extended principal components of the corresponding subgroup samples were used in the common PCA method to reconstruct the spectral reflectance. The results show that this proposed method is quite accurate and outperforms other related methods.  相似文献   

17.
Principal component analysis (PCA) is a statistical method used to find combinations of variables or factors that describe the most important trends in the data. PCA has been combined with time-of-flight secondary ion mass spectrometry (TOF-SIMS) data to extract new information and find relations between species contained in complex systems. Monolayers of dipalmitoylphosphatidylcholine alone and mixed with palmitoyloleoylphosphatidylglycerol prepared using the Langmuir-Blodgett technique are discussed. PCA software provides image scores and corresponding loadings for each significant principal component. Image plots of the scores show the spatial distribution and intensity of the species defined by the loading plots (mass spectral features). The intensity and resolution of the image scores can result in substantial improvement over that of the regular TOF-SIMS images especially when static conditions are used for small analysis areas. Also, some of the effects of topography and matrix in the images can be removed, allowing for a better presentation of chemical variations.  相似文献   

18.
Keratan sulfate (KS) is a glycosaminoglycan consisting of repeating disaccharide units composed of alternating residues of d-galactose and N-acetyl-d-glucosamine linked beta-(1-4) and beta-(1-3), respectively. In this study, electrospray ionization tandem mass spectrometry (ESI-MS/MS) was employed to identify keratan sulfate oligosaccharides. Two nonsulfated disaccharide isomers and two monosulfated disaccharide isomers were distinguished through MS/MS. In MS(1) spectra of multiply sulfated KS oligosaccharides, the charge state of the most abundant molecular ion equals the number of sulfates. Subsequent MS(2) and MS(3) spectra of mono-, di-, tri-, and tetrasulfated KS oligosaccharides and sialylated tetrasaccharides reveal diagnostic ions that can be used as fingerprint maps to identify unknown KS oligosaccharides. Based on the pattern of fragment ions, the compositions of an oligosaccharide mixture from shark cartilage KS and of two enzyme digests of bovine corneal KS were determined directly, without prior isolation of individual oligosaccharides by HPLC or other methods.  相似文献   

19.
The application of eigenstructure tracking analysis (ETA) and SIMPLISMA for the investigation of the protonation equilibria of a monomer and several polynucleotides is proposed. Both approaches have been applied in the pH and in the wavelength direction to the spectroscopic data matrices obtained in the study of each equilibrum. ETA provides information about the number of components in the system, their evolution along the titration, and the local rank. SIMPLISMA is also used to obtain the number of compounds in the system, the concentration profiles, and the unit spectrum of each compound. The results obtained with SIMPLISMA and those obtained previously with the alternating least-squares approach are compared.  相似文献   

20.
范雪莉  冯海泓  原猛 《声学技术》2013,32(3):222-227
主成分分析是声场景分类中常用的特征选择方法。针对主成分分析的局限性,提出一种基于互信息的主成分分析方法。这一方法引入类别信息,用不同声场景条件下特征之间的互信息矩阵之和替代传统主成分分析中的协方差矩阵,计算其特征向量与特征值,特征向量表示由原始特征空间向新的主成分空间的转换系数,特征值则用于计算主成分的累计贡献率并判断主成分维数。声场景分类实验结果表明,该方法较之传统主成分分析方法降维效果更好,辅以神经网络分类器,计算得到的分类正确率更高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号