共查询到20条相似文献,搜索用时 15 毫秒
1.
Time series analysis has always been an important and interesting research field due to its frequent appearance in different applications. In the past, many approaches based on regression, neural networks and other mathematical models were proposed to analyze the time series. In this paper, we attempt to use the data mining technique to analyze time series. Many previous studies on data mining have focused on handling binary-valued data. Time series data, however, are usually quantitative values. We thus extend our previous fuzzy mining approach for handling time-series data to find linguistic association rules. The proposed approach first uses a sliding window to generate continues subsequences from a given time series and then analyzes the fuzzy itemsets from these subsequences. Appropriate post-processing is then performed to remove redundant patterns. Experiments are also made to show the performance of the proposed mining algorithm. Since the final results are represented by linguistic rules, they will be friendlier to human than quantitative representation. 相似文献
2.
The recent trends in collecting huge and diverse datasets have created a great challenge in data analysis. One of the characteristics of these gigantic datasets is that they often have significant amounts of redundancies. The use of very large multi-dimensional data will result in more noise, redundant data, and the possibility of unconnected data entities. To efficiently manipulate data represented in a high-dimensional space and to address the impact of redundant dimensions on the final results, we propose a new technique for the dimensionality reduction using Copulas and the LU-decomposition (Forward Substitution) method. The proposed method is compared favorably with existing approaches on real-world datasets: Diabetes, Waveform, two versions of Human Activity Recognition based on Smartphone, and Thyroid Datasets taken from machine learning repository in terms of dimensionality reduction and efficiency of the method, which are performed on statistical and classification measures. 相似文献
3.
This paper explores dimensionality reduction (DR) approaches for visualizing high dimensional data in chemical processes. Visualization provides powerful insight and process understanding in the industrial context, and accelerates process troubleshooting. A diverse array of existing, easy-to-use DR methods are evaluated in three case studies on large-scale industrial manufacturing plants. Supervised and unsupervised cases are presented with the objective of solving typical industrial problems related to unplanned events, plant performance improvement, and quality underperformance troubleshooting. For the unsupervised case, the evaluation aims to identify approaches that provide insight beyond those of PCA (Principal Component Analysis), and also examines quality metrics of the reduced (latent) space which characterize the degree of trust in the DR. UMAP (Uniform Manifold Approximation and Projection) outperforms other techniques, bringing new insights when comparing with other methods. For the supervised case, UMAP is combined with traditional variable selection methods, such as VIP (Variable Influence on Projection) weights from PLS-DA (Partial Least Squares Discriminant Analysis), in order to improve latent space visualization by increasing separation between classes. 相似文献
4.
Dimensionality reduction of clustered data sets 总被引:1,自引:0,他引:1
We present a novel probabilistic latent variable model to perform linear dimensionality reduction on data sets which contain clusters. We prove that the maximum likelihood solution of the model is an unsupervised generalisation of linear discriminant analysis. This provides a completely new approach to one of the most established and widely used classification algorithms. The performance of the model is then demonstrated on a number of real and artificial data sets. 相似文献
5.
Mario Ernesto Jijón Palma Alvaro Muriel Lima Machado Jorge Antonio Silva Centeno 《International journal of remote sensing》2019,40(9):3401-3420
Binary encoding is an approach that aims at summarizing the information contained in various spectral bands into a single image that stores the meaningful information of the bands. In this paper, it is introduced a feature extraction approach to reduce the dimensionality of hyperspectral data with binary encoding for classification purposes. Different options to reduce the radiometric information of the pixels are introduced, such as using a single threshold or multiple thresholds. After the dimensionality reduction, the separation of the spectral classes was analysed and the thematic classification of the reduced data was performed. In order to evaluate the performance of the proposed approach, experiments on AVIRIS (Airborne Visible/Infrared Imaging Spectrometer) image, ROSIS (Reflection Optics System Imaging Spectrometer) hyperspectral image and HYDICE (Hyperspectral Digital Imagery Collection Experiment) hyperspectral image are presented. In the experiments, neighbouring spectral bands are grouped and coded and the results of the classification are compared. The results show that the use of binary encoding based on three thresholds by spectral region is more efficient than with the use of one threshold. The thematic mapping of the hyperspectral data with reduced dimension confirms the competitiveness of the binary encoding method compared with other dimension reduction methods, such as the Principal Component Analysis (PCA), the Principal Component Analysis – Fisher’s Linear Discriminant Analysis (PCA-LDA), the Discriminant Analysis Feature Extraction (DAFE) and the Non-parametric Weighted Feature Extraction (NWFE). In this context, the present methodology shows to be promising, because it reduces the computational complexity and improves performance. 相似文献
6.
P. Delicado 《Computational statistics & data analysis》2011,55(1):401-420
Functional Data Analysis deals with samples where a whole function is observed for each individual. A relevant case of FDA is when the observed functions are density functions. Among the particular characteristics of density functions, the most of the fact that they are an example of infinite dimensional compositional data (parts of some whole which only carry relative information) is made. Several dimensionality reduction methods for this particular type of data are compared: functional principal components analysis with or without a previous data transformation, and multidimensional scaling for different inter-density distances, one of them taking into account the compositional nature of density functions. The emphasis is on the steps previous and posterior to the application of a particular dimensionality reduction method: care must be taken in choosing the right density function transformation and/or the appropriate distance between densities before performing dimensionality reduction; subsequently the graphical representation of dimensionality reduction results must take into account that the observed objects are density functions. The different methods are applied1 to artificial and real data (population pyramids for 223 countries in year 2000). As a global conclusion, the use of multidimensional scaling based on compositional distance is recommended. 相似文献
7.
8.
9.
《International journal of remote sensing》2012,33(7):2457-2476
ABSTRACTCotton is the most important fibre culture in the world. In Brazil, cotton cultivation is concentrated in the Cerrado biome, the Brazilian savanna, and is one of the most important commodities in the country. As an annual crop, the updating frequency of the spatial distribution data of cotton fields is extremely important for crop monitoring systems. In order to provide fast and accurate information for crop monitoring, time series of remote- sensing data has been used in the development of several applications in agriculture, since the high temporal resolution of some orbital sensor allows monitoring targets with high spectral-temporal variations in the land surface. However, there are still some challenges to systematize the processing of such a large amount of data available by long time series of remote-sensing imagery. Thus, this study contributes to the construction of models to identify and separate specific crop types with similar spectral behaviour to other crops practised in the same period. The objective of this study was to develop a systematic methodology based on data mining of time series of vegetation indices (VI) to map cotton fields at the regional scale. Field reference data and time series of NDVI and EVI images, obtained from MODIS sensor products during four cropping seasons (from 2012–2013 to 2015–2016), were used to construct mapping models based on decision tree algorithms. Phenological metrics were calculated from the VI time series and used to build classification rules for mapping cotton fields. Our results demonstrate that the proposed method to map cotton fields achieve high accuracy when field data and visual interpretation of NDVI temporal profiles were used for validation (accuracy higher than 95% and 93%, respectively). Comparisons with the official statistics indicated an optimal fit, with linear correlation (r) and coefficient of determination (R2) above 0.93. Therefore, the proposed method was efficient to distinguish cotton fields from other crop types with similar spectral behaviour. In addition, this method can also be applied to other cotton-producing regions and other production seasons, by reusing the models generated through machine learning approaches. 相似文献
10.
为了有效地约简稀疏数据的维度,提出一种基于切空间判别的稀疏数据局部降维方法,其思想是扩展局部邻域,增大样本点间的重叠信息,使之在稀疏降维过程中通过充分的信息达到精确的低维嵌入;利用切空间判别的方法对扩展后局部区域的样本点进行选择保留,弃除切方向变化较大的点,使之实现更好的降维效果。实验结果表明,在人工生成的数据集上,新方法获得了较好的嵌入结果;并且在人脸识别与图像检索中得到了期望的可视化分类结果。 相似文献
11.
12.
13.
在供水管网中部署传感器网络实时获取多个水质参数时间序列数据,当供水管网发生污染时,高效准确地检测水质异常是一个重要问题。提出多变量水质参数时间异常事件检测算法(M-TAEDA),利用BP模型分析多变量水质参数的时序数据,确定可能离群点;结合贝叶斯序贯分析独立更新每个参数的事件概率,预测单个传感器节点检测的异常概率;将单变量的事件概率融合为统一多变量事件概率,融合判断异常事件。实验结果表明:BP模型模拟多变量水质参数进行预测可以达到90%精确度;与单变量参数时间异常事件检测算法(S-TAEDA)相比,M-TAEDA可以提高异常检出率约40%,降低误报率约45%。 相似文献
14.
Flora S. Tsai 《Expert systems with applications》2012,39(5):4965-4971
This paper describes the usage of dimensionality reduction techniques for computer facial animation. Techniques such as Principal Components Analysis (PCA), Expectation-Maximization (EM) algorithm for PCA, Multidimensional Scaling (MDS), and Locally Linear Embedding (LLE) are compared for the purpose of facial animation of different emotions. The experimental results on our facial animation data demonstrate the usefulness of dimensionality reduction techniques for both space and time reduction. In particular, the EMPCA algorithm performed especially well in our dataset, with negligible error of only 1-2%. 相似文献
15.
多变量时间序列模式挖掘的研究 总被引:4,自引:0,他引:4
多变量时间序列数据集合在许多领域中存在,由于其观测变量之间的相互关联性,往往需要进行综合分析.使用基于时间序列相似性的多变量时间序列模式挖掘方法,从历史数据中寻找出相似的多变量时间序列.将多变量的数据集分段平均为连续矩阵,并采用基于主成分分析和奇异值分解的方法来对矩阵进行相似性比较,最后通过相邻片断的合并以组成更高层次的时序片断,以提高模式的匹配的范围.并在地震前兆数据进行了实现. 相似文献
16.
针对现有基于流形学习的降维方法对局部邻域大小选择的敏感性,且降至低维后的数据不具有很好的可分性,提出一种自适应邻域选择的数据可分性降维方法。该方法通过估计数据的本征维度和局部切方向来自适应地选择每一样本点的邻域大小;同时,使用映射数据时的聚类信息来汇聚相似的样本点,保证降维后的数据具有良好的可分性,使之实现更好的降维效果。实验结果表明,在人工生成的数据集上,新方法获得了较好的嵌入结果;并且在人脸的可视化分类和图像检索中得到了期望的结果。 相似文献
17.
Kun-Huang Huarng Tiffany Hui-Kuang Yu Yu Wei Hsu 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2007,37(4):836-846
Fuzzy time-series models have been widely applied due to their ability to handle nonlinear data directly and because no rigid assumptions for the data are needed. In addition, many such models have been shown to provide better forecasting results than their conventional counterparts. However, since most of these models require complicated matrix computations, this paper proposes the adoption of a multivariate heuristic function that can be integrated with univariate fuzzy time-series models into multivariate models. Such a multivariate heuristic function can easily be extended and integrated with various univariate models. Furthermore, the integrated model can handle multiple variables to improve forecasting results and, at the same time, avoid complicated computations due to the inclusion of multiple variables. 相似文献
18.
Automatic acoustic-based vehicle detection is a common task in security and surveillance systems. Usually, a recording device
is placed in a designated area and a hardware/software system processes the sounds that are intercepted by this recording
device to identify vehicles only as they pass by. An algorithm, which is suitable for online automatic detection of vehicles,
which is based on their online acoustic recordings, is proposed. The scheme uses dimensionality reduction methodologies such
as random projections instead of using traditional signal processing methods to extract features. It uncovers characteristic
features of the recorded sounds without any assumptions about the structure of the signal. The set of features is classified
by the application of PCA. The microphone is opened all the time and the algorithm filtered out many background noises such
as wind, steps, speech, airplanes, etc. The introduced algorithm is generic and can be applied to various signal types for
solving different detection and classification problems. 相似文献
19.
We define the problem of bounded similarity querying in time-series databases, which generalizes earlier notions of similarity querying. Given a (sub)sequence S, a query sequence Q, lower and upper bounds on shifting and scaling parameters, and a tolerance , S is considered boundedly similar to Q if S can be shifted and scaled within the specified bounds to produce a modified sequence S′ whose distance from Q is within . We use similarity transformation to formalize the notion of bounded similarity. We then describe a framework that supports the resulting set of queries; it is based on a fingerprint method that normalizes the data and saves the normalization parameters. For off-line data, we provide an indexing method with a single index structure and search technique for handling all the special cases of bounded similarity querying. Experimental investigations find the performance of our method to be competitive with earlier, less general approaches. 相似文献