首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
一种时序数据的离群数据挖掘新算法   总被引:11,自引:0,他引:11  
离群数据挖掘是数据挖掘的重要内容,针对时序数据进行离群数据挖掘方法的研究。首先通过对时序数据进行离散傅立叶变换将其从时域空间变换到频域空间,将时序数据映射为多维空间的点,在此基础上,提出一种新的基于距离的离群数据挖掘算法。对某钢铁企业电力负荷时序数据进行仿真实验,结果表明了算法的有效性。  相似文献   

2.
基于离群指数的时序数据离群挖掘   总被引:12,自引:0,他引:12  
离群数据挖掘(0utlier mining,简称离群挖掘)是数据挖掘的重要内容.该文针对时序数据进行离群数据挖掘方法的研究,提出了离群指数的概念,在此基础上设计了时序数据离群数据挖掘算法,并对某钢铁企业电力负荷时序数据进行离群数据挖掘,结果表明了算法的有效性.  相似文献   

3.
对随机投影算法的离群数据挖掘技术研究   总被引:1,自引:0,他引:1  
[d]维点集离群数据挖掘技术是目前数据挖掘领域的研究热点之一。当前基于距离或最近邻概念进行离群数据挖掘时,在高维数据情况下的挖掘效果不佳,鉴于此,将基于角度的离群因子应用到高维离群数据挖掘中,提出一种新的基于随机投影算法的离群数据挖掘方案,它只需要用接近线性时间的方法就能预测所有数据点的基于角度的离群因子。该方法可以用于并行环境进行并行加速。对近似质量进行了理论分析,以保证算法的可靠性。合成和真实数据集实验结果表明,对超高维数据集,该方法效率高、可伸缩性强。  相似文献   

4.
针对现有的自回归(Autoregressive,AR)模型对非平稳数据预测效果不佳的问题,提出了基于时变自回归(Time-Varying Autoregressive,TVAR)模型的时序预测方法.针对某型国产飞机发动机的低压转速信号,使用TVAR模型分别进行点预测和区间预测,并与AR模型的点预测结果进行对比.研究结果表明,TVAR模型能够很好地反映非平稳数据的变化趋势.在给定置信水平下,TVAR预测区间能够包含真实数据,因此TVAR模型在时序预测中具有更好的预测效果.  相似文献   

5.
灰色理论与时序模型的发动机状态监测分析   总被引:1,自引:0,他引:1       下载免费PDF全文
针对目前发动机磨损状态监测中磨粒数量预测方法存在的问题,提出了基于灰色理论与时序模型相结合的预测方法,建立了灰色时序组合模型.通过灰色GM(1,1)模型模拟数据宏观变化趋势,并用时序AR(P)模型建立了残差序列以模拟数据微观变化趋势.通过对实测数据进行检验与比较,证明该组合模型在发动机状态监测中具有更好的预报效果.  相似文献   

6.
描述了离群数据挖掘的基本理论以及经典算法,提出附加约束的基于规则的离群数据挖掘算法,并根据过去几十年数据的特点,提出了一种运用离群数据挖掘进行病虫害预测的模型。实验证明,通过对实际病虫害气象数据进行挖掘,预测的结果合理,预测效率提高。  相似文献   

7.
描述了离群数据挖掘的基本理论以及经典算法,提出附加约束的基于规则的离群数据挖掘算法,并根据过去几十年数据的特点,提出了一种运用离群数据挖掘进行病虫害预测的模型。实验证明,通过对实际病虫害气象数据进行挖掘,预测的结果合理,预测效率提高。  相似文献   

8.
离群数据挖掘是为了找出隐含在海量数据中相对稀疏而孤立的异常数据模式,但传统的离群数据挖掘方法受人为因素影响较大.通过引入基于信息熵的离群度量因子,给出一种离群数据挖掘新算法.该算法先利用信息熵计算每个数据对象的离群度量因子,然后通过离群度量因子来衡量每个对象的离群程度,进而检测离群数据,有效地消除了人为主观因素对离群检测的影响,并能很好地解释离群点的含义.最后,采用UCI和恒星光谱数据作为实验数据,通过对实验的分析,验证了该算法的可行性和有效性.  相似文献   

9.
根据负荷预测的理论,通过历史数据为基础进行电力负荷数据预测。由于实际运行过程中,采集数据存在错误,使得获得到的负荷预测曲线包含较大的锯齿状。提出一种新的离群数据挖掘方法,即求二直线的夹角方法寻找尖锐点,离群数据为尖锐点处对应电力负荷有功值,然后使用曲线平滑的方法对这些离群数据进行了处理。实验证明,运用提出的这一新的离群数据挖掘方法处理负荷预测曲线,预测结果明显改进。  相似文献   

10.
离群点检测是数据挖掘一个重要内容,它为分析各种海量的、复杂的、含有噪声的数据提供了新的方法。对离群数据挖掘几类主要的方法进行了分析和评价,并在此基础上了提出了一种基于遗传聚类的离群点检测算法。该算法结合了遗传算法全局搜索的优点和K-均值方法局部收敛速度快的特点,取得较好效果。实验验证该算法很好地检测到数据集中的离群点,同时还完成了数据集的聚类。具有较好的实用性。  相似文献   

11.
准确预测海表面温度对于海洋渔业生产、海洋动力环境信息预测预报等至关重要.传统数值预报方法计算代价大、时效差,而现有基于数据驱动的海表温预测方法大都针对单个观测点进行海表温预测,不适合预测由多个观测点构成的某个区域的海表面温度,而现有的区域海表温预测方法的预测精度仍然有待提高.为此,本文提出了一种基于XGBoost结合PredRNN++的区域海表温预测方法 (XGBoost-PredRNN++),该方法首先将海表面温数据处理成灰度图片,然后利用XGBoost模型来提取每个点的时间特征;在此基础上,采用CNN网络将时间特征融合到原始海表温数据中,同时提取出海表温数据之间的空间依赖关系;最后利用PredRNN++时间序列预测模型提取整个海表温序列之间的时空关联关系,从而实现了区域海表温度的高精度预测.一系列实验结果表明,本文提出的方法具有较高预测精度和效率,明显优于现有预测方法.  相似文献   

12.
In this article, annealing robust radial basis function networks (ARRBFNs), which consist of a radial basis function network and a support vector regression (SVR), and an annealing robust learning algorithm (ARLA) are proposed for the prediction of chaotic time series with outliers. In order to overcome the initial structural problems of the proposed neural networks, the SVR is utilized to determine the number of hidden nodes, the initial parameters of the kernel, and the initial weights for the proposed ARRBFNs. Then the ARLA that can conquer the outliers is applied to tune the parameters of the kernel and the weights in the proposed ARRBFNs under the initial structure with SVR. The simulation results of Mackey-Glass time series show that the proposed approach with different SVRs can cope with outliers and give a fast learning speed. The results of the simulation are also given to demonstrate the validity of proposed method for chaotic time series with outliers.  相似文献   

13.
A single distribution is typically used to model the innovations of an autoregressive (AR) model. However, sparse impulses may exist within the innovations which may cause outliers in the observations. These impulses cannot be modeled by a single distribution and may potentially degrade the estimation. In this study, the innovation of an AR model is modeled by using both a Gaussian noise component and a sparse impulse noise model in order to obtain robust estimation and estimation of the impulses simultaneously. The Gaussian distribution models the normal noise and the sparse impulse noise model models the sparse abnormal innovation impulses. A hierarchal Bayesian model is built for the proposed model. Automatic relevance determination (ARD) priors are set for both the coefficients and the sparse impulses. A Variational Bayesian (VB) learning algorithm is given to estimate the parameters of the model. Experimental results show that the proposed model with the learning algorithm is valid for AR models with outliers caused by sparse innovation impulses, the coefficient estimation accuracy is better than other methods, and the sparse impulses can be estimated simultaneously.  相似文献   

14.
Image variability that is impossible or difficult to restore by intra-image processing, such as the variability caused by occlusions, significantly reduces the performance of image-recognition methods. To address this issue, we propose that the pixels associated with large distances obtained by inter-image pixel-by-pixels comparisons should be considered as inter-image outliers and should be removed from the similarity calculation used for the image classification. When this method is combined with the template-matching method for image recognition, it leads to state-of-the-art recognition performance: 91% with AR database that includes occluded face images, 90% with PUT database that includes pose variations of face images and 100% with EYale B database that includes images with large illumination variation.  相似文献   

15.
In existing Linear Discriminant Analysis (LDA) models, the class population mean is always estimated by the class sample average. In small sample size problems, such as face and palm recognition, however, the class sample average does not suffice to provide an accurate estimate of the class population mean based on a few of the given samples, particularly when there are outliers in the training set. To overcome this weakness, the class median vector is used to estimate the class population mean in LDA modeling. The class median vector has two advantages over the class sample average: (1) the class median (image) vector preserves useful details in the sample images, and (2) the class median vector is robust to outliers that exist in the training sample set. In addition, a weighting mechanism is adopted to refine the characterization of the within-class scatter so as to further improve the robustness of the proposed model. The proposed Median Fisher Discriminator (MFD) method was evaluated using the Yale and the AR face image databases and the PolyU (Polytechnic University) palmprint database. The experimental results demonstrated the robustness and effectiveness of the proposed method.  相似文献   

16.
Accurate prediction of sea surface temperature (SST) is extremely important for forecasting oceanic environmental events and for ocean studies. However, the existing SST prediction methods do not consider the seasonal periodicity and abnormal fluctuation characteristics of SST or the importance of historical SST data from different times; thus, these methods suffer from low prediction accuracy. To solve this problem, we comprehensively consider the effects of seasonal periodicity and abnormal fluctuation characteristics of SST data, as well as the influence of historical data in different periods, on prediction accuracy. We propose a novel ensemble learning approach that combines the Predictive Recurrent Neural Network(PredRNN) network and an attention mechanism for effective SST field prediction. In this approach, the XGBoost model is used to learn the long-period fluctuation law of SST and to extract seasonal periodic features from SST data. The exponential smoothing method is used to mitigate the impact of severely abnormal SST fluctuations and extract the a priori features of SST data. The outputs of the two aforementioned models and the original SST data are stacked and used as inputs for the next model, the PredRNN network. PredRNN is the most recently developed spatiotemporal deep learning network, which simulates both spatial and temporal representations and is capable of transferring memory across layers and time steps. Therefore, we used it to extract the spatiotemporal correlations of SST data and predict future SSTs. Finally, an attention mechanism is added to capture the importance of different historical SST data, weigh the output of each step of the PredRNN network, and improve the prediction accuracy. The experimental results on two ocean datasets confirm that the proposed approach achieves higher training efficiency and prediction accuracy than the existing SST field prediction approaches do.  相似文献   

17.
基础矩阵的鲁棒性估计是计算机视觉领域的一个基本问题。为了提高基础矩阵的估计精度,首先指出了现有的鲁棒性算法——RANSAC和MLESAC理论上的缺陷和实际应用中的问题;然后通过详细分析局外点复杂的成因,同时运用混合高斯分布代替均匀分布分别对不同成因的局外点进行了有针对性的建模,并提出了一种鲁棒性更强的算法——GMSAC。实验结果表明,相比于MLESAC算法,GMSAC算法提供了更高的模型似然度和计算精度。  相似文献   

18.
在线鲁棒最小二乘支持向量机回归建模   总被引:5,自引:0,他引:5  
鉴于工业过程的时变特性以及现场采集的数据通常具有非线性特性且包含离群点,利用最小二乘支持向量机回归(least squares support vector regression,LSSVR)建模易受离群点的影响.针对这一问题,结合鲁棒学习算法(robust learning algorithm,RLA),本文提出了一种在线鲁棒最小二乘支持向量机回归建模方法.该方法首先利用LSSVR模型对过程输出进行预测,与真实输出相比较得到预测误差;然后利用RLA方法训练LSSVR模型的权值,建立鲁棒LSSVR模型;最后应用增量学习方法在线更新鲁棒LSSVR模型,从而得到在线鲁棒LSSVR模型.仿真研究验证了所提方法的有效性.  相似文献   

19.
相机全局位置估计作为运动恢复结构算法(Structure from motion,SfM)中的核心内容一直以来都是计算机视觉领域的研究热点.现有相机全局位置估计方法大多对外点敏感,在处理大规模、无序图像集时表现的尤为明显.增量式SfM中的迭代优化步骤可以剔除大部分的误匹配从而降低外点对估计结果的影响,而全局式SfM中没有有效地剔除误匹配的策略,估计结果受外点影响较大.针对这种情况,本文提出一种改进的相机全局位置估计方法:首先,结合极线约束提出一种新的对误匹配鲁棒的相对平移方向估计算法,减少相对平移方向估计结果中存在的外点;然后,引入平行刚体理论提出一种新的预处理方法将相机全局位置估计转化为一个适定性问题;最后,在此基础上构造了一个对外点鲁棒的凸优化线性估计模型,对模型解算获取相机位置估计全局最优解.本文方法可以很好地融合到当下的全局式SfM流程中.与现有典型方法的对照实验结果表明:在处理大规模、无序图像时,本文方法能显著提高相机全局位置估计的鲁棒性,并保证估计过程的高效性和估计结果的普遍精度.  相似文献   

20.
Though the global precipitation measurement microwave imager(GMI)has been a new microwave sensor for about two years,no capability evaluation of GMI SST has been made.For providing some helpful information to using the products later,in the paper,monthly/annual GMI SST measurement coverage are calculated using GMI products during 4/2014 and 3/2016 and spatial and temporal variation of the coverage are also analyzed.Besides,to generate matchups set with buoy and Voluntary Observing Ship Climate(VOSClim)measurements as in\|situ data strategy has been made,and retrieval uncertainty of GMI SST are finally evaluated.All the work show,(1)the GMI SST annual coverage is about 0.51,smaller than global average,which is 0.59,but it’s significant bigger than Infrared SSTs;(2)using±3 h/0.1° as temporal and spatial match windows,and 6.5 m/s as a wind speed threshold to exclude outliers can eliminate some of the errors effectively;(3)GMI SST Bias is-0.02±0.89 ℃,and is approximate to bias of some other SST products.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号