首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.
We present a data mining technique for the analysis of multichannel oscillatory timeseries data and show an application using poloidal arrays of magnetic sensors installed in the H-1 heliac. The procedure is highly automated, and scales well to large datasets. The timeseries data is split into short time segments to provide time resolution, and each segment is represented by a singular value decomposition (SVD). By comparing power spectra of the temporal singular vectors, related singular values are grouped into subsets which define fluctuation structures. Thresholds for the normalised energy of the fluctuation structure and the normalised entropy of the SVD can be used to filter the dataset. We assume that distinct classes of fluctuations are localised in the space of phase differences Δψ(n,n+1) between each pair of nearest neighbour channels. An expectation maximisation clustering algorithm is used to locate the distinct classes of fluctuations and assign mode numbers where possible, and a cluster tree mapping is used to visualise the results.  相似文献   

2.
提出一种有效的遥感图像阵列目标识别方法,不仅利用空间属性特征,而且通过设计Gabor滤波器组提取子目标的非空间属性特征。在对子目标进行聚类时,根据图划分理论给出一种迭代提取样本中相似样本群的方法,实现子目标群的提取。实验结果表明,通过加入非空间属性特征和采用迭代提取样本中相似样本群的方法,可以有效地排除大量虚假子目标并精确定位阵列目标。  相似文献   

3.
4.
陈略  熊宸  蔡铭 《计算机工程》2021,47(3):83-93
手机信令具有时空序列性以及数据量大、采样频率不均、定位精度低与基站振荡等特点,导致传统手机信令聚类方法数据密度分布不均、时空开销大且聚类效果差。提出一种用于手机信令的时空密度轨迹点识别算法。将手机信令数据网格化以统一评估尺度,根据振荡噪声特征对网格簇进行时空联结减少空间不确定性和计算量,结合网络轨迹的曲折性以及移动与停留时间重新定义网格簇内轨迹点时空移动能力,计算网格簇的时空密度以判断用户停留区域,并采集具有移动停留标签的轨迹数据以验证算法有效性和识别效率。实验结果表明,该算法识别精度较改进DBSCAN算法更高,适用于识别手机信令数据停留区域,对复杂轨迹停留区域的识别效果更好。  相似文献   

5.
In this paper a new framework based on multiobjective optimization (MOO), namely FeaClusMOO, is proposed which is capable of identifying the correct partitioning as well as the most relevant set of features from a data set. A newly developed multiobjective simulated annealing based optimization technique namely archived multiobjective simulated annealing (AMOSA) is used as the background strategy for optimization. Here features and cluster centers are encoded in the form of a string. As the objective functions, two internal cluster validity indices measuring the goodness of the obtained partitioning using Euclidean distance and point symmetry based distance, respectively, and a count on the number of features are utilized. These three objectives are optimized simultaneously using AMOSA in order to detect the appropriate subset of features, appropriate number of clusters as well as the appropriate partitioning. Points are allocated to different clusters using a point symmetry based distance. Mutation changes the feature combination as well as the set of cluster centers. Since AMOSA, like any other MOO technique, provides a set of solutions on the final Pareto front, a technique based on the concept of semi-supervised classification is developed to select a solution from the given set. The effectiveness of the proposed FeaClustMOO in comparison with other clustering techniques like its Euclidean distance based version where Euclidean distance is used for cluster assignment, a genetic algorithm based automatic clustering technique (VGAPS-clustering) using point symmetry based distance with all the features, K-means clustering technique with all features is shown for seven higher dimensional data sets obtained from real-life.  相似文献   

6.
Biomedical time series clustering that automatically groups a collection of time series according to their internal similarity is of importance for medical record management and inspection such as bio-signals archiving and retrieval. In this paper, a novel framework that automatically groups a set of unlabelled multichannel biomedical time series according to their internal structural similarity is proposed. Specifically, we treat a multichannel biomedical time series as a document and extract local segments from the time series as words. We extend a topic model, i.e., the Hierarchical probabilistic Latent Semantic Analysis (H-pLSA), which was originally developed for visual motion analysis to cluster a set of unlabelled multichannel time series. The H-pLSA models each channel of the multichannel time series using a local pLSA in the first layer. The topics learned in the local pLSA are then fed to a global pLSA in the second layer to discover the categories of multichannel time series. Experiments on a dataset extracted from multichannel Electrocardiography (ECG) signals demonstrate that the proposed method performs better than previous state-of-the-art approaches and is relatively robust to the variations of parameters including length of local segments and dictionary size. Although the experimental evaluation used the multichannel ECG signals in a biometric scenario, the proposed algorithm is a universal framework for multichannel biomedical time series clustering according to their structural similarity, which has many applications in biomedical time series management.  相似文献   

7.
The text clustering technique is an appropriate method used to partition a huge amount of text documents into groups. The documents size affects the text clustering by decreasing its performance. Subsequently, text documents contain sparse and uninformative features, which reduce the performance of the underlying text clustering algorithm and increase the computational time. Feature selection is a fundamental unsupervised learning technique used to select a new subset of informative text features to improve the performance of the text clustering and reduce the computational time. This paper proposes a hybrid of particle swarm optimization algorithm with genetic operators for the feature selection problem. The k-means clustering is used to evaluate the effectiveness of the obtained features subsets. The experiments were conducted using eight common text datasets with variant characteristics. The results show that the proposed algorithm hybrid algorithm (H-FSPSOTC) improved the performance of the clustering algorithm by generating a new subset of more informative features. The proposed algorithm is compared with the other comparative algorithms published in the literature. Finally, the feature selection technique encourages the clustering algorithm to obtain accurate clusters.  相似文献   

8.
In this paper, a novel feature extraction approach is proposed for identifying ocean wave characteristics in real time. The algorithm was developed through the integration of the fuzzy C-means clustering algorithm, statistics formulation, short-time Fourier transforms, high frequency radar data processing and window function analysis. This method provides new insight into the detection of ocean wave characteristics and provides a more direct and convenient way to detect changes in ocean wave characteristics than the conventional method. To demonstrate the proposed algorithm, two Wellen radar systems were installed in Samcheok City, Gangwon-do on the East Coast of South Korea. A data set was selected for training the proposed algorithm while three other data sets, not used for the training processes, were used to validate the proposed model. The testing results demonstrate that the proposed algorithm is effective in extracting characteristic features from a variety of ocean waves. It is expected that the proposed system will accurately predict natural hazards and provide adequate warning time for people to evacuate from threatened coastal area. Hence this approach will directly contribute to the reduction of injuries and deaths in natural disasters by supplying near real-time data of the environment around coastal areas.  相似文献   

9.
This paper presents the QA-Pagelet as a fundamental data preparation technique for large-scale data analysis of the deep Web. To support QA-Pagelet extraction, we present the Thor framework for sampling, locating, and partioning the QA-Pagelets from the deep Web. Two unique features of the Thor framework are 1) the novel page clustering for grouping pages from a deep Web source into distinct clusters of control-flow dependent pages and 2) the novel subtree filtering algorithm that exploits the structural and content similarity at subtree level to identify the QA-Pagelets within highly ranked page clusters. We evaluate the effectiveness of the Thor framework through experiments using both simulation and real data sets. We show that Thor performs well over millions of deep Web pages and over a wide range of sources, including e-commerce sites, general and specialized search engines, corporate Web sites, medical and legal resources, and several others. Our experiments also show that the proposed page clustering algorithm achieves low-entropy clusters, and the subtree filtering algorithm identifies QA-Pagelets with excellent precision and recall.  相似文献   

10.
Fault diagnosis is crucial to improve reliability and performance of machinery. Effective feature extraction and clustering analysis can mine useful information from large amounts of raw data and facilitate fault diagnosis. This paper presents a novel intelligent fault diagnosis method based on ant colony clustering analysis. Vibration signals acquired from equipment are decomposed by wavelet packet transform, after which sub-bands of signals are clustered by ant colony algorithm, and each cluster as a set of data is analyzed from pattern of frequency band perspective for selecting intrinsic features reflecting operation condition of equipment, and thus fault diagnosis model is established to combine the extracted major features with given fault prototypes from historical data. The classification process for fault diagnosis is carried out using Euclidean nearness degree based on the established model. Furthermore, an improved ant colony clustering algorithm is proposed to adjust comparison probability dynamically and detect outliers. When compared with other clustering algorithms, the algorithm has higher convergence speed to meet requirements of real-time analysis as well as further improvement of accuracy. Finally, effectiveness and feasibility of the proposed method is verified by vibration signals acquired from a rotor test bed.  相似文献   

11.
Dynamic waveform matching (DWM) is a generalization of cross correlation that is useful in waveform classification and feature extraction problems. While cross correlation matches two signals by shifting the time axis of one signal relative to the other, DWM matches signals by shifting and warping the time axis until the corresponding features of the two waveforms are properly aligned. The optimum shift and warping of the time axis is determined by dynamic programming. The dynamic programming algorithm behind DWM can be used in many matching applications. Several geophysical applications of DWM are described. These include identifying local earthquake arrival times, extracting features useful for discriminating between earthquakes and underground nuclear explosions, waveform clustering, identifying quarry blasts, interpreting full wave acoustic well logs, well to well correlation, and matching P-wave reflection profiles to S-wave reflection profiles. The similarity between DWM and error correcting regular grammars can be used to add grammarlike capabilities to the matching algorithm.  相似文献   

12.
车辆行驶控制决策是无人驾驶的核心技术,现有基于深度强化学习的无人驾驶控制决策算法存在处理数据效率低、无法有效提取状态间时序特征等问题.因此本文提出了一种基于多步积累奖励的双重时序Q网络算法.首先,设计了一种多步积累奖励方法,该方法对未来多步即时奖励的累加和进行均值化,与当前即时奖励共同作用于智能体的控制策略,并在奖励函...  相似文献   

13.
Effective fuzzy c-means clustering algorithms for data clustering problems   总被引:3,自引:0,他引:3  
Clustering is a well known technique in identifying intrinsic structures and find out useful information from large amount of data. One of the most extensively used clustering techniques is the fuzzy c-means algorithm. However, computational task becomes a problem in standard objective function of fuzzy c-means due to large amount of data, measurement uncertainty in data objects. Further, the fuzzy c-means suffer to set the optimal parameters for the clustering method. Hence the goal of this paper is to produce an alternative generalization of FCM clustering techniques in order to deal with the more complicated data; called quadratic entropy based fuzzy c-means. This paper is dealing with the effective quadratic entropy fuzzy c-means using the combination of regularization function, quadratic terms, mean distance functions, and kernel distance functions. It gives a complete framework of quadratic entropy approaching for constructing effective quadratic entropy based fuzzy clustering algorithms. This paper establishes an effective way of estimating memberships and updating centers by minimizing the proposed objective functions. In order to reduce the number iterations of proposed techniques this article proposes a new algorithm to initialize the cluster centers.In order to obtain the cluster validity and choosing the number of clusters in using proposed techniques, we use silhouette method. First time, this paper segments the synthetic control chart time series directly using our proposed methods for examining the performance of methods and it shows that the proposed clustering techniques have advantages over the existing standard FCM and very recent ClusterM-k-NN in segmenting synthetic control chart time series.  相似文献   

14.
核聚类算法   总被引:112,自引:0,他引:112  
该文提出了一种用于聚类分析的核聚类方法,通过利用Mercer核,作者把输入空间的样本映射到高维特征空间后,在特征空间中进行聚类,由于经过了核函数的映射,使原来没有显现的特征突出来,从而能够更好地聚类,该核聚类方法在性能上比以典的聚类算法有较大的改进,具有更快的收敛速度以及更为准确的聚类,仿真实验的结果证实了核聚类方法的可行性和有效性。  相似文献   

15.
There has been a vast augmentation in quantity of Video Content (VC) generated amid the last some years. The Video Summarization (VS) approach is introduced for managing the VC. Prevailing VS techniques have endeavored to render the VS but the systems have Execution Time (ET) as well as condensing the video's content in domain specific manner. To triumph over such disadvantages, this paper proposed efficient VS for surveillance system using normalized k-means along with quick sort method. The proposed technique comprises eight stages, like split video into frames, pre-sampling, provide ID number, feature extraction, Feature Selection (FS), clustering, extract frames, video summary. Initially, the video frames are pre-sampled utilizing the proposed Three Step Cross Searching Algorithm (TSCS) technique. Then, give the ID number for every frame. Next, the features are extracted as of the frames. Then, the necessary features are selected using Entropy based Spider Monkey Algorithms (ESMA). In next stage, the features are grouped using Normalized K-Means (N-Kmeans) algorithm for identifying best candidate frames. Then select the minimum distance value based cluster set is the Key Frame (KF) selection. At last, the video is orderly summarized using quick sort method. Finally, in experimental evaluation the proposed work is compared with the prevailing methods. The proposed VS gave better outcome than the existing approaches.  相似文献   

16.
万福成 《计算机应用研究》2019,36(10):2952-2954,2970
在大数据环境下进行模糊信息挖掘抽取中受到数据之间的小扰动类间干扰的影响,导致信息抽取的特征聚类性不好。为此提出一种基于改进混沌分区算法的模糊信息抽取方法,对高维数据信息流进行分布式结构重组,以Lorenz混沌吸引子作为训练测试集进行大数据模糊信息抽取的自适应学习训练,采用相空间重构技术对大数据的混沌吸引子负载特征量进行自相关特征匹配处理,提取模糊信息的平均互信息特征量,结合关联规则模糊配对方法进行大数据混沌分区,实现模糊信息的优化聚类,根据数据聚类结果实现模糊信息准确抽取,对抽取的高维模糊信息进行特征压缩,降低计算开销。仿真结果表明,采用该方法进行大数据样本序列的模糊信息抽取的聚类性较好,抗类间扰动能力较强,模糊信息抽取的准确概率较高,在数据挖掘和特征提取中具有很好的应用价值。  相似文献   

17.
基于机器视觉原理的自动光学表面缺陷检测技术是当今工业生产中在线检测表面缺陷的一种新的技术方法,是精密制造与组装工业过程中保证零部件表面质量的重要检测手段.以液晶面板TFT阵列表面缺陷自动光学检测为例,介绍了表面缺陷自动光学检测的基本组成原理,阐述了周期纹理背景表面上的表面缺陷检测方法、缺陷信息处理的基本过程与实用算法.针对表面缺陷检测图像处理技术难题,详细论述了表面缺陷扫描图像中的周期纹理背景傅里叶变换频域滤波方法、缺陷分割双阈值统计控制法,并用实验结果给出了例证.  相似文献   

18.
This paper presents a method for the extraction of contour lines and other geographic information from scanned color images of topographical maps. Although topographic maps are available from many suppliers, this work focuses on United States Geological Survey (USGS) maps. The extraction of contour lines, which are shown with brown color on USGS maps, is a difficult process due to aliasing and false colors induced by the scanning process and due to closely spaced and intersecting/overlapping features inherent to the map. These difficulties render simple approaches such as clustering ineffective. The proposed method overcomes these difficulties using a multistep process. First, a color key set, designed to comprehend color aliasing and false colors, is generated using an eigenvector line-fitting technique in RGB space. Next, area features, representing vegetation and bodies of water, are extracted using RGB color histogram analysis in order to simplify the next stage. Then, linear features corresponding to roads and rivers including contours, are extracted using a valley seeking algorithm operating on a transformed version of the original map. Finally, an A* search algorithm is used to link valleys together to form linear features and to close the gaps caused by intersecting features. The performance of the algorithm is tested on a number of USGS topographic map samples.  相似文献   

19.
Most knowledge discovery in databases (KDD) research is concentrated on supervised inductive learning. Conceptual clustering is an unsupervised inductive learning technique that organizes observations into an abstraction hierarchy without using predefined class values. However, a typical conceptual clustering algorithm is not suitable for a KDD task because of space and time constraints. Furthermore, typical incremental and non-incremental clustering algorithms are not designed for a partitioned data set. In this paper, we present a conceptual clustering algorithm that works on partitioned data. The proposed algorithm improves the clustering process by using less computation time and less space while maintaining the clustering quality.  相似文献   

20.
关键帧提取是基于内容的视频摘要生成中的一个重要技术.首次引入仿射传播聚类方法来提取视频关键帧.该方法结合两个连续图像帧的颜色直方图交,通过消息传递,实现数据点的自动聚类.并与k means和SVC(support vector clustering)算法的关键帧提取方法进行了比较.实验结果表明,AP(Affinity Propagation)聚类的关键帧提取速度快,准确性高,生成的视频摘要具有良好的压缩率和内容涵盖率.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号