首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
RGB-D sensors are capable of providing 3D points (depth) together with color information associated with each point. These sensors suffer from different sources of noise. With some kinds of RGB-D sensors, it is possible to pre-process the color image before assigning the color information to the 3D data. However, with other kinds of sensors that is not possible: RGB-D data must be processed directly. In this paper, we compare different approaches for noise and artifacts reduction: Gaussian, mean and bilateral filter. These methods are time consuming when managing 3D data, which can be a problem with several real time applications. We propose new methods to accelerate the whole process and improve the quality of the color information using entropy information. Entropy provides a framework for speeding up the involved methods allowing certain data not to be processed if the entropy value of that data is over or under a given threshold. The experimental results provide a way to balance the quality and the acceleration of these methods. The current results show that our methods improve both the image quality and processing time, as compared to the original methods.  相似文献   

2.
In clustering algorithms, it is usually assumed that the number of clusters is known or given. In the absence of such a priori information, a procedure is needed to find an appropriate number of clusters. This paper presents a clustering algorithm that incorporates a mechanism for finding the appropriate number of clusters as well as the locations of cluster prototypes. This algorithm, called multi-scale clustering, is based on scale-space theory by considering that any prominent data structure ought to survive over many scales. The number of clusters as well as the locations of cluster prototypes are found in an objective manner by defining and using lifetime and drift speed clustering criteria. The outcome of this algorithm does not depend on the initial prototype locations that affect the outcome of many clustering algorithms. As an application of this algorithm, it is used to enhance the Hough transform technique.  相似文献   

3.
基于信息熵的不确定性数据清理方法   总被引:1,自引:0,他引:1  
覃远翔  段亮  岳昆 《计算机应用》2013,33(9):2490-2492
针对不确定性数据中往往包含一些异常数据而导致相应的查询结果出现错误的问题,提出了一种基于信息熵的不确定性数据清理方法以减少异常数据并提高不确定性数据的质量。首先使用信息熵来度量数据的不确定度,然后结合统计学方法计算出不确定性数据的可信区间,最后去除那些不在可信区间内的数据。实验结果验证了该方法的高效性和有效性。  相似文献   

4.
Clustering is the process of partitioning a set of patterns into disjoint and homogeneous meaningful groups (clusters). A fundamental and unresolved issue in cluster analysis is to determine how many clusters are present in a given set of patterns. In this paper, we present the z-windows clustering algorithm, which aims to address this problem using a windowing technique. Extensive empirical tests that illustrate the efficiency and the accuracy of the propsoed method are presented. The text was submitted by the authors in English. Basilis Boutsinas. Received his diploma in Computer Engineering and Informatics in 1991 from the University of Patras, Greece. He also conducted studies in Electronics Engineering at the Technical Education Institute of Piraeus, Greece, and Pedagogics at the Pedagogical Academy of Lamia, Greece. He received his PhD on Knowledge Representation from the University of Patras in 1997. He has been an assistant professor in the Department of Business Administration at the University of Patras since 2001. His primary research interests include data mining, business intelligence, knowledge representation techniques, nonmonotonic reasoning, and parallel AI. Dimitris K. Tasoulis received his diploma in Mathematics from the University of Patras, Greece, in 2000. He attained his MSc degree in 2004 from the postgraduate course “Mathematics of Computers and Decision Making” from which he was awarded a postgraduate fellowship. Currently, he is a PhD candidate in the same course. His research activities focus on data mining, clustering, neural networks, parallel algorithms, and evolutionary computation. He is coauthor of more than ten publications. Michael N. Vrahatis is with the Department of Mathematics at the University of Patras, Greece. He received the diploma and PhD degree in Mathematics from the University of Patras in 1978 and 1982, respectively. He was a visiting research fellow at the Department of Mathematics, Cornell University (1987–1988) and a visiting professor to the INFN (Istituto Nazionale di Fisica Nucleare), Bologna, Italy (1992, 1994, and 1998); the Department of Computer Science, Katholieke Universiteit Leuven, Belgium (1999); the Department of Ocean Engineering, Design Laboratory, MIT, Cambridge, MA, USA (2000); and the Collaborative Research Center “Computational Intelligence” (SFB 531) at the Department of Computer Science, University of Dortmund, Germany (2001). He was a visiting researcher at CERN (European Organization of Nuclear Research), Geneva, Switzerland (1992) and at INRIA (Institut National de Recherche en Informatique et en Automatique), France (1998, 2003, and 2004). He is the author of more than 250 publications (more than 110 of which are published in international journals) in his research areas, including computational mathematics, optimization, neural networks, evolutionary algorithms, and artificial intelligence. His research publications have received more than 600 citations. He has been a principal investigator of several research grants from the European Union, the Hellenic Ministry of Education and Religious Affairs, and the Hellenic Ministry of Industry, Energy, and Technology. He is among the founders of the “University of Patras Artificial Intelligence Research Center” (UPAIRC), established in 1997, where currently he serves as director. He is the founder of the Computational Intelligence Laboratory (CI Lab), established in 2004 at the Department of Mathematics of University of Patras, where currently he serves as director.  相似文献   

5.
鱼群中的个体如何通过信息传递从而达到一致的群体运动状态,至今还没有刻画这一现象的统一数学模型.本文阐述了一种利用视频数据和传递熵构建鱼群中个体间信息传递网络的方法.首先用实验获取斑马鱼集群的视频数据,并采用计算机视觉跟踪的方法获取鱼群中每个个体的位置和运动速度,然后利用传递熵计算个体两两之间的信息传递关系,在此基础上构建了鱼群信息传递网络.通过网络分析,揭示了鱼群中个体间的信息交互个数与信息传播速度之间的关系,进一步发现了鱼群信息传递网络中的频繁子结构.本文提供了一种利用探测时间序列间因果关系建立鱼群信息传递网络的方法,为鱼群信息传递研究提供了一种新的思路.  相似文献   

6.
Enhancing density-based data reduction using entropy   总被引:1,自引:0,他引:1  
Data reduction algorithms determine a small data subset from a given large data set. In this article, new types of data reduction criteria, based on the concept of entropy, are first presented. These criteria can evaluate the data reduction performance in a sophisticated and comprehensive way. As a result, new data reduction procedures are developed. Using the newly introduced criteria, the proposed data reduction scheme is shown to be efficient and effective. In addition, an outlier-filtering strategy, which is computationally insignificant, is developed. In some instances, this strategy can substantially improve the performance of supervised data analysis. The proposed procedures are compared with related techniques in two types of application: density estimation and classification. Extensive comparative results are included to corroborate the contributions of the proposed algorithms.  相似文献   

7.
This paper proposes a new approach named SGMIEC in the field of estimation of distribution algorithm (EDA). While the current EDAs require much time in the statistical learning process as the relationships among the variables are too complicated, the selfish gene theory (SG) is deployed in this approach and a mutual information and entropy based cluster (MIEC) model with an incremental learning and resample scheme is also set to optimize the probability distribution of the virtual population. Experimental results on several benchmark problems demonstrate that, compared with BMDA, COMIT and MIMIC, SGMIEC often performs better in convergent reliability, convergent velocity and convergent process.  相似文献   

8.
Several algorithms for clustering data streams based on k-Means have been proposed in the literature. However, most of them assume that the number of clusters, k, is known a priori by the user and can be kept fixed throughout the data analysis process. Besides the difficulty in choosing k, data stream clustering imposes several challenges to be addressed, such as addressing non-stationary, unbounded data that arrive in an online fashion. In this paper, we propose a Fast Evolutionary Algorithm for Clustering data streams (FEAC-Stream) that allows estimating k automatically from data in an online fashion. FEAC-Stream uses the Page–Hinkley Test to detect eventual degradation in the quality of the induced clusters, thereby triggering an evolutionary algorithm that re-estimates k accordingly. FEAC-Stream relies on the assumption that clusters of (partially unknown) data can provide useful information about the dynamics of the data stream. We illustrate the potential of FEAC-Stream in a set of experiments using both synthetic and real-world data streams, comparing it to four related algorithms, namely: CluStream-OMRk, CluStream-BkM, StreamKM++-OMRk and StreamKM++-BkM. The obtained results show that FEAC-Stream provides good data partitions and that it can detect, and accordingly react to, data changes.  相似文献   

9.
This paper proposes a new method for estimating the true number of clusters and initial cluster centers in a dataset with many clusters. The observation points are assigned to the data space to observe the clusters through the distributions of the distances between the observation points and the objects in the dataset. A Gamma Mixture Model (GMM) is built from a distance distribution to partition the dataset into subsets, and a GMM tree is obtained by recursively partitioning the dataset. From the leaves of the GMM tree, a set of initial cluster centers are identified and the true number of clusters is estimated. This method is implemented in the new GMM-Tree algorithm. Two GMM forest algorithms are further proposed to ensemble multiple GMM trees to handle high dimensional data with many clusters. The GMM-P-Forest algorithm builds GMM trees in parallel, whereas the GMM-S-Forest algorithm uses a sequential process to build a GMM forest. Experiments were conducted on 32 synthetic datasets and 15 real datasets to evaluate the performance of the new algorithms. The results have shown that the proposed algorithms outperformed the existing popular methods: Silhouette, Elbow and Gap Statistic, and the recent method I-nice in estimating the true number of clusters from high dimensional complex data.  相似文献   

10.
A method of predicting the number of clusters using Rand's statistic   总被引:1,自引:0,他引:1  
Distributional and asymptotic results on the moment of Rand's Ck statistic were derived by DuBien and Warde [1981. Some distributional results concerning a comparative statistic used in cluster analysis. ASA Proceedings of the Social Statistics Section, 309–313.]. Based on those results, a method to predict the number of clusters is suggested by applying various agglomerative clustering algorithms. In the procedure, the methods using different indexes are examined and compared based on the concept of agreement (or, disagreement) between clusterings generated by different clustering algorithms on the set of data. Our method having practical generality works better than the other methods and assigns statistical meaning to Ck values in determining the number of clusters from the comparison.  相似文献   

11.
12.
Determining object translation information using stereoscopic motion   总被引:1,自引:0,他引:1  
Stereoscopic motion is an approach for comparing image change due to motion in a stereo pair of image sequences. Qualitatively, the relative image change shows that an object point is approaching, receding, or remaining at constant depth. Quantitatively, the relative change predicts where the object point will pass with respect to the camera system.  相似文献   

13.
Imbalanced data classification, an important type of classification task, is challenging for standard learning algorithms. There are different strategies to handle the problem, as popular imbalanced learning technologies, data level imbalanced learning methods have elicited ample attention from researchers in recent years. However, most data level approaches linearly generate new instances by using local neighbor information rather than based on overall data distribution. Differing from these algorithms, in this study, we develop a new data level method, namely, generative learning (GL), to deal with imbalanced problems. In GL, we fit the distribution of the original data and generate new data on the basis of the distribution by adopting the Gaussian mixed model. Generated data, including synthetic minority and majority classes, are used to train learning models. The proposed method is validated through experiments performed on real-world data sets. Results show that our approach is competitive and comparable with other methods, such as SMOTE, SMOTE-ENN, SMOTE-TomekLinks, Borderline-SMOTE, and safe-level-SMOTE. Wilcoxon signed rank test is applied, and the testing results show again the significant superiority of our proposal.  相似文献   

14.
In this paper, we present a new method, called Spectral Global Silhouette method (GS), to calculate the optimal number of clusters in a dataset using a Spectral Clustering algorithm. It combines both a Silhouette Validity Index and the concept of Local Scaling. First, the GS algorithm has first been tested using synthetic data. Then, it is applied on real data for image segmentation task. In addition, three new methods for image segmentation and two new ways to calculate the optimal number of groups in an image are proposed. Our experiments have shown a promising performance of the proposed algorithms.  相似文献   

15.
Estimation of number of segments in an image attracts a formidable interest among the research community. The number of segments in an image is estimated by calculating the number of clusters present in the pixels of that image. The present work offers an unsupervised method, named “Electrostatic Force Index (EF-Index)”, to estimate the number of clusters inherent in an image, reporting of which is very rare in literature. The proposed approach is inspired by Coulomb's law of electrostatics. The EF-Index explores the mutual influence of an arbitrary pixel on another, by considering them similar to point charges. Our proposed cluster indexing method, viz. EF-Index is capable of determining the number of clusters present in an image. It has a strong resemblance to the way the electrostatic force is operative between a pair of static point charges in a closed system as per Coulomb's principle. In order to justify the effectiveness of the proposed approach, we have compared EF-Index of a given image with DB-Index, I-Index, CVNN-Index, DOE-AND-SCA and Sym-Index of the same image. Experimental results show that EF-Index is same as other state-of-the-art indices, whereas EF-Index does not require any clustering algorithm. To establish the applicability of the EF-Index, the same is applied for image segmentation considering Berkeley Segmentation Dataset and Stanford Background Dataset. We observe the results obtained conform to the ground truth and results achieved by applying existing well-established segmentation techniques on the same datasets. The efficacy of the proposed approach is further substantiated in terms of its reduced computational overhead in comparison to the state-of-the-art segmentation algorithms.  相似文献   

16.
Clustering is actively studied in such fields as statistics, pattern recognition, machine training, et al. A new randomized algorithm is suggested and established for finding the number of clusters in the set of data, the efficiency of which is demonstrated by examples of simulation modeling on synthetic data with thousands of clusters.  相似文献   

17.
Determining the number of principal components for best reconstruction   总被引:5,自引:0,他引:5  
A well-defined variance of reconstruction error (VRE) is proposed to determine the number of principal components in a PCA model for best reconstruction. Unlike most other methods in the literature, this proposed VRE method has a guaranteed minimum over the number of PC's corresponding to the best reconstruction. Therefore, it avoids the arbitrariness of other methods with monotonic indices. The VRE can also be used to remove variables that are little correlated with others and cannot be reliably reconstructed from the correlation-based PCA model. The effectiveness of this method is demonstrated with a simulated process.  相似文献   

18.
提出了一种基于距离分布信息熵的图像检索方法,该方法首先对图像的目标区域进行区域划分,然后提取区域的信息熵作为特征来描述图像形状,最后使用欧式距离度量熵矢量之间的相似性。实验结果表明,距离分布信息熵能有效地刻画出二值图象的形状特征,并且具有良好的平移、旋转及尺度不变性,检索结果符合人眼的视觉感受。  相似文献   

19.
We introduce from first principles an analysis of the information content of multivariate distributions as information sources. Specifically, we generalize a balance equation and a visualization device, the Entropy Triangle, for multivariate distributions and find notable differences with similar analyses done on joint distributions as models of information channels.As an example application, we extend a framework for the analysis of classifiers to also encompass the analysis of data sets. With such tools we analyze a handful of UCI machine learning task to start addressing the question of how well do datasets convey the information they are supposed to capture about the phenomena they stand for.  相似文献   

20.
Determining forest canopy characteristics using airborne laser data   总被引:3,自引:0,他引:3  
A pulsed laser system was flown over a forested area in Pennsylvania which exhibited a wide range of canopy closure conditions. The lasing system acts as the ultraviolet light equivalent of radar, sensing not only the distance to the top of the forest canopy, but also the range to the forest floor. The data were analyzed to determine which components of the laser data could explain the variability in crown closure along the flight transect. Results indicated that canopy closure was most strongly related to the penetration capability of the laser pulse. Pulses were attenuated more quickly in a dense canopy. Hence the inability to find a strong ground return in the laser data after initially sensing the top of the canopy connoted dense canopy cover. Photogrammetrically acquired tree heights were compared to laser estimates; average heights differed by less than 1 m. The results indicated that the laser system may be used to remotely sense the vertical forest canopy profile. Elements of this profile are linearly related to crown closure and may be used to assess tree height.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号