首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A flexible approach for visual data mining   总被引:3,自引:0,他引:3  
The exploration of heterogenous information spaces requires suitable mining methods as well as effective visual interfaces. Most of the existing systems concentrate either on mining algorithms or on visualization techniques. This paper describes a flexible framework for visual data mining which combines analytical and visual methods to achieve a better understanding of the information space. We provide several pre-processing methods for unstructured information spaces, such as a flexible hierarchy generation with user-controlled refinement. Moreover, we develop new visualization techniques, including an intuitive focus+context technique to visualize complex hierarchical graphs. A special feature of our system is a new paradigm for visualizing information structures within their frame of reference  相似文献   

2.
From visual data exploration to visual data mining: a survey   总被引:8,自引:0,他引:8  
We survey work on the different uses of graphical mapping and interaction techniques for visual data mining of large data sets represented as table data. Basic terminology related to data mining, data sets, and visualization is introduced. Previous work on information visualization is reviewed in light of different categorizations of techniques and systems. The role of interaction techniques is discussed, in addition to work addressing the question of selecting and evaluating visualization techniques. We review some representative work on the use of information visualization techniques in the context of mining data. This includes both visual data exploration and visually expressing the outcome of specific mining algorithms. We also review recent innovative approaches that attempt to integrate visualization into the DM/KDD process, using it to enhance user interaction and comprehension.  相似文献   

3.
迄今为止,数据挖掘与知识发现软件的功能不再停留在"挖掘"这个单一功能的实现,而已延伸到数据挖掘与知识发现的过程,即包括数据的预处理、数据挖掘、模型评估与可视化,在单纯的模型可视化基础上扩充了数据可视化与数据挖掘过程可视化.主要讨论了数据挖掘的方法与可视化技术,指出了未来的研究方向.  相似文献   

4.
This paper discusses 3D visualization and interactive exploration of large relational data sets through the integration of several well-chosen multidimensional data visualization techniques and for the purpose of visual data mining and exploratory data analysis. The basic idea is to combine the techniques of grand tour, direct volume rendering, and data aggregation in databases to deal with both the high dimensionality of data and a large number of relational records. Each technique has been enhanced or modified for this application. Specifically, positions of data clusters are used to decide the path of a grand tour. This cluster-guided tour makes intercluster-distance-preserving projections in which data clusters are displayed as separate as possible. A tetrahedral mapping method applied to cluster centroids helps in choosing interesting cluster-guided projections. Multidimensional footprint splatting is used to directly render large relational data sets. This approach abandons the rendering techniques that enhance 3D realism and focuses on how to efficiently produce real-time explanatory images that give comprehensive insights into global features such as data clusters and holes. Examples are given where the techniques are applied to large (more than a million records) relational data sets.  相似文献   

5.
Data mining techniques such as classification algorithms are applied to data which are usually high dimensional and very large. In order to assist the user to perform a classification task, visual techniques can be employed to represent high dimensional data in a more comprehensible 2D or 3D space. However, such representation of high dimensional data in the 2D or 3D space may unavoidably cause overlapping data and information loss. This issue can be addressed by interactive visualization. With expert domain knowledge, the user can build classifiers that are as competitive as automated ones using a 2D or 3D visual interface interactively. Several visual techniques have been proposed for classifying high dimensional data. However, the user׳s interaction with those techniques is highly dependent on the experience of the user in the visual identification of classifying data, and as a result, the classification results of those techniques may vary and may not be repeatable. To address this deficiency, this article presents an interactive visual approach to the classification of high dimensional data. Our approach employs the enhanced separation feature of a visual technique called HOV3 by which the user plots the training dataset by applying statistical measurements on a 2D space in order to separate data points into groups with the same class labels. A data group with its corresponding statistical measurement which separated it from the others is taken as a visual classifier. Then the user mixes the data points in a classifier with the unlabeled dataset and plots them in HOV3 by the measurement of the classifier. The data points which overlap the labeled ones in the 2D space are assigned the corresponding label. Our approach avoids the randomness in the existing interactive visual classification techniques, as the visual classifier in this approach only depends on the training dataset and its statistical measurement. As a result, this work provides an intuitive and effective approach to classify high dimensional data by interactive visualization.  相似文献   

6.
Detecting flaws and intruders with visual data analysis   总被引:1,自引:0,他引:1  
The task of sifting through large amounts of data to find useful information spawned the field of data mining. Most data mining approaches are based on machine-learning techniques, numerical analysis, or statistical modeling. They use human interaction and visualization only minimally. Such automatic methods can miss some important features of the data. Incorporating human perception into the data mining process through interactive visualization can help us better understand the complex behaviors of computer network systems. This article describes visual-analytics-based solutions and outlines a visual exploration process for log analysis. Three log-file analysis applications demonstrate our approach's effectiveness in discovering flaws and intruders in network systems.  相似文献   

7.
Visualization techniques for mining large databases: a comparison   总被引:9,自引:0,他引:9  
Visual data mining techniques have proven to be of high value in exploratory data analysis, and they also have a high potential for mining large databases. In this article, we describe and evaluate a new visualization-based approach to mining large databases. The basic idea of our visual data mining techniques is to represent as many data items as possible on the screen at the same time by mapping each data value to a pixel of the screen and arranging the pixels adequately. The major goal of this article is to evaluate our visual data mining techniques and to compare them to other well-known visualization techniques for multidimensional data: the parallel coordinate and stick-figure visualization techniques. For the evaluation of visual data mining techniques, the perception of data properties counts most, while the CPU time and the number of secondary storage accesses are only of secondary importance. In addition to testing the visualization techniques using real data, we developed a testing environment for database visualizations similar to the benchmark approach used for comparing the performance of database systems. The testing environment allows the generation of test data sets with predefined data characteristics which are important for comparing the perceptual abilities of visual data mining techniques  相似文献   

8.
Visualization techniques are of increasing importance in exploring and analyzing large amounts of multidimensional information. One important class of visualization techniques which is particularly interesting for visualizing very large multidimensional data sets is the class of pixel-oriented techniques. The basic idea of pixel-oriented visualization techniques is to represent as many data objects as possible on the screen at the same time by mapping each data value to a pixel of the screen and arranging the pixels adequately. A number of different pixel-oriented visualization techniques have been proposed in recent years and it has been shown that the techniques are useful for visual data exploration in a number of different application contexts. In this paper, we discuss a number of issues which are important in developing pixel-oriented visualization techniques. The major goal of this article is to provide a formal basis of pixel-oriented visualization techniques and show that the design decisions in developing them can be seen as solutions of well-defined optimization problems. This is true for the mapping of the data values to colors, the arrangement of pixels inside the subwindows, the shape of the subwindows, and the ordering of the dimension subwindows. The paper also discusses the design issues of special variants of pixel-oriented techniques for visualizing large spatial data sets  相似文献   

9.
The visual senses for humans have a unique status, offering a very broadband channel for information flow. Visual approaches to analysis and mining attempt to take advantage of our abilities to perceive pattern and structure in visual form and to make sense of, or interpret, what we see. Visual Data Mining techniques have proven to be of high value in exploratory data analysis and they also have a high potential for mining large databases. In this work, we try to investigate and expand the area of visual data mining by proposing new visual data mining techniques for the visualization of mining outcomes.  相似文献   

10.
The analysis of large dynamic networks poses a challenge in many fields, ranging from large bot-nets to social networks. As dynamic networks exhibit different characteristics, e.g., being of sparse or dense structure, or having a continuous or discrete time line, a variety of visualization techniques have been specifically designed to handle these different aspects of network structure and time. This wide range of existing techniques is well justified, as rarely a single visualization is suitable to cover the entire visual analysis. Instead, visual representations are often switched in the course of the exploration of dynamic graphs as the focus of analysis shifts between the temporal and the structural aspects of the data. To support such a switching in a seamless and intuitive manner, we introduce the concept of in situ visualization--a novel strategy that tightly integrates existing visualization techniques for dynamic networks. It does so by allowing the user to interactively select in a base visualization a region for which a different visualization technique is then applied and embedded in the selection made. This permits to change the way a locally selected group of data items, such as nodes or time points, are shown--right in the place where they are positioned, thus supporting the user's overall mental map. Using this approach, a user can switch seamlessly between different visual representations to adapt a region of a base visualization to the specifics of the data within it or to the current analysis focus. This paper presents and discusses the in situ visualization strategy and its implications for dynamic graph visualization. Furthermore, it illustrates its usefulness by employing it for the visual exploration of dynamic networks from two different fields: model versioning and wireless mesh networks.  相似文献   

11.
Visual data mining in large geospatial point sets   总被引:2,自引:0,他引:2  
Visual data-mining techniques have proven valuable in exploratory data analysis, and they have strong potential in the exploration of large databases. Detecting interesting local patterns in large data sets is a key research challenge. Particularly challenging today is finding and deploying efficient and scalable visualization strategies for exploring large geospatial data sets. One way is to share ideas from the statistics and machine-learning disciplines with ideas and methods from the information and geo-visualization disciplines. PixelMaps in the Waldo system demonstrates how data mining can be successfully integrated with interactive visualization. The increasing scale and complexity of data analysis problems require tighter integration of interactive geospatial data visualization with statistical data-mining algorithms.  相似文献   

12.
In many information visualization techniques, labels are an essential part to communicate the visualized data. To preserve the expressiveness of the visual representation, a placed label should neither occlude other labels nor visual representatives (e.g., icons, lines) that communicate crucial information. Optimal, non-overlapping labeling is an NP-hard problem. Thus, only a few approaches achieve a fast non-overlapping labeling in highly interactive scenarios like information visualization. These approaches generally target the point-feature label placement (PFLP) problem, solving only label-label conflicts. This paper presents a new, fast, solid and flexible 2D labeling approach for the PFLP problem that additionally respects other visual elements and the visual extent of labeled features. The results (number of placed labels, processing time) of our particle-based method compare favorably to those of existing techniques. Although the esthetic quality of non-real-time approaches may not be achieved with our method, it complies with practical demands and thus supports the interactive exploration of information spaces. In contrast to the known adjacent techniques, the flexibility of our technique enables labeling of dense point clouds by the use of nonoccluding distant labels. Our approach is independent of the underlying visualization technique, which enables us to demonstrate the application of our labeling method within different information visualization scenarios.  相似文献   

13.
The identification of significant sequences in large and complex event-based temporal data is a challenging problem with applications in many areas of today's information intensive society. Pure visual representations can be used for the analysis, but are constrained to small data sets. Algorithmic search mechanisms used for larger data sets become expensive as the data size increases and typically focus on frequency of occurrence to reduce the computational complexity, often overlooking important infrequent sequences and outliers. In this paper we introduce an interactive visual data mining approach based on an adaptation of techniques developed for Web searching, combined with an intuitive visual interface, to facilitate user-centred exploration of the data and identification of sequences significant to that user. The search algorithm used in the exploration executes in negligible time, even for large data, and so no pre-processing of the selected data is required, making this a completely interactive experience for the user. Our particular application area is social science diary data but the technique is applicable across many other disciplines.  相似文献   

14.
Visual data mining may overcome some of the flexibility problem often suffered by computer-centered data mining approaches. This can happen because human beings are introduced to the information discovery loop to take advantage of their natural strength in creative thinking and rapid visual pattern recognition to discover information not defined a priori and to perform approximated reasoning that computer algorithms are hard to do. This paper presents a novel visual exploration approach for mining abstract, multi-dimensional data stored in tables in a relational database. The visual image is constructed by converting each table into a visualization unit called a table graph and then assembling these table graphs together to form a small multiples design. Different types of non-uniform color mappings to render this small multiples design could be automatically generated by minimizing the weight differences of colors in the visual image. These non-uniform color mappings are designed in such a way that the adjacent glyphs in a table graph that have near underlying values will be assigned with the same color. As such, visual patterns not able to see under the traditional uniform color mapping could be revealed. This enables the users to examine the input tables from different perspectives. The proposed flexible visualization method has been applied to generate visual images from which the users could quickly and easily compare the machine idle cost performances of alternative master production plans.  相似文献   

15.
The derivation, manipulation and verification of analytical models from raw data is a process which requires a transformation of information across different levels of abstraction. We introduce a concept for the coupling of data classification and interactive visualization in order to make this transformation visible and steerable for the human user. Data classification techniques generate mappings that formally group data items into categories. Interactive visualization includes the user into an iterative refinement process. The user identifies and selects interesting patterns to define these categories. The following step is the transformation of a visible pattern into the formal definition of a classifier. In the last step the classifier is transformed back into a pattern that is blended with the original data in the same visual display. Our approach allows in intuitive assessment of a formal classifier and its model, the detection of outliers and the handling of noisy data using visual pattern‐matching. We instantiated the concept using decision trees for classification and KVMaps as the visualization technique. The generation of a classifier from visual patterns and its verification is transformed from a cognitive to a mostly pre‐cognitive task.  相似文献   

16.
钱宇 《软件学报》2008,19(8):1965-1979
可视化技术的发展极大地提高了传统数据挖掘技术的效率.通过结合人类识别模式的能力,计算机程序能够更有效的发现隐藏在数据中的规律和信息.作为聚类分析的重要步骤,噪音消除一直都是困绕数据挖掘研究者的问题,尤其对于不同领域的应用,由于噪音的模型和定义不同,单一的数据处理方法无法有效而准确地去除域相关的噪音.本文针对这一问题,提出了一个新型的可视化噪音处理方法CLEAN.CLEAN的独特之处在于它设计的噪音处理技术和提出的可视化方法有机地结合在一起.噪音处理算法为可视化模型生成所需数据,同时针对噪音处理算法选择可视化方法,从而达到提高整个数据处理系统性能的目的.这样不仅降低了噪音去除过程中主观因素的影响,还可以帮助数据挖掘程序去除领域相关的噪音.同时源数据的质量,算法参数的选择和不同噪音去除算法的精确性都可以在所使用的可视化模型中反映出来.实验表明CLEAN能够有效地帮助空间数据聚类算法在噪音环境下发现数据的自然聚类.  相似文献   

17.
The loosely coupled relationships between visualization and analytical data mining (DM) techniques represent the majority of the current state of art in visual data mining; DM modeling is typically an automatic process with very limited forms of guidance from users. A conceptual model of the visualization support to DM modeling process and a novel interactive visual decision tree (IVDT) classification process have been proposed in this paper, with the aim of exploring humans’ pattern recognition ability and domain knowledge to facilitate the knowledge discovery process. An IVDT for categorical input attributes has been developed and experimented on 20 subjects to test three hypotheses regarding its potential advantages. The experimental results suggested that, compared to the automatic modeling process as typically applied in current decision tree modeling tools, IVDT process can improve the effectiveness of modeling in terms of producing trees with relatively high classification accuracies and small sizes, enhance users’ understanding of the algorithm, and give them greater satisfaction with the task.  相似文献   

18.
数据挖掘技术   总被引:13,自引:0,他引:13       下载免费PDF全文
数据挖掘技术是当前数据库和人工智能领域研究的热点课题,为了使人们对该领域现状有个概略了解,在消化大量文献资料的基础上,首先对数据挖掘技术的国内外总体研究情况进行了概略介绍,包括数据挖掘技术的产生背景、应用领域、分类及主要挖掘技术;结合作者的研究工作,对关联规则的挖掘、分类规则的挖掘、离群数据的挖掘及聚类分析作了 较详细的论述;介绍了关联规则挖掘的主要研究成果,同时指出了关联规则衡量标准的不足及其改进方法,提出了分类模式的准确度评估方法;最后,描述了数据挖掘技术在科学研究、金属投资、市场营销、保险业、制造业及通信网络管理等行业的应用情况,并对数据挖掘技术的应用前景作了展望。  相似文献   

19.
Direct visualization of volume data   总被引:5,自引:0,他引:5  
A combination of segmentation tools and fast volume renderers that provides an interactive exploration environment for volume visualization is discussed. The tools and renderers include mechanisms that distribute volume data across multiple processors, as well as image compositing techniques and solutions to representation problems in the selection and display of subregions within bounding volumes. A volume visualization technique using the interactive control of images rendered directly from volume data coupled with a user-controlled semantic classification tool is described. The variations of parallel volume rendering being explored on the Pixel-Planes 5 system and the region-of-interest selection methods and the interactive tools used by the system are presented. The flexibility and power of combining volume rendering with region-of-interest selection techniques are demonstrated using examples of medical imaging applications  相似文献   

20.
Cartographic maps have been shown to provide cognitive benefits when interpreting data in relation to a geographic location. In visualization, the term map-like describes techniques that incorporate characteristics of cartographic maps in their representation of abstract data. However, the field of map-like visualization is vast and currently lacks a clear classification of the existing techniques. Moreover, choosing the right technique to support a particular visualization task is further complicated, as techniques are scattered across different domains, with each considering different characteristics as map-like. In this paper, we give an overview of the literature on map-like visualization and provide a hierarchical classification of existing techniques along two general perspectives: imitation and schematization of cartographic maps. Each perspective is further divided into four principal categories that group common map-like techniques along the visual primitives they affect. We further discuss this classification from a task-centered view and highlight open research questions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号