首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
Most Web content categorization methods are based on the vector space model of information retrieval. One of the most important advantages of this representation model is that it can be used by both instance‐based and model‐based classifiers. However, this popular method of document representation does not capture important structural information, such as the order and proximity of word occurrence or the location of a word within the document. It also makes no use of the markup information that can easily be extracted from the Web document HTML tags. A recently developed graph‐based Web document representation model can preserve Web document structural information. It was shown to outperform the traditional vector representation using the k‐Nearest Neighbor (k‐NN) classification algorithm. The problem, however, is that the eager (model‐based) classifiers cannot work with this representation directly. In this article, three new hybrid approaches to Web document classification are presented, built upon both graph and vector space representations, thus preserving the benefits and overcoming the limitations of each. The hybrid methods presented here are compared to vector‐based models using the C4.5 decision tree and the probabilistic Naïve Bayes classifiers on several benchmark Web document collections. The results demonstrate that the hybrid methods presented in this article outperform, in most cases, existing approaches in terms of classification accuracy, and in addition, achieve a significant reduction in the classification time. © 2008 Wiley Periodicals, Inc.  相似文献   

2.
Vector graphic gives us a new solution to the representation of raster images. Among many types of vectorized representations, the most popular is mesh representation, which inherits the benefits of vector graphics. Inspired by mesh, we propose a novel patch-based representation for raster images, in which pixels are partitioned into regions, and pixels belonging to the same region are converted into a 3D point cloud and approximated by a 3D planar patch with proper boundaries in a variational way. The resulting patches are then encoded via a half-edge structure for storage. The key point is that the vertices of boundaries are not located on the very positions of sample points, i.e. converted pixels, but dependent on the optimal position of the patch, which theoretically reduces the fitting errors. Experiments show that our algorithm produces better results.  相似文献   

3.
4.
We propose a framework for database querying providing the user with several interaction paradigms based on different (i.e., form-based, diagrammatic, iconic, and hybrid) visual representations of the database. A unified model, namely the Graph Model, is used as the common underlying model, in terms of which databases expressed in the most common data models can be easily converted. Graph Model databases can be queried by means of the multiparadigmatic interface. The semantics of the query operations is formally defined in terms of the Graphical Primitives. Such a formal approach enables the query manager to maintain the same query consistently in any representation. In the proposed multiparadigmatic environment, the user can switch from one interaction paradigm to another during query formulation, so that the most suitable query representation can be found.  相似文献   

5.
Topology has been an important tool for analyzing scalar data and flow fields in visualization. In this work, we analyze the topology of multivariate image and volume data sets with discontinuities in order to create an efficient, raster-based representation we call IStar. Specifically, the topology information is used to create a dual structure that contains nodes and connectivity information for every segmentable region in the original data set. This graph structure, along with a sampled representation of the segmented data set, is embedded into a standard raster image which can then be substantially downsampled and compressed. During rendering, the raster image is upsampled and the dual graph is used to reconstruct the original function. Unlike traditional raster approaches, our representation can preserve sharp discontinuities at any level of magnification, much like scalable vector graphics. However, because our representation is raster-based, it is well suited to the real-time rendering pipeline. We demonstrate this by reconstructing our data sets on graphics hardware at real-time rates.  相似文献   

6.
An extended model and calculus, called RasterCalc, is presented for operations on discrete graphics rasters, including their colour functions. The operations are separated into two main categories: operations on domains, and operations on colour functions. The operations are further classified as local and remote, depending on the correspondence between destination and source pixels. The new raster element or pixel can be a function of a single element from one or more rasters, a function of a small area from other rasters, or a function of entire rasters. Local operations have their main applications in computer graphics, while remote operations are more oriented towards image processing. A mathematically oriented notation is used to define and represent the operations included. RasterCalc has been implemented as a procedure package in Pascal, to be used on a powerful, yet expensive display. Recently a C version has been completed for a personal colour computer with a special chip for raster operations. The work reported in this paper is partially supplied by the Dutch Technical Sciences Foundation, under project number LWI 14.0130: “Facilities for raster graphics in programming languages”  相似文献   

7.
Raster graphics systems frequently store both a vector and a raster representation of the image. The vector representation is convenient for data input and editing, while the raster format is required for refreshing the display. Methods of converting polygons to raster format and raster to polygon format are discussed. A simple scheme which uses overlapping polygons is described.  相似文献   

8.
《Information & Management》2005,42(2):349-359
Representations used in digital documentation applications today usually assume a world that only exists in the present. Information contained within a database may be added-to or modified over time, but change through time is seldom maintained. This limitation of current IT has recently received attention, given the increasingly urgent need to understand geographical processes and cause-and-effect inter-relationships between human activities and the environment. Models proposed for the representation of spatio-temporal data are extensions of traditional raster and vector representations that can be seen as location- or feature-based, respectively, and are therefore best organized for performing, either location- or feature-based queries. In this paper, a new spatio-temporal data model suitable for digital documentation of historical living systems (artefacts, monuments and sites) is defined: it is based on 3D geometry and intended to facilitate analysis of temporal relationships and patterns of 3D modeling changes through time. This is particularly useful to both IT and IS managers, researchers and practitioners. It is shown that time-based queries related to 3D models of objects can be processed in an efficient and straightforward manner using the model. Finally, analytical time efficiency estimations are given, showing that the model is also an efficient and compact representation of spatio-temporal information.  相似文献   

9.
We reformulate minimalist grammars as partial functions on term algebras for strings and trees. Using filler/role bindings and tensor product representations, we construct homomorphisms for these data structures into geometric vector spaces. We prove that the structure-building functions as well as simple processors for minimalist languages can be realized by piecewise linear operators in representation space. We also propose harmony, i.e. the distance of an intermediate processing step from the final well-formed state in representation space, as a measure of processing complexity. Finally, we illustrate our findings by means of two particular arithmetic and fractal representations.  相似文献   

10.
This paper presents a new and enhanced voxel representation format for modeling the machined workpiece geometry in simulating machining operations involving repeated update of the workpiece model volume. The modeling format is named as the Frame-Sliced Voxel representation (FSV-rep) as it uses a novel concept of frame-sliced voxels to represent the boundary of the workpiece volume. The FSV-rep uses a multi-level surface voxel representation for sparse and memory-efficient implementations. The utilization of frame-sliced voxels enables approximation of the workpiece surface to only loosely depend on the grid resolution but achieve sub-voxel resolution updates for the model volume. It can, thus, provide a boundary representation of the workpiece model at an accuracy that is much higher than a basic voxel model of the same grid resolution and a similar model size. Quantitative comparisons of the FSV-rep with the traditional voxel representations at the same finest grid resolution show improvement up to two orders of magnitude in accuracy with only marginal increases in the model size. This confirms the effectiveness of the FSV-rep in simulating machined workpiece geometry in complex machining processes such as multi-axis milling.  相似文献   

11.
本文借助于多边形图像区域表示,提出了一种将光栅图像转换为SVG表示的矢量图形方法。该方法借鉴种子生长法将光栅图像划分为若干等大小的正方形图像原子块,在原子块中寻找主块,并从主块开始利用块间的邻域关系和种子生长准则,将与主块颜色特征相近的块合并到主块所在的集合,再从这个集合中抽取能覆盖这个集合所有元素的多边形。然后对分割得到的多边形区域提取边界顶点,并对边界顶点进行优化。最后根据多边形区域的形状和颜色使用相应的SVG代码描述光栅图像。该方法采用矢量图形描述光栅图像,具有存储空间比较小、放大无锯齿、不易变形等特点。  相似文献   

12.
Reducing the energy consumption of water distribution networks has never had more significance. The greatest energy savings can be obtained by carefully scheduling the operations of pumps. Schedules can be defined either implicitly, in terms of other elements of the network such as tank levels; or explicitly, by specifying the time during which each pump is on/off. The traditional representation of explicit schedules is a string of binary values with each bit representing pump on/off status during a particular time interval. In this paper, we formally define and analyze two new explicit representations based on time-controlled triggers, where the maximum number of pump switches is established beforehand and the schedule may contain fewer than the maximum number of switches. In these representations, a pump schedule is divided into a series of integers with each integer representing the number of hours for which a pump is active/inactive. This reduces the number of potential schedules compared to the binary representation, and allows the algorithm to operate on the feasible region of the search space. We propose evolutionary operators for these two new representations. The new representations and their corresponding operations are compared with the two most-used representations in pump scheduling, namely, binary representation and level-controlled triggers. A detailed statistical analysis of the results indicates which parameters have the greatest effect on the performance of evolutionary algorithms. The empirical results show that an evolutionary algorithm using the proposed representations is an improvement over the results obtained by a recent state of the art hybrid genetic algorithm for pump scheduling using level-controlled triggers.  相似文献   

13.
王昌  滕艳辉 《计算机工程》2010,36(20):88-89
针对如何在“3S”集成过程中选择良好数据结构的问题,通过分析矢量数据结构与栅格数据结构的优缺点,以二级划分策略建立具有两种结构优点的矢量栅格一体化数据结构,使空间数据在栅格化的同时能满足矢量精度要求,并给出其逻辑表示。在此基础上,讨论基于该数据结构的空间数据采集与叠加分析策略。  相似文献   

14.
Modeling appealing virtual scenes is an elaborate and time-consuming task, requiring not only training and experience, but also powerful modeling tools providing the desired functionality to the user. In this paper, we describe a modeling approach using signed distance functions as an underlying representation for objects, handling both conventional and complex surface manipulations. Scenes defined by signed distance functions can be stored compactly and rendered directly in real-time using sphere tracing. Hence, we are capable of providing an interactive application with immediate visual feedback for the artist, which is a crucial factor for the modeling process. Moreover, dealing with underlying mathematical operations is not necessary on the user level. We show that fundamental aspects of traditional modeling can be directly transferred to this novel kind of environment, resulting in an intuitive application behavior, and describe modeling operations which naturally benefit from implicit representations. We show modeling examples where signed distance functions are superior to explicit representations, but discuss the limitations of this approach as well.  相似文献   

15.
A system is frequently represented by transfer functions in an input–output characterization. However, such a system (under mild assumptions) can also be represented by transfer functions in a port characterization, frequently referred to as a chain-scattering representation. Due to its cascade properties, the chain-scattering representation is used throughout many fields of engineering. This paper studies the relationship between poles and zeros of input–output and chain-scattering representations of the same system.  相似文献   

16.
Image representations and feature selection for multimedia database search   总被引:3,自引:0,他引:3  
The success of a multimedia information system depends heavily on the way the data is represented. Although there are "natural" ways to represent numerical data, it is not clear what is a good way to represent multimedia data, such as images, video, or sound. We investigate various image representations where the quality of the representation is judged based on how well a system for searching through an image database can perform-although the same techniques and representations can be used for other types of object detection tasks or multimedia data analysis problems. The system is based on a machine learning method used to develop object detection models from example images that can subsequently be used for examples to detect-search-images of a particular object in an image database. As a base classifier for the detection task, we use support vector machines (SVM), a kernel based learning method. Within the framework of kernel classifiers, we investigate new image representations/kernels derived from probabilistic models of the class of images considered and present a new feature selection method which can be used to reduce the dimensionality of the image representation without significant losses in terms of the performance of the detection-search-system.  相似文献   

17.
In this paper we define a new 3D vector field distance transform to implicitly represent a mesh surface. We show that this new representation is more accurate than the classic scalar field distance transform by comparing both representations with an error metric evaluation. The widely used marching cube triangulation algorithm is adapted to the new vector field distance transform to correctly reconstruct the resulting explicit surface. In the reconstruction process of 3D scanned data, the useful mesh denoising operation is extended to the new vector field representation, which enables adaptive and selective filtering features. Results show that mesh processing with this new vector field representation is more accurate than with the scalar field distance transform and that it outperforms previous mesh filtering algorithms. Future work is discussed to extend this new vector field representation to other mesh useful operations and applications.  相似文献   

18.
A technique is developed to construct a representation of planar objects undergoing a general affine transformation. The representation can be used to describe planar or nearly planar objects in a three-dimensional space, observed by a camera under arbitrary orientations. The technique is based upon object contours, parameterized by an affine invariant parameter and the dyadic wavelet transform. The role of the wavelet transform is the extraction of multiresolution affine invariant features from the affine invariant contour representation. A dissimilarity function is also developed and used to distinguish among different object representations. This function makes use of the extrema on the representations, thus making its computation very efficient. A study of the effect of using different wavelet functions and their order or vanishing moments is also carried out. Experimental results show that the performance of the proposed representation is better than that of other existing methods, particularly when objects are heavily corrupted with noise  相似文献   

19.
20.
In a negative representation, a set of elements (the positive representation) is depicted by its complement set. That is, the elements in the positive representation are not explicitly stored, and those in the negative representation are. The concept, feasibility, and properties of negative representations are explored in the paper; in particular, its potential to address privacy concerns. It is shown that a positive representation consisting of n l-bit strings can be represented negatively using only O(ln) strings, through the use of an additional symbol. It is also shown that membership queries for the positive representation can be processed against the negative representation in time no worse than linear in its size, while reconstructing the original positive set from its negative representation is an NP{\mathcal{NP}} -hard problem. The paper introduces algorithms for constructing negative representations as well as operations for updating and maintaining them.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号