首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
连通区的页面分割与分类方法   总被引:2,自引:0,他引:2  
页面分割与分类是文档处理的关键步骤,但目前多数方法对页面的块和倾斜进行了限制,文中提出一种新的基于连通区的页面分割与分类方法,首行采用快速算法抽取页面内的连通区,然后利用改进的PLSA算法分割页面,并根据连通区的分布情况以及块的特征对块进行分类,该方法页面分割与分类紧密结合,充分考虑到块的局部特征,保证块分类的正确性,大大提高了算法效率。  相似文献   

2.
In this paper, based on the study of the specificity of historical printed books, we first explain the main error sources in classical methods used for page layout analysis. We show that each method (bottom-up and top-down) provides different types of useful information that should not be ignored, if we want to obtain both a generic method and good segmentation results. Next, we propose to use a hybrid segmentation algorithm that builds two maps: a shape map that focuses on connected components and a background map, which provides information about white areas corresponding to block separations in the page. Using this first segmentation, a classification of the extracted blocks can be achieved according to scenarios produced by the user. These scenarios are defined very simply during an interactive stage. The user is able to make processing sequences adapted to the different kinds of images he is likely to meet and according to the user needs. The proposed “user-driven approach” is capable of doing segmentation and labelling of the required user high level concepts efficiently and has achieved above 93% accurate results over different data sets tested. User feedbacks and experimental results demonstrate the effectiveness and usability of our framework mainly because the extraction rules can be defined without difficulty and parameters are not sensitive to page layout variation.  相似文献   

3.
This paper presents a robust methodology that automatically counts moving vehicles along an expressway. The domain of interest for this paper is using both neuro-fuzzy network and simple image processing techniques to implement traffic flow monitoring and analysis. As this system is dedicated for outdoor applications, efficient and robust processing methods are introduced to handle both day and night analysis. In our study, a neuro-fuzzy network based on the Hebbian–Mamdani rule reduction architecture is used to classify and count the number of vehicles that passed through a three- or four-lanes expressway. As the quality of the video captured is corrupted under noisy outdoor environment, a series of preprocessing is required before the features are fed into the network. A vector of nine feature values is extracted to represent whether a vehicle is passing through a lane and this vector serves as input patterns would be used to train the neuro-fuzzy network. The vehicle counting and classification would then be performed by the well-trained network. The novel approach is benchmarked against the MLP and RBF networks. The results of using our proposed neuro-fuzzy network are very encouraging with a high degree of accuracy.  相似文献   

4.
While numerous page segmentation algorithms have been proposed in the literature, there is lack of comparative evaluation of these algorithms. In the existing performance evaluation methods, two crucial components are usually missing: 1) automatic training of algorithms with free parameters and 2) statistical and error analysis of experimental results. We use the following five-step methodology to quantitatively compare the performance of page segmentation algorithms: 1) first, we create mutually exclusive training and test data sets with groundtruth, 2) we then select a meaningful and computable performance metric, 3) an optimization procedure is then used to search automatically for the optimal parameter values of the segmentation algorithms on the training data set, 4) the segmentation algorithms are then evaluated on the test data set, and, finally, 5) a statistical and error analysis is performed to give the statistical significance of the experimental results. In particular, instead of the ad hoc and manual approach typically used in the literature for training algorithms, we pose the automatic training of algorithms as an optimization problem and use the simplex algorithm to search for the optimal parameter value. A paired-model statistical analysis and an error analysis are then conducted to provide confidence intervals for the experimental results of the algorithms. This methodology is applied to the evaluation of live page segmentation algorithms of which, three are representative research algorithms and the other two are well-known commercial products, on 978 images from the University of Washington III data set. It is found that the performance indices of the Voronoi, Docstrum, and Caere segmentation algorithms are not significantly different from each other, but they are significantly better than that of ScanSoft's segmentation algorithm, which, in turn, is significantly better than that of X-Y cut  相似文献   

5.
6.
基于统计分词的中文网页分类   总被引:9,自引:3,他引:9  
本文将基于统计的二元分词方法应用于中文网页分类,实现了在事先没有词表的情况下通过统计构造二字词词表,从而根据网页中的文本进行分词,进而进行网页的分类。因特网上不同类型和来源的文本内容用词风格和类型存在相当的差别,新词不断出现,而且易于获得大量的同类型文本作为训练语料。这些都为实现统计分词提供了条件。本文通过试验测试了统计分词构造二字词表用于中文网页分类的效果。试验表明,在统计阈值选择合适的时候,通过构建的词表进行分词进而进行网页分类,能有效地提高网页分类的分类精度。此外,本文还分析了单字和分词对于文本分类的不同影响及其原因。  相似文献   

7.
8.
We propose an approach to image segmentation that views it as one of pixel classification using simple features defined over the local neighborhood. We use a support vector machine for pixel classification, making the approach automatically adaptable to a large number of image segmentation applications. Since our approach utilizes only local information for classification, both training and application of the image segmentor can be done on a distributed computing platform. This makes our approach scalable to larger images than the ones tested. This article describes the methodology in detail and tests it efficacy against 5 other comparable segmentation methods on 2 well‐known image segmentation databases. Hence, we present the results together with the analysis that support the following conclusions: (i) the approach is as effective, and often better than its studied competitors; (ii) the approach suffers from very little overfitting and hence generalizes well to unseen images; (iii) the trained image segmentation program can be run on a distributed computing environment, resulting in linear scalability characteristics. The overall message of this paper is that using a strong classifier with simple pixel‐centered features gives as good or better segmentation results than some sophisticated competitors and does so in a computationally scalable fashion.  相似文献   

9.
The design of manual assembly workstations, as with most forms of designs, is highly iterative and interactive. The designer has to consider countless constraints and solutions for contradictory goals. In order to assist the designer in design process, it is required to develop a new intelligent methodology and system. This paper develops a neuro-fuzzy hybrid approach to intelligent design and planning of manual assembly workstations. Problems, related to workstation layout design, planning, and evaluation, are discussed in detail. A fuzzy neural network is used to predict the ranges of anatomical joint motions and to design or adjust workstations and tasks. The neuro-fuzzy computing scheme is integrated with operator's posture analysis and evaluation. For training and test purposes, experiment is carried out to simulate assembly tasks on a multi-adjustable assembly workstation equipped with a flexible PEAK motion measurement and analysis system. The trained neural network is capable of memorizing and predicting the joint angles associated with a range of workstation configurations. Thus, it can also be used for the design/layout and on-line adjustment of manual assembly workstations. Thus, the developed system provides a unified, computational intelligent framework for the design, planning and simulation of manual assembly workstations.  相似文献   

10.
Learning texture discrimination masks   总被引:6,自引:0,他引:6  
A neural network texture classification method is proposed in this paper. The approach is introduced as a generalization of the multichannel filtering method. Instead of using a general filter bank, a neural network is trained to find a minimal set of specific filters, so that both the feature extraction and classification tasks are performed by the same unified network. The authors compute the error rates for different network parameters, and show the convergence speed of training and node pruning algorithms. The proposed method is demonstrated in several texture classification experiments. It is successfully applied in the tasks of locating barcodes in the images and segmenting a printed page into text, graphics, and background. Compared with the traditional multichannel filtering method, the neural network approach allows one to perform the same texture classification or segmentation task more efficiently. Extensions of the method, as well as its limitations, are discussed in the paper  相似文献   

11.
One of the difficulties in the understanding of document images is document layout analysis, which is the first step in document image modeling. In this paper, a robust system for which a multilevel-homogeneity structure is used in accordance with a hybrid methodology is proposed to deal with this problem. Our system consists of the following three main stages: classification, segmentation, and refinement and labeling. Different from other page segmentation methods, the proposed system includes an efficient algorithm to detect table regions in document images. Besides, to create an effective application, the proposed system is designed to work with a variety of document languages. The proposed method was tested with the ICDAR2015 competition (RDCL-2015) and three other published datasets in different languages. The results of these tests show that the accuracy of proposed system is superior to the previous methods.  相似文献   

12.
13.
In this paper, we propose a scheme for segmentation of multitexture images. The methodology involves extraction of texture features using an overcomplete wavelet decomposition scheme called discrete M-band wavelet packet frame (DMbWPF). This is followed by the selection of important features using a neuro-fuzzy algorithm under unsupervised learning. A computationally efficient search procedure is developed for finding the optimal basis based on some maximum criterion of textural measures derived from the statistical parameters for each of the subbands. The superior discriminating capability of the extracted features for segmentation of various texture images over those obtained by several existing methods is established.  相似文献   

14.
为了能对复杂版式的文本图像(如包含镶嵌在文字中的形状不规则的图片区)的页面进行图文分割与分类,提出了一种新的基于模式链分析的文本页面分割与分类算法。该算法首先使用外接矩形框出图像中的所有黑像素,并且存入矩形框链表中,再组合所有相邻的矩形进而形成模式,最后依据各模式的统计特征分类,输出文字区和图片区两类图像。另外,对大图片模式周围个别不确定的模式,本文采用了上下文分类的算法进行再次分类。实验结果表明,该算法不仅运算速度快,而且能够对复杂版式的页面图像进行正确的图文分割和分类。  相似文献   

15.
Document representation and its application to page decomposition   总被引:6,自引:0,他引:6  
Transforming a paper document to its electronic version in a form suitable for efficient storage, retrieval, and interpretation continues to be a challenging problem. An efficient representation scheme for document images is necessary to solve this problem. Document representation involves techniques of thresholding, skew detection, geometric layout analysis, and logical layout analysis. The derived representation can then be used in document storage and retrieval. Page segmentation is an important stage in representing document images obtained by scanning journal pages. The performance of a document understanding system greatly depends on the correctness of page segmentation and labeling of different regions such as text, tables, images, drawings, and rulers. We use the traditional bottom-up approach based on the connected component extraction to efficiently implement page segmentation and region identification. A new document model which preserves top-down generation information is proposed based on which a document is logically represented for interactive editing, storage, retrieval, transfer, and logical analysis. Our algorithm has a high accuracy and takes approximately 1.4 seconds on a SGI Indy workstation for model creation, including orientation estimation, segmentation, and labeling (text, table, image, drawing, and ruler) for a 2550×3300 image of a typical journal page scanned at 300 dpi. This method is applicable to documents from various technical journals and can accommodate moderate amounts of skew and noise  相似文献   

16.
Digital preservation of newspaper archives aims both at the salvation of endangered material (paper) and at the creation of digital library services that will allow full utilization of the archives by all interested parties. In this paper, we address a series of issues pertaining to the retro-conversion of newspapers, i.e., the conversion of newspaper pages into digital resources. An integrated approach is presented that provides solutions to problems related to newspaper page image enhancement, segmentation of pages into various items (titles, text, images etc), article identification and reconstruction, and, finally, recognition of the textual components. Emphasis is placed on the most difficult intermediate stages of page segmentation and article identification and reconstruction. Detailed experimental results, obtained from a large testbed of old newspaper issues, are presented which clearly demonstrate the applicability of our methodology to the successful retro-conversion of newspaper material.  相似文献   

17.
This work presents a classification technique for hyperspectral image analysis when concurrent ground truth is either unavailable or available. The method adopts a principal component analysis (PCA)-based projection pursuit (PP) procedure with an entropy index for dimensionality reduction, followed by a Markov random field (MRF) model-based segmentation. An ordinal optimization approach to PP determines a set of ‘good enough projections’ with high probability, the best among which is chosen with the help of MRF model-based segmentation. When ground-truth is absent, the segmented output obtained is labelled with the desired number of classes so that it resembles the natural scene closely. When the land-cover classes are in detailed level, some special reflectance characteristics based on the classes of the study area are determined and incorporated in the segmentation stage. Segments are evaluated with training samples so as to yield a classified image with respect to the type of ground-truth data. Two illustrations are presented: (i) an AVIRIS-92AV3C image with concurrent ground truth – for both supervised and unsupervised cases and (ii) an EO-1 Hyperion sensor image with concurrent ground-truth at detailed level classes. Provided with the illustrations are comparisons of classification accuracies and computational times of other approaches with those of the proposed methodology. Experimental results demonstrate that the proposed method provides high classification accuracy and is not computationally intensive.  相似文献   

18.
An autoadaptive neuro-fuzzy segmentation and edge detection architecture is presented. The system consists of a multilayer perceptron (MLP)-like network that performs image segmentation by adaptive thresholding of the input image using labels automatically pre-selected by a fuzzy clustering technique. The proposed architecture is feedforward, but unlike the conventional MLP the learning is unsupervised. The output status of the network is described as a fuzzy set. Fuzzy entropy is used as a measure of the error of the segmentation system as well as a criterion for determining potential edge pixels. The proposed system is capable to perform automatic multilevel segmentation of images, based solely on information contained by the image itself. No a priori assumptions whatsoever are made about the image (type, features, contents, stochastic model, etc.). Such an "universal" algorithm is most useful for applications that are supposed to work with different (and possibly initially unknown) types of images. The proposed system can be readily employed, "as is," or as a basic building block by a more sophisticated and/or application-specific image segmentation algorithm. By monitoring the fuzzy entropy relaxation process, the system is able to detect edge pixels  相似文献   

19.
李文昊  彭红超  童名文  石俊杰 《计算机科学》2015,42(11):284-287, 309
网页分割技术是实现网页自适应呈现的关键。针对经典的基于视觉的网页分割算法VIPS(Vision-based Page Segmentation Algorithm)分割过碎和半自动的问题,基于图最优划分思想提出了一种新颖的基于视觉的网页最优分割算法VWOS(Vision-based Web Optimal Segmentation)。考虑到视觉特征和网页结构,将网页构造为加权无向连通图,网页分割转化为图的最优划分,基于Kruskal算法并结合网页分割的过程,设计网页分割算法VWOS。实验证明,与VIPS相比,采用VWOS算法分割网页的语义完整性更好,且不需要人工参与。  相似文献   

20.
A neuro-fuzzy approach for segmentation of human objects in image sequences   总被引:1,自引:0,他引:1  
We propose a novel approach for segmentation of human objects, including face and body, in image sequences. Object segmentation is important for achieving a high compression ratio in modern video coding techniques, e.g., MPEG-4 and MPEG-7, and human objects are usually the main parts in the video streams of multimedia applications. Existing segmentation methods apply simple criteria to detect human objects, leading to the restriction of the usage or a high segmentation error. We combine temporal and spatial information and employ a neuro-fuzzy mechanism to overcome these difficulties. A fuzzy self-clustering technique is used to divide the base frame of a video stream into a set of segments which are then categorized as foreground or background based on a combination of multiple criteria. Then, human objects in the base frame and the remaining frames of the video stream are precisely located by a fuzzy neural network constructed with the fuzzy rules previously obtained and is trained by a singular value decomposition (SVD)-based hybrid learning algorithm. The proposed approach has been tested on several different video streams, and the results have shown that the approach can produce a much better segmentation than other methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号