首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.
Geometric groundtruth at the character, word, and line levels is crucial for designing and evaluating optical character recognition (OCR) algorithms. Kanungo and Haralick proposed a closed-loop methodology for generating geometric groundtruth for rescanned document images. The procedure assumed that the original image and the corresponding groundtruth were available. It automatically registered the original image to the rescanned one using four corner points and then transformed the original groundtruth using the estimated registration transformation. In this paper, we present an attributed branch-and-bound algorithm for establishing the point correspondence that uses all the data points. We group the original feature points into blobs and use corners of blobs for matching. The Euclidean distance between character centroids is used as the error metric. We conducted experiments on synthetic point sets with varying layout complexity to characterize the performance of two matching algorithms. We also report results on experiments conducted using the University of Washington dataset. Finally, we show examples of application of this methodology for generating groundtruth for microfilmed and FAXed versions of the University of Washington dataset documents. Received: July 24, 2001 / Accepted: May 20, 2002  相似文献   

2.
Straight lines have to be straight   总被引:18,自引:0,他引:18  
Most algorithms in 3D computer vision rely on the pinhole camera model because of its simplicity, whereas video optics, especially low-cost wide-angle or fish-eye lenses, generate a lot of non-linear distortion which can be critical. To find the distortion parameters of a camera, we use the following fundamental property: a camera follows the pinhole model if and only if the projection of every line in space onto the camera is a line. Consequently, if we find the transformation on the video image so that every line in space is viewed in the transformed image as a line, then we know how to remove the distortion from the image. The algorithm consists of first doing edge extraction on a possibly distorted video sequence, then doing polygonal approximation with a large tolerance on these edges to extract possible lines from the sequence, and then finding the parameters of our distortion model that best transform these edges to segments. Results are presented on real video images, compared with distortion calibration obtained by a full camera calibration method which uses a calibration grid. Received: 27 December 1999 / Accepted: 8 November 2000  相似文献   

3.
This paper presents a local approach for matching contour segments in an image sequence. This study has been primarily motivated by work concerned with the recovery of 3D structure using active vision. The method to recover the 3D structure of the scene requires to track in real-time contour segments in an image sequence. Here, we propose an original and robust approach that is ideally suited for this problem. It is also of more general interest and can be used in any context requiring matching of line boundaries over time. This method only involves local modeling and computation of moving edges dealing “virtually” with a contour segment primitive representation. Such an approach brings robustness to contour segmentation instability and to occlusion, and easiness for implementation. Parallelism has also been investigated using an SIMD-based real-time image-processing system. This method has been validated with experiments on several real-image sequences. Our results show quite satisfactory performance and the algorithm runs in a few milliseconds. Received: 11 December 1996 / Accepted: 8 August 1997  相似文献   

4.
The paper presents an analysis of the stability of pose estimation. Stability is defined as sensitivity of the pose parameters towards noise in image features used for estimating pose. The specific emphasis of the analysis is on determining {how the stability varies with viewpoint} relative to an object and to understand the relationships between object geometry, viewpoint, and pose stability. Two pose estimation techniques are investigated. One uses a numerical scheme for finding pose parameters; the other is based on closed form solutions. Both are “pose from trihedral vertices” techniques, which provide the rotation part of object pose based on orientations of three edge segments. The analysis is based on generalized sensitivity analysis propagating the uncertainty in edge segment orientations to the resulting effect on the pose parameters. It is shown that there is a precomputable, generic relationship between viewpoint and pose stability, and that there is a drastic difference in stability over the range of viewpoints. This viewpoint variation is shared by the two investigated techniques. Additionally, the paper offers an explicit way to determine the most robust viewpoints directly for any given vertex model. Experiments on real images show that the results of the work can be used to compute the variance in pose parameters for any given pose. For the predicted {instable} viewpoints the variance in pose parameters is on the order of 20 (degrees squared), whereas the variance for robust viewpoints is on the order of 0.05 (degrees squared), i.e., two orders of magnitude difference.  相似文献   

5.
Two methods for stroke segmentation from a global point of view are presented and compared. One is based on thinning methods and the other is based on contour curve fitting. For both cases an input image is binarized. For the former, Hilditch's method is used, then crossing points are sought, around which a domain is constructed. Outside the domain, a set of line segments are identified. These lines are connected and approximated by cubic B-spline curves. Smoothly connected lines are selected as segmented curves. This method works well for a limited class of crossing lines, which are shown experimentally. In the latter, a contour line is approximated by cubic B-spline curve, along which curvature is measured. According to the extreme points of the curvature graph, the contour line is segmented, based on which the line segment is obtained. Experimental results are shown for some difficult cases. Received October 31, 1998 / Revised January 12, 1999  相似文献   

6.
This paper describes a complete stereovision system, which was originally developed for planetary applications, but can be used for other applications such as object modeling. A new effective on-site calibration technique has been developed, which can make use of the information from the surrounding environment as well as the information from the calibration apparatus. A correlation-based stereo algorithm is used, which can produce sufficient dense range maps with an algorithmic structure for fast implementations. A technique based on iterative closest-point matching has been developed for registration of successive depth maps and computation of the displacements between successive positions. A statistical method based on the distance distribution is integrated into this registration technique, which allows us to deal with such important problems as outliers, occlusion, appearance, and disappearance. Finally, the registered maps are expressed in the same coordinate system and are fused, erroneous data are eliminated through consistency checking, and a global digital elevation map is built incrementally.  相似文献   

7.
When developing packaged software, which is sold ‘off-the-shelf’ on a worldwide marketplace, it is essential to collect needs and opportunities from different market segments and use this information in the prioritisation of requirements for the next software release. This paper presents an industrial case study where a distributed prioritisation process is proposed, observed and evaluated. The stakeholders in the requirements prioritisation process include marketing offices distributed around the world. A major objective of the distributed prioritisation is to gather and highlight the differences and similarities in the requirement priorities of the different market segments. The evaluation through questionnaires shows that the stakeholders found the process useful. The paper also presents novel approaches to visualise the priority distribution among stakeholders, together with measures on disagreement and satisfaction. Product management found the proposed charts valuable as decision support when selecting requirements for the next release, as they revealed unforeseen differences among stakeholder priorities. Conclusions on stakeholder tactics are provided and issues of further research are identified, including ways of addressing identified challenges.  相似文献   

8.
In this study, a fuzzy-inference-rule-based flexible model (FIR-FM) for automatic elastic image registration is proposed. First, according to the characteristics of elastic image registration, an FIR-FM is proposed to model the complex geometric transformation and feature variation in elastic image registration. Then, by introducing the concept of motion estimation and the corresponding sum-of-squared-difference (SSD) objective function, the parameter learning rules of the proposed model are derived for general image registration. Based on the likelihood objective function, particular attention is also paid to the derivation of parameter learning rules for the case of partial image registration. Thus, an FIR-FM-based automatic elastic image registration algorithm is presented here. It is distinguished by its 1) strong ability in approximating complex nonlinear transformation inherited from fuzzy inference; 2) efficiency and adaptability in obtaining precise model parameters through effective parameter learning rules; and 3) completely automatic registration process that avoids the requirement of manual control, as in many traditional landmark-based algorithms. Our experiments show that the proposed method has an obvious advantage in speed and is comparable in registration accuracy as compared with a state-of-the-art algorithm.  相似文献   

9.
Handling a tertiary storage device, such as an optical disk library, in the framework of a disk-based stream service model, requires a sophisticated streaming model for the server, and it should consider the device-specific performance characteristics of tertiary storage. This paper discusses the design and implementation of a video server which uses tertiary storage as a source of media archiving. We have carefully designed the streaming mechanism for a server whose key functionalities include stream scheduling, disk caching and admission control. The stream scheduling model incorporates the tertiary media staging into a disk-based scheduling process, and also enhances the utilization of tertiary device bandwidth. The disk caching mechanism manages the limited capacity of the hard disk efficiently to guarantee the availability of media segments on the hard disk. The admission controller provides an adequate mechanism which decides upon the admission of a new request based on the current resource availability of the server. The proposed system has been implemented on a general-purpose operating system and it is fully operational. The design principles of the server are validated with real experiments, and the performance characteristics are analyzed. The results guide us on how servers with tertiary storage should be deployed effectively in a real environment. RID="*" ID="*" e-mail: hjcha@cs.yonsei.ac.kr  相似文献   

10.
Machine vision system for curved surface inspection   总被引:2,自引:0,他引:2  
This application-oriented paper discusses a non-contact 3D range data measurement system to improve the performance of the existing 2D herring roe grading system. The existing system uses a single CCD camera with unstructured halogen lighting to acquire and analyze the shape of the 2D shape of the herring roe for size and deformity grading. Our system will act as an additional system module, which can be integrated into the existing 2D grading system, providing the additional third dimension to detect deformities in the herring roe, which were not detected in the 2D analysis. Furthermore, the additional surface depth data will increase the accuracy of the weight information used in the existing grading system. In the proposed system, multiple laser light stripes are projected into the herring roe and the single B/W CCD camera records the image of the scene. The distortion in the projected line pattern is due to the surface curvature and orientation. Utilizing the linear relation between the projected line distortion and surface depth, the range data was recovered from a single camera image. The measurement technique is described and the depth information is obtained through four steps: (1) image capture, (2) stripe extraction, (3) stripe coding, (4) triangulation, and system calibration. Then, this depth information can be converted into the curvature and orientation of the shape for deformity inspection, and also used for the weight estimation. Preliminary results are included to show the feasibility and performance of our measurement technique. The accuracy and reliability of the computerized herring roe grading system can be greatly improved by integrating this system into existing system in the future.  相似文献   

11.
Abstract. Conventional tracking methods encounter difficulties as the number of objects, clutter, and sensors increase, because of the requirement for data association. Statistical tracking, based on the concept of network tomography, is an alternative that avoids data association. It estimates the number of trips made from one region to another in a scene based on interregion boundary traffic counts accumulated over time. It is not necessary to track an object through a scene to determine when an object crosses a boundary. This paper describes statistical tracing and presents an evaluation based on the estimation of pedestrian and vehicular traffic intensities at an intersection over a period of 1 month. We compare the results with those from a multiple-hypothesis tracker and manually counted ground-truth estimates. Received: 30 August 2001 / Accepted: 28 May 2002 Correspondence to: J.E. Boyd  相似文献   

12.
Spatial indexing of high-dimensional data based on relative approximation   总被引:2,自引:0,他引:2  
We propose a novel index structure, the A-tree (approximation tree), for similarity searches in high-dimensional data. The basic idea of the A-tree is the introduction of virtual bounding rectangles (VBRs) which contain and approximate MBRs or data objects. VBRs can be represented quite compactly and thus affect the tree configuration both quantitatively and qualitatively. First, since tree nodes can contain a large number of VBR entries, fanout becomes large, which increases search speed. More importantly, we have a free hand in arranging MBRs and VBRs in the tree nodes. Each A-tree node contains an MBR and its children VBRs. Therefore, by fetching an A-tree node, we can obtain information on the exact position of a parent MBR and the approximate position of its children. We have performed experiments using both synthetic and real data sets. For the real data sets, the A-tree outperforms the SR-tree and the VA-file in all dimensionalities up to 64 dimensions, which is the highest dimension in our experiments. Additionally, we propose a cost model for the A-tree. We verify the validity of the cost model for synthetic and real data sets. Edited by T. Sellis. Received: December 8, 2000 / Accepted: March 20, 2002 Published online: September 25, 2002  相似文献   

13.
Random perturbation models for boundary extraction sequence   总被引:2,自引:0,他引:2  
Computer vision algorithms are composed of different sub-algorithms often applied in sequence. Determination of the performance of a total computer vision algorithm is possible if the performance of each of the sub-algorithm constituents is given. The performance characterization of an algorithm has to do with establishing the correspondence between the random variations and imperfections in the output data and the random variations and imperfections in the input data. In this paper we illustrate how random perturbation models can be set up for a vision algorithm sequence involving edge finding, edge linking, and gap filling. By starting with an appropriate noise model for the input data we derive random perturbation models for the output data at each stage of our example sequence. By utilizing the perturbation model for edge detector output derived, we illustrate how pixel noise can be successively propagated to derive an error model for the boundary extraction output. It is shown that the fragmentation of an ideal boundary can be described by an alternating renewal process and that the parameters of the renewal process are related to the probability of correct detection and grouping at the edge linking step. It is also shown that the characteristics of random segments generated due to gray-level noise are functions of the probability of false alarm of the edge detector. Theoretical results are validated through systematic experiments.  相似文献   

14.
We present an autonomous mobile robot navigation system using stereo fish-eye lenses for navigation in an indoor structured environment and for generating a model of the imaged scene. The system estimates the three-dimensional (3D) position of significant features in the scene, and by estimating its relative position to the features, navigates through narrow passages and makes turns at corridor ends. Fish-eye lenses are used to provide a large field of view, which images objects close to the robot and helps in making smooth transitions in the direction of motion. Calibration is performed for the lens-camera setup and the distortion is corrected to obtain accurate quantitative measurements. A vision-based algorithm that uses the vanishing points of extracted segments from a scene in a few 3D orientations provides an accurate estimate of the robot orientation. This is used, in addition to 3D recovery via stereo correspondence, to maintain the robot motion in a purely translational path, as well as to remove the effects of any drifts from this path from each acquired image. Horizontal segments are used as a qualitative estimate of change in the motion direction and correspondence of vertical segment provides precise 3D information about objects close to the robot. Assuming detected linear edges in the scene as boundaries of planar surfaces, the 3D model of the scene is generated. The robot system is implemented and tested in a structured environment at our research center. Results from the robot navigation in real environments are presented and discussed. Received: 25 September 1996 / Accepted: 20 October 1996  相似文献   

15.
In this paper, an integrated offline recognition system for unconstrained handwriting is presented. The proposed system consists of seven main modules: skew angle estimation and correction, printed-handwritten text discrimination, line segmentation, slant removing, word segmentation, and character segmentation and recognition, stemming from the implementation of already existing algorithms as well as novel algorithms. This system has been tested on the NIST, IAM-DB, and GRUHD databases and has achieved accuracy that varies from 65.6% to 100% depending on the database and the experiment.  相似文献   

16.
Abstract. In this paper, a novel method is presented for generating a textured CAD model of an outdoor urban environment using a vehicle-borne sensor system. In data measurement, three single-row laser range scanners and six line cameras are mounted on a measurement vehicle, which has been equipped with a GPS/INS/Odometer-based navigation system. Laser range and line images are measured as the vehicle moves forward. They are synchronized with the navigation system so they can be geo-referenced to a world coordinate system. Generation of the CAD model is conducted in two steps. A geometric model is first generated using the geo-referenced laser range data, where urban features, such as buildings, ground surfaces, and trees are extracted in a hierarchical way. Different urban features are represented using different geometric primitives, such as a planar face, a triangulated irregular network (TIN), and a triangle. The texture of the urban features is generated by projecting and resampling line images onto the geometric model. An outdoor experiment is conducted, and a textured CAD model of a real urban environment is reconstructed in a full automatic mode.  相似文献   

17.
An Internet-based negotiation server for e-commerce   总被引:6,自引:0,他引:6  
This paper describes the design and implementation of a replicable, Internet-based negotiation server for conducting bargaining-type negotiations between enterprises involved in e-commerce and e-business. Enterprises can be buyers and sellers of products/services or participants of a complex supply chain engaged in purchasing, planning, and scheduling. Multiple copies of our server can be installed to complement the services of Web servers. Each enterprise can install or select a trusted negotiation server to represent his/her interests. Web-based GUI tools are used during the build-time registration process to specify the requirements, constraints, and rules that represent negotiation policies and strategies, preference scoring of different data conditions, and aggregation methods for deriving a global cost-benefit score for the item(s) under negotiation. The registration information is used by the negotiation servers to automatically conduct bargaining type negotiations on behalf of their clients. In this paper, we present the architecture of our implementation as well as a framework for automated negotiations, and describe a number of communication primitives which are used in the underlying negotiation protocol. A constraint satisfaction processor (CSP) is used to evaluate a negotiation proposal or counterproposal against the registered requirements and constraints of a client company. In case of a constraint violation, an event is posted to trigger the execution of negotiation strategic rules, which either automatically relax the violated constraint, ask for human intervention, invoke an application, or perform other remedial operations. An Event-Trigger-Rule (ETR) server is used to manage events, triggers, and rules. Negotiation strategic rules can be added or modified at run-time. A cost-benefit analysis component is used to perform quantitative analysis of alternatives. The use of negotiation servers to conduct automated negotiation has been demonstrated in the context of an integrated supply chain scenario. Received: 30 October 2000 / Accepted: 12 January 2001 Published online: 2 August 2001  相似文献   

18.
This paper reports on the mechanical verification of the IEEE 1394 root contention protocol. This is an industrial leader election protocol, in which timing parameters play an essential role. A manual verification of this protocol using I/O automata has been published in [24]. We improve the communication model presented in that paper. Using the Uppaal2k tool, we investigate the timing constraints on the parameters which are necessary and sufficient for correct protocol operation: by analyzing large numbers of protocol instances with different parameter values, we derive the required timing constraints. We explore the use of model checking in combination with stepwise abstraction. That is, we show that the implementation automaton correctly implements the specification via several intermediate automata, using Uppaal to prove the trace inclusion in each step. Published online: 18 July 2001  相似文献   

19.
An architecture for handwritten text recognition systems   总被引:1,自引:1,他引:0  
This paper presents an end-to-end system for reading handwritten page images. Five functional modules included in the system are introduced in this paper: (i) pre-processing, which concerns introducing an image representation for easy manipulation of large page images and image handling procedures using the image representation; (ii) line separation, concerning text line detection and extracting images of lines of text from a page image; (iii) word segmentation, which concerns locating word gaps and isolating words from a line of text image obtained efficiently and in an intelligent manner; (iv) word recognition, concerning handwritten word recognition algorithms; and (v) linguistic post-pro- cessing, which concerns the use of linguistic constraints to intelligently parse and recognize text. Key ideas employed in each functional module, which have been developed for dealing with the diversity of handwriting in its various aspects with a goal of system reliability and robustness, are described in this paper. Preliminary experiments show promising results in terms of speed and accuracy. Received October 30, 1998 / Revised January 15, 1999  相似文献   

20.
Large multimedia document archives may hold a major fraction of their data in tertiary storage libraries for cost reasons. This paper develops an integrated approach to the vertical data migration between the tertiary, secondary, and primary storage in that it reconciles speculative prefetching, to mask the high latency of the tertiary storage, with the replacement policy of the document caches at the secondary and primary storage level, and also considers the interaction of these policies with the tertiary and secondary storage request scheduling. The integrated migration policy is based on a continuous-time Markov chain model for predicting the expected number of accesses to a document within a specified time horizon. Prefetching is initiated only if that expectation is higher than those of the documents that need to be dropped from secondary storage to free up the necessary space. In addition, the possible resource contention at the tertiary and secondary storage is taken into account by dynamically assessing the response-time benefit of prefetching a document versus the penalty that it would incur on the response time of the pending document requests. The parameters of the continuous-time Markov chain model, the probabilities of co-accessing certain documents and the interaction times between successive accesses, are dynamically estimated and adjusted to evolving workload patterns by keeping online statistics. The integrated policy for vertical data migration has been implemented in a prototype system. The system makes profitable use of the Markov chain model also for the scheduling of volume exchanges in the tertiary storage library. Detailed simulation experiments with Web-server-like synthetic workloads indicate significant gains in terms of client response time. The experiments also show that the overhead of the statistical bookkeeping and the computations for the access predictions is affordable. Received January 1, 1998 / Accepted May 27, 1998  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号