共查询到20条相似文献,搜索用时 0 毫秒
1.
Shuhua Wang Yang Cao Shijie Cai 《International Journal on Document Analysis and Recognition》2001,4(1):27-34
The most noticeable characteristic of a construction tender document is that its hierarchical architecture is not obviously
expressed but is implied in the citing information. Currently available methods cannot deal with such documents. In this paper,
the intra-page and inter-page relationships are analyzed in detail. The creation of citing relationships is essential to extracting
the logical structure of tender documents. The hierarchy of tender documents naturally leads to extracting and displaying
the logical structure as tree structure. This method is successfully implemented in VHTender, and is the key to the efficiency
and flexibility of the whole system.
Received February 28, 2000 / Revised October 20, 2000 相似文献
2.
Exploring Context-aware Information Push 总被引:1,自引:1,他引:1
Despite much interest over recent years in the area of context-aware computing, there are still a number of significant gaps
in our understanding of the HCI issues associated with such systems. One particular issue that remains relatively unexplored
is how to design around the apparently conflicting goals of adapting to changes in context while at the same time adhering
to the principle of predictability. In this paper, we describe our exploration into this issue through two alternative designs
of an interactive context-aware tourist guide. One original design was based around information pull, i.e. the emphasis is
on the user to decide when context-aware information is presented. Our second design incorporates the notion of information
push whereby the actual presentation of context-aware information is triggered by contextual events, e.g. changes in the user’s
location or changes to the opening times of attractions. Through the evaluation of these alternative designs we hope to gain
a better understanding of the usability implications relating to push vs. pull in both this specific domain and in interactive
context-aware systems in general. 相似文献
3.
Hwan-Chul Park Se-Young Ok Young-Jung Yu Hwan-Gue Cho 《International Journal on Document Analysis and Recognition》2001,4(2):115-130
Automatic character recognition and image understanding of a given paper document are the main objectives of the computer
vision field. For these problems, a basic step is to isolate characters and group words from these isolated characters. In
this paper, we propose a new method for extracting characters from a mixed text/graphic machine-printed document and an algorithm
for distinguishing words from the isolated characters. For extracting characters, we exploit several features (size, elongation,
and density) of characters and propose a characteristic value for classification using the run-length frequency of the image
component. In the context of word grouping, previous works have largely been concerned with words which are placed on a horizontal
or vertical line. Our word grouping algorithm can group words which are on inclined lines, intersecting lines, and even curved
lines. To do this, we introduce the 3D neighborhood graph model which is very useful and efficient for character classification
and word grouping. In the 3D neighborhood graph model, each connected component of a text image segment is mapped onto 3D
space according to the area of the bounding box and positional information from the document. We conducted tests with more
than 20 English documents and more than ten oriental documents scanned from books, brochures, and magazines. Experimental
results show that more than 95% of words are successfully extracted from general documents, even in very complicated oriental
documents.
Received August 3, 2001 / Accepted August 8, 2001 相似文献
4.
Amit Kumar Das Sanjoy Kumar Saha Bhabatosh Chanda 《International Journal on Document Analysis and Recognition》2002,4(3):183-190
Document image segmentation is the first step in document image analysis and understanding. One major problem centres on
the performance analysis of the evolving segmentation algorithms. The use of a standard document database maintained at the
Universities/Research Laboratories helps to solve the problem of getting authentic data sources and other information, but
some methodologies have to be used for performance analysis of the segmentation. We describe a new document model in terms
of a bounding box representation of its constituent parts and suggest an empirical measure of performance of a segmentation
algorithm based on this new graph-like model of the document. Besides the global error measures, the proposed method also
produces segment-wise details of common segmentation problems such as horizontal and vertical split and merge as well as invalid
and mismatched regions.
Received July 14, 2000 / Revised June 12, 2001[-1mm] 相似文献
5.
Stefan Klink Thomas Kieninger 《International Journal on Document Analysis and Recognition》2001,4(1):18-26
Document image processing is a crucial process in office automation and begins at the ‘OCR’ phase with difficulties in document
‘analysis’ and ‘understanding’. This paper presents a hybrid and comprehensive approach to document structure analysis. Hybrid
in the sense that it makes use of layout (geometrical) as well as textual features of a given document. These features are
the base for potential conditions which in turn are used to express fuzzy matched rules of an underlying rule base. Rules
can be formulated based on features which might be observed within one specific layout object. However, rules can also express
dependencies between different layout objects. In addition to its rule driven analysis, which allows an easy adaptation to
specific domains with their specific logical objects, the system contains domain-independent markup algorithms for common
objects (e.g., lists).
Received June 19, 2000 / Revised November 8, 2000 相似文献
6.
Identifying facsimile duplicates using radial pixel densities 总被引:2,自引:0,他引:2
P. Chatelain 《International Journal on Document Analysis and Recognition》2002,4(4):219-225
A method for detecting full layout facsimile duplicates based on radial pixel densities is proposed. It caters for facsimiles,
including text and/or graphics. Pages may be positioned upright or inverted on the scanner bed. The method is not dependent
on the computation of text skew or text orientation. Using a database of original documents, 92% of non-duplicates and upright
duplicates as well as 89% of inverted duplicates could be correctly identified. The method is vulnerable to double scanning.
This occurs when documents are copied using a photocopier and the copies are subsequently transmitted using a facsimile machine.
Received September 29, 2000 / Revised: August 23, 2001 相似文献
7.
Marcel Worring Arnold W.M. Smeulders 《International Journal on Document Analysis and Recognition》1999,1(4):209-220
When archives of paper documents are to be accessed via the Internet, the implicit hypertext structure of the original documents should be employed. In this paper we study the different hypertext structures one encounters in a document. Methods for analyzing paper documents to find these structures are presented. The structures also form the basis for the presentation of the content of the document to the user. Results are presented. Received October 13, 1998 / Revised February 15, 1999 相似文献
8.
Claudia Wenzel Heiko Maus 《International Journal on Document Analysis and Recognition》2001,3(4):248-260
Knowledge-based systems for document analysis and understanding (DAU) are quite useful whenever analysis has to deal with
the changing of free-form document types which require different analysis components. In this case, declarative modeling is
a good way to achieve flexibility. An important application domain for such systems is the business letter domain. Here, high
accuracy and the correct assignment to the right people and the right processes is a crucial success factor. Our solution
to this proposes a comprehensive knowledge-centered approach: we model not only comparatively static knowledge concerning
document properties and analysis results within the same declarative formalism, but we also include the analysis task and
the current context of the system environment within the same formalism. This allows an easy definition of new analysis tasks
and also an efficient and accurate analysis by using expectations about incoming documents as context information. The approach
described has been implemented within the VOPR (VOPR is an acronym for the Virtual Office PRototype.) system. This DAU system
gains the required context information from a commercial workflow management system (WfMS) by constant exchanges of expectations
and analysis tasks. Further interaction between these two systems covers the delivery of results from DAU to the WfMS and
the delivery of corrected results vice versa.
Received June 19, 1999 / Revised November 8, 2000 相似文献
9.
Context-aware Retrieval: Exploring a New Environment for Information Retrieval and Information Filtering 总被引:2,自引:2,他引:2
The opportunities for context-aware computing are fast expanding. Computing systems can be made aware of their environment
by monitoring attributes such as their current location, the current time, the weather, or nearby equipment and users. Context-aware
computing often involves retrieval of information: it introduces a new aspect to technologies for information delivery; currently
these technologies are based mainly on contemporary approaches to information retrieval and information filtering. In this
paper, we consider how the closely related, but distinct, topics of information retrieval and information filtering relate
to context-aware retrieval. Our thesis is that context-aware retrieval is as yet a sparsely researched and sparsely understood
area, and we aim in this paper to make a start towards remedying this. 相似文献
10.
Henry S. Baird Allison L. Coates Richard J. Fateman 《International Journal on Document Analysis and Recognition》2003,5(2-3):158-163
Abstract. We exploit the gap in ability between human and machine vision systems to craft a family of automatic challenges that tell
human and machine users apart via graphical interfaces including Internet browsers. Turing proposed [Tur50] a method whereby
human judges might validate “artificial intelligence” by failing to distinguish between human and machine interlocutors. Stimulated
by the “chat room problem” posed by Udi Manber of Yahoo!, and influenced by the CAPTCHA project [BAL00] of Manuel Blum et
al. of Carnegie-Mellon Univ., we propose a variant of the Turing test using pessimal print: that is, low-quality images of machine-printed text synthesized pseudo-randomly over certain ranges of words, typefaces,
and image degradations. We show experimentally that judicious choice of these ranges can ensure that the images are legible
to human readers but illegible to several of the best present-day optical character recognition (OCR) machines. Our approach
is motivated by a decade of research on performance evaluation of OCR machines [RJN96,RNN99] and on quantitative stochastic
models of document image quality [Bai92,Kan96]. The slow pace of evolution of OCR and other species of machine vision over
many decades [NS96,Pav00] suggests that pessimal print will defy automated attack for many years. Applications include `bot'
barriers and database rationing.
Received: February 14, 2002 / Accepted: March 28, 2002
An expanded version of: A.L. Coates, H.S. Baird, R.J. Fateman (2001) Pessimal Print: a reverse Turing Test. In: {\it Proc.
6th Int. Conf. on Document Analysis and Recognition}, Seattle, Wash., USA, September 10–13, pp. 1154–1158
Correspondence to: H. S. Baird 相似文献
11.
Jean-Marc Jot 《Multimedia Systems》1999,7(1):55-69
This paper gives an overview of the principles and methods for synthesizing complex 3D sound scenes by processing multiple
individual source signals. Signal-processing techniques for directional sound encoding and rendering over loudspeakers or
headphones are reviewed, as well as algorithms and interface models for synthesizing and dynamically controling room reverberation
and distance effects. A real-time modular spatial-sound-processing software system, called Spat, is presented. It allows reproducing and controling the localization of sound sources in three dimensions and the reverberation
of sounds in an existing or virtual space. A particular aim of the Spatialisateur project is to provide direct and computationally
efficient control over perceptually relevant parameters describing the interaction of each sound source with the virtual space,
irrespective of the chosen reproduction format over loudspeakers or headphones. The advantages of this approach are illustrated
in practical contexts, including professional audio, computer music, multimodal immersive simulation systems, and architectural
acoustics. 相似文献
12.
As part of the Spatial Location Protocol activity in the Internet Engineering Task Force (IETF), we have been working on
how to express location information in an interoperable way in the Internet. The objective of this paper is to share our ideas
on concepts for enabling interoperability and reuse of location information. These concepts can also be used in the area of
ubiquitous computing.
Correspondence to: Ms M. Korkea-aho, Department of Computer Science & Engineering, Helsinki University of Technology, Apollokatu 10 A 49, FIN-00100
Helsinki, Finland. Email: mari.korkea-aho@iki.fi 相似文献
13.
Michael Cannon Judith Hochberg Patrick Kelly 《International Journal on Document Analysis and Recognition》1999,2(2-3):80-89
We present a useful method for assessing the quality of a typewritten document image and automatically selecting an optimal restoration method based on that assessment. We use five quality measures that assess the severity of background speckle, touching characters, and broken characters. A linear classifier uses these measures to select a restoration method. On a 139-document corpus, our methodology reduced the corpus OCR character error rate from 20.27% to 12.60%. Received November 10, 1998 / Revised October 27, 1999 相似文献
14.
J. Hu R.S. Kashi D. Lopresti G.T. Wilfong 《International Journal on Document Analysis and Recognition》2002,4(3):140-153
While techniques for evaluating the performance of lower-level document analysis tasks such as optical character recognition
have gained acceptance in the literature, attempts to formalize the problem for higher-level algorithms, while receiving a
fair amount of attention in terms of theory, have generally been less successful in practice, perhaps owing to their complexity.
In this paper, we introduce intuitive, easy-to-implement evaluation schemes for the related problems of table detection and
table structure recognition. We also present the results of several small experiments, demonstrating how well the methodologies
work and the useful sorts of feedback they provide. We first consider the table detection problem. Here algorithms can yield
various classes of errors, including non-table regions improperly labeled as tables (insertion errors), tables missed completely
(deletion errors), larger tables broken into a number of smaller ones (splitting errors), and groups of smaller tables combined
to form larger ones (merging errors). This leads naturally to the use of an edit distance approach for assessing the results
of table detection. Next we address the problem of evaluating table structure recognition. Our model is based on a directed
acyclic attribute graph, or table DAG. We describe a new paradigm, “graph probing,” for comparing the results returned by
the recognition system and the representation created during ground-truthing. Probing is in fact a general concept that could
be applied to other document recognition tasks as well.
Received July 18, 2000 / Accepted October 4, 2001 相似文献
15.
Constantine Stephanidis Anthony Savidis 《Universal Access in the Information Society》2001,1(1):40-55
Accessibility and high quality of interaction with products, applications, and services by anyone, anywhere, and at any time are fundamental requirements
for universal access in the emerging Information Society. This paper discusses these requirements, and their relation to the concept of automated
adaptation of user interfaces. An example application is presented, showing how adaptation can be used to accommodate the
requirements of different user categories and contexts of use. This application is then used as a vehicle for discussing a
new engineering paradigm appropriate for the development of adaptation-based user interfaces. Finally, the paper investigates
issues concerning the interaction technologies required for universal access.
Published online: 23 May 2001 相似文献
16.
Pierre M. Devaux Daniel B. Lysak Rangachar Kasturi 《International Journal on Document Analysis and Recognition》1999,2(2-3):120-131
Converting paper-based engineering drawings into CAD model files is a tedious process. Therefore, automating the conversion
of such drawings represents tremendous time and labor savings. We present a complete system which interprets such 2D paper-based
engineering drawings, and outputs 3D models that can be displayed as wireframes. The system performs the detection of dimension
sets, the extraction of object lines, and the assembly of 3D objects from the extracted object lines. A knowledge-based method
is used to remove dimension sets and text from ANSI engineering drawings, a graphics recognition procedure is used to extract
complete object lines, and an evidential rule-based method is utilized to identify view relationships. While these methods
are the subject of several of our previous papers, this paper focuses on the 3D interpretation of the object. This is accomplished
using a technique based on evidential reasoning and a wide range of rules and heuristics. The system is limited to the interpretation
of objects composed of planar, spherical, and cylindrical surfaces. Experimental results are presented.
Received December 2, 1998 / Revised June 18, 1999 相似文献
17.
People wish to maintain a level of awareness of timely information, including presence of others in the workplace and other
social settings. We believe this provides better exchange, coordination and contact within a community, especially as people
work in asynchronous times and distributed locations. The challenge is to develop lightweight techniques for awareness, interaction
and communication using shared information appliances. In this paper, we describe the design of an exploratory responsive display projected within a shared workspace at the MIT
Media Lab. The system uses visual sensing to provide relevant information and constructs traces of people’s activity over
time. Such aware portals may be deployed in casual workplace domains, distributed workgroups, and everyday public spaces. 相似文献
18.
In this paper we describe a prototype spatial audio user interface for a Global Positioning System (GPS). The interface is
designed to allow mobile users to carry out location tasks while their eyes, hands or attention are otherwise engaged. Audio
user interfaces for GPS have typically been designed to meet the needs of visually impaired users, and generally, though not
exclusively, employ speech-audio. In contrast, our prototype system uses a simple form of non-speech, spatial audio. This
paper analyses various candidate audio mappings for location and distance information. A variety of tasks, design considerations,
design trade-offs and opportunities are considered. The findings from pilot empirical testing are reported. Finally, opportunities
for improvements to the system and for future evaluation are explored. 相似文献
19.
Requirements for choosing off-the-shelf information systems (OISR) differ from requirements for development of new information
systems in that they do not necessarily provide complete specifications, thus allowing flexibility in matching an existing
IS to the stated needs. We present a framework for OISR conceptual models that consists of four essential elements: business
processes, business rules, information objects and required system services. We formalise the definitions of these concepts
based on an ontological model. The ontology-based OISR model provides a framework to evaluate modelling languages on how appropriate
they are for OISR requirements specifications. The evaluation framework is applied to the Object-Process Methodology, and
its results are compared with a similar evaluation of ARIS. This comparison demonstrates the effectiveness of the ontological
framework for evaluating modelling tools on how well they can guide selection, implementation and integration of purchased
software packages. 相似文献
20.
Francesco Garibaldo 《AI & Society》2002,16(4):305-331
This article, first of all, supports the idea that the undeniable process of ICT-based technological convergence implies
the social, cultural and business unification of the world of media and culture. The poor performance of the megamerger is
a clear indicator of the unstable ground of the convergence hypothesis. Secondly, it argues in favour of cooperation between
different expertise, skills and cultures to make multimedia products or to supply multimedia services, instead of creating
from scratch a brand new class of hybrid skills and professions. Thirdly, a variety of new possible and realistically achievable
professional profiles in cultural industries and institutions are illustrated. Eventually a set of public policies, in the
light of a new role for cities and regions, is developed.
Correspondence and offprint requests to: Francesco Garibaldo, Fondazione ‘Istituto Per il Lavoro (IPL)’, via Marconi 8, 40122 Bologna, Italy. Email: f.garibaldo@ipielle.emr.it 相似文献