期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Using citing information to understand the logical structure of document images

Shuhua Wang Yang Cao Shijie Cai 《International Journal on Document Analysis and Recognition》2001,4(1):27-34

The most noticeable characteristic of a construction tender document is that its hierarchical architecture is not obviously expressed but is implied in the citing information. Currently available methods cannot deal with such documents. In this paper, the intra-page and inter-page relationships are analyzed in detail. The creation of citing relationships is essential to extracting the logical structure of tender documents. The hierarchy of tender documents naturally leads to extracting and displaying the logical structure as tree structure. This method is successfully implemented in VHTender, and is the key to the efficiency and flexibility of the whole system. Received February 28, 2000 / Revised October 20, 2000 相似文献

2.

Exploring Context-aware Information Push 总被引：1，自引：1，他引：1

Keith Cheverst Keith Mitchell Nigel Davies 《Personal and Ubiquitous Computing》2002,6(4):276-281

Despite much interest over recent years in the area of context-aware computing, there are still a number of significant gaps in our understanding of the HCI issues associated with such systems. One particular issue that remains relatively unexplored is how to design around the apparently conflicting goals of adapting to changes in context while at the same time adhering to the principle of predictability. In this paper, we describe our exploration into this issue through two alternative designs of an interactive context-aware tourist guide. One original design was based around information pull, i.e. the emphasis is on the user to decide when context-aware information is presented. Our second design incorporates the notion of information push whereby the actual presentation of context-aware information is triggered by contextual events, e.g. changes in the user’s location or changes to the opening times of attractions. Through the evaluation of these alternative designs we hope to gain a better understanding of the usability implications relating to push vs. pull in both this specific domain and in interactive context-aware systems in general. 相似文献

3.

A word extraction algorithm for machine-printed documents using a 3D neighborhood graph model

Hwan-Chul Park Se-Young Ok Young-Jung Yu Hwan-Gue Cho 《International Journal on Document Analysis and Recognition》2001,4(2):115-130

Automatic character recognition and image understanding of a given paper document are the main objectives of the computer vision field. For these problems, a basic step is to isolate characters and group words from these isolated characters. In this paper, we propose a new method for extracting characters from a mixed text/graphic machine-printed document and an algorithm for distinguishing words from the isolated characters. For extracting characters, we exploit several features (size, elongation, and density) of characters and propose a characteristic value for classification using the run-length frequency of the image component. In the context of word grouping, previous works have largely been concerned with words which are placed on a horizontal or vertical line. Our word grouping algorithm can group words which are on inclined lines, intersecting lines, and even curved lines. To do this, we introduce the 3D neighborhood graph model which is very useful and efficient for character classification and word grouping. In the 3D neighborhood graph model, each connected component of a text image segment is mapped onto 3D space according to the area of the bounding box and positional information from the document. We conducted tests with more than 20 English documents and more than ten oriental documents scanned from books, brochures, and magazines. Experimental results show that more than 95% of words are successfully extracted from general documents, even in very complicated oriental documents. Received August 3, 2001 / Accepted August 8, 2001 相似文献

4.

An empirical measure of the performance of a document image segmentation algorithm

Amit Kumar Das Sanjoy Kumar Saha Bhabatosh Chanda 《International Journal on Document Analysis and Recognition》2002,4(3):183-190

Document image segmentation is the first step in document image analysis and understanding. One major problem centres on the performance analysis of the evolving segmentation algorithms. The use of a standard document database maintained at the Universities/Research Laboratories helps to solve the problem of getting authentic data sources and other information, but some methodologies have to be used for performance analysis of the segmentation. We describe a new document model in terms of a bounding box representation of its constituent parts and suggest an empirical measure of performance of a segmentation algorithm based on this new graph-like model of the document. Besides the global error measures, the proposed method also produces segment-wise details of common segmentation problems such as horizontal and vertical split and merge as well as invalid and mismatched regions. Received July 14, 2000 / Revised June 12, 2001[-1mm] 相似文献

5.

Rule-based document structure understanding with a fuzzy combination of layout and textual features

Stefan Klink Thomas Kieninger 《International Journal on Document Analysis and Recognition》2001,4(1):18-26

Document image processing is a crucial process in office automation and begins at the ‘OCR’ phase with difficulties in document ‘analysis’ and ‘understanding’. This paper presents a hybrid and comprehensive approach to document structure analysis. Hybrid in the sense that it makes use of layout (geometrical) as well as textual features of a given document. These features are the base for potential conditions which in turn are used to express fuzzy matched rules of an underlying rule base. Rules can be formulated based on features which might be observed within one specific layout object. However, rules can also express dependencies between different layout objects. In addition to its rule driven analysis, which allows an easy adaptation to specific domains with their specific logical objects, the system contains domain-independent markup algorithms for common objects (e.g., lists). Received June 19, 2000 / Revised November 8, 2000 相似文献

6.

Identifying facsimile duplicates using radial pixel densities 总被引：2，自引：0，他引：2

P. Chatelain 《International Journal on Document Analysis and Recognition》2002,4(4):219-225

A method for detecting full layout facsimile duplicates based on radial pixel densities is proposed. It caters for facsimiles, including text and/or graphics. Pages may be positioned upright or inverted on the scanner bed. The method is not dependent on the computation of text skew or text orientation. Using a database of original documents, 92% of non-duplicates and upright duplicates as well as 89% of inverted duplicates could be correctly identified. The method is vulnerable to double scanning. This occurs when documents are copied using a photocopier and the copies are subsequently transmitted using a facsimile machine. Received September 29, 2000 / Revised: August 23, 2001 相似文献

7.

Content based internet access to paper documents

Marcel Worring Arnold W.M. Smeulders 《International Journal on Document Analysis and Recognition》1999,1(4):209-220

When archives of paper documents are to be accessed via the Internet, the implicit hypertext structure of the original documents should be employed. In this paper we study the different hypertext structures one encounters in a document. Methods for analyzing paper documents to find these structures are presented. The structures also form the basis for the presentation of the content of the document to the user. Results are presented. Received October 13, 1998 / Revised February 15, 1999 相似文献

8.

Leveraging corporate context within knowledge-based document analysis and understanding

Claudia Wenzel Heiko Maus 《International Journal on Document Analysis and Recognition》2001,3(4):248-260

Knowledge-based systems for document analysis and understanding (DAU) are quite useful whenever analysis has to deal with the changing of free-form document types which require different analysis components. In this case, declarative modeling is a good way to achieve flexibility. An important application domain for such systems is the business letter domain. Here, high accuracy and the correct assignment to the right people and the right processes is a crucial success factor. Our solution to this proposes a comprehensive knowledge-centered approach: we model not only comparatively static knowledge concerning document properties and analysis results within the same declarative formalism, but we also include the analysis task and the current context of the system environment within the same formalism. This allows an easy definition of new analysis tasks and also an efficient and accurate analysis by using expectations about incoming documents as context information. The approach described has been implemented within the VOPR (VOPR is an acronym for the Virtual Office PRototype.) system. This DAU system gains the required context information from a commercial workflow management system (WfMS) by constant exchanges of expectations and analysis tasks. Further interaction between these two systems covers the delivery of results from DAU to the WfMS and the delivery of corrected results vice versa. Received June 19, 1999 / Revised November 8, 2000 相似文献

9.

Context-aware Retrieval: Exploring a New Environment for Information Retrieval and Information Filtering 总被引：2，自引：2，他引：2

P. J. Brown G. J. F. Jones 《Personal and Ubiquitous Computing》2001,5(4):253-263

The opportunities for context-aware computing are fast expanding. Computing systems can be made aware of their environment by monitoring attributes such as their current location, the current time, the weather, or nearby equipment and users. Context-aware computing often involves retrieval of information: it introduces a new aspect to technologies for information delivery; currently these technologies are based mainly on contemporary approaches to information retrieval and information filtering. In this paper, we consider how the closely related, but distinct, topics of information retrieval and information filtering relate to context-aware retrieval. Our thesis is that context-aware retrieval is as yet a sparsely researched and sparsely understood area, and we aim in this paper to make a start towards remedying this. 相似文献

10.

PessimalPrint: a reverse Turing test

Henry S. Baird Allison L. Coates Richard J. Fateman 《International Journal on Document Analysis and Recognition》2003,5(2-3):158-163

Abstract. We exploit the gap in ability between human and machine vision systems to craft a family of automatic challenges that tell human and machine users apart via graphical interfaces including Internet browsers. Turing proposed [Tur50] a method whereby human judges might validate “artificial intelligence” by failing to distinguish between human and machine interlocutors. Stimulated by the “chat room problem” posed by Udi Manber of Yahoo!, and influenced by the CAPTCHA project [BAL00] of Manuel Blum et al. of Carnegie-Mellon Univ., we propose a variant of the Turing test using pessimal print: that is, low-quality images of machine-printed text synthesized pseudo-randomly over certain ranges of words, typefaces, and image degradations. We show experimentally that judicious choice of these ranges can ensure that the images are legible to human readers but illegible to several of the best present-day optical character recognition (OCR) machines. Our approach is motivated by a decade of research on performance evaluation of OCR machines [RJN96,RNN99] and on quantitative stochastic models of document image quality [Bai92,Kan96]. The slow pace of evolution of OCR and other species of machine vision over many decades [NS96,Pav00] suggests that pessimal print will defy automated attack for many years. Applications include `bot' barriers and database rationing. Received: February 14, 2002 / Accepted: March 28, 2002 An expanded version of: A.L. Coates, H.S. Baird, R.J. Fateman (2001) Pessimal Print: a reverse Turing Test. In: {\it Proc. 6th Int. Conf. on Document Analysis and Recognition}, Seattle, Wash., USA, September 10–13, pp. 1154–1158 Correspondence to: H. S. Baird 相似文献

11.

Real-time spatial processing of sounds for music, multimedia and interactive human-computer interfaces

Jean-Marc Jot 《Multimedia Systems》1999,7(1):55-69

This paper gives an overview of the principles and methods for synthesizing complex 3D sound scenes by processing multiple individual source signals. Signal-processing techniques for directional sound encoding and rendering over loudspeakers or headphones are reviewed, as well as algorithms and interface models for synthesizing and dynamically controling room reverberation and distance effects. A real-time modular spatial-sound-processing software system, called Spat, is presented. It allows reproducing and controling the localization of sound sources in three dimensions and the reverberation of sounds in an existing or virtual space. A particular aim of the Spatialisateur project is to provide direct and computationally efficient control over perceptually relevant parameters describing the interaction of each sound source with the virtual space, irrespective of the chosen reproduction format over loudspeakers or headphones. The advantages of this approach are illustrated in practical contexts, including professional audio, computer music, multimodal immersive simulation systems, and architectural acoustics. 相似文献

12.

Expressing Location Information for Applications in the Internet

Mari Korkea-aho Haitao Tang Reijo Sulonen 《Personal and Ubiquitous Computing》2002,6(5-6):329-333

As part of the Spatial Location Protocol activity in the Internet Engineering Task Force (IETF), we have been working on how to express location information in an interoperable way in the Internet. The objective of this paper is to share our ideas on concepts for enabling interoperability and reuse of location information. These concepts can also be used in the area of ubiquitous computing. Correspondence to: Ms M. Korkea-aho, Department of Computer Science & Engineering, Helsinki University of Technology, Apollokatu 10 A 49, FIN-00100 Helsinki, Finland. Email: mari.korkea-aho@iki.fi 相似文献

13.

Quality assessment and restoration of typewritten document images

Michael Cannon Judith Hochberg Patrick Kelly 《International Journal on Document Analysis and Recognition》1999,2(2-3):80-89

We present a useful method for assessing the quality of a typewritten document image and automatically selecting an optimal restoration method based on that assessment. We use five quality measures that assess the severity of background speckle, touching characters, and broken characters. A linear classifier uses these measures to select a restoration method. On a 139-document corpus, our methodology reduced the corpus OCR character error rate from 20.27% to 12.60%. Received November 10, 1998 / Revised October 27, 1999 相似文献

14.

Evaluating the performance of table processing algorithms

J. Hu R.S. Kashi D. Lopresti G.T. Wilfong 《International Journal on Document Analysis and Recognition》2002,4(3):140-153

While techniques for evaluating the performance of lower-level document analysis tasks such as optical character recognition have gained acceptance in the literature, attempts to formalize the problem for higher-level algorithms, while receiving a fair amount of attention in terms of theory, have generally been less successful in practice, perhaps owing to their complexity. In this paper, we introduce intuitive, easy-to-implement evaluation schemes for the related problems of table detection and table structure recognition. We also present the results of several small experiments, demonstrating how well the methodologies work and the useful sorts of feedback they provide. We first consider the table detection problem. Here algorithms can yield various classes of errors, including non-table regions improperly labeled as tables (insertion errors), tables missed completely (deletion errors), larger tables broken into a number of smaller ones (splitting errors), and groups of smaller tables combined to form larger ones (merging errors). This leads naturally to the use of an edit distance approach for assessing the results of table detection. Next we address the problem of evaluating table structure recognition. Our model is based on a directed acyclic attribute graph, or table DAG. We describe a new paradigm, “graph probing,” for comparing the results returned by the recognition system and the representation created during ground-truthing. Probing is in fact a general concept that could be applied to other document recognition tasks as well. Received July 18, 2000 / Accepted October 4, 2001 相似文献

15.

Universal Access in the Information Society: Methods, Tools, and Interaction Technologies

Constantine Stephanidis Anthony Savidis 《Universal Access in the Information Society》2001,1(1):40-55

Accessibility and high quality of interaction with products, applications, and services by anyone, anywhere, and at any time are fundamental requirements for universal access in the emerging Information Society. This paper discusses these requirements, and their relation to the concept of automated adaptation of user interfaces. An example application is presented, showing how adaptation can be used to accommodate the requirements of different user categories and contexts of use. This application is then used as a vehicle for discussing a new engineering paradigm appropriate for the development of adaptation-based user interfaces. Finally, the paper investigates issues concerning the interaction technologies required for universal access. Published online: 23 May 2001 相似文献

16.

A complete system for the intelligent interpretation of engineering drawings

Pierre M. Devaux Daniel B. Lysak Rangachar Kasturi 《International Journal on Document Analysis and Recognition》1999,2(2-3):120-131

Converting paper-based engineering drawings into CAD model files is a tedious process. Therefore, automating the conversion of such drawings represents tremendous time and labor savings. We present a complete system which interprets such 2D paper-based engineering drawings, and outputs 3D models that can be displayed as wireframes. The system performs the detection of dimension sets, the extraction of object lines, and the assembly of 3D objects from the extracted object lines. A knowledge-based method is used to remove dimension sets and text from ANSI engineering drawings, a graphics recognition procedure is used to extract complete object lines, and an evidential rule-based method is utilized to identify view relationships. While these methods are the subject of several of our previous papers, this paper focuses on the 3D interpretation of the object. This is accomplished using a technique based on evidential reasoning and a wide range of rules and heuristics. The system is limited to the interpretation of objects composed of planar, spherical, and cylindrical surfaces. Experimental results are presented. Received December 2, 1998 / Revised June 18, 1999 相似文献

17.

Aware Community Portals: Shared Information Appliances for Transitional Spaces

Nitin Sawhney Sean Wheeler Chris Schmandt 《Personal and Ubiquitous Computing》2001,5(1):66-70

People wish to maintain a level of awareness of timely information, including presence of others in the workplace and other social settings. We believe this provides better exchange, coordination and contact within a community, especially as people work in asynchronous times and distributed locations. The challenge is to develop lightweight techniques for awareness, interaction and communication using shared information appliances. In this paper, we describe the design of an exploratory responsive display projected within a shared workspace at the MIT Media Lab. The system uses visual sensing to provide relevant information and constructs traces of people’s activity over time. Such aware portals may be deployed in casual workplace domains, distributed workgroups, and everyday public spaces. 相似文献

18.

AudioGPS: Spatial Audio Navigation with a Minimal Attention Interface 总被引：1，自引：0，他引：1

Simon Holland David R. Morse Henrik Gedenryd 《Personal and Ubiquitous Computing》2002,6(4):253-259

In this paper we describe a prototype spatial audio user interface for a Global Positioning System (GPS). The interface is designed to allow mobile users to carry out location tasks while their eyes, hands or attention are otherwise engaged. Audio user interfaces for GPS have typically been designed to meet the needs of visually impaired users, and generally, though not exclusively, employ speech-audio. In contrast, our prototype system uses a simple form of non-speech, spatial audio. This paper analyses various candidate audio mappings for location and distance information. A variety of tasks, design considerations, design trade-offs and opportunities are considered. The findings from pilot empirical testing are reported. Finally, opportunities for improvements to the system and for future evaluation are explored. 相似文献

19.

Modelling Off-the-Shelf Information Systems Requirements: An Ontological Approach 总被引：4，自引：1，他引：4

Pnina Soffer Boaz Golany Dov Dori Yair Wand 《Requirements Engineering》2001,6(3):183-199

Requirements for choosing off-the-shelf information systems (OISR) differ from requirements for development of new information systems in that they do not necessarily provide complete specifications, thus allowing flexibility in matching an existing IS to the stated needs. We present a framework for OISR conceptual models that consists of four essential elements: business processes, business rules, information objects and required system services. We formalise the definitions of these concepts based on an ontological model. The ontology-based OISR model provides a framework to evaluate modelling languages on how appropriate they are for OISR requirements specifications. The evaluation framework is applied to the Object-Process Methodology, and its results are compared with a similar evaluation of ARIS. This comparison demonstrates the effectiveness of the ontological framework for evaluating modelling tools on how well they can guide selection, implementation and integration of purchased software packages. 相似文献

20.

Information and Communication Technologies,Organisations and Skills: Convergence and Persistence

Francesco Garibaldo 《AI & Society》2002,16(4):305-331

This article, first of all, supports the idea that the undeniable process of ICT-based technological convergence implies the social, cultural and business unification of the world of media and culture. The poor performance of the megamerger is a clear indicator of the unstable ground of the convergence hypothesis. Secondly, it argues in favour of cooperation between different expertise, skills and cultures to make multimedia products or to supply multimedia services, instead of creating from scratch a brand new class of hybrid skills and professions. Thirdly, a variety of new possible and realistically achievable professional profiles in cultural industries and institutions are illustrated. Eventually a set of public policies, in the light of a new role for cities and regions, is developed. Correspondence and offprint requests to: Francesco Garibaldo, Fondazione ‘Istituto Per il Lavoro (IPL)’, via Marconi 8, 40122 Bologna, Italy. Email: f.garibaldo@ipielle.emr.it 相似文献