期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Leveraging corporate context within knowledge-based document analysis and understanding

Claudia Wenzel Heiko Maus 《International Journal on Document Analysis and Recognition》2001,3(4):248-260

Knowledge-based systems for document analysis and understanding (DAU) are quite useful whenever analysis has to deal with the changing of free-form document types which require different analysis components. In this case, declarative modeling is a good way to achieve flexibility. An important application domain for such systems is the business letter domain. Here, high accuracy and the correct assignment to the right people and the right processes is a crucial success factor. Our solution to this proposes a comprehensive knowledge-centered approach: we model not only comparatively static knowledge concerning document properties and analysis results within the same declarative formalism, but we also include the analysis task and the current context of the system environment within the same formalism. This allows an easy definition of new analysis tasks and also an efficient and accurate analysis by using expectations about incoming documents as context information. The approach described has been implemented within the VOPR (VOPR is an acronym for the Virtual Office PRototype.) system. This DAU system gains the required context information from a commercial workflow management system (WfMS) by constant exchanges of expectations and analysis tasks. Further interaction between these two systems covers the delivery of results from DAU to the WfMS and the delivery of corrected results vice versa. Received June 19, 1999 / Revised November 8, 2000 相似文献

2.

The GMAP: a versatile tool for physical data independence 总被引：1，自引：0，他引：1

Odysseas G. Tsatalos Marvin H. Solomon Yannis E. Ioannidis 《The VLDB Journal The International Journal on Very Large Data Bases》1996,5(2):101-118

Physical data independence is touted as a central feature of modern database systems. It allows users to frame queries in terms of the logical structure of the data, letting a query processor automatically translate them into optimal plans that access physical storage structures. Both relational and object-oriented systems, however, force users to frame their queries in terms of a logical schema that is directly tied to physical structures. We present an approach that eliminates this dependence. All storage structures are defined in a declarative language based on relational algebra as functions of a logical schema. We present an algorithm, integrated with a conventional query optimizer, that translates queries over this logical schema into plans that access the storage structures. We also show how to compile update requests into plans that update all relevant storage structures consistently and optimally. Finally, we report on experiments with a prototype implementation of our approach that demonstrate how it allows storage structures to be tuned to the expected or observed workload to achieve significantly better performance than is possible with conventional techniques. Edited by Matthias Jarke, Jorge Bocca, Carlo Zaniolo. Received September 15, 1994 / Accepted September 1, 1995 相似文献

3.

Efficient and reliable digital media archive for content-based retrieval

Shih-Ping Liou Rune Hjelsvold Remi Depommier Arding Hsu 《Multimedia Systems》1999,7(4):256-268

相似文献

4.

A fast algorithm for skew detection of document images using morphology 总被引：1，自引：0，他引：1

A.K. Das B. Chanda 《International Journal on Document Analysis and Recognition》2001,4(2):109-114

相似文献

5.

WIRE3: Driving Around the Information Super-Highway

Stuart Goose Safia Djennane 《Personal and Ubiquitous Computing》2002,6(3):164-175

Interactive voice browsers offer an alternative paradigm that affords ubiquitous mobile access to the WWW using a wide range of consumer devices. This technology can facilitate a safe, “hands-free” browsing environment that is of importance both to car drivers and various mobile and technical professionals. This paper describes the challenges of architecting an interactive voice browser that combines digital audio with the features of a speech synthesizer to make structural elements of the document explicit to the listener. The aesthetics of the audio rendition can simultaneously help reduce the monotony factor and enhance comprehension. The evolution of the voice browser gave rise to a new conceptual model of the HTML document structure and its mapping to a 3D audio space. A number of novel features are discussed for improving both the user’s comprehension of the HTML document structure and their orientation within it. These factors, in turn, can improve the effectiveness of the browsing experience. 相似文献

6.

Integrated document caching and prefetching in storage hierarchies based on Markov-chain predictions

Achim Kraiss Gerhard Weikum 《The VLDB Journal The International Journal on Very Large Data Bases》1998,7(3):141-162

Large multimedia document archives may hold a major fraction of their data in tertiary storage libraries for cost reasons. This paper develops an integrated approach to the vertical data migration between the tertiary, secondary, and primary storage in that it reconciles speculative prefetching, to mask the high latency of the tertiary storage, with the replacement policy of the document caches at the secondary and primary storage level, and also considers the interaction of these policies with the tertiary and secondary storage request scheduling. The integrated migration policy is based on a continuous-time Markov chain model for predicting the expected number of accesses to a document within a specified time horizon. Prefetching is initiated only if that expectation is higher than those of the documents that need to be dropped from secondary storage to free up the necessary space. In addition, the possible resource contention at the tertiary and secondary storage is taken into account by dynamically assessing the response-time benefit of prefetching a document versus the penalty that it would incur on the response time of the pending document requests. The parameters of the continuous-time Markov chain model, the probabilities of co-accessing certain documents and the interaction times between successive accesses, are dynamically estimated and adjusted to evolving workload patterns by keeping online statistics. The integrated policy for vertical data migration has been implemented in a prototype system. The system makes profitable use of the Markov chain model also for the scheduling of volume exchanges in the tertiary storage library. Detailed simulation experiments with Web-server-like synthetic workloads indicate significant gains in terms of client response time. The experiments also show that the overhead of the statistical bookkeeping and the computations for the access predictions is affordable. Received January 1, 1998 / Accepted May 27, 1998 相似文献

7.

Application-controlled memory management in a single address space environment

Alberto Bartoli Gianluca Dini Lanfranco Lopriore 《International Journal on Software Tools for Technology Transfer (STTT)》2001,3(2):235-245

With reference to a memory management system supporting the single address space abstraction and a uniform, persistent view of storage, we present a set of mechanisms that allow applications to exert explicit control over memory management activities. These mechanisms make it possible to move the contents of a virtual page to primary memory for fast processor access, or to push these contents back to secondary memory to free primary memory space. Our memory management scheme allows programs to exploit the memory reference pattern of the underlying algorithms, thereby improving utilisation of the system storage resources. This result is illustrated by using significant examples of memory management activities implemented at the application program level. Published online: 8 February 2001 相似文献

8.

Rule-based document structure understanding with a fuzzy combination of layout and textual features 总被引：1，自引：1，他引：0

Stefan Klink Thomas Kieninger 《International Journal on Document Analysis and Recognition》2001,4(1):18-26

Document image processing is a crucial process in office automation and begins at the ‘OCR’ phase with difficulties in document ‘analysis’ and ‘understanding’. This paper presents a hybrid and comprehensive approach to document structure analysis. Hybrid in the sense that it makes use of layout (geometrical) as well as textual features of a given document. These features are the base for potential conditions which in turn are used to express fuzzy matched rules of an underlying rule base. Rules can be formulated based on features which might be observed within one specific layout object. However, rules can also express dependencies between different layout objects. In addition to its rule driven analysis, which allows an easy adaptation to specific domains with their specific logical objects, the system contains domain-independent markup algorithms for common objects (e.g., lists). Received June 19, 2000 / Revised November 8, 2000 相似文献

9.

Extending the Unified Modeling Language for ontology development 总被引：3，自引：0，他引：3

Kenneth Baclawski Mieczyslaw K. Kokar Paul A. Kogut Lewis Hart Jeffrey Smith Jerzy Letkowski Pat Emery 《Software and Systems Modeling》2002,1(2):142-156

There is rapidly growing momentum for web enabled agents that reason about and dynamically integrate the appropriate knowledge and services at run-time. The dynamic integration of knowledge and services depends on the existence of explicit declarative semantic models (ontologies). We have been building tools for ontology development based on the Unified Modeling Language (UML). This allows the many mature UML tools, models and expertise to be applied to knowledge representation systems, not only for visualizing complex ontologies but also for managing the ontology development process. UML has many features, such as profiles, global modularity and extension mechanisms that are not generally available in most ontology languages. However, ontology languages have some features that UML does not support. Our paper identifies the similarities and differences (with examples) between UML and the ontology languages RDF and DAML+OIL. To reconcile these differences, we propose a modification to the UML metamodel to address some of the most problematic differences. One of these is the ontological concept variously called a property, relation or predicate. This notion corresponds to the UML concepts of association and attribute. In ontology languages properties are first-class modeling elements, but UML associations and attributes are not first-class. Our proposal is backward-compatible with existing UML models while enhancing its viability for ontology modeling. While we have focused on RDF and DAML+OIL in our research and development activities, the same issues apply to many of the knowledge representation languages. This is especially the case for semantic network and concept graph approaches to knowledge representations. Initial sbmission: 16 February 2002 / Revised submission: 15 October 2002 Published online: 2 December 2002 相似文献

10.

Algebraic query optimisation for database programming languages 总被引：1，自引：0，他引：1

Alexandra Poulovassilis Carol Small 《The VLDB Journal The International Journal on Very Large Data Bases》1996,5(2):119-132

A major challenge still facing the designers and implementors of database programming languages (DBPLs) is that of query optimisation. We investigate algebraic query optimisation techniques for DBPLs in the context of a purely declarative functional language that supports sets as first-class objects. Since the language is computationally complete issues such as non-termination of expressions and construction of infinite data structures can be investigated, whilst its declarative nature allows the issue of side effects to be avoided and a richer set of equivalences to be developed. The language has a well-defined semantics which permits us to reason formally about the properties of expressions, such as their equivalence with other expressions and their termination. The support of a set bulk data type enables much prior work on the optimisation of relational languages to be utilised. In the paper we first give the syntax of our archetypal DBPL and briefly discuss its semantics. We then define a small but powerful algebra of operators over the set data type, provide some key equivalences for expressions in these operators, and list transformation principles for optimising expressions. Along the way, we identify some caveats to well-known equivalences for non-deductive database languages. We next extend our language with two higher level constructs commonly found in functional DBPLs: set comprehensions and functions with known inverses. Some key equivalences for these constructs are provided, as are transformation principles for expressions in them. Finally, we investigate extending our equivalences for the set operators to the analogous operators over bags. Although developed and formally proved in the context of a functional language, our findings are directly applicable to other DBPLs of similar expressiveness. Edited by Matthias Jarke, Jorge Bocca, Carlo Zaniolo. Received September 15, 1994 / Accepted September 1, 1995 相似文献

11.

Authorization Translation for XML Document Transformation

Chatvichienchai Somchai Iwaihara Mizuho Kambayashi Yahiko 《World Wide Web》2004,7(1):111-138

XML access control models proposed in the literature enforce access restrictions directly on the structure and content of an XML document. Therefore access authorization rules (authorizations, for short), which specify access rights of users on information within an XML document, must be revised if they do not match with changed structure of the XML document. In this paper, we present two authorization translation problems. The first is a problem of translating instance-level authorizations for an XML document. The second is a problem of translating schema-level authorizations for a collection of XML documents conforming to a DTD. For the first problem, we propose an algorithm that translates instance-level authorizations of a source XML document into those for a transformed XML document by using instance-tree mapping from the transformed document instance to the source document instance. For the second problem, we propose an algorithm that translates value-independent schema-level authorizations of non-recursive source DTD into those for a non-recursive target DTD by using schema-tree mapping from the target DTD to the source DTD. The goal of authorization translation is to preserve authorization equivalence at instance node level of the source document. The XML access control models use path expressions of XPath to locate data in XML documents. We define property of the path expressions (called node-reducible path expressions) that we can transform schema-level authorizations of value-independent type by schema-tree mapping. To compute authorizations on instances of schema elements of the target DTD, we need to identify the schema elements whose instances are located by a node-reducible path expression of a value-independent schema-level authorization. We give an algorithm that carries out path fragment containment test to identify the schema elements whose instances are located by a node-reducible path expression. 相似文献

12.

Views in a large-scale XML repository

Vincent Aguilera Sophie Cluet Tova Milo Pierangelo Veltri Dan Vodislav 《The VLDB Journal The International Journal on Very Large Data Bases》2002,11(3):238-255

We are interested in defining and querying views in a huge and highly heterogeneous XML repository (Web scale). In this context, view definitions are very large, involving lots of sources, and there is no apparent limitation to their size. This raises interesting problems that we address in the paper: (i) how to distribute views over several machines without having a negative impact on the query translation process; (ii) how to quickly select the relevant part of a view given a query; (iii) how to minimize the cost of communicating potentially large queries to the machines where they will be evaluated. The solution that we propose is based on a simple view definition language that allows for automatic generation of views. The language maps paths in the view abstract DTD to paths in the concrete source DTDs. It enables a distributed implementation of the view system that is scalable both in terms of data and load. In particular, the query translation algorithm is shown to have a good (linear) complexity. Received: November 1, 2001 / Accepted: March 2, 2002 Published online: September 25, 2002 相似文献

13.

Floor control for multimedia conferencing and collaboration 总被引：12，自引：0，他引：12

Hans-Peter Dommel J.J. Garcia-Luna-Aceves 《Multimedia Systems》1997,5(1):23-38

Floor control allows users of networked multimedia applications to utilize and share resources such as remote devices, distributed data sets, telepointers, or continuous media such as video and audio without access conflicts. Floors are temporary permissions granted dynamically to collaborating users in order to mitigate race conditions and guarantee mutually exclusive resource usage. A general framework for floor control is presented. Collaborative environments are characterized and the requirements for realization of floor control will be identified. The differences to session control, as well as concurrency control and access control are elicited. Based upon a brief taxonomy of collaboration-relevant parameters, system design issues for floor control are discussed. Floor control mechanisms are discerned from service policies and principal architectures of collaborative systems are compared. The structure of control packets and an application programmer's interface are proposed and further implementation aspects are elaborated. User-related aspects such as floor presentation, assignment, and the timely stages of floor-controlled interaction in relation to user-interface design are also presented. 相似文献

14.

Distributed accessibility control points help deliver a directly accessible Web

Peter G. Fairweather John T. Richards Vicki L. Hanson 《Universal Access in the Information Society》2002,2(1):70-75

This paper describes a set of interfaces and mechanisms to enhance access to the World Wide Web for persons with sensory, cognitive, or motor limitations. Paradoxically, although complex Web architectures are often accused of impeding accessibility, their layers expand the range of points where interventions can be staged to improve it. This paper identifies some of these access control points and evaluates the particular strengths and weaknesses of each. In particular, it describes an approach to enhance access that is distributed across multiple control points and implemented as an aggregation of services. Published online: 6 November 2002 相似文献

15.

A framework for designing and implementing the user interface of a geographic digital library

Juliano Lopes de Oliveira Marcos André Gonçalves Claudia Bauzer Medeiros 《International Journal on Digital Libraries》1999,2(2-3):190-206

Geographic data are useful for a large set of applications, such as urban planning and environmental control. These data are, however, very expensive to acquire and maintain. Moreover, their use is often restricted due to a lack of dissemination mechanisms. Digital libraries are a good approach for increasing data availability and therefore reducing costs, since they provide efficient storage and access to large volumes of data. One major drawback to this approach is that it creates the necessity of providing facilities for a large and heterogeneous community of users to search and interact with these geographic libraries. We present a solution to this problem, based on a framework that allows the design and construction of customizable user interfaces for applications based on Geographic Digital Libraries (GDL). This framework relies on two main concepts: a geographic user interface architecture and a geographic digital library model. Received: 15 December 1997 / Revised: June 1999 相似文献

16.

Metaphors and Context-Aware Information Access

Peter Coschurba Joachim Baumann Uwe Kubach Alexander Leonhardi 《Personal and Ubiquitous Computing》2001,5(1):16-19

Metaphors are often used to provide the user with a mental model to ease the use of computers. An example of such a metaphor is the commonly used “Desktop Metaphor”. Metaphors also can be used to ease context-aware information access for the users of mobile information systems. In this paper we present a taxonomy that allows the categorisation of such metaphors. Furthermore, we give an overview of existing metaphors and their implementations. After introducing some new metaphors we conclude our considerations with a classification of new and existing metaphors using our taxonomy. 相似文献

17.

On Z39.50 wrapping and description logics

Yannis Velegrakis Vassilis Christophides Panos Constantopoulos 《International Journal on Digital Libraries》2000,3(3):208-220

相似文献

18.

Classification of document pages using structure-based features

Christian Shin David Doermann Azriel Rosenfeld 《International Journal on Document Analysis and Recognition》2001,3(4):232-247

Searching for documents by their type or genre is a natural way to enhance the effectiveness of document retrieval. The layout of a document contains a significant amount of information that can be used to classify it by type in the absence of domain-specific models. Our approach to classification is based on “visual similarity” of layout structure and is implemented by building a supervised classifier, given examples of each class. We use image features such as percentages of text and non-text (graphics, images, tables, and rulings) content regions, column structures, relative point sizes of fonts, density of content area, and statistics of features of connected components which can be derived without class knowledge. In order to obtain class labels for training samples, we conducted a study where subjects ranked document pages with respect to their resemblance to representative page images. Class labels can also be assigned based on known document types, or can be defined by the user. We implemented our classification scheme using decision tree classifiers and self-organizing maps. Received June 15, 2000 / Revised November 15, 2000 相似文献

19.

Automatic document classification and indexing in high-volume applications

E. Appiani F. Cesarini A.M. Colla M. Diligenti M. Gori S. Marinai G. Soda 《International Journal on Document Analysis and Recognition》2001,4(2):69-83

In this paper a system for analysis and automatic indexing of imaged documents for high-volume applications is described. This system, named STRETCH (STorage and RETrieval by Content of imaged documents), is based on an Archiving and Retrieval Engine, which overcomes the bottleneck of document profiling bypassing some limitations of existing pre-defined indexing schemes. The engine exploits a structured document representation and can activate appropriate methods to characterise and automatically index heterogeneous documents with variable layout. The originality of STRETCH lies principally in the possibility for unskilled users to define the indexes relevant to the document domains of their interest by simply presenting visual examples and applying reliable automatic information extraction methods (document classification, flexible reading strategies) to index the documents automatically, thus creating archives as desired. STRETCH offers ease of use and application programming and the ability to dynamically adapt to new types of documents. The system has been tested in two applications in particular, one concerning passive invoices and the other bank documents. In these applications, several classes of documents are involved. The indexing strategy first automatically classifies the document, thus avoiding pre-sorting, then locates and reads the information pertaining to the specific document class. Experimental results are encouraging overall; in particular, document classification results fulfill the requirements of high-volume application. Integration into production lines is under execution. Received March 30, 2000 / Revised June 26, 2001 相似文献

20.

Synchronization and recovery in a client-server storage system

E. Panagos A. Biliris 《The VLDB Journal The International Journal on Very Large Data Bases》1997,6(3):209-223

Client-server object-oriented database management systems differ significantly from traditional centralized systems in terms of their architecture and the applications they target. In this paper, we present the client-server architecture of the EOS storage manager and we describe the concurrency control and recovery mechanisms it employs. EOS offers a semi-optimistic locking scheme based on the multi-granularity two-version two-phase locking protocol. Under this scheme, multiple concurrent readers are allowed to access a data item while it is being updated by a single writer. Recovery is based on write-ahead redo-only logging. Log records are generated at the clients and they are shipped to the server during normal execution and at transaction commit. Transaction rollback is fast because there are no updates that have to be undone, and recovery from system crashes requires only one scan of the log for installing the changes made by transactions that committed before the crash. We also present a preliminary performance evaluation of the implementation of the above mechanisms. Edited by R. King. Received July 1993 / Accepted May 1996 相似文献