共查询到20条相似文献,搜索用时 62 毫秒
1.
Claudia Wenzel Heiko Maus 《International Journal on Document Analysis and Recognition》2001,3(4):248-260
Knowledge-based systems for document analysis and understanding (DAU) are quite useful whenever analysis has to deal with
the changing of free-form document types which require different analysis components. In this case, declarative modeling is
a good way to achieve flexibility. An important application domain for such systems is the business letter domain. Here, high
accuracy and the correct assignment to the right people and the right processes is a crucial success factor. Our solution
to this proposes a comprehensive knowledge-centered approach: we model not only comparatively static knowledge concerning
document properties and analysis results within the same declarative formalism, but we also include the analysis task and
the current context of the system environment within the same formalism. This allows an easy definition of new analysis tasks
and also an efficient and accurate analysis by using expectations about incoming documents as context information. The approach
described has been implemented within the VOPR (VOPR is an acronym for the Virtual Office PRototype.) system. This DAU system
gains the required context information from a commercial workflow management system (WfMS) by constant exchanges of expectations
and analysis tasks. Further interaction between these two systems covers the delivery of results from DAU to the WfMS and
the delivery of corrected results vice versa.
Received June 19, 1999 / Revised November 8, 2000 相似文献
2.
The GMAP: a versatile tool for physical data independence 总被引:1,自引:0,他引:1
Odysseas G. Tsatalos Marvin H. Solomon Yannis E. Ioannidis 《The VLDB Journal The International Journal on Very Large Data Bases》1996,5(2):101-118
Physical data independence is touted as a central feature of modern
database systems. It allows users to frame queries in terms of the logical
structure of the data, letting a query processor automatically translate
them into optimal plans that access physical storage structures. Both
relational and object-oriented systems, however, force users to frame their
queries in terms of a logical schema that is directly tied to physical
structures. We present an approach that eliminates this dependence. All
storage structures are defined in a declarative language based on
relational algebra as functions of a logical schema. We present an
algorithm, integrated with a conventional query optimizer, that translates
queries over this logical schema into plans that access the storage
structures. We also show how to compile update requests into plans that
update all relevant storage structures consistently and optimally.
Finally, we report on experiments with a prototype implementation of our
approach that demonstrate how it allows storage structures to be tuned to
the expected or observed workload to achieve significantly better
performance than is possible with conventional techniques.
Edited by
Matthias Jarke, Jorge Bocca, Carlo Zaniolo. Received
September 15, 1994 / Accepted September 1, 1995 相似文献
3.
4.
5.
Interactive voice browsers offer an alternative paradigm that affords ubiquitous mobile access to the WWW using a wide range
of consumer devices. This technology can facilitate a safe, “hands-free” browsing environment that is of importance both to
car drivers and various mobile and technical professionals. This paper describes the challenges of architecting an interactive
voice browser that combines digital audio with the features of a speech synthesizer to make structural elements of the document
explicit to the listener. The aesthetics of the audio rendition can simultaneously help reduce the monotony factor and enhance
comprehension. The evolution of the voice browser gave rise to a new conceptual model of the HTML document structure and its
mapping to a 3D audio space. A number of novel features are discussed for improving both the user’s comprehension of the HTML
document structure and their orientation within it. These factors, in turn, can improve the effectiveness of the browsing
experience. 相似文献
6.
Integrated document caching and prefetching in storage hierarchies based on Markov-chain predictions
Achim Kraiss Gerhard Weikum 《The VLDB Journal The International Journal on Very Large Data Bases》1998,7(3):141-162
Large multimedia document archives may hold a major fraction of their data in tertiary storage libraries for cost reasons.
This paper develops an integrated approach to the vertical data migration between the tertiary, secondary, and primary storage
in that it reconciles speculative prefetching, to mask the high latency of the tertiary storage, with the replacement policy
of the document caches at the secondary and primary storage level, and also considers the interaction of these policies with
the tertiary and secondary storage request scheduling.
The integrated migration policy is based on a continuous-time Markov chain model for predicting the expected number of accesses
to a document within a specified time horizon. Prefetching is initiated only if that expectation is higher than those of the
documents that need to be dropped from secondary storage to free up the necessary space. In addition, the possible resource
contention at the tertiary and secondary storage is taken into account by dynamically assessing the response-time benefit
of prefetching a document versus the penalty that it would incur on the response time of the pending document requests.
The parameters of the continuous-time Markov chain model, the probabilities of co-accessing certain documents and the interaction
times between successive accesses, are dynamically estimated and adjusted to evolving workload patterns by keeping online
statistics. The integrated policy for vertical data migration has been implemented in a prototype system. The system makes
profitable use of the Markov chain model also for the scheduling of volume exchanges in the tertiary storage library. Detailed
simulation experiments with Web-server-like synthetic workloads indicate significant gains in terms of client response time.
The experiments also show that the overhead of the statistical bookkeeping and the computations for the access predictions
is affordable.
Received January 1, 1998 / Accepted May 27, 1998 相似文献
7.
Alberto Bartoli Gianluca Dini Lanfranco Lopriore 《International Journal on Software Tools for Technology Transfer (STTT)》2001,3(2):235-245
With reference to a memory management system supporting the single address space abstraction and a uniform, persistent view
of storage, we present a set of mechanisms that allow applications to exert explicit control over memory management activities.
These mechanisms make it possible to move the contents of a virtual page to primary memory for fast processor access, or to
push these contents back to secondary memory to free primary memory space. Our memory management scheme allows programs to
exploit the memory reference pattern of the underlying algorithms, thereby improving utilisation of the system storage resources.
This result is illustrated by using significant examples of memory management activities implemented at the application program
level.
Published online: 8 February 2001 相似文献
8.
Rule-based document structure understanding with a fuzzy combination of layout and textual features 总被引:1,自引:1,他引:0
Stefan Klink Thomas Kieninger 《International Journal on Document Analysis and Recognition》2001,4(1):18-26
Document image processing is a crucial process in office automation and begins at the ‘OCR’ phase with difficulties in document
‘analysis’ and ‘understanding’. This paper presents a hybrid and comprehensive approach to document structure analysis. Hybrid
in the sense that it makes use of layout (geometrical) as well as textual features of a given document. These features are
the base for potential conditions which in turn are used to express fuzzy matched rules of an underlying rule base. Rules
can be formulated based on features which might be observed within one specific layout object. However, rules can also express
dependencies between different layout objects. In addition to its rule driven analysis, which allows an easy adaptation to
specific domains with their specific logical objects, the system contains domain-independent markup algorithms for common
objects (e.g., lists).
Received June 19, 2000 / Revised November 8, 2000 相似文献
9.
Extending the Unified Modeling Language for ontology development 总被引:3,自引:0,他引:3
Kenneth Baclawski Mieczyslaw K. Kokar Paul A. Kogut Lewis Hart Jeffrey Smith Jerzy Letkowski Pat Emery 《Software and Systems Modeling》2002,1(2):142-156
There is rapidly growing momentum for web enabled agents that reason about and dynamically integrate the appropriate knowledge
and services at run-time. The dynamic integration of knowledge and services depends on the existence of explicit declarative
semantic models (ontologies). We have been building tools for ontology development based on the Unified Modeling Language
(UML). This allows the many mature UML tools, models and expertise to be applied to knowledge representation systems, not
only for visualizing complex ontologies but also for managing the ontology development process. UML has many features, such
as profiles, global modularity and extension mechanisms that are not generally available in most ontology languages. However,
ontology languages have some features that UML does not support. Our paper identifies the similarities and differences (with
examples) between UML and the ontology languages RDF and DAML+OIL. To reconcile these differences, we propose a modification
to the UML metamodel to address some of the most problematic differences. One of these is the ontological concept variously
called a property, relation or predicate. This notion corresponds to the UML concepts of association and attribute. In ontology
languages properties are first-class modeling elements, but UML associations and attributes are not first-class. Our proposal
is backward-compatible with existing UML models while enhancing its viability for ontology modeling. While we have focused
on RDF and DAML+OIL in our research and development activities, the same issues apply to many of the knowledge representation
languages. This is especially the case for semantic network and concept graph approaches to knowledge representations.
Initial sbmission: 16 February 2002 / Revised submission: 15 October 2002 Published online: 2 December 2002 相似文献
10.
Algebraic query optimisation for database programming languages 总被引:1,自引:0,他引:1
Alexandra Poulovassilis Carol Small 《The VLDB Journal The International Journal on Very Large Data Bases》1996,5(2):119-132
A major challenge still facing the designers and implementors of database
programming languages (DBPLs) is that of query optimisation. We investigate
algebraic query optimisation techniques for DBPLs in the context of a purely
declarative functional language that supports sets as first-class objects.
Since the language is computationally complete issues such as
non-termination of expressions and construction of infinite data structures
can be investigated, whilst its declarative nature allows the issue of side
effects to be avoided and a richer set of equivalences to be developed.
The language has a well-defined semantics which permits us to reason
formally about the properties of expressions, such as their equivalence with
other expressions and their termination. The support of a set bulk data
type enables much prior work on the optimisation of relational languages to
be utilised.
In the paper we first give the syntax of our archetypal DBPL and briefly
discuss its semantics. We then define a small but powerful algebra of
operators over the set data type, provide some key equivalences for
expressions in these operators, and list transformation principles for
optimising expressions. Along the way, we identify some caveats to
well-known equivalences for non-deductive database languages. We next
extend our language with two higher level constructs commonly found in
functional DBPLs: set comprehensions and functions with known inverses. Some
key equivalences for these constructs are provided, as are transformation
principles for expressions in them. Finally, we investigate extending our
equivalences for the set operators to the analogous operators over bags.
Although developed and formally proved in the context of a functional
language, our findings are directly applicable to other DBPLs of similar
expressiveness.
Edited by
Matthias Jarke, Jorge Bocca, Carlo Zaniolo. Received
September 15, 1994 / Accepted September 1, 1995 相似文献
11.
XML access control models proposed in the literature enforce access restrictions directly on the structure and content of an XML document. Therefore access authorization rules (authorizations, for short), which specify access rights of users on information within an XML document, must be revised if they do not match with changed structure of the XML document. In this paper, we present two authorization translation problems. The first is a problem of translating instance-level authorizations for an XML document. The second is a problem of translating schema-level authorizations for a collection of XML documents conforming to a DTD. For the first problem, we propose an algorithm that translates instance-level authorizations of a source XML document into those for a transformed XML document by using instance-tree mapping from the transformed document instance to the source document instance. For the second problem, we propose an algorithm that translates value-independent schema-level authorizations of non-recursive source DTD into those for a non-recursive target DTD by using schema-tree mapping from the target DTD to the source DTD. The goal of authorization translation is to preserve authorization equivalence at instance node level of the source document. The XML access control models use path expressions of XPath to locate data in XML documents. We define property of the path expressions (called node-reducible path expressions) that we can transform schema-level authorizations of value-independent type by schema-tree mapping. To compute authorizations on instances of schema elements of the target DTD, we need to identify the schema elements whose instances are located by a node-reducible path expression of a value-independent schema-level authorization. We give an algorithm that carries out path fragment containment test to identify the schema elements whose instances are located by a node-reducible path expression. 相似文献
12.
Vincent Aguilera Sophie Cluet Tova Milo Pierangelo Veltri Dan Vodislav 《The VLDB Journal The International Journal on Very Large Data Bases》2002,11(3):238-255
We are interested in defining and querying views in a huge and highly heterogeneous XML repository (Web scale). In this context,
view definitions are very large, involving lots of sources, and there is no apparent limitation to their size. This raises
interesting problems that we address in the paper: (i) how to distribute views over several machines without having a negative
impact on the query translation process; (ii) how to quickly select the relevant part of a view given a query; (iii) how to
minimize the cost of communicating potentially large queries to the machines where they will be evaluated. The solution that
we propose is based on a simple view definition language that allows for automatic generation of views. The language maps
paths in the view abstract DTD to paths in the concrete source DTDs. It enables a distributed implementation of the view system
that is scalable both in terms of data and load. In particular, the query translation algorithm is shown to have a good (linear)
complexity.
Received: November 1, 2001 / Accepted: March 2, 2002 Published online: September 25, 2002 相似文献
13.
Floor control for multimedia conferencing and collaboration 总被引:12,自引:0,他引:12
Floor control allows users of networked multimedia applications to utilize and share resources such as remote devices, distributed
data sets, telepointers, or continuous media such as video and audio without access conflicts. Floors are temporary permissions
granted dynamically to collaborating users in order to mitigate race conditions and guarantee mutually exclusive resource
usage. A general framework for floor control is presented. Collaborative environments are characterized and the requirements
for realization of floor control will be identified. The differences to session control, as well as concurrency control and
access control are elicited. Based upon a brief taxonomy of collaboration-relevant parameters, system design issues for floor
control are discussed. Floor control mechanisms are discerned from service policies and principal architectures of collaborative
systems are compared. The structure of control packets and an application programmer's interface are proposed and further
implementation aspects are elaborated. User-related aspects such as floor presentation, assignment, and the timely stages
of floor-controlled interaction in relation to user-interface design are also presented. 相似文献
14.
Peter G. Fairweather John T. Richards Vicki L. Hanson 《Universal Access in the Information Society》2002,2(1):70-75
This paper describes a set of interfaces and mechanisms to enhance access to the World Wide Web for persons with sensory,
cognitive, or motor limitations. Paradoxically, although complex Web architectures are often accused of impeding accessibility,
their layers expand the range of points where interventions can be staged to improve it. This paper identifies some of these
access control points and evaluates the particular strengths and weaknesses of each. In particular, it describes an approach
to enhance access that is distributed across multiple control points and implemented as an aggregation of services.
Published online: 6 November 2002 相似文献
15.
Juliano Lopes de Oliveira Marcos André Gonçalves Claudia Bauzer Medeiros 《International Journal on Digital Libraries》1999,2(2-3):190-206
Geographic data are useful for a large set of applications, such as urban planning and environmental control. These data are,
however, very expensive to acquire and maintain. Moreover, their use is often restricted due to a lack of dissemination mechanisms.
Digital libraries are a good approach for increasing data availability and therefore reducing costs, since they provide efficient
storage and access to large volumes of data. One major drawback to this approach is that it creates the necessity of providing
facilities for a large and heterogeneous community of users to search and interact with these geographic libraries. We present
a solution to this problem, based on a framework that allows the design and construction of customizable user interfaces for
applications based on Geographic Digital Libraries (GDL). This framework relies on two main concepts: a geographic user interface
architecture and a geographic digital library model.
Received: 15 December 1997 / Revised: June 1999 相似文献
16.
Peter Coschurba Joachim Baumann Uwe Kubach Alexander Leonhardi 《Personal and Ubiquitous Computing》2001,5(1):16-19
Metaphors are often used to provide the user with a mental model to ease the use of computers. An example of such a metaphor
is the commonly used “Desktop Metaphor”. Metaphors also can be used to ease context-aware information access for the users
of mobile information systems. In this paper we present a taxonomy that allows the categorisation of such metaphors. Furthermore,
we give an overview of existing metaphors and their implementations. After introducing some new metaphors we conclude our
considerations with a classification of new and existing metaphors using our taxonomy. 相似文献
17.
18.
Christian Shin David Doermann Azriel Rosenfeld 《International Journal on Document Analysis and Recognition》2001,3(4):232-247
Searching for documents by their type or genre is a natural way to enhance the effectiveness of document retrieval. The layout
of a document contains a significant amount of information that can be used to classify it by type in the absence of domain-specific
models. Our approach to classification is based on “visual similarity” of layout structure and is implemented by building
a supervised classifier, given examples of each class. We use image features such as percentages of text and non-text (graphics,
images, tables, and rulings) content regions, column structures, relative point sizes of fonts, density of content area, and
statistics of features of connected components which can be derived without class knowledge. In order to obtain class labels
for training samples, we conducted a study where subjects ranked document pages with respect to their resemblance to representative
page images. Class labels can also be assigned based on known document types, or can be defined by the user. We implemented
our classification scheme using decision tree classifiers and self-organizing maps.
Received June 15, 2000 / Revised November 15, 2000 相似文献
19.
E. Appiani F. Cesarini A.M. Colla M. Diligenti M. Gori S. Marinai G. Soda 《International Journal on Document Analysis and Recognition》2001,4(2):69-83
In this paper a system for analysis and automatic indexing of imaged documents for high-volume applications is described.
This system, named STRETCH (STorage and RETrieval by Content of imaged documents), is based on an Archiving and Retrieval Engine, which overcomes the bottleneck of document profiling bypassing some limitations of existing pre-defined indexing schemes.
The engine exploits a structured document representation and can activate appropriate methods to characterise and automatically
index heterogeneous documents with variable layout. The originality of STRETCH lies principally in the possibility for unskilled
users to define the indexes relevant to the document domains of their interest by simply presenting visual examples and applying
reliable automatic information extraction methods (document classification, flexible reading strategies) to index the documents
automatically, thus creating archives as desired. STRETCH offers ease of use and application programming and the ability to
dynamically adapt to new types of documents. The system has been tested in two applications in particular, one concerning
passive invoices and the other bank documents. In these applications, several classes of documents are involved. The indexing
strategy first automatically classifies the document, thus avoiding pre-sorting, then locates and reads the information pertaining
to the specific document class. Experimental results are encouraging overall; in particular, document classification results
fulfill the requirements of high-volume application. Integration into production lines is under execution.
Received March 30, 2000 / Revised June 26, 2001 相似文献
20.
E. Panagos A. Biliris 《The VLDB Journal The International Journal on Very Large Data Bases》1997,6(3):209-223
Client-server object-oriented database management systems differ significantly from traditional centralized systems in terms
of their architecture and the applications they target. In this paper, we present the client-server architecture of the EOS
storage manager and we describe the concurrency control and recovery mechanisms it employs. EOS offers a semi-optimistic locking
scheme based on the multi-granularity two-version two-phase locking protocol. Under this scheme, multiple concurrent readers
are allowed to access a data item while it is being updated by a single writer. Recovery is based on write-ahead redo-only
logging. Log records are generated at the clients and they are shipped to the server during normal execution and at transaction
commit. Transaction rollback is fast because there are no updates that have to be undone, and recovery from system crashes
requires only one scan of the log for installing the changes made by transactions that committed before the crash. We also
present a preliminary performance evaluation of the implementation of the above mechanisms.
Edited by R. King. Received July 1993 / Accepted May 1996 相似文献