共查询到20条相似文献,搜索用时 31 毫秒
1.
Abstract. Providing a customized result set based upon a user preference is the ultimate objective of many content-based image retrieval
systems. There are two main challenges in meeting this objective: First, there is a gap between the physical characteristics
of digital images and the semantic meaning of the images. Secondly, different people may have different perceptions on the
same set of images. To address both these challenges, we propose a model, named Yoda, that conceptualizes content-based querying
as the task of soft classifying images into classes. These classes can overlap, and their members are different for different
users. The “soft” classification is hence performed for each and every image feature, including both physical and semantic
features. Subsequently, each image will be ranked based on the weighted aggregation of its classification memberships. The
weights are user-dependent, and hence different users would obtain different result sets for the same query. Yoda employs
a fuzzy-logic based aggregation function for ranking images. We show that, in addition to some performance benefits, fuzzy
aggregation is less sensitive to noise and can support disjunctive queries as compared to weighted-average aggregation used
by other content-based image retrieval systems. Finally, since Yoda heavily relies on user-dependent weights (i.e., user profiles)
for the aggregation task, we utilize the users' relevance feedback to improve the profiles using genetic algorithms (GA).
Our learning mechanism requires fewer user interactions, and results in a faster convergence to the user's preferences as
compared to other learning techniques.
Correspondence to: Y.-S. Chen (E-mail: yishinc@usc.edu)
This research has been funded in part by NSF grants EEC-9529152 (IMSC ERC) and IIS-0082826, NIH-NLM R01-LM07061, DARPA and
USAF under agreement nr. F30602-99-1-0524, and unrestricted cash gifts from NCR, Microsoft, and Okawa Foundation. 相似文献
2.
3.
Constantine Stephanidis Anthony Savidis 《Universal Access in the Information Society》2001,1(1):40-55
Accessibility and high quality of interaction with products, applications, and services by anyone, anywhere, and at any time are fundamental requirements
for universal access in the emerging Information Society. This paper discusses these requirements, and their relation to the concept of automated
adaptation of user interfaces. An example application is presented, showing how adaptation can be used to accommodate the
requirements of different user categories and contexts of use. This application is then used as a vehicle for discussing a
new engineering paradigm appropriate for the development of adaptation-based user interfaces. Finally, the paper investigates
issues concerning the interaction technologies required for universal access.
Published online: 23 May 2001 相似文献
4.
We propose a system that simultaneously utilizes the stereo disparity and optical flow information of real-time stereo grayscale
multiresolution images for the recognition of objects and gestures in human interactions. For real-time calculation of the
disparity and optical flow information of a stereo image, the system first creates pyramid images using a Gaussian filter.
The system then determines the disparity and optical flow of a low-density image and extracts attention regions in a high-density
image. The three foremost regions are recognized using higher-order local autocorrelation features and linear discriminant
analysis. As the recognition method is view based, the system can process the face and hand recognitions simultaneously in
real time. The recognition features are independent of parallel translations, so the system can use unstable extractions from
stereo depth information. We demonstrate that the system can discriminate the users, monitor the basic movements of the user,
smoothly learn an object presented by users, and can communicate with users by hand signs learned in advance.
Received: 31 January 2000 / Accepted: 1 May 2001
Correspondence to: I. Yoda (e-mail: yoda@ieee.org, Tel.: +81-298-615941, Fax: +81-298-613313) 相似文献
5.
Ashish Mehta James Geller Yehoshua Perl Erich Neuhold 《The VLDB Journal The International Journal on Very Large Data Bases》1998,7(1):25-47
A path-method is used as a mechanism in object-oriented databases (OODBs) to retrieve or to update information relevant to one class that
is not stored with that class but with some other class. A path-method is a method which traverses from one class through
a chain of connections between classes and accesses information at another class. However, it is a difficult task for a casual
user or even an application programmer to write path-methods to facilitate queries. This is because it might require comprehensive
knowledge of many classes of the conceptual schema that are not directly involved in the query, and therefore may not even
be included in a user's (incomplete) view about the contents of the database. We have developed a system, called path-method generator (PMG), which generates path-methods automatically according to a user's database-manipulating requests. The PMG offers the
user one of the possible path-methods and the user verifies from his knowledge of the intended purpose of the request whether
that path-method is the desired one. If the path method is rejected, then the user can utilize his now increased knowledge
about the database to request (with additional parameters given) another offer from the PMG. The PMG is based on access weights attached to the connections between classes and precomputed access relevance between every pair of classes of the OODB. Specific rules for access weight assignment and algorithms for computing access
relevance appeared in our previous papers [MGPF92, MGPF93, MGPF96]. In this paper, we present a variety of traversal algorithms
based on access weights and precomputed access relevance. Experiments identify some of these algorithms as very successful
in generating most desired path-methods. The PMG system utilizes these successful algorithms and is thus an efficient tool
for aiding the user with the difficult task of querying and updating a large OODB.
Received July 19, 1993 / Accepted May 16, 1997 相似文献
6.
Thad Starner Bastian Leibe David Minnen Tracy Westyn Amy Hurst Justin Weeks 《Machine Vision and Applications》2003,14(1):59-71
Abstract. The Perceptive Workbench endeavors to create a spontaneous and unimpeded interface between the physical and virtual worlds.
Its vision-based methods for interaction constitute an alternative to wired input devices and tethered tracking. Objects are
recognized and tracked when placed on the display surface. By using multiple infrared light sources, the object's 3-D shape
can be captured and inserted into the virtual interface. This ability permits spontaneity, since either preloaded objects
or those objects selected at run-time by the user can become physical icons. Integrated into the same vision-based interface
is the ability to identify 3-D hand position, pointing direction, and sweeping arm gestures. Such gestures can enhance selection,
manipulation, and navigation tasks. The Perceptive Workbench has been used for a variety of applications, including augmented
reality gaming and terrain navigation. This paper focuses on the techniques used in implementing the Perceptive Workbench
and the system's performance. 相似文献
7.
Using vanishing points for camera calibration and coarse 3D reconstruction from a single image 总被引:5,自引:0,他引:5
In this paper, we show how to calibrate a camera and to recover the geometry and the photometry (textures) of objects from
a single image. The aim of this work is to make it possible walkthrough and augment reality in a 3D model reconstructed from
a single image. The calibration step does not need any calibration target and makes only four assumptions: (1) the single
image contains at least two vanishing points, (2) the length (in 3D space) of one line segment (for determining the translation
vector) in the image is known, (3) the principle point is the center of the image, and (4) the aspect ratio is fixed by the
user. Each vanishing point is determined from a set of parallel lines. These vanishing points help determine a 3D world coordinate
system R
o. After having computed the focal length, the rotation matrix and the translation vector are evaluated in turn for describing
the rigid motion between R
o and the camera coordinate system R
c. Next, the reconstruction step consists in placing, rotating, scaling, and translating a rectangular 3D box that must fit
at best with the potential objects within the scene as seen through the single image. With each face of a rectangular box,
a texture that may contain holes due to invisible parts of certain objects is assigned. We show how the textures are extracted
and how these holes are located and filled. Our method has been applied to various real images (pictures scanned from books,
photographs) and synthetic images. 相似文献
8.
Andrew P. Black Jie Huang Rainer Koster Jonathan Walpole Calton Pu 《Multimedia Systems》2002,8(5):406-419
To simplify the task of building distributed streaming applications, we propose a new abstraction for information flow –
Infopipes. Infopipes make information flow primary, not an auxiliary mechanism that is hidden away. Systems are built by connecting
predefined component Infopipes such as sources, sinks, buffers, filters, broadcasting pipes, and multiplexing pipes. The goal
of Infopipes is not to hide communication, like an RPC system, but to reify it: to represent communication explicitly as objects that the program can interrogate and manipulate. Moreover, these objects
represent communication in application-level terms, not in terms of network or process implementation. 相似文献
9.
Yonit Kesten Amir Pnueli 《International Journal on Software Tools for Technology Transfer (STTT)》2000,2(4):328-342
In spite of the impressive progress in the development of the two main methods for formal verification of reactive systems
– Symbolic Model Checking and Deductive Verification, they are still limited in their ability to handle large systems. It
is generally recognized that the only way these methods can ever scale up is by the extensive use of abstraction and modularization,
which break the task of verifying a large system into several smaller tasks of verifying simpler systems.
In this paper, we review the two main tools of compositionality and abstraction in the framework of linear temporal logic.
We illustrate the application of these two methods for the reduction of an infinite-state system into a finite-state system
that can then be verified using model checking.
The technical contributions contained in this paper are a full formulation of abstraction when applied to a system with both
weak and strong fairness requirements and to a general temporal formula, and a presentation of a compositional framework for
shared variables and its application for forming network invariants. 相似文献
10.
Personalized, interactive news on the Web 总被引:2,自引:0,他引:2
We present Krakatoa Chronicle, an interactive, personalized newspaper on the World Wide Web implemented as a Java applet. The newspaper is similar in appearance
to newspapers in the real world, with a multi-column layout and justified text. At the same time, it provides various interaction
techniques for browsing the content of articles, giving relevance feedback, and dynamically changing layout. As users interact
with the system, individual ‘user profiles’ are built up at the webserver site. These are used to tailor the newspaper's content
and layout to each user's declared and inferred preferences. The system allows for a balancing of personal and community interests,
allowing the user to navigate through a space of newspapers corresponding to a range of viewpoints. 相似文献
11.
Intelligent vehicle systems have introduced the need for designers to consider user preferences so as to make several kinds
of driving features as driver friendly as possible. This requirement raises the problem of how to suitably analyse human performance
so they can be implemented in automatic driving tasks. The framework of the present work is an adaptive cruise control with
stop and go features for use in an urban setting. In such a context, one of the main requirements is to be able to tune the
control strategy to the driver’s style. In order to do this, a number of different drivers were studied through the statistical
analysis of their behaviour while driving. The aim of this analysis is to decide whether it is possible to determine a driver’s
behaviour, what signals are suitable for this task and which parameters can be used to describe a driver’s style. An assignment
procedure is then introduced in order to classify a driver’s behaviour within the stop and go task being considered. Finally,
the findings were analysed subjectively and compared with a statistically objective one. 相似文献
12.
Rekha Kengeri Cheryl D. Seals Hope D. Harley Himabindu P. Reddy Edward A. Fox 《International Journal on Digital Libraries》1999,2(2-3):157-169
If digital libraries are to be used effectively, their user interfaces should be tested and enhanced. We observed 48 participants
as they worked with the following digital libraries: ACM, IEEE-CS, NCSTRL, and NDLTD. We discuss how the features of these
digital libraries influence the subjects’ efforts to perform search and retrieval tasks. Data analysis indicates that the
IEEE-CS digital library was rated the best overall and NDLTD had the best search time. We present user recommendations and
propose a taxonomy of features that we believe are essential for the design of future digital libraries. Noteworthy is the
observation that users’ judgements on the importance of different features varied widely between the beginning and end of
their test sessions.
Received: 15 December 1997 / Revised: June 1999 相似文献
13.
E. Francesconi M. Gori S. Marinai G. Soda 《International Journal on Document Analysis and Recognition》2001,3(3):160-168
In this paper we describe the connectionist-based classification engine of an OCR system. The classification engine is based
on a new modular connectionist architecture, where a multilayer perceptron (MLP) acting as a classifier is properly combined
with a set of autoassociators – one for each class – trained to copy the input to the output layer. The MLP-based classifier
selects a small group of classes with high score, that are afterwards verified by the corresponding autoassociators. The learning
samples used to train the classifiers are constructed by means of a synthetic noise generator starting from few grey level
characters labeled by the user. We report experimental results for comparing three neural architectures: an MLP-based classifier,
an autoassociator-based classifier, and the proposed combined architecture. The experiments show that the proposed architecture
exhibits the best performance, without increasing significantly the computational burden.
Received March 6, 2000 / Revised July 12, 2000 相似文献
14.
The traditional style of working with computers generally revolves around the computer being used as a tool, with individual
users directly initiating operations and waiting for the results of them. A more recent paradigm of human-computer interaction,
based on the indirect management of computing resources, is agent-based interaction. The idea of delegation plays a key part
in this approach to computer-based work, which allows individuals to relinquish the routine, mechanistic parts of their everyday
tasks, having them performed automatically instead. Adaptive interfaces combine elements of both these approaches, where the
goal is to have the interface adapt to its users rather than the reverse. This paper addresses some of the issues arising
from a practical software development process which aimed to support individuals using this style of interaction. This paper
documents the development of a set of classes which implement an architecture for adaptive interfaces. These classes are intended
to be used as part of larger user interface systems which are to exhibit adaptive behaviour. One approach to the implementation
of an adaptive interface is to use a set of software “agents”– simple processes which effectively run “in the background”–
to decompose the task of implementing the interface. These agents form part of a larger adaptive interface architecture, which
in turn forms a component of the adaptive system. 相似文献
15.
In order to get useful information from various kinds of information sources, we first apply a searching process with query
statements to retrieve candidate data objects (called a hunting process in this paper) and then apply a browsing process to
check the properties of each object in detail by visualizing candidates. In traditional information retrieval systems, the
hunting process determines the quality of the result, since there are only a few candidates left for the browsing process.
In order to retrieve data from widely distributed digital libraries, the browsing process becomes very important, since the
properties of data sources are not known in advance. After getting data from various information sources, a user checks the
properties of data in detail using the browsing process. The result can be used to improve the hunting process or for selecting
more appropriate visualization parameters. Visualization relationships among data are very important, but will become too
time-consuming if the amount of data in the candidate set is large, for example, over one hundred objects. One of the important
problems in handling information retrieval from a digital library is to create efficient and powerful visualization mechanisms
for the browsing process. One promising way to solve the visualization problem is to map each candidate data object into a
location in three-dimensional (3D) space using a proper distance definition. In this paper, we will introduce the functions
and organization of a system having a browsing navigator to achieve an efficient browsing process in 3D information search
space. This browsing navigator has the following major functions: ?1. Selection of features which determine the distance for
visualization, in order to generate a uniform distribution of candidate data objects in the resulting space. ?2. Calculation
of the location of the data objects in 2D space using the selected features. ?3. Construction of 3D browsing space by combining
2D spaces, in order to find the required data objects easily. ?4. Generation of the oblique views of 3D browsing space and
data objects by reducing the overlap of data objects in order to make navigation easy for the user in 3D space. ?Examples
of this browsing navigator applied to book data are shown.
Received: 15 December 1997 / Revised: June 1999 相似文献
16.
Developed forms of task analysis allow designers to focus on both utility and usability issues in the development of interactive
work systems. The models they generate represent aspects of the human, computer and domain elements of an interactive work
system. Many interactive work systems are embedded in an organisational context. Pressure for changes are present in this
context and provide impetus to stakeholders to change work tasks and the supporting tools. Interactive work systems also provide
evolutionary pressures of their own, changing the very task they were designed to support. One approach to coping with change
has been to evolve interactive work systems. Currently none of these techniques place focus on the performance of tasks as
central, and consideration of usability is minimal. However, an evolutionary design approach forces an evolutionary experience
upon users, and we cannot be sure whether this approach enhances the user’s experience or degrades their performance. Given
the strength of task analysis it is likely that it will be applied within evolutionary contexts. Yet, little work has been
undertaken to examine whether its role will, or could be different. We ask how we can move task analysis towards being used
in a principled manner in the evolution of interactive work systems. This paper examines a number of features of the approach
called task knowledge structures that may be useful in evolving interactive work systems. We look at tasks and their representativeness,
roles, goals, objects (their attributes, relationships, typicality and centrality) and actions. We present a developing framework
for examining other task analysis approaches for their utility in supporting interactive work systems evolution. Finally,
we discuss future work within the area of applying task analysis in the evolution of interactive work systems. 相似文献
17.
This paper describes a method for recognizing partially occluded objects under different levels of illumination brightness
by using the eigenspace analysis. In our previous work, we developed the “eigenwindow” method to recognize the partially occluded
objects in an assembly task, and demonstrated with sufficient high performance for the industrial use that the method works
successfully for multiple objects with specularity under constant illumination. In this paper, we modify the eigenwindow method
for recognizing objects under different illumination conditions, as is sometimes the case in manufacturing environments, by
using additional color information. In the proposed method, a measured color in the RGB color space is transformed into one
in the HSV color space. Then, the hue of the measured color, which is invariant to change in illumination brightness and direction,
is used for recognizing multiple objects under different illumination conditions. The proposed method was applied to real
images of multiple objects under various illumination conditions, and the objects were recognized and localized successfully. 相似文献
18.
A User-Centered Location Model 总被引:1,自引:0,他引:1
This paper discusses the user-centered location model used in comMotion. In this context, the location model refers to a set of learned places (destinations), which coincide to a latitude and a
longitude, that the user has categorized. It also includes knowledge of the routes between the destinations and the time it
takes to travel them. The model is based on user experience, i.e. his patterns of mobility, so no two models are the same.
We also discuss the pattern recognition models implemented for route learning, route prediction and estimation of time to
arrival.
Correspondence to: Ms N. Marmasse, MIT Media Laboratory, 20 Ames Street, Cambridge, MA 02139, USA. Email: nmarmas@media.mit.edu 相似文献
19.
We describe a system which supports dynamic user interaction with multimedia information using content-based hypermedia navigation
techniques, specialising in a technique for navigation of musical content. The model combines the principles of open hypermedia, whereby hypermedia link information is maintained by a link service, with content-based retrieval techniques in which a database is queried based on a feature of the multimedia content; our approach could be described as
‘content-based retrieval of hypermedia links’. The experimental system focuses on temporal media and consists of a set of
component-based navigational hypermedia tools. We propose the use of melodic pitch contours in this context and we present
techniques for storing and querying contours, together with experimental results. Techniques for integrating the contour database
with open hypermedia systems are also discussed. 相似文献
20.
In this paper we describe a prototype spatial audio user interface for a Global Positioning System (GPS). The interface is
designed to allow mobile users to carry out location tasks while their eyes, hands or attention are otherwise engaged. Audio
user interfaces for GPS have typically been designed to meet the needs of visually impaired users, and generally, though not
exclusively, employ speech-audio. In contrast, our prototype system uses a simple form of non-speech, spatial audio. This
paper analyses various candidate audio mappings for location and distance information. A variety of tasks, design considerations,
design trade-offs and opportunities are considered. The findings from pilot empirical testing are reported. Finally, opportunities
for improvements to the system and for future evaluation are explored. 相似文献