共查询到20条相似文献,搜索用时 10 毫秒
1.
2.
3.
4.
In this paper we present a multi-touch tabletop system for browsing image databases, conceived for museums and art gallery exhibitions. The system exploits an innovative image browsing paradigm and image retrieval functionalities to perform natural and intuitive user interaction: users can explore the image database by handling digital pictures by means of touch gestures or by the use of a predetermined set of physical objects; once one of these objects is placed on the table, it is automatically recognized and the associated function is triggered. The set of objects and the function associations can be dynamically configured. An innovative feature of our application is that users can interactively create and manipulate image clusters where images are grouped according to their pictorial similarity. This is achieved by placing one or more specific tangible objects on the table surface. The system has been evaluated on a collection of photos organized in groups according to the UNESCO picture categories. The usability tests, performed considering different user categories, show that users consider the application to be attractive and interesting. 相似文献
5.
6.
目前多数图像视觉情感分析方法主要从图像整体构建视觉情感特征表示,然而图像中包含对象的局部区域往往更能突显情感色彩。针对视觉图像情感分析中忽略局部区域情感表示的问题,提出一种嵌入图像整体特征与局部对象特征的视觉情感分析方法。该方法结合整体图像和局部区域以挖掘图像中的情感表示,首先利用对象探测模型定位图像中包含对象的局部区域,然后通过深度神经网络抽取局部区域的情感特征,最后用图像整体抽取的深层特征和局部区域特征来共同训练图像情感分类器并预测图像的情感极性。实验结果表明,所提方法在真实数据集TwitterⅠ和TwitterⅡ上的情感分类准确率分别达到了75.81%和78.90%,高于仅从图像整体特征和仅从局部区域特征分析情感的方法。 相似文献
7.
E. V. Myasnikov 《Pattern Recognition and Image Analysis》2011,21(2):312-315
This method is based on hierarchical clusterization of data in a multidimensional space and the Sammon mapping. The key feature of the dimensionality reduction stage is the use of reference node lists created by the results of the hierarchical clusterization stage. The experimental results given in this work include the method performance quality estimates for some systems of the features extracted from digital image collections of different sizes. 相似文献
8.
Semantic scene classification is an open problem in computer vision, especially when information from only a single image
is employed. In applications involving image collections, however, images are clustered sequentially, allowing surrounding
images to be used as temporal context. We present a general probabilistic temporal context model in which the first-order
Markov property is used to integrate content-based and temporal context cues. The model uses elapsed time-dependent transition probabilities between images to enforce the fact that images captured within a shorter period of time are more
likely to be related. This model is generalized in that it allows arbitrary elapsed time between images, making it suitable
for classifying image collections. In addition, we derived a variant of this model to use in ordered image collections for
which no timestamp information is available, such as film scans. We applied the proposed context models to two problems, achieving
significant gains in accuracy in both cases. The two algorithms used to implement inference within the context model, Viterbi
and belief propagation, yielded similar results with a slight edge to belief propagation.
Matthew Boutell received the BS degree in Mathematical Science from Worcester Polytechnic Institute, Massachusetts, in 1993, the MEd degree
from University of Massachusetts at Amherst in 1994, and the PhD degree in Computer Science from the University of Rochester,
Rochester, NY, in 2005. He served for several years as a mathematics and computer science instructor at Norton High School
and Stonehill College and as a research intern/consultant at Eastman Kodak Company. Currently, he is Assistant Professor of
Computer Science and Software Engineering at Rose-Hulman Institute of Technology in Terre Haute, Indiana. His research interests
include image understanding, machine learning, and probabilistic modeling.
Jiebo Luo received his PhD degree in Electrical Engineering from the University of Rochester, Rochester, NY in 1995. He is a Senior
Principal Scientist with the Kodak Research Laboratories.
He was a member of the Organizing Committee of the 2002 IEEE International Conference on Image Processing and 2006 IEEE International
Conference on Multimedia and Expo, a guest editor for the Journal of Wireless Communications and Mobile Computing Special
Issue on Multimedia Over Mobile IP and the Pattern Recognition journal Special Issue on Image Understanding for Digital Photos,
and a Member of the Kodak Research Scientific Council.
He is on the editorial boards of the IEEE Transactions on Multimedia, Pattern Recognition, and Journal of Electronic Imaging.
His research interests include image processing, pattern recognition, computer vision, medical imaging, and multimedia communication.
He has authored over 100 technical papers and holds over 30 granted US patents. He is a Kodak Distinguished Inventor and a
Senior Member of the IEEE.
Chris Brown (BA Oberlin 1967, PhD University of Chicago 1972) is Professor of Computer Science at the University of Rochester.
He has published in many areas of computer vision and robotics. He wrote COMPUTER VISION with his colleague Dana Ballard,
and influential work on the “active vision” paradigm was reported in two special issues of the International Journal of Computer
Vision. He edited the first two volumes of ADVANCES IN COMPUTER VISION for Erlbaum and (with D. Terzopoulos) REAL-TIME COMPUTER
VISION, from Cambridge University Press. He is the co-editor of VIDERE, the first entirely on-line refereed computer vision
journal (MIT Press).
His most recent PhD students have done research in infrared tracking and face recognition, features and strategies for image
understanding, augmented reality, and three-dimensional reconstruction algorithms.
He supervised the undergraduate team that twice won the AAAI Host Robot competition (and came third in the Robot Rescue competition
in 2003). 相似文献
9.
Xueping Su Jinye Peng Xiaoyi Feng Jun Wu Jianping Fan Li Cui 《Multimedia Tools and Applications》2014,73(3):1643-1661
For automatically mining the underlying relationships between different famous persons in daily news, for example, building a news person based network with the faces as icons to facilitate face-based person finding, we need a tool to automatically label faces in new images with their real names. This paper studies the problem of linking names with faces from large-scale news images with captions. In our previous work, we proposed a method called Person-based Subset Clustering which is mainly based on face clustering for all face images derived from the same name. The location where a name appears in a caption, as well as the visual structural information within a news image provided informative cues such as who are really in the associated image. By combining the domain knowledge from the captions and the corresponding image we propose a novel cross-modality approach to further improve the performance of linking names with faces. The experiments are performed on the data sets including approximately half a million news images from Yahoo! news, and the results show that the proposed method achieves significant improvement over the clustering-only methods. 相似文献
10.
In this work, we are interested in technologies that will allow users to actively browse and navigate large image databases and to retrieve images through interactive fast browsing and navigation. The development of a browsing/navigation-based image retrieval system has at least two challenges. The first is that the system's graphical user interface (GUI) should intuitively reflect the distribution of the images in the database in order to provide the users with a mental picture of the database content and a sense of orientation during the course of browsing/navigation. The second is that it has to be fast and responsive, and be able to respond to users actions at an interactive speed in order to engage the users. We have developed a method that attempts to address these challenges of a browsing/navigation based image retrieval systems. The unique feature of the method is that we take an integrated approach to the design of the browsing/navigation GUI and the indexing and organization of the images in the database. The GUI is tightly coupled with the algorithms that run in the background. The visual cues of the GUI are logically linked with various parts of the repository (image clusters of various particular visual themes) thus providing intuitive correspondences between the GUI and the database contents. In the backend, the images are organized into a binary tree data structure using a sequential maximal information coding algorithm and each image is indexed by an n-bit binary index thus making response to users’ action very fast. We present experimental results to demonstrate the usefulness of our method both as a pre-filtering tool and for developing browsing/navigation systems for fast image retrieval from large image databases. 相似文献
11.
Nowadays, due to the rapid growth of digital technologies, huge volumes of image data are created and shared on social media sites. User-provided tags attached to each social image are widely recognized as a bridge to fill the semantic gap between low-level image features and high-level concepts. Hence, a combination of images along with their corresponding tags is useful for intelligent retrieval systems, those are designed to gain high-level understanding from images and facilitate semantic search. However, user-provided tags in practice are usually incomplete and noisy, which may degrade the retrieval performance. To tackle this problem, we present a novel retrieval framework that automatically associates the visual content with textual tags and enables effective image search. To this end, we first propose a probabilistic topic model learned on social images to discover latent topics from the co-occurrence of tags and image features. Moreover, our topic model is built by exploiting the expert knowledge about the correlation between tags with visual contents and the relationship among image features that is formulated in terms of spatial location and color distribution. The discovered topics then help to predict missing tags of an unseen image as well as the ones partially labeled in the database. These predicted tags can greatly facilitate the reliable measure of semantic similarity between the query and database images. Therefore, we further present a scoring scheme to estimate the similarity by fusing textual tags and visual representation. Extensive experiments conducted on three benchmark datasets show that our topic model provides the accurate annotation against the noise and incompleteness of tags. Using our generalized scoring scheme, which is particularly advantageous to many types of queries, the proposed approach also outperforms state-of-the-art approaches in terms of retrieval accuracy. 相似文献
12.
13.
14.
Holger Meuss Klaus U. Schulz Felix Weigel Simone Leonardi François Bry 《International Journal on Digital Libraries》2005,5(1):3-17
This article reports on the XML retrieval system x2 that has been developed at the University of Munich over the last 5 years. In a typical session with x2, the user first browses a structural summary of the XML database in order to select interesting elements and keywords occurring in documents. Using this intermediate result, queries combining structure and textual references are composed semiautomatically. After query evaluation, the full set of answers is presented in a visual and structured way. x2 largely exploits the structure found in documents, queries and answers to enable new interactive visualization and exploration techniques that support mixed IR and database-oriented querying, thus bridging the gap between these three views on the data to be retrieved. Another salient characteristic of x2 that distinguishes it from other visual query systems for XML is that it supports various degrees of detailedness in the presentation of answers, as well as techniques for dynamically reordering, grouping and ranking retrieved elements once the complete answer set has been computed. 相似文献
15.
本文介绍用VisualBasic分离彩色图像的三基色的实现方法。以便于把现有的黑白图像处理的子程序应用于彩色图像处理之中。 相似文献
16.
Visual image retrieval by elastic matching of user sketches 总被引:17,自引:0,他引:17
Del Bimbo A. Pala P. 《IEEE transactions on pattern analysis and machine intelligence》1997,19(2):121-132
Effective image retrieval by content from database requires that visual image properties are used instead of textual labels to properly index and recover pictorial data. Retrieval by shape similarity, given a user-sketched template is particularly challenging, owing to the difficulty to derive a similarity measure that closely conforms to the common perception of similarity by humans. In this paper, we present a technique which is based on elastic matching of sketched templates over the shapes in the images to evaluate similarity ranks. The degree of matching achieved and the elastic deformation energy spent by the sketch to achieve such a match are used to derive a measure of similarity between the sketch and the images in the database and to rank images to be displayed. The elastic matching is integrated with arrangements to provide scale invariance and take into account spatial relationships between objects in multi-object queries. Examples from a prototype system are expounded with considerations about the effectiveness of the approach and comparative performance analysis 相似文献
17.
18.
《Pattern recognition》2014,47(2):705-720
We present word spatial arrangement (WSA), an approach to represent the spatial arrangement of visual words under the bag-of-visual-words model. It lies in a simple idea which encodes the relative position of visual words by splitting the image space into quadrants using each detected point as origin. WSA generates compact feature vectors and is flexible for being used for image retrieval and classification, for working with hard or soft assignment, requiring no pre/post processing for spatial verification. Experiments in the retrieval scenario show the superiority of WSA in relation to Spatial Pyramids. Experiments in the classification scenario show a reasonable compromise between those methods, with Spatial Pyramids generating larger feature vectors, while WSA provides adequate performance with much more compact features. As WSA encodes only the spatial information of visual words and not their frequency of occurrence, the results indicate the importance of such information for visual categorization. 相似文献
19.
20.
We have developed a flexible software environment called ADVIZOR for visual information discovery. ADVIZOR complements existing assumptive-based analyses by providing a discovery-based approach. ADVIZOR consists of five parts: a rich set of flexible visual components, strategies for arranging the components for particular analyses, an in-memory data pool, data manipulation components, and container applications. Working together, ADVIZOR's architecture provides a powerful production platform for creating innovative visual query and analysis applications 相似文献