共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
4.
In this paper we present a multi-touch tabletop system for browsing image databases, conceived for museums and art gallery exhibitions. The system exploits an innovative image browsing paradigm and image retrieval functionalities to perform natural and intuitive user interaction: users can explore the image database by handling digital pictures by means of touch gestures or by the use of a predetermined set of physical objects; once one of these objects is placed on the table, it is automatically recognized and the associated function is triggered. The set of objects and the function associations can be dynamically configured. An innovative feature of our application is that users can interactively create and manipulate image clusters where images are grouped according to their pictorial similarity. This is achieved by placing one or more specific tangible objects on the table surface. The system has been evaluated on a collection of photos organized in groups according to the UNESCO picture categories. The usability tests, performed considering different user categories, show that users consider the application to be attractive and interesting. 相似文献
5.
6.
目前多数图像视觉情感分析方法主要从图像整体构建视觉情感特征表示,然而图像中包含对象的局部区域往往更能突显情感色彩。针对视觉图像情感分析中忽略局部区域情感表示的问题,提出一种嵌入图像整体特征与局部对象特征的视觉情感分析方法。该方法结合整体图像和局部区域以挖掘图像中的情感表示,首先利用对象探测模型定位图像中包含对象的局部区域,然后通过深度神经网络抽取局部区域的情感特征,最后用图像整体抽取的深层特征和局部区域特征来共同训练图像情感分类器并预测图像的情感极性。实验结果表明,所提方法在真实数据集TwitterⅠ和TwitterⅡ上的情感分类准确率分别达到了75.81%和78.90%,高于仅从图像整体特征和仅从局部区域特征分析情感的方法。 相似文献
7.
E. V. Myasnikov 《Pattern Recognition and Image Analysis》2011,21(2):312-315
This method is based on hierarchical clusterization of data in a multidimensional space and the Sammon mapping. The key feature of the dimensionality reduction stage is the use of reference node lists created by the results of the hierarchical clusterization stage. The experimental results given in this work include the method performance quality estimates for some systems of the features extracted from digital image collections of different sizes. 相似文献
8.
Semantic scene classification is an open problem in computer vision, especially when information from only a single image
is employed. In applications involving image collections, however, images are clustered sequentially, allowing surrounding
images to be used as temporal context. We present a general probabilistic temporal context model in which the first-order
Markov property is used to integrate content-based and temporal context cues. The model uses elapsed time-dependent transition probabilities between images to enforce the fact that images captured within a shorter period of time are more
likely to be related. This model is generalized in that it allows arbitrary elapsed time between images, making it suitable
for classifying image collections. In addition, we derived a variant of this model to use in ordered image collections for
which no timestamp information is available, such as film scans. We applied the proposed context models to two problems, achieving
significant gains in accuracy in both cases. The two algorithms used to implement inference within the context model, Viterbi
and belief propagation, yielded similar results with a slight edge to belief propagation.
Matthew Boutell received the BS degree in Mathematical Science from Worcester Polytechnic Institute, Massachusetts, in 1993, the MEd degree
from University of Massachusetts at Amherst in 1994, and the PhD degree in Computer Science from the University of Rochester,
Rochester, NY, in 2005. He served for several years as a mathematics and computer science instructor at Norton High School
and Stonehill College and as a research intern/consultant at Eastman Kodak Company. Currently, he is Assistant Professor of
Computer Science and Software Engineering at Rose-Hulman Institute of Technology in Terre Haute, Indiana. His research interests
include image understanding, machine learning, and probabilistic modeling.
Jiebo Luo received his PhD degree in Electrical Engineering from the University of Rochester, Rochester, NY in 1995. He is a Senior
Principal Scientist with the Kodak Research Laboratories.
He was a member of the Organizing Committee of the 2002 IEEE International Conference on Image Processing and 2006 IEEE International
Conference on Multimedia and Expo, a guest editor for the Journal of Wireless Communications and Mobile Computing Special
Issue on Multimedia Over Mobile IP and the Pattern Recognition journal Special Issue on Image Understanding for Digital Photos,
and a Member of the Kodak Research Scientific Council.
He is on the editorial boards of the IEEE Transactions on Multimedia, Pattern Recognition, and Journal of Electronic Imaging.
His research interests include image processing, pattern recognition, computer vision, medical imaging, and multimedia communication.
He has authored over 100 technical papers and holds over 30 granted US patents. He is a Kodak Distinguished Inventor and a
Senior Member of the IEEE.
Chris Brown (BA Oberlin 1967, PhD University of Chicago 1972) is Professor of Computer Science at the University of Rochester.
He has published in many areas of computer vision and robotics. He wrote COMPUTER VISION with his colleague Dana Ballard,
and influential work on the “active vision” paradigm was reported in two special issues of the International Journal of Computer
Vision. He edited the first two volumes of ADVANCES IN COMPUTER VISION for Erlbaum and (with D. Terzopoulos) REAL-TIME COMPUTER
VISION, from Cambridge University Press. He is the co-editor of VIDERE, the first entirely on-line refereed computer vision
journal (MIT Press).
His most recent PhD students have done research in infrared tracking and face recognition, features and strategies for image
understanding, augmented reality, and three-dimensional reconstruction algorithms.
He supervised the undergraduate team that twice won the AAAI Host Robot competition (and came third in the Robot Rescue competition
in 2003). 相似文献
9.
Xueping Su Jinye Peng Xiaoyi Feng Jun Wu Jianping Fan Li Cui 《Multimedia Tools and Applications》2014,73(3):1643-1661
For automatically mining the underlying relationships between different famous persons in daily news, for example, building a news person based network with the faces as icons to facilitate face-based person finding, we need a tool to automatically label faces in new images with their real names. This paper studies the problem of linking names with faces from large-scale news images with captions. In our previous work, we proposed a method called Person-based Subset Clustering which is mainly based on face clustering for all face images derived from the same name. The location where a name appears in a caption, as well as the visual structural information within a news image provided informative cues such as who are really in the associated image. By combining the domain knowledge from the captions and the corresponding image we propose a novel cross-modality approach to further improve the performance of linking names with faces. The experiments are performed on the data sets including approximately half a million news images from Yahoo! news, and the results show that the proposed method achieves significant improvement over the clustering-only methods. 相似文献
10.
Nowadays, due to the rapid growth of digital technologies, huge volumes of image data are created and shared on social media sites. User-provided tags attached to each social image are widely recognized as a bridge to fill the semantic gap between low-level image features and high-level concepts. Hence, a combination of images along with their corresponding tags is useful for intelligent retrieval systems, those are designed to gain high-level understanding from images and facilitate semantic search. However, user-provided tags in practice are usually incomplete and noisy, which may degrade the retrieval performance. To tackle this problem, we present a novel retrieval framework that automatically associates the visual content with textual tags and enables effective image search. To this end, we first propose a probabilistic topic model learned on social images to discover latent topics from the co-occurrence of tags and image features. Moreover, our topic model is built by exploiting the expert knowledge about the correlation between tags with visual contents and the relationship among image features that is formulated in terms of spatial location and color distribution. The discovered topics then help to predict missing tags of an unseen image as well as the ones partially labeled in the database. These predicted tags can greatly facilitate the reliable measure of semantic similarity between the query and database images. Therefore, we further present a scoring scheme to estimate the similarity by fusing textual tags and visual representation. Extensive experiments conducted on three benchmark datasets show that our topic model provides the accurate annotation against the noise and incompleteness of tags. Using our generalized scoring scheme, which is particularly advantageous to many types of queries, the proposed approach also outperforms state-of-the-art approaches in terms of retrieval accuracy. 相似文献
11.
In this work, we are interested in technologies that will allow users to actively browse and navigate large image databases and to retrieve images through interactive fast browsing and navigation. The development of a browsing/navigation-based image retrieval system has at least two challenges. The first is that the system's graphical user interface (GUI) should intuitively reflect the distribution of the images in the database in order to provide the users with a mental picture of the database content and a sense of orientation during the course of browsing/navigation. The second is that it has to be fast and responsive, and be able to respond to users actions at an interactive speed in order to engage the users. We have developed a method that attempts to address these challenges of a browsing/navigation based image retrieval systems. The unique feature of the method is that we take an integrated approach to the design of the browsing/navigation GUI and the indexing and organization of the images in the database. The GUI is tightly coupled with the algorithms that run in the background. The visual cues of the GUI are logically linked with various parts of the repository (image clusters of various particular visual themes) thus providing intuitive correspondences between the GUI and the database contents. In the backend, the images are organized into a binary tree data structure using a sequential maximal information coding algorithm and each image is indexed by an n-bit binary index thus making response to users’ action very fast. We present experimental results to demonstrate the usefulness of our method both as a pre-filtering tool and for developing browsing/navigation systems for fast image retrieval from large image databases. 相似文献
12.
13.
14.
Holger Meuss Klaus U. Schulz Felix Weigel Simone Leonardi François Bry 《International Journal on Digital Libraries》2005,5(1):3-17
This article reports on the XML retrieval system x2 that has been developed at the University of Munich over the last 5 years. In a typical session with x2, the user first browses a structural summary of the XML database in order to select interesting elements and keywords occurring in documents. Using this intermediate result, queries combining structure and textual references are composed semiautomatically. After query evaluation, the full set of answers is presented in a visual and structured way. x2 largely exploits the structure found in documents, queries and answers to enable new interactive visualization and exploration techniques that support mixed IR and database-oriented querying, thus bridging the gap between these three views on the data to be retrieved. Another salient characteristic of x2 that distinguishes it from other visual query systems for XML is that it supports various degrees of detailedness in the presentation of answers, as well as techniques for dynamically reordering, grouping and ranking retrieved elements once the complete answer set has been computed. 相似文献
15.
Automatic medical image analysis shows that image segmentation is a crucial task for any practical AI system in this field.On the basis of evaluation of the existing segmentation methods,a new image segmentation method is presented.To seek the perfct solution to knowledge representation in low level machine vision,a new knowledge representation approach--“Notebbok”approach is proposed and the processing of visual knowledge is discussed at all levels.To integrate the computer vision theory with Gestalt psychology and knowledge engineering,a new integrated method for intelligent image segmentation of sonargraphs- “Generalized-pattern guided segmentation”is proposed.With the methods and techniques mentioned above,the medical diagnosis expert system for sonargraphs can be built The work on the preliminary experiments is also introduced. 相似文献
16.
可靠的视觉感知估计体现的是对图像中有意义的那部分结构信息的理解。基于此,描述一个完整的图像和视频抽象化框架,通过一种边缘保持的滤波器有区分地风格化抽象图像和视频的显著部分和非显著部分,并保持两部分之间的和谐过渡。首先,对于显著区域的识别引入一种自动显著性目标分割算法,基于局部空间邻域的自动显著性计算。考虑到实际需要,提供了交互式技术,方便有目的地去指定图像中的显著结构信息。然后,根据生成的显著结构信息掩模使用单尺度的各向异性滤波处理显著部分,而用多尺度的同种滤波处理非显著部分以实现更强程度的抽象效果。本文方法生成的图像和视频的抽象化不仅呈现了视觉满意的非真实感效果,而且在软量化后还可以应用于其他风格的非真实感渲染(NPR)效果的实现。实验结果显示,处理一定数量的图像及视频,本文方法可以得到期望的结果。 相似文献
17.
本文介绍用VisualBasic分离彩色图像的三基色的实现方法。以便于把现有的黑白图像处理的子程序应用于彩色图像处理之中。 相似文献
18.
基于内容的敏感图像检测方法是过滤互联网上敏感信息的有效手段。然而,基于全局特征的检测方法误检率偏高,现有的基于BoW(bag-of-visual-words)的检测方法速度较慢。为了快速准确地检测敏感图像,本文提出基于视觉注意模型VAMAI(visual attention model for adult images)的敏感图像检测方法,包括构造面向敏感图像的视觉注意模型VAMAI、基于兴趣区域和SURF(speeded up robust features)的视觉词表算法、全局特征选择及其与BoW的融合三部分。首先,结合显著图模型、肤色分类模型和人脸检测模型,构造VAMAI,用于较准确地提取兴趣区域。然后,基于兴趣区域和SURF构建视觉词表,用于提高基于BoW的检测方法的检测速度与检测精度。最后,比较多种全局特征的性能,从中选择颜色矩,将它与BoW的支持向量机分类结果进行后融合。实验结果表明:VAMAI能够较准确地检测兴趣区域;从检测速度和检测精度两方面显著地提高了敏感图像的检测性能。 相似文献
19.
Visual image retrieval by elastic matching of user sketches 总被引:17,自引:0,他引:17
Del Bimbo A. Pala P. 《IEEE transactions on pattern analysis and machine intelligence》1997,19(2):121-132
Effective image retrieval by content from database requires that visual image properties are used instead of textual labels to properly index and recover pictorial data. Retrieval by shape similarity, given a user-sketched template is particularly challenging, owing to the difficulty to derive a similarity measure that closely conforms to the common perception of similarity by humans. In this paper, we present a technique which is based on elastic matching of sketched templates over the shapes in the images to evaluate similarity ranks. The degree of matching achieved and the elastic deformation energy spent by the sketch to achieve such a match are used to derive a measure of similarity between the sketch and the images in the database and to rank images to be displayed. The elastic matching is integrated with arrangements to provide scale invariance and take into account spatial relationships between objects in multi-object queries. Examples from a prototype system are expounded with considerations about the effectiveness of the approach and comparative performance analysis 相似文献