首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Managing a large number of digital photos is a challenging task for casual users. Personal photos often don’t have rich metadata, or additional information associated with them. However, available metadata can play a crucial role in managing photos. Labeling the semantic content of photos (i.e., annotating them), can increase the amount of metadata and facilitate efficient management. However, manual annotation is tedious and labor intensive while automatic metadata extraction techniques often generate inaccurate and irrelevant results. This paper describes a semi-automatic annotation strategy that takes advantage of human and computer strengths. The semi-automatic approach enables users to efficiently update automatically obtained metadata interactively and incrementally. Even though automatically identified metadata are compromised with inaccurate recognition errors, the process of correcting inaccurate information can be faster and easier than manually adding new metadata from scratch. In this paper, we introduce two photo clustering algorithms for generating meaningful photo groups: (1) Hierarchical event clustering; and (2) Clothing based person recognition, which assumes that people who wear similar clothing and appear in photos taken in one day are very likely to be the same person. To explore our semi-automatic strategies, we designed and implemented a prototype called SAPHARI (Semi-Automatic PHoto Annotation and Recognition Interface). The prototype provides an annotation framework which focuses on making bulk annotations on automatically identified photo groups. The prototype automatically creates photo clusters based on events, people, and file metadata so that users can easily bulk annotation photos. We performed a series of user studies to investigate the effectiveness and usability of the semi-automatic annotation techniques when applied to personal photo collections. The results show that users were able to make annotations significantly faster with event clustering using SAPHARI. We also found that users clearly preferred the semi-automatic approaches.  相似文献   

2.
With the proliferation of digital cameras and mobile devices, people are taking much more photos than ever before. However, these photos can be redundant in content and varied in quality. Therefore there is a growing need for tools to manage the photo collections. One efficient photo management way is photo collection summarization which segments the photo collection into different events and then selects a set of representative and high quality photos (key photos) from those events. However, existing photo collection summarization methods mainly consider the low-level features for photo representation only, such as color, texture, etc, while ignore many other useful features, for example high-level semantic feature and location. Moreover, they often return fixed summarization results which provide little flexibility. In this paper, we propose a multi-modal and multi-scale photo collection summarization method by leveraging multi-modal features, including time, location and high-level semantic features. We first use Gaussian mixture model to segment photo collection into events. With images represented by those multi-modal features, our event segmentation algorithm can generate better performance since the multi-modal features can better capture the inhomogeneous structure of events. Next we propose a novel key photo ranking and selection algorithm to select representative and high quality photos from the events for summarization. Our key photo ranking algorithm takes the importance of both events and photos into consideration. Furthermore, our photo summarization method allows users to control the scale of event segmentation and number of key photos selected. We evaluate our method by extensive experiments on four photo collections. Experimental results demonstrate that our method achieves better performance than previous photo collection summarization methods.  相似文献   

3.
Due to the large number of photos that are currently being generated, it is very important to have techniques to organize, search for, and retrieve such images. Photo annotation plays a key role in these mechanisms because it can link raw data (photos) to specific information that is essential for human beings to handle large amounts of content. However, the generation of photo annotation is still a difficult problem to solve as part of a well-known challenge called the semantic gap. In this paper, a literature review was conducted with the aim of investigating the most popular methods employed to produce photo annotations. Based on the papers surveyed, we identified that People (“Who?”), Location (“Where?”), and Event (“Where? When?”) are the most important features of photo annotation. We also established comparisons between similar photo annotation methods, highlighting key aspects of the most commonly used approaches. Moreover, we provide an overview of a general photo annotation process and present the main aspects of photo annotation representation comprising formats, context of usage, advantages and disadvantages. Finally, we discuss ways to improve photo annotation methods and present some future research guidelines.  相似文献   

4.
The recent popularity of digital cameras has posed a new problem: how to efficiently store and retrieve the very large number of digital photos captured and chaotically stored in multiple locations without any annotation. This paper proposes an infrastructure, called PhotoGeo, which aims at helping users with the people photo annotation, event photo annotation, storage and retrieval of personal digital photos. To achieve the desired objective, PhotoGeo uses new algorithms that make it possible to annotate photos with the key metadata to facilitate their retrieval, such as: the people who were shown in the photo (who); where it was captured (where); the date and time of capture (when); and the event that was captured. The paper concludes with a detailed evaluation of these algorithms.  相似文献   

5.
Most photograph management systems use a scrollable view based on a sequential grid layout that arranges photo thumbnails in some default order on the screen. Although users are very accustomed to this kind of photo layout, multiple drag and drop mouse interactions are required to search and obtain an overview of their photos. This paper describes a photo visualization system that visualizes hundreds of photos on a 2D grid space in order to help users manage their photos. Our system provides the following three main functions. First, it places similar photos in terms of color histogram and shoot time close together on the grid. Therefore, users can find their photos using temporal and color-based coherences relating to human sensory information such as colors that invokes similar feelings and photo shoot time. Second, our system provides a hierarchical clustering method based on a 2D grid space. This function can decrease drag and drop mouse interactions when classifying photos into small groups compared to the sequential grid layout. Finally, our system displays a representative photo from each cluster, in order to provide a summarized view of multiple photos. For evaluation of our system, we conducted seven experiments consisting of four computational and three subjective evaluations making a comparison with a sequential grid layout. The computational evaluations consider the four features of space efficiency, temporal stability and color-based consistency between neighbor photos of the grid, and cluster similarity. The evaluations establish that our system can decrease space efficiency while improving the other features. The three subjective evaluations deal with our system??s ease-of-use from a subjective human perspective. Most people that participated in our experiments found that this photo visualization system was quite suited to finding and summarizing their photo content.  相似文献   

6.
Song  Yuqing  Wang  Wei  Zhang  Aidong 《World Wide Web》2003,6(2):209-231
Although a variety of techniques have been developed for content-based image retrieval (CBIR), automatic image retrieval by semantics still remains a challenging problem. We propose a novel approach for semantics-based image annotation and retrieval. Our approach is based on the monotonic tree model. The branches of the monotonic tree of an image, termed as structural elements, are classified and clustered based on their low level features such as color, spatial location, coarseness, and shape. Each cluster corresponds to some semantic feature. The category keywords indicating the semantic features are automatically annotated to the images. Based on the semantic features extracted from images, high-level (semantics-based) querying and browsing of images can be achieved. We apply our scheme to analyze scenery features. Experiments show that semantic features, such as sky, building, trees, water wave, placid water, and ground, can be effectively retrieved and located in images.  相似文献   

7.
爱美之心人皆有之,对于黄种人而言,往往以皮肤白嫩为美。面向这一普遍愿望,该项目通过自行设计的颜色迁移和图像融合算法,以数码照片为素材,改造成不同款式的、具有增白嫩肤效果的作品,以满足青年人特别是年轻女孩的愿望。该项目的另一内容是模拟炭精画效果。将炭精画作为美白润肤的极致,希望通过模拟炭精画的技法将肖像改造成一幅别致的绘画艺术品。  相似文献   

8.
9.
Unconstrained consumer photos pose great challenge for content-based image retrieval. Unlike professional images or domain-specific images, consumer photos vary significantly. More often than not, the objects in the photos are ill-posed, occluded, and cluttered with poor lighting, focus and exposure. In this paper, we propose a cascading framework for combining intra-image and inter-class similarities in image retrieval, motivated from probabilistic Bayesian principles. Support vector machines are employed to learn local view-based semantics based on just-in-time fusion of color and texture features. A new detection-driven block-based segmentation algorithm is designed to extract semantic features from images. The detection-based indexes also serve as input for support vector learning of image classifiers to generate class-relative indexes. During image retrieval, both intra-image and inter-class similarities are combined to rank images. Experiments using query-by-example on 2400 genuine heterogeneous consumer photos with 16 semantic queries show that the combined matching approach is better than matching with single index. It also outperformed the method of combining color and texture features by 55% in average precision.  相似文献   

10.
Analyzing personal photo albums for understanding the related events is an emerging trend. A reliable event recognition tool could suggest appropriate annotation of pictures, provide the context for single image classification and tagging, achieve automatic selection and summarization, ease organization and sharing of media among users. In this paper, a novel method for fast and reliable event-type classification of personal photo albums is presented. Differently from previous approaches, the proposed method does not process photos individually but as a whole, exploiting three main features, namely Saliency, Gist, and Time, to extract an event signature, which is characteristic for a specific event type. A highly challenging database containing more than 40.000 photos belonging to 19 diverse event-types was crawled from photo-sharing websites for the purpose of modeling and performance evaluation. Experimental results showed that the proposed approach meets superior classification accuracy with limited computational complexity.  相似文献   

11.
With the growing use of social networking services, various applications have been developed to utilize their vast capabilities. Photomosaic techniques, which combine many images to create a new rendering of an input image, can benefit from the capabilities of social networks. In this study, we propose a method that generates a photomosaic image by considering social network context. Our algorithm creates a photomosaic that incorporates photos posted by other users in the users network . We enable the matching function to easily select photos from the albums of users who are connected to the owner of the input image, by computing the closeness of those connections. Moreover, our technique allows the photos in the albums of friends who are annotated in the source image to be matched more effectively.  相似文献   

12.
In this paper, we propose a content-based method the for semi-automatic organization of photo albums based on the analysis of how different users organize their own pictures. The goal is to help the user in dividing his pictures into groups characterized by a similar semantic content. The method is semi-automatic: the user starts to assign labels to the pictures and unlabeled pictures are tagged with proposed labels. The user can accept the recommendation or made a correction. To formulate the suggestions is exploited the knowledge encoded in how other users have partitioned their images. The method is conceptually articulated in two parts. First, we use a suitable feature representation of the images to model the different classes that the users have collected, second, we look for correspondences between the criteria used by the different users. Boosting is used to integrate the information provided by the analysis of multiple users. A quantitative evaluation of the proposed approach is obtained by simulating the amount of user interaction needed to annotate the albums of a set of members of the flickr® photo-sharing community.  相似文献   

13.
四合一的人脸识别查询系统的设计与实现   总被引:1,自引:0,他引:1  
论文介绍了TH人脸识别查询系统(四合一)的设计思想和实现技术。该系统以人像识别技术为核心,采用客户机/服务器模式,建立和维护了一个含有人像特征以及文档信息的综合数据库,从而能够在输入一张未知人脸图像时,迅速地查询出最接近于待查图像的若干已登记人员的身份信息。该系统应用到了人像识别、机群、数据库等多种技术。该系统在公安刑侦,海关,机场,车站等领域有着广泛的应用前景。  相似文献   

14.
Semantic analysis and retrieval in personal and social photo collections   总被引:1,自引:1,他引:0  
Semantic understanding of images has been an important topic in the research community for a long time as it is an important prerequisite to build meaningful retrieval systems which are accessible by both users and automatic reasoning algorithms. Recently, especially with the growing trend to share photos online, the social aspect of image retrieval becomes more and more prevalent and image retrieval more and more focusses specifically on photos and their special characteristics, especially on information outside the image itself. Researchers are starting to explore how and why photos are shot, shared and used and try to incorporate this additional knowledge to aid image analysis and retrieval. Several survey papers have been written in the past reviewing works in the general field of image analysis and retrieval. However, the social aspect of image retrieval and the focus on digital photos has not sufficiently been addressed in these works. In this article we give an overview over the current research field of semantic photo understanding, annotation and retrieval. We review over 160 contributions in the field and identify trending topics and implications for future directions of research.  相似文献   

15.
The vast amount of images available on the Web request for an effective and efficient search service to help users find relevant images.The prevalent way is to provide a keyword interface for users to submit queries.However,the amount of images without any tags or annotations are beyond the reach of manual efforts.To overcome this,automatic image annotation techniques emerge,which are generally a process of selecting a suitable set of tags for a given image without user intervention.However,there are three main challenges with respect to Web-scale image annotation:scalability,noiseresistance and diversity.Scalability has a twofold meaning:first an automatic image annotation system should be scalable with respect to billions of images on the Web;second it should be able to automatically identify several relevant tags among a huge tag set for a given image within seconds or even faster.Noise-resistance means that the system should be robust enough against typos and ambiguous terms used in tags.Diversity represents that image content may include both scenes and objects,which are further described by multiple different image features constituting different facets in annotation.In this paper,we propose a unified framework to tackle the above three challenges for automatic Web image annotation.It mainly involves two components:tag candidate retrieval and multi-facet annotation.In the former content-based indexing and concept-based codebook are leveraged to solve scalability and noise-resistance issues.In the latter the joint feature map has been designed to describe different facets of tags in annotations and the relations between these facets.Tag graph is adopted to represent tags in the entire annotation and the structured learning technique is employed to construct a learning model on top of the tag graph based on the generated joint feature map.Millions of images from Flickr are used in our evaluation.Experimental results show that we have achieved 33% performance improvements compared with those single facet approaches in terms of three metrics:precision,recall and F1 score.  相似文献   

16.
Automatic image annotation is an attractive service for users and administrators of online photo sharing websites. In this paper, we propose an image annotation approach exploiting the crossmodal saliency correlation including visual and textual saliency. For textual saliency, a concept graph is firstly established based on the association between the labels. Then semantic communities and latent textual saliency are detected; For visual saliency, we adopt a dual-layer BoW (DL-BoW) model integrated with the local features and salient regions of the image. Experiments on MIRFlickr and IAPR TC-12 datasets demonstrate that the proposed method outperforms other state-of-the-art approaches.  相似文献   

17.
目的 修复老照片具有重要的实用价值,但老照片包含多种未知复杂的退化,传统的修复方法组合不同的数字图像处理技术进行修复,通常产生不连贯或不自然的修复结果。基于深度学习的修复方法虽然被提出,但大多集中于对单一或有限的退化进行修复。针对上述问题,本文提出一种融合参考先验与生成先验的生成对抗网络来修复老照片。方法 对提取的老照片和参考图像的浅层特征进行编码获得深层语义特征与潜在编码,对获得的潜在编码进一步融合获得深度语义编码,深度语义编码通过生成先验网络获得生成先验特征,并且深度语义编码引导条件空间多特征变换条件注意力块进行参考语义特征、生成先验特征与待修复特征的空间融合变换,最后通过解码网络重建修复图像。结果 实验与6种图像修复方法进行了定量与定性评估。比较了4种评估指标,本文算法的所有指标评估结果均优于其他算法,PSNR (peak signal-to-noise ratio)为23.69 d B,SSIM (structural similarity index)为0.828 3,FID (Fréchet inception distance)为71.53,LPIPS (learned ...  相似文献   

18.
Social relation analysis via images is a new research area that has attracted much interest recently. As social media usage increases, a wide variety of information can be extracted from the growing number of consumer photos shared online, such as the category of events captured or the relationships between individuals in a given picture. Family is one of the most important units in our society, thus categorizing family photos constitutes an essential step toward image-based social analysis and content-based retrieval of consumer photos. We propose an approach that combines multiple unique and complimentary cues for recognizing family photos. The first cue analyzes the geometric arrangement of people in the photograph, which characterizes scene-level information with efficient yet discriminative capability. The second cue models facial appearance similarities to capture and quantify relevant pairwise relations between individuals in a given photo. The last cue investigates the semantics of the context in which the photo was taken. Experiments on a dataset containing thousands of family and non-family pictures collected from social media indicate that each individual model produces good recognition results. Furthermore, a combined approach incorporating appearance, geometric and semantic features significantly outperforms the state of the art in this domain, achieving 96.7% classification accuracy.  相似文献   

19.
Users of mobile devices can nowadays easily create large quantities of mobile multimedia documents tracing significant events attended, places visited or, simply, moments of their everyday life. However, they face the challenge of organizing these documents in order to facilitate searching through them at a later time and sharing them with other users. We propose using context awareness and semantic technologies in order to improve and facilitate the organization, annotation, retrieval and sharing of personal mobile multimedia documents. Our approach combines metadata extracted and enriched automatically from the users’ context with annotations provided manually by the users and with annotations inferred by applying user-defined rules to context features. These new contextual metadata are integrated into the processes of annotation, sharing and keyword-based retrieval.  相似文献   

20.
Photo clustering is an effective way to organize albums and it is useful in many applications, such as photo browsing and tagging. But automatic photo clustering is not an easy task due to the large variation of photo content. In this paper, we propose an interactive photo clustering paradigm that jointly explores human and computer. In this paradigm, the photo clustering task is semi-automatically accomplished: users are allowed to manually adjust clustering results with different operations, such as splitting clusters, merging clusters and moving photos from one cluster to another. Behind users’ operations, we have a learning engine that keeps updating the distance measurements between photos in an online way, such that better clustering can be performed based on the distance measure. Experimental results on multiple photo albums demonstrated that our approach is able to improve automatic photo clustering results, and by exploring distance metric learning, our method is much more effective than pure manual adjustments of photo clustering.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号