首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Structured scenario photos, referring to the images which capture important events that usually follow specific routines/structures (such as wedding ceremonies, graduation ceremonies, etc.), account for a significant proportion in personal photo collections. Conventional image analysis techniques without considering the event routines/structures are not sufficient to handle these photos. In this paper, we explore the appropriate framework to learn and utilize the specific routines for understanding these structure scenario photos. Specifically, we propose a novel framework which can systematically integrate Hidden Markov Model and Gaussian Mixture Model to recognize sub-events from structured scenario photos. Then we present a comprehensive criterion to select representative images to summarize the whole photo collection. Experimental results conducted on the real-world datasets demonstrate the superiority of our framework in both of sub-event recognition and photo summarization tasks.  相似文献   

2.
3.
Online photo collections have become truly gigantic. Photo sharing sites such as Flickr (http://www.flickr.com/) host billions of photographs, a large portion of which are contributed by tourists. In this paper, we leverage online photo collections to automatically rank canonical views for tourist attractions. Ideal canonical views for a tourist attraction should both be representative of the site and exhibit a diverse set of views (Kennedy and Naaman, International Conference on World Wide Web 297–306, 2008). In order to meet both goals, we rank canonical views in two stages. During the first stage, we use visual features to encode the content of photographs and infer the popularity of each photograph. During the second stage, we rank photographs using a suppression scheme to keep popular views top-ranked while demoting duplicate views. After a ranking is generated, canonical views at various granularities can be retrieved in real-time, which advances over previous work and is a promising feature for real applications. In order to scale canonical view ranking to gigantic online photo collections, we propose to leverage geo-tags (latitudes/longitudes of the location of the scene in the photographs) to speed up the basic algorithm. We preprocess the photo collection to extract subsets of photographs that are geographically clustered (or geo-clusters), and constrain the expensive visual processing within each geo-cluster. We test the algorithm on two large Flickr data sets of Rome and the Yosemite national park, and show promising results on canonical view ranking. For quantitative analysis, we adopt two medium data sets and conduct a subjective comparison with previous work. It shows that while both algorithms are able to produce canonical views of high quality, our algorithm has the advantage of responding in real-time to canonical view retrieval at various granularities.  相似文献   

4.
基于事件项语义图聚类的多文档摘要方法   总被引:2,自引:2,他引:0  
基于事件的抽取式摘要方法一般首先抽取那些描述重要事件的句子,然后把它们重组并生成摘要。该文将事件定义为事件项以及与其关联的命名实体,并聚焦从外部语义资源获取的事件项语义关系。首先基于事件项语义关系创建事件项语义关系图并使用改进的DBSCAN算法对事件项进行聚类,接着为每类选择一个代表事件项或者选择一类事件项来表示文档集的主题,最后从文档抽取那些包含代表项并且最重要的句子生成摘要。该文的实验结果证明在多文档自动摘要中考虑事件项语义关系是必要的和可行的。  相似文献   

5.
Collage can provide a summary form on the collection of photos in an album. In this paper, we introduce a novel approach to constructing photo collage in the hierarchical narrative manner. As opposed to previous methods focusing on spatial coherence in the collage layout, our narrative collage arranges the photos according to the basic narrative elements from literary writings, i.e., character, setting and plot. Face, time and place attributes are exploited to embody those narrative elements in the collage. Then, photos are organized into the hierarchical structure for the multi‐level details in the events recorded by the album. Such hierarchical narrative collage can present a visual overview in the chronological order on what happened in the album. Experimental results show that our approach offers a better summarization to browse on the photo album content than previous ones.  相似文献   

6.
Grouping photos of the same event together is extremely useful for the management of personal photo collections. However, most methods cannot be applied to the problem of online event detection in embedded devices because they do not consider hardware constraints or a user’s photo-taking behavior. In this paper, we propose an efficient and effective event detection algorithm for managing personal photo collections in camera phones or digital cameras. The proposed algorithm fuses time and location information, which is deemed the most important information for personal photo management, and works in real time in embedded devices. We model event occurrences in a user’s photo-taking behavior as a Poisson process by imposing certain constraints on calculating the elapsed time. Location information is incorporated into event detection when confidence in a decision based on the Poisson process is not high enough. The algorithm is user-centric because it provides the unique capabilities of accepting and adjusting to user feedback. Our experiment results show that the proposed event detection method has the potential to support emerging multimedia applications in embedded devices.  相似文献   

7.
Xiong  Yu  Zhou  Xiangmin  Zhang  Yifei  Feng  Shi  Wang  Daling 《Multimedia Tools and Applications》2019,78(6):6409-6440

Effectively and efficiently summarizing social media is crucial and non-trivial to analyze social media. On social streams, events which are the main concept of semantic similar social messages, often bring us a firsthand story of daily news. However, to identify the valuable news, it is almost impossible to plough through millions of multi-modal messages one by one with traditional methods. Thus, it is urgent to summarize events with a few representative data samples on the streams. In this paper, we provide a vivid textual-visual media summarization approach for microblog streams, which exploits the incremental latent semantic analysis (LSA) of detected events. Firstly, with a novel weighting scheme for keyword relationship, we can detect and track daily sub-events on a keyword relation graph (WordGraph) of microblog streams effectively. Then, to summarize the stream with representative texts and images, we use cross-modal fusion to analyze the semantics of microblog texts and images incrementally and separately, with a novel incremental cross-modal LSA algorithm. The experimental results on a real microblog dataset show that our method is at least 1.31% better and 23.67% faster than existing state-of-the-art methods, and cross-modal fusion can improve the summarization performance by 4.16% on average.

  相似文献   

8.
随着数码产品的普及,数码照片的积累越来越多。为了更好的满足用户语义检索照片的需求,各国研究者做了很多相关工作。他们研究照片的本质和用户使用照片的习惯,使用这些知识来帮助检索和管理照片。分析和总结这些研究,主要介绍基于语义的数码照片检索研究工作中语义提取的方法、运用语义Web技术现状及其在照片管理中的应用。  相似文献   

9.
目的 修复老照片具有重要的实用价值,但老照片包含多种未知复杂的退化,传统的修复方法组合不同的数字图像处理技术进行修复,通常产生不连贯或不自然的修复结果。基于深度学习的修复方法虽然被提出,但大多集中于对单一或有限的退化进行修复。针对上述问题,本文提出一种融合参考先验与生成先验的生成对抗网络来修复老照片。方法 对提取的老照片和参考图像的浅层特征进行编码获得深层语义特征与潜在编码,对获得的潜在编码进一步融合获得深度语义编码,深度语义编码通过生成先验网络获得生成先验特征,并且深度语义编码引导条件空间多特征变换条件注意力块进行参考语义特征、生成先验特征与待修复特征的空间融合变换,最后通过解码网络重建修复图像。结果 实验与6种图像修复方法进行了定量与定性评估。比较了4种评估指标,本文算法的所有指标评估结果均优于其他算法,PSNR (peak signal-to-noise ratio)为23.69 d B,SSIM (structural similarity index)为0.828 3,FID (Fréchet inception distance)为71.53,LPIPS (learned ...  相似文献   

10.
Analyzing personal photo albums for understanding the related events is an emerging trend. A reliable event recognition tool could suggest appropriate annotation of pictures, provide the context for single image classification and tagging, achieve automatic selection and summarization, ease organization and sharing of media among users. In this paper, a novel method for fast and reliable event-type classification of personal photo albums is presented. Differently from previous approaches, the proposed method does not process photos individually but as a whole, exploiting three main features, namely Saliency, Gist, and Time, to extract an event signature, which is characteristic for a specific event type. A highly challenging database containing more than 40.000 photos belonging to 19 diverse event-types was crawled from photo-sharing websites for the purpose of modeling and performance evaluation. Experimental results showed that the proposed approach meets superior classification accuracy with limited computational complexity.  相似文献   

11.
Event summarization is a task to generate a single, concise textual representation of an event. This task does not consider multiple development phases in an event. However, news articles related to long and complicated events often involve multiple phases. Thus, traditional approaches for event summarization generally have difficulty in capturing event phases in summarization effectively. In this paper, we define the task of Event Phase Oriented News Summarization (EPONS). In this approach, we assume that a summary contains multiple timelines, each corresponding to an event phase. We model the semantic relations of news articles via a graph model called Temporal Content Coherence Graph. A structural clustering algorithm EPCluster is designed to separate news articles into several groups corresponding to event phases. We apply a vertex-reinforced random walk to rank news articles. The ranking results are further used to create timelines. Extensive experiments conducted on multiple datasets show the effectiveness of our approach.  相似文献   

12.
Managing a large number of digital photos is a challenging task for casual users. Personal photos often don’t have rich metadata, or additional information associated with them. However, available metadata can play a crucial role in managing photos. Labeling the semantic content of photos (i.e., annotating them), can increase the amount of metadata and facilitate efficient management. However, manual annotation is tedious and labor intensive while automatic metadata extraction techniques often generate inaccurate and irrelevant results. This paper describes a semi-automatic annotation strategy that takes advantage of human and computer strengths. The semi-automatic approach enables users to efficiently update automatically obtained metadata interactively and incrementally. Even though automatically identified metadata are compromised with inaccurate recognition errors, the process of correcting inaccurate information can be faster and easier than manually adding new metadata from scratch. In this paper, we introduce two photo clustering algorithms for generating meaningful photo groups: (1) Hierarchical event clustering; and (2) Clothing based person recognition, which assumes that people who wear similar clothing and appear in photos taken in one day are very likely to be the same person. To explore our semi-automatic strategies, we designed and implemented a prototype called SAPHARI (Semi-Automatic PHoto Annotation and Recognition Interface). The prototype provides an annotation framework which focuses on making bulk annotations on automatically identified photo groups. The prototype automatically creates photo clusters based on events, people, and file metadata so that users can easily bulk annotation photos. We performed a series of user studies to investigate the effectiveness and usability of the semi-automatic annotation techniques when applied to personal photo collections. The results show that users were able to make annotations significantly faster with event clustering using SAPHARI. We also found that users clearly preferred the semi-automatic approaches.  相似文献   

13.

Text summarization presents several challenges such as considering semantic relationships among words, dealing with redundancy and information diversity issues. Seeking to overcome these problems, we propose in this paper a new graph-based Arabic summarization system that combines statistical and semantic analysis. The proposed approach utilizes ontology hierarchical structure and relations to provide a more accurate similarity measurement between terms in order to improve the quality of the summary. The proposed method is based on a two-dimensional graph model that makes uses statistical and semantic similarities. The statistical similarity is based on the content overlap between two sentences, while the semantic similarity is computed using the semantic information extracted from a lexical database whose use enables our system to apply reasoning by measuring semantic distance between real human concepts. The weighted ranking algorithm PageRank is performed on the graph to produce significant score for all document sentences. The score of each sentence is performed by adding other statistical features. In addition, we address redundancy and information diversity issues by using an adapted version of Maximal Marginal Relevance method. Experimental results on EASC and our own datasets showed the effectiveness of our proposed approach over existing summarization systems.

  相似文献   

14.
There is a growing evidence that visual saliency can be better modeled using top-down mechanisms that incorporate object semantics. This suggests a new direction for image and video analysis, where semantics extraction can be effectively utilized to improve video summarization, indexing and retrieval. This paper presents a framework that models semantic contexts for key-frame extraction. Semantic context of video frames is extracted and its sequential changes are monitored so that significant novelties are located using a one-class classifier. Working with wildlife video frames, the framework undergoes image segmentation, feature extraction and matching of image blocks, and then a co-occurrence matrix of semantic labels is constructed to represent the semantic context within the scene. Experiments show that our approach using high-level semantic modeling achieves better key-frame extraction as compared with its counterparts using low-level features.  相似文献   

15.
通过对自动文摘技术的研究,针对叙事类文本,以事件作为基本语义单元,提出一种基于事件的多主题文本自动文摘方法。利用事件和事件间的关系构建事件网络文本表示模型,使用社区划分算法解决子事件主题划分问题。实验结果表明,该方法提取出的准确率、召回率及F值较高,能更好地概括文本的内容。  相似文献   

16.
Event detection is a fundamental information extraction task, which has been explored largely in the context of question answering, topic detection and tracking, knowledge base population, news recommendation, and automatic summarization. In this article, we explore an event detection framework to improve a key phrase-guided centrality-based summarization model. Event detection is based on the fuzzy fingerprint method, which is able to detect all types of events in the ACE 2005 Multilingual Corpus. Our base summarization approach is a two-stage method that starts by extracting a collection of key phrases that will be used to help the centrality-as-relevance retrieval model. We explored three different ways to integrate event information, achieving state-of-the-art results in text and speech corpora: (1) filtering of nonevents, (2) event fingerprints as features, and (3) combination of filtering of nonevents and event fingerprints as features.  相似文献   

17.
18.
We propose a framework for abstractive summarization of multi-documents, which aims to select contents of summary not from the source document sentences but from the semantic representation of the source documents. In this framework, contents of the source documents are represented by predicate argument structures by employing semantic role labeling. Content selection for summary is made by ranking the predicate argument structures based on optimized features, and using language generation for generating sentences from predicate argument structures. Our proposed framework differs from other abstractive summarization approaches in a few aspects. First, it employs semantic role labeling for semantic representation of text. Secondly, it analyzes the source text semantically by utilizing semantic similarity measure in order to cluster semantically similar predicate argument structures across the text; and finally it ranks the predicate argument structures based on features weighted by genetic algorithm (GA). Experiment of this study is carried out using DUC-2002, a standard corpus for text summarization. Results indicate that the proposed approach performs better than other summarization systems.  相似文献   

19.
微博数据具有实时动态特性,人们通过分析微博数据可以检测现实生活中的事件。同时,微博数据的海量、短文本和丰富的社交关系等特性也为事件检测带来了新的挑战。综合考虑了微博数据的文本特征(转帖、评论、内嵌链接、用户标签hashtag、命名实体等)、语义特征、时序特性和社交关系特性,提出了一种有效的基于微博数据的事件检测算法(event detection in microblogs,EDM)。还提出了一种通过提取事件关键要素,即关键词、命名实体、发帖时间和用户情感倾向性,构成事件摘要的方法。与基于LDA(latent Dirichlet allocation)模型的事件检测算法进行实验对比,结果表明,EDM算法能够取得更好的事件检测效果,并且能够提供更直观可读的事件摘要。  相似文献   

20.
Multimedia event detection (MED) is a challenging problem because of the heterogeneous content and variable quality found in large collections of Internet videos. To study the value of multimedia features and fusion for representing and learning events from a set of example video clips, we created SESAME, a system for video SEarch with Speed and Accuracy for Multimedia Events. SESAME includes multiple bag-of-words event classifiers based on single data types: low-level visual, motion, and audio features; high-level semantic visual concepts; and automatic speech recognition. Event detection performance was evaluated for each event classifier. The performance of low-level visual and motion features was improved by the use of difference coding. The accuracy of the visual concepts was nearly as strong as that of the low-level visual features. Experiments with a number of fusion methods for combining the event detection scores from these classifiers revealed that simple fusion methods, such as arithmetic mean, perform as well as or better than other, more complex fusion methods. SESAME’s performance in the 2012 TRECVID MED evaluation was one of the best reported.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号