首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
丰富的媒体内容和格式、异构的网络以及多样的终端设备,成为通用媒体访问的巨大障碍,媒体适配成为必要.文中分析了MPEG-21数字媒体适配DIA各实体之间的抽象关系,讨论了数字媒体适配的一般框架,建立了基于混合变量的约束优化模型.该模型统一了现有的媒体适配应用研究,能够用一致的算法进行求解.同时,文中从图像理解和视频分析角度,建立了媒体适配的层次结构,对媒体适配应用进行分类,并以当前的主要应用研究如图像适配、视频转码、位率适配、视频对象适配等举例说明.文中也探讨了混合媒体的多模态适配,指出了今后的研究热点和难点,如媒体语义抽取和适配、用户主观测度和媒体访问体验最大化等.  相似文献   

新闻视频挖掘技术研究   总被引:4,自引:0,他引:4  
新闻视频挖掘是一个新兴的研究领域,也是多媒体数据挖掘的典型代表。本文对新闻视频挖掘技术进行了全面深入的讨论,首先从概念上对新闻视频挖掘进行了界定,提出了新闻视频挖掘的层次框架和技术框架,指出新闻视频挖掘包括低层视频挖掘和高层视频挖掘两个层次。其中,低层视频挖掘是利用数据挖掘的方法对视频内容进行分析的过程,而高层数据挖掘则是在低层挖掘的基础上进一步发现视频中的知识的过程。新闻视频挖掘的技术框架则对挖掘所涉及到的具体技术进行了分析。最后,对新闻视频挖掘中的结构挖掘、语义内容挖掘、视频摘要、趋势挖掘、关联挖掘等任务进行了详细的阐述,并对各种任务举出了具体的示例加以说明。  相似文献   

Summaries are an essential component of video retrieval and browsing systems. Most research in video summarization has focused on content analysis to obtain compact yet comprehensive representations of video items. However, important aspects such as how they can be effectively integrated in mobile interfaces and how to predict the quality and usability of the summaries have not been investigated. Conventional summaries are limited to a single instance with certain length (i.e. a single scale). In contrast, scalable summaries target representations with multiple scales, that is, a set of summaries with increasing length in which longer summaries include more information about the video. Thus, scalability provides high flexibility that can be exploited in devices such as smartphones or tablets to provide versions of the summary adapted to the limited visualization area. In this paper, we explore the application of scalable storyboards to summary adaptation and zoomable video navigation in handheld devices. By introducing a new adaptation dimension related with the summarization scale, we can formulate navigation and adaptation in a two-dimensional adaptation space, where different navigation actions modify the trajectory in that space. We also describe the challenges to evaluate scalable summaries and some usability issues that arise from having multiple scales, proposing some objective metrics that can provide useful insight about their potential quality and usability without requiring very costly user studies. Experimental results show a reasonable agreement with the trends shown in subjective evaluations. Experiments also show that content-based scalable storyboards are less redundant and useful than the content-blind baselines.  相似文献   

上行流媒体在军民融合领域展现出日益重要的新兴战略价值,压缩感知视频流技术体系在上行流媒体应用中具有前端功耗低、容错性好、适用信号广等独特优势,已成为当前可视通信研究的前沿与热点之一。本文从阐述上行流媒体的应用特征出发,从性能指标、并行分块计算成像、低复杂度视频编码、视频重构和语义质量评价等方面,分析了当前针对压缩感知视频流的基础理论与关键技术,对国内外相关的研究进展进行了探究与比较。面向上行流媒体的压缩感知视频流面临着观测效率难控、码流适配困难和重建质量较低等技术挑战。对压缩感知视频流的技术发展趋势进行展望,未来将通过前端与智能云端的分工协作,突破高效率的视频观测与语义质量导引视频重构等关键技术,进一步开拓压缩感知视频流在上行流媒体应用中的定量优势与演进途径。  相似文献   

Due to the information redundancy of video, automatically extracting essential video content is one of key techniques for accessing and managing large video library. In this paper, we present a generic framework of a user attention model, which estimates the attentions viewers may pay to video contents. As human attention is an effective and efficient mechanism for information prioritizing and filtering, user attention model provides an effective approach to video indexing based on importance ranking. In particular, we define viewer attention through multiple sensory perceptions, i.e. visual and aural stimulus as well as partly semantic understanding. Also, a set of modeling methods for visual and aural attentions are proposed. As one of important applications of user attention model, a feasible solution of video summarization, without fully semantic understanding of video content as well as complex heuristic rules, is implemented to demonstrate the effectiveness, robustness, and generality of the user attention model. The promising results from the user study on video summarization indicate that the user attention model is an alternative way to video understanding.  相似文献   

基于统计学理论,提出了一种视频多粒度语义分析的通用方法,使得多层次语义分析与多模式信息融合得到统一.为了对时域内容进行表示,首先提出一种具有时间语义语境约束的关键帧选取策略和注意力选择模型;在基本视觉语义识别后,采用一种多层视觉语义分析框架来抽取视觉语义;然后应用隐马尔可夫模型(HMM)和贝叶斯决策进行音频语义理解;最后用一种具有两层结构的仿生多模式融合方案进行语义信息融合.实验结果表明,该方法能有效融合多模式特征,并提取不同粒度的视频语义.  相似文献   

监控视频是安防系统的重要组成部分。在如今的各行各业中,只要涉及到安全,均 离不开监控视频。但对监控视频内容的分析主要依靠大量人工来完成,人力和时间成本巨大。随 着监控视频数据越来越多,如何提高针对视频内容的分析效率、降低用户认知负荷是拓展视频利 用率的重要方面。为此,针对监控视频存在的冗余信息较多、人工获取视频关键内容效率低的问 题,采用螺旋视频摘要及相应交互技术,开发了一种面向监控视频内容的可视分析系统,结合运 动目标检测结果数据,基于螺旋摘要的展示优势实现多角度可视化视频目标统计信息,并辅以针 对螺旋摘要的导航、定位操作以及草图交互等方式,实现对监控视频内容的快速有效获取。  相似文献   

Despite much work on Universal Multimedia Experience (UME), existing video adaptation approaches cannot yet be considered as truly user-centric, mostly due to their poor handling of semantic user preferences. Indeed, these works mainly concentrate on lower-level user preferences but do neither consider any fine-grained object-level adaptation nor evaluate different adaptation options based on predicted user expectations. Moreover, these works do not provide owners with property rights that enable them to place restrictions on the types of modifications to be made to the video content. To address these shortcomings, we propose the Personalized vIdeo Adaptation Framework (PIAF) for high-level semantic video adaptation. PIAF is a fully integrated framework providing all the requirements for a semantic video adaptation. It defines a video annotation model and a user profile model comprising semantic constraints that are delineated in a consistent way, based on the standards MPEG-7 and MPEG-21. At the heart of the framework, the Adaptation Decision Taking Engine (ADTE) computes utility values for different adaptation options, considering each shot separately. The corresponding utility function evaluates the possible choices by evaluating multiple parameters that capture different dimensions of a multimedia experience: amount of modified content, modifications to key objects and shots with respect to the semantic integrity of the original content, expected processing cost of the adaptation, and the anticipated visual and temporal quality of the adapted content. Furthermore, the ADTE can deal with intellectual property issues by selecting an adaptation plan of good quality that also satisfies constraints specified by the content owner. This paper places a significant emphasis on theoretical details of the utility function and the computation of the adaptation plan. It also presents the results and evaluation of the adaptation process both in simulation and user study.  相似文献   

视频摘要技术是当前多媒体领域研究的热点之一。视频摘要生成方法归结为两类:基于关键帧的视频摘要和基于对象的视频摘要;对基于关键帧的视频摘要方法做了简要的介绍,并重点总结了历年来出现的基于对象的视频摘要的生成方法。最后对视频摘要技术的发展做出了总结和展望。  相似文献   

An Integrated Framework for Semantic Annotation and Adaptation   总被引:1,自引:1,他引:0  
Tools for the interpretation of significant events from video and video clip adaptation can effectively support automatic extraction and distribution of relevant content from video streams. In fact, adaptation can adjust meaningful content, previously detected and extracted, to the user/client capabilities and requirements. The integration of these two functions is increasingly important, due to the growing demand of multimedia data from remote clients with limited resources (PDAs, HCCs, Smart phones). In this paper we propose an unified framework for event-based and object-based semantic extraction from video and semantic on-line adaptation. Two cases of application, highlight detection and recognition from soccer videos and people behavior detection in domotic* applications, are analyzed and discussed.Domotics is a neologism coming from the Latin word domus (home) and informatics.Marco Bertini has a research grant and carries out his research activity at the Department of Systems and Informatics at the University of Florence, Italy. He received a M.S. in electronic engineering from the University of Florence in 1999, and Ph.D. in 2004. His main research interest is content-based indexing and retrieval of videos. He is author of more than 25 papers in international conference proceedings and journals, and is a reviewer for international journals on multimedia and pattern recognition.Rita Cucchiara (Laurea Ingegneria Elettronica, 1989; Ph.D. in Computer Engineering, University of Bologna, Italy 1993). She is currently Full Professor in Computer Engineering at the University of Modena and Reggio Emilia (Italy). She was formerly Assistant Professor (‘93–‘98) at the University of Ferrara, Italy and Associate Professor (‘98–‘04) at the University of Modena and Reggio Emilia, Italy. She is currently in the Faculty staff of Computer Engenering where has in charges the courses of Computer Architectures and Computer Vision.Her current interests include pattern recognition, video analysis and computer vision for video surveillance, domotics, medical imaging, and computer architecture for managing image and multimedia data.Rita Cucchiara is author and co-author of more than 100 papers in international journals, and conference proceedings. She currently serves as reviewer for many international journals in computer vision and computer architecture (e.g. IEEE Trans. on PAMI, IEEE Trans. on Circuit and Systems, Trans. on SMC, Trans. on Vehicular Technology, Trans. on Medical Imaging, Image and Vision Computing, Journal of System architecture, IEEE Concurrency). She participated at scientific committees of the outstanding international conferences in computer vision and multimedia (CVPR, ICME, ICPR, ...) and symposia and organized special tracks in computer architecture for vision and image processing for traffic control. She is in the editorial board of Multimedia Tools and Applications journal. She is member of GIRPR (Italian chapter of Int. Assoc. of Pattern Recognition), AixIA (Ital. Assoc. Of Artificial Intelligence), ACM and IEEE Computer Society.Alberto Del Bimbo is Full Professor of Computer Engineering at the Università di Firenze, Italy. Since 1998 he is the Director of the Master in Multimedia of the Università di Firenze. At the present time, he is Deputy Rector of the Università di Firenze, in charge of Research and Innovation Transfer. His scientific interests are Pattern Recognition, Image Databases, Multimedia and Human Computer Interaction. Prof. Del Bimbo is the author of over 170 publications in the most distinguished international journals and conference proceedings. He is the author of the “Visual Information Retrieval” monography on content-based retrieval from image and video databases edited by Morgan Kaufman. He is Member of IEEE (Institute of Electrical and Electronic Engineers) and Fellow of IAPR (International Association for Pattern Recognition). He is presently Associate Editor of Pattern Recognition, Journal of Visual Languages and Computing, Multimedia Tools and Applications Journal, Pattern Analysis and Applications, IEEE Transactions on Multimedia, and IEEE Transactions on Pattern Analysis and Machine Intelligence. He was the Guest Editor of several special issues on Image databases in highly respected journals.Andrea Prati (Laurea in Computer Engineering, 1998; PhD in Computer Engineering, University of Modena and Reggio Emilia, 2002). He is currently an assistant professor at the University of Modena and Reggio Emilia (Italy), Faculty of Engineering, Dipartimento di Scienze e Metodi dell’Ingegneria, Reggio Emilia. During last year of his PhD studies, he has spent six months as visiting scholar at the Computer Vision and Robotics Research (CVRR) lab at University of California, San Diego (UCSD), USA, working on a research project for traffic monitoring and management through computer vision. His research interests are mainly on motion detection and analysis, shadow removal techniques, video transcoding and analysis, computer architecture for multimedia and high performance video servers, video-surveillance and domotics. He is author of more than 60 papers in international and national conference proceedings and leading journals and he serves as reviewer for many international journals in computer vision and computer architecture. He is a member of IEEE, ACM and IAPR.  相似文献   

基于HMM的足球视频语义分析研究   总被引:1,自引:1,他引:0  
针对视频高层语义分析问题,文章结合足球比赛的领域知识,按照足球比赛转播,视频编辑的一般规律,根据足球比赛语义事件随机性的特点,选择特定的视频物理特征,应用 HMM (隐马尔科夫模型) 分析视频的语义结构,确定视频和HMM 模型中各元素的对应关系,构建一个基于HMM 的视频语义分析框架,并通过进行足球视频 HMM 参数的训练,得到视频各语义事件的 HMM 模型,达到视频语义自动分析的目的.  相似文献   

视频结构挖掘的概念及应用   总被引:3,自引:0,他引:3  
提出了一种视频结构挖掘的概念框架和视频结构挖掘系统框架,在概念框架中对视频结构挖掘相关概念给出了规范化的定义,视频结构挖掘框架包括的主要内容有视频基本结构挖掘、视频语法结构挖掘和视频语义结构挖掘。最后讨论了视频结构挖掘中发现的结构模式和知识的具体应用,包括指导视频的组织与管理、实现基于内容的个性视频推荐和改善视频摘要系统。  相似文献   

J2EE架构下智能视频检索系统集成框架研究   总被引:2,自引:0,他引:2  
提出了一种基于J2EE平台的智能视频检索系统集成框架,并实现了具有视频分析、内容管理、基于WEB检索和浏览等功能的视频检索系统iVideo。该系统参照MPEG7标准描述视频数据,这种描述便于视频内容的管理;系统采取高层语义特征与底层视觉特征融合以及相关反馈等手段有效提高检索的准确度,并能根据不同的终端设备自适应地显示查询结果。  相似文献   

Using Webcast Text for Semantic Event Detection in Broadcast Sports Video   总被引:1,自引:0,他引:1  
Sports video semantic event detection is essential for sports video summarization and retrieval. Extensive research efforts have been devoted to this area in recent years. However, the existing sports video event detection approaches heavily rely on either video content itself, which face the difficulty of high-level semantic information extraction from video content using computer vision and image processing techniques, or manually generated video ontology, which is domain specific and difficult to be automatically aligned with the video content. In this paper, we present a novel approach for sports video semantic event detection based on analysis and alignment of webcast text and broadcast video. Webcast text is a text broadcast channel for sports game which is co-produced with the broadcast video and is easily obtained from the web. We first analyze webcast text to cluster and detect text events in an unsupervised way using probabilistic latent semantic analysis (pLSA). Based on the detected text event and video structure analysis, we employ a conditional random field model (CRFM) to align text event and video event by detecting event moment and event boundary in the video. Incorporation of webcast text into sports video analysis significantly facilitates sports video semantic event detection. We conducted experiments on 33 hours of soccer and basketball games for webcast analysis, broadcast video analysis and text/video semantic alignment. The results are encouraging and compared with the manually labeled ground truth.   相似文献   

视频摘要是对视频内容进行浓缩的一项技术,对于快速了解视频内容至关重要。如何对视频摘要的效果进行评价,是值得研究的一个问题。论文基于层次分析法构建了视频摘要评价模型,将视频摘要质量作为最终评价目标,以内容合理性和结构合理性作为准则,以内容完整性、特殊重要性、整体流畅性等作为测度层,从而建立了视频摘要评价指标体系。最后,通过对随机生成、基于特写人脸的摘要生成以及融合视音频特征的摘要生成三种算法对所提评价方法进行了实验验证,表明该方法能够有效反映出视频摘要的质量。  相似文献   

视频信息处理的关键是视频信息的结构化,视频除了有基本层次结构之外,还有隐藏其中的视频结构语法和结构语义。该文提出了一种视频结构挖掘的概念框架和视频结构挖掘的系统框架,在概念框架中对视频结构挖掘相关概念给出了明确定义和界定;视频结构挖掘框架主要包括:视频基本层次结构挖掘,视频结构语法挖掘和视频结构语义挖掘。讨论了视频结构模式和知识的具体应用,包括指导视频的组织与管理、实现基于内容的个性视频推荐和改善视频摘要系统。  相似文献   

一种通用的基于基本语义单元的体育视频内容分析框架   总被引:1,自引:0,他引:1  
体育视频内容分析的研究现状集中在语义标注,对体育视频的句法分段和框架的研究较少.本文在分析了体育视频基本特征的基础上,提出了体育视频中基本语义单元(Basic Semantic Unit,简称BSU单元)的概念;继而提出了一种基于BSU的体育视频内容分析的通用框架;并且以足球视频为例,实例化了这种通用的体育视频内容分析框架.初步的实验结果表明,这种基于BSU的体育视频内容分析框架是有效和可行的.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号