首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
语义视频检索综述   总被引:4,自引:1,他引:4  
视频内容检索是多媒体应用的一个活跃研究方向,现有的内容检索技术大多是基于低层次特征的。这些非语义的低层特征难以理解,与人思维中的高层语义概念相差甚远,严重影响视频内容检索系统的易用性。低层特征和高层语义概念间的语义鸿沟很难逾越。如何跨越语义鸿沟,用语义概念检索视频内容是目前基于内容视频检索最具挑战性的研究方向。本文介绍语义视频检索出现的背景,分析语义鸿沟出现的原因,对现有尝试跨越语义鸿沟的主要方法进行综述;评述了相关技术的优缺点,探讨了各方法将来可能的研究发展方向以及视频语义检索近期、长期可能的技术突破点。  相似文献   

2.
王昊冉  白亮  老松杨 《计算机科学》2011,38(6):266-269,297
视频低层特征和高层语义之间存在的“语义鸿沟”是视频分析与检索应用研究的一个技术瓶颈问题。通过深入分析,提出了足球视频语义内容的时空关联特性,尝试引入图模型方法来建模这种语义关联,提出了视频语义图(VSU)的语义建模方法和基于DFS的视频语义图匹配算法,并分析了算法的复杂性。实验结果显示,该方法能够有效解决视频语义内容建模和分析匹配问题。  相似文献   

3.
基于高层语义的视频检索研究   总被引:1,自引:0,他引:1       下载免费PDF全文
视频语义检索的研究是目前研究的热点之一。现有的视频检索系统技术多是基于底层特征的、非语义层次的检索。与人类思维中所能理解的高层语义概念相去甚远,这严重影响视频检索的实际效果。如何跨越底层特征和高层语义的鸿沟,用高层语义概念进行视频检索是当前研究的重点。通过对视频内容的语义理解、语义分析、语义提取的简要概述,试图构造一种视频语义检索模型。  相似文献   

4.
在基于内容的图像检索中,低层视觉特征和高层语义之间的“语义鸿沟”一直是基于内容图像检索技术前进的一大障碍。相关反馈机制在一定程度上缩小了图像检索中的“语义鸿沟”。提出了一种基于模糊语义相关矩阵(FSRM)的相关反馈算法。该算法根据用户对检索结果的反馈调整模糊语义相关矩阵中的权值,从而捕捉用户的检索企图,通过对模糊语义相关矩阵中数据的学习不断修正语义矩阵,达到低层视觉特征到高层语义特征的过渡,最终提高了查询的准确度。实验结果证明了该算法的有效性。  相似文献   

5.
摘 要 基于语义的视频检索要处理的两项关键技术就是解决视频低层特征和高层语义概念间的语义鸿沟以及有效的语义提取模型.本文通过对视频进行多层次语义分析,采用有效的语义对象分割方法提取视频中的语义对象,以语义对象为中间层,并融合图像、声音、文本的多模式视频特征,从而缩小语义鸿沟。其次,视频语义概念具有多粒度性,由于本体在表示概念及概念间联系时的优越性,本文提出基于本体的语义提取模型,在从图像、声音、文本中提取出的原子概念中,推理出更高层次的复合概念。最终运用该模型提取的视频语义就具有更丰富的语义层次和语义粒度,从而更接近人类思维中的高层语义概念。  相似文献   

6.
视频上的事件探测对于视频检索与语义理解是一个很重要的工作.视频中的轨迹不仅记录了物体的移动信息,也反映了物体移动的动机,并与事件的发生密切相关.主要探讨了如何从轨迹抽取事件.然而,基于内容的视频事件分析中,从视频中抽取的低层特征与高层的语义特征存在一定的鸿沟.因此,利用领域知识标记的兴趣区域,提出一种新的语义轨迹表示方法,从而将视频中得到的原始轨迹转化为语义轨迹.同时,使用物体与兴趣区域关系的正则表达式描述视频中的语义事件.基于归纳学习的事件规则学习算法显示了正则表达式比传统的一阶谓词上的合式公式更易于学习.利用学习得到的事件规则可以很好地用于视频中语义事件的探测.最后,实验表明了事件探测的有效性。  相似文献   

7.
理想的视频库组织方法应该把语义相关并且特征相似的视频的特征向量相邻存储.针对大规模视频库的特点,在语义监督下基于低层视觉特征对视频库进行层次聚类划分,当一个聚类中只包含一个语义类别的视频时,为这个聚类建立索引项,每个聚类所包含的原始特征数据在磁盘上连续存储.统计低层特征和高层特征的概率联系,构造Bayes分类器.查询时对用户的查询范例,首先确定最可能的候选聚类,然后在候选聚类范围内查询相似视频片段.实验结果表明,文中的方法不仅提高了检索速度而且提高了检索的语义敏感度.  相似文献   

8.
基于语义概念的视频检索系统的设计与实现   总被引:2,自引:0,他引:2       下载免费PDF全文
设计并实现了一种基于语义概念的视频检索系统,该系统包括视频镜头分割与关键帧提取、语义概念检测和用户检索3个部分。系统采用镜头分割与关键帧提取对视频进行层次分割,并对关键帧图像提取有效的图像低层特征,再使用支持向量机(SVM)进行概念的检测,最后针对概念内容进行视频检索。在概念检测中,提出了一种基于验证平均准确率的线性加权方法对SVM的分类结果进行后融合。实验结果表明,该方法可以达到较高的检索准确率。  相似文献   

9.
在基于内容的图像检索中,针对图像的低层可视特征与高层语义特征之间的鸿沟,提出了一种基于支持向量机(SVM)的语义关联方法。通过对图像低层特征的分析,提取了颜色和形状特征向量(221维),将它们作为支持向量机的输入向量,对图像类进行学习,建立图像低层特征与高层语义的关联,并应用于鸟类、花卉、海洋以及建筑物等几个典型的语义类别检索。实验结果表明,该方法可适应于不同用户的图像检索,并提高了检索性能。  相似文献   

10.
基于SVM的图像低层特征与高层语义的关联   总被引:4,自引:0,他引:4  
成洁  石跃祥 《计算机应用研究》2006,23(9):250-252,255
在基于内容的图像检索中,针对图像的低层可视特征与高层语义特征之间的鸿沟,提出了一种基于支持向量机(SVM)的语义关联方法。通过对图像低层特征的分析,提取了颜色和形状特征向量(221维),将它们作为支持向量机的输入向量,对图像类进行学习,建立图像低层特征与高层语义的关联,并应用于鸟类、花卉、海洋以及建筑物等几个典型的语义类别检索。实验结果表明,该方法可适应于不同用户的图像检索,并提高了检索性能。  相似文献   

11.
Current approaches to modeling the structure and semantics of video recordings restrict its reuse. This is because these approaches are either too rigidly structured or too generally structured and so do not represent the structural and semantic regularities of classes of video recordings. This paper proposes a framework which tackles the problem of reuse by supporting the definition of a wide range of models of video recordings and supporting reuse between them. Examples of the framework's use are presented and examined with respect to different kinds of reuse of video, current research, and the development of a toolset to support the framework.  相似文献   

12.
This paper considers the automated generation of humorous video sequences from arbitrary video material. We present a simplified model of the editing process. We then outline our approach to narrativity and visual humour, discuss the problems of context and shot-order in video and consider influences on the editing process. We describe the role of themes and semantic fields in the generation of content oriented video scenes. We then present the architecture of AUTEUR, an experimental system that embodies mechanisms to interpret, manipulate and generate video. An example of a humorous video sequence generated by AUTEUR is described.  相似文献   

13.
In this paper, we develop a content-based video classification approach to support semantic categorization, high-dimensional indexing and multi-level access. Our contributions are in four points: (a) We first present a hierarchical video database model that captures the structures and semantics of video contents in databases. One advantage of this hierarchical video database model is that it can provide a framework for automatic mapping from high-level concepts to low-level representative features. (b) We second propose a set of useful techniques for exploiting the basic units (e.g., shots or objects) to access the videos in database. (c) We third suggest a learning-based semantic classification technique to exploit the structures and semantics of video contents in database. (d) We further develop a cluster-based indexing structure to both speed-up query-by-example and organize databases for supporting more effective browsing. The applications of this proposed multi-level video database representation and indexing structures for MPEG-7 are also discussed.  相似文献   

14.
一种整体的视频匹配方法   总被引:1,自引:0,他引:1  
柴登峰  彭群生 《软件学报》2006,17(9):1899-1907
给出一种视频时空配准的整体方法,提出一种视频内匹配与视频间匹配相结合的空间配准策略,改进动态时间扭曲方法以用于时间维的对齐.视频内匹配跟踪视频内各帧图像的特征点并记录其轨迹,视频间匹配配准不同视频的帧图像,使用轨迹对应提供图像配准所需的初始特征点对应,根据图像配准得到的特征点对应建立和更新轨迹对应.该匹配策略充分利用了视频的连贯性提高了匹配的稳定性和效率,同时提高了配准视频的连贯性.改进的动态时间扭曲方法通过极小化两段视频的整体距离建立视频之间的帧对应关系,保持视频内部各帧之间的时序关系并能处理非线性偏移  相似文献   

15.
This paper reports our progress in developing an advanced video-on-demand (VoD) testbed, which will be used to accommodate various multimedia research and applications such as Electronic News on Demand, Columbia's Video Course Network, and Digital Libraries. The testbed supports delivery of MPEG-2 audio/video stored as transport streams over various types of networks, e.g., ATM, Ethernet, and wireless. Both software and hardware video encoders/decoders are used in the testbed. A real-time video pump and a distributed application control protocol (MPEG-2's DSM-CC) have been incorporated. Hardware decoders and set-tops are being used to test wide-area video interoperability. Our VoD testbed also provides an advanced platform for implementing proof-of-concept prototypes of related research. Our current research focus covers video transmission with heterogeneous quality-of-service (QoS) provision, variable bitrate (VBR) traffic modeling, VBR server scheduling, video over Internet, and video transmission over IP-ATM hybrid networks. An important aim is to enhance interoperability. Accommodation of practical multimedia applications and interoperability testing with external VoD systems has also been undertaken.  相似文献   

16.
The recent expansion of broadband Internet access led to an exponential increase of potential consumers of video on the Web. The huge success of video upload websites shows that the online world, with its virtually unlimited possibilities of active user participation, is an ideal complement to traditional consumption-only media like TV and DVD. It is evident that users are willing to interact with content-providing systems in order to get the content they desire. In parallel to these developments, innovative tools for producing interactive, non-linear audio-visual content are being created. They support the authoring process alongside management of media and metadata, enabling on-demand assembly of videos based on the consumer’s wishes. The quality of such a dynamic video remixing system mainly depends on the expressiveness of associated metadata. Eliminating the need for manual input as far as possible, we aim at designing a system which is able to automatically enrich its own media and metadata repositories continuously. Currently, video content remixing is available on the Web mostly in very basic forms. Most platforms offer upload and simple modification of content. Although several implementations exist, to the best of our knowledge no solution uses metadata to its full extent to dynamically render a video stream based on consumers’ wishes. With the research presented in this paper, we propose a novel concept to interactive video assembly on the Web. In this approach, consumers may describe the desired content using a set of domain-specific parameters. Based on the metadata the video clips are annotated with, the system chooses clips fitting the user criteria. They are aligned in an aesthetically pleasing manner while the user furthermore is able to interactively influence content selection during playback at any time. We use a practical example to clarify the concept and further outline what it takes to implement a suchlike system.
Martin UmgeherEmail:

Rene Kaiser   graduated in Software Engineering at the FH Hagenberg in 2005. Since 2006, he is working at JOANNEUM RESEARCH, focussing on various research aspects of multimedia semantics. Rene is especially interested in metadata representation, Semantic Web technologies, and non-linear interactive video production. Dr. Michael Hausenblas   is a senior researcher at JOANNEUM RESEARCH working in the area of multimedia semantics. He has been utilising Web of Data technologies in a couple of national and international projects. Additionally, he has been active in several W3C activities, Semantic Web Deployment Working Group and in Video in the Web activity. Michael holds a PhD in Computer Science (Telematics) from Graz University of Technology. Martin Umgeher   is a PhD student at the Technical University of Graz. He is researching in the area of mobile multimedia applications, applying agile development methodologies and focussing on usability aspects. Martin has been active in both national and international multimedia-based projects.   相似文献   

17.
周渝斌 《计算机应用》2012,32(11):3185-3197
为解决海量监控视频的快速浏览和检索,介绍了一种基于目标索引的视频摘要和检索方法。该方法在光流分析的基础上,在画面的静止区域更新背景,运动的区域利用差分法分割出运动目标图像。经过优化的快速特征匹配和建立运动跟踪模型后,根据目标运动轨迹,按照时空距离进行聚类。在目标图像数据和运动参数进行XML结构化存储为索引的基础上,最后在检索时将符合条件的所有目标图像,按照其原有时间顺序逐帧贴到同一个背景图像中,形成动态的摘要视频。由于该方法剔除了背景中大量的时空冗余信息,可在较短回放时间内浏览全部有用目标,显著提高海量监控视频的查阅效率。  相似文献   

18.
Video provides strong cues for automatic road extraction that are not available in static aerial images. In video from a static camera, or stabilized (or geo-referenced) aerial video data, motion patterns within a scene enable function attribution of scene regions. A “road”, for example, may be defined as a path of consistent motion — a definition which is valid in a large and diverse set of environments. The spatio-temporal structure tensor field is an ideal representation of the image derivative distribution at each pixel because it can be updated in real time as video is acquired. An eigen-decomposition of the structure tensor encodes both the local scene motion and the variability in the motion. Additionally, the structure tensor field can be factored into motion components, allowing explicit determination of traffic patterns in intersections. Example results of a real time system are shown for an urban scene with both well-traveled and infrequently traveled roads, indicating that both can be discovered simultaneously. The method is ideal in urban traffic scenes, which are the most difficult to analyze using static imagery.  相似文献   

19.
VisualGREP: A Systematic Method to Compare and Retrieve Video Sequences   总被引:1,自引:0,他引:1  
Multimedia Tools and Applications - In this paper, we consider the problem of similarity between video sequences. Three basic questions are raised and (partially) answered. Firstly, at what...  相似文献   

20.
多媒体技术在人们日常生活中的应用越来越广泛,图像、视频、音频等多媒体数据逐渐成为信息处理领域中主要的信息媒体形式。视频捕获技术是信息处理中的重要环节,研究该项技术具有重要的实用价值。文章提出一种基于VFW的远程视频捕获方法。该方法利用VFW捕获视频数据,采用H.263编码标准压缩视频数据,利用面向连接协议的流式套接字实现实时视频流的传输,结合多线程技术实现视频文件播放。然后,基于Windows操作系统设计实现了远程视频捕获系统。实验结果表明,该方法CPU占用率低、内存占用小,可靠性强,具有较好的应用价值。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号