首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 994 毫秒
1.
2.
3.
现今常用的线性结构视频推荐方法存在推荐结果非个性化、精度低等问题,故开发高精度的个性化视频推荐方法迫在眉睫。提出了一种基于自编码器与多模态数据融合的视频推荐方法,对文本和视觉两种数据模态进行视频推荐。具体来说,所提方法首先使用词袋和TF-IDF方法描述文本数据,然后将所得特征与从视觉数据中提取的深层卷积描述符进行融合,使每个视频文档都获得一个多模态描述符,并利用自编码器构造低维稀疏表示。本文使用3个真实数据集对所提模型进行了实验,结果表明,与单模态推荐方法相比,所提方法推荐性能明显提升,且所提视频推荐方法的性能优于基准方法。  相似文献   

4.
5.
6.
7.
Dominant sets based movie scene detection   总被引:1,自引:0,他引:1  
Multimedia indexing and retrieval has become a challenging topic in organizing huge amount of multimedia data. This problem is not a trivial task for large visual databases; hence, segmentation into low- and high-level temporal video segments might improve the realization of this task. In this paper, we introduce a weighted undirected graph-based movie scene detection approach to detect semantically meaningful temporal video segments. The method is based on the idea of finding the dominant scene of the video according to the selected low-level feature. The proposed method starts from obtaining the most reliable solution first and exploit each solution in the subsequent steps recursively. The dominant movie scene boundary, which can be the highest probability to be the correct one, is determined and this scene boundary information is also exploited in the subsequent steps. We handle two partitioning strategies to determine the boundaries of the remaining scenes. One is a tree-based strategy and the other is an order-based strategy. The proposed dominant sets based movie scene detection method is compared with the graph-based video scene detection methods presented in literature.  相似文献   

8.
For a variety of applications such as video surveillance and event annotation, the spatial–temporal boundaries between video objects are required for annotating visual content with high-level semantics. In this paper, we define spatial–temporal sampling as a unified process of extracting video objects and computing their spatial–temporal boundaries using a learnt video object model. We first provide a computational approach for learning an optimal key-object codebook sequence from a set of training video clips to characterize the semantics of the detected video objects. Then, dynamic programming with the learnt codebook sequence is used to locate the video objects with spatial–temporal boundaries in a test video clip. To verify the performance of the proposed method, a human action detection and recognition system is constructed. Experimental results show that the proposed method gives good performance on several publicly available datasets in terms of detection accuracy and recognition rate.  相似文献   

9.
Most semantic video search methods use text-keyword queries or example video clips and images. But such methods have limitations. To address the problems of example-based video search approaches and avoid the use of specialized models, we conduct semantic video searches using a reranking method that automatically reorders the initial text search results based on visual cues and associated context. We developed two general reranking methods that explore the recurrent visual patterns in many contexts, such as the returned images or video shots from initial text queries, and video stories from multiple channels.  相似文献   

10.
罗凤玲  刘雨  张洪德 《电视技术》2003,(4):15-17,21
结合基于内容的视频检索技术介绍了MPEG-7标准中视觉描述部分的内容,并着重介绍了各种视觉描述子和它们的一些应用。  相似文献   

11.
We present a geometry-based indexing approach for the retrieval of video databases. It consists of two modules: 3D object shape inferencing from video data and geometric modeling from the reconstructed shape structure. A motion-based segmentation algorithm employing feature block tracking and principal component split is used for multi-moving-object motion classification and segmentation. After segmentation, feature blocks from each individual object are used to reconstruct its motion and structure through a factorization method. The estimated shape structure and motion parameters are used to generate the implicit polynomial model for the object. The video data is retrieved using the geometric structure of objects and their spatial relationship. We generalize the 2D string to 3D to compactly encode the spatial relationship of objects.  相似文献   

12.
13.
14.
15.
Salient object detection is essential for applications, such as image classification, object recognition and image retrieval. In this paper, we design a new approach to detect salient objects from an image by describing what does salient objects and backgrounds look like using statistic of the image. First, we introduce a saliency driven clustering method to reveal distinct visual patterns of images by generating image clusters. The Gaussian Mixture Model (GMM) is applied to represent the statistic of each cluster, which is used to compute the color spatial distribution. Second, three kinds of regional saliency measures, i.e, regional color contrast saliency, regional boundary prior saliency and regional color spatial distribution, are computed and combined. Then, a region selection strategy integrating color contrast prior, boundary prior and visual patterns information of images is presented. The pixels of an image are divided into either potential salient region or background region adaptively based on the combined regional saliency measures. Finally, a Bayesian framework is employed to compute the saliency value for each pixel taking the regional saliency values as priority. Our approach has been extensively evaluated on two popular image databases. Experimental results show that our approach can achieve considerable performance improvement in terms of commonly adopted performance measures in salient object detection.  相似文献   

16.
张天  靳聪  帖云  李小兵 《信号处理》2020,36(6):966-976
跨模态检索旨在通过以某一模态的数据为查询词,使人们能够得到与之相关的其他不同模态数据的检索结果的新型检索方法,这已成为多媒体和信息检索领域中一个有趣的研究问题。但是,目前大多数的研究成果集中于文本到图像、文本到视频以及歌词到音频等跨模态相关任务上,而关于如何为特定的视频通过跨模态检索得到合适的音乐这一跨模态的相关研究却很有限。此外,大多现有的关于视频和音频跨模态的研究依赖于元数据(例如关键字,标签或描述)。本文介绍了一种基于音频和视频这两种模态数据内容的跨模态检索的方法,该方法以新型的双流处理网络为框架,并通过神经网络学习两模态数据在公共子空间的特征表达,以计算音频和视频数据之间的相似度。本文所提出的方法的创新点主要在以下三个方面:1)在原有的提取各模态特征的模型基础上引入注意力机制,以此得到了视频和音频的特征选择模型,并筛选出相应的特征表达。2)使用了样本挖掘机制,剔除了无效样本,使得数据的训练更加高效。3)从计算模态间相似性和保持模态内结构不变两方面出发,设计了相应的损失函数进行模型的训练。且所提出的模型在VEGAS数据集和自建数据集上都取得了较高的准确度。   相似文献   

17.
为了解决面向话题的搜索问题,提出一种新的面向话题的检索技术。首先分析了面向话题的搜索技术所面临的问题,然后基于数据挖掘技术提出了解决方案。利用数据挖掘技术抽取文本的多层次语义特征,形成对文本的多精度表示,抽取的特征不仅包括单个词特征也包括多词特征。建立了一个示例检索系统,实验表明利用多层次文本特征能够很好地实现面向话题的文本检索。  相似文献   

18.
19.
We have proposed a new spatio-temporal knowledge structure called 3D C-string to represent symbolic videos accompanying with the string generation and video reconstruction algorithms. In this paper, we extend the idea behind the similarity retrieval of images in 2D C+-string to 3D C-string. Our extended approach consists of two phases. First, we infer the spatial relation sequence and temporal relations for each pair of objects in a video. Second, we use the inferred relations to define various types of similarity measures and propose the similarity retrieval algorithm. By providing various types of similarity between videos, our proposed similarity retrieval algorithm has discrimination power about different criteria. Finally, some experiments are performed to show the efficiency of the proposed approach.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号