首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
白晨  范涛  王文静  王国中 《计算机应用研究》2023,40(11):3276-3281+3288
针对传统视频摘要算法没有充分利用视频的多模态信息、难以确保摘要视频片段时序一致性的问题,提出了一种融合多模态特征与时区检测的视频摘要算法(MTNet)。首先,通过GoogLeNet与VGGish预训练模型提取视频图像与音频的特征表示,设计了一种维度平滑操作对齐两种模态特征,使模型具备全面的表征能力;其次,考虑到生成的视频摘要应具备全局代表性,因此通过单双层自注意力机制结合残差结构分别提取视频图像与音频特征的长范围时序特征,获取模型在时序范围的单一向量表示;最后,通过分离式时区检测与权值共享方法对视频逐个时序片段的摘要边界与重要性进行预测,并通过非极大值抑制来选取关键视频片段生成视频摘要。实验结果表明,在两个标准数据集SumMe与TvSum上,MTNet的表征能力与鲁棒性更强;它的F1值相较基于无锚框的视频摘要算法DSNet-AF以及基于镜头重要性预测的视频摘要算法VASNet,在两个数据集上分别有所提高。  相似文献   

2.
A compact summary of video that conveys visual content at various levels of detail enhances user interaction significantly. In this paper, we propose a two-stage framework to generate MPEG-7-compliant hierarchical key frame summaries of video sequences. At the first stage, which is carried out off-line at the time of content production, fuzzy clustering and data pruning methods are applied to given video segments to obtain a nonredundant set of key frames that comprise the finest level of the hierarchical summary. The number of key frames allocated to each shot or segment is determined dynamically and without user supervision through the use of cluster validation techniques. A coarser summary is generated on-demand in the second stage by reducing the number of key frames to match the low-level browsing preferences of a user. The proposed method has been validated by experimental results on a collection of video programs.  相似文献   

3.
In this paper, we introduce the concept of a priority curve associated with a video. We then provide an algorithm that can use the priority curve to create a summary (of a desired length) of any video. The summary thus created exhibits nice continuity properties and also avoids repetition. We have implemented the priority curve algorithm (PriCA) and compared it with other summarization algorithms in the literature with respect to both performance and the output quality. The quality of summaries was evaluated by a group of 200 students in Naples, Italy, who watched soccer videos. We show that PriCA is faster than existing algorithms and also produces better quality summaries. We also briefly describe a soccer video summarization system we have built on using the PriCA architecture and various (classical) image processing algorithms.  相似文献   

4.
视频摘要技术的目的是在缩短视频长度的同时,概括视频的主要内容,这样可以极大地节省人们浏览视频的时间。视频摘要技术的一个关键步骤是评估生成摘要的性能,现有的大多数方法是基于整个视频进行评估。然而,基于整个视频序列进行评估的计算成本很高,特别是对于长视频。而且在整个视频上评估生成摘要往往忽略了视频数据固有的时序关系,导致生成摘要缺乏故事情节的逻辑性。因此,提出了一个关注局部信息的视频摘要网络,称为自注意力和局部奖励视频摘要网络(ALRSN)。确切地说,该模型采用自注意力机制预测视频帧的重要性分数,然后通过重要性分数生成视频摘要。为了评估生成摘要的性能,进一步设计了一个局部奖励函数,同时考虑了视频摘要的局部多样性和局部代表性。该函数将生成摘要映射回原视频,并在局部范围内评估摘要的性能,使其具有原视频的时序结构。通过在局部范围内获得更高的奖励分数,使模型生成更多样化、更具代表性的视频摘要。综合实验表明,在两个基准数据集SumMe和TvSum上,ALRSN模型优于现有方法。  相似文献   

5.
Automatic text summarization is an essential tool in this era of information overloading. In this paper we present an automatic extractive Arabic text summarization system where the user can cap the size of the final summary. It is a direct system where no machine learning is involved. We use a two pass algorithm where in pass one, we produce a primary summary using Rhetorical Structure Theory (RST); this is followed by the second pass where we assign a score to each of the sentences in the primary summary. These scores will help us in generating the final summary. For the final output, sentences are selected with an objective of maximizing the overall score of the summary whose size should not exceed the user selected limit. We used Rouge to evaluate our system generated summaries of various lengths against those done by a (human) news editorial professional. Experiments on sample texts show our system to outperform some of the existing Arabic summarization systems including those that require machine learning.  相似文献   

6.
Summaries are an essential component of video retrieval and browsing systems. Most research in video summarization has focused on content analysis to obtain compact yet comprehensive representations of video items. However, important aspects such as how they can be effectively integrated in mobile interfaces and how to predict the quality and usability of the summaries have not been investigated. Conventional summaries are limited to a single instance with certain length (i.e. a single scale). In contrast, scalable summaries target representations with multiple scales, that is, a set of summaries with increasing length in which longer summaries include more information about the video. Thus, scalability provides high flexibility that can be exploited in devices such as smartphones or tablets to provide versions of the summary adapted to the limited visualization area. In this paper, we explore the application of scalable storyboards to summary adaptation and zoomable video navigation in handheld devices. By introducing a new adaptation dimension related with the summarization scale, we can formulate navigation and adaptation in a two-dimensional adaptation space, where different navigation actions modify the trajectory in that space. We also describe the challenges to evaluate scalable summaries and some usability issues that arise from having multiple scales, proposing some objective metrics that can provide useful insight about their potential quality and usability without requiring very costly user studies. Experimental results show a reasonable agreement with the trends shown in subjective evaluations. Experiments also show that content-based scalable storyboards are less redundant and useful than the content-blind baselines.  相似文献   

7.
基于向量空间模型的视频语义相关内容挖掘   总被引:1,自引:0,他引:1       下载免费PDF全文
对海量视频数据库中所蕴涵的语义相关内容进行挖掘分析,是视频摘要生成方法面临的难题。该文提出了一种基于向量空间模型的视频语义相关内容挖掘方法:对新闻视频进行预处理,将视频转化为向量形式的数据集,采用主题关键帧提取算法对视频聚类内容进行挖掘,保留蕴涵场景独特信息的关键帧,去除视频中冗余的内容,这些主题关键帧按原有的时间顺序排列生成视频的摘要。实验结果表明,使用该视频语义相关内容挖掘的算法生成的新闻视频具有良好的压缩率和内容涵盖率。  相似文献   

8.
针对目前基于评论文本的推荐算法存在文本特征和隐含信息提取能力不足的问题, 提出一种基于注意力机制的深度学习推荐算法. 通过分别构建用户和项目的评论文本表示, 利用双向门控循环单元提取文本的上下文依赖关系以获得文本特征表示, 引入注意力机制, 更准确的获取用户兴趣偏好和项目属性特征. 将生成的用户和项目评论数据的两组隐含特征分别输入全连接层处理, 再合并到同一个向量空间进行评分预测, 得到推荐结果. 在Yelp和Amazon两个公开数据集中进行实验, 结果表明所提出的算法与其他算法相比, 具有更好的推荐性能.  相似文献   

9.
10.
基于滑动窗口的微博时间线摘要算法   总被引:1,自引:0,他引:1  
时间线摘要是在时间维度上对文本进行内容归纳和概要生成的技术。传统的时间线摘要主要研究诸如新闻之类的长文本,而本文研究微博短文本的时间线摘要问题。由于微博短文本内容特征有限,无法仅依靠文本内容生成摘要,本文采用内容覆盖性、时间分布性和传播影响力3种指标评价时间线摘要,并提出了基于滑动窗口的微博时间线摘要算法(Microblog timeline summariaztion based on sliding window, MTSW)。该算法首先利用词项强度和熵来确定代表性词项;然后基于上述3种指标构建出评价时间线摘要的综合评价指标;最后采用滑动窗口的方法,遍历时间轴上的微博消息序列,生成微博时间线摘要。利用真实微博数据集的实验结果表明,MTSW算法生成的时间线摘要可以有效地反映热点事件发展演化的过程。  相似文献   

11.
唐泽坤 《计算机应用研究》2020,37(9):2615-2619,2639
推荐系统通过建立用户和信息产品之间的二元关系,利用用户行为产生的数据挖掘每个用户感兴趣的对象并进行推荐,基于用户的协同过滤是近年来的主流办法,但存在一定局限性:推荐时需要考虑全部用户,而单个用户往往只与少部分用户类似。为了解决这个问题,提出了基于改进Canopy聚类的协同过滤推荐算法,将用户模型数据密度、距离与用户活跃度结合,计算用户数据权值,对用户模型数据进行聚类。由于结合了Canopy的聚类思想,同一用户可以属于不同的类,符合用户可能对多领域感兴趣的情况。最后对每个Canopy中的用户进行相应的推荐,根据聚类结果与用户评分预测用户可能感兴趣的对象。通过在数据集MovieLens和million songs上与对比算法进行MAE、RMSE、NDGG三个指标的比较,验证了该算法能显著提高推荐系统预测与推荐的准确度。  相似文献   

12.
提出了一种分布集群式的视频点播体系结构,针对视频点播系统特点设计了两种适用于系统不同的运行阶段的视频服务器节目替换算法。在视频点播系统初始化时间段内使用改进的LFRU算法进行节目替换;系统达到稳定状态后使用最小加权周期频率替换算法进行节目替换。对比实验表明两种替换算法适合分布集群式视频点播系统,其替换效率较高。  相似文献   

13.
针对在社交网络中挖掘意见领袖时存在的计算复杂度高的难题,提出了一种基于K核分解的意见领袖识别算法CR.首先,基于K核分解方法获取社交网络中的意见领袖候选集,以缩小识别意见领袖的数据规模;然后,提出包括位置相似性和邻居相似性的用户相似性的概念,利用K核值、入度数、平均K核变化率和用户追随者个数计算用户相似性,并根据用户相...  相似文献   

14.
李雪君  张开华  宋慧慧 《计算机应用》2017,37(11):3134-3138
针对视频分割的难点在于分割目标的无规则运动、快速变换的背景、目标外观的任意变化与形变等,提出了一种基于时空多特征表示的无监督视频分割算法,通过融合像素级、超像素级以及显著性三类特征设计由细粒度到粗粒度的稳健特征表示。首先,采用超像素分割对视频序列进行处理以提高运算效率,并设计图割算法进行快速求解;其次,利用光流法对相邻帧信息进行匹配,并通过K-D树算法实现最近邻搜索以引入各超像素的非局部时空颜色特征,从而增强分割的鲁棒性;然后,对采用超像素计算得到的分割结果,设计混合高斯模型进行完善;最后,引入图像的显著性特征,协同超像素分割与混合高斯模型的分割结果,设计投票获得更加准确的视频分割结果。实验结果表明,所提算法是一种稳健且有效的分割算法,其结果优于当前大部分无监督视频分割算法及部分半监督视频分割算法。  相似文献   

15.
The important new revenue opportunities that multimedia services offer to network and service providers come with important management challenges. For providers, it is important to control the video quality that is offered and perceived by the user, typically known as the quality of experience (QoE). Both admission control and scalable video coding techniques can control the QoE by blocking connections or adapting the video rate but influence each other’s performance. In this article, we propose an in-network video rate adaptation mechanism that enables a provider to define a policy on how the video rate adaptation should be performed to maximize the provider’s objective (e.g., a maximization of revenue or QoE). We discuss the need for a close interaction of the video rate adaptation algorithm with a measurement based admission control system, allowing to effectively orchestrate both algorithms and timely switch from video rate adaptation to the blocking of connections. We propose two different rate adaptation decision algorithms that calculate which videos need to be adapted: an optimal one in terms of the provider’s policy and a heuristic based on the utility of each connection. Through an extensive performance evaluation, we show the impact of both algorithms on the rate adaptation, network utilisation and the stability of the video rate adaptation. We show that both algorithms outperform other configurations with at least 10 %. Moreover, we show that the proposed heuristic is about 500 times faster than the optimal algorithm and experiences only a performance drop of approximately 2 %, given the investigated video delivery scenario.  相似文献   

16.
This paper proposes a 1D representation of isometric feature mapping (Isomap) based united video coding algorithms. First, 1D Isomap representations that maintain distances are generated which can achieve a very high compression ratio. Next, embedding and reconstruction algorithms for the 1D Isomap representation are presented that can transform samples from a high-dimensional space to a low-dimensional space and vice versa. Then, dictionary learning algorithms for training samples are proposed to compress the input samples. Finally, a unified coding framework for diverse videos based on a 1D Isomap representation is built. The proposed methods make full use of correlations between internal and external videos, which are not considered by classical methods. Simulation experiments have shown that the proposed methods can obtain higher peak signal-to-noise ratios than standard highly efficient video coding for similar bit per pixel levels in the low bit rate situation.  相似文献   

17.
Keyframe-based video summarization using Delaunay clustering   总被引:1,自引:0,他引:1  
Recent advances in technology have made tremendous amounts of multimedia information available to the general population. An efficient way of dealing with this new development is to develop browsing tools that distill multimedia data as information oriented summaries. Such an approach will not only suit resource poor environments such as wireless and mobile, but also enhance browsing on the wired side for applications like digital libraries and repositories. Automatic summarization and indexing techniques will give users an opportunity to browse and select multimedia document of their choice for complete viewing later. In this paper, we present a technique by which we can automatically gather the frames of interest in a video for purposes of summarization. Our proposed technique is based on using Delaunay Triangulation for clustering the frames in videos. We represent the frame contents as multi-dimensional point data and use Delaunay Triangulation for clustering them. We propose a novel video summarization technique by using Delaunay clusters that generates good quality summaries with fewer frames and less redundancy when compared to other schemes. In contrast to many of the other clustering techniques, the Delaunay clustering algorithm is fully automatic with no user specified parameters and is well suited for batch processing. We demonstrate these and other desirable properties of the proposed algorithm by testing it on a collection of videos from Open Video Project. We provide a meaningful comparison between results of the proposed summarization technique with Open Video storyboard and K-means clustering. We evaluate the results in terms of metrics that measure the content representational value of the proposed technique.  相似文献   

18.
19.
VISON: VIdeo Summarization for ONline applications   总被引:1,自引:0,他引:1  
Recent advances in technology have increased the availability of video data, creating a strong requirement for efficient systems to manage those materials. Making efficient use of video information requires that data to be accessed in a user-friendly way. This has been the goal of a quickly evolving research area known as video summarization. Most of existing techniques to address the problem of summarizing a video sequence have focused on the uncompressed domain. However, decoding and analyzing of a video sequence are two extremely time-consuming tasks. Thus, video summaries are usually produced off-line, penalizing any user interaction. The lack of customization is very critical, as users often have different demands and resources. Since video data are usually available in compressed form, it is desirable to directly process video material without decoding. In this paper, we present VISON, a novel approach for video summarization that works in the compressed domain and allows user interaction. The proposed method is based on both exploiting visual features extracted from the video stream and on using a simple and fast algorithm to summarize the video content. Results from a rigorous empirical comparison with a subjective evaluation show that our technique produces video summaries with high quality relative to the state-of-the-art solutions and in a computational time that makes it suitable for online usage.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号