期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Dominant sets based movie scene detection 总被引：1，自引：0，他引：1

Ufuk Sakarya Ziya TelatarA.Ayd?n Alatan 《Signal processing》2012,92(1):107-119

Multimedia indexing and retrieval has become a challenging topic in organizing huge amount of multimedia data. This problem is not a trivial task for large visual databases; hence, segmentation into low- and high-level temporal video segments might improve the realization of this task. In this paper, we introduce a weighted undirected graph-based movie scene detection approach to detect semantically meaningful temporal video segments. The method is based on the idea of finding the dominant scene of the video according to the selected low-level feature. The proposed method starts from obtaining the most reliable solution first and exploit each solution in the subsequent steps recursively. The dominant movie scene boundary, which can be the highest probability to be the correct one, is determined and this scene boundary information is also exploited in the subsequent steps. We handle two partitioning strategies to determine the boundaries of the remaining scenes. One is a tree-based strategy and the other is an order-based strategy. The proposed dominant sets based movie scene detection method is compared with the graph-based video scene detection methods presented in literature. 相似文献

2.

基于场景转换检测的混合差错掩盖方法

廖彬胡金龙胡洁《信息技术》2007,31(7):74-77

提出了一种基于场景转换检测的混合差错掩盖法。该方法通过场景转换检测，充分利用时域信息重建图像，并将场景转换检测中收集的信息用于局部运动的判断，以便将时域掩盖法引入I帧，而在发生场景转换时，使用权值随距离自动调整的空域掩盖法。仿真实验表明，该方法在满足实时视频要求的条件下，可有效地恢复视频差错与抑制视频差错的扩散，取得较好的差错掩盖效果。相似文献

3.

数字视频序列中渐变场景切换的检测

杨小康王金础张文军余松煜《红外与激光工程》1999,28(4):57-59

淡入、淡出和消融的共同特点是两个场景之间的切换由一个渐变的线性变化过程来实现。渐变场景切换检测也是视频后处理和视频数据库检索的关键步骤之一。文中提出了一种用于检测数字视频序列中渐变场景切换的算法,首先建立了统一的渐变场景切换数学模型;然后,根据随机过程的方差特征分析渐变区域的统计特征,提出了包括淡入、淡出和消融等场景切换检测的算法;最后,给出了算法的实验结果。实验表明该算法具有９５％以上的检测正确性。相似文献

4.

基于加权场景先验的海上红外弱小目标检测

潘胜达张素赵明安博文《红外与毫米波学报》2019,38(5):633-641

为了提高海上红外弱小目标检测的检测精度和实时性,提出了一种基于加权场景先验的红外弱小目标检测方法.该方法首先利用目标的稀疏特性以及海面场景的非局部自相关特性,将目标和背景的分离问题转化为恢复低秩和稀疏矩阵的鲁棒主成分分析(Robust Principal Component Analysis,RPCA)问题.之后,将海面背景的先验特征信息通过加权核范数的方式加入模型,加快算法中目标和背景图像块矩阵的分解速度.最后,通过引入交替方向乘子法(ADMM)算法进一步加速求解的迭代速度.实验结果表明:该算法能有效地提高目标检测准确率,算法实时性较原算法提高了120%. 相似文献

5.

Salient region detection through sparse reconstruction and graph-based ranking

《Journal of Visual Communication and Image Representation》2015

In this paper, we propose a salient region detection algorithm from the point of view of unique and compact representation of individual image. In first step, the original image is segmented into super-pixels. In second step, the sparse representation measure and uniqueness of the features are computed. Then both are ranked on the basis of the background and foreground seeds respectively. Thirdly, a location prior map is used to enhance the foci of attention. We apply the Bayes procedure to integrate computed results to produce smooth and precise saliency map. We compare our proposed algorithm against the state-of-the-art saliency detection methods using four of the largest widely available standard data-bases, experimental results specify that the proposed algorithm outperforms. We also show that how the saliency map of the proposed method is used to discover outline of object, furthermore using this outline our method produce the saliency cut of the desired object. 相似文献

6.

综合纹理和亮度的夜间场景图像来源检测方法

王瑞昆柯永振陈凌翔《光电子．激光》2018,29(12):1358

针对目前基于模式噪声方法处理夜间场景下图像来源检测准确效果较差问题,提出一种综合纹理和亮度的夜间场景图像来源检测方法。首先基于夜间场景图片会存在同张图片不同区域会有不同纹理和亮度从而会影响提取模式噪声质量不同这一理论依据,将图片根据纹理和亮度分成若干大小相等的块;然后对不同块按纹理平缓亮度良好、纹理复杂亮度差、纹理平缓亮度差、温度复杂亮度良好四种情况处理,其中对于纹理复杂的区域提取的噪声要抑制纹理干扰、对于亮度条件差的区域提取的模式噪声需要增强、对于亮度差且纹理又复杂的区域提取的模式噪声、既要抑制纹理的干扰又要增强模式噪声。实验结果表明,本文方法在夜间场景情况对图像检测整体识别率均在在80%以上,与传统模式噪声提取算法相比,该算法正确率能够提高4到12个百分点。相似文献

7.

用非监督式聚类进行视频镜头分割 总被引：2，自引：1，他引：2

金红《红外与激光工程》2000,29(5):42-46,51

镜头边界检测是基于内容的视频检索首先要解决的问题。研究人员通常将镜头转换分为突变和渐变,并根据各种转换的特点采用不同的检测算法。在研究中发现,视频镜头的抽象程度与对其进行边界划分的精度相关。为此,提出采用非监督式聚类算法,按照给定的相似尺度对视频数据进行自组织和动态分析,完成层次化的镜头分割。该算法侧重于揭示视频的层次结构,能实现不同精度的视频抽象要求。相似文献

8.

Video saliency detection incorporating temporal information in compressed domain

《Signal Processing: Image Communication》2015

Saliency detection is widely used to pick out relevant parts of a scene as visual attention regions for various image/video applications. Since video is increasingly being captured, moved and stored in compressed form, there is a need for detecting video saliency directly in compressed domain. In this study, a compressed video saliency detection algorithm is proposed based on discrete cosine transformation (DCT) coefficients and motion information within a visual window. Firstly, DCT coefficients and motion information are extracted from H.264 video bitstream without full decoding. Due to a high quantization parameter setting in encoder, skip/intra is easily chosen as the best prediction mode, resulting in a large number of blocks with zero motion vector and no residual existing in video bitstream. To address these problems, the motion vectors of skip/intra coded blocks are calculated by interpolating its surroundings. In addition, a visual window is constructed to enhance the contrast of features and to avoid being affected by encoder. Secondly, after spatial and temporal saliency maps being generated by the normalized entropy, a motion importance factor is imposed to refine the temporal saliency map. Finally, a variance-like fusion method is proposed to dynamically combine these maps to yield the final video saliency map. Experimental results show that the proposed approach significantly outperforms other state-of-the-art video saliency detection models. 相似文献

9.

基于深度学习的场景文本检测算法研究

熊炜艾美慧杨荻椿李利荣刘敏王娟曾春艳《光电子．激光》2021,32(7):728-734

针对自然场景中任意形状文本图像因文本行难以区分导致的信息丢失问题,提出了一种基于深度学习的场景文本检测算法。首先构建特征提取模块,使用Resnet50作为骨干网络,在增加跨层连接的金字塔网络结构中引入并联的空洞卷积模块,以提取更多语义信息; 其次,对得到的特征图进行多尺度特征融合,学习不同尺度的特征;最后预测出不同内核大小的文本实例,并通过尺度扩展逐渐扩大文本行区域,直到得到最终的检测结果。实验结果表明,该方法在SCUT-CTW1500弯曲文本数据集上的准确率、召回率及F1值分别达到88.5%、 77.0%和81.3%,相比其他基于分割的算法,该算法对弯曲文本的检测效果良好,具有一定的应用价值。相似文献

10.

Accumulated micro-motion representations for lightweight online action detection in real-time

《Journal of Visual Communication and Image Representation》2023

In the last decade, the explosive growth of vision sensors and video content has driven numerous application demands for automating human action detection in space and time. Aside from reliable precision, vast real-world scenarios also mandate continuous and instantaneous processing of actions under limited computational budgets. However, existing studies often rely on heavy operations such as 3D convolution and fine-grained optical flow, therefore are hindered in practical deployment. Aiming strictly at a better mixture of detection accuracy, speed, and complexity for online detection, we customize a cost-effective 2D-CNN-based tubelet detection framework coined Accumulated Micro-Motion Action detector (AMMA). It sparsely extracts and fuses visual-dynamic cues of actions spanning a longer temporal window. To lift reliance on expensive optical flow estimation, AMMA efficiently encodes actions’ short-term dynamics as accumulated micro-motion from RGB frames on-the-fly. On top of AMMA’s motion-aware 2D backbone, we adopt an anchor-free detector to cooperatively model action instances as moving points in the time span. The proposed action detector achieves highly competitive accuracy as state-of-the-arts while substantially reducing model size, computational cost, and processing time (6 million parameters, 1 GMACs, and 100 FPS respectively), making it much more appealing under stringent speed and computational constraints. Codes are available on https://github.com/alphadadajuju/AMMA. 相似文献

11.

基于中心点回归的大场景SAR图像舰船检测方法

下载免费PDF全文

崔宗勇王晓雅施君南曹宗杰杨建宇《电波科学学报》2022,37(1):153-161

合成孔径雷达(synthetic aperture radar,SAR)图像舰船目标检测在军事和民用领域有着重要的应用.然而随着SAR图像成像能力的提升,SAR成像场景越来越大,舰船目标检测存在两个难点:一是舰船目标在整幅图像中所占比例极小,很难与周围背景分开;二是靠岸舰船目标通常密集排列,目标之间难以区分.目前常用基... 相似文献

12.

Flash scene video coding using weighted prediction

Sik-Ho TsangYui-Lam Chan Wan-Chi Siu 《Journal of Visual Communication and Image Representation》2012,23(2):264-270

A novel algorithm for coding flash scenes is proposed. In principle, flash scenes can be detected by analyzing the histogram differences between frames. The proposed algorithm then suggests an adaptive coding order technique for increasing the efficiency of video coding by taking account of characteristics of flash scenes in video contents. The use of adaptive coding technique also benefits to enhance the accuracy of derived motion vectors for determination of weighting parameter sets. Experimental results show that a significant improvement of coding performance in terms of bitrate and PSNR can be achieved in comparison with the conventional weighted prediction algorithms. 相似文献

13.

一种基于模糊逻辑的MPEG压缩视频场景转换检测方法 总被引：5，自引：1，他引：5

金红周源华《通信学报》2000,21(7):57-62

镜头边界的自动检测是实现基于内容的视频检索必不可少的第一步,目前大多数的场景转换检测方法都是基于非压缩视频的,而越来越多的视频数据却以压缩形式存在。本主文提出了一咱新的针对ＭＰＥＧ压缩视频的场景转换检测算法,它利用ＤＣ序列和运动向量计算像素差、直方图差、统计差和具有“真实”运动向量的宏块所占的比例,然后用模糊逻辑对上述参量加以综合隶属度用自适应的方法确定。实验表明这种镜头检测算法具有较高的检出率和相似文献

14.

A video semantic detection method based on locality-sensitive discriminant sparse representation and weighted KNN

《Journal of Visual Communication and Image Representation》2016

Video semantic detection has been one research hotspot in the field of human-computer interaction. In video features-oriented sparse representation, the features from the same category video could not achieve similar coding results. To address this, the Locality-Sensitive Discriminant Sparse Representation (LSDSR) is developed, in order that the video samples belonging to the same video category are encoded as similar sparse codes which make them have better category discrimination. In the LSDSR, a discriminative loss function based on sparse coefficients is imposed on the locality-sensitive sparse representation, which makes the optimized dictionary for sparse representation be discriminative. The LSDSR for video features enhances the power of semantic discrimination to optimize the dictionary and build the better discriminant sparse model. More so, to further improve the accuracy of video semantic detection after sparse representation, a weighted K-Nearest Neighbor (KNN) classification method with the loss function that integrates reconstruction error and discrimination for the sparse representation is adopted to detect video semantic concepts. The proposed methods are evaluated on the related video databases in comparison with existing sparse representation methods. The experimental results show that the proposed methods significantly enhance the power of discrimination of video features, and consequently improve the accuracy of video semantic concept detection. 相似文献

15.

基于行为减法的视频异常检测研究

袁丽雁《电子测试》2012,(4):32-37

本文提出了一种基于行为减法的视频异常检测研究的方法。与传统的视频异常检测方法相比,该方法不需要先对目标进行标签、识别、归类和跟踪,因此,需要的计算量和内存消耗较少,实时性良好。基于物理世界中的事件都是时空相关的,该方法很好地利用了事件的时空特性。在动态特征检测的预处理基础上,直接在像素点上进行操作。在训练阶段,对每一个像素点,先建立一个时空共生模型,通过建模,计算正常事件概率;然后在检测阶段,采用相同的模型,将计算获得的概率值经过阈值比较的方法,确定该点是否为异常。通过实验证实,该方法在视频异常检测中具有高效性,并且可以应用在很多场合。相似文献

16.

Multiple attention encoded cascade R-CNN for scene text detection

《Journal of Visual Communication and Image Representation》2021

Inspired by instance segmentation algorithms, researchers have proposed quantity of segmentation-based methods for text detection, achieving remarkable results on scene text with arbitrary orientation and large aspect ratios. Following their success, we believe cascade architecture and extracting contextual information in multiple aspects are powerful to boost performance on the basis of segmentation-based methods, especially in decreasing false positive texts in complex natural scene. Based on such consideration, we propose a multiple-context-aware and cascade CNN structure, which appropriately encodes multiple categories of context information into a cascade R-CNN framework. Specifically, the proposed method consists of two stages, i.e., feature generation and cascade detection. During the first stage, we define ISTK (Isolated Selective Text Kernel) module to refine feature map, which sequentially encodes channel-wise and kernel-size attention information by designing multiple branches and different kernel sizes in isolate form. Afterwards, we build long-range spatial dependencies in feature map via non-local operations. Built on contextual feature map, Cascade Mask R-CNN structure progressively refines accurate boundaries of text instances with multi-stage framework. We conduct comparative experiments on ICDAR2015 and 2017-MLT datasets, where the proposed method outperform comparative methods in terms of effectiveness and efficiency measurements. 相似文献

17.

基于视频摘要生成技术的研究 总被引：5，自引：0，他引：5

朱志辉《微电子学与计算机》2006,23(2):76-78,82

文章研究了标题形式摘要，故事板摘要及缩略视频摘要三种形式的摘要。充分利用各种多媒体融合分析手段，提出了视频内客判定模型。根据不同的视频分解粒度，提出了不同层次的对象重要度判定模型．生成有意义的视频摘要。设计实现了有效的视频摘要生成系统，融合多种技术与方法，形成完整的检索视频索引生成系统。相似文献

18.

一种基于监控系统的图像清晰度检测算法

王聪《电视技术》2012,36(21):162-164,175

视频图像的清晰度是衡量视频监控系统的重要指标。通过比较灰度变化函数、频谱函数、方差梯度函数、灰度熵函数评价图像清晰度的优劣,提出了一种新的清晰度评价函数。通过测试大量的图片和监控视频,对改进后的算法和其他检测算法的性能进行了比较,结果表明,提出的检测算法具有单峰性强、灵敏度高、无偏好性、信噪比高等特点,能够自动实时有效地完成对视频图像清晰度的检测。相似文献

19.

基于地铁复杂场景下异常行为的视频分析研究

张起贵张妮《电视技术》2014,38(3):169-172

为减少异常行为在公共场所造成的安全隐患,以复杂场景地铁为研究背景,从人工智能的角度出发,全面地分析异常行为特征并进行系统设计,快速标定视频监控场景。在上下文场景模型与异常行为模型建立的基础上,采用改进camshift算法对轨迹间断进行填补并跟踪,提出了一种以轨迹属性集对晕倒异常行为进行表征。实验结果表明,该算法能够快速分析地铁场景中的晕倒行为并及时预警,检测率为89.9%。相似文献

20.

基于残留噪声相关性的视频篡改检测算法

李晓梅《黑龙江电子技术》2013,(10):103-105,109

提出了一种基于残留噪声相关性的视频篡改检测算法.该算法利用双树复小波域局部维纳滤波的方法获取视频每帧的残留噪声,相邻两残留噪声帧对应块做相关性运算,根据相关系数的统计特性建立累积分布函数模型,设定最佳阈值,定位篡改区域.所提出的提取残留噪声的方法能更好地保留图像细节,减少残留噪声中的场景污迹.实验结果验证了该算法检测篡改视频的准确率更高. 相似文献