首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Cai  Bo  Ye  Wei  Zhao  Jianhui 《Multimedia Tools and Applications》2019,78(5):5381-5401

To segment regions of interest (ROIs) from ultrasound images, one novel dynamic texture based algorithm is presented with surfacelet transform, hidden Markov tree (HMT) model and parallel computing. During surfacelet transform, the image sequence is decomposed by pyramid model, and the 3D signals with high frequency are decomposed by directional filter banks. During HMT modeling, distribution of coefficients is described with Gaussian mixture model (GMM), and relationship of scales is described with scale continuity model. From HMT parameters estimated through expectation maximization, the joint probability density is calculated and taken as feature value of image sequence. Then ROIs and non-ROIs in collected sample videos are used to train the support vector machine (SVM) classifier, which is employed to identify the divided 3D blocks from input video. To improve the computational efficiency, parallel computing is implemented with multi-processor CPU. Our algorithm has been compared with the existing texture based approaches, including gray level co-occurrence matrix (GLCM), local binary pattern (LBP), Wavelet, for ultrasound images, and the experimental results prove its advantages of processing noisy ultrasound images and segmenting higher accurate ROIs.

  相似文献   

2.
张圆圆  黄宜军  王跃飞 《计算机应用》2018,38(12):3409-3413
针对目前室内场景视频中关键物体的检测、跟踪及信息编辑等方面主要是采用人工处理方式,存在效率低、精度不高等问题,提出了一种基于纹理信息的室内场景语义标注学习方法。首先,采用光流方法获取视频帧间的运动信息,利用关键帧标注和帧间运动信息进行非关键帧的标注初始化;然后,利用非关键帧的图像纹理信息约束及其初始化标注构建能量方程;最后,利用图割方法优化得到该能量方程的解,即为非关键帧语义标注。标注的准确率和视觉效果的实验结果表明,与运动估计法和基于模型的学习法相比较,所提基于纹理信息的室内场景语义标注学习法具有较好的效果。该方法可以为服务机器人、智能家居、应急响应等低时延决策系统提供参考。  相似文献   

3.
Query by video clip   总被引:15,自引:0,他引:15  
Typical digital video search is based on queries involving a single shot. We generalize this problem by allowing queries that involve a video clip (say, a 10-s video segment). We propose two schemes: (i) retrieval based on key frames follows the traditional approach of identifying shots, computing key frames from a video, and then extracting image features around the key frames. For each key frame in the query, a similarity value (using color, texture, and motion) is obtained with respect to the key frames in the database video. Consecutive key frames in the database video that are highly similar to the query key frames are then used to generate the set of retrieved video clips. (ii) In retrieval using sub-sampled frames, we uniformly sub-sample the query clip as well as the database video. Retrieval is based on matching color and texture features of the sub-sampled frames. Initial experiments on two video databases (basketball video with approximately 16,000 frames and a CNN news video with approximately 20,000 frames) show promising results. Additional experiments using segments from one basketball video as query and a different basketball video as the database show the effectiveness of feature representation and matching schemes.  相似文献   

4.
As we all know, video frame rate determines the quality of the video. The higher the frame rate, the smoother the movements in the picture, the clearer the information expressed, and the better the viewing experience for people. Video interpolation aims to increase the video frame rate by generating a new frame image using the relevant information between two consecutive frames, which is essential in the field of computer vision. The traditional motion compensation interpolation method will cause holes and overlaps in the reconstructed frame, and is easily affected by the quality of optical flow. Therefore, this paper proposes a video frame interpolation method via optical flow estimation with image inpainting. First, the optical flow between the input frames is estimated via combined local and global-total variation (CLG-TV) optical flow estimation model. Then, the intermediate frames are synthesized under the guidance of the optical flow. Finally, the nonlocal self-similarity between the video frames is used to solve the optimization problem, to fix the pixel loss area in the interpolated frame. Quantitative and qualitative experimental results show that this method can effectively improve the quality of optical flow estimation, generate realistic and smooth video frames, and effectively increase the video frame rate.  相似文献   

5.
视频去抖动是视频增强技术的一个重要应用,通过纠正视频帧的位置使视频运动变得平稳。随之而来的问题是如何修复视频帧留下的空缺以保持视频的连续性。在对图像修复技术进行研究的基础上,提出了利用改进的纹理合成技术进行去抖动视频修复的方法。实验给出的视频去抖动效果证明了该方法的有效性。  相似文献   

6.
在视频稳定的过程中,由于摄像机的运动,造成图像的扭曲.针对这种情况,提出一种基于相机姿势的全局运动估计,同时为了克服图像拼接后,部分区域像素丢失的问题,使用改进后调和模型来修复缺少的像素.算法首先提取特征不变量,然后基于这些特征不变量去估计摄像机的运动矢量,相乘各帧间的运动矢量,可以得到每一帧参考第一帧的运动矢量.运用这个矢量可以很好地计算出没有扭曲的图像.运用计算出的图像与视频帧进行拼接,可以很好的解决图像的扭曲的问题.然而,图像拼接完成后可能导致部分区域像素缺少,为了填充缺少像素,算法使用了改进的调和模型来修复缺少区域.实验结果表明,基于相机姿势的全局运动估计可以很好的解决图像扭曲的问题,同时改进的调和模型可以高效的完成对图像的修复.  相似文献   

7.

Human activity recognition is a challenging problem of computer vision and it has different emerging applications. The task of recognizing human activities from video sequence exhibits more challenges because of its highly variable nature and requirement of real time processing of data. This paper proposes a combination of features in a multiresolution framework for human activity recognition. We exploit multiresolution analysis through Daubechies complex wavelet transform (DCxWT). We combine Local binary pattern (LBP) with Zernike moment (ZM) at multiple resolutions of Daubechies complex wavelet decomposition. First, LBP coefficients of DCxWT coefficients of image frames are computed to extract texture features of image, then ZM of these LBP coefficients are computed to extract the shape feature from texture feature for construction of final feature vector. The Multi-class support vector machine classifier is used for classifying the recognized human activities. The proposed method has been tested on various standard publicly available datasets. The experimental results demonstrate that the proposed method works well for multiview human activities as well as performs better than some of the other state-of-the-art methods in terms of different quantitative performance measures.

  相似文献   

8.
Automatic video logo detection and removal   总被引:1,自引:0,他引:1  
Most commercial television channels use video logos, which can be considered a form of visible watermark, as a declaration of intellectual property ownership. They are also used as a symbol of authorization to rebroadcast when original logos are used in conjunction with newer logos. An unfortunate side effect of such logos is the concomitant decrease in viewing pleasure. In this paper, we use the temporal correlation of video frames to detect and remove video logos. In the video-logo-detection part, as an initial step, the logo boundary box is first located by using a distance threshold of video frames and is further refined by employing a comparison of edge lengths. Second, our proposed Bayesian classifier framework locates fragments of logos called logo-lets. In this framework, we systematically integrate the prior knowledge about the location of the video logos and their intrinsic local features to achieve a robust detection result. In our logo-removal part, after the logo region is marked, a matching technique is used to find the best replacement patch for the marked region within that video shot. This technique is found to be useful for small logos. Furthermore, we extend the image inpainting technique to videos. Unlike the use of 2D gradients in the image inpainting technique, we inpaint the logo region of video frames by using 3D gradients exploiting the temporal correlations in video. The advantage of this algorithm is that the inpainted regions are consistent with the surrounding texture and hence the result is perceptually pleasing. We present the results of our implementation and demonstrate the utility of our method for logo removal.  相似文献   

9.
关键帧提取是基于内容的视频摘要生成中的一个重要技术.首次引入仿射传播聚类方法来提取视频关键帧.该方法结合两个连续图像帧的颜色直方图交,通过消息传递,实现数据点的自动聚类.并与k means和SVC(support vector clustering)算法的关键帧提取方法进行了比较.实验结果表明,AP(Affinity Propagation)聚类的关键帧提取速度快,准确性高,生成的视频摘要具有良好的压缩率和内容涵盖率.  相似文献   

10.
针对手机拍摄过程中产生的视频抖动问题,提出了一种基于光流法和卡尔曼滤波的视频稳像算法。首先通过光流法预稳定抖动视频,对其生成的预稳定视频帧进行Shi-Tomasi角点检测,并采用LK算法跟踪角点,再利用RANSAC算法估计相邻帧间的仿射变换矩阵,由此计算得出原始相机路径;然后通过卡尔曼滤波器优化平滑相机路径,得到平滑相机路径;最后由原始相机路径与平滑路径的关系,计算相邻帧间的补偿矩阵,再利用补偿矩阵对视频帧逐一进行几何变换,由此得到稳定的视频输出。实验表明,该算法在处理6大类抖动视频时均有较好的效果,其中稳像后视频的PSNR值相比原始视频的PSNR值约提升了6.631 dB,视频帧间的结构相似性SSIM约提升了40%,平均曲率值约提升了8.3%。  相似文献   

11.
In video processing, a common first step is to segment the videos into physical units, generally called shots. A shot is a video segment that consists of one continuous action. In general, these physical units need to be clustered to form more semantically significant units, such as scenes, sequences, programs, etc. This is the so-called story-based video structuring. Automatic video structuring is of great importance for video browsing and retrieval. The shots or scenes are usually described by one or several representative frames, called key-frames. Viewed from a higher level, key frames of some shots might be redundant in terms of semantics. In this paper, we propose automatic solutions to the problems of: (i) video partitioning, (ii) key frame computing, (iii) key frame pruning. For the first problem, an algorithm called “net comparison” is devised. It is accurate and fast because it uses both statistical and spatial information in an image and does not have to process the entire image. For the last two problems, we develop an original image similarity criterion, which considers both spatial layout and detail content in an image. For this purpose, coefficients of wavelet decomposition are used to derive parameter vectors accounting for the above two aspects. The parameters exhibit (quasi-) invariant properties, thus making the algorithm robust for many types of object/camera motions and scaling variances. The novel “seek and spread” strategy used in key frame computing allows us to obtain a large representative range for the key frames. Inter-shot redundancy of the key-frames is suppressed using the same image similarity measure. Experimental results demonstrate the effectiveness and efficiency of our techniques.  相似文献   

12.
In this article, we present an algorithm for detecting moving objects from a given video sequence. Here, spatial and temporal segmentations are combined together to detect moving objects. In spatial segmentation, a multi-layer compound Markov Random Field (MRF) is used which models spatial, temporal, and edge attributes of image frames of a given video. Segmentation is viewed as a pixel labeling problem and is solved using the maximum a posteriori (MAP) probability estimation principle; i.e., segmentation is done by searching a labeled configuration that maximizes this probability. We have proposed using a Differential Evolution (DE) algorithm with neighborhood-based mutation (termed as Distributed Differential Evolution (DDE) algorithm) for estimating the MAP of the MRF model. A window is considered over the entire image lattice for mutation of each target vector of the DDE; thereby enhancing the speed of convergence. In case of temporal segmentation, the Change Detection Mask (CDM) is obtained by thresholding the absolute differences of the two consecutive spatially segmented image frames. The intensity/color values of the original pixels of the considered current frame are superimposed in the changed regions of the modified CDM to extract the Video Object Planes (VOPs). To test the effectiveness of the proposed algorithm, five reference and one real life video sequences are considered. Results of the proposed method are compared with four state of the art techniques and provide better spatial segmentation and better identification of the location of moving objects.  相似文献   

13.
Accurately tracking the video object in video sequence is a crucial stage for video object processing which has wide applications in different fields. In this paper, a novel video object tracking algorithm based on the improved gradient vector flow (GVF) snake model and intra-frame centroids tracking algorithm is proposed. Unlike traditional gradient vector flow snake, the improved gradient vector flow snake adopts anisotropic diffusion and a four directions edge operator to solve the blurry boundary and edge shifting problem. Then the improved gradient vector flow snake is employed to extract the object contour in each frame of the video sequence. To set the initial contour of the gradient vector flow snake automatically, we design an intra-frame centroids tracking algorithm. Splitting the original video sequence into segments, for each segment, the initial contours of first two frames are set by change detection based on t-distribution significance test. Then, utilizing the redundancy between the consecutive frames, the subsequent frames’ initial contours are obtained by intra-frame motion vectors. Experimental results with several test video sequences indicate the validity and accuracy of the video object tracking.  相似文献   

14.
基于压缩感知的视频双水印算法研究   总被引:1,自引:0,他引:1  
针对数字视频的内容保护与帧内、帧间篡改检测的难题,采用压缩感知理论提取视频的内容特征作为水印,提出一种双水印的视频保护和篡改检测算法。首先,利用压缩感知过程提取I帧宏块的内容特征,生成半脆弱的内容认证水印;然后,对帧序号进行二值运算,生成完整性水印;最后,利用压缩感知信号重构OMP(Orthogonal Matching Pursuit)算法把生成的双水印嵌入到I帧和P帧相应宏块的DCT高频系数的压缩测量值中,以此提高视频水印的抗攻击能力,并实现对视频篡改的检测。仿真实验表明,所提算法对视频帧内篡改具有精确定位到子块的检测能力;同时对帧插入、帧删除、帧交换等类型的视频帧间篡改具有很强的检测能力。  相似文献   

15.
目的 视频摘要技术在多媒体数据处理和计算机视觉中都扮演着重要的角色。基于聚类的摘要方法多结合图像全局或局部特征,对视频帧进行集群分类操作,再从各类中获取具有代表性的关键帧。然而这些方法多需要提前确定集群的数目,自适应的方法也不能高效的获取聚类的中心。为此,提出一种基于映射和聚类的图像密度值分析的关键帧选取方法。方法 首先利用各图像间存在的差异,提出将其映射至2维空间对应点的度量方法,再依据点对间的相对位置和邻域密度值进行集群的聚类,提出根据聚类的结果从视频中获取具有代表性的关键帧的提取方法。结果 分别使用提出的度量方法对Olivetti人脸库内图像和使用关键帧提取方法对Open Video库进行测试,本文关键帧提取方法的平均查准率达到66%、查全率达到74%,且F值较其他方法高出11%左右达到了69%。结论 本文提出的图像映射后聚类的方法可有效进行图像类别的识别,并可有效地获取视频中的关键帧,进而构成视频的摘要内容。  相似文献   

16.
为了研究岩石表面变形破坏过程的变化特征,设计了一个可视化应用程序.该应用程序以岩石常规力学性质试验视频作为研究对象,包括静态图像处理界面和视频图像处理界面.静态图像处理界面由图像类型转换、图像边缘检测、图像形态学处理、图像滤波处理4个模块组成.视频处理界面则提供试验视频帧数、历时、帧图像大小和维数等基本信息.通过在可视化界面上进行所需参数设置实现了单帧图像特征纹理参数计算和岩石试样表面位移场的计算.本文还以两个示例说明了使用该应用程序进行岩石材料变形破坏过程分析的方法.本文成果对分析岩石材料变形特点和破坏机制具有一定的参考价值.  相似文献   

17.
This paper presents a context-aware smartphone-based based visual obstacle detection approach to aid visually impaired people in navigating indoor environments. The approach is based on processing two consecutive frames (images), computing optical flow, and tracking certain points to detect obstacles. The frame rate of the video stream is determined using a context-aware data fusion technique for the sensors on smartphones. Through an efficient and novel algorithm, a point dataset on each consecutive frames is designed and evaluated to check whether the points belong to an obstacle. In addition to determining the points based on the texture in each frame, our algorithm also considers the heading of user movement to find critical areas on the image plane. We validated the algorithm through experiments by comparing it against two comparable algorithms. The experiments were conducted in different indoor settings and the results based on precision, recall, accuracy, and f-measure were compared and analyzed. The results show that, in comparison to the other two widely used algorithms for this process, our algorithm is more precise. We also considered time-to-contact parameter for clustering the points and presented the improvement of the performance of clustering by using this parameter.  相似文献   

18.
In this paper, a unified and adaptive web video thumbnail recommendation framework is proposed, which recommends thumbnails both for video owners and browsers on the basis of image quality assessment, image accessibility analysis, video content representativeness analysis and query-sensitive matching. At the very start, video shot detection is performed and the highest image quality video frame is extracted as the key frame for each shot on the basis of our proposed image quality assessment method. These key frames are utilized as the thumbnail candidates for the following processes. In the image quality assessment, the normalized variance autofocusing function is employed to evaluate the image blur and ensures that the selected video thumbnail candidates are clear and have high image quality. For accessibility analysis, color moment, visual salience and texture are used with a support vector regression model to predict the candidates’ accessibility score, which ensures that the recommended thumbnail’s ROIs are big enough and it is very accessible for users. For content representativeness analysis, the mutual reinforcement algorithm is adopted in the entire video to obtain the candidates’ representativeness score, which ensures that the final thumbnail is representative enough for users to catch the main video contents at a glance. Considering browsers’ query intent, a relevant model is designed to recommend more personalized thumbnails for certain browsers. Finally, by flexibly fusing the above analysis results, the final adaptive recommendation work is accomplished. Experimental results and subjective evaluations demonstrate the effectiveness of the proposed approach. Compared with the existing web video thumbnail generation methods, the thumbnails for video owners not only reflect the contents of the video better, but also make users feel more comfortable. The thumbnails for video browsers directly reflect their preference, which greatly enhances their user experience.  相似文献   

19.
针对如何在镜头基础上进行聚类,以得到更高层次的场景问题,提出了一个基于语义的场景分割算法。该算法首先将视频分割为镜头,并提取镜头的关键帧。然后计算关键帧的颜色直方图和MPEG-7边缘直方图,以形成关键帧的特征;接着利用镜头关键帧的颜色和纹理特征对支持向量机(SVM)进行训练来构造7个基于SVM对应不同语义概念的分类器,并利用它们对要进行场景分割的视频镜头关键帧进行分类,以得到关键帧的语义。并根据关键帧包含的语义概念形成了其语义概念矢量,最后根据语义概念矢量通过对镜头关键帧进行聚类来得到场景。另外.为提取场景关键帧,还构建了镜头选择函数,并根据该函数值的大小来选择场景的关键帧。实验结果表明,该场景分割算法与Hanjalic的方法相比,查准率和查全率分别提高了34.7%和9.1%。  相似文献   

20.
论文提出了一种工作于MPEG压缩域的快速运动目标提取算法。算法以通过部分解码得到的运动向量和亮度分量的直流DCT系数作为输入,提取P帧的运动目标。首先采用鲁棒性回归分析估计全局运动,标记出与全局运动不一致的宏块,得到运动块的分布;然后将运动向量场插值作为时间域的特征,将重构的直流图像转换到LUV颜色空间作为空间域的特征,采用快速平均移聚类找到时间和空间特征具有相似性的区域,得到细化的区域边界;最后结合运动块分布和聚类分析的结果,通过基于马尔可夫随机场的统计标号方法进行背景分离,得到运动目标的掩模。实验结果表明该算法可以有效地消除运动向量噪声的影响,并有很高的处理速度,对于CIF格式的视频码流,每秒可以处理约50帧。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号