首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A neuro-fuzzy approach for segmentation of human objects in image sequences   总被引:1,自引:0,他引:1  
We propose a novel approach for segmentation of human objects, including face and body, in image sequences. Object segmentation is important for achieving a high compression ratio in modern video coding techniques, e.g., MPEG-4 and MPEG-7, and human objects are usually the main parts in the video streams of multimedia applications. Existing segmentation methods apply simple criteria to detect human objects, leading to the restriction of the usage or a high segmentation error. We combine temporal and spatial information and employ a neuro-fuzzy mechanism to overcome these difficulties. A fuzzy self-clustering technique is used to divide the base frame of a video stream into a set of segments which are then categorized as foreground or background based on a combination of multiple criteria. Then, human objects in the base frame and the remaining frames of the video stream are precisely located by a fuzzy neural network constructed with the fuzzy rules previously obtained and is trained by a singular value decomposition (SVD)-based hybrid learning algorithm. The proposed approach has been tested on several different video streams, and the results have shown that the approach can produce a much better segmentation than other methods.  相似文献   

2.
视频图像中的实时人脸检测方法   总被引:4,自引:0,他引:4  
宋红  石峰  王一拙 《计算机工程》2004,30(19):23-24,158
给出了一种视频图像中的实时人脸检测方法,该方法综合了彩色视频图像的运动信息和颜色信息,可以快速地对图像中的人脸区域进行定位。算法通过对视频图像序列中每连续3帧图像进行对称差分,提取出运动区域;然后基于肤色聚类模型,再对运动区域进行肤色检测,经过候选人脸验证,最终定位图像中的人脸。实验表明,提出的方法检测速度快,实现简单、高效,满足实时系统的要求。  相似文献   

3.
实时场景下的小脸检测存在检出率低而且回归精度差的问题。通过融合更底层特征进行多尺度级联预测。根据实时场景下的人脸特点生成不同大小和比例的预测框以更好地适应人脸形状。在预测阶段提出了基于IOU判别的soft and hard nms算法,对冗余预测框进行抑制,设置两个阈值将网络生成的预测框划分为低中高三段,对不同段的预测框采取不同的处理以达到精准筛选的目的。最优架构可在两张NVIDIA GTX 1080显卡下的实时视频检测和摄像头检测中获得45 f/s的速度,并且在Wider Face总体验证集上取得82.6%的平均精度。  相似文献   

4.
在 MPEG- 4视频编码标准中 ,为了实现基于视频内容的交互功能 ,视频序列的每一帧由视频对象面来表示 ,而生成视频对象面 ,需要对视频序列中运动对象进行有效分割 ,并跟踪运动对象随时间的变化 .在视频分割方法中 ,交互式分割视频对象能满足分割的效率与质量指标要求 ,因此提出了一种交互分割与自动跟踪相结合的方式来分割视频语义对象 ,即在初始分割时 ,依据用户的交互与形态学的分水线分割算法相结合提取视频对象轮廓 ,并用改进的轮廓跟踪方法有效提高视频对象轮廓的精度 ;对后续帧的跟踪 ,采用六参数仿射变换跟踪运动对象轮廓的变化 ,用平移估算的运动矢量作为初始值 ,计算六参数仿射变换的参数 .实验结果表明 ,该方法能有效地分割并跟踪视频运动对象  相似文献   

5.
Major Cast Detection in Video Using Both Speaker and Face Information   总被引:1,自引:0,他引:1  
Major casts, for example, the anchor persons or reporters in news broadcast programs and the principle characters in movies, play an important role in video, and their occurrences provide meaningful indices for organizing and presenting video content. This paper describes a new approach for automatically generating a list of major casts in a video sequence based on multiple modalities, specifically, speaker information in audio track and face information in video track. The core algorithm is composed of three steps. First, speaker boundaries are detected and speaker segments are clustered in audio stream. Second, face appearances are tracked and face tracks are clustered in video stream. Finally, correspondences between speakers and faces are determined based on their temporal co-occurrence. A list of major casts is constructed and ranked in an order that reflects each cast's importance, which is determined by the accumulative temporal and spatial presence of the cast. The proposed algorithm has been integrated in a major cast based video browsing system, which presents the face icon and marks the speech locations in time stream for each detected major cast. The system provides a semantically meaningful summary of the video content, which helps the user to effectively digest the theme of the video  相似文献   

6.
为解决基于视频流的人体关键点检测效果不佳及视频流切片后可能会发生运动模糊的问题,提出了一种改进的RetinaNet-CPN网络对人体关键点进行检测,有效解决切片后运动模糊图像的干扰并提高了人体关键点的检测准确率.视频流切片后,先用改进的RetinaNet网络检测出图片中的所有人并对每个目标框做模糊检测,对大于阈值的目标框做去模糊处理,最后用引入注意力机制的CPN网络提取关键点.将RetinaNet衡量预测框与真实框差异的IOU函数改成DIOU后,在仿真实验中目标检测AP提高了近3%;对于模糊的图片,利用匀速直线运动频谱特征估算出的模糊核与实际模糊核相差不大,对其做去模糊处理后基本能恢复出原清晰图片;同时引入注意力机制为各通道和特征层分配合理的权重,使得CPN检测AP提高近1%,AR提升0.5%.  相似文献   

7.
基于改进的Fisher准则的多示例学习视频人脸识别算法   总被引:1,自引:0,他引:1  
王玉  申铉京  陈海鹏 《自动化学报》2018,44(12):2179-2187
视频环境下目标的姿态变化使得人脸关键帧难以准确定位,导致基于关键帧标识的视频人脸识别方法的识别率偏低.为解决上述问题,本文提出一种基于Fisher加权准则的多示例学习视频人脸识别算法.该算法将视频人脸识别视为一个多示例问题,将视频中归一化后的人脸帧图像作为视频包中的示例,采用分块TPLBP级联直方图作为示例纹理特征,示例特征的权值通过改进的Fisher准则获得.在训练集合的示例特征空间中,采用多示例学习算法生成分类器,进而实现对测试视频的分类及预测.通过在Honda/UCSD视频库和Youtube Face数据库中的相关实验,该算法达到了较高的识别精度,从而验证了算法的有效性.同时,该方法对均匀光照变化、姿态变化等具有良好的鲁棒性.  相似文献   

8.
根据在线考试安全监控的特点,针对视频流图像提出一种基于肤色模型和特征定位的快速入脸定位算法,并建立了系统原型,适用于实际网络在线考试中的身份认证。该方法以普通PC摄像头作为图像采集设备,以采集的视频流为数据源,截取视频流中的单帧图像,通过转换彩色空间、入脸肤色建模、后处理操作实现了入脸定位,通过图像马赛克、边缘提取,显著特征点定位等技术定位入脸图像的各特征点。在此基础上通过入眼定位实现了在视频流中对于人脸的跟踪算法研究。试验结果表明,所实现的入脸检测适用于近距离入脸的检测,定位速度快,误检率低,可以在实时系统中应用。  相似文献   

9.
Shot Partitioning Based Recognition of TV Commercials   总被引:1,自引:0,他引:1  
Digital video applications exploit the intrinsic structure of video sequences. In order to obtain and represent this structure for video annotation and indexing tasks, the main initial step is automatic shot partitioning. This paper analyzes the problem of automatic TV commercials recognition, and a new algorithm for scene break detection is then introduced. The structure of each commercial is represented by the set of its key-frames, which are automatically extracted from the video stream. The particular characteristics of commercials make commonly used shot boundary detection techniques obtain worse results than with other video content domains. These techniques are based on individual image features or visual cues, which show significant performance lacks when they are applied to complex video content domains like commercials. We present a new scene break detection algorithm based on the combined analysis of edge and color features. Local motion estimation is applied to each edge in a frame, and the continuity of the color around them is then checked in the following frame. By separately considering both sides of each edge, we rely on the continuous presence of the objects and/or the background of the scene during each shot. Experimental results show that this approach outperforms single feature algorithms in terms of precision and recall.  相似文献   

10.
视频监控中的一种快速人脸定位方法   总被引:3,自引:0,他引:3  
宋红  石峰  李剑 《计算机工程》2005,31(2):30-32
根据视频监控应用的特点,结合视频图像的时域连续特性和人脸肤色特征,提出了一种应用十视频监控的快速人脸定位方法。该方法首先通过对称差分算法,提取运动区域;然后基于BP(back-error-propagation,误差反传)神经网络的肤色分割算法,对运动区域进行肤色检测,最后,经过进一步的候选人脸区域验证,定位图像中的人脸。实验结果表明,提出的方法实现简单、检测速度怏、误检率低,适合实时视频监控系统应用。  相似文献   

11.
基于压缩域的关键帧快速提取方法   总被引:1,自引:0,他引:1  
关键帧提取技术是基于内容检索和视频分析的基础。关键帧的使用减少了视频索引的数据量,同时也为视频摘要和检索提供了一个组织框架。首先介绍了目前的关键帧提取技术,然后提出了一种基于运动特征利用模糊推理算法从MPEG视频流中提取关键帧的方法。由于处理过程是直接从MPEG的压缩视频提取,不需对其解压,所以计算复杂度低,提高了提取速度。实验证明该方法效率高,可以比较好地代表视频内容。  相似文献   

12.
13.
一种适用于无线视频传输Ⅰ帧的稳健容错算法   总被引:1,自引:0,他引:1       下载免费PDF全文
在没有服务质量保证的无线传输环境中,按照传统视频编码标准压缩的码流很容易遭到破坏,这在Ⅰ帧中尤为突出.因此需要对传统视频压缩标准加以改进以提高在无线传输环境中的自身容错性能.针对这一问题提出了一种用于提高视频流Ⅰ帧容错能力的算法.该算法首先对视频流中不同部分对发生在自身的码位错误所引起误差传播的敏感度进行分析,然后以此分析结果和Ⅰ帧重建中码流信息的不同重要性作为决定Ⅰ帧码流中重要信息的原则,最终通过对这些Ⅰ帧中的重要信息加以保护、提取和集中,以减少发生帧内误差传播范围.通过H.263+baseline测试码流在加性高斯白噪声无线信道仿真实验的结果可以证明,该算法在增加少量冗余信息和计算复杂度的情况下,相比参考算法在主观和客观质量上都有较大幅度的提高.  相似文献   

14.
AD-HOC (Appearance Driven Human tracking with Occlusion Classification) is a complete framework for multiple people tracking in video surveillance applications in presence of large occlusions. The appearance-based approach allows the estimation of the pixel-wise shape of each tracked person even during the occlusion. This peculiarity can be very useful for higher level processes, such as action recognition or event detection. A first step predicts the position of all the objects in the new frame while a MAP framework provides a solution for best placement. A second step associates each candidate foreground pixel to an object according to mutual object position and color similarity. A novel definition of non-visible regions accounts for the parts of the objects that are not detected in the current frame, classifying them as dynamic, scene or apparent occlusions. Results on surveillance videos are reported, using in-house produced videos and the PETS2006 test set.  相似文献   

15.
本文针对将特定人体头部图象移植到视频图像序列中逐帧手工处理的状况,提出并实现一种相对定点位移的图像处理方法。首先采用差分图像法对视频图像中每帧图像及需要移植的特定图像进行人脸检测、定位,然后依据人脸特征并结合人脸肤色,找出特定图像人脸相对定点的位移量,根据位移量把特定人体头部图像移植到视频图像序列中去。实验证明该方法移植效果比较好,实现简单,高效。  相似文献   

16.
17.
A compact summary of video that conveys visual content at various levels of detail enhances user interaction significantly. In this paper, we propose a two-stage framework to generate MPEG-7-compliant hierarchical key frame summaries of video sequences. At the first stage, which is carried out off-line at the time of content production, fuzzy clustering and data pruning methods are applied to given video segments to obtain a nonredundant set of key frames that comprise the finest level of the hierarchical summary. The number of key frames allocated to each shot or segment is determined dynamically and without user supervision through the use of cluster validation techniques. A coarser summary is generated on-demand in the second stage by reducing the number of key frames to match the low-level browsing preferences of a user. The proposed method has been validated by experimental results on a collection of video programs.  相似文献   

18.
文章在VS .NET 2003环境下,利用C对图像及视频流的处理能力,实现对运动物体进行检测的方法.方法主要是在"背景帧"基础上,使用滤波技术,对运动物体进行自动检测,该方法简单、有效,有助于对运动物体的进一步识别.  相似文献   

19.
人脸识别是生物特征识别领域的一项关键技术,长期以来得到研究者的广泛关注.视频人脸识别任务特指从一段视频中提取出人脸的关键信息,从而完成身份识别.相较于基于图像的人脸识别任务来说,视频数据中的人脸变化模式更为多样且视频帧之间存在较大差异,如何从冗长而复杂的视频中抽取到人脸的关键特征成为当前的研究重点.以视频人脸识别技术为...  相似文献   

20.
在没有服务质量保证的无线传输环境中,按照传统视频编码标准压缩的码流很容易遭到破坏,这在I帧中尤为突出。因此需要对传统视频压缩标准加以改进以提高在无线传输环境中的自身容错性能。针对这一问题提出了一种用于提高视频流I帧容错能力的算法。该算法首先对视频流中不同部分对发生在自身的码位错误所引起误差传播的敏感度进行分析,然后以此分析结果和I帧重建中码流信息的不同重要性作为决定I帧码流中重要信息的原则,最终通过对这些I帧中的重要信息加以保护、提取和集中,以减少发生帧内误差传播范围。通过H.263+baseline测试码流在加性高斯白噪声无线信道仿真实验的结果可以证明,该算法在增加少量冗余信息和计算复杂度的情况下,相比参考算法在主观和客观质量上都有较大幅度的提高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号