首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We present a novel and practical way to integrate techniques from computer vision to low bit-rate coding systems for video teleconferencing applications. Our focus is to locate and track the faces and selected facial features of persons in typical head-and-shoulders video sequences, and to exploit the location information in a ‘classical’ video coding/decoding system. The motivation is to enable the system to encode selectively various image areas and to produce perceptually pleasing coded images where faces are sharper. We refer to this approach—a mix of classical waveform coding and model—based coding-as model-assisted coding. We propose two totally automatic algorithms which, respectively, perform the detection of a head outline, and identify an ‘eyes-nose-mouth’ region, both from downsampled binary thresholded edge images. The algorithms operate accurately and robustly, even in cases of significant head rotation or partial occlusion by moving objects. We show how the information about face and facial feature location can be advantageously exploited by low bit-rate waveform-based video coders. In particular, we describe a method of object-selective quantizer control in a standard coding system based on motion-compensated discrete cosine transform—CCITT's recommendation H.261. The approach is based on two novel algorithms, namely buffer rate modulation and buffer size modulation. By forcing the rate control algorithm to transfer a fraction of the total available bit-rate from the coding of the non-facial to that of the facial area, the coder produces images with better-rendered facial features, i.e. coding artefacts in the facial area are less pronounced and eye contact is preserved. The improvement was found to be perceptually significant on video sequences coded at the ISDN rate of 64 kbps, with 48 kbps for the input (color) video signal in QCIF format.  相似文献   

2.
3.
Scalable low bit-rate video coding is vital for the transmission of video signals over wireless channels. A scalable model-based video coding scheme is proposed in this paper to achieve this. This paper mainly addresses automatic scalable face model design. Firstly, a robust and adaptive face segmentation method is proposed, which is based on piecewise skin-colour distributions. 43 million skin pixels from 900 images are used to train the skin-colour model, which can identify skin-colour pixels reliably under different lighting conditions. Next, reliable algorithms are proposed for detecting the eyes, mouth and chin that are used to verify the face candidatures. Then, based on the detected facial features and human face muscular distributions, a heuristic scalable face model is designed to represent the rigid and non-rigid motion of head and facial features. A novel motion estimation algorithm is proposed to estimate the object model motion hierarchically. Experimental results are provided to illustrate the performance of the proposed algorithms for facial feature detection and the accuracy of the designed scalable face model for representing face motion.  相似文献   

4.
提出姿态估计和特定部位跟踪相结合的动作视频关键帧提取算法.首先利用非确定人体部位的时间连续性保持提高基于柔性部件铰接人体模型的单帧图像人体姿态估计准确率,通过实施数据降维得到局部拓扑结构表达能力强的判别性运动特征向量,采用极值判定原理确定候选关键帧集合.然后利用ISODATA动态聚类算法,通过初始聚类中心优化、基于语义的关键帧集合增强等策略确定关键帧.实验表明文中算法具有较高的关键帧提取准确率和召回率,支持基于语义的关键帧提取.提取的视频关键帧可以用于运动视频压缩和批注审阅.  相似文献   

5.
一种同步人脸运动跟踪与表情识别算法   总被引:1,自引:0,他引:1       下载免费PDF全文
於俊  汪增福  李睿 《电子学报》2015,43(2):371-376
针对单视频动态变化背景下的人脸表情识别问题,提出了一种同步人脸运动跟踪和表情识别算法,并在此基础上构建了一个实时系统.该系统达到了如下目标:首先在粒子滤波框架下结合在线外观模型和柱状几何模型进行人脸三维运动跟踪;接着基于生理知识来提取人脸表情的静态信息;然后基于流形学习来提取人脸表情的动态信息;最后在人脸运动跟踪过程中,结合人脸表情静态信息和动态信息来进行表情识别.实验结果表明,该系统在大姿态和丰富表情下具有较好的综合优势.  相似文献   

6.
实现的人脸检测跟踪与特征点定位系统,基于VC++6.0开发平台,使用opencv作为开发工具,有效缩短了系统的开发时间。首先,本系统采用adaboost算法进行人脸检测,通过合理的特征模板的选择实现了人脸的实时检测;其次,人脸跟踪模块选用camshift算法,利用人脸检测模块生成的人脸坐标传递给跟踪模块,实现人脸的自动实时跟踪,同时建立多个camshift跟踪器对多人脸进行跟踪,并有效地解决了人脸遮挡的问题;最后,通过ASM(active shapemodel)算法实现了实时人脸特征点定位。实验结果表明该系统实现的人脸实时检测跟踪及特征点定位,效果明显,可以作为表情分析和情感计算、视频人脸识别开发的基础。  相似文献   

7.
With video compression standards such as MPEG‐4, a transmission error happens in a video‐packet basis, rather than in a macroblock basis. In this context, we propose a semantic error prioritization method that determines the size of a video packet based on the importance of its contents. A video packet length is made to be short for an important area such as a facial area in order to reduce the possibility of error accumulation. To facilitate the semantic error prioritization, an efficient hardware algorithm for face tracking is proposed. The increase of hardware complexity is minimal because a motion estimation engine is efficiently re‐used for face tracking. Experimental results demonstrate that the facial area is well protected with the proposed scheme.  相似文献   

8.
目标基视频编码中的运动目标提取与跟踪新算法   总被引:4,自引:1,他引:4       下载免费PDF全文
自动、快速的视频目标提取与跟踪是目标基视频编码中的一项关键技术.本文提出一种运动目标提取与跟踪新算法.首先,根据多帧运动信息和高阶统计检测方法得到二值运动掩模图像,然后提出一种改进分水岭算法对运动区域及其周围部分进行分割.将二者所得结果进行投影运算,得到最终运动目标.最后提出一种运动目标跟踪新算法,能对目标进行有效的跟踪.实验结果说明了本文算法的有效性.  相似文献   

9.
A real-time algorithm for affine-structure-based video compression for facial images is presented. The face undergoing motion is segmented and triangulated to yield a set of control points. The set of control points generated by triangulation are tracked across a few frames using an intensity-based correlation technique. For accurate motion and structure estimation a Kalman-filter-based algorithm is used to track features on the facial image. The structure information of the control points is transmitted only during the bootstrapping stage. After that only the motion information is transmitted to the decoder. This reduces the number of motion parameters associated with control points in each frame. The local motion of the eyes and lips is captured using local 2-D affine transformations. For real time implementation a quad-tree based search technique is adopted to solve local correlation. Any remaining reconstruction error is accounted for using predictive encoding. Results on real image sequences demonstrate the applicability of the method  相似文献   

10.
文章是以Nios II处理器为中心的视频运动目标检测跟踪系统,通过CMOS图像传感器采集视频图像信息,采用帧间差分法检测运动目标,形心跟踪算法对目标进行跟踪,最后在VGA显示器上显示视频中运动物体。实验结果表明,该系统可达到运动目标检测跟踪的理想结果。  相似文献   

11.
The authors describe a deformable model of the human iris, which forms part of a system for accurate offline measurement of binocular eye movements, particularly cyclotorsion (torsion), from video image sequences. At least two existing systems measure torsion from infrared video images by pupil tracking followed by cross-correlation of bandpass filtered iris sectors. Unfortunately, pupil expansion and contraction reduce the accuracy of this method. In addition, infrared iris images typically contain very little texture, so correlation can be unreliable. A five-parameter deformable model was therefore developed for taken in visible light. This model can translate (horizontal and vertical eye motion), rotate (torsion) and scale both uniformly and radially (pupil changes). A series of software simulations and hardware tests suggest that torsion measurements obtained with the model are repeatable and accurate to within 0.1°. This performance is illustrated by analysing binocular torsion during fixation on a static target; the results match previously published data from other equipment  相似文献   

12.
该文针对现有的人脸视频心率检测方法在现实情景中受运动干扰难以准确估计心率的问题,提出一种抑制运动干扰的非接触式心率估计新方法。首先利用判别响应图拟合与KLT跟踪算法消除人脸的刚性运动干扰;然后使用对运动鲁棒的色度特征进行两步心率估计,并引入空间梯度因子调控空域和频域的权重,抑制非刚性运动的干扰;最终得到人脸不同区域融合的平均心率数值和信号波形图,实现心率的精确估计。实验结果表明:所提方法相比其它的基于人脸视频的心率估计方法优势明显,提升了信号波形图和真实脉搏波形的一致性,进一步提高了心率估计的精度和鲁棒性。  相似文献   

13.
该文提出一种基于优选特征轨迹的视频稳定算法。首先,采用改进的Harris角点检测算子提取特征点,通过K-Means聚类算法剔除前景特征点。然后,利用帧间特征点的空间运动一致性减少错误匹配和时间运动相似性实现长时间跟踪,从而获取有效特征轨迹。最后,建立同时包含特征轨迹平滑度与视频质量退化程度的目标函数计算视频序列的几何变换集以平滑特征轨迹获取稳定视频。针对图像扭曲产生的空白区,由当前帧定义区与参考帧的光流作引导来腐蚀,并通过图像拼接填充仍属于空白区的像素。经仿真验证,该文方法稳定的视频,空白区面积仅为Matsushita方法的33%左右,对动态复杂场景和多个大运动前景均具有较高的有效性并可生成内容完整的视频,既提高了视频的视觉效果,又减轻了费时的边界修复任务。  相似文献   

14.
针对成像平台运动情况下的运动目标检测问题,提出了一种从特征点稀疏运动场估计到运动分类的目标检测算法.首先通过快速特征点检测与跟踪恢复出图像稀疏运动场;然后依据特征点之间运动一致性关系实现属于同一运动模式的特征点分类,根据分类得到的各组特征点计算场景图像重建误差,剔除重建误差最小的特征点组,实现对前景目标的检测.仿真实验对该算法在复杂场景中检测运动目标的有效性进行了验证.  相似文献   

15.
一种鲁棒性的抖动视频稳像算法   总被引:1,自引:0,他引:1  
设计了一种适用于移动摄像设备获取的视频序列的基于特征匹配的鲁棒性不稳定视频稳像算法,算法首先将亮度自适应模型融入传统的KLT方法中以实现鲁棒性的特征匹配,而后基于特征误差分析和运动一致性原则对特征集合进行有效性验证以提高特征的可靠性,并提出一种基于运动往复特性的抖动检测方法以避免视频无抖动时的稳像误差,最后根据匹配的特征区域间的对应关系采用Affine运动模型进行图像稳定.实验结果表明,算法对于前景物体运动和外界光线突变具有较好的鲁棒性.  相似文献   

16.
针对视频表情识别,静态特征不能有效描述人脸区域沿时间轴动态变化信息的局限,该文提出一种融合动态纹理信息和运动信息的表情识别方法,借鉴LBP-TOP原理,提出具有时空域描述能力的时空韦伯局部描述子(STWLD)来提取动态纹理信息,同时采用分块光流直方图(BHOF)描述运动信息,最后利用SVM对融合后的纹理和运动信息完成表情分类。在CK+和MMI表情数据库上的交叉实验结果表明,相比基于单一特征的识别方法,所提方法取得了更好的效果;与其他相关方法的对比实验也验证了该方法的优越性。  相似文献   

17.
视频中人体跟踪存在复杂性,尤其是对复杂背景下的人体上、下肢区域进行识别与跟踪时,传统算法存在一些问题。本文在传统Kalman滤波跟踪算法基础上,提出一种基于可变测量协方差的离散Kalman滤波人体识别算法。通过初始化测量协方差,用递归的方法从新获取的观测数据中计算出新的测量协方差估计量,通过离散Kalman滤波器进行跟踪。在实际的视频图像中,表现出良好的跟踪效果,并且对上肢、下肢及整个人体的区分以及部位跟踪方面都有很好的表现。相对于传统的Kalman滤波算法,本算法没有丢失跟踪目标的现象,跟踪速度适中,与人体行进速度保持一致,基本为1.5 m/s,特别适用于对视频中的人体行为进行跟踪及分析处理。  相似文献   

18.
夏爱明  伍雪冬 《红外技术》2021,43(5):429-436
针对传统核相关滤波视觉目标跟踪算法在快速运动、背景杂波、运动模糊等情况下跟踪精度低且不能处理尺度变化的问题,提出了一种基于上下文感知和尺度自适应的实时目标跟踪算法.该算法在核相关滤波算法框架的基础上,引入了上下文感知和尺度自适应方法,增加了背景信息且能够处理目标的尺度变化.首先,利用融合了fHOG(fusion his...  相似文献   

19.
基于粒子滤波的空-地目标跟踪算法   总被引:4,自引:4,他引:0  
宋策  张葆  尹传历  王超 《光电子.激光》2013,(10):2017-2023
针对空-地目标跟踪中目标大幅度变速运动而引 起的跟踪失败问题,基于Kristan等人提出的双步(TS)动态模型框架,对空-地目标跟 踪中目标运动特点进行分析与建模,改进TS模型中 的保守模型以适应加速运动,提出适于描述大幅度变速运动的加速度双步(TSA)动态模型作 为粒子滤波(PF)跟踪算法的动态模 型,实现对粒子状态的精确预测,进而达到使用较少粒子即可对目标鲁棒跟踪的目的。对空 -地目标跟踪的测试视频进行测 试,结果表明,本文算法可对大幅度变速运动目标稳定跟踪,正确跟踪率为92%,对目标 尺寸约为25pixel×30pixel时的处理帧率为29frame/s。本文算法具有较好的鲁棒性与实时性。  相似文献   

20.
Intelligently tracking objects with varied shapes, color, lighting conditions, and backgrounds is an extremely useful application in many HCI applications, such as human body motion capture, hand gesture recognition, and virtual reality (VR) games. However, accurately tracking different objects under uncontrolled environments is a tough challenge due to the possibly dynamic object parts, varied lighting conditions, and sophisticated backgrounds. In this work, we propose a novel semantically-aware object tracking framework, wherein the key is weakly-supervised learning paradigm that optimally transfers the video-level semantic tags into various regions. More specifically, give a set of training video clips, each of which is associated with multiple video-level semantic tags, we first propose a weakly-supervised learning algorithm to transfer the semantic tags into various video regions. The key is a MIL (Zhong et al., 2020) [1]-based manifold embedding algorithm that maps the entire video regions into a semantic space, wherein the video-level semantic tags are well encoded. Afterward, for each video region, we use the semantic feature combined with the appearance feature as its representation. We designed a multi-view learning algorithm to optimally fuse the above two types of features. Based on the fused feature, we learn a probabilistic Gaussian mixture model to predict the target probability of each candidate window, where the window with the maximal probability is output as the tracking result. Comprehensive comparative results on a challenging pedestrian tracking task as well as the human hand gesture recognition have demonstrated the effectiveness of our method. Moreover, visualized tracking results have shown that non-rigid objects with moderate occlusions can be well localized by our method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号