共查询到20条相似文献,搜索用时 31 毫秒
1.
We present a novel and practical way to integrate techniques from computer vision to low bit-rate coding systems for video teleconferencing applications. Our focus is to locate and track the faces and selected facial features of persons in typical head-and-shoulders video sequences, and to exploit the location information in a ‘classical’ video coding/decoding system. The motivation is to enable the system to encode selectively various image areas and to produce perceptually pleasing coded images where faces are sharper. We refer to this approach—a mix of classical waveform coding and model—based coding-as model-assisted coding. We propose two totally automatic algorithms which, respectively, perform the detection of a head outline, and identify an ‘eyes-nose-mouth’ region, both from downsampled binary thresholded edge images. The algorithms operate accurately and robustly, even in cases of significant head rotation or partial occlusion by moving objects. We show how the information about face and facial feature location can be advantageously exploited by low bit-rate waveform-based video coders. In particular, we describe a method of object-selective quantizer control in a standard coding system based on motion-compensated discrete cosine transform—CCITT's recommendation H.261. The approach is based on two novel algorithms, namely buffer rate modulation and buffer size modulation. By forcing the rate control algorithm to transfer a fraction of the total available bit-rate from the coding of the non-facial to that of the facial area, the coder produces images with better-rendered facial features, i.e. coding artefacts in the facial area are less pronounced and eye contact is preserved. The improvement was found to be perceptually significant on video sequences coded at the ISDN rate of 64 kbps, with 48 kbps for the input (color) video signal in QCIF format. 相似文献
2.
3.
《Signal Processing: Image Communication》2004,19(5):421-436
Scalable low bit-rate video coding is vital for the transmission of video signals over wireless channels. A scalable model-based video coding scheme is proposed in this paper to achieve this. This paper mainly addresses automatic scalable face model design. Firstly, a robust and adaptive face segmentation method is proposed, which is based on piecewise skin-colour distributions. 43 million skin pixels from 900 images are used to train the skin-colour model, which can identify skin-colour pixels reliably under different lighting conditions. Next, reliable algorithms are proposed for detecting the eyes, mouth and chin that are used to verify the face candidatures. Then, based on the detected facial features and human face muscular distributions, a heuristic scalable face model is designed to represent the rigid and non-rigid motion of head and facial features. A novel motion estimation algorithm is proposed to estimate the object model motion hierarchically. Experimental results are provided to illustrate the performance of the proposed algorithms for facial feature detection and the accuracy of the designed scalable face model for representing face motion. 相似文献
4.
5.
6.
实现的人脸检测跟踪与特征点定位系统,基于VC++6.0开发平台,使用opencv作为开发工具,有效缩短了系统的开发时间。首先,本系统采用adaboost算法进行人脸检测,通过合理的特征模板的选择实现了人脸的实时检测;其次,人脸跟踪模块选用camshift算法,利用人脸检测模块生成的人脸坐标传递给跟踪模块,实现人脸的自动实时跟踪,同时建立多个camshift跟踪器对多人脸进行跟踪,并有效地解决了人脸遮挡的问题;最后,通过ASM(active shapemodel)算法实现了实时人脸特征点定位。实验结果表明该系统实现的人脸实时检测跟踪及特征点定位,效果明显,可以作为表情分析和情感计算、视频人脸识别开发的基础。 相似文献
7.
With video compression standards such as MPEG‐4, a transmission error happens in a video‐packet basis, rather than in a macroblock basis. In this context, we propose a semantic error prioritization method that determines the size of a video packet based on the importance of its contents. A video packet length is made to be short for an important area such as a facial area in order to reduce the possibility of error accumulation. To facilitate the semantic error prioritization, an efficient hardware algorithm for face tracking is proposed. The increase of hardware complexity is minimal because a motion estimation engine is efficiently re‐used for face tracking. Experimental results demonstrate that the facial area is well protected with the proposed scheme. 相似文献
8.
9.
Chatterjee S. Banerjee S. Biswas K.K. 《Vision, Image and Signal Processing, IEE Proceedings -》1999,146(4):211-221
A real-time algorithm for affine-structure-based video compression for facial images is presented. The face undergoing motion is segmented and triangulated to yield a set of control points. The set of control points generated by triangulation are tracked across a few frames using an intensity-based correlation technique. For accurate motion and structure estimation a Kalman-filter-based algorithm is used to track features on the facial image. The structure information of the control points is transmitted only during the bootstrapping stage. After that only the motion information is transmitted to the decoder. This reduces the number of motion parameters associated with control points in each frame. The local motion of the eyes and lips is captured using local 2-D affine transformations. For real time implementation a quad-tree based search technique is adopted to solve local correlation. Any remaining reconstruction error is accounted for using predictive encoding. Results on real image sequences demonstrate the applicability of the method 相似文献
10.
文章是以Nios II处理器为中心的视频运动目标检测跟踪系统,通过CMOS图像传感器采集视频图像信息,采用帧间差分法检测运动目标,形心跟踪算法对目标进行跟踪,最后在VGA显示器上显示视频中运动物体。实验结果表明,该系统可达到运动目标检测跟踪的理想结果。 相似文献
11.
Ivins J.P. Porrill J. Frisby J.P. 《Vision, Image and Signal Processing, IEE Proceedings -》1998,145(3):213-220
The authors describe a deformable model of the human iris, which forms part of a system for accurate offline measurement of binocular eye movements, particularly cyclotorsion (torsion), from video image sequences. At least two existing systems measure torsion from infrared video images by pupil tracking followed by cross-correlation of bandpass filtered iris sectors. Unfortunately, pupil expansion and contraction reduce the accuracy of this method. In addition, infrared iris images typically contain very little texture, so correlation can be unreliable. A five-parameter deformable model was therefore developed for taken in visible light. This model can translate (horizontal and vertical eye motion), rotate (torsion) and scale both uniformly and radially (pupil changes). A series of software simulations and hardware tests suggest that torsion measurements obtained with the model are repeatable and accurate to within 0.1°. This performance is illustrated by analysing binocular torsion during fixation on a static target; the results match previously published data from other equipment 相似文献
12.
该文针对现有的人脸视频心率检测方法在现实情景中受运动干扰难以准确估计心率的问题,提出一种抑制运动干扰的非接触式心率估计新方法。首先利用判别响应图拟合与KLT跟踪算法消除人脸的刚性运动干扰;然后使用对运动鲁棒的色度特征进行两步心率估计,并引入空间梯度因子调控空域和频域的权重,抑制非刚性运动的干扰;最终得到人脸不同区域融合的平均心率数值和信号波形图,实现心率的精确估计。实验结果表明:所提方法相比其它的基于人脸视频的心率估计方法优势明显,提升了信号波形图和真实脉搏波形的一致性,进一步提高了心率估计的精度和鲁棒性。 相似文献
13.
该文提出一种基于优选特征轨迹的视频稳定算法。首先,采用改进的Harris角点检测算子提取特征点,通过K-Means聚类算法剔除前景特征点。然后,利用帧间特征点的空间运动一致性减少错误匹配和时间运动相似性实现长时间跟踪,从而获取有效特征轨迹。最后,建立同时包含特征轨迹平滑度与视频质量退化程度的目标函数计算视频序列的几何变换集以平滑特征轨迹获取稳定视频。针对图像扭曲产生的空白区,由当前帧定义区与参考帧的光流作引导来腐蚀,并通过图像拼接填充仍属于空白区的像素。经仿真验证,该文方法稳定的视频,空白区面积仅为Matsushita方法的33%左右,对动态复杂场景和多个大运动前景均具有较高的有效性并可生成内容完整的视频,既提高了视频的视觉效果,又减轻了费时的边界修复任务。 相似文献
14.
15.
一种鲁棒性的抖动视频稳像算法 总被引:1,自引:0,他引:1
设计了一种适用于移动摄像设备获取的视频序列的基于特征匹配的鲁棒性不稳定视频稳像算法,算法首先将亮度自适应模型融入传统的KLT方法中以实现鲁棒性的特征匹配,而后基于特征误差分析和运动一致性原则对特征集合进行有效性验证以提高特征的可靠性,并提出一种基于运动往复特性的抖动检测方法以避免视频无抖动时的稳像误差,最后根据匹配的特征区域间的对应关系采用Affine运动模型进行图像稳定.实验结果表明,算法对于前景物体运动和外界光线突变具有较好的鲁棒性. 相似文献
16.
针对视频表情识别,静态特征不能有效描述人脸区域沿时间轴动态变化信息的局限,该文提出一种融合动态纹理信息和运动信息的表情识别方法,借鉴LBP-TOP原理,提出具有时空域描述能力的时空韦伯局部描述子(STWLD)来提取动态纹理信息,同时采用分块光流直方图(BHOF)描述运动信息,最后利用SVM对融合后的纹理和运动信息完成表情分类。在CK+和MMI表情数据库上的交叉实验结果表明,相比基于单一特征的识别方法,所提方法取得了更好的效果;与其他相关方法的对比实验也验证了该方法的优越性。 相似文献
17.
视频中人体跟踪存在复杂性,尤其是对复杂背景下的人体上、下肢区域进行识别与跟踪时,传统算法存在一些问题。本文在传统Kalman滤波跟踪算法基础上,提出一种基于可变测量协方差的离散Kalman滤波人体识别算法。通过初始化测量协方差,用递归的方法从新获取的观测数据中计算出新的测量协方差估计量,通过离散Kalman滤波器进行跟踪。在实际的视频图像中,表现出良好的跟踪效果,并且对上肢、下肢及整个人体的区分以及部位跟踪方面都有很好的表现。相对于传统的Kalman滤波算法,本算法没有丢失跟踪目标的现象,跟踪速度适中,与人体行进速度保持一致,基本为1.5 m/s,特别适用于对视频中的人体行为进行跟踪及分析处理。 相似文献
18.
19.
基于粒子滤波的空-地目标跟踪算法 总被引:4,自引:4,他引:0
针对空-地目标跟踪中目标大幅度变速运动而引 起的跟踪失败问题,基于Kristan等人提出的双步(TS)动态模型框架,对空-地目标跟 踪中目标运动特点进行分析与建模,改进TS模型中 的保守模型以适应加速运动,提出适于描述大幅度变速运动的加速度双步(TSA)动态模型作 为粒子滤波(PF)跟踪算法的动态模 型,实现对粒子状态的精确预测,进而达到使用较少粒子即可对目标鲁棒跟踪的目的。对空 -地目标跟踪的测试视频进行测 试,结果表明,本文算法可对大幅度变速运动目标稳定跟踪,正确跟踪率为92%,对目标 尺寸约为25pixel×30pixel时的处理帧率为29frame/s。本文算法具有较好的鲁棒性与实时性。 相似文献
20.
Intelligently tracking objects with varied shapes, color, lighting conditions, and backgrounds is an extremely useful application in many HCI applications, such as human body motion capture, hand gesture recognition, and virtual reality (VR) games. However, accurately tracking different objects under uncontrolled environments is a tough challenge due to the possibly dynamic object parts, varied lighting conditions, and sophisticated backgrounds. In this work, we propose a novel semantically-aware object tracking framework, wherein the key is weakly-supervised learning paradigm that optimally transfers the video-level semantic tags into various regions. More specifically, give a set of training video clips, each of which is associated with multiple video-level semantic tags, we first propose a weakly-supervised learning algorithm to transfer the semantic tags into various video regions. The key is a MIL (Zhong et al., 2020) [1]-based manifold embedding algorithm that maps the entire video regions into a semantic space, wherein the video-level semantic tags are well encoded. Afterward, for each video region, we use the semantic feature combined with the appearance feature as its representation. We designed a multi-view learning algorithm to optimally fuse the above two types of features. Based on the fused feature, we learn a probabilistic Gaussian mixture model to predict the target probability of each candidate window, where the window with the maximal probability is output as the tracking result. Comprehensive comparative results on a challenging pedestrian tracking task as well as the human hand gesture recognition have demonstrated the effectiveness of our method. Moreover, visualized tracking results have shown that non-rigid objects with moderate occlusions can be well localized by our method. 相似文献