首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Journal of Supercomputing - The lateral interaction in accumulative computation (LIAC) algorithm is a biologically inspired method that allows us to detect moving objects from image sequences...  相似文献   

2.
To be able to understand the motion of non-rigid objects, techniques in image processing and computer vision are essential for motion analysis. Lateral interaction in accumulative computation for extracting non-rigid shapes from an image sequence has recently been presented, as well as its application to segmentation from motion. In this paper, we introduce a modified version of the first multi-layer architecture. This version uses the basic parameters of the LIAC model to spatio-temporally build up to the desired extent the shapes of all moving objects present in a sequence of images. The influences of LIAC model parameters are explained in this paper, and we finally show some examples of the usefulness of the model proposed.  相似文献   

3.
Recently, the Algorithmic Lateral Inhibition (ALI) method and the Accumulative Computation (AC) method have proven to be efficient in modelling at the knowledge level for general-motion-detection tasks in video sequences. More precisely, the task of persistent motion detection has been widely expressed by means of the AC method, whereas the ALI method has been used with the objective of moving objects detection, labelling and further tracking. This paper exploits the current knowledge of our research team on the mentioned problem-solving methods to model the Stereovision-Correspondence-Analysis (SCA) task. For this purpose, ALI and AC methods are combined into the Lateral Inhibition in Accumulative Computation (LIAC) method. The four basic subtasks, namely “LIAC 2D Charge-Memory Calculation”, “LIAC 2D Charge-Disparity Analysis” and “LIAC 3D Charge-Memory Calculation” in our proposal of SCA are described in detail by inferential CommonKADS schemes. It is shown that the LIAC method may perfectly be used to solve a complex task based on motion information inherent to binocular video sequences.  相似文献   

4.
Recently, the Algorithmic Lateral Inhibition (ALI) method and the Accumulative Computation (AC) method have proven to be efficient in modelling at the knowledge level for general-motion-detection tasks in video sequences. More precisely, the task of persistent motion detection has been widely expressed by means of the AC method, whereas the ALI method has been used with the objective of moving objects detection, labelling and further tracking. This paper exploits the current knowledge of our research team on the mentioned problem-solving methods to model the Stereovision-Correspondence-Analysis (SCA) task. For this purpose, ALI and AC methods are combined into the Lateral Inhibition in Accumulative Computation (LIAC) method. The four basic subtasks, namely “LIAC 2D Charge-Memory Calculation”, “LIAC 2D Charge-Disparity Analysis” and “LIAC 3D Charge-Memory Calculation” in our proposal of SCA are described in detail by inferential CommonKADS schemes. It is shown that the LIAC method may perfectly be used to solve a complex task based on motion information inherent to binocular video sequences.  相似文献   

5.
6.
Han  Yuzhuo  Liu  Risheng  Zhong  Guangyu  Fan  Xin  Li  Haojie  Luo  Zhongxuan 《Neural computing & applications》2018,29(5):1267-1279

Anisotropic propagations have been widely used for image processing for decades. However, most previous anisotropic propagations are simply defined on regular image pixels and cannot be used for complex vision task, e.g., object tracking. Tracking as a fundamental task in computer vision has potential value for virtual reality (VR) and augmented reality (AR). In this paper, we proposed a novel discriminative anisotropic propagations model called sequential heat diffusions (SHD) on video sequences to address this issue. Our core idea is to propagate the discriminative appearance of the target object on both the temporal and spatial domains. In particular, we first train a discriminative appearance model for the target. Then for a coming frame, we design two coupled diffusions, in which the spatial one estimates a probability to reflect the intrinsic object structure and the temporal one also provides another probability (guided by information of training frames) to capture the background distribution. Finally, the tracking result is achieved by maximizing a combined confidence maps. The experiments on many challenging videos show the superiority of our method against other state-of-the-art trackers.

  相似文献   

7.
随着计算机技术的不断发展,视频跟踪技术越来越成为计算机领域中研究的热点。视频跟踪技术的研究涉及范围很多,包括视频图像处理、模式识别以及人工智能等,具有较强的研究价值。手势检测识别技术作为一种基于计算机视觉的新型人机交互方式,是其中备受瞩目的研究和应用技术之一。文章采用一种简单高效的颜色直方图对目标(红色手指)进行主色定位,并在图像序列中进行目标区域提取,得到运动轨迹,进行手写数字识别。最后利用八段视频验证了该方法的简单高效,并能成功进行实时跟踪与识别。  相似文献   

8.
Human action recognition has been an active field of research in computer vision community for the last decade. The spatiotemporal MACH (maximum average correlation height) filter approach has proved to be a very efficient method to solve the problem. It captures the intra-class variability and produces a very high response at the spatiotemporal location $(x,y,t)$ where the action is present in a video. Its computation cost is significantly lower than any other action recognition approach. However, faster algorithm is always needed to perform a computer vision task in real-time. Therefore, we propose a very efficient algorithm for normalized spatiotemporal MACH filtering for action recognition. It is based on the computations performed both in the frequency domain as well as the spatiotemporal domain exploiting integral video. We compare its speed with that of the relevant traditional algorithms and show that our approach drastically outperforms all of them.  相似文献   

9.

We describe an artificial high-level vision system for the symbolic interpretation of data coming from a video camera that acquires the image sequences of moving scenes. The system is based on ARSOM neural networks that learn to generate the perception-grounded predicates obtained by image sequences. The ARSOM neural networks also provide a three-dimensional estimation of the movements of the relevant objects in the scene. The vision system has been employed in two scenarios: the monitoring of a robotic arm suitable for space operations, and the surveillance of an electronic data processing (EDP) center.  相似文献   

10.
Huang  Wei  Zeng  Jing  Zhang  Peng  Chen  Guang  Ding  Huijun 《Multimedia Tools and Applications》2018,77(21):28539-28565

Foreground targets localization in video sequences receives much popularity in computer vision during the past few years, and its studies are highly related toward machine learning techniques. Driven by the recent popular deep learning techniques in machine learning, many contemporary localization studies are equipped with popular deep learning methods, and their performance has been benefited a lot by the prominent generalization capability of deep learning methods. In this study, inspired by deep metric learning, which is a new trend in deep learning, a novel single-target localization method is proposed. This new method is composed of two steps. First, an offline deep-ranked metric learning step is fulfilled and its gradient at the end-to-end learning procedure of the whole deep learning model is derived for realizing the conventional stochastic gradient algorithm. Also, an alternative proximal gradient algorithm is introduced to boost the efficiency as well. Second, an online models updating step is employed by the consecutive updating manner as well as the incremental updating manner, in order to make the offline learned outcome more adaptive during the progression of video sequences, in which challenging circumstances, such as sudden illumination changes, obstacles, shape transformation, complex background, etc., are likely to occur. This new single-target localization method has been compared with several shallow learning-based or deep learning-based localization methods in a large video database. Both qualitative and quantitative analysis have been comprehensively conducted to reveal the superiority of the new single-target localization method from the statistical point of view.

  相似文献   

11.
Measuring pedestrian traffic in public areas is important for diverse business, security, and building management applications. Even though various computer vision methods have been proposed for this purpose, they are not suitable for measuring high traffic levels in large public areas. Because previous methods measured pedestrian traffic by detecting and tracking individuals, their computational complexity was high and they could not be used for crowded areas. Previous methods were also sometimes unable to integrate with existing surveillance cameras because they required specific camera angles. We propose an efficient method for measuring pedestrian traffic that employs feature-based regression in the spatiotemporal domain. The proposed method first extracts foreground pixels and motion vectors as image features, and then the extracted image features are accumulated over sequential frames. By identifying relationships between the extracted image features and the number of people passing by, pedestrian traffic can be measured efficiently. Because the proposed method does not involve any detection and tracking of humans, its computational complexity is low and the method is less constrained by the angle of the camera. In addition, due to the statistical nature of the proposed method, it can be used to assess extremely high traffic areas. To evaluate the proposed method, a dataset consisting of 24 hours of video sequences was prepared. The video data were acquired from 12 different locations in the most crowded underground shopping mall in Korea. Our studies revealed that the proposed method was capable of measuring pedestrian traffic with an error rate of 4.46% at an average processing speed of 70 fps.  相似文献   

12.
视频监控数据TB级的增长,从海量视频数据中高效准确的分离出视频监控场景中的运动物体,是计算机视觉领域的研究重点和挑战。提出了基于云平台的视频数据处理的并行计算框架及一种改进的基于混合高斯模型(GMM)的自适应前景提取算法,通过对混合高斯分布的自适应学习和在线 EM(期望最大化)算法获得最优参数组合,并将改进算法融合到视频处理并行计算框架。实验结果表明,该方法不但能大大提高视频处理的效率,并对复杂环境下准确提取前景目标也有良好的鲁棒性。  相似文献   

13.
Automatic head pose estimation from real-world video sequences is of great interest to the computer vision community since pose provides prior knowledge for tasks, such as face detection and classification. However, developing pose estimation algorithms requires large, labeled real-world video databases on which computer vision systems can be trained and tested. Manual labeling of each frame is tedious, time consuming, and often difficult due to the high uncertainty in head pose angle estimate, particularly in unconstrained environments that include arbitrary facial expression, occlusion, illumination etc. To overcome these difficulties, a semi-automatic framework is proposed for labeling temporal head pose in real-world video sequences. The proposed multi-stage labeling framework first detects a subset of frames with distinct head poses over a video sequence, which is then manually labeled by the expert to obtain the ground truth for those frames. The proposed framework provides a continuous head pose label and corresponding confidence value over the pose angles. Next, the interpolation scheme over a video sequence estimates i) labels for the frames without manual labels and ii) corresponding confidence values for interpolated labels. This confidence value permits an automatic head pose estimation framework to determine the subset of frames to be used for further processing, depending on the labeling accuracy required. The experiments performed on an in-house, labeled, large, real-world face video database (which will be made publicly available) show that the proposed framework achieves 96.98 % labeling accuracy when manual labeling is only performed on 30 % of the video frames.  相似文献   

14.
15.
散乱点数据处理在科学可视化研究、逆向工程、计算机视觉等领域有广泛应用。本文根据小波变换的基本原理和多维小波变换算法,设计了一种基于小波变换的散乱点数据处理方法。通过对散乱点的分层处理,将图像视频的三维小波变换应用于散乱点。在满足后期可视化显示要求的基础上,按照需要约减表示细节的高频子带,可应用于三维可视化数据的前期处理方面。  相似文献   

16.
The aim of the work is to build self-growing based architectures to support visual surveillance and human–computer interaction systems. The objectives include: identifying and tracking persons or objects in the scene or the interpretation of user gestures for interaction with services, devices and systems implemented in the digital home. The system must address multiple vision tasks of various levels such as segmentation, representation or characterization, analysis and monitoring of the movement to allow the construction of a robust representation of their environment and interpret the elements of the scene.It is also necessary to integrate the vision module into a global system that operates in a complex environment by receiving images from acquisition devices at video frequency and offering results to higher level systems, monitors and take decisions in real time, and must accomplish a set of requirements such as: time constraints, high availability, robustness, high processing speed and re-configurability.Based on our previous work with neural models to represent objects, in particular the Growing Neural Gas (GNG) model and the study of the topology preservation as a function of the parameters election, it is proposed to extend the capabilities of this self-growing model to track objects and represent their motion in image sequences under temporal restrictions.These neural models have various interesting features such as: their ability to readjust to new input patterns without restarting the learning process, adaptability to represent deformable objects and even objects that are divided in different parts or the intrinsic resolution of the problem of matching features for the sequence analysis and monitoring of the movement. It is proposed to build an architecture based on the GNG that has been called GNG-Seq to represent and analyze the motion in image sequences. Several experiments are presented that demonstrate the validity of the architecture to solve problems of target tracking, motion analysis or human–computer interaction.  相似文献   

17.
Image and video processing techniques are being frequently used in medical science applications. Computer vision-based systems have successfully replaced various manual medical processes such as analyzing physical and biomechanical parameters, physical examination of patients. These systems are gaining popularity because of their robustness and the objectivity they bring to various medical procedures. Hammersmith Infant Neurological Examinations (HINE) is a set of physical tests that are carried out on infants in the age group of 3–24 months with neurological disorders. However, these tests are graded through visual observations, which can be highly subjective. Therefore, computer vision-aided approach can be used to assist the experts in the grading process. In this paper, we present a method of automatic exercise classification through visual analysis of the HINE videos recorded at hospitals. We have used scale-invariant-feature-transform features to generate a bag-of-words from the image frames of the video sequences. Frequency of these visual words is then used to classify the video sequences using HMM. We also present a method of event segmentation in long videos containing more than two exercises. Event segmentation coupled with a classifier can help in automatic indexing of long and continuous video sequences of the HINE set. Our proposed framework is a step forward in the process of automation of HINE tests through computer vision-based methods. We conducted tests on a dataset comprising of 70 HINE video sequences. It has been found that the proposed method can successfully classify exercises with accuracy as high as 84%. The proposed work has direct applications in automatic or semiautomatic analysis of “vertical suspension” and “ventral suspension” tests of HINE. Though some of the critical tests such as “pulled-to-sit,” “lateral tilting,” or “adductor’s angle measurement” have already been addressed using image- and video-guided techniques, scopes are there for further improvement.  相似文献   

18.
19.
在动态场景中提取运动目标是开展视频分析的关键问题,也是当前计算机视觉与图像处理技术领域中的热门课题。本文提出了一种适用于动态场景的运动目标提取新算法,算法先根据摄像机全局运动模型计算全局运动参数,再利用三帧差分法得到分割的前景。将分割为背景的像素点映射到邻近帧,求得各帧的像素点为背景时其高斯模型的均值及方差。最后利用粒子滤波预测出下一帧前景区域,计算各像素点为前景的概率,获得运动目标的视频分割结果。实验表明,本文算法有效地克服了由于全局运动模型参数估算偏差而导致的累积误差,能以更高精度实现跳水运动视频中的目标分割。  相似文献   

20.

Human activity recognition is a challenging problem of computer vision and it has different emerging applications. The task of recognizing human activities from video sequence exhibits more challenges because of its highly variable nature and requirement of real time processing of data. This paper proposes a combination of features in a multiresolution framework for human activity recognition. We exploit multiresolution analysis through Daubechies complex wavelet transform (DCxWT). We combine Local binary pattern (LBP) with Zernike moment (ZM) at multiple resolutions of Daubechies complex wavelet decomposition. First, LBP coefficients of DCxWT coefficients of image frames are computed to extract texture features of image, then ZM of these LBP coefficients are computed to extract the shape feature from texture feature for construction of final feature vector. The Multi-class support vector machine classifier is used for classifying the recognized human activities. The proposed method has been tested on various standard publicly available datasets. The experimental results demonstrate that the proposed method works well for multiview human activities as well as performs better than some of the other state-of-the-art methods in terms of different quantitative performance measures.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号