首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Especially in urban environments, video cameras have become omnipresent. Supporters of video surveillance argue that it is an excellent tool for many applications including crime prevention and law enforcement. While this is certainly true, it must be questioned if sufficient efforts are made to protect the privacy of monitored people. Privacy concerns are often set aside when compared to public safety and security. One reaction to this situation is emerging: community-based efforts where citizens register and map surveillance cameras in their environment. Our study is inspired by this idea and proposes a user-specific and location-aware privacy awareness system. Using conventional smartphones, users not only can contribute to the camera maps, but also use community-collected data to be alerted of potential privacy violations. In our model, we define different levels of privacy awareness. For the highest level, we present a mechanism that allows users to directly interact with specially designed, trustworthy cameras. These cameras provide direct feedback about the tasks that are executed by the camera and how privacy-sensitive data is handled. A hardware security chip that is integrated into the camera is used to ensure authenticity, integrity and freshness of the provided camera status information.  相似文献   

2.
Visual surveillance using multiple cameras has attracted increasing interest in recent years. Correspondence between multiple cameras is one of the most important and basic problems which visual surveillance using multiple cameras brings. In this paper, we propose a simple and robust method, based on principal axes of people, to match people across multiple cameras. The correspondence likelihood reflecting the similarity of pairs of principal axes of people is constructed according to the relationship between "ground-points" of people detected in each camera view and the intersections of principal axes detected in different camera views and transformed to the same view. Our method has the following desirable properties; 1) camera calibration is not needed; 2) accurate motion detection and segmentation are less critical due to the robustness of the principal axis-based feature to noise; 3) based on the fused data derived from correspondence results, positions of people in each camera view can be accurately located even when the people are partially occluded in all views. The experimental results on several real video sequences from outdoor environments have demonstrated the effectiveness, efficiency, and robustness of our method.  相似文献   

3.
Today digital video is used extensively in many applications. Sometimes a video could be treated as a top secret for an organization, for example military secrets, surveillance footage and corporate product designs, and may need to be shared among a group of people in a secure manner. Traditional data security methods such as encryption techniques are prone to single-point attack, i.e. the secret can be revealed by obtaining the decryption key from any single person. Alternatively, the secret sharing scheme provides collective control over the secrecy of information and is considered information theoretically secure. In this paper, we propose to adopt a secret sharing based approach to provide collective control over a given sensitive video. We present three methods that utilize the spatial and temporal redundancy in videos in different ways. We analyze the security of these methods and compare them for efficiency in terms of computation time and space using extensive experimentation.  相似文献   

4.
Time-Delayed Correlation Analysis for Multi-Camera Activity Understanding   总被引:1,自引:0,他引:1  
We propose a novel approach to understanding activities from their partial observations monitored through multiple non-overlapping cameras separated by unknown time gaps. In our approach, each camera view is first decomposed automatically into regions based on the correlation of object dynamics across different spatial locations in all camera views. A new Cross Canonical Correlation Analysis (xCCA) is then formulated to discover and quantify the time delayed correlations of regional activities observed within and across multiple camera views in a single common reference space. We show that learning the time delayed activity correlations offers important contextual information for (i) spatial and temporal topology inference of a camera network; (ii) robust person re-identification and (iii) global activity interpretation and video temporal segmentation. Crucially, in contrast to conventional methods, our approach does not rely on either intra-camera or inter-camera object tracking; it thus can be applied to low-quality surveillance videos featured with severe inter-object occlusions. The effectiveness and robustness of our approach are demonstrated through experiments on 330 hours of videos captured from 17 cameras installed at two busy underground stations with complex and diverse scenes.  相似文献   

5.
Hierarchical database for a multi-camera surveillance system   总被引:1,自引:0,他引:1  
This paper presents a framework for event detection and video content analysis for visual surveillance applications. The system is able to coordinate the tracking of objects between multiple camera views, which may be overlapping or non-overlapping. The key novelty of our approach is that we can automatically learn a semantic scene model for a surveillance region, and have defined data models to support the storage of tracking data with different layers of abstraction into a surveillance database. The surveillance database provides a mechanism to generate video content summaries of objects detected by the system across the entire surveillance region in terms of the semantic scene model. In addition, the surveillance database supports spatio-temporal queries, which can be applied for event detection and notification applications.  相似文献   

6.
Video surveillance activity has dramatically increased over the past few years. Earlier work dealt mostly with single stationary cameras, but the recent trend is toward active multicamera systems. Such systems offer several advantages over single camera systems - multiple overlapping views for obtaining 3D information and handling occlusions, multiple nonoverlapping cameras for covering wide areas, and active pan-tilt-zoom (PTZ) cameras for observing object details. To address these issues, we have developed a multicamera video surveillance approach, called distributed interactive video array. The DIVA framework provides multiple levels of semantically meaningful information ("situational" awareness) to match the needs of multiple remote observers. We have designed DIVA-based systems that can track and identify vehicles and people, monitor perimeters and bridges, and analyze activities. A new video surveillance approach employing a large-scale cluster of video sensors demonstrates the promise of multicamera arrays for homeland security.  相似文献   

7.
Electronic surveillance systems are being used rapidly today, ranging from a simple video camera to a complex biometric surveillance system for facial patterns and intelligent computer vision based surveillance systems, which are applied in many fields such as home monitoring, security surveillance of important places and mission critical tasks like air traffic control surveillance. Such systems normally involve a computer system and a human surveillance operator, who looks at the dynamic display to perform his surveillance tasks. Exploitation of shared information between these physical heterogeneous data capture systems with human operated functions is one emerging aspect in electronic surveillance that has yet to be addressed deeply. Hence, an innovative interaction interface for such knowledge extraction and representation is required. Such an interface should establish a data activity register frame which captures information depicting various surveillance activities at a specified spatial and time reference.This paper presents a real time eye tracking system, which integrates two sets of activity data in a highly dynamic changing and synchronous manner in real-time with respect to both spatial and time frames, through the “Dynamic Data Alignment and Timestamp Synchronisation Model”. This model matches the timestamps of the two data streams, aligns them to the same spatial reference frame before fusing them together into a data activity register frame. The Air Traffic Control (ATC) domain is used to illustrate this model, where experiments are conducted under simulated radar traffic situations with participants and their radar input data. Test results revealed that this model is able to synchronise the timestamp of the eye and dynamic display data, align both of these data spatially, while taking into account dynamic changes in space and time on a simulated radar display. This system can also distinguish and show variations in the monitoring behaviour of participants. As such, new knowledge can be extracted and represented through this innovative interface, which can then be applied to other applications in the field of electronic surveillance to unearth monitoring behaviour of the human surveillance operator.  相似文献   

8.
We present a hybrid camera system for capturing video at high spatial and spectral resolutions. Composed of an red, green, and blue (RGB) video camera, a grayscale video camera and a few optical elements, the hybrid camera system simultaneously records two video streams: an RGB video with high spatial resolution, and a multispectral (MS) video with low spatial resolution. After registration of the two video streams, our system propagates the MS information into the RGB video to produce a video with both high spectral and spatial resolution. This propagation between videos is guided by color similarity of pixels in the spectral domain, proximity in the spatial domain, and the consistent color of each scene point in the temporal domain. The propagation algorithm, based on trilateral filtering, is designed to rapidly generate output video from the captured data at frame rates fast enough for real-time video analysis tasks such as tracking and surveillance. We evaluate the proposed system using both simulations with ground truth data and on real-world scenes. The accuracy of spectral capture is examined through comparisons with ground truth and with a commercial spectrometer. The utility of this high resolution MS video data is demonstrated on the applications of dynamic white balance adjustment, object tracking, and separating the appearance contributions of different illumination sources. The various high resolution MS video datasets that we captured will be made publicly available to facilitate research on dynamic spectral data analysis.  相似文献   

9.
Reliable and real-time crowd counting is one of the most important tasks in intelligent visual surveillance systems. Most previous works only count passing people based on color information. Owing to the restrictions of color information influences themselves for multimedia processing, they will be affected inevitably by the unpredictable complex environments (e.g. illumination, occlusion, and shadow). To overcome this bottleneck, we propose a new algorithm by multimodal joint information processing for crowd counting. In our method, we use color and depth information together with a ordinary depth camera (e.g. Microsoft Kinect). Specifically, we first detect each head of the passing or still person in the surveillance region with adaptive modulation ability to varying scenes on depth information. Then, we track and count each detected head on color information. The characteristic advantage of our algorithm is that it is scene adaptive, which means the algorithm can be applied into all kinds of different scenes directly without additional conditions. Based on the proposed approach, we have built a practical system for robust and fast crowd counting facing complicated scenes. Extensive experimental results show the effectiveness of our proposed method.  相似文献   

10.
Nowadays, tremendous amount of video is captured endlessly from increased numbers of video cameras distributed around the world. Since needless information is abundant in the raw videos, making video browsing and retrieval is inefficient and time consuming. Video synopsis is an effective way to browse and index such video, by producing a short video representation, while keeping the essential activities of the original video. However, video synopsis for single camera is limited in its view scope, while understanding and monitoring overall activity for large scenarios is valuable and demanding. To solve the above issues, we propose a novel video synopsis algorithm for partially overlapping camera network. Our main contributions reside in three aspects: First, our algorithm can generate video synopsis for large scenarios, which can facilitate understanding overall activities. Second, for generating overall activity, we adopt a novel unsupervised graph matching algorithm to associate trajectories across cameras. Third, a novel multiple kernel similarity is adopted in selecting key observations for eliminating content redundancy in video synopsis. We have demonstrated the effectiveness of our approach on real surveillance videos captured by our camera network.  相似文献   

11.
采用视频拼图方法构建高分辨率全景视频监控系统   总被引:1,自引:0,他引:1       下载免费PDF全文
与普通视频监控系统只能实现单向监控不同,全景视频监控系统可以实现360°全向监控。设计并实现了一种嵌入式高分辨率全景视频监控系统KD-PVS。重点介绍了KD-PVS中多个摄像头的空间位置设计、视频图像变换与拼接算法。KD-PVS通过对多个摄像头采集的视频进行实时变换与拼接以生成全景视频。该系统可方便应用于金融系统、仓库、监狱和移动监控等多种场合,尤其适用于室内监控。  相似文献   

12.
We present a surveillance system, comprising wide field-of-view (FOV) passive cameras and pan/tilt/zoom (PTZ) active cameras, which automatically captures high-resolution videos of pedestrians as they move through a designated area. A wide-FOV static camera can track multiple pedestrians, while any PTZ active camera can capture high-quality videos of one pedestrian at a time. We formulate the multi-camera control strategy as an online scheduling problem and propose a solution that combines the information gathered by the wide-FOV cameras with weighted round-robin scheduling to guide the available PTZ cameras, such that each pedestrian is observed by at least one PTZ camera while in the designated area. A centerpiece of our work is the development and testing of experimental surveillance systems within a visually and behaviorally realistic virtual environment simulator. The simulator is valuable as our research would be more or less infeasible in the real world given the impediments to deploying and experimenting with appropriately complex camera sensor networks in large public spaces. In particular, we demonstrate our surveillance system in a virtual train station environment populated by autonomous, lifelike virtual pedestrians, wherein easily reconfigurable virtual cameras generate synthetic video feeds. The video streams emulate those generated by real surveillance cameras monitoring richly populated public spaces.A preliminary version of this paper appeared as [1].  相似文献   

13.
We address the structure-from-motion problem in the context of head modeling from video sequences for which calibration data is not available. This task is made challenging by the fact that correspondences are difficult to establish due to lack of texture and that a quasi-euclidean representation is required for realism.We have developed an approach based on regularized bundle-adjustment. It takes advantage of our rough knowledge of the head's shape, in the form of a generic face model. It allows us to recover relative head-motion and epipolar geometry accurately and consistently enough to exploit a previously-developed stereo-based approach to head modeling. In this way, complete and realistic head models can be acquired with a cheap and entirely passive sensor, such as an ordinary video camera, with minimal manual intervention.We chose to demonstrate and evaluate our technique mainly in the context of head-modeling. We do so because it is the application for which all the tools required to perform the complete reconstruction are available to us. We will, however, argue that the approach is generic and could be applied to other tasks, such as body modeling, for which generic facetized models exist.  相似文献   

14.
We present a novel representation and rendering method for free‐viewpoint video of human characters based on multiple input video streams. The basic idea is to approximate the articulated 3D shape of the human body using a subdivision into textured billboards along the skeleton structure. Billboards are clustered to fans such that each skeleton bone contains one billboard per source camera. We call this representation articulated billboards. In the paper we describe a semi‐automatic, data‐driven algorithm to construct and render this representation, which robustly handles even challenging acquisition scenarios characterized by sparse camera positioning, inaccurate camera calibration, low video resolution, or occlusions in the scene. First, for each input view, a 2D pose estimation based on image silhouettes, motion capture data, and temporal video coherence is used to create a segmentation mask for each body part. Then, from the 2D poses and the segmentation, the actual articulated billboard model is constructed by a 3D joint optimization and compensation for camera calibration errors. The rendering method includes a novel way of blending the textural contributions of each billboard and features an adaptive seam correction to eliminate visible discontinuities between adjacent billboards textures. Our articulated billboards do not only minimize ghosting artifacts known from conventional billboard rendering, but also alleviate restrictions to the setup and sensitivities to errors of more complex 3D representations and multiview reconstruction techniques. Our results demonstrate the flexibility and the robustness of our approach with high quality free‐viewpoint video generated from broadcast footage of challenging, uncontrolled environments.  相似文献   

15.
Object detection is an essential component in automated vision-based surveillance systems. In general, object detectors are constructed using training examples obtained from large annotated data sets. The inevitable limitations of typical training data sets make such supervised methods unsuitable for building generic surveillance systems applicable to a wide variety of scenes and camera setups. In our previous work we proposed an unsupervised method for learning and detecting the dominant object class in a general dynamic scene observed by a static camera. In this paper, we investigate the possibilities to expand the applicability of this method to the problem of multiple dominant object classes. We propose an idea on how to approach this expansion, and perform an evaluation of this idea using two representative surveillance video sequences.  相似文献   

16.
Video recommendation is an important tool to help people access interesting videos. In this paper, we propose a universal scheme to integrate rich information for personalized video recommendation. Our approach regards video recommendation as a ranking task. First, it generates multiple ranking lists by exploring different information sources. In particular, one novel source user’s relationship strength is inferred through the online social network and applied to recommend videos. Second, based on multiple ranking lists, a multi-task rank aggregation approach is proposed to integrate these ranking lists to generate a final result for video recommendation. It is shown that our scheme is flexible that can easily incorporate other methods by adding their generated ranking lists into our multi-task rank aggregation approach. We conduct experiments on a large dataset with 76 users and more than 11,000 videos. The experimental results demonstrate the feasibility and effectiveness of our approach.  相似文献   

17.
We propose an efficient real-time automatic license plate recognition (ALPR) framework, particularly designed to work on CCTV video footage obtained from cameras that are not dedicated to the use in ALPR. At present, in license plate detection, tracking and recognition are reasonably well-tackled problems with many successful commercial solutions being available. However, the existing ALPR algorithms are based on the assumption that the input video will be obtained via a dedicated, high-resolution, high-speed camera and is/or supported by a controlled capture environment, with appropriate camera height, focus, exposure/shutter speed and lighting settings. However, typical video forensic applications may require searching for a vehicle having a particular number plate on noisy CCTV video footage obtained via non-dedicated, medium-to-low resolution cameras, working under poor illumination conditions. ALPR in such video content faces severe challenges in license plate localization, tracking and recognition stages. This paper proposes a novel approach for efficient localization of license plates in video sequence and the use of a revised version of an existing technique for tracking and recognition. A special feature of the proposed approach is that it is intelligent enough to automatically adjust for varying camera distances and diverse lighting conditions, a requirement for a video forensic tool that may operate on videos obtained by a diverse set of unspecified, distributed CCTV cameras.  相似文献   

18.
This paper examines factors contributing to the effectiveness of camera operators in urban camera surveillance. The use of camera surveillance has taken an enormous flight in the past decades. Despite this increase, its effectiveness is strongly debated. One reason for the disputed effectiveness may be that an understanding of how to use camera surveillance, including elements contributing to the effectiveness of camera operators, has not kept track with technological developments. This paper focuses on the role of expertise and familiarity with the environment on the effectiveness of camera operators to detect offenders in video footage from Rotterdam City Surveillance in the Netherlands. Results show no effect of expertise, but do show that familiarity with the location contributes to operator effectiveness and that camera operators seem to use different criteria for detecting and selecting suspects depending on the familiarity with the location. These results contribute to our understanding of operator effectiveness and offer guidelines for the training of camera operators. Implications are discussed.  相似文献   

19.
本文提出了一种基于图像像素比较的技术,在摄像头录像监控的环境下,应用在仓库管理系统可以快速查看录像监控异常内容,有比较高的应用价值。  相似文献   

20.
In this paper, we present an approach for consistently labeling people and for detecting human–object interactions using mono-camera surveillance video. The approach is based on a robust appearance-based correlogram model combined with histogram information to model color distributions of people and objects in the scene. The models are dynamically built from non-stationary objects, which are the outputs of background subtraction, and are used to identify objects on a frame-by-frame basis. We are able to detect when people merge into groups and to segment them even during partial occlusion. We can also detect when a person deposits or removes an object. The models persist when a person or object leaves the scene and are used to identify them when they reappear. Experiments show that the models are able to accommodate perspective foreshortening that occurs with overhead camera angles, as well as partial occlusion. The results show that this is an effective approach that is able to provide important information to algorithms performing higher-level analysis, such as activity recognition, where human–object interactions play an important role.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号