首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
随着平安城市与监控摄像头的发展,车辆违章停靠事件的自动检测在视频自动检测方法中具有重要意义.当前的目标检测算法,例如基于特征+分类器、基于视频背景差分、基于深度学习的方法等,大多关注车牌、交通标志的定位与识别,而忽略了对车辆违章等事件的判别.因此,发展了一种基于物体交互模型的车辆违章停靠事件检测方法:首先基于轮廓提取和SVM进行车牌检测;其次基于形状和颜色进行交通标志检测;最后基于物体交互模型来判断是否违停.上述方法在真实视频场景中的实验验证了方法的有效性和可靠性.  相似文献   

4.
This paper focuses on the task of human-object interaction (HOI) recognition, which aims to classify the interaction between human and objects. It is a challenging task partially due to the extremely imbalanced data among classes. To solve this problem, we propose a language-guided graph parsing attention network (LG-GPAN) that makes use of the word distribution in language to guide the classification in vision. We first associate each HOI class name with a word embedding vector in language and then all the vectors can construct a language space specified for HOI recognition. Simultaneously, the visual feature is extracted from the inputs via the proposed graph parsing attention network (GPAN) for better visual representation. The visual feature is then transformed into the linguistic one in language space. Finally, the output score is obtained via measuring the distance between the linguistic feature and the word embedding of classes in language space. Experimental results on the popular CAD-120 and V-COCO datasets validate our design choice and demonstrate its superior performance in comparison to the state-of-the-art.  相似文献   

5.
Objects that occupy a small portion of an image or a frame contain fewer pixels and contains less information. This makes small object detection a challenging task in computer vision. In this paper, an improved Single Shot multi-box Detector based on feature fusion and dilated convolution (FD-SSD) is proposed to solve the problem that small objects are difficult to detect. The proposed network uses VGG-16 as the backbone network, which mainly includes a multi-layer feature fusion module and a multi-branch residual dilated convolution module. In the multi-layer feature fusion module, the last two layers of the feature map are up-sampled, and then they are concatenated at the channel level with the shallow feature map to enhance the semantic information of the shallow feature map. In the multi-branch residual dilated convolution module, three dilated convolutions with different dilated ratios based on the residual network are combined to obtain the multi-scale context information of the feature without losing the original resolution of the feature map. In addition, deformable convolution is added to each detection layer to better adapt to the shape of small objects. The proposed FD-SSD achieved 79.1% mAP and 29.7% mAP on PASCAL VOC2007 dataset and MS COCO dataset respectively. Experimental results show that FD-SSD can effectively improve the utilization of multi-scale information of small objects, thus significantly improve the effect of the small object detection.  相似文献   

6.
杜杰  吴谨  朱磊 《液晶与显示》2016,31(1):117-123
为了对各类自然场景中的显著目标进行检测,本文提出了一种将图像的深度信息引入区域显著性计算的方法,用于目标检测。首先对图像进行多尺度分割得到若干区域,然后对区域多类特征学习构建回归随机森林,采用监督学习的方法赋予每个区域特征显著值,最后采用最小二乘法对多尺度的显著值融合,得到最终的显著图。实验结果表明,本文算法能较准确地定位RGBD图像库中每幅图的显著目标。  相似文献   

7.
基于全卷积网络的图像显著性检测获得了广泛的关 注,并取得了令人瞩目的检测性能 。然而,该类型神经网络依然存在许多问题,如高复杂网络导致难以训练、显著性对象边缘 结果不准确等。针对这些问题,本文提出基于Gabor初始化的卷积神经网络。该网络主要特 点包括:1) 利用Gabor特征初始化卷积神经网络,提高神经网络训练效率; 2) 构建多尺 度 桥接模块,有效衔接编码和解码阶段,进而提高显著性检测结果; 3) 提出加权交叉熵损失 函数,提高训练效果。实验结果表明,本文提出的神经网络在三个不同的数据集上均显示出 优异的显著性对象检测性能。  相似文献   

8.
在计算机视觉中形状是目标识别和检测的重要特征,而目标边缘是形状特征最直接的表现,因此基于边缘信息进行形状特征描述是最直接有效的方法.针对目前大多数形状特征描述的全局性以及对旋转、缩放等变化的敏感性,采用一种基于目标近似多边形的形状特征描述,这种描述方式具有局部性和紧凑性,同时结合运动参数预测及递归估计的方法实现二维目标...  相似文献   

9.
Human-computer interaction is the way in which humans and machines communicate information. With the rapid development of deep learning technology, the technology of human-computer interaction has also made a corresponding breakthrough. In the past, the way human-computer interaction was mostly relied on hardware devices. Through the coordinated work of multiple sensors, people and machines can realize information interaction. However, as theoretical technology continues to mature, algorithms for human-computer interaction are also being enriched. The popularity of convolutional neural networks has made image processing problems easier to solve. Therefore, real-time human-computer interaction can be performed by using image processing, and intelligent of human-computer interaction can be realized. The main idea of this paper is to use the real-time capture of face images and video information to image the face image information. We perform feature point positioning based on the feature points of the face image. We perform expression recognition based on the feature points that are located. At the same time, we perform ray tracing for the identified human eye area. The feature points of the face and the corresponding expressions and implementation movements represent the user's use appeal. Therefore, we can analyze the user's use appeal by locating the face feature area. We define the corresponding action information for specific user face features. We extract the user's corresponding information according to the user's face features, and perform human-computer interaction according to the user's information.  相似文献   

10.
Depth maps have been proven profitable to provide supplements for salient object detection in recent years. However, most RGB-D salient object detection approaches ignore that there are usually low-quality depth maps, which will inevitably result in unsatisfactory results. In this paper, we propose a depth cue enhancement and guidance network (DEGNet) for RGB-D salient object detection by exploring the depth quality enhancement and utilizing the depth cue guidance to generate predictions with highlighted objects and suppressed backgrounds. Specifically, a depth cue enhancement module is designed to generate high-quality depth maps by enhancing the contrast between the foreground and the background. Then considering the different characteristics of unimodal RGB and depth features, we use different feature enhancement strategies to strengthen the representation capability of side-output unimodal features. Moreover, we propose a depth-guided feature fusion module to excavate depth cues provided by the depth stream to guide the fusion of multi-modal features by fully making use of different modal properties, thus generating discriminative cross-modal features. Besides, we aggregate cross-modal features at different levels to obtain the final prediction by adopting a pyramid feature shrinking structure. Experimental results on six benchmark datasets demonstrate that the proposed network DEGNet outperforms 17 state-of-the-art methods.  相似文献   

11.
非参数密度估计在样本分析建模方面得到了很大的关注,尤其是核密度估计方法。但由于核密度估计方法计算量大,应用到运动目标检测方面很难达到实时效果。提出了一种特征帧构建的核密度估计方法。因为核密度估计不需要假设背景模型的密度分布函数,所有样本值又满足独立同分布的原则,所以可以通过特征帧构建的方法进行背景建模,同时应用此方法进行背景更新。实验结果表明:该方法能够适应环境变化且具有运算速度快、实时性好等特点,可以将其应用到复杂背景下的监控系统中。  相似文献   

12.
基于INAP的智能业务形式化描述和冲突检测   总被引:1,自引:0,他引:1  
基于业务自然语言定义的业务形式化描述具有模糊性,不能精确地检测出业务冲突的问题,本文提出了一种新的基于智能网应用协议INAP的业务形式化描述和冲突检测方法。  相似文献   

13.
特定目标的检测与识别是自动目标识别的关键技术之一。作为典型地面目标,机场跑道的识别.一直是自动目标识别领域人们感兴趣的研究课题。针对机场的跑道成像是线状结构和高灰度值的特征。提出了一种基于多尺度线状目标强化的机场跑道识别算法。实验结果表明,在复杂背景下,此方法在将机场跑道目标强化出来的同时,能够很好地抑制其他非线状特征目标,起到真正的强化作用。因此,在复杂背景下,此方法较基于边缘的机场跑道识别算法具有更好的识别性能。  相似文献   

14.
LiDAR-based 3D object detection is important for autonomous driving scene perception, but point clouds produced by LiDAR are irregular and unstructured in nature, and cannot be adopted by the conventional Convolutional Neural Networks (CNN). Recently, Graph Convolutional Networks (GCN) has been proved as an ideal way to handle non-Euclidean structure data, as well as for point cloud processing. However, GCN involves massive computation for searching adjacent nodes, and the heavy computational cost limits its applications in processing large-scale LiDAR point cloud in autonomous driving. In this work, we adopt a frustum-based point cloud-image fusion scheme to reduce the amount of LiDAR point clouds, thus making the GCN-based large-scale LiDAR point clouds feature learning feasible. On this basis, we propose an efficient graph attentional network to accomplish the goal of 3D object detection in autonomous driving, which can learn features from raw LiDAR point cloud directly without any conversions. We evaluate the model on the public KITTI benchmark dataset, the 3D detection mAP is 63.72% on KITTI Cars, Pedestrian and Cyclists, and the inference speed achieves 7.9 fps on a single GPU, which is faster than other methods of the same type.  相似文献   

15.
Fractional Brownian motion, continuous everywhere and differentiable nowhere, offers a convenient modeling for irregular nonstationary stochastic processes with long-term dependencies and power law behavior of spectrum over wide ranges of frequencies. It shows high correlation at coase scale and varies slightly at fine scale, which is suitable for and successful in describing and modeling natural scenes. On the other hand, man-made objects can be constructively well described by using a set of regular simple shape primitives such as line, cylinder, etc. and are free of fractal. Based on the difference, we provide a method to discriminate man-made objects from natural scenes. Experiments are used to demonstrated the good efficiency of the developed technique.  相似文献   

16.
LiDAR-based 3D Object detection is one of the popular topics in recent years, and it is widely used in the fields of autonomous driving and robot controlling. However, due to the scanning pattern of LiDAR, the point clouds of objects at far distance are sparse and more difficult to be detected. To solve this problem, we propose a two-stage network based on spatial context information, named SC-RCNN (Spatial Context RCNN), for object detection in 3D point cloud scenes. SC-RCNN first uses a backbone with sparse convolutions and submanifold sparse convolutions to extract the voxel features of point scenes and generate a series of candidate boxes. For the sparsity of far-distance point clouds, we design the local grid point pooling (LGP Pooling) to extract features and spatial context information around candidate regions for subsequent box refinement. In addition, we propose the pyramid candidate box augmentation (PCB Augmentation) to expand the candidate boxes with a multi-scale style, enriching the feature encoding. The experimental results show that SC-RCNN significantly outperforms previous methods on KITTI dataset and Waymo dataset, and is particularly robust to the sparsity of point clouds.  相似文献   

17.
Fractional Brownian motion, continuous everywhere and differentiable nowhere, offers a convenient modeling for irregular nonstationary stochastic processes with long-term dependencies and power law behavior of spectrum over wide ranges of frequencies. It shows high correlation at coarse scale and varies slightly at fine scale, which is suitable for and successful in describing and modeling natural scenes. On the other hand, man-made objects can be constructively well described by using a set of regular simple shape primitives such as line, cylinder, etc. and are free of fractal. Based on the difference, a method to discriminate man-made objects from natural scenes is provided. Experiments are used to demonstrate the good efficiency of developed technique.  相似文献   

18.
陈国平  程秋菊  黄超意  周围  王璐 《电讯技术》2019,59(10):1121-1126
通过收集大量的毫米波图像并建立相应的人体数据集进行检测,提出基于Faster R-CNN深度学习的方法检测隐藏于人体上的危险物品。该方法将区域建议网络和VGG19训练卷积神经网络模型相结合,构建了面向毫米波图像目标检测的深度卷积神经网络。为了提高毫米波图像的处理能力,采用Caffe深度学习框架在图形处理单元上进行训练和测试。实验结果证明了基于Faster R-CNN深度卷积神经网络的目标检测方法能有效检测毫米波图像中的危险物品,并且目标检测的平均准确率约94%,检测速度约为6 frame/s,对毫米波安检系统的智能化发展有着极其重要的参考价值。  相似文献   

19.
刘博  安建成 《电视技术》2014,38(5):38-41
人体动作识别是计算机视频和图像方面的一个热点问题,为了解决识别率不高、识别速度不快、不能实时识别,以及不同的人摆出相同动作时出现的识别误差,提出了一种能有效解决该问题的方法,该方法分析计算匹配视频帧序列,然后分类匹配后的视频帧,达到识别的目的。  相似文献   

20.
针对在日常的生产实践当中,摄像头采集到的数据并不是全部是人们想要的数据,可能人们只对某个球体物体的运动轨迹,运动特征感兴趣,因此提出了基于TMS320DM6446的球体物体的在线检测与识别,减少视频存储的数据量,便于视频的后期处理.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号