共查询到20条相似文献,搜索用时 16 毫秒
1.
This paper proposes a new two-phase approach to robust text detection by integrating the visual appearance and the geometric reasoning rules. In the first phase, geometric rules are used to achieve a higher recall rate. Specifically, a robust stroke width transform (RSWT) feature is proposed to better recover the stroke width by additionally considering the cross of two strokes and the continuousness of the letter border. In the second phase, a classification scheme based on visual appearance features is used to reject the false alarms while keeping the recall rate. To learn a better classifier from multiple visual appearance features, a novel classification method called double soft multiple kernel learning (DS-MKL) is proposed. DS-MKL is motivated by a novel kernel margin perspective for multiple kernel learning and can effectively suppress the influence of noisy base kernels. Comprehensive experiments on the benchmark ICDAR2005 competition dataset demonstrate the effectiveness of the proposed two-phase text detection approach over the state-of-the-art approaches by a performance gain up to 4.4% in terms of F-measure. 相似文献
2.
This paper deals with the use of invariant visual features for visual servoing. New features are proposed to control the 6 degrees of freedom of a robotic system with better linearizing properties and robustness to noise than the state of the art in image-based visual servoing. We show in this paper that by using these features the behavior of image-based visual servoing in task space can be significantly improved. Several experimental results are provided and validate our proposal. 相似文献
3.
Multimedia Tools and Applications - Discriminative correlation filter-based algorithms have recently demonstrated prominent advantages in the community of computer visual tracking, due to their... 相似文献
4.
This paper presents a novel online object tracking algorithm with sparse representation for learning effective appearance models under a particle filtering framework. Compared with the state-of-the-art ? 1 sparse tracker, which simply assumes that the image pixels are corrupted by independent Gaussian noise, our proposed method is based on information theoretical Learning and is much less sensitive to corruptions; it achieves this by assigning small weights to occluded pixels and outliers. The most appealing aspect of this approach is that it can yield robust estimations without using the trivial templates adopted by the previous sparse tracker. By using a weighted linear least squares with non-negativity constraints at each iteration, a sparse representation of the target candidate is learned; to further improve the tracking performance, target templates are dynamically updated to capture appearance changes. In our template update mechanism, the similarity between the templates and the target candidates is measured by the earth movers’ distance(EMD). Using the largest open benchmark for visual tracking, we empirically compare two ensemble methods constructed from six state-of-the-art trackers, against the individual trackers. The proposed tracking algorithm runs in real-time, and using challenging sequences performs favorably in terms of efficiency, accuracy and robustness against state-of-the-art algorithms. 相似文献
5.
Highlight detection is a fundamental step in semantics based video retrieval and personalized sports video browsing. In this paper, an effective hidden Markov models (HMMs) based soccer video event detection method based on a hierarchical video analysis framework is proposed. Soccer video shots are classified into four coarse mid-level semantics: global, median, close-up and audience. Global and local motion information is utilized for the refinement of coarse mid-level semantics. Sequential soccer video is segmented into event clips. Both the temporal transitions of the mid-level semantics and the overall features of an event clip are fused using HMMs to determine the type of event. Highlight detection performance of dynamic Bayesian networks (DBN), conditional random fields (CRF) and the proposed HMM based approach are compared. The average F-score of our highlights (including goal, shoot, foul and placed kick) detection approach is 82.92%, which outperforms that of DBN and CRF by 9.85% and 11.12% respectively. The effects of number of hidden states, overall features, and the refinement of mid-level semantics on the event detection performance are also discussed. 相似文献
6.
目的 L1跟踪对局部遮挡具有良好的鲁棒性,但存在易产生模型漂移和计算速度慢的问题。针对这两个问题,该文提出了一种基于判别稀疏表示的视觉跟踪方法。 方法 考虑到背景和遮挡信息的干扰,提出了一种判别稀疏表示模型,并基于块坐标优化原理,采用学习迭代收缩阈值算法和软阈值操作设计出了表示模型的快速求解算法。 结果 在8组图像序列中,该文方法与现有的4种经典跟踪方法分别在鲁棒性和稀疏表示的计算时间方面进行了比较。在鲁棒性的定性和定量比较实验中,该文方法不仅表现出了对跟踪过程中的多种干扰因素具有良好的适应能力,而且在位置误差阈值从0~50像素的变化过程中,其精度曲线均优于实验中的其他方法;在稀疏表示的计算时间方面,在采用大小为16×16和32×32的模板进行跟踪时,该文算法的时间消耗分别为0.152 s和0.257 s,其时效性明显优于实验中的其他方法。 结论 与经典的跟踪方法相比,该文方法能够在克服遮挡、背景干扰和外观改变等诸多不良因素的同时,实现快速目标跟踪。由于该文方法不仅具有较优的稀疏表示计算速度,而且能够克服多种影响跟踪鲁棒性的干扰因素,因此可以将其应用于视频监控和体育竞技等实际场合。 相似文献
7.
The Journal of Supercomputing - In Internet of Things (IoT) environments, visual sensors with good performance have been used to create and apply various kinds of image data. Particularly, in the... 相似文献
8.
This work presents an automated solution for tool changing in industrial robots using visual servoing and sliding mode control. The robustness of the proposed method is due to the control law of the visual servoing, which uses the information acquired by a vision system to close a feedback control loop. Furthermore, sliding mode control is simultaneously used in a prioritised level to satisfy the constraints typically present in a robot system: joint range limits, maximum joint speeds and allowed workspace. Thus, the global control accurately places the tool in the warehouse, but satisfying the robot constraints. The feasibility and effectiveness of the proposed approach is substantiated by simulation results for a complex 3D case study. Moreover, real experimentation with a 6R industrial manipulator is also presented to demonstrate the applicability of the method for tool changing. 相似文献
9.
Abstract-Recently, a robust version of the linear decorrelating detector (LDD) based on the Huber's M-estimation technique has been proposed. In this paper, we first demonstrate the use of a three-layer recurrent neural network (RNN) to implement the LDD without requiring matrix inversion. The key idea is based on minimizing an appropriate computational energy function iteratively. Second, it will be shown that the M-decorrelating detector (MDD) can be implemented by simply incorporating sigmoidal neurons in the first layer of the RNN. A proof of the redundancy of the matrix inversion process is provided and the computational saving in realistic network is highlighted. Third, we illustrate how further performance gain could be achieved for the subspace-based blind MDD by using robust estimates of the signal subspace components in the initial stage. The impulsive noise is modeled using non-Gaussian alpha-stable distributions, which do not include a Gaussian component but facilitate the use of the recently proposed geometric signal-to-noise ratio (G-SNR). The characteristics and performance of the proposed neural-network detectors are investigated by computer simulation. 相似文献
10.
A framework for robust foreground detection that works under difficult conditions such as dynamic background and moderately moving camera is presented in this paper. The proposed method includes two main components: coarse scene representation as the union of pixel layers, and foreground detection in video by propagating these layers using a maximum-likelihood assignment. We first cluster into "layers" those pixels that share similar statistics. The entire scene is then modeled as the union of such non-parametric layer-models. An in-coming pixel is detected as foreground if it does not adhere to these adaptive models of the background. A principled way of computing thresholds is used to achieve robust detection performance with a pre-specified number of false alarms. Correlation between pixels in the spatial vicinity is exploited to deal with camera motion without precise registration or optical flow. The proposed technique adapts to changes in the scene, and allows to automatically convert persistent foreground objects to background and re-convert them to foreground when they become interesting. This simple framework addresses the important problem of robust foreground and unusual region detection, at about 10 frames per second on a standard laptop computer. The presentation of the proposed approach is complemented by results on challenging real data and comparisons with other standard techniques. 相似文献
11.
为了能快速有效地识别出应用层DoS攻击, 提出一种基于HMM的应用层DoS攻击检测方法。该方法以应用层协议关键词和关键词之间的时间间隔作为输入, 采用隐马尔可夫模型来快速检测应用层DoS攻击。实验结果表明, 该方法对应用层上的多种DoS攻击都具有很高的检测率和较低的误报率。 相似文献
12.
Multimedia Tools and Applications - Smart transportation plays an important role in building smart cities. We can obtain mass data from multi-source and use it to manage transportation in an... 相似文献
13.
Many existing methods for pedestrian detection have the limited detection performance in case of deformation such as large appearance variations. To overcome this limitation, we propose a novel pedestrian detection method that uses two low-level boosted features to detect pedestrians despite the presence of deformations. One is a boosted max feature (BMF) that uses a max operation to aggregate a selected pair of features to make them invariant to deformation. Another is a boosted difference feature (BDF) that uses a difference operation between a selected pair of features to improve localization accuracy of pedestrian detection. We incorporate a spatial pyramid pool method that uses multiple sized blocks to increase the richness of boosted features in a local region and use a RealBoost method to train a tree-structured classifier for the proposed pedestrian detection method. We also apply a region-of-interest method to the detected results to remove false positives effectively. Our proposed detector achieved log-average miss rates of 19.95%, 10.39%, 36.12%, and 39.57% on the Caltech-USA, INRIA, ETH, and TUD-Brussels dataset, respectively, which are the lowest among those of all state-of-the-art pedestrian detectors. 相似文献
14.
Current sensor is one of the key elements in the control system of induction motor.Whether the accurate measurement of variables reflecting motor operation status can be made will directly affect the control effect on motor system and therefore the timely,accurate detection of sensor fault is necessary.This paper brings forward an observerbased method of residual generation and fault detection on the basis of the mathematical model of the induction motor.As whether or not the nonlinear part satisfies the Lipschitz conditions does not limit the observer design,the application of such an observer is expanded.Meanwhile,the contradiction between robust error and fault sensitivity is also settled.The correctness and effectiveness of such method are verified by experimental testing on the simulated fault which also casts light on engineering practice. 相似文献
15.
As one of the most significant image local features, corner is widely utilized in many computer vision applications. A number of contour-based corner detection algorithms have been proposed over the last decades, among which the chord-to-point distance accumulation (CPDA) corner detector is reported to produce robust performance in corner detection, especially compared with curvature scale-space (CSS) based corner detectors, which are sensitive to local variation and noise on the contour. In this paper, we investigate the CPDA algorithm in terms of its limitations, and then propose the altitude-to-chord ratio accumulation (ACRA) corner detector based on CPDA approach. Altitude-to-chord ratio is insensitive to the selection of chord length compared with chord-to-point distance, which allows us utilize a single chord instead of the three chords used in CPDA algorithm. Besides, we replace the maximum normalization used in CPDA algorithm with the linear normalization to avoid the uneven data projection. Numerical experiments demonstrate that the proposed ACRA corner detection algorithm outperforms the CPDA approach and other seven state-of-the-art methods in terms of the repeatability and localization error evaluation metrics. 相似文献
16.
Determining the pupil center is fundamental for calculating eye orientation in video-based systems. Existing techniques are error prone and not robust because eyelids, eyelashes, corneal reflections or shadows in many instances occlude the pupil. We have developed a new algorithm which utilizes curvature characteristics of the pupil boundary to eliminate these artifacts. Pupil center is computed based solely on points related to the pupil boundary. For each boundary point, a curvature value is computed. Occlusion of the boundary induces characteristic peaks in the curvature function. Curvature values for normal pupil sizes were determined and a threshold was found which together with heuristics discriminated normal from abnormal curvature. Remaining boundary points were fit with an ellipse using a least squares error criterion. The center of the ellipse is an estimate of the pupil center. This technique is robust and accurately estimates pupil center with less than 40% of the pupil boundary points visible. 相似文献
17.
We present a novel algorithm for detection of certain types of unusual events. The algorithm is based on multiple local monitors which collect low-level statistics. Each local monitor produces an alert if its current measurement is unusual, and these alerts are integrated to a final decision regarding the existence of an unusual event. Our algorithm satisfies a set of requirements that are critical for successful deployment of any large-scale surveillance system. In particular it requires a minimal setup (taking only a few minutes) and is fully automatic afterwards. Since it is not based on objects' tracks, it is robust and works well in crowded scenes where tracking-based algorithms are likely to fail. The algorithm is effective as soon as sufficient low-level observations representing the routine activity have been collected, which usually happens after a few minutes. Our algorithm runs in realtime. It was tested on a variety of real-life crowded scenes. A ground-truth was extracted for these scenes, with respect to which detection and false-alarm rates are reported. 相似文献
18.
结合视觉显著区检测的特点,本文提出一种面向视觉注意区域检测的运动分割方法。该方法用一种层次聚类方法将特征点的运动轨迹进行聚类。首先用中值偏移算法扩大了不同类型运动之间特征向量的差距,同时缩小了相同运动类型的差别。继而,用一种无监督聚类算法,将不同类型的运动进行分割,同时自动获得运动分类数。最后利用运动分割结果,提出一种结合空间和颜色采样的运动显著区域生成方法。与以往方法相比,该方法能够将不同类型的运动自动进行分割,生成的视觉注意区域更为准确,而且稳定性大幅提高。实验结果证明了该方法的有效性和稳定性。 相似文献
19.
提出一种应用奇异值分解的海上场景显著性检测方法。提取海上场景图像中颜色和亮度各通道特征,并对各其分别进行奇异值分解,根据设定的阈值,选择各特征的典型分量。各特征的粗显著图定义为各特征和其典型分量的差。为进一步去除海杂波等干扰,在粗显著图中,计算其空间域全局显著性,以此形成显著性图。得到的颜色通道和亮度通道显著图通过线性合并为总显著图。利用海上场景图像进行了实验,结果表明提出方法的有效性。 相似文献
20.
语音识别领域中所采用的经典HMM模型,忽略了语音信号间的相关信息.针对这一问题,利用语音信号的空间相关性对经典HMM模型进行补偿,得到一种改进模型.该方法通过空间相关变换,描述了当前语音特征与历史数据之间的空间相关性,从而对联合状态输出分布进行建模.改进模型的解码算法利用空间相关性变换的参数更新算法在经典ⅧⅥM的解码算法基础上得到.实验结果表明,上述方法在说话人无关连续语音识别系统上获得了明显的性能改进. 相似文献
|