共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Location information, i.e., the position of content in image plane, is considered as an important supplement in saliency detection. The effect of location information is usually evaluated by integrating it with the selected saliency detection methods and measuring the improvement, which is highly influenced by the selection of saliency methods. In this paper, we provide direct and quantitative analysis of the importance of location information for saliency detection in natural images. We firstly analyze the relationship between content location and saliency distribution on four public image datasets, and validate the distribution by simply treating location based Gaussian distribution as saliency map. To further validate the effectiveness of location information, we propose a location based saliency detection approach, which completely initializes saliency maps with location information and propagate saliency among patches based on color similarity, and discuss the robustness of location information’s effect. The experimental results show that location information plays a positive role in saliency detection, and the proposed method can outperform most state-of-the-art saliency detection methods and handle natural images with different object positions and multiple salient objects. 相似文献
3.
Liu Yizhi Gu Xiaoyan Huang Lei Ouyang Junlin Liao Miao Wu Liangran 《Multimedia Tools and Applications》2020,79(7-8):4729-4745
Multimedia Tools and Applications - Content-based adult video detection plays an important role in preventing pornography. However, existing methods usually rely on single modality and seldom focus... 相似文献
4.
Keechul Jung Author Vitae Kwang In Kim Author Vitae Author Vitae 《Pattern recognition》2004,37(5):977-997
Text data present in images and video contain useful information for automatic annotation, indexing, and structuring of images. Extraction of this information involves detection, localization, tracking, extraction, enhancement, and recognition of the text from a given image. However, variations of text due to differences in size, style, orientation, and alignment, as well as low image contrast and complex background make the problem of automatic text extraction extremely challenging. While comprehensive surveys of related problems such as face detection, document analysis, and image & video indexing can be found, the problem of text information extraction is not well surveyed. A large number of techniques have been proposed to address this problem, and the purpose of this paper is to classify and review these algorithms, discuss benchmark data and performance evaluation, and to point out promising directions for future research. 相似文献
5.
《International journal of remote sensing》2012,33(8):3095-3118
ABSTRACTThe requirements of spectral and spatial quality differ from region to region in remote sensing images. The employment of saliency in pan-sharpening methods is an effective approach to fulfil this kind of demands. Common saliency feature analysis, which considers the mutual information between multiple images, can ensure the consistency and accuracy when assigning saliency to regions in different images. Thus, we propose a pan-sharpening method based on common saliency feature analysis and multiscale spatial information extraction for multiple remote sensing images. Firstly, we extract spatial information by the guided filter and accurate intensity component estimation. Then, a common saliency feature analysis method based on global contrast calculation and intensity feature extraction is designed to obtain preliminary pixel-wise saliency estimation, which is subsequently integrated with text-featured based compensation to generate adaptive injection gains. The introduction of common saliency feature analysis guarantees that the same pan-sharpening strategy will be applied to regions with similar features in multiple images. Finally, the injection gains are used to implement the detail injection. Our proposal satisfies diverse needs of spatial and spectral information for different regions in the single image and guarantees that regions with similar features in different images are treated consistently in the process of pan-sharpening. Both visual and quantitative results demonstrate that our method has better performance in guaranteeing consistency in multiple images, improving spatial quality and preserving spectral fidelity. 相似文献
6.
Ullah Javid Khan Ahmad Jaffar Muhammad Arfan 《Multimedia Tools and Applications》2018,77(6):7429-7446
Multimedia Tools and Applications - The segmentation of moving objects become challenging when the object motion is small, the shape of object changes, and there is global background motion in... 相似文献
7.
Zhang Xufan Wang Yong Yan Jun Chen Zhenxing Wang Dianhong 《Multimedia Tools and Applications》2020,79(25-26):17331-17348
Multimedia Tools and Applications - Conventional saliency detection algorithms usually achieve good detection performance at the cost of high computational complexity, and most of them focus on... 相似文献
8.
Niu Yuzhen Lin Lening Chen Yuzhong Ke Lingling 《Multimedia Tools and Applications》2017,76(24):26329-26353
Multimedia Tools and Applications - Visual saliency detection is useful in carrying out image compression, image segmentation, image retrieval, and other image processing applications. Majority of... 相似文献
9.
为了适应视频后处理芯片低成本的需求,提出一种仅需用两行缓存的新的保持边缘的图像放大算法。该方法寻找代表点代替插值点来确定相关方向。找到相关方向后,对应方向上寻找四个邻域点及其对应位置,进行插值。实验结果表明该算法能实现图像的放大,并能消除图像边缘模糊和锯齿效应,可应用于低成本的数字视频后处理芯片中。 相似文献
10.
This paper presents a new attention model for detecting visual saliency in news video. In the proposed model, bottom-up (low level) features and top-down (high level) factors are used to compute bottom-up saliency and top-down saliency respectively. Then, the two saliency maps are fused after a normalization operation. In the bottom-up attention model, we use quaternion discrete cosine transform in multi-scale and multiple color spaces to detect static saliency. Meanwhile, multi-scale local motion and global motion conspicuity maps are computed and integrated into motion saliency map. To effectively suppress the background motion noise, a simple histogram of average optical flow is adopted to calculate motion contrast. Then, the bottom-up saliency map is obtained by combining the static and motion saliency maps. In the top-down attention model, we utilize high level stimulus in news video, such as face, person, car, speaker, and flash, to generate the top-down saliency map. The proposed method has been extensively tested by using three popular evaluation metrics over two widely used eye-tracking datasets. Experimental results demonstrate the effectiveness of our method in saliency detection of news videos compared to several state-of-the-art methods. 相似文献
11.
Multimedia Tools and Applications - Human visual system is endowed with an innate capability of distinguishing the salient regions of an image. It do so even in the presence of noise and other... 相似文献
12.
This article addresses the use of stereoscopic images in teleoperated tasks. Depth perception is a key point in the ability to skillfully manipulate in remote environments. Displaying three‐dimensional images is a complex process but it is possible to design a teleoperation interface that displays stereoscopic images to assist in manipulation tasks. The appropriate interface for image viewing must be chosen and the stereoscopic video cameras must be calibrated so that the image disparity is natural for the observer. Attention is given to the calculation of stereoscopic image disparity, and suggestions are made as to the limits within which adequate stereoscopic image perception takes place. The authors have designed equipment for image visualization in teleoperated systems. These devices are described and their performance evaluated. Finally, an architecture for the transmission of stereoscopic video images via network is proposed, which in the future will substitute for current image processing devices. © 2005 Wiley Periodicals, Inc. 相似文献
13.
为剔除复杂运动前景对视频稳像精度的干扰,同时结合时空显著性在运动目标检测上的独特优势,提出一种融入时空显著性的高精度视频稳像算法。该算法一方面通过时空显著性检测技术识别出运动目标并对其进行剔除;另一方面,采用多网格的运动路径进行运动补偿。具体包括:SURF特征点提取和匹配、时空显著性目标检测、网格划分与运动矢量计算、运动轨迹生成、多路径平滑、运动补偿等环节。实验结果表明,相较于传统的稳像算法,所提算法在稳定度(Stability)指标方面表现突出。对于有大范围运动前景干扰的视频,所提算法比RTVSM(Robust Traffic Video Stabilization Method assisted by foreground feature trajectories)的Stability指标提高了约9.6%;对于有多运动前景干扰的视频,所提算法比Bundled-paths算法的Stability指标提高了约5.8%,充分说明了所提算法对于复杂场景的稳像优势。 相似文献
14.
In spite of the ever-increasing prevalence of low-cost, color printing devices, gray-scale printers remain in widespread use. Authors producing documents with color images for any venue must account for the possibility that the color images might be reduced to gray scale before they are viewed. Because conversion to gray scale reduces the number of color dimensions, some loss of visual information is generally unavoidable. Ideally, we can restrict this loss to features that vary minimally within the color image. Nevertheless, with standard procedures in widespread use, this objective is not often achieved, and important image detail is often lost. Consequently, algorithms that convert color images to gray scale in a way that preserves information remain important. Human observers with color-deficient vision may experience the same problem, in that they may perceive distinct colors to be indistinguishable and thus lose image detail. The same strategy that is used in converting color images to gray scale provides a method for recoloring the images to deliver increased information content to such observers. 相似文献
15.
16.
Weiwei Xing Pingping Bai Shunli Zhang Peng Bao 《Automatic Control and Computer Sciences》2017,51(3):180-192
Pedestrian detection is a fundamental problem in video surveillance and has achieved great progress in recent years. However the performance of a generic pedestrian detector trained on some public datasets drops significantly when it is applied to some specific scenes due to the difference between source training samples and pedestrian samples in target scenes. We propose a novel transfer learning framework, which automatically transfers a generic detector to a scene-specific pedestrian detector without manually labeling training samples from target scenes. In our method, we get initial detected results and several cues are used to filter target templates whose labels we are sure about from the initial detected results. Gaussian mixture model (GMM) is used to get the motion areas in each video frame and some other target samples. The relevancy between target samples and target templates and the relevancy between source samples and target templates are estimated by sparse coding and later used to calculate the weights for source samples and target samples. Saliency detection is an essential work before the relevancy computing between source samples and target templates for eliminating interference of non-salient region. We demonstrate the effectiveness of our scene-specific detector on a public dataset, and compare with the generic detector. Detection rates improves significantly, and also it is comparable with the detector trained by a lot of manually labeled samples from the target scene. 相似文献
17.
目的 视觉显著性在众多视觉驱动的应用中具有重要作用,这些应用领域出现了从2维视觉到3维视觉的转换,从而基于RGB-D数据的显著性模型引起了广泛关注。与2维图像的显著性不同,RGB-D显著性包含了许多不同模态的线索。多模态线索之间存在互补和竞争关系,如何有效地利用和融合这些线索仍是一个挑战。传统的融合模型很难充分利用多模态线索之间的优势,因此研究了RGB-D显著性形成过程中多模态线索融合的问题。方法 提出了一种基于超像素下条件随机场的RGB-D显著性检测模型。提取不同模态的显著性线索,包括平面线索、深度线索和运动线索等。以超像素为单位建立条件随机场模型,联合多模态线索的影响和图像邻域显著值平滑约束,设计了一个全局能量函数作为模型的优化目标,刻画了多模态线索之间的相互作用机制。其中,多模态线索在能量函数中的权重因子由卷积神经网络学习得到。结果 实验在两个公开的RGB-D视频显著性数据集上与6种显著性检测方法进行了比较,所提模型在所有相关数据集和评价指标上都优于当前最先进的模型。相比于第2高的指标,所提模型的AUC(area under curve),sAUC(shuffled AUC),SIM(similarity),PCC(Pearson correlation coefficient)和NSS(normalized scanpath saliency)指标在IRCCyN数据集上分别提升了2.3%,2.3%,18.9%,21.6%和56.2%;在DML-iTrack-3D数据集上分别提升了2.0%,1.4%,29.1%,10.6%,23.3%。此外还进行了模型内部的比较,验证了所提融合方法优于其他传统融合方法。结论 本文提出的RGB-D显著性检测模型中的条件随机场和卷积神经网络充分利用了不同模态线索的优势,将它们有效融合,提升了显著性检测模型的性能,能在视觉驱动的应用领域发挥一定作用。 相似文献
18.
Effective annotation and content-based search for videos in a digital library require a preprocessing step of detecting, locating and classifying scene transitions, i.e., temporal video segmentation. This paper proposes a novel approach—spatial-temporal joint probability image (ST-JPI) analysis for temporal video segmentation. A joint probability image (JPI) is derived from the joint probabilities of intensity values of corresponding points in two images. The ST-JPT, which is a series of JPIs derived from consecutive video frames, presents the evolution of the intensity joint probabilities in a video. The evolution in a ST-JPI during various transitions falls into one of several well-defined linear patterns. Based on the patterns in a ST-JPI, our algorithm detects and classifies video transitions effectively.Our study shows that temporal video segmentation based on ST-JPIs is distinguished from previous methods in the following way: (1) It is effective and relatively robust not only for video cuts but also for gradual transitions; (2) It classifies transitions on the basis of predefined evolution patterns of ST-JPIs during transitions; (3) It is efficient, scalable and suitable for real-time video segmentation. Theoretical analysis and experimental results of our method are presented to illustrate its efficacy and efficiency. 相似文献
19.
目的 遥感图像飞机目标的检测与识别是近年来国内外研究的热点之一。传统的飞机目标识别算法一般是先通过目标分割,然后提取不变特征进行训练来完成目标的识别。在干扰较少的情况下,传统算法的识别效果较好。但遥感图像存在着大量的干扰因素,如光照变化、复杂背景及噪声等,因此传统算法识别精度较低,耗时量较大。为快速、准确识别遥感图像中飞机目标,提出一种基于显著图和全局特征、局部特征结合的飞机目标识别算法。方法 首先使用改进的Itti显著算法提取遥感图像中的显著目标;接着使用基于区域增长和线标记算法寻找连通区域来确定候选目标的数量和位置;然后提取MSA(multi-scale autoconvolution)、Pseudo-Zernike矩和Harris-Laplace特征描述子,并使用标准差和均值的比值来评估特征的稳定性,再把提取的特征结合成特征向量;最后应用支持向量机的方法完成对候选目标的识别。结果 实验结果表明,本文算法检测率和识别率分别为97.2%和94.9%,均高于现有算法,并且耗时少,虚警率低(为0.03),对噪声干扰、背景影响以及光照变化和仿射变化均具有良好的鲁棒性。结论 本文算法使用了图像的3种特征信息,包括MSA、Pseudo-Zernike矩和Harris-Laplace特征描述子,有效克服单一特征的缺点,提高了遥感图像飞机目标的识别率和抗干扰能力。 相似文献
20.
In this paper, we propose a method to jointly transfer the color and detail of multiple source images to a target video or image. Our method is based on a probabilistic segmentation scheme using Gaussian mixture model (GMM) to divide each source image as well as the target video frames or image into soft regions and determine the relevant source regions for each target region. For detail transfer, we first decompose each image as well as the target video frames or image into base and detail components. Then histogram matching is performed for detail components to transfer the detail of matching regions from source images to the target. We propose a unified framework to perform both color and detail transforms in an integrated manner. We also propose a method to maintain consistency for video targets, by enforcing consistent region segmentations for consecutive video frames using GMM-based parameter propagation and adaptive scene change detection. Experimental results demonstrate that our method automatically produces consistent color and detail transferred videos and images from a set of source images. 相似文献