期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Tiny videos: a large data set for nonparametric video retrieval and frame classification

Karpenko A Aarabi P 《IEEE transactions on pattern analysis and machine intelligence》2011,33(3):618-630

In this paper, we present a large database of over 50,000 user-labeled videos collected from YouTube. We develop a compact representation called "tiny videos" that achieves high video compression rates while retaining the overall visual appearance of the video as it varies over time. We show that frame sampling using affinity propagation-an exemplar-based clustering algorithm-achieves the best trade-off between compression and video recall. We use this large collection of user-labeled videos in conjunction with simple data mining techniques to perform related video retrieval, as well as classification of images and video frames. The classification results achieved by tiny videos are compared with the tiny images framework [24] for a variety of recognition tasks. The tiny images data set consists of 80 million images collected from the Internet. These are the largest labeled research data sets of videos and images available to date. We show that tiny videos are better suited for classifying scenery and sports activities, while tiny images perform better at recognizing objects. Furthermore, we demonstrate that combining the tiny images and tiny videos data sets improves classification precision in a wider range of categories. 相似文献

2.

Heterogeneous bag-of-features for object/scene recognition

Loris Nanni Alessandra Lumini 《Applied Soft Computing》2013,13(4):2171-2178

相似文献

3.

Discriminative compact pyramids for object and scene recognition

Noha M. Elfiky Fahad Shahbaz Khan Joost van de Weijer Jordi Gonzàlez 《Pattern recognition》2012,45(4):1627-1636

Spatial pyramids have been successfully applied to incorporating spatial information into bag-of-words based image representation. However, a major drawback is that it leads to high dimensional image representations. In this paper, we present a novel framework for obtaining compact pyramid representation. First, we investigate the usage of the divisive information theoretic feature clustering (DITC) algorithm in creating a compact pyramid representation. In many cases this method allows us to reduce the size of a high dimensional pyramid representation up to an order of magnitude with little or no loss in accuracy. Furthermore, comparison to clustering based on agglomerative information bottleneck (AIB) shows that our method obtains superior results at significantly lower computational costs. Moreover, we investigate the optimal combination of multiple features in the context of our compact pyramid representation. Finally, experiments show that the method can obtain state-of-the-art results on several challenging data sets. 相似文献

4.

运用Vague集进行多传感器目标识别

万树平《传感器与微系统》2011,30(1):58-59,78

针对具有多个特征指标的多传感器目标识别问题,采用 Vague 集表达目标特征的不确定信息,提出了一种新的多传感器目标识别方法.定义两 Vague 集之间的加权 Hamming 距离和相似度,建立了Vague 集表达的多传感器目标识别模型,通过最小化各目标类型的 Vague 度优化模型客观地确定了各特征的权重,利用相似度... 相似文献

5.

Combining monoSLAM with object recognition for scene augmentation using a wearable camera

R.O. Castle G. Klein D.W. Murray 《Image and vision computing》2010

In wearable visual computing, maintaining a time-evolving representation of the 3D environment along with the pose of the camera provides the geometrical foundation on which person-centred processing can be built. In this paper, an established method for the recognition of feature clusters is used on live imagery to identify and locate planar objects around the wearer. Objects’ locations are incorporated as additional 3D measurements into a monocular simultaneous localization and mapping process, which routinely uses 2D image measurements to acquire and maintain a map of the surroundings, irrespective of whether objects are present or not. Augmenting the 3D maps with automatically recognized objects enables useful annotations of the surroundings to be presented to the wearer. After demonstrating the geometrical integrity of the method, experiments show its use in two augmented reality applications. 相似文献

6.

Optimal linear representations of images for object recognition

Liu X Srivastava A Gallivan K 《IEEE transactions on pattern analysis and machine intelligence》2004,26(5):662-666

Although linear representations are frequently used in image analysis, their performances are seldom optimal in specific applications. This paper proposes a stochastic gradient algorithm for finding optimal linear representations of images for use in appearance-based object recognition. Using the nearest neighbor classifier, a recognition performance function is specified and linear representations that maximize this performance are sought. For solving this optimization problem on a Grassmann manifold, a stochastic gradient algorithm utilizing intrinsic flows is introduced. Several experimental results are presented to demonstrate this algorithm. 相似文献

7.

Indoor scene recognition by a mobile robot through adaptive object detection

P. Espinace T. Kollar N. Roy A. Soto 《Robotics and Autonomous Systems》2013,61(9):932-947

Mobile robotics has achieved notable progress, however, to increase the complexity of the tasks that mobile robots can perform in natural environments, we need to provide them with a greater semantic understanding of their surrounding. In particular, identifying indoor scenes, such as an Office or a Kitchen, is a highly valuable perceptual ability for an indoor mobile robot, and in this paper we propose a new technique to achieve this goal. As a distinguishing feature, we use common objects, such as Doors or furniture, as a key intermediate representation to recognize indoor scenes. We frame our method as a generative probabilistic hierarchical model, where we use object category classifiers to associate low-level visual features to objects, and contextual relations to associate objects to scenes. The inherent semantic interpretation of common objects allows us to use rich sources of online data to populate the probabilistic terms of our model. In contrast to alternative computer vision based methods, we boost performance by exploiting the embedded and dynamic nature of a mobile robot. In particular, we increase detection accuracy and efficiency by using a 3D range sensor that allows us to implement a focus of attention mechanism based on geometric and structural information. Furthermore, we use concepts from information theory to propose an adaptive scheme that limits computational load by selectively guiding the search for informative objects. The operation of this scheme is facilitated by the dynamic nature of a mobile robot that is constantly changing its field of view. We test our approach using real data captured by a mobile robot navigating in Office and home environments. Our results indicate that the proposed approach outperforms several state-of-the-art techniques for scene recognition. 相似文献

8.

基于Vague集的多传感器目标识别方法 总被引：3，自引：0，他引：3

万树平《控制与决策》2009,24(7)

采用Vague集来表达传感器的模糊测量信息,提出一种基于Vague集的多传感器信息融合方法.建立Vague集表达的多目标模型数据库,并定义两Vague值之间的贴近度,利用多目标规划模型客观地确定各特征的权重,根据综合贴近度给出目标识别算法.实例分析表明了算法的有效性. 相似文献

9.

Optimization methods of video images processing for mobile object recognition

Xiao Shuo Li Tianxu Wang Jiawei 《Multimedia Tools and Applications》2020,79(25-26):17245-17255

Multimedia Tools and Applications - Recognition of moving objects in video images is mainly based on acquiring the target information in a certain time series. After image processing, relevant... 相似文献

10.

Handprinted word recognition on a NIST data set 总被引：1，自引：0，他引：1

Paul Gader Michael Whalen Margaret Ganzberger Dan Hepp 《Machine Vision and Applications》1995,8(1):31-40

An approach to handprinted word recognition is described. The approach is based on the use of generating multiple possible segmentations of a word image into characters and matching these segmentations to a lexicon of candidate strings. The segmentation process uses a combination of connected component analysis and distance transform-based, connected character splitting. Neural networks are used to assign character confidence values to potential character within word images. Experimental results are provided for both character and word recognition modules on data extracted from the NIST handprinted character database. 相似文献

11.

Fast algorithms for nonparametric population modeling of large data sets

Gianluigi Pillonetto Author Vitae Giuseppe De Nicolao^{Author Vitae} 《Automatica》2009,45(1):173-179

Population models are widely applied in biomedical data analysis since they characterize both the average and individual responses of a population of subjects. In the absence of a reliable mechanistic model, one can resort to the Bayesian nonparametric approach that models the individual curves as Gaussian processes. This paper develops an efficient computational scheme for estimating the average and individual curves from large data sets collected in standardized experiments, i.e. with a fixed sampling schedule. It is shown that the overall scheme exhibits a “client-server” architecture. The server is in charge of handling and processing the collective data base of past experiments. The clients ask the server for the information needed to reconstruct the individual curve in a single new experiment. This architecture allows the clients to take advantage of the overall data set without violating possible privacy and confidentiality constraints and with negligible computational effort. 相似文献

12.

Adaptive RANSAC and extended region-growing algorithm for object recognition over remote-sensing images

Hossein-Nejad Zahra Nasri Mehdi 《Multimedia Tools and Applications》2022,81(22):31685-31708

Multimedia Tools and Applications - In this paper, a new approach is proposed for object recognition in remote-sensing images. In the proposed approach, the matching process between the object in... 相似文献

13.

Using spin images for efficient object recognition in cluttered 3Dscenes

Johnson A.E. Hebert M. 《IEEE transactions on pattern analysis and machine intelligence》1999,21(5):433-449

相似文献

14.

Precise candidate selection for large character set recognition byconfidence evaluation

Liu C.-L. Nakagawa M. 《IEEE transactions on pattern analysis and machine intelligence》2000,22(6):636-641

This paper proposes a precise candidate selection method for large character set recognition by confidence evaluation of distance-based classifiers. The proposed method is applicable to a wide variety of distance metrics and experiments on Euclidean distance and city block distance have achieved promising results. By confidence evaluation, the distribution of distances is analyzed to derive the probabilities of classes in two steps: output probability evaluation and input probability inference. Using the input probabilities as confidences, several selection rules have been tested and the rule that selects the classes with high confidence ratio to the first rank class produced best results. The experiments were implemented on the ETL9B database and the results show that the proposed method selects about one-fourth as many candidates with accuracy preserved compared to the conventional method that selects a fixed number of candidates 相似文献

15.

Overlapping object recognition: a paradigm for multiple sensorfusion

Intaek Kim Vachtsevanos G. 《Robotics & Automation Magazine, IEEE》1998,5(3):37-44

Recognizing and identifying overlapping or occluded objects is a problem typically encountered in a manufacturing setting. The resultant image distortion tends to limit the applicability of current recognition systems in that case. The proposed recognition scheme involves the utility of an appropriate suite of complementary sensors and is based upon a systematic methodology that addresses the modeling problem through a polygonal approximation and the matching task between the sensor data and stored templates through a construction, called the intervertex matrix. An example is included to illustrate the simplicity and flexibility of the proposed approach 相似文献

16.

A local approach for 3D object recognition through a set of size functions

Mohammed Ayoub Alaoui Mhamdi Djemel Ziou 《Image and vision computing》2014

In this paper, a local approach for 3D object recognition is presented. It is based on the topological invariants provided by the critical points of the 3D object. The critical points and the links between them are represented by a set of size functions obtained after splitting the 3D object into portions. A suitable similarity measure is used to compare the sets of size functions associated with the 3D objects. In order to validate our approach's recognition performance, we used different collections of 3D objects. The obtained scores are favourably comparable to the related work. 相似文献

17.

CarveNet: a channel-wise attention-based network for irregular scene text recognition

Wu Guibin Zhang Zheng Xiong Yongping 《International Journal on Document Analysis and Recognition》2022,25(3):177-186

International Journal on Document Analysis and Recognition (IJDAR) - Although it has achieved considerable progress in recent years, recognizing irregular text in natural scene is still a... 相似文献

18.

A paradigm for invariant object recognition of brightness,optical flow and binocular disparity images

Lowell Jacobson Harry Wechsler 《Pattern recognition letters》1982,1(1):61-68

We suggest the Wigner distribution (WD) for the analysis of 2-D images. The WD can be used to rigorously define a local power-spectrum at each point of an image. Furthermore, an invariant representation of a given image can be obtained by applying a complex-logarithmic (CL) conformal mapping to the spatial-frequency domain of the WD. The representation is such that all local spectra are invariant, within a linear shift, with respect to linear transformations of the image. A discrete WD has been implemented and results are shown. We next describe how the same CL-mapped WD of a scalar or vector field could be used for binocular disparity and motion analysis, respectively, where the goal is object recognition. 相似文献

19.

11K Hands: Gender recognition and biometric identification using a large dataset of hand images

Afifi Mahmoud 《Multimedia Tools and Applications》2019,78(15):20835-20854

Human hand not only possesses distinctive feature for gender information, it is also considered one of the primary biometric traits used to identify a person. Unlike face images, which are usually unconstrained, an advantage of hand images is they are usually captured under a controlled position. Most state-of-the-art methods, that rely on hand images for gender recognition or biometric identification, employ handcrafted features to train an off-the-shelf classifier or be used by a similarity metric for biometric identification. In this work, we propose a deep learning-based method to tackle the gender recognition and biometric identification problems. Specifically, we design a two-stream convolutional neural network (CNN) which accepts hand images as input and predicts gender information from these hand images. This trained model is then used as a feature extractor to feed a set of support vector machine classifiers for biometric identification. As part of this effort, we propose a large dataset of human hand images, 11K Hands, which contains dorsal and palmar sides of human hand images with detailed ground-truth information for different problems including gender recognition and biometric identification. By leveraging thousands of hand images, we could effectively train our CNN-based model achieving promising results. One of our findings is that the dorsal side of human hands is found to have effective distinctive features similar to, if not better than, those available in the palmar side of human hand images. To facilitate access to our 11K Hands dataset, the dataset, the trained CNN models, and our Matlab source code are available at (https://goo.gl/rQJndd).

相似文献

20.

Accurate object recognition in the underwater images using learning algorithms and texture features

K. Srividhya M. M. Ramya 《Multimedia Tools and Applications》2017,76(24):25679-25695

Underwater image processing is very challenging due to its environmental conditions and poor sunlight. Images captured from the ocean using autonomous vehicles are often non-uniformly illuminated and contain noise due to the underlying environment. Object recognition is a challenging task under water due to the variation in the environment, target shape and orientation. Traditional algorithms based on spatial information may not lead to accurate segmentation as the intensity variation is often less in underwater images. Texture information representing the characteristics of the object is needed. Statistical features like autocorrelation, sum average, sum variance and sum entropy were extracted. These were fed as input to learning algorithms and training was done to effectively classify the object of interest and background. Chain coding was further applied for object recognition. The proposed methodology achieved a maximum classification accuracy of 96%. 相似文献