期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Content-based image retrieval using joint correlograms

Adam Williams Peter Yoon 《Multimedia Tools and Applications》2007,34(2):239-248

The comparison of digital images to determine their degree of similarity is one of the fundamental problems of computer vision. Many techniques exist which accomplish this with a certain level of success, most of which involve either the analysis of pixel-level features or the segmentation of images into sub-objects that can be geometrically compared. In this paper we develop and evaluate a new variation of the pixel feature and analysis technique known as the color correlogram in the context of a content-based image retrieval system. Our approach is to extend the autocorrelogram by adding multiple image features in addition to color. We compare the performance of each index scheme with our method for image retrieval on a large database of images. The experiment shows that our proposed method gives a significant improvement over histogram or color correlogram indexing, and it is also memory-efficient.

Peter YoonEmail:

相似文献

2.

Context-based environmental audio event recognition for scene understanding

Tong Lu Gongyou Wang Feng Su 《Multimedia Systems》2015,21(5):507-524

Automatic audio content recognition has attracted an increasing attention for developing multimedia systems, for which the most popular approaches combine frame-based features with statistic models or discriminative classifiers. The existing methods are effective for clean single-source event detection but may not perform well for unstructured environmental sounds, which have a broad noise-like flat spectrum and a diverse variety of compositions. We present an automatic acoustic scene understanding framework that detects audio events through two hierarchies, acoustic scene recognition and audio event recognition, in which the former is preceded by following dominant audio sources and in turn helps infer non-dominant audio events within the same scene through modeling their occurrence correlations. On the scene recognition hierarchy, we perform adaptive segmentation and feature extraction for every input acoustic scene stream through Eigen-audiospace and an optimized feature subspace, respectively. After filtering background, scene streams are recognized by modeling the observation density of dominant features using a two-level hidden Markov model. On the audio event recognition hierarchy, scene knowledge is characterized by an audio context model that essentially describes the occurrence correlations of dominant and non-dominant audio events within this scene. Monte Carlo integration and gradient descent techniques are employed to maximize the likelihood and correctly tag each audio event. To the best of our knowledge, this is the first work that models event correlations as scene context for robust audio event detection from complex and noisy environments. Note that according to the recent report, the mean accuracy for the acoustic scene classification task by human listeners is only around 71 % on the data collected in office environments from the DCASE dataset. None of the existing methods performs well on all scene categories and the average accuracy of the best performances of the recent 11 methods is 53.8 %. The proposed method averagely achieves an accuracy of 62.3 % on the same dataset. Additionally, we create a 10-CASE dataset by manually collecting 5,250 audio clips of 10 scene types and 21 event categories. Our experimental results on 10-CASE show that the proposed method averagely achieves the enhanced performance of 78.3 %, and the average accuracy of audio event recognition can be effectively improved by capturing dominant audio sources and reasoning non-dominant events from the dominant ones through acoustic context modeling. In the future work, exploring the interactions between acoustic scene recognition and audio event detection, and incorporating other modalities to improve the accuracy are required to further advance the proposed framework. 相似文献

3.

Robust symbolic representation for shape recognition and retrieval

Mohammad Reza Daliri Vincent Torre 《Pattern recognition》2008,41(5):1782-1798

相似文献

4.

Multimedia document retrieval using speech and speaker recognition

Mahesh Viswanathan Homayoon S.M. Beigi Satya Dharanipragada Fereydoun Maali Alain Tritschler 《International Journal on Document Analysis and Recognition》2000,2(4):147-162

Speech and speaker recognition systems are rapidly being deployed in real-world applications. In this paper, we discuss the details of a system and its components for indexing and retrieving multimedia content derived from broadcast news sources. The audio analysis component calls for real-time speech recognition for converting the audio to text and concurrent speaker analysis consisting of the segmentation of audio into acoustically homogeneous sections followed by speaker identification. The output of these two simultaneous processes is used to abstract statistics to automatically build indexes for text-based and speaker-based retrieval without user intervention. The real power of multimedia document processing is the possibility of Boolean queries in the form of combined text- and speaker-based user queries. Retrieval for such queries entails combining the results of individual text and speaker based searches. The underlying techniques discussed here can easily be extended to other speech-centric applications and transactions. 相似文献

5.

Special issue on document recognition and retrieval 2009

Kathrin Berkner Laurence Likforman-Sulem 《International Journal on Document Analysis and Recognition》2010,13(2):77-78

相似文献

6.

Context-based scene recognition from visual data in smart homes: an Information Fusion approach

Juan Gómez-Romero Miguel A. Serrano Miguel A. Patricio Jesús García José M. Molina 《Personal and Ubiquitous Computing》2012,16(7):835-857

相似文献

7.

Face recognition by generalized two-dimensional FLD method and multi-class support vector machines 总被引：2，自引：0，他引：2

Shiladitya Chowdhury Jamuna Kanta Sing Dipak Kumar Basu Mita NasipuriAuthor vitae 《Applied Soft Computing》2011,11(7):4282-4292

This paper presents a novel scheme for feature extraction, namely, the generalized two-dimensional Fisher's linear discriminant (G-2DFLD) method and its use for face recognition using multi-class support vector machines as classifier. The G-2DFLD method is an extension of the 2DFLD method for feature extraction. Like 2DFLD method, G-2DFLD method is also based on the original 2D image matrix. However, unlike 2DFLD method, which maximizes class separability either from row or column direction, the G-2DFLD method maximizes class separability from both the row and column directions simultaneously. To realize this, two alternative Fisher's criteria have been defined corresponding to row and column-wise projection directions. Unlike 2DFLD method, the principal components extracted from an image matrix in G-2DFLD method are scalars; yielding much smaller image feature matrix. The proposed G-2DFLD method was evaluated on two popular face recognition databases, the AT&T (formerly ORL) and the UMIST face databases. The experimental results using different experimental strategies show that the new G-2DFLD scheme outperforms the PCA, 2DPCA, FLD and 2DFLD schemes, not only in terms of computation times, but also for the task of face recognition using multi-class support vector machines (SVM) as classifier. The proposed method also outperforms some of the neural networks and other SVM-based methods for face recognition reported in the literature. 相似文献

8.

On the consecutive retrieval property for generalized binary queries

Shinsei Tazawa 《Information Processing Letters》1984,18(5):291-293

相似文献

9.

3D scene retrieval and recognition with Depth Gradient Images

Antonio Adán 《Pattern recognition letters》2011,32(9):1337-1353

The intention of the strategy proposed in this paper is to solve the object retrieval problem in highly complex scenes using 3D information. In the worst case scenario the complexity of the scene includes several objects with irregular or free-form shapes, viewed from any direction, which are self-occluded or partially occluded by other objects with which they are in contact and whose appearance is uniform in intensity/color. This paper introduces and analyzes a new 3D recognition/pose strategy based on DGI (Depth Gradient Images) models. After comparing it with current representative techniques, we can affirm that DGI has very interesting prospects.The DGI representation synthesizes both surface and contour information, thus avoiding restrictions concerning the layout and visibility of the objects in the scene. This paper first explains the key concepts of the DGI representation and shows the main properties of this method in comparison to a set of known techniques. The performance of this strategy in real scenes is then reported. Details are also presented of a wide set of experimental tests, including results under occlusion, performance with injected noise and experiments with cluttered scenes of a high level of complexity. 相似文献

10.

Homotopic image pseudo-invariants for openset object recognition and image retrieval

Yoshihisa Shinagawa 《IEEE transactions on pattern analysis and machine intelligence》2008,30(11):1891-1901

This paper presents novel homotopic image pseudo-invariants for face recognition based on pixelwise analysis. An exemplar face and test images are matched, and the most similar image is determined first. The homotopic image pseudo-invariants are calculated next to judge whether the most similar image is the same person as the exemplar. The proposed method can be applied to openset recognition. Recognition task can be performed with or without face databases, while the recognition rate is higher when a database is available. This fact facilitates the recognition of faces and various other objects on the Internet. We benchmark the method using FERET as well as the images downloaded from the Internet. 相似文献

11.

Context-based authentication and transport of cultural assets

Leonardo Mostarda Changyu Dong Naranker Dulay 《Personal and Ubiquitous Computing》2010,14(4):321-334

We present a ubiquitous system that combines context information, security mechanisms and a transport infrastructure to provide authentication and secure transport of works of art. Authentication is provided for both auctions and exhibitions, where users can use their own mobile devices to authenticate works of art. Transport is provided by a secure protocol that makes use of position–time information and wireless sensors providing context information. The system has been used in several real case studies in the context of the CUSPIS project and continues to be used as a commercial product for the transportation and exhibition of cultural assets in Italy. 相似文献

12.

R-theta local neighborhood pattern for unconstrained facial image recognition and retrieval

Chakraborty Soumendu Singh Satish Kumar Chakraborty Pavan 《Multimedia Tools and Applications》2019,78(11):14799-14822

相似文献

13.

Construction site image retrieval based on material cluster recognition

Ioannis K. Brilakis Lucio Soibelman Yoshihisa Shinagawa 《Advanced Engineering Informatics》2006,20(4):443-452

The capability to automatically identify shapes, objects and materials from the image content through direct and indirect methodologies has enabled the development of several civil engineering related applications that assist in the design, construction and maintenance of construction projects. This capability is a product of the technological breakthroughs in the area of image processing that has allowed for the development of a large number of digital imaging applications in all industries. In this paper, an automated and content based construction site image retrieval method is presented. This method is based on image retrieval techniques, and specifically those related with material and object identification and matches known material samples with material clusters within the image content. The results demonstrate the suitability of this method for construction site image retrieval purposes and reveal the capability of existing image processing technologies to accurately identify a wealth of materials from construction site images. 相似文献

14.

基于加权排序检索和视觉模式挖掘的商标识别

《计算机应用研究》2015,(11)

相似文献

15.

Context-based global multi-class semantic image segmentation by wireless multimedia sensor networks

Chen Wei Xiaorong Jiang Zhibo Tang Wang Qian Na Fan 《Artificial Intelligence Review》2015,43(4):579-591

相似文献

16.

Robust face recognition using generalized neural reflectance model

Siu-Yeung Cho Tommy W. S. Chow 《Neural computing & applications》2006,15(2):170-182

A generalized neural reflectance (GNR) model for enhancing face recognition under variations in illumination and posture is presented in this paper. Our work is based on training a number of synthesis images of each face taken at single lighting direction with frontal/posture view. This way of synthesizing images can be used to build training cases for each face under different known illumination conditions from which face recognition can be significantly improved. However, reconstructing face shape may not easily be achieved and the human face images usually form by highly complex structure which suffers from strong specular and unknown reflective conditions. In this paper, these limitations are addressed by Cho and Chow (IEEE Trans Neural Netw 12(5):1204–1214, 2002). Face surfaces are recovered by this GNR model and face images in different poses are synthesized to create a database for training. Our training algorithm assigns to recognize the face identity by similarity measure on face features extracting first by the principle component analysis (PCA) method and then further processing by the Fisher’s discrimination analysis (FDA) to acquire lower dimensional patterns. Experimental results conducted on the Yale Face Database B show that lower error rates of classification and recognition are achieved under different variations in lighting and pose and the performance significantly outperforms the recognition without using the proposed GNR model. 相似文献

17.

Context-based unsupervised ensemble learning and feature ranking

Erfan Soltanmohammadi Mort Naraghi-Pour Mihaela van der Schaar 《Machine Learning》2016,105(3):459-485

相似文献

18.

基于上下文自适应算术编码的设计与实现 总被引：1，自引：0，他引：1

安向明张丹邹红《电脑学习》2009,(3):107-108

实现了经典算术编码的流程设计,提出基于上下文自适应算术编码的算法。建立了基于上下文的多阶自适应的概率模型,使其符号的压缩码长尽量的接近其熵值。相似文献

19.

改进的基于广义图像共生矩阵的图像检索方法

HONG Qing Qi WANG Bei Zhan DONG Huai Lin ZHANG Lei CHEN Bing 《微型机与应用》2007,(Z1)

综合考虑了传统灰度共生矩阵法与基于广义图像灰度共生矩阵法各自的优点,提出了改进的基于广义图像灰度共生矩阵的图像检索方法。新方法构造了广义图像四个方向的灰度共生矩阵,并提取四个共生矩阵的纹理参数进行检索。实验结果表明,新方法对图像的旋转及尺寸变化具有更好的检索性能。相似文献

20.

Efficient 3-D model search and retrieval using generalized 3-D radon transforms 总被引：4，自引：0，他引：4

Daras P. Zarpalas D. Tzovaras D. Strintzis M.G. 《Multimedia, IEEE Transactions on》2006,8(1):101-114

相似文献