首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
针对传统彩色图像分割中出现的单纯利用颜色空间,只考虑图像的全局分布,或是只考虑图像的局部区域和边缘信息等问题,提出了一种基于Fuzzy-ART模型的层次化彩色图像分割算法。该算法有效地利用图像的亮度空间分布、细节信息以及颜色空间信息,对图像进行分级特征提取,利用Fuzzy-ART模型基于人类视觉特性的稳定、快速的在线学习和记忆能力,对图像进行层次化的区域划分,形成对图像的分层表达方式,从而达到良好的分割效果;将其与FFCM算法进行比较,取得了较好的结果。  相似文献   

2.
We present an algorithm for the halftoning of greyscale image sequences. This facilitates the display of video sequences on black-and-white visual displays (e.g., high-qualityX-terminals) for multimedia applications. The main problem to be overcome when halftoning sequences is the temporal flicker between successive images.The classical problem of halftoning a static greyscale image may be posed as an optimization problem. We present an iterative algorithm for its solution. At the expense of being slower, our algorithm achieves visual results on static images better than those obtained from classic halftoning algorithms.We extend our static halftoning algorithm to image sequence halftoning. Temporal correlation between the halftoned versions of image sequences is guaranteed by using an incremental algorithm transfering information between the images. This results in binary sequences with relatively little high-frequency temporal noise, a feature that facilitates efficient no-loss compression (14) of the results.  相似文献   

3.
4.
Ontologies have been intensively applied for improving multimedia search and retrieval by providing explicit meaning to visual content. Several multimedia ontologies have been recently proposed as knowledge models suitable for narrowing the well known semantic gap and for enabling the semantic interpretation of images. Since these ontologies have been created in different application contexts, establishing links between them, a task known as ontology matching, promises to fully unlock their potential in support of multimedia search and retrieval. This paper proposes and compares empirically two extensional ontology matching techniques applied to an important semantic image retrieval issue: automatically associating common-sense knowledge to multimedia concepts. First, we extend a previously introduced textual concept matching approach to use both textual and visual representation of images. In addition, a novel matching technique based on a multi-modal graph is proposed. We argue that the textual and visual modalities have to be seen as complementary rather than as exclusive sources of extensional information in order to improve the efficiency of the application of an ontology matching approach in the multimedia domain. An experimental evaluation is included in the paper.  相似文献   

5.
A hyperclique is an itemset containing items that are strongly correlated with each other, based on a user-specified threshold. Hypercliques (HC) have been successfully used in a number of applications, for example clustering and noise removal. Even though the HC collection has been shown to respond well to datasets with skewed support distribution and low support threshold, it may still grow very large for dense datasets and lower h-confidence threshold. Recently, we proposed a condensed representation of HC, the Non-Derivable Hypercliques (NDHC). NDHC was shown to have substantial advantages over HC, especially for dense datasets and lower h-confidence values. In this paper, we propose an approximation of the NDHC collection, called Approximate Non-Derivable Hypercliques (ANDHC). This collection is a subset of NDHC and is generated based on a user-input error. We present an efficient algorithm to mine all ANDHC sets, and an algorithm to approximately derive the HC collection from the ANDHC collection. Through experiments with several datasets and parameter combinations, the ANDHC collection is shown to have significant collection size reduction from NDHC even for small error values. It is also shown that based on ANDHC we can generate a good approximation of the entire HC collection. Finally, we experiment with applying our proposed collections to enhance clustering with encouraging results.  相似文献   

6.
7.
朱杰  吴树芳  谢博鋆  马丽艳 《计算机应用》2017,37(11):3238-3243
空间金字塔模型在每层中把图像划分成细胞单元用于给图像表示提供空间信息,但是这种方式不能很好地匹配对象上的不同部分,为此提出一种基于颜色的层次(CL)划分算法。CL算法从多特征融合的角度出发,通过优化的方式在不同层次中得到每个类别中有判别力的颜色,然后根据每层中有判别力的颜色对图像进行迭代的层次划分;最后连接不同层次直方图用于图像表示。为了解决图像表示维度过高的问题,采用分化信息理论的特征聚类(DITC)方法对字典进行聚类用于字典降维,并用压缩生成的字典进行最终的图像表示。实验结果表明,所提方法能够在Soccer,Flower 17和Flower 102上取得良好的识别效果。  相似文献   

8.
Smoothness is a quality that feels aesthetic and pleasing to the human eye. We present an algorithm for finding “as‐smooth‐as‐possible” sequences in image collections. In contrast to previous work, our method does not assume that the images show a common 3D scene, but instead may depict different object instances with varying deformations, and significant variation in lighting, texture, and color appearance. Our algorithm does not rely on a notion of camera pose, view direction, or 3D representation of an underlying scene, but instead directly optimizes the smoothness of the apparent motion of local point matches among the collection images. We increase the smoothness of our sequences by performing a global similarity transform alignment, as well as localized geometric wobble reduction and appearance stabilization. Our technique gives rise to a new kind of image morphing algorithm, in which the in‐between motion is derived in a data‐driven manner from a smooth sequence of real images without any user intervention. This new type of morph can go far beyond the ability of traditional techniques. We also demonstrate that our smooth sequences allow exploring large image collections in a stable manner.  相似文献   

9.
Object-based image analysis using multiscale connectivity   总被引:2,自引:0,他引:2  
This paper introduces a novel approach for image analysis based on the notion of multiscale connectivity. We use the proposed approach to design several novel tools for object-based image representation and analysis, which exploit the connectivity structure of images in a multiscale fashion. More specifically, we propose a nonlinear pyramidal image representation scheme, which decomposes an image at different scales by means of multiscale grain filters. These filters gradually remove connected components from an image that fail to satisfy a given criterion. We also use the concept of multiscale connectivity to design a hierarchical data partitioning tool. We employ this tool to construct another image representation scheme, based on the concept of component trees, which organizes partitions of an image in a hierarchical multiscale fashion. In addition, we propose a geometrically-oriented hierarchical clustering algorithm which generalizes the classical single-linkage algorithm. Finally, we propose two object-based multiscale image summaries, reminiscent of the well-known (morphological) pattern spectrum, which can be useful in image analysis and image understanding applications.  相似文献   

10.
In this paper, a hierarchical algorithm, HierarchyScan, is proposed to efficiently locate one-dimensional subsequences within a collection of sequences with arbitrary length. The proposed algorithm performs correlation between the stored sequences and the template pattern in the transformed domain to identify subsequences in a scale- and phase-independent fashion. This is in contrast to those approaches based on the computation of Euclidean distance in the transformed domain. In the proposed hierarchical algorithm, the transformed domain representation of each original sequence is divided into multiple groups of coefficients. The matching is performed hierarchically from the group with the greatest filtering capability to the group with the lowest filtering capability. Only those subsequences whose maximum correlation value is higher than a predefined threshold will be selected for additional screening. This approach is compared to the sequential scanning and an order-of-magnitude speedup is observed.  相似文献   

11.
Robust visual tracking remains a technical challenge in real-world applications, as an object may involve many appearance variations. In existing tracking frameworks, objects in an image are often represented as vector observations, which discounts the 2-D intrinsic structure of the image. By considering an image in its actual form as a matrix, we construct the 3rd order tensor based object representation to preserve the spatial correlation within the 2-D image and fully exploit the useful temporal information. We perform incremental update of the object template using the N-mode SVD to model the appearance variations, which reduces the influence of template drifting and object occlusions. The proposed scheme efficiently learns a low-dimensional tensor representation through adaptively updating the eigenbasis of the tensor. Tensor based Bayesian inference in the particle filter framework is then utilized to realize tracking. We present the validation of the proposed tracking system by conducting the real-time facial expression recognition with video data and a live camera. Experiment evaluation on challenging benchmark image sequences undergoing appearance variations demonstrates the significance and effectiveness of the proposed algorithm.  相似文献   

12.
The large collections of news images available from stock photo agencies provide interesting insights into how different celebrities are related to each other, in terms of the events they attend together and also in terms of how often they are photographed together. In this paper, we leverage such collections to predict which celebrities will attend future events. The main motivation for this is in the event-based indexing of online collections of multimedia content, an area that has attracted much attention in recent years. Based on the metadata associated with a corpus of stock photos, we propose a language model for predicting celebrities attending future events. A temporal hierarchical version of the language model exploits fresh data while still making use of all historical data. We extract a social network from co-appearance of public figures in the events depicted in the photographs, and combine this latent social information with the language model to further improve prediction accuracy. The experimental results show that combining textual, network and temporal information gives the best prediction performance. Our analysis also shows that the prediction models, when trained by the most recent data, are most accurate for political and sports events.  相似文献   

13.
The falling cost of digital cameras and camcorders has encouraged the creation of massive collections of personal digital media. However, once captured, this media is infrequently accessed and often lies dormant on users' PCs. We present a system to breathe life into home digital media collections, drawing upon artistic stylization to create a “Digital Ambient Display” that automatically selects, stylizes and transitions between digital contents in a semantically meaningful sequence. We present a novel algorithm based on multi-label graph cut for segmenting video into temporally coherent region maps. These maps are used to both stylize video into cartoons and paintings, and measure visual similarity between frames for smooth sequence transitions. The system automatically structures the media collection into a hierarchical representation based on visual content and semantics. Graph optimization is applied to adaptively sequence content for display in a coarse-to-fine manner, driven by user attention level (detected in real-time by a webcam). Our system is deployed on embedded hardware in the form of a compact digital photo frame. We demonstrate coherent segmentation and stylization over a variety of home videos and photos. We evaluate our media sequencing algorithm via a small-scale user study, indicating that our adaptive display conveys a more compelling media consumption experience than simple linear “slide-shows”.  相似文献   

14.

Unsupervised representation learning of unlabeled multimedia data is important yet challenging problem for their indexing, clustering, and retrieval. There have been many attempts to learn representation from a collection of unlabeled 2D images. In contrast, however, less attention has been paid to unsupervised representation learning for unordered sets of high-dimensional feature vectors, which are often used to describe multimedia data. One such example is set of local visual features to describe a 2D image. This paper proposes a novel algorithm called Feature Set Aggregator (FSA) for accurate and efficient comparison among sets of high-dimensional features. FSA learns representation, or embedding, of unordered feature sets via optimization using a combination of two training objectives, that are, set reconstruction and set embedding, carefully designed for set-to-set comparison. Experimental evaluation under three multimedia information retrieval scenarios using 3D shapes, 2D images, and text documents demonstrates efficacy as well as generality of the proposed algorithm.

  相似文献   

15.
16.
Geographic locations estimation for web images have been received a lot of attention in recent years. With the help of smart phone, it is very popular for us to capture photos and share them in our social media networks. Users often generate several tags to describe image content. Many images are embedded with with geo-tags. In this paper, we propose an effective image GPS (geo-coordinates or geo-tags) estimation approach by fusing the multi-source such as textual, temporal and visual features of web images. We propose a hierarchical strategy to inference the GPS of web image. We preselect several geographic locations of higher expected relevance and perform a deeper analysis inside the selected locations to return the coordinates most likely to be related to the input image by an enhanced language model. Experiments show the effectiveness of our proposed approach.  相似文献   

17.
Multimedia data mining refers to pattern discovery, rule extraction and knowledge acquisition from multimedia database. Two typical tasks in multimedia data mining are of visual data classification and clustering in terms of semantics. Usually performance of such classification or clustering systems may not be favorable due to the use of low-level features for image representation, and also some improper similarity metrics for measuring the closeness between multimedia objects as well. This paper considers a problem of modeling similarity for semantic image clustering. A collection of semantic images and feed-forward neural networks are used to approximate a characteristic function of equivalence classes, which is termed as a learning pseudo metric (LPM). Empirical criteria on evaluating the goodness of the LPM are established. A LPM based k-Mean rule is then employed for the semantic image clustering practice, where two impurity indices, classification performance and robustness are used for performance evaluation. An artificial image database with 11 semantics is employed for our simulation studies. Results demonstrate the merits and usefulness of our proposed techniques for multimedia data mining.  相似文献   

18.
Retrieving images from large and varied collections using image content as a key is a challenging and important problem. We present a new image representation that provides a transformation from the raw pixel data to a small set of image regions that are coherent in color and texture. This "Blobworld" representation is created by clustering pixels in a joint color-texture-position feature space. The segmentation algorithm is fully automatic and has been run on a collection of 10,000 natural images. We describe a system that uses the Blobworld representation to retrieve images from this collection. An important aspect of the system is that the user is allowed to view the internal representation of the submitted image and the query results. Similar systems do not offer the user this view into the workings of the system; consequently, query results from these systems can be inexplicable, despite the availability of knobs for adjusting the similarity metrics. By finding image regions that roughly correspond to objects, we allow querying at the level of objects rather than global image properties. We present results indicating that querying for images using Blobworld produces higher precision than does querying using color and texture histograms of the entire image in cases where the image contains distinctive objects.  相似文献   

19.
20.
In multimedia databases, k-nearest neighbor queries are popular and frequently contain non-spatial predicates. Among the available techniques for such queries, the incremental nearest neighbor algorithm proposed by Hjaltason and Samet is known as the most useful algorithm [16]. The reason is that if k > k neighbors are needed, it can provide the next neighbor for the upper operator without restarting the query from scratch. However, the R-tree in their algorithm has no facility capable of partially pruning tuple candidates that will turn out not to satisfy the remaining predicates, leading their algorithm to inefficiency. In this paper, we propose an RS-tree-based incremental nearest neighbor algorithm complementary to their algorithm. The RS-tree used in our algorithm is a hybrid of the R-tree and the S-tree, as its buddy tree, based on the hierarchical signature file. Experimental results show that our RS-tree enhances the performance of Hjaltason and Samet's algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号