首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
While people compare images using semantic concepts, computers compare images using low-level visual features that sometimes have little to do with these semantics. To reduce the gap between the high-level semantics of visual objects and the low-level features extracted from them, in this paper we develop a framework of learning pseudo metrics (LPM) using neural networks for semantic image classification and retrieval. Performance analysis and comparative studies, by experimenting on an image database, show that the LPM has potential application to multimedia information processing.  相似文献   

2.
Auditory scenes are temporal audio segments with coherent semantic content. Automatically classifying and grouping auditory scenes with similar semantics into categories is beneficial for many multimedia applications, such as semantic event detection and indexing. For such semantic categorization, auditory scenes are first characterized with either low-level acoustic features or some mid-level representations like audio effects, and then supervised classifiers or unsupervised clustering algorithms are employed to group scene segments into various semantic categories. In this paper, we focus on the problem of automatically categorizing audio scenes in unsupervised manner. To achieve more reasonable clustering results, we introduce the co-clustering scheme to exploit potential grouping trends among different dimensions of feature spaces (either low-level or mid-level feature spaces), and provide more accurate similarity measure for comparing auditory scenes. Moreover, we also extend the co-clustering scheme with a strategy based on the Bayesian information criterion (BIC) to automatically estimate the numbers of clusters. Evaluation performed on 272 auditory scenes extracted from 12-h audio data shows very encouraging categorization results. Co-clustering achieved a better performance compared to some traditional one-way clustering algorithms, both based on the low-level acoustic features and on the mid-level audio effect representations. Finally, we present our vision regarding the applicability of this approach on general multimedia data, and also show some preliminary results on content-based image clustering.  相似文献   

3.
多媒体数据挖掘的体系结构和方法   总被引:6,自引:1,他引:6  
提出了一个多媒体数据挖掘系统的一般结构(M3),包括多媒体数据库(MD)、多媒体挖掘引擎(MME)和多媒体挖掘界面(MMI),重点分析了几种挖掘方法(分类、关联和聚类)在多媒体挖掘中的应用。针对不同的媒体,如图像、音频、视频,讨论了各自的挖掘特点和主要挖掘内容。  相似文献   

4.
集成视觉特征和语义信息的相关反馈方法   总被引:1,自引:0,他引:1  
为了有效地利用图像检索系统的语义分类信息和视觉特征,提出一种基于Bayes的集成视觉特征和语义信息的相关反馈检索方法.首先,将图像库的数据经语义监督的视觉特征聚类算法划分为小的聚类,每个聚类内数据的视觉特征相似并且语义类别相同;然后以聚类为单位标注正负反馈的实例,这显著区别于以单个图像为单位的相关反馈过程;最后分别以基于视觉特征的Bayes分类器和基于语义的Bayes分类器修正相似距离.在图像库上的实验表明,只用较少的反馈次数就可以达到较高的检索准确率.  相似文献   

5.
One major challenge in the content-based image retrieval (CBIR) and computer vision research is to bridge the so-called “semantic gap” between low-level visual features and high-level semantic concepts, that is, extracting semantic concepts from a large database of images effectively. In this paper, we tackle the problem by mining the decisive feature patterns (DFPs). Intuitively, a decisive feature pattern is a combination of low-level feature values that are unique and significant for describing a semantic concept. Interesting algorithms are developed to mine the decisive feature patterns and construct a rule base to automatically recognize semantic concepts in images. A systematic performance study on large image databases containing many semantic concepts shows that our method is more effective than some previously proposed methods. Importantly, our method can be generally applied to any domain of semantic concepts and low-level features. Wei Wang received his Ph.D. degree in Computing Science and Engineering from the State University of New York (SUNY) at Buffalo in 2004, under Dr. Aidong Zhang's supervision. He received the B.Eng. in Electrical Engineering from Xi'an Jiaotong University, China in 1995 and the M.Eng. in Computer Engineering from National University of Singapore in 2000, respectively. He joined Motorola Inc. in 2004, where he is currently a senior research engineer in Multimedia Research Lab, Motorola Applications Research Center. His research interests can be summarized as developing novel techniques for multimedia data analysis applications. He is particularly interested in multimedia information retrieval, multimedia mining and association, multimedia database systems, multimedia processing and pattern recognition. He has published 15 research papers in refereed journals, conferences, and workshops, has served in the organization committees and the program committees of IADIS International Conference e-Society 2005 and 2006, and has been a reviewer for some leading academic journals and conferences. In 2005, his research prototype of “seamless content consumption” was awarded the “most innovative research concept of the year” from the Motorola Applications Research Center. Dr. Aidong Zhang received her Ph.D. degree in computer science from Purdue University, West Lafayette, Indiana, in 1994. She was an assistant professor from 1994 to 1999, an associate professor from 1999 to 2002, and has been a professor since 2002 in the Department of Computer Science and Engineering at the State University of New York at Buffalo. Her research interests include bioinformatics, data mining, multimedia systems, content-based image retrieval, and database systems. She has authored over 150 research publications in these areas. Dr. Zhang's research has been funded by NSF, NIH, NIMA, and Xerox. Dr. Zhang serves on the editorial boards of International Journal of Bioinformatics Research and Applications (IJBRA), ACMMultimedia Systems, the International Journal of Multimedia Tools and Applications, and International Journal of Distributed and Parallel Databases. She was the editor for ACM SIGMOD DiSC (Digital Symposium Collection) from 2001 to 2003. She was co-chair of the technical program committee for ACM Multimedia 2001. She has also served on various conference program committees. Dr. Zhang is a recipient of the National Science Foundation CAREER Award and SUNY Chancellor's Research Recognition Award.  相似文献   

6.
连接高层语义和低层视觉特征的图像语义标注技术能够很好地表示图像的语义,提出并实现了一种结合相关反馈日志与语义网络的图像标注方法。该方法以收集的用户相关反馈日志为基础获得图像的语义信息,通过计算图像间的语义相似度进行语义聚类并采用语义传播的方式实现图像的语义标注。实验结果表明,随着相关反馈日志库的不断扩充,图像库中越来越多的图像会在反馈的过程中得到标注且标注的准确率会随着反馈次数的增加而趋于稳定。  相似文献   

7.
Measuring image similarity is an important task for various multimedia applications. Similarity can be defined at two levels: at the syntactic (lower, context-free) level and at the semantic (higher, contextual) level. As long as one deals with the syntactic level, defining and measuring similarity is a relatively straightforward task, but as soon as one starts dealing with the semantic similarity, the task becomes very difficult. We examine the use of simple readily available syntactic image features combined with other multimodal features to derive a similarity measure that captures the weak semantics of an image. The weak semantics can be seen as an intermediate step between low level image understanding and full semantic image understanding. We investigate the use of single modalities alone and see how the combination of modalities affect the similarity measures. We also test the measure on multimedia retrieval task on a tv series data, even though the motivation is in understanding how different modalities relate to each other.  相似文献   

8.
Zhang  Hong  Huang  Yu  Xu  Xin  Zhu  Ziqi  Deng  Chunhua 《Multimedia Tools and Applications》2018,77(3):3353-3368

Due to the rapid development of multimedia applications, cross-media semantics learning is becoming increasingly important nowadays. One of the most challenging issues for cross-media semantics understanding is how to mine semantic correlation between different modalities. Most traditional multimedia semantics analysis approaches are based on unimodal data cases and neglect the semantic consistency between different modalities. In this paper, we propose a novel multimedia representation learning framework via latent semantic factorization (LSF). First, the posterior probability under the learned classifiers is served as the latent semantic representation for different modalities. Moreover, we explore the semantic representation for a multimedia document, which consists of image and text, by latent semantic factorization. Besides, two projection matrices are learned to project images and text into a same semantic space which is more similar with the multimedia document. Experiments conducted on three real-world datasets for cross-media retrieval, demonstrate the effectiveness of our proposed approach, compared with state-of-the-art methods.

  相似文献   

9.
图像分类是计算机视觉的重要研究领域,选择一种特征建立图像间的相似性度量是图像分类的关键问题.鉴于壁画图像自身的特点,轮廓特征是能够表达壁画图像语义的重要特征.研究表明轮廓可以作为图像的重要特征进行图像的识别和分类,但以往研究往往是通过两两最相似轮廓间的chamfer距离来计算图像间的相似性,或者对轮廓建立局部描述符,聚类生成词典,用统计直方图的方式描述图像特征,然后用支持向量机(SVM)进行图像分类.这些方法都忽略了轮廓间的整体结构关系,缺乏对所有轮廓的整体性描述,而现实中一幅图像的语义更多的是一种整体上的语义.基于轮廓整体结构的图像间相似性度量方法,图像间轮廓的相似度计算要受到与其他轮廓空间结构关系的约束,由此生成的相似度更能够表达两幅图像的整体相似程度.实验结果表明本文方法在壁画图像的分类上相对于没有整体结构约束的方法精度有所提高.  相似文献   

10.
图像分类是计算机视觉的重要研究领域, 选择一种特征建立图像间的相似性度量是图像分类的关键问题。鉴于壁画图像自身的特点,轮廓特征是能够表达壁画图像语义的重要特征。很多研究已经表明轮廓可以作为图像的重要特征进行图像的识别和分类,但以往研究往往都通过两两最相似轮廓间的chamfer距离来计算图像间的相似 性, 或者对轮廓建立局部描述符, 聚类生成词典, 用统计直方图的方式描述图像特征, 然后用SVM进行图像分类。 这些方法都忽略了轮廓间的整体结构关系,缺乏对所有轮廓的整体性描述,而现实中一幅图像的语义更多的是一种整体上的语义。本文研究基于轮廓整体结构的图像间相似性度量方法,图像间轮廓的相似度计算要受到与其它轮廓空间结构关系的约束,由此生成的相似度更能够表达两幅图像的整体相似程度。实验结果表明我们的方法在壁画图像的分类上相对于没有整体结构约束的方法有了精度上的提高.  相似文献   

11.
图象主要区域的提取是图象语义抽取及其应用的基础 .为了更好地进行图象语义的抽取 ,提出了一种面向图象语义的图象主要区域自动提取方法 .该方法首先将图象划分成固定大小的子块 ,并通过对子块特征进行聚类来获得图象的初始区域分割 ;而后 ,经过一系列的后处理来优化分割结果 ,并实现前景和背景区分 ;最后通过分析每个背景区域的重要程度 ,去除掉不相关的背景区域 .通过对包含有显著对象的户外图象进行的实验表明 :该方法不仅可以去除图象中 ,大量与图象语义不相关的内容 ,而且能保留图象的主要信息 ,这就为进一步的图象语义应用打好了基础 .  相似文献   

12.
基于综合推理的多媒体语义挖掘和跨媒体检索   总被引:6,自引:0,他引:6  
为了更准确地进行跨媒体检索,需要挖掘、学习不同类型多媒体对象之间的语义关联,为此提出一种基于综合推理模型的多媒体语义挖掘和跨媒体检索技术.首先根据多媒体对象的底层特征构造推理源,根据多媒体对象的共生关系构造影响源场来进行综合推理,并构造出多媒体语义空间;然后针对不同检索例子,根据伪相关反馈为每一个检索例子自适应地选择不同的榆索方法进行跨媒体检索.为了处理检索例子不在训练集合内的情况,提出了两阶段学习方法完成检索;同时还提出了一种基于日志的长程反馈学习算法,以提高系统性能.实验结果证明,该技术能够准确地挖掘多媒体语义,多媒体文档检索和跨媒体检索效果准确_凡稳定.  相似文献   

13.
新闻视频挖掘技术研究   总被引:4,自引:0,他引:4  
新闻视频挖掘是一个新兴的研究领域,也是多媒体数据挖掘的典型代表。本文对新闻视频挖掘技术进行了全面深入的讨论,首先从概念上对新闻视频挖掘进行了界定,提出了新闻视频挖掘的层次框架和技术框架,指出新闻视频挖掘包括低层视频挖掘和高层视频挖掘两个层次。其中,低层视频挖掘是利用数据挖掘的方法对视频内容进行分析的过程,而高层数据挖掘则是在低层挖掘的基础上进一步发现视频中的知识的过程。新闻视频挖掘的技术框架则对挖掘所涉及到的具体技术进行了分析。最后,对新闻视频挖掘中的结构挖掘、语义内容挖掘、视频摘要、趋势挖掘、关联挖掘等任务进行了详细的阐述,并对各种任务举出了具体的示例加以说明。  相似文献   

14.
基于互信息约束聚类的图像语义标注   总被引:2,自引:0,他引:2       下载免费PDF全文
提出一种基于互信息约束聚类的图像标注算法。采用语义约束对信息瓶颈算法进行改进,并用改进的信息瓶颈算法对分割后的图像区域进行聚类,建立图像语义概念和聚类区域之间的相互关系;对未标注的图像,提出一种计算语义概念的条件概率的方法,同时考虑训练图像的先验知识和区域的低层特征,最后使用条件概率最大的语义关键字对图像区域语义自动标注。对一个包含500幅图像的图像库进行实验,结果表明,该方法比其他方法更有效。  相似文献   

15.
Semantic entities carry the most important semantics of text data. Therefore, the identification and the relationship integration of semantic entities are very important for applications requiring semantics of text data. However, current strategies are still facing many problems such as semantic entity identification, new word identification and relationship integration among semantic entities. To address these problems, a two-phase framework for semantic entity identification with relationship integration in large scale text data is proposed in this paper. In the first semantic entities identification phase, we propose a novel strategy to extract unknown text semantic entities by integrating statistical features, Decision Tree (DT), and Support Vector Machine (SVM) algorithms. Compared with traditional approaches, our strategy is more effective in detecting semantic entities and more sensitive to new entities that just appear in the fresh data. After extracting the semantic entities, the second phase of our framework is for the integration of Semantic Entities Relationships (SER) which can help to cluster the semantic entities. A novel classification method using features such as similarity measures and co-occurrence probabilities is applied to tackle the clustering problem and discover the relationships among semantic entities. Comprehensive experimental results have shown that our framework can beat state-of-the-art strategies in semantic entity identification and discover over 80% relationship pairs among related semantic entities in large scale text data.  相似文献   

16.
图象和视频的检索技术   总被引:10,自引:0,他引:10  
随着网络技术的发展,多媒体数据将成为网络服务的主要内容,因此对多媒体数据管理问题的研究成为近几年的热点。由于媒体信息表现性质的不同,传统关系数据库的检索方式不再适用于图象和视频,因此,必须采用基于自身内容的检索方式。文章对基于内容的图象和视频检索技术分不同层次进行了全面的总结,内容包括依据基本特征,色彩、纹理、形状、和位置关系的技术,视频的场景分割、关键帧提取技术以及基于声音、文字的检索技术等,并阐述了各种方法的优缺点,现状及发展方向。  相似文献   

17.
18.
The popularity of GPS-equipped gadgets and mapping mashup applications has motivated the growth of geotagged Web resources as well as georeferenced multimedia applications. More and more research attention have been put on mining collaborative knowledge from mass user-contributed geotagged contents. However, little attention has been paid to generating high-quality geographical clusters, which is an important preliminary data-cleaning process for most geographical mining works. Previous works mainly use geotags to derive geographical clusters. Simply using one channel information is not sufficient for generating distinguishable clusters, especially when the location ambiguity problem occurs. In this paper, we propose a two-level clustering framework to utilize both the spatial and the semantic features of photographs for clustering. For the first-level geoclustering phase, we cluster geotagged photographs according to their spatial ties to roughly partition the dataset in an efficient way. Then we leverage the textual semantics in photographs' annotation to further refine the grouping results in the second-level semantic clustering phase. To effectively measure the semantic correlation between photographs, a semantic enhancement method as well as a new term weighting function have been proposed. We also propose a method for automatic parameter determination for the second-level spectral clustering process. Evaluation of our implementation on real georeferenced photograph dataset shows that our algorithm performs well, producing distinguishable geographical cluster with high accuracy and mutual information.  相似文献   

19.
基于音乐语义标签的音乐相似计算研究是音乐信息检索领域的另一个新的热点。该文提出一种基于标签挖掘的歌曲分类方法,以Last.fm音乐网站上的用户标签为特征进行歌曲相似性研究。文中将文本聚类中常用的潜在语义分析(LSA)方法和改进的K-means聚类方法相结合,应用于音乐语义标签的自动抽取;从音乐网站last.fm上抽取了6大类600首歌曲的8000多个用户标签作为音乐语义特征,并利用LSA进行歌曲向量的降维,形成了一个表示歌曲间相似关系的600×150维向量矩阵。最后利用K均值,根据音乐歌曲间的相似度进行歌曲分类,完成歌曲相似性比较。实验结果同没有LSA降维前及已有的HCC结果比较表明,使用文中提出的基于音乐标签的模型对歌曲进行分类,能得到较好的分类效果。  相似文献   

20.
In this paper, we present ICICLE (Image ChainNet and Incremental Clustering Engine), a prototype system that we have developed to efficiently and effectively retrieve WWW images based on image semantics. ICICLE has two distinguishing features. First, it employs a novel image representation model called Weight ChainNet to capture the semantics of the image content. A new formula, called list space model, for computing semantic similarities is also introduced. Second, to speed up retrieval, ICICLE employs an incremental clustering mechanism, ICC (Incremental Clustering on ChainNet), to cluster images with similar semantics into the same partition. Each cluster has a summary representative and all clusters' representatives are further summarized into a balanced and full binary tree structure. We conducted an extensive performance study to evaluate ICICLE. Compared with some recently proposed methods, our results show that ICICLE provides better recall and precision. Our clustering technique ICC facilitates speedy retrieval of images without sacrificing recall and precision significantly.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号