首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 671 毫秒
1.
目的 随着手持移动设备的迅猛发展和大数据时代的到来,以多媒体数据为核心的视觉搜索等研究和应用得到了广泛关注。其中局部特征描述子的压缩、存储和传输起到了举足轻重的作用。为此在传统图像/视频压缩框架中,提出一种高效的视觉局部特征的紧凑表示方法,使得传统内容编码可以适应广泛的检索分析等需求。方法 为了得到紧凑、有区分度、同时高效的局部特征表示,首先引入了多参考的预测机制,在消除了时空冗余的同时,通过充分利用视频纹理编码的信息,消除了来自纹理-特征之间的冗余。此外,还提出了一种新的率失真优化方法——码率-准确率最优化方法,使得基于匹配/检索应用的性能达到最优。结果 在不同数据集上进行验证实验,和最新的视频局部描述子压缩框架进行比较,本文方法能够在保证匹配和检索性能的基础上,显著地减少特征带来的比特消耗,达到大约150:1的压缩比。结论 本文方法适用于传统图像/视频编码框架,通过在码流中嵌入少量表示特征的信息,即可实现高效的检索性能,是一种面向检索等智能设备应用的新型多媒体内容编码框架。  相似文献   

2.
目的 传统的基于浮点型向量表示的图像局部特征描述子(如SIFT、SURF等)已经成为计算机视觉研究和应用领域的重要工具,然而传统的高维特征向量在基于内容的大规模视觉检索应用中存在着维度灾难的问题,这使得传统浮点型视觉特征在大规模多媒体数据应用中面临严峻挑战。为了解决浮点型特征的计算复杂度高以及存储空间开销大的问题,越来越多的计算机视觉研究团队开始关注和研究基于二进制表达的局部特征并取得了重要进展。方法 首先介绍了二进制特征的相关工作,并对这些方法进行了分类研究,在此基础上提出了基于亮度差量化的特征描述算法。有别于传统二进制特征描述算法,本文算法首先对图像局部进行随机像素点对采样,并计算像素点对之间的亮度差,通过对亮度差值作二进制量化得到图像的局部二进制特征。结果 本文算法在公共数据集上与目前主流的几种二进制特征提取算法进行了比较评价,实验结果表明,本文二进制特征在特征匹配准确率和召回率上超过目前主流的几种二进制描述子,并且同样具有极高的计算速度和存储效率。结论 通过实验结果验证,本文二进制特征在图像条件发生变化时仍然能保持一定的鲁棒性。  相似文献   

3.

Unsupervised representation learning of unlabeled multimedia data is important yet challenging problem for their indexing, clustering, and retrieval. There have been many attempts to learn representation from a collection of unlabeled 2D images. In contrast, however, less attention has been paid to unsupervised representation learning for unordered sets of high-dimensional feature vectors, which are often used to describe multimedia data. One such example is set of local visual features to describe a 2D image. This paper proposes a novel algorithm called Feature Set Aggregator (FSA) for accurate and efficient comparison among sets of high-dimensional features. FSA learns representation, or embedding, of unordered feature sets via optimization using a combination of two training objectives, that are, set reconstruction and set embedding, carefully designed for set-to-set comparison. Experimental evaluation under three multimedia information retrieval scenarios using 3D shapes, 2D images, and text documents demonstrates efficacy as well as generality of the proposed algorithm.

  相似文献   

4.
目的 海量数据的快速增长给多媒体计算带来了深刻挑战。与传统以手工构造为核心的媒体计算模式不同,数据驱动下的深度学习(特征学习)方法成为当前媒体计算主流。方法 重点分析了深度学习在检索排序与标注、多模态检索与语义理解、视频分析与理解等媒体计算方面的最新进展和所面临的挑战,并对未来的发展趋势进行展望。结果 在检索排序与标注方面, 基于深度学习的神经编码等方法取得了很好的效果;在多模态检索与语义理解方面,深度学习被用于弥补不同模态间的“异构鸿沟“以及底层特征与高层语义间的”语义鸿沟“,基于深度学习的组合语义学习成为研究热点;在视频分析与理解方面, 深度神经网络被用于学习视频的有效表示方式及动作识别,并取得了很好的效果。然而,深度学习是一种数据驱动的方法,易受数据噪声影响, 对于在线增量学习方面还不成熟,如何将深度学习与众包计算相结合是一个值得期待的问题。结论 该综述在深入分析现有方法的基础上,对深度学习框架下为解决异构鸿沟和语义鸿沟给出新的思路。  相似文献   

5.
ContextA software product line is a family of software systems that share some common features but also have significant variabilities. A feature model is a variability modeling artifact, which represents differences among software products with respect to the variability relationships among their features. Having a feature model along with a reference model developed in the domain engineering lifecycle, a concrete product of the family is derived by binding the variation points in the feature model (called configuration process) and by instantiating the reference model.ObjectiveIn this work we address the feature model configuration problem and propose a framework to automatically select suitable features that satisfy both the functional and non-functional preferences and constraints of stakeholders. Additionally, interdependencies between various non-functional properties are taken into account in the framework.MethodThe proposed framework combines Analytical Hierarchy Process (AHP) and Fuzzy Cognitive Maps (FCM) to compute the non-functional properties weights based on stakeholders’ preferences and interdependencies between non-functional properties. Afterwards, Hierarchical Task Network (HTN) planning is applied to find the optimal feature model configuration.ResultOur approach improves state-of-art of feature model configuration by considering positive or negative impacts of the features on non-functional properties, the stakeholders’ preferences, and non-functional interdependencies. The approach presented in this paper extends earlier work presented in [1] from several distinct perspectives including mechanisms handling interdependencies between non-functional properties, proposing a novel tooling architecture, and offering visualization and interaction techniques for representing functional and non-functional aspects of feature models.Conclusionour experiments show the scalability of our configuration approach when considering both functional and non-functional requirements of stakeholders.  相似文献   

6.
目的 随着视频监控技术的日益成熟和监控设备的普及,视频监控应用日益广泛,监控视频数据量呈现出爆炸性的增长,已经成为大数据时代的重要数据对象。然而由于视频数据本身的非结构化特性,使得监控视频数据的处理和分析相对困难。面对大量摄像头采集的监控视频大数据,如何有效地按照视频的内容和特性去传输、存储、分析和识别这些数据,已经成为一种迫切的需求。方法 本文面向智能视频监控中大规模视觉感知与智能处理问题,围绕监控视频编码、目标检测与跟踪、监控视频增强、视频运动与异常行为识别等4个主要研究方向,系统阐述2013年度的技术发展状况,并对未来的发展趋势进行展望。结果 中国最新制定的国家标准AVS2在对监控视频的编码效率上比最新国际标准H.265/HEVC高出一倍,标志着我国的视频编码技术和标准在视频监控领域已经实现跨越;视频运动目标检测跟踪的研究主要集中在有效特征提取和分类器训练等方面,机器学习等方法的引入,使得基于多实例学习、稀疏表示的运动目标检测跟踪成为研究的热点;监控视频质量增强主要包括去雾、去夜色、去雨雪、去模糊和超分辨率增强等多方面的内容,现有的算法均是对某类图像清晰化效果较好,而对其他类则相对较差,普适性不高;现有的智能动作分析与异常行为识别技术虽然得到了不断发展,算法的性能也在不断提高,但是从实用角度,除了简单的特定或可控场景外,还没有太多成熟的应用系统。结论 随着大数据时代的到来,智能视频监控的需求将日益迫切,面对众多挑战的同时,该研究领域将迎来前所未有的重大机遇,必将产生越来越多可以实用的研究成果。  相似文献   

7.
Xu  Xing  Wu  Haiping  Yang  Yang  Shen  Fumin  Xie  Ning  Ji  Yanli 《Multimedia Tools and Applications》2018,77(17):22185-22198

Recent years have witnessed the unprecedented efforts of visual representation for enabling various efficient and effective multimedia applications. In this paper, we propose a novel visual representation learning framework, which generates efficient semantic hash codes for visual samples by substantially exploring concepts, semantic attributes as well as their inter-correlations. Specifically, we construct a conceptual space, where the semantic knowledge of concepts and attributes is embedded. Then, we develop an effective on-line feature coding scheme for visual objects by leveraging the inter-concept relationships through the intermediate representative power of attributes. The code process is formulated as an overlapping group lasso problem, which can be efficiently solved. Finally, we may binarize the visual representation to generate efficient hash codes. Extensive experiments have been conducted to illustrate the superiority of our proposed framework on visual retrieval task as compared to state-of-the-art methods.

  相似文献   

8.

This paper presents a formal framework for the representation of knowledge concerning the use of multimedia information in the context of designing user-tailored information presentations. The aim is to support efficient development and maintenance of knowledge bases and effective utilization of knowledge for adaptive information presentations. It is conjectured that a hybrid knowledge representation framework that mixes formal knowledge structures with canned fragments of multimedia information and distinguishes between categories of medium-independent information items and types of presentation forms comprehensively can achieve this aim. The framework supports subject-based classification of reusable multimedia objects and their utilization according to presentation intentions and to pragmatic constraints. This approach, although it is considered complementary, contrasts to approaches that require systems to build every presentation from scratch.  相似文献   

9.
10.
11.
Detecting multimedia events in web videos is an emerging hot research area in the fields of multimedia and computer vision. In this paper, we introduce the core methods and technologies of the framework we developed recently for our Event Labeling through Analytic Media Processing (E-LAMP) system to deal with different aspects of the overall problem of event detection. More specifically, we have developed efficient methods for feature extraction so that we are able to handle large collections of video data with thousands of hours of videos. Second, we represent the extracted raw features in a spatial bag-of-words model with more effective tilings such that the spatial layout information of different features and different events can be better captured, thus the overall detection performance can be improved. Third, different from widely used early and late fusion schemes, a novel algorithm is developed to learn a more robust and discriminative intermediate feature representation from multiple features so that better event models can be built upon it. Finally, to tackle the additional challenge of event detection with only very few positive exemplars, we have developed a novel algorithm which is able to effectively adapt the knowledge learnt from auxiliary sources to assist the event detection. Both our empirical results and the official evaluation results on TRECVID MED’11 and MED’12 demonstrate the excellent performance of the integration of these ideas.  相似文献   

12.
Querying live media streams is a challenging problem that is becoming an essential requirement in a growing number of applications. Research in multimedia information systems has addressed and made good progress in dealing with archived data. Meanwhile, research in stream databases has received significant attention for querying alphanumeric symbolic streams. The lack of a data model capable of representing different multimedia data in a declarative way, hiding the media heterogeneity and providing reasonable abstractions for querying live multimedia streams poses the challenge of how to make the best use of data in video, audio and other media sources for various applications. In this paper we propose a system that enables directly capturing media streams from sensors and automatically generating more meaningful feature streams that can be queried by a data stream processor. The system provides an effective combination between extendible digital processing techniques and general data stream management research. Together with other query techniques developed in related data stream management streams, our system can be used in those application areas where multifarious live media senors are deployed for surveillance, disaster response, live conferencing, telepresence, etc.
Bin LiuEmail:
  相似文献   

13.
This paper presents a novel ranking framework for content-based multimedia information retrieval (CBMIR). The framework introduces relevance features and a new ranking scheme. Each relevance feature measures the relevance of an instance with respect to a profile of the targeted multimedia database. We show that the task of CBMIR can be done more effectively using the relevance features than the original features. Furthermore, additional performance gain is achieved by incorporating our new ranking scheme which modifies instance rankings based on the weighted average of relevance feature values. Experiments on image and music databases validate the efficacy and efficiency of the proposed framework.  相似文献   

14.
Feature selection is an essential data processing step to remove irrelevant and redundant attributes for shorter learning time, better accuracy, and better comprehensibility. A number of algorithms have been proposed in both data mining and machine learning areas. These algorithms are usually used in a single table environment, where data are stored in one relational table or one flat file. They are not suitable for a multi‐relational environment, where data are stored in multiple tables joined to one another by semantic relationships. To address this problem, in this article, we propose a novel approach called FARS to conduct both Feature And Relation Selection for efficient multi‐relational classification. Through this approach, we not only extend the traditional feature selection method to select relevant features from multi‐relations, but also develop a new method to reconstruct the multi‐relational database schema and eliminate irrelevant tables to improve classification performance further. The results of the experiments conducted on both real and synthetic databases show that FARS can effectively choose a small set of relevant features, thereby enhancing classification efficiency and prediction accuracy significantly.  相似文献   

15.
Time series classification is related to many different domains, such as health informatics, finance, and bioinformatics. Due to its broad applications, researchers have developed many algorithms for this kind of tasks, e.g., multivariate time series classification. Among the classification algorithms, k-nearest neighbor (k-NN) classification (particularly 1-NN) combined with dynamic time warping (DTW) achieves the state of the art performance. The deficiency is that when the data set grows large, the time consumption of 1-NN with DTWwill be very expensive. In contrast to 1-NN with DTW, it is more efficient but less effective for feature-based classification methods since their performance usually depends on the quality of hand-crafted features. In this paper, we aim to improve the performance of traditional feature-based approaches through the feature learning techniques. Specifically, we propose a novel deep learning framework, multi-channels deep convolutional neural networks (MC-DCNN), for multivariate time series classification. This model first learns features from individual univariate time series in each channel, and combines information from all channels as feature representation at the final layer. Then, the learnt features are applied into a multilayer perceptron (MLP) for classification. Finally, the extensive experiments on real-world data sets show that our model is not only more efficient than the state of the art but also competitive in accuracy. This study implies that feature learning is worth to be investigated for the problem of time series classification.  相似文献   

16.
Combined SVM-Based Feature Selection and Classification   总被引:1,自引:0,他引:1  
Feature selection is an important combinatorial optimisation problem in the context of supervised pattern classification. This paper presents four novel continuous feature selection approaches directly minimising the classifier performance. In particular, we include linear and nonlinear Support Vector Machine classifiers. The key ideas of our approaches are additional regularisation and embedded nonlinear feature selection. To solve our optimisation problems, we apply difference of convex functions programming which is a general framework for non-convex continuous optimisation. Experiments with artificial data and with various real-world problems including organ classification in computed tomography scans demonstrate that our methods accomplish the desired feature selection and classification performance simultaneously. Editor: Dale Schuurmans  相似文献   

17.
ContextSeveral issues hinder software defect data including redundancy, correlation, feature irrelevance and missing samples. It is also hard to ensure balanced distribution between data pertaining to defective and non-defective software. In most experimental cases, data related to the latter software class is dominantly present in the dataset.ObjectiveThe objectives of this paper are to demonstrate the positive effects of combining feature selection and ensemble learning on the performance of defect classification. Along with efficient feature selection, a new two-variant (with and without feature selection) ensemble learning algorithm is proposed to provide robustness to both data imbalance and feature redundancy.MethodWe carefully combine selected ensemble learning models with efficient feature selection to address these issues and mitigate their effects on the defect classification performance.ResultsForward selection showed that only few features contribute to high area under the receiver-operating curve (AUC). On the tested datasets, greedy forward selection (GFS) method outperformed other feature selection techniques such as Pearson’s correlation. This suggests that features are highly unstable. However, ensemble learners like random forests and the proposed algorithm, average probability ensemble (APE), are not as affected by poor features as in the case of weighted support vector machines (W-SVMs). Moreover, the APE model combined with greedy forward selection (enhanced APE) achieved AUC values of approximately 1.0 for the NASA datasets: PC2, PC4, and MC1.ConclusionThis paper shows that features of a software dataset must be carefully selected for accurate classification of defective components. Furthermore, tackling the software data issues, mentioned above, with the proposed combined learning model resulted in remarkable classification performance paving the way for successful quality control.  相似文献   

18.
目的 基于物理的烟雾模拟是计算机图形学的重要组成部分,渲染具有细小结构的高分辨率烟雾,需要大量的计算资源和高精度的数值求解方法。针对目前高精度湍流烟雾模拟速度慢,仿真困难的现状,提出了基于字典神经网络的方法,能够快速合成湍流烟雾,使得合成的结果增加细节的同时,保持高分辨率烟雾结果的重要结构信息。方法 使用高精度的数值仿真求解方法获得高分辨率和低分辨率的湍流烟雾数据,通过采集速度场局部块及相应的空间位置信息和时间特征生成数据集, 设计字典神经网络的网络架构,训练烟雾高频成分字典预测器,在GPU(graphic processing unit)上实现并行化,快速合成高分辨率的湍流烟雾结果。结果 实验表明,基于字典神经网络的方法能够在非常低分辨率的烟雾数据下合成空间和时间上连续的高分辨率湍流烟雾结果,效率比通过在GPU平台上直接仿真得到高分辨率湍流烟雾的结果快了一个数量级,且合成的烟雾结果与数值仿真方法得到的高分辨率湍流烟雾结果足够接近。结论 本文方法解决了烟雾的上采样问题,能够从非常低分辨率的烟雾仿真结果,通过设计基于字典神经网络结构以及特征描述符编码烟雾速度场的局部和全局信息,快速合成高分辨率湍流烟雾结果,且保持高精度烟雾的细节,与数值仿真方法的对比表明了本文方法的有效性。  相似文献   

19.
Multimedia understanding for high dimensional data is still a challenging work, due to redundant features, noises and insufficient label information it contains. Graph-based semi-supervised feature learning is an effective approach to address this problem. Nevertheless, Existing graph-based semi-supervised methods usually depend on the pre-constructed Laplacian matrix but rarely modify it in the subsequent classification tasks. In this paper, an adaptive local manifold learning based semi-supervised feature selection is proposed. Compared to the state-of-the-art, the proposed algorithm has two advantages: 1) Adaptive local manifold learning and feature selection are integrated jointly into a single framework, where both the labeled and unlabeled data are utilized. Besides, the correlations between different components are also considered. 2) A group sparsity constraint, i.e. l 2?,?1-norm, is imposed to select the most relevant features. We also apply the proposed algorithm to serval kinds of multimedia understanding applications. Experimental results demonstrate the effectiveness of the proposed algorithm.  相似文献   

20.
ABSTRACT

In the proposed technique we have merged two layers of security, namely, the watermarking layer and the secure channel management layer used in the framework of a real time multimedia security system. The joint layer embeds watermark to the host signal and deploys encryption simultaneously to protect multimedia data, thereby reducing the processing delay. The scheme supports G.729 codec bit-stream. Encryption is done using Data Encryption Standard (DES) protocol. The computation time and watermark-embedding rate is evaluated for the proposed algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号