首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
场景图生成是计算机视觉领域中的热点研究方向,可连接上、下游视觉任务。场景图由形式为<主语-谓语-宾语>的三元组组成,模型需要对整幅图像的全局视觉信息进行编码,从而辅助场景理解。但目前模型在处理一对多、多对一和对称性等特殊的视觉关系时仍存在问题。基于知识图谱与场景图的相似性,我们将知识图谱中的转换嵌入模型迁移至场景图生成领域。为了更好地对此类视觉关系进行编码,本文提出了一种基于多模态特征转换嵌入的场景图生成框架,可对提取的视觉和语言等多模态特征进行重映射,最后使用重映射的特征进行谓语类别预测,从而在不明显增加模型复杂度的前提下构建更好的关系表示。该框架囊括并补充了现存的几乎所有转换嵌入模型的场景图实现,将四种转换嵌入模型(TransE、TransH、TransR、TransD)分别应用于场景图生成任务,同时详细阐述了不同的视觉关系类型适用的模型种类。本文所提框架扩展了传统应用方式,除独立模型之外,本文设计了新的应用方式,即作为即插即用的子模块插入到其他网络模型。本文利用大规模语义理解的视觉基因组数据集进行实验,实验结果充分验证了所提框架的有效性,同时,得到的更丰富的类别预测结...  相似文献   

2.
This paper proposes a method for event recognition in photo albums which aims at predicting the event categories of groups of photos. We propose a probabilistic graphical model (PGM) for event prediction based on high-level visual features consisting of objects and scenes, which are extracted directly from images. For better discrimination between different event categories, we develop a scheme to integrate feature relevance in our model which yields a more powerful inference when album images exhibit a large number of objects and scenes. It allows also to mitigate the influence of non-informative images usually contained in the albums. The performance of the proposed method is validated using extensive experiments on the recently-proposed PEC dataset containing over 61 000 images. Our method obtained the highest accuracy which outperforms previous work.  相似文献   

3.
The main challenge in wireless sensor network deployment pertains to optimizing energy consumption when collecting data from sensor nodes. This paper proposes a new centralized clustering method for a data collection mechanism in wireless sensor networks, which is based on network energy maps and Quality-of-Service (QoS) requirements. The clustering problem is modeled as a hypergraph partitioning and its resolution is based on a tabu search heuristic. Our approach defines moves using largest size cliques in a feasibility cluster graph. Compared to other methods (CPLEX-based method, distributed method, simulated annealing-based method), the results show that our tabu search-based approach returns high-quality solutions in terms of cluster cost and execution time. As a result, this approach is suitable for handling network extensibility in a satisfactory manner.  相似文献   

4.
This paper focuses on the task of human-object interaction (HOI) recognition, which aims to classify the interaction between human and objects. It is a challenging task partially due to the extremely imbalanced data among classes. To solve this problem, we propose a language-guided graph parsing attention network (LG-GPAN) that makes use of the word distribution in language to guide the classification in vision. We first associate each HOI class name with a word embedding vector in language and then all the vectors can construct a language space specified for HOI recognition. Simultaneously, the visual feature is extracted from the inputs via the proposed graph parsing attention network (GPAN) for better visual representation. The visual feature is then transformed into the linguistic one in language space. Finally, the output score is obtained via measuring the distance between the linguistic feature and the word embedding of classes in language space. Experimental results on the popular CAD-120 and V-COCO datasets validate our design choice and demonstrate its superior performance in comparison to the state-of-the-art.  相似文献   

5.
The number of people collecting photos has surged owing to social media and cloud services in recent years. A typical approach to summarize a photo collection is dividing it into events and selecting key photos from each event. Despite the fact that a certain event comprises several sub-events, few studies have proposed sub-event segmentation. We propose the sentiment analysis-based photo summarization (SAPS) method, which automatically summarizes personal photo collections by utilizing metadata and visual sentiment features. For this purpose, we first cluster events using metadata of photos and then calculate the novelty scores to determine the sub-event boundaries. Next, we summarize the photo collections using a ranking algorithm that measures sentiment, emotion, and aesthetics. We evaluate the proposed method by applying it to the photo collections of six participants consisting of 5,480 photos in total. We observe that our sub-event segmentation based on sentiment features outperforms the existing baseline methods. Furthermore, the proposed method is also more effective in finding sub-event boundaries and key photos, because it focuses on detailed sentiment features instead of general content features.  相似文献   

6.
Exploring context information for visual recognition has recently received significant research attention. This paper proposes a novel and highly efficient approach, which is named semantic diffusion, to utilize semantic context for large-scale image and video annotation. Starting from the initial annotation of a large number of semantic concepts (categories), obtained by either machine learning or manual tagging, the proposed approach refines the results using a graph diffusion technique, which recovers the consistency and smoothness of the annotations over a semantic graph. Different from the existing graph-based learning methods that model relations among data samples, the semantic graph captures context by treating the concepts as nodes and the concept affinities as the weights of edges. In particular, our approach is capable of simultaneously improving annotation accuracy and adapting the concept affinities to new test data. The adaptation provides a means to handle domain change between training and test data, which often occurs in practice. Extensive experiments are conducted to improve concept annotation results using Flickr images and TV program videos. Results show consistent and significant performance gain (10 +% on both image and video data sets). Source codes of the proposed algorithms are available online.  相似文献   

7.
混合语义约简和选择估值优化SPARQL   总被引:1,自引:0,他引:1       下载免费PDF全文
叶育鑫  欧阳丹彤 《电子学报》2010,38(5):1205-1210
本文在定义SPARQL查询优化问题基础上,利用本体中概念间的语义关系提出语义约简优化方案.并通过与选择估值策略的有机结合,给出RS-Opti优化算法及其实现.测试表明:RS-Opti优于单独使用语义约简和选择估值两种优化策略;与其它查询引擎测试对比表明:该优化方案在查询的元组模式个数较多和语义较复杂时效果明显.  相似文献   

8.
In the proposed photo certificate, the principal component is the image, for example, the user's photo. User-related fields, such as the subject's name, the issuer's name, and the expiration period, which are meaningful to users, are embedded into the surface of the photo by using a visible watermark algorithm, so that the reader can capture this information without the requirement for special software. The remaining fields in the certificate are embedded into a marked photo. Later, the whole photo certificate is cryptographically signed by certification authority (CA) private key to guarantee the integrity of our photo certificate. By such arrangement, the certificate's verification is divided into two layers. The first layer is human visual system oriented and the second layer is the software-oriented. User can determine whether the user's photo and its subject's name are consistent and check whether the expired period is valid first. The second layer's verification is lunched only when the first layer's verification is passed. To sum up, the proposed photo certificate not only inherits the functions of a traditional certificate, but also provides a friendlier operational environment of X.509 certificate.  相似文献   

9.
为了更好地分析出租车在城市区域中的运行规律,提出一种基于图结构的城市交通流量可视化分析方法。通过对路网进行聚类,将路网连接路段划分为区域结构,并用点-线连接形式表示城市路网,同时以区域车流量为权重,结合图中心性概念对区域重要性进行了分析。以北京市四环为例,对一亿多条出租车GPS数据进行可视化分析,实验结果表明,该方法可以直观有效地展现不同区域的出租车流量随时间变化规律和不同区域的重要程度。  相似文献   

10.
In this paper, we concentrate on a challenging problem, i.e., weakly supervised image parsing, whereby only weak image-level labels are available in the dataset. In tradition, an affinity graph of superpixels is constructed to strengthen weak information by leveraging the neighbors from the perspective of image-level labels. Existing work constructs the affinity graph by purely utilizing the visual relevance, where the context homogenization is a common phenomenon and hinders the performance of label prediction. To overcome the context homogenization problem, we not only consider the visual and semantic relevance but also the semantic distinction between every target superpixel and its neighbor superpixels in the affinity graph construction. We propose a novel way in constructing the inter-image contextual graph, and design a label propagation framework jointly combining visual relevance, semantic relevance and discriminative ability. Extensive experiments on real-world datasets demonstrate that our approach obtains significant gains.  相似文献   

11.
B 《电子学报:英文版》2021,30(2):258-267
Frequent subgraph mining (FSM) is a subset of the graph mining domain that is extensively used for graph classification and clustering. Over the past decade, many efficient FSM algorithms have been devel-oped with improvements generally focused on reducing the time complexity by changing the algorithm structure or using parallel programming techniques. FSM algorithms also require high memory consumption, which is another problem that should be solved. In this paper, we propose a new approach called Predictive dynamic sized structure packing (PDSSP) to minimize the memory needs of FSM algorithms. Our approach redesigns the internal data structures of FSM algorithms without making algorithmic modifications. PDSSP offers two contributions. The first is the Dynamic Sized Integer Type, a newly designed unsigned integer data type, and the second is a data structure packing technique to change the behavior of the compiler. We examined the effectiveness and efficiency of the PDSSP approach by experimentally embedding it into two state-of-the-art algorithms, gSpan and Gaston. We compared our implementations to the performance of the originals. Nearly all results show that our proposed implementation consumes less memory at each support level, suggesting that PDSSP extensions could save memory, with peak memory usage decreasing up to 38%depending on the dataset.  相似文献   

12.

The Parallel Coordinates Plot (PCP) is a popular technique for the exploration of high-dimensional data. In many cases, researchers apply it as an effective method to analyze and mine data. However, when today’s data volume is getting larger, visual clutter and data clarity become two of the main challenges in parallel coordinates plot. Although Arc Coordinates Plot (ACP) is a popular approach to address these challenges, few optimization and improvement have been made on it. In this paper, we do three main contributions on the state-of-the-art PCP methods. One approach is the improvement of visual method itself. The other two approaches are mainly on the improvement of perceptual scalability when the scale or the dimensions of the data turn to be large in some mobile and wireless practical applications. 1) We present an improved visualization method based on ACP, termed as double arc coordinates plot (DACP). It not only reduces the visual clutter in ACP, but use a dimension-based bundling method with further optimization to deals with the issues of the conventional parallel coordinates plot (PCP). 2)To reduce the clutter caused by the order of the axes and reveal patterns that hidden in the data sets, we propose our first dimensional reordering method, a contribution-based method in DACP, which is based on the singular value decomposition (SVD) algorithm. The approach computes the importance score of attributes (dimensions) of the data using SVD and visualize the dimensions from left to right in DACP according the score in SVD. 3) Moreover, a similarity-based method, which is based on the combination of nonlinear correlation coefficient and SVD algorithm, is proposed as well in the paper. To measure the correlation between two dimensions and explains how the two dimensions interact with each other, we propose a reordering method based on non-linear correlation information measurements. We mainly use mutual information to calculate the partial similarity of dimensions in high-dimensional data visualization, and SVD is used to measure global data. Lastly, we use five case scenarios to evaluate the effectiveness of DACP, and the results show that our approaches not only do well in visualizing multivariate dataset, but also effectively alleviate the visual clutter in the conventional PCP, which bring users a better visual experience.

  相似文献   

13.
Current image search system uses paged image list to show search results. However, the problems such as query ambiguity make users hard to find search targets in such image list. In this work, we propose an image search result grouping system that summarizes image search results in semantic and visual groups. We use MapReduce-based image graph construction and image clustering methods to deal with scalability problem on this system. By precomputing image graphs and image clusters at offline stage, this system can be efficient at responding user query. The experiments on two large scale Flickr image datasets are conducted for our system. Compared with using single machine, our graph construction method is 69 times faster. We conduct a comprehensive user study to compare our approach with state-of-the-art baseline methods. We find that our approach generates competent image groups with a 2–100 times speeded-up.  相似文献   

14.
Photos are becoming spontaneous, objective, and universal sources of information. This paper explores evolving situation recognition using photo streams coming from disparate sources combined with the advances of deep learning. Using visual concepts in photos together with space and time information, we formulate the situation detection into a semi-supervised learning framework and propose new graph-based models to solve the problem. To extend the method for unknown situations, we introduce a soft label method that enables the traditional semi-supervised learning framework to accurately predict predefined labels as well as effectively form new clusters. To overcome the noisy data which degrades graph quality, leading to poor recognition results, we take advantage of two kinds of noise-robust norms which can eliminate the adverse effects of outliers in visual concepts and improve the accuracy of situation recognition. Finally, we demonstrate the idea and the effectiveness of the proposed models on Yahoo Flickr Creative Commons 100 Million.  相似文献   

15.
In this paper, we present an interleaver design method for SCCC. Our design criterion is to minimize the message-round probability in the SCCC graph which is specially well suited for SCCC. The message-round probability characterizes the message flow in the SCCC graph. By minimizing it, we can get a large interleaving gain. The simulation results confirm our approach.  相似文献   

16.
Bistable perception arises when an ambiguous stimulus under continuous view is perceived as an alternation of two mutually exclusive states. Such a stimulus provides a unique opportunity for understanding the neural basis of visual perception because it dissociates the perception from the visual input. In this paper, we focus on extracting the percept-related features from the local field potential (LFP) in monkey visual cortex for decoding its bistable structure-from-motion (SFM) perception. Our proposed feature extraction approach consists of two stages. First, we estimate and remove from each LFP trial the nonpercept-related stimulus-evoked activity via a local regression method called the locally weighted scatterplot smoothing because of the dissociation between the perception and the stimulus in our experimental paradigm. Second, we use the common spatial patterns approach to design spatial filters based on the residue signals of multiple channels to extract the percept-related features. We exploit a support vector machine (SVM) classifier on the extracted features to decode the reported perception on a single-trial basis. We apply the proposed approach to the multichannel intracortical LFP data collected from the middle temporal (MT) visual cortex in a macaque monkey performing an SFM task. We demonstrate that our approach is effective in extracting the discriminative features of the percept-related activity from LFP and achieves excellent decoding performance. We also find that the enhanced gamma band synchronization and reduced alpha and beta band desynchronization may be the underpinnings of the percept-related activity.  相似文献   

17.
Visual saliency is an effective tool for perceptual image processing. In the past decades, many saliency models have been proposed by primarily considering visual cues such as local contrast and global rarity. However, such explicit cues derived only from input stimuli are often insufficient to separate targets from distractors, leading to noisy saliency maps. In fact, the latent cues, especially the latent signal correlations that link visually distinct stimuli (e.g., various parts of a salient target), may also play an important role in saliency estimation. In this paper, we propose a graph-based approach for image saliency estimation by incorporating both explicit visual cues and latent signal correlations. In our approach, the latent correlations between various image patches are first derived according to the statistical prior obtained from 10 million reference images. After that, the informativeness of image patches and their latent correlations are jointly considered to construct a directed graph, on which a random walking process is performed to generate saliency maps that pop-out only the most salient locations. Experimental results show that our approach achieves impressive performances on three public image benchmarks.  相似文献   

18.
In this paper, we approach the design of ID caching technology (IDCT) for graph databases, with the purpose of accelerating the queries on graph database data and avoiding redundant graph database query operations which will consume great computer resources. Traditional graph database caching technology (GDCT) needs a large memory to store data and has the problems of serious data consistency and low cache utilization. To address these issues, in the paper we propose a new technology which focuses on ID allocation mechanism and high-speed queries of ID on graph databases. Specifically, ID of the query result is cached in memory and data consistency is achieved through the real-time synchronization and cache memory adaptation. In addition, we set up complex queries and simple queries to satisfy all query requirements and design a mechanism of cache replacement based on query action time, query times, and memory capacity, thus improving the performance furthermore. Extensive experiments show the superiority of our techniques compared with the traditional query approach of graph databases.  相似文献   

19.
A multiple-hypothesis approach for multiobject visual tracking.   总被引:3,自引:0,他引:3  
In multiple-object tracking applications, it is essential to address the problem of associating targets and observation data. For visual tracking of multiple targets which involves objects that split and merge, a target may be associated with multiple measurements and many targets may be associated with a single measurement. The space of such data association is exponential in the number of targets and exhaustive enumeration is impractical. We pose the association problem as a bipartite graph edge covering problem given the targets and the object detection information. We propose an efficient method of maintaining multiple association hypotheses with the highest probabilities over all possible histories of associations. Our approach handles objects entering and exiting the field of view, merging and splitting objects, as well as objects that are detected as fragmented parts. Experimental results are given for tracking multiple players in a soccer game and for tracking people with complex interaction in a surveillance setting. It is shown through quantitative evaluation that our method tracks through varying degrees of interactions among the targets with high success rate.  相似文献   

20.
Wireless mesh networks (WMNs) have emerged as a significant technology for applications because of its advantage of multi-radio and multi-channel which makes it perform better than wireless LANs. Furthermore, quality-of-service (QoS) support can be achieved by some distinguished ways in WMN. In this paper, QoS requirements are recorded by traffic profile, QoS constraints are formulated as delay time of transmitting all the requested data flows in the network. Multi-commodity flow technologies are applied for handling this issue. After minimizing the delay of the network by the assistance of multi-commodity-flow techniques and resource contention graph, we use effective channel assignment algorithm to schedule the data flows under the QoS constraints. Our evaluation indicates that our technologies successfully route flows under their special QoS requirements with different priority.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号