首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Semantic search attempts to go beyond the current state of the art in information access by addressing information needs on the semantic level, i.e. considering the meaning of users’ queries and the available resources. In recent years, there have been significant advances in developing and applying semantic technologies to the problem of semantic search. To collate these various approaches and to better understand what the concept of semantic search entails, we study semantic search under a general model. Extending this model, we introduce the notion of process-based semantic search, where semantics is exploited not only for query processing, but might be involved in all steps of the search process. We propose a particular approach that instantiates this process-based model. The usefulness of using semantics throughout the search process is finally assessed via a task-based evaluation performed in a real world scenario.  相似文献   

2.
In this paper, we present a novel framework on personalized retrieval of sports video, which includes two research tasks: semantic annotation and user preference acquisition. For semantic annotation, web-casting texts which are corresponding to sports videos are firstly captured from the webpages using data region segmentation and labeling. Incorporating the text, we detect events in the sports video and generate video event clips. These video clips are annotated by the semantics extracted from web-casting texts and indexed in a sports video database. Based on the annotation, these video clips can be retrieved from different semantic attributes according to the user preference. For user preference acquisition, we utilize click-through data as a feedback from the user. Relevance feedback is applied on text annotation and visual features to infer the intention and interested points of the user. A user preference model is learned to re-rank the initial results. Experiments are conducted on broadcast soccer and basketball videos and show an encouraging performance of the proposed method.
Hanqing LuEmail:

Yi-Fan Zhang   received the B.E. degree from Southeast University, Nanjing, China, in 2004. He is currently pursuing the Ph.D. degree at National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China. In 2007, he was an intern student in Institute for Infocomm Research, Singapore. Currently he is an intern student in China-Singapore Institute of Digital Media. His research interests include multimedia, video analysis and pattern recognition. Changsheng Xu   (M’97–SM’99) received the Ph.D. degree from Tsinghua University, Beijing, China in 1996. Currently he is Professor of Institute of Automation, Chinese Academy of Sciences and Executive Director of China-Singapore Institute of Digital Media. He was with Institute for Infocomm Research, Singapore from 1998 to 2008. He was with the National Lab of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences from 1996 to 1998. His research interests include multimedia content analysis, indexing and retrieval, digital watermarking, computer vision and pattern recognition. He published over 150 papers in those areas. Dr. Xu is an Associate Editor of ACM/Springer Multimedia Systems Journal. He served as Short Paper Co-Chair of ACM Multimedia 2008, General Co-Chair of 2008 Pacific-Rim Conference on Multimedia (PCM2008) and 2007 Asia-Pacific Workshop on Visual Information Processing (VIP2007), Program Co-Chair of VIP2006, Industry Track Chair and Area Chair of 2007 International Conference on Multimedia Modeling (MMM2007). He also served as Technical Program Committee Member of major international multimedia conferences, including ACM Multimedia Conference, International Conference on Multimedia & Expo, Pacific-Rim Conference on Multimedia, and International Conference on Multimedia Modeling. Xiaoyu Zhang   received the B.S. degree in computer science from Nanjing University of Science and Technology in 2005. He is a Ph.D. candidate of National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences. He is currently a student in China-Singapore Institute of Digital Media. His research interests include image retrieval, video analysis, and machine learning. Hanqing Lu   (M’05–SM’06) received the Ph.D. degree in Huazhong University of Sciences and Technology, Wuhan, China in 1992. Currently he is Professor of Institute of Automation, Chinese Academy of Sciences. His research interests include image similarity measure, video analysis, object recognition and tracking. He published more than 100 papers in those areas.   相似文献   

3.
4.
5.
This paper reports on a study to explore how semantic relations can be used to expand a query for objects in an image. The study is part of a project with the overall objective to provide semantic annotation and search facilities for a virtual collection of art resources. In this study we used semantic relations from WordNet for 15 image content queries. The results show that, next to the hyponym/hypernym relation, the meronym/holonym (part-of) relation is particularly useful in query expansion. We identified a number of relation patterns that improve recall without jeopardising precision.  相似文献   

6.
潜语义标与汉语信息检索研究   总被引:4,自引:0,他引:4  
1 引言典型的传统信息检索系统,如布尔逻辑模型、向量空间模型,根据用户提供的查询条件,依据关键词的匹配或向量空间的相似系数,返回相关查询结果。对于相同的概念,使用不同的词汇表示,如同义词或近义词,或同一词汇在不同的语言环境中拥有不同的语义,即一词多义,因此基于语词匹配的查询方法,其准确性和完整性都不够理想。尽管同义词词典的使用,在一定程度上,提高了信息检索的查全率(recall),但却降低了查询的精度,且在实际应用中,需要不断更新同义词库,才能满足系统不断变化的要求。  相似文献   

7.
HIRMA results in an integrated environment to query any full-text document base system by natural language sentences, obtaining a document set relevant to the query. Moreover it supports hypertextual navigation into the document base. The system uses content based document representation and retrieval methods.

In this paper the representation framework as well as the retrieval and navigation algorithms used by HIRMA are described. Coverage and portability throughout application domains are supported by the lexical acquisition system ARIOSTO that provides the suitable lexical knowledge and processing methods to extract from raw text the semantic representation of documents content.  相似文献   


8.
We seek to leverage an expert user's knowledge about how information is organized in a domain and how information is presented in typical documents within a particular domain-specific collection, to effectively and efficiently meet the expert's targeted information needs. We have developed the semantic components model to describe important semantic content within documents. The semantic components model for a given collection (based on a general understanding of the type of information needs expected) consists of a set of document classes, where each class has an associated set of semantic components. Each semantic component instance consists of segments of text about a particular aspect of the main topic of the document and may not correspond to structural elements in the document. The semantic components model represents document content in a manner that is complementary to full text and keyword indexing. This paper describes how the semantic components model can be used to improve an information retrieval system. We present experimental evidence from a large interactive searching study that compared the use of semantic components in a system with full text and keyword indexing, where we extended the query language to allow users to search using semantic components, to a base system that did not have semantic components. We evaluate the systems from a system perspective, where semantic components were shown to improve document ranking for precision-oriented searches, and from a user perspective. We also evaluate the systems from a session-based perspective, evaluating not only the results of individual queries but also the results of multiple queries during a single interactive query session.  相似文献   

9.
In this paper, we study the problem of mining temporal semantic relations between entities. The goal of the studied problem is to mine and annotate a semantic relation with temporal, concise, and structured information, which can release the explicit, implicit, and diversity semantic relations between entities. The temporal semantic annotations can help users to learn and understand the unfamiliar or new emerged semantic relations between entities. The proposed temporal semantic annotation structure integrates the features from IEEE and Renlifang. We propose a general method to generate temporal semantic annotation of a semantic relation between entities by constructing its connection entities, lexical syntactic patterns, context sentences, context graph, and context communities. Empirical experiments on two different datasets including a LinkedIn dataset and movie star dataset show that the proposed method is effective and accurate. Different from the manually generated annotation repository such as Wikipedia and LinkedIn, the proposed method can automatically mine the semantic relation between entities and does not need any prior knowledge such as ontology or the hierarchical knowledge base. The proposed method can be used on some applications, which proves the effectiveness of the proposed temporal semantic relations on many web mining tasks.  相似文献   

10.
Modern Web search engines still have many limitations: search terms are not disambiguated, search terms in one query cannot be in different languages, the retrieved media items have to be in the same language as the search terms and search results are not integrated across a live stream of different media channels, including TV, online news and social media. The system described in this paper enables all of this by combining a media stream processing architecture with cross-lingual and cross-modal semantic annotation, search and recommendation. All those components were developed in the xLiMe project.  相似文献   

11.
Recent work on searching the Semantic Web has yielded a wide range of approaches with respect to the underlying search mechanisms, results management and presentation, and style of input. Each approach impacts upon the quality of the information retrieved and the user’s experience of the search process. However, despite the wealth of experience accumulated from evaluating Information Retrieval (IR) systems, the evaluation of Semantic Web search systems has largely been developed in isolation from mainstream IR evaluation with a far less unified approach to the design of evaluation activities. This has led to slow progress and low interest when compared to other established evaluation series, such as TREC for IR or OAEI for Ontology Matching. In this paper, we review existing approaches to IR evaluation and analyse evaluation activities for Semantic Web search systems. Through a discussion of these, we identify their weaknesses and highlight the future need for a more comprehensive evaluation framework that addresses current limitations.  相似文献   

12.
From a user perspective, data and services provide a complementary view of an information source: data provide detailed information about specific needs, while services execute processes involving data and returning an informative result as well. For this reason, users need to perform aggregated searches to identify not only relevant data, but also services able to operate on them. At the current state of the art such aggregated search can be only manually performed by expert users, who first identify relevant data, and then identify existing relevant services.  相似文献   

13.
We investigate the possibility of using Semantic Web data to improve hypertext Web search. In particular, we use relevance feedback to create a ‘virtuous cycle’ between data gathered from the Semantic Web of Linked Data and web-pages gathered from the hypertext Web. Previous approaches have generally considered the searching over the Semantic Web and hypertext Web to be entirely disparate, indexing, and searching over different domains. While relevance feedback has traditionally improved information retrieval performance, relevance feedback is normally used to improve rankings over a single data-set. Our novel approach is to use relevance feedback from hypertext Web results to improve Semantic Web search, and results from the Semantic Web to improve the retrieval of hypertext Web data. In both cases, an evaluation is performed based on certain kinds of informational queries (abstract concepts, people, and places) selected from a real-life query log and checked by human judges. We evaluate our work over a wide range of algorithms and options, and show it improves baseline performance on these queries for deployed systems as well, such as the Semantic Web Search engine FALCON-S and Yahoo! Web search. We further show that the use of Semantic Web inference seems to hurt performance, while the pseudo-relevance feedback increases performance in both cases, although not as much as actual relevance feedback. Lastly, our evaluation is the first rigorous ‘Cranfield’ evaluation of Semantic Web search.  相似文献   

14.
基于特定领域的中文微博热点话题挖掘系统BTopicMiner   总被引:1,自引:0,他引:1  
李劲  张华  吴浩雄  向军 《计算机应用》2012,32(8):2346-2349
随着微博应用的迅猛发展,自动地从海量微博信息中提取出用户感兴趣的热点话题成为一个具有挑战性的研究课题。为此研究并提出了基于扩展的话题模型的中文微博热点话题抽取算法。为了解决微博信息固有的数据稀疏性问题,算法首先利用文本聚类方法将内容相关的微博消息合成为微博文档;基于微博之间的跟帖关系蕴含着话题的关联性的假设,算法对传统潜在狄利克雷分配(LDA)话题模型进行扩展以建模微博之间的跟帖关系;最后利用互信息(MI)计算被抽取出的话题的话题词汇用于热点话题推荐。为了验证扩展的话题抽取模型的有效性,实现了一个基于特定领域的中文微博热点话题挖掘的原型系统——BTopicMiner。实验结果表明:基于微博跟帖关系的扩展话题模型可以更准确地自动提取微博中的热点话题,同时利用MI度量自动计算得到的话题词汇和人工挑选的热点词汇之间的语义相似度达到75%以上。  相似文献   

15.
A large volume of research in temporal data mining is focusing on discovering temporal rules from time-stamped data. The majority of the methods proposed so far have been mainly devoted to the mining of temporal rules which describe relationships between data sequences or instantaneous events and do not consider the presence of complex temporal patterns into the dataset. Such complex patterns, such as trends or up and down behaviors, are often very interesting for the users. In this paper we propose a new kind of temporal association rule and the related extraction algorithm; the learned rules involve complex temporal patterns in both their antecedent and consequent. Within our proposed approach, the user defines a set of complex patterns of interest that constitute the basis for the construction of the temporal rule; such complex patterns are represented and retrieved in the data through the formalism of knowledge-based Temporal Abstractions. An Apriori-like algorithm looks then for meaningful temporal relationships (in particular, precedence temporal relationships) among the complex patterns of interest. The paper presents the results obtained by the rule extraction algorithm on a simulated dataset and on two different datasets related to biomedical applications: the first one concerns the analysis of time series coming from the monitoring of different clinical variables during hemodialysis sessions, while the other one deals with the biological problem of inferring relationships between genes from DNA microarray data.  相似文献   

16.
Recently, the class imbalance problem has attracted much attention from researchers in the field of data mining. When learning from imbalanced data in which most examples are labeled as one class and only few belong to another class, traditional data mining approaches do not have a good ability to predict the crucial minority instances. Unfortunately, many real world data sets like health examination, inspection, credit fraud detection, spam identification and text mining all are faced with this situation. In this study, we present a novel model called the “Information Granulation Based Data Mining Approach” to tackle this problem. The proposed methodology, which imitates the human ability to process information, acquires knowledge from Information Granules rather then from numerical data. This method also introduces a Latent Semantic Indexing based feature extraction tool by using Singular Value Decomposition, to dramatically reduce the data dimensions. In addition, several data sets from the UCI Machine Learning Repository are employed to demonstrate the effectiveness of our method. Experimental results show that our method can significantly increase the ability of classifying imbalanced data.  相似文献   

17.
语义搜索研究综述   总被引:2,自引:0,他引:2  
语义搜索将语义Web技术引入搜索引擎,改善当前搜索引擎的搜索效果,近年来得到广泛关注.文章介绍了语义搜索领域的研究基础,包括研究现状和常用的研究方法,对语义搜索进行了分类研究和深入分析,语义搜索主要可分为基于传统搜索的增强型语义搜索和基于本体推理的知识型语义搜索;文章指出了语义搜索研究中存在的问题,并对未来开展语义搜索研究进行了总结和展望.  相似文献   

18.
Hepatocellular carcinoma (HCC) is the third leading cause of cancer-related mortality worldwide. New insights into the pathogenesis of this lethal disease are urgently needed. Chromosomal copy number alterations (CNAs) can lead to activation of oncogenes and inactivation of tumor suppressors in human cancers. Thus, identification of cancer-specific CNAs will not only provide new insight into understanding the molecular basis of tumor genesis but also facilitate the identification of HCC biomarkers using CNA.  相似文献   

19.
基于XML的Web数据挖掘的研究   总被引:4,自引:1,他引:4  
1.引言传统的数据挖掘方法一般是针对数据库或数据仓库中的结构化数据进行的,但在现实世界中,人们面对的数据绝大部分是属于非结构化或半结构化的,例如Web页面。我们知道,Web的数据量目前至少可以用数百兆兆字节计算,且仍在迅速增长。这些数据一方面为数据挖掘提供了丰富的资源,另一方面也对数据挖掘技术提出了严峻的挑战。与传统的数据挖掘相比,实现Web数据挖掘的主要困难表现在以下三个方面:第一,Web页面缺乏统一的结构,Web上的每一个站点就  相似文献   

20.
The number of vertical search engines and portals has rapidly increased over the last years, making the importance of a topic-driven (focused) crawler self-evident. In this paper, we develop a latent semantic indexing classifier that combines link analysis with text content in order to retrieve and index domain-specific web documents. Our implementation presents a different approach to focused crawling and aims to overcome the limitations imposed by the need to provide initial data for training, while maintaining a high recall/precision ratio. We compare its efficiency with other well-known web information retrieval techniques.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号