首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Recent work on searching the Semantic Web has yielded a wide range of approaches with respect to the underlying search mechanisms, results management and presentation, and style of input. Each approach impacts upon the quality of the information retrieved and the user’s experience of the search process. However, despite the wealth of experience accumulated from evaluating Information Retrieval (IR) systems, the evaluation of Semantic Web search systems has largely been developed in isolation from mainstream IR evaluation with a far less unified approach to the design of evaluation activities. This has led to slow progress and low interest when compared to other established evaluation series, such as TREC for IR or OAEI for Ontology Matching. In this paper, we review existing approaches to IR evaluation and analyse evaluation activities for Semantic Web search systems. Through a discussion of these, we identify their weaknesses and highlight the future need for a more comprehensive evaluation framework that addresses current limitations.  相似文献   

2.
The current web IR system retrieves relevant information only based on the keywords which is inadequate for that vast amount of data. It provides limited capabilities to capture the concepts of the user needs and the relation between the keywords. These limitations lead to the idea of the user conceptual search which includes concepts and meanings. This study deals with the Semantic Based Information Retrieval System for a semantic web search and presented with an improved algorithm to retrieve the information in a more efficient way.This architecture takes as input a list of plain keywords provided by the user and the query is converted into semantic query. This conversion is carried out with the help of the domain concepts of the pre-existing domain ontologies and a third party thesaurus and discover semantic relationship between them in runtime. The relevant information for the semantic query is retrieved and ranked according to the relevancy with the help of an improved algorithm. The performance analysis shows that the proposed system can improve the accuracy and effectiveness for retrieving relevant web documents compared to the existing systems.  相似文献   

3.
Visual interfaces are potentially powerful tools for users to explore a representation of a collection and opportunistically discover information that will guide them toward relevant documents. Semantic fisheye views (SFEVs) are focus + context visualization techniques that manage visual complexity by selectively emphasizing and increasing the detail of information related to the users focus and deemphasizing or filtering less important information.In this paper we describe a prototype for visualizing an annotated image collection and an experiment to compare the effectiveness of two distinctly different SFEVs for a complex opportunistic search task. The first SFEV calculates relevance based on keyword-content similarity and the second based on conceptual relationships between images derived using WordNet. The results of the experiment suggest that semantic-guided search is significantly more effective than similarity-guided search for discovering and using domain knowledge in a collection.  相似文献   

4.
In this paper, we present the results of a project that seeks to transform low-level features to a higher level of meaning. This project concerns a technique, latent semantic indexing (LSI), in conjunction with normalization and term weighting, which have been used for full-text retrieval for many years. In this environment, LSI determines clusters of co-occurring keywords, sometimes, called concepts, so that a query which uses a particular keyword can then retrieve documents perhaps not containing this keyword, but containing other keywords from the same cluster. In this paper, we examine the use of this technique for content-based image retrieval, using two different approaches to image feature representation. We also study the integration of visual features and textual keywords and the results show that it can help improve the retrieval performance significantly.  相似文献   

5.
We investigate the possibility of using Semantic Web data to improve hypertext Web search. In particular, we use relevance feedback to create a ‘virtuous cycle’ between data gathered from the Semantic Web of Linked Data and web-pages gathered from the hypertext Web. Previous approaches have generally considered the searching over the Semantic Web and hypertext Web to be entirely disparate, indexing, and searching over different domains. While relevance feedback has traditionally improved information retrieval performance, relevance feedback is normally used to improve rankings over a single data-set. Our novel approach is to use relevance feedback from hypertext Web results to improve Semantic Web search, and results from the Semantic Web to improve the retrieval of hypertext Web data. In both cases, an evaluation is performed based on certain kinds of informational queries (abstract concepts, people, and places) selected from a real-life query log and checked by human judges. We evaluate our work over a wide range of algorithms and options, and show it improves baseline performance on these queries for deployed systems as well, such as the Semantic Web Search engine FALCON-S and Yahoo! Web search. We further show that the use of Semantic Web inference seems to hurt performance, while the pseudo-relevance feedback increases performance in both cases, although not as much as actual relevance feedback. Lastly, our evaluation is the first rigorous ‘Cranfield’ evaluation of Semantic Web search.  相似文献   

6.
7.
User profiles play an important role in information retrieval system. In this paper, we propose a novel method for the acquisition of ontology-based user profiles. In the method, the ontology-based user profiles can maintain the representations of personal interest. In addition, user ontologies can be automatically constructed. The method can make user profiles strong expressive and less manually interfered.  相似文献   

8.
基于特定领域的中文微博热点话题挖掘系统BTopicMiner   总被引:1,自引:0,他引:1  
李劲  张华  吴浩雄  向军 《计算机应用》2012,32(8):2346-2349
随着微博应用的迅猛发展,自动地从海量微博信息中提取出用户感兴趣的热点话题成为一个具有挑战性的研究课题。为此研究并提出了基于扩展的话题模型的中文微博热点话题抽取算法。为了解决微博信息固有的数据稀疏性问题,算法首先利用文本聚类方法将内容相关的微博消息合成为微博文档;基于微博之间的跟帖关系蕴含着话题的关联性的假设,算法对传统潜在狄利克雷分配(LDA)话题模型进行扩展以建模微博之间的跟帖关系;最后利用互信息(MI)计算被抽取出的话题的话题词汇用于热点话题推荐。为了验证扩展的话题抽取模型的有效性,实现了一个基于特定领域的中文微博热点话题挖掘的原型系统——BTopicMiner。实验结果表明:基于微博跟帖关系的扩展话题模型可以更准确地自动提取微博中的热点话题,同时利用MI度量自动计算得到的话题词汇和人工挑选的热点词汇之间的语义相似度达到75%以上。  相似文献   

9.
10.
The relational database model is widely used in real applications. We propose a way of complementing such a database with an XML data warehouse. The approach we propose is generic, and driven by a domain ontology. The XML data warehouse is built from data extracted from the Web, which are semantically tagged using terms belonging to the domain ontology. The semantic tagging is fuzzy, since, instead of tagging the values of the Web document with one value of the domain ontology, we propose to use tags expressed in terms of a possibility distribution representing a set of possible terms, each term being weighted by a possibility degree. The querying of the XML data warehouse is also fuzzy: the end-users can express their preferences by means of fuzzy selection criteria. We present our approach on a first application domain: predictive microbiology.  相似文献   

11.
A novel approach to clustering for image segmentation and a new object-based image retrieval method are proposed. The clustering is achieved using the Fisher discriminant as an objective function. The objective function is improved by adding a spatial constraint that encourages neighboring pixels to take on the same class label. A six-dimensional feature vector is used for clustering by way of the combination of color and busyness features for each pixel. After clustering, the dominant segments in each class are chosen based on area and used to extract features for image retrieval. The color content is represented using a histogram, and Haar wavelets are used to represent the texture feature of each segment. The image retrieval is segment-based; the user can select a query segment to perform the retrieval and assign weights to the image features. The distance between two images is calculated using the distance between features of the constituent segments. Each image is ranked based on this distance with respect to the query image segment. The algorithm is applied to a pilot database of natural images and is shown to improve upon the conventional classification and retrieval methods. The proposed segmentation leads to a higher number of relevant images retrieved, 83.5% on average compared to 72.8 and 68.7% for the k-means clustering and the global retrieval methods, respectively.  相似文献   

12.
13.
The rapid growth of the Linked Open Data cloud, as well as the increasing ability to lift relational enterprise datasets to a semantic, ontology-based level means that vast amounts of information are now available in a representation that closely matches the conceptualizations of the potential users of this information. This makes it interesting to create ontology based, user-oriented tools for searching and exploring this data. Although initial efforts were intended for tech users with knowledge of SPARQL/RDF, there are ongoing proposals designed for lay users. One of the most promising approaches is to use visual query interfaces, but more user studies are needed to assess their effectiveness. In this paper, we compare the effect on usability of two important paradigms for ontology-based query interfaces: form-based and graph-based interfaces. In order to reduce the number of variables affecting the comparison, we performed a user study with two state-of-the-art query tools developed by ourselves, sharing a large part of the code base: the graph-based tool OptiqueVQS*, and the form-based tool PepeSearch. We evaluated these tools in a formal comparison study with 15 participants searching a Linked Open Data version of the Norwegian Company Registry. Participants had to respond to 6 non-trivial search tasks using alternately OptiqueVQS* and PepeSearch. Even without previous training, retrieval performance and user confidence were very high, thus suggesting that both interface designs are effective for searching RDF datasets. Expert searchers had a clear preference for the graph-based interface, and mainstream searchers obtained better performance and confidence with the form-based interface. While a number of participants spontaneously praised the capability of the graph interface for composing complex queries, our results evidence that graph interfaces are difficult to grasp. In contrast, form interfaces are more learnable and relieve problems with disorientation for mainstream users. We have also observed positive results introducing faceted search and dynamic term suggestion in semantic search interfaces.  相似文献   

14.
The volume of publicly available data in biomedicine is constantly increasing. However, these data are stored in different formats and on different platforms. Integrating these data will enable us to facilitate the pace of medical discoveries by providing scientists with a unified view of this diverse information. Under the auspices of the National Center for Biomedical Ontology (NCBO), we have developed the Resource Index – a growing, large-scale ontology-based index of more than twenty heterogeneous biomedical resources. The resources come from a variety of repositories maintained by organizations from around the world. We use a set of over 200 publicly available ontologies contributed by researchers in various domains to annotate the elements in these resources. We use the semantics that the ontologies encode, such as different properties of classes, the class hierarchies, and the mappings between ontologies, in order to improve the search experience for the Resource Index user. Our user interface enables scientists to search the multiple resources quickly and efficiently using domain terms, without even being aware that there is semantics “under the hood.”  相似文献   

15.
Modern Web search engines still have many limitations: search terms are not disambiguated, search terms in one query cannot be in different languages, the retrieved media items have to be in the same language as the search terms and search results are not integrated across a live stream of different media channels, including TV, online news and social media. The system described in this paper enables all of this by combining a media stream processing architecture with cross-lingual and cross-modal semantic annotation, search and recommendation. All those components were developed in the xLiMe project.  相似文献   

16.
The semantic web vision is one in which rich, ontology-based semantic markup will become widely available. The availability of semantic markup on the web opens the way to novel, sophisticated forms of question answering. AquaLog is a portable question-answering system which takes queries expressed in natural language and an ontology as input, and returns answers drawn from one or more knowledge bases (KBs). We say that AquaLog is portable because the configuration time required to customize the system for a particular ontology is negligible. AquaLog presents an elegant solution in which different strategies are combined together in a novel way. It makes use of the GATE NLP platform, string metric algorithms, WordNet and a novel ontology-based relation similarity service to make sense of user queries with respect to the target KB. Moreover it also includes a learning component, which ensures that the performance of the system improves over the time, in response to the particular community jargon used by end users.  相似文献   

17.
This research extends text mining and information retrieval research to the digital forensic text string search process. Specifically, we used a self-organizing neural network (a Kohonen Self-Organizing Map) to conceptually cluster search hits retrieved during a real-world digital forensic investigation. We measured information retrieval effectiveness (e.g., precision, recall, and overhead) of the new approach and compared them against the current approach. The empirical results indicate that the clustering process significantly reduces information retrieval overhead of the digital forensic text string search process, which is currently a very burdensome endeavor.  相似文献   

18.
The general public is increasingly using search engines to seek information on risks and threats. Based on a search log from a large search engine, spanning three months, this study explores user patterns of query submission and subsequent clicks in sessions, for two important risk related topics, healthcare and information security, and compares them to other randomly sampled sessions. We investigate two session-level metrics reflecting users' interactivity with a search engine: session length and query click rate. Drawing from information foraging theory, we find that session length can be characterized well by the Inverse Gaussian distribution. Among three types of sessions on different topics (healthcare, information security, and other randomly sampled sessions), we find that healthcare sessions have the most queries and the highest query click rate, and information security sessions have the lowest query click rate. In addition, sessions initiated by the users with greater search engine activity level tend to have more queries and higher query click rates. Among three types of sessions, search engine activity level shows the strongest effect on query click rate for information security sessions and weakest for healthcare sessions. We discuss theoretical and practical implications of the study.  相似文献   

19.
20.
Keyword‐based search engines such as Google? index Web pages for human consumption. Sophisticated as such engines have become, surveys indicate almost 25% of Web searchers are unable to find useful results in the first set of URLs returned (Technology Review, March 2004). The lack of machine‐interpretable information on the Web limits software agents from matching human searches to desirable results. Tim Berners‐Lee, inventor of the Web, has architected the Semantic Web in which machine‐interpretable information provides an automated means to traversing the Web. A necessary cornerstone application is the search engine capable of bringing the Semantic Web together into a searchable landscape. We implemented a Semantic Web Search Engine (SWSE) that performs semantic search, providing predictable and accurate results to queries. To compare keyword search to semantic search, we constructed the Google CruciVerbalist (GCV), which solves crossword puzzles by reformulating clues into Google queries processed via the Google API. Candidate answers are extracted from query results. Integrating GCV with SWSE, we quantitatively show how semantic search improves upon keyword search. Mimicking the human brain's ability to create and traverse relationships between facts, our techniques enable Web applications to ‘think’ using semantic reasoning, opening the door to intelligent search applications that utilize the Semantic Web. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号