首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Distributed hash tables (DHTs) excel at exact-match lookups, but they do not directly support complex queries such as semantic search that is based on content. In this paper, we propose a novel approach to efficient semantic search on DHT overlays. The basic idea is to place indexes of semantically close files into same peer nodes with high probability by exploiting information retrieval algorithms and locality sensitive hashing. A query for retrieving semantically close files is answered with high recall by consulting only a small number (e.g., 10–20) of nodes that stores the indexes of the files semantically close to the query. Our approach adds only index information to peer nodes, imposing only a small storage overhead. Via detailed simulations, we show that our approach achieves high recall for queries at very low cost, i.e., the number of nodes visited for a query is about 10–20, independent of the overlay size.  相似文献   

2.
3.
Object-Fuzzy Concept Network (O-FCN) is a recent knowledge representation model to integrate Fuzzy Ontologies in Information Retrieval systems. O-FCNs handle huge data collections and have to face the inherent complexity of semantic manipulation during the retrieval process. Therefore their distribution is an essential requirement to reach good scalability. We present ‘Grid2Peer’: a distributed architecture for O-FCN-based semantic information retrieval that exploits the self-organization characteristics of both Grid and P2P systems. The most relevant features in Grid2Peer are the adoption of the fuzzy sets to organize the overlay itself, the capability of migrating knowledge towards the location where it is accessed, and granting dynamic load balancing among peers. Numerical simulations are performed in order to analyze these characteristics, evaluating also fuzzy precision and fuzzy recall measures given by the distributed retrieval algorithm for the Grid2Peer architecture.  相似文献   

4.
This paper presents a novel method for semantic annotation and search of a target corpus using several knowledge resources (KRs). This method relies on a formal statistical framework in which KR concepts and corpus documents are homogeneously represented using statistical language models. Under this framework, we can perform all the necessary operations for an efficient and effective semantic annotation of the corpus. Firstly, we propose a coarse tailoring of the KRs w.r.t the target corpus with the main goal of reducing the ambiguity of the annotations and their computational overhead. Then, we propose the generation of concept profiles, which allow measuring the semantic overlap of the KRs as well as performing a finer tailoring of them. Finally, we propose how to semantically represent documents and queries in terms of the KRs concepts and the statistical framework to perform semantic search. Experiments have been carried out with a corpus about web resources which includes several Life Sciences catalogs and Wikipedia pages related to web resources in general (e.g., databases, tools, services, etc.). Results demonstrate that the proposed method is more effective and efficient than state-of-the-art methods relying on either context-free annotation or keyword-based search.  相似文献   

5.
Traditional content-based music retrieval systems retrieve a specific music object which is similar to what a user has requested. However, the need exists for the development of category search for the retrieval of a specific category of music objects which share a common semantic concept. The concept of category search in content-based music retrieval is subjective and dynamic. Therefore, this paper investigates a relevance feedback mechanism for category search of polyphonic symbolic music based on semantic concept learning. For the consideration of both global and local properties of music objects, a segment-based music object modeling approach is presented. Furthermore, in order to discover the user semantic concept in terms of discriminative features of discriminative segments, a concept learning mechanism based on data mining techniques is proposed to find the discriminative characteristics between relevant and irrelevant objects. Moreover, three strategies, the Most-Positive, the Most-Informative, and the Hybrid, to return music objects concerning user relevance judgments are investigated. Finally, comparative experiments are conducted to evaluate the effectiveness of the proposed relevance feedback mechanism. Experimental results show that, for a database of 215 polyphonic music objects, 60% average precision can be achieved through the use of the proposed relevance feedback mechanism.
Fang-Fei KuoEmail:
  相似文献   

6.
The current web IR system retrieves relevant information only based on the keywords which is inadequate for that vast amount of data. It provides limited capabilities to capture the concepts of the user needs and the relation between the keywords. These limitations lead to the idea of the user conceptual search which includes concepts and meanings. This study deals with the Semantic Based Information Retrieval System for a semantic web search and presented with an improved algorithm to retrieve the information in a more efficient way.This architecture takes as input a list of plain keywords provided by the user and the query is converted into semantic query. This conversion is carried out with the help of the domain concepts of the pre-existing domain ontologies and a third party thesaurus and discover semantic relationship between them in runtime. The relevant information for the semantic query is retrieved and ranked according to the relevancy with the help of an improved algorithm. The performance analysis shows that the proposed system can improve the accuracy and effectiveness for retrieving relevant web documents compared to the existing systems.  相似文献   

7.
Peer-to-peer (P2P) technology provides a popular way of distributing resources, sharing, and locating in a large-scale distributed environment. However, most of the current existing P2P systems only support queries over a single resource attribute, such as file name. The current multiple resource attribute search methods often encounter high maintenance cost and lack of resilience to the highly dynamic environment of P2P networks. In this paper, we propose a Flabellate overlAy Network (FAN), a scalable and structured underlying P2P overlay supporting resource queries over multi-dimensional attributes. In FAN, the resources are mapped into a multi-dimensional Cartesian space based on the consistent hash values of the resource attributes. The mapping space is divided into non-overlapping and continuous subspaces based on the peer’s distance. This paper presents strategies for managing the extended adjacent subspaces, which is crucial to network maintenance and resource search in FAN. The algorithms of a basic resource search and range query over FAN are also presented in this paper. To alleviate the load of the hot nodes, a virtual replica network (VRN) consisting of the nodes with the same replicates is proposed for replicating popular resources adaptively. The queries can be forwarded from the heavily loaded nodes to the lightly loaded ones through VRN. Theoretical analysis and experimental results show that FAN has a higher routing efficiency and lower network maintenance cost over the existing multi-attribute search methods. Also, VRN efficiently balances the network load and reduces the querying delay in FAN while invoking a relatively low overhead.  相似文献   

8.
The increasing performance and wider spread use of automated semantic annotation and entity linking platforms has empowered the possibility of using semantic information in information retrieval. While keyword-based information retrieval techniques have shown impressive performance, the addition of semantic information can increase retrieval performance by allowing for more accurate sense disambiguation, intent determination, and instance identification, just to name a few. Researchers have already delved into the possibility of integrating semantic information into practical search engines using a combination of techniques such as using graph databases, hybrid indices and adapted inverted indices, among others. One of the challenges with the efficient design of a search engine capable of considering semantic information is that it would need to be able to index information beyond the traditional information stored in inverted indices, including entity mentions and type relationships. The objective of our work in this paper is to investigate various ways in which different data structure types can be adopted to integrate three types of information including keywords, entities and types. We will systematically compare the performance of the different data structures for scenarios where (i) the same data structure types are adopted for the three types of information, and (ii) different data structure types are integrated for storing and retrieving the three different information types. We report our findings in terms of the performance of various query processing tasks such as Boolean and ranked intersection for the different indices and discuss which index type would be appropriate under different conditions for semantic search.  相似文献   

9.
XML documents are extensively used in several applications and evolve over time. Identifying the semantics of these changes becomes a fundamental process to understand their evolution. Existing approaches related to understanding changes (diff) in XML documents focus only on syntactic changes. These approaches compare XML documents based on their structure, without considering the associated semantics. However, for large XML documents, which have undergone many changes from a version to the next, a large number of syntactic changes in the document may correspond to fewer semantic changes, which are then easier to analyze and understand. For instance, increasing the annual salary and the gross pay, and changing the job title of an employee (three syntactic changes) may mean that this employee was promoted (one semantic change). In this paper, we explore this idea and present the XChange approach. XChange considers the semantics of the changes to calculate the diff of different versions of XML documents. For such, our approach analyzes the granular syntactic changes in XML attributes and elements using inference rules to combine them into semantic changes. Thus, differently from existing approaches, XChange proposes the use of syntactic changes in versions of an XML document to infer the real reason for the change and support the process of semantic diff. Results of an experimental study indicate that XChange can provide higher effectiveness and efficiency when used to understand changes between versions of XML documents when compared with the (syntactic) state-of-the-art approaches.  相似文献   

10.
11.
In this paper, we present an ontology-based information extraction and retrieval system and its application in the soccer domain. In general, we deal with three issues in semantic search, namely, usability, scalability and retrieval performance. We propose a keyword-based semantic retrieval approach. The performance of the system is improved considerably using domain-specific information extraction, inferencing and rules. Scalability is achieved by adapting a semantic indexing approach and representing the whole world as small independent models. The system is implemented using the state-of-the-art technologies in Semantic Web and its performance is evaluated against traditional systems as well as the query expansion methods. Furthermore, a detailed evaluation is provided to observe the performance gain due to domain-specific information extraction and inferencing. Finally, we show how we use semantic indexing to solve simple structural ambiguities.  相似文献   

12.
Current search engines are rapidly changing to embrace more powerful mechanisms that are capable of reasoning on semantic attributes of contents in a distributed repository. Formalisms have been proposed to represent the semantic attributes. Yet, traditional approaches for content sharing in peer-to-peer systems cannot be adapted to use semantic information. Some novel proposals in the literature consider semantic aspects in P2P systems. However, they either make strong assumptions on the semantic model, or have high communication and computation overhead.  相似文献   

13.
14.
15.
Peer-to-peer (P2P) networks are beginning to form the infrastructure of future applications. Heavy network traffic limits the scalability of P2P networks. Indexing is a method to reduce this traffic. But indexes tend to become large with the growth of the network. Also, limiting the size of these indexes causes loss of indexing information. In this paper we introduce a novel ontology based index (OI) which limits the size of the indexes without sacrificing indexing information. We show that the method can be employed by many P2P networks. The OI sits on top of routing and maintenance modules of a P2P network and enhances it. The OI prunes branches of search trees which have no chance to proceed to a response. Also the OI guarantees that an enhanced routing algorithm and its basic version have the same result set for a given search query. This means that the OI reduces traffic without reducing quality of service. To measure the performance of the OI we apply it on Chord (DHT based) and HyperCup (non-DHT based) P2P networks and show that it reduces the networks’ traffic significantly.  相似文献   

16.
Various types of applications access to objects distributed in peer-to-peer (P2P) overlay networks. Even if the locations of target objects are detected by some algorithms like flooding and distributed hash table (DHT) ones, applications cannot manipulate the target objects without access requests. It is critical to discover which peer can manipulate an object in which method, i.e. only a peer with an access right is allowed to manipulate an object. Hence, the application rather has to find target peers which can manipulate a target object than detecting the location of a target object. Due to the scalability and variety of peers, it is difficult, possibly impossible to maintain a centralized directory showing in which peer each object is distributed. An acquaintance peer of a peer p is a peer whose service the peer p knows and with which the peer p can directly communicate. We discuss types of acquaintance relations of peers with respect to what objects each peer holds, is allowed to manipulate, and can grant access rights on. Acquaintance peers of a peer may notify the peer of different information on target peers due to communication and propagation delay. Here, it is critical to discuss how much a peer trusts each acquaintance peer. We first define the satisfiability, i.e. how much a peer is satisfied by issuing an access request to another peer. For example, if a peer locally manipulates a target object o and sends a response, a peer p i is mostly satisfied. On the other hand, if has to ask another peer to manipulate the object o, p i is less satisfied. We define the trustworthiness of an acquaintance peer of a peer from the satisfiability and the ranking factor.
Makoto TakizawaEmail:
  相似文献   

17.
An increasing amount of structured data on the Web has attracted industry attention and renewed research interest in what is collectively referred to as semantic search. These solutions exploit the explicit semantics captured in structured data such as RDF for enhancing document representation and retrieval, or for finding answers by directly searching over the data. These data have been used for different tasks and a wide range of corresponding semantic search solutions have been proposed in the past. However, it has been widely recognized that a standardized setting to evaluate and analyze the current state-of-the-art in semantic search is needed to monitor and stimulate further progress in the field. In this paper, we present an evaluation framework for semantic search, analyze the framework with regard to repeatability and reliability, and report on our experiences on applying it in the Semantic Search Challenge 2010 and 2011.  相似文献   

18.
现在大量研究者通过语义覆盖网构建来提高P2P网络资源查询效率,但在语义覆盖网最佳规模大小上缺乏研究。考虑运用数学方法对语义覆盖网络进行数据建模,对路由算法的路由性能指标的求解方法进行研究,并分析语义覆盖网规模与路由性能指标之间的关系。通过模型的分析和求解,得出了社区的最佳规模大小,为语义覆盖网构建与研究提供了有力的支撑。  相似文献   

19.
包含关联的语义覆盖网构建方法研究   总被引:1,自引:0,他引:1       下载免费PDF全文
现实世界中信息资源之间存在着各种各样的关联关系,而当前的搜索引擎只能提供基于关键字的搜索,不能为用户提供他们所关心的与关键字相关的各类信息。针对这个问题,提出了构建语义覆盖网,以用户需求为导向,以用户所关心的信息为中心,将所有与此资源相关的信息全部汇聚起来提供给用户。这里先根据语义相似度将节点聚类,然后在聚类的基础上,根据各类关联关系构建基于关联关系的语义覆盖网。  相似文献   

20.
最近,通过建立语义覆盖网络来提高大规模分布式网络环境中信息检索服务的性能成为对等计算领域的研究热点.目前,研究者们在语义覆盖协议和搜索算法方面已经做了大量研究,证明了语义覆盖在基于对等网络模型的内容定位应用方面极为有效.然而,分析和评价语义覆盖网络特征的研究工作确非常有限.文中通过建立数学模型和设计启发式回溯-贪婪混合算法、确认了语义覆盖网络的一种主要内在特性——社区结构特性.利用评价模型比较了SemreX语义覆盖网络和Gnutella网络的性能,实验结果显示SemreX覆盖网具有显著的社区结构特征,而Gnutella网络却没有这样的特征.另外,通过分别在两种覆盖网中仿真洪泛协议发现具有显著社区结构特征的覆盖网在内容定位方面效率更高.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号