期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Mapping the semantics of Web text and links 总被引：1，自引：0，他引：1

Menczer F. 《Internet Computing, IEEE》2005,9(3):27-36

Search engines use content and links to search, rank, cluster, and classify Web pages. These information discovery applications use similarity measures derived from this data to estimate relatedness between pages. However, little research exists on the relationships between similarity measures or between such measures and semantic similarity. The author analyzes and visualizes similarity relationships in massive Web data sets to identify how to integrate content and link analysis for approximating relevance. He uses human-generated metadata from Web directories to estimate semantic similarity and semantic maps to visualize relationships between content and link cues and what these cues suggest about page meaning. Highly heterogeneous topical maps point to a critical dependence on search context. 相似文献

2.

语义网站点的发现与排序

张祥葛唯益瞿裕忠《软件学报》2009,20(10):2834-3843

随着语义网中RDF数据的大量涌现,语义搜索引擎为用户搜索RDF数据带来了便利.但是,如何自动地发现包含语义网信息资源的站点,并高效地在语义网站点中收集语义网信息资源,一直是语义搜索引擎所面临的问题.首先介绍了语义网站点的链接模型.该模型刻画了语义网站点、语义网信息资源、RDF模型和语义网实体之间的关系.基于该模型讨论了语义网实体的归属问题,并进一步定义了语义网站点的发现规则;另外,从站点链接模型出发,定义了语义网站点依赖图,并给出了对语义网站点进行排序的算法.将相关算法在一个真实的语义搜索引擎中进行了初步测试.实验结果表明,所提出的方法可以有效地发现语义网站点并对站点进行排序. 相似文献

3.

Relationship Web: Blazing Semantic Trails between Web Resources

Sheth A.P. Ramakrishnan C. 《Internet Computing, IEEE》2007,11(4):77-81

Using keywords as inputs to search engines and receiving documents as responses remains the prevalent way to access information on the Web. Although a shift toward entity awareness is a fairly recent trend in information access, such methods remain devoid of semantics, which are increasingly recognized as the lynchpin of search, integration, and analysis. We argue that relationships are at the heart of semantics, and, as such, we envision a Web of relationships to relate content across Web resources. Under this powerful new paradigm, information access over the Web would switch from a mere document-retrieval operation to an information framework that supports insight elicitation and semantic analytics over Web resources. In this column, we outline our vision and discuss how recent Improvements in content extraction and semantic annotation will ultimately help us realize this relationship Web. 相似文献

4.

Automatic generation of semantically enriched web pages by a text mining approach

Hsin-Chang Yang 《Expert systems with applications》2009,36(6):9709-9718

相似文献

5.

WebOWL: A Semantic Web search engine development experiment

Alexandros Batzios Pericles A. Mitkas 《Expert systems with applications》2012,39(5):5052-5060

This paper presents WebOWL, an experiment in using the latest technologies to develop a Semantic Web search engine. WebOWL consists of a community of intelligent agents, acting as crawlers, that are able to discover and learn the locations of Semantic Web neighborhoods on the Web, a semantic database to store data from different ontologies, a query mechanism that supports semantic queries in OWL, and a ranking algorithm that determines the order of the returned results based on the semantic relationships of classes and individuals. The system has been implemented using Jade, Jena and the db4o object database engine and has successfully stored over one million OWL classes, individuals and properties. 相似文献

6.

Improving large-scale search engines with semantic annotations

Damaris Fuentes-Lorenzo Norberto Fernández Jesús A. Fisteus Luis Sánchez 《Expert systems with applications》2013,40(6):2287-2296

Traditional search engines have become the most useful tools to search the World Wide Web. Even though they are good for certain search tasks, they may be less effective for others, such as satisfying ambiguous or synonym queries. In this paper, we propose an algorithm that, with the help of Wikipedia and collaborative semantic annotations, improves the quality of web search engines in the ranking of returned results. Our work is supported by (1) the logs generated after query searching, (2) semantic annotations of queries and (3) semantic annotations of web pages. The algorithm makes use of this information to elaborate an appropriate ranking. To validate our approach we have implemented a system that can apply the algorithm to a particular search engine. Evaluation results show that the number of relevant web resources obtained after executing a query with the algorithm is higher than the one obtained without it. 相似文献

7.

i-TagRanker: an efficient tag ranking system for image sharing and retrieval using the semantic relationships between tags

Jin-Woo Jeong Hyun-Ki Hong Dong-Ho Lee 《Multimedia Tools and Applications》2013,62(2):451-478

Folksonomy, considered a core component for Web 2.0 user-participation architecture, is a classification system made by user’s tags on the web resources. Recently, various approaches for image retrieval exploiting folksonomy have been proposed to improve the result of image search. However, the characteristics of the tags such as semantic ambiguity and non-controlledness limit the effectiveness of tags on image retrieval. Especially, tags associated with images in a random order do not provide any information about the relevance between a tag and an image. In this paper, we propose a novel image tag ranking system called i-TagRanker which exploits the semantic relationships between tags for re-ordering the tags according to the relevance with an image. The proposed system consists of two phases: 1) tag propagation phase, 2) tag ranking phase. In tag propagation phase, we first collect the most relevant tags from similar images, and then propagate them to an untagged image. In tag ranking phase, tags are ranked according to their semantic relevance to the image. From the experimental results on a Flickr photo collection about over 30,000 images, we show the effectiveness of the proposed system. 相似文献

8.

Evolutionary approach for semantic-based query sampling in large-scale information sources

Jason J. Jung 《Information Sciences》2012,182(1):30-39

Metadata about information sources (e.g., databases and repositories) can be collected by Query Sampling (QS). Such metadata can include topics and statistics (e.g., term frequencies) about the information sources. This provides important evidence for determining which sources in the distributed information space should be selected for a given user query. The aim of this paper is to find out the semantic relationships between the information sources in order to distribute user queries to a large number of sources. Thereby, we propose an evolutionary approach for automatically conducting QS using multiple crawlers and obtaining the optimized semantic network from the sources. The aim of combining QS and evolutionary methods is to collaboratively extract metadata about target sources and optimally integrate the metadata, respectively. For evaluating the performance of contextualized QS on 122 information sources, we have compared the ranking lists recommended by the proposed method with user feedback (i.e., ideal ranks), and also computed the precision of the discovered subsumptions in terms of the semantic relationships between the target sources. 相似文献

9.

Mining Web search engines for query suggestion

Zheng Xu Xiangfeng Luo Jie Yu Weimin Xu 《Concurrency and Computation》2011,23(10):1101-1113

Queries to Web search engines are usually short and ambiguous, which provides insufficient information needs of users for effectively retrieving relevant Web pages. To address this problem, query suggestion is implemented by most search engines. However, existing methods do not leverage the contradiction between accuracy and computation complexity appropriately (e.g. Google's ‘Search related to’ and Yahoo's ‘Also Try’). In this paper, the recommended words are extracted from the search results of the query, which guarantees the real time of query suggestion properly. A scheme for ranking words based on semantic similarity presents a list of words as the query suggestion results, which ensures the accuracy of query suggestion. Moreover, the experimental results show that the proposed method significantly improves the quality of query suggestion over some popular Web search engines (e.g. Google and Yahoo). Finally, an offline experiment that compares the accuracy of snippets in capturing the number of words in a document is performed, which increases the confidence of the method proposed by the paper. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献

10.

Web Semantics in the Clouds 总被引：3，自引：0，他引：3

《Intelligent Systems, IEEE》2008,23(5):82-87

Cloud computing refers to the use of large-scale computer clusters often built from low-cost hardware and network equipment, where resources are allocated dynamically among users of the cluster. While the paradigm is not entirely novel, recent developments in software frameworks for cloud computing are making it increasingly easy for programmers to parallelize and thereby scale-up complex data-processing tasks. This article investigates how this trend is impacting the semantic Web field and shows how cloud computing can be used to analyze, query, and reason with the massive amounts of metadata handled by semantic search engines. 相似文献

11.

语义元数据的描述及生成技术研究

陶皖廖述梅《计算机与现代化》2006,(12):1-3

语义元数据是有关Web内容语义信息的数据描述,它的有效表示及生成是构建语义Web的关键性技术。本文在讨论各种语义元数据的表示方法后,研究语义元数据的生成技术,在分析现有技术的特点和不足后,评述语义元数据生成技术的发展趋势。相似文献

12.

Webspam demotion: Low complexity node aggregation methods

Thomas Largillier^{Author Vitae} Sylvain Peyronnet Author Vitae 《Neurocomputing》2012,76(1):105-113

Search engines result pages (SERPs) for a specific query are constructed according to several mechanisms. One of them consists in ranking Web pages regarding their importance, regardless of their semantic. Indeed, relevance to a query is not enough to provide high quality results, and popularity is used to arbitrate between equally relevant Web pages. The most well-known algorithm that ranks Web pages according to their popularity is the PageRank.The term Webspam was coined to denotes Web pages created with the only purpose of fooling ranking algorithms such as the PageRank. Indeed, the goal of Webspam is to promote a target page by increasing its rank. It is an important issue for Web search engines to spot and discard Webspam to provide their users with a nonbiased list of results. Webspam techniques are evolving constantly to remain efficient but most of the time they still consist in creating a specific linking architecture around the target page to increase its rank.In this paper we propose to study the effects of node aggregation on the well-known ranking algorithm of Google (the PageRank) in the presence of Webspam. Our node aggregation methods have the purpose to construct clusters of nodes that are considered as a sole node in the PageRank computation. Since the Web graph is way to big to apply classic clustering techniques, we present four lightweight aggregation techniques suitable for its size. Experimental results on the WEBSPAM-UK2007 dataset show the interest of the approach, which is moreover confirmed by statistical evidence. 相似文献

13.

Toward a New Generation of Semantic Web Applications

d'Aquin M. Motta E. Sabou M. Angeletou S. Gridinoc L. Lopez V. Guidi D. 《Intelligent Systems, IEEE》2008,23(3):20-28

Although research on integrating semantics with the Web started almost as soon as the Web was in place, a concrete Semantic Web that is, a large-scale collection of distributed semantic metadata emerged only over the past four to five years. The Semantic Web's embryonic nature is reflected in its existing applications. Most of these applications tend to produce and consume their own data, much like traditional knowledge- based applications, rather than actually exploiting the Semantic Web as a large-scale information source. These first-generation semantic Web applications typically use a single ontology that supports integration of resources selected at design time. 相似文献

14.

Semantic integration of enterprise information systems using meta-metadata ontology

Igor Cverdelj-Fogaraši Goran Sladić Stevan Gostojić Milan Segedinac Branko Milosavljević 《Information Systems and E-Business Management》2017,15(2):257-304

This paper proposes a non-domain-specific metadata ontology as a core component in a semantic model-based document management system (DMS), a potential contender towards the enterprise information systems of the next generation. What we developed is the core semantic component of an ontology-driven DMS, providing a robust semantic base for describing documents’ metadata. We also enabled semantic services such as automated semantic translation of metadata from one domain to another. The core semantic base consists of three semantic layers, each one serving a different view of documents’ metadata. The core semantic component’s base layer represents a non-domain-specific metadata ontology founded on ebRIM specification. The main purpose of this ontology is to serve as a meta-metadata ontology for other domain-specific metadata ontologies. The base semantic layer provides a generic metadata view. For the sake of enabling domain-specific views of documents’ metadata, we implemented two domain-specific metadata ontologies, semantically layered on top of ebRIM, serving domain-specific views of the metadata. In order to enable semantic translation of metadata from one domain to another, we established model-to-model mappings between these semantic layers by introducing SWRL rules. Having the semantic translation of metadata automated not only allows for effortless switching between different metadata views, but also opens the door for automating the process of documents long-term archiving. For the case study, we chose judicial domain as a promising ground for improving the efficiency of the judiciary by introducing the semantics in this field. 相似文献

15.

语义标注元数据及其抽取技术 总被引：4，自引：0，他引：4

凌海云左志宏陈兰段恩泽袁军英《计算机应用研究》2004,21(7):147-149

讨论了语义Web上用XML或RDF/XML标注元数据的方法以及元数据标注在语义Web上的两种存在形式：单一文件或XML包。在此基础上,介绍了从这些单独文件或XML包宿主文件中抽取元数据的方法,包括XML解析器SAX和DOM以及XML包扫描器的构造。相似文献

16.

Algorithmic Computation and Approximation of Semantic Similarity

Ana G. Maguitman Filippo Menczer Fulya Erdinc Heather Roinestad Alessandro Vespignani 《World Wide Web》2006,9(4):431-456

Automatic extraction of semantic information from text and links in Web pages is key to improving the quality of search results. However, the assessment of automatic semantic measures is limited by the coverage of user studies, which do not scale with the size, heterogeneity, and growth of the Web. Here we propose to leverage human-generated metadata—namely topical directories—to measure semantic relationships among massive numbers of pairs of Web pages or topics. The Open Directory Project classifies millions of URLs in a topical ontology, providing a rich source from which semantic relationships between Web pages can be derived. While semantic similarity measures based on taxonomies (trees) are well studied, the design of well-founded similarity measures for objects stored in the nodes of arbitrary ontologies (graphs) is an open problem. This paper defines an information-theoretic measure of semantic similarity that exploits both the hierarchical and non-hierarchical structure of an ontology. An experimental study shows that this measure improves significantly on the traditional taxonomy-based approach. This novel measure allows us to address the general question of how text and link analyses can be combined to derive measures of relevance that are in good agreement with semantic similarity. Surprisingly, the traditional use of text similarity turns out to be ineffective for relevance ranking. 相似文献

17.

Graph visualization techniques for web clustering engines 总被引：1，自引：0，他引：1

Di Giacomo E Didimo W Grilli L Liotta G 《IEEE transactions on visualization and computer graphics》2007,13(2):294-304

One of the most challenging issues in mining information from the World Wide Web is the design of systems that present the data to the end user by clustering them into meaningful semantic categories. We show that the analysis of the results of a clustering engine can significantly take advantage of enhanced graph drawing and visualization techniques. We propose a graph-based user interface for Web clustering engines that makes it possible for the user to explore and visualize the different semantic categories and their relationships at the desired level of detail 相似文献

18.

Semantic Web approach to smart link generation for Web navigations

Shang‐Juh Kao I‐Ching Hsu 《Software》2007,37(8):857-879

相似文献

19.

基于语义距离度量模型的语义Web服务匹配排序机制

下载免费PDF全文

曾志浩应时陈锐倪友聪赵楷《计算机工程与科学》2010,32(6):138-141

随着语义Web服务技术研究工作的不断深入,因特网上语义Web服务数量急剧增加。如何快速便捷地定位可用语义Web服务已经成为一个迫切且关键的问题。在语义Web服务匹配技术研究中,其中一个重要的研究主题就是语义Web服务匹配结果的排序机制。本文在综合概括和分析各种关于语义Web服务匹配结果排序机制的基础上,提出了一种基于语义距离度量模型的语义Web服务匹配结果排序机制,利用该排序机制,计算待匹配语义Web服务的语义相似度量,并依据此度量对语义Web服务匹配结果进行排序。该度量模型将语义Web服务引用概念间的语义关系转换成可精确比较的量化度量值,对属于相同语义匹配类型的匹配候选服务也能够分别计算语义距离,区分出相同匹配类型的候选服务与服务请求的匹配程度,从而达到改善用户对语义Web服务的搜索体验的目的。相似文献

20.

Search personalization through query and page topical analysis

Sofia Stamou Alexandros Ntoulas 《User Modeling and User-Adapted Interaction》2009,19(1-2):5-33

Thousands of users issue keyword queries to the Web search engines to find information on a number of topics. Since the users may have diverse backgrounds and may have different expectations for a given query, some search engines try to personalize their results to better match the overall interests of an individual user. This task involves two great challenges. First the search engines need to be able to effectively identify the user interests and build a profile for every individual user. Second, once such a profile is available, the search engines need to rank the results in a way that matches the interests of a given user. In this article, we present our work towards a personalized Web search engine and we discuss how we addressed each of these challenges. Since users are typically not willing to provide information on their personal preferences, for the first challenge, we attempt to determine such preferences by examining the click history of each user. In particular, we leverage a topical ontology for estimating a user’s topic preferences based on her past searches, i.e. previously issued queries and pages visited for those queries. We then explore the semantic similarity between the user’s current query and the query-matching pages, in order to identify the user’s current topic preference. For the second challenge, we have developed a ranking function that uses the learned past and current topic preferences in order to rank the search results to better match the preferences of a given user. Our experimental evaluation on the Google query-stream of human subjects over a period of 1 month shows that user preferences can be learned accurately through the use of our topical ontology and that our ranking function which takes into account the learned user preferences yields significant improvements in the quality of the search results. 相似文献