共查询到20条相似文献,搜索用时 15 毫秒
1.
Eli Rohn 《Knowledge and Information Systems》2010,24(2):283-304
We examine data definition languages (DDLs) from various computing era spanning almost 50 years to date. We prove that contemporary DDLs are indistinguishable from older ones using Zipf distribution of words, Zipf distributions of meanings, and information theory. None addresses the Law of Requisite Variety, which is necessary for enabling automatic data integration from autonomous heterogeneous data sources and for the realization of the Semantic Web. The growth of the entire computing industry is hampered by the lack of progress in the development of DDLs suitable for these two goals. Our findings set the stage for the future development of a mathematically sound DDL better suited for the aforementioned purposes. 相似文献
2.
基于Web Services的语义异构数据集成设计与实现 总被引:2,自引:1,他引:1
高校在信息化建设过程中积累了大量异构、异质的数据源,如何将这些异构数据源中相关的数据资源进行有效整合是当前急需解决的问题.为此,提出了一种新的基于Web Services的异构数据集成解决方案,利用Web Services技术在异构数据集成中的优势,采用虚拟数据库法设计思想,通过设计领域字典表及字段映射表,有效地解决了异构数据集成中最难解决的语义异构问题,并从数据模型转换、全局查询等方面阐述了该方案的可行性.实验结果表明,该方案具有可用性和高效性. 相似文献
3.
4.
为了搜索Web资源中深层数据并对其利用,在分析利用搜索引擎获取Web资源存在问题的基础上,利用语义网和Web服务技术,提出构建Web资源本体模型实现对Web资源进行语义标识,结合服务管理代理构建数据中介服务应用模型,并以Web服务方式实现Web资源数据中介服务.通过实验验证了Web资源数据中介服务的有效性和可行性,从而实现帮助用户能在形式多样、种类繁多的海量Web资源中有效地荻取和共享Web资源数据. 相似文献
5.
Semantic Annotation is required to add machine-readable content to natural language text. A global initiative such as the
Semantic Web directly depends on the annotation of massive amounts of textual Web resources. However, considering the amount
of those resources, a manual semantic annotation of their contents is neither feasible nor scalable. In this paper we introduce
a methodology to partially annotate textual content of Web resources in an automatic and unsupervised way. It uses several
well-established learning techniques and heuristics to discover relevant entities in text and to associate them to classes
of an input ontology by means of linguistic patterns. It also relies on the Web information distribution to assess the degree
of semantic co-relation between entities and classes of the input domain ontology. Special efforts have been put in minimizing
the amount of Web accesses required to evaluate entities in order to ensure the scalability of the approach. A manual evaluation
has been carried out to test the methodology for several domains showing promising results. 相似文献
6.
Claudio Gennaro Rita Lenzi Federica Mandreoli Riccardo Martoglia Matteo Mordacchini Wilma Penzo Simona Sassatelli 《Information Systems》2011
In recent years, the emerging diffusion of peer-to-peer networks is going beyond the single-domain paradigm like, for instance, the mono-thematic file sharing one (e.g. Napster for music). Peers are more and more heterogeneous data sources which need to share data with commercial, educational, and/or collaboration purposes, just to mention a few. Moreover, in current information processing applications data cannot be meaningfully searched by precise database queries that would return exact matches (e.g. when dealing with multimedia, proteomic, statistical data). 相似文献
7.
Web数据挖掘中数据集成问题的研究 总被引:3,自引:0,他引:3
在分析Web环境下数据源特点的基础上,对Web数据挖掘中的数据集成问题进行了深入的研究,给出了一个基于XML技术的集成方案.该方案采用Web数据存取方式将不同数据源集成起来,为Web数据挖掘提供了统一有效的数据集,解决了Web异构数据源集成的难题.通过一个具体实例介绍了Web数据集成的过程. 相似文献
8.
分析了当前网络化制造环境下企业资源共享和信息集成的现状及存在问题,结合语义Web技术,提出了网络化制造环境下企业资源信息集成平台的体系结构,为企业资源在语义层及企业本体上的集成提供了实现方式,并将体系结构应用于有色金属行业中,通过实际应用验证了系统的可靠性和稳定性. 相似文献
9.
This paper describes a process for mashing heterogeneous data sources based on the Multi-data source Fusion Approach (MFA) (Nachouki and Quafafou, 2008 [52]). The aim of MFA is to facilitate the fusion of heterogeneous data sources in dynamic contexts such as the Web. Data sources are either static or active: static data sources can be structured or semi-structured (e.g. XML documents or databases), whereas active sources are services (e.g. Web services). Our main objective is to combine (Web) data sources with a minimal effort required from the user. This objective is crucial because the mashing process implies easy and fast integration of data sources. We suppose that the user is not expert in this field but he/she understands the meaning of data being integrated. In this paper, we consider two important aspects of the Web mashing process. The first one concerns the information extraction from the Web. The results of this process are the static data sources that are used later together with services in order to create a new result/application. The second one concerns the problem of semantic reconciliation of data sources. This step consists to generate the Conflicts data source in order to improve the problem of rewriting semantic queries into sub-queries (not addressed in this paper) over data sources. We give the design of our system MDSManager. We show this process through a real-life application. 相似文献
10.
语义网格环境中异构数据资源整合研究 总被引:2,自引:1,他引:2
目前,各大高校信息资源采用不同的实现平台、不同的数据结构和实现方式,并且数据库在地理上是分布的更是异构的,给用户选择利用高校信息资源带来很大的不便,因此,如何消除各高校之间的"信息孤岛"现象,提供一个有效的机制,已成为当前信息化进程中急需解决的问题,运用语义网格理论,构造语义网格模型CI-Grid,可以实现一种利用本体整合分布式异构数据库资源的机制,能有效地解决这一问题,并有较高的应用前景. 相似文献
11.
12.
13.
E. Sakkopoulos Author Vitae D. Antoniou Author VitaeAuthor Vitae N. Tsirakis Author VitaeAuthor Vitae 《Journal of Systems and Software》2010,83(11):2200-2210
The explosive growth in the size and use of the World Wide Web continuously creates new great challenges and needs. The need for predicting the users’ preferences in order to expedite and improve the browsing though a site can be achieved through personalizing of the Websites. Recommendation and personalization algorithms aim at suggesting WebPages to users based on their current visit and past users’ navigational patterns. The problem that we address is the case where few WebPages become very popular for short periods of time and are accessed very frequently in a limited temporal space. Our aim is to deal with these bursts of visits and suggest these highly accessed pages to the future users that have common interests. Hence, in this paper, we propose a new web personalization technique, based on advanced data structures.The data structures that are used are the Splay tree (1) and Binary heaps (2). We describe the architecture of the technique, analyze the time and space complexity and prove its performance. In addition, we compare both theoretically and experimentally the proposed technique to another approach to verify its efficiency. Our solution achieves O(P2) space complexity and runs in k log P time, where k is the number of pages and P the number of categories of WebPages. 相似文献
14.
许国艳 《计算机工程与设计》2006,27(10):1791-1792,1796
数据集成是共享分布的异构数据资源的核心问题.在分析常用数据集成技术的基础上,结合Web Services技术和组件技术,提出了基于Web Services和组件技术实现Mediated系统的数据集成方案.最后,以J2EE为平台给出了一种面向服务的低偶合的数据集成框架,中介器和包装器由EJB组件实现,由组件部署的Web服务为用户提供一个透明的统一的接口,实现异地异构数据资源的共享和整合. 相似文献
15.
16.
Federica Cena 《Journal of Intelligent Information Systems》2011,36(2):131-166
Nowadays there is a great number of Web information systems that build a model of the user and adapt their services according
to the needs and preferences maintained by the user model (UM). One of the most challenging issues of this scenario is the
possibility to enable different systems to cooperate in order to exchange the available information about a user. Our aim
is to create rich (and scalable) communication protocols and infrastructures to enable consumers and providers of UM data
to interact. Our solution for dealing with such an issue is to exploit Web standards for interoperability (i.e. Semantic Web
and Web Services) for implementing simple atomic communication, and a dialogue model for implementing enhanced communication
capabilities. In particular, two systems can start a semantics-enhanced Dialogue Game as a form of negotiation to clarify
the meaning of the requested concepts when a shared knowledge model does not exist, and to approximate the response when the
exact one is not available. We propose a distributed semantic conversation framework based on the Sesame semantic environment
for the exchange of user model knowledge on the Web. Systems have to expose their user model data as a Web Service, and to
exploit a public dialogue knowledge base to start the dialogue. The main advantage of the approach is to allow systems to
deal with difficult situations by starting an appropriate dialogue game instead of stopping the communication as in the traditional
“all-or-nothing” Web Service approach. On the basis of a preliminary evaluation, the approach has shown an improvement of
the adaptation results provided by the systems we tested. 相似文献
17.
Hung-Yu Kao Shian-Hua Lin Jan-Ming Ho Ming-Syan Chen 《Knowledge and Data Engineering, IEEE Transactions on》2004,16(1):41-55
We study the problem of mining the informative structure of a news Web site that consists of thousands of hyperlinked documents. We define the informative structure of a news Web site as a set of index pages (or referred to as TOC, i.e., table of contents, pages) and a set of article pages linked by these TOC pages. Based on the Hyperlink Induced Topics Search (HITS) algorithm, we propose an entropy-based analysis (LAMIS) mechanism for analyzing the entropy of anchor texts and links to eliminate the redundancy of the hyperlinked structure so that the complex structure of a Web site can be distilled. However, to increase the value and the accessibility of pages, most of the content sites tend to publish their pages with intrasite redundant information, such as navigation panels, advertisements, copy announcements, etc. To further eliminate such redundancy, we propose another mechanism, called InfoDiscoverer, which applies the distilled structure to identify sets of article pages. InfoDiscoverer also employs the entropy information to analyze the information measures of article sets and to extract informative content blocks from these sets. Our result is useful for search engines, information agents, and crawlers to index, extract, and navigate significant information from a Web site. Experiments on several real news Web sites show that the precision and the recall of our approaches are much superior to those obtained by conventional methods in mining the informative structures of news Web sites. On the average, the augmented LAMIS leads to prominent performance improvement and increases the precision by a factor ranging from 122 to 257 percent when the desired recall falls between 0.5 and 1. In comparison with manual heuristics, the precision and the recall of InfoDiscoverer are greater than 0.956. 相似文献
18.
The aim of this paper is to propose and discuss the structure of an automatic system for reasoning about data in inspection missions concerning underwater structures. The system is mainly intended as a support for the operator activity during the surveys of undersea pipelines and it represents a first step in the development of possible fully automatic systems for unmanned inspection missions. The structure of the system and the procedures that it implements are described in detail and the results of suitable simulations are presented. 相似文献
19.
20.
The study is concerned with a development of information granules and their use in data analysis. The design of information granules exploits a concept of overshadowing meaning that, we retain a given level of membership to a given concept unless faced with a contrary evidence. A detailed algorithm is provided and illustrated through a number of numerical studies. The idea of noninvasive data analysis is then introduced and discussed from a standpoint of a minimal level of structural dependencies to be used in the model. 相似文献