首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
This paper presents and evaluates a simple but very effective method to implement large data warehouses on an arbitrary number of computers, achieving very high query execution performance and scalability. The data is distributed and processed in a potentially large number of autonomous computers using our technique called data warehouse striping (DWS). The major problem of DWS technique is that it would require a very expensive cluster of computers with fault tolerant capabilities to prevent a fault in a single computer to stop the whole system. In this paper, we propose a radically different approach to deal with the problem of the unavailability of one or more computers in the cluster, allowing the use of DWS with a very large number of inexpensive computers. The proposed approach is based on approximate query answering techniques that make it possible to deliver an approximate answer to the user even when one or more computers in the cluster are not available. The evaluation presented in the paper shows both analytically and experimentally that the approximate results obtained this way have a very small error that can be negligible in most of the cases.  相似文献   

2.
条件函数依赖(Conditional Functional Dependeny,CFD)是对函数依赖(Functional Depencency,FD)加入语义约束扩展而来,它在数据库一致性检测、数据清洗方面更优于后者.讨论了条件函数依赖的相关概念及其基本性质,讨论如何将它应用于数据清洗,并对已提出的基于CFD的数据清洗方案提出改进措施,并通过实验说明改进措施的可行性.  相似文献   

3.
4.
The use of Generalized Quantifiers in query languages was introduced independently in (Hsu and Parker, 1995; Gyssens et al., 1995). In both cases it is argued that GQs make query languages more able to handle complex queries in a declarative way and provide a syntax more similar to natural language. In this paper we argue that query languages with Generalized Quantifiers can be used to produce cooperative question answering (Gaasterland et al., 1992). We introduce the Query Language with Generalized Quantifiers QLGQ and review related work in cooperative query answering, focusing on research that has direct connections with the results of this paper. Then we show how to use Generalized Quantifiers in dealing with false presuppositions, constructing justifications, and query relaxation. For each technique, we give examples that suggest that Generalized Quantifiers are better suited to the application of the technique than traditional approaches.  相似文献   

5.
Query Answering for OWL-DL with rules   总被引:2,自引:0,他引:2  
Both OWL-DL and function-free Horn rules are decidable fragments of first-order logic with interesting, yet orthogonal expressive power. A combination of OWL-DL and rules is desirable for the Semantic Web; however, it might easily lead to the undecidability of interesting reasoning problems. Here, we present a decidable such combination where rules are required to be DL-safe: each variable in the rule is required to occur in a non-DL-atom in the rule body. We discuss the expressive power of such a combination and present an algorithm for query answering in the related logic extended with DL-safe rules, based on a reduction to disjunctive programs.  相似文献   

6.
Caching stores the results of previously answered queries in order to answer succeeding queries faster by reusing these results. We propose two different approaches for using caches of XSLT transformed XML data in order to answer queries. The first approach checks whether or not a current query Q can be directly answered from the result of a previously answered query Qi stored in the cache. The new query is otherwise submitted to the source over the network, the answer of the query is determined, transmitted back to the client, and stored in the cache. The second approach determines only the intersection Q−Qi and integrates the result of Q−-Qi into the previous results in the cache, which requires applying a numbering scheme for the output of the XSLT stylesheet. We show by experimental results that the second approach can significantly speed up the answering time in comparison to the first approach, but is not significantly slower in few worst cases than the second approach.  相似文献   

7.
8.
Schema matching is one of the key challenges in information integration. It is a labor-intensive and time-consuming process. To alleviate the problem, many automated solutions have been proposed. Most of the existing solutions mainly rely upon textual similarity of the data to be matched. However, there exist instances of the schema matching problem for which they do not even apply. Such problem instances typically arise when the column names in the schemas and the data in the columns are opaque or very difficult to interpret. In our previous work [36] we proposed a two-step technique to address this problem. In the first step, we measure the dependencies between attributes within tables using an information-theoretic measure and construct a dependency graph for each table capturing the dependencies among attributes. In the second step, we find matching node pairs across the dependency graphs by running a graph matching algorithm. In our previous work, we experimentally validated the accuracy of the approach. One remaining challenge is the computational complexity of the graph matching problem in the second step. In this paper we extend the previous work by improving the second phase of the algorithm incorporating efficient approximation algorithms into the framework.  相似文献   

9.
断接下查询的缓存处理   总被引:5,自引:0,他引:5  
吴婷婷  章文嵩  周兴铭 《计算机学报》2003,26(10):1393-1399
移动环境下,由于无线网络可靠性低、费用高,移动主机本身受电源、资源等方面的限制,移动主机经常会主动或被动地处于断接,即没有网络连接的状态.为了提高断接时移动客户对数据的访问能力,有效利用移动缓存,该文提出断接下基于语义缓存的查询处理QPID算法.该算法的主要思路是先找出缓存中与当前查询相关的缓存项,再通过对相关项数据的进一步处理获得缓存中满足查询的结果.试验表明,基于QPID算法的查询处理可以更好地满足断接下客户的查询请求.  相似文献   

10.
When answering queries using external information sources, the contents of the queries can be described by views. To answer a query, we must rewrite it using the set of views presented by the sources. When the external information sources also have the ability to answer some (perhaps limited) sets of queries that require performing operations on their data, the set of views presented by the source may be infinite (albeit encoded in some finite fashion). Previous work on answering queries using views has only considered the case where the set of views is finite. In order to exploit the ability of information sources to answer more complex queries, we consider the problem of answering conjunctive queries using infinite sets of conjunctive views. Our first result is that an infinite set of conjunctive views can be partitioned into a finite number of equivalence classes, such that picking one view from every nonempty class is sufficient to determine whether the query can be answered using the views. Second, we show how to compute the set of equivalence classes for sets of conjunctive views encoded by a datalog program. Furthermore, we extend our results to the case when the query and the views use the built-in predicates <, ⩽, =, and ≠, and they are interpreted over a dense domain. Finally, we extend our results to conjunctive queries and views with the built-in predicates <, ⩽, and = interpreted over the integers. In doing so we present a result of independent interest, namely, an algorithm to minimize such queries.  相似文献   

11.
通过对自然语言进行分词处理,利用LINQ技术将单词与特征数据库中的特征进行匹配查询,给出满足约束条件的查询结果。实践表明,基于LINQ技术的数据查询匹配算法,在保证查全率和查准率的前提下能较好地提高查询效率。  相似文献   

12.
13.
对XML文档中存在的异常数据依赖进行了分析,提出了规范化所对应的范式及规则.  相似文献   

14.
应用VBA语句操作Excel文档,能迅速统一文档数据的格式,解决由旧身份证号码生成新身份证号码的问题,从大量的数据中进行相对应的数据比较,将匹配的数据写入到新的Excel文档中,降低了工作强度,能按时保质地完成工作任务.  相似文献   

15.
模式匹配是模式集成、数据仓库、电子商务以及语义查询等领域中的一个难点.它主要利用元素自身信息(如元素名、数据类型等信息)、数据实例信息(模式中的数据)和结构信息(模式元素相互关联的关系)来挖掘元素语义以获得正确的映射关系.文中介绍了一种将数据实例信息与结构信息相结合来辅助匹配的新方法.此方法首先根据模式对应的数据实例信息来计算模式元素间的部分函数依赖度(模式结构信息),然后根据部分函数依赖关系建立模式元素间的依赖图,再根据元素依赖图计算元素间的结构相似度,最后得到模式元素间的映射关系.由于利用了更多的结构信息辅助匹配,所以文中方法在性能上要优于其它仅使用完全函数依赖结构信息进行匹配的方法.实验表明此方法在查准率、查全率以及全面性等各个指标上都优于已有的其它方法.  相似文献   

16.
李志平  孙瑜 《计算机工程与应用》2004,40(34):186-187,194
该文对查询系统做了深入的研究,提出了一种基于本体的智能查询系统的形式化模型,并且对系统的具体运行过程进行了详细的分析。该模型充分考虑了用户查询的语义信息,并且引入了本体环境和用户查询环境来对系统进行建模。将用来对数据库的语义信息进行描述的本体层引入到异构的、分布式的数据库系统中,在方便用户进行查询的同时增加查询结果的相关性和用户满意度。同时,系统能够及时地反映数据库信息的动态变化。  相似文献   

17.
随着语义网的发展,本体已经成为很多领域表达知识的主要手段。许多领域都根据自己的需求建立了本体来描述本领域内的知识。但是目前许多针对本体的语义查询只能对一个本体进行查询。为了实现一个查询能够对多个本体进行访问并且返回适当的查询结果,文中提出了一种利用本体映射实现对多本体的查询方法。其中的映射方法是一种基于语义的多策略结合方式。通过实验发现查询的速度与本体的数量基本呈线性关系且不会因为本体异构程度而增加。  相似文献   

18.
随着语义网的发展,本体已经成为很多领域表达知识的主要手段.许多领域都根据自己的需求建立了本体来描述本领域内的知识.但是目前许多针对本体的语义查询只能对一个本体进行查询.为了实现一个查询能够对多个本体进行访问并且返回适当的查询结果,文中提出了一种利用本体映射实现对多本体的查询方法.其中的映射方法是一种基于语义的多策略结合方式.通过实验发现查询的速度与本体的数量基本呈线性关系且不会因为本体异构程度而增加.  相似文献   

19.
数据清理中不完整数据的清理方法   总被引:7,自引:0,他引:7  
针对数据源中出现的不完整数据,提出一种有效的清理方法。  相似文献   

20.
基于多领域本体的智能查询系统模型   总被引:5,自引:0,他引:5  
孙瑜  李志平 《计算机工程》2005,31(13):148-150
提出了一种基于多领域本体的智能查询系统的形式化模型,并且对系统及其特点进行了详细分析。该模型引入本体层来对知识库系统中的语义和信息内容进行描述,充分体现了用户查询中的语义信息,既方便用户进行查询,又增加了查询结果的相关性。同时,系统能够及时反映知识库的动态变化,简化了单一的全局本体所导致的一致性及效率问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号