首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到8条相似文献,搜索用时 15 毫秒
1.
仲志平  仲晓辉 《微机发展》2012,(1):217-220,224
数据冲突是数据库中数据质量中心问题之一。在集中式数据库中,基于SQL技术可以有效地检测出违背给定条件函数依赖集的元组。然而,当数据库中数据被水平或垂直划分且分布在不同站点时,检测数据冲突将面临更大的挑战,常常需要将数据从一个站点移动到另外一个站点。提出了分布式数据库中条件函数依赖冲突检测算法,该算法不仅能有效地检测出水平划分数据中条件函数依赖冲突,而且能减少数据传输。实验结果证实算法是有效的。  相似文献   

2.
Database Research: Achievements and Challenges   总被引:1,自引:1,他引:1       下载免费PDF全文
Database system is the infrastructure of the modern information system. The R&D in the database system and its technologies is one of the important research topics in the field. The database R&D in China took off later but it moves along by giant steps. This report presents the achievements Renmin University of China (RUC) has made in the past 25 years and at the same time addresses some of the research projects we, RUC, are currently working on. The National Natural Science Foundation of China supports and initiates most of our research projects and these successfully conducted projects have produced fruitful results.  相似文献   

3.
Several studies have repeatedly demonstrated that both the performance and scalability of a shared-nothing parallel database system depend on the physical layout of data across the processing nodes of the system. Today, data is allocated in these systems using horizontal partitioning strategies. This approach has a number of drawbacks. If a query involves the partitioning attribute, then typically only a small number of the processing nodes can be used to speedup the execution of this query. On the other hand, if the predicate of a selection query includes an attribute other than the partitioning attribute, then the entire data space must be searched. Again, this results in waste of computing resources. In recent years, several multidimensional data declustering techniques have been proposed to address these problems. However, these schemes are too restrictive (e.g., FX, ECC, etc.), or optimized for a certain type of queries (e.g., DM, HCAM, etc.). In this paper, we introduce a new technique which is flexible, and performs well for general queries. We prove its optimality properties, and present experimental results showing that our scheme outperforms DM and HCAM by a significant margin.  相似文献   

4.
Matching dependencies (MDs) are used to declaratively specify the identification (or matching) of certain attribute values in pairs of database tuples when some similarity conditions on other values are satisfied. Their enforcement can be seen as a natural generalization of entity resolution. In what we call the pure case of MD enforcement, an arbitrary value from the underlying data domain can be used for the value in common that is used for a matching. However, the overall number of changes of attribute values is expected to be kept to a minimum. We investigate this case in terms of semantics and the properties of data cleaning through the enforcement of MDs. We characterize the intended clean instances, and also the clean answers to queries, as those that are invariant under the cleaning process. The complexity of computing clean instances and clean query answering is investigated. Tractable and intractable cases depending on the MDs are identified and characterized.  相似文献   

5.
An inherent limitation in mobile data access is due to the unreliable and low bandwidth wireless communication channel. Caching of useful database items from database server in local storage of mobile clients is effective in reducing data access latency and wireless bandwidth consumption. In the event of disconnection, cached data can also serve the purpose of partial query processing. In this paper, we present the implementation and evaluate a new caching mechanism for object-oriented database systems in a mobile environment called MODEC. MODEC possesses the capabilities of performing caching at multiple granularities and adapting to changes in data access pattern, providing improved performance through tolerating limited inconsistency to read-only transactions. This caching capabilities is supported via standard ODMG modeling constructs. The prototype of MODEC is implemented using ODE database. Empirical system performance results are obtained from experiments on the prototype with data from a real-life database. The results are validated against results obtained via detailed simulation studies on MODEC. Both sets of results are found to be consistent and are in favor of our MODEC mechanism in providing a feasible solution to the mobile data access problem under the constraints in a mobile environment.  相似文献   

6.
In this paper, a Graph-based semantic Data Model (GDM) is proposed with the primary objective of bridging the gap between the human perception of an enterprise and the needs of computing infrastructure to organize information in some particular manner for efficient storage and retrieval. The Graph. Data Model (GDM) has been proposed as an alternative data model to combine the advantages of the relational model with the positive features of semantic data models. The proposed GDM offers a structural representation for interacting to the designer, making it always easy to comprehend the complex relations amongst basic data items. GDM allows an entire database to be viewed as a Graph (V, E) in a layered organization. Here, a graph is created in a bottom up fashion where V represents the basic instances of data or a functionally abstracted module, called primary semantic group (PSG) and secondary semantic group (SSG). An edge in the model implies the relationship among the secondary semantic groups. The contents of the lowest layer are the semantically grouped data values in the form of primary semantic groups. The SSGs are nothing but the higher-level abstraction and are created by the method of encapsulation of various PSGs, SSGs and basic data elements. This encapsulation methodology to provide a higher-level abstraction continues generating various secondary semantic groups until the designer thinks that it is sufficient to declare the actual problem domain. GDM, thus, uses standard abstractions available in a semantic data model with a structural representation in terms of a graph. The operations on the data model are formalized in the proposed graph algebra. A Graph Query Language (GQL) is also developed, maintaining similarity with the widely accepted user-friendly SQL. Finally, the paper also presents the methodology to make this GDM compatible with the distributed environment, and a corresponding query processing technique for distributed environment is also suggested for the sake of completeness.  相似文献   

7.
快速准确获取省域碳排放数据是实时制定差异化碳减排政策的前提。基于DMSP/OLS和NPP-VIIRS夜间灯光数据,采用统计数据比较法提取1997~2017年中国大陆各省域(不包括西藏)建成区的夜间灯光总值(用TDN表示),并利用1997~2014年各省域的TDN值与同期核算的碳排放量建立各省域碳排放预测模型。然后,以2015~2017年的TDN值为自变量估算中国各省域的碳排放量;同时,利用熵值法和碳排放分配模型将四大国际权威数据库(IEA、EIA、EDGAR和CEADs)发布的中国碳排放量分配至各省;最后,将估算结果与四大典型碳数据库分配的省域碳排放值进行比较。研究表明:估算的省域碳排放量与分配的省域碳排放量大体一致,平均绝对百分比误差(MAPE)为6.45%~9.12%,并且基于夜间灯光数据估算的省域碳排放量与IEA和EIA数据库分配的碳排放量更为接近;各省域估算的碳排放量与分配的碳排放量均落在1∶1线附近;单个省域的MAPE值变化在0.68%~14.85%,且多数省域的MAPE值均在10.0%以内。上述结果证明,基于夜间灯光数据通过提取TDN值估算省域碳排放量具有可行性和准确性。  相似文献   

8.

随着智能移动终端的日益普及,人们越来越多地利用社交网络平台(如Twitter、新浪微博等)获取信息、评论和交流. 虽然全球卫星定位系统(GPS)设备能够精确获取位置信息,但是大量用户出于隐私和安全的考虑不会直接共享自己的位置信息. 因此,如何获取在线用户的地理位置成为了一个前沿的研究领域以及学术界和工业界共同关注的重要课题,并且成为众多下游应用的基础,例如基于位置的定向广告投放、事件/地点的推荐、自然灾害或疾病预警和网络犯罪的追踪等. 详细总结了预测社交网络用户地理位置的方法、数据、评价体系和基础算法. 首先,归纳了不同的定位任务以及相应的评价指标;其次,针对不同的任务梳理所用的数据类型和数据融合方式,并且,详尽分析了已有的信息抽取和特征选择方式及其优缺点;再次,对现有定位模型和算法进行分类,从地名词典、传统机器学习和深度学习3个方面对用户定位方法进行阐述和分析;最后,总结了社交网络用户地理位置预测的难点和面临的挑战,并展望该领域的发展趋势和未来研究所需要关注的方向.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号