首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 218 毫秒
1.
为解决关系型数据库在大数据处理中遇到的瓶颈问题,满足企业对大数据处理的需求,提出将关系型数据库迁移到NoSQL文档型数据库中。针对RDBMS中的关系模型向MongoDB中的集合模型转化方法进行了研究,提出了表示关系间参照完整性的有向图表示模型,和基于关系型数据模型向MongoDB文档模型自动转化算法;实现了RDBMS中迁移数据到MongoDB的插入算法。针对上述方案和算法,结合典型开源RDBMS--MySQL实例,对上述关系有向图模型的生成、基于有向图模型的转化算法以及数据迁移算法应用验证。实验结果表明RDBMS可以按照一定的数据结构平滑地迁移到MongoDB中。  相似文献   

2.
为解决云计算中海量数据的存储管理问题,分析了关系数据模型和NoSQL数据模型各自的特点,提出了一种新的数据模型。该模型根据数据本身的特点将数据横向切分为一组实体的集合,不同的数据实体负责处理不同的数据应用,结合了关系数据模型的可用性与NoSQL数据模型的可扩展性。通过详细定义该模型的数据结构、约束条件以及数据操作,保证了数据模型的完整性。通过一个原型系统运行实例,验证了该模型的有效性,为云数据管理提供了可行的解决途径。  相似文献   

3.
数据模型及其发展历程   总被引:1,自引:0,他引:1  
数据库是数据管理的技术,是计算机学科的重要分支.经过近半个世纪的发展,数据库技术形成了坚实的理论基础、成熟的商业产品和广泛的应用领域.数据模型描述了数据库中数据的存储方式和操作方式.从数据组织形式,可以将数据模型分为结构化模型、半结构化模型、OLAP分析模型和大数据模型.20世纪60年代中后期到90年代初,结构化模型最早被提出,其主要包括层次模型、网状模型、关系模型和面向对象模型等.20世纪90年代末期,随着互联网应用和科学计算等复杂应用的快速发展,开始出现半结构化模型,包括XML模型、JSON模型和图模型等.21世纪,随着电子商务、商业智能等应用的不断发展,数据分析模型成为研究热点,主要包括关系型ROLAP和多维型MOLAP.2010年以来,随着大数据工业应用的快速发展,以NoSQL和NewSQL数据库系统为代表的大数据模型成为新的研究热点.对上述数据模型进行了综述,并选取每个模型的典型数据库系统进行了性能的分析.  相似文献   

4.
随着信息化技术的发展,面对材料等相关领域数据的多源异构、扩展性强、爆炸增长等特点,传统关系数据库无法对数据进行存储,因此可利用NoSQL的无模式存储、高扩展性等特性来解决这一难题。作为NoSQL数据库常用的数据存储格式,JSON因简单性和灵活性备受欢迎。然而,NoSQL数据库缺乏模式信息,在JSON文档存入数据库之前,需要对其进行数据验证与分析。目前,大多数方法是基于JSON schema对JSON文档格式的规范性进行校验,无法有效解决JSON文档的异常检测以及语义歧义问题。为此,文中提出了面向NoSQL数据库的JSON文档异常检测与语义消歧模型doctorJSON。该模型基于JSON schema对存入的JSON文档分别设计了异常检测算法deoutJSON和语义消歧算法disemaJSON,以检测JSON文档存在的异常和歧义。在真实数据集与合成数据集上的实验验证了所提模型的有效性和执行效率。  相似文献   

5.
面向IT技术人员或大学生,在分析传统建模工具的不足的基础上,设计了一款数据建模工具StarDbDesigner.可以实现针对不同数据库实例,建立数据字典、灵活配置数据库表结构、设定数据库对象.可以根据设计结果,迭代数据模型的设计.阐述了系统的目标、系统的逻辑结构、数据模型、系统的实现、操作流程以及系统的适用对象使用优势,对IT技术人员尤其是计算机相关专业大学生具有借鉴意义.  相似文献   

6.
支持大数据管理的NoSQL系统研究综述   总被引:6,自引:0,他引:6  
申德荣  于戈  王习特  聂铁铮  寇月 《软件学报》2013,24(8):1786-1803
针对大数据管理的新需求,呈现出了许多面向特定应用的NoSQL数据库系统。针对基于key-value数据模型的 NoSQL 数据库的相关研究进行综述。首先,介绍了大数据的特点以及支持大数据管理系统面临的关键技术问题;然后,介绍了相关前沿研究和研究挑战,其中典型的包括系统体系结构、数据模型、访问方式、索引技术、事务特性、系统弹性、动态负载均衡、副本策略、数据一致性策略、基于flash的多级缓存机制、基于MapReduce的数据处理策略和新一代数据管理系统等;最后给出了研究展望。  相似文献   

7.
介绍了两个具有代表性的NoSQL数据库:Bigtable和Dynamo系统。首先,描述了Bigtable和Dynamo的适用范围及其产生原因。Bigtable和Dynamo可以高效的处理web数据提供相应服务;然后,介绍了Bigtable和Dynamo系统的架构、特性等,以及各自独特的设计方法。最后,将这两个数据库与传统的关系数据库进行比较分析,描述了它们之间的不同点,对比结果表明NoSQL数据库在处理web应用数据时是高效可用的,比传统关系数据库更占优势。  相似文献   

8.
社交网络和微博等新型应用对数据管理技术提出了新的挑战,如海量数据高效存储、高并发访问、高可扩展性和高可用性等。而传统的关系数据库技术无法满足这些新型应用的需求,因此,NoSQL数据管理技术的研究、开发和应用越来越受到重视。本文从NoSQL数据模型、数据存储、查询处理以及SQL与NoSQL混合数据库解决方案等方面,综述了NoSQL数据管理技术发展现状和趋势,并介绍了几种典型的NoSQL产品。  相似文献   

9.
林峰  任开银  倪斌 《计算机工程》2009,35(2):42-43,4
在分析常规数据交换实现技术的基础上,提出一种能根据特定时间区段进行交换控制和根据特定放行指令进行交换许可的通用数据交换技术,该技术可满足实际业务应用中对于数据交换实施精确控制的需要。描述该数据交换机制中的时间闸和放行令牌技术,包括其主要设计目标、类型、数据结构、基本工作原理和模式以及相关数据交换处理流程。  相似文献   

10.
各企业在进行数据共享之前,首要考虑的是数据模型的数据映射关系,然而数据模型在不断的更替与升级,给数据集成带来了很大困难。文中设计了一种对数据源结构进行统一描述的数据字典元模型,利用数据元来规范数据项,将编辑距离算法思想应用其中,实现数据项与数据元字典中数据元的相似度匹配。应用语义树的表示方法来描述数据元结构,利用语义相似度算法进行数据元间的相似性、一致性检查,寻找数据元间的关联关系,间接地定位数据项间的语义关系,为数据映射奠定良好基础。以中石化标准数据元规范油田企业搜索引擎数据项,确保研究的实用价值。  相似文献   

11.
Data model is the core knowledge of database course. A deep understanding of data model is the key to mastering database design and application. The data models of NoSQL databases are categorized as key-value stores, column-oriented stores, document-oriented stores and graph databases. This paper makes a comparative analysis of the characteristics of the relational data model and NoSQL data models, and gives the design and implementation of different data models combined with cases, so that students can master the relevant theories and application methods of the database model.  相似文献   

12.
NoSQL systems have gained their popularity for many reasons, including the flexibility they provide in organizing data, as they relax the rigidity provided by the relational model and by the other structured models. This flexibility and the heterogeneity that has emerged in the area have led to a little use of traditional modeling techniques, as opposed to what has happened with databases for decades.In this paper, we argue how traditional notions related to data modeling can be useful in this context as well. Specifically, we propose NoAM (NoSQL Abstract Model), a novel abstract data model for NoSQL databases, which exploits the commonalities of various NoSQL systems. We also propose a database design methodology for NoSQL systems based on NoAM, with initial activities that are independent of the specific target system. NoAM is used to specify a system-independent representation of the application data and, then, this intermediate representation can be implemented in target NoSQL databases, taking into account their specific features. Overall, the methodology aims at supporting scalability, performance, and consistency, as needed by next-generation web applications.  相似文献   

13.
NoSQL databases are famed for the characteristics of high scalability, high availability, and high fault-tolerance. So NoSQL databases are used in a lot of applications. The data partitioning strategy and fragment allocation strategy directly affect NoSQL database systems’ performance. The data partition strategy of large, global databases is performed by horizontally, vertically partitioning or combination of both. In the general way the system scatters the related fragments as possible to improve operations’ parallel degree. But the operations are usually not very complicated in some applications, and an operation may access to more than one fragment. At the same time, those fragments which have to be accessed by an operation may interact with each other. The general allocation strategies will increase system’s communication cost during operations execution over sites. In order to improve those applications’ performance and enable NoSQL database systems to work efficiently, these applications’ fragments have to be allocated in a reasonable way that can reduce the communication cost i.e., to minimize the total volume of data transmitted during operations execution over sites. A strategy of clustering fragments based on hypergraph is proposed, which can cluster fragments which were accessed together in most operations to the same cluster. Themethod uses a weighted hypergraph to represent the fragments’ access pattern of operations. A hypergraph partitioning algorithmis used to cluster fragments in our strategy. This method can reduce the amount of sites that an operation has to span. So it can reduce the communication cost over sites. Experimental results confirm that the proposed technique will effectively contribute in solving fragments re-allocation problem in a specific application environment of NoSQL database system.  相似文献   

14.
Integration of data stored in heterogeneous database systems is a very challenging task and it may hide several difficulties. As NoSQL databases are growing in popularity, integration of different NoSQL systems and interoperability of NoSQL systems with SQL databases become an increasingly important issue. In this paper, we propose a novel data integration methodology to query data individually from different relational and NoSQL database systems. The suggested solution does not support joins and aggregates across data sources; it only collects data from different separated database management systems according to the filtering options and migrates them. The proposed method is based on a metamodel approach and it covers the structural, semantic and syntactic heterogeneities of source systems. To introduce the applicability of the proposed methodology, we developed a web-based application, which convincingly confirms the usefulness of the novel method.  相似文献   

15.
16.
Wide-column NoSQL databases are an important class of NoSQL (Not only SQL) databases which scale horizontally and feature high access performance on sparse tables. With current trends towards big Data Warehouses (DWs), it is attractive to run existing business intelligence/data warehousing applications on higher volumes of data in wide-column NoSQL databases for low latency by mapping multidimensional models to wide-column NoSQL models or using additional SQL add-ons. For examples, applications like retail management can run over integrated data sets stored in big DWs or in the cloud to capture current item-selling trends. Many of these systems also employ Snapshot Isolation (SI) as a concurrency control mechanism to achieve high throughput for read-heavy workloads. SI works well in a DW environment, as analytical queries can now work on (consistent) snapshots and are not impacted by concurrent update jobs performed by online incremental Extract-Transform-Load (ETL) flows that refresh fact/dimension tables. However, the snapshot made available in the DW is often stale, since at the moment when an analytical query is issued, the source updates (e.g. in a remote retail store) may not have been extracted and processed by the ETL process in time due to high input data volume or slow processing speed. This staleness may cause incorrect results for time-critical decision support queries. To address this problem, snapshots which are supposed to be accessed by analytical queries need to be first maintained by corresponding ETL flows to reflect source updates based on given freshness needs. Snapshot maintenance in this work means maintaining the distributed data partitions that are required by a query. Since most NoSQL databases are not ACID compliant and do not provide full-fledged distributed transaction support, snapshot may be inconsistently derived when its data partitions are updated by different ETL maintenance jobs.This paper describes an extended version of HBelt system [1] which tightly integrates the wide-column NoSQL database HBase with a clustered & pipelined ETL engine. Our objective is to efficiently refresh HBase tables with remote source updates while a consistent snapshot is guaranteed across distributed partitions for each scan request in analytical queries. A consistency model is defined and implemented to address so-called distributed snapshot maintenance. To achieve this, ETL jobs and analytical queries are scheduled in a distributed processing environment. In addition, a partitioned, incremental ETL pipeline is introduced to increase the performance of ETL (update) jobs. We validate the efficiency gain in terms of data pipelining and data partitioning using the TPC-DS benchmark, which simulates a modern decision support system for a retail product supplier. Experimental results show that high query throughput can be achieved in HBelt when distributed, refreshed snapshots are demanded.  相似文献   

17.
Changqing Li  Jianhua Gu 《Software》2019,49(3):401-422
As the applications with big data in cloud computing environment grow, many existing systems expect to expand their service to support the dramatic increase of data, and modern software development for services computing and cloud computing software systems is no longer based on a single database but on existing multidatabases and this convergence needs new software architecture design. This paper proposes an integration approach to support hybrid database architecture, including MySQL, MongoDB, and Redis, to make it possible of allowing users to query data simultaneously from both relational SQL systems and NoSQL systems in a single SQL query. Two mechanisms are provided for constructing Redis's indexes and semantic transforming between SQL and MongoDB API to add the SQL feature for these NoSQL databases. With the proposed approach, hybrid database systems can be performed in a flexible manner, ie, access can be either relational database or NoSQL, depending on the size of data. The approach can effectively reduce development complexity and improve development efficiency of the software systems with multidatabases. This is the result of further research on the related topic, which fills the gap ignored by relevant scholars in this field to make a little contribution to the further development of NoSQL technology.  相似文献   

18.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号