首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
移动医疗、家庭远程监护等患者健康管理模式的兴起,产生了海量的医疗监护数据,在对患者的海量健康数据处理时存在性能瓶颈问题。文章首先介绍了医疗实时监护大数据的特征和应用模型,分析了当前关系数据库存储海量医疗监护数据存在的问题,比较了多种NoSQL数据库的特性,提出了使用HBase分布式非关系数据库进行医疗监护大数据存储的方案,并给出了主要表结构设计,最后对HBase数据库的部署、表数据访问进行说明。  相似文献   

2.
传统的关系型数据库已无法满足海量数据的存储与访问需求。针对该问题,提出一种非关系型数据库(NoSQL)的分布式存储与扩展解决方法。分析并改进NoSQL,讨论基于一致性哈希算法键值对的分布式存储,以及基于双hash环的数据库服务器节点的扩展方法,提出将NoSQL作为镜像引入数据库架构系统。实际应用结果表明,该方法可以避免资源浪费及服务器过载。  相似文献   

3.
随着云计算、物联网、移动互联网等技术的飞速发展,海量数据在这些崭新的领域迅猛地生长着,大数据作为一项颠覆性技术,为处理海量数据提供了无限可能。而传统的关系型数据库的不再适用,导致了分布式数据库NoSQL的应运而生。针对大数据领域面临的种种现实难题,设计并实现了一种基于Hadoop和NoSQL的新型分布式大数据管理系统(DBDMS),其提供大数据的实时采集、检索以及永久存储的功能。实验表明,DBDMS可以显著提高大数据处理能力,适用于海量日志备份和检索、海量网络报文抓取和分析等领域。  相似文献   

4.
石清 《软件工程》2021,(3):48-51
在"互联网+"概念的影响下,越来越多的信息技术应用于体育产业.本文通过构建基于MEAN框架的体育竞赛实时数据管理系统,设计了一种结合本地存储与远端云数据库的分布式存储方案,既实现数据的实时分享,又保障数据的可靠性.并通过实验的方式比较了基于HTML5 Local Storage本地存储的两种方法与本地NoSQL数据库的...  相似文献   

5.
实际工程中采集和处理的数据量特别大,这对传统数据库技术提出巨大挑战。针对传统关系型数据库存储速度慢、对硬件要求高的缺点,提出一种以NoSQL数据库为基础的大数据处理方法,打破了传统数据库的关系模型,数据以一种自由的方式存储,而不依赖固定的表结构。该方法主要是将经验模态分解并与NoSQL数据库技术相结合,应用于大型结构件的变形监测中,构建出一个基于NoSQL数据库系统的大型结构件变形监测系统。仿真结果表明,该方法可以实现大型结构件变形监测数据的实时处理,在计算收敛性、算法稳定性和处理速度上都优于传统数据库技术。  相似文献   

6.
随着数字档案资源体系建设的不断加快,档案数据的种类日趋丰富,数量迅猛增长,呈现出大数据的特征。传统关系型数据库与集中式存储在档案大数据处理方面存在着适应性、可靠性和扩展性不足的问题。针对当前遇到的问题,通过分析传统档案数据存储模式的局限性,将分布式NoSQL数据库、分布式文件系统、分布式搜索引擎应用于档案大数据的管理。设计一套基于分布式NoSQL数据库的档案大数据存储与检索方案,并开发原型系统进行验证。  相似文献   

7.
社交网络和微博等新型应用对数据管理技术提出了新的挑战,如海量数据高效存储、高并发访问、高可扩展性和高可用性等。而传统的关系数据库技术无法满足这些新型应用的需求,因此,NoSQL数据管理技术的研究、开发和应用越来越受到重视。本文从NoSQL数据模型、数据存储、查询处理以及SQL与NoSQL混合数据库解决方案等方面,综述了NoSQL数据管理技术发展现状和趋势,并介绍了几种典型的NoSQL产品。  相似文献   

8.
王宏伟  方群  陈伟 《微机发展》2013,(7):242-244,248
工业OPC实时监测系统要求系统能够快速响应并及时处理大批量实时数据,传统关系型数据库较难满足,内存数据库能够较好地完成实时监测系统的实时海量数据处理,及时反馈数据信息。文章将内存数据库技术引入工业OPC实时监测系统,与传统关系型数据库相互融合,提出了一个基于内存数据库技术的工业OPC实时监测系统的架构模型。该模型在保证存储海量历史数据的同时,提高OPC监测系统的实时性、稳定性,满足OPC实时监测系统的需求,具有较好的实时监控效果,可以用于工业OPC实时监控系统中。  相似文献   

9.
工业OPC实时监测系统要求系统能够快速响应并及时处理大批量实时数据,传统关系型数据库较难满足,内存数据库能够较好地完成实时监测系统的实时海量数据处理,及时反馈数据信息.文章将内存数据库技术引入工业OPC实时监测系统,与传统关系型数据库相互融合,提出了一个基于内存数据库技术的工业OPC实时监测系统的架构模型.该模型在保证存储海量历史数据的同时,提高OPC监测系统的实时性、稳定性,满足OPC实时监测系统的需求,具有较好的实时监控效果,可以用于工业OPC实时监控系统中.  相似文献   

10.
随着建筑信息模型的规模和复杂性不断增加,利用单台计算机处理海量BIM数据的存储和分析变得越来越困难。传统的关系数据库、面向对象数据库等已经无法满足当下建筑业海量和多样化的数据存储和管理的需求。而大数据技术的出现为建筑信息模型海量数据的存储、管理和分析带来极大的潜力。利用大数据技术管理BIM结构化和非结构化数据的优势,探讨分布式大数据平台Hadoop和HBase数据库整体架构和存储模型;制定基于HBase数据库存储IFC(工业基础类)结构化数据和非结构化数据的策略及数据表格的设计;建立基于Hadoop和HBase大数据环境的建筑信息模型存储系统,实现对IFC数据的基本管理操作。通过实际案例验证该系统的可行性。  相似文献   

11.
12.
随着互联网的高速发展,特别是近年来云计算、物联网等新兴技术的出现,社交网络等服务的广泛应用,人类社会的数据的规模正快速地增长,大数据时代已经到来。如何获取,分析大数据已经成为广泛的问题。但随着带来的数据的安全性必须引起高度重视。本文从大数据的概念和特征说起,阐述大数据面临的安全挑战,并提出大数据的安全应对策略。  相似文献   

13.
The optimization capabilities of RDBMSs make them attractive for executing data transformations. However, despite the fact that many useful data transformations can be expressed as relational queries, an important class of data transformations that produce several output tuples for a single input tuple cannot be expressed in that way.

To overcome this limitation, we propose to extend Relational Algebra with a new operator named data mapper. In this paper, we formalize the data mapper operator and investigate some of its properties. We then propose a set of algebraic rewriting rules that enable the logical optimization of expressions with mappers and prove their correctness. Finally, we experimentally study the proposed optimizations and identify the key factors that influence the optimization gains.  相似文献   


14.
As the amount of multimedia data is increasing day-by-day thanks to cheaper storage devices and increasing number of information sources, the machine learning algorithms are faced with large-sized datasets. When original data is huge in size small sample sizes are preferred for various applications. This is typically the case for multimedia applications. But using a simple random sample may not obtain satisfactory results because such a sample may not adequately represent the entire data set due to random fluctuations in the sampling process. The difficulty is particularly apparent when small sample sizes are needed. Fortunately the use of a good sampling set for training can improve the final results significantly. In KDD’03 we proposed EASE that outputs a sample based on its ‘closeness’ to the original sample. Reported results show that EASE outperforms simple random sampling (SRS). In this paper we propose EASIER that extends EASE in two ways. (1) EASE is a halving algorithm, i.e., to achieve the required sample ratio it starts from a suitable initial large sample and iteratively halves. EASIER, on the other hand, does away with the repeated halving by directly obtaining the required sample ratio in one iteration. (2) EASE was shown to work on IBM QUEST dataset which is a categorical count data set. EASIER, in addition, is shown to work on continuous data of images and audio features. We have successfully applied EASIER to image classification and audio event identification applications. Experimental results show that EASIER outperforms SRS significantly. Surong Wang received the B.E. and M.E. degree from the School of Information Engineering, University of Science and Technology Beijing, China, in 1999 and 2002 respectively. She is currently studying toward for the Ph.D. degree at the School of Computer Engineering, Nanyang Technological University, Singapore. Her research interests include multimedia data processing, image processing and content-based image retrieval. Manoranjan Dash obtained Ph.D. and M. Sc. (Computer Science) degrees from School of Computing, National University of Singapore. He has worked in academic and research institutes extensively and has published more than 30 research papers (mostly refereed) in various reputable machine learning and data mining journals, conference proceedings, and books. His research interests include machine learning and data mining, and their applications in bioinformatics, image processing, and GPU programming. Before joining School of Computer Engineering (SCE), Nanyang Technological University, Singapore, as Assistant Professor, he worked as a postdoctoral fellow in Northwestern University. He is a member of IEEE and ACM. He has served as program committee member of many conferences and he is in the editorial board of “International journal of Theoretical and Applied Computer Science.” Liang-Tien Chia received the B.S. and Ph.D. degrees from Loughborough University, in 1990 and 1994, respectively. He is an Associate Professor in the School of Computer Engineering, Nanyang Technological University, Singapore. He has recently been appointed as Head, Division of Computer Communications and he also holds the position of Director, Centre for Multimedia and Network Technology. His research interests include image/video processing & coding, multimodal data fusion, multimedia adaptation/transmission and multimedia over the Semantic Web. He has published over 80 research papers.  相似文献   

15.
Time series analysis has always been an important and interesting research field due to its frequent appearance in different applications. In the past, many approaches based on regression, neural networks and other mathematical models were proposed to analyze the time series. In this paper, we attempt to use the data mining technique to analyze time series. Many previous studies on data mining have focused on handling binary-valued data. Time series data, however, are usually quantitative values. We thus extend our previous fuzzy mining approach for handling time-series data to find linguistic association rules. The proposed approach first uses a sliding window to generate continues subsequences from a given time series and then analyzes the fuzzy itemsets from these subsequences. Appropriate post-processing is then performed to remove redundant patterns. Experiments are also made to show the performance of the proposed mining algorithm. Since the final results are represented by linguistic rules, they will be friendlier to human than quantitative representation.  相似文献   

16.
Compression-based data mining of sequential data   总被引:3,自引:1,他引:2  
The vast majority of data mining algorithms require the setting of many input parameters. The dangers of working with parameter-laden algorithms are twofold. First, incorrect settings may cause an algorithm to fail in finding the true patterns. Second, a perhaps more insidious problem is that the algorithm may report spurious patterns that do not really exist, or greatly overestimate the significance of the reported patterns. This is especially likely when the user fails to understand the role of parameters in the data mining process. Data mining algorithms should have as few parameters as possible. A parameter-light algorithm would limit our ability to impose our prejudices, expectations, and presumptions on the problem at hand, and would let the data itself speak to us. In this work, we show that recent results in bioinformatics, learning, and computational theory hold great promise for a parameter-light data-mining paradigm. The results are strongly connected to Kolmogorov complexity theory. However, as a practical matter, they can be implemented using any off-the-shelf compression algorithm with the addition of just a dozen lines of code. We will show that this approach is competitive or superior to many of the state-of-the-art approaches in anomaly/interestingness detection, classification, and clustering with empirical tests on time series/DNA/text/XML/video datasets. As a further evidence of the advantages of our method, we will demonstrate its effectiveness to solve a real world classification problem in recommending printing services and products. Responsible editor: Johannes Gehrke  相似文献   

17.
18.
Linear combinations of translates of a given basis function have long been successfully used to solve scattered data interpolation and approximation problems. We demonstrate how the classical basis function approach can be transferred to the projective space ℙ d−1. To be precise, we use concepts from harmonic analysis to identify positive definite and strictly positive definite zonal functions on ℙ d−1. These can then be applied to solve problems arising in tomography since the data given there consists of integrals over lines. Here, enhancing known reconstruction techniques with the use of a scattered data interpolant in the “space of lines”, naturally leads to reconstruction algorithms well suited to limited angle and limited range tomography. In the medical setting algorithms for such incomplete data problems are desirable as using them can limit radiation dosage.  相似文献   

19.
Existing automated test data generation techniques tend to start from scratch, implicitly assuming that no pre‐existing test data are available. However, this assumption may not always hold, and where it does not, there may be a missed opportunity; perhaps the pre‐existing test cases could be used to assist the automated generation of additional test cases. This paper introduces search‐based test data regeneration, a technique that can generate additional test data from existing test data using a meta‐heuristic search algorithm. The proposed technique is compared to a widely studied test data generation approach in terms of both efficiency and effectiveness. The empirical evaluation shows that test data regeneration can be up to 2 orders of magnitude more efficient than existing test data generation techniques, while achieving comparable effectiveness in terms of structural coverage and mutation score. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

20.
自互联网出现以来,数据保护一直是个难题。当社交媒体网站在数字市场上大展拳脚的那一刻,对用户数据和信息的保护让决策者们不得不保持警惕。在数字经济时代的背景下,数据逐渐成为企业提升竞争力的重要要素,围绕着数据展开的市场竞争越来越多。数字经济时代,企业对数据资源的重视与争夺,将网络平台权利与用户个人信息保护、互联网企业之间有关数据不正当竞争的纠纷和冲突,推上了风口浪尖。因此,如何协调和把握数据的合理利用和保护之间的关系,规制不正当竞争行为,以求在数字经济快速发展的洪流中,占据竞争优势显得尤为重要。文章将通过分析数据的二元性,讨论数据在数字经济时代的价值,并结合反不正当竞争法和实践案例,进一步讨论数据利用和保护的关系。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号