首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 21 毫秒
1.
2.
A dynamic hashing scheme based on extendible hashing is proposed whose directory can grow into a multilevel directory. The scheme is compared to the extendible hashing and the extendible hashing tree schemes. The simulation results reveal that the proposed scheme is superior than the other two with respect to directory space utilization, especially for files with nonuniform record distribution. This scheme can be easily extended to multikey file systems and also has good performance  相似文献   

3.
Hadoop分布式文件系统(HDFS)通常用于大文件的存储和管理,当进行海量小文件的存储和计算时,会消耗大量的NameNode内存和访问时间,成为制约HDFS性能的一个重要因素.针对多模态医疗数据中海量小文件问题,提出一种基于双层哈希编码和HBase的海量小文件存储优化方法.在小文件合并时,使用可扩展哈希函数构建索引文件存储桶,使索引文件可以根据需要进行动态扩展,实现文件追加功能.在每个存储桶中,使用MWHC哈希函数存储每个文件索引信息在索引文件中的位置,当访问文件时,无须读取所有文件的索引信息,只需读取相应存储桶中的索引信息即可,从而能够在O(1)的时间复杂度内读取文件,提高文件查找效率.为了满足多模态医疗数据的存储需求,使用HBase存储文件索引信息,并设置标识列用于标识不同模态的医疗数据,便于对不同模态数据的存储管理,并提高文件的读取速度.为了进一步优化存储性能,建立了基于LRU的元数据预取机制,并采用LZ4压缩算法对合并文件进行压缩存储.通过对比文件存取性能、NameNode内存使用率,实验结果表明,所提出的算法与原始HDFS、HAR、MapFile、TypeStorage以及...  相似文献   

4.
游小容  曹晟 《计算机科学》2015,42(10):76-80
Hadoop作为成熟的分布式云平台,能提供可靠高效的存储服务,常用来解决大文件的存储问题,但在处理海量小文件时效率显著降低。提出了基于Hadoop的海量教育资源中小文件的存储优化方案,即利用教育资源小文件间的关联关系,将小文件合并成大文件以减少文件数量,并用索引机制访问小文件及元数据缓存和关联小文件预取机制来提高文件的读取效率。实验证明,以上方法提高了Hadoop文件系统对小文件的存取效率。  相似文献   

5.
A hybrid method of block truncation coding (BTC) and differential pulse code modulation (DPCM) offers better visual quality than the standard BTC for small block sizes due to its inherent multitone representation. Recently, a two-level quantizer design method has been proposed to increase the coding performance of the DPCM-BTC framework. However, the design method is near optimal in the sense that its coding performance depends on the initial bit plane patterns. In this paper, we propose a bit plane modification (BPM) algorithm to achieve further performance improvement. The BPM algorithm, inspired by error diffusion, effectively distributes large quantization error at a certain pixel to its neighboring pixels having small quantization errors by changing partial bit patterns. Experimental results show that the proposed algorithm successfully achieves much higher coding performance than various conventional BTC methods. The average PSNR performance of the proposed method is 2.31 dB, 5.15 dB, and 5.15 dB higher than that of BTC, DPCM-BTC, and a recently developed BTC scheme using error diffusion and bilateral filtering, respectively.  相似文献   

6.

With the fast increase of multimedia contents, efficient forensics investigation methods for multimedia files have been required. In multimedia files, the similarity means that the identical media (audio and video) data are existing among multimedia files. This paper proposes an efficient multimedia file forensics system based on file similarity search of video contents. The proposed system needs two key techniques. First is a media-aware information detection technique. The first critical step for the similarity search is to find the meaningful keyframes or key sequences in the shots through a multimedia file, in order to recognize altered files from the same source file. Second is a video fingerprint-based technique (VFB) for file similarity search. The byte for byte comparison is an inefficient similarity searching method for large files such as multimedia. The VFB technique is an efficient method to extract video features from the large multimedia files. It also provides an independent media-aware identification method for detecting alterations to the source video file (e.g., frame rates, resolutions, and formats, etc.). In this paper, we focus on two key challenges: to generate robust video fingerprints by finding meaningful boundaries of a multimedia file, and to measure video similarity by using fingerprint-based matching. Our evaluation shows that the proposed system is possible to apply to realistic multimedia file forensics tools.

  相似文献   

7.
Parallel file systems are experiencing more and more applications from various fields. Various applications have different I/O workload characteristics, which have diverse requirements on accessing storage resources. However, parallel file systems often adopt the “one-size-fits-all” solution, which fails to meet specific application needs and hinders the full exploitation of potential performance. This paper presents a framework to enable dynamic file I/O path selection with fine granularity at runtime. The framework adopts a file handle-rich scheme to allow file systems choose corresponding optimizations to serve I/O requests. Consistency control algorithms are proposed to ensure data consistency while changing optimizations at runtime. One case study on our prototype shows that choosing proper optimizations can improve the I/O performance for small files and large files by up to 40 and 64.4 %, respectively. Another case study shows that the data prefetch performance for real-world application traces can be improved by up to 193 % by selecting correct prefetch patterns. Simulations in large-scale environment also show that our method is scalable and both the memory consumption and the consistency control overhead can be negligible.  相似文献   

8.
It has been proven that network coding can provide significant benefits to networks. However, network coding is very vulnerable to pollution attacks. In recent years, many schemes have been designed to defend against these attacks, but as far as we know almost all of them are inapplicable for multi-source network coding system. This paper proposed a novel homomorphic signature scheme based on bilinear pairings to stand against pollution attacks for multi-source network coding, which has a broader application background than single-source network coding. Our signatures are publicly verifiable and the public keys are independent of the files so that our scheme can be used to authenticate multiple files without having to update public keys. The signature length of our proposed scheme is as short as the shortest signatures of a single-source network coding. The verification speed of our scheme is faster than those signature schemes based on elliptic curves in the single-source network.  相似文献   

9.
随着数据量的日益增加,大数据存储在整个大数据应用框架体系中居于重要地位。对大数据存储系统进行性能评测可以指导大数据应用开发人员分析性能瓶颈,进行大数据系统的性能优化。在以往的工作中,通常使用基准测试的方式来对不同大数据框架进行性能评测,或者采用插桩并分析轨迹文件的方式对分布式文件系统进行性能分析。这2种方法采用的分析角度不同,并没有形成合理的评测体系来评价大数据分布式存储系统。本文提出主动与被动相结合的大数据存储系统性能评测方法体系结构及其具体实现。在主动性能评测方法方面,提供了6个领域,超过20个应用的基准测试程序,对大数据存储系统主动发起性能测试,分析大数据存储系统的基准性能指标;在被动性能评测方法方面,提供了对低效任务、低效算子、低效函数的分析及定位方法,通过分析运行在大数据存储系统之上的大数据应用,分析大数据应用程序低效的原因。通过实验表明,该大数据性能评测方法体系结构能够全面地对大数据存储系统进行性能评测。  相似文献   

10.
One of the important features of distributed computing systems (DCSs) is the potential of high reliability. When the hardware configuration of a DCS is fixed, the system reliability mainly depends on the allocation of various resources. One of the important resources used in a DCS are various files. In this paper, we have developed a reliability oriented file allocation scheme for distributed systems. In this scheme various files are allocated to different nodes of a DCS so that the reliability of executing a program which requires files from remote node(s) is maximized. Several variations of this problem are solved to illustrate the Genetic Algorithm based solution approach. The paper also provides the relation between degree of redundancy of files and the maximum achievable reliability of executing a program. The proposed method is compared with optimal solutions to demonstrate the accuracy of the solution obtained from Genetic Algorithm based methodology.  相似文献   

11.
Frame-sliced signature files   总被引:1,自引:0,他引:1  
A superimposed coding method, frame-sliced signature file, is proposed, and the performance of this method is studied and compared with that of other signature file methods. The response time of the method is improved due to its ability to effectively partition the signature file so that fewer random disk accesses are required on both retrieval and insertion, while the good characteristics of conventional square file, i.e., low space overhead, low maintenance cost, and the write-once property, are retained. The generalized version of the method is shown to be a unified framework for several popular signature file methods including the sequential signature file (SSF) method, bit-sliced signature file (BSSF) method, and its enhanced version of B'SSF. A prototype system was implemented on UNIX workstations with the C language. Experimental results on a 2.8-Mb database consisting of 2800 technical reports and a 28-Mb database with 28000 technical reports are presented  相似文献   

12.
Light Detection and Ranging (LIDAR) has become one of the prime technologies for rapid collection of vast spatial data, usually stored in a LAS file format (LIDAR data exchange format standard). In this article, a new method for lossless LIDAR LAS file compression is presented. The method applies three consequent steps: a predictive coding, a variable-length coding and an arithmetic coding. The key to the method is the prediction schema, where four different predictors are used: three predictors for x, y and z coordinates and a predictor for scalar values, associated with each LIDAR point. The method has been compared with the popular general-purpose methods and with a method developed specially for compressing LAS files. The proposed method turns out to be the most efficient in all test cases. On average, the LAS file is losslessly compressed to 12% of its original size.  相似文献   

13.
针对互联网生态下电子文件流转中文件没有统一登记、对文件去向没有跟踪、流转过程不规范等问题,提出了基于区块链的电子文件流转方案。首先,采用区块链中联盟链的多中心化体系,提出基于区块链的电子文件流转系统设计目标与设计架构。然后,借助云存储平台进行电子文件存放来实现文件的上传功能,通过将文件的所有权转换数据加盖时间戳,使流转过程连续、关联、可追溯且诚实可信,实现了基于区块链的电子文件流转系统。通过数据库调用完成数据存取,实现了基于区块链的电子文件流转系统的数据同步和追溯。最后,提出了电子文件所有权转换、查询的智能合约,这种合约通过读取文件标识来实现对文件内容的验证和保护。安全性分析和性能测试表明,该方案与现有的文件流转系统相比更具安全性,增强了流转信息的可信度,同时智能合约的执行时间较短,使得系统具有更好的可靠性和可溯源性。  相似文献   

14.
Co-allocation architecture was developed to enable parallel transferring of files from multiple replicas stored in the different servers. Several co-allocation strategies have been coupled and used to exploit the different transfer rates among various client-server links and to address dynamic rate fluctuations by dividing files into multiple blocks of equal sizes. The paper presents a dynamic file transfer scheme, called dynamic adjustment strategy (DAS), for co-allocation architecture in concurrently transferring a file from multiple replicas stored in multiple servers within a data grid. The scheme overcomes the obstacle of transfer performance due to idle waiting time of faster servers in co-allocation based file transfers and, therefore, provides reduced file transfer time. A tool with user friendly interface that can be used to manage replicas and downloading in a data grid environment is also described. Experimental results show that our DAS can obtain high-performance file transfer speed and reduce the time cost of reassembling data blocks.  相似文献   

15.
针对互联网生态下电子文件流转中文件没有统一登记、对文件去向没有跟踪、流转过程不规范等问题,提出了基于区块链的电子文件流转方案。首先,采用区块链中联盟链的多中心化体系,提出基于区块链的电子文件流转系统设计目标与设计架构。然后,借助云存储平台进行电子文件存放来实现文件的上传功能,通过将文件的所有权转换数据加盖时间戳,使流转过程连续、关联、可追溯且诚实可信,实现了基于区块链的电子文件流转系统。通过数据库调用完成数据存取,实现了基于区块链的电子文件流转系统的数据同步和追溯。最后,提出了电子文件所有权转换、查询的智能合约,这种合约通过读取文件标识来实现对文件内容的验证和保护。安全性分析和性能测试表明,该方案与现有的文件流转系统相比更具安全性,增强了流转信息的可信度,同时智能合约的执行时间较短,使得系统具有更好的可靠性和可溯源性。  相似文献   

16.
Inverted file partitioning schemes in multiple disk systems   总被引:1,自引:0,他引:1  
Multiple-disk I/O systems (disk arrays) have been an attractive approach to meet high performance I/O demands in data intensive applications such as information retrieval systems. When we partition and distribute files across multiple disks to exploit the potential for I/O parallelism, a balanced I/O workload distribution becomes important for good performance. Naturally, the performance of a parallel information retrieval system using an inverted file structure is affected by the partitioning scheme of the inverted file. In this paper, we propose two different partitioning schemes for an inverted file system for a shared-everything multiprocessor machine with multiple disks. We study the performance of these schemes by simulation under a number of workloads where the term frequencies in the documents are varied, the term frequencies in the queries are varied, the number of disks are varied and the multiprogramming level is varied  相似文献   

17.
云存储加密数据去重删除所有权证明方法   总被引:5,自引:0,他引:5  
随着云计算服务的广泛应用,为了节省磁盘空间和带宽,出现了一种新技术:客户端去重复化.但近期发现了一种针对该技术的新型攻击:攻击者只需获得原始文件的一个摘要信息,即文件的散列值,即可从服务器端获得全部原始文件.为了解决上述安全问题,提出了一个密码学安全的、高效的证明方案来支持多客户端加密文件的去重复删除场景.通过抽样检测、动态系数和随机选择的原始文件检索值使方案达到安全与高效的目标;同时,还提出了一种巧妙的分布式捎带技术,将文件加密密钥的分发过程与所有权证明过程同步实施.最后,对所提方案进行了严格的安全性证明和深入的性能分析与仿真,结果表明,所提的方案不仅能达到可证明的安全级别,而且执行效率较高,尤其在减少客户端计算负载方面.  相似文献   

18.
针对Ceph存储系统面对小文件存储时存在元数据服务器性能瓶颈、文件读取效率低等问题.本文从小文件之间固有的数据关联性出发,通过轻量级模式匹配算法,提取出关联特征并以此为依据对小文件进行合并,提高了合并文件之间的合理性,并在文件读取时将同一合并文件内的小文件存入客户端缓存来提高缓存读取命中率,经过实验验证本文的方案有效的提高了小文件的访问效率.  相似文献   

19.
针对数据备份需占用大量空间和需额外手段对其安全保密性进行保护的问题,提出了一种基于网络编码技术的文件备份方案,其核心是对多个备份文件进行网络编码操作,生成编码备份文件后存储于备份服务器中。该方案包括以下部分:备份的基本原理、备份过程、文件更新对备份恢复的影响和处理、备份的更新和全局备份控制系统。实验与理论分析表明,该方案较传统备份方案大幅节省了存储空间,提高了备份文件的安全保密性,但文件的可恢复性微降并增加了系统的复杂度。  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号