首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Bit transposition for very large scientific and statistical databases   总被引:2,自引:0,他引:2  
Conventional access methods cannot be effectively used in large Scientific/Statistical Database (SSDB) applications. A file structure (called bit transposed file (BTF)) is proposed which offers several attractive features that are better suited for the special characteristics that SSDBs exhibit. This file structure is an extreme version of the (attribute) transposed file. The data are stored by vertical bit partitions. The bit patterns of attributes are assigned using one of several data encoding methods. Each of these encoding methods is appropriate for different query types. The bit partitions can also be compressed using a version of the run length encoding scheme. Efficient operators on compressed bit vectors have been developed and form the basis of a query language. Because of the simplicity of the file structure and query language, optimization problems for database design, query evaluation, and common subexpression removal can be formalized and efficient exact solution or near optimal solution can be achieved. In addition to selective power with low overheads for SSDBs, the BTF is also amenable to special parallel hardware. Results from experiments with the file structure suggest that this approach may be a reasonable alternative file structure for large SSDBs.Supported by the Office of Energy Research, U.S. DOE under Contract No. DE-AC03-76SF00098.On leave from the Department of Computer Science, Heilongjiang University, China.  相似文献   

2.
Digital devices are increasingly being used in various crimes, and therefore, it becomes important for law enforcement agencies to be able to investigate and analyze digital devices. Accordingly, there is an increasing demand for digital forensic technologies which can recover the data concealed or deleted by criminals that are of prime importance. There are various digital forensic tools available for Windows-based systems and relatively a few of those for Linux-based systems. Thus, this paper suggests a deleted file recovery technique for the Ext 2/3 filesystem, which is commonly used in Linux. The research involved the analysis of the Ext filesystem structure, file storage structure, and metadata information of file. The shortcomings of the existing methods and methods implemented by the proposed technique to address them are presented. Further, a comparison of the performance of the proposed technique and that of the existing methods is presented.  相似文献   

3.
We believe that currently marketed programs leave unexploited much of the potential of the spreadsheet interface. The purpose of our work is to obtain suggestions for wider application of this interface by showing how to obtain its main features as a subset of logic programming. Our work is based on two observations. The first is that spreadsheets would already be a useful enhancement to interactive languages such as APL and Basic. Although Prolog is also an interactive language, this interface cannot be used in the same direct way. Hence our second observation: the usual query mechanism of Prolog does not provide the kind of interaction this application requires. But it can be provided by the Incremental Query, a new query mechanism for Prolog. The two observations together yield the spreadsheet as a display of the state of the substitution of an incremental query in Prolog. Recalculation of dependent cells is achieved by automatic modification of the query in response to a new increment that would make it unsolvable without the modification.  相似文献   

4.
5.
6.
地理信息查询外壳的设计   总被引:1,自引:0,他引:1       下载免费PDF全文
在分析应用型地理信息系统传统开发模式的基础上,提出了地理信息系统外壳(GISS:Geographical Information System Shell)的开发模式。然后详细介绍了以GISS思想为指导的地理信息查询外壳(GIQS:Geographical Information Query Shell)的实现,包括系统结构、文件类型、分类编码方案和双向多功能查询的实现。GIQS的索引机制和压缩树形折半查找方法大大提高了操作效率。  相似文献   

7.
定义了适用于P2P文件共享的数据基因模型,并给出了基于数据基因模型的P2P文件共享平台的体系结构。这一文件共享平台利用文件的数据基因组来组织和管理共享文件。由于同一文件的不同版本拥有不同的基因信息,它们可同时存在于系统中供用户使用,因此数据一致性问题得到简化。由于可利用文件数据基因组中对相关文件的记录进行查询处理,系统查询实现更高效。文中还给出了此文件共享平台的数据查询算法与更新策略。  相似文献   

8.
Clark’s query evaluation procedure for computing negative information in deductive databases using a “negation as failure” inference rule requires a safe computation rule which may only select negative literals if they are ground. This is a very restrictive condition, which weakens the usefulness of negation as failure in a query evaluation procedure. This paper studies the definition and properties of the “not” predicate defined in most Prolog systems which do not enforce the above mentioned condition of a safe computation rule. We show that the negation in clauses and the “not” Predicate of Prolog are not the same. In fact a Prolog program may not be in clause form. An extended query evaluation procedure with an extended safe computation rule is proposed to evaluate queries which involve the “not” predicate. The soundness and completeness of this extended query evaluation procedure with respect to a class of logic programs are proved. The implementation of such an extended query evaluation procedure in a Prolog system can be implemented by a preprocessor for executing range restricted programs and requires no modification to the interpreter/compiler of an existing Prolog system. We compare this proposed extended query evaluation procedure with the extended program proposed by Lloyd and Topor, and the negation constructs in NU-Prolog. The use of the “not” predicate for integrity constraint checking in deductive databases is also presented.  相似文献   

9.
This paper suggests a general method for compiling OR-parallelism into AND-parallelism. An interpreter for an AND/OR-parallel language written in the AND-parallel subset of the language induces a source-to-source transformation from the full language into the AND-parallel subset. This transformation can be identified and implemented as a special purpose compiler or applied using a general purpose partial evaluator. The method is demonstrated to compile a variant of Concurrent Prolog into an AND-parallel subset of the language called Flat Concurrent Prolog (FCP). It is also shown applicable to the compilation of OR-parallel Prolog to FCP. The transformation identified is simple and efficient. The performance of the method is discussed in the context of programming examples. These compare well with conventionally compiled Prolog programs.  相似文献   

10.
We consider the parallel time complexity of logic programs without function symbols, called logical query programs, or Datalog programs. We give a PRAM algorithm for computing the minimum model of a logical query program, and show that for programs with the “polynomial fringe property,” this algorithm runs in time that is logarithmic in the input size, assuming that concurrent writes are allowed if they are consistent. As a result, the “linear” and “piecewise linear” classes of logic programs are inN C. Then we examine several nonlinear classes in which the program has a single recursive rule that is an “elementary chain.” We show that certain nonlinear programs are related to GSM mappings of a balanced parentheses language, and that this relationship implies the “polynomial fringe property;” hence such programs are inN C Finally, we describe an approach for demonstrating that certain logical query programs are log space complete forP, and apply it to both elementary single rule programs and nonelementary programs.  相似文献   

11.
As data exploration has increased rapidly in recent years, the datastore and data processing are getting more and more attention in extracting important information. To find a scalable solution to process the large-scale data is a critical issue in either the relational database system or the emerging NoSQL database. With the inherent scalability and fault tolerance of Hadoop, MapReduce is attractive to process the massive data in parallel. Most of previous researches focus on developing the SQL or SQL-like queries translator with the Hadoop distributed file system. However, it could be difficult to update data frequently in such file system. Therefore, we need a flexible datastore as HBase not only to place the data over a scale-out storage system, but also to manipulate the changeable data in a transparent way. However, the HBase interface is not friendly enough for most users. A GUI composed of SQL client application and database connection to HBase will ease the learning curve. In this paper, we propose the JackHare framework with SQL query compiler, JDBC driver and a systematical method using MapReduce framework for processing the unstructured data in HBase. After importing the JDBC driver to a SQL client GUI, we can exploit the HBase as the underlying datastore to execute the ANSI-SQL queries. Experimental results show that our approaches can perform well with efficiency and scalability.  相似文献   

12.
High definition video applications often require heavy computation, high bandwidth and high memory requirements which make their real-time implementation difficult. Multi-core architecture with parallelism provides new solutions to implementing complex multimedia applications in real-time. It is well-known that the speed of the H.264 encoder can be increased on a multi-core architecture using the parallelism concept. Most of the parallelization methods proposed earlier for these purposes suffer from the drawbacks of limited scalability and data dependency. In this paper, we present a result obtained using data-level parallelism at the Group-Of-Pictures (GOP) level for the video encoder. The proposed technique involves each GOP being encoded independently and implemented on JM 18.0 using advanced data structures and OpenMP programming techniques. The performance of the parallelized video encoder is evaluated for various resolutions based on the parameters such as encoding speed, bit rate, memory requirements and PSNR. The results show that with GOP level parallelism, very high speed up values can be achieved without much degradation in the video quality.  相似文献   

13.
14.
Semistructured data has no absolute schema fixed in advance and its structure may be irregular or incomplete. Such data commonly arises in sources that do not impose a rigid structure (such as the World-Wide Web) and when data is combined from several heterogeneous sources. Data models and query languages designed for well structured data are inappropriate in such environments. Starting with a lightweight object model adopted for the TSIMMIS project at Stanford, in this paper we describe a query language and object repository designed specifically for semistructured data. Our language provides meaningful query results in cases where conventional models and languages do not: when some data is absent, when data does not have regular structure, when similar concepts are represented using different types, when heterogeneous sets are present, and when object structure is not fully known. This paper motivates the key concepts behind our approach, describes the language through a series of examples (a complete semantics is available in an accompanying technical report [23]), and describes the basic architecture and query processing strategy of the lightweight object repository we have developed.  相似文献   

15.
This paper presents some benchmark timings from an optimising Prolog compiler using global analysis for a RISC workstation, the MIPS R2030. These results are extremely promising. For example, the infamous naive reverse benchmark runs at 2 mega LIPS. We compare these timings with those for other Prolog implementations running on the same workstation and with published timings for the KCM, a recent piece of special purpose Prolog hardware. The comparison suggests that global analysis is a fruitful source of information for an optimising Prolog compiler and that the performance of special purpose Prolog hardware can be at least matched by the code from a compiler using such information. We include some analysis of the sources of the improvement global analysis yields. An overview of the compiler is given and some implementation issues are discussed. This paper is an extended version of Ref. 15)  相似文献   

16.
With the rapid growth of the video surveillance applications, the storage energy consumption of video surveillance is more noticeable, but existed energy-saving methods for massive storage system most concentrate on the data centers mainly with random accesses. The storage of video surveillance has inherent access pattern, and requires special energy-saving approach to save more energy. An energy-efficient data layout for video surveillance, Semi-RAID is proposed. It adopts partial-parallelism strategy, which partitions disk data into different groups, and implements parallel accesses in each group. Grouping benefits to realize only partial disks working and the rest ones idle, and inner-group parallelism provides the performance guarantee. In addition, greedy strategy for address allocation is adopted to effectively prolong the idle period of the disks; particular Cache strategies are used to filter the small amount of random accesses. The energy-saving efficiency of Semi-RAID is verified by a simulated video surveillance consisting of 32 cameras with D1 resolution. The experiment shows: Semi-RAID can save 45 % energy than Hibernator; 80 % energy than PARAID; 33 % energy than MAID; 79 % energy than eRAID-5, while providing single disk fault tolerance and meeting the performance requirement, such as throughput.  相似文献   

17.
In this paper, we propose an efficient encoding and labeling scheme for XML, called EXEL, which is a variant of the region labeling scheme using ordinal and insert-friendly bit strings. We devise a binary encoding method to generate the ordinal bit strings, and an algorithm to make a new bit string inserted between bit strings without any influences on the order of preexisting bit strings. These binary encoding method and bit string insertion algorithm are the bases of the efficient query processing and the complete avoidance of re-labeling for updates. We present query processing and update processing methods based on EXEL. In addition, the Stack-Tree-Desc algorithm is used for an efficient structural join, and the String B-tree indexing is utilized to improve the join performance. Finally, the experimental results show that EXEL enables complete avoidance of re-labeling for updates while providing fairly reasonable query processing performance.  相似文献   

18.
The paper describes a high level interactive conversational language L-A-S (Linear Algebra and Systems) used in analysis and design of linear control systems. The L-A-S language is written in FOR-TRAN-IV (DEC-PDP 10) but its use does not require the knowledge of standard programming languages. Modular structure of the language permits easy modification updating and extensions. The syntax and semantics are simple and straightforward so that familiarity with the language may be easily attained.  相似文献   

19.
Traditional random access file methods provide access to individual records just through the key attribute. For a more versatile query, a flexible file structure is desirable. The link list structure built on top of the traditional random access files provides a more flexible access path to a record. The purpose of this work is to examine various types of link list structures for application to data base query involving multiple keys. It is shown how the technique can improve the performance of traditional random access file for Grants Information System.This paper also illustrates the design and implentation of Grants Information System and concludes that the link list structure built on the top of the traditional random access files provides a more flexible path to a record. The language used in implementation was COBOL.  相似文献   

20.
面向XPath执行的XML数据流压缩方法   总被引:13,自引:0,他引:13  
由于XML(extensible markup language)本身是自描述的,所以XML数据流中存在大量冗余的结构信息.如何压缩XML数据流,使得在减少网络传输代价的同时有效支持压缩数据流上的查询处理,成为一个新的研究领域.目前已有的XML数据压缩技术,都需要扫描数据多遍,或者不支持数据流之上的实时查询处理.提出了一种XML数据流的压缩技术XSC(XML stream compression),实时完成XML数据流的压缩和解压缩,XSC动态构建XML元素事件序列字典并输出相关索引,能够根据XML数据流所遵从的DTD,产生XML元素事件序列图,在压缩扫描之前,产生更加合理的结构序列编码.压缩的XML数据流能够直接解压缩用于XPath的执行.实验表明,在XML数据流环境中,XSC在数据压缩率和压缩时间上要优于传统算法.同时,在压缩数据之上查询的执行代价是可以接受的.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号