首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 8 毫秒
1.
The programming language PLAIN has been designed to support conversational access to a data base, and incorporates relations as a built-in data type. This paper describes the architecture of the data base handler for PLAIN, emphasizing the separation of the data base handler from other aspects of the language processor, and the modularization of the data base architecture to support modifications to the language and its implementation with minimal difficulty. The data base architecture is layered in order to provide the greatest possible degree of information hiding and separation of functionality. The paper shows the structure of the data base handler and the functions of the various modules of the system.  相似文献   

2.
After a short introduction to the field of data base machines the design of the RDBM (Relational Data Base Machine) is presented, which provides all the functions required of a relational data base system. Frequently used and time-consuming functions are supported by appropriate hardware components. A transaction-oriented multi-user system was designed which exploits the inherent parallelism of user tasks and data base functions.The RDBM consists of a quasi-associative mass store together with a system of special function processors with common access to a large main memory, and a general purpose mini-computer exercising overall control over all hardware components.  相似文献   

3.
4.
Hash join is used to join large, unordered relations and operates independently of the data distributions of the join relations. Real-world data sets are not uniformly distributed and often contain significant skew. Although partition skew has been studied for hash joins, no prior work has examined how exploiting data skew can improve the performance of hash join. In this paper, we present histojoin, a join algorithm that uses histograms to identify data skew and improve join performance. Experimental results show that for skewed data sets histojoin performs significantly fewer I/O operations and is faster by 10–60% than hybrid hash join.  相似文献   

5.
The problem of efficiently finding similar items in a large corpus of high-dimensional data points arises in many real-world tasks, such as music, image, and video retrieval. Beyond the scaling difficulties that arise with lookups in large data sets, the complexity in these domains is exacerbated by an imprecise definition of similarity. In this paper, we describe a method to learn a similarity function from only weakly labeled positive examples. Once learned, this similarity function is used as the basis of a hash function to severely constrain the number of points considered for each lookup. Tested on a large real-world audio dataset, only a tiny fraction of the points (~0.27%) are ever considered for each lookup. To increase efficiency, no comparisons in the original high-dimensional space of points are required. The performance far surpasses, in terms of both efficiency and accuracy, a state-of-the-art Locality-Sensitive-Hashing-based (LSH) technique for the same problem and data set.  相似文献   

6.
Computers employing some degree of data flow organisation are now well established as providing a possible vehicle for concurrent computation. Although data-driven computation frees the architecture from the constraints of the single program counter, processor and global memory, inherent in the classic von Neumann computer, there can still be problems with the unconstrained generation of fresh result tokens if a pure data flow approach is adopted. The advantages of allowing serial processing for those parts of a program which are inherently serial, and of permitting a demand-driven, as well as data-driven, mode of operation are identified and described. The MUSE machine described here is a structured architecture supporting both serial and parallel processing which allows the abstract structure of a program to be mapped onto the machine in a logical way.  相似文献   

7.
将全过程计算机辅助动画自动生成技术应用于中国古建领域,实现了一个自动古建动画系统.它根据用户对古建的描述,自动生成三维动画来表现古建的搭建过程,整个自动过程都是在古建知识库的支持下完成的,采用语义网络技术设计并实现了知识库,包括本体库和规则库2部分.主要论述古建知识库的结构、本体和规则库的组成,并以生成古建搭建顺序为例来说明规则推理的过程,共实现了2个方案,一个基于本体构造环境Protégé,另一个基于规则推理系统Jess,后者比前者在推理时间上节省了99%以上,重点论述了高效方案的设计思想和实现技术,并分析了其优点和不足.  相似文献   

8.
支持向量机是一种具有完备统计学习理论基础和出色学习性能的新型机器学习方法,它能够较好地克服过学习和泛化能力低等缺陷.但是在利用支持向量机的分类算法处理实际问题时,该算法的计算速度较慢、处理问题效率较低.文中介绍了一种新的学习算法粗SVM分类方法,就是将粗糙集和支持向量机相结合,利用粗糙集对支持向量机的训练样本进行预处理,通过属性约简方法以减少属性个数,且在属性约简过程中选出几组合适的属性集组成新的属性集,使模型具有一定的抗信息丢失能力,同时充分利用SCM的良好推广性能,从而缩短样本的训练时间,实现快速故障诊断.对航空发动机故障诊断的实验结果表明了该方法的优越性. 型机器学习方法,它能够较好地克服过学习和泛化能力低等缺陷.但是在利用支持向量机的分类算法处理实际问题时,该算法的计算速度较慢、处理问题效率较低.文中介绍了一种新的学习算法粗SVM分类方法,就是将粗糙集和支持向量机相结合,利用粗糙集对支持向量机的训练样本进行预处理,通过属性约简方法以减少属性个数,且在属性约筒过程中选出几组合适的属性集组成新的属性集,使模型具有一定的抗信息丢失能力,同时充分利用SCM的良好推广性能,从而缩短样本的训练时间,实现快速故障诊 .对航空发动机故障诊断的实验结果表明了该方法的优越性. 型机器学习方法  相似文献   

9.
10.
面向数据质量的ETL框架的设计与实现   总被引:1,自引:0,他引:1  
针对传统抽取-转换-装载(ETL)架构在数据质量控制方面的不足,提出一种面向数据质量管理的ETL架构.根据ETL过程的特点,设计多数据源接口模块、ETL元数据描述模块、ETL任务描述模块和数据质量控制模块等.该架构以数据质量为核心,通过建立数据分析模型,利用规则推导引擎对数据分析结果生成数据清洗方案,从而有效地对数据流进行质量评估和管理.基于该设计思想开发一个ETL工具-DQETL.DQETL采用统一建模语言进行设计,并提供友好界面对ETL过程进行集中管理.最后,结合实例阐述了在该框架下进行数据质量管理的一般步骤.  相似文献   

11.
One of the problems facing penetration testers is that a test can generate vast quantities of information that need to be stored, analysed and cross-referenced for later use. Consequently, this paper will present an architecture based on the encoding of information within an XML document. We will also demonstrate how, through application of the architecture, large quantities of security-related information can be captured within a single database schema. This database can then be used to ensure that systems are conforming to an organisation's network security policy.  相似文献   

12.
In this paper, we observe that in the seminal work on indifferentiability analysis of iterated hash functions by Coron et al. and in subsequent works, the initial value $(IV)$ of hash functions is fixed. In addition, these indifferentiability results do not depend on the Merkle–Damg?rd (MD) strengthening in the padding functionality of the hash functions. We propose a generic $n$ -bit-iterated hash function framework based on an $n$ -bit compression function called suffix-free-prefix-free (SFPF) that works for arbitrary $IV$ s and does not possess MD strengthening. We formally prove that SFPF is indifferentiable from a random oracle (RO) when the compression function is viewed as a fixed input-length random oracle (FIL-RO). We show that some hash function constructions proposed in the literature fit in the SFPF framework while others that do not fit in this framework are not indifferentiable from a RO. We also show that the SFPF hash function framework with the provision of MD strengthening generalizes any $n$ -bit-iterated hash function based on an $n$ -bit compression function and with an $n$ -bit chaining value that is proven indifferentiable from a RO.  相似文献   

13.
网络是数据中心的一个重要基础设施,而通常分为两大部分:前端计算网络以及后端存储网络.后端的存储网络目前主要是FC网络,不过随着IP技术的发展,采用FCoE技术融合前后端网络成为一种趋势.本文主要关注前端计算网络.  相似文献   

14.
Recent advancement in microarray technology permits monitoring of the expression levels of a large set of genes across a number of time points simultaneously. For extracting knowledge from such huge volume of microarray gene expression data, computational analysis is required. Clustering is one of the important data mining tools for analyzing such microarray data to group similar genes into clusters. Researchers have proposed a number of clustering algorithms in this purpose. In this article, an attempt has been made in order to improve the performance of fuzzy clustering by combining it with support vector machine (SVM) classifier. A recently proposed real-coded variable string length genetic algorithm based clustering technique and an iterated version of fuzzy C-means clustering have been utilized in this purpose. The performance of the proposed clustering scheme has been compared with that of some well-known existing clustering algorithms and their SVM boosted versions for one simulated and six real life gene expression data sets. Statistical significance test based on analysis of variance (ANOVA) followed by posteriori Tukey-Kramer multiple comparison test has been conducted to establish the statistical significance of the superior performance of the proposed clustering scheme. Moreover biological significance of the clustering solutions have been established.  相似文献   

15.
当数据集中包含的训练信息不充分时,监督的极限学习机较难应用,因此将半监督学习应用到极限学习机,提出一种半监督极限学习机分类模型;但其模型是非凸、非光滑的,很难直接求其全局最优解。为此利用组合优化方法,将提出的半监督极限学习机化为线性混合整数规划,可直接得到其全局最优解。进一步,利用近红外光谱技术,将半监督极限学习机应用于药品和杂交种子的近红外光谱数据的模式分类。与传统方法相比,在不同的光谱区域的数值实验结果显示:当数据集中包含训练信息不充分时,提出的半监督极限学习机提高了模型的推广能力,验证了所提出方法的可行性和有效性。  相似文献   

16.
The output of 18 software architecture evaluations is analyzed. The goal of the analysis is to find patterns in the important quality attributes and risk themes identified in the evaluations. The major results are
A categorization of risk themes.
The observation that twice as many risk themes are risks of “omission” as are risks of “commission”.
A failure to find a relationship between the business and mission goals of a system and the risk themes from an evaluation of that system.
A failure to find a correlation between the domain of a system being evaluated and the important quality attributes for that system.
A wide diversity of names used for various quality attributes.
The results of this investigation have application to practitioners by suggesting activities on which developers should put greater focus. They also have application to researchers by suggesting further areas of investigation.  相似文献   

17.
郑煜  高宝 《计算机应用研究》2013,30(7):2007-2009
在研究常用的数据变化捕获策略和技术的基础上, 结合树型结构目录服务的特性, 提出一种基于同态哈希的目录服务数据变化捕获方法, 并对其总体构架与实现方案进行了描述。该方法综合运用了同态哈希函数与内存数据库技术的特殊影子表法, 依据同态哈希算法的同态性, 可以实现在数据变化率较小的情况下快速获取目录服务的变更数据。性能分析表明, 该方法具有可行性和高效性。  相似文献   

18.
This paper aims at describing the transition procedure from conventional data files to a data base, beginning with data models of reality and ending with the data definition using the CODASYL DDL. The transition process is explained and a case study is also provided.  相似文献   

19.
20.
Pattern Analysis and Applications - Clustering is a long-standing challenging task in pattern recognition and computer vision. In recent years, with development of multimedia technologies and...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号