共查询到20条相似文献,搜索用时 0 毫秒
1.
Mehrdad Aliasgari Marina Blanton Fattaneh Bayatbabolghani 《International Journal of Information Security》2017,16(6):577-601
Hidden Markov model (HMM) is a popular statistical tool with a large number of applications in pattern recognition. In some of these applications, such as speaker recognition, the computation involves personal data that can identify individuals and must be protected. We thus treat the problem of designing privacy-preserving techniques for HMM and companion Gaussian mixture model computation suitable for use in speaker recognition and other applications. We provide secure solutions for both two-party and multi-party computation models and both semi-honest and malicious settings. In the two-party setting, the server does not have access in the clear to either the user-based HMM or user input (i.e., current observations) and thus the computation is based on threshold homomorphic encryption, while the multi-party setting uses threshold linear secret sharing as the underlying data protection mechanism. All solutions use floating-point arithmetic, which allows us to achieve high accuracy and provable security guarantees, while maintaining reasonable performance. A substantial part of this work is dedicated to building secure protocols for floating-point operations in the two-party setting, which are of independent interest. 相似文献
2.
3.
By splitting the mantissa of a multiple precision number into BLOCKS of constant width, it has been shown that the precision of a computer can be increased to a degree as high as we please by merely developing a FORTRAN program that can force the computer to perform all arithmetical executions up to any desired number of significant decimal places. Some of the salient features of such a programming are summarized as follows:
- 1 It reduces inherent errors arising due to approximate nature of representing in some finite number of digits a number that cannot ordinarily be represented exactly in the number of digits available with the particular installation being used.
- 2 The working of the computer proceeds just as if it were a decimal computer. Naturally we should expect better results even for the same number of digits as the computer normally takes.
- 3 It takes due care of very low/high numbers occurring in intermediate calculations as one word space is being provided to store the exponent itself.
- 4 Being problem oriented in nature, the FORTRAN language is most commonly understood by a large section of programmers. It is mainly for this reason that although it requires comparatively more time and space, one can save one's own valuable time in learning complicated assembly languages which differ from computer to computer.
- 5 It can be easily extended to complex numbers.
4.
It is shown that an otherwise stable digital control system may become unstable due to signal quantization if the controller operates on floating-point arithmetic. Sufficient conditions for instability are developed. 相似文献
5.
Charles Farnum 《Software》1988,18(7):701-709
Predictability is a basic requirement for compilers of floating-point code—it must be possible to determine the exact floating-point operations that will be executed for a particular source-level construction. Experience shows that many compilers fail to provide predictability, either because of an inadequate understanding of its importance or from an attempt to produce locally better code. Predictability can be attained through careful attention to code generation and a knowledge of the common pitfalls. Most language standards do not completely define the precision of floating-point operations, and so a good compiler must also make a good choice in assigning precisions of subexpression computation. Choosing the widest precision that will be used in the expression usually gives the best trade-off between efficiency and accuracy. Finally, certain optimizations are particularly useful for floating-point and should be included in a compiler aimed at scientific computation. But predictability is more important than efficiency; obtaining incorrect answers fast helps no one. 相似文献
6.
A overview is given of Motorola's DSP96002, a digital signal processor that implements IEEE-standard floating-point arithmetic. It is designed for graphics, image processing, spectral analysis and scientific computing applications. Performance peaks at 40.5 Mflops (million floating-point operations per second) and 13.5 MIPS (million instructions per second) and 18 Mflops on assembly-language benchmarks. The DSP is software-compatible with the fixed-point 56000/1 family architecture and instruction set. The 96002 achieves compatibility with other processors and databases, higher mathematical accuracy, and better error handling than implementations that do not conform to the IEEE standard. The 96002's on-chip memories, dual-bus architecture, and transparent DMA are suitable for multiprocessor systems in which many 96002s connect with minimum external components. These features result in a smaller-footprint, lower-cost system than other microprocessors or data-path chips. On-chip support for the fast access modes of external memories achieves near-SRAM (static random-access memory) performance with high-density DRAM/VRAM (dynamic RAM/virtual RAM) devices. An on-chip circuit emulation controller provides full access and control of the machine state for system debugging. A variety of software and hardware development tools support the 96002 相似文献
7.
Thomas Gross 《Journal of Parallel and Distributed Computing》1985,2(4):362-375
Current single chip implementations of reduced-instruction-set processors do not support hardware floating-point operations. Instead, floating-point operations have to be provided either by a coprocessor or by software. This paper discusses issues arising from a software implementation of floating-point arithmetic for the MIPS processor, an experimental VLSI architecture. Measurements indicate that an acceptable level of performance is achieved, but this approach is no substitute for a hardware accelerator if higher-precision results are required. This paper includes instruction profiles for the basic floating-point operations and evaluates the usefulness of some aspects of the instruction set. 相似文献
8.
The performance of a digital state regulator system having an A/D converter of finite wordlength and a floating-point estimator/ controller computer of finite mantissa-length is analyzed. An upper bound on the mean-square state error, as a function of the two wordlengths, is derived in closed form. This bound can be used in analyzing the stability of open-loop unstable systems and to give insight into the error-performance/hardware-cost tradeoff for wordlength design choices. An example is given where the estimates of error performance of a second-order digitally-regulated plant, as estimated from the upper bound, from the system covariance equation, and from simulation are compared. 相似文献
9.
The major parallel architecture classes are considered: single-instruction multiple-data (SIMD) computers, tightly coupled multiple-instruction multiple-data (MIMD) computers, hypercuboid computers and constant-valence MIMD computers. An argument that the PRAM model is universal over tightly coupled and hypercube systems, but not over constant-valence-topology, loosely coupled-system is reviewed, showing precisely how the PRAM model is too powerful to permit broad universality. Ways in which a model of computation can be restricted to become universal over less powerful architectures are discussed. The Bird-Meertens formalism (R.S. Bird, 1989), is introduced and it is shown how it is used to express computations in a compact way. It is also shown that the Bird-Meertens formalism is universal over all four architecture classes and that nontrivial restrictions of functional programming languages exist that can be efficiently executed on disparate architectures. The use of the Bird-Meertens formalism as the basis for a programming language is discussed, and it is shown that it is expressive enough to be used for general programming. Other models and programming languages with architecture-independent properties are reviewed 相似文献
10.
Prof. Dr. F. Stummel 《Computing》1986,37(2):103-124
Exact representations of errors and residuals of approximate solutions of linear algebraic systems under data perturbations and rounding errors of a floating-point arithmetic are established from which strict optimal a posteriori error and residual bounds are obtained. These bounds are formulated by means of a posteriori error and residual condition numbers. Condition numbers, error and residual bounds can be computed completely in the range of nonnegative numbers using the arithmetic operations+, x, / only. It is shown that computations in this range are numerically very stable. The general results are applied to a series of numerical examples. 相似文献
11.
The current paper explores the capability and flexibility of field programmable gate-arrays (FPGAs) to implement variable-precision floating-point (VP) arithmetic. First, the VP exact dot product algorithm, which uses exact fixed-point operations to obtain an exact result, is presented. A VP multiplication and accumulation unit (VPMAC) on FPGA is then proposed. In the proposed design, the parallel multipliers generate the partial products of mantissa multiplication in parallel, which is the most time-consuming part in the VP multiplication and accumulation operation. This method fully utilizes DSP performance on FPGAs to enhance the performance of the VPMAC unit. Several other schemes, such as two-level RAM bank, carry-save accumulation, and partial summation, are used to achieve high frequency and pipeline throughput in the product accumulation stage. The typical algorithms in Basic Linear Algorithm Subprograms (i.e., vector dot product, general matrix vector product, and general matrix multiply product), LU decomposition, and Modified Gram–Schmidt QR decomposition, are used to evaluate the performance of the VPMAC unit. Two schemes, called the VPMAC coprocessor and matrix accelerator, are presented to implement these applications. Finally, prototypes of the VPMAC unit and the matrix accelerator based on the VPMAC unit are created on a Xilinx XC6VLX760 FPGA chip. Compared with a parallel software implementation based on OpenMP running on an Intel Xeon Quad-core E5620 CPU, the VPMAC coprocessor, equipped with one VPMAC unit, achieves a maximum acceleration factor of 18X. Moreover, the matrix accelerator, which mainly consists of a linear array of eight processing elements, achieves 12X–65X better performance. 相似文献
12.
基于Altera浮点IP核实现浮点矩阵相乘运算时,由于矩阵阶数的增大,造成消耗的器件资源虽增加但系统性能反而下降的问题,针对现有IP核存在数据加载不连贯、存储带宽不均匀的不足,提出采用并行化数据存储、依据查找表加载数据和处理数据的方式对IP核进行改进.然后将改进的浮点矩阵运算在FPGA中实现,经过Quartus、Matlab软件联合仿真并进行结果比对,其误差不超过万分之一,且节省了器件资源、提升了系统性能.仿真结果表明该设计可行,有利于提高诸多高性能领域浮点矩阵的运算速度. 相似文献
13.
14.
D. B. Skillicorn 《International journal of parallel programming》1991,20(2):133-158
A major reason for the lack of practical use of parallel computers has been the absence of a suitable model of parallel computation. Many existing models are either theoretical or are tied to a particular architecture. A more general model must be architecture independent, must realistically reflect execution costs, and must reduce the cognitive overhead of managing massive parallelism. A growing number of models meeting some of these goals have been suggested. We discuss their properties and relative strengths and weaknesses. We conclude that data parallelism is a style with much to commend it, and discuss the Bird-Meertens formalism as a coherent approach to data parallel programming.This work was supported by the Natural Sciences and Engineering Research Council of Canada. 相似文献
15.
The survey focuses on algebraic-grammatical models of parallel processes, representation of knowledge about classes of algorithms in terms of a variety of production systems (structured design grammars), and program generation tools.Translated from Kibernetika i Sistemnyi Analiz, No. 5, pp. 5–13, September–October, 1991. 相似文献
16.
《Computer aided design》1987,19(9):503-507
This paper presents a unified representation scheme for the implicit equations of points, lines and circles, together with algorithms for incident and tangent constructions. These algorithms operate successfully on degenerate and nearly degenerate geometry and when necessary produce degenerate geometric results. Care is taken to ensure the accuracy of results. 相似文献
17.
Domagoj Jakobović Marin Golub Marko Čupić 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2014,18(6):1225-1236
This paper presents the design and the application of asynchronous models of parallel evolutionary algorithms. An overview of the existing parallel evolutionary algorithm (PEA) models and available implementations is given. We present new PEA models in the form of asynchronous algorithms and implicit parallelization, as well as experimental data on their efficiency. The paper also discusses the definition of speedup in PEAs and proposes an appropriate speedup measurement procedure. The described parallel EA algorithms are tested on problems with varying degrees of computational complexity. The results show good efficiency of asynchronous and implicit models compared to existing parallel algorithms. 相似文献
18.
19.
Charles Koelbel Piyush Mehrotra John Van Rosendale 《International journal of parallel programming》1987,16(5):365-382
Automatic process partitioning is the operation of automatically rewriting an algorithm as a collection of tasks, each operating primarily on its own portion of the data, to carry out the computation in parallel. Hybrid shared memory systems provide a hierarchy of globally accessible memories. To achieve high performance on such machines one must carefully distribute the work and the data so as to keep the workload balanced while optimizing the access to nonlocal data. In this paper we consider a semi-automatic approach to process partitioning in which the compiler, guided by advice from the user, automatically transforms programs into such an interacting set of tasks. This approach is illustrated with a picture processing example written in BLAZE, which is transformed by the compiler into a task system maximizing locality of memory reference.Research supported by an IBM Graduate Fellowship.Research supported under NASA Contract No. 520-1398-0356.Research supported by NASA Contract No. NAS1-18107 while the last two authors were in residence at ICASE, NASA, Langley Research Center. 相似文献
20.
We present a new method for computing solutions of conservation laws based on the use of cellular automata with the method of characteristics. The method exploits the high degree of parallelism available with cellular automata and retains important features of the method of characteristics. It yields high numerical accuracy and extends naturally to adaptive meshes and domain decomposition methods for perturbed conservation laws. We describe the method and its implementation for a Dirichlet problem with a single conservation law for the one-dimensional case.
Numerical results for the one-dimensional law with the classical Burgers nonlinearity or the Buckley-Leverett equation show good numerical accuracy outside the neighborhood of the shocks. The error in the area of the shocks is of the order of the mesh size. The algorithm is well suited for execution on both massively parallel computers and vector machines. We present timing results for an Alliant FX/8, Connection Machine Model 2, and CRAY X-MP. 相似文献