共查询到20条相似文献,搜索用时 15 毫秒
1.
一些数字信号处理程序存在强数据相关性,在将这些数字信号处理程序划分到多核DSP上时,需要开发细粒度并行性,而细粒度并行性的开发需要快速的核间通信机制支持。本文提出了一种新的面向多核DSP的快速核间通信机制:标记式共享寄存器文件TSRF,TSRF由所有的DSP核共享,寄存器文件中的每个寄存器同一个有效标记位关联,该标记位提供了核间通信同步支持。本文构建了集成TSRF机制的多核DSP原型的周期精确模拟器,该多核DSP原型包含的处理器核数目为4个。通过详细模拟,我们使用数据相关性较强的数字信号处理算法:IIR滤波和ADPCM编解码,对TSRF机制的性能进行了测试,与单核DSP相比,TSDB机制性能提升分别为1.8、1.2和1.9左右。 相似文献
2.
We describe a nonconstructive extension to primitive recursive arithmetic, both abstractly and as implemented on the Boyer-Moore prover. Abstractly, this extension is obtained by adding the unbounded µ operator applied to primitive recursive functions; doing so, one can define the Ackermann function and prove the consistency of primitive recursive arithmetic. The implementation does not mention the µ operator explicitly but has the strength to define the µ operator through the built-in functions EVAL$ and V&C$. 相似文献
3.
介绍了计算数学组织理论概念及其研究内容,综述了组织适应性研究的基本观点、理论和方法。组织适应性足组织不断学习、调整、优化和变革自身以适应于环境的一个提高组织适应能力的过程,以这一过程为主线总结了适应性研究的现状,并阐述了这一研究的价值和应用前景。 相似文献
4.
A parallel implementation of the preconditioned GMRES method is described. The method is used to solve the discretized incompressible Navier–Stokes equations. A parallel implementation of the inner product is given, which appears to be scalable on a massively parallel computer. The most difficult part to parallelize is the ILU-preconditioner. We parallelize the preconditioner using ideas proposed by Bastian and Horton (P. Bastian, G. Horton, SIAM. J. Stat. Comput. 12 (1991) 1457–1470). Contrary to some other parallel methods, the required number of iterations is independent of the number of processors used. A model is presented to predict the efficiency of the method. Experiments are done on the Cray T3D, computing the solution of a two-dimensional incompressible flow. Predictions of computing time show good correspondence with measurements. 相似文献
5.
Despite the noticeable progress in perceptual tasks like detection, instance segmentation and human parsing, computers still perform unsatisfactorily on visually understanding humans in crowded scenes, such as group behavior analysis, person re-identification, e-commerce, media editing, video surveillance, autonomous driving and virtual reality, etc. To perform well, models need to comprehensively perceive the semantic information and the differences between instances in a multi-human image, which is recently defined as the multi-human parsing task. In this paper, we first present a new large-scale database “Multi-human Parsing (MHP v2.0)” for algorithm development and evaluation to advance the research on understanding humans in crowded scenes. MHP v2.0 contains 25,403 elaborately annotated images with 58 fine-grained semantic category labels and 16 dense pose key point labels, involving 2–26 persons per image captured in real-world scenes from various viewpoints, poses, occlusion, interactions and background. We further propose a novel deep Nested Adversarial Network (NAN) model for multi-human parsing. NAN consists of three Generative Adversarial Network-like sub-nets, respectively performing semantic saliency prediction, instance-agnostic parsing and instance-aware clustering. These sub-nets form a nested structure and are carefully designed to learn jointly in an end-to-end way. NAN consistently outperforms existing state-of-the-art solutions on our MHP and several other datasets, including MHP v1.0, PASCAL-Person-Part and Buffy. NAN serves as a strong baseline to shed light on generic instance-level semantic part prediction and drive the future research on multi-human parsing. With the above innovations and contributions, we have organized the CVPR 2018 Workshop on Visual Understanding of Humans in Crowd Scene (VUHCS 2018) and the Fine-Grained Multi-human Parsing and Pose Estimation Challenge. These contributions together significantly benefit the community. Code and pre-trained models are available at https://github.com/ZhaoJ9014/Multi-Human-Parsing_MHP.
相似文献
6.
Several strategies of parallelism for spectral algorithms are discussed. The investigation shows that, despite the intrinsic lack of locality of spectral methods, they are amenable to parallel implementations, even on fine grain architectures. Typical algorithms for the spectral approximation of the viscous, incompressible Navier-Stokes equations serve as examples in the discussion. 相似文献
7.
作为一项重要的教育理念,计算思维得到了国内外科学界和教育界的广泛关注,相应地对计算机专业的人才培养提出了新的要求。本文分析了计算思维培养与离散数学教学之间的内在关系,在此基础上分别从课程引入和课程教学两个阶段探讨如何将离散数学教学与计算思维培养有机地结合起来。通过案例,着重论述了如何将抽象和自动化这两个核心思想贯穿于整个教学过程,以及如何根据所讲授的知识点适时地引入计算思维中其他基本概念和思维方法。 相似文献
8.
We present a distributed algorithm for implementing α-β search on a tree of processors. Each processor is an independent computer with its own memory and is connected by communication lines to each of its nearest neighbors. Measurements of the algorithm's performance on the Arachne distributed operating system are presented. A theoretical model is developed that predicts at least order of speedup with k processors. 相似文献
9.
There is a tension between the objectives of avoiding irrelevant computation and extracting parallelism, in that a computational step used to restrict another must precede the latter. Our thesis, following [3], is that evaluation methods can be viewed as implementing a choice of sideways information propagation graphs, or sips, which determines the set of goals and facts that must be evaluated. Two evaluation methods that implement the same sips can then be compared to see which obtains a greater degree of parallelism, and we provide a formal measure of parallelism to make this comparison.Using this measure, we prove that transforming a program using the Magic Templates algorithm and then evaluating the fixpoint bottom-up provides a most parallel implementation for a given choice of sips, without taking resource constraints into account. This result, taken in conjunction with earlier results from [3,27], which show that bottom-up evaluation performs no irrelevant computation and is sound and complete, suggests that a bottom-up approach to parallel evaluation of logic programs is very promising. A more careful analysis of the relative overheads in the top-down and bottom-up evaluation paradigms is needed, however, and we discuss some of the issues.The abstract model allows us to establish several results comparing other proposed parallel evaluation methods in the logic programming and deductive database literature, thereby showing some natural, and sometimes surprising, connections. We consider the limitations of the abstract model and of the proposed bottom-up evaluation method, including the inability of sips to describe certain evaluation methods, and the effect of resource constraints. Our results shed light on the limits of the sip paradigm of computation, which we extend in the process. 相似文献
10.
The process of gene assembly in ciliates, an ancient group of organisms, is one of the most complex instances of DNA manipulation
known in any organisms. This process is fascinating from the computational point of view, with ciliates even using the linked
lists data structure. Three molecular operations ( ld, hi, and dlad) have been postulated for the gene assembly process. We initiate here the study of parallelism in this process, raising several
natural questions, such as: when can a number of operations be applied in parallel to a gene pattern; or how many steps are
needed to assemble (in parallel) a micronuclear gene. In particular, this gives rise to a new measure of complexity for the
process of gene assembly in ciliates.
“One of the oldest forms of life on Earth has been revealed as a natural born computer programmer.” 相似文献
13.
数据库系统担负集中处理大量信息的任务,目前越来越多的应用需要对数据访问进行更细粒度的访问控制,不仅仅是表/视图级的控制,而是进一步对单个记录的访问进行控制。本文详细介绍了实现数据库细粒度访问控制的查询修改技术的工作原理,分析了该技术存在的问题,重点研究了Truman模型及Non-Truman模型,比较分析了它们的特点和不足,并且对未来的研究趋势进行了总结。 相似文献
15.
This paper presents an abstract specification of an enforcement mechanism of usage control for Grids, and verifies formally that such mechanism enforces UCON policies. Our technique is based on KAOS, a goal-oriented requirements engineering methodology with a formal LTL-based language and semantics. KAOS is used in a bottom-up form. We abstract the specification of the enforcement mechanism from current implementations of usage control for Grids. The result of this process is agent and operation models that describe the main components and operations of the enforcement mechanism. KAOS is used in top-down form by applying goal-refinement in order to refine UCON policies. The result of this process is a goal-refinement tree, which shows how a goal (policy) can be decomposed into sub-goals. Verification that a policy can be enforced is then equivalent to prove that a goal can be implemented by the enforcement mechanism represented by the agent and operation models. 相似文献
16.
This paper presents an approach for camera auto-calibration from uncalibratedvideo sequences taken by a hand-held camera. The novelty of this approach lies in that theline parallelism is transformed to the constraints on the absolute quadric during camera auto-calibration.This makes some critical cases solvable and the reconstruction more Euclidean. Theapproach is implemented and validated using simulated data and real image data. The experimentalresults show the effectiveness of the approach. 相似文献
17.
The registers constraints are usually taken into account during the scheduling pass of an acyclic data dependence graph (DAG): any schedule of the instructions inside a basic block must bound the register requirement under a certain limit. In this work, we show how to handle the register pressure before the instruction scheduling of a DAG. We mathematically study an approach which consists in managing the exact upper-bound of the register need for all the valid schedules of a considered DAG, independently of the functional unit constraints. We call this computed limit the register saturation (RS) of the DAG. Its aim is to detect possible obsolete register constraints, i.e., when RS does not exceed the number of available registers. If it does, we add some serial edges to the original DAG such that the worst register need does not exceed the number of available registers. We propose an appropriate mathematical formalism for this problem. Our generic processor model takes into account superscalar, VLIW and EPIC/IA64 architectures. Our deeper analysis of the problem and our formal methods enable us to provide nearly optimal heuristics and strategies for register optimization in the face of ILP. 相似文献
18.
该文对Linux时钟机制进行研究,分析了影响时钟慢的原因并设计了细粒度定时器,把系统时钟的节拍的粒度减小,较大的提升了Linux的实时性能。 相似文献
19.
随着处理器和主存之间性能差距的不断增大,长延迟访存成为影响处理器性能的主要原因之一.存储级并行通过多个访存并行执行减少长延迟访存对处理器性能的影响.文中回顾了存储级并行出现的背景,介绍了存储级并行的概念及其与处理器性能模型之间的关系;分析了限制处理器存储级并行的主要因素;详细综述了提高处理器存储级并行的各种技术,进行了... 相似文献
|