首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
A class of critical computer requirements for real-time scan T.V. computer graphics is examined in relation to commercially available CPU architectures. Finding general purpose processors not suited, a new processor is proposed which is designed around the concept of ‘instruction set partitioning.’ In this design, special hardware-implemented algorithms may be included in the machine instruction set, and these processors allowed to operate asynchronously from each other. The design is projected to generate a complete new frame of a color T.V. picture every 0.1-0.8 s depending on image complexity. Due to its inherent generality, the CPU may be similarly expanded to encompass a wide variety of other specialized, or real-time tasks with minimal additional hardware. The 32-bit parallel processor has a design cycle time of 100 ns and is in the price class of a minicomputer.  相似文献   

3.
A distributed system is said to be self-stabilizing if it will eventually reach a legitimate system state regardless of its initial state. Because of this property, a self-stabilizing system is extremely robust against failures; it tolerates any finite number of transient failures. The ring orientation problem for a ring is the problem of all the processors agreeing on a common ring direction. This paper focuses on the problem of designing a deterministic self-stabilizing ring orientation system with a small number of processor states under the distributed daemon. Because of the impossibility of symmetry breaking, under the distributed daemon, no such systems exist when the number n of processors is even. Provided that n is odd, the best known upper bound on the number of states is 256 in the link-register model, and eight in the state-reading model. We improve the bound down to 63=216 in the link-register model  相似文献   

4.
In future, multicore processors with hundreds of cores will collaborate on a single chip. Then, more advanced network-on-chip (NoC) topologies will be needed than today's shared busses for dual core processors. Multistage interconnection networks, which are already used in parallel computers, seem to be a promising alternative. In this paper, a new network topology is introduced that particularly applies to multicast traffic in multicore systems and parallel computers. Those multilayer multistage interconnection networks are described by defining the main parameters of such a topology. Performance and costs of the new architecture are determined and compared to other network topologies. Network traffic consisting of constant size packets and of varying size packets is investigated. It is shown that all kinds of multicast traffic particularly benefit from the new topology.  相似文献   

5.
阵列众核处理器由于其较高的计算性能和能效比已经被广泛应用于高性能计算领域。而要构建未来高性能计算系统处理器必须解决严峻的"访存墙"挑战以及核心协同问题。通常的阵列处理器中,核心多采用单线程结构,以减少开销,但是对访存提出了较高的要求。在阵列众核处理器中,在单核心中引入硬件同时多线程技术,针对实验中一级指令缓存命中率随着线程数增加而显著降低的问题,提出了一种面向阵列众核处理器的冗余指令缓存存储结构,基于该结构,提出采用FIFO及类LRU替换策略。通过上述优化的高速缓存结构设计,经实验模拟,双线程整体指令Cache失效率降低了25.2%,整体CPI性能提升了30.2%。  相似文献   

6.
A model-based vision system has been successfully implemented in a small computer environment. This approach uses a basic solid modeling system to develop three-dimensional models of mechanical parts. From those models, two-dimensional projections are taken for every stable state of the object, with many orientations around the object's vertical axis for each stable state. These two-dimensional projections are treated as synthetic binary images, from which a variety of features may be measured and extracted. A similar procedure is used for a binary image of an object from a real scene, and features are also extracted for that image. A simple matching procedures uses the model-based feature sets to determine the real object's stable state position and orientation. This paper describes the system in detail and shows examples of its use.  相似文献   

7.
In a classic paper1 Wirth describes a language which combines the readability of ALGOL 60 with the flexibility and degree of control of a conventional assembly language. This paper gives an outline of a similar language for a small 16-bit computer—the Honeywell DDP-516. Implementing the compiler in its own language by recoding an ALGOL version of the compiler has shown that the language is suitable for large systems. With the compiler written in a high-level language, many enhancements have been possible even though these were not envisaged in the original coding This use of the language clearly demonstrated that a high-level assembly language can be a very effective tool for a small machine as well as for computers like the 360 series.  相似文献   

8.
The popularity of multimedia applications made them a major theme in embedded systems. The key component for supporting multimedia application well is embedded processor. Thus, we have designed and implemented an embedded processor, called UniDual processor, to achieve this objective. Its key features are the integration of instructions of reduced instruction set computers (RISCs) and digital signal processors (DSPs) as well as the support of special instruction set and shared‐based clustered register architecture. However, an important issue of UniDual that remains open is how to efficiently allocate registers. In this paper, we present a scheduling and instruction transformation approach to resolve the aforementioned issue. The proposed approach schedules instructions and then transforms overlapped instructions into RISC and DSP instructions by taking communication overhead and hardware limitations into account. Compared with the greedy approach, the evaluation shows that our work is relatively effective in performance and code size reduction. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

9.
A large subset of the language Algol 68 has been implemented on a small computer—the TESLA 200 with a 64 Kbyte store. In this paper, the general structure of the compiler and organization of the program at run-time are described. Especially, the original methods and devices are mentioned that cause a considerable enhancement of the speed of compilation and program run. These concern the syntax analysis (consistent notation, error recovery), the organization of the resulting program (procedure organization, algorithms of work with multiple values) and the techniques of code generation. We try to show that Algol 68 is a suitable programming language for small as well as large computers and that it can successfully compete with traditional programming languages.  相似文献   

10.
This paper presents a layout for a hierarchial computer system structure for electric power utility. The pyramidal control structure or hierarchial computer system is divided into four levels. These levels include: 1) Corporate computer, 2) Factory computers, 3) Center computers, and 4) Cell/Remote computers. The paper also discusses the domain and the components of the proposed structure.  相似文献   

11.
This paper presents a study and comparison of shape design sensitivity analysis algorithms that are based on the continuum adjoint variable method, the continuum direct differentiation method, and the finite difference method, implemented on a supermini computer with an attached array processor. The basic algorithms and their differences in evaluating shape design sensitivity coefficients are outlined. A solution method for solving a system of equations, using a general sparse storage technique, is used for numerical implementation of shape design sensitivity analysis. It is found that computing shape design sensitivity coefficients using the direct differentiation method is significantly more efficient than using the adjoint variable method or the finite difference method. A detailed performance evaluation of the methods, using an attached array processor, is presented. The performance of the attached array processor, compared to a supermini computer is shown to depend strongly on the type of computations to be carried out. When only parts of a program are running on an attached array processor, the CPU time distribution among the different subroutines of the program can change significantly, compared to using the host processor only.  相似文献   

12.
R. Kingslake 《Software》1971,1(4):391-401
TALK is an interactive file creation and editing system implemented on a very small computer. It provides facilities for FORTRAN syntax checking and an interactive ‘desk-calculator’. Major computing tasks are sent over a high speed link to CDC 6600/6400 computers A number of teleprinters can be controlled by TALK in the usual way, but also the card reader/line printer pair may be used as a terminal. This is useful for creating or printing large files When a user has logged in he can work in one of several modes. These include INPUT, EDIT and CALCULATE modes. It is possible to nest modes, for example, by entering CALCULATE mode while in INPUT mode without losing one's place in the input file An unusual feature of TALK is the ‘inclusion’ facility. This enables a user to specify previously filed text to be included within another file, either as the file is being created or dynamically whenever the file is used. It also gives users the ability to build up macros of commonly used commands. Many system commands are in fact macros with only simple primitives provided as executable code.  相似文献   

13.
In a mobile ad hoc (multi-hop) wireless network, the logical structure of a ring is likely to become volatile or expensive to maintain over time due to changeable network topology. Additional adverse effects take place when a process joins or leaves the computation in the presence of mobility. This paper presents a distributed algorithm that adapts a ring among mobile nodes to the network dynamics to reflect overall communication efficiency. This is achieved by modifying the ring structure in a localized, mutual exclusive fashion, thereby allowing for concurrent segment-wise modifications to proceed. Remarkably our proposal operates without global knowledge of the logical structure and can be embodied as an underlying protocol stratum that supports transparent deployments of conventional algorithms in mobile environment. Subsequent to correctness proof, simulation results show that our proposal is promising in several regards.  相似文献   

14.
15.
16.
17.
This paper proposes extending a multi-core processor with a common matrix unit to maximize on-chip resource utilization and to leverage the advantages of the current multi-core revolution to improve the performance of data-parallel applications. Each core fetches scalar/vector/matrix instructions from its instruction cache. Scalar instructions continue the execution on the scalar datapath; however, vector/matrix instructions are issued by the decode stage to the shared matrix unit through the corresponding FIFO queue. Moreover, scalar results from reduction vector/matrix instructions are sent back from the matrix unit to the scalar core that sent these instructions. Some dense linear algebra kernels (scalar–vector multiplication, scalar times vector plus another, apply Givens rotation, rank-1 update, vector–matrix multiplication, and matrix–matrix multiplication) as well as discrete cosine transform, sum of absolute differences, and affine transformation are used in the performance evaluation. Our results show that the improvement in the utilization of the shared matrix unit with a dual-core ranges from 9% to 26% compared to extending a matrix unit to a single-core. Moreover, the average speedup of the dual-core shared matrix unit over a single-core extended with a matrix unit ranges from 6% to 24% and the maximum speedup ranges from 13% to 46%.  相似文献   

18.
大规模数据排序、搜索引擎、流媒体等大数据应用在面向延迟的多核/众核处理器上运行时资源利用率低下,一级缓存命中率高,二级/三级缓存命中率低,LLC容量的增加对IPC的提升并不明显。针对缓存资源利用率低的问题,分析了大数据应用的访存行为特点,提出了针对大数据应用的两种众核处理器缓存结构设计方案,两种结构均只有一级缓存,Share结构为完全共享缓存,Partition结构为部分共享缓存。评估结果表明,两种方案在访存延迟增加不多的前提下能大幅节省芯片面积,其中缓存容量较低时,Partition结构优于Share结构,缓存容量较高时,Share结构要逐渐优于Partition结构。由于众核处理器中分配到每个处理器核的容量有限,因此Partition结构有一定的优势。  相似文献   

19.
The essential components of an effective program development environment are described. These assist in the interactive investigation of program behaviour and accelerate the rate at which programs are modified. The whole system is compact and simple, being designed for use on small machines. It is considered as important to understand how to use these tools as it is to provide them.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号