首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We present three parallel sorting algorithms suitable for implementation on tightly coupled multiprocessors and compare their performance on the Denelcor HEP. Two of the algorithms implemented—parallel Shellsort and quickmerge—are new. Shellsort is amenable to parallelization; however, since Shellsort has higher complexity than quicksort, parallel Shellsort is inferior to parallel quicksort. A second new parallel algorithm, called quickmerge, is based upon both quicksort and mergesort. Our implementation of quickmerge achieves significantly higher speedup than occur implementation of parallel quicksort.  相似文献   

2.
In this paper we present a new parallel sorting algorithm suitable for implementation on a tightly coupled multiprocessor. The algorithm utilizes P processors to sort a set of N data items subdivided into M subsets. The performance of the algorithm is investigated, and the results of experiments carried out on the Balance 8000 multiprocessor are presented.  相似文献   

3.
A sorting algorithm, dubbed MeshSort, for multidimensional mesh-connected multiprocessors is introduced. Bitonic Sort and ShearSort are shown to be special cases of MeshSort. MeshSort thus provides some insight into the operation of parallel sorting. It requires operations only along orthogonal vectors of processors, simplifying the control of the multiprocessor. This allows MeshSort to be used on any reduced architecture where a multidimensional memory structure is interconnected with a lower dimensional structure of processors. A modified version of MeshSort, called FastMeshSort, is presented. This algorithm applies the same basic principle as MeshSort, and is almost as simple to implement, but achieves much better performance. The modified algorithm is shown to be very efficient for reasonably sized meshes. FastMeshSort is presented as a practical sorting and routing algorithm for real multidimensional mesh-connected multiprocessors. The algorithms can easily be extended to other multiprocessor structures  相似文献   

4.
We consider the iterative solution of large sparse linear systems of equations arising from elliptic and parabolic partial differential equations in two or three space dimensions. Specifically, we focus our attention on nonsymmetric systems of equations whose eigenvalues lie on both sides of the imaginary axis, or whose symmetric part is not positive definite. This system of equation is solved using a block Kaczmarz projection method with conjugate gradient acceleration. The algorithm has been designed with special emphasis on its suitability for multiprocessors. In the first part of the paper, we study the numerical properties of the algorithm and compare its performance with other algorithms such as the conjugate gradient method on the normal equations, and conjugate gradient-like schemes such as ORTHOMIN(k), GCR(k) and GMRES(k). We also study the effect of using various preconditioners with these methods. In the second part of the paper, we describe the implementation of our algorithm on the CRAY X-MP/48 multiprocessor, and study its behavior as the number of processors is increased.  相似文献   

5.
The Erlangen General Purpose Array (EGPA) consists of a grid-like array of memory-coupled processor-modules. Above the array there is a hierarchy of processors for supervising and for data transports. An experimental pilot pyramid was realized. For a broad spectrum of applications the measured efficiency of the 4 worker-processors of the pilot pyramid ranged between 80 and 100%.  相似文献   

6.
《Parallel Computing》1997,23(13):2075-2093
This paper studies the parallel solution of large-scale sparse linear least squares problems on distributed-memory multiprocessors. The key components required for solving a sparse linear least squares problem are sparse QR factorization and sparse triangular solution. A block-oriented parallel algorithm for sparse QR factorization has already been described in the literature. In this paper, new block-oriented parallel algorithms for sparse triangular solution are proposed. The arithmetic and communication complexities of the new algorithms applied to regular grid problems are analyzed. The proposed parallel sparse triangular solution algorithms together with the block-oriented parallel sparse QR factorization algorithm result in a highly efficient approach to the parallel solution of sparse linear least squares problems. Performance results obtained on an IBM Scalable POWERparallel system SP2 are presented. The largest least squares problem solved has over two million rows and more than a quarter million columns. The execution speed for the numerical factorization of this problem achieves over 3.7 gigaflops per second on an IBM SP2 machine with 128 processors.  相似文献   

7.
Although various strategies have been developed for scheduling parallel applications with independent tasks, very little work exists for scheduling tightly coupled parallel applications on cluster environments. In this paper, we compare four different strategies based on performance models of tightly coupled parallel applications for scheduling the applications on clusters. In addition to algorithms based on existing popular optimization techniques, we also propose a new algorithm called Box Elimination that searches the space of performance model parameters to determine the best schedule of machines. By means of real and simulation experiments, we evaluated the algorithms on single cluster and multi‐cluster setups. We show that our Box Elimination algorithm generates up to 80% more efficient schedules than other algorithms. We also show that the execution times of the schedules produced by our algorithm are more robust against the performance modeling errors. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

8.
为了解决低成本微机电惯性导航系统存在的累积误差问题,提出一种基于融合行人航迹推算(PDR)和超宽带(UWB)无线定位的实时室内行人导航系统.利用加速度计和磁强计进行初始姿态对准;考虑滤波误差估计,推导了惯性导航算法;依靠加速度计和陀螺仪的"与"逻辑进行行人步态检测;实施零速更新(ZUPT)提供速度误差观测量,利用UWB系统提供位置误差观测量;设计具有野值辨识机制的扩展卡尔曼滤波器进行数据融合.对提出的行人导航算法进行实验验证,结果表明该行人导航算法与传统定位方法相比能够有效提高行人定位精度.实验中,该行人导航算法能够获取低于0.2 m的定位误差,且稳定、不发散.  相似文献   

9.
New standards in signal, multimedia, and network processing for embedded electronics are characterized by computationally intensive algorithms, high flexibility due to the swift change in specifications. In order to meet demanding challenges of increasing computational requirements and stringent constraints on area and power consumption in fields of embedded engineering, there is a gradual trend towards coarse-grained parallel embedded processors. Furthermore, such processors are enabled with dynamic reconfiguration features for supporting time- and space-multiplexed execution of the algorithms. However, the formidable problem in efficient mapping of applications (mostly loop algorithms) onto such architectures has been a hindrance in their mass acceptance. In this paper we present (a) a highly parameterizable, tightly coupled, and reconfigurable parallel processor architecture together with the corresponding power breakdown and reconfiguration time analysis of a case study application, (b) a retargetable methodology for mapping of loop algorithms, (c) a co-design framework for modeling, simulation, and programming of such architectures, and (d) loosely coupled communication with host processor.  相似文献   

10.
11.
地下电缆沟的日常巡检劳动强度大,且存在危险隐患,是城市电力系统保持稳定工作亟需解决的问题,采用智能巡检机器人系统是解决这一问题的趋势。同步定位和实时地图的构建是地下电缆沟机器人自主巡检的前提。地下电缆沟等场景具有底纹理、结构化、路面平整度情况复杂、GPS信号差等场景特征,巡检机器人对该类结构化场景进行建图时,会出现地图退化和定位精度下降的现象。针对上述问题设计了一种基于多传感器的SLAM系统,融合了二维激光雷达、惯性测量单元、轮式里程计等多种传感器数据,通过自适应初始化对机器人里程计进行优化。针对不同路面平整度下的相邻激光关键帧匹配误差,设计了一种自适应帧间配准方法进行校正。现场试验表明,在路况复杂的地下电缆沟场景中,该方法比现有方法的地图退化率和定位误差平均分别降低了7.42%和8.73%,具有明显的工程应用价值。  相似文献   

12.
The perceptual cycle model (PCM) underpins much Ergonomics research, particularly in a team context, for example in its theoretical underpinning of distributed situation awareness. Despite this, the PCM framework it has not been explicitly applied to explore team processes, which is surprising given the prevalence of teamwork in safety critical systems. This paper explores team processes in the context of search and rescue (SAR) by applying the PCM and an association classification scheme with a network analysis approach utilising the event analysis of systemic teamwork (EAST) method. Data were collected via observations and communication recordings during training flights with SAR crews and were amalgamated into a representative case study. The analysis demonstrates how the SAR team function within a distributed perceptual cycle whereby the actions of one team member become world information for another team member. Advancements to the EAST method are proposed and the implications of the research are discussed.

Practitioner Summary: This paper explores the perceptual cycle interactions of SAR crews using a novel EAST approach. The analysis demonstrates how the crew function as a distributed cognitive unit and applications in terms of training and design are discussed.  相似文献   


13.
A joint range-velocity closed tracking loop,which is based on tightly coupled range and velocity filter is proposed.When the measured velocity value is adopted in the range tracking loop to modify the velocity and acceleration equations from traditional α-β-γ filter,the tracking loop based on tightly coupled range and velocity filter can not only track the range and the velocity simultaneously,but also improve the range tracking accuracy.The experimental results show that the tracking errors about range thermal noise in the proposed loop is lower than those in the traditional loop over 2.2 dB,when filter parameters satisfy least mean-square error criterion.Moreover,with the increase of the filter parameter,the tracking performance of our schemes are improved accordingly.  相似文献   

14.
This paper addresses mapping of streaming applications (such as MPEG) on multiprocessor platforms with time-division-multiplexed network-on-chip. In particular, we solve processor selection, path selection and router configuration problems. Given the complexity of these problems, state of the art approaches in this area largely rely on greedy heuristics, which do not guarantee optimality. Our approach is based on a constraint programming formulation that merges a number of steps, usually tackled in sequence in classic approaches. Thus, our method has the potential of finding optimal solutions with respect to resource usage under throughput constraints. The experimental evaluation presented in here shows that our approach is capable of exploring a range of solutions while giving the designer the opportunity to emphasize the importance of various design metrics.  相似文献   

15.
16.
基于GPS/SINS紧耦合系统的新息外推法   总被引:2,自引:0,他引:2  
在GPS/SINS紧耦合系统中,完好性是一个极其重要的指标。为缩短卫星故障检测时间,在新息检测法的基础上提出一种新息外推法,该算法通过对外推过程中产生的新息进行处理,形成检验统计量来检测卫星故障。结合GPS/SINS紧耦合系统,仿真结果表明,新息外推法比新息检测法能更快地检测慢变故障,且能在一定程度上削弱野值对故障检测的影响。  相似文献   

17.
针对视频人体动作识别中动作信息利用率不高、时间信息关注度不足等问题,提出了一种基于紧耦合时空双流卷积神经网络的人体动作识别模型。首先,采用两个2D卷积神经网络分别提取视频中的空间特征和时间特征;然后,利用长短期记忆(LSTM)网络中的遗忘门模块在各采样片段之间建立特征层次的紧耦合连接以实现信息流的传递;接着,利用双向长短期记忆(Bi-LSTM)网络评估各采样片段的重要性并为其分配自适应权重;最后,结合时空双流特征以完成人体动作识别。在数据集UCF101和HMDB51上进行实验验证,该模型在这两个数据集上的准确率分别为94.2%和70.1%。实验结果表明,所提出的紧耦合时空双流卷积网络模型能够有效提高时间信息利用率和动作整体表达能力,由此明显提升人体动作识别的准确度。  相似文献   

18.
We address a variant of scheduling problem on two identical machines, where we are given an additional speed-up resource. If a job uses the resource, its processing time may decrease. However, at any time the resource can only be used by at most one job. The objective is to minimize the makespan. For the offline version, we present an FPTAS. For the online version where jobs arrive over list, we propose an online algorithm with competitive ratio of 1.781, and show a lower bound of 1.686 for any online algorithm.  相似文献   

19.
Mixed-criticality scheduling on multiprocessors   总被引:1,自引:0,他引:1  
The scheduling of mixed-criticality implicit-deadline sporadic task systems on identical multiprocessor platforms is considered. Two approaches, one for global and another for partitioned scheduling, are described. Theoretical analyses and simulation experiments are used to compare the global and partitioned scheduling approaches.  相似文献   

20.
针对视频人体动作识别中动作信息利用率不高、时间信息关注度不足等问题,提出了一种基于紧耦合时空双流卷积神经网络的人体动作识别模型。首先,采用两个2D卷积神经网络分别提取视频中的空间特征和时间特征;然后,利用长短期记忆(LSTM)网络中的遗忘门模块在各采样片段之间建立特征层次的紧耦合连接以实现信息流的传递;接着,利用双向长短期记忆(Bi-LSTM)网络评估各采样片段的重要性并为其分配自适应权重;最后,结合时空双流特征以完成人体动作识别。在数据集UCF101和HMDB51上进行实验验证,该模型在这两个数据集上的准确率分别为94.2%和70.1%。实验结果表明,所提出的紧耦合时空双流卷积网络模型能够有效提高时间信息利用率和动作整体表达能力,由此明显提升人体动作识别的准确度。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号