首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Over the years, presence of heterogeneous system has dominated the area of concurrent job execution. Heterogeneous system is the natural choice as it can be designed with the legacy system. Scheduling, on such systems, is an important activity as it affects the job execution characteristic. Heterogeneity introduces many challenges for the efficient job execution. Heterogeneity in core architecture introduces the possibility of heterogeneous memory architecture in many/multi core heterogeneous system. This makes it often impossible to determine for the same instruction if a high frequency core has low or high memory latency in comparison to the low frequency core and vice-versa. The work proposes an improved scheduler for such systems in which both core and memory are heterogeneous. It defines average effective time ( \(\hbox {AE}_\mathrm{t}\) ) as the base parameter for this purpose. Priorities of each thread (workload) and the core are dynamically generated using \(\hbox {AE}_\mathrm{t}\) for effective mapping. Experimental results, on the benchmark data, reveal that the proposed scheduler performs much better in terms of cores utilization, speedup and efficiency in comparison to other similar models.  相似文献   

2.
Several studies have shown that Asymmetric Multicore Processors (AMPs) systems, which are composed of processors with different hardware characteristics, present better performance and power when compared to homogeneous systems. With Moore’s law behavior still lasting, core-count growth creates typical non-uniform memory accesses (NUMA). Existing schedulers assume that the underlying architecture is homogeneous, and as consequence, they may not be well suited for AMP and NUMA systems, since they, respectively, do not properly explore hardware elements asymmetry, while improving memory utilization by avoid multi-processes data starvation. In this paper we propose a new scheduler, namely NUMA-aware Scheduler, to accommodate the next generation of AMP architectures in terms of architecture asymmetry and processes starvation. Experimental results show that the average speedup is 1.36 times faster than default Linux scheduler through evaluation using PARSEC benchmarks, demonstrating that the proposed technique is promising when compared to other prior studies.  相似文献   

3.
本文介绍了一种智能供水系统的设计与实现。系统由智能水表、刷卡机和管理软件组成。智能水表基于C8051F021单片机和USOSⅡ嵌入式实时操作系统。系统具有多级别、多制式、权限制的特点,实现单位日常用水自动定量管理问题,节约了大量的水资源。  相似文献   

4.
Computational grids have become an appealing research area as they solve compute-intensive problems within the scientific community and in industry. A Grid computational power is aggregated from a huge set of distributed heterogeneous workers; hence, it is becoming a mainstream technology for large-scale distributed resource sharing and system integration. Unfortunately, current grid schedulers suffer from the haste problem, which is the schedule inability to successfully allocate all input tasks. Accordingly, some tasks fail to complete execution as they are allocated to unsuitable workers. Others may not start execution as suitable workers are previously allocated to other peers. This paper is the first to introduce the scheduling haste problem. It also presents a reliable grid scheduler. The proposed scheduler selects the most suitable worker to execute an input grid task using a fuzzy inference system. Hence, it minimizes the turnaround time for a set of grid tasks. Moreover, our scheduler is a system-oriented one as it avoids the scheduling haste problem. Experimental results have shown that the proposed scheduler outperforms traditional grid schedulers as it introduces a better scheduling efficiency.  相似文献   

5.
The International Parallel and Distributed Processing Symposium (IPDPS) 2008 panel with the title “How to avoid making the same Mistakes all over again: What the parallel-processing Community has (failed) to offer the multi/many-core Generation” sought to provoke discussion on current and recent computer science education in relation to the emergence of fundamentally parallel multi/many-core systems. Is today’s/tomorrow’s/yesterday’s computer science graduate equipped to deal with the challenges of parallel software development for such systems? Are mistakes from the past being unnecessarily repeated? What are the fundamental contributions of the parallel processing research community to the current state of affairs that are possibly being ignored? What are the new challenges that have not been addressed in past parallel processing research? How should computer-science education in parallel processing look like? Should it be taught at all?  相似文献   

6.

This article presents the design and implementation of an air-crew assignment system, for producing and refining a solution to this problem, based on the artificial intelligence principles and techniques of abductive reasoning as captured by the framework of abductive logic programming (ALP). The system offers a high level of flexibility in addressing both the tasks of crew scheduling and rescheduling. Itcan be used to generate a valid and good quality initial solution and then help the human operators adjust and refine further this solution in order to meet extra requirements of the problem. These additional needs can arise either due to new foreseen requirements that the company wants to have or experiment with for a particular period in time, or due to unexpected events that have occurred while the solution (crew-roster) is in operation. This work shows the ability and flexibility of abduction, and, more specifically, of ALP, in tackling problems of this type with complex and changing requirements.  相似文献   

7.

The measurement of the vessel pattern in fingers is a superior method for identifying individuals owing to its convenience and the security it offers. We introduce in this paper a new perspective to accomplish finger vein recognition. This method, which regards deformations as discriminative information, is distinct from existing methods that attempt to prevent the influence of deformations. The proposed technique is based on the observation that regular deformation, which corresponds to a posture change, can only exist in genuine vein patterns. In terms of methodology, we incorporate optimized matching to generate pixelbased 2D displacements that correspond to deformations. The texture of uniformity extracted from the displacement fields is taken as the final matching score. Evaluated on two publicly available databases, PolyU and SDU-MLA, extensive experiments demonstrated that the discriminability of the new feature derived from deformations is preferable. The equal error rate (EER) achieved is the lowest compared to that of state-of-the-art techniques.

  相似文献   

8.
An implementable parallel scheduler for input-queued switches   总被引:1,自引:0,他引:1  
Giaccone  P. Shah  D. Prabhakar  B. 《Micro, IEEE》2002,22(1):19-25
The Apsara algorithm is an input-queued switch scheduler that uses limited parallelism to find a matching in a single iteration, as compared to the O(N3) iterations of the more common maximum-weight matching algorithm. The Apsara algorithm also achieves a throughput of up to 100 percent and has very good delay properties  相似文献   

9.
Performance in superscalar processing strongly depends on the compiler's ability to generate codes that can be executed by hardware in an optimal or near optimal order. Generating optimal code is an NP-complete problem. However, there is a need for highly optimized code, such as in superscalar or real-time systems. In this paper, an instruction scheduling scheme for optimizing a program trace is proposed. Optimized code can be arrived at without much redundant work, if some important features in code are well explored and utilized in scheduling. To formalize the task, two abstract models, one for a superscalar processor and the other for a program trace, are given. These two models reflect most of the characteristics of the scheduling problem. The interrelations between instructions and partial schedules are thoroughly studied, and dominance and equivalence relations on them are defined. These relations are then used to reduce the solution space and eventually help to produce optimal schedules. The results of experiments that show the promise of the proposed scheme are also presented  相似文献   

10.
Lock synchronization is a key programming primitive for shared-memory many-core CMPs. However, as the number of cores increases, conventional software implementations cannot meet the desirable levels of performance and scalability. Meanwhile, most existing hardware-supported lock proposals require modifications at some level of the memory hierarchy, thus degrading QoS of applications through synchronization traffic.  相似文献   

11.
Emerging non-volatile memory technologies, especially flash-based solid state drives (SSDs), have increasingly been adopted in the storage stack. They provide numerous advantages over traditional mechanically rotating hard disk drives (HDDs) and have a tendency to replace HDDs. Due to the long existence of HDDs as primary building blocks for storage systems, however, much of the system software has been specially designed for HDD and may not be optimal for non-volatile memory media. Therefore, in order to realistically leverage its superior raw performance to the maximum, the existing upper layer software has to be re-evaluated or re-designed. To this end, in this paper, we propose PASS, an optimized I/O scheduler at the Linux block layer to accommodate the changing trend of underlying storage devices toward flash-based SSDs. PASS takes the rich internal parallelism in SSDs into account when dispatching requests to the device driver in order to achieve high performance. Specifically, it parti-tions the logical storage space into fixed-size regions (preferably the component package sizes) as scheduling units. These scheduling units are serviced in a round-robin manner and for every chance that the chosen dispatching unit issues only a batch of either read or write requests to suppress the excessive mutual interference. Additionally, the requests are sorted according to their visiting addresses while waiting in the dispatching queues to exploit high sequential performance of SSD. The experimental results with a variety of workloads have shown that PASS outperforms the four Linux off-the-shelf I/O schedulers by a degree of 3%up to 41%, while at the same time it improves the lifetime significantly, due to reducing the internal write amplification.  相似文献   

12.
A neuro-fuzzy system specially suited for efficient implementations is presented. The system is of the same type as the well-known “adaptive network-based fuzzy inference system” (ANFIS) method. However, different restrictions are applied to the system that considerably reduce the complexity of the inference mechanism. Hence, efficient implementations can be developed. Some experiments are presented which demonstrate the good performance of the proposed system despite its restrictions. Finally, an efficient digital hardware implementation is presented for a two-input single-output neuro-fuzzy system.  相似文献   

13.
14.
15.

The images comprise not only photographic images but also graphic and text images, they are determined in magazines, brochures and websites. The segmentation and compression of compound images (for instance, computer-generated images, scanned documents and so on) are tough to the procedure.The existing segmentation and compression techniques do not provide a complete comprehensive solution. To solve the problems in existing techniques, here we segmented the compound images via an optimization depended on K-means clustering technique along with AC (Alternate Current) coefficient method for the dynamic segmentation and then compressed individually. The AC coefficient based segmentation method results in detachment of smooth (background) and non-smooth (text, image and overlapping) areas. Further, the non-smooth part is segmented via the optimization depended on K-means clustering technique. Also, the density of segmented objects is headed applying different compression strategies such as the Huffman coder, arithmetic coder, and Jpeg coders. With the being approaches, the entire projected architecture is implemented in MATLAB and the function of the scheme is measured and equated. Our proposed system achieves better compression ratio (21.16), and also improves the performance for image quality index (0.931574), PSNR (Peak Signal to Noise Ratio) (34.91338), RMSE (Root Mean Square Error) (0.931574), SSIM (Structural Similarity) (0.546882), and SDME (Second Derivative-like Measure of Enhancement) (44.91293) than the available CS K-means algorithm.

  相似文献   

16.
This paper describes the design and development of IS (for Intelligent Scheduler), a true multiple criteria knowledge-based scheduler which can be used for operational level scheduling of batch manufacturing systems, sometimes called job shops. IS incorporates a heuristic algorithm coupled with two knowledge bases, one for job scheduling and the other for selecting a suitable schedule based on the user provided criterion or criteria. With fourteen dispatching rules, it can generate both static and dynamic schedules. IS is a far more realistic and sophisticated model, accounting for many important factors, such as multiple machines, multiple fixtures, multiple tools, alternate processing routes, machine setup time, machine processing time, due date, job arrival time, initial shop loading, hot jobs, and considering either one criterion or multiple criteria simultaneously. In addition, IS coded in C and has all the features of a modern professional quality interactive program. It has moving bar and pull down menus and an on-line help function, a friendly human-computer interface, and an intuitive and easy to understand representation of the schedules.  相似文献   

17.
The human eye cannot see subtle motion signals that fall outside human visual limits, due to either limited resolution of intensity variations or lack of sensitivity to lower spatial and temporal frequencies. Yet, these invisible signals can be highly informative when amplified to be observable by a human operator or an automatic machine vision system. Many video magnification techniques have recently been proposed to magnify and reveal these signals in videos and image sequences. Limitations, including noise level, video quality and long execution time, are associated with the existing video magnification techniques. Therefore, there is value in developing a new magnification method where these issues are the main consideration. This study presents a new magnification method that outperforms other magnification techniques in terms of noise removal, video quality at large magnification factor and execution time. The proposed method is compared with four methods, including Eulerian video magnification, phase-based video magnification, Riesz pyramid for fast phase-based video magnification and enhanced Eulerian video magnification. The experimental results demonstrate the superior performance of the proposed magnification method regarding all video quality metrics used. Our method is also 60–70% faster than Eulerian video magnification, whereas other competing methods take longer to execute than Eulerian video magnification.  相似文献   

18.
19.
Many current graphical display systems utilize a buffer memory system to contain a two-dimensional image array to be modified and displayed. In order to speed up the update of the buffer memory system, it is required that the buffer memory system accesses many image points within an image subarray in parallel. This paper proposes an efficient buffer memory system for a fast and high-resolution graphical display system. The memory system provides parallel accesses to pq image points within a block(p×q), a horizontal (1×pq), a vertical (pq×1), a forward-diagonal, or a backward-diagonal subarray in a two-dimensional image array, M×N, where the design parameters p and q are all powers of two. In the address calculation and routing circuit of the proposed buffer memory system, the address differences of the five subarrays are prearranged according to the index numbers of memory modules and stored in two static random access memories (SRAMs), so that the address differences are simply added to the base address to obtain the addresses according to the index numbers of memory modules. In addition, for the fast address calculation, one single multiplication operation in the base address calculation is replaced by a SRAM access, so that the multiplication operation can be performed during the SRAM access for the address differences for the case when N is not a power of two. The address calculation and routing circuit proposed in this paper is improved in the hardware cost, the complexity of control, and the speed over the previous circuits  相似文献   

20.
A structured video consists of a collection of background objects, characters, spatial and temporal constructs, and rendering features. Assuming a platform consisting of a fixed amount of memory and a magnetic disk drive, this study presents a resource scheduler for the continuous display of structured video that minimizes both the latency observed by a display and its required amount of memory  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号