首页 | 本学科首页   官方微博 | 高级检索  
     


Affinity-Based Network Interfaces for Efficient Communication on Multicore Architectures
Authors:Andrés Ortiz    Julio Ortega    Antonio F.Díaz    Alberto Prieto
Affiliation:1. Department of Communications Engineering, University of Málaga, Málaga 29071, Spain
2. Department of Computer Architecture and Technology/CITIC, University of Granada, Granada 18071,Spain
Abstract:Improving the network interface performance is needed by the demand of applications with high communication requirements (for example, some multimedia, real-time, and high-performance computing applications), and the availability of network links providing multiple gigabits per second bandwidths that could require many processor cycles for communication tasks. Multicore architectures, the current trend in the microprocessor development to cope with the difficulties to further increase clock frequencies and microarchitecture efficiencies, provide new opportunities to exploit the parallelism available in the nodes for designing efficient communication architectures. Nevertheless, although present OS network stacks include multiple threads that make it possible to execute network tasks concurrently in the kernel, the implementations of packet-based or connection-based parallelism are not trivial as they have to take into account issues related with the cost of synchronization in the access to shared resources and the efficient use of caches. Therefore, a common trend in many recent researches on this topic is to assign network interrupts and the corresponding protocol and network application processing to the same core, as with this affinity scheduling it would be possible to reduce the contention for shared resources and the cache misses. In this paper we propose and analyze several configurations to distribute the network interface among the different cores available in the server. These alternatives have been devised according to the affinity of the corresponding communication tasks with the location (proximity to the memories where the different data structures are stored) and characteristics of the processing core. As this approach uses several cores to accelerate the communication path of a given connection, it can be seen as complementary to those that consider several cores to simultaneously process packets belonging to either the same or different connections. Message passing interface (MPI) workloads and dynamic web servers have been considered as applications to evaluate and compare the communication performance of these alternatives. In our experiments, performed by full-system simulation, improvements of up to 35% in the throughput and up to 23% in the latency have been observed in MPI workloads, and up to 100% in the throughput, up to 500% in the response time, and up to 82% in the requests attended per second have been measured in dynamic web servers.
Keywords:interrupt affinity  processor affinity  network interface  offloading  SIMICS
本文献已被 CNKI 万方数据 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号