首页 | 本学科首页   官方微博 | 高级检索  
     


Highly reliable message-passing mechanism for cluster file system
Abstract:With the increase in personal computer clusters in popularity and quantity, message passing between nodes has been an important issue for high failure rate in the network. File access in a cluster file system often contains several sub-operations; each includes one or more network transmissions. Any network failures cause the file system service unavailable. In this paper, we describe a highly reliable message-passing mechanism (HR-NET), which tolerates both software and hardware network failures. HR-NET provides fine-grained, connection-level failover across redundant communication paths. With it, the file system can keep passing messages because HR-NET handles failures automatically by either recovery from network failures or failed over to a backup; therefore, it screens network failures from requests or data transmission of cluster file system. Load balance for messages is also achieved to relieve network traffic. For transmission timeout, HR-NET proposes a priority-based message scheduling which dynamically manages messages in an appropriate order to tolerate request–response failures between clients and servers. HR-NET is implemented upon standard network protocol stack. Performance results show that HR-NET can provide almost full underlying network bandwidth with average 6.17% throughput loss and provide a fast recovery. Experiments with cluster file system show that the overall performance degradation is below 8% due to failover of HR-NET while the reliability is highly enhanced.
Keywords:cluster file system  message-passing mechanism  high reliability  fault tolerance
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号