Abstract: | This research proposes and tests an approach to engineering distributed file systems that are aimed at wide-scale, Internet-based use. The premise is that replication is essential to deliver performance and availability, yet the traditional conservative replica consistency algorithms do not scale to this environment. Our Ficus replicated file system uses a single-copy availability, optimistic update policy with reconciliation algorithms that reliably detect concurrent updates and automatically restore the consistency of directory replicas. The system uses the peer-to-peer model in which all machines are architectural equals but still permits configuration in a client-server arrangement where appropriate. Ficus has been used for six years at several geographically scattered installations. This paper details and evaluates the use of optimistic replica consistency, automatic update conflict detection and repair, the peer-to-peer (as opposed to client-server) interaction model, and the stackable file system architecture in the design and construction of Ficus. The paper concludes with a number of lessons learned from the experience of designing, building, measuring, and living with an optimistically replicated file system. © 1998 John Wiley & Sons, Ltd. |