首页 | 本学科首页   官方微博 | 高级检索  
     


Wikipedia workload analysis for decentralized hosting
Authors:Guido Urdaneta  Guillaume Pierre  Maarten van Steen
Affiliation:1. Escuela de Informática y Telecomunicaciones, Universidad Diego Portales, Santiago, Chile;2. Departamento de Ingeniería Informática, Universidad De Santiago, Chile;1. Group of Computer Networks, Software Engineering, and Systems (GREat) Departamento de Engenharia de Teleinformática, Universidade Federal do Ceará, Fortaleza-Brazil;2. Cloud Computing and Distributed Systems (CLOUDS) Laboratory, Department of Computing and Information Systems, The University of Melbourne, Australia
Abstract:We study an access trace containing a sample of Wikipedia’s traffic over a 107-day period aiming to identify appropriate replication and distribution strategies in a fully decentralized hosting environment. We perform a global analysis of the whole trace, and a detailed analysis of the requests directed to the English edition of Wikipedia. In our study, we classify client requests and examine aspects such as the number of read and save operations, significant load variations and requests for nonexisting pages. We also review proposed decentralized wiki architectures and discuss how they would handle Wikipedia’s workload. We conclude that decentralized architectures must focus on applying techniques to efficiently handle read operations while maintaining consistency and dealing with typical issues on decentralized systems such as churn, unbalanced loads and malicious participating nodes.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号