首页 | 本学科首页   官方微博 | 高级检索  
     


Web robot detection techniques: overview and limitations
Authors:Derek Doran  Swapna S. Gokhale
Affiliation:1. Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, 06269, USA
Abstract:
Most modern Web robots that crawl the Internet to support value-added services and technologies possess sophisticated data collection and analysis capabilities. Some of these robots, however, may be ill-behaved or malicious, and hence, may impose a significant strain on a Web server. It is thus necessary to detect Web robots in order to block undesirable ones from accessing the server. Such detection is also essential to ensure that the robot traffic is considered appropriately in the performance and capacity planning of Web servers. Despite a variety of Web robot detection techniques, there is no consensus regarding a single technique, or even a specific ??type?? of technique, that performs well in practice. Therefore, to aid in the development of a practically applicable robot detection technique, this survey presents a critical analysis and comparison of the prevalent detection approaches. We propose a framework to classify the existing detection techniques into four categories based on their underlying detection philosophy. We compare the different classes to gain insights into those characteristics that make up an effective robot detection scheme. Finally, we discuss why the contemporary techniques fail to offer a general solution to the robot detection problem and propose a set of key ingredients necessary for strong Web robot detection.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号