KPS: a Web information mining algorithm期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

KPS: a Web information mining algorithm

Affiliation:	1. Department of Marketing, Clemson University, United States;2. Department of Management, Clemson University, United States;1. Department of Information Systems, College of Business, City University of, Hong Kong, Hong Kong Special Administrative Region;2. School of Information, Renmin University of China, Beijing 100872, PR China;3. Smart City Research Center, Renmin University of China, Beijing 100872, PR China

Abstract:	The Web mostly contains semi-structured information. It is, however, not easy to search and extract structural data hidden in a Web page. Current practices address this problem by (1) syntax analysis (i.e. HTML tags); or (2) wrappers or user-defined declarative languages. The former is only suitable for highly structured Web sites and the latter is time-consuming and offers low scalability. Wrappers could handle tens, but certainly not thousands, of information sources. In this paper, we present a novel information mining algorithm, namely KPS, over semi-structured information on the Web. KPS employs keywords, patterns and/or samples to mine the desired information. Experimental results show that KPS is more efficient than existing Web extracting methods.

Keywords:
本文献已被 ScienceDirect 等数据库收录！