Learning Information Extraction Rules for Semi-Structured and Free Text期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Learning Information Extraction Rules for Semi-Structured and Free Text

Authors:	Soderland Stephen

Affiliation:	(1) Department Computer Science and Engineering, University of Washington, Seattle, WA, 98195-2350

Abstract:	A wealth of on-line text information can be made available to automatic processing by information extraction (IE) systems. Each IE application needs a separate set of rules tuned to the domain and writing style. WHISK helps to overcome this knowledge-engineering bottleneck by learning text extraction rules automatically.WHISK is designed to handle text styles ranging from highly structured to free text, including text that is neither rigidly formatted nor composed of grammatical sentences. Such semi-structured text has largely been beyond the scope of previous systems. When used in conjunction with a syntactic analyzer and semantic tagging, WHISK can also handle extraction from free text such as news stories.

Keywords:	natural language processing information extraction rule learning
本文献已被 SpringerLink 等数据库收录！