Extracting exact answers from large-scale corpus based on hybrid strategy |
| |
引用本文: | LI Peng WANG Xiao-long WANG Bao-xun. Extracting exact answers from large-scale corpus based on hybrid strategy[J]. 通讯和计算机, 2007, 4(8): 44-52 |
| |
作者姓名: | LI Peng WANG Xiao-long WANG Bao-xun |
| |
作者单位: | School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China |
| |
摘 要: | This paper provides a novel and efficient method for extracting exact textual answers from the returned documents that are retrieved by traditional IR system in large-scale collection of texts. The main intended contribution of this paper is to propose System Similarity Model (SSM), which can be considered as an extension of vector space model (VSM) to rank passages, It presents a method of formalized answer extraction based on pattern learning and applies binary logistic regression model (LRM), which seldom be used in IE to extract special information from candidate data sets. The parameters estimated for the data gathers with serious problem of data sparse, therefore we take stratified sampling method, and improve traditional logistic regression model parameters estimated methods. The series of experimental results show that the overall performance of our system is good and our approach is effective. Our system, lnsun05QAl, which participated in QA track of TREC 2005 obtained excellent results.
|
关 键 词: | 问题解答 解答抽取 大规模集合 系统相似性模型 分层取样 回归模型 混合策略 |
Extracting exact answers from large-scale corpus based on hybrid strategy |
| |
Abstract: | |
| |
Keywords: | question answering answer extraction system similarity model stratified sampling logistic regression model |
本文献已被 维普 等数据库收录! |