首页 | 本学科首页   官方微博 | 高级检索  
     

Extracting exact answers from large-scale corpus based on hybrid strategy
引用本文:LI Peng WANG Xiao-long WANG Bao-xun. Extracting exact answers from large-scale corpus based on hybrid strategy[J]. 通讯和计算机, 2007, 4(8): 44-52
作者姓名:LI Peng WANG Xiao-long WANG Bao-xun
作者单位:School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
摘    要:This paper provides a novel and efficient method for extracting exact textual answers from the returned documents that are retrieved by traditional IR system in large-scale collection of texts. The main intended contribution of this paper is to propose System Similarity Model (SSM), which can be considered as an extension of vector space model (VSM) to rank passages, It presents a method of formalized answer extraction based on pattern learning and applies binary logistic regression model (LRM), which seldom be used in IE to extract special information from candidate data sets. The parameters estimated for the data gathers with serious problem of data sparse, therefore we take stratified sampling method, and improve traditional logistic regression model parameters estimated methods. The series of experimental results show that the overall performance of our system is good and our approach is effective. Our system, lnsun05QAl, which participated in QA track of TREC 2005 obtained excellent results.

关 键 词:问题解答 解答抽取 大规模集合 系统相似性模型 分层取样 回归模型 混合策略

Extracting exact answers from large-scale corpus based on hybrid strategy
Abstract:
Keywords:question answering   answer extraction   system similarity model   stratified sampling   logistic regression model
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号