首页 | 本学科首页   官方微博 | 高级检索  
     

匹配树和决策树方法识别英语句子中的BaseNP
引用本文:荀恩东,李生,赵铁军.匹配树和决策树方法识别英语句子中的BaseNP[J].计算机研究与发展,2000,37(7):826-832.
作者姓名:荀恩东  李生  赵铁军
作者单位:哈尔滨工业大学计算机科学与工程系,哈尔滨,150001
基金项目:国家自然科学基金,国家“八六三”高技术研究发展计划基金
摘    要:提出了语料库和机器学习相结合的方法识别英语句子中的简单的、非递归的名词短语(BaseNP),在含有词性标注和BaseNP边界标注的训练语料中,抽取所有不同类型BaseNP短语对应的词性序列(BaseNP规则),通过规则排序和语方学知识,对其中正确率低且明显不符合语法的规则进行剔除,在识别时,采取规则匹配树的方法进行最大长度匹配,通过归纳机器学习C4.5自满引入上下文信息,由C4.5算法学习出有效(

关 键 词:BaseNP  匹配树  决策树  英语句子  自然语言处理

USING MATCHING TREE AND DECISION TREES TO IDENTIFY BaseNP IN ENGLISH SENTENCES
XUN En-Dong,LI Sheng,ZHAO Tie-Jun.USING MATCHING TREE AND DECISION TREES TO IDENTIFY BaseNP IN ENGLISH SENTENCES[J].Journal of Computer Research and Development,2000,37(7):826-832.
Authors:XUN En-Dong  LI Sheng  ZHAO Tie-Jun
Abstract:A new method, which combines the corpus approach with the machine learning approach, is put forward in this paper to identify simple, non recursion noun phrases (BaseNP). Firstly, all different part of speech (POS) strings (BaseNP rules) which are corresponding to BaseNP are extracted from the training corpus tagged with POS and the boundary of each BaseNP. By means of training and based on linguistics knowledge, some BaseNP rules which have lower precision and have no linguistics sense apparently are deleted. Secondly, the remaining BaseNP rules are employed to identify BaseNP in new sentences. In the process, a heuristic algorithm of longest match, which is combined with the machine learning method of inductive decision trees to consult contexts, is applied. Experiments show that this new method results in higher precision and recall precision.
Keywords:BaseNP  noun phrase  matching tree  decision tree  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号