首页 | 本学科首页   官方微博 | 高级检索  
     

汉语名物化复合词识别
引用本文:陈昌熊,赵京雷. 汉语名物化复合词识别[J]. 计算机应用与软件, 2008, 25(9)
作者姓名:陈昌熊  赵京雷
作者单位:上海交通大学计算机科学与工程系,上海,200240
摘    要:名物化复合词的识别是汉语复合词识别中的难点.困难之处在于汉语动词和名词共现时既可以构成动词短语也可以构成名物化复合词.传统的汉语复合词识别往往只使用语料统计特征,效果往往不怎么理想.基于最大熵模型,在基准上下文特征的基础上,采用了词汇特征与Web特征对动词和名词共现时的名物化候选进行判定,取得了较好的实验结果.其中,Precision达到了86.31%,Recall达到了70.00%.

关 键 词:最大熵模型  名词性复合词  复合能力

THE IDENTIFICATION OF CHINESE NOMINALIZATION COMPOUNDS
Chen Changxiong,Zhao Jinglei. THE IDENTIFICATION OF CHINESE NOMINALIZATION COMPOUNDS[J]. Computer Applications and Software, 2008, 25(9)
Authors:Chen Changxiong  Zhao Jinglei
Affiliation:Chen Changxiong Zhao Jinglei(Department of Computer Science , Engineering,Shanghai Jiaotong University,Shanghai 200240,China)
Abstract:The identification of nominalization compounds is very difficult in Chinese compound recognition.When a verb and a noun co-occur,there will be an ambiguity as whether the expression is a verb phrase or a compound.Traditional identification of nominalization compounds is usually only based on the features from the corpus and the result is not very good.In this paper it uses a Maximum Entropy model to identify nominalization compounds.Besides the baseline contextual features,the model also adopts lexical and ...
Keywords:Maximum entropy model Nominal compounds (NC) Compound ability (CA) Thesaurus Web features Point-type mutual information based on information retrieval (PMI-IR)  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号