首页 | 本学科首页   官方微博 | 高级检索  
     


THE DESIGN OF A STATISTICAL ALGORITHM FOR RESOLVING STRUCTURAL AMBIGUITY IN "V NP1 usde NP0"
Authors:Wenjie  Li and Kam-Fai  Wong
Affiliation:Department of Systems Engineering and Engineering Management, Chinese University of Hong Kong
Abstract:The existence of structural ambiguity in modifying clauses renders noun phrase (NP) extraction from running Chinese texts complicated. It is shown from previous experiments that nearly 33% of the errors in an NP extractor were actually caused by the use of clause modifiers. For example, consider the sequence "V + NP1+      ( of ) + NP0." It can be interpreted as two alternatives, a verb phrase (i.e., VNP1+     + NP0]NP]VP) or a noun phrase (i.e., V NP1]VP+     + NP0]NP). To resolve this ambiguity, syntactical, contextual, and semantics-based approaches are investigated in this article. The conclusion is that the problem can be overcome only when the semantic knowledge about words is adopted. Therefore, a structural disambiguation algorithm based on lexical association is proposed. The algorithm uses the semantic class relation between a word pair derived from a standard Chinese thesaurus,     , to work out whether a noun phrase or a verb phrase has a stronger lexical association within the collocation. This can, in turn, determine the intended phrase structure. With the proposed algorithm, the best accuracy and coverage are 79% and 100%, respectively. The experiment also shows that the backed-off model is more effective for this purpose. With this disambiguation algorithm, parsing performance can be significantly improved.
Keywords:Chinese language processing  noun phrase extraction  structural disambiguation
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号