首页 | 本学科首页   官方微博 | 高级检索  
     

基于混合策略的双语搭配成分抽取系统的设计与实现
引用本文:徐东英,张彤.基于混合策略的双语搭配成分抽取系统的设计与实现[J].计算机工程与应用,2004,40(25):173-175,178.
作者姓名:徐东英  张彤
作者单位:中国电子信息产业发展研究院,北京,100044
摘    要:介绍了使用混合策略从汉英双语语料库中抽取搭配的方法。采用互信息抽取最初的候选的搭配组合,并用t-测试值考察其可靠度,过滤掉t-score<1.65的候选搭配串,再通过词性标注和浅层句法分析进行筛选。实验证明了该方法的有效性。同时探讨了将抽取的搭配组合应用于建造双语词典和机器翻译系统的途径。

关 键 词:搭配抽取  混合策略  互信息  t-测试  统计方法  规则方法
文章编号:1002-8331-(2004)25-0173-03

Research and Implementation of Bilingual Collocation Extraction System Based on Hybrid Strategy
Xu,Dongying Zhang Tong.Research and Implementation of Bilingual Collocation Extraction System Based on Hybrid Strategy[J].Computer Engineering and Applications,2004,40(25):173-175,178.
Authors:Xu  Dongying Zhang Tong
Abstract:This paper introduces a new approach to extract collocation from Chinese-English parallel corpora using hybrid strategy.We extract initial collocation candidates using mutual information and calculate t-score to eliminate those below the threshold of1.65.Then the candidate collocations left are further filtered through POS patterns and chunking.Relative Experiments prove the validity of this method.Finally,we show the ways to apply the results to bilingual dictionary construction and machine translation system.
Keywords:collocation extraction  hybrid strategy  mutual information  t-score  statistical methods  rule-based methods  
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号