首页 | 本学科首页   官方微博 | 高级检索  
     

用于中文分词的组合型歧义消解算法
引用本文:袁鼎荣,李新友,邵延振. 用于中文分词的组合型歧义消解算法[J]. 计算机应用与软件, 2011, 28(6)
作者姓名:袁鼎荣  李新友  邵延振
作者单位:1. 北京工业大学国际WIC研究院,北京,100022;广西师范大学计算机科学与信息工程学院,广西桂林,541004
2. 广西师范大学计算机科学与信息工程学院,广西桂林,541004
基金项目:国家自然科学基金重大研究计划培育项目(90718020); 澳大利亚ARC项目(DP0667060)
摘    要:自动分词技术的瓶颈是切分歧义,切分歧义可分为交集型切分歧义和组合型切分歧义。以组合型歧义字段所在句子为研究对象,考察歧义字段不同切分方式所得结果与其前后搭配所得词在全文中的支持度,构造从合或从分切分支持度度量因子,依据该因子消除组合型歧义。通过样例说明和实验验证该方法可行并优于现有技术。

关 键 词:中文信息处理  组合型歧义  共现支持度  歧义消解  支持度因子  

COMBINATORIAL WORD SENSES DISAMBIGUATION ALGORITHM FOR CHINESE WORD SEGMENTATION
Yuan Dingrong,Li Xinyou,Shao Yanzhen. COMBINATORIAL WORD SENSES DISAMBIGUATION ALGORITHM FOR CHINESE WORD SEGMENTATION[J]. Computer Applications and Software, 2011, 28(6)
Authors:Yuan Dingrong  Li Xinyou  Shao Yanzhen
Affiliation:Yuan Dingrong1,2 Li Xinyou2 Shao Yanzhen21(The International WIC Institute,Beijing University of Technology,Beijing 100022,China)2(College of Computer Science and Information Technology,Guangxi Normal University,Guilin 541004,Guangxi,China)
Abstract:The bottleneck of automatic word segmentation is to segment the ambiguity of word senses,which can be divided into crossing ambiguity and combinational ambiguity of the word senses.In this paper,we took the sentence including word section with combinational ambiguity as our research object,examined the support degree of the words composed of the segmented results of ambiguous word section derived from different segmentation methods and their co-occurrence words in the text,constructed the metric factor of s...
Keywords:Chinese text information processing Combinational ambiguity Co-occurrence support degree Disambiguation of word senses Support factor  
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号