首页 | 本学科首页   官方微博 | 高级检索  
     

一种符合ISO14651语义的藏文排序实现方法
引用本文:林河水,程伟,曹晖,李文波,吴健,孙玉芳.一种符合ISO14651语义的藏文排序实现方法[J].中文信息学报,2004,18(5):37-42.
作者姓名:林河水  程伟  曹晖  李文波  吴健  孙玉芳
作者单位:中国科学院软件所开放系统与中心信息处理中心;中国科学院研究生院
基金项目:国家高技术研究发展计划(863计划),中国科学院知识创新工程项目
摘    要:本文介绍了一种实现藏文字典序排序的方法,它针对藏文“大字丁字符集”编码方案。通过引入有(无)前加基字符的概念,它把待排序的藏字预处理为有(无)前加基字符、前加字符、基字(基字符或者字丁)、后加字符、再后加字符串后,再行比较,从而避免拆分字丁。本实现方法符合ISO/IEC14651标准语义。

关 键 词:计算机应用  中文信息处理  藏文  字典序  机器排序  
文章编号:1003-0077(2004)05-0036-06
修稿时间:2004年4月15日

A Method for Ordering Tibetan Text in Arccord with ISO 14651
LIN He-shui,CHENG Wei,CAO Hui,LI Wen-bo,WU Jian,SUN Yu-fang.A Method for Ordering Tibetan Text in Arccord with ISO 14651[J].Journal of Chinese Information Processing,2004,18(5):37-42.
Authors:LIN He-shui  CHENG Wei  CAO Hui  LI Wen-bo  WU Jian  SUN Yu-fang
Affiliation:Open System and Chinese Information Processing Center , Institute of Software , Chinese Academy of Sciences ; Graduate School of the Chinese Academy of Sciences
Abstract:This thesis discusses the machine ordering of Tibetan words on the basis of linear characters, which means any pre-composed forms or vertical stack can be processed as a single Tibetan character. Our method is to divide Tibetan words into two types: with or without pre-consonant character. And by defining base characters without pre-consonants and base characters with pre-consonants, we convert the Tibetan words into all kinds of strings like base characters without pre-consonants, base characters with pre-consonants, pre-consonant characters, base characters, post-consonant characters, ppost-consonant characters. Then compare all the defined units with their weight and acquire results. The method is according with the semantic of ISO/IEC 14651.
Keywords:computer application  Chinese information processing  tibetan  dictionary order  machine ordering
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号