一种符合ISO14651语义的藏文排序实现方法 A Method for Ordering Tibetan Text in Arccord with ISO 14651期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种符合ISO14651语义的藏文排序实现方法

引用本文：	林河水,程伟,曹晖,李文波,吴健,孙玉芳.一种符合ISO14651语义的藏文排序实现方法[J].中文信息学报,2004,18(5):37-42.

作者姓名：	林河水程伟曹晖李文波吴健孙玉芳

作者单位：	中国科学院软件所开放系统与中心信息处理中心;中国科学院研究生院

基金项目：	国家高技术研究发展计划(863计划)，中国科学院知识创新工程项目

摘要：	本文介绍了一种实现藏文字典序排序的方法,它针对藏文“大字丁字符集”编码方案。通过引入有(无)前加基字符的概念,它把待排序的藏字预处理为有(无)前加基字符、前加字符、基字(基字符或者字丁)、后加字符、再后加字符串后,再行比较,从而避免拆分字丁。本实现方法符合ISO/IEC14651标准语义。
关键词：	计算机应用中文信息处理藏文字典序机器排序
文章编号：	1003-0077(2004)05-0036-06
修稿时间：	2004年4月15日
A Method for Ordering Tibetan Text in Arccord with ISO 14651

LIN He-shui,CHENG Wei,CAO Hui,LI Wen-bo,WU Jian,SUN Yu-fang.A Method for Ordering Tibetan Text in Arccord with ISO 14651[J].Journal of Chinese Information Processing,2004,18(5):37-42.

Authors:	LIN He-shui CHENG Wei CAO Hui LI Wen-bo WU Jian SUN Yu-fang

Affiliation:	Open System and Chinese Information Processing Center , Institute of Software , Chinese Academy of Sciences ; Graduate School of the Chinese Academy of Sciences

Abstract:	This thesis discusses the machine ordering of Tibetan words on the basis of linear characters, which means any pre-composed forms or vertical stack can be processed as a single Tibetan character. Our method is to divide Tibetan words into two types: with or without pre-consonant character. And by defining base characters without pre-consonants and base characters with pre-consonants, we convert the Tibetan words into all kinds of strings like base characters without pre-consonants, base characters with pre-consonants, pre-consonant characters, base characters, post-consonant characters, ppost-consonant characters. Then compare all the defined units with their weight and acquire results. The method is according with the semantic of ISO/IEC 14651.

Keywords:	computer application Chinese information processing tibetan dictionary order machine ordering
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏