首页 | 本学科首页   官方微博 | 高级检索  
     

农业古籍断句标点模式研究
引用本文:黄建年,侯汉清.农业古籍断句标点模式研究[J].中文信息学报,2008,22(4):31-38.
作者姓名:黄建年  侯汉清
作者单位:1. 南京农业大学 人文与社会科学学院, 江苏 南京210095;2. 南京财经大学 图书馆, 江苏 南京210046
摘    要:农业古籍的整理已经引起了众多学者和专家的注意,但是,对于农业古籍的自动断句、标点模式的研究仍付之阙如。本研究探索并总结出部分农业古籍断句、标点识别模式。首先采用句法特征词断句法、同义语标志词法进行初步断句;进而利用反义复合词、引书标志、时序、数量词、重叠字词、动名结构及比较句法进一步对子句进行断句、标点;最后使用农业用语和禁用模式表进一步提高断句、标点后农业古籍的可读性和准确性。经测试表明,断句、标点的平均准确率分别达到48%和35%,证明本方法具有一定的正确性和可行性。

关 键 词:计算机应用  中文信息处理  农业古籍  古农书  古籍整理  断句  标点  模式匹配
  

On Sentence Segmentation and Punctuation Model for Ancient Books on Agriculture
HUANG Jian-nian,HOU Han-qing.On Sentence Segmentation and Punctuation Model for Ancient Books on Agriculture[J].Journal of Chinese Information Processing,2008,22(4):31-38.
Authors:HUANG Jian-nian  HOU Han-qing
Affiliation:1. Nanjing Agricultural University,Nanjing, Jiangsu 210095,China;
2. Nanjing University of Finance & Economics,Nanjing, Jiangsu 210046,China
Abstract:The collation of ancient books on agriculture has arouse the attention of the research circle.But the automatic sentence segmentation and punctuation model for these books have been less touched.This article probes into this issue and summarizes certain patterns on sentence splitting and punctuation model for ancient books on agriculture.It is proposed that the sentence is initially segmented by syntax words(like empty word,conjunction and modal words) and synonyms indication words.Then antonyms,cited books indications,time sequence,quantifiers, pleonasms and verb-object structure are employed for further sentence segmentation and punctuation fill-up.Also the comparative sentence supplies an auxiliary means of judgment of complex sentences and punctuation of clauses.Finally the terms in agriculture and the stop punctuation list are applied to improve the readability of these books after marking the punctuations.In experiments,the average precision of the punctuation model reaches 48% and 35% respectively,which shows the feasibility and the potentially of the proposed method.
Keywords:computer application  Chinese information processing  ancient books on agriculture  agricultural treatises of ancient China  collation of ancient books  sentence segmentation  punctuation  pattern match
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号