首页 | 本学科首页   官方微博 | 高级检索  
     

一种代码和中文文档关联信息的自动提取方法
引用本文:陈华,钱剑飞,俞瑞钊.一种代码和中文文档关联信息的自动提取方法[J].计算机应用与软件,2005,22(9):48-49,110.
作者姓名:陈华  钱剑飞  俞瑞钊
作者单位:浙江大学计算机科学与工程学院,浙江,杭州,310027;浙江大学计算机科学与工程学院,浙江,杭州,310027;浙江大学计算机科学与工程学院,浙江,杭州,310027
摘    要:维护代码和对应的文档的关联在软件维护、程序理解、需求跟踪等软件工程活动中有重要的意义。维护这些关联其关键在于提取关联信息,提出了一种利用信息检索技术自动提取程序源代码和中文文档关联信息的方法。首先通过提取文档中的词汇建立文档的语言概率模型,在此基础上用由代码信息组成的检索项检索文档集,由此得到代码和文档的相关列表和关联矩阵。测试结果表明在提取项大于5时即可获得95%以上的关联。

关 键 词:软件维护  信息检索  程序理解
收稿时间:2004-06-04
修稿时间:2004-06-04

A METHOD OF TRACING LINKS BETWEEN CODE AND CHINESE DOCUMENTATION
Chen Hua,Qian Jianfei,Yu Ruizhao.A METHOD OF TRACING LINKS BETWEEN CODE AND CHINESE DOCUMENTATION[J].Computer Applications and Software,2005,22(9):48-49,110.
Authors:Chen Hua  Qian Jianfei  Yu Ruizhao
Abstract:Tracing and maintaining links between free text documents in Chinese and its source code plays important role in software engineering.A new method based on Information Retrieval(IR) to do this work automatically is proposed.First of all,a stochastic language model is built which assigns a probability to every query string of words taken from all of the documents;then for each source code file,a list of documents ranked according to the probability of relevance are generated.Based on these,a relevance matrix linking each source code file to the documents could be got.Experiments shows that above 95 percent of the links could be traced when we only take the top 5 documents from the ranked list.
Keywords:Software maintenance Information retrieval Program understanding
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号