首页 | 本学科首页   官方微博 | 高级检索  
     

基于StackOverflow数据的软件功能特征挖掘组织方法
引用本文:朱子骁,邹艳珍,华晨彦,沈琦,赵俊峰.基于StackOverflow数据的软件功能特征挖掘组织方法[J].软件学报,2018,29(8):2210-2225.
作者姓名:朱子骁  邹艳珍  华晨彦  沈琦  赵俊峰
作者单位:高可信软件技术教育部重点实验室, 北京大学, 北京 100871;北京大学 信息科学技术学院, 北京 100871,高可信软件技术教育部重点实验室, 北京大学, 北京 100871;北京大学 信息科学技术学院, 北京 100871;北京大学, 天津滨海新一代信息技术研究院, 天津 300450,高可信软件技术教育部重点实验室, 北京大学, 北京 100871;北京大学 信息科学技术学院, 北京 100871,高可信软件技术教育部重点实验室, 北京大学, 北京 100871;北京大学 信息科学技术学院, 北京 100871,高可信软件技术教育部重点实验室, 北京大学, 北京 100871;北京大学 信息科学技术学院, 北京 100871;北京大学, 天津滨海新一代信息技术研究院, 天津 300450
基金项目:国家重点研发计划(2016YFB1000801);国家杰出青年科学基金(61525201)
摘    要:软件的功能描述文档是开发人员了解软件的重要基础.现有软件项目并不都具备全面描述软件功能的文档,但软件项目开发和应用过程中的各种交流记录蕴含了讨论其功能的大量信息.为此,本文提出了一种基于StackOverflow问答数据的软件功能特征挖掘组织方法.该方法提出以动宾短语形式描述软件功能特征,挖掘并组织蕴含在StackOverflow数据中的软件功能特征,自动生成一种以层次化方式展示的软件项目功能特征文档.在针对真实项目的实验中,本文方法生成的软件功能文档可以覆盖官方文档中列举的97.6%的软件常用功能.同时,该方法可以扩展从不同形式的项目交流记录中生成全面描述软件功能特征的文档.

关 键 词:软件复用  功能特征  软件文档  StackOverflow  自然语言句法分析  频繁子图挖掘
收稿时间:2017/7/19 0:00:00
修稿时间:2017/9/28 0:00:00

Mining and Organizing Software Functional Features Based on StackOverflow Data
ZHU Zi-Xiao,ZOU Yan-Zhen,HUA Chen-Yan,SHEN Qi and ZHAO Jun-Feng.Mining and Organizing Software Functional Features Based on StackOverflow Data[J].Journal of Software,2018,29(8):2210-2225.
Authors:ZHU Zi-Xiao  ZOU Yan-Zhen  HUA Chen-Yan  SHEN Qi and ZHAO Jun-Feng
Affiliation:Key Laboratory of High Confidence Software Technologies. Ministry of Education;, Beijing, 100871, China;Software Institute, School of Electronics Engineering and Computer Science, Peking University, Beijing, 100871, China,Key Laboratory of High Confidence Software Technologies. Ministry of Education;, Beijing, 100871, China;Software Institute, School of Electronics Engineering and Computer Science, Peking University, Beijing, 100871, China;Beida. Binhai;Information Research, Tianjin, 300450, China,Key Laboratory of High Confidence Software Technologies. Ministry of Education;, Beijing, 100871, China;Software Institute, School of Electronics Engineering and Computer Science, Peking University, Beijing, 100871, China,Key Laboratory of High Confidence Software Technologies. Ministry of Education;, Beijing, 100871, China;Software Institute, School of Electronics Engineering and Computer Science, Peking University, Beijing, 100871, China and Key Laboratory of High Confidence Software Technologies. Ministry of Education;, Beijing, 100871, China;Software Institute, School of Electronics Engineering and Computer Science, Peking University, Beijing, 100871, China;Beida. Binhai;Information Research, Tianjin, 300450, China
Abstract:The functional specification documents are very important for the developers who want to understand and reuse unfamiliar software libraries. Due to high cost of human effort and time, lots of software have not provided the official functional documentation. However, some software communication records produced in software developing processes contain valuable information of discussing software functions and usages. In this paper, we propose a novel approach to automatically mining and organizing functional features for open source software based on the StackOverflow data. We describe the functional features in the form of verb phrases. The approach generates hierarchical list of software functional features as the supplement of software documentation. In the experimental evaluation on some real-world subjects, the automatically generated documents have covered 97.6% of the frequent-used functional features in the official documents. At the same time, our approach could be adapted to different types of software communication records, and applied to software in different domains.
Keywords:software reuse  functional feature  software documentation  StackOverflow  syntax parsing  frequent subgraph mining
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号