首页 | 本学科首页   官方微博 | 高级检索  
     


Query expansion based on statistical learning from code changes
Authors:Qing Huang  Yangrui Yang  Xue Zhan  Hongyan Wan  Guoqing Wu
Affiliation:1. State Key Laboratory of Software Engineering, Computer School, Wuhan University, Wuhan, China;2. College of Information Engineering of North China University of Water Resources and Electric Power, Zhengzhou, China;3. Deepin Technologies Co Ltd, Wuhan, China
Abstract:Thesaurus‐based, code‐related, and software‐specific query expansion techniques are the main contributions in free‐form query search. However, these techniques still could not put the most relevant query result in the first position because they lack the ability to infer the expansion words that represent the user needs based on a given query. In this paper, we discover that code changes can imply what users want and propose a novel query expansion technique with code changes (QECC). It exploits (changes, contexts) pairs from changed methods. On the basis of statistical learning from pairs, it can infer code changes for a given query. In this way, it expands a query with code changes and recommends the query results that meet actual needs perfectly. In addition, we implement InstaRec to perform QECC and evaluate it with 195 039 change commits from GitHub and our code tracker. The results show that QECC can improve the precision of 3 code search algorithms (ie, IR, Portfolio, and VF) by up to 52% to 62% and outperform the state‐of‐the‐art query expansion techniques (ie, query expansion based on crowd knowledge and CodeHow) by 13% to 16% when the top 1 result is inspected.
Keywords:code changes  code search  information retrieval  software reuse  statistical learning  query expansion
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号