首页 | 本学科首页   官方微博 | 高级检索  
     


A noun-based approach to feature location using time-aware term-weighting
Affiliation:1. Faculty of Computer Science & Information Technology, University of Malaya, Kuala Lumpur, Malaysia;2. Department of Computer Science, Central Washington University, Ellensburg, WA, USA;1. Centre de Recherches en Psychologie, Cognition et Communication, Université Rennes 2, 1 place du recteur Henri Le Moal, 35043 Rennes, France;2. IFSTTAR, Laboratoire de Psychologie de la Conduite, 25 allée des Marronniers, Satory, 78000 Versailles, France;3. LAMPA, Arts et Métiers ParisTech, 2 bd du Ronceray, 49000 Angers, France;1. Simula Research Laboratory, Martin Linges vei 17, 1325 Lysaker, Norway;2. SnT Centre, University of Luxembourg, Luxembourg;3. Chalmers and the University of Gothenburg, Gothenburg, Sweden;4. Blekinge Institute of Technology, Karlskrona, Sweden;1. Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;2. Graduate University, Chinese Academy of Sciences, Beijing 100190, China;3. State Key Laboratory of Computer Science, Beijing 100190, China;1. Biomedical Radiochemistry Laboratory, Department of Applied Mathematics, Research School of Physics and Engineering, Australian National University, Australia;2. Animal Services Division, Research School of Biology, Australian National University, Australia;3. Vascular Biology Laboratory, John Curtin School of Medical Research, Australian National University, Australia;4. Sirtex Medical Ltd, Sydney, Australia;1. Fraunhofer Institute for Experimental Software Engineering (IESE), Fraunhofer-Platz 1, 67663 Kaiserslautern, Germany;2. University of Helsinki, P.O. Box 68, Gustaf Hällströmin katu 2b, 00014 Helsinki, Finland;3. Vector Informatik GmbH, Ingersheimer Straße 24, 70499 Stuttgart, Germany
Abstract:ContextFeature location aims to identify the source code location corresponding to the implementation of a software feature. Many existing feature location methods apply text retrieval to determine the relevancy of the features to the text data extracted from the software repositories. One of the preprocessing activities in text retrieval is term-weighting, which is used to adjust the importance of a term within a document or corpus. Common term-weighting techniques may not be optimal to deal with text data from software repositories due to the origin of term-weighting techniques from a natural language context.ObjectiveThis paper describes how the consideration of when the terms were used in the repositories, under the condition of weighting only the noun terms, can improve a feature location approach.MethodWe propose a feature location approach using a new term-weighting technique that takes into account how recently a term has been used in the repositories. In this approach, only the noun terms are weighted to reduce the dataset volume and avoid dealing with dimensionality reduction.ResultsAn empirical evaluation of the approach on four open-source projects reveals improvements to the accuracy, effectiveness and performance up to 50%, 17%, and 13%, respectively, when compared to the commonly-used Vector Space Model approach. The comparison of the proposed term-weighting technique with the Term Frequency-Inverse Document Frequency technique shows accuracy, effectiveness, and performance improvements as much as 15%, 10%, and 40%, respectively. The investigation of using only noun terms, instead of using all terms, in the proposed approach also indicates improvements up to 28%, 21%, and 58% on accuracy, effectiveness, and performance, respectively.ConclusionIn general, the use of time in the weighting of terms, along with the use of only the noun terms, makes significant improvements to a feature location approach that relies on textual information.
Keywords:Feature location  Software change request  Time-metadata  Term-weighting  Noun usage
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号