Affiliation: | (1) Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education, Beijing, 100871, China;(2) School of Electronics Engineering and Computer Science, Peking University, Beijing, 100871, China |
Abstract: | In this paper we present an approach to automate the architecture recovery process of software systems. The approach is built on information retrieval and clustering techniques, and, in particular, uses Latent Semantic Indexing (LSI) to get similarities among software entities (e.g., programs or classes) and the k-means clustering algorithm to form groups of software entities that implement similar functionality. In order to improve computational time in the context of the software evolution and then reduce energy waste, the architecture recovery process can be also applied by using fold-in and fold-out mechanisms that, respectively, add and remove software entities to the LSI representation of the understudy software system. The approach has been implemented in a prototype of a supporting software system as an Eclipse plug-in. Finally, to assess the approach and the plug-in, we have conducted an empirical investigation on five open source software systems implemented using the programming languages Java and C/C++. In the investigation special emphasis has been also given to the effect of using the fold-in and fold-out mechanisms. |