首页 | 本学科首页   官方微博 | 高级检索  
     


Detecting approximate clones in business process model repositories
Affiliation:1. School of Computer Science and Technology, Shandong University, Jinan, China;2. School of Engineering, Brown University, Providence, United States;3. Engineering Research Center of Digital Media Technology, Ministry of Education of PRC, Jinan, China;1. Laboratoire de Conception et Application de Molécules Bioactives, CNRS - Université de Strasbourg UMR 7199, Faculté de Pharmacie, 74, Route du Rhin, F-67400 Illkirch, France.;2. Ecole Supérieure de Biotechnologie de Strasbourg, CNRS - Université de Strasbourg UMR, Bld Sébastien Brant, F-67412 Illkirch, France;3. Laboratoire de Biophotonique et Pharmacologie, CNRS - Université de Strasbourg UMR 7213, Faculté de Pharmacie, 74, Route du Rhin, F-67400 Illkirch, France;4. Université de Lorraine, CNRS, CRAN, UMR 7039, Campus Sciences, 54500 Vandoeuvre les Nancy, France
Abstract:Empirical evidence shows that repositories of business process models used in industrial practice contain significant amounts of duplication. This duplication arises for example when the repository covers multiple variants of the same processes or due to copy-pasting. Previous work has addressed the problem of efficiently retrieving exact clones that can be refactored into shared subprocess models. This paper studies the broader problem of approximate clone detection in process models. The paper proposes techniques for detecting clusters of approximate clones based on two well-known clustering algorithms: DBSCAN and Hierarchical Agglomerative Clustering (HAC). The paper also defines a measure of standardizability of an approximate clone cluster, meaning the potential benefit of replacing the approximate clones with a single standardized subprocess. Experiments show that both techniques, in conjunction with the proposed standardizability measure, accurately retrieve clusters of approximate clones that originate from copy-pasting followed by independent modifications to the copied fragments. Additional experiments show that both techniques produce clusters that match those produced by human subjects and that are perceived to be standardizable.
Keywords:Business process model  Clone detection  Model collection  Repository  Standardization
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号