首页 | 本学科首页   官方微博 | 高级检索  
     

一种预测流程剩余时间的可解释特征分层方法
引用本文:郭娜,刘聪,李彩虹,陆婷,闻立杰,曾庆田.一种预测流程剩余时间的可解释特征分层方法[J].软件学报,2024,35(3):1341-1356.
作者姓名:郭娜  刘聪  李彩虹  陆婷  闻立杰  曾庆田
作者单位:山东理工大学 电气与电子工程学院, 山东 淄博 255000;山东理工大学 计算机科学与技术学院, 山东 淄博 255000;山东科技大学 计算机科学与工程学院, 山东 青岛 266590;清华大学 软件学院, 北京 100084
基金项目:国家自然科学基金(61902222);山东省泰山学者工程专项基金(ts20190936,tsqn201909109);山东省自然科学基金优秀青年基金(ZR2021YQ45);山东省高等学校青创科技计划创新团队项目(2021KJ031);山东科技大学领军人才与优秀科研团队计划(2015TDJH102)
摘    要:流程剩余时间预测对于业务异常的预防和干预有着重要的价值和意义.现有的剩余时间预测方法通过深度学习技术达到了更高的准确率,然而大多数深度模型结构复杂难以解释预测结果,即不可解释问题.此外,剩余时间预测除了活动这一关键属性还会根据领域知识选择若干其他属性作为预测模型的输入特征,缺少通用的特征选择方法,对于预测的准确率和模型的可解释性存在一定的影响.针对上述问题,提出基于可解释特征分层模型(explainable feature-based hierarchical model,EFH model)的流程剩余时间预测框架.具体而言,首先提出特征自选择策略,通过基于优先级的后向特征删除和基于特征重要性值的前向特征选择,得到对预测任务具有积极影响的属性作为模型输入.然后提出可解释特征分层模型架构,通过逐层加入不同特征得到每层的预测结果,解释特征值与预测结果的内在联系.采用LightGBM (light gradient boosting machine)和LSTM (long short-term memory)算法实例化所提方法,框架是通用的,不限于选用算法.最后在8个真实事件日志上与最新方法进行比较.实验结果表明所提方法能够选取出有效特征,提高预测的准确率,并解释预测结果.

关 键 词:流程挖掘  剩余时间预测  特征选择  可解释  分层模型
收稿时间:2022/7/18 0:00:00
修稿时间:2022/9/12 0:00:00

Explainable Feature-based Hierarchical Approach to Predict Remaining Process Time
GUO N,LIU Cong,LI Cai-Hong,LU Ting,WEN Li-Jie,ZENG Qing-Tian.Explainable Feature-based Hierarchical Approach to Predict Remaining Process Time[J].Journal of Software,2024,35(3):1341-1356.
Authors:GUO N  LIU Cong  LI Cai-Hong  LU Ting  WEN Li-Jie  ZENG Qing-Tian
Affiliation:School of Electrical and Electronic Engineering, Shandong University of Technology, Zibo 255000, China;School of Computer Science and Technology, Shandong University of Technology, Zibo 255000, China;College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China;School of Software, Tsinghua University, Beijing 100084, China
Abstract:Remaining process time prediction is important for preventing and intervening in abnormal business operations. For predicting the remaining time, existing approaches have achieved high accuracy through deep learning techniques. However, most of these techniques involve complex model structures, and the prediction results are difficult to be explained, namely, unexplainable issues. In addition, the prediction of the remaining time usually uses the key attribute, namely activity, or selects several other attributes as the input features of the predicted model according to the domain knowledge. However, a general feature selection method is missing, which may affect both prediction accuracy and model explainability. To tackle these two challenges, this study introduces a remaining process time prediction framework based on an explainable feature-based hierarchical (EFH) model. Specifically, a feature self-selection strategy is first proposed, and the attributes that have a positive impact on the prediction task are obtained as the input features of the model through the backward feature deletion based on priority and the forward feature selection based on feature importance. Then an EFH model is proposed. The prediction results of each layer are obtained by adding different features layer by layer, so as to explain the relationship between input features and prediction results. The study also uses the light gradient boosting machine (LightGBM) and long short-term memory (LSTM) algorithms to implement the proposed approach, and the framework is general and not limited to the algorithms selected in this study. Finally, the proposed approach is compared with other methods on eight real-life event logs. The experimental results show that the proposed approach can select effective features and improve prediction accuracy. In addition, the prediction results are explained.
Keywords:process mining  remaining time prediction  feature selection  explainability  hierarchical model
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号