WKAG：一种针对不平衡医保数据的欺诈检测方法 WKAG:Fraud Detection Method for Imbalanced Medical Insurance Data期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

WKAG：一种针对不平衡医保数据的欺诈检测方法

引用本文：	吴文龙,周喜,王轶,王保全.WKAG：一种针对不平衡医保数据的欺诈检测方法[J].计算机工程与应用,2021,57(9):247-254.

作者姓名：	吴文龙周喜王轶王保全

作者单位：	1.中国科学院新疆理化技术研究所，乌鲁木齐 830011 2.中国科学院大学，北京 100049 3.新疆民族语音语言信息处理实验室，乌鲁木齐 830011

基金项目：	中国科学院STS计划;中科院创新青年促进会;自治区天山青年计划

摘要：	医保欺诈检测具有迫切的现实意义，当前工作主要以机器学习方法为主，但面临两个重要问题：（1）数据不平衡问题较为突出，欺诈样本占比极小，影响识别效果；（2）数据特征的选取与构造过于依赖领域业务知识，难以保证特征有效性。针对这些问题，提出了一种针对不平衡医保数据的欺诈检测方法--WKAG。使用WGAN-KDE（Wasserstein Generative Adversarial Network-Kernel Density Estimation）方法改善数据不平衡问题，结合自编码器（Auto-Encoder）提取数据的深层隐藏特征，使用Gradient Boosted Decision Tree（GBDT）检测医保欺诈行为。在多个公开数据集上验证了该方法有效性，并在真实医保业务数据集上进行了实验验证，结果表明了WKAG可作为医保欺诈行为的有效检测方法。
关键词：	生成对抗网络不平衡类自编码特征表示医保欺诈检测集成学习
WKAG:Fraud Detection Method for Imbalanced Medical Insurance Data

WU Wenlong,ZHOU Xi,WANG Yi,WANG Baoquan.WKAG:Fraud Detection Method for Imbalanced Medical Insurance Data[J].Computer Engineering and Applications,2021,57(9):247-254.

Authors:	WU Wenlong ZHOU Xi WANG Yi WANG Baoquan

Affiliation:	1.Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China 2.University of Chinese Academy of Sciences, Beijing 100049, China 3.Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China

Abstract:	Medical insurance fraud detection has urgent practical significance. The current work is mainly concentrated on machine learning methods and confronted with two important issues：（1）The problem of imbalanced data is prominent and the proportion of fraud data among medical insurance data is extremely small, which affects the identification effect; （2）The selection and construction of data features depend on domain business knowledge, and it is difficult to guarantee the validity of features. Aiming at these problems, this paper proposes a fraud detection method for imbalanced healthcare data-WKAG：The Wasserstein Generative Adversarial Network-Kernel Density Estimation（WGAN-KDE） method is used to improve the imbalance of medical insurance data. The Auto-Encoder is used to extract the deep hidden features of data. The Gradient Boosted Decision Tree（GBDT） is used to detect medical insurance fraud. The validity of the method has been verified on multiplepublic data sets as well as the real medical insurance business data set. The results show that WKAG can be used as an effective detection method for medical insurance fraud.

Keywords:	generative adversarial network imbalance dataset auto-encoder feature representation medical insurance fraud detection ensemble learning
本文献已被万方数据等数据库收录！
	点击此处可从《计算机工程与应用》浏览原始摘要信息
	点击此处可从《计算机工程与应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏