首页 | 本学科首页   官方微博 | 高级检索  
     

改进自训练模型在业务质差用户识别中的应用
引用本文:余立,李哲,高飞,袁向阳,杨永. 改进自训练模型在业务质差用户识别中的应用[J]. 电信科学, 2021, 37(10): 136-142. DOI: 10.11959/j.issn.1000-0801.2021191
作者姓名:余立  李哲  高飞  袁向阳  杨永
作者单位:中国移动通信有限公司研究院,北京 100053;中国移动通信集团公司,北京 100033
摘    要:质差用户识别是降低用户投诉率、提升用户满意度的重要环节。针对当前电信网络系统中业务感知相关的大量结构化及非结构化数据难以有效标注、质差用户标签不完备、现有监督学习模型训练样本不均衡而导致质差识别率低的问题,采用改进自训练半监督学习模型,利用少量满意度低分和投诉用户作为质差用户标签对网络数据进行标注,并通过标签迁移对大量未标注数据进行训练识别质差用户。实验表明,相比于识别准确率高但是训练成本高的全监督学习和识别准确率低的无监督学习,半监督学习可以充分利用无标签样本数据进行有效训练,保证较低训练成本的同时显著提升质差用户识别准确率。

关 键 词:半监督学习  改进自训练模型  质差用户识别  无标签数据

Application of improved self-training model in the identification of users with poor service quality
Li YU,Zhe LI,Fei GAO,Xiangyang YUAN,Yong YANG. Application of improved self-training model in the identification of users with poor service quality[J]. Telecommunications Science, 2021, 37(10): 136-142. DOI: 10.11959/j.issn.1000-0801.2021191
Authors:Li YU  Zhe LI  Fei GAO  Xiangyang YUAN  Yong YANG
Affiliation:1. China Mobile Research Institute, Beijing 100053, China;2. China Mobile Communications Corporation, Beijing 100033, China
Abstract:Poor quality user identification is an important method to reduce the complaint rate and increase satisfaction.It is difficult to effectively label a large amount of structured and unstructured data related to business perception in current telecommunications network systems, poor quality user labels are not complete, and the existing supervised learning model training samples are unbalanced, resulting in a low quality recognition rate.An improved self-training semi-supervised learning model was adopted, a small number of low-satisfaction and complaint users as poor quality user labels was used to label network data, and label migration was used to train a large amount of unlabeled data to identify poor quality users.Experiments show that compared to fully supervised learning with high recognition model accuracy but high training cost and unsupervised learning with low recognition model accuracy, semi-supervised learning can make full use of unlabeled sample data for effective training, ensuring lower training costs and the recognition accuracy of poor-quality users is significantly improved.
Keywords:semi-supervised learning  improved self-training model  poor quality user identification  unlabeled data  
本文献已被 万方数据 等数据库收录!
点击此处可从《电信科学》浏览原始摘要信息
点击此处可从《电信科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号