首页 | 本学科首页   官方微博 | 高级检索  
     

基于相关修正的无偏排序学习方法
引用本文:白婷, 刘轩宁, 吴斌, 张梓滨, 徐志远, 林康熠. 基于多粒度特征交叉剪枝的点击率预测模型[J]. 计算机研究与发展, 2024, 61(5): 1290-1298. DOI: 10.7544/issn1000-1239.202220943
作者姓名:白婷  刘轩宁  吴斌  张梓滨  徐志远  林康熠
作者单位:1.北京邮电大学计算机学院(国家示范性软件学院) 北京 100876;2.微信事业群开放平台基础部 广州 510220
基金项目:国家自然科学基金项目(62102038, 61972047);腾讯微信开放平台项目(S2021120)
摘    要:

在推荐系统中,学习有效的高阶特征交互是提升点击率预测的关键. 现有的研究将低阶特征进行组合来学习高阶交叉特征表示,导致模型的时间复杂度随着特征维度的增加呈指数型增长;而基于深度神经网络的高阶特征交叉模型也无法很好地拟合低阶特征交叉,影响预测的准确率. 针对这些问题,提出了基于多粒度特征交叉剪枝的点击率预测模型FeatNet. 该模型首先在显式的特征粒度上,通过特征剪枝生成有效的特征集合,保持了不同特征组合的多样性,也降低了高阶特征交叉的复杂度;基于剪枝后的特征集合,在特征元素粒度上进一步进行隐式高阶特征交叉,通过滤波器自动过滤无效的特征交叉. 在2个真实的数据集上进行了大量的实验,FeatNet都取得了最优的点击率预测效果.



关 键 词:点击率预测  高阶特征交叉  多粒度  特征剪枝  特征降噪
收稿时间:2022-11-11
修稿时间:2023-03-09

Unbiased learning to rank based on relevance correction
Bai Ting, Liu Xuanning, Wu Bin, Zhang Zibin, Xu Zhiyuan, Lin Kangyi. Multi-Granularity Based Feature Interaction Pruning Model for CTR Prediction[J]. Journal of Computer Research and Development, 2024, 61(5): 1290-1298. DOI: 10.7544/issn1000-1239.202220943
Authors:Bai Ting  Liu Xuanning  Wu Bin  Zhang Zibin  Xu Zhiyuan  Lin Kangyi
Affiliation:1.School of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing 100876;2.Weixin Open Platform, Tencent, Guangzhou 510220
Abstract:Learning effective high-order feature interactions is crucial for click through rate (CTR) prediction in recommender systems. Existing methods that learn meaningful high-order feature combinations by reassembling low-order feature combinations, i.e., 2-order feature interaction, suffer from high computational costs to calculate the interaction weight of all pairwise feature interactions. Some deep neural network-based methods can be seen as universal function approximators to potentially learn all kinds of feature interactions. However, it had been proved to be inefficient to approximate the low-order interactions, i.e., 2-order or 3rd-order feature interactions, which may influence the accuracy of CTR prediction task. Based on the above consideration, we propose a multi-granularity based feature interaction pruning network (FeatNet) for CTR prediction task. Firstly, FeatNet generates different subsets with a threshold pruning operation to select the meaningful feature combinations on the explicit feature granularity, which enables FeatNet to keep the diversity of different feature combinations, and reduce the complexity of high-order feature interactions. Based on the pruned feature subsets, implicit high-order feature interactions are further conducted on the granularity of feature elements, which automatically filters out the invalid feature interactions. Extensive experiments are conducted on two real-world datasets, showing the superiority of FeatNet in CTR prediction.
Keywords:CTR prediction  high-order feature interaction  multi-granularity  feature pruning  feature denoising
点击此处可从《计算机研究与发展》浏览原始摘要信息
点击此处可从《计算机研究与发展》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号