首页 | 本学科首页   官方微博 | 高级检索  
     

一种自适应在线核密度估计方法
引用本文:邓齐林,邱天宇,申富饶,赵金熙.一种自适应在线核密度估计方法[J].软件学报,2020,31(4):1173-1188.
作者姓名:邓齐林  邱天宇  申富饶  赵金熙
作者单位:计算机软件新技术国家重点实验室(南京大学),江苏南京210023;南京大学计算机科学与技术系,江苏南京210023;计算机软件新技术国家重点实验室(南京大学),江苏南京210023;南京大学计算机科学与技术系,江苏南京210023;计算机软件新技术国家重点实验室(南京大学),江苏南京210023;南京大学计算机科学与技术系,江苏南京210023;计算机软件新技术国家重点实验室(南京大学),江苏南京210023;南京大学计算机科学与技术系,江苏南京210023
基金项目:国家自然科学基金(61876076);江苏省自然科学基金(BK20171344)
摘    要:给定一组观察数据,估计其潜在的概率密度函数是统计学中的一项基本任务,被称为密度估计问题.随着数据收集技术的发展,出现了大量的实时流式数据,其特点是数据量大,数据产生速度快,并且数据的潜在分布也可能随着时间而发生变化,对这类数据分布的估计也成为亟待解决的问题.然而,在传统的密度估计算法中,参数式算法因为有较强的模型假设导致其表达能力有限,非参数式算法虽然具有更好的表达能力,但其计算复杂度通常很高.因此,它们都无法很好地应用于这种流式数据的场景.通过分析基于竞争学习的学习过程,提出了一种在线密度估计算法来完成流式数据上的密度估计任务,并且分析了其与高斯混合模型之间的密切联系.最后,将所提算法与现有的密度估计算法进行对比实验.实验结果表明,与现有的在线密度估计算法相比,所提算法能够取得更好的估计结果,并且能够基本上达到当前最好的离线密度估计算法的估计性能.

关 键 词:密度估计  高斯混合模型  数据流  在线学习  竞争学习
收稿时间:2017/3/3 0:00:00
修稿时间:2018/4/2 0:00:00

Adaptive Online Kernel Density Estimation Method
DENG Qi-Lin,QIU Tian-Yu,SHEN Fu-Rao,ZHAO Jin-Xi.Adaptive Online Kernel Density Estimation Method[J].Journal of Software,2020,31(4):1173-1188.
Authors:DENG Qi-Lin  QIU Tian-Yu  SHEN Fu-Rao  ZHAO Jin-Xi
Affiliation:State Key Laboratory for Novel Software Technology(Nanjing University), Nanjing 210023, China;Department of Computer Science and Technology, Nanjing University, Nanjing 210023, China
Abstract:Based on observed data, density estimation is the construction of an estimate of an unobservable underlying probability density function. With the development of data collection technology, real-time streaming data becomes the main subject of many related tasks. It has the properties of that high throughput, high generation speed, and the underlying distribution of data may change over time. However, for the traditional density estimation algorithms, parametric methods make unrealistic assumptions on the estimated density function while non-parametric ones suffer from the unacceptable time and space complexity. Therefore, neither parametric nor non-parametric ones could scale well to meet the requirements of streaming data environment. In this study, based on the analysis of the learning strategy in competitive learning, it is proposed a novel online density estimation algorithm to accomplish the task of density estimation for such streaming data. And it is also pointed out that it has pretty close relationship with the Gaussian mixture model. Finally, the proposed algorithm is compared with the existing density estimation algorithms. The experimental results show that the proposed algorithm could obtain better estimates compared with the existing online algorithm, and also get comparable estimation performance compared with state-of-the-art offline density estimation algorithms.
Keywords:density estimation  Gaussian mixture model  data stream  online learning  competitive learning
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号