首页 | 本学科首页   官方微博 | 高级检索  
     

在线社交网络的UNI64采样方法
引用本文:许南山,李浩,卢罡.在线社交网络的UNI64采样方法[J].计算机系统应用,2014,23(12):206-212.
作者姓名:许南山  李浩  卢罡
作者单位:北京化工大学信息科学与技术学院,北京,100029
基金项目:北京高等学校青年英才计划(YETP0506)
摘    要:在对社交网络采样方法进行研究时,常以拒绝-接受采样法得到的样本作为对照来评价其他采样方法的优劣.由于各种在线社交网络陆续将其用户ID系统由32位升级为64位,导致拒绝-接受采样法的采样命中率近乎为零.本文根据在线社交网络的特点,以新浪微博为例,对其用户ID分布情况进行分析,提出了一种改进的拒绝-接受采样法UNI64.该方法通过分析网络有效ID样本的分布情况,结合聚类的方法将整个样本空间划分为有效区间和无效区间,并使采样算法避开无效区间,仅在有效区间内生成待测样本,从而有效提高了拒绝-接受采样法在有效样本极为稀疏的样本空间内采样的命中率.

关 键 词:在线社交网络  采样方法  随机采样  新浪微博  层次聚类
收稿时间:2014/4/10 0:00:00
修稿时间:5/9/2014 12:00:00 AM

UNI64 Sampling Method on Online Social Networks
XU Nan-Shan,LI Hao and LU Gang.UNI64 Sampling Method on Online Social Networks[J].Computer Systems& Applications,2014,23(12):206-212.
Authors:XU Nan-Shan  LI Hao and LU Gang
Affiliation:College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China;College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China;College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
Abstract:When studying the sampling methods on online social networks, samples collected by acceptance-rejection method are usually used as the "ground truth" to estimate the pros and cons of other sampling methods. The acceptance rate of the original acceptance-rejection method slumps dramatically since OSN sites updated their user ID from 32bit to 64bit. According to the characteristics of online social networks and taking Sina Weibo for example, we analyzed the distribution of user IDs in Sina Weibo, and proposed an improved acceptance-rejection method called UNI64. In this method, the user ID space is divided into valid intervals and vacant intervals by analyzing the distribution of valid sample IDs and utilizing cluster method. The sampling method generates candidate IDs only in valid intervals, so that the acceptance rate could be effectively improved even in a sparse-distributed user ID space.
Keywords:online social networks  sampling method  random sampling  Sina Weibo  hierarchical cluster
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号