首页 | 本学科首页   官方微博 | 高级检索  
     

面向互联网的大规模重复图像检索技术研究
引用本文:王树鹏,陈 明,吴广君.面向互联网的大规模重复图像检索技术研究[J].通信学报,2014,35(12):23-202.
作者姓名:王树鹏  陈 明  吴广君
作者单位:1. 中国科学院 信息工程研究所,北京 100093;2. 郑州轻工业学院 软件学院,河南 郑州 450000
基金项目:国家自然科学基金资助项目(61271275, 61202067);国家高技术研究发展计划(“863”计划)基金资助项目(2013AA013205, 2012AA013001, 2013AA013204);北京市科技计划基金资助项目(Z131100001113034)
摘    要:针对互联网上典型的社交媒体应用,提出了一个基于随机投影和分块DCT系数的大规模分布式重复图像检索方法。该方法在Hadoop集群的基础上,首先利用随机投影映射生成图像签名,再由图像签名高效的检索HBase表以获得具有高召回率的候选图像集,最后依赖分块DCT系数对候选图像进行进一步过滤来提高检索精度。实验结果表明,对于1 200万张微博图像,当H =2且T=150时,该方法的召回率为98%,精确率为93.2%,平均检索时间为6.7 s。

关 键 词:社交媒体  随机投影映射  图像签名  分块DCT系数  Hadoop集群

Large-scale duplicate image retrieval technical research for the internet
Shu-peng WANG,Ming CHEN,Guang-jun WU.Large-scale duplicate image retrieval technical research for the internet[J].Journal on Communications,2014,35(12):23-202.
Authors:Shu-peng WANG  Ming CHEN  Guang-jun WU
Affiliation:1. Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100093,China;2. School of Software Engineering,Zhengzhou University of Light Industry,Zhengzhou 450000,China
Abstract:For the typical social media application on the internet, a large-scale distributed duplicate image retrieval approach based on random projection and the block DCT coefficients was proposed. On the basis of Hadoop, this approach exploited image signatures generated by random projection mapping to retrieve HBase efficiently. And candidate images with high-recall were achieved. Then in order to improve the retrieval precision, the block DCT coefficients were used to further filter candidate images. For 12 million images, experimental results showed that with our approach the recall ratio reached 98%, the precision ratio reached 93.2%, and the average retrieval time was 6.7s when H=2 and T=150.
Keywords:social media  random projection mapping  image signature  block DCT coefficients  Hadoop cluster
点击此处可从《通信学报》浏览原始摘要信息
点击此处可从《通信学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号