首页 | 本学科首页   官方微博 | 高级检索  
     

按需印刷平台中的相似搜索研究
引用本文:张明西,张雷洪,吕巍,孙刘杰.按需印刷平台中的相似搜索研究[J].包装工程,2015,36(23):135-139.
作者姓名:张明西  张雷洪  吕巍  孙刘杰
作者单位:上海理工大学,上海 200093,上海理工大学,上海 200093,上海理工大学,上海 200093,上海理工大学,上海 200093
基金项目:上海市教委科研创新项目 (15ZZ074);上海高校青年教师培养资助计划 (ZZSLG14021);上海出版传媒研究院招标课题 (SAYB1410);上海理工大学博士启动基金 (1D-14-309-001)
摘    要:目的 研究按需印刷平台中的相似搜索效率问题。方法 利用用户与产品之间的 “购买” 关系构建 “用户-产品” 关系, 基于P-Rank提出一种高效的相似搜索方法POD-Rank, 用于从 “用户-产品” 关系中发现相似产品。POD-Rank相似搜索过程依据 “用户-产品” 关系离线计算用户相似性, 并利用用户相似性在线计算产品相似性, 而后进一步提出优化的在线查询处理算法, 以降低查询处理的时间开销。结果 POD-Rank的计算时间开销和存储开销显著低于P-Rank, 而且能够快速响应查询请求。结论 POD-Rank 的相似性计算开销为 P-Rank 的 0.03%, 存储开销为 P-Rank 的 0.06%, 计算效果与P-Rank接近, 能够满足按需印刷平台中大规模产品数据处理的需求。

关 键 词:按需印刷  P-Rank  相似搜索    “用户-产品”  关系图
收稿时间:2015/5/23 0:00:00
修稿时间:2015/12/10 0:00:00

Similarity Search over Print-on-demand Platform
ZHANG Ming-xi,ZHANG Lei-hong,LYU Wei and SUN Liu-jie.Similarity Search over Print-on-demand Platform[J].Packaging Engineering,2015,36(23):135-139.
Authors:ZHANG Ming-xi  ZHANG Lei-hong  LYU Wei and SUN Liu-jie
Abstract:The aim of this work was to study the efficiency problem of similarity search over Print-On-Demand (POD) Platform. A "user-product" relation graph was built by utilizing the purchasing relationship between user and product, the similarity between products was measured according to the structure of "user-product" relation graph. For improving the efficiency, we proposed a similarity search method, POD-Rank, which divided the computation process into 2 steps. In the first step, we computed the similarity between users in an off-line manner; and in the second step, we computed the similarity between the query and each candidate product based on user similarity in an online manner. For further reducing the response time of on-line query processing, we proposed an optimized online query processing algorithm by skipping the unnecessary accumulation operations on zero-values. The space cost and pre-computation time cost of POD-Rank were evidently lower than those of P-Rank with little effectiveness loss and short online query time. By adopting the 2-step similarity computation method, the time cost was significantly reduced, the computation time cost was only 0.03% of that of P-Rank, the size of similarity matrix was only 0.06% of that of P-Rank, and the effectiveness was close to that of P-Rank. This method can therefore be efficiently applied to processing of large datasets of POP platform.
Keywords:
点击此处可从《包装工程》浏览原始摘要信息
点击此处可从《包装工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号