首页 | 本学科首页   官方微博 | 高级检索  
     

基于层次语义的Web服装图像智能采集方法
引用本文:耿增民,商书元,邵新艳,周毅灵,马玲.基于层次语义的Web服装图像智能采集方法[J].计算机科学,2016,43(Z11):252-255.
作者姓名:耿增民  商书元  邵新艳  周毅灵  马玲
作者单位:北京服装学院数字与交互媒体北京市重点实验室 北京100029;北京服装学院计算机信息中心 北京100029,北京服装学院计算机信息中心 北京100029,北京服装学院服装艺术与工程学院 北京100029,北京服装学院计算机信息中心 北京100029,北京服装学院服装艺术与工程学院 北京100029
基金项目:本文受北京市教育科学“十二五”规划重点课题(AJA11174),教育部人文社科项目(12YJA760014),北京市教育委员会专项资助
摘    要:以大规模智能采集互联网中的服装图像为目的,研究如何利用互联网上服装图像的伴随文本与服装图像概念之间的关联,实现自动采集各语义对应的服装图像。在HITS(Hyperlink-Induced Topic Search)算法的基础上提出一个基于层次语义的图像采集算法SICR(Semantic-based Image Collection Robot)。该算法在层次语义库的支持下,扩充根集与去除链接工厂页面同步进行。在爬取链接网页前,进行锚文字的相似度计算和页面内容的概念分析,舍弃不符合语义的页面,只下载满足语义的服装图像。算法克服了基于文本分析或链接分析的图像自动提取算法的不足,具有较高的准确率和召回率,实验结果证明了SICR算法的有效性。

关 键 词:图像语义  图像检索  服装图像  Web挖掘

Hierarchical Semantic-based Web Intelligent Fashion Image Retrieval Method
GENG Zeng-min,SHANG Shu-yuan,SHAO Xin-yan,ZHOU Yi-ling and MA Lin.Hierarchical Semantic-based Web Intelligent Fashion Image Retrieval Method[J].Computer Science,2016,43(Z11):252-255.
Authors:GENG Zeng-min  SHANG Shu-yuan  SHAO Xin-yan  ZHOU Yi-ling and MA Lin
Affiliation:Beijing Lab of Digital and Interactive Media,Beijing Institute of Fashion Technology,Beijing 100029,China;Computer Information Center,Beijing Institute of Fashion Technology,Beijing 100029,China,Computer Information Center,Beijing Institute of Fashion Technology,Beijing 100029,China,School of Fashion Art and Engineering,Beijing Institute of Fashion Technology,Beijing 100029,China,Computer Information Center,Beijing Institute of Fashion Technology,Beijing 100029,China and School of Fashion Art and Engineering,Beijing Institute of Fashion Technology,Beijing 100029,China
Abstract:Aiming at the large-scale automatic collection of fashion images from Web,this paper studied how to use association between the accompany text and the concept of fashion images on Web pages to collect images automatically.Based on the acquisition of semantic-based Web content and the drawbacks of HITS (Hyperlink-induced Topic Search) method,a novel SICR (Semantic-based Image Collection Robot) method was proposed to collect the fashion images from Web.The proposed method (SICR),under the support of the hierarchical semantic library,removes Link Farm page in the expansion of root set,does the similarity calculation for anchor text when crawling link pages.In addition,it makes a brief conceptual analysis of the page content before downloading images.The experimental results on the large-scale dataset have demonstrated that the proposed method can overcome the deficiency of only text or link-based analysis and improve the precision rate and recall rate of fashion image retrieval,experimental results demonstrate the effectiveness of SICR.
Keywords:Image semantics  Image retrial  Fashion image  Web mining
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号