首页 | 本学科首页   官方微博 | 高级检索  
     

Web社区发现技术综述
引用本文:杨楠,弓丹志,李忺,孟小峰.Web社区发现技术综述[J].计算机研究与发展,2005,42(3):439-447.
作者姓名:杨楠  弓丹志  李忺  孟小峰
作者单位:中国人民大学信息学院,北京,100872
基金项目:国家"八六三"高技术研究发展计划基金项目(2002AA116030) 国家自然科学基金项目(60073014,60273018) 教育部科学技术重点基金项目(03044) 教育部优秀青年教师资助计划基金项目
摘    要:Web是一个复杂超文本所组成的巨大的信息源,而且以很快的速度在不断的扩大.针对这样一个不断变化的信息源,如何利用和发现Web中的有用信息变得具有挑战性.Web在发展过程中存在着大量的社区,这些社区是Web组织中非常重要的信息.通过对社区信息的认识可以帮助我们总览Web的全貌.而将Web按照社区来组织有许多优点.社区可以引导用户找到感兴趣的信息;社区可以帮助Internet/Intranet服务提供者有效地组织门户;社区可以帮助制造商准确地找到消费者.社区还代表了Web的社会活动,因为Web就是一个社会性的网络.目前,许多社区的发现和维护是依靠人工来完成的,维护成本较高,修改也困难;此外,还存在着许多不为人知或者称为潜在的社区,而这些社区是无法通过人工来发现的.因此,许多研究都在致力于社区的自动或半自动发现技术.社区的发现主要采用基于Web图形的链接分析技术.在方法上大致上分为两类,一类是面向某个主题的社区发现,而另一个是无主题的社区发现技术.对于社区的发现技术做了较为全面的分析,并且总结了社区发现技术中依然存在的、挑战性的问题和未来的研究趋势.

关 键 词:Web资源发现  社区  链接分析  Web模型

Survey of Web Communities Identification
Yang Nan,Gong Danzhi,Li Xian,Meng Xiaofeng.Survey of Web Communities Identification[J].Journal of Computer Research and Development,2005,42(3):439-447.
Authors:Yang Nan  Gong Danzhi  Li Xian  Meng Xiaofeng
Abstract:WWW is a complicated collection of hypertext and expands with tremendous speed. Finding and applying usable information of Web is a challenging job. There exist a lot of communities while Web evolves. These communities are very important information in Web organization. Knowing these communities is helpful to overview the whole Web. Organizing Web into communities has many advantages. With communities, users can navigate their interesting information, Internet/Intranet service providers can arrange efficient ports, and manufacturers can find right consumers. Community also reflects sociality of Web, because Web is a social network. At present, many communities are found and maintained by human effort. It is costly and difficult to update. Nevertheless, there are still many unknown and newly emerged communities. It is impossible to find them manually. Therefore, this motivates many researches on automatic or semi-automatic discovering technologies. The method of community extraction consists of two categories, one is topic-oriented, the other is non-topic. They have different data sources. The former uses results from search engine by a query term and the latter uses a raw data from a crawler. But this field is still new and there remain still many problems. This paper analyzes the algorithms of community finding at present, and describes the challenging problems and promising research trends.
Keywords:Web resource discovery  community  link analysis  Web modeling
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号