首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
为实现博客资源的实时和有效搜索,提出以高性能和可扩展的Lucene作为搜索引擎的框架的博客搜索引擎.它充分利用RSS技术和网络蜘蛛技术实现博客资源的快速采集,能够为博客资源的搜索提供良好的支持,并在采集、索引生成及检索三方面的效率和成本上达到一个较为理想的水平.  相似文献   

2.
2006年12月6日,全球最大中文搜索引擎百度宣布,依托其庞大的搜索技术和全面的博客数据分析,2006年中国博客发展权威报告正式发布。  相似文献   

3.
趋势观察一:评论搜索 第一个重要趋势就是搜索用户自己产生的内容。2005年称得上是博客搜索引擎之年。互联网领域对于博客信息的挖掘是如此地关注,以至于几乎每个主要的搜索引擎都感觉到必须提供博客搜索,包括Google在内。  相似文献   

4.
随着博客这一全新的个人信息发布方式的迅速流行,博客圈作为聚集拥有共同兴趣爱好的博客作者的媒介而在互联网的应用当中扮演着重要的角色.本文在引入作者的兴趣的同时,结合传统的机器学习思想,提出了一种对博客进行自动聚合并得到具有明确类别的博客圈的方法.实验结果表明该方法在正确率和效率方面都能满足实用要求.  相似文献   

5.
伦宏 《电脑开发与应用》2008,21(7):F0003-F0003
互联网上的数据是庞大的、无序的。要想真正利用好网上信息,就必须利用好信息的检索工具——搜索引擎。而搜索引擎经过几代的发展,现阶段已经能使用智能代理技术跟踪用户检索行为,分析用户模型,并能建立基于智能代理的信息过滤和个性化服务技术。国内两大搜索引擎百度与谷歌中文,从综合实力来看,可说是各有千秋,旗鼓相当。Google的优势在于信息量大,资源丰富;百度的优势在于对中文的理解较好,更贴近中文用户的搜索习惯。  相似文献   

6.
基于多结构特征的垃圾博客识别研究   总被引:1,自引:0,他引:1  
为解决日益严重的垃圾博客问题,对产生垃圾博客的作弊技术和相应的识别技术进行了研究.通过对大量中文垃圾博客的分析,结合对作弊者目的的研究,提出了从用户名、发帖时间间隔、博文内容、锚文本和链接地址、分类标签等博客的结构特征出发的特征提取方法.在特征提取的基础上,提出了基于多结构特征的识别方法,并建立了相应的系统模型.使用支持向量机和朴素贝叶斯模型作为分类器进行了实验,并与经典的基于内容的方法进行了对比.实验结果表明,在小的训练集上,基于多结构特征的方法正确率达到90%以上,比基于内容的方法提高了6个百分点,该方法可有效区分垃圾博客和正常博客.  相似文献   

7.
基于贝叶斯方法和信息指纹的博客评论过滤   总被引:1,自引:0,他引:1       下载免费PDF全文
博客的出现丰富和改变了网络的内涵,影响了人们的信息传递方式,同时博客评论作为一种交互方式在博客中广泛存在,给信息监管带来了新的问题。通过分析现有的博客过滤系统,将广泛应用于文本过滤的贝叶斯方法应用到博客评论中,针对博客评论中广泛存在的广告机器人特点,结合信息指纹对其进行识别和过滤。同时对影响博客评论过滤效果和执行速度的指纹函数进行了分析讨论和实验对比,实验结果表明基于贝叶斯方法和信息指纹相结合的博客评论过滤是行之有效的,而且相对于单独的贝叶斯方法更有利于提高系统运行效率和发现广告机器人现象。  相似文献   

8.
基于本体的元搜索引擎的设计与实现   总被引:1,自引:0,他引:1  
与荚丈元搜索引擎相比,中文元搜索引擎还存在一定的差距,现有的中文元搜索引擎在实现关键词的扩展方面有待提高.通过对本体及元搜索引擎技术的研究,提出并实现了基于本体的元搜索引擎系统,介绍了系统的工作原理,通过对现有的元搜索引擎排序算法的分析,对摘要排序算法进行了改进.最终对系统进行测试并对其结果进行分析,该系统实现了对关键词的同义词和英丈扩展查询,有效地提高了系统的查全率和查准率.  相似文献   

9.
针对网页数量激增,网上信息资源庞大而繁杂的现状,介绍了个性化信息检索系统的发展及相关技术.并在介绍简易信息聚合技术(RSS)的含义和功能的基础上,利用 RSS 的信息聚合和信息推送两大基本功能,重点分析了基于 RSS 技术的对电子商务个性检索的应用.  相似文献   

10.
基于知识库系统的中文智能搜索引擎   总被引:2,自引:0,他引:2  
随着信息技术的发展和信息量的增长,传统的搜索引擎技术日益不能满足用户信息查询的需要.目前.搜索引擎技术与人工智能(AI)技术的结合已经成为网络信息搜索的关键技术与核心思想.一种基于知识库系统的智能搜索引擎技术已成为当前研究的热点.主要介绍基于知识库系统的中文智能搜索引擎及其实现技术,以及中文智能搜索引擎的主要发展方向.  相似文献   

11.
随着移动产业发展和移动技术提高,基于用户位置的业务迅速发展,如:紧急援助、信息查询等,基于位置业务创新已经成为移动产业发展的巨大推动力。文中在ISG平台上设计和实现基于位置的手机博客系统。与传统的手机博客系统相比较,文中引入用户位置信息。用户写博客时,系统自动记录用户的位置信息,并把用户位置与其所写博客动态绑定存储;用户可以根据自己的位置动态搜索博客。  相似文献   

12.
Blog retrieval is a complex task because of the informal language usage.Blogs deviate from the language which is used in traditional corpora largely due to various reasons.Spelling errors,grammatical irregularity,over use of abbreviations and symbolic characters like emotions are a few reasons of irregular corpus blogs.To make the retrieval of blogs easier,the novel idea of personalized semantic based blog retrieval(PSBBR) system is discussed in this paper.The blogs are tagged with a relationship to one another with reference to ontology.The meanings of the blog content and key term are tagged as XML tags.The query term accesses the XML tags to retrieve entire blog content.The system is evaluated with a huge number of blogs extracted from various blog sources.Relevance score is calculated for every blog associated with  相似文献   

13.
博客作为一种用户发表其观点和看法的载体已成为Web上一个重要的情感抒发与交流平台,博文搜索为这种交流提供了方便快捷的途径.很多时候,用户进行博文搜索时更关注作者对事件所持的观点或情感,但目前的博文搜索返回结果大多基于主题而非情感倾向.基于此提出一种基于句法依存分析技术的算法SOAD(sentiment orientation analysis based on syntactic dependency)对博文搜索结果进行情感倾向性分析.基于SOAD算法,构建了一个中文博文搜索原型系统,对博文搜索结果进行再处理.实验证明,一方面,SOAD算法在分析博文情感上具有更大的优势;另一方面,建立的原型系统实现了依据情感倾向返回搜索结果的目标.  相似文献   

14.
Although many studies focus on information sharing in communities and organisations, little research has been carried out on the antecedents of continuance intention of blog sharing. This study focuses on amateur blogs, which are the major customers for blog service providers (BSPs). The purposes are to investigate the antecedents of continuous blog sharing and determine whether they change with gender, age, and blog experience differences. Based on the Unified Theory of Acceptance and Use of Technology (UTAUT) framework and related social-psychological foundations, this study proposes outcome expectancy of financial capital, knowledge capital, and social capital, perceived usability, social influence, self-disclosure, and information literacy as the antecedents of continuous blog sharing. A survey of 268 blog authors reveals that usability is a necessary condition for continuous blog sharing. Outcome expectancy for knowledge capital and social capital can encourage continuous sharing behaviour, but expectancy for financial capital does not. Meanwhile, blog sharing is primarily a personal endeavour facilitated by inner self-disclosure, not extrinsic information literacy or social influence. In addition, the antecedents differ according to gender, age, and blog experience differences.  相似文献   

15.
Many modern consumers use blogs as important information sources, which they evaluate on the basis of blog-specific cues. Using the theory of self-disclosure, this study posits that bloggers' product evaluation self-disclosures, social self-disclosures, and blog popularity are key determinants of readers' cognitive and affective trust. Readers' trust in turn should affect their product attitudes and feedback intentions towards the blog. With a survey study involving seven blog articles about dining experience and a structural equation model, this research confirms the positive influences of product evaluation self-disclosures and popularity on readers' cognitive trust and of social self-disclosures on readers' affective trust. Both cognitive and affective forms of trust enhance product attitudes. Affective trust also increases readers' feedback intentions towards the blog. With these findings, this study offers suggestions for bloggers and companies that use blogs as marketing tools.  相似文献   

16.
博客在学科教学中的作用初探   总被引:1,自引:0,他引:1  
倪海良 《数字社区&智能家居》2007,1(2):1146-1146,1150
博客是一种新兴的网络交流方式,相对其他的交流方式,它有许多自身的特点和优势。本文就有关博客在学科教学中的应用进行探讨,分析教师博客和师生博客在学科教学中的作用,并提出对博客的几点反思。  相似文献   

17.
This study examines factors influencing students’ continuance intention to use blogs to learn in an undergraduate-level course. The research uses constructs from relevant theoretical frameworks, including the technology acceptance model, social cognitive theory, innovation diffusion theory, and expectation–confirmation model. A survey administered to 108 university students in a Canadian university was analysed using the partial least squares technique. The results show that perceived usefulness and perceived compatibility have positive effects on students’ attitudes towards blog use; perceived ease of use did not. Perceived compatibility, perceived self-efficacy, and perceived support for enhancing social ties with blogs have significant effects on the positive impacts of learning with such tools. Attitude and positive impacts of learning with blogs influence satisfaction with blog use. Both attitude and satisfaction are determinants of students’ continuance intention to use blogs to learn. Satisfaction with blog use is the main predictor of continued use intention.  相似文献   

18.
Blogs are increasingly accepted as a useful means to proliferate a variety of information on the web. As the popularity of blogs grows rapidly, a number of blog search engines have appeared recently to help users access and discover blog posts efficiently. Nevertheless, existing approaches tend to focus on ranking the blog posts according to their recency or popularity only, leaving the problem of retrieving more topic relevant posts to a user’s query largely unexplored. In this paper, we present a novel blog ranking framework, called PTRank, that improves search quality by taking account of relevance feedback from users as well as various information available from RSS feeds. A neural network method is employed to learn ranking functions that provide a relevance score between a keyword and a blog post. Extensive experiments on real blog data have been conducted to validate the proposed ranking framework for blog post search, and the results indicate that PTRank performs significantly better than the existing popular approach.  相似文献   

19.
20.
基于网页格式信息量的博客文章和评论抽取模型   总被引:3,自引:0,他引:3  
曹冬林  廖祥文  许洪波  白硕 《软件学报》2009,20(5):1282-1291
从信息论的角度出发,提出了一个基于网页格式信息量的博客文章和评论抽取模型.首先,结合网页视觉上的位置信息和文本的有效信息来定位网页正文.其次,利用博客网页中的格式信息作为信息单元并计算每个信息块所包含的格式信息量,通过计算最小切分位置信息量来切分正文中的文章和评论.该模型具有与语言无关的特点,因此具有一定的通用性.实验结果表明,该模型在博客正文定位和正文切分方面达到了较高的精确率.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号