首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
周博  刘奕群  张敏  金奕江  马少平 《软件学报》2011,22(8):1714-1724
锚文本对网络信息检索性能的提升作用已经得到验证,并被广泛地应用于商用网络搜索引擎.然而,锚文本制作的不可控性导致其中蕴含大量与目标网页不相关或具有作弊倾向的无用信息.另外,对于需要衡量检索结果服务质量的事务类查询,原始锚文本推荐的目标网页也往往与真实的用户体验不一致.为了解决上述问题,基于大规模真实用户的互联网浏览行为日志展开研究.首先提出了锚文本检索有效性的评估框架,然后分析了用户网络浏览点击行为与锚文本检索有效性之间的联系,挖掘了用户网络浏览点击行为中有助于筛选高质量锚文本的特征.基于这些特征,提出了两种超链接文档生成方法.实验结果表明,基于用户网络浏览点击行为特征筛选出的锚文本,与原始锚文本相比,能够明显地提升网络检索的性能.  相似文献   

2.
随着互联网技术的发展,用户信息行为的研究越来越受到重视,浏览行为是用户信息行为的重要组成部分,浏览行为的研究对提升用户体验具有重要意义。以今日头条用户为研究对象,基于顾客感知价值理论,使用访谈法与问卷调查法相结合的方法,探索影响用户浏览新闻的行为的因素。研究发现,用户在移动端浏览新闻场景下,只有实用价值对新闻浏览行为正向影响显著。由于今日头条新闻平台与其他新闻客户端具有较高的相似性,因此结论可以推广。对于新闻创作者和平台运营者来说,要着重提升内容质量并减少无用信息,注重新闻内容的真实性可靠性。  相似文献   

3.
RSS技术及其发展探讨   总被引:4,自引:0,他引:4  
互联网的发展使得网络成为人们重要的信息来源,但传统的浏览方式存在一定不足。一种新的浏览技术RSS在近年来迅速发展,越来越多的Web站点为用户提供基于RSS的浏览方式。文中对RSS技术的各个方面做一个综述,分析了RSS的由来及发展状况,给出了RSS不同版本之间的对比;介绍了RSS技术的工作原理及其与传统的浏览方式的区别。对RSS的优点及应用领域作一个探讨,简单讨论了RSS的一些不足。作为一种新的网络浏览方式,RSS存在优点的同时存在一定的不足,但其将来必定会越来越完善。  相似文献   

4.
个性化网上信息过滤智能体的实现   总被引:13,自引:0,他引:13  
论述了一个Internet网上个性化信息过滤智能体的实现。它采用向量空间模型[3] 作为文档表示的基础 ,通过抽取用户浏览网页的特征 ,使用BP神经网络来学习和跟踪用户的兴趣 ,从而动态了解用户的浏览行为 ,并在用户查询时能有效地过滤出用户感兴趣的信息。  相似文献   

5.
网络技术以及硬件设备的飞速发展,带来互联网用户阅读与思维习惯的转变,呈 现出越来越明显的女性化思维特征,即互联网的感性化、体验化、去中心化的思维特征。互联 网女性化思维特征实际上是用户在视觉感知、情感交互以及多元架构上的诉求,创新性 Web 界 面设计的研究主要目标是为了给用户提供更好的沉浸式的体验与服务。因此,以用户行为核心 流程的研究为切入点,通过分析互联网女性化思维现象;研究互联网用户的需求特性,结合案 例从提升信息感召力;促进用户情感共鸣;建立面向未来的智能多元服务系统三方面,探讨创 新性 Web 界面设计策略与实现方法。  相似文献   

6.
针对富互联网应用系统需要传输较大数据量而导致速度慢这一问题,研究了一种基于上下文感知的数据传输策略,通过感知用户的浏览记录和当前行为,系统可以自动预测用户下一步对数据的需求,并充分利用用户交互过程中的网络空闲时间传输数据,以提高系统的响应性.实验结果表明,该数据传输策略大大缩短了用户的等待时间,特别是在比较复杂的场景中,效果更为显著.  相似文献   

7.
网络淫秽色情犯罪的法律控制   总被引:1,自引:0,他引:1  
网上传播淫秽物品的行为一般表现为上载,下载淫秽色情物品和建立展示淫秽物品的超链接点。上载淫秽物品是指行为人通过网络将淫秽色情物品载入互联网的行为。例如行为人建立网站或者自己的个人主页。然后,通过将淫秽色情物品上载至网页中供他人浏览或者观看。下载淫秽色情物品有两种形式,一种是指行为人将淫秽色情物品从互联网发送到不特定人的电子邮件中,当用户打开自己的电子邮件时就能浏览到这些淫秽色情物品的行为。  相似文献   

8.
受限于AI技术以及远程智能终端网络条件的复杂性,终端用户的网页浏览行为跟踪过程易产生冗余数据,用户身份识别难度较大。为此,提出基于强化学习的AI远程终端用户身份识别方法。从解锁行为、操作行为、通信行为等方面判断远程终端用户行为规律,在客户端中通过用户ID、访问页面地址、页面标题等属性定义用户终端浏览行为。将浏览信息传输至中心服务器并录入终端数据库内,采集完整终端用户数据。通过小波阈值方法消除冗余信息,根据强化学习的奖励持续调节方法,提取AI远程终端用户行为数据集,计算用户身份特征与行为特征间的耦合关系,得到身份识别结果。仿真结果表明,所提方法能够快速准确地识别目标用户身份,保障了用户数据安全,为其提供更可靠的AI远程操作环境。  相似文献   

9.
为提高网络购物的个性化体验,设计和实现了一个基于智能推荐的电子商城购物系统.为达到用户网页浏览行为的分析与预测,推荐算法结合了基于用户和基于项的协同过滤算法,系统构建用户具有偏序结构的关键浏览路径层次图.数据分析结果表明,改进后的推荐算法有助于提升推荐系统的性能,从而满足用户个性化需求.  相似文献   

10.
《计算机与网络》2007,(15):39-39
互联网的匿名性保护了用户的信息和网络使用安全。然而我们也常常能在各种媒体里面了解到发生在互联网上的侵犯隐私的恶性事件。当前对用户的隐私威胁最大的不是用于跟踪用户的Cooke、间谍软件和用户浏览行为分析网站。而是我们日常使用的搜索引擎。大部分搜索引擎在用户使用其服务时。都会记录用户的IP地址、搜索的关键词、从搜索结果中跳转到哪个网站等信息,通过数据挖掘等技术。  相似文献   

11.
Surfing the World Wide Web (WWW) involves traversing hyperlink connections among documents. The ability to predict surfing patterns could solve many problems facing producers and consumers of WWW content. We analyzed WWW server logs for a WWW site, collected over ten days, to compare different path reconstruction methods and to investigate how past surfing behavior predicts future surfing choices. Since log files do not explicitly contain user paths, various methods have evolved to reconstruct user paths. Session times, number of clicks per visit, and Levenshtein Distance analyses were performed to show the impact of various reconstruction methods. Different methods for measuring surfing patterns were also compared. Markov model approximations were used to model the probability of users choosing links conditional on past surfing paths. Information‐theoretic (entropy) measurements suggest that information is gained by using longer paths to estimate the conditional probability of link choice given surf path. The improvements diminish, however, as one increases the length of path beyond one. Information‐theoretic (total divergence to the average entropy) measurements suggest that the conditional probabilities of link choice given surf path are more stable over time for shorter paths than longer paths. Direct examination of the accuracy of the conditional probability models in predicting test data also suggests that shorter paths yield more stable models and can be estimated reliably with less data than longer paths. This revised version was published online in August 2006 with corrections to the Cover Date.  相似文献   

12.
The Semantic Web (SW) is a meta-web built on the existing WWW to facilitate its access. SW expresses and exploits dependencies between web pages to yield focused search results. Manual annotation of web pages towards building a SW is hindered by at least two user dependent factors: users do not agree on an annotation standard, which can be used to extricate their pages inter-dependencies; and they are simply too lazy to use, undertake and maintain annotation of pages. In this paper, we present an alternative to exploit web pages dependencies: as users surf the net, they create a virtual surfing trail which can be shared with other users, this parallels social navigation for knowledge. We capture and use these trails to allow subsequent intelligent search of the web.People surfing the net with different interests and objectives do not leave similar and mutually beneficial trails. However, individuals in a given interest group produce trails that are of interest to the whole group. Moreover, special interest groups will be higher motivated than casual users to rate utility of pages they browse. In this paper, we introduce our system KAPUST1.2 (Keeper And Processor of User Surfing Trails). It captures user trails as they search the internet. It constructs a semantic web structure from the trails. The semantic web structure is expressed as a conceptual lattice guiding future searches. KAPUST is deployed as an E-learning software for an undergraduate class. First results indicated that indeed it is possible to process surfing trails into useful knowledge structures which can later be used to produce intelligent searching.  相似文献   

13.
Currently, computers are changing from single, isolated devices into entry points to a worldwide network of information exchange and business transactions called the World Wide Web (WWW). For this reason, support in data, information, and knowledge exchange has become a key issue in current computer technology. The success of the WWW has made it increasingly difficult to find, access, present, and maintain the information required by a wide variety of users. In response to this problem, many new research initiatives and commercial enterprises have been set up to enrich available information with machine processable semantics. This semantic web will provide intelligent access to heterogeneous, distributed information, enabling software products (agents) to mediate between user needs and the information sources available. This paper summarizes ongoing research in the area of the semantic web, focusing especially on ontology technology.  相似文献   

14.
In this paper, we present a temporal web data model designed for warehousing historical data from World Wide Web (WWW). As the Web is now populated with large volume of information, it has become necessary to capture selected portions of web information in a data warehouse that supports further information processing such as data extraction, data classification, and data mining. Nevertheless, due to the unstructured and dynamic nature of Web, the traditional relational model and its temporal variants could not be used to build such a data warehouse. In this paper, we therefore propose a temporal web data model that represents web documents and their connectivities in the form of temporal web tables. To represent web data that evolve with time, a visible time interval is associated with each web document. To manipulate temporal web tables, we have defined a set of web operators with capabilities ranging from extracting WWW information into web tables, to merging information from different web tables. We further illustrate the use of our temporal web data model using some realistic motivating examples.  相似文献   

15.
Characterizing Web usage regularities with information foraging agents   总被引:1,自引:0,他引:1  
Researchers have recently discovered several interesting, self-organized regularities from the World Wide Web, ranging from the structure and growth of the Web to the access patterns in Web surfing. What remains to be a great challenge in Web log mining is how to explain user behavior underlying observed Web usage regularities. We address the issue of how to characterize the strong regularities in Web surfing in terms of user navigation strategies, and present an information foraging agent-based approach to describing user behavior. By experimenting with the agent-based decision models of Web surfing, we aim to explain how some Web design factors as well as user cognitive factors may affect the overall behavioral patterns in Web usage.  相似文献   

16.
The World Wide Web (WWW) has been recognized as the ultimate and unique source of information for information retrieval and knowledge discovery communities. Tremendous amount of knowledge are recorded using various types of media, producing enormous amount of web pages in the WWW. Retrieval of required information from the WWW is thus an arduous task. Different schemes for retrieving web pages have been used by the WWW community. One of the most widely used scheme is to traverse predefined web directories to reach a user's goal. These web directories are compiled or classified folders of web pages and are usually organized into hierarchical structures. The classification of web pages into proper directories and the organization of directory hierarchies are generally performed by human experts. In this work, we provide a corpus-based method that applies a kind of text mining techniques on a corpus of web pages to automatically create web directories and organize them into hierarchies. The method is based on the self-organizing map learning algorithm and requires no human intervention during the construction of web directories and hierarchies. The experiments show that our method can produce comprehensible and reasonable web directories and hierarchies.  相似文献   

17.
Search engines are useful because they allow the user to find information of interest from the World Wide Web (WWW). However, most of the popular search engines today are textual; they do not allow the user to find images from the web. For effective retrieval, determining the semantics of the images is essential. In this paper, we describe the problems in determining the semantics of images on the WWW and the approach of AMORE, a WWW search engine that we have developed. AMORE's techniques can be extended to other media like audio and video. We explain how we assign keywords to the images based on HTML pages and the method to determine similar images based on the assigned text. We also discuss some statistics showing the effectiveness of our technique. Finally, we present the visual interface of AMORE with the help of several retrieval scenarios.  相似文献   

18.
计算机网络发展迅速,网络数据挖掘已经成为一个重要的研究领域。网络数据分布范围广,数据量大,结构多样,时间跨度高。如何对这些海量数据进行高效查询成为研究人员关注的问题。遗传算法在搜索的过程中采用群体搜索方式,有利于得到最优查询结果。在数据查询、查询优化和分布式数据挖掘等方面使用遗传算法,能够从不同角度大大提高查询效果。  相似文献   

19.
基于Web浏览特征提出了一种自适应抗噪声的PPM预测模型。模型在构造过程中,利用描述用户浏览深度特征的逆高斯分布及Web流行度特征,对噪声页面及过期数据进行动态移除,分别从纵向和横向上对PPM预测模型规模进行控制。实验表明,该模型对噪声数据的影响有较大的改善,能较好地动态预测用户的Web浏览特征,不仅预测准确率和存储复杂度都有一定程度的提高,而且有效控制了由预取引起的网络流量。  相似文献   

20.
The WWW has become one of the most important media for sharing information. Web information provides another emerging and important avenue and source of competitive intelligence (CI) for companies. CI is critical for companies to stay competitive in the marketplace. Apart from business users, there are other types of CI users such as technical users, casual users, news awareness users and others who would like to be kept informed on the latest development of their interested areas over the WWW. To discover web information, CI users need to constantly monitor certain web sites and web pages for related information. However, the dynamic nature of the web has made such monitoring task complicated and time-consuming. This paper proposes a web monitoring system, WebMon, to help users monitor specified web pages for latest changes and updates in information. Four monitoring functions including date monitoring, keywords monitoring, link monitoring and portion monitoring are supported by the system. The performance of these monitoring functions is also evaluated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号