首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
Web使用模式挖掘的研究   总被引:6,自引:0,他引:6  
Web挖掘是传统数据挖掘技术在Web环境下的应用,Web挖掘分为Web内容挖掘、Web结构挖掘和Web使用模式挖掘。Web使用模式挖掘是从用户浏览网站的数据中抽取感兴趣的模式,理解用户的浏览兴趣行为,以便进一步改善网站结构或为用户提供个性化的服务。文章主要论述了Web使用模式挖掘。  相似文献   

2.
网络使用挖掘是通过分析记录在Web服务器上的用户使用数据,来自动发现用户访问信息网的方式。其挖掘结果可以用于改善网站设计、商业决策支持、个性化服务等方面。序列模式分析是数据挖掘使用的模式分析的一种方式。本文主要介绍了一种适应复杂条件限制的序列模式分析在网络使用挖掘中的应用及其一般步骤。  相似文献   

3.
Hierarchical semistructured data arise frequently in the Web, or in biological information processing applications. Semistructured objects describing the same type of information have similar but not identical structure. Usually they share some common ‘schema’. Finding the common schema of a collection of semistructured objects is a very important task and due to the huge amount of such data encountered, data mining techniques have been employed.In this paper, we study the problem of discovering frequently occurring structures in semistructured objects using the notion of association rules. We identify that discovering the frequent structures in the early phases of the mining procedure is the dominant cost and we provide a fast algorithm addressing this issue. We present experimental results, which demonstrate the superiority of the proposed algorithm and also its efficiency in reducing dramatically the processing cost.  相似文献   

4.
Web使用模式挖掘是数据挖掘技术在 Web领域的应用。介绍了 Web使用模式挖掘的基本概况 ,重点讨论了 Web使用模式挖掘过程中的几个关键问题 ,即源数据的收集与集成 ,挖掘方法的不断更新及 Web使用模式分析等问题  相似文献   

5.
Web使用挖掘研究   总被引:5,自引:1,他引:5  
Web数据挖掘是数据挖掘技术在Web信息仓库中的应用与研究。Web数据挖掘包括Web内容挖掘、Web结构挖掘和Web使用挖掘三个研究方向,文中研究的重点是Web使用挖掘。Web使用挖掘研究的主要对象是用户的使用记录,研究的主要过程包括数据预处理、模式发现和模式分析。文中详细介绍了Web使用挖掘的最新研究成果,并对将来技术的研究方向和发展趋势进行了探讨性的预测与分析,为进一步的理论研究和实际应用工作提供了指导性的建议。  相似文献   

6.
The rapid growth of the World Wide Web has resulted in more data being accessed over the Internet. In turn there is an increase in the use of semistructured data, which plays a crucial role in many web applications particularly with the introduction of XML and its related technologies. This increase in use makes the design of good semistructured data structures essential. The Object Relationship Attribute model for Semistructured data (ORA-SS) is a graphical notation for designing and representing semistructured data. In this paper, we demonstrate an approach to formally validate the ORA-SS data models in order to enhance the correctness of semistructured data design. A mathematical semantics for the ORA-SS notation is defined using the Z formal language, and further validation processes are carried out to check the correctness of the semistructured data models at both the schema and instance levels.  相似文献   

7.
Web挖掘研究   总被引:27,自引:2,他引:27  
Internet的迅速发展,使得worldwideweb已经成为一个巨大的、蕴涵着具有潜在价值知识的分布式信息空间,为数据挖掘研究提供了丰富的资源的同时也提出了新的挑战。该文首先概述了数据挖掘的概念、挖掘算法及其主要应用领域,然后结合Web数据的多样性、丰富和动态的超链接信息以及Web用户访问信息,详细阐述了Web内容挖掘、Web结构挖掘和Web用户访问信息挖掘的概念、定义、主要的挖掘算法及最新研究进展,文章最后介绍了Web挖掘的研究方向和发展趋势。  相似文献   

8.
Many modern applications(e-commerce,digital library,etc.)require integrated access to various information sources(from tr5aditional RDBMS to semistructured Web repositories).Extracting schema from semistructured data is a prereuisite to integrated heterogeneous information sources.The traditional method that extracts global schema may require time (and space)to increase exponentially with the number of objects and edges in the source.A new method is presented in this paper.which is about extracting local schema,In this method,the algorithm controls the scale of extracting schema within the “schema diameter“ by examining the semantic distance of the target set and using the Hash class and its path distance operation.This method is very efficient for restraining schema from expanding.The prototype validates the new approach.  相似文献   

9.
近几年来,不确定数据广泛出现在传感器网络、Web应用等领域中。不确定数据挖掘已经成为了新的研究热点,主要包括聚类、分类、频繁项集挖掘、孤立点检测等方面,其中频繁项集挖掘是重点研究的问题之一。综述了传统的频繁项集挖掘的两类基本算法,分析了在此基础上提出的适用于不确定数据以及不确定数据流的频繁项集挖掘的方法,并探讨了今后可能的研究方向。  相似文献   

10.
Current query languages for the Web (e.g., W3QL, WebLog and WebSQL) explore the structure of the Web. However, usually, the structure of the Web has little to do with the semantics of the data. Therefore, it is practically difficult to pose database queries over the Web. We introduce a new type of tags for denoting the semantics of data stored in HTML pages. These semantic tags (implemented as HTML comments) superimpose on HTML pages semistructured objects in the style of the OEM model. The paper discusses two implemented tools for fully utilizing the semantics. The first is a visualization tool for displaying both the HTML reading of Web pages and the OEM reading of Web pages. The second tool is a query language, similar to LOREL, that can query the HTML structure and/or the OEM reading. The above formalism and tools provide data-modeling capabilities for the Web that fit its heterogeneous nature. Real database queries, taking the OEM point of view, can be formulated, including queries about the schema as well as queries about the HTML structure of Web pages. Therefore, the query language is not restricted to portions of the Web in which semantic tags are used.  相似文献   

11.
基于Web的数据挖掘技术   总被引:4,自引:0,他引:4  
对Web数据挖掘技术的国内外研究成果进行了评价.阐述了Web数据挖掘的流程及其特点,针对Web内容挖掘、Web结构挖掘、Web使用挖掘的方法及实现技术分别进行了讨论分析,介绍了Web数据挖掘的典型应用,并对该领域进一步研究的问题进行了展望。  相似文献   

12.
国际互联网的广泛应用使得数据挖掘技术在Web数据挖掘得到了最大的发展,文章就Web数据挖掘技术的存储数据源、分类、实现技术作了详细的阐述,并介绍了一些实用的Web挖掘工具,对Web数据挖掘进行了探讨和分析,并指出了国内外的发展趋势和待解决的问题。  相似文献   

13.
Web数据挖掘分析   总被引:1,自引:0,他引:1  
国际互联网的广泛应用使得数据挖掘技术在Web数据挖掘得到了最大的发展,文章就Web数据挖掘技术的存储数据源、分类、实现技术作了详细的阐述,并介绍了一些实用的Web挖掘工具,对Web数据挖掘进行了探讨和分析,并指出了国内外的发展趋势和待解决的问题。  相似文献   

14.
Complete Mining of Frequent Patterns from Graphs: Mining Graph Data   总被引:16,自引:0,他引:16  
Basket Analysis, which is a standard method for data mining, derives frequent itemsets from database. However, its mining ability is limited to transaction data consisting of items. In reality, there are many applications where data are described in a more structural way, e.g. chemical compounds and Web browsing history. There are a few approaches that can discover characteristic patterns from graph-structured data in the field of machine learning. However, almost all of them are not suitable for such applications that require a complete search for all frequent subgraph patterns in the data. In this paper, we propose a novel principle and its algorithm that derive the characteristic patterns which frequently appear in graph-structured data. Our algorithm can derive all frequent induced subgraphs from both directed and undirected graph structured data having loops (including self-loops) with labeled or unlabeled nodes and links. Its performance is evaluated through the applications to Web browsing pattern analysis and chemical carcinogenesis analysis.  相似文献   

15.
如何有效地分析用户的需求,帮助用户从因特网的信息海洋中发现他们感兴趣的信息和资源.已经成为一项迫切而重要的课题。解决这些问题的一个途径,就是将传统的数据挖掘技术与Web结合起来,进行Web数据挖掘。其中的Web日志挖掘可以掌握用户在浏览站点时的行为,并且将挖掘出的用户访问模式应用于网站上,在改善Web站点的结构以及页面间的超链接结构,提高站点的服务质量等方面有重要的意义。  相似文献   

16.
如何有效地分析用户的需求,帮助用户从因特网的信息海洋中发现他们感兴趣的信息和资源,已经成为一项迫切而重要的课题。解决这些问题的一个途径,就是将传统的数据挖掘技术与Web结合起来,进行Web数据挖掘。其中的Web日志挖掘可以掌握用户在浏览站点时的行为,并且将挖掘出的用户访问模式应用于网站上,在改善Web站点的结构以及页面间的超链接结构,提高站点的服务质量等方面有重要的意义。  相似文献   

17.
电子商务是随着网络的发展产生的一种新兴事物,电子商务的迅速崛起,使得不管是商家还是客户对基于Web数据检索、挖掘等需求不断提高。目前静态结构的Web页面显然已经被众多个性化的动态结构站点所代替。网站如何根据Web服务器日志文件,客户交易数据中挖掘出有意义的用户访问模式和潜在的客户群,为企业提供全方位信息服务和开展有针对性的电子商务活动。针对电子商务方面论述了数据挖掘的优势和应用。介绍了数据挖掘、数据挖掘的分类、电子商务中Web数据挖掘的步骤等。  相似文献   

18.
Web挖掘研究   总被引:289,自引:4,他引:285  
因特网目前是一个巨大,分布广泛,全球性的信息服务中心,它涉及新闻,广告,消费信息,金融管理,教育,政府,电子商务和许多其它信息服务,Web包含了丰富和动态的超链接信息,以及Web页面的访问和使用信息,这为数据挖掘提供了丰富的资源,Web挖掘就是从Web活动中抽取感兴趣的潜在有用模式和隐藏的信息,对Web挖掘最新技术及发展方向做了全面分析,包括Web结构挖掘,多层次Web数据仓库方法以及W eb,Log挖掘等。  相似文献   

19.
Web日志挖掘中的数据预处理技术研究   总被引:30,自引:0,他引:30  
赵伟  何丕廉  陈霞  谢振亮 《计算机应用》2003,23(5):62-64,67
在Web数据挖掘研究领域中,Web日志挖掘是Web数据挖掘研究领域中一个最重要的应用方面。而数据预处理在Web日志挖掘过程中起着至关重要的作用。文中深入探讨了数据预处理环节的主要任务,并介绍这个过程中一些特殊情况的处理方法。  相似文献   

20.
Web mining involves the application of data mining techniques to large amounts of web-related data in order to improve web services. Web traversal pattern mining involves discovering users’ access patterns from web server access logs. This information can provide navigation suggestions for web users indicating appropriate actions that can be taken. However, web logs keep growing continuously, and some web logs may become out of date over time. The users’ behaviors may change as web logs are updated, or when the web site structure is changed. Additionally, it can be difficult to determine a perfect minimum support threshold during the data mining process to find interesting rules. Accordingly, we must constantly adjust the minimum support threshold until satisfactory data mining results can be found.The essence of incremental data mining and interactive data mining is the ability to use previous mining results in order to reduce unnecessary processes when web logs or web site structures are updated, or when the minimum support is changed. In this paper, we propose efficient incremental and interactive data mining algorithms to discover web traversal patterns that match users’ requirements. The experimental results show that our algorithms are more efficient than other comparable approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号