首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 234 毫秒
1.
概述了Web挖掘的组成部分、分类和现状,指出了现有一些Web挖掘方法的局限.介绍了目前比较新的技术--软计算技术,总结了软计算技术在Web挖掘中的应用.Web数据的固有无标记、不精确、异构性和动态性,处理人与机器的交互,上下文敏感性和近似查询,个性化学习,软计算都是很合适的解决方法.  相似文献   

2.
Data mining in soft computing framework: a survey   总被引:19,自引:0,他引:19  
The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model. The utility of the different soft computing methodologies is highlighted. Generally fuzzy sets are suitable for handling the issues related to understandability of patterns, incomplete/noisy data, mixed media information and human interaction, and can provide approximate solutions faster. Neural networks are nonparametric, robust, and exhibit good learning and generalization capabilities in data-rich environments. Genetic algorithms provide efficient search algorithms to select a model, from mixed media data, based on some preference criterion/objective function. Rough sets are suitable for handling different types of uncertainty in data. Some challenges to data mining and the application of soft computing methodologies are indicated. An extensive bibliography is also included.  相似文献   

3.
Distributed data mining implements techniques for analyzing data on distributed computing systems by exploiting data distribution and parallel algorithms. The grid is a computing infrastructure for implementing distributed high‐performance applications and solving complex problems, offering effective support to the implementation and use of data mining and knowledge discovery systems. The Web Services Resource Framework has become the standard for the implementation of grid services and applications, and it can be exploited for developing high‐level services for distributed data mining applications. This paper describes how distributed data mining patterns, such as collective learning, ensemble learning, and meta‐learning models, can be implemented as Web Services Resource Framework mining services by exploiting the grid infrastructure. The goal of this work was to design a distributed architectural model that can be exploited for different distributed mining patterns deployed as grid services for the analysis of dispersed data sources. In order to validate such an approach, we presented also the implementation of two clustering algorithms on the developed architecture. In particular, the distributed k‐means and distributed expectation maximization were exploited as pilot examples to show the suitability of the implemented service‐oriented framework. An extensive evaluation of its performance was provided. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

4.
软计算方法在数据挖掘中的应用   总被引:5,自引:0,他引:5  
对数据挖掘和软计算方法做了一个综合性的阐述,分析了模糊逻辑、神经网络、神经一模糊、遗传算法、粗糙集和混合方法等不同软计算方法的特点,尤其是对模糊逻辑软计算方法进行了更多的探讨,并结合软计算方法在数据挖掘中的应用现状,指出了数据挖掘面临的挑战和软计算方法的应用前景。  相似文献   

5.
The service‐oriented architecture paradigm can be exploited for the implementation of data and knowledge‐based applications in distributed environments. The Web services resource framework (WSRF) has recently emerged as the standard for the implementation of Grid services and applications. WSRF can be exploited for developing high‐level services for distributed data mining applications. This paper describes Weka4WS, a framework that extends the widely used open source Weka toolkit to support distributed data mining on WSRF‐enabled Grids. Weka4WS adopts the WSRF technology for running remote data mining algorithms and managing distributed computations. The Weka4WS user interface supports the execution of both local and remote data mining tasks. On every computing node, a WSRF‐compliant Web service is used to expose all the data mining algorithms provided by the Weka library. The paper describes the design and implementation of Weka4WS using the WSRF libraries and services provided by Globus Toolkit 4. A performance analysis of Weka4WS for executing distributed data mining tasks in different network scenarios is presented. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

6.
本文对Web 数据挖掘算法分析进行了研究,首先简述了云计算关键技术,提出了如何在海量信息中提取出有 用信息的重要性,其次指出了在云计算环境中进行Web 数据挖掘提供更多数据挖掘的解决方案,最后对云计算环境下Web 数 据挖掘常用算法进行了探讨。  相似文献   

7.
杨楠  罗省贤 《数字社区&智能家居》2011,(19):4526-4528,4536
Web数据挖掘是数据挖掘技术与Web的结合,它利用数据挖握技术从与WWW相关的资源和行为中抽取感兴趣的、有用的模式和隐含信息.总结了云计算的关健技术,介绍了Web数据挖掘的定义、流程、分类,并引入了与Web数据挖掘有关的图论数据结构-Web Graph.重点是在云计算平台对Web Graph中挖掘频繁子图算法抑rior...  相似文献   

8.
Web使用挖掘是数据挖掘技术在Web信息仓库中的应用.Web使用挖掘通过挖掘Web服务器日志获取的知识来预测用户浏览行为,是Web挖掘技术中的一个重要研究方向.通常发现的知识或一些意外规则很可能是不精确的、不完备的,这就需要用软计算技术如粗糙集来解决.提出一种基于粗糙近似的聚类方法,该方法能够实现从Web访问日志中聚类Web事务.通过这种方法可以有效地挖掘Web日志记录,从而发现用户存取Web页面的模式.  相似文献   

9.
近年来,计算机科学技术快速发展,在人们的生活、工作和学习中发挥着越来越重要的作用。计算机互联网的信息资源非常丰富,与此同时碎片化、海量的数据信息在很大程度上增加了人们获取有价值信息的成本和时间。当前云计算平台下的Web数据挖掘技术为海量数据信息的处理和分析提供了极大的便利,通过研究云计算平台下的Web数据挖掘,进一步完善和优化Web结构数据挖掘技术,降低大量数据信息存储和处理的成本,提高系统运行效率。本文简要介绍了云计算和Web数据挖掘,阐述了云计算平台下的Web数据挖掘系统。  相似文献   

10.
新网络体系结构—Web Services研究综述   总被引:17,自引:0,他引:17  
Web服务作为一种新型的松耦合分布式计算范式而成为目前业界研究的热点。本文从分布式计算、Grid计算和XML等技术发展融合的角度提出Web服务是分布式技术发展的第四个阶段,并将它与前三个阶段的技术特点进行了综合比较。提出了一种Web Services新架构模型:RSRPM模型,在形式化定义的基础上,对实现步骤、方法和协议进行了描述。另外,本文对目前Web服务的应用情况与主要开发平台进行了比较分析,并指出了Web服务目前存在的主要技术挑战和发展趋势。  相似文献   

11.
Web services technology is critical for the success of business integration and other application fields such as bioinformatics. However, there are two challenges facing the practicality of Web services: (a) efficient location of the Web service registries that contain the requested Web services and (b) efficient retrieval of the requested services from these registries with high quality of service (QoS). The main reason for this problem is that current Web services technology is not semantic-oriented. Several proposals have been made to add semantics to Web services to facilitate discovery and composition of relevant Web services. Such proposals are being referred to as semantic Web services (SWS). However, most of these proposals do not address the second problem of retrieval of web services with high QoS. In this paper, we propose a framework called soft semantic Web services agent (soft SWS agent) for providing high QoS Semantic Web services using soft computing methodology. Since different application domains have different requirement for QoS, it is impractical to use classical mathematical modeling methods to evaluate the QoS of semantic Web services. We use fuzzy neural networks with Genetic Algorithms (GA) as our study case. Simulation results show that the soft computing methodology is practicable to handle fuzzy and uncertain QoS metrics effectively.  相似文献   

12.
Nowadays, the application of Web mining techniques in e-learning and Web-based adaptive educational systems is increasing exponentially. In this paper, we propose an advanced architecture for a personalization system to facilitate Web mining. A specific Web mining tool is developed and a recommender engine is integrated into the AHA! system in order to help the instructor to carry out the whole Web mining process. Our objective is to be able to recommend to a student the most appropriate links/Web pages within the AHA! system to visit next. Several experiments are carried out with real data provided by Eindhoven University of Technology students in order to test both the architecture proposed and the algorithms used. Finally, we have also described the meaning of several recommendations, starting from the rules discovered by the Web mining algorithms.  相似文献   

13.
14.
Web Services:分布式网络体系新架构   总被引:6,自引:1,他引:6  
饶元 《计算机工程》2004,30(22):1-3
Web服务作为一种新型的分布式网络体系架构成为目前研究的热点。该文从分布式计算技术与Grid计算和XML等技术发展融合的角度分析了Web服务发展历程和概念定义。提出了一种RSRPM新的Web Services架构模型和形式化定义,并对实现时所需要采用的核心步骤、指令方法以及核心协议栈进行了分析和描述。另外,在对目前Web服务的应用情况进行分析的基础上,指出了Web服务存在的局限性、4个主要的技术问题以及未来的发展趋势。  相似文献   

15.
Web site owners have trouble identifying customer purchasing patterns from their Web logs because the two aren't directly related. Thus, organizations must understand their customers' behavior, preferences, and future needs. This imperative leads many companies to develop a great many e-service systems for data collection and analysis. Web mining is a popular technique for analyzing visitor activities in e-service systems. It mainly includes Web text mining, Web structure mining and Web log mining. Our Web log mining approach classifies a particular site's visitors into different groups on the basis of their purchase interest.  相似文献   

16.
挖掘Web日志降低信息搜寻的时间费用   总被引:4,自引:0,他引:4  
如何根据用户的行为信息优化站点的设计是一个重要的研究问题.提出了一种新的支持站点设计优化的Web使用挖掘方案.此方案基于Web日志中的搜寻路径统计用户寻找目标花费的平均时间,用以量化Web页面的搜寻费用.在此基础上提出了一种高效的数据挖掘方法,寻找一组能够有效压缩搜寻路径(降低时间费用)的超链接.实验表明,挖掘的结果能够提供许多有用的信息,帮助管理者及时发现站点设计中存在的问题.  相似文献   

17.
基于Web日志挖掘的Web文档聚类   总被引:2,自引:1,他引:2  
Web日志挖掘是Web挖掘的一种,介绍了Web日志挖掘的一般过程,研究了k-means聚类算法,并分析了k-means聚类算法的不足.k-means聚类算法迭代过程中每次都需要计算每个数据对象到簇质心的距离,使得聚类效率不高,针对这个问题,提出了k-means聚类算法的改进算法,该算法避免了重复计算数据对象到簇质心的距离,并用这两种算法实现了Web文档的聚类.试验结果表明,该改进算法提高了聚类效率.  相似文献   

18.
Yu  Xiao  Li  Qing  Liu  Jin 《World Wide Web》2019,22(1):295-324
World Wide Web - The performance of the existing parallel sequential pattern mining algorithms is often unsatisfactory due to high IO overhead and imbalanced load among the computing nodes. To...  相似文献   

19.
Semistructued data are specified in lack of any fixed and rigid schema,even though typically some implicit structure appears in the data.The huge amounts of on-line applications make it important and imperative to mine the schema of semistructured data ,both for the users(e.g.,to gather useful information and facilitate querying)and for the systems (e.g.,to optimize access).The critical problem is to discover the hidden structure in the semistructured data.Current methods in extracting Web data structure are either in a general way independent of application background,or bound in some concrete environment such as HTML,XML etc.But both face the burden of expensive cost and difficulty in keeping along with the frequent and complicated variances of Web data.In this paper,the problem of incremental mining of schema for semistructured data after the update of the raw data is discusses.An algorithm for incrementally mining the schema of semistructured data is provided,and some experimental results are also given,which show that incremental mining for semistructured data is more efficient than non-incremental mining.  相似文献   

20.
Since their early development, computers have had a profound impact on how we conduct modern scientific research. The disciplines of mathematics and operations research are perhaps the earliest to be dramatically transformed by information technology. However, over the years, computing technologies have provided many new opportunities for information processing, problem solving and knowledge creation. In this paper, we explore the potential of data mining technology for providing support for systematic theory testing based on Peirce's theory of abduction. We propose a data mining approach to abducting and evaluating hypotheses based on Peirce's scientific method. We believe that this approach could assist scientist to more efficiently explore alternative hypotheses for existing theories. We demonstrate our approach with empirical observations collected using instruments from the well known user performance area of information systems research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号