首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Relational reinforcement learning (RRL) combines traditional reinforcement learning (RL) with a strong emphasis on a relational (rather than attribute-value) representation. Earlier work used RRL on a learning version of the classic Blocks World planning problem (a version where the learner does not know what the result of taking an action will be) and the Tetris game. Learning results based on the structure of training examples were obtained, such as learning in a mixed 3–5 block environment and being able to perform in a 3 or 10 block environment. Here, we instead take a function approximation approach to RL for the Blocks World problem. We obtain similar learning accuracies, with better running times, allowing us to consider much larger problem sizes. For instance, we can train on 15 blocks and then perform well on worlds with 100–800 blocks–using less running time than the relational method required to perform well for 3–10 blocks.  相似文献   

2.
It is well known that many hard tasks considered in machine learning and data mining can be solved in a rather simple and robust way with an instance- and distance-based approach. In this work we present another difficult task: learning, from large numbers of complex performances by concert pianists, to play music expressively. We model the problem as a multi-level decomposition and prediction task. We show that this is a fundamentally relational learning problem and propose a new similarity measure for structured objects, which is built into a relational instance-based learning algorithm named DISTALL. Experiments with data derived from a substantial number of Mozart piano sonata recordings by a skilled concert pianist demonstrate that the approach is viable. We show that the instance-based learner operating on structured, relational data outperforms a propositional k-NN algorithm. In qualitative terms, some of the piano performances produced by DISTALL after learning from the human artist are of substantial musical quality; one even won a prize in an international ‘computer music performance’ contest. The experiments thus provide evidence of the capabilities of ILP in a highly complex domain such as music. Editors: Tamás Horváth and Akihiro Yamamoto  相似文献   

3.
For each decision problem, there is a competence set, proposed by Yu (1990), consisting of ideas, knowledge, information, and skills required for solving the problem. Thus, it is reasonable that we view a set of useful patterns discovered from a relational database by data mining techniques as a needed competence set for solving one problem. Significantly, when decision makers have not acquired the competence set, they may lack confidence in making decisions. In order to effectively acquire a needed competence set to cope with the corresponding problem, it is necessary to find appropriate learning sequences for acquiring those useful patterns, the so-called competence set expansion. This paper thus proposes an effective method consisting of two phases to generate learning sequences. The first phase finds a competence set consisting of useful patterns by using a proposed data mining technique. The other phase expands that competence set with minimum learning cost by the minimum spanning table method (Feng and Yu (1998)). From a numerical example, we can see that it is possible to help decision makers to solve the decision problems by use of the data mining technique and the competence set expansion, enabling them to make better decisions.  相似文献   

4.
Attribute-value based representations, standard in today's data mining systems, have a limited expressiveness. Inductive Logic Programming provides an interesting alternative, particularly for learning from structured examples whose parts, each with its own attributes, are related to each other by means of first-order predicates. Several subsets of first-order logic (FOL) with different expressive power have been proposed in Inductive Logic Programming (ILP). The challenge lies in the fact that the more expressive the subset of FOL the learner works with, the more critical the dimensionality of the learning task. The Datalog language is expressive enough to represent realistic learning problems when data is given directly in a relational database, making it a suitable tool for data mining. Consequently, it is important to elaborate techniques that will dynamically decrease the dimensionality of learning tasks expressed in Datalog, just as Feature Subset Selection (FSS) techniques do it in attribute-value learning. The idea of re-using these techniques in ILP runs immediately into a problem as ILP examples have variable size and do not share the same set of literals. We propose here the first paradigm that brings Feature Subset Selection to the level of ILP, in languages at least as expressive as Datalog. The main idea is to first perform a change of representation, which approximates the original relational problem by a multi-instance problem. The representation obtained as the result is suitable for FSS techniques which we adapted from attribute-value learning by taking into account some of the characteristics of the data due to the change of representation. We present the simple FSS proposed for the task, the requisite change of representation, and the entire method combining those two algorithms. The method acts as a filter, preprocessing the relational data, prior to the model building, which outputs relational examples with empirically relevant literals. We discuss experiments in which the method was successfully applied to two real-world domains.  相似文献   

5.
Machine Learning for Information Extraction in Informal Domains   总被引:13,自引:0,他引:13  
Freitag  Dayne 《Machine Learning》2000,39(2-3):169-202
We consider the problem of learning to perform information extraction in domains where linguistic processing is problematic, such as Usenet posts, email, and finger plan files. In place of syntactic and semantic information, other sources of information can be used, such as term frequency, typography, formatting, and mark-up. We describe four learning approaches to this problem, each drawn from a different paradigm: a rote learner, a term-space learner based on Naive Bayes, an approach using grammatical induction, and a relational rule learner. Experiments on 14 information extraction problems defined over four diverse document collections demonstrate the effectiveness of these approaches. Finally, we describe a multistrategy approach which combines these learners and yields performance competitive with or better than the best of them. This technique is modular and flexible, and could find application in other machine learning problems.  相似文献   

6.
Current trends clearly indicate that online learning has become an important learning mode. However, no effective assessment mechanism for learning performance yet exists for e-learning systems. Learning performance assessment aims to evaluate what learners learned during the learning process. Traditional summative evaluation only considers final learning outcomes, without concerning the learning processes of learners. With the evolution of learning technology, the use of learning portfolios in a web-based learning environment can be beneficially adopted to record the procedure of the learning, which evaluates the learning performances of learners and produces feedback information to learners in ways that enhance their learning. Accordingly, this study presents a mobile formative assessment tool using data mining, which involves six computational intelligence theories, i.e. statistic correlation analysis, fuzzy clustering analysis, grey relational analysis, K-means clustering, fuzzy association rule mining and fuzzy inference, in order to identify the key formative assessment rules according to the web-based learning portfolios of an individual learner for the performance promotion of web-based learning. Restated, the proposed method can help teachers to precisely assess the learning performance of individual learner utilizing only the learning portfolios in a web-based learning environment. Hence, teachers can devote themselves to teaching and designing courseware, since they save a lot of time in measuring learning performance. More importantly, teachers can understand the main factors influencing learning performance in a web-based learning environment based on the interpretable learning performance assessment rules obtained. Experimental results indicate that the evaluation results of the proposed scheme are very close to those of summative assessment results and the factor analysis provides simple and clear learning performance assessment rules. Furthermore, the proposed learning feedback with formative assessment can clearly promote the learning performances and interests of learners.  相似文献   

7.
Many data mining applications have a large amount of data but labeling data is often di cult, expensive, or time consuming, as it requires human experts for annotation.Semi-supervised learning addresses this problem by using unlabeled data together with labeled data to improve the performance. Co-Training is a popular semi-supervised learning algorithm that has the assumptions that each example is represented by two or more redundantly su cient sets of features (views) and additionally these views are independent given the class. However, these assumptions are not satis ed in many real-world application domains. In this paper, a framework called Co-Training by Committee (CoBC) is proposed, in which an ensemble of diverse classi ers is used for semi-supervised learning that requires neither redundant and independent views nor di erent base learning algorithms. The framework is a general single-view semi-supervised learner that can be applied on any ensemble learner to build diverse committees. Experimental results of CoBC using Bagging, AdaBoost and the Random Subspace Method (RSM) as ensemble learners demonstrate that error diversity among classi ers leads to an e ective Co-Training style algorithm that maintains the diversity of the underlying ensemble.  相似文献   

8.
图形处理器在数据管理领域的应用研究综述   总被引:1,自引:0,他引:1       下载免费PDF全文
比较了中央处理器和图形处理器体系结构的异同,并简要介绍了最新的图形处理器通用计算平台及不同体系结构间并行算法的异同。详细叙述了图形处理器在空间数据库、关系数据库、数据流和数据挖掘及信息检索等方面应用的技术特点;探讨了基于图形处理器的各种内外存排序算法及性能;描述了基于图形处理器的各种数据结构和索引技术;阐述了图形处理器算法优化方面的工作。最后,展望了图形处理器应用于数据管理的发展前景,并分析了这一领域未来所面临的挑战。  相似文献   

9.
kNN算法是机器学习和数据挖掘程序中经常使用的经典算法。随着数据量的增大,kNN算法的执行时间急剧上升。为了有效利用现代计算机的GPU等计算单元减少kNN算法的计算时间,提出了一种基于OpenCL的并行kNN算法,该算法对距离计算和排序两个瓶颈点进行并行化,在距离计算阶段使用细粒度并行化策略和优化的线程模型,排序阶段使用优化内存模型的双调排序。以UCI数据集letter为测试集,分别使用E8400和GTS450运行kNN算法进行测试,采用GPU加速的并行kNN算法的计算速度比CPU版提高了40.79倍。  相似文献   

10.
将计算密度高的部分迁移到GPU上是加速经典数据挖掘算法的有效途径。首先介绍GPU特性和主要的GPU编程模型,随后针对数据挖掘主要任务类型分别介绍基于GPU加速的工作,包括分类、聚类、关联分析、时序分析和深度学习。最后分别基于CPU和GPU实现协同过滤推荐的两类经典算法,并基于经典的MovieLens数据集的实验验证GPU对加速数据挖掘应用的显著效果,进一步了解GPU加速的工作原理和实际意义。  相似文献   

11.
目前大多数数据挖掘方法是从单关系中发现模式,而多关系数据挖掘(MRDM)则可直接从关系数据库的多表中抽取有效模式。MRDM可以解决原有命题数据挖掘方法不能解决的问题,它不仅有更强的信息表示能力,可以表示和发现更复杂的模式,还可以在挖掘进程中有效地利用背景知识来提高挖掘效率和准确率。近年来,借鉴归纳逻辑程序设计(ILP)技术,已经形成许多多关系数据挖掘方法,如关系关联规则挖掘方法、关系分类聚类方法等。  相似文献   

12.
多关系数据挖掘根据表示形式可以分为基于图的MRDM和基于逻辑的MRDM.本文讨论了基于图的数据挖掘和基于图的关系学习之间的关系,重点介绍基于图的关系学习算法Subdue及其优缺点,针对它的缺点提出优化的算法F_Subdue,改进了子图同构的计算,减少了子图同构的次数.在实际和人工数据集上运行的实验结果显示它比原算法更加有效率.最后给出结论并指明将来的工作.  相似文献   

13.
Feature selection methods often improve the performance of attribute-value learning. We explore whether also in relational learning, examples in the form of clauses can be reduced in size to speed up learning without affecting the learned hypothesis. To this end, we introduce the notion of safe reduction: a safely reduced example cannot be distinguished from the original example under the given hypothesis language bias. Next, we consider the particular, rather permissive bias of bounded treewidth clauses. We show that under this hypothesis bias, examples of arbitrary treewidth can be reduced efficiently. We evaluate our approach on four data sets with the popular system Aleph and the state-of-the-art relational learner nFOIL. On all four data sets we make learning faster in the case of nFOIL, achieving an order-of-magnitude speed up on one of the data sets, and more accurate in the case of Aleph.  相似文献   

14.
Most e-Learning systems store data about the learner’s actions in log files, which give us detailed information about learner behaviour. Data mining and machine learning techniques can give meaning to these data and provide valuable information for learning improvement. One area that is of particular importance in the design of e-Learning systems is learner motivation as it is a key factor in the quality of learning and in the prevention of attrition. One aspect of motivation is engagement, a necessary condition for effective learning. Using data mining techniques for log file analysis, our research investigates the possibility of predicting users’ level of engagement, with a focus on disengaged learners. As demonstrated previously across two different e-Learning systems, HTML-Tutor and iHelp, disengagement can be predicted by monitoring the learners’ actions (e.g. reading pages and taking test/quizzes). In this paper we present the findings of three studies that refine this prediction approach. Results from the first study show that two additional reading speed attributes can increase the accuracy of prediction. The second study suggests that distinguishing between two different patterns of disengagement (spending a long time on a page/test and browsing quickly through pages/tests) may improve prediction in some cases. The third study demonstrates the influence of exploratory behaviour on prediction, as most users at the first login familiarize themselves with the system before starting to learn.  相似文献   

15.
Inductive databases integrate database querying with database mining. In this article, we present an inductive database system that does not rely on a new data mining query language, but on plain SQL. We propose an intuitive and elegant framework based on virtual mining views, which are relational tables that virtually contain the complete output of data mining algorithms executed over a given data table. We show that several types of patterns and models that are implicitly present in the data, such as itemsets, association rules, and decision trees, can be represented and queried with SQL using a unifying framework. As a proof of concept, we illustrate a complete data mining scenario with SQL queries over the mining views, which is executed in our system.  相似文献   

16.
Evolutionary multi-feature construction for data reduction: A case study   总被引:1,自引:0,他引:1  
Real-world data are often prepared for purposes other than data mining and machine learning and, therefore, are represented by primitive attributes. When data representation is primitive, preprocessing data before looking for patterns becomes necessary. The low-level primitive representation of real-world problems facilitates the existence of complex interactions among attributes. If lack of domain experts prevents traditional methods to uncover patterns in data due to complex attribute interactions, then the use of soft computing techniques such as genetic algorithms becomes necessary. This article introduces MFE3/GADR, a data reduction method derived from the learning preprocessing system MFE3/GA. The method restructures the primitive data representation by capturing and compacting hidden information into new features in order to highlight regularities to the learner. We thoroughly analyze the empirical results obtained on the poker hand data set. The results show that this approach successfully compacts the set of low-level primitive attributes into a smaller set of highly informative features which outline patterns to the learner; thus, the new approach provides data reduction and yields learning a smaller and more accurate classifier.  相似文献   

17.
On data mining,compression, and Kolmogorov complexity   总被引:1,自引:1,他引:0  
Will we ever have a theory of data mining analogous to the relational algebra in databases? Why do we have so many clearly different clustering algorithms? Could data mining be automated? We show that the answer to all these questions is negative, because data mining is closely related to compression and Kolmogorov complexity; and the latter is undecidable. Therefore, data mining will always be an art, where our goal will be to find better models (patterns) that fit our datasets as best as possible.  相似文献   

18.
纪滨 《微机发展》2008,18(2):126-128
随着数据挖掘的兴起,有许多分类和预测的方法。数据挖掘研究的实旌对象多为关系型数据库,这给粗糙集方法的应用带来了极大的方便。关系表可被看作为粗糙集理论中的决策表,而利用粗糙集理论来处理数据挖掘有着传统挖掘工具所不具有的优点。粗糙集理论是一种处理不确定和不精确问题的数学工具,文中通过实例介绍了粗糙集的基本理论,并通过实例详细介绍了在基于对决策表属性约简的基础上采用了可变精度粗糙模型实现规则的获取。该实例说明了对于不完备的信息系统,应用粗糙集理论进行数据挖掘是非常有效的。  相似文献   

19.
基于OLAP技术的教学诊断与评价模型   总被引:11,自引:0,他引:11  
王陆  李亚文 《计算机工程》2003,29(5):49-50,194
结合首师大虚拟学习社区网络教学支撑平台,提出了一个基于联机分析技术(OLAP)的教学诊断与评价模型,文中使用DMQL语言给出了一个由学生,知识点,时间和认知技能4个维度构成的数据立方体,以及利用OLAP的上卷,下钻、切块和切片等技术实现对该数据方体进行数据挖掘的解决方案,提出的基于OLAP技术的数据挖掘方案,可以在教学诊断与评价中解决诸如判断学习者学习难点,了解学习者集成或个人的学习特征,以及获得学习者认知过程等问题,为实现因材施教的个性化教学和以学生为中心的自主学习提供支持与服务。  相似文献   

20.
A metapattern (also known as a metaquery) is a new approach for integrated data mining systems. As opposed to a typical “toolbox”-like integration, where components must be picked and chosen by users without much help, metapatterns provide a common representation for inter-component communication as well as a human interface for hypothesis development and search control. One weakness of this approach, however, is that the task of generating fruitful metapatterns is still a heavy burden for human users. In this paper, we describe a metapattern generator and an integrated discovery loop that can automatically generate metapatterns. Experiments in both artificial and real-world databases have shown that this new system goes beyond the existing machine learning technologies, and can discover relational patterns without requiring humans to pre-label the data as positive or negative examples for some given target concepts. With this technology, future data mining systems could discover high-quality, human-comprehensible knowledge in a much more efficient and focused manner, and data mining could be managed easily by both expert and less-expert users  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号