期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Building Decision Trees with Constraints

Minos Garofalakis Dongjoon Hyun Rajeev Rastogi Kyuseok Shim 《Data mining and knowledge discovery》2003,7(2):187-214

相似文献

2.

Combining Classifiers with Meta Decision Trees 总被引：4，自引：0，他引：4

Todorovski Ljupčo Džeroski Sašo 《Machine Learning》2003,50(3):223-249

The paper introduces meta decision trees (MDTs), a novel method for combining multiple classifiers. Instead of giving a prediction, MDT leaves specify which classifier should be used to obtain a prediction. We present an algorithm for learning MDTs based on the C4.5 algorithm for learning ordinary decision trees (ODTs). An extensive experimental evaluation of the new algorithm is performed on twenty-one data sets, combining classifiers generated by five learning algorithms: two algorithms for learning decision trees, a rule learning algorithm, a nearest neighbor algorithm and a naive Bayes algorithm. In terms of performance, stacking with MDTs combines classifiers better than voting and stacking with ODTs. In addition, the MDTs are much more concise than the ODTs and are thus a step towards comprehensible combination of multiple classifiers. MDTs also perform better than several other approaches to stacking. 相似文献

3.

Multivariate Decision Trees 总被引：24，自引：0，他引：24

Brodley Carla E. Utgoff Paul E. 《Machine Learning》1995,19(1):45-77

Unlike a univariate decision tree, a multivariate decision tree is not restricted to splits of the instance space that are orthogonal to the features' axes. This article addresses several issues for constructing multivariate decision trees: representing a multivariate test, including symbolic and numeric features, learning the coefficients of a multivariate test, selecting the features to include in a test, and pruning of multivariate decision trees. We present several new methods for forming multivariate decision trees and compare them with several well-known methods. We compare the different methods across a variety of learning tasks, in order to assess each method's ability to find concise, accurate decision trees. The results demonstrate that some multivariate methods are in general more effective than others (in the context of our experimental assumptions). In addition, the experiments confirm that allowing multivariate tests generally improves the accuracy of the resulting decision tree over a univariate tree. 相似文献

4.

Coding Decision Trees 总被引：4，自引：0，他引：4

Wallace C.S. Patrick J.D. 《Machine Learning》1993,11(1):7-22

相似文献

5.

Combining user preferences and user opinions for accurate recommendation

Hongyan Liu Jun He Tingting Wang Wenting Song Xiaoyang Du 《Electronic Commerce Research and Applications》2013,12(1):14-23

Recommendation systems represent a popular research area with a variety of applications. Such systems provide personalized services to the user and help address the problem of information overload. Traditional recommendation methods such as collaborative filtering suffer from low accuracy because of data sparseness though. We propose a novel recommendation algorithm based on analysis of an online review. The algorithm incorporates two new methods for opinion mining and recommendation. As opposed to traditional methods, which are usually based on the similarity of ratings to infer user preferences, the proposed recommendation method analyzes the difference between the ratings and opinions of the user to identify the user’s preferences. This method considers explicit ratings and implicit opinions, an action that can address the problem of data sparseness. We propose a new feature and opinion extraction method based on the characteristics of online reviews to extract effectively the opinion of the user from a customer review written in Chinese. Based on these methods, we also conduct an empirical study of online restaurant customer reviews to create a restaurant recommendation system and demonstrate the effectiveness of the proposed methods. 相似文献

6.

Induction of Decision Trees 总被引：390，自引：5，他引：390

Quinlan J.R. 《Machine Learning》1986,1(1):81-106

The technology for building knowledge-based systems by inductive inference from examples has been demonstrated successfully in several practical applications. This paper summarizes an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail. Results from recent studies show ways in which the methodology can be modified to deal with information that is noisy and/or incomplete. A reported shortcoming of the basic algorithm is discussed and two means of overcoming it are compared. The paper concludes with illustrations of current research directions. 相似文献

7.

Design of liquid container handles in accordance with user preferences

《Ergonomics》2012,55(3):247-260

In this study, a prototype liquid container combined with auxiliary handles was designed to increase the safety of manual handling and to protect users of these containers from hand contamination. A Likert summated rating method as well as a pairwise ranking test was applied to evaluate the user preferences for handles provided for the container under the conditions of different shapes and positions. The results show that the participants preferred perpendicular orientation of the handle on the top of the liquid container while carrying the containers and the crosswise position of the handle at the side of the container while pouring the liquid. In order to satisfy both conditions, the container needs to be designed with handles in perpendicular as well as crosswise positions for selective application. A prototype liquid container with provided auxiliary handles was developed based on the results of the evaluation. It is recommended that a liquid container provides extra handles to reduce musculoskeletal stress and in turn increase user satisfaction. 相似文献

8.

Design of liquid container handles in accordance with user preferences

Jung HS Jung HS 《Ergonomics》2008,51(3):247-260

In this study, a prototype liquid container combined with auxiliary handles was designed to increase the safety of manual handling and to protect users of these containers from hand contamination. A Likert summated rating method as well as a pairwise ranking test was applied to evaluate the user preferences for handles provided for the container under the conditions of different shapes and positions. The results show that the participants preferred perpendicular orientation of the handle on the top of the liquid container while carrying the containers and the crosswise position of the handle at the side of the container while pouring the liquid. In order to satisfy both conditions, the container needs to be designed with handles in perpendicular as well as crosswise positions for selective application. A prototype liquid container with provided auxiliary handles was developed based on the results of the evaluation. It is recommended that a liquid container provides extra handles to reduce musculoskeletal stress and in turn increase user satisfaction. 相似文献

9.

On Learning Decision Trees with Large Output Domains

N. H. Bshouty C. Tamon D. K. Wilson 《Algorithmica》1998,20(1):77-100

For two disjoint sets of variables, X and Y , and a class of functions C , we define DT(X,Y,C) to be the class of all decision trees over X whose leaves are functions from C over Y . We study the learnability of DT(X,Y,C) using membership and equivalence queries. Boolean decision trees, , were shown to be exactly learnable by Bshouty but does this imply the learnability of decision trees that have nonboolean leaves? A simple encoding of all possible leaf values will work provided that the size of C is reasonable. Our investigation involves several cases where simple encoding is not feasible, i.e., when |C| is large. We show how to learn decision trees whose leaves are learnable concepts belonging to a class C , DT(X,Y,C) , when the separation between the variables X and Y is known. A simple algorithm for decision trees whose leaves are constants, , is also presented. Each case above requires at least s separate executions of the algorithm due to Bshouty where s is the number of distinct leaves of the tree but we show that if C is a bounded lattice, is learnable using only one execution of this algorithm. Received September 23, 1995; revised January 15, 1996. 相似文献

10.

基于群体智能的选择性决策树分类器集成

王丽丽苏德富《微机发展》2006,16(12):55-57

尽管选择性集成方法的研究和应用已取得了不少重要成果,然而其实现方法计算复杂度高、效率低仍是应用该方法的一个瓶颈。为此,提出了一种新的高速收敛的选择性集成方法。该方法使用C4.5决策树分类器作为基学习器,利用高速收敛的群体智能算法来寻找最优集成模型,并在UCI数据库的多值分类数据集上进行了实验。实验结果表明,该方法计算效率高,其精度和稳定性比Bagging方法都要高,可以成为一种高效的选择性集成的实现方法。相似文献

11.

Prediction of survival probabilities with Bayesian Decision Trees

Vitaly Schetinin Livia Jakaite Wojtek J. Krzanowski 《Expert systems with applications》2013,40(14):5466-5476

Practitioners use Trauma and Injury Severity Score (TRISS) models for predicting the survival probability of an injured patient. The accuracy of TRISS predictions is acceptable for patients with up to three typical injuries, but unacceptable for patients with a larger number of injuries or with atypical injuries. Based on a regression model, the TRISS methodology does not provide the predictive density required for accurate assessment of risk. Moreover, the regression model is difficult to interpret. We therefore consider Bayesian inference for estimating the predictive distribution of survival. The inference is based on decision tree models which recursively split data along explanatory variables, and so practitioners can understand these models. We propose the Bayesian method for estimating the predictive density and show that it outperforms the TRISS method in terms of both goodness-of-fit and classification accuracy. The developed method has been made available for evaluation purposes as a stand-alone application. 相似文献

12.

Efficient and non-parametric reasoning over user preferences

Carmel Domshlak Thorsten Joachims 《User Modeling and User-Adapted Interaction》2007,17(1-2):41-69

We consider the problem of modeling and reasoning about statements of ordinal preferences expressed by a user, such as monadic statement like “X is good,” dyadic statements like “X is better than Y,” etc. Such qualitative statements may be explicitly expressed by the user, or may be inferred from observable user behavior. This paper presents a novel technique for efficient reasoning about sets of such preference statements in a semantically rigorous manner. Specifically, we propose a novel approach for generating an ordinal utility function from a set of qualitative preference statements, drawing upon techniques from knowledge representation and machine learning. We provide theoretical evidence that the new method provides an efficient and expressive tool for reasoning about ordinal user preferences. Empirical results further confirm that the new method is effective on real-world data, making it promising for a wide spectrum of applications that require modeling and reasoning about user preferences. 相似文献

13.

Incremental Induction of Decision Trees 总被引：25，自引：11，他引：25

Paul E. Utgoff 《Machine Learning》1989,4(2):161-186

This article presents an incremental algorithm for inducing decision trees equivalent to those formed by Quinlan's nonincremental ID3 algorithm, given the same training instances. The new algorithm, named ID5R, lets one apply the ID3 induction process to learning tasks in which training instances are presented serially. Although the basic tree-building algorithms differ only in how the decision trees are constructed, experiments show that incremental training makes it possible to select training instances more carefully, which can result in smaller decision trees. The ID3 algorithm and its variants are compared in terms of theoretical complexity and empirical behavior. 相似文献

14.

Approximating Optimal Binary Decision Trees

Micah Adler Brent Heeringa 《Algorithmica》2012,62(3-4):1112-1121

We give a (ln?n+1)-approximation for the decision tree (DT) problem. An instance of DT is a set of m binary tests T=(T ₁,…,T _m) and a set of n items X=(X ₁,…,X _n). The goal is to output a binary tree where each internal node is a test, each leaf is an item and the total external path length of the tree is minimized. Total external path length is the sum of the depths of all the leaves in the tree. DT has a long history in computer science with applications ranging from medical diagnosis to experiment design. It also generalizes the problem of finding optimal average-case search strategies in partially ordered sets which includes several alphabetic tree problems. Our work decreases the previous best upper bound on the approximation ratio by a constant factor. We provide a new analysis of the greedy algorithm that uses a simple accounting scheme to spread the cost of a tree among pairs of items split at a particular node. We conclude by showing that our upper bound also holds for the DT problem with weighted tests. 相似文献

15.

用元决策树组合多个分类器的方法

何丽韩文秀《计算机工程》2005,31(12):18-19,80

在机器学习中,分类器融合已经成为一个新的研究领域。该本文介绍了用元决策树(MDT)融合多个分类器的新方法,阐释了MDT、元属性以及用MDT组合多个分类器的stacking框架。相似文献

16.

Satisfying user preferences in optimised ridesharing services:

de Carvalho Vinicius Renan Golpayegani Fatemeh 《Applied Intelligence》2022,52(10):11257-11272

Applied Intelligence - Ridesharing services offer on-demand transportation solutions while improving the utilization of the available capacity of the vehicles on the road. Profit, travel time, and... 相似文献

17.

Extracting Actionable Knowledge from Decision Trees

Qiang Yang Jie Yin Ling C. Rong Pan 《Knowledge and Data Engineering, IEEE Transactions on》2007,19(1):43-56

Most data mining algorithms and tools stop at discovered customer models, producing distribution information on customer profiles. Such techniques, when applied to industrial problems such as customer relationship management (CRM), are useful in pointing out customers who are likely attritors and customers who are loyal, but they require human experts to postprocess the discovered knowledge manually. Most of the postprocessing techniques have been limited to producing visualization results and interestingness ranking, but they do not directly suggest actions that would lead to an increase in the objective function such as profit. In this paper, we present novel algorithms that suggest actions to change customers from an undesired status (such as attritors) to a desired one (such as loyal) while maximizing an objective function: the expected net profit. These algorithms can discover cost-effective actions to transform customers from undesirable classes to desirable ones. The approach we take integrates data mining and decision making tightly by formulating the decision making problems directly on top of the data mining results in a postprocessing step. To improve the effectiveness of the approach, we also present an ensemble of decision trees which is shown to be more robust when the training data changes. Empirical tests are conducted on both a realistic insurance application domain and UCI benchmark data 相似文献

18.

有序决策树的比较研究

王鑫王熙照陈建凯翟俊海《计算机科学与探索》2013,(11)

有序分类是现实生活中广泛存在的一种分类问题。基于排序熵的有序决策树算法是处理有序分类问题的重要方法之一,这种方法是以排序互信息作为启发式来构建有序决策树。基于这项工作,通过引入模糊有序熵,并以模糊有序互信息作为启发式构建模糊有序决策树,对有序决策树进行了扩展。这两种算法在实际应用中各有自己的优劣之处,从四个方面对这两种算法进行了详细的比较,并指出了这两种算法的异同及优缺点。相似文献

19.

Efficient Incremental Induction of Decision Trees

Kalles Dimitrios Morris Tim 《Machine Learning》1994,24(3):231-242

This paper proposes a method to improve ID5R, an incremental TDIDT algorithm. The new method evaluates the quality of attributes selected at the nodes of a decision tree and estimates a minimum number of steps for which these attributes are guaranteed such a selection. This results in reducing overheads during incremental learning. The method is supported by theoretical analysis and experimental results. 相似文献

20.

Web Categorisation Using Distance-Based Decision Trees

V. Estruch C. Ferri J. Hernndez-Orallo M.J. Ramírez-Quintana 《Electronic Notes in Theoretical Computer Science》2006,157(2):35

In Web classification, web pages are assigned to pre-defined categories mainly according to their content (content mining). However, the structure of the web site might provide extra information about their category (structure mining). Traditionally, both approaches have been applied separately, or are dealt with techniques that do not generate a model, such as Bayesian techniques. Unfortunately, in some classification contexts, a comprehensible model becomes crucial. Thus, it would be interesting to apply rule-based techniques (rule learning, decision tree learning) for the web categorisation task. In this paper we outline how our general-purpose learning algorithm, the so called distance based decision tree learning algorithm (DBDT), could be used in web categorisation scenarios. This algorithm differs from traditional ones in the sense that the splitting criterion is defined by means of metric conditions (“is nearer than”). This change allows decision trees to handle structured attributes (lists, graphs, sets, etc.) along with the well-known nominal and numerical attributes. Generally speaking, these structured attributes will be employed to represent the content and the structure of the web-site. 相似文献