共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Coding Decision Trees 总被引:4,自引:0,他引:4
3.
Combining Classifiers with Meta Decision Trees 总被引:4,自引:0,他引:4
The paper introduces meta decision trees (MDTs), a novel method for combining multiple classifiers. Instead of giving a prediction, MDT leaves specify which classifier should be used to obtain a prediction. We present an algorithm for learning MDTs based on the C4.5 algorithm for learning ordinary decision trees (ODTs). An extensive experimental evaluation of the new algorithm is performed on twenty-one data sets, combining classifiers generated by five learning algorithms: two algorithms for learning decision trees, a rule learning algorithm, a nearest neighbor algorithm and a naive Bayes algorithm. In terms of performance, stacking with MDTs combines classifiers better than voting and stacking with ODTs. In addition, the MDTs are much more concise than the ODTs and are thus a step towards comprehensible combination of multiple classifiers. MDTs also perform better than several other approaches to stacking. 相似文献
4.
Multivariate Decision Trees 总被引:24,自引:0,他引:24
Unlike a univariate decision tree, a multivariate decision tree is not restricted to splits of the instance space that are orthogonal to the features' axes. This article addresses several issues for constructing multivariate decision trees: representing a multivariate test, including symbolic and numeric features, learning the coefficients of a multivariate test, selecting the features to include in a test, and pruning of multivariate decision trees. We present several new methods for forming multivariate decision trees and compare them with several well-known methods. We compare the different methods across a variety of learning tasks, in order to assess each method's ability to find concise, accurate decision trees. The results demonstrate that some multivariate methods are in general more effective than others (in the context of our experimental assumptions). In addition, the experiments confirm that allowing multivariate tests generally improves the accuracy of the resulting decision tree over a univariate tree. 相似文献
5.
Hongyan Liu Jun He Tingting Wang Wenting Song Xiaoyang Du 《Electronic Commerce Research and Applications》2013,12(1):14-23
Recommendation systems represent a popular research area with a variety of applications. Such systems provide personalized services to the user and help address the problem of information overload. Traditional recommendation methods such as collaborative filtering suffer from low accuracy because of data sparseness though. We propose a novel recommendation algorithm based on analysis of an online review. The algorithm incorporates two new methods for opinion mining and recommendation. As opposed to traditional methods, which are usually based on the similarity of ratings to infer user preferences, the proposed recommendation method analyzes the difference between the ratings and opinions of the user to identify the user’s preferences. This method considers explicit ratings and implicit opinions, an action that can address the problem of data sparseness. We propose a new feature and opinion extraction method based on the characteristics of online reviews to extract effectively the opinion of the user from a customer review written in Chinese. Based on these methods, we also conduct an empirical study of online restaurant customer reviews to create a restaurant recommendation system and demonstrate the effectiveness of the proposed methods. 相似文献
6.
Induction of Decision Trees 总被引:390,自引:5,他引:390
The technology for building knowledge-based systems by inductive inference from examples has been demonstrated successfully in several practical applications. This paper summarizes an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail. Results from recent studies show ways in which the methodology can be modified to deal with information that is noisy and/or incomplete. A reported shortcoming of the basic algorithm is discussed and two means of overcoming it are compared. The paper concludes with illustrations of current research directions. 相似文献
7.
8.
More than one possible classifications for a given instance is supposed. A possibility distribution is assigned at a terminal node of a fuzzy decision tree. The possibility distribution of given instance with known value of attributes is determined by using simple fuzzy reasoning. The inconsistency in determining a single class for a given instance diminishes here. 相似文献
9.
仅凭相似度来定位邻居用户对传统协同过滤算法的性能有严重的负面影响。引入社会网络中的信任机制,从个体在社交圈中的主观信任和全局声誉角度出发建模。分别考虑用户交互、评分差和用户偏好调节生成直接信任度。利用声誉及专家信任优先模型聚合生成间接信任度,将两者动态加权形成用户之间的信任关系。用参数[η]协调信任和相似双属性,使用户关系更加紧密,有效地解决新用户和稀疏性问题。经实证,改良后的模型颇有成效。 相似文献
10.
《Ergonomics》2012,55(3):247-260
In this study, a prototype liquid container combined with auxiliary handles was designed to increase the safety of manual handling and to protect users of these containers from hand contamination. A Likert summated rating method as well as a pairwise ranking test was applied to evaluate the user preferences for handles provided for the container under the conditions of different shapes and positions. The results show that the participants preferred perpendicular orientation of the handle on the top of the liquid container while carrying the containers and the crosswise position of the handle at the side of the container while pouring the liquid. In order to satisfy both conditions, the container needs to be designed with handles in perpendicular as well as crosswise positions for selective application. A prototype liquid container with provided auxiliary handles was developed based on the results of the evaluation. It is recommended that a liquid container provides extra handles to reduce musculoskeletal stress and in turn increase user satisfaction. 相似文献
11.
Vitaly Schetinin Livia Jakaite Wojtek J. Krzanowski 《Expert systems with applications》2013,40(14):5466-5476
Practitioners use Trauma and Injury Severity Score (TRISS) models for predicting the survival probability of an injured patient. The accuracy of TRISS predictions is acceptable for patients with up to three typical injuries, but unacceptable for patients with a larger number of injuries or with atypical injuries. Based on a regression model, the TRISS methodology does not provide the predictive density required for accurate assessment of risk. Moreover, the regression model is difficult to interpret. We therefore consider Bayesian inference for estimating the predictive distribution of survival. The inference is based on decision tree models which recursively split data along explanatory variables, and so practitioners can understand these models. We propose the Bayesian method for estimating the predictive density and show that it outperforms the TRISS method in terms of both goodness-of-fit and classification accuracy. The developed method has been made available for evaluation purposes as a stand-alone application. 相似文献
12.
尽管选择性集成方法的研究和应用已取得了不少重要成果,然而其实现方法计算复杂度高、效率低仍是应用该方法的一个瓶颈。为此,提出了一种新的高速收敛的选择性集成方法。该方法使用C4.5决策树分类器作为基学习器,利用高速收敛的群体智能算法来寻找最优集成模型,并在UCI数据库的多值分类数据集上进行了实验。实验结果表明,该方法计算效率高,其精度和稳定性比Bagging方法都要高,可以成为一种高效的选择性集成的实现方法。 相似文献
13.
For two disjoint sets of variables, X and Y , and a class of functions C , we define DT(X,Y,C) to be the class of all decision trees over X whose leaves are functions from C over Y . We study the learnability of DT(X,Y,C) using membership and equivalence queries. Boolean decision trees, , were shown to be exactly learnable by Bshouty but does this imply the learnability of decision trees that have nonboolean
leaves? A simple encoding of all possible leaf values will work provided that the size of C is reasonable. Our investigation involves several cases where simple encoding is not feasible, i.e., when |C| is large.
We show how to learn decision trees whose leaves are learnable concepts belonging to a class C , DT(X,Y,C) , when the separation between the variables X and Y is known. A simple algorithm for decision trees whose leaves are constants, , is also presented.
Each case above requires at least s separate executions of the algorithm due to Bshouty where s is the number of distinct leaves of the tree but we show that if C is a bounded lattice, is learnable using only one execution of this algorithm.
Received September 23, 1995; revised January 15, 1996. 相似文献
14.
In this study, a prototype liquid container combined with auxiliary handles was designed to increase the safety of manual handling and to protect users of these containers from hand contamination. A Likert summated rating method as well as a pairwise ranking test was applied to evaluate the user preferences for handles provided for the container under the conditions of different shapes and positions. The results show that the participants preferred perpendicular orientation of the handle on the top of the liquid container while carrying the containers and the crosswise position of the handle at the side of the container while pouring the liquid. In order to satisfy both conditions, the container needs to be designed with handles in perpendicular as well as crosswise positions for selective application. A prototype liquid container with provided auxiliary handles was developed based on the results of the evaluation. It is recommended that a liquid container provides extra handles to reduce musculoskeletal stress and in turn increase user satisfaction. 相似文献
15.
Although there has been significant research on modelling and learning user preferences for various types of objects, there has been relatively little work on the problem of representing and learning preferences over sets of objects. We introduce a representation language, DD-PREF, that balances preferences for particular objects with preferences about the properties of the set. Specifically, we focus on the depth of objects (i.e. preferences for specific attribute values over others) and on the diversity of sets (i.e. preferences for broad vs. narrow distributions of attribute values). The DD-PREF framework is general and can incorporate additional object- and set-based preferences. We describe a greedy algorithm, DD-Select, for selecting satisfying sets from a collection of new objects, given a preference in this language. We show how preferences represented in DD-PREF can be learned from training data. Experimental results are given for three domains: a blocks world domain with several different task-based preferences, a real-world music playlist collection, and rover image data gathered in desert training exercises. 相似文献
16.
Incremental Induction of Decision Trees 总被引:25,自引:11,他引:25
Paul E. Utgoff 《Machine Learning》1989,4(2):161-186
This article presents an incremental algorithm for inducing decision trees equivalent to those formed by Quinlan's nonincremental ID3 algorithm, given the same training instances. The new algorithm, named ID5R, lets one apply the ID3 induction process to learning tasks in which training instances are presented serially. Although the basic tree-building algorithms differ only in how the decision trees are constructed, experiments show that incremental training makes it possible to select training instances more carefully, which can result in smaller decision trees. The ID3 algorithm and its variants are compared in terms of theoretical complexity and empirical behavior. 相似文献
17.
We consider the problem of modeling and reasoning about statements of ordinal preferences expressed by a user, such as monadic
statement like “X is good,” dyadic statements like “X is better than Y,” etc. Such qualitative statements may be explicitly
expressed by the user, or may be inferred from observable user behavior. This paper presents a novel technique for efficient
reasoning about sets of such preference statements in a semantically rigorous manner. Specifically, we propose a novel approach
for generating an ordinal utility function from a set of qualitative preference statements, drawing upon techniques from knowledge
representation and machine learning. We provide theoretical evidence that the new method provides an efficient and expressive
tool for reasoning about ordinal user preferences. Empirical results further confirm that the new method is effective on real-world
data, making it promising for a wide spectrum of applications that require modeling and reasoning about user preferences. 相似文献
18.
We give a (ln?n+1)-approximation for the decision tree (DT) problem. An instance of DT is a set of m binary tests T=(T 1,…,T m ) and a set of n items X=(X 1,…,X n ). The goal is to output a binary tree where each internal node is a test, each leaf is an item and the total external path length of the tree is minimized. Total external path length is the sum of the depths of all the leaves in the tree. DT has a long history in computer science with applications ranging from medical diagnosis to experiment design. It also generalizes the problem of finding optimal average-case search strategies in partially ordered sets which includes several alphabetic tree problems. Our work decreases the previous best upper bound on the approximation ratio by a constant factor. We provide a new analysis of the greedy algorithm that uses a simple accounting scheme to spread the cost of a tree among pairs of items split at a particular node. We conclude by showing that our upper bound also holds for the DT problem with weighted tests. 相似文献
19.
在机器学习中,分类器融合已经成为一个新的研究领域。该本文介绍了用元决策树(MDT)融合多个分类器的新方法,阐释了MDT、元属性以及用MDT组合多个分类器的stacking框架。 相似文献
20.
《Computers and biomedical research》1997,30(1):19-33
Decision tree models may be more realistic if branching probabilities (and possibly utilities) are represented as distributions rather than point estimates. However, numerical analysis of such “probabilistic” trees is more difficult. This study employed theMathematicacomputer algebra system to implement and verify previously described probabilistic methods. Both algebraic approximations and Monte Carlo simulation methods were used; in particular, simulations with beta, logistic-normal, and triangular distributions for branching probabilities were compared. Algebraic and simulation methods of sensitivity analysis were also implemented and compared. Computation required minimal programming and was reasonably fast usingMathematicaon a standard personal computer. This study verified previously published results, including methods of sensitivity analysis. Changing the input distributional form had little effect. Computation is no longer a significant barrier to the use of probabilistic methods for analysis of decision trees. 相似文献