首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We present ELEM2, a machine learning system that induces classification rules from a set of data based on a heuristic search over a hypothesis space. ELEM2 is distinguished from other rule induction systems in three aspects. First, it uses a new heuristtic function to guide the heuristic search. The function reflects the degree of relevance of an attribute-value pair to a target concept and leads to selection of the most relevant pairs for formulating rules. Second, ELEM2 handles inconsistent training examples by defining an unlearnable region of a concept based on the probability distribution of that concept in the training data. The unlearnable region is used as a stopping criterion for the concept learning process, which resolves conflicts without removing inconsistent examples. Third, ELEM2 employs a new rule quality measure in its post-pruning process to prevent rules from overfitting the data. The rule quality formula measures the extent to which a rule can discriminate between the positive and negative examples of a class. We describe features of ELEM2, its rule induction algorithm and its classification procedure. We report experimental results that compare ELEM2 with C4.5 and CN2 on a number of datasets.  相似文献   

2.
3.
This paper proposes a cellular automata-based solution of a binary classification problem. The proposed method is based on a two-dimensional, three-state cellular automaton (CA) with the von Neumann neighborhood. Since the number of possible CA rules (potential CA-based classifiers) is huge, searching efficient rules is conducted with use of a genetic algorithm (GA). Experiments show an excellent performance of discovered rules in solving the classification problem. The best found rules perform better than the heuristic CA rule designed by a human and also better than one of the most widely used statistical method: the k-nearest neighbors algorithm (k-NN). Experiments show that CAs rules can be successfully reused in the process of searching new rules.  相似文献   

4.
5.
Pattern Analysis and Applications - Sparse-representation-based classification (SRC) has been widely studied and developed for various practical signal classification applications. However, the...  相似文献   

6.
Rule induction has attracted a great deal of attention in Machine Learning and Data Mining. However, generating rules is not an end in itself because their applicability is not straightforward especially when their number is large. Ideally, the ultimate user would like to use these rules to decide which actions to undertake. In the literature, this notion is usually referred to as actionability. We propose a new framework to address actionability. Our goal is to lighten the burden of analyzing a large set of classification rules when the user is confronted to an “unsatisfactory situation” and needs help to decide about the appropriate actions to remedy to this situation. The method consists in comparing the situation to a set of classification rules. For this purpose, we propose   a new framework for learning action recommendations dealing with complex notions of feasibility and quality of actions. Our approach has been motivated by an environmental application aiming at building a tool to help specialists in charge of the management of a catchment to preserve stream-water quality. The results show the utility of this methodology with regard to enhancing the actionability of a set of classification rules in a real-world application.  相似文献   

7.
Neural Computing and Applications - Context of data points, which is usually defined as the other data points in a data set, has been found to paly important roles in data representation and...  相似文献   

8.
一种基于分类一致性的决策规则获取算法   总被引:3,自引:3,他引:3       下载免费PDF全文
代建华  潘云鹤 《控制与决策》2004,19(10):1086-1090
提出一种基于分类一致性的规则获取算法.它是一种例化方向的方法,即从空集开始,以条件属性子集的分类一致性来度量属性的重要性,逐步加入重要的属性,当选择的属性子集能够正确分类时,则获取到决策规则.算法中设计了一个规则约简过程,用来简化所获得的规则,增强规则的泛化能力.实验结果表明,所提出的算法获得的规则更为简洁和高效.  相似文献   

9.
Learning middle-level image representations is very important for the computer vision community, especially for scene classification tasks. Middle-level image representations currently available are not sparse enough to make training and testing times compatible with the increasing number of classes that users want to recognize. In this work, we propose a middle-level image representation based on the pattern that extremely shared among different classes to reduce both training and test time. The proposed learning algorithm first finds some class-specified patterns and then utilizes the lasso regularization to select the most discriminative patterns shared among different classes. The experimental results on some widely used scene classification benchmarks (15 Scenes, MIT-indoor 67, SUN 397) show that the fewest patterns are necessary to achieve very remarkable performance with reduced computation time.  相似文献   

10.
11.
Personalized medicine requires the analysis of epidemiological data for the identification of subgroups sharing some risk factors and exhibiting dedicated outcome risks. We investigate the potential of data mining methods for the analysis of subgroups of cohort participants on hepatic steatosis. We propose a workflow for data preparation and mining on epidemiological data and we present InteractiveRuleMiner, an interactive tool for the inspection of rules in each subpopulation, including functionalities for the juxtaposition of labeled individuals and unlabeled ones. We report on our insights on specific subpopulations that have been discovered in a data-driven rather than hypothesis-driven way.  相似文献   

12.
13.
14.
The application of the CD3 decision tree induction algorithm to telecommunications customer call data to obtain classification rules is described. CD3 is robust against drift in the underlying rules over time (concept drift): it both detects drift and protects the induction process from its effects. Specifically, the task is to data mine customer details and call records to determine whether the profile of customers registering for a friends and family service is changing over time and to maintain a rule set profiling such customers. CD3 and the rationale behind it are described and experimental results on customer data are presented.  相似文献   

15.
When a set of rules generates (conflicting) values for a virtual attribute of some tuple, the system must resolve the inconsistency and decide on a unique value that is assigned to that attribute. In most current systems, the conflict is resolved based on criteria that choose one of the rules in the conflicting set and use the value that it generated. There are several applications, however, where inconsistencies of the above form arise, whose semantics demand a different form of resolution. We propose a general framework for the study of the conflict resolution problem, and suggest a variety of resolution criteria, which collectively subsume all previously known solutions. With several new criteria being introduced, the semantics of several applications are captured more accurately than in the past. We discuss how conflict resolution criteria can be specified at the schema or the rule-module level. Finally, we suggest some implementation techniques based on rule indexing, which allow conflicts to be resolved efficiently at compile time, so that at run time only a single rule is processed.An earlier version of this work appeared under the title Conflict Resolution of Rules Assigning Values to Virtual Attributes inProceedings of the 1989 ACM-Sigmod Conference, Portland, OR, June 1989, pp. 205–214.Partially supported by the National Science Foundation under Grant IRI-9157368 (PYI Award) and by grants from DEC, HP, and AT&T.Partially supported by the National Science Foundation under Grant IRI-9057573 (PYI Award), IBM, DEC, and the University of Maryland Institute for Advanced Computer Studies (UMIACS).  相似文献   

16.
Yang  Sen  Feng  Dawei  Liu  Yang  Li  Dongsheng 《Applied Intelligence》2022,52(2):1672-1685

Text generation from abstract meaning representation is a fundamental task in natural language generation. An interesting challenge is that distant context could influence the surface realization for each node. In the previous encoder-decoder based approaches, graph neural networks have been commonly used to encode abstract meaning representation graphs and exhibited superior performance over the sequence and tree encoders. However, most of them cannot stack numerous layers, thus being too shallow to capture distant context. In this paper, we propose solutions from three aspects. Firstly, we introduce a Transformer based graph encoder to embed abstract meaning representation graphs. This encoder can stack more layers to encode larger context, while without performance degrading. Secondly, we expand the receptive field of each node, i.e. building direct connections between node pairs, to capture the information of its distant neighbors. We also exploit relative position embedding to make the model aware of the original hierarchy of graphs. Thirdly, we encode the linearized version of abstract meaning representation with the pre-trained language model to get the sequence encoding and incorporate it into graph encoding to enrich features. We conduct experiments on LDC2015E86 and LDC2017T10. Experimental results demonstrate that our method outperforms previous strong baselines. Especially, we investigate the performance of our model on large graphs, finding a larger performance gain. Our best model achieves 31.99 of BLEU and 37.02 of METEOR on LDC2015E86, 34.21 of BLEU, and 39.26 of METEOR on LDC2017T10, which are new states of the art.

  相似文献   

17.
Extracting fuzzy classification rules from partially labeled data   总被引:1,自引:1,他引:0  
The interpretability and flexibility of fuzzy if-then rules make them a popular basis for classifiers. It is common to extract them from a database of examples. However, the data available in many practical applications are often unlabeled, and must be labeled manually by the user or by expensive analyses. The idea of semi-supervised learning is to use as much labeled data as available and try to additionally exploit the information in the unlabeled data. In this paper we describe an approach to learn fuzzy classification rules from partially labeled datasets.  相似文献   

18.
Supporting production rules using ECA rules in an object-oriented context   总被引:2,自引:0,他引:2  
This paper presents an approach to implementing production rules for object-oriented databases (OODBs). The approach builds upon earlier work on production rule algorithms for relational databases, and exploits fundamental differences in the structuring mechanisms employed by OODBs. An implementation is described whereby the production rules are mapped onto eventcondition action rules for execution. It is shown how the resulting implementation has minimal space overheads, and a time performance close to that of the widely used TREAT algorithm which uses significantly more space.  相似文献   

19.
The successor representation was introduced into reinforcement learning by Dayan ( 1993 ) as a means of facilitating generalization between states with similar successors. Although reinforcement learning in general has been used extensively as a model of psychological and neural processes, the psychological validity of the successor representation has yet to be explored. An interesting possibility is that the successor representation can be used not only for reinforcement learning but for episodic learning as well. Our main contribution is to show that a variant of the temporal context model (TCM; Howard & Kahana, 2002 ), an influential model of episodic memory, can be understood as directly estimating the successor representation using the temporal difference learning algorithm (Sutton & Barto, 1998 ). This insight leads to a generalization of TCM and new experimental predictions. In addition to casting a new normative light on TCM, this equivalence suggests a previously unexplored point of contact between different learning systems.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号