首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 734 毫秒
1.
Formal Concept Analysis of real set formal contexts is a generalization of classical formal contexts. By dividing the attributes into condition attributes and decision attributes, the notion of real decision formal contexts is introduced. Based on an implication mapping, problems of rule acquisition and attribute reduction of real decision formal contexts are examined. The extraction of “if–then” rules from the real decision formal contexts, and the approach to attribute reduction of the real decision formal contexts are discussed. By the proposed approach, attributes which are non-essential to the maximal s rules or l rules (to be defined later in the text) can be removed. Furthermore, discernibility matrices and discernibility functions for computing the attribute reducts of the real decision formal contexts are constructed to determine all attribute reducts of the real set formal contexts without affecting the results of the acquired maximal s rules or l rules.  相似文献   

2.
纪霞  李龙澍 《控制与决策》2013,28(12):1837-1842

提出一种基于属性分辨度的不完备决策表规则提取算法, 它是一种例化方向的方法. 首先从空集开始, 逐步 选择当前最重要的条件属性对对象集分类, 从广义决策值唯一的相容块提取确定规则, 从其他的相容块提取不确定 规则; 然后设计属性必要性判断步骤去除每条规则的冗余属性; 最后通过规则约简过程来简化所获得的规则, 增强规 则的泛化能力. 实验结果表明, 所提出的算法效率更高, 并且所获得的规则简洁有效.

  相似文献   

3.
In this work, we present the hierarchical object-driven action rules; a hybrid action rule extraction approach that combines key elements from both the classical action rule mining approach, first proposed by Ra? and Wieczorkowska (2000), and the more recent object-driven action rule extraction approach proposed by Hajja et al. (2012, 2013), to extract action rules from object-driven information systems. Action rules, as defined in Ra? and Wieczorkowska (2000), are actionable tasks that describe possible transitions of instances from one state to another with respect to a distinguished attribute, called the decision attribute. Recently, a new specialized case of action rules, namely object-driven action rules, has been introduced by Hajja et al. (2012, 2013). Object-driven action rules are action rules that are extracted from information systems with temporal and object-based nature. By object-driven information systems, we mean systems that contain multiple observations for each object, in which objects are determined by an attribute that assumingly defines some unique distribution; and by temporally-based information systems, we refer to systems in which each instance is attached to a timestamp that, by definition, must have an intrinsic meaning for each corresponding instance. Though the notion of object-driven and temporal-based action rules had its own successes, some argue that the essence of object-driven assumptions, which is in big part the reason for its effectiveness, are imposing few limitations as well. Object-driven approaches treat entire systems as multi-subsystems for which action rules are extracted from; as a result, more accurate and specific action rules are extracted. However, by doing so, our diverseness of the extracted action rules are much less apparent, compared to the outcome when applying the classical action rule extraction approach, which treats information systems as a whole. For that reason, we propose a hybrid approach which builds a hierarchy of clusters of subsystems; a novel way of clustering through treatments responses similarities is introduced.  相似文献   

4.
The problem of automatically extracting multiple news attributes from news pages is studied in this paper. Most previous work on web news article extraction focuses only on content. To meet a growing demand for web data integration applications, more useful news attributes, such as title, publication date, author, etc., need to be extracted from news pages and stored in a structured way for further processing. An automatic unified approach to extract such attributes based on their visual features, including independent and dependent visual features, is proposed. Unlike conventional methods, such as extracting attributes separately or generating template-dependent wrappers, the basic idea of this approach is twofold. First, candidates for each news attribute are extracted from the page based on their independent visual features. Second, the true value of each attribute is identified from the candidates based on dependent visual features such as the layout relationships among the attributes. Extensive experiments with a large number of news pages show that the proposed approach is highly effective and efficient.  相似文献   

5.
6.

This work describes a method that combines a Bayesian feature selection approach with a clustering genetic algorithm to get classification rules in data-mining applications. A Bayesian network is generated from a data set and the Markov blanket of the class variable is applied to the feature subset selection task. The general rule extraction method is simple and consists of employing the clustering process in the examples of each class separately. In this way, clusters of similar examples are found for each class. These clusters can be viewed as subclasses and can, consequently, be modeled into logical rules. In this context, the problem of finding the optimal number of classification rules can be viewed as the problem of finding the best number of clusters. The Clustering Genetic Algorithm can find the best clustering in a data set, according to the Average Silhouette Width criterion, and it was applied to extract classification rules. The proposed methodology is illustrated by means of simulations in three data sets that are benchmarks for data-mining methods--Wisconsin Breast Cancer, Mushroom, and Congressional Voting Records. The rules extracted with all the attributes are compared to those extracted with the features belonging to the Markov blanket and the obtained results show that the proposed method is very promising.  相似文献   

7.
提出了一种新的粗糙集双重学习方法,该方法能用遗传算法实现外层学习,用规则提取方法进行内层学习.其基本思想是:首先引入遗传算法,将属性编码,并针对不同的属性组合进行规则提取;然后用测试样本对规则集进行检验,并基于所得到的识别率建立适应度函数;最后在合适的遗传算子下获取最佳的属性组合及相应的知识规则.与其他方法相比,本文所提粗糙集双重学习方法集属性约简和规则提取于一体,整个过程具有很强的自适应能力.最后,用算例对本文方法进行了验证.  相似文献   

8.
《Information Systems》2001,26(6):425-444
Mining association rules on large data sets have received considerable attention in recent years. Association rules are useful for determining correlations between attributes of a relation and have applications in marketing, financial and retail sectors. Furthermore, optimized association rules are an effective way to focus on the most interesting characteristics involving certain attributes. Optimized association rules are permitted to contain uninstantiated attributes and the problem is to determine instantiations such that either the support, confidence or gain of the rule is maximized. In this paper, we generalize the optimized support association rule problem by permitting rules to contain disjunctions over uninstantiated numeric attributes. Our generalized association rules enable us to extract more useful information about seasonal and local patterns involving the uninstantiated attribute. For rules containing a single numeric attribute, we present a dynamic programming algorithm for computing optimized association rules. Furthermore, we propose bucketing technique for reducing the input size, and a divide and conquer strategy that improves the performance significantly without sacrificing optimality. We also present approximation algorithms based on dynamic programming for two numeric attributes. Our experimental results for a single numeric attribute indicate that our bucketing and divide and conquer enhancements are very effective in reducing the execution times and memory requirements of our dynamic programming algorithm. Furthermore, they show that our algorithms scale up almost linearly with the attribute's domain size as well as the number of disjunctions.  相似文献   

9.
朱臻  孙媛 《中文信息学报》2015,29(6):220-227
该文提出了一种基于SVM和泛化模板协作的藏语人物属性抽取方法。该方法首先构建了基于藏语语言规则的模板系统,收集了包括格助词、特殊动词等具有明显语义信息的特征建设模板并泛化。针对规则方法的局限性,该文在模板的基础上,采用SVM机器学习方法,设计了一种处理多分类问题的层次分类器结构,同时对多样化的特征选取给予说明。最后,实验结果表明,基于SVM和模板相结合的方式可以对人物属性抽取的性能有较大提高。  相似文献   

10.
In sentiment analysis, a finer-grained opinion mining method not only focuses on the view of the product itself, but also focuses on product features, which can be a component or attribute of the product. Previous related research mainly relied on explicit features but ignored implicit features. However, the implicit features, which are implied by some words or phrases, are so significant that they can express the users’ opinion and help us to better understand the users’ comments. It is a big challenge to detect these implicit features in Chinese product reviews, due to the complexity of Chinese. This paper is mainly centered on implicit features identification in Chinese product reviews. A novel hybrid association rule mining method is proposed for this task. The core idea of this approach is mining as many association rules as possible via several complementary algorithms. Firstly, we extract candidate feature indicators based word segmentation, part-of-speech (POS) tagging and feature clustering, then compute the co-occurrence degree between the candidate feature indicators and the feature words using five collocation extraction algorithms. Each indicator and the corresponding feature word constitute a rule (feature indicator → feature word). The best rules in five different rule sets are chosen as the basic rules. Next, three methods are proposed to mine some possible reasonable rules from the lower co-occurrence feature indicators and non indicator words. Finally, the latest rules are used to identify implicit features and the results are compared with the previous. Experiment results demonstrate that our proposed approach is competent at the task, especially via using several expanding methods. The recall is effectively improved, suggesting that the shortcomings of the basic rules have been overcome to certain extent. Besides those high co-occurrence degree indicators, the final rules also contain uncommon rules.  相似文献   

11.
基于k-means聚类的无导词义消歧   总被引:5,自引:3,他引:5  
无导词义消歧避免了人工词义标注的巨大工作量,可以适应大规模的多义词消歧工作,具有广阔的应用前景。这篇文章提出了一种无导词义消歧的方法,该方法采用二阶context 构造上下文向量,使用k-means算法进行聚类,最后通过计算相似度来进行词义的排歧. 实验是在抽取术语的基础上进行的,在多个汉语高频多义词的两组测试中取得了平均准确率82167 %和80187 %的较好的效果。  相似文献   

12.
一种基于统计的神经网络规则抽取方法   总被引:6,自引:0,他引:6  
从功能性观点出发,提出了一种基于统计的神经网络规则抽取方法.该方法利用统计技术对抽取出的规则进行评价,使其可以较好地覆盖示例空间.采用独特的连续属性处理方式,降低了离散化处理的主观性和复杂度.采用优先级规则形式,不仅使得规则表示简洁、紧凑,而且还免除了规则应用时所需要的一致性处理.该方法不依赖于具体的网络结构和训练算法,可以方便地应用于各种分类器型神经网络.实验表明,利用该方法可以抽取出可理解性好,简洁、紧凑,保真度高的符号规则.  相似文献   

13.
基于粗糙概念格的属性约简及规则获取   总被引:2,自引:0,他引:2  
黄加增 《软件》2011,(10):16-19,23
基于粗糙集和概念格理论进行结合,给出了决策背景下的多属性约简与规则提取方法。为此,针对决策背景的辨识矩阵和辨识函数给出了决策背景属性约简的具体属性约简方法;在此基础上得到了决策背景下的规则提取与属性约简方法,并通过实例表明了该约简方法的可行性与有效性.  相似文献   

14.
基于MDL聚类的无导词义消歧   总被引:2,自引:0,他引:2  
无导词义消歧避免了人工词义标注的巨大工作量,可以适应大规模的多义词消歧工作,具有广阔的应用前景.提出了一种无导词义消歧的方法,该方法以hownet词库为词典,采用二阶上下文构造上下文向量,使用MDL算法进行聚类,最后通过计算相似度来进行词义的排歧.实验是在抽取术语的基础上进行的,在8个汉语高频多义词的测试中取得了平均准确率81.12%的较好的效果.  相似文献   

15.
Since the inception of the Senseval series there has been a great deal of debate in the word sense disambiguation (WSD) community on what the right sense distinctions are for evaluation, with the consensus of opinion being that the distinctions should be relevant to the intended application. A solution to the above issue is lexical substitution, i.e. the replacement of a target word in context with a suitable alternative substitute. In this paper, we describe the English lexical substitution task and report an exhaustive evaluation of the systems participating in the task organized at SemEval-2007. The aim of this task is to provide an evaluation where the sense inventory is not predefined and where performance on the task would bode well for applications. The task not only reflects WSD capabilities, but also can be used to compare lexical resources, whether man-made or automatically created, and has the potential to benefit several natural-language applications.
Roberto NavigliEmail:
  相似文献   

16.
信息抽取是数据挖掘的一个重要领域,文本信息抽取是指从一段自由文本中抽取出指定的信息并将其结构化数 据存入知识库供用户查询或下一步处理所用。人物属性信息抽取是智能人物类搜索引擎构建的重要基础,同时结构化信 息也是计算机所能理解的一种数据格式。作者提出了一种自动获取百科人物属性的方法,该方法利用各属性值的词性信 息来定位到百科自由文本中,通过统计的方法发现规则,再根据规则匹配从百科文本中获取人物属性信息。实验表明该 方法从百科文本中抽取人物属性信息是有效的。抽取的结果可以用来构建人物属性知识库。  相似文献   

17.
Attribute reduction in decision-theoretic rough set models   总被引:6,自引:0,他引:6  
Yiyu Yao 《Information Sciences》2008,178(17):3356-3373
Rough set theory can be applied to rule induction. There are two different types of classification rules, positive and boundary rules, leading to different decisions and consequences. They can be distinguished not only from the syntax measures such as confidence, coverage and generality, but also the semantic measures such as decision-monotocity, cost and risk. The classification rules can be evaluated locally for each individual rule, or globally for a set of rules. Both the two types of classification rules can be generated from, and interpreted by, a decision-theoretic model, which is a probabilistic extension of the Pawlak rough set model.As an important concept of rough set theory, an attribute reduct is a subset of attributes that are jointly sufficient and individually necessary for preserving a particular property of the given information table. This paper addresses attribute reduction in decision-theoretic rough set models regarding different classification properties, such as: decision-monotocity, confidence, coverage, generality and cost. It is important to note that many of these properties can be truthfully reflected by a single measure γ in the Pawlak rough set model. On the other hand, they need to be considered separately in probabilistic models. A straightforward extension of the γ measure is unable to evaluate these properties. This study provides a new insight into the problem of attribute reduction.  相似文献   

18.
实体属性抽取是信息抽取、知识库构建等任务的重要基础。该文提出了一种利用在线百科获取实体属性的方法,该方法首先通过在线百科的结构特征和领域独立的抽取模式捕获可能的属性短语,然后根据同义扩展获取尽可能多的属性表述形式,并同时得到对应实体类别的同义属性集合。实验表明,该方法在保证属性抽取准确率不变的情况下,获得了比仅使用频率的方法覆盖范围更广的实体属性集合。  相似文献   

19.
This paper proposes a method to extract rules for the anaphora resolution of Japanese zero pronouns in Japanese–English MT from aligned sentence pairs. After aligned sentence pairs unsuitable for rule extraction because of analysis errors or free translations are automatically rejected, zero pronouns in the Japanese sentences and the English translation equivalents of their antecedents are extracted from the remaining Japanese and English aligned sentence pairs using ten hand-developed alignment rules. This method identifies all Japanese zero pronouns whose translation equivalents are not explicitly expressed in an English sentence, this method identifies these as unalignable. Then, resolution rules for the remaining zero pronouns are automatically extracted using the aligned pairs, equivalent word/phrase pairs extracted from the aligned sentence pairs, and the syntactic and semantic structures of the Japanese sentences. This method was implemented in a Japanese–English MT system, ALT-J/E. 98.4% of all pairs were automatically aligned correctly in a window test, and 94.0% in a blind test. Furthermore, extracted rules for zero pronouns with deictic references created automatically from sentence pairs correctly resolved 99.0% of the zero pronouns in a window test and 85.0% of the zero pronouns in a blind test.  相似文献   

20.
Artificial neural networks often achieve high classification accuracy rates, but they are considered as black boxes due to their lack of explanation capability. This paper proposes the new rule extraction algorithm RxREN to overcome this drawback. In pedagogical approach the proposed algorithm extracts the rules from trained neural networks for datasets with mixed mode attributes. The algorithm relies on reverse engineering technique to prune the insignificant input neurons and to discover the technological principles of each significant input neuron of neural network in classification. The novelty of this algorithm lies in the simplicity of the extracted rules and conditions in rule are involving both discrete and continuous mode of attributes. Experimentation using six different real datasets namely iris, wbc, hepatitis, pid, ionosphere and creditg show that the proposed algorithm is quite efficient in extracting smallest set of rules with high classification accuracy than those generated by other neural network rule extraction methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号