A fuzzy approach to classification of text documents   总被引:1,自引:0,他引:1       下载免费PDF全文
This paper discusses the classification problems of text documents. Based on the concept of the proximity degree, the set of words is partitioned into some equivalence classes.Particularly, the concepts of the semantic field and association degree are given in this paper.Based on the above concepts, this paper presents a fuzzy classification approach for document categorization. Furthermore, applying the concept of the entropy of information, the approaches to select key words from the set of words covering the classification of documents and to construct the hierarchical structure of key words are obtained.  相似文献   

Among the computational intelligence techniques employed to solve classification problems, Fuzzy Rule-Based Classification Systems (FRBCSs) are a popular tool because of their interpretable models based on linguistic variables, which are easier to understand for the experts or end-users.The aim of this paper is to enhance the performance of FRBCSs by extending the Knowledge Base with the application of the concept of Interval-Valued Fuzzy Sets (IVFSs). We consider a post-processing genetic tuning step that adjusts the amplitude of the upper bound of the IVFS to contextualize the fuzzy partitions and to obtain a most accurate solution to the problem.We analyze the goodness of this approach using two basic and well-known fuzzy rule learning algorithms, the Chi et al.’s method and the fuzzy hybrid genetics-based machine learning algorithm. We show the improvement achieved by this model through an extensive empirical study with a large collection of data-sets.  相似文献   

针对分形图像编码时间长的问题,提出选取边缘提取图像的6个特征参数作为图像块特征.采用模糊模式识别技术对图像块进行分类,后采用基于局部灰度均值聚类技术进一步减小最佳匹配搜索范围的两步分类法.实验表明,该方法可较大提高编码速度,解码质量无明显下降.  相似文献   

In recent years, many methods have been proposed to generate fuzzy rules from training instances for handling the Iris data classification problem. In this paper, we present a new method to generate fuzzy rules from training instances for dealing with the Iris data classification problem based on the attribute threshold value α, the classification threshold value β and the level threshold value γ, where α  [0, 1], β  [0, 1] and γ  [0, 1]. The proposed method gets a higher average classification accuracy rate than the existing methods.  相似文献   

In this paper, a methodology has been introduced as a decision support tool to the consumers in the Internet business. This decision support tool takes into account the multiple attributes of the product, analyses them with respect to the consumer's desire, and finally classifies these products into different hierarchical levels as per the consumer's level of preference. The product attributes, which are in general conflicting, imprecise, and non-commensurable in nature, are well handled here by using the concepts of fuzzy logic. Concepts of linguistic quantifier are used to quantify the qualitatively defined items and also to classify the products into different preference levels as required by the customer. Classification of the products into preference levels in any business, particularly, in the business through the Internet, gives a boost to the customer and helps him in a final product choice. The procedure described here can be used by virtual buying agents for generating a hierarchical classification based on buyer's preference. At the end, a numerical example is illustrated to highlight the procedure.  相似文献   

基于模糊集的风险聚类预测方法   总被引:1,自引:0,他引:1  
复杂社会技术系统存在许多不确定性的因素,这些因素给社会决策、项目过程管理带来了巨大的障碍和风险,因此有效的风险预测方法变得十分重要.根据风险项目的风险因素向量,利用模糊等价类的方法,对风险项目的历史数据进行模糊聚类,进而通过对新的风险项目和历史数据的模糊匹配实现了项目的风险聚类预测方法.分析和实践表明,该模型有效地解决了风险项目中诸多不确定性因素分类问题.该方法适合于政府决策、电子商务、软件项目管理等方面的风险管理应用.  相似文献   

Database classification suffers from two well-known difficulties, i.e., the high dimensionality and non-stationary variations within the large historic data. This paper presents a hybrid classification model by integrating a case-based reasoning technique, a fuzzy decision tree (FDT), and genetic algorithms (GAs) to construct a decision-making system for data classification in various database applications. The model is major based on the idea that the historic database can be transformed into a smaller case base together with a group of fuzzy decision rules. As a result, the model can be more accurately respond to the current data under classifying from the inductions by these smaller case-based fuzzy decision trees. Hit rate is applied as a performance measure and the effectiveness of our proposed model is demonstrated experimentally compared with other approaches on different database classification applications. The average hit rate of our proposed model is the highest among others.  相似文献   

The categorical approach is proposed to the formalization of fuzzy graph grammars obtained as a result of generalization of sequential graph grammars. This approach takes into consideration the basic types of fuzziness that arise in constructing categories of fuzzy objects and describing transformations of fuzzy graphs generated by fuzzy sets. All the problems of undecidability that are well known for Chomsky grammars are proved to hold true for fuzzy graph grammars. __________ Translated from Kibernetika i Sistemnyi Analiz, No. 4, pp. 130–144, July–August 2006.  相似文献   

This paper describes a database framework which is similar to a relational database in style but uses alternative knowledge structures to represent uncertain data. Two knowledge structures are used, the mass assignment to represent probabilistic information and fuzzy sets to hold subjective information. We describe how the query is modified such that the selection criteria is held in the form of specific knowledge which can be updated with the more general knowledge held in the database. The updating procedure has the effect of filling in uncertain or missing information such that a final solution can be found. The operations required to perform a query are generated automatically, optimisation is performed as the operations are determined. The output from the database is in the form of a distribution over a projection of the database domain space. An example is given where a database of sea vessels can be given uncertain or noisy evidence about the characteristics of a vessel and a distribution of the likelihood of each of the vessels can be determined from the evidence.  相似文献   

One of the main problems in practice is the difficulty in dealing with membership functions. Many decision makers ask for a graphical representation to help them to visualize results. In this paper, we point out that some useful tools for fuzzy classification can be derived from fuzzy coloring procedures. In particular, we bring here a crisp grey coloring algorithm based upon a sequential application of a basic black and white binary coloring procedure, already introduced in a previous paper [D. Gómez, J. Montero, J. Yáñez, C. Poidomani, A graph coloring algorithm approach for image segmentation, Omega, in press]. In this article, the image is conceived as a fuzzy graph defined on the set of pixels where fuzzy edges represent the distance between pixels. In this way, we can obtain a more flexible hierarchical structure of colors, which in turn should give useful hints about those classes with unclear boundaries.  相似文献   

 We address the problem of the representation of resemblances involved in analogical reasoning. We use fuzzy relations to compare situations. We provide constructive methods to adapt the solution of an already solved situation to a similar new situation according to the degree of resemblance between these two situations. We give a general definition of analogical scheme which can be considered from a more or less constrained point of view.  相似文献   

The probabilistic fuzzy set (PFS) is designed for handling the uncertainties with both stochastic and fuzzy nature. In this paper, the concept of the distance between probabilistic fuzzy sets is introduced and its metric definition is conducted, which may be finite or continuous. And some related distances are discussed. The proposed distance considers the random perturbation in progress by introducing the distance of probability distribution, thus it improves the ability to handle random uncertainties, and some inadequacy of the distance of probability distribution is remedied. Finally, a PFS-based distance classifier is proposed to discuss the classification problem, the numerical experiment shows the superiority of this proposed distance in fuzzy and stochastic circumstance.  相似文献   

田学东  郝楠 《计算机应用》2007,27(8):2036-2037
公式抽取是印刷体数学公式识别的基础性环节,现有的识别方法多以公式区域已知为前提,相关的研究还很欠缺。通过引入模糊分类理论,提出了一种孤立数学公式的抽取算法,通过对大量训练样张的数据统计与分析,选取了非规则度、宽高比、密度等6维特征,由此构建出对孤立公式行、文本行、标题行的模糊分类规则,实现了孤立公式行的抽取。实验结果表明,该方法有较高的准确性和鲁棒性。  相似文献   

针对不均衡分类问题,提出了一种基于隶属度加权的模糊支持向量机模型。使用传统支持向量机对样本进行训练,并通过样本点与所得分类超平面之间的距离构造模糊隶属度,这不仅能够消除噪点和野值点的影响,而且可以在一定程度上约减样本;利用正负类的平均隶属度和样本数量求得平衡调节因子,消除数据不平衡时造成的分类超平面的偏移现象;通过实验结果验证了该算法的可行性和有效性。实验结果表明,该算法能有效提高分类精度,特别是对不平衡数据效果更加明显,在训练速度和分类性能上比传统支持向量机和模糊支持向量机有进一步的提升。  相似文献   

Motivated by fuzzy control problems and by some investigations of eigen fuzzy sets, we deal with a closedness of fuzzy sets under fuzzy relations in two ways: in one sense by directly analyzing fuzzy concepts and in the other by investigating the corresponding crisp problems in the cutworthy framework. Our main task is to investigate particular fuzzy functional equations and inequations appearing in this context, which turn out to be essentially connected with fuzzy control problems. We analyze procedures and find solutions of these equations and inequations, pointing to important applications.  相似文献   

建立了形式背景下一种由乘积蕴涵算子构造的模糊概念格,给出了它的定义方式;讨论了它的性质和层次结构,并给出了一种计算模糊概念的算法。通过数值例子说明了此类概念格的构造方法。  相似文献   

粗糙神经智能疑似乳癌图像分类方法研究   总被引:1,自引:0,他引:1       下载免费PDF全文
提出了一种用于乳腺X线图像分类的粗糙神经智能方法,该方法是一种混合智能计算技术。首先使用模糊图像处理算法来提高整个原始图像的对比度以提取感兴趣区域以及增强区域边缘;然后建立灰度共生矩阵,提取出表征感兴趣区域纹理的特征属性;接着使用粗糙集方法进行属性约简并产生规则;最后,设计出粗糙神经网络,用来将感兴趣区域区分为良性或是恶性。为了对所提出的粗糙集神经网络进行性能评价,对若干乳腺X线图像样本进行了测试,实验结果表明:用该方法进行乳癌识别的整体准确率要高于使用其他技术。  相似文献   

