首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
讨论一种常见的集成方法--距离平方和最小准则,指出该准则下由线性加权原理所得融合结果的优良性以及信息检索文献中的一个错误.然后通过分析基于距离之和最小准则所得融合结果的检索性能,发现由基于距离之和最小准则得到的融合结果距离原始检索结果最近.最后,通过实例验证了该方法的结果.  相似文献   

An ACS-based framework for fuzzy data mining   总被引:1,自引:0,他引:1  
Data mining is often used to find out interesting and meaningful patterns from huge databases. It may generate different kinds of knowledge such as classification rules, clusters, association rules, and among others. A lot of researches have been proposed about data mining and most of them focused on mining from binary-valued data. Fuzzy data mining was thus proposed to discover fuzzy knowledge from linguistic or quantitative data. Recently, ant colony systems (ACS) have been successfully applied to optimization problems. However, few works have been done on applying ACS to fuzzy data mining. This thesis thus attempts to propose an ACS-based framework for fuzzy data mining. In the framework, the membership functions are first encoded into binary-bits and then fed into the ACS to search for the optimal set of membership functions. The problem is then transformed into a multi-stage graph, with each route representing a possible set of membership functions. When the termination condition is reached, the best membership function set (with the highest fitness value) can then be used to mine fuzzy association rules from a database. At last, experiments are made to make a comparison with other approaches and show the performance of the proposed framework.  相似文献   

Due to the steady increase in the number of heterogeneous types of location information on the internet, it is hard to organize a complete overview of the geospatial information for the tasks of knowledge acquisition related to specific geographic locations. The text- and photo-types of geographical dataset contain numerous location data, such as location-based tourism information, therefore defining high dimensional spaces of attributes that are highly correlated. In this work, we utilized text- and photo-types of location information with a novel approach of information fusion that exploits effective image annotation and location based text-mining approaches to enhance identification of geographic location and spatial cognition. In this paper, we describe our feature extraction methods to annotating images, and utilizing text mining approach to analyze images and texts simultaneously, in order to carry out geospatial text mining and image classification tasks. Subsequently, photo-images and textual documents are projected to a unified feature space, in order to generate a co-constructed semantic space for information fusion. Also, we employed text mining approaches to classify documents into various categories based upon their geospatial features, with the aims to discovering relationships between documents and geographical zones. The experimental results show that the proposed method can effectively enhance the tasks of location based knowledge discovery.  相似文献   

Building Information Modelling (BIM) is a standard digital process that fuses buildings information from different sources into a 3D model during their lifecycle. For new construction sites using BIM, it is possible to monitor the cost, schedule, and changes throughout the lifecycle; however, existing buildings do not have a BIM model. Manually creating the BIM models for existing buildings is a high-cost task, both in time and money, hence there is a need for extracting information from available paper-based documentation and fuse it into a BIM model. The struggle of facility management and utility companies to fully adopt a BIM process (due to their high volumes of paper-based documentation of existing buildings) has led to the research on creating these 3D BIM models from 2D floor plan images.This paper presents a novel processing pipeline to extract 2D digital information from floorplans, fusing it into a 3D BIM model. The work focuses on fusing the available information to create the structure of the building in BIM format, which is considered the essential step before looking on working with other sources of data. In this process, we introduce a type-2 fuzzy logic based Explainable Artificial Intelligence (XAI) approach for the semantic segmentation step. The approach consists of using the outputs of type-2 fuzzy logic systems to classify a pixel as wall or background, by using information around and from the pixel of interest as the inputs to the system. After the semantic segmentation step, the output of the type-2 fuzzy logic goes through a noise removal process and finally a transformation from 2D to 3D by assigning the corresponding BIM tag to each identified element. The proposed type-2 fuzzy logic semantic segmentation approach produced comparable results (97.3% mean Intersection over Union (IoU) performance metric value) to the opaque box model approach based on Convolutional Neural Network (CNN) (99.3% mean IoU performance metric value). However, the type-2 fuzzy XAI system benefits from being an augmentable and interpretable model, which means that human users can understand the decision process and modify the model using their expert knowledge.  相似文献   

针对大数据环境下数据冗余量大的问题,以粗糙集理论为基础,提出了一种基于香农信息熵(Shannon entropy)融合模糊综合评判的相似重复数据检测方法,首先基于香农熵对数据集中的属性进行约简,然后采用模糊综合评判方法获取约简后各属性的重要性权值,最后依据约简属性及其权值进行相似数据的检测。理论分析与实验对比表明,该方法在结构化大数据集的相似数据检测中,有较高的检测精度与效率。  相似文献   

如何获取粗粒度级信息是信息管理与信息系统中的研究热点之一。提出一种基于模糊认知图(Fuzzy Cog-nitive Map,FCM)与信息融合集成挖掘的面向多样例粗粒度信息获取方法,FCM可以建立多细粒度概念与粗粒度概念之间的模糊认知关系,信息融合则用于构建粗粒度级概念的信息表达,NHL(Nonlinear Hebbian Learning)实现了基于数据源的自动学习,从而可以计算出粗粒度级概念的信息值,该方法在Fisher’s Iris公开数据集上分析并验证了有效性,并将此应用于基于科技文献大数据的科技人才评价发现中。  相似文献   

This study proposes a knowledge discovery model that integrates the modification of the fuzzy transaction data-mining algorithm (MFTDA) and the Adaptive-Network-Based Fuzzy Inference Systems (ANFIS) for discovering implicit knowledge in the fuzzy database more efficiently and presenting it more concisely. A prototype was built for testing the feasibility of the model. The testing data are from a company’s human resource management department. The results indicated that the generated rules (knowledge) are useful in supporting the company to predict its employees’ future performance and then assign proper persons for appropriate positions and projects. Furthermore, the convergence of ANFIS in the model was proven to be more efficient than a generic fuzzy artificial neural network.  相似文献   

模糊时间序列挖掘在复杂系统模糊建模中的应用   总被引:5,自引:0,他引:5  
针对于复杂工业过程领域模糊建模问题, 提出了一种基于时间序列的模糊定量数据挖掘方法, 并讨论了其在复杂系统模糊逻辑推理模型结构辨识中的应用. 该方法建立在系统历史采集数据库基础之上, 较好的解决了多入多出 (MIMO)非线性复杂工业过程模糊建模时初始模型的建立问题. 文章最后讨论了该方法在氧化铝熟料烧结回转窑建模中的应用, 取得了良好的现场运行效果.  相似文献   

针对传统多对一数据融合方法无法有效支持多源多汇无线传感器网络(WSNs)应用场景,难以适应网络动态性和可扩展性等问题,提出了一种基于概率路由的分布式多对多数据融合方法。在无需全局知识的前提下,多个汇聚节点能同时高效地收集来自多个源节点的感知数据。新方法使用混沌蚁群优化指导数据传输方向,结合数据融合和多播技术降低通信负载。当网络环境保持稳定时,路由结构逐渐收敛为近似最优数据融合结构,如果网络环境发生变化,路由结构也会迅速调整。通过与现有方法的比较,实验仿真验证了新方法的可行性和高效性。  相似文献   

模糊神经网络在移动机器人信息融合中的应用   总被引:9,自引:0,他引:9       下载免费PDF全文
针对移动机器人所用的传感器,提出了一种用于多传感器信息融合的方法,将模糊逻辑和神经网络结合起来,构建了模糊神经网络,并建立了网络的计算模型.通过建立的模糊神经网络对移动机器人的多传感器信息进行融合,实现了移动机器人对动态环境中障碍和环境类型的实时识别以及无冲突运动.网络的训练和试验表明该方法在移动机器人躲避运动物体中是可行的.  相似文献   

Data fusion is the process of combining the output of a number of Information Retrieval (IR) algorithms into a single result set, to achieve greater retrieval performance. ProbFuse is a data fusion algorithm that uses the history of the underlying IR algorithms to estimate the probability that subsequent result sets include relevant documents in particular positions. It has been shown to out-perform CombMNZ, the standard data fusion algorithm against which to compare performance, in a number of previous experiments. This paper builds upon this previous work and applies probFuse to the much larger Web Track document collection from the 2004 Text REtreival Conference. The performance of probFuse is compared against that of CombMNZ using a number of evaluation measures and is shown to achieve substantial performance improvements.  相似文献   

数据挖掘中关联规则挖掘算法比较研究   总被引:27,自引:12,他引:15  
分析数据挖掘中关联规则挖掘算法的研究现状,提出关联规则新的价值衡量方法和关联规则挖掘今后进一步的研究方向。以核心Apfiofi算法为基点,运用文献查询和比较分析方法对典型的关联规则挖掘算法进行了综合研究:Apfiofi法即使进行了优化,一些固有的缺陷仍然无法克服,还需进一步研究;②今后的研究方向将是提高处理极大量数据和非结构化数据算法的效率、与OLAP相结合以及生成结果的可视化。  相似文献   

基于数据挖掘的瓦斯灾害信息融合模型的研究   总被引:1,自引:0,他引:1  
数据挖掘和信息融合是2种功能不同的处理数据的过程,2种方法虽然原理不同,但在功能上可以相互弥补。介绍了基于数据挖掘技术建立信息融合模型的原理和算法,研究了基于模糊粗糙集的数据挖掘算法建立瓦斯灾害信息融合模型的方法,并对所建模型进行误差曲线仿真分析。  相似文献   

Data fusion in information retrieval has been investigated by many researchers and a number of data fusion methods have been proposed. However, problems such as why data fusion can increase effectiveness and favorable conditions for the use of data fusion methods are poorly resolved at best. In this paper, we formally describe data fusion under a geometric framework, in which each component result returned from an information retrieval system for a given query is represented as a point in a multi-dimensional space. The Euclidean distance is the measure by which the effectiveness and similarity of search results are judged. This allows us to explain all component results and fused results using geometrical principles. In such a framework, score-based data fusion becomes a deterministic problem. Several interesting features of the centroid-based data fusion method and the linear combination method are discussed. Nevertheless, in retrieval evaluation, ranking-based measures are the most popular. Therefore, this paper investigates the relation and correlation between the Euclidean distance and several typical ranking-based measures. We indeed find that a very strong correlation exists between these. It means that the theorems and observations obtained using the Euclidean distance remain valid when ranking-based measures are used. The proposed framework enables us to have a better understanding of score-based data fusion and use score-based data fusion methods more precisely and effectively in various ways.  相似文献   

多传感器模糊信息融合在煤矿安全中的应用   总被引:2,自引:0,他引:2  
我国在矿井安全监测系统中,常采用单只传感器来实现对井下工作环境安全性的监测。为了克服单只传感器无法准确判断是否有危险发生,基于模糊集系统理论,提出一种应用多传感器模糊信息融合的监测方法,将监测设备的多只传感器所获得的信息模糊化,再将其融合,从而获取设备精确的状态估计。该方法充分利用了传感器提供的多种信息,提高了系统的识别率。实验结果表明:多传感器信息融合的识别准确率高于单传感器。因而,多传感器模糊信息融合是一种有潜力的新方法。  相似文献   

The goal of data mining is to find out interesting and meaningful patterns from large databases. In some real applications, many data are quantitative and linguistic. Fuzzy data mining was thus proposed to discover fuzzy knowledge from this kind of data. In the past, two mining algorithms based on the ant colony systems were proposed to find suitable membership functions for fuzzy association rules. They transformed the problem into a multi-stage graph, with each route representing a possible set of membership functions, and then, used the any colony system to solve it. They, however, searched for solutions in a discrete solution space in which the end points of membership functions could be adjusted only in a discrete way. The paper, thus, extends the original approaches to continuous search space, and a fuzzy mining algorithm based on the continuous ant approach is proposed. The end points of the membership functions may be moved in the continuous real-number space. The encoding representation and the operators are also designed for being suitable in the continuous space, such that the actual global optimal solution is contained in the search space. Besides, the proposed approach does not have fixed edges and nodes in the search process. It can dynamically produce search edges according to the distribution functions of pheromones in the solution space. Thus, it can get a better nearly global optimal solution than the previous two ant-based fuzzy mining approaches. The experimental results show the good performance of the proposed approach as well.  相似文献   

Recently, the class imbalance problem has attracted much attention from researchers in the field of data mining. When learning from imbalanced data in which most examples are labeled as one class and only few belong to another class, traditional data mining approaches do not have a good ability to predict the crucial minority instances. Unfortunately, many real world data sets like health examination, inspection, credit fraud detection, spam identification and text mining all are faced with this situation. In this study, we present a novel model called the “Information Granulation Based Data Mining Approach” to tackle this problem. The proposed methodology, which imitates the human ability to process information, acquires knowledge from Information Granules rather then from numerical data. This method also introduces a Latent Semantic Indexing based feature extraction tool by using Singular Value Decomposition, to dramatically reduce the data dimensions. In addition, several data sets from the UCI Machine Learning Repository are employed to demonstrate the effectiveness of our method. Experimental results show that our method can significantly increase the ability of classifying imbalanced data.  相似文献   

D-S证据理论作为一种重要的不确定性推理理论,为处理传感器信息的模糊性及不确定性提供了很好的解决方法。但各个证据中的基本概率分配函数(mass函数)如何生成,仍是人们需要解决的问题。针对这一问题,提出了一种基于模糊理论中的高斯隶属度函数来得到传感器提供信息的可信度,计算了各个传感器之间的相互支持度;将各传感器的可信度和支持度转化成mass函数;利用证据理论对多传感器信息进行融合。仿真试验表明该方法能够有效提高识别的准确性和可靠性。  相似文献   

《Information Fusion》2003,4(2):123-133
The study presented in this paper concerns an interactive fusion system that uses the fuzzy set theory to detect regions in three-dimensional seismic images. To achieve detection of regions in 3D seismic images, attributes extracted from images are fused using geophysicist interpreter knowledge by means of a fuzzy rule-based classifier. The original contribution of this work lies in the means proposed to the end-user for tuning the fuzzy membership functions in a two-dimensional universe for particular 2D reference image sections. The proposed graphic user interface allows to obtain a better region detection compared with the detection obtained without fusion. Moreover, a confidence index, based on information theory concepts, is introduced. This index is based on a coefficient of attribute influence and provides some elucidation on how the fusion results have been obtained.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号