首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
当不完备双论域模糊概率粗糙集获取缺省值时,传统的静态算法更新近似集的时间效率较低,为了解决这个问题,对带标记不完备双论域模糊概率粗糙集的近似集动态更新方法进行了研究。首先,给出了带标记的不完备双论域信息系统的相关定义,运用矩阵提出了带标记的不完备双论域模糊概率粗糙集的模型,证明了其相关定理,给出了一种带标记的不完备双论域模糊概率粗糙集的近似集计算方法,并对其进行了讨论分析。其次,当不完备双论域模糊概率粗糙集获取缺省值时,给出了动态更新其近似集的相关定理,并进行了证明,进而设计了一种带标记的不完备双论域模糊概率粗糙集中近似集动态更新算法,并分析讨论了其算法复杂度。最后,在6个UCI数据集和3个人工数据集上进行仿真实验,实验结果表明,该动态更新算法提高了更新近似集的时间效率,并结合实例证明了该动态算法更新近似集时不影响结果的正确性,验证了该动态更新算法的有效性。  相似文献   

2.
Our main objective was to compare two discretization techniques, both based on cluster analysis, with a new rule induction algorithm called MLEM2, in which discretization is performed simultaneously with rule induction. The MLEM2 algorithm is an extension of the existing LEM2 rule induction algorithm. The LEM2 algorithm works correctly only for symbolic attributes and is a part of the LERS data mining system. For the two strategies, based on cluster analysis, rules were induced by the LEM2 algorithm. Our results show that MLEM2 outperformed both strategies based on cluster analysis, in terms of complexity (size of rule sets) and, more importantly, error rates.  相似文献   

3.
Jian Liu  Z. M. Ma  Li Yan 《World Wide Web》2013,16(3):325-353
As the next generation language of the Internet, XML has been the de-facto standard of information exchange over the web. A core operation for XML query processing is to find all the occurrences of a twig pattern in an XML database. In addition, the study of probabilistic data has become an emerging topic for various applications on the Web. Therefore, researching the combination of XML twig pattern and probabilistic data is quite significant. In prior work of probabilistic XML, the answers of a given twig query are always complete. However, complete answers with low probabilities may be deemed irrelevant while incomplete answers with high probabilities are of great significance because incomplete answers may be the potential answers that interest the users. Different from complete evaluation, evaluating incomplete twigs in probabilistic XML introduces some new challenges. On one hand, incomplete queries do not only obtain complete matches, but also return answers that contain considerable incomplete matches. On the other hand, the processing of incomplete evaluation is more complicated. It is obvious that a ranking approach should be adopted along with evaluating incomplete answers. In this paper, we propose an efficient algorithm to handle the problem of querying incomplete twigs over the probabilistic XML database. We also present a novel algorithm for ranking the incomplete answers. The experimental results show that our proposed algorithms can improve the performance of querying and ranking incomplete twigs significantly.  相似文献   

4.
现有的关系学习研究都是基于完备数据进行的,而现实问题中,数据通常是不完备的.提出一种从不完备关系数据中学习概率关系模型(probabilistic relational models,简称PRMs)的方法——MLTEC(maximum likelihood tree and evolutionary computing method).首先,随机填充不完备关系数据得到完备关系数据.然后从每个随机填充后的数据样本中分别生成最大似然树并作为初始PRM网络,再利用进化过程中最好的网络结构反复修正不完备数据集,最后得到概率关系模型.实验结果显示,MLTEC方法能够从不完备关系数据中学习到较好的概率关系模型.  相似文献   

5.
Learning rules from incomplete training examples by rough sets   总被引:1,自引:0,他引:1  
Machine learning can extract desired knowledge from existing training examples and ease the development bottleneck in building expert systems. Most learning approaches derive rules from complete data sets. If some attribute values are unknown in a data set, it is called incomplete. Learning from incomplete data sets is usually more difficult than learning from complete data sets. In the past, the rough-set theory was widely used in dealing with data classification problems. In this paper, we deal with the problem of producing a set of certain and possible rules from incomplete data sets based on rough sets. A new learning algorithm is proposed, which can simultaneously derive rules from incomplete data sets and estimate the missing values in the learning process. Unknown values are first assumed to be any possible values and are gradually refined according to the incomplete lower and upper approximations derived from the given training examples. The examples and the approximations then interact on each other to derive certain and possible rules and to estimate appropriate unknown values. The rules derived can then serve as knowledge concerning the incomplete data set.  相似文献   

6.
徐怡  肖鹏 《计算机应用》2019,39(5):1247-1251
针对不完备信息系统变化时缺失值获取具体属性值的特性,为解决多粒度粗糙集中更新近似集时间效率低的问题,提出了一种基于容差关系的近似集动态更新算法。首先,讨论了基于容差关系的近似集变化的性质,并根据相关性质得出乐观、悲观多粒度粗糙集的近似集的变化趋势;然后,针对更新容差类效率低的问题,提出了动态更新容差类的定理;最后,在此基础上,设计出基于容差关系的近似集动态更新算法。采用UCI数据库中4个数据集进行仿真实验,当数据集变大时,所提更新算法的计算时间远小于静态更新算法的计算时间,即所提动态更新算法的时间效率高于静态算法,验证了所提算法的正确性和高效性。  相似文献   

7.
大数据下不完备信息系统近似空间的并行算法   总被引:1,自引:0,他引:1  
上、下近似空间是粗糙理论的重要概念,解决上、下近似问题是海量数据挖掘的基础。经典的近似空间算法不适合处理海量数据,更不适合处理带缺失信息的海量数据问题。为此,通过深度分析带缺失信息的海量数据特征,结合MapReduce编程模型,提出了基于MapReduce框架下近似空间的并行算法,以处理带缺失信息的海量数据,实验结果表明了该并行算法的有效性。  相似文献   

8.
This paper presents a real-time path planning algorithm that guarantees probabilistic feasibility for autonomous robots with uncertain dynamics operating amidst one or more dynamic obstacles with uncertain motion patterns. Planning safe trajectories under such conditions requires both accurate prediction and proper integration of future obstacle behavior within the planner. Given that available observation data is limited, the motion model must provide generalizable predictions that satisfy dynamic and environmental constraints, a limitation of existing approaches. This work presents a novel solution, named RR-GP, which builds a learned motion pattern model by combining the flexibility of Gaussian processes (GP) with the efficiency of RRT-Reach, a sampling-based reachability computation. Obstacle trajectory GP predictions are conditioned on dynamically feasible paths identified from the reachability analysis, yielding more accurate predictions of future behavior. RR-GP predictions are integrated with a robust path planner, using chance-constrained RRT, to identify probabilistically feasible paths. Theoretical guarantees of probabilistic feasibility are shown for linear systems under Gaussian uncertainty; approximations for nonlinear dynamics and/or non-Gaussian uncertainty are also presented. Simulations demonstrate that, with this planner, an autonomous vehicle can safely navigate a complex environment in real-time while significantly reducing the risk of collisions with dynamic obstacles.  相似文献   

9.
Abstract: Machine learning can extract desired knowledge from training examples and ease the development bottleneck in building expert systems. Most learning approaches derive rules from complete and incomplete data sets. If attribute values are known as possibility distributions on the domain of the attributes, the system is called an incomplete fuzzy information system. Learning from incomplete fuzzy data sets is usually more difficult than learning from complete data sets and incomplete data sets. In this paper, we deal with the problem of producing a set of certain and possible rules from incomplete fuzzy data sets based on rough sets. The notions of lower and upper generalized fuzzy rough approximations are introduced. By using the fuzzy rough upper approximation operator, we transform each fuzzy subset of the domain of every attribute in an incomplete fuzzy information system into a fuzzy subset of the universe, from which fuzzy similarity neighbourhoods of objects in the system are derived. The fuzzy lower and upper approximations for any subset of the universe are then calculated and the knowledge hidden in the information system is unravelled and expressed in the form of decision rules.  相似文献   

10.
不完备信息系统是一般信息系统的推广,在现实中具有广泛的应用.信息系统动态变化时,对象的近似集会产生相应的变化.研究如何利用原有近似集信息来进行近似集的更新具有重要意义.信息系统动态变化主要可以从属性值粗化细化、属性集粗化细化、时象集粗化细化3个方面考虑.现仅讨论属性值粗化细化时近似集的增量更新方法,给出了不完备信息系统中属性值粗化细化的定义,讨论了在不完备信息系统下的特性关系粗糙集模型中属性值粗化细化时近似集的增量更新方法,并通过实例验证了方法的有效性.  相似文献   

11.
Neighbourhood rough set theory has proven already, as an efficient tool for knowledge discovering from heterogeneous data. However, some types of the data are incomplete and noisy in practical environments, such as signal analysis, fault diagnosis etc. To solve this problem, a universal neighbourhood rough sets model (variable precision tolerance neighbourhood rough sets [VPTNRS] model) is proposed based on a tolerance neighbourhood relation and the probabilistic theory. The proposed model can be inducing a family of much more comprehensive information granules to characterize arbitrary concepts in complex universe. In this paper, we discussed the properties of the model as well as some important relevant theorems are also introduced and proved. Furthermore, a heuristic heterogeneous feature selection algorithm is given based on the model. The experimental results with 10 choices University of California Irvine (UCI) standard data sets showed that the universal model performed well both in feature selection and classification, especially in incomplete environment.  相似文献   

12.
杂波下的机动目标跟踪的综合概率数据关联(IPDA)算法是在概率数据关联(PDA)算法的思想基础上引入目标存在及可观测概率所形成的.本文进一步通过引入自适应调整因子,提出了针对强机动目标跟踪的自适应IPDA算法(CIPDA),并通过仿真论证,与传统的IPDA相比,CIPDA提高了对强机动目标跟踪的稳定性和精确度.  相似文献   

13.
We analyze probabilistic convergences of random Galerkin approximations for a heat equation with a random initial condition.

Almost sure L2-convergence results for both continuous time and discrete time Galerkin approximations are obtained by the Borel-Cantelli's lemma. A criterion for determining the sample size is suggested.  相似文献   


14.
A Novel Approach for Phase-Type Fitting with the EM Algorithm   总被引:2,自引:0,他引:2  
The representation of general distributions or measured data by phase-type distributions is an important and nontrivial task in analytical modeling. Although a large number of different methods for fitting parameters of phase-type distributions to data traces exist, many approaches lack efficiency and numerical stability. In this paper, a novel approach is presented that fits a restricted class of phase-type distributions, namely, mixtures of Erlang distributions, to trace data. For the parameter fitting, an algorithm of the expectation maximization type is developed. This paper shows that these choices result in a very efficient and numerically stable approach which yields phase-type approximations for a wide range of data traces that are as good or better than approximations computed with other less efficient and less stable fitting methods. To illustrate the effectiveness of the proposed fitting algorithm, we present comparative results for our approach and two other methods using six benchmark traces and two real traffic traces as well as quantitative results from queueing analysis.  相似文献   

15.
Real-life events are emerging and evolving in social and news streams. Recent methods have succeeded in capturing designed features of monolingual events, but lack of interpretability and multi-lingual considerations. To this end, we propose a multi-lingual event mining model, namely MLEM, to automatically detect events and generate evolution graph in multilingual hybrid-length text streams including English, Chinese, French, German, Russian and Japanese. Specially, we merge the same entities and similar phrases and present multiple similarity measures by incremental word2vec model. We propose an 8-tuple to describe event for correlation analysis and evolution graph generation. We evaluate the MLEM model using a massive humangenerated dataset containing real world events. Experimental results show that our new model MLEM outperforms the baseline method both in efficiency and effectiveness.  相似文献   

16.
Probabilistic approaches to rough sets   总被引:6,自引:0,他引:6  
Y. Y. Yao 《Expert Systems》2003,20(5):287-297
Abstract: Probabilistic approaches to rough sets in granulation, approximation and rule induction are reviewed. The Shannon entropy function is used to quantitatively characterize partitions of a universe. Both algebraic and probabilistic rough set approximations are studied. The probabilistic approximations are defined in a decision‐theoretic framework. The problem of rule induction, a major application of rough set theory, is studied in probabilistic and information‐theoretic terms. Two types of rules are analyzed: the local, low order rules, and the global, high order rules.  相似文献   

17.
信息系统中的数据是动态变化的,根据动态变化的信息系统获取有用的信息,成为数据处理中的关键问题。针对该问题,分别讨论了信息系统中属性增加和减少时,近似集的动态获取方法。通过对信息系统中原有的等价类进行划分,避免了对论域的重新划分,提高了动态更新近似集的效率,通过讨论等价类与原有近似集之间的关系,给出了信息系统动获取之后的近似集与原来近似集之间的相关定理,提出了在经典粗糙集模型中,属性增减时近似集动态获取方法。实验结果验证了该方法的正确性和有效性,而且效率优于原始的方法。  相似文献   

18.
As the information available to naïve users through autonomous data sources continues to increase, mediators become important to ensure that the wealth of information available is tapped effectively. A key challenge that these information mediators need to handle is the varying levels of incompleteness in the underlying databases in terms of missing attribute values. Existing approaches such as QPIAD aim to mine and use Approximate Functional Dependencies (AFDs) to predict and retrieve relevant incomplete tuples. These approaches make independence assumptions about missing values—which critically hobbles their performance when there are tuples containing missing values for multiple correlated attributes. In this paper, we present a principled probabilistic alternative that views an incomplete tuple as defining a distribution over the complete tuples that it stands for. We learn this distribution in terms of Bayesian networks. Our approach involves mining/“learning” Bayesian networks from a sample of the database, and using it to do both imputation (predict a missing value) and query rewriting (retrieve relevant results with incompleteness on the query-constrained attributes, when the data sources are autonomous). We present empirical studies to demonstrate that (i) at higher levels of incompleteness, when multiple attribute values are missing, Bayesian networks do provide a significantly higher classification accuracy and (ii) the relevant possible answers retrieved by the queries reformulated using Bayesian networks provide higher precision and recall than AFDs while keeping query processing costs manageable.  相似文献   

19.
WEBSOM is a recently developed neural method for exploring full-text document collections, for information retrieval, and for information filtering. In WEBSOM the full-text documents are encoded as vectors in a document space somewhat like in earlier information retrieval methods, but in WEBSOM the document space is formed in an unsupervised manner using the Self-Organizing Map algorithm. In this article the document representations the WEBSOM creates are shown to be computationally efficient approximations of the results of a certain probabilistic model. The probabilistic model incorporates information about the similarity of use of different words to take into account their semantic relations.  相似文献   

20.
STAPLE大脑皮层表面沟回分割算法   总被引:1,自引:0,他引:1  
胡新韬  李刚  郭雷 《计算机科学》2011,38(3):279-282
提出根据多种大脑皮层表面沟回分割结果预测潜在的概率最优分割,同时自动预测各分割算法性能参数的方法。概率最优分割被建模为多个分割决策的加权组合,利用最大期望算法,以估计的性能参数为依据,迭代地求取权重的最优解。然后利用隐马尔可夫模型,在预测的概率最优分割中引入空间一致性限制条件,将预测的最优分割优化为具有空间一致性的分割决策结果。仿真数据及根据3种典型大脑皮层表面分割算法得到的结果,证明了该算法能有效提高大脑沟回分割的精度,同时自动衡量以后算法的性能指标。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号