首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Markov logic networks   总被引:16,自引:0,他引:16  
We propose a simple approach to combining first-order logic and probabilistic graphical models in a single representation. A Markov logic network (MLN) is a first-order knowledge base with a weight attached to each formula (or clause). Together with a set of constants representing objects in the domain, it specifies a ground Markov network containing one feature for each possible grounding of a first-order formula in the KB, with the corresponding weight. Inference in MLNs is performed by MCMC over the minimal subset of the ground network required for answering the query. Weights are efficiently learned from relational databases by iteratively optimizing a pseudo-likelihood measure. Optionally, additional clauses are learned using inductive logic programming techniques. Experiments with a real-world database and knowledge base in a university domain illustrate the promise of this approach. Editors: Hendrik Blockeel, David Jensen and Stefan Kramer An erratum to this article is available at .  相似文献   

2.
3.
链接预测是对实体间的关系进行预测,是一个重要而复杂的任务。传统同类独立同概率分布的方法会带来很大的噪音,导致预测效果很差。将Markov逻辑网应用到链接预测中,旨在改善这一问题。Markov逻辑网是将Markov网与一阶逻辑结合的统计关系学习方法。利用Markov逻辑网构建关系模型,对实体之间是否存在链接关系以及当链接关系存在时预测此链接关系的类型。针对两个数据集的实验结果显示了采用Markov逻辑网模型要比传统链接预测模型有更好的效果,进而为Markov逻辑网解决实际问题提供了依据。  相似文献   

4.
Markov逻辑网及其在文本分类中的应用   总被引:2,自引:0,他引:2  
介绍了Markov逻辑网的理论模型、学习算法和推理算法,并将其应用于中文文本分类中.实验结合了判别式训练的学习算法,MC-SAT、吉布斯抽样和模拟退火等推理算法,结果表明基于Markov逻辑网的分类方法能够取得比传统K邻近(KNN)分类算法更好的效果.  相似文献   

5.
现有的知识学习多基于统计方法,常常忽略了知识间的关系以及随时间的变化情况,在应用效果方面往往差强人意。如何准确把握知识间的统计关系,进行正确的知识学习,成为知识研究的一个重点和难点。近几年,随着统计关系学习研究的兴起,结合概率图模型和一阶逻辑理论的马尔可夫逻辑网被成功应用于自然语言处理、机器学习、社会关系分析等领域中。基于马尔可夫逻辑网技术,提出一种知识学习方法,在传统知识获取方法的基础上,引入一阶逻辑来学习知识间的关系,进行逻辑推理。在文本分类的应用实验中,通过对分类知识学习,与传统的SVM相比,所提出方法的准确率提高10%左右。  相似文献   

6.
7.
Statistical-relational learning combines logical syntax with probabilistic methods. Markov Logic Networks (MLNs) are a prominent model class that generalizes both first-order logic and undirected graphical models (Markov networks). The qualitative component of an MLN is a set of clauses and the quantitative component is a set of clause weights. Generative MLNs model the joint distribution of relationships and attributes. A state-of-the-art structure learning method is the moralization approach: learn a set of directed Horn clauses, then convert them to conjunctions to obtain MLN clauses. The directed clauses are learned using Bayes net methods. The moralization approach takes advantage of the high-quality inference algorithms for MLNs and their ability to handle cyclic dependencies. A?weakness of moralization is that it leads to an unnecessarily large number of clauses. In this paper we show that using decision trees to represent conditional probabilities in the Bayes net is an effective remedy that leads to much more compact MLN structures. In experiments on benchmark datasets, the decision trees reduce the number of clauses in the moralized MLN by a factor of 5?C25, depending on the dataset. The accuracy of predictions is competitive with the models obtained by standard moralization, and in many cases superior.  相似文献   

8.
针对现有的企业安全风险管理中,风险处理方案的制定和管理措施的选择缺乏量化手段、手动风险分析方式耗时过长等问题,提出了一种基于马尔科夫逻辑网的信息安全风险管理方法。首先利用马尔科夫逻辑网对被评估系统组件及服务间依赖关系进行描述,进而利用马尔科夫逻辑网的边际推理模型来预估不同安全管理措施情况下的系统可用性值,从而为管理措施的选择提供了量化依据。案例研究表明,该方法能够为企业信息系统安全风险管理措施的选择提供可靠的量化依据,且方法实施简单易行。  相似文献   

9.
现有的情感原因对提取任务(ECPE)大多采用将情感从句逐一与原因从句匹配的方法, 或专注于候选对的排序方法, 忽略了影响情感因果关系成立的从句的事件语境, 导致模型在理解情感因果关系时产生偏差, 并且无法捕捉长距离的因果关系. 为此, 本文提出了基于注意力机制和情感从句卷积核的分层模型, 将原始文档的事件语境特征嵌入到情感原因对特征提取器中, 以创建一个集成和增强的特征. 首先, 将情感分析得到的情感从句类别特征作为卷积核. 然后, 利用文档的事件语境特征提取情感原因对. 本文方法在中文数据集的F1分数上有1.38%6.08%的提升, 在英文数据集的F1分数上有2.35%~7.27%的提升, 说明情感分析和因果事件语境对于情感原因对提取的有效性.  相似文献   

10.
基于信任的推荐系统是利用信任的实体进行项目推荐,然而信任是一个复杂的概念,对信任进行传播和预测是一项重要的任务。提出了用一种统计关系模型——Markov逻辑网来表示信任的传递性质,讨论了Markov逻辑网的理论模型,通过其推理算法预测信任关系,实验结果表明,在基于信任的推荐系统中Markov逻辑网方法比MoleTrust方法在推荐精度和解决冷用户问题上有更好的效果。  相似文献   

11.
从复杂的自然场景标志牌图像中提取和识别字符一直是数字图像处理领域的热点问题,目前的求解算法普遍存在提取文本精确度不高,提取率偏低,鲁棒性差等缺点。提出一种高效的文本提取算法,针对标志牌文本图像通常具有较复杂的自然背景等特征,首先对原始图片进行模糊化处理,然后进行Laplacian边缘提取,再对边缘图像进行非文本长边缘的删除,最后根据文本区域的特征进行边缘扫描和连通域分析实现标志牌文本的提取。通过对2003年国际自然场景文本识别竞赛(ICDAR’2003 Robust Reading Competition)中大量图片测试表明,该算法对背景的复杂度、文字语言、颜色、大小字体以及排列方向具有较强的鲁棒性,同时也具有较高的准确率(Precision)和提取率(Recall)。  相似文献   

12.
An efficient term mining method to build a general term network is presented. The resulting term network can be used for entity relation visualization and exploration, which is useful in many text-mining applications such as crime exploration and investigation from vast piles of crime news or official criminal records. In the proposed method, terms from each document in a text collection are first identified. They are subjected to an analysis for pairwise association weights. The weights are then accumulated over all the documents to obtain final similarity for each term pair. Based on the resulting term similarity, a general term network for the collection is built with terms as nodes and non-zero similarities as links. In application, a list of predefined terms having similar attributes was selected to extract the desired sub-network from the general term network for entity relation visualization. This text analysis scenario based on the collective terms of the similar type or from the same topic enables evidence-based relation exploration. Some practical instances of crime exploration and investigation are demonstrated. Our application examples show that term relations, be it causality, subordination, coupling, or others, can be effectively revealed by our method and easily verified by the underlying text collection. This work contributes by presenting an integrated term-relationship mining and exploration approach and demonstrating the feasibility of the term network to the increasingly important application of crime exploration and investigation.  相似文献   

13.
针对人类活动识别中存在的检测不确定问题,改进了马尔可夫逻辑网络(MLN)中势函数的计算方法。即软化一阶逻辑中关系运算符,使特征函数的取值范围从布尔值扩展到[0,1]区间;计算传感器事件的可信度,来获取所对应闭原子为真的概率。将改进的MLN方法与本体结合,提出混合识别框架并实现了相应算法。仿真实验结果表明,在包含错误的数据集ADL-E下,改进的MLN仍能保持较高的准确率。  相似文献   

14.
The capability of extracting and recognizing characters printed in color documents will widen immensely the applications of OCR systems. This paper describes a new method of color segmentation to extract character areas from a color document. At first glance, the characters seem to be printed in a single color, but actual measurements reveal that the color image has a distribution of components. Compared with clustering algorithms, our method prevents oversegmentation and fusion with the background while maintaining real-time usability. It extracts the representative colors based on a histogram analysis of the color space. Our method also contains a selective local color averaging technique that removes the problem of mesh noise on high-resolution color images.Received: 25 July 2003, Revised: 10 August 2003, Published online: 6 February 2004Correspondence to: Hiroyuki Hase. Current address: 3-9-1 Bunkyo, Fukui-shi 910-8507, Japan  相似文献   

15.
Shen  Chen  Lin  Hongfei  Guo  Kai  Xu  Kan  Yang  Zhihao  Wang  Jian 《Neural computing & applications》2019,31(9):4799-4808
Neural Computing and Applications - As one of the most important medical field subjects, adverse drug reaction seriously affects the patient’s life, health, and safety. Although many methods...  相似文献   

16.
Liu  Yu  Hua  Wen  Zhou  Xiaofang 《World Wide Web》2021,24(1):135-156

Knowledge, in practice, is time-variant and many relations are only valid for a certain period of time. This phenomenon highlights the importance of harvesting temporal-aware knowledge, i.e., the relational facts coupled with their valid temporal interval. Inspired by pattern-based information extraction systems, we resort to temporal patterns to extract time-aware knowledge from free text. However, pattern design is extremely laborious and time consuming even for a single relation, and free text is usually ambiguous which makes temporal instance extraction extremely difficult. Therefore, in this work, we study the problem of temporal knowledge extraction with two steps: (1) temporal pattern extraction by automatically analysing a large-scale text corpus with a small number of seed temporal facts, (2) temporal instance extraction by applying the identified temporal patterns. For pattern extraction, we introduce various techniques, including corpus annotation, pattern generation, scoring and clustering, to improve both accuracy and coverage of the extracted patterns. For instance extraction, we propose a double-check strategy to improve the accuracy and a set of node-extension rules to improve the coverage. We conduct extensive experiments on real world datasets and compared with state-of-the-art systems. Experimental results verify the effectiveness of our proposed methods for temporal knowledge harvesting.

  相似文献   

17.
维吾尔文Bigram文本特征提取   总被引:1,自引:0,他引:1  
文本特征表示是在文本自动分类中最重要的一个环节。在基于向量空间模型(VSM)的文本表示中特征单元粒度的选择直接影响到文本分类的效果。在维吾尔文文本分类中,对于单词特征不能更好地表征文本内容特征的问题,在分析了维吾尔文Bigram对文本分类作用的基础上,构造了一个新的统计量CHIMI,并在此基础上提出了一种维吾尔语Bigram特征提取算法。将抽取到的Bigram作为文本特征,采用支持向量机(SVM)算法对维吾尔文文本进行了分类实验。实验结果表明,与以词为特征的文本分类相比,Bigram作为文本特征能够提高维吾尔文文本分类的准确率和召回率并且通过实验验证了该算法的有效性。  相似文献   

18.
This paper describes a novel method for extracting text from document pages of mixed content. The method works by detecting pieces of text lines in small overlapping columns of width , shifted with respect to each other by image elements (good default values are: of the image width, ) and by merging these pieces in a bottom-up fashion to form complete text lines and blocks of text lines. The algorithm requires about 1.3 s for a 300 dpi image on a PC with a Pentium II CPU, 300 MHz, MotherBoard Intel440LX. The algorithm is largely independent of the layout of the document, the shape of the text regions, and the font size and style. The main assumptions are that the background be uniform and that the text sit approximately horizontally. For a skew of up to about 10 degrees no skew correction mechanism is necessary. The algorithm has been tested on the UW English Document Database I of the University of Washington and its performance has been evaluated by a suitable measure of segmentation accuracy. Also, a detailed analysis of the segmentation accuracy achieved by the algorithm as a function of noise and skew has been carried out. Received April 4, 1999 / Revised June 1, 1999  相似文献   

19.
Semantic relation extraction is a significant topic in semantic web and natural language processing with various important applications such as knowledge acquisition, web and text mining, information retrieval and search engine, text classification and summarization. Many approaches such rule base, machine learning and statistical methods have been applied, targeting different types of relation ranging from hyponymy, hypernymy, meronymy, holonymy to domain-specific relation. In this paper, we present a computational method for extraction of explicit and implicit semantic relation from text, by applying statistic and linear algebraic approaches besides syntactic and semantic processing of text.  相似文献   

20.
网页文本信息自动提取技术综述 *   总被引:2,自引:0,他引:2  
对Web网页文本信息自动提取技术提供了一个较为全面的综述。通过分析在这个领域常用到的三种 信息提取模型和四类机器学习算法的发展,较为全面地阐述了当前主流的网页文本信息自动提取技术,对比了 各种方法的应用范围,最后对于该领域当前的热点问题和发展趋势进行了展望。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号