首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
A fundamental issue in natural language processing is the prerequisite of an enormous quantity of preprogrammed knowledge concerning both the language and the domain under examination. Manual acquisition of this knowledge is tedious and error prone. Development of an automated acquisition process would prove invaluable.This paper references and overviews a range of the systems that have been developed in the domain of machine learning and natural language processing. Each system is categorised into either a symbolic or connectionist paradigm, and has its own characteristics and limitations described.  相似文献   

知识图是一种新的知识表示方法。本文从本体论的角度出发,将知识图的本体论分别与Aristotle、Kant和Peirce的三种知识表示的本体论进行了比较,表明知识图方法的有效性以及本原性,说明知识图是一种更为一般的知识表示方法。从知识图本体论的观点,研究了各类逻辑词的知识图表示。本文结合汉语的特点,从结构的角度,研究并揭示了逻辑词的共性和规律性。进一步阐明知识图“结构就是含义”的思想。逻辑词的知识图分析将为自然语言分析中词典的建立奠定基础。  相似文献   

This paper addresses some of the issues that arise in representing temporal information in the database context. It deals not only with the explicit representation of temporal information but with mechanisms for reasoning with it as well. It addresses the issue of processing natural language queries with explicit temporal references. The three issues of knowledge representation, natural language processing and query processing are addressed using the axiomatic framework based on equational logic.  相似文献   

Statistical natural language processing (NLP) and evolutionary algorithms (EAs) are two very active areas of research which have been combined many times. In general, statistical models applied to deal with NLP tasks require designing specific algorithms to be trained and applied to process new texts. The development of such algorithms may be hard. This makes EAs attractive since they offer a general design, yet providing a high performance in particular conditions of application. In this article, we present a survey of many works which apply EAs to different NLP problems, including syntactic and semantic analysis, grammar induction, summaries and text generation, document clustering and machine translation. This review finishes extracting conclusions about which are the best suited problems or particular aspects within those problems to be solved with an evolutionary algorithm.  相似文献   

A formal, computational, semantically clean representation of natural language is presented. This representation captures the fact that logical inferences in natural language crucially depend on the semantic relation of entailment between sentential constituents such as determiner, noun, adjective, adverb, preposition, and verb phrases.The representation parallels natural language in that it accounts for human intuition about entailment of sentences, it preserves its structure, it reflects the semantics of different syntactic categories, it simulates conjunction, disjunction, and negation in natural language by computable operations with provable mathematical properties, and it allows one to represent coordination on different syntactic levels.The representation demonstrates that Boolean semantics of natural language can be successfully modeled in terms of representation and inference by knowledge representation formalisms with Boolean semantics. A novel approach to the problem of automatic inferencing in natural language is addressed. The algorithm for updating a computer knowledge base and reasoning with explicit negative, disjunctive, and conjunctive information based on computing subsumption relation between the representations of the appropriate sentential constituents is discussed with examples.  相似文献   

自然语言处理技术在药物专利检索中的应用研究   总被引:2,自引:2,他引:0  
本文研究了自然语言处理技术在药物专利检索中的应用,开发出一套翻译软件,能够将药物专利中对族性变量的文本描述半自动化地转化为符合规则的GSCCT格式,为准确、高效地建立药物专利检索数据库打下了基础。  相似文献   

随着互联网技术的飞速发展,大量的网络案情信息存在于互联网上,这既给办案人员提供了一定的线索,同时又带来了很大的挑战。设计并实现了一种网络案情分析系统,利用自然语言处理技术识别出海量网络案情文件中网名和网址等信息,并构建它们之间的关系网络。针对不同类型的文件,分别采取结构化分析和以“规则和统计”相结合为主、用户辅助知识库为辅的网名识别技术。实验证明,将该方法应用于网络犯罪案情分析系统中,有助于办案人员快速侦破案情。  相似文献   

This article describes the natural language processing techniques used in two computer-assisted language instruction programs: VERBCON and PARSER. VERBCON is a template-type program which teaches students how to use English verb forms in written texts. In the exercises verbs have been put into the infinitive, and students are required to supply appropriate verb forms. PARSER is intended to help students learn English sentence structure. Using a lexicon and production rules, it generates sentences and asks students to identify their grammatical parts. The article contends that only by incorporating natural language processing techniques can these programs offer a substantial number of exercises and at the same time provide students with informative feedback. Alan Bailin is director of the Effective Writing Program at the University of Western Ontario, London, Ontario, Canada. Philip Thomson is a programmer in the Faculty of Medecine, University of Western Ontario.  相似文献   

自然语言处理(NLP)可以将建设领域非结构化文档转化为结构化信息,方便相关从 业人员对建设项目进行高效的日常管理。近年来,NLP 相关算法得到了广泛的发展,但NLP 技术在建设领域中的研究还处于初级阶段。通过调研近十年关于NLP 在建筑工程的相关文献, 对国内外技术与应用层面的研究进行了梳理。介绍了NLP 的技术发展、常用方法及相关开源工 具实现的功能;并重点从统计分析工具、应用系统和其他3 方面对NLP 在建筑领域各阶段的应 用进行总结。此外,对建设领域NLP 应用存在的问题进行了讨论,总结原因并从技术、建筑业 和政府3 个方面提出了未来展望。  相似文献   

项炜 《计算机应用》2013,33(5):1446-1449
通用信息模型(CIM) 是工业界的一种公开标准,并已实现于很多产品中,大量的bug被发现和修复。为了减少了人工查找错误根源所需的时间和精力,提出一种基于自然语言处理的方法对CIM 的bug进行自动调试。首先使用最大熵模型对已解决bug的文档描述进行分词,然后基于构建的词典使用simHash找出那些重复性很大的已修复的bug,最后使用文档处理的方法分析客户提供的trace找出问题所在和解决方法。实验结果取得了87.5%准确率, 表明了该方法的有效性。  相似文献   

Abstract: In this paper we present a novel approach that allows humans to create meaningful web annotations in controlled natural language. The controlled natural language serves as a high-level interface language which enables human annotators to summarize individual web pages of a website and to express domain-specific ontological knowledge about that website in an unambiguous subset of English. The annotation process is backed up by an intelligent text editor which supports the writing process of the controlled natural language with the help of predictive interface techniques. The text editor runs as a Java applet and is connected over the Internet to a controlled natural language processor and to a reasoning service (consisting of a theorem prover and a model builder). The controlled language processor translates the summaries of web pages and the ontological knowledge about a website into first-order predicate logic and the reasoning service combines this information into a set of micro-theories for consistency and informativity checking as well as for question-answering. Specification texts written in controlled natural language are both human-readable and machine-processable and can be easily exported and distributed as web feeds.  相似文献   

隐马尔可夫模型是序列数据处理和统计学习的一种重要概率模型,最近几年已经被成功应用到许多关于自然语言处理的任务中.简要介绍了隐马尔可夫模型,对其在词性标注应用中的难点、模型的建立,Viterbi算法等问题进行了详细论述,给出了基于隐马尔可夫模型的中文科研论文头部信息抽取过程以及模型结构的学习和参数的训练等关键问题的解决办法.  相似文献   

自然语言处理中的语义关系与句法模式互发现*   总被引:3,自引:0,他引:3  
在国家科技基础条件平台中如何建设汉语字词之间的语义关系库,并且利用初始的语义关系库自动获取句法模式和新的关系。使用了句法模式的概念,并提出了利用已有关系发现新模式、利用已有模式发现新关系的方法,创造性地设计相关模型并实现了一个中文语义关系知识库系统。利用此系统结合自然语言处理相关技术,从搜狗语料库和百度百科页面文件中大规模自动化获取了有效关系200多个,并从中提取了继承、同义等有效的新关系1 000多条。实验证明其效率达到约40%,主要取决于关系中查询词的距离取值和语料库本身的性质。  相似文献   

In conventional algorithms, the lack of entity information, reference, and semantic relations in the current corpus leads to a low rate of precision and efficiency in constructing cross‐language bilingual mapping. According to natural language processing and machine translation technology, to solve the problem, this paper aims to establish a parallel corpus for information extraction based on the OntoNotes corpus by combining automatic extraction and manual adjustment. To verify the validity of the parallel corpus constructed in this paper, a comparative experiment was carried out on the corpus. The corpus entity alignment rate, anaphora absence, and syntactic structure were analysed in detail based on statistics. The data set is well performed in language processing and machine translation. The parallel corpus for information extraction constructed in this paper can produce highly precise, stable, and efficient information in the process of bilingual mapping, which provides an effective parallel corpus for the study in machine translation of bilingual mapping.  相似文献   

传统的自然语言处理方法是将大量手工制定的特征输入到统计学习模型中,以完成文本的加工处理。目前,条件随机场模型在多种自然语言处理任务中都取得了较好的效果,但手工特征制定的方式以及庞大的特征数量增加了模型建立的难度,降低了模型运算的速度,同时易使模型“过拟合”。为了解决上述问题,提出一种张量扩展的条件随机场模型,利用张量变换自动构建出复杂的特征,减少了手工特征制定的工作量,并使用Tucker分解算法加速模型,得到的模型可用于多种自然语言处理任务。实验表明,在提取相同基本特征的前提下,与传统的条件随机场模型相比,文中的模型在多种自然语言处理任务中的性能都有所提高,具有一定的使用价值。  相似文献   

Computer‐Interpretable Guidelines (CIGs) are the dominant medium for the delivery of clinical decision support, given the evidence‐based nature of their source material. Therefore, these machine‐readable versions have the ability to improve practitioner performance and conformance to standards, with availability at the point and time of care. The formalisation of Clinical Practice Guideline knowledge in a machine‐readable format is a crucial task to make it suitable for the integration in Clinical Decision Support Systems. However, the current tools for this purpose reveal shortcomings with respect to their ease of use and the support offered during CIG acquisition and editing. In this work, we characterise the current landscape of CIG acquisition tools based on the properties of guideline visualisation, organisation, simplicity, automation, manipulation of knowledge elements, and guideline storage and dissemination. Additionally, we describe the CompGuide Editor, a tool for the acquisition of CIGs in the CompGuide model for Clinical Practice Guidelines that also allows the editing of previously encoded guidelines. The Editor guides the users throughout the process of guideline encoding and does not require proficiency in any programming language. The features of the CIG encoding process are revealed through a comparison with already established tools for CIG acquisition.  相似文献   

为了解决以自然语言表示节点标签的分类树很难通过自动软件agents来进行自动推理的问题,通过词性标志、词义辨析、连接词辨析和受约束的自然语言定义及转换等步骤,将分类树中每一个节点对应的自然语言标签转换成了机器能够识别的逻辑表达式,从而使整个分类树转换成了一个轻量级本体,它适合应用在数据整合的语义匹配、文档分类和语义搜索等方面的自动推理,从而促进了本体知识的自动化推理,为以后文本自动检索奠定基础。  相似文献   

扩充转移网络具有直观定量的特点,其在句法处理中的应用可推动自然语言的程序化研究.歧义现象类似“多车道单向”通行,各车道的认知选择具有非回溯的优先性.花园幽径现象是“单车道”单向通行,前期默认式选择蕴涵后期反叛式决定,折返性跨越式解读使其具有心理实验的“啊哈”体验.两种现象的程序化句法研究证实了歧义现象和花园幽径现象具有区别性特征这一论断.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号