期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Automatically building a knowledge base through natural language text analysis

Julia E. Hodges Jose L. Cordova 《国际智能系统杂志》1993,8(9):921-938

相似文献

2.

Logical form generation as abduction: Part I. Representation of linguistic concepts

Venu Dasigi 《国际智能系统杂志》1994,9(7):571-608

For some time, researchers have become increasingly aware that some aspects of natural language processing can be viewed as abductive inference. This article describes knowledge representation in dual-route parsimonious covering theory, based on an existing diagnostic abductive inference model, extended to address issues specific to logic form generation. the two routes of covering deal with syntactic and semantic aspects of language, and are integrated by attributing both syntactic and semantic facets to each “open class” concept. Such extensions reflect some fundamental differences between the two task domains. the syntactic aspect of covering is described to show the differences, and some interesting properties are established. the semantic associations are characterized in terms of how they can be used in an abductive model. A major significance of this work is that it paves the way for a nondeductive inference method for word sense disambiguation and logical form generation, exploiting the associative linguistic knowledge. This approach sharply contrasts with others, where knowledge has usually been laboriously encoded into pattern-action rules that are hard to modify. Further, this work represents yet another application for the general principle of parsimonious covering. © 1994 John Wiley & Sons, Inc. 相似文献

3.

融合形态特征的最大熵蒙古文词性标注模型

张贯虹斯·劳格劳乌达巴拉《计算机研究与发展》2011,48(12)

最大熵模型以其能够较好地包容各种约束信息及与自然语言模型相适应等优点在词性标注研究中取得了良好的效果.因此,将其作为基本框架,提出了一种融合语言特征的最大熵蒙古文词性标注模型.首先,根据蒙古文构词特点及统计分析结果,定义并选取特征模板,利用训练语料提取了大量的候选特征集合,针对错误或者无效的特征通过设置一些规则筛选特征.然后,训练最大熵概率模型参数.实验结果表明,融合蒙古文形态特征的最大熵模型可以较好地标注蒙古文. 相似文献

4.

电子词典与词汇知识表达 总被引：3，自引：0，他引：3

陈克健《中文信息学报》2002,16(4):2-12

词汇知识的表达与取得是自然语言处理极须克服的问题,本论文提出一个初步的架构与常识的抽取机制。语言处理系统是以词为讯息处理单元,登录在词项下的讯息可以包括统计、语法、语义、常识等。语言分析系统利用〈词〉为引得取得输入语句中相关词汇的语法、语义、常识等信息,让语言处理系统有更好的聚焦能力,可以藉以解决分词歧义、结构的歧义。对于不易以人工整理取得的常识,本论文也提出计算机自动学习的策略,以渐进式的方式累积概念与概念之间的语义关系,来增进语言系统的分析能力。这个策略可行的几个关键技术,包括(1)未登录词判别及语法语义自动分类, (2)词义分析, (3)应用语法语义及常识的剖析系统。相似文献

5.

基于配价的维吾尔语框架语义知识库的构建

吾买尔江·库尔班阿里甫·库尔班《中文信息学报》2007,21(6):36-42

本文阐述了以配价作为基本描写法、真实语料为事实依据的维吾尔语框架语义知识库(简称框架网FrameNet)的构建,该知识库在构建维吾尔语词汇及其所属框架的语义词典等诸多领域有着广阔的应用空间和发展前景。提出了研究维吾尔语中句法功能和概念结构(也就是语义结构) 之间的关系, 以及建立用于自然语言处理的维吾尔语网上词汇知识库的意义。在维吾尔语的研究中引入了框架语义知识库(框架网)。框架语义知识库作为一种网上词汇语料库, 包括对每个词位( lexeme)的各个涵义的句法、语义信息的详尽描述。本文为维吾尔语框架语义知识库中各个框架元素的句法、语义特征的说明等自然语言信息处理研究提出新的研究思路,对基于配价的维吾尔语框架语义知识库构建的方法进行了探讨。相似文献

6.

一种基于语义分析的汉语语音识别纠错方法

韦向峰张全熊亮《计算机科学》2006,33(10):152-155

汉语语音识别的研究越来越重视与语言处理的结合,语音识别已经不是单纯的语音信号处理。N-gram语言模型应用到语音识别系统中,大大增强了系统的正确率和稳定性,但它也有其自身的局限性,使得语音识别出现许多语法和语义的错误结果。本文分析了语音识别产生语音和文字方面的错误的原因和类型,在概念层次网络语言模型的基础上提出了一种基于语句语义分析和混淆音矩阵的语音识别纠错方法。通过三个发音人、5万字的声音语料和216句实验语句的纠错测试,本文的纠错系统在纠正语义搭配型错误方面有比较好的表现,可克服N-gram语言模型带来的一些缺陷。本文提出的纠错方法还可以融合到语音识别系统中,以便更好地为语音识别的纠错处理服务。相似文献

7.

A multilingual FrameNet-based grammar and lexicon for controlled natural language

Normunds Gruzitis Dana Dannélls 《Language Resources and Evaluation》2017,51(1):37-66

Berkeley FrameNet is a lexico-semantic resource for English based on the theory of frame semantics. It has been exploited in a range of natural language processing applications and has inspired the development of framenets for many languages. We present a methodological approach to the extraction and generation of a computational multilingual FrameNet-based grammar and lexicon. The approach leverages FrameNet-annotated corpora to automatically extract a set of cross-lingual semantico-syntactic valence patterns. Based on data from Berkeley FrameNet and Swedish FrameNet, the proposed approach has been implemented in Grammatical Framework (GF), a categorial grammar formalism specialized for multilingual grammars. The implementation of the grammar and lexicon is supported by the design of FrameNet, providing a frame semantic abstraction layer, an interlingual semantic application programming interface (API), over the interlingual syntactic API already provided by GF Resource Grammar Library. The evaluation of the acquired grammar and lexicon shows the feasibility of the approach. Additionally, we illustrate how the FrameNet-based grammar and lexicon are exploited in two distinct multilingual controlled natural language applications. The produced resources are available under an open source license. 相似文献

8.

数据库NL界面上汉语查询的EAAD模型 总被引：6，自引：0，他引：6

张亚南徐洁磐《计算机学报》1993,16(12):881-888

本文给出一种旨在描述数据库ＮＬ界面上汉语查询的语法，语义结构的ＥＡＡＤ模型。通过该模型，数据库自然语言界面上的查询分析与理解，可以与其相应的背景知识机地结合起来，ＥＡＡＤ模型适合于描述任意构形上的ＥＲ模型或与其相应的关系模型上的汉语查询，尤其是描述涉及多实体，多路径的查询的结构规律，有利于增强数据库ＮＬ界面的理解力和可移植性。相似文献

9.

Connectionist systems for natural language understanding

B. Selman 《Artificial Intelligence Review》1989,3(1):23-31

We will discuss various connectionist schemes for natural language understanding (NLU). In principle, massively parallel processing schemes, such as connectionist networks, are well-suited for modelling highly integrated forms of processing. The connectionist approach towards natural language processing is motivated by the belief that a NLU system should process knowledge from many different sources, e.g. semantic, syntactic, and pragmatic, in just this sort of integrated manner. The successful use of spreading activation for various disambiguation tasks in natural language processing models lead to the first connectionist NLU systems. In addition to describing in detail a connectionist disambiguation system, we will also discuss proposed connectionist approaches towards parsing and case role assignment. This paper is intended to introduce the reader to some of the basic ideas behind the connectionist approach to NLU. We will also suggest some directions for future research. 相似文献

10.

基于多语义融合的反讽识别

樊小超杨亮林鸿飞刁宇峰申晨楚永贺《中文信息学报》2021,35(6):103-111

反讽是一种复杂的语言现象,被广泛应用于社交媒体中.如何让计算机具有识别反讽的能力,成为了自然语言处理研究领域的热门研究内容之一.该文针对反讽识别中缺乏上下文语境信息和修辞表达信息的问题,提出了基于多语义融合的反讽识别方法.该方法采用ELMo从大规模反讽文本中训练得到领域词嵌入表示,并融合基于词性和基于风格信息的语义表示... 相似文献

11.

New Functionalities of the System for Processing Natural Language Specifications and its Operating Environment

N. M. Mishchenko M. K. Morokhovets O. D. Felizhanko Y. V. Shtelik N. N. Shchogoleva 《Cybernetics and Systems Analysis》2018,54(6):883-891

This paper describes the operating environment of the language processor intended for processing behavioral models of systems represented in natural language. This environment provides the tuning of the language processor; the main stage of this tuning is the construction of a syntactic table. An approach is proposed for automating the construction of grammar rules intended for replenishment of the syntactic table used by the language processor. 相似文献

12.

基于轻语义λ-演算的汉语陈述句灵活语序研究

刘冬宁邓春国滕少华张巍梁路《中文信息学报》2016,30(3):23-29

目前,自然语言处理已经从句法、语法层面走向轻语义层面。对于汉语陈述句的处理,传统的方法是采用Lambek演算来进行处理。但是传统的Lambek演算无法处理汉语中的灵活语序问题,而现有的方法,如加入模态词、新连接词等,又因为其进一步使得本已是NP-hard的Lambek演算时间复杂度变大,并不适合当前的计算机处理。基于此,该文提出了λ-Lambek演算,即采用Lambek演算来对汉语陈述句进行句法演算,并通过Curry-Howard对应理论与λ-演算来对汉语陈述句进行轻语义模型的构建。λ-Lambek演算不仅能够对汉语陈述句进行轻语义演算,而且还能对汉语陈述句灵活语序进行处理。相似文献

13.

基于多策略的藏语语义角色标注研究

龙从军康才畯李琳江荻《中文信息学报》2014,28(5):176-181

语义角色标注研究对自然语言处理具有十分重要的意义。英汉语语义角色标注研究已经获得了很多成果。然而藏语语义角色标注研究不管是资源建设,还是语义角色标注的技术探讨都鲜有报道。藏语具有比较丰富的句法标记,它们把一个句子天然地分割成功能不同的语义组块,而这些语义组块与语义角色之间存在一定的对应关系。根据这个特点,该文提出规则和统计相结合的、基于语义组块的语义角色标注策略。为了实现语义角色标注,文中首先对藏语语义角色进行分类,得到语义角色标注的分类体系;然后讨论标注规则的获得情况,包括手工编制初始规则集和采用错误驱动学习方法获得扩充规则集;统计技术上,选用了条件随机场模型,并添加了有效的语言特征,最终语义角色标注的结果准确率、召回率和F值分别达到82.78%、85.71%和83.91%。相似文献

14.

Shallow semantic labeling using two-phase feature-enhanced string matching

Samuel W.K. Chan 《Expert systems with applications》2009,36(6):9729-9736

A two-phase annotation method for semantic labeling in natural language processing is proposed. The dynamic programming approach stresses on a non-exact string matching which takes full advantage of the underlying grammatical structure of the parse trees in a Treebank. The first phase of the labeling is a coarse-grained syntactic parsing which is complementary to a semantic dissimilarities analysis in its latter phase. The approach goes beyond shallow parsing to a deeper level of case role identification, while preserving robustness, without being bogged down into a complete linguistic analysis. The paper presents experimental results for recognizing more than 50 different semantic labels in 10,000 sentences. Results show that the approach improves the labeling, even though with incomplete information. Detailed evaluations are discussed in order to justify its significances. 相似文献

15.

一种提高自然语言文本水印容量的算法 总被引：1，自引：0，他引：1

黄友荣徐向阳吴霞徐晓静《计算机应用与软件》2007,24(8):180-182

自然语言文本水印算法通过对文本句子的语法结构或语义结构进行转换来嵌入水印信息.对句子的语法和TMR(Text Meaning Representation)语义结构进行分析,利用句子语法结构的转换不会改变句子TMR语义结构这一性质将语法水印技术和语义水印技术有效结合起来,提出了一种提高自然语言文本水印嵌入容量的算法.该算法的优点是将控制信息和水印信息分离,并根据每个句子本身的特性动态嵌入相应数量的水印信息.实验表明该算法和原有的语法或语义水印算法相比,水印嵌入容量有一定程度的提高. 相似文献

16.

Terminological inconsistency analysis of natural language requirements

《Information and Software Technology》2016

Context: Terminological inconsistencies owing to errors in usage of terms in requirements specifications could result into subtle yet critical problems in interpreting and applying these specifications into various phases of SDLC.Objective: In this paper, we consider special class of terminological inconsistencies arising from term-aliasing, wherein multiple terms spread across a corpus of natural language text requirements may be referring to the same entity. Identification of such alias entity-terms is a difficult problem for manual analysis as well as for developing tool support.Method: We consider the case of syntactic as well as semantic aliasing and propose a systematic approach for identifying these. Identification of syntactic aliasing involves automated generation of patterns for identifying syntactic variances of terms including abbreviations and introduced-aliases. Identification of semantic aliasing involves extracting multidimensional features (linguistic, statistical, and locational) from given requirement text to estimate semantic relatedness among terms. Based upon the estimated relatedness and standard language database based refinement, clusters of potential semantic aliases are generated. Results of these analyses with user refinement lead to generation of entity-term alias glossary and unification of term usage across requirements.Results: A prototype tool was developed to assess the effectiveness of the proposed approach for an automated analysis of term-aliasing in the requirements given as plain English language text. Experimental results suggest that approach is effective in identifying syntactic as well as semantic aliases, however, when aiming for higher recall on larger corpus, user selection is necessary to eliminate false positives.Conclusion: This proposed approach reduces the time-consuming and error-prone task of identifying multiple terms which might be referring to the same entity to a process of tool assisted identification of such term-aliases. 相似文献

17.

SREC: Discourse-level semantic relation extraction from text

Mohammad-hadi Zahedi Mohsen Kahani 《Neural computing & applications》2013,23(6):1573-1582

Semantic relation extraction is a significant topic in semantic web and natural language processing with various important applications such as knowledge acquisition, web and text mining, information retrieval and search engine, text classification and summarization. Many approaches such rule base, machine learning and statistical methods have been applied, targeting different types of relation ranging from hyponymy, hypernymy, meronymy, holonymy to domain-specific relation. In this paper, we present a computational method for extraction of explicit and implicit semantic relation from text, by applying statistic and linear algebraic approaches besides syntactic and semantic processing of text. 相似文献

18.

Direct microprogrammed execution of the intermediate text from a high-level language compiler

Richard E. Merwin Francois Robert Broca 《Computer Languages, Systems and Structures》1975,1(1):17-28

Microprogramming commonly executed operations can improve the computational speed of data processing systems. This paper describes how microprogramming may be used to execute directly the intermediate text generated by a high-level language compiler after syntactic and semantic analysis of the input source program.Direct microprogrammed execution of common forms of intermediate text—i.e. quadruples, triples, and duos—has been simulated. A comparison is made, in terms of storage requirements and execution time, of this direct microprogrammed system with the present methods which result in machine language representation and execution of the intermediate text. Direct generation of a microprogram from the high-level language statements is also examined.Timing assumptions for comparative purposes have been based on the IBM 360 MOD 50 system. Simulation and timing estimates for the microprograms have been carried out on a microprogram directed simulator which closely represents the architectural organization of the MOD 50. 相似文献

19.

日汉机器翻译中词的自动切分技术

王启祥王锡江陈未竞《中文信息学报》1988,2(3):76

本文阐述NDJCMT系统中词的自动切分技术, NDJCMT是我们实现的一个日汉机器翻译实验系统, 词的自动切分是日语词素分析、句法及语义分析的基础, 是一项日本语计算机信息处理的基础性研完课题, 它涉及对语言本身的研究。日语和汉语类似, 词及词之间无分隔符, 通常假名、汉字混写, 给词的切分造成了困难。作者根据日语的特点, 提出了一种“ 句节数最少” 词的自动切分方法, 使用语言编程且在一机上获得实现。相似文献

20.

SyMSS: A syntax-based measure for short-text semantic similarity

Jesús Oliva^{Author Vitae} José Ignacio Serrano Author VitaeMaría Dolores del Castillo Author Vitae Ángel Iglesias Author Vitae 《Data & Knowledge Engineering》2011,70(4):390-405

Sentence and short-text semantic similarity measures are becoming an important part of many natural language processing tasks, such as text summarization and conversational agents. This paper presents SyMSS, a new method for computing short-text and sentence semantic similarity. The method is based on the notion that the meaning of a sentence is made up of not only the meanings of its individual words, but also the structural way the words are combined. Thus, SyMSS captures and combines syntactic and semantic information to compute the semantic similarity of two sentences. Semantic information is obtained from a lexical database. Syntactic information is obtained through a deep parsing process that finds the phrases in each sentence. With this information, the proposed method measures the semantic similarity between concepts that play the same syntactic role. Psychological plausibility is added to the method by using previous findings about how humans weight different syntactic roles when computing semantic similarity. The results show that SyMSS outperforms state-of-the-art methods in terms of rank correlation with human intuition, thus proving the importance of syntactic information in sentence semantic similarity computation. 相似文献