首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
语法分析器应用文法规则生成语法树来表示一名句子的语法结构。本文讨论用概念图表示自然语言的语义,描述一个从语法树开始生成概念图来解释句子语义的方法,生成过程是由语法树引导,把与每一个输入词相关的概念图连接成一个较大的概念图表示整修句子的语义。  相似文献   

2.
语义分析是计算机理解自然语言的基础,是自然语言理解研究的一个突破点和出发点。从事自然语言理解研究的学者们追求的主要目标是对句子进行正确的语义分析。概念图是支持概念结构思想的一个具体的语义模型,是以图形表示的一种有向连通图。文中分析了中文信息处理中语义研究的必要性和现状,阐述了概念图在语义研究中的应用,并提出了下一步的研究方向。  相似文献   

3.
语义依存分析建立在依存理论基础上,是一种深层的语义分析理论。同时融合了句子的依存结构和语义信息,更好地表达了句子的结构与隐含信息。在许多高层次的研究和应用上,语义依存分析都大有用武之地。语义依存分析主要面临两方面的难题,一是语义体系的确定,其次是自动语义依存分析算法。将重点从语义体系的确定以及自动语义依存分析算法的角度上对语义依存分析进行系统的介绍。  相似文献   

4.
随着软件系统复杂度的持续增长,如何保证大型复杂软件系统的健壮性与正确性逐渐成为一个热点问题,不确定性语义计算的研究是解决这一问题的关键.本文提出了一种不确定性语义计算模型,并应用模型对示例小语言设计了四种不同的形式语义,通过四种形式语义等价性的证明论证了模型的正确性与灵活性.  相似文献   

5.
张志昌  曾扬扬  庞雅丽 《电子学报》2000,48(11):2162-2169
文本蕴含识别旨在识别两个给定句子之间的逻辑关系.本文通过构造语义角色和自注意力机制融合模块,把句子的深层语义信息与Transformer模型的编码部分相结合,从而增强自注意力机制捕获句子语义的能力.针对中文文本蕴含识别在数据集上存在规模小和噪声大的问题,使用大规模预训练语言模型能够提升模型在小规模数据集上的识别性能.实验结果表明,提出的方法在第十七届中国计算语言学大会中文文本蕴含识别评测数据集CNLI上的准确率达到了80.28%.  相似文献   

6.
高留杰  赵文  张君福  姜波 《电子学报》2021,49(6):1132-1141
问题意图理解是知识图谱问答的主要任务之一,语义解析是当前理解问题意图的主流方法.其主要挑战是如何充分利用知识图谱上下文理解问句中的隐含实体或关系,以及时间、排序和聚合等复杂约束条件等意图.为了应对这些挑战,本文提出了一种基于语义块的知识图谱问答语义解析框架——Graph-to-Segment,框架中的语义解析模型结合了基于规则的准确度和基于深度学习的覆盖度,实现了问题到语义块序列的解析和语义查询图的构造.框架将问题意图使用基于语义块的语义查询图表示,将问题的语义解析建模为语义块序列生成任务,采用编码器-解码器神经网络模型实现问题到语义块序列的解析,然后通过语义块组装形成语义查询图.同时,结合知识图谱中的上下文信息,模型使用图神经网络学习问题的表示,改进隐含实体或关系的语义解析效果.在两个知识图谱问答数据集上的实验表明,模型性能达到了良好的效果.  相似文献   

7.
语义网络是专家系统中知识表示最重要的方法之一,为了使用语义网络更好地描述现实世界中的不确定性知识,定义了语义网络中结点的结构,使用六元式表示结点,并在Prolog环境下实现了初始化、增加、删除、修改六元式结点等操作,提高了用语义网络表示知识时求解问题的效率.  相似文献   

8.
利用隐含语义索引技术设计了一个问答系统,在系统中利用隐含语义索引理论进行查询问题和数据库中的候选问题的相似度计算.主要是通过构造一个语义矩阵,进行奇异值分解消除"噪音"进行实现的.这样更清晰地表示出了词之间的语义相关性,使本系统可以接受被自然语言描述的问题.最后,对整个系统进行实验测试并对测试结果进行了分析,发现本系统比一般的基于VSM等方法实现的系统表现出了明显的优势.  相似文献   

9.
在命名实体识别任务中,运用词典匹配的方法能够添加丰富的文本特征,但匹配到的词组信息多使用静态归一化的方法,缺乏自动推理能力.提出了基于动态词典匹配的语义增强中文命名实体识别方法.对输入句子中的字符,在词典中进行动态词组匹配,利用神经网络对词组加权,结合word2vec与ALBERT得到字符的增强特征表示;在序列建模层运...  相似文献   

10.
吴畏  赵川 《数字通信》2014,(4):32-34
机器要完全智能化,自然语言理解是基础,其中语义是最突出的问题.语义分析方法的选择,对于处理各类不同的语料有着十分重要的作用.介绍语义分析理论,并重点分析语义网络、袼语法、概念从属理论和本体等语义分析方法;提出在选择语义知识表示方法时应遵循的原则;最后,分析基于语义的自然语言理解的重要性及发展趋势.  相似文献   

11.
张光磊  徐雅斌 《通信学报》2014,35(Z2):30-227
提出了一种微博热门话题的观点挖掘方法。首先通过句法依存关系模板和支持向量机(SVM)共同识别热门话题中的观点句,然后进一步通过词法关系和句法依存关系抽取观点词对,从而简明、清晰展现热门话题中的观点。最后通过实验证明了该方法的有效性。  相似文献   

12.
该文通过对文摘句的选择问题进行分析,提出了一种文摘句优选方法,相对于传统的逐个添加句子生成文摘的方法,该文提出的方法是在一定范围内逐个删除句子生成文摘。该方法分两阶段进行句子选择,第1阶段获取候选文摘句子集合,采用了直接获取算法和基于冗余信息处理的获取算法。第2阶段逐步删除句子,分别以不同特征项作为衡量句子对候选文摘句子集合的贡献,提出了文摘句优选算法。以DUC2004为实验语料,通过经句子选择后生成文摘的ROUGE得分,验证了句子选择在文摘生成过程中的必要性,与基于冗余信息处理的句子选择方法比较,验证了该文提出算法的有效性。  相似文献   

13.
In this paper, we propose a method for estimating emotion in Wakamono Kotoba that were not registered in the system, by using Wakamono Kotoba example sentences as features. The proposed method applies Earth Mover's Distance (EMD) to vector of words. As a result of the evaluation experiment using 14 440 sentences, higher estimation accuracy is obtained by considering emotional distance between wordsan approach that had not been used in the conventional research than by using only word importance value.  相似文献   

14.
基于HMM/VQ的认人的中等词表连续语音识别   总被引:2,自引:2,他引:0  
本文讨论基于隐马尔可夫模型(HMM)和矢量量化(VQ)的连续语音识别方法。用这种方法,对每个单词作成一个HMM,对多个模型组合成的状态转移网络搜索其状态转移的最佳路径,从而实现不预先进行单词切分的连续语音的识别,使用有限态文法约束及其它一些改善识别性能的措施,演示系统能识别特定人的18种英语句式,150个单词,用312个话句(共有2710个单词)进行测试,识别延迟时间为发音时长的62%,发音速度平均为每秒2.32个单词,单词识准率为97.3%。  相似文献   

15.
Two methods for generating training sets for a speech recognition system are studied. The first uses a nondeterministic statistical method to generate a uniform distribution of sentences from a finite state machine (FSM) represented in digraph form. The second method, a deterministic heuristic approach, takes into consideration the importance of word ordering to address the problem of coarticulation effects. The two methods are critically compared. The first algorithm, referred to as MARKOV, converts the FSM into a first-order Markov model. The digraphs are determined, transitive closure computed, transition probabilities are assigned, and stopping criteria established. An efficient algorithm for computing these parameters is described. Statistical tests are conducted to verify performance and demonstrate its utility. A second algorithm for generating training sentences, referred to as BIGRAM, uses heuristics to satisfy three requirements: adequate coverage of basic speech (subword) units; adequate coverage of words in the recognition vocabulary (intraword contextual units); and adequate coverage of word pairs bigrams (interword contextual units)  相似文献   

16.
Compared with the traditional method of adding sentences to get summary in multi-document summarization,a two-stage sentence selection approach based on deleting sentences in a candidate sentence set to generate summary is proposed,which has two stages,the acquisition of a candidate sentence set and the optimum selection of sentence.At the first stage,the candidate sentence set is obtained by redundancy-based sentence selection approach.At the second stage,optimum selection of sentences is proposed to delete sentences in the candidate sentence set according to its contribution to the whole set until getting the appointed summary length.With a test corpus,the ROUGE value of summaries gotten by the proposed approach proves its validity,compared with the traditional method of sentence selection.The influence of the token chosen in the two-stage sentence selection approach on the quality of the generated summaries is analyzed.  相似文献   

17.
In recent years, with the development of the social Internet of Things (IoT), all kinds of data accumulated on the network. These data, which contain a lot of social information and opinions. However, these data are rarely fully analyzed, which is a major obstacle to the intelligent development of the social IoT. In this paper, we propose a sentence similarity analysis model to analyze the similarity in people’s opinions on hot topics in social media and news pages. Most of these data are unstructured or semi-structured sentences, so the accuracy of sentence similarity analysis largely determines the model’s performance. For the purpose of improving accuracy, we propose a novel method of sentence similarity computation to extract the syntactic and semantic information of the semi-structured and unstructured sentences. We mainly consider the subjects, predicates and objects of sentence pairs and use Stanford Parser to classify the dependency relation triples to calculate the syntactic and semantic similarity between two sentences. Finally, we verify the performance of the model with the Microsoft Research Paraphrase Corpus (MRPC), which consists of 4076 pairs of training sentences and 1725 pairs of test sentences, and most of the data came from the news of social data. Extensive simulations demonstrate that our method outperforms other state-of-the-art methods regarding the correlation coefficient and the mean deviation.  相似文献   

18.
A model of a linguistic information source is proposed as a grammar that generates a language over some finite alphabet. It is pointed out that grammatical sentences generated by the source grammar contain intrinsic "redundancy" that can be exploited for error-corrections. Symbols occurring in the sentences are composed according to some syntactic rules determined by the source grammar, and hence are different in nature from the lexicographical source symbols assumed in information theory and algebraic coding theory. Almost all programming languages and some simple natural languages can be described by the linguistic source model proposed in this paper. In order to combat excessive errors for very noisy channels, a conventional encoding-decoding scheme that does not utilize the source structure is introduced into the communication system. Decoded strings coming out of the lexicographical decoder may not be grammatical, which indicates that some uncorrected errors still remain in the individual sentences and will be reprocessed by a syntactic decoder that converts ungrammatical strings into legal sentences of the source language by the maximum-likelihood criterion. Thus more errors in the strings coming out of the noisy channel can be corrected by the syntactic decoder using syntactic analysis than the !exicographical decoder is capable of correcting or even of detecting. To design the syntactic decoder we use parsing techniques from the study of compilers and formal languages.  相似文献   

19.
Laurence Danlos 《电信纪事》1989,44(1-2):94-100
Automatic generation is a recent domain dedicated to the production of written texts from abstract representations or from numerical data. One way to produce sentences while avoiding the difficulties of automatic generation consists in using pre-recorded texts. However, this method becomes cumbersome when there is a number of pre-recorded texts. It can be improved by allowing them to include variables. Yet, it will be shown that the use of variables requires the pre-recorded sentences to be annotated with linguistic information so that syntactic operations can be applied to them. In fact, the use of variables entails adopting techniques that fall in the domain of automatic generation. Such techniques will be presented briefly.  相似文献   

20.
基于KNN的汉语问句分类   总被引:1,自引:0,他引:1  
汉语问句分类是问答系统中重要的组成部分,问句分类结果的好坏直接影响问答系统的质量。利用知网(HowNet)义原树计算问句之间的语义相似度,并以此作为句子之间的距离度量,利用KNN算法构造分类器进行问句分类,并对最近邻分类算法、KNN分类算法及改进的KNN分类算法进行实验比较。结果表明加权的KNN分类器分类效果最好,达到了89.8%的精确率。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号