共查询到20条相似文献,搜索用时 125 毫秒
1.
周继鹏 《微电子学与计算机》1993,10(11):7-10
语法分析器应用文法规则生成语法树来表示一名句子的语法结构。本文讨论用概念图表示自然语言的语义,描述一个从语法树开始生成概念图来解释句子语义的方法,生成过程是由语法树引导,把与每一个输入词相关的概念图连接成一个较大的概念图表示整修句子的语义。 相似文献
2.
语义分析是计算机理解自然语言的基础,是自然语言理解研究的一个突破点和出发点。从事自然语言理解研究的学者们追求的主要目标是对句子进行正确的语义分析。概念图是支持概念结构思想的一个具体的语义模型,是以图形表示的一种有向连通图。文中分析了中文信息处理中语义研究的必要性和现状,阐述了概念图在语义研究中的应用,并提出了下一步的研究方向。 相似文献
3.
4.
随着软件系统复杂度的持续增长,如何保证大型复杂软件系统的健壮性与正确性逐渐成为一个热点问题,不确定性语义计算的研究是解决这一问题的关键.本文提出了一种不确定性语义计算模型,并应用模型对示例小语言设计了四种不同的形式语义,通过四种形式语义等价性的证明论证了模型的正确性与灵活性. 相似文献
5.
6.
问题意图理解是知识图谱问答的主要任务之一,语义解析是当前理解问题意图的主流方法.其主要挑战是如何充分利用知识图谱上下文理解问句中的隐含实体或关系,以及时间、排序和聚合等复杂约束条件等意图.为了应对这些挑战,本文提出了一种基于语义块的知识图谱问答语义解析框架——Graph-to-Segment,框架中的语义解析模型结合了基于规则的准确度和基于深度学习的覆盖度,实现了问题到语义块序列的解析和语义查询图的构造.框架将问题意图使用基于语义块的语义查询图表示,将问题的语义解析建模为语义块序列生成任务,采用编码器-解码器神经网络模型实现问题到语义块序列的解析,然后通过语义块组装形成语义查询图.同时,结合知识图谱中的上下文信息,模型使用图神经网络学习问题的表示,改进隐含实体或关系的语义解析效果.在两个知识图谱问答数据集上的实验表明,模型性能达到了良好的效果. 相似文献
7.
语义网络是专家系统中知识表示最重要的方法之一,为了使用语义网络更好地描述现实世界中的不确定性知识,定义了语义网络中结点的结构,使用六元式表示结点,并在Prolog环境下实现了初始化、增加、删除、修改六元式结点等操作,提高了用语义网络表示知识时求解问题的效率. 相似文献
8.
利用隐含语义索引技术设计了一个问答系统,在系统中利用隐含语义索引理论进行查询问题和数据库中的候选问题的相似度计算.主要是通过构造一个语义矩阵,进行奇异值分解消除"噪音"进行实现的.这样更清晰地表示出了词之间的语义相关性,使本系统可以接受被自然语言描述的问题.最后,对整个系统进行实验测试并对测试结果进行了分析,发现本系统比一般的基于VSM等方法实现的系统表现出了明显的优势. 相似文献
9.
10.
机器要完全智能化,自然语言理解是基础,其中语义是最突出的问题.语义分析方法的选择,对于处理各类不同的语料有着十分重要的作用.介绍语义分析理论,并重点分析语义网络、袼语法、概念从属理论和本体等语义分析方法;提出在选择语义知识表示方法时应遵循的原则;最后,分析基于语义的自然语言理解的重要性及发展趋势. 相似文献
11.
提出了一种微博热门话题的观点挖掘方法。首先通过句法依存关系模板和支持向量机(SVM)共同识别热门话题中的观点句,然后进一步通过词法关系和句法依存关系抽取观点词对,从而简明、清晰展现热门话题中的观点。最后通过实验证明了该方法的有效性。 相似文献
12.
该文通过对文摘句的选择问题进行分析,提出了一种文摘句优选方法,相对于传统的逐个添加句子生成文摘的方法,该文提出的方法是在一定范围内逐个删除句子生成文摘。该方法分两阶段进行句子选择,第1阶段获取候选文摘句子集合,采用了直接获取算法和基于冗余信息处理的获取算法。第2阶段逐步删除句子,分别以不同特征项作为衡量句子对候选文摘句子集合的贡献,提出了文摘句优选算法。以DUC2004为实验语料,通过经句子选择后生成文摘的ROUGE得分,验证了句子选择在文摘生成过程中的必要性,与基于冗余信息处理的句子选择方法比较,验证了该文提出算法的有效性。 相似文献
13.
In this paper, we propose a method for estimating emotion in Wakamono Kotoba that were not registered in the system, by using Wakamono Kotoba example sentences as features. The proposed method applies Earth Mover's Distance (EMD) to vector of words. As a result of the evaluation experiment using 14 440 sentences, higher estimation accuracy is obtained by considering emotional distance between wordsan approach that had not been used in the conventional research than by using only word importance value. 相似文献
14.
基于HMM/VQ的认人的中等词表连续语音识别 总被引:2,自引:2,他引:0
本文讨论基于隐马尔可夫模型(HMM)和矢量量化(VQ)的连续语音识别方法。用这种方法,对每个单词作成一个HMM,对多个模型组合成的状态转移网络搜索其状态转移的最佳路径,从而实现不预先进行单词切分的连续语音的识别,使用有限态文法约束及其它一些改善识别性能的措施,演示系统能识别特定人的18种英语句式,150个单词,用312个话句(共有2710个单词)进行测试,识别延迟时间为发音时长的62%,发音速度平均为每秒2.32个单词,单词识准率为97.3%。 相似文献
15.
Brown M.K. McGee M.A. Rabiner L.R. Wilpon J.G. 《Signal Processing, IEEE Transactions on》1991,39(6):1268-1281
Two methods for generating training sets for a speech recognition system are studied. The first uses a nondeterministic statistical method to generate a uniform distribution of sentences from a finite state machine (FSM) represented in digraph form. The second method, a deterministic heuristic approach, takes into consideration the importance of word ordering to address the problem of coarticulation effects. The two methods are critically compared. The first algorithm, referred to as MARKOV, converts the FSM into a first-order Markov model. The digraphs are determined, transitive closure computed, transition probabilities are assigned, and stopping criteria established. An efficient algorithm for computing these parameters is described. Statistical tests are conducted to verify performance and demonstrate its utility. A second algorithm for generating training sentences, referred to as BIGRAM, uses heuristics to satisfy three requirements: adequate coverage of basic speech (subword) units; adequate coverage of words in the recognition vocabulary (intraword contextual units); and adequate coverage of word pairs bigrams (interword contextual units) 相似文献
16.
Compared with the traditional method of adding sentences to get summary in multi-document summarization,a two-stage sentence selection approach based on deleting sentences in a candidate sentence set to generate summary is proposed,which has two stages,the acquisition of a candidate sentence set and the optimum selection of sentence.At the first stage,the candidate sentence set is obtained by redundancy-based sentence selection approach.At the second stage,optimum selection of sentences is proposed to delete sentences in the candidate sentence set according to its contribution to the whole set until getting the appointed summary length.With a test corpus,the ROUGE value of summaries gotten by the proposed approach proves its validity,compared with the traditional method of sentence selection.The influence of the token chosen in the two-stage sentence selection approach on the quality of the generated summaries is analyzed. 相似文献
17.
In recent years, with the development of the social Internet of Things (IoT), all kinds of data accumulated on the network. These data, which contain a lot of social information and opinions. However, these data are rarely fully analyzed, which is a major obstacle to the intelligent development of the social IoT. In this paper, we propose a sentence similarity analysis model to analyze the similarity in people’s opinions on hot topics in social media and news pages. Most of these data are unstructured or semi-structured sentences, so the accuracy of sentence similarity analysis largely determines the model’s performance. For the purpose of improving accuracy, we propose a novel method of sentence similarity computation to extract the syntactic and semantic information of the semi-structured and unstructured sentences. We mainly consider the subjects, predicates and objects of sentence pairs and use Stanford Parser to classify the dependency relation triples to calculate the syntactic and semantic similarity between two sentences. Finally, we verify the performance of the model with the Microsoft Research Paraphrase Corpus (MRPC), which consists of 4076 pairs of training sentences and 1725 pairs of test sentences, and most of the data came from the news of social data. Extensive simulations demonstrate that our method outperforms other state-of-the-art methods regarding the correlation coefficient and the mean deviation. 相似文献
18.
《IEEE transactions on information theory / Professional Technical Group on Information Theory》1975,21(4):423-430
A model of a linguistic information source is proposed as a grammar that generates a language over some finite alphabet. It is pointed out that grammatical sentences generated by the source grammar contain intrinsic "redundancy" that can be exploited for error-corrections. Symbols occurring in the sentences are composed according to some syntactic rules determined by the source grammar, and hence are different in nature from the lexicographical source symbols assumed in information theory and algebraic coding theory. Almost all programming languages and some simple natural languages can be described by the linguistic source model proposed in this paper. In order to combat excessive errors for very noisy channels, a conventional encoding-decoding scheme that does not utilize the source structure is introduced into the communication system. Decoded strings coming out of the lexicographical decoder may not be grammatical, which indicates that some uncorrected errors still remain in the individual sentences and will be reprocessed by a syntactic decoder that converts ungrammatical strings into legal sentences of the source language by the maximum-likelihood criterion. Thus more errors in the strings coming out of the noisy channel can be corrected by the syntactic decoder using syntactic analysis than the !exicographical decoder is capable of correcting or even of detecting. To design the syntactic decoder we use parsing techniques from the study of compilers and formal languages. 相似文献
19.
Laurence Danlos 《电信纪事》1989,44(1-2):94-100
Automatic generation is a recent domain dedicated to the production of written texts from abstract representations or from numerical data. One way to produce sentences while avoiding the difficulties of automatic generation consists in using pre-recorded texts. However, this method becomes cumbersome when there is a number of pre-recorded texts. It can be improved by allowing them to include variables. Yet, it will be shown that the use of variables requires the pre-recorded sentences to be annotated with linguistic information so that syntactic operations can be applied to them. In fact, the use of variables entails adopting techniques that fall in the domain of automatic generation. Such techniques will be presented briefly. 相似文献
20.
基于KNN的汉语问句分类 总被引:1,自引:0,他引:1
汉语问句分类是问答系统中重要的组成部分,问句分类结果的好坏直接影响问答系统的质量。利用知网(HowNet)义原树计算问句之间的语义相似度,并以此作为句子之间的距离度量,利用KNN算法构造分类器进行问句分类,并对最近邻分类算法、KNN分类算法及改进的KNN分类算法进行实验比较。结果表明加权的KNN分类器分类效果最好,达到了89.8%的精确率。 相似文献