首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
基于框架的词语搭配自动抽取方法   总被引:4,自引:1,他引:4  
曲维光  陈小荷  吉根林 《计算机工程》2004,30(23):22-24,195
提出了一种基于框架的词语搭配抽取方法,可以同时获取词语搭配以及搭配结构信息。引入相对词序比(RRWR)的方法对候选搭配词语进行筛选,应用语言学中词语搭配组合规律对候选搭配的词性进行限定,利用互信息等统计学模型在大规模语料中进行词语搭配的自动抽取,抽取的搭配平均准确率为84.73%,较Xtract系统高4.7%,较国内同类工作结果高50.79%。并且在获得搭配的同时得到了词语搭配的结构信息。  相似文献   

2.
本文提出一种基于双语语料库的短语复述实例获取方法,尤其能够很好的抽取歧义短语的复述实例。该方法通过输入一个双语短语对约束短语的语义,利用词对齐的双语语料库,构造一个双向抽取模型从中抽取双语对的复述实例。双向抽取模型通过比较每一个候选复述短语和输入短语之间的语义一致性,来确定每个候选是否成为最终的复述实例。实验结果表明,本文短语复述实例获取方法的综合准确率达到了 60% ,获取了较好的性能。  相似文献   

3.
缅甸语属于低资源语言,网络中获取大规模的汉-缅双语词汇一定程度上可以缓解汉-缅机器翻译中面临句子级对齐语料匮乏的问题.为此,本文提出了一种融合主题及上下文特征的汉缅双语词汇抽取方法.首先利用LDA主题模型获取汉缅文档主题分布,并通过双语词向量表征将跨语言主题向量映射到共享的语义空间后抽取同一主题下相似度较高的词作为汉-缅双语候选词汇,然后基于BERT获取候选双语词汇相关上下文的词汇语义表征构建上下文向量,最后通过计算候选词的上下文向量的相似度对候选双语词汇进行加权得到质量更高的汉缅互译词汇.实验结果表明,相对于基于双语词典的方法和基于双语LDA+CBW的方法,本文提出的方法准确率上分别提升了11.07%和3.82%.  相似文献   

4.
双语平行句对是机器翻译的重要资源,但是由于获取途径的限制,句子级平行语料库不仅数量有限而且经常集中在特定领域,很难适应真实应用的需求。该文介绍了一个基于Web的双语平行句对自动获取系统。该系统融合了现有系统的优点,对其中的关键技术进行了改进。文中提出了一种自动发现双语网站中URL命名规律的方法,改进了双语平行句对抽取技术。实验结果表明文中所提出的方法大大提高了候选双语网站发现的召回率,所获取双语平行句对的召回率为93%,准确率为96%,证明了该文方法的有效性。此外,该文还对存在于双语对照网页内部的双语平行句对的抽取方法进行了研究,取得了初步成果。  相似文献   

5.
基于Web的双语平行句对自动获取   总被引:3,自引:1,他引:2  
双语平行句对是机器翻译的重要资源,但是由于获取途径的限制,句子级平行语料库不仅数量有限而且经常集中在特定领域,很难适应真实应用的需求。该文介绍了一个基于Web的双语平行句对自动获取系统。该系统融合了现有系统的优点,对其中的关键技术进行了改进。文中提出了一种自动发现双语网站中URL命名规律的方法,改进了双语平行句对抽取技术。实验结果表明文中所提出的方法大大提高了候选双语网站发现的召回率,所获取双语平行句对的召回率为93%,准确率为96%,证明了该文方法的有效性。此外,该文还对存在于双语对照网页内部的双语平行句对的抽取方法进行了研究,取得了初步成果。  相似文献   

6.
提出一种基于最大熵模型和投票法的汉语动词与动词搭配识别方法.该方法通过组合目标动词与候选搭配词的上下文词性信息以及关联程度的统计信息构成5种复合特征模板,然后利用最大熵方法获得它们对应搭配识别器,最后采用最好搭配识别器占优的投票法构造组合识别器.实验结果表明,同时包含上下文词性信息和统计信息的识别器优于单纯包含上下文词性信息或统计信息的识别器,但最好搭配识别器占优的组合识别器效果更佳.  相似文献   

7.
基于类别特征域的文本分类特征选择方法   总被引:11,自引:2,他引:11  
特征选择是文本分类的关键问题之一,而噪音与数据稀疏则是特征选择过程中遇到的主要障碍。本文介绍了一种基于类别特征域的特征选择方法。该方法首先利用“组合特征抽取”[1 ]的方法去除原始特征空间中的噪音 ,从中抽取出候选特征。这里“, 组合特征抽取”是指先利用文档频率(DF)的方法去掉一部分低频词,再用互信息的方法选择出候选特征。接下来,本方法为分类体系中的每个类别构建一个类别特征域,对出现在类别特征域中的候选特征进行特征的合并和强化,从而解决数据稀疏的问题。实验表明,这种新的方法较之各种传统方法在特征选择的效果上有着明显改善,并能显著提高文本分类系统的性能。  相似文献   

8.
双语平行语料库在自然语言处理领域有很多重要应用,但是大规模双语平行语料库的自动获取并不容易。该文提出了一种有效的从Web上获取高质量双语平行语料库的方案,研究了候选双语混合网页获取和平行句对抽取等关键技术。运用该文方法共获取了258万双语平行句对,平均正确率为93.75%,其中前150万句对的平均正确率达到96%。该文还提出句对质量排序和领域信息检索两种方法将Web数据应用于统计机器翻译的模型训练,在IWSLT评测数据上BLEU值可以提高2到5个百分点。  相似文献   

9.
平行语料库中双语术语词典的自动抽取   总被引:7,自引:5,他引:2  
本文提出了一种从英汉平行语料库中自动抽取术语词典的算法。首先采用基于字符长度的改进的统计方法对平行语料进行句子级的对齐,并对英文语料和中文语料分别进行词性标注和切分与词性标注。统计已对齐和标注的双语语料中的名词和名词短语生成候选术语集。然后对每个英文候选术语计算与其相关的中文翻译之间的翻译概率。最后通过设定随词频变化的阈值来选取中文翻译。在对真实语料的术语抽取实验中取得了较好的结果。  相似文献   

10.
基于互信息的中文术语抽取系统   总被引:5,自引:0,他引:5  
介绍了一个中文术语自动抽取系统,该系统首先基于互信息计算字串的内部结合强度,从而得到术语候选集;接着从术语候选集中去除基本词,并利用普通词语搭配前缀、后缀信息进一步过滤;最后对术语候选进行词法分析,利用术语的词性构成规则进行判别,得到最终的术语抽取结果。实验结果表明,术语抽取正确率为72.19%,召回率为77.98%,F测量为74.97%。  相似文献   

11.
This paper continues the theme of the recent work Chen et al. (2008) [18], in which fast collocation methods are introduced for solving ill-posed Fredholm integral equations of the first kind. We develop in this paper multilevel augmentation algorithms, which lead to fast solutions of the discrete equations resulting from fast collocation methods. Regularization parameter choice strategies are given for proposed methods. The theoretical analysis and numerical experiments illustrate the accuracy and efficiency of the algorithm.  相似文献   

12.
The original Legendre–Gauss collocation method is derived for impulsive differential equations, and the convergence is analysed. Then a new hp-Legendre–Gauss collocation method is presented for impulsive differential equations, and the convergence for the hp-version method is also studied. The results obtained in this paper show that the convergence condition for the original Legendre–Gauss collocation method depends on the impulsive differential equation, and it cannot be improved, however, the convergence condition for the hp-Legendre–Gauss collocation method depends both on the impulsive differential equation and the meshsize, and we always can choose a sufficient small meshsize to satisfy it, which show that the hp-Legendre–Gauss collocation method is superior to the original version. Our theoretical results are confirmed in two test problems.  相似文献   

13.
在基于Winnow算法的基础上引入混淆词和介词搭配的方法.首先通过混淆集获得训练集,对训练集进行预处理后利用文本特征提取方法获得特征词集,然后对特征词集进行Winnow训练得到带有权重的特征词集并把出现在混淆词后的介词提取出来生成介词向量,最后从测试集提取特征并进行结合Winnow算法和混淆词与介词搭配方法的测试得到真词错误检查的结果.混淆词与介词搭配方法的加入使得某些混淆词的正确率、召回率以及F1测度提高了10%~20%,有的甚至提高到了100%.  相似文献   

14.
求解最优控制问题的Chebyshev-Gauss伪谱法   总被引:1,自引:0,他引:1  
唐小军  尉建利  陈凯 《自动化学报》2015,41(10):1778-1787
提出了一种求解最优控制问题的Chebyshev-Gauss伪谱法, 配点选择为Chebyshev-Gauss点. 通过比较非线性规划问题的Kaursh-Kuhn-Tucker条件和伪谱离散化的最优性条件, 导出了协态和Lagrange乘子的估计公式. 在状态逼近中, 采用了重心Lagrange插值公式, 并提出了一种简单有效的计算状态伪谱微分矩阵的方法. 该法的独特优势是具有良好的数值稳定性和计算效率. 仿真结果表明, 该法能够高精度地求解带有约束的复杂最优控制问题.  相似文献   

15.
A pseudospectral method is presented for direct trajectory optimization and costate estimation of infinite-horizon optimal control problems using global collocation at flipped Legendre-Gauss-Radau points which include the end point +1. A distinctive feature of the method is that it uses a new smooth, strictly monotonically decreasing transformation to map the scaled left half-open interval τ∈(-1, +1] to the descending time interval t ∈ (+∞, 0]. As a result, the singularity of collocation at point +1 associated with the commonly used transformation, which maps the scaled right half-open interval τ∈[-1, +1) to the increasing time interval [0,+∞), is avoided. The costate and constraint multiplier estimates for the proposed method are rigorously derived by comparing the discretized necessary optimality conditions of a finite-horizon optimal control problem with the Karush-Kuhn-Tucker conditions of the resulting nonlinear programming problem from collocation. Another key feature of the proposed method is that it provides highly accurate approximation to the state and costate on the entire horizon, including approximation at t = +∞, with good numerical stability. Numerical results show that the method presented in this paper leads to the ability to determine highly accurate solutions to infinite-horizon optimal control problems.   相似文献   

16.
A unified framework is presented for the numerical solution of optimal control problems using collocation at Legendre-Gauss (LG), Legendre-Gauss-Radau (LGR), and Legendre-Gauss-Lobatto (LGL) points. It is shown that the LG and LGR differentiation matrices are rectangular and full rank whereas the LGL differentiation matrix is square and singular. Consequently, the LG and LGR schemes can be expressed equivalently in either differential or integral form, while the LGL differential and integral forms are not equivalent. Transformations are developed that relate the Lagrange multipliers of the discrete nonlinear programming problem to the costates of the continuous optimal control problem. The LG and LGR discrete costate systems are full rank while the LGL discrete costate system is rank-deficient. The LGL costate approximation is found to have an error that oscillates about the true solution and this error is shown by example to be due to the null space in the LGL discrete costate system. An example is considered to assess the accuracy and features of each collocation scheme.  相似文献   

17.
An important aspect of numerically approximating the solution of an infinite-horizon optimal control problem is the manner in which the horizon is treated. Generally, an infinite-horizon optimal control problem is approximated with a finite-horizon problem. In such cases, regardless of the finite duration of the approximation, the final time lies an infinite duration from the actual horizon at t=+. In this paper we describe two new direct pseudospectral methods using Legendre–Gauss (LG) and Legendre–Gauss–Radau (LGR) collocation for solving infinite-horizon optimal control problems numerically. A smooth, strictly monotonic transformation is used to map the infinite time domain t∈[0,) onto a half-open interval τ∈[−1,1). The resulting problem on the finite interval is transcribed to a nonlinear programming problem using collocation. The proposed methods yield approximations to the state and the costate on the entire horizon, including approximations at t=+. These pseudospectral methods can be written equivalently in either a differential or an implicit integral form. In numerical experiments, the discrete solution exhibits exponential convergence as a function of the number of collocation points. It is shown that the map ?:[−1,+1)→[0,+) can be tuned to improve the quality of the discrete approximation.  相似文献   

18.
The way boundary conditions are imposed when applying Chebyshev collocation methods to Poisson and biharmonic-type problems in rectangular domains is investigated. It is shown that careful selection of the number of collocation points leads to a linear system ofn linearly independent equations inn unknowns.  相似文献   

19.
《国际计算机数学杂志》2012,89(7):1079-1087
A numerical solution of a fifth-order non-linear dispersive wave equation is set up using collocation of seventh-order B-spline interpolation functions over finite elements. A linear stability analysis shows that this numerical scheme, based on a Crank–Nicolson approximation in time, is unconditionally stable. The method is used to model the behaviour of solitary waves.  相似文献   

20.
针对在线评论情感分析的复杂特征抽取问题,提出一种基于粗糙集的在线评论情感分析模型。分析传统词袋性特征,指出固定搭配特征在情感极性判别中的作用,采用粗糙集方法挖掘在线评论中的固定搭配特征,将其融合于SVM与Naive Bayes等情感分析模型中。实际酒店的在线评论情感分析结果表明,增加粗规则后,SVM模型与Naive Bayes模型获得的评论情感判别精度都有所提高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号