首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 234 毫秒
1.
针对基于短语的统计机器翻译(SMT)模型中由于采用精确匹配策略导致的短语稀疏问题,提出了一种基于短语相似度的统计机器翻译模型.该模型将基于实例的翻译方法引入到统计机器翻译中.翻译时,对于训练语料库中未出现过的短语,通过计算源语言短语之间的相似度,采用模糊匹配策略从短语表中查找相似的实例短语,并根据实例短语为其构造翻译.与精确匹配策略相比,利用相似度进行模糊匹配增加了对短语表的利用程度,缓解了短语稀疏问题.实验表明,该模型能够明显地提高统计机器翻译的质量,效果超过了当前最好的短语系统"摩西(Moses)".  相似文献   

2.
利用汉-英双语句对进行了抽取短语翻译对的研究,提出了一种利用双语评价特征进行译文评价的短语翻译对主动获取方法.该方法通过选择有代表性的短语翻译对来达到减少人工标注数据的目的,以短语译文直译率、短语翻译概率和短语长度差异为基础,使用标注后的短语翻译对对支持向量机(SVM)进行训练,并使用优化后的SVM对测试数据进行分类....  相似文献   

3.
为缓解译文消歧任务中消歧知识获取困难及数据稀疏问题,提出了一种基于Web的挖掘双语词汇相关关系的无指导译文消歧方法。该方法将双语词汇在语料库中的间接相关拓展到Web,提出了基于Web的双语词汇间接相关模型,在此基础上又提出了一种基于Web的双语词汇相关度的消歧方法,通过构造不同queries并利用搜索引擎抽取返回页面的page counts,最后利用点式互信息来计算词汇间的相关度并用于消歧决策。该方法最好性能(P_(mar)=0.464)超过了国际语义评测Semeval-2007的Task #5上可比较的最好无指导系统TorMd。  相似文献   

4.
提出了一种利用字符基元视觉短语进行图像关键字识别的方法.该方法通过提取图像关键字的最大稳定极值区域,并进行归一化后得到字符基元.由于通常情况下每个关键字由若干字符基元构成,因此通过采用利用邻接的字符基元构造的视觉短语来提高图像关键字特征描述的可区分性;由于不同的字符基元组合结构可能构成不同的图像关键字,因此基于字符基元相邻关系判断短语几何结构的相似性.此方法不需要对图像进行二值化、布局分析和文本区域定位等预处理操作,具有更好的灵活性和鲁棒性.实验结果表明,此方法对于不同语言的图像关键字识别都具有较高的准确性.  相似文献   

5.
短语学作为语言学各个领域的附属研究项目在过去的几十年中一直是语言学研究领域的擦边球,对其研究也大都沿用了传统的基于语言模式的词汇语法研究方法,通过词根、意义、语法句法限制等对短语现象界定和研究,并且传统的研究多基于直觉.J.R.Firth最早提出用语料库的研究方法对词项共现现象进行研究,之后韩礼德和辛克莱(1996)继承并发展了弗思的理论,并提出了包括节点词、搭配词等一整套的概念和方法从语料库提取搭配例证.本世纪80年代短语学正式独立为一门系统的学科领域后,其研究方法仍是传统理论驱动和语料库数据驱动两大类,然而由于短语学自身的广泛性和多样性使得对其研究方法遭遇瓶颈,WillyMartin(2008)提出的整合分析法则使得短语学研究柳暗花明.本文将旨在对传统研究方法和语料库研究方法在短语学研究中的发展演化以及各自特点优劣势进行概述,并探讨分析Willy Martin所提出的整合分析法的合理性和发展趋势.  相似文献   

6.
英语基本名词短语识别向汉语的快速移植   总被引:1,自引:0,他引:1  
提出了边界统计与词性串校正相结合的英语基本名词短语识别策略,使英语基本名词短语识别的F测度值达到了96.90%,超过目前报道的最好结果。通过简单的符号替换(修改程序的时间不超过1h),用识别英语基本名词短语的程序实现了对汉语基本名词短语的识别,汉语基本名词短语识别的F测度值达到了95.04%。该技术可推广到对多种短语的快速移植。  相似文献   

7.
罗胜  陈平  叶忻泉  沈龙 《光电工程》2008,35(12):101-106
提出了一种模仿人类视觉机制的区域-细节的图像分割算法.首先提取图像边缘,然后将边缘分段切割,得到端点集合,然后从端点集合生成Delaunay三角形网络,以Delatmay三角形为顶点,相邻三角形的属性差异作为边的权重,构造图;9以基于图的分割算法生成最小生成树,划分区域.最后用Snake模型精确确定区域边界,生成准确的区域边缘.实验证明,这种区域分割和边缘检测相结合的方法能准确地分割非纹理图像,较好地克服了块现象和非连续边界,相比单一区域分割或者边缘检测方法有更好的分割结果,并且计算速度比较快.  相似文献   

8.
在显式动力计算中引入粘弹性人工边界时,受人工边界刚度和阻尼等因素影响,整体模型的数值积分稳定性将变得更为严格,这在一定程度上限制了粘弹性人工边界在大规模显式动力计算中的应用。该文基于对采用粘弹性人工边界的显式时域逐步积分算法稳定性条件的分析及其影响因素的研究,提出通过对人工边界附加集中质量来改善其数值积分稳定性的方法,发展了稳定性更优的改进粘弹性人工边界。为确定合理的人工边界质量值,利用基于局部子系统的稳定性分析方法推导得到改进粘弹性人工边界的稳定性条件,通过比较分析给出人工边界质量参数的建议值。采用该建议值后,粘弹性人工边界区的数值积分稳定性条件优于内部计算域的稳定性条件,整体计算模型的稳定性由内部计算域控制,此时可以用常规的稳定性判别准则来确定临界时间积分步长。数值算例表明,该文提出的粘弹性人工边界数值积分稳定性改善方法在提高计算效率的同时保持原人工边界的计算精度,具有较强的实用性。  相似文献   

9.
刘宇鹏  李生  赵铁军 《高技术通讯》2011,21(10):1042-1047
针对词一级的混淆网络(CN)构建的对齐方法和骨架翻译选择的问题进行了研究.将增量的对齐方法引入到原有的翻译错误率(TER)对齐框架中,并对加入候选翻译的顺序进行了不同的尝试;同时选用两种最小贝叶斯风险(MBR)策略来选择骨架翻译;为了改进TER中同义词的对齐,进行了同义词匹配.在实验中,把候选翻译的顺序、选择骨架翻译方...  相似文献   

10.
目的 以人工智能对齐的视角,探讨在人智共创中生成式AI模型如何对齐设计师的意图。方法 通过要素分析,以人工智能对齐问题中的可解释性与可控性为研究视角,探讨生成式AI技术作为辅助设计工具如何对齐设计求解过程中“探索-创新-评估”三个设计阶段的具体意图与需求,并分析对齐阶段中需要解决的对齐问题。结果 根据对齐阶段的任务构建基于人智设计概念表征交互式对齐方法、表征拓展方法与表征评估方法。结论 在三个对齐阶段中分别构建意图对齐、设计空间拓展和设计规则匹配这三种方法,帮助设计师构建可控、可解释的人智共创方法,从而构建可控、可信的人智共创。  相似文献   

11.
When the Transformer proposed by Google in 2017, it was first used for machine translation tasks and achieved the state of the art at that time. Although the current neural machine translation model can generate high quality translation results, there are still mistranslations and omissions in the translation of key information of long sentences. On the other hand, the most important part in traditional translation tasks is the translation of key information. In the translation results, as long as the key information is translated accurately and completely, even if other parts of the results are translated incorrect, the final translation results’ quality can still be guaranteed. In order to solve the problem of mistranslation and missed translation effectively, and improve the accuracy and completeness of long sentence translation in machine translation, this paper proposes a key information fused neural machine translation model based on Transformer. The model proposed in this paper extracts the keywords of the source language text separately as the input of the encoder. After the same encoding as the source language text, it is fused with the output of the source language text encoded by the encoder, then the key information is processed and input into the decoder. With incorporating keyword information from the source language sentence, the model’s performance in the task of translating long sentences is very reliable. In order to verify the effectiveness of the method of fusion of key information proposed in this paper, a series of experiments were carried out on the verification set. The experimental results show that the Bilingual Evaluation Understudy (BLEU) score of the model proposed in this paper on the Workshop on Machine Translation (WMT) 2017 test dataset is higher than the BLEU score of Transformer proposed by Google on the WMT2017 test dataset. The experimental results show the advantages of the model proposed in this paper.  相似文献   

12.
Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, overcoming the weaknesses of conventional phrase-based translation systems. Although NMT based systems have gained their popularity in commercial translation applications, there is still plenty of room for improvement. Being the most popular search algorithm in NMT, beam search is vital to the translation result. However, traditional beam search can produce duplicate or missing translation due to its target sequence selection strategy. Aiming to alleviate this problem, this paper proposed neural machine translation improvements based on a novel beam search evaluation function. And we use reinforcement learning to train a translation evaluation system to select better candidate words for generating translations. In the experiments, we conducted extensive experiments to evaluate our methods. CASIA corpus and the 1,000,000 pairs of bilingual corpora of NiuTrans are used in our experiments. The experiment results prove that the proposed methods can effectively improve the English to Chinese translation quality.  相似文献   

13.
一种基于短语的汉蒙统计机器翻译与调序模型   总被引:2,自引:0,他引:2  
根据蒙古语的一些特点,为基于短语的汉蒙统计机器翻译提出了一种适合于汉蒙统计机器翻译的调序模型,并给出了相应的训练和解码算法以及初步实验的结果.汉蒙双语语料库规模很小,数据稀疏问题严重,而在汉蒙翻译中,词序变化又非常明显,在汉英等机器翻译中使用的调序方法难于应用到汉蒙统计机器翻译中.通过对汉蒙翻译过程中词语顺序变化的正态分布假设,建立了一种概率调序模型.实验表明,这种概率调序模型好于Moses系统中采用的调序方法.  相似文献   

14.
文化差异会阻碍跨文化交际,表现在商标的翻译上就更为明显。长期以来对许多商标翻译原则的僵化理解,引发了商标翻译实践中的诸多问题。本文以因商标误译而影响产品销售的案例为切入点,着重分析了商标误译带来的种种问题。在此基础上,提出了了解产品特性、了解目的国的文化传统等四个翻译策略以达到提高商标翻译质量的目的,促进出口商品销售的目的。  相似文献   

15.
The translation quality of neural machine translation (NMT) systems depends largely on the quality of large-scale bilingual parallel corpora available. Research shows that under the condition of limited resources, the performance of NMT is greatly reduced, and a large amount of high-quality bilingual parallel data is needed to train a competitive translation model. However, not all languages have large-scale and high-quality bilingual corpus resources available. In these cases, improving the quality of the corpora has become the main focus to increase the accuracy of the NMT results. This paper proposes a new method to improve the quality of data by using data cleaning, data expansion, and other measures to expand the data at the word and sentence-level, thus improving the richness of the bilingual data. The long short-term memory (LSTM) language model is also used to ensure the smoothness of sentence construction in the process of sentence construction. At the same time, it uses a variety of processing methods to improve the quality of the bilingual data. Experiments using three standard test sets are conducted to validate the proposed method; the most advanced fairseq-transformer NMT system is used in the training. The results show that the proposed method has worked well on improving the translation results. Compared with the state-of-the-art methods, the BLEU value of our method is increased by 2.34 compared with that of the baseline.  相似文献   

16.
The performance of state-of-the-art machine translation (MT) systems is still far from being perfect. In this article, we investigate the combination of on-line MT systems to produce better outputs. The basic idea is to apply editorial operations of substitution, insertion, and deletion to candidate outputs in an automatic fashion. The proposed methods are evaluated on the translation task in the travel domain from Chinese to English as defined in the International Workshop on Spoken Language Translation evaluation campaign. Outputs of three on-line MT engines are combined. The proposed method is evaluated on Chinese-to-English MT tasks in the tourism domain. An overall improvement of 1.4 in the BLEU score over the best single-system baseline has been achieved.  相似文献   

17.
李瑛 《包装工程》2023,44(24):377-387, 397
目的 将听障人使用的视觉交流语言“手语”转译为更加直观易懂、有趣易学、易于传播的图形语言,从而帮助人们认知手语的语形和语义,促进手语的保护与传播。方法 在转译概念的基础上,运用符号学的基本原理,对手语视觉转译的符号传达逻辑进行深入的分析和阐释;借助语言学的直译与意译理论,对手语的视觉转译设计进行模型建构;运用图解设计方法,对手语视觉信息进行可视化设计。结论 手语视觉转译在当下还有许多亟待解决的问题;手语视觉转译的传达逻辑是从“抽象意义”到“手语视觉”再到“图形符号”的过程;手语的视觉转译应当从应用性和艺术性两方面提升手语视觉传达的效率;从语形的转译角度出发,划分出适于识别交流的“直译图解”;从语义的转译角度出发,划分出了适于阅读理解的“意译图解”;目前手语的直译图解有待改善,而相关的意译图解被严重忽视;手语视觉转译的图解设计应当考虑信息编码与信息解码的全过程;其创新设计应当加强多样化、准确性、功能性、艺术性、人文性、趣味性探索。  相似文献   

18.
Recently dependency information has been used in different ways to improve neural machine translation. For example, add dependency labels to the hidden states of source words. Or the contiguous information of a source word would be found according to the dependency tree and then be learned independently and be added into Neural Machine Translation (NMT) model as a unit in various ways. However, these works are all limited to the use of dependency information to enrich the hidden states of source words. Since many works in Statistical Machine Translation (SMT) and NMT have proven the validity and potential of using dependency information. We believe that there are still many ways to apply dependency information in the NMT structure. In this paper, we explore a new way to use dependency information to improve NMT. Based on the theory of local attention mechanism, we present Dependency-based Local Attention Approach (DLAA), a new attention mechanism that allowed the NMT model to trace the dependency words related to the current translating words. Our work also indicates that dependency information could help to supervise attention mechanism. Experiment results on WMT 17 Chineseto- English translation task shared training datasets show that our model is effective and perform distinctively on long sentence translation  相似文献   

19.
文章以顺应论为理论框架,指出广告语的翻译是对交际语境的顺应,即顺应目的语受众的情感和信念等心理世界因素,社会环境和文化等社交世界因素,以及时空等物理世界因素,使得译入广告语在他国市场产生其预期的广告效果。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号