首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 687 毫秒
1.
专业术语的自动抽取对于领域机器翻译、领域知识图谱等方面均具有重要作用.近年来,新能源领域专利文本的申请量逐年增加,我国科技文献走向世界有了更高需求,专业术语翻译质量直接影响专利文本的翻译质量.为了提高新能源领域专利文本术语抽取结果的准确率和召回率,构建新能源领域术语库以及提高新能源领域专利文本的翻译质量打下基础.本文提出了基于BERT-BiLSTM-CRF的新能源专利术语抽取方法,以自建的3002条新能源领域专利文本作为实验对象,在中文数据集上的实验结果达到了0.9211的精确率,0.9245的召回率以及0.9228的F1值.与其他经典深度学习术语抽取模型相比,基于BERT-BiLSTM-CRF的新能源专利术语抽取模型能有效地将新能源领域专利文本中字符较多的长序列术语识别出来,提高术语抽取在实际应用中的效果.  相似文献   

2.
【背景】及时掌握领域术语有助于动态把握领域发展方向,揭示领域的核心知识与研究热点。【目的】为提高领域术语抽取准确率,提出一种基于深度学习和统计信息的领域术语抽取方法。【方法】首先,对领域中文专利文本进行字嵌入表示,基于BERT(Bidirectional Encoder Representations from Transformers)获取字符级的向量表征作为模型的输入;然后,利用BiLSTM-CRF(Bidirectional Long Short Term Memory-Conditional Random Field)深度学习模型提取序列化文本的语义特征,得到领域术语标注序列;最后,综合计算复合结构术语的互信息和左右熵,并结合领域知识库对抽取结果进行校正。【结果】模型在“盐湖提锂”领域进行实验,结果表明BERT-BiLSTM-CRF模型抽取该领域术语准确率达到77.33%,而对抽取结果进行校正进一步将准确率提升了3.68%,是一种有效的领域术语抽取方法。  相似文献   

3.
中文领域术语自动抽取是中文信息处理中的一项基础性课题,并在自然语言生成、信息检索、文本摘要等领域中有广泛的应用。针对领域术语抽取问题,采用基于规则和多种统计策略相融合的方法,从词语度和领域度两个角度出发,提出一种领域术语的抽取算法并构建出相应的抽取系统。系统流程包括基于左右信息熵扩展的候选领域术语获取、基于词性搭配规则与边界信息出现概率知识库相结合的词语度筛选策略以及基于词频-逆文档频率(TF?IDF)的领域度筛选策略。运用此算法不但能抽取出领域的常见用词,还可以挖掘出领域新词。实验结果显示,基于如上方法构建的领域术语抽取系统的准确率为84.33%,所提方法能够有效支持中文领域术语的自动抽取。  相似文献   

4.
林源  陈志泊  孙俏 《计算机工程》2011,37(2):172-174
设计一种能够自动获取计算机领域术语的方案,提出基于规则与统计相结合的抽取方法,使用亚马逊网站的计算机类图书作为语料库,通过分词、去停止词预处理以及词频统计的方法提取出计算机类领域术语,并插入到由ODP构建的树中,形成计算机领域术语的层次结构。实验结果表明,与人工标注结果相比,使用废方法自动获取的术语有很高的准确率与召回率。  相似文献   

5.
本文提出了一种规则与统计相结合的方法,针对计算机领域术语综合其领域术语特征和统计特征。算法在语料词性标注基础上,在原有词串扩展算法上糅合领域术语部件和领域术语特征获取候选术语。综合统计特征G-MI实现候选术语过滤。实验证明,算法能有效提高术语抽取的正确率和抽取效率。  相似文献   

6.
在分别研究了基于信息熵和基于词频分布变化的术语抽取方法的情况下,该文提出了一种信息熵和词频分布变化相结合的术语抽取方法。信息熵体现了术语的完整性,词频分布变化体现了术语的领域相关性。通过应用信息熵,即将信息熵结合到词频分布变化公式中进行术语抽取,且应用简单语言学规则过滤普通字符串。实验表明,在汽车领域的语料上,应用该方法抽取出1300个术语,其正确率达到73.7%。结果表明该方法对低频术语有更好的抽取效果,同时抽取出的术语结构更完整。  相似文献   

7.
多词领域术语抽取是自然语言处理技术中的一个重点和难点问题, 结合维吾尔语语言特征,该文提出了一种基于规则和统计相结合的维吾尔语多词领域术语的自动抽取方法。该方法分为四个阶段: ①语料预处理, 包括停用词过滤和词性标注; ② 对字串取N元子串, 利用改进的互信息算法和对数似然比率计算子串内部的联合强度, 结合词性构成规则, 构建候选维吾尔语多词领域术语集; ③ 利用相对词频差值, 得到尽可能多的维吾尔语多词领域术语; ④ 结合C_value值获取最终领域术语并作后处理。实验结果准确率为85.08%, 召回率为 73.19%, 验证了该文提出的方法在维吾尔语多词领域术语抽取上的有效性。  相似文献   

8.
中文领域本体学习中术语的自动抽取*   总被引:3,自引:0,他引:3  
提出一种领域术语自动抽取的混合策略,首先进行多字词候选术语抽取和分词,然后合并其结果,最后通过领域相关度和领域主题一致度抽取出最终领域术语。在多字词抽取和最终领域术语抽取阶段分别对现有方法进行了改进,降低了字符串分解的时间复杂度并提高了领域术语抽取的准确率和召回率。实验表明,术语抽取准确率为90.64%,优于现有的抽取方法。  相似文献   

9.
采用CRF技术的军事情报术语自动抽取研究   总被引:3,自引:0,他引:3       下载免费PDF全文
针对军事情报领域,提出了一种基于条件随机场的术语抽取方法,该方法将领域术语抽取看作一个序列标注问题,将领域术语分布的特征量化作为训练的特征,利用CRF工具包训练出一个领域术语特征模板,然后利用该模板进行领域术语抽取。实验采用的训练语料来自“搜狐网络军事频道”的新闻数据,测试语料选取《现代军事》杂志2007年第1~8期的所有文章。实验取得了良好的结果,准确率为73.24%,召回率为69.57%,F-测度为71.36%,表明该方法简单易行,且具有领域通用性。  相似文献   

10.
周浪  张亮  冯冲  黄河燕 《计算机科学》2009,36(5):177-180
提出了一种规则与统计相结合的术语抽取方法,用于抽取包含多个词语的词组型术语.目前,绝大多数的统计方法都侧重于衡量术语的结构完整性,但这些方法并不能体现术语与专业相关的领域特征.通过对术语在各文档中的分布情况进行观察,提出了一种利用术语在语料中词频分布变化程度的统计信息采检验术语的领域相关性的方法,同时结合机器学习方法获取的语言知识,从计算机领域的语料中抽取领域特征明显的词组型术语.实验证明,该方法对低频术语和高频普通词串有较强的分辨能力.  相似文献   

11.
Abstract This paper describes an approach to the design of interactive multimedia materials being developed in a European Community project. The developmental process is seen as a dialogue between technologists and teachers. This dialogue is often problematic because of the differences in training, experience and culture between them. Conditions needed for fruitful dialogue are described and the generic model for learning design used in the project is explained.  相似文献   

12.
European Community policy and the market   总被引:1,自引:0,他引:1  
Abstract This paper starts with some reflections on the policy considerations and priorities which are shaping European Commission (EC) research programmes. Then it attempts to position the current projects which seek to capitalise on information and communications technologies for learning in relation to these priorities and the apparent realities of the marketplace. It concludes that while there are grounds to be optimistic about the contribution EC programmes can make to the efficiency and standard of education and training, they are still too technology driven.  相似文献   

13.
融合集成方法已经广泛应用在模式识别领域,然而一些基分类器实时性能稳定性较差,导致多分类器融合性能差,针对上述问题本文提出了一种新的基于多分类器的子融合集成分类器系统。该方法考虑在度量层融合层次之上通过对各类基多分类器进行动态选择,票数最多的类别作为融合系统中对特征向量识别的类别,构成一种新的自适应子融合集成分类器方法。实验表明,该方法比传统的分类器以及分类融合方法识别准确率明显更高,具有更好的鲁棒性。  相似文献   

14.
Development of software intensive systems (systems) in practice involves a series of self-contained phases for the lifecycle of a system. Semantic and temporal gaps, which occur among phases and among developer disciplines within and across phases, hinder the ongoing development of a system because of the interdependencies among phases and among disciplines. Such gaps are magnified among systems that are developed at different times by different development teams, which may limit reuse of artifacts of systems development and interoperability among the systems. This article discusses such gaps and a systems development process for avoiding them.  相似文献   

15.
This paper presents control charts models and the necessary simulation software for the location of economic values of the control parameters. The simulation program is written in FORTRAN, requires only 10K of main storage, and can run on most mini and micro computers. Two models are presented - one describes the process when it is operating at full capacity and the other when the process is operating under capacity. The models allow the product quality to deteriorate to a further level before an existing out-of-control state is detected, and they can also be used in situations where no prior knowledge exists of the out-of-control causes and the resulting proportion defectives.  相似文献   

16.
Going through a few examples of robot artists who are recognized worldwide, we try to analyze the deepest meaning of what is called “robot art” and the related art field definition. We also try to highlight its well-marked borders, such as kinetic sculptures, kinetic art, cyber art, and cyberpunk. A brief excursion into the importance of the context, the message, and its semiotics is also provided, case by case, together with a few hints on the history of this discipline in the light of an artistic perspective. Therefore, the aim of this article is to try to summarize the main characteristics that might classify robot art as a unique and innovative discipline, and to track down some of the principles by which a robotic artifact can or cannot be considered an art piece in terms of social, cultural, and strictly artistic interest. This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January 31–February 2, 2008  相似文献   

17.
Although there are many arguments that logic is an appropriate tool for artificial intelligence, there has been a perceived problem with the monotonicity of classical logic. This paper elaborates on the idea that reasoning should be viewed as theory formation where logic tells us the consequences of our assumptions. The two activities of predicting what is expected to be true and explaining observations are considered in a simple theory formation framework. Properties of each activity are discussed, along with a number of proposals as to what should be predicted or accepted as reasonable explanations. An architecture is proposed to combine explanation and prediction into one coherent framework. Algorithms used to implement the system as well as examples from a running implementation are given.  相似文献   

18.
This paper provides the author's personal views and perspectives on software process improvement. Starting with his first work on technology assessment in IBM over 20 years ago, Watts Humphrey describes the process improvement work he has been directly involved in. This includes the development of the early process assessment methods, the original design of the CMM, and the introduction of the Personal Software Process (PSP)SM and Team Software Process (TSP){SM}. In addition to describing the original motivation for this work, the author also reviews many of the problems he and his associates encountered and why they solved them the way they did. He also comments on the outstanding issues and likely directions for future work. Finally, this work has built on the experiences and contributions of many people. Mr. Humphrey only describes work that he was personally involved in and he names many of the key contributors. However, so many people have been involved in this work that a full list of the important participants would be impractical.  相似文献   

19.
基于复小波噪声方差显著修正的SAR图像去噪   总被引:4,自引:1,他引:3  
提出了一种基于复小波域统计建模与噪声方差估计显著性修正相结合的合成孔径雷达(Synthetic Aperture Radar,SAR)图像斑点噪声滤波方法。该方法首先通过对数变换将乘性噪声模型转化为加性噪声模型,然后对变换后的图像进行双树复小波变换(Dualtree Complex Wavelet Transform,DCWT),并对复数小波系数的统计分布进行建模。在此先验分布的基础上,通过运用贝叶斯估计方法从含噪系数中恢复原始系数,达到滤除噪声的目的。实验结果表明该方法在去除噪声的同时保留了图像的细节信息,取得了很好的降噪效果。  相似文献   

20.
Abstract  This paper considers some results of a study designed to investigate the kinds of mathematical activity undertaken by children (aged between 8 and 11) as they learned to program in LOGO. A model of learning modes is proposed, which attempts to describe the ways in which children used and acquired understanding of the programming/mathematical concepts involved. The remainder of the paper is concerned with discussing the validity and limitations of the model, and its implications for further research and curriculum development.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号