首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
本文主要是对在线问诊中产生的医疗文本进行命名实体识别的研究.使用在线医疗问答网站的数据,采用{B,I,O}标注体系构建数据集,抽取疾病、治疗、检查和症状四个医疗实体.以BiLSTM-CRF为基准模型,提出两种深度学习模型IndRNN-CRF和IDCNN-BiLSTM-CRF,并在自构建数据集上验证模型的有效性.将新提出的两种模型与基准模型通过实验对比得出:模型IDCNN-BiLSTM-CRF的F1值0.8116,超过了BiLSTM-CRF的F1值0.8009,IDCNN-BiLSTM-CRF整体性能好于BiLSTM-CRF模型;模型IndRNN-CRF的精确率0.8427,但该模型在召回率上低于基准模型BiLSTM-CRF.  相似文献   

2.
临床电子病历命名实体识别(Clinical Named Entity Recognition,CNER)的主要任务是对给定的一组电子病历文档进行识别并抽取出与医学临床相关的命名实体,然后将它们归类到预先定义好的类别中,如疾病、症状、检查等实体。命名实体识别任务通常被看作一个序列标注问题。目前,深度学习方法已经被广泛应用于该任务并取得了非常好的效果。但其中大部分方法未能有效利用大量的未标注数据;并且目前使用的特征相对简单,未能深入捕捉病历文本自身的特征。针对这两个问题,文中提出一种融入语言模型和注意力机制的深度学习方法。该方法首先从未标注的临床医疗数据中训练字符向量和语言模型,然后利用标注数据来训练标注模型。具体地,将句子的向量表示送入一个双向门控循环网络(Bidirectional Gated Recurrent Units,BiGRU)和预训练好的语言模型,并将两部分的输出进行拼接。之后,将前一层的拼接向量输入另一个BiGRU和多头注意力(Multi-head Attention)模块。最后,将BiGRU和多头注意力模块的输出进行拼接并输入条件随机场(Conditional Randoin Field,CRF),预测全局最优的标签序列。通过利用语言模型特征和多头注意力机制,该方法在CCKS-2017 Shared Task2标准数据集上取得了良好的结果(F1值为91.34%)。  相似文献   

3.
为弥补现有方法不能很好捕获电子病历实体之间的长距离依赖关系的缺陷,提出一种结合自注意力的BiLSTM-CRF的命名实体识别方法.将输入文本转成神经网络可识别的数值形式;经过BiLSTM网络并结合自注意力计算得到每个字的输出特征向量;通过C RF层找到句子最适合的输出标签序列,从而确定命名实体.采用CCKS2018数据集进行实验,结果表明,改进的命名实体识别方法对电子病历具有一定的适应性,且与现有的方法相比,测试集的准确率提高了6.50~9.25个百分点.  相似文献   

4.
为弥补现有方法不能很好捕获电子病历实体之间的长距离依赖关系的缺陷,提出一种结合自注意力的BiLSTM-CRF的命名实体识别方法.将输入文本转成神经网络可识别的数值形式;经过BiLSTM网络并结合自注意力计算得到每个字的输出特征向量;通过C RF层找到句子最适合的输出标签序列,从而确定命名实体.采用CCKS2018数据集进行实验,结果表明,改进的命名实体识别方法对电子病历具有一定的适应性,且与现有的方法相比,测试集的准确率提高了6.50~9.25个百分点.  相似文献   

5.
针对中文电子病历中医疗嵌套实体难以处理的问题, 本文基于RoBERTa-wwm-ext-large预训练模型提出一种知识增强的中文电子病历命名实体识别模型ERBEGP. RoBERTa-wwm-ext-large采用的全词掩码策略能够获得词级别的语义表示, 更适用于中文文本. 首先结合知识图谱, 使模型学习到了大量的医疗实体名词, 进一步提高模型对电子病历实体识别的准确性. 然后通过BiLSTM对电子病历输入序列编码, 能够更好捕获病历的中上下语义信息. 最后利用全局指针网络模型EGP (efficient GlobalPointer)同时考虑实体的头部和尾部的特征信息来预测嵌套实体, 更加有效地解决中文电子病历命名实体识别任务中嵌套实体难以处理的问题. 在CBLUE中的4个数据集上本文方法均取得了更好的识别效果, 证明了ERBEGP模型的有效性.  相似文献   

6.
电子病历(EMR)是医疗信息快速发展的产物,目前以非结构化文本形式存储。通过使用自然语言处理(NLP)技术,在非结构化文本中提取出大量医学实体,将有助于提升医务人员查阅病历效率,同时识别的成果也将辅助于接下来的关系提取和知识图谱构建等研究。介绍常用的若干个数据集、语料标注标准和评价指标。从早期传统方法、深度学习方法、预训练模型、小样本问题处理四个方面详细阐述电子病历命名实体识别方法,对比分析各模型自身的优势及局限性。探讨了目前研究的不足,并对未来发展方向提出展望。  相似文献   

7.
为解决命名实体之间的复杂嵌套以及语料库中标注误差导致的相邻命名实体边界重叠问题,提出一种中文重叠命名实体识别方法。利用基于随机合并与拆分的层次化聚类算法将重叠命名实体标签划分到不同的聚类簇中,建立文字到实体标签之间的一对一关联关系,解决了实体标签聚类陷入局部最优的问题,并在每个标签聚类簇中采用融合中文部首的BiLSTM-CRF模型提高重叠命名实体的识别稳定性。实验结果表明,该方法通过标签聚类的方式有效避免标注误差对识别过程的干扰,F1值相比现有识别方法平均提高了0.05。  相似文献   

8.
电子病历命名实体识别(named entity recognition,NER)旨在识别电子病历文本中的医疗实体,并将其归为预定义的医疗实体类别,为进一步的医疗关系抽取、医疗信息检索、医疗智能问答等自然语言处理任务提供支持。系统梳理了电子病历命名实体识别的定义、标注方法、评价指标及难点;从电子病历命名实体识别难点及技术发展历程两个角度,综述了每类电子病历命名实体识别方法的优势与不足;详细梳理了国内医疗领域命名实体识别的评测任务及数据集;详细讨论和总结电子病历命名实体识别每一类难点的解决方案;总结全文并展望了医疗领域命名实体识别的发展方向。  相似文献   

9.
中文电子病历命名实体和实体关系语料库构建   总被引:1,自引:0,他引:1  
电子病历是由医务人员撰写的面向患者个体描述医疗活动的记录,蕴含了大量的医疗知识和患者的健康信息.电子病历命名实体识别和实体关系抽取等信息抽取研究对于临床决策支持、循证医学实践和个性化医疗服务等具有重要意义,而电子病历命名实体和实体关系标注语料库的构建是首当其冲的.在调研了国内外电子病历命名实体和实体关系标注语料库构建的基础上,结合中文电子病历的特点,提出适合中文电子病历的命名实体和实体关系的标注体系,在医生的指导和参与下,制定了命名实体和实体关系的详细标注规范,构建了标注体系完整、规模较大且一致性较高的标注语料库.语料库包含病历文本992份,命名实体标注一致性达到0.922,实体关系一致性达到0.895.为中文电子病历信息抽取后续研究打下了坚实的基础.  相似文献   

10.
《软件》2019,(8):208-211
电子病历是医疗单位对门诊部、住院患者临床诊疗与指导干预的、数字化的医疗服务工作的相关记录[1]。为了完成电子病历的高效的信息提取工作,本文使用深度学习的相关算法对电子病历中的文本进行命名实体的识别工作。其算法选择LSTM(Long-Short Term Memory,长短期记忆人工神经网络)和MLP(Multi-Layer Perception,多层神经网络),其用于构建算法模型。该本使用BP网络(Back—PropagationNetwork,后向传播)训练数据模型,应用已经标注的病历数据进行相应的训练与测试。该本通过实验证明,深度学习的算法在电子病历命名实体识别中是高效的[2]。  相似文献   

11.
Abstract This paper describes an approach to the design of interactive multimedia materials being developed in a European Community project. The developmental process is seen as a dialogue between technologists and teachers. This dialogue is often problematic because of the differences in training, experience and culture between them. Conditions needed for fruitful dialogue are described and the generic model for learning design used in the project is explained.  相似文献   

12.
European Community policy and the market   总被引:1,自引:0,他引:1  
Abstract This paper starts with some reflections on the policy considerations and priorities which are shaping European Commission (EC) research programmes. Then it attempts to position the current projects which seek to capitalise on information and communications technologies for learning in relation to these priorities and the apparent realities of the marketplace. It concludes that while there are grounds to be optimistic about the contribution EC programmes can make to the efficiency and standard of education and training, they are still too technology driven.  相似文献   

13.
融合集成方法已经广泛应用在模式识别领域,然而一些基分类器实时性能稳定性较差,导致多分类器融合性能差,针对上述问题本文提出了一种新的基于多分类器的子融合集成分类器系统。该方法考虑在度量层融合层次之上通过对各类基多分类器进行动态选择,票数最多的类别作为融合系统中对特征向量识别的类别,构成一种新的自适应子融合集成分类器方法。实验表明,该方法比传统的分类器以及分类融合方法识别准确率明显更高,具有更好的鲁棒性。  相似文献   

14.
Development of software intensive systems (systems) in practice involves a series of self-contained phases for the lifecycle of a system. Semantic and temporal gaps, which occur among phases and among developer disciplines within and across phases, hinder the ongoing development of a system because of the interdependencies among phases and among disciplines. Such gaps are magnified among systems that are developed at different times by different development teams, which may limit reuse of artifacts of systems development and interoperability among the systems. This article discusses such gaps and a systems development process for avoiding them.  相似文献   

15.
This paper presents control charts models and the necessary simulation software for the location of economic values of the control parameters. The simulation program is written in FORTRAN, requires only 10K of main storage, and can run on most mini and micro computers. Two models are presented - one describes the process when it is operating at full capacity and the other when the process is operating under capacity. The models allow the product quality to deteriorate to a further level before an existing out-of-control state is detected, and they can also be used in situations where no prior knowledge exists of the out-of-control causes and the resulting proportion defectives.  相似文献   

16.
Going through a few examples of robot artists who are recognized worldwide, we try to analyze the deepest meaning of what is called “robot art” and the related art field definition. We also try to highlight its well-marked borders, such as kinetic sculptures, kinetic art, cyber art, and cyberpunk. A brief excursion into the importance of the context, the message, and its semiotics is also provided, case by case, together with a few hints on the history of this discipline in the light of an artistic perspective. Therefore, the aim of this article is to try to summarize the main characteristics that might classify robot art as a unique and innovative discipline, and to track down some of the principles by which a robotic artifact can or cannot be considered an art piece in terms of social, cultural, and strictly artistic interest. This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January 31–February 2, 2008  相似文献   

17.
Although there are many arguments that logic is an appropriate tool for artificial intelligence, there has been a perceived problem with the monotonicity of classical logic. This paper elaborates on the idea that reasoning should be viewed as theory formation where logic tells us the consequences of our assumptions. The two activities of predicting what is expected to be true and explaining observations are considered in a simple theory formation framework. Properties of each activity are discussed, along with a number of proposals as to what should be predicted or accepted as reasonable explanations. An architecture is proposed to combine explanation and prediction into one coherent framework. Algorithms used to implement the system as well as examples from a running implementation are given.  相似文献   

18.
This paper provides the author's personal views and perspectives on software process improvement. Starting with his first work on technology assessment in IBM over 20 years ago, Watts Humphrey describes the process improvement work he has been directly involved in. This includes the development of the early process assessment methods, the original design of the CMM, and the introduction of the Personal Software Process (PSP)SM and Team Software Process (TSP){SM}. In addition to describing the original motivation for this work, the author also reviews many of the problems he and his associates encountered and why they solved them the way they did. He also comments on the outstanding issues and likely directions for future work. Finally, this work has built on the experiences and contributions of many people. Mr. Humphrey only describes work that he was personally involved in and he names many of the key contributors. However, so many people have been involved in this work that a full list of the important participants would be impractical.  相似文献   

19.
基于复小波噪声方差显著修正的SAR图像去噪   总被引:4,自引:1,他引:3  
提出了一种基于复小波域统计建模与噪声方差估计显著性修正相结合的合成孔径雷达(Synthetic Aperture Radar,SAR)图像斑点噪声滤波方法。该方法首先通过对数变换将乘性噪声模型转化为加性噪声模型,然后对变换后的图像进行双树复小波变换(Dualtree Complex Wavelet Transform,DCWT),并对复数小波系数的统计分布进行建模。在此先验分布的基础上,通过运用贝叶斯估计方法从含噪系数中恢复原始系数,达到滤除噪声的目的。实验结果表明该方法在去除噪声的同时保留了图像的细节信息,取得了很好的降噪效果。  相似文献   

20.
Abstract  This paper considers some results of a study designed to investigate the kinds of mathematical activity undertaken by children (aged between 8 and 11) as they learned to program in LOGO. A model of learning modes is proposed, which attempts to describe the ways in which children used and acquired understanding of the programming/mathematical concepts involved. The remainder of the paper is concerned with discussing the validity and limitations of the model, and its implications for further research and curriculum development.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号