首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 578 毫秒
1.
中文电子病历命名实体和实体关系语料库构建   总被引:1,自引:0,他引:1  
电子病历是由医务人员撰写的面向患者个体描述医疗活动的记录,蕴含了大量的医疗知识和患者的健康信息.电子病历命名实体识别和实体关系抽取等信息抽取研究对于临床决策支持、循证医学实践和个性化医疗服务等具有重要意义,而电子病历命名实体和实体关系标注语料库的构建是首当其冲的.在调研了国内外电子病历命名实体和实体关系标注语料库构建的基础上,结合中文电子病历的特点,提出适合中文电子病历的命名实体和实体关系的标注体系,在医生的指导和参与下,制定了命名实体和实体关系的详细标注规范,构建了标注体系完整、规模较大且一致性较高的标注语料库.语料库包含病历文本992份,命名实体标注一致性达到0.922,实体关系一致性达到0.895.为中文电子病历信息抽取后续研究打下了坚实的基础.  相似文献   

2.
电子病历是诊疗过程中记录患者健康状况的档案, 文本中分布着大量的医学实体, 其中蕴含着丰富的医学信息. 目前医学领域的关系抽取模型主要是通过关系分类的方法识别两个给定医学实体之间的语义关系. 中文电子病历具有实体高密度分布的特点. 针对这个问题, 本文提出了一种基于条件提示与序列标注的关系三元组识别方法, 将关系三元组识别任务转换为序列标注任务. 关系三元组中的头实体和关系类型作为条件提示信息, 通过序列标注方法识别电子病历文本中与条件提示信息有关联的尾实体. 在中文电子病历数据集上的实验证明本文方法能有效识别中文电子病历中的关系三元组.  相似文献   

3.
该文探讨了在脑卒中疾病中文电子病历文本中实体及实体间关系的标注问题,提出了适用于脑卒中疾病电子病历文本的实体及实体关系标注体系和规范。在标注体系和规范的指导下,进行了多轮的人工标注及校正工作,完成了158万余字的脑卒中电子病历文本实体及实体关系的标注工作。构建了脑卒中电子病历实体及实体关系标注语料库(Stroke Electronic Medical Record entity and entity related Corpus, SEMRC)。该文所构建的语料库共包含命名实体10 594个,实体关系14 457个。实体名标注一致率达到85.16%,实体关系标注一致率达到94.16%。  相似文献   

4.
针对医疗领域的研究,发现了不同科室间电子病历存在着差异,但是新语料的标注成本又非常高。为了解决这一问题,利用迁移学习的方法在中文电子病历中进行跨科室组块分析的研究。在构建的中文电子病历中,对比了SSVM与CRF模型在词性标注和组块分析上的实验结果,发现SSVM模型的效果更好并选择该模型作为基本标注模型。此外,使用了改进的结构对应学习算法(SCL)进行组块分析,使得该算法能适用于SSVM模型进行领域适应。实验结果表明该算法有效地改善了序列标注任务中跨科室的领域适应性问题。  相似文献   

5.
蒋志鹏  关毅 《自动化学报》2019,45(2):276-288
完全句法分析是自然语言处理(Natural language processing,NLP)中重要的结构化过程,由于中文电子病历(Chinese electronic medical record,CEMR)句法标注语料匮乏,目前还没有面向中文电子病历的完全句法分析研究.本文针对中文电子病历模式化强的子语言特征,首次以树片段形式化中文电子病历复用的模式,提出了面向数据句法分析(Data-oriented parsing,DOP)和层次句法分析融合模型.在树片段抽取阶段,提出效率更高的标准树片段和局部树片段抽取算法,分别解决了标准树片段的重复比对问题,以及二次树核(Quadratic tree kernel,QTK)的效率低下问题,获得了标准树片段集和局部树片段集.基于上述两个树片段集,提出词汇和词性混合匹配策略和最大化树片段组合算法改进面向数据句法分析模型,缓解了无效树片段带来的噪声.实验结果表明,该融合模型能够有效改善中文电子病历句法分析效果,基于少量标注语料F1值能够达到目前最高的80.87%,并且在跨科室句法分析上超过Stanford parser和Berkeley parser 2%以上.  相似文献   

6.
电子病历中包含大量有用的医疗知识,抽取这些知识对于构建临床决策支持系统和个性化医疗健康信息服务具有重要意义。自动分词是分析和挖掘中文电子病历的关键基础。为了克服获取标注语料的困难,提出了一种基于无监督学习的中文电子病历分词方法。首先,使用通用领域的词典对电子病历进行初步的切分,为了更好地解决歧义问题,引入概率模型,并通过EM算法从生语料中估计词的出现概率。然后,利用字串的左右分支信息熵构建良度,将未登录词识别转化为最优化问题,并使用动态规划算法进行求解。最后,在3 000来自神经内科的中文电子病历上进行实验,证明了该方法的有效性。  相似文献   

7.
在现有的面向中文临床电子病历的命名实体识别任务中,实体标注粒度通常过细或过粗,过细的标注结果难以找到实际应用场景,而过粗的标注结果通常需要在进行复杂的处理后,才能明确实体的规范形式和语义类型,以便于后续的数据挖掘应用。为简化处理步骤,根据常见的7类粗粒度临床实体的特点,定义了用以解释粗粒度实体的9类细粒度解析实体。同时,针对多粒度实体的特点,提出了基于多任务学习和自注意力机制的多粒度临床实体识别模型,并在真实的医院电子病历库中标注了5000条包含多粒度实体的文本以验证模型的效果。实验结果表明,该模型优于主流的序列标注模型,在粗、细粒度实体识别任务中,两者的F 1值分别达到了92.88和85.48。  相似文献   

8.
针对中文电子病历中医疗嵌套实体难以处理的问题, 本文基于RoBERTa-wwm-ext-large预训练模型提出一种知识增强的中文电子病历命名实体识别模型ERBEGP. RoBERTa-wwm-ext-large采用的全词掩码策略能够获得词级别的语义表示, 更适用于中文文本. 首先结合知识图谱, 使模型学习到了大量的医疗实体名词, 进一步提高模型对电子病历实体识别的准确性. 然后通过BiLSTM对电子病历输入序列编码, 能够更好捕获病历的中上下语义信息. 最后利用全局指针网络模型EGP (efficient GlobalPointer)同时考虑实体的头部和尾部的特征信息来预测嵌套实体, 更加有效地解决中文电子病历命名实体识别任务中嵌套实体难以处理的问题. 在CBLUE中的4个数据集上本文方法均取得了更好的识别效果, 证明了ERBEGP模型的有效性.  相似文献   

9.
针对中文电子病历中命名实体识别和实体关系抽取研究方法中存在的问题,提出了一种基于双向长短时记忆网络(bidirectional long short term memory)与CRF(conditional random field)结合的实体识别和实体关系抽取方法。该方法首先使用词嵌入技术将文本转换为数值向量,作为神经网络BiLSTM的输入,再结合CRF链式结构进行序列标注,输出最大概率序列,并对识别结果知识图谱化。实验证明,该方法对中文电子病历进行实体识别和实体关系抽取时的准确率、召回率、◢F◣值有明显的提升。实验结果满足临床中系统应用需求,对帮助研究构建临床决策支持系统、个性化医疗推荐服务有引导作用。  相似文献   

10.
余杰  纪斌  刘磊  李莎莎  马俊  刘慧君 《计算机科学》2021,48(11):287-293
临床病历电子化的推广普及使得利用自动化的方法从病历中快速抽取高价值的信息成为可能.作为一种重要的医学信息,肿瘤医疗事件由描述恶性肿瘤的一系列属性构成.近年来,肿瘤医疗事件抽取已成为学术界的一个研究热点,众多学术会议将其发布为评测任务,并提供了一系列高质量的标注数据.针对肿瘤医疗事件属性离散的特点,文中提出了一种中文医疗事件的联合抽取方法,实现了肿瘤原发部位和原发肿瘤大小两种属性的联合抽取和肿瘤转移部位的抽取.此外,针对肿瘤医疗事件标注文本的数量和类型少的问题,提出了一种基于关键信息全域随机替换的伪数据生成算法,提升了联合抽取方法对不同类型肿瘤医疗事件抽取的迁移学习能力.所提方法获得了 CCKS2020中文电子病历临床医疗事件抽取评测任务的第三名,在CCKS2019和CCKS2020数据集上的大量实验验证了所提方法的有效性.  相似文献   

11.
Autoimmune diseases, such as antiphospholipid syndrome, systemic lupus erythematosus, and rheumatoid arthritis, are characterized by a high prevalence of cardiovascular (CV) disease (CVD), which constitutes the leading causes of morbidity and mortality among such patients. Although such effects are partly explained by a higher prevalence of traditional CV risk factors, many studies indicate that such factors do not fully explain the enhanced CV risk in these patients. In addition, risk stratification algorithms based upon traditional CV risk factors are not as predictive in autoimmune diseases as in the general population. For these reasons, the timely and accurate assessment of CV risk in these high-risk populations still remains an unmet clinical need. An enhanced contribution of different inflammatory components of the immune response, as well as autoimmune elements (e.g. autoantibodies, autoantigens, and cellular response), has been proposed to underlie the incremental CV risk observed in these populations. Recent advances in proteomic tools have contributed to the discovery of proteins involved in CVDs, including some that may be suitable to be used as biological markers. In this review we summarize the main markers in the field of CVDs associated with autoimmunity, as well as the recent advances in proteomic technology and their application for biomarker discovery in autoimmune disease.  相似文献   

12.
We present an efficient and automatic image-recoloring technique for dichromats that highlights important visual details that would otherwise be unnoticed by these individuals. While previous techniques approach this problem by potentially changing all colors of the original image, causing their results to look unnatural to color vision deficients, our approach preserves, as much as possible, the image's original colors. Our approach is about three orders of magnitude faster than previous ones. The results of a paired-comparison evaluation carried out with fourteen color-vision deficients (CVDs) indicated the preference of our technique over the state-of-the-art automatic recoloring technique for dichromats. When considering information visualization examples, the subjects tend to prefer our results over the original images. An extension of our technique that exaggerates color contrast tends to be preferred when CVDs compared pairs of scientific visualization images. These results provide valuable information for guiding the design of visualizations for color-vision deficients.  相似文献   

13.
The US Federal Aviation Administration (FAA) has developed a standard set of colors for coding information on air traffic control (ATC) displays. A significant complication was that the air traffic controller population includes people who have color-vision deficiencies (CVDs). We wrote a software tool to assist the FAA in selecting a preliminary color set. It accepts a set of luminances and chromaticity coordinates as input and: (1) Draws graphics and calculates color-related figures of merit to predict whether the set will be acceptable for color-normal and CVD users; (2) Flags colors and pairings that violate human factors criteria; and (3) Allows designers to adjust the colors and see the resulting changes immediately. The tool has been used to perform a pilot study for the FAA’s color-set development project and should be useful for designing other color-coding sets, also.  相似文献   

14.
BRICS (Brazil, Russia, India, China and South Africa) are viewed currently as pillars of relative political, economic and financial stability, with the prospect of a major shift in future world power. The paper aims at investigating the relationships among the economic, financial and political country risk ratings of the BRICS and relating those risk factors to their respective national stock markets in the presence of representatives of the world's major stock markets and oil market. It also examines the interrelationships among the national country financial risk ratings factors to discern transmission of the risk spectrum among the countries of this group because of the relevance of this information to investors, traders and policy makers. The results demonstrate that only the Chinese stock market is sensitive to all the factors. Financial risk ratings generally demonstrate more sensitivity than economic and political risk ratings, and political risk is sensitive to both financial and economic risk ratings. Among the five BRICS, Brazil shows special sensitivity to economic and financial risks, while Russia and China hold strong sensitivity to political risk and India demonstrates special sensitivity to higher oil prices. Among the global factors, oil price is more sensitive to economic than financial risk, while the S&P 500 reverses this relationship. The two American quantitative easings (QEs) affect BRICS differently.  相似文献   

15.
The success rate for information technology (IT) projects continues to be low. With an increasing number of IT projects in developing countries such as China, it is important to understand the risks they are experiencing on IT projects. To date, there has been little research documenting Asian perceptions of IT project risk. In this research, we examine the risks identified by Chinese senior executives (SEs) and project managers (PMs), and compare these two groups. The importance of top management support in IT projects is well documented. Prior research has shown that from the perspective of IT PMs, lack of support from SEs is the number one risk in IT projects. Surprisingly, senior executives' perceptions towards IT project risk have never been systematically examined. One reason why lack of support from senior executives continues to represent a major risk may be that senior executives themselves do not realize the critical role that they can play in helping to deliver successful projects. In this study, we use the Delphi method to compare the risk perceptions of senior executives and project managers. By comparing risk factors selected by each group, zones of concordance and discordance are identified. In terms of perceived importance ascribed to risk factors, PMs tend to focus on lower‐level risks with particular emphasis on risks associated with requirements and user involvement, whereas SEs tend to focus on higher‐level risks such as those risks involving politics, organization structure, process, and culture. Finally, approaches for dealing with risk factors that are seen as important by both SEs and PMs are provided.  相似文献   

16.
The investigation of hemodynamic information for the assessment of cardiovascular diseases (CVDs) gained importance in recent years. Improved flow measuring modalities and computational fluid dynamics (CFD) simulations yield in reliable blood flow information. For a visual exploration of the flow information, domain experts are used to investigate the flow information combined with its enclosed vessel anatomy. Since the flow is spatially embedded in the surrounding vessel surface, occlusion problems have to be resolved. A visual reduction of the vessel surface that still provides important anatomical features is required. We accomplish this by applying an adaptive surface visualization inspired by the suggestive contour measure. Furthermore, an illustration is employed to highlight the animated pathlines and to emphasize nearby surface regions. Our approach combines several visualization techniques to improve the perception of surface shape and depth. Thereby, we ensure appropriate visibility of the embedded flow information, which can be depicted with established or advanced flow visualization techniques. We apply our approach to cerebral aneurysms and aortas with simulated and measured blood flow. An informal user feedback with nine domain experts, we confirm the advantages of our approach compared with existing methods, e.g. semi‐transparent surface rendering. Additionally, we assessed the applicability and usefulness of the pathline animation with highlighting nearby surface regions.  相似文献   

17.
 随着信息化时代的到来,信息安全问题变得日益复杂与多样,因此急需一种高性能的解决方法。本文在前人的研究基础上进一步改进贝叶斯网络模型在信息安全风险评估中的应用。首先分析信息系统风险元素种类,提出一种新的确定风险因素的方法,即建立因素之间常见关联关系;然后依据因素关联关系确定信息系统指标体系,并结合经验积累的条件概率,利用Matlab贝叶斯网络工具箱(BNT)构建完整的贝叶斯网络风险评估模型,其中包括对评估流程、方法使用及风险等级确定的分析;最后通过实例分析改进的贝叶斯评估模型,对实验数据推理出风险各等级概率。仿真结果与实际结论相一致,表明改进的贝叶斯评估模型能够准确反映信息系统安全风险等级,是一种有效且合理的评估方法。  相似文献   

18.
基于博弈论的综合赋权法的信息安全风险评估   总被引:2,自引:0,他引:2  
为了合理确定风险评估中风险因素的权重,对信息安全风险进行科学的评价,在确定风险因素权重时,本文应用基于博弈论的综合赋权法将主、客观权重集成为风险因素的综合权重。并利用该方法对信息系统进行了实例分析,说明了该方法所得的评估结果科学合理,为信息系统风险评估提供了一个新的思路。  相似文献   

19.
Slip, trip and fall (STF) incidents, particularly falls from a height, are a leading cause of injury in the New Zealand residential construction industry. The most common origins of falls from a height in this sector are ladders, scaffolding and roofs, while slipping is the most frequent fall initiating event category. The study aimed to provide detailed information on construction industry STF risk factors for high-risk tasks, work equipment and environments, as identified from an earlier analysis of STF claims data, together with information to be used in the development of interventions to reduce STF risk in New Zealand residential construction. The study involved the use of both incident-centred and incident-independent methods of investigation, including detailed follow-up investigations of incidents and observations and interviews with workers on construction sites, to provide data on a wide range of risk factors. A large number of risk factors for residential construction STFs were identified, including factors related to the work environment, tasks and the use and availability of appropriate height work equipment. The different methods of investigation produced complementary information on factors related to equipment design and work organization, which underlie some of the site conditions and work practices identified as key risk factors for residential construction STFs. A conceptual systems model of residential construction STF risk is presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号