首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 469 毫秒
1.
基于电子病历观察性数据的真实世界研究成为目前临床科研的热点。然而关系数据模型无法直接支撑起科研应用中医疗事件的时序关系表示以及知识融合的查询需求。针对上述问题,该文提出了一种新的基于RDF的医疗观察性数据表示模型,该模型可以清晰地表示临床检查、诊断、治疗等多种事件类型以及事件的时序关系。对来源于医院的电子病历数据,经过数据预处理、数据模式转换、时序关系构建以及知识融合4个步骤建立事件图谱。具体地,使用三家上海三甲医院的电子病历数据,构建了包括3个专科、173 395个医疗事件以及501 335个事件时序关系的医疗数据集,并融合了5 313个中文医疗知识库概念。基于临床文献与医生科研需求,该文根据公共卫生流行病学的病因研究、治疗研究等类型,分别提供了针对本数据集的40个问题示例,并将其中的部分问题与传统关系数据库在查询的构建与执行方面进行了实验比对,论证了该事件图谱的优越性。该数据集遵循开放链接标准,在OpenKG上发布并提供了在线访问的SPARQL站点,链接为 https://peg.ecustnlplab.com/dataset.html。  相似文献   

2.
顾佩月  刘峥  李云  李涛 《计算机应用》2019,39(2):421-428
对于事件序列中的时序依赖发现,传统的频繁情节发现方法一方面使用时间窗口机制挖掘事件之间简单的关联依赖,另一方面无法有效处理事件的交叉时序关联。针对以上问题,提出了时滞情节发现的概念,在频繁情节发现的基础上,设计了一种基于相邻事件匹配集(AEM)的时滞情节发现算法。首先,引入时滞的概率统计模型进行事件序列匹配,避免预先设定时间窗口,处理可能存在的交叉关联;然后,将时滞挖掘转化为最优化问题,使用迭代的方式得到时滞情节之间的时间间隔分布;最后,利用假设检验区分串行时滞情节和并行时滞情节。理论分析与实验结果表明,与目前最新的时滞挖掘方法迭代最近事件(ICE)算法相比,基于AEM的时滞情节发现算法模拟的时滞分布与真实时滞分布的平均KL距离为0.056,缩短了20.68%。基于AEM的时滞情节发现算法通过时滞的概率统计模型衡量事件多种匹配情况的可能性,获得一对多的相邻事件匹配集,比ICE算法中的一对一匹配更加有效地模拟了实际情况。  相似文献   

3.
4.
To satisfy a user’s need to find and understand the whole picture of an event effectively and efficiently, in this paper we formalize the problem of temporal event searches and propose a framework of event relationship analysis for search events based on user queries. We define three kinds of event relationships: temporal, content dependence, and event reference, that can be used to identify to what extent a component event is dependent on another in the evolution of a target event (i.e., the query event). The search results are organized as a temporal event map (TEM) that serves as the whole picture about an event’s evolution or development by showing the dependence relationships among events. Based on the event relationships in the TEM, we further propose a method to measure the degrees of importance of events, so as to discover the important component events for a query, as well as the several algebraic operators involved in the TEM, that allow users to view the target event. Experiments conducted on a real data set show that our method outperforms the baseline method Event Evolution Graph (EEG), and it can help discover certain new relationships missed by previous methods and even by human annotators.  相似文献   

5.
An important usage of time sequences is to discover temporal patterns. The discovery process usually starts with a user specified skeleton, called an event structure, which consists of a number of variables representing events and temporal constraints among these variables; the goal of the discovery is to find temporal patterns, i.e., instantiations of the variables in the structure that appear frequently in the time sequence. The paper introduces event structures that have temporal constraints with multiple granularities, defines the pattern discovery problem with these structures, and studies effective algorithms to solve it. The basic components of the algorithms include timed automata with granularities (TAGs) and a number of heuristics. The TAGs are for testing whether a specific temporal pattern, called a candidate complex event type, appears frequently in a time sequence. Since there are often a huge number of candidate event types for a usual event structure, heuristics are presented aiming at reducing the number of candidate event types and reducing the time spent by the TAGs testing whether a candidate type does appear frequently in the sequence. These heuristics exploit the information provided by explicit and implicit temporal constraints with granularity in the given event structure. The paper also gives the results of an experiment to show the effectiveness of the heuristics on a real data set  相似文献   

6.
黄一龙  李培峰  朱巧明 《计算机科学》2018,45(6):204-207, 234
事件的因果关系与时序关系是两种重要的事件关系。已有研究往往将事件的因果关系与时序关系识别分别看成两项独立的任务,这种做法忽略了两种事件关系之间的关联性。文中提出使用整数线性规划方法来构建基于事件因果关系与时序关系识别的联合推理模型。联合模型对两种事件关系进行约束,在分类器模型的基础上对结果进行优化。最终结果表明,所提联合推理模型能够有效增强识别性能。  相似文献   

7.
Knowledge Discovery from Series of Interval Events   总被引:4,自引:0,他引:4  
Knowledge discovery from data sets can be extensively automated by using data mining software tools. Techniques for mining series of interval events, however, have not been considered. Such time series are common in many applications. In this paper, we propose mining techniques to discover temporal containment relationships in such series. Specifically, an item A is said to contain an item B if an event of type B occurs during the time span of an event of type A, and this is a frequent relationship in the data set. Mining such relationships provides insight about temporal relationships among various items. We implement the technique and analyze trace data collected from a real database application. Experimental results indicate that the proposed mining technique can discover interesting results. We also introduce a quantization technique as a preprocessing step to generalize the method to all time series.  相似文献   

8.
Mining Nonambiguous Temporal Patterns for Interval-Based Events   总被引:2,自引:0,他引:2  
Previous research on mining sequential patterns mainly focused on discovering patterns from point-based event data. Little effort has been put toward mining patterns from interval-based event data, where a pair of time values is associated with each event. Kam and Fu's work in 2000 identified 13 temporal relationships between two intervals. According to these temporal relationships, a new variant of temporal patterns was defined for interval-based event data. Unfortunately, the patterns defined in this manner are ambiguous, which means that the temporal relationships among events cannot be correctly represented in temporal patterns. To resolve this problem, we first define a new kind of nonambiguous temporal pattern for interval-based event data. Then, the TPrefixSpan algorithm is developed to mine the new temporal patterns from interval-based events. The completeness and accuracy of the results are also proven. The experimental results show that the efficiency and scalability of the TPrefixSpan algorithm are satisfactory. Furthermore, to show the applicability and effectiveness of temporal pattern mining, we execute experiments to discover temporal patterns from historical Nasdaq data  相似文献   

9.
In order to achieve an optimum and successful operation of an industrial process, it is important firstly to detect upsets, equipment malfunctions or other abnormal events as early as possible and secondly to identify and remove the cause of those events. Univariate and multivariate statistical process control methods have been widely applied in process industries for early fault detection and localization.The primary objective of the proposed research is the design of an anomaly detection and visualization tool that is able to present to the shift operator – and to the various levels of plant operation and company management – an early, global, accurate and consolidated presentation of the operation of major subgroups or of the whole plant, aided by a graphical form.Piecewise Aggregate Approximation (PAA) and Symbolic Aggregate Approximation (SAX) are considered as two of the most popular representations for time series data mining, including clustering, classification, pattern discovery and visualization in time series datasets. However SAX is preferred since it is able to transform a time series into a set of discrete symbols, e.g. into alphabet letters, being thus far more appropriate for a graphical representation of the corresponding information, especially for the shift operator. The methods are applied on individual time records of each process variable, as well as on entire groups of time records of process variables in combination with Hidden Markov Models. In this way, the proposed visualization tool is not only associated with a process defect, but it allows also identifying which specific abnormal situation occurred and if this has also occurred in the past. Case studies based on the benchmark Tennessee Eastman process demonstrate the effectiveness of the proposed approach. The results indicate that the proposed visualization tool captures meaningful information hidden in the observations and shows superior monitoring performance.  相似文献   

10.
When analyzing thousands of event histories, analysts often want to see the events as an aggregate to detect insights and generate new hypotheses about the data. An analysis tool must emphasize both the prevalence and the temporal ordering of these events. Additionally, the analysis tool must also support flexible comparisons to allow analysts to gather visual evidence. In a previous work, we introduced align, rank, and filter (ARF) to accentuate temporal ordering. In this paper, we present temporal summaries, an interactive visualization technique that highlights the prevalence of event occurrences. Temporal summaries dynamically aggregate events in multiple granularities (year, month, week, day, hour, etc.) for the purpose of spotting trends over time and comparing several groups of records. They provide affordances for analysts to perform temporal range filters. We demonstrate the applicability of this approach in two extensive case studies with analysts who applied temporal summaries to search, filter, and look for patterns in electronic health records and academic records.  相似文献   

11.
This paper introduces a method for mining co-occurring events from longitudinal data, and applies this method to detecting adverse drug reactions (ADRs) from patient data. Electronic health records are richer than older data sources (such as spontaneous report records) and thus are ideal for ADR mining. However, current data mining methods, such as disproportionality ratios and temporal itemset mining, ignore certain important aspects of the longitudinal data in patient records. In this paper, we highlight two specific problems with current methods, which we name temporal and contextual sensitivity, and discuss why these two properties are vital to mining patterns from longitudinal data. We also propose two sensitive longitudinal rate comparison measures, which utilize condition occurrence rates and length of drug eras, for mining ADRs from this type of data. These novel methods are then used to rank potential ADRs, along with existing state-of-the-art methods, under many simulated yet realistic datasets. In 48 out of 60 experiments, the proposed longitudinal rate comparison methods significantly outperform other methods in mining known ADRs from other drug / condition pairs.  相似文献   

12.
From association to classification: inference using weight of evidence   总被引:1,自引:0,他引:1  
Association and classification are two important tasks in data mining and knowledge discovery. Intensive studies have been carried out in both areas. But, how to apply discovered event associations to classification is still seldom found in current publications. Trying to bridge this gap, this paper extends our previous paper on significant event association discovery to classification. We propose to use weight of evidence to evaluate the evidence of a significant event association in support of, or against, a certain class membership. Traditional weight of evidence in information theory is extended here to measure the event associations of different orders with respect to a certain class. After the discovery of significant event associations inherent in a data set, it is easy and efficient to apply the weight of evidence measure for classifying an observation according to any attribute. With this approach, we achieve flexible prediction.  相似文献   

13.
Time series are often generated by continuous sampling or measurement of natural or social phenomena. In many cases, events cannot be represented by individual records, but instead must be represented by time series segments (temporal intervals). A consequence of this segment-based approach is that the analysis of events is reduced to analysis of occurrences of time series patterns that match segments representing the events.A major obstacle on the path toward event analysis is the lack of query languages for expressing interesting time series patterns. We have introduced SQL/LPP (Perng and Parker, 1999). Which provides fairly strong expressive power for time series pattern queries, and are now able to attack the problem of specifying queries that analyze temporal coupling, i.e., temporal relationships obeyed by occurrences of two or more patterns.In this paper, we propose SQL/LPP+, a temporal coupling verification language for time series databases. Based on the pattern definition language of SQL/LPP (Perng and Parker, 1999), SQL/LPP+ enables users to specify a query that looks for occurrences of a cascade of multiple patterns using one or more of Allen's temporal relationships (Allen, 1983) and obtain desired aggregates or meta-aggregates of the composition. Issues of pattern composition control are also discussed.  相似文献   

14.
数据仓库的时态关联规则的描述   总被引:1,自引:1,他引:1  
该文从时态型概念出发给出了有限个属性在时态型上描绘的不同状态时态事件空间,定义了事件之间的时态关联规则,由此导出了5种不同的具有一定意义的时态关联规则,这些时态关联规则具有普遍的理论意义,可以用于商品销售、股票价格等等数据仓库中的数据采掘问题。  相似文献   

15.
复合事件处理通过分析多个事件类型实例之间的关系以产生对应用感兴趣的复合事件.事件处理中已有的时间模型或者使用点时间戳建模原子和复合事件,或者定义的复合事件时间戳考虑不周,导致复合事件检测与复合事件语义存在不一致的结果;另外,需要根据应用需求对时间模型的准确性与复合事件的检测效率作出权衡.针对这两个问题,在面向服务计算平台InforSIB中定义了复合事件时间模型,包括复合事件时间戳和事件不同步与传输延迟的解决方案,最后基于时间模型给出了相应的高效的复合事件检测算法.实验结果证明了时间模型的有效性.  相似文献   

16.
Objective: To tackle the extraction of adverse drug reaction events in electronic health records. The challenge stands in inferring a robust prediction model from highly unbalanced data. According to our manually annotated corpus, only 6% of the drug-disease entity pairs trigger a positive adverse drug reaction event and this low ratio makes machine learning tough.Method: We present a hybrid system utilising a self-developed morpho-syntactic and semantic analyser for medical texts in Spanish. It performs named entity recognition of drugs and diseases and adverse drug reaction event extraction. The event extraction stage operates using rule-based and machine learning techniques.Results: We assess both the base classifiers, namely a knowledge-based model and an inferred classifier, and also the resulting hybrid system. Moreover, for the machine learning approach, an analysis of each particular bio-cause triggering the adverse drug reaction is carried out.Conclusions: One of the contributions of the machine learning based system is its ability to deal with both intra-sentence and inter-sentence events in a highly skewed classification environment. Moreover, the knowledge-based and the inferred model are complementary in terms of precision and recall. While the former provides high precision and low recall, the latter is the other way around. As a result, an appropriate hybrid approach seems to be able to benefit from both approaches and also improve them. This is the underlying motivation for selecting the hybrid approach. In addition, this is the first system dealing with real electronic health records in Spanish.  相似文献   

17.
Previous sequential pattern mining studies have dealt with either point-based event sequences or interval-based event sequences. In some applications, however, event sequences may contain both point-based and interval-based events. These sequences are called hybrid event sequences. Since the relationships among both kinds of events are more diversiform, the information obtained by discovering patterns from these events is more informative. In this study we introduce a hybrid temporal pattern mining problem and develop an algorithm to discover hybrid temporal patterns from hybrid event sequences. We carry out an experiment using both synthetic and real stock price data to compare our algorithm with the traditional algorithms designed exclusively for mining point-based patterns or interval-based patterns. The experimental results indicate that the efficiency of our algorithm is satisfactory. In addition, the experiment also shows that the predicting power of hybrid temporal patterns is higher than that of point-based or interval-based patterns.  相似文献   

18.
在飞速发展的信息时代和数据时代,网络攻击对个人隐私、工作生活乃至生命财产安全带来了严重威胁。而主机作为人类进行日常工作交流、生活娱乐、数据存储的重要设备,成为了网络攻击的主要目标。因此,进行主机攻击发现技术的研究是紧迫且必要的,而主机事件作为记录主机中一切行为的载体,成为了当今网络攻防领域的重点研究对象。攻击者在主机中的各种恶意操作会不可避免地被记录为主机事件,但恶意事件隐藏在规模庞大的正常事件中难以察觉和筛选,引发了如何获取主机事件、如何识别并提取恶意事件、如何还原攻击过程、如何进行安全防护等一系列问题的学术研究。本文对基于主机事件的攻击发现技术相关研究进行了广泛的调研和细致的汇总,对其研究发展历程进行了梳理,并将本文所研究的基于主机事件的攻击发现技术与入侵检测、数字取证两大研究方向从分析对象、分析方法、作用时间、分析目的4个方面进行了对比,阐明了本文所研究问题的独特之处,并对其下定义。随后,本文对基于主机事件的攻击发现技术涉及的关键概念进行了解释,提出了该领域面临的依赖关系爆炸和及时性两大问题,并将研究按照阶段划分为主机事件采集、主机事件处理、主机事件分析三个类别,分别介绍了三个类别围绕两大问题共计12个细分方向的研究成果和进展,最后结合研究现状提出了主机事件记录的完整性和可信性、攻击发现的时效性、跨设备的攻击发现、多步骤攻击的发现、算法的运用等5个未来可能的研究方向。  相似文献   

19.
Many events in real world applications are long-lasting events which have certain durations. The temporal relationships among those durable events are often complex. Processing such complex events has become increasingly important in applications of wireless networks. An important issue of complex event processing is to extract patterns from event streams to support decision making in real-time. However, network latencies and machine failures in wireless networks may cause events to be out-of-order. In this work, we analyze the preliminaries of event temporal semantics. A tree-plan model of out-of-order durable events is proposed. A hybrid solution is correspondingly introduced. Extensive experimental studies demonstrate the efficiency of our approach.  相似文献   

20.
This article presents a novel framework for adapting the behavior of intelligent agents. The framework consists of an extended sequential pattern mining algorithm that, in combination with association rule discovery techniques, is used to extract temporal patterns and relationships from the behavior of human agents executing a procedural task. The proposed framework has been integrated within the CanadarmTutor, an intelligent tutoring agent aimed at helping students solve procedural problems that involve moving a robotic arm in a complex virtual environment. We present the results of an evaluation that demonstrates the benefits of this integration to agents acting in ill-defined domains.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号