首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A large volume of research in temporal data mining is focusing on discovering temporal rules from time-stamped data. The majority of the methods proposed so far have been mainly devoted to the mining of temporal rules which describe relationships between data sequences or instantaneous events and do not consider the presence of complex temporal patterns into the dataset. Such complex patterns, such as trends or up and down behaviors, are often very interesting for the users. In this paper we propose a new kind of temporal association rule and the related extraction algorithm; the learned rules involve complex temporal patterns in both their antecedent and consequent. Within our proposed approach, the user defines a set of complex patterns of interest that constitute the basis for the construction of the temporal rule; such complex patterns are represented and retrieved in the data through the formalism of knowledge-based Temporal Abstractions. An Apriori-like algorithm looks then for meaningful temporal relationships (in particular, precedence temporal relationships) among the complex patterns of interest. The paper presents the results obtained by the rule extraction algorithm on a simulated dataset and on two different datasets related to biomedical applications: the first one concerns the analysis of time series coming from the monitoring of different clinical variables during hemodialysis sessions, while the other one deals with the biological problem of inferring relationships between genes from DNA microarray data.  相似文献   

2.
HIQUAL is a component-oriented deep modeling language that supports the modeling of component hierarchies and the representation and analysis of temporal relations. We describe the semantics of a system of components as a set of temporally and causally related temporal intervals that are denoted by dynamic states and events of the components. Thus, we obtain a uniform semantics for single components, for a system of horizontally connected components at the same level, and for a system of vertically connected components at different levels of abstraction. We claim that in our approach parallelism and other temporal aspects including temporal uncertainty are more naturally represented than in other approaches, in particular those using global state semantics.  相似文献   

3.
Narayanan  Arvind  Verma  Saurabh  Zhang  Zhi-Li 《World Wide Web》2019,22(6):2771-2798

We coin the term geoMobile data to emphasize datasets that exhibit geo-spatial features reflective of human behaviors. We propose and develop an EPIC framework to mine latent patterns from geoMobile data and provide meaningful interpretations: we first ‘E’xtract latent features from high dimensional geoMobile datasets via Laplacian Eigenmaps and perform clustering in this latent feature space; we then use a state-of-the-art visualization technique to ‘P’roject these latent features into 2D space; and finally we obtain meaningful ‘I’nterpretations by ‘C’ulling cluster-specific significant feature-set. We illustrate that the local space contraction property of our approach is most superior than other major dimension reduction techniques. Using diverse real-world geoMobile datasets, we show the efficacy of our framework via three case studies.

  相似文献   

4.
In information exchange networks such as email or blog networks, most processes are carried out using exchange of messages. The behavioral analysis in such networks leads to interesting insight which would be quite valuable for organizational or social analysis. In this paper, we investigate user engagingness and responsiveness as two interaction behaviors that help us understand an email network which is one of information exchange networks. Engaging actors are those who can effectively solicit responses from other actors. Responsive actors are those who are willing to respond to other actors. By modeling such behaviors, we are able to measure them and to identify high engaging or responsive actors. We systematically propose novel behavior models to quantify the engagingness and responsiveness of actors in the Enron email network. Furthermore, as one of case studies, we study an event detection problem, based on our proposed behavior models, in the Enron emails. According to our empirical study, we found out meaningful events in Enron. For details, see Sect. 5.  相似文献   

5.
The standard approach to feature construction and predictive learning in molecular datasets is to employ computationally expensive graph mining techniques and to bias the feature search exploration using frequency or correlation measures. These features are then typically employed in predictive models that can be constructed using, for example, SVMs or decision trees. We take a different approach: rather than mining for all optimal local patterns, we extract features from the set of pairwise maximum common subgraphs. The maximum common subgraphs are computed under the block-and-bridge-preserving subgraph isomorphism from the outerplanar examples in polynomial time. We empirically observe a significant increase in predictive performance when using maximum common subgraph features instead of correlated local patterns on 60 benchmark datasets from NCI. Moreover, we show that when we randomly sample the pairs of graphs from which to extract the maximum common subgraphs, we obtain a smaller set of features that still allows the same predictive performance as methods that exhaustively enumerate all possible patterns. The sampling strategy turns out to be a very good compromise between a slight decrease in predictive performance (although still remaining comparable with state-of-the-art methods) and a significant runtime reduction (two orders of magnitude on a popular medium size chemoinformatics dataset). This suggests that maximum common subgraphs are interesting and meaningful features.  相似文献   

6.
Abnormality detection in crowded scenes plays a very important role in automatic monitoring of surveillance feeds. Here we present a novel framework for abnormality detection in crowd videos. The key idea of the approach is that rarely or sparsely occurring events correspond to abnormal activities, while the regularly or commonly occurring events correspond to the normal activities. Each input video is represented using feature matrices that capture the nature of activity taking place while maintaining the spatial and temporal structure of the video. The feature matrices are decomposed into their low-rank and sparse components where sparse component corresponds to the abnormal activities. The approach does not require any explicit modeling of crowd behavior or training, but the information from training data can be seamlessly incorporated if it is available. The estimation is further improved by ensuring temporal and spatial coherence of sparse component across the videos using a Kalman filter-like framework. This not only results in reduction of outliers and noise but also fills missing regions in the sparse component. Localization of the anomalies is obtained as a by-product of the proposed approach. Evaluation on the UMN and UCSD datasets and comparisons with several state-of-the-art crowd abnormality detection approaches shows the effectiveness of the proposed approach. We also show results on a challenging crowd dataset created as part of this effort, with videos downloaded from the web.  相似文献   

7.
This paper proposes the mobile forensic reference set (MFReS), a mobile forensic investigation procedure and a tool for mobile forensics that we developed. The MFReS consists of repositories, databases, and services that can easily retrieve data from a database, which can be used to effectively classify meaningful data related to crime, among numerous data types in mobile devices. Mobile data consist of system data, application data, and multimedia data according to characteristics and format. We have developed a mobile forensic process that can effectively analyze information from installed applications and user behavior through these data. In particular, our tool can be useful for investigators because it can analyze the log files of all applications (apps) and analyze behavior based on timeline, geodata, and other characteristics. Our research can contribute to the study of mobile forensic support systems and suggest the direction of mobile data analysis tool development.  相似文献   

8.
Session-based recommendation (SBR) and multi-behavior recommendation (MBR) are both important problems and have attracted the attention of many researchers and practitioners. Different from SBR that solely uses one single type of behavior sequences and MBR that neglects sequential dynamics, heterogeneous SBR (HSBR) that exploits different types of behavioral information (e.g., examinations like clicks or browses, purchases, adds-to-carts and adds-to-favorites) in sequences is more consistent with real-world recommendation scenarios, but it is rarely studied. Early efforts towards HSBR focus on distinguishing different types of behaviors or exploiting homogeneous behavior transitions in a sequence with the same type of behaviors. However, all the existing solutions for HSBR do not exploit the rich heterogeneous behavior transitions in an explicit way and thus may fail to capture the semantic relations between different types of behaviors. However, all the existing solutions for HSBR do not model the rich heterogeneous behavior transitions in the form of graphs and thus may fail to capture the semantic relations between different types of behaviors. The limitation hinders the development of HSBR and results in unsatisfactory performance. As a response, we propose a novel behavior-aware graph neural network (BGNN) for HSBR. Our BGNN adopts a dual-channel learning strategy for differentiated modeling of two different types of behavior sequences in a session. Moreover, our BGNN integrates the information of both homogeneous behavior transitions and heterogeneous behavior transitions in a unified way. We then conduct extensive empirical studies on three real-world datasets, and find that our BGNN outperforms the best baseline by 21.87%, 18.49%, and 37.16% on average correspondingly. A series of further experiments and visualization studies demonstrate the rationality and effectiveness of our BGNN. An exploratory study on extending our BGNN to handle more than two types of behaviors show that our BGNN can easily and effectively be extended to multi-behavior scenarios.  相似文献   

9.
Determining user geolocation from social media data is essential in various location-based applications — from improved transportation/supply management, through providing personalized services and targeted marketing, to better overall user experiences. Previous methods rely on the similarity of user posting content and neighboring nodes for user geolocation, which suffer the problems of: (1) position-agnostic of network representation learning, which impedes the performance of their prediction accuracy; and (2) noisy and unstable user relation fusion due to the flat graph embedding methods employed. This work presents Hierarchical Graph Neural Networks (HGNN) – a novel methodology for location-aware collaborative user-aspect data fusion and location prediction. It incorporates geographical location information of users and clustering effect of regions and can capture topological relations while preserving their relative positions. By encoding the structure and features of regions with hierarchical graph learning, HGNN can primarily alleviate the problem of noisy and unstable signal fusion. We further design a relation mechanism to bridge connections between individual users and clusters, which not only leverages the information of isolated nodes that are useless in previous methods but also captures the relations between unlabeled nodes and labeled subgraphs. Furthermore, we introduce a robust statistics method to interpret the behavior of our model by identifying the importance of data samples when predicting the locations of the users. It provides meaningful explanations on the model behaviors and outputs, overcoming the drawbacks of previous approaches that treat user geolocation as “black-box” modeling and lacking interpretability. Comprehensive evaluations on real-world Twitter datasets verify the proposed model’s superior performance and its ability to interpret the user geolocation results.  相似文献   

10.
With the rapid development of Internet technologies, security issues have always been the hot topics. Continuous identity authentication based on mouse behavior plays a crucial role in protecting computer systems, but there are still some problems to be solved. Aiming at the problems of low authentication accuracy and long authentication latency in mouse behavior authentication method, a new continuous identity authentication method based on mouse behavior was proposed. The method divided the user’s mouse event sequence into corresponding mouse behaviors according to different types, and mined mouse behavior characteristics from various aspects based on mouse behaviors. Thereby, the differences in mouse behavior of different users can be better represented, and the authentication accuracy can be improved. Besides, the importance of mouse behavior features was obtained by the ReliefF algorithm, and on this basis, the irrelevant or redundant features of mouse behavior were removed by combining the neighborhood rough set to reduce model complexity and modeling time. Moreover binary classification was adopted. The algorithm performed the training of the authentication model. During identity authentication, the authentication model was used to obtain a classification score based on the mouse behavior collected each time, and then the user’s trust value was updated in combination with the trust model. When the user’s trust value fell below the threshold of the trust model, it might be judged as illegal user. The authentication effect of the proposed method was simulated on the Balabit and DFL datasets. The results show that, compared with the methods in other literatures, this method not only improves the authentication accuracy and reduces the authentication latency, but also has a certain robustness to the illegal intrusion of external users. © 2022, Beijing Xintong Media Co., Ltd.. All rights reserved.  相似文献   

11.
12.
汤小月  周康  王凯 《软件学报》2020,31(4):1189-1211
作为一种新兴的社交媒体用户交互服务,提及机制(mention mechanism)正在用户在线交互和网络信息传播方面扮演着重要角色.对用户提及行为的研究能够揭示用户的隐式偏好与其显式行为之间的联系,为信息传播监控、商业智能、个性化推荐等应用提供新的数据支撑.当前,对用户提及机制的探索多集中在其信息传播属性上,缺少从普通用户角度对其用户交互属性的学习.通过对普通用户提及行为的分析和建模构建一个推荐系统,为给定的社交媒体消息生成目标用户推荐.通过对大型真实社交媒体数据集的分析发现,用户的提及行为受其提及活动的语义和空间上下文因素的联合影响.据此,提出一个联合概率生成模型JUMBM(joint user mention behavior model),模拟用户空间关联提及活动的生成过程.通过对用户语义和空间上下文感知的提及行为进行统一建模,JUMBM能够同时发掘用户的移动模式、地理区域依赖的语义兴趣及其对应目标用户的地理聚集模式.此外,提出一种混合剪枝算法,加快推荐系统对在线top-k查询的响应速度.在大型真实数据集上的实验结果表明,所提方法在推荐有效性和推荐效率方面均优于对比方法.  相似文献   

13.
Location prediction is a crucial need for location-aware services and applications. Given an object’s recent movement and a future time, the goal of location prediction is to predict the location of the object at the future time specified. Different from traditional location prediction using motion function, some research works have elaborated on mining movement behavior from historical trajectories for location prediction. Without loss of generality, given a set of trajectories of an object, prior works on mining movement behaviors will first extract regions of popularity, in which the object frequently appears, and then discover the sequential relationships among regions. However, the quality of the frequent regions extracted affects the accuracy of the location prediction. Furthermore, trajectory data has both spatial and temporal information. To further enhance the accuracy of location prediction, one could utilize not only spatial information but also temporal information to predict the locations of objects. In this paper, we propose a framework QS-STT (standing for QuadSection clustering and Spatial-Temporal Trajectory model) to capture the movement behaviors of objects for location prediction. Specifically, we have developed QuadSection clustering to extract a reasonable and near-optimal set of frequent regions. Then, based on the set of frequent regions, we propose a spatial-temporal trajectory model to explore the object’s movement behavior as a probabilistic suffix tree with both spatial and temporal information of movements. Note that STT is not only able to discover sequential relationships among regions but also derives the corresponding probabilities of time, indicating when the object appears in each region. Based on STT, we further propose an algorithm to traverse STT for location prediction. By enhancing the quality of the frequent region extracted and exploring both the spatial and temporal information of STT, the accuracy of location prediction in QS-STT is improved. QS-STT is designed for individual location prediction. For verifying the effectiveness of QS-STT for location prediction under the different spatial density, we have conducted experiments on four types of real trajectory datasets with different speed. The experimental results show that our proposed QS-STT is able to capture both spatial and temporal patterns of movement behaviors and by exploring QS-STT, our proposed prediction algorithm outperforms existing works.  相似文献   

14.
Video visualization is a computation process that extracts meaningful information from original video data sets and conveys the extracted information to users in appropriate visual representations. This paper presents a broad treatment of the subject, following a typical research pipeline involving concept formulation, system development, a path-finding user study, and a field trial with real application data. In particular, we have conducted a fundamental study on the visualization of motion events in videos. We have, for the first time, deployed flow visualization techniques in video visualization. We have compared the effectiveness of different abstract visual representations of videos. We have conducted a user study to examine whether users are able to learn to recognize visual signatures of motions, and to assist in the evaluation of different visualization techniques. We have applied our understanding and the developed techniques to a set of application video clips. Our study has demonstrated that video visualization is both technically feasible and cost-effective. It has provided the first set of evidence confirming that ordinary users can be accustomed to the visual features depicted in video visualizations, and can learn to recognize visual signatures of a variety of motion events.  相似文献   

15.
16.
With the rapid development of location-based social networks (LBSNs), more and more media data are unceasingly uploaded by users. The asynchrony between the visual and textual information has made it extremely difficult to manage the multimodal information for manual annotation-free retrieval and personalized recommendation. Consequently the automated image semantic discovery of multimedia location-related user-generated contents (UGCs) for user experience has become mandatory. Most of the literatures leverage single-modality data or correlated multimedia data for image semantic detection. However, the intrinsically heterogeneous UGCs in LBSNs are usually independent and uncorrelated. It is hard to build correlation between textual information and visual information. In this paper, we propose a cross-domain semantic modeling method for automatic image annotation for visual information from social network platforms. First, we extract a set of hot topics from the collected textual information for image dataset preparation. Then the proposed noisy sample filtering is implemented to remove low-relevance photos. Finally, we leverage cross-domain datasets to discover the common knowledge of each semantic concept from UGCs and boost the performance of semantic annotation by semantic transfer. The comparison experiments on cross-domain datasets were conducted to demonstrate the superiority of the proposed method.  相似文献   

17.
A contrast pattern is a set of items (itemset) whose frequency differs significantly between two classes of data. Such patterns describe distinguishing characteristics between datasets, are meaningful to human experts, have strong discriminating ability and can be used for powerful classifiers. Incrementally mining such patterns is very important for evolving datasets, where transactions can be either inserted or deleted and mining needs to be repeated after changes occur. When the change is small, it is undesirable to carry out mining from scratch. Rather, the set of previously mined contrast patterns should be reused where possible to compute the new patterns. A primary example of evolving data is a data stream, where the data is a sequence of continuously arriving transactions (or itemsets). In this paper, we propose an efficient technique for incrementally mining contrast patterns. Our algorithm particularly aims to avoid redundant computation which might occur due to simultaneous transaction insertion and deletion, as is the case for data streams. In an experimental study using real and synthetic data streams, we show our algorithm can be substantially faster than the previous approach.  相似文献   

18.
In this paper, we introduce a new algorithm for clustering and aggregating relational data (CARD). We assume that data is available in a relational form, where we only have information about the degrees to which pairs of objects in the data set are related. Moreover, we assume that the relational information is represented by multiple dissimilarity matrices. These matrices could have been generated using different sensors, features, or mappings. CARD is designed to aggregate pairwise distances from multiple relational matrices, partition the data into clusters, and learn a relevance weight for each matrix in each cluster simultaneously. The cluster dependent relevance weights offer two advantages. First, they guide the clustering process to partition the data set into more meaningful clusters. Second, they can be used in subsequent steps of a learning system to improve its learning behavior. The performance of the proposed algorithm is illustrated by using it to categorize a collection of 500 color images. We represent the pairwise image dissimilarities by six different relational matrices that encode color, texture, and structure information.  相似文献   

19.
《Advanced Robotics》2013,27(4):317-333
The purpose of this study is to improve the locomotion performance for autonomous mobile robots in outdoor environments. In this paper improvement of an environment model is called empirical locomotion performance leaming. A system avoids wasting time of observations and actions by analyzing data from the last run. We propose a method of empirical learning. The method is expressed by rewriting the rules on the trajectory data. Brief route information for navigating a robot is represented with motion directions at intersections and metric distances between intersections. The behavior of our robot is based on a locomotion strategy 'sign pattern-based stereotyped motion'. The behaviors are implemented on our mobile robot HARUNOBU-4 and tested at our university campus. Experimental results show a robustness of our proposed behaviors under dynamic environments with existing obstacles. Furthermore, they showed that our proposed rewriting rules improved the locomotion performance. In particular, searching time was shortened by 87% (from 453 to 61 s) and the travel distance was shortened by 10% (from 173.8 to 157.5 m).  相似文献   

20.
Recently, as damage caused by Internet threats has increased significantly, one of the major challenges is to accurately predict the period and severity of threats. In this study, a novel probabilistic approach is proposed effectively to forecast and detect network intrusions. It uses a Markov chain for probabilistic modeling of abnormal events in network systems. First, to define the network states, we perform K-means clustering, and then we introduce the concept of an outlier factor. Based on the defined states, the degree of abnormality of the incoming data is stochastically measured in real-time. The performance of the proposed approach is evaluated through experiments using the well-known DARPA 2000 data set and further analyzes. The proposed approach achieves high detection performance while representing the level of attacks in stages. In particular, our approach is shown to be very robust to training data sets and the number of states in the Markov model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号