首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The growing popularity of graph databases has generated interesting data management problems, such as subgraph search, shortest path query, reachability verification, and pattern matching. Among these, a pattern match query is more flexible compared with a subgraph search and more informative compared with a shortest path or a reachability query. In this paper, we address distance-based pattern match queries over a large data graph G. Due to the huge search space, we adopt a filter-and-refine framework to answer a pattern match query over a large graph. We first find a set of candidate matches by a graph embedding technique and then evaluate these to find the exact matches. Extensive experiments confirm the superiority of our method.  相似文献   

2.
基于滑动窗口的异常检测是数据流挖掘研究的一个重要课题,在许多应用中数据流通常在一个分布网络上传输,解决这类问题时常采用分布计算技术,以便获得实时高质量的计算结果。对分布演化数据流上连续异常检测问题,进行形式化地阐述,提出了两个基于核密度估计的异常检测定义和算法,并通过大量真实数据集的实验,表明该算法具有良好的高效性和可扩展性,完全适应数据流应用的需求。  相似文献   

3.
In several emerging and important applications, such as location-based services, sensor monitoring and biological databases, the values of the data items are inherently imprecise. A useful query class for these data is the Probabilistic Nearest-Neighbor Query (PNN), which yields the IDs of objects for being the closest neighbor of a query point, together with the objects’ probability values. Previous studies showed that this query takes a long time to evaluate. To address this problem, we propose the Constrained Nearest-Neighbor Query (C-PNN), which returns the IDs of objects whose probabilities are higher than some threshold, with a given error bound in the answers. We show that the C-PNN can be answered efficiently with verifiers. These are methods that derive the lower and upper bounds of answer probabilities, so that an object can be quickly decided on whether it should be included in the answer. We design five verifiers, which can be used on uncertain data with arbitrary probability density functions. We further develop a  partial evaluation technique, so that a user can obtain some answers quickly, without waiting for the whole query evaluation process to be completed (which may incur a high response time). In addition, we examine the maintenance of a long-standing, or continuous C-PNN query. This query requires any update to be applied to the result immediately, in order to reflect the changes to the database values (e.g., due to the change of the location of a moving object). We design an incremental update method based on previous query answers, in order to reduce the amount of I/O and CPU cost in maintaining the correctness of the answers to such a query. Performance evaluation on realistic datasets show that our methods are capable of yielding timely and accurate results.  相似文献   

4.
Recently, uncertain graph data management and mining techniques have attracted significant interests and research efforts due to potential applications such as protein interaction networks and social networks. Specifically, as a fundamental problem, subgraph similarity all-matching is widely applied in exploratory data analysis. The purpose of subgraph similarity all-matching is to find all the similarity occurrences of the query graph in a large data graph. Numerous algorithms and pruning methods have been developed for the subgraph matching problem over a certain graph. However, insufficient efforts are devoted to subgraph similarity all-matching over an uncertain data graph, which is quite challenging due to high computation costs. In this paper, we define the problem of subgraph similarity maximal all-matching over a large uncertain data graph and propose a framework to solve this problem. To further improve the efficiency, several speed-up techniques are proposed such as the partial graph evaluation, the vertex pruning, the calculation model transformation, the incremental evaluation method and the probability upper bound filtering. Finally, comprehensive experiments are conducted on real graph data to test the performance of our framework and optimization methods. The results verify that our solutions can outperform the basic approach by orders of magnitudes in efficiency.  相似文献   

5.
6.
郭莉  王坤 《计算机应用》2006,26(11):2615-2617
提出了阶段化学习的概念,分阶段学习区分出了模式的完备性和不完备性。基于此概念,提出一种新的异常检测模式持续学习算法(PADPL)。仿真结果显示,PADPL能够满足由不完备性引起的异常检测模式持续学习的要求。  相似文献   

7.
Anomaly detection holds great potential for detecting previously unknown attacks. In order to be effective in a practical environment, anomaly detection systems have to be capable of online learning and handling concept drift. In this paper, a new adaptive anomaly detection framework, based on the use of unsupervised evolving connectionist systems, is proposed to address these issues. It is designed to adapt to normal behavior changes while still recognizing anomalies. The evolving connectionist systems learn a subject's behavior in an online, adaptive fashion through efficient local element tuning. Experiments with the KDD Cup 1999 network data and the Windows NT user profiling data show that our adaptive anomaly detection systems, based on Fuzzy Adaptive Resonance Theory (ART) and Evolving Fuzzy Neural Networks (EFuNN), can significantly reduce the false alarm rate while the attack detection rate remains high.  相似文献   

8.
In this paper, we study several physically feasible quantum secret sharing (QSS) schemes using continuous variable graph state (CVGS). Their implementation protocols are given, and the estimation error formulae are derived. Then, we present a variety of results on the theory of QSS with CVGS. Any $(k,n)$ threshold protocol of the three specific schemes satisfying $\frac{n}{2}<k\le n$ , where $n$ denotes the total number of players and $k$ denotes the minimum number of players who can collaboratively access the secret, can be implemented by certain weighted CVGS. The quantum secret is absolutely confidential to any player group with number less than threshold. Besides, the effect of finite squeezing to these results is properly considered. In the end, the duality between two specific schemes is investigated.  相似文献   

9.
社区检测和划分已经成为大规模社会网络中一个非常关键的问题。然而,大多数现有的算法受限于计算成本,其适用性十分有限。为了提高社区划分质量和计算效率,提出了一种基于非加权图的社区网络检测算法。首先,算法采用两个新的参数来度量社区并实现社区检测,即聚类系数和共同的邻居相似性,并通过理论分析和公式推导证明其有效性。最后采用真实社会网络数据集进行了大量的模拟,实验结果表明,与传统的生成树算法以及CBCD算法相比,提出的方法更加有效,且计算运行时间具有线性复杂度,适用于大规模社会网络的社区检测。  相似文献   

10.
Snapshot Isolation (SI) is a multiversion concurrency control that has been implemented by several open source and commercial database systems (Oracle, Microsoft SQL Server, and previous releases of PostgreSQL). The main feature of SI is that a read operation does not block a write operation and vice versa, which allows higher degree of concurrency than traditional two-phase locking. SI prevents many anomalies that appear in other isolation levels, but it still can result in non-serializable executions, in which database integrity constraints can be violated. Several techniques are known to modify the application code based on preanalysis, in order to ensure that every execution is serializable on engines running SI. We introduce a new technique called External Lock Manager (ELM). In using a technique, there is a choice to make, of which pairs of transactions need to have conflicts introduced. We measure the performance impact of the choices available, among techniques and conflicts.  相似文献   

11.
Constructions of woven graph codes based on constituent convolutional codes are studied, and examples of woven convolutional graph codes are presented. Existence of codes satisfying the Costello lower bound on the free distance within a random ensemble of woven graph codes based on s-partite, s-uniform hypergraphs is shown, where s depends only on the code rate. Simulation results for Viterbi decoding of woven graph codes are presented and discussed.  相似文献   

12.
This paper proposes a novel subspace approach towards identification of optimal residual models for process fault detection and isolation (PFDI) in a multivariate continuous-time system. We formulate the problem in terms of the state space model of the continuous-time system. The motivation for such a formulation is that the fault gain matrix, which links the process faults to the state variables of the system under consideration, is always available no matter how the faults vary with time. However, in the discrete-time state space model, the fault gain matrix is only available when the faults follow some known function of time within each sampling interval. To isolate faults, the fault gain matrix is essential. We develop subspace algorithms in the continuous-time domain to directly identify the residual models from sampled noisy data without separate identification of the system matrices. Furthermore, the proposed approach can also be extended towards the identification of the system matrices if they are needed. The newly proposed approach is applied to a simulated four-tank system, where a small leak from any tank is successfully detected and isolated. To make a comparison, we also apply the discrete time residual models to the tank system for detection and isolation of leaks. It is demonstrated that the continuous-time PFDI approach is practical and has better performance than the discrete-time PFDI approach.  相似文献   

13.
针对现有虚假信息检测方法主要基于单模态数据分析,检测时忽视了信息之间相关性的问题,提出了结合社交网络图的多模态虚假信息检测模型。该模型使用预训练Transformer模型和图像描述模型分别从多角度提取各模态数据的语义,并通过融合信息传播过程中的社交网络图,在文本和图像模态中加入传播信息的特征,最后使用跨模态注意力机制分配各模态信息权重以进行虚假信息检测。在推特和微博两个真实数据集上进行对比实验,所提模型的虚假信息检测准确率稳定为约88%,高于EANN、PTCA等现有基线模型。实验结果表明所提模型能够有效融合多模态信息,从而提高虚假信息检测的准确率。  相似文献   

14.
在谣言检测的问题上,现有的研究方法无法有效地表达谣言在社交网络传播的异构图结构特征,并且没有引入外部知识作为内容核实的手段。因此,提出了引入知识表示的图卷积网络谣言检测方法,其中知识图谱作为额外先验知识来帮助核实内容真实性。采用预训练好的词嵌入模型和知识图谱嵌入模型获取文本表示后,融合图卷积网络的同时,能够在谣言传播的拓扑图中更好地进行特征提取以提升谣言检测的精确率。实验结果表明,该模型能够更好地对社交网络中的谣言进行检测。与基准模型的对比中,在Weibo数据集上的精确率达到96.1%,在Twitter15和Twitter16数据集上的F1值分别提升了3.1%和3.3%。消融实验也表明了该方法对谣言检测皆有明显提升效果,同时验证了模型的有效性和先进性。  相似文献   

15.
The Multiple Time Bucket Join (MTB-join) algorithm is the state of the art for processing the continuous intersection join (CI-join) query over moving objects. It considerably outperforms alternatives, but still falls short of real-time application performance requirements for large sets of moving objects. In this paper, we achieve real-time performance for the CI-join query over large sets of moving objects by exploiting the computational power of commodity graphics processing units (GPUs). We first analyze how the main characteristics of the MTB-join algorithm make it ill suited to GPUs and identify key challenges in designing efficient GPU-based algorithms for the query. We then address these challenges by developing the multi-layered grid join (MLG-join) algorithm which has the following key features: (i) memory locality friendly indexing, (ii) no dynamic memory allocation, (iii) in-place object updates, (iv) lock-free concurrent updates, and (v) massive parallelism. These features unleash the full potential of the memory bandwidth and parallel processing of GPUs. Furthermore, we conduct a theoretical analysis which can predict the pruning power of the MLG-join algorithm given certain parameter values used in the algorithm. This allows us to select optimal parameter values. Through extensive experimental results, we show that our analysis accurately models the MLG-join algorithm’s sensitivity to parameter values. The proposed MLG-join algorithm outperforms the MTB-join algorithm, and a GPU-based nested-loops join algorithm, by up to two orders of magnitude, and achieves real-time performance for CI-join queries on large sets of moving objects.  相似文献   

16.
随着国产计算机的推广应用,原X86平台开发的软件经常面临国产化平台适配的需求,且要求适配后的功能、性能不降低.以大批量实时图像渲染类的应用为例,性能问题是国产化平台适配时经常遇到的难题.文章以主流的国产软硬件平台为研究基准,以对比实验形式论证了基于QtOpenGL的实时渲染软件国产化适配性能优化的关键技术点及解决方法.提出了六条切实可行的显示性能优化技术途径,这些成果对于基于QtOpenGL的国产平台显示性能的优化工作有借鉴意义.  相似文献   

17.
Rumor detection has become an emerging and active research field in recent years. At the core is to model the rumor characteristics inherent in rich information, such as propagation patterns in social network and semantic patterns in post content, and differentiate them from the truth. However, existing works on rumor detection fall short in modeling heterogeneous information, either using one single information source only (e.g., social network, or post content) or ignoring the relations among multiple sources (e.g., fusing social and content features via simple concatenation).Therefore, they possibly have drawbacks in comprehensively understanding the rumors, and detecting them accurately. In this work, we explore contrastive self-supervised learning on heterogeneous information sources, so as to reveal their relations and characterize rumors better. Technically, we supplement the main supervised task of detection with an auxiliary self-supervised task, which enriches post representations via post self-discrimination.Specifically, given two heterogeneous views of a post (i.e., representations encoding social patterns and semantic patterns), the discrimination is done by maximizing the mutual information between different views of the same post compared to that of other posts. We devise cluster-wise and instance-wise approaches to generate the views and conduct the discrimination, considering different relations of information sources. We term this framework as self-supervised rumor detection (SRD). Extensive experiments on three real-world datasets validate the effectiveness of SRD for automatic rumor detection on social media.  相似文献   

18.
We describe a construction of explicit affine extractors over large finite fields with exponentially small error and linear output length. Our construction relies on a deep theorem of Deligne giving tight estimates for exponential sums over smooth varieties in high dimensions.  相似文献   

19.
This paper presents a delay-tolerant mix-zone framework for protecting the location privacy of mobile users against continuous query correlation attacks. First, we describe and analyze the continuous query correlation attacks (CQ-attacks) that perform query correlation based inference to break the anonymity of road network-aware mix-zones. We formally study the privacy strengths of the mix-zone anonymization under the CQ-attack model and argue that spatial cloaking or temporal cloaking over road network mix-zones is ineffective and susceptible to attacks that carry out inference by combining query correlation with timing correlation (CQ-timing attack) and transition correlation (CQ-transition attack) information. Next, we introduce three types of delay-tolerant road network mix-zones (i.e., temporal, spatial and spatio-temporal) that are free from CQ-timing and CQ-transition attacks and in contrast to conventional mix-zones, perform a combination of both location mixing and identity mixing of spatially and temporally perturbed user locations to achieve stronger anonymity under the CQ-attack model. We show that by combining temporal and spatial delay-tolerant mix-zones, we can obtain the strongest anonymity for continuous queries while making acceptable tradeoff between anonymous query processing cost and temporal delay incurred in anonymous query processing. We evaluate the proposed techniques through extensive experiments conducted on realistic traces produced by GTMobiSim on different scales of geographic maps. Our experiments show that the proposed techniques offer high level of anonymity and attack resilience to continuous queries.  相似文献   

20.
Context: Inconsistency detection and resolution is critical for context-aware applications to ensure their normal execution. Contexts, which refer to pieces of environmental information used by applications, are checked against consistency constraints for potential errors. However, not all detected inconsistencies are caused by real context problems. Instead, they might be triggered by improper checking timing. Such inconsistencies are ephemeral and usually harmless. Their detection and resolution is unnecessary, and may even be detrimental. We name them inconsistency hazards.Objective: Inconsistency hazards should be prevented from being detected or resolved, but it is not straightforward since their occurrences resemble real inconsistencies. In this article, we present SHAP, a pattern-learning based approach to suppressing the detection of such hazards automatically.Method: Our key insight is that detection of inconsistency hazards is subject to certain patterns of context changes. Although such patterns can be difficult to specify manually, they may be learned effectively with data mining techniques. With these patterns, we can reasonably schedule inconsistency detections.Results: The experimental results show that SHAP can effectively suppress the detection of most inconsistency hazards (over 90%) with negligible overhead.Conclusions: Comparing with other approaches, our approach can effectively suppress the detection of inconsistency hazards, and at the same time allow real inconsistencies to be detected and resolved timely.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号