首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 24 毫秒
1.
Many mal-practices in stock market trading—e.g., circular trading and price manipulation—use the modus operandi of collusion. Informally, a set of traders is a candidate collusion set when they have “heavy trading” among themselves, as compared to their trading with others. We formalize the problem of detection of collusion sets, if any, in the given trading database. We show that naïve approaches are inefficient for real-life situations. We adapt and apply two well-known graph clustering algorithms for this problem. We also propose a new graph clustering algorithm, specifically tailored for detecting collusion sets. A novel feature of our approach is the use of Dempster–Schafer theory of evidence to combine the candidate collusion sets detected by individual algorithms. Treating individual experiments as evidence, this approach allows us to quantify the confidence (or belief) in the candidate collusion sets. We present detailed simulation experiments to demonstrate effectiveness of the proposed algorithms.  相似文献   

2.
3.
Fast agglomerative clustering using a k-nearest neighbor graph   总被引:1,自引:0,他引:1  
We propose a fast agglomerative clustering method using an approximate nearest neighbor graph for reducing the number of distance calculations. The time complexity of the algorithm is improved from O(tauN2) to O(tauN log N) at the cost of a slight increase in distortion; here, tau denotes the lumber of nearest neighbor updates required at each iteration. According to the experiments, a relatively small neighborhood size is sufficient to maintain the quality close to that of the full search  相似文献   

4.
Multimedia Tools and Applications - A social graph is a common way to publish a social network, but such publication poses privacy risks. In this paper, we use attributed social graph as a graph...  相似文献   

5.

Embedded real-time systems generate state sequences where time elapses between state changes. Ensuring that such systems adhere to a provided specification of admissible or desired behavior is essential. Formal model-based testing is often a suitable cost-effective approach. We introduce an extended version of the formalism of symbolic graphs, which encompasses types as well as attributes, for representing states of dynamic systems. Relying on this extension of symbolic graphs, we present a novel formalism of timed graph transformation systems (TGTSs) that supports the model-based development of dynamic real-time systems at an abstract level where possible state changes and delays are specified by graph transformation rules. We then introduce an extended form of the metric temporal graph logic (MTGL) with increased expressiveness to improve the applicability of MTGL for the specification of timed graph sequences generated by a TGTS. Based on the metric temporal operators of MTGL and its built-in graph binding mechanics, we express properties on the structure and attributes of graphs as well as on the occurrence of graphs over time that are related by their inner structure. We provide formal support for checking whether a single generated timed graph sequence adheres to a provided MTGL specification. Relying on this logical foundation, we develop a testing framework for TGTSs that are specified using MTGL. Lastly, we apply this testing framework to a running example by using our prototypical implementation in the tool AutoGraph.

  相似文献   

6.
In this paper, we are interested in the problem of graph clustering. We propose a new algorithm for computing the median of a set of graphs. The concept of median allows the extension of conventional algorithms such as the k-means to graph clustering, helping to bridge the gap between statistical and structural approaches to pattern recognition. Experimental results show the efficiency of the new median graph algorithm compared to the (only) existing algorithm in the literature. We also show its effective use in clustering a set of random graphs and in a content-based synthetic image retrieval system.  相似文献   

7.
8.
实体消歧作为知识库构建、信息检索等应用的重要支撑技术,在自然语言处理领域有着重要的作用。然而在短文本环境中,对实体的上下文特征进行建模的传统消歧方式很难提取到足够多用以消歧的特征。针对短文本的特点,提出一种基于实体主题关系的中文短文本图模型消歧方法,首先,通过TextRank算法对知识库信息构建的语料库进行主题推断,并使用主题推断的结果作为实体间关系的表示;然后,结合基于BERT的语义匹配模型给出的消歧评分对待消歧文本构建消歧网络图;最终,通过搜索排序得出最后的消歧结果。使用CCKS2020短文本实体链接任务提供的数据集对所提方法进行评测,实验结果表明,该方法对短文本的实体消歧效果优于其他方法,能有效解决在缺乏知识库实体关系情况下的中文短文本实体消歧问题。  相似文献   

9.
This paper introduces an approach to analyzing temporal patterns of knowledge construction (KC) in online discussions, including consequences of role assignments. The paper illustrates the power of this approach for illuminating collaborative processes using data from a semester-long series of discussions in which 21 university students were assigned weekly roles. The KC contributions of all 252 posts in the discussion were coded using a five phase scheme (Gunawardena et al. 1997). Then, statistical discourse analysis was applied to identify segments of discussion characterized by particular aspects of KC, and “pivotal posts”—those posts which initiated new segments of discussion. Finally, the influences of assigned student roles on pivotal posts and KC were modeled. The results indicate that most online discussions had a single pivotal post separating the discussion into two distinct segments: the first dominated by a lower KC phase; the second dominated by a higher KC phase. This provides empirical evidence supporting the progressive nature of the KC process, but not the necessity of the full five-phase sequence. The pivotal posts that initiated later segments were often contributed mid-discussion by students playing one of two summarizing roles (Synthesizer and Wrapper). This suggests that assigning a summarizing role mid-discussion can aid group progress to more advanced phases of KC. Finally, in some discussion segments, the KC phase of a post was related to characteristics of the two preceding posts. Collectively, the results demonstrate the power of this temporal approach for investigating interdependencies in collaborative KC in online discussions.  相似文献   

10.
Surveillance of epidemic outbreaks and spread from social media is an important tool for governments and public health authorities. Machine learning techniques for nowcasting the Flu have made significant inroads into correlating social media trends to case counts and prevalence of epidemics in a population. There is a disconnect between data-driven methods for forecasting Flu incidence and epidemiological models that adopt a state based understanding of transitions, that can lead to sub-optimal predictions. Furthermore, models for epidemiological activity and social activity like on Twitter predict different shapes and have important differences. In this paper, we propose two temporal topic models (one unsupervised model as well as one improved weakly-supervised model) to capture hidden states of a user from his tweets and aggregate states in a geographical region for better estimation of trends. We show that our approaches help fill the gap between phenomenological methods for disease surveillance and epidemiological models. We validate our approaches by modeling the Flu using Twitter in multiple countries of South America. We demonstrate that our models can consistently outperform plain vocabulary assessment in Flu case-count predictions, and at the same time get better Flu-peak predictions than competitors. We also show that our fine-grained modeling can reconcile some contrasting behaviors between epidemiological and social models.  相似文献   

11.
Scientific research works conducted by researchers spread all over the world in every research field, which are hard to be tracked and quantified. Although there are many research works focused on scientific community discovery and researcher profiling, it is still a big challenge to track the research patterns and assess the research development for an individual researcher or a research group over time. In this study, we seek to model researchers’ scientific activities and quantify their outcome during their research career. A temporal tracking and assessing model is introduced to represent the research development and quantify the scientific outcome for both an individual and a group along the time. Based on our model, a research topic analyzing approach is developed to extract the topics covered by a research group for the research pattern analysis. Furthermore, a latent research pattern discovering approach is proposed to depict how a research group’s research works contributed by its members are discovered and visualized. The effectiveness of our approach is evaluated based on a real academic dataset.  相似文献   

12.
A multiresolution color image segmentation approach is presented that incorporates the main principles of region-based segmentation and cluster-analysis approaches. The contribution of This work may be divided into two parts. In the first part, a multiscale dissimilarity measure is proposed that makes use of a feature transformation operation to measure the interregion relations with respect to their proximity to the main clusters of the image. As a part of this process, an original approach is also presented to generate a multiscale representation of the image information using nonparametric clustering. In the second part, a graph theoretic algorithm is proposed to synthesize regions and produce the final segmentation results. The latter algorithm emerged from a brief analysis of fuzzy similarity relations in the context of clustering algorithms. This analysis indicates that the segmentation methods in general may be formulated sufficiently and concisely by means of similarity relations theory. The proposed scheme produces satisfying results and its efficiency is indicated by comparing it with: 1) the single scale version of dissimilarity measure and 2) several earlier graph theoretic merging approaches proposed in the literature. Finally, the multiscale processing and region-synthesis properties validate our method for applications, such as object recognition, image retrieval, and emulation of human visual perception.  相似文献   

13.
遍历模式数据挖掘方法已经在多种应用中被提出,传统的遍历模式挖掘仅仅考虑了非加权遍历。为解决加权遍历模式挖掘问题,首先提出了一种从EWDG(边加权有向图)到VWDG(顶点加权有向图)的变换模型;基于这种模型,提出了在具有层次特性的局部图遍历中,挖掘加权频繁模式的LGTWFPMiner(局部图遍历加权频繁模式挖掘法)及其支持度/权值界的局部评估方法。针对合成数据的实验结果表明该算法能够有效地进行基于图遍历的加权频繁模式挖掘。  相似文献   

14.
采用了一种基于空间模式聚类的方法,它将图像中的每个像素看成是一个模式,每个模式既体现了所代表像素的空间信息,又包括了像素的颜色信息。这样,对像素的聚类,转变成为对模式的聚类,聚类过程考虑了彩色图像空间中的三个颜色分量。经过实验,此方法能够比较好的对一些彩色图像进行聚类图像分割。  相似文献   

15.
Mining semantic relations between concepts underlies many fundamental tasks including natural language processing, web mining, information retrieval, and web search. In order to describe the semantic relation between concepts, in this paper, the problem of automatically generating spatial temporal relation graph (STRG) of semantic relation between concepts is studied. The spatial temporal relation graph of semantic relation between concepts includes relation words, relation sentences, relation factor, relation graph, faceted feature, temporal feature, and spatial feature. The proposed method can automatically generate the spatial temporal relation graph (STRG) of semantic relation between concepts, which is different from the manually generated annotation repository such as WordNet and Wikipedia. Moreover, the proposed method does not need any prior knowledge such as ontology or the hierarchical knowledge base such as WordNet. Empirical experiments on real dataset show that the proposed algorithm is effective and accurate.  相似文献   

16.
张万山  肖瑶  梁俊杰  余敦辉 《计算机应用》2014,34(11):3144-3146
针对传统Web文本聚类算法没有考虑Web文本主题信息导致对多主题Web文本聚类结果准确率不高的问题,提出基于主题的Web文本聚类方法。该方法通过主题提取、特征抽取、文本聚类三个步骤实现对多主题Web文本的聚类。相对于传统的Web文本聚类算法,所提方法充分考虑了Web文本的主题信息。实验结果表明,对多主题Web文本聚类,所提方法的准确率比基于K-means的文本聚类方法和基于《知网》的文本聚类方法要好。  相似文献   

17.
Cao  Yi  Liu  Chen  Huang  Zilong  Sheng  Yongjian  Ju  Yongjian 《Multimedia Tools and Applications》2021,80(19):29139-29162

Skeleton-based action recognition has recently achieved much attention since they can robustly convey the action information. Recently, many studies have shown that graph convolutional networks (GCNs), which generalize CNNs to more generic non-Euclidean structures, are more exactly extracts spatial feature. Nevertheless, how to effectively extract global temporal features is still a challenge. In this work, firstly, a unique feature named temporal action graph is designed. It first attempts to express timing relationship with the form of graph. Secondly, temporal adaptive graph convolution structure (T-AGCN) are proposed. Through generating global adjacency matrix for temporal action graph, it can flexibly extract global temporal features in temporal dynamics. Thirdly, we further propose a novel model named spatial-temporal adaptive graph convolutional network (ST-AGCN) for skeletons-based action recognition to extract spatial-temporal feature and improve action recognition accuracy. ST-AGCN combines T-AGCN with spatial graph convolution to make up for the shortage of T-AGCN for spatial structure. Besides, ST-AGCN uses dual features to form a two-stream network which is able to further improve action recognition accuracy for hard-to-recognition sample. Finally, comparsive experiments on the two skeleton-based action recognition datasets, NTU-RGBD and SBU, demonstrate that T-AGCN and temporal action graph can effective explore global temporal information and ST-AGCN achieves certain improvement of recognition accuracy on both datasets.

  相似文献   

18.
The development of micro-blog, generating large-scale short texts, provides people with convenient communication. In the meantime, discovering topics from short texts genuinely becomes an intractable problem. It was hard for traditional topic model-to-model short texts, such as probabilistic latent semantic analysis (PLSA) and Latent Dirichlet Allocation (LDA). They suffered from the severe data sparsity when disposed short texts. Moreover, K-means clustering algorithm can make topics discriminative when datasets is intensive and the difference among topic documents is distinct. In this paper, BTM topic model is employed to process short texts–micro-blog data for alleviating the problem of sparsity. At the same time, we integrating K-means clustering algorithm into BTM (Biterm Topic Model) for topics discovery further. The results of experiments on Sina micro-blog short text collections demonstrate that our method can discover topics effectively.  相似文献   

19.
Geometry-based edge clustering for graph visualization   总被引:4,自引:0,他引:4  
Graphs have been widely used to model relationships among data. For large graphs, excessive edge crossings make the display visually cluttered and thus difficult to explore. In this paper, we propose a novel geometry-based edge-clustering framework that can group edges into bundles to reduce the overall edge crossings. Our method uses a control mesh to guide the edge-clustering process; edge bundles can be formed by forcing all edges to pass through some control points on the mesh. The control mesh can be generated at different levels of detail either manually or automatically based on underlying graph patterns. Users can further interact with the edge-clustering results through several advanced visualization techniques such as color and opacity enhancement. Compared with other edge-clustering methods, our approach is intuitive, flexible, and efficient. The experiments on some large graphs demonstrate the effectiveness of our method.  相似文献   

20.
目的 随着实际应用场景中海量数据采集技术的发展和数据标注成本的不断增加,自监督学习成为海量数据分析的一个重要策略。然而,如何从海量数据中抽取有用的监督信息,并该监督信息下开展有效的学习仍然是制约该方向发展的研究难点。为此,提出了一个基于共识图学习的自监督集成聚类框架。方法 框架主要包括3个功能模块。首先,利用集成学习中多个基学习器构建共识图;其次,利用图神经网络分析共识图,捕获节点优化表示和节点的聚类结构,并从聚类中挑选高置信度的节点子集及对应的类标签生成监督信息;再次,在此标签监督下,联合其他无标注样本更新集成成员基学习器。交替迭代上述功能块,最终提高无监督聚类的性能。结果 为验证该框架的有效性,在标准数据集(包括图像和文本数据)上设计了一系列实验。实验结果表明,所提方法在性能上一致优于现有聚类方法。尤其是在MNIST-Test(modified national institute of standards and technology database)上,本文方法实现了97.78%的准确率,比已有最佳方法高出3.85%。结论 该方法旨在利用图表示学习提升自监督学习中监督信息捕获...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号