首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The concept of support is central to data mining. While the definition of support in transaction databases is intuitive and simple, that is not the case in graph datasets and databases. Most mining algorithms require the support of a pattern to be no greater than that of its subpatterns, a property called anti-monotonicity, or admissibility. This paper examines the requirements for admissibility of a support measure. Support measures for mining graphs are usually based on the notion of an instance graph---a graph representing all the instances of the pattern in a database and their intersection properties. Necessary and sufficient conditions for support measure admissibility, based on operations on instance graphs, are developed and proved. The sufficient conditions are used to prove admissibility of one support measure—the size of the independent set in the instance graph. Conversely, the necessary conditions are used to quickly show that some other support measures, such as weighted count of instances, are not admissible. *Partially supported by the KITE consortium under contract to the Israeli Ministry of Trade and Industry, and by the Paul Ivanier Center for Robotics and Production Management.  相似文献   

2.
We introduce a new filtering algorithm, called IDL(d)-filtering, for a global constraint dedicated to the graph isomorphism problem—the goal of which is to decide if two given graphs have an identical structure. The basic idea of IDL(d)-filtering is to label every vertex with respect to its relationships with other vertices around it in the graph, and to use these labels to filter domains by removing values that have different labels. IDL(d)-filtering is parameterized by a positive integer value d which gives a limit on the distance between a vertex to be labelled and the set of vertices considered to build its label. We experimentally compare different instantiations of IDL(d)-filtering with state-of-the-art dedicated algorithms and show that IDL(d)-filtering is more efficient on regular sparse graphs and competitive on other kinds of graphs.  相似文献   

3.
Mining patterns from graph traversals   总被引:12,自引:0,他引:12  
In data models that have graph representations, users navigate following the links of the graph structure. Conducting data mining on collected information about user accesses in such models, involves the determination of frequently occurring access sequences. In this paper, the problem of finding traversal patterns from such collections is examined. The determination of patterns is based on the graph structure of the model. For this purpose, three algorithms, one which is level-wise with respect to the lengths of the patterns and two which are not are presented. Additionally, we consider the fact that accesses within patterns may be interleaved with random accesses due to navigational purposes. The definition of the pattern type generalizes existing ones in order to take into account this fact. The performance of all algorithms and their sensitivity to several parameters is examined experimentally.  相似文献   

4.
为解决加权遍历模式挖掘问题,概括了加权有向图的种类,提出一种边加权有向图与顶点加权有向图间的变换模型,并基于该模型提出一种基于图遍历的加权序列模式挖掘算法GTWSPMiner.该算法根据遍历模式中的项的连续性特点,采用一种加权前缀投影序列模式增长方法,将原挖掘序列数据库的任务分解成一组挖掘局部投影数据库的小任务.对比实验结果表明,该算法能快速有效地挖掘加权频繁遍历模式.  相似文献   

5.
面向元数据流,提出有效评测用户订阅的方法.设计了索引结构对订阅进行分组索引,消除了一个订阅因为包含多个谓词而造成的多次索引、计数和比较;设计了新的基于分组的过滤算法,该算法通过缓存谓词匹配结果使得谓词匹配结果得以在订阅过滤过程中传播,取得了很高的过滤性能.实验结果表明,该系统可以有效地处理达上百万订阅的负载量,实验中引进提取词干和消除停用词,极大提高系统的查全率和精度.  相似文献   

6.
We present an algorithm to solve the graph isomorphism problem for the purpose of object recognition. Objects, such as those which exist in a robot workspace, may be represented by labelled graphs (graphs with attributes on their nodes and/or edges). Thereafter, object recognition is achieved by matching pairs of these graphs. Assuming that all objects are sufficiently different so that their corresponding representative graphs are distinct, then given a new graph, the algorthm efficiently finds the isomorphic stored graph (if it exists). The algorithm consists of three phases: preprocessing, link construction, and ambiguity resolution. Results from experiments on a wide variety and sizes of graphs are reported. Results are also reported for experiments on recognising graphs that represent protein molecules. The algorithm works for all types of graphs except for a class of highly ambiguous graphs which includes strongly regular graphs. However, members of this class are detected in polynomial time, which leaves the option of switching to a higher complexity algorithm if desired.  相似文献   

7.
We present a new block adaptive algorithm as a variant of the Toeplitz-preconditioned block conjugate gradient (TBCG) algorithm. The proposed algorithm is formulated by combining TBCG algorithm with a data-reusing scheme that is realized by processing blocks of data in an overlapping manner, as in the optimum block adaptive shifting (OBAS) algorithm. Simulation results show that the proposed algorithm is superior to the block conjugate gradient shifting (BCGS), TBCG and Toeplitz-OBAS (TOBAS) algorithms in both convergence rate and tracking property of input signal conditioning.  相似文献   

8.
This letter derives a data filtering based least squares iterative identification algorithm for output error autoregressive systems. The basic idea is to use the data filtering technique to transform the original identification model to an equation error model and to estimate the parameters of this model. The proposed algorithm is more efficient and can produce more accurate parameter estimation than the existing least squares iterative algorithm.  相似文献   

9.
The identification of nonlinear systems is a hot topic in the identification fields. In this paper, a data filtering based multi-innovation stochastic gradient algorithm is derived for Hammerstein nonlinear controlled autoregressive moving average systems by adopting the key-term separation principle and the data filtering technique. The proposed algorithm provides a reference to improve the identification accuracy of the nonlinear systems with colored noise. The simulation results show that the new algorithm can more effectively estimate the parameters of the Hammerstein nonlinear systems than the multi-innovation stochastic gradient algorithm.  相似文献   

10.
As the Internet flourishes, online advertising becomes essential for marketing campaigns for business applications. To perform a marketing campaign, advertisers provide their advertisements to Internet publishers and commissions are paid to the publishers of the advertisements based on the clicks made for the posted advertisements or the purchases of the products of which advertisements posted. Since the payment given to a publisher is proportional to the amount of clicks received for the advertisements posted by the publisher, dishonest publishers are motivated to inflate the number of clicks on the advertisements hosted on their web sites. Since the click frauds are critical for online advertising to be reliable, the online advertisers make the efforts to prevent them effectively. However, the methods used for click frauds are also becoming more complex and sophisticated.  相似文献   

11.
12.
协同过滤是迄今为止最成功的推荐系统,它可以产生高质量的推荐,但是其性能随着客户和产品数目的增加而下降.提出了一种基于特征表的协同过滤算法,该算法首先将原始数据划分成若干个特征集,然后通过建立特征表而避免顺序扫描.在真实数据集上的实验表明该算法对推荐系统的可伸缩性和推荐质量都有较大的提高.  相似文献   

13.
We develop a parallel algorithm for partitioning the vertices of a graph intop2 sets in such a way that few edges connect vertices in different sets. The algorithm is intended for a message-passing multiprocessor system, such as the hypercube, and is based on the Kernighan-Lin algorithm for finding small edge separators on a single processor.(1) We use this parallel partitioning algorithm to find orderings for factoring large sparse symmetric positive definite matrices. These orderings not only reduce fill, but also result in good processor utilization and low communication overhead during the factorization. We provide a complexity analysis of the algorithm, as well as some numerical results from an Intel hypercube and a hypercube simulator.Publication of this report was partially supported by the National Science Foundation under Grant DCR-8451385 and by AT&T Bell Laboratories through their Ph.D scholarship program.  相似文献   

14.
Sigma-Point直接式卡尔曼滤波惯性组合导航算法   总被引:2,自引:0,他引:2  
基于西格玛点采样加权的方法,以姿态、速度和位置等9个导航参数为状态向量,以卫星导航系统的速度和位置组成6维观测向量,构建直接式卡尔曼滤波器,融惯性导航系统求解和状态估计的过程为一体,直接描述系统导航参数动态过程.仿真结果验证了惯性组合导航Sigma-Point直接式滤波方法的有效性,表明该非线性直接式滤波方法可提高惯性组合导航系统的导航精度和对飞机、导弹等载体非线性机动过程的适应性.  相似文献   

15.
16.
A cycleC passing through two specific verticess andt of a biconnected graph is said to be anst-ambitus if its bridges do not interlace in some special way. We present algorithms forst-ambitus for planar biconnected graphs, which are much simpler than the one known for general graphs [MT]. Our algorithm runs inO(n) time on a sequential machine and (logn) parallel time usingO(n/logn) processors on an EREW PRAM.  相似文献   

17.
This paper reflects results of research related to developing a new methodology for automatic graph drawing based on applying genetic algorithms. The methodology has permitted the elaboration of a hybrid technique that combines the most popular, classical, topology-shape-metric approach to orthogonal drawings on the grid and a genetic algorithm that is directed, in its evolutionary process, at multicriteria decision making in a fuzzy environment. In the traditional use of the topology-shape-metric approach, a single fixed planar embedding is obtained in the planarization step. Thereafter this embedding is subjected to the orthogonalization and compaction steps. However, this sequence does not guarantee that the fixed planar embedding will generate a final drawing of a good quality. Moreover, every topology-shape-metric step is classified as a NP-hard problem, and choices as well as heuristics used in previous stages have a direct impact on subsequent ones. Taking this into account, the developed hybrid technique generates a greater number of planar embeddings by varying the order of edges’ insertion when forming the planar embeddings in planarization step. Thus, the problem is formulated as a permutation-based combinatorial optimization problem. The genetic algorithm is applied at the planarization step of the topology-shape-metric. This allows one to generate the population with the corresponding number of planar embeddings. Each planar embedding obtained in the planarization step is submitted to the orthogonalization and compaction. Their results serve for applying the procedures of multicriteria decision making in a fuzzy environment. Thus, in the evolutionary process, the genetic algorithm is able to select individuals, which provide more harmonious solutions (relatively of the solutions obtained with traditional applying the topology-shape-metric approach) from the point of view of the aesthetic criteria that are usually utilized at the three steps of automatic graph drawing. This is convincingly demonstrated by experimental results given in the paper.  相似文献   

18.
Based on the introduction to the user-based and item-based collaborative filtering algorithms, the problems related to the two algorithms are analyzed, and a new entropy-based recommendation algorithm is proposed. Aiming at the drawbacks of traditional similarity measurement methods, we put forward an improved similarity measurement method. The entropy-based collaborative filtering algorithm contributes to solving the cold-start problem and discovering users’ hidden interests. Using the data selected from Movielens and Book-Crossing datasets and MAE accuracy metric, three different collaborative filtering recommendation algorithms are compared through experiments. The experimental scheme and results are discussed in detail. The results show that the entropy-based algorithm provides better recommendation quality than user-based algorithm and achieves recommendation accuracy comparable to the item-based algorithm. At last, a solution to B2B e-commerce recommendation applications based on Web services technology is proposed, which adopts entropy-based collaborative filtering recommendation algorithm.  相似文献   

19.
传统主动队列管理(AQM)算法在处理传感器网络突发流时具有响应速度慢、抗网络突变性能弱的缺点.针对此问题,提出了一种新的AQM算法,算法首先将队列长度作为早期拥塞检测参量,运用卡尔曼滤波理论预测队列长度;其次根据队列长度在缓冲区的占用比来划分网络状态;最后根据不同占用比采取相应的丢包策略,自适应地调整丢包率,当出现网络突变时,加大调整幅度,使队列长度保持在理想区间.仿真实验表明:新算法能够较好地适应网络波动,提高网络服务质量(QoS),算法综合性能优于主流AQM算法.  相似文献   

20.
元数据(Meta Data)是用来描述应用中数据结构的数据或文档,即关于数据的数据,处在数据ETL系统的核心位置。分析了元数据在ETL中的作用,设计了元数据模型,提出了元数据管理系统的设计原则和步骤,使其具有广泛的适应性、很好的扩展性和可重用性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号