共查询到20条相似文献,搜索用时 0 毫秒
1.
The concept of support is central to data mining. While the definition of support in transaction databases is intuitive and
simple, that is not the case in graph datasets and databases. Most mining algorithms require the support of a pattern to be
no greater than that of its subpatterns, a property called anti-monotonicity, or admissibility. This paper examines the requirements for admissibility of a support measure. Support measures for mining
graphs are usually based on the notion of an instance graph---a graph representing all the instances of the pattern in a database
and their intersection properties. Necessary and sufficient conditions for support measure admissibility, based on operations
on instance graphs, are developed and proved. The sufficient conditions are used to prove admissibility of one support measure—the
size of the independent set in the instance graph. Conversely, the necessary conditions are used to quickly show that some
other support measures, such as weighted count of instances, are not admissible.
*Partially supported by the KITE consortium under contract to the Israeli Ministry of Trade and Industry, and by the Paul Ivanier
Center for Robotics and Production Management. 相似文献
2.
We introduce a new filtering algorithm, called IDL(d)-filtering, for a global constraint dedicated to the graph isomorphism problem—the goal of which is to decide if two given
graphs have an identical structure. The basic idea of IDL(d)-filtering is to label every vertex with respect to its relationships with other vertices around it in the graph, and to
use these labels to filter domains by removing values that have different labels. IDL(d)-filtering is parameterized by a positive integer value d which gives a limit on the distance between a vertex to be labelled and the set of vertices considered to build its label.
We experimentally compare different instantiations of IDL(d)-filtering with state-of-the-art dedicated algorithms and show that IDL(d)-filtering is more efficient on regular sparse graphs and competitive on other kinds of graphs. 相似文献
3.
Mining patterns from graph traversals 总被引:12,自引:0,他引:12
In data models that have graph representations, users navigate following the links of the graph structure. Conducting data mining on collected information about user accesses in such models, involves the determination of frequently occurring access sequences. In this paper, the problem of finding traversal patterns from such collections is examined. The determination of patterns is based on the graph structure of the model. For this purpose, three algorithms, one which is level-wise with respect to the lengths of the patterns and two which are not are presented. Additionally, we consider the fact that accesses within patterns may be interleaved with random accesses due to navigational purposes. The definition of the pattern type generalizes existing ones in order to take into account this fact. The performance of all algorithms and their sensitivity to several parameters is examined experimentally. 相似文献
4.
5.
面向元数据流,提出有效评测用户订阅的方法.设计了索引结构对订阅进行分组索引,消除了一个订阅因为包含多个谓词而造成的多次索引、计数和比较;设计了新的基于分组的过滤算法,该算法通过缓存谓词匹配结果使得谓词匹配结果得以在订阅过滤过程中传播,取得了很高的过滤性能.实验结果表明,该系统可以有效地处理达上百万订阅的负载量,实验中引进提取词干和消除停用词,极大提高系统的查全率和精度. 相似文献
6.
We present an algorithm to solve the graph isomorphism problem for the purpose of object recognition. Objects, such as those which exist in a robot workspace, may be represented by labelled graphs (graphs with attributes on their nodes and/or edges). Thereafter, object recognition is achieved by matching pairs of these graphs. Assuming that all objects are sufficiently different so that their corresponding representative graphs are distinct, then given a new graph, the algorthm efficiently finds the isomorphic stored graph (if it exists). The algorithm consists of three phases: preprocessing, link construction, and ambiguity resolution. Results from experiments on a wide variety and sizes of graphs are reported. Results are also reported for experiments on recognising graphs that represent protein molecules. The algorithm works for all types of graphs except for a class of highly ambiguous graphs which includes strongly regular graphs. However, members of this class are detected in polynomial time, which leaves the option of switching to a higher complexity algorithm if desired. 相似文献
7.
We present a new block adaptive algorithm as a variant of the Toeplitz-preconditioned block conjugate gradient (TBCG) algorithm. The proposed algorithm is formulated by combining TBCG algorithm with a data-reusing scheme that is realized by processing blocks of data in an overlapping manner, as in the optimum block adaptive shifting (OBAS) algorithm. Simulation results show that the proposed algorithm is superior to the block conjugate gradient shifting (BCGS), TBCG and Toeplitz-OBAS (TOBAS) algorithms in both convergence rate and tracking property of input signal conditioning. 相似文献
8.
《Information Processing Letters》2014,114(10):573-578
This letter derives a data filtering based least squares iterative identification algorithm for output error autoregressive systems. The basic idea is to use the data filtering technique to transform the original identification model to an equation error model and to estimate the parameters of this model. The proposed algorithm is more efficient and can produce more accurate parameter estimation than the existing least squares iterative algorithm. 相似文献
9.
The identification of nonlinear systems is a hot topic in the identification fields. In this paper, a data filtering based multi-innovation stochastic gradient algorithm is derived for Hammerstein nonlinear controlled autoregressive moving average systems by adopting the key-term separation principle and the data filtering technique. The proposed algorithm provides a reference to improve the identification accuracy of the nonlinear systems with colored noise. The simulation results show that the new algorithm can more effectively estimate the parameters of the Hammerstein nonlinear systems than the multi-innovation stochastic gradient algorithm. 相似文献
10.
As the Internet flourishes, online advertising becomes essential for marketing campaigns for business applications. To perform a marketing campaign, advertisers provide their advertisements to Internet publishers and commissions are paid to the publishers of the advertisements based on the clicks made for the posted advertisements or the purchases of the products of which advertisements posted. Since the payment given to a publisher is proportional to the amount of clicks received for the advertisements posted by the publisher, dishonest publishers are motivated to inflate the number of clicks on the advertisements hosted on their web sites. Since the click frauds are critical for online advertising to be reliable, the online advertisers make the efforts to prevent them effectively. However, the methods used for click frauds are also becoming more complex and sophisticated. 相似文献
11.
12.
杨风召 《计算机工程与应用》2007,43(6):184-187
协同过滤是迄今为止最成功的推荐系统,它可以产生高质量的推荐,但是其性能随着客户和产品数目的增加而下降.提出了一种基于特征表的协同过滤算法,该算法首先将原始数据划分成若干个特征集,然后通过建立特征表而避免顺序扫描.在真实数据集上的实验表明该算法对推荐系统的可伸缩性和推荐质量都有较大的提高. 相似文献
13.
We develop a parallel algorithm for partitioning the vertices of a graph intop2 sets in such a way that few edges connect vertices in different sets. The algorithm is intended for a message-passing multiprocessor system, such as the hypercube, and is based on the Kernighan-Lin algorithm for finding small edge separators on a single processor.(1) We use this parallel partitioning algorithm to find orderings for factoring large sparse symmetric positive definite matrices. These orderings not only reduce fill, but also result in good processor utilization and low communication overhead during the factorization. We provide a complexity analysis of the algorithm, as well as some numerical results from an Intel hypercube and a hypercube simulator.Publication of this report was partially supported by the National Science Foundation under Grant DCR-8451385 and by AT&T Bell Laboratories through their Ph.D scholarship program. 相似文献
14.
15.
16.
A cycleC passing through two specific verticess andt of a biconnected graph is said to be anst-ambitus if its bridges do not interlace in some special way. We present algorithms forst-ambitus for planar biconnected graphs, which are much simpler than the one known for general graphs [MT]. Our algorithm runs inO(n) time on a sequential machine and (logn) parallel time usingO(n/logn) processors on an EREW PRAM. 相似文献
17.
Bernadete M.M. NetaAuthor Vitae Gustavo H.D. AraújoAuthor Vitae 《Applied Soft Computing》2012,12(4):1379-1389
This paper reflects results of research related to developing a new methodology for automatic graph drawing based on applying genetic algorithms. The methodology has permitted the elaboration of a hybrid technique that combines the most popular, classical, topology-shape-metric approach to orthogonal drawings on the grid and a genetic algorithm that is directed, in its evolutionary process, at multicriteria decision making in a fuzzy environment. In the traditional use of the topology-shape-metric approach, a single fixed planar embedding is obtained in the planarization step. Thereafter this embedding is subjected to the orthogonalization and compaction steps. However, this sequence does not guarantee that the fixed planar embedding will generate a final drawing of a good quality. Moreover, every topology-shape-metric step is classified as a NP-hard problem, and choices as well as heuristics used in previous stages have a direct impact on subsequent ones. Taking this into account, the developed hybrid technique generates a greater number of planar embeddings by varying the order of edges’ insertion when forming the planar embeddings in planarization step. Thus, the problem is formulated as a permutation-based combinatorial optimization problem. The genetic algorithm is applied at the planarization step of the topology-shape-metric. This allows one to generate the population with the corresponding number of planar embeddings. Each planar embedding obtained in the planarization step is submitted to the orthogonalization and compaction. Their results serve for applying the procedures of multicriteria decision making in a fuzzy environment. Thus, in the evolutionary process, the genetic algorithm is able to select individuals, which provide more harmonious solutions (relatively of the solutions obtained with traditional applying the topology-shape-metric approach) from the point of view of the aesthetic criteria that are usually utilized at the three steps of automatic graph drawing. This is convincingly demonstrated by experimental results given in the paper. 相似文献
18.
Chun-Hui Piao Jing Zhao Li-Juan Zheng 《Service Oriented Computing and Applications》2009,3(2):147-157
Based on the introduction to the user-based and item-based collaborative filtering algorithms, the problems related to the
two algorithms are analyzed, and a new entropy-based recommendation algorithm is proposed. Aiming at the drawbacks of traditional
similarity measurement methods, we put forward an improved similarity measurement method. The entropy-based collaborative
filtering algorithm contributes to solving the cold-start problem and discovering users’ hidden interests. Using the data
selected from Movielens and Book-Crossing datasets and MAE accuracy metric, three different collaborative filtering recommendation
algorithms are compared through experiments. The experimental scheme and results are discussed in detail. The results show
that the entropy-based algorithm provides better recommendation quality than user-based algorithm and achieves recommendation
accuracy comparable to the item-based algorithm. At last, a solution to B2B e-commerce recommendation applications based on
Web services technology is proposed, which adopts entropy-based collaborative filtering recommendation algorithm. 相似文献
19.
20.
管丽娟 《数字社区&智能家居》2007,2(8):469
元数据(Meta Data)是用来描述应用中数据结构的数据或文档,即关于数据的数据,处在数据ETL系统的核心位置。分析了元数据在ETL中的作用,设计了元数据模型,提出了元数据管理系统的设计原则和步骤,使其具有广泛的适应性、很好的扩展性和可重用性。 相似文献