首页 | 本学科首页   官方微博 | 高级检索  
     

一种新的基于嵌入集的图分类方法
引用本文:王桂娟, 印 鉴, 詹卫许. 一种新的基于嵌入集的图分类方法[J]. 计算机研究与发展, 2012, 49(11): 2311-2319.
作者姓名:王桂娟  印鉴  詹卫许
作者单位:1. 中山大学信息科学与技术学院 广州 510275;华南师范大学计算机学院 广州 510631
2. 中山大学信息科学与技术学院 广州 510275
3. 南方电网信息中心 广州 510000
基金项目:国家自然科学基金项目,广东省自然科学基金项目,广东省科技计划基金项目
摘    要:随着图数据收集技术在许多科学领域的发展,对图数据分类已成为机器学习和数据挖掘领域的重要课题.目前已经提出许多图分类方法.其中,一些图分类方法采用3步来构筑分类模型;一些图分类方法采用2步来构筑分类模型.这些方法在挖掘频繁子图或特征子图时,只考虑到子图的结构信息,而没有考虑到子图的嵌入信息.为此,在L-CCAM子图编码的基础上,提出了一种基于嵌入集的图分类方法.该方法采用基于类别信息的特征子图选择策略,不但考虑了子图的结构信息,而且在频繁子图挖掘过程中充分利用嵌入信息——嵌入集,通过一步即直接选择特征子图以及生成分类规则.实验结果表明:在对化合物数据分类时,在分类精度上该方法优于采用3步的图分类方法;在运行效率上该方法优于采用2步和3步的图数据分类方法.

关 键 词:频繁子图  图分类  图挖掘  特征选择  嵌入集  数据挖掘

A Novel Graph Classification Approach Based on Embedding Sets
Wang Guijuan, Yin Jian, Zhan Weixu. A Novel Graph Classification Approach Based on Embedding Sets[J]. Journal of Computer Research and Development, 2012, 49(11): 2311-2319.
Authors:Wang Guijuan    Yin Jian    Zhan Weixu
Affiliation:1(College of Information Science and Technology,Sun Yat-sen University,Guangzhou 510275) 2(School of Computer,South China Normal University,Guangzhou 510631) 3(Information Center,South Power Grid,Guangzhou 510000)
Abstract:With the development of highly efficient graph data collection technology in many scientific application fields, classification of graph data becomes an important topic in the machine learning and data mining community. At present, many graph classification approaches have been proposed. Some of the graph classification approaches take three steps, which are mining frequent subgraphs, selecting feature subgraphs from mined frequent subgraphs, and constructing classification model by frequent subgraphs. Some other graph classification approaches take two steps, which are mining discriminative subgraphs directly from graph data and learning classification model by discriminative subgraphs. However, during mining frequent subgraphs or discriminative subgraphs, these approaches only take advantage of the structural information of the pattern, and do not consider the embedding information. In fact, in some efficient subgraph mining algorithms, the embedding information of a pattern can be maintained. We propose a graph classification approach, in which we employ a novel subgraph encoding approach with category label and adopt a feature subgraph selection strategy based on category information. Meanwhile, during mining frequent subgraphs, we make full use of embedding sets to select the feature subgraphs and by only one step we are able to generate classification rules. Experiment results show that the proposed approach is effective and feasible for classifying chemical compounds.
Keywords:frequent subgraph pattern  graph classification  graph mining  feature selection  embedding set  data mining
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机研究与发展》浏览原始摘要信息
点击此处可从《计算机研究与发展》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号