首页 | 本学科首页   官方微博 | 高级检索  
     

异质网中基于张量表示的动态离群点检测方法
引用本文:刘露, 左万利, 彭涛. 异质网中基于张量表示的动态离群点检测方法[J]. 计算机研究与发展, 2016, 53(8): 1729-1739. DOI: 10.7544/issn1000-1239.2016.20160178
作者姓名:刘露  左万利  彭涛
作者单位:1.1(吉林大学计算机科学与技术学院 长春 130012);2.2(符号计算与知识工程教育部重点实验室(吉林大学) 长春 130012) (liulu12@mails.jlu.edu.cn)
基金项目:国家自然科学基金项目(60903098);吉林省工业技术研究和开发项目(JF2012c016-2);吉林大学研究生创新基金项目(2015040)
摘    要:挖掘隐藏在异质信息网络中丰富的语义信息是数据挖掘的重要任务之一.离群点在值、数据分布、和产生机制上都明显不同于正常数据对象.检测离群点并分析其不同的产生机制,最终消除离群点具有重要的现实意义.目前,针对异质信息网络动态离群点检测的研究工作相对较少,还有很多问题有待解决.由于异质信息网络的动态性,随着时间的变化,正常数据对象也可能转变为离群点.针对异质网络提出一种基于张量表示的动态离群点检测方法(TRBOutlier),并根据张量表示的高阶数据构建张量索引树.通过搜索张量索引树,将特征加入到直接项集和间接项集中.同时,根据基于短文本相关性的聚类方法来判断数据集中的数据对象是否偏离其原聚簇来动态检测网络中的离群点.该模型能够在充分降低时间和空间复杂度的条件下保留异质网络中的语义信息.实验结果表明:该方法能够快速有效地进行异质网络环境下的动态离群点检测.

关 键 词:动态离群点检测  异质信息网络  张量表示  张量索引树  聚类

Tensor Representation Based Dynamic Outlier Detection Method in Heterogeneous Network
Liu Lu, Zuo Wanli, Peng Tao. Tensor Representation Based Dynamic Outlier Detection Method in Heterogeneous Network[J]. Journal of Computer Research and Development, 2016, 53(8): 1729-1739. DOI: 10.7544/issn1000-1239.2016.20160178
Authors:Liu Lu  Zuo Wanli  Peng Tao
Affiliation:1.1(College of Computer Science and Technology, Jilin University, Changchun 130012);2.2(Key Laboratory of Symbol Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun 130012)
Abstract:Mining rich semantic information hidden in heterogeneous information network is an important task in data mining. The value, data distribution and generation mechanism of outliers are all different from that of normal data. It is of great significance of analyzing its generation mechanism or even eliminating outliers. Outlier detection in homogeneous information network has been studied and explored for a long time. However, few of them are aiming at dynamic outlier detection in heterogeneous networks. Many issues need to be settled. Due to the dynamics of the heterogeneous information network, normal data may become outliers over time. This paper proposes a dynamic tensor representation based outlier detection method, called TRBOutlier. It constructs tensor index tree according to the high order data represented by tensor. The features are added to direct item set and indirect item set respectively when searching the tensor index tree. Meanwhile, we describe a clustering method based on the correlation of short texts to judge whether the objects in datasets change their original clusters and then detect outliers dynamically. This model can keep the semantic relationship in heterogeneous networks as much as possible in the case of fully reducing the time and space complexity. The experimental results show that our proposed method can detect outliers dynamically in heterogeneous information network effectively and efficiently.
Keywords:dynamic outlier detection  heterogeneous information network  tensor representation  tensor index tree  clustering
点击此处可从《计算机研究与发展》浏览原始摘要信息
点击此处可从《计算机研究与发展》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号