首页 | 本学科首页   官方微博 | 高级检索  
     


Node similarity in the citation graph
Authors:Wangzhong Lu  J Janssen  E Milios  N Japkowicz  Yongzheng Zhang
Affiliation:(1) Faculty of Computer Science, Dalhousie University, 6050 University Ave., Halifax, Nova Scotia, B3H 1W5, Canada;(2) Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, B3H 3J5, Canada;(3) School of Information Technology and Engineering, University of Ottawa, Ottawa, Ontario, K1N 6N5, Canada
Abstract:Published scientific articles are linked together into a graph, the citation graph, through their citations. This paper explores the notion of similarity based on connectivity alone, and proposes several algorithms to quantify it. Our metrics take advantage of the local neighborhoods of the nodes in the citation graph. Two variants of link-based similarity estimation between two nodes are described, one based on the separate local neighborhoods of the nodes, and another based on the joint local neighborhood expanded from both nodes at the same time. The algorithms are implemented and evaluated on a subgraph of the citation graph of computer science in a retrieval context. The results are compared with text-based similarity, and demonstrate the complementarity of link-based and text-based retrieval. Wangzhong Lu holds a Bachelor's degree from Hefei University of Technology (1993), and a Master's degree from Dalhousie University (2001), both in computer science. From 1993 to 1999 he worked as a developer with China National Computer Software and Technical Service Corp. in Beijing. From 2001 to 2005 he held industrial positions as a senior software architect in Atlantic Canada. He is currently with DST Systems, Charlotte, NC, as a senior data architect. Jeannette Janssen's research area is applied graph theory. She has worked on the problem of frequency assignment in cellular and digital broadcasting networks. Her current interest is in graph theory applied to the World Wide Web and other networked information spaces. Dr. Janssen did her Master's studies at Eindhoven University of Technology in the Netherlands, and her doctorate at Lehigh University, USA. She is currently an associate professor at Dalhousie University, Canada. Evangelos Milios received a diploma in electrical engineering from the National Technical University of Athens, and Master's and Ph.D. degrees in electrical engineering and computer science from the Massachusetts Institute of Technology. He held faculty positions at the University of Toronto and York University. He is currently a professor of computer science at Dalhousie University, Canada, where he was Director of the Graduate Program. He has served on the committees of the ACM Dissertation Award, and the AAAI/SIGART Doctoral Consortium. He has worked on the interpretation of visual and range signals for landmark-based positioning, navigation and map construction in single- and multi-agent robotics. His current research activity is centered on Networked Information Spaces, Web information retrieval, and aquatic robotics. He is a senior member of the IEEE. Nathalie Japkowicz is an associate professor at the School of Information Technology and Engineering of the University of Ottawa. She obtained her Ph.D. from Rutgers University, her M.Sc. from the University of Toronto, and her B.Sc. from McGill University. Prior to joining the University of Ottawa, she taught at Ohio State University and Dalhousie University. Her area of specialization is Machine Learning and her most recent research interests focused on the class imbalance problem. She made over 50 contributions in the form of journal articles, conference articles, workshop articles, magazine articles, technical reports or edited volumes. Yongzheng Zhang obtained a B.E. in computer applications from Southeast University, China, in 1997 and a M.S. in computer science from Dalhousie University in 2002. From 1997 to 1999 he was an instructor and undergraduate advisor at Southeast University. He also worked as a software engineer in Ricom Information and Telecommunications Co. Ltd., China. He is currently a Ph.D. candidate at Dalhousie University. His research interests are in the areas of Information Retrieval, Machine Learning, Natural Language Processing, and Web Mining, particularly centered on Web Document Summarization. A paper based on his Master's thesis received the best paper award at the 2003 Canadian Artificial Intelligence conference.
Keywords:Networked information spaces  Document similarity metric  Citation graph  Digital libraries
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号