Deep context of citations using machine-learning models in scholarly full-text articles期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Deep context of citations using machine-learning models in scholarly full-text articles

Authors:	Saeed-Ul Hassan Mubashir Imran Sehrish Iqbal Naif Radi Aljohani Raheel Nawaz

Affiliation:	1.Information Technology University,Lahore,Pakistan;2.Faculty of Computing and Information Technology,King Abdulaziz University,Jeddah,Kingdom of Saudi Arabia;3.School of Computer Science,Manchester Metropolitan University,Manchester,UK

Abstract:	Information retrieval systems for scholarly literature rely heavily not only on text matching but on semantic- and context-based features. Readers nowadays are deeply interested in how important an article is, its purpose and how influential it is in follow-up research work. Numerous techniques to tap the power of machine learning and artificial intelligence have been developed to enhance retrieval of the most influential scientific literature. In this paper, we compare and improve on four existing state-of-the-art techniques designed to identify influential citations. We consider 450 citations from the Association for Computational Linguistics corpus, classified by experts as either important or unimportant, and further extract 64 features based on the methodology of four state-of-the-art techniques. We apply the Extra-Trees classifier to select 29 best features and apply the Random Forest and Support Vector Machine classifiers to all selected techniques. Using the Random Forest classifier, our supervised model improves on the state-of-the-art method by 11.25%, with 89% Precision-Recall area under the curve. Finally, we present our deep-learning model, the Long Short-Term Memory network, that uses all 64 features to distinguish important and unimportant citations with 92.57% accuracy.

Keywords:
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏