首页 | 本学科首页   官方微博 | 高级检索  
     

作者识别研究综述
引用本文:张洋, 江铭虎. 作者识别研究综述. 自动化学报, 2021, 47(11): 2501−2520 doi: 10.16383/j.aas.c200654
作者姓名:张洋  江铭虎
作者单位:1.清华大学人文学院计算语言学实验室 北京 100084
基金项目:国家自然科学基金(62036001)资助
摘    要:作者识别是根据已知文本推断未知文本作者的交叉学科. 其传统研究通常基于文学或语言学的经验知识, 而现代研究则主要依靠数学方法量化作者的写作风格. 近些年, 随着认知科学、系统科学和信息技术的发展, 作者识别受到越来越多研究者的关注. 本文主要站在计算语言学的角度综述作者识别领域现代研究中的方法和思路. 首先, 简要介绍了作者识别的发展历程. 然后, 详述了文体风格特征、作者识别方法以及该领域中多层面的研究. 接着介绍了与作者识别相关的一些评测、数据集及评价指标. 最后, 指出该领域存在的一些问题, 结合这些问题分析并展望了作者识别的发展趋势.

关 键 词:作者识别   文体学   写作风格   评价指标
收稿时间:2020-08-14

A Review on Authorship Identification Research
Zhang Yang, Jiang Ming-Hu. A review on authorship identification research. Acta Automatica Sinica, 2021, 47(11): 2501−2520 doi: 10.16383/j.aas.c200654
Authors:ZHANG Yang  JIANG Ming-Hu
Affiliation:1. Lab of Computational Linguistics, School of Humanities, Tsinghua University, Beijing 100084
Abstract:Authorship identification is an interdisciplinary subject of inferring the author of unknown texts based on the known texts. The traditional research of authorship identification is generally based on the empirical knowledge of literature or linguistics, while the modern research mostly relies on mathematical methods to quantify the author' s writing style. In recent years, with the development of cognitive science, system science and information technology, more and more researchers pay attention to authorship identification. This paper mainly reviews the methods and ideas in modern research in the field of authorship identification from the perspective of computational linguistics. First, the development history of authorship identification is introduced briefly. Then, the stylometry, authorship identification methods and multi-faceted research in this realm are expounded. Next, some evaluations, data sets and evaluation metrics related to authorship identification are explicated. Finally, some problems in this domain are pointed out, while the development trend of authorship identification is analyzed and forecasted combined with these problems.
Keywords:Authorship identification  stylometry  writing style  evaluation metrics
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号