Intrinsic plagiarism analysis |
| |
Authors: | Benno Stein Nedim Lipka Peter Prettenhofer |
| |
Affiliation: | 1.Faculty of Media, Media Systems,Bauhaus-Universit?t Weimar,Weimar,Germany |
| |
Abstract: | Research in automatic text plagiarism detection focuses on algorithms that compare suspicious documents against a collection
of reference documents. Recent approaches perform well in identifying copied or modified foreign sections, but they assume
a closed world where a reference collection is given. This article investigates the question whether plagiarism can be detected
by a computer program if no reference can be provided, e.g., if the foreign sections stem from a book that is not available
in digital form. We call this problem class intrinsic plagiarism analysis; it is closely related to the problem of authorship verification. Our contributions are threefold. (1) We organize the algorithmic
building blocks for intrinsic plagiarism analysis and authorship verification and survey the state of the art. (2) We show
how the meta learning approach of Koppel and Schler, termed “unmasking”, can be employed to post-process unreliable stylometric
analysis results. (3) We operationalize and evaluate an analysis chain that combines document chunking, style model computation,
one-class classification, and meta learning. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|