Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Citation-based plagiarism detection : detecting disguised and cross-language plagiarism using citation pattern analysis
Gipp B., Springer Vieweg, New York, NY, 2014. 350 pp. Type: Book (978-3-658063-93-1)
Date Reviewed: Apr 29 2015

Gipp has published his doctoral dissertation in book form with a change in the original subtitle, “Citation-based Plagiarism Detection [CbPD]: Applying Citation Pattern Analysis to Identify Currently Non-Machine-Detectable Disguised Plagiarism in Scientific Publications.” Online access to the dissertation points only to a PDF preprint of this book with an added dissertation title-page for Otto-von-Guericke University Magdeburg (OvGU), from which the English title above was quoted. The book comprises seven chapters, ten appendices, and a glossary, with 372 bibliographical entries.

The author presents the conception, implementation, and evaluation of a novel approach to detecting plagiarism: focus not on matching character strings, but on matching strings of citations. In many cases of acknowledged plagiarism, even when the text is strongly reordered, paraphrased, or translated into a different language, the citations and their sequence are little disturbed.

This notion is ingeniously extended to several measures of citation similarity and applied to real and realistically scaled models. The results broadly demonstrate the viability of the CbPD automated approach. Not only are known examples detected, but unknown instances have been discovered, leading to admittedly plagiarizing papers being withdrawn or retracted. Moreover, the citation matching can be computationally more efficient than existing string-matching systems and will detect plagiarism patterns they cannot. In short, the ideas are compelling, the execution is brilliant, and the reporting is lucid and a joy to read.

Posted on citeplag.org/thesis are links to the thesis, related publications, the prototype CbPD system CitePlag, and evaluations (the latter two on request). Other materials and examples can be found via SciPlore.org.

There are some weaknesses: The mathematics is clearly stated, but the explanation is not always so clear. For example, the crucial longest common citation sequence (LCCS) is twice (pp. 70, 82) stated to be unique, but 12345678, 56781234 is a citation sequence pair with two LCCSs of length four. Also, the discussion of Cont.-Score needs n+1 instead of n in two places in the second paragraph on page 86.

There are a number of pervasive editing errors, such as dropping the second comma of a pair or confusing “that” and “which.” An example of the latter appears on page 220: “We found no study on scientific fraud, which analyzes how many studies containing fabricated or falsified data also contain plagiarism.” On page 14 occurs one too rare: “in a clandestine manor.” A safe house?

Incoherent diagrams (especially Tables 1, 8, 15, 16, and 24, and Figures 27, 29–32, 43, 44, 50, 53, and 56) result from printing the colored thesis in black and white. For example, Figure 3 seems to illustrate autocloning in the absence of its yellow highlighting. Page 78 suffers from no colors and wrong color names (but color cannot rescue Figure 51’s curious caption). There are many reasons to buy a DVD of a streamed movie. In this case, purchasing a hardcopy so inferior to the index-searchable, enlargeable, multicolor PDF download, despite loyalty and royalty, cannot be recommended.

Structural and idea plagiarism are two types not yet addressable by computers. A third type requires little automation to discover: self-plagiarism (mentioned on pp. 13 and 15) is considered evil only “without such reuse being justified.” I have seen a doctoral dissertation that consisted of end-wrapping text around five papers that were previously published, all with additional authors. Two men submitted duplicate theses to Harvard and MIT. When this was questioned during one’s oral defense, he asserted: “The results are worth four PhDs. We only want two. Let’s move on.” A famous mathematician counseled many to follow his example of successfully mining the dissertation for journal articles for decades. The reviewer has obviously plied this trade too. Page iv acknowledges that this is in fact an OvGU dissertation, but the final word is that this book does not cite or credit the underlying work.

Reviewer:  Benjamin Wells Review #: CR143399 (1507-0564)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Pattern Analysis (I.5.2 ... )
 
 
Pattern Matching (F.2.2 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Pattern Analysis": Date
Understanding data pattern processing
Inmon W., Osterfelt S., QED Information Sciences, Inc., Wellesley, MA, 1991. Type: Book (9780894353864)
Jun 1 1992
Parallel thinning with two-subiteration algorithms
Guo Z., Hall R. Communications of the ACM 32(3): 359-373, 1989. Type: Article
Jan 1 1990
A variable window approach to early vision
Boykov Y., Veksler O., Zabith R. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(12): 1283-1294, 1998. Type: Article
Oct 1 1999
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy