1 Definition

“Plagiarism is the practice of claiming or implying original authorship of (or incorporating material from) someone else's written or creative work, in whole or in part, into one's own without adequate acknowledgement. Unlike cases of forgery, in which the authenticity of the writing, document, or some other kind of object itself is in question, plagiarism is concerned with the issue of false attribution. (Wikipedia, retrieved 17:42, 25 March 2008 (MET))”

Plagiarism is one form of academic dishonesty. A related form is contract cheating, i.e. someone else produces the work. Kuro5hin has this interesting shadow scholar (2010) article, where a ghostwriter called Nathaniel Orenstam claims to have written about 5000 pages / year including a PhD thesis.

The definition of plagiarism is cultural (see below) and particular attention needs to be paid to that in order to avoid intercultural understanding problems and to help students and researchers from other than western cultures adapt to western practice.

According to Webster University [1], TurnitIn [2] [3] defines ten different forms of plagiarism. Some are more severe than others, some may be due to neglect.

  1. Clone: complete copy of a text (submitting another's work as one's own)
  2. CTRL-C: Quoting a source without citing, quoting a source with quotation marks but without citing or without quotation marks.
  3. Find-replace: slight alterations, i.e. paraphrases that are not really rewritten.
  4. Remix: Piecing together different sources (that includes properly paraphrased ideas) without citing.
  5. Recycle: auto-plagiarism, e.g. from prior homework
  6. Hybrid: A mix of proper citations and plagiarised contents
  7. Mashup: Similar to CTRL-C but using slight alterations
  8. 404 Error: Wrong citation, typically an author A who cited an author B
  9. Aggregator: Aggregated essay material without proper reflection (e.g. the case of this article here, which is OK since it does pretend to invent anything).
  10. Re-tweet: Corrected citations, but uses improper quotation (e.g. by changing wording and/or structure without signalling it in a proper way).

Citation without correct page numbers is becoming an increasing problem. Journals often provide html pages as alternatives for difficult to read PDFs but fail to provide paragraph numbers. E.g. in this article we do quote without page numbers, because HTML compatible citation methods are not provided by the vendors. A typical example is Sutton, Taylor and Johnston (2014). Indeed, if we had to write a conference or journal paper we would identify a page number through the PDF version, but for a little bit of note-taking this represents too much work. Therefore, editors, please number paragraphs if you want to support precise citations.

2 Perception of plagiarism among students

Sutton, Taylor and Johnston (2014) [4], in a survey study among English and Australian students identified three factors: “Dishonest behaviours were viewed as most serious, followed by actions representing Poor Referencing, with Group Work behaviours seen as least serious.”

3 Plagiarism in cultural contexts

According to Chien (2014:120) [5], “Pennycook (1996) and Sowden (2005) state that plagiarism is culturally conditioned and therefore is interpreted differently in diverse cultures. Pennycook (1996) suggests the complex nature of plagiarism, indicating that ideas of ownership, authorship, and intellectual property evolving in Western society contain distinctive cultural and historical elements.”. Even more importantly, Chien [5] argues that, “At a deeper level, knowledge in some collectivist cultures such as those in China, Japan, and Korea, is regarded as belonging to societies, and it is a responsibility for people to share knowledge with others for the benefit of societies as a whole (Introna, Hayes, Blair, & Wood, 2003). Textual authorship is thus not owned by an individual, but is instead shared by all members of society.”

Handa and Power (2005) [6] note that plagiarism of foreign students if often associated with their poor language skills and they argue that academic integrity and plagiarism by students from a different than western culture needs to be addressed in specific ways. In their study, the authors found, for example, that:

  • Referencing is neither expected nor taught in undergraduate classes in the Indian context. “ [...] in an Indian context, lack of integrity (cheating) had nothing to do with referencing and acknowledging ideas and words from books and authors. Ideas and even words of well known writers and philosophers are considered as part of the collective bank of knowledge and learners are supposed to make use of these to learn and develop new knowledge.” (p.74)
  • In a similar vein, the act of writing itself differs, i.e. in some cultures it is perfectly acceptable to build sentences from phrases in books that are viewed as "common goods" (Pennycook 1996) [7]. “ [...] students taught to base their writing on books and other writers’ texts to get a good mark may have a different concept of plagiarism” (p.75)

Yusof (2009) [8] also argue that “knowledge according to some societies including Asian is considered to belong to the society as a whole and it is a duty to share it with others (Hu, 2001 in McDonnell, 2003, Introna et al, 2003). This asserts the idea of a collective society and the concept of societal interdependence advocated in Asian societies which opposes the view on the value of individual rights and ownership”.

Introna et al. (2003) [9] state, that “in the UK, we typically expect a significant part of the assessment of a course to be some form of writing such as a critical review of reading material or an essay. For most our overseas students this is not familiar ground, as our data indicated.” In other words, students from non-western cultures are tested through "recall examinations" and are simply not used to write productions that are then evaluated. The authors (p.51) [9] also point out that some authors (e.g. Howard (1993) [10] based on Hull and Rose (1989) [11] argue that patchwriting, defined as “copying from a source text and then deleting some words, altering grammatical structures, or plugging in one-for-one synonym-substitutes (p. 213).” “is a legitimate attempt to "interact with the text, relate it to your own experiences, derive your own meaning from it" (p. 150).”

Most researchers acknowledge that plagiarism is largely a Western concept and that there is a gap between western and non-western cultures. Michalska (2012) [12] looked at disparities in mindsets of students inside Europe. However, we couldn't find any results of this research so far.

4 Remediation and prevention strategies

Plagiarism most often occurs when learners are left alone to produce a term paper and/or if classes are too big. When properly scenarized with a good project-oriented instructional design model, risks seem to be much lower.

Here is a short list of strategies to consider (see also Wikipedia's dealing with contract cheating and Gretchen Pearson's Plagiarism and Anti-Plagiarism. Both retrieved 17:42, 25 March 2008 (MET)):

  • Turn assignments into real personalized projects (like mini-research projects).
  • Change subject areas (i.e. paper topics) for each course.
  • Require step-wise delivery, lists of themes, goals, questions, resources etc. Each of these must be reified as products and be discussed an evaluated. As an example see the C3MS project-based learning model.
  • In the same spirit, require an electronic research trail (e.g. have them use a wiki to work on concepts)
  • Tell students that prior work must be considered and that citations are encouraged, but also announce that you will use plagiarism detection software.
  • Have students present their paper and ask them tough questions.

5 Education of graduate students from non-western contexts

Appropriate action may take different forms in different contexts. Students must be trained for different cultural contexts before any sanctions should be applied. Also, one should consider relaxing an all-inclusive definition of plagiarism, like "patchwork writing". The latter, btw., is a strategy with increasing Internet sub-culture that believes that re-use and re-mixing is perfectly fair if the original authors allows it, e.g. through a Creative Commons licence.

Chien (2014) [5] looked at English teacher's perception of plagiarism in Taiwan. He detected that teachers adopt a developmental approach, e.g. they allow students to rewrite a text as opposed to punishment.

Hayes and Introna (2005:229) [13], considering different cultural values observed within overseas student's conclude: “One central implication arising from our research pertains to the need for Western academics not only to develop a broader understanding of how overseas students were taught and assessed but also to communicate their expectations and explain how they differ from those in the students’ own country, and to provide resources for students to meet these expectations. Further, it is important to ensure that students view their assessments as an opportunity to learn rather than as merely an externally imposed logic of judgment.”

6 Plagiarising from the Internet, a new frontier and rule changer ?

Plagiarizing from the Internet - other than articles or books in electronic form - may not be seen as plagiarising by certain subcultures since re-blogging, mashing up, re-tweeting etc. without proper referencing is current practice and since their is a belief that everything on the Internet is (by default) public. E.g. Introna (2003:39) [9] found that “For all groups, plagiarising material from the internet was seen by students to be a much less serious form of cheating than plagiarising from non-electronic sources.”

A second issue concerns plagiarism defined extensively (like in TurnitIn's 10 types) and a new emerging Internet Culture that explicitly encourages reuse and mixing. While most "open" licenses do require proper citation, the idea that something of interest can be created by mixing (i.e. TurnitIn's "aggregator/RSS feed model") is not compatible with a traditional view of "original" work. We wonder whether a strong requirement for original work, may explain the sad state of technology emerging from research (unusable, useless, ugly, forgotten, dead links, etc.). Our field could be in better shape if improving and combining other's work would be more valued.

In any case, "moral rights" (i.e. paternity and maybe integrity of the work) should be respected in academia, even if some legal systems (e.g. the US) do not require that. People should make a strong distinction between contextualized plagiarism rules and various copyright models. There are situations where academic originality is required and situations where it is not (e.g. in this wiki). Various copyright schemes allow for very different types of reuse and authors should understand the sometimes subtle nuances.

7 Tools

We can distinguish several features that distinguish a good anti-plagiarism tool.

  • A large database of both public and non-public sources (e.g. as of 2016 TurnitIn has 60b Internetpages, 159M closed access publications, and 626M student work)
  • Good matching algorithms for finding similar text (e.g. string matching, bags of words)
  • Good algorithms for finding similar ideas (e.g. through detection of style or citation analysis)
  • Cross-language detection/translation (e.g. match a work written in French against sources written English and Italian)
  • Diverse filtering options, e.g. the ability to exclude quoted text.
  • Detection of other media (e.g. source code, pictures)

Wikipedia's plagiarism detection article (retrieved 6/2016)includes a picture made in 2011 by an anonymous author that summarizes methods and performance of plagiarism detection systems:

Detection performance of CaPD approaches depending on the type of plagiarism being present (Wikipedia)

7.1 Commercial online

Below are a few services, but there exist more. TurnitIn probably offers the best features overall. We do not know who offers the best price/quality ratio (and how this could be measured). Few services are multi-lingual. Some services may offer better performances in a given language, e.g. compilatio may do a fairly good job in French (since it's a french company). Daniel K. Schneider (Updated May 2016).

In roughly alphabetical order:

  • PlagiarismCheck. Proficient online plagiarism checker tool. Finds exact matches, synonimazation, rearrangements and paraphrase plagiarism. Page-based pricing model.
  • PlagAware. A german company. Includes relatively cheap single assessment (e.g. a big master thesis can be checked for 10 Euros, i.e. with a 250 ScanCredits on can check 62'500 words).
  • Plagscan Free trial version for 2000 words). Integrates with some LMS
  • TurnitIn. Major institutional player. Integrates with some LMS.
    • Ephorus merged with TurnitIn in 2015, not offered anymore but supported
    • Ithenticate. A Tunitin service for universities (dissertations, external publications) and research institutes, publication industrie. Individual transmission possible ($100 per paper)
  • UnPlag. Integrates with some LMS.
  • Urkund. Integrates with some LMS

7.2 Free online services

Free online services offer some minimal services, in particular checking smaller text fragments. Most of these include commercial extensions.

7.2.1 Plagiarism detector of text fragments and files

  • Plagtracker. Free online webservice. Can check full papers as well as fragments. Can compare to 5 million academic papers as well as all (or most) Internet pages. (added 14:24, 24 September 2012. The free version allows copy/paste of text fragments.
  • Helioblast. Allows to copy/paste 1000 word text fragments and compares to Medline Abstracts (therefore of limited use).
  • NoPlag. Free copy/paste of text fragments. Premium services are commercial

7.2.2 Web pages

  • CopyScape. Check a web page. There is also a premium version.

7.3 Free software

There seems to be no free anti-plagiarism software that offer full analysis of papers that have a certain length. Some software (e.g. this) pretends to be free, but is not. Other are is not confidential, e.g. uploaded materials will be made public.

English texts
  • Viper (Windows only) is a free client software. However, in the free version, an uploaded text will be entered in the public database after 9 month. Therefore not suitable for teachers since they are not allowed to publish student essays without their consent. Actually, Viper requires that one holds intellectual property on scanned texts. The program (as of May 2016) does not work in "our" region (Switzerland).
  • WCopyfind examines and compares a collection of document files. Can handle text, html, and some wordprocessor formats. Only useful once you identified possible sources of plagiarism.
Multilingual texts
  • CopyTracker (dead link). Free software to download (also available from source forge). Not updated since 2013. Dead project ?
  • Use a search engine like Google and just copy/paste some particularly well written sentence. First within quotes, then without quotes. This btw. also works for computer code.
Software plagiarism
  • MOSS. A System for Detecting Software Plagiarism. Free for non-commercial use, subscription needed.

8 Instruments to study perceptions of plagiarism

