Spam: Difference between revisions

The educational technology and digital learning wiki
Jump to navigation Jump to search
Line 95: Line 95:
* [http://www.gearhack.com/Articles/FightSpam/ Fight Comment Spam, Ban IP's] A large list of banned IP addresses by Chieh Cheng. (There exist others)
* [http://www.gearhack.com/Articles/FightSpam/ Fight Comment Spam, Ban IP's] A large list of banned IP addresses by Chieh Cheng. (There exist others)
* [http://www.stopbadware.org/ StopBadWare] (in case someone managed to upload code, e.g. JavaScript)
* [http://www.stopbadware.org/ StopBadWare] (in case someone managed to upload code, e.g. JavaScript)
* [http://www.oecd.org/dataoecd/63/28/36494147.pdf Report Of The Oecd Task Force On Spam: Anti-Spam Toolkit Of Recommended Policies And Measures] (2006), PDF.
* [http://news.netcraft.com/archives/2004/06/04/wikis_the_next_frontier_for_spammers.html Wikis: The Next Frontier for Spammers?] (Netcraft, 2004).
=== Legal issues and official policy ===
Note:
* Wiki spamming is worse than e-mail spamming, because it relates to vandalism and therefore additional laws can apply.
* Official EU and OECD websites are often unstable (link decay, e.g. the www.oecd-antispam.org official website which is linked to from many places is dead ...)
* [http://spamlinks.net/legal-laws.htm Anti-Spam Laws] (good resource)
* [http://en.wikipedia.org/wiki/E-mail_spam_legislation_by_country E-mail spam legislation by country] (wikipedia)
; USA (main direct or indirect source of spamming)
* [http://www.spamlaws.com/spam-laws.html Spam Laws: The United States CAN-SPAM Act]
* [http://www.ftc.gov/bcp/edu/pubs/business/ecommerce/bus61.shtm The CAN-SPAM Act: A Compliance Guide for Business]
* [http://en.wikipedia.org/wiki/CAN-SPAM_Act_of_2003 CAN-SPAM Act of 2003] (Wikipedia)
; EU
* [http://www.euro.cauce.org/en/index.html The European Coalition Against Unsolicited Commercial Email] (EuroCAUCE).
* [http://ec.europa.eu/information_society/policy/ecomm/todays_framework/privacy_protection/spam/index_en.htm Unsolicited communications - Fighting Spam) (EU Information society portal,, retrieved 11:07, 16 July 2010 (UTC)).
* [http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:52006DC0688:EN:NOT Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions on fighting spam, spyware and malicious software], retrieved 11:07, 16 July 2010 (UTC).
; UK
* [http://www.scotchspam.co.uk/law.html Spam Law Summary] (Scotch Spam)
* [http://www.ico.gov.uk/what_we_cover/privacy_and_electronic_communications/guidance.aspx Privacy and Electronic Communications (EC Directive) Regulations 2003] (information commissioner's office)


=== General wiki spamming ===
=== General wiki spamming ===

Revision as of 12:07, 16 July 2010

Draft

Lookup IP addresses and domain names

This may allow to block whole domains (e.g. in the httpd.conf file or at the system level). Sometimes, wikis are spammed manually and this can help a bit.

If your Mediawiki is spammed: first you will have to go either through your web server logs, e.g. search for "submitlogin" or install an extension that shows the IP number of users.

The CheckUser extension
allows you to figure out where they come from (connect from) and may help you decide whether you should block a whole IP range or ranges (e.g. a whole country). You either can enter user names or IP numbers. Then you can both trace and block a user.
Installed on EduTechwiki and also some Wikipedia/media sites.

Alternatively dig through web server access logs and then consult one of these:

Mediawiki spamming

There exist several strategies:

Registered users

To fight spamming, only registered uses should be able to edit. Edit Localsettings.php

$wgGroupPermissions['*']['edit']            = false;
$wgGroupPermissions['*']['createaccount']   = true;
$wgGroupPermissions['*']['read']            = true;
Light-weight user creation that requires some math

This can defeat some scripts

Making user creation more difficult with captcha

This can defeat more scripts

Making user creation more difficult with recaptcha and contributes to a digitalization project.

This extension is currently used in Edutechwiki with (roughly the following setup)

# Anti Spam ConfirmEdit
# Recaptcha relies on ConfirmEdit, but only ONE needs to be loaded
# require_once("extensions/ConfirmEdit/ConfirmEdit.php");

# ReCaptcha
# See the docs in extensions/recaptcha/ConfirmEdit.php
# http://wiki.recaptcha.net/index.php/Main_Page
require_once( "$IP/extensions/recaptcha/ReCaptcha.php" );
$recaptcha_public_key = '................';
$recaptcha_private_key = '................';

# Users must be registered, once they are in, they they still must fill in captchas (at least over the summer)
$wgCaptchaTriggers['edit']          = true;
$wgCaptchaTriggers['addurl']        = false;
$wgCaptchaTriggers['create']        = true;
$wgCaptchaTriggers['createaccount'] = true;

Filtering edits and page names

Prevent creation of pages with bad words in the title and/or the text.

The builtin WgSpamRegex variable

Mediawiki includes a $wgSpamRegex variable. The goals is prevent three things: (a) bad words, (b) links to bad web sites and (c) CSS tricks to hide contents.

Insert in LocalSettings.php something like:

$wgSpamRegex = "/badword1|barword2|abcdefghi-website\.com|display_remove_:none|overflow_remove_:\s*auto;\s*height:\s*[0-4]px;/i"

I will not show ours here since I can't include it in this page ;)

Read the manual page for detail. It includes a longer regular expression that you may adopt.

Don't forget to edit MediaWiki:Spamprotectiontext

Spam blacklists extensions (an alternative)

The SpamBlacklist extension prevents edits that contain URL hosts that match regular expression patterns defined in specified files or wiki pages.

Links

General

Legal issues and official policy

Note:

  • Wiki spamming is worse than e-mail spamming, because it relates to vandalism and therefore additional laws can apply.
  • Official EU and OECD websites are often unstable (link decay, e.g. the www.oecd-antispam.org official website which is linked to from many places is dead ...)
USA (main direct or indirect source of spamming)
EU
UK

General wiki spamming

Examples from content guidelines - what is spam ?

Mediawiki

  • Spam Filter (This is development page of Mediawiki. I includes extra information, e.g. cleanup scripts.)
  • Help:Spam (Wikia) Wikia is a commercial version of Wikipedia with many user-managed subwikis that have their own aims and content policies.