Talk:Spam Patrol

From Katrina Help Info

Table of contents

SpammerBlockPattern

Pending and Past Updates for SpammerBlockPattern
When a user attempts to save a page with filtered text, she will be redirected to MediaWiki:Spamprotectiontext, which links to Spam Protection Comment
SpammerBlockPattern latest version
Get latest version: http://www.myseattle.com/mediawiki/wgSpamRegex.txt
Test latest version: http://www.myseattle.com/mediawiki/index.php/More:Spam_Box
As of 3-17-2006, updates are processed daily by a cron job [1] (http://groups.yahoo.com/group/admin-katrina/message/1216)
SpammerBlockPattern contacts at KatrinaHelp.info
Administrator: user:jwalling
Tech support: Rudi Cilibrasi and AnnaLissa Cruz
Post request at http://groups.yahoo.com/group/admin-katrina/
Send request to mailto:admin-katrina@yahoogroups-dot-com (yahoogroups.com)

LocalSettings.php

To prevent a spammer from saving wiki edits with problematic content, use the variable '$wgSpamRegex'. Set the variable in LocalSettings.php (overriding the value appearing in DefaultSettings.php). Set it to a regular expression (RegEx) for matching on any URLs (or parts of URLS) which you do not want to allow users to link to. You can also match any other bad content which you wish to ban. Users are presented with an explanatory message, indicating which part of their edit text is not allowed.

RegEx Examples

Sample Regular Expressions to block common spam fragments
 $wgSpamRegex = "/overflow:\s*auto;\s*height:\s*\dpx/"; #big net
 $wgSpamRegex = "/height:\s*\dpx/";                     #bigger net

Regular Expressions Howto

The SpammerBlockPattern is formatted using Regular Expression (RegEx) syntax.

Resources for learning RegEx
http://en.wikipedia.org/wiki/Regular_expression
http://etext.lib.virginia.edu/services/helpsheets/unix/regex.html
http://www.regular-expressions.info/

Spam Blacklist Extension

The above approach may become too cumbersome. Another approach is to have a long blacklist identifying many known spamming URLs and spam terms, in a more readable format (not a single regular expression). With the Spam Blacklist extension you can allow some of your users to edit the blacklist on a wiki page, and you can fetch updates from external sources.

Resources
http://meta.wikimedia.org/wiki/SpamBlacklist_extension
http://meta.wikimedia.org/wiki/Spam_blacklist
http://cvs.sourceforge.net/viewcvs.py/wikipedia/extensions/SpamBlacklist/
http://cyber.law.harvard.edu/globalvoices/wiki/index.php/Spam_blacklist
http://cyber.law.harvard.edu/dyn/globalvoices/wiki/index.php/User_talk:Sj#Spam_blacklist

More External Resources to Fight Wiki Spam

  • Wikimedia: Wiki Spam (http://meta.wikimedia.org/wiki/Wiki_Spam)
  • Wikimedia: Anti-spam Features (http://meta.wikimedia.org/wiki/Anti-spam_Features)
  • Wikipedia: Link spam (http://en.wikipedia.org/wiki/Link_spam)
  • Spam Chongqing (http://chongq.blogspot.com/) [2] (http://chongqed.org/chongqed.html)[3] (http://chongqed.blogspot.com/2004/11/two-reasons-why-indexing-kept-pages-is.html)
  • Interview with a link spammer (http://www.theregister.co.uk/2005/01/31/link_spamer_interview/)
  • Bad Behavior (http://www.ioerror.us/software/bad-behavior/) is a set of PHP scripts which prevents spambots from accessing your site by analyzing their actual HTTP requests and comparing them to profiles from known spambots. It goes far beyond User-Agent and Referer, however. Bad Behavior is available for several PHP-based software packages, and also can be integrated in seconds into any PHP script.
  • Fighting spam in Wikka (http://wikka.jsnx.com/WikkaSpamFighting?show_comments=1&showall=1)
  • PHP Naive Bayesian Filter (http://www.phpgeek.com/pragmacms/index.php?layout=main&cslot_1=14)

RSS checkpoint

This is a RSS checkpoint

--jwalling 21:46, 15 Mar 2008 (CET)
--jwalling 20:43, 19 Mar 2008 (CET)
--jwalling 10:34, 1 Apr 2008 (CEST)
--jwalling 23:03, 14 Apr 2008 (CEST)
--jwalling 22:38, 29 Apr 2008 (CEST)
--jwalling 04:15, 3 May 2008 (CEST)
--jwalling 08:14, 9 May 2008 (CEST)
--jwalling 20:57, 9 May 2008 (CEST)
--jwalling 10:00, 3 Jun 2008 (CEST)
--jwalling 21:47, 12 Jun 2008 (CEST)
--jwalling 20:37, 20 Jun 2008 (CEST)
--jwalling 07:45, 2 Jul 2008 (CEST)
--jwalling 21:11, 7 Jul 2008 (CEST)
--jwalling 21:38, 20 Jul 2008 (CEST)

Spam protection filter

Special:Spam protection filter

Help us stay online!