SpamFilter Ruby upload script

From Katrina Help Info

return to SpamFilter

(see http://groups.yahoo.com/group/admin-katrina/message/1216)

From: "Rudi Cilibrasi" <cilibrar@...>
Date: Fri Mar 17, 2006  7:36 am
Subject: Re: [admin-katrina] SpammerBlockPattern update request 3/16/06

I've got it set to update by itself daily now from the URL given... So you should only need to email the list if something goes wrong and prevents updates. Best regards,

Rudi

P.S. Thanks to AnnaLissa for this:


#!/usr/local/bin/ruby
require 'fileutils'

scriptdir = File.dirname(__FILE__) + "/../htdoc/wiki"
filename = scriptdir + "/LocalSettings.php"
newname = scriptdir + "/LocalSettings.php.new"
backup = scriptdir + "/LocalSettings.php.bak"
regexfile = "/tmp/wgSpamRegex.txt"

# grab John's file and save in tmp location
system("wget http://www.myseattle.com/mediawiki/wgSpamRegex.txt -O #{regexfile}"
)

regexp = ''
File.readlines(regexfile).each do |line|
   regexp = $1 if line =~ /^\$wgSpamRegex\s=\s(.*)/
end
raise "Can't find valid wgSpamRegex" unless regexp.length > 0

# read in text of original LocalSettings.php
src = File.readlines(filename)

# write to a temporary filename
File.open(newname,"w") { |f| f.write(src) }

# move temporary file name to backup filename
File.rename(newname, backup)

File.open(filename,"w") do |f|
   src.each do |line|
     if line =~ /^\$wgSpamRegex\s=\s(.*)/
       line.gsub!(/\s=\s.*/, " = #{regexp}")
     end
     f.write(line)
   end
end

and here's my 'contribution' in root's crontab:

33 16 * * * /home/webuser/hosting/katrinahelp.info/admin/wikiSpamRegexChanger.rb

On 3/16/06, John Walling <wallingconsulting@...> wrote:
> Anna Lissa or Rudi,
>
> We had a spambot hit yesterday. We had a 2 month streak without any
> bot spam.
>
> Please update SpammerBlockPattern from here
> http://www.myseattle.com/mediawiki/wgSpamRegex.txt
>
> Note: The regex string follows the line: "#current version:". For some
> strange reason, the line appears to be blank in my Firefox browser but
> is visible in my IE browser. The PCRE string is about 4658 characters
> long which is well below the system limit of 64KB.
>
> You can see the regex fragments listed alphabetically here
> http://www.myseattle.com/mediawiki/wgSpamRegexList.txt
>
> Thanks,
> =John
Help us stay online!