Application updates

March 5th, 2007

This site is now running on WordPress 2.1.2 with integrated Vanilla 1.1.2 forum. The integration (documented here) is still experimental – what isn’t on this site?! – so if you find any unusual behaviour please let me know.

Edit: It turns out that the update wasn’t quite as simple as I thought. In the new version of Vanilla the authenticator code (library/People/People.Class.Authenticator.php) has been changed to allow for the new security patch. The alternative People.Class.WordpressAuthenticator.php used for WordPress integration needs to be modified in the same way.

In detail, I edited the file library/People/People.Class.WordpressAuthenticator.php and replaced:

function GetIdentity() {
    $this->GetWordpressSettings();

with this:

function GetIdentity() {
    if (!session_id()) {
        session_set_cookie_params(0,
                $this->Context->Configuration['COOKIE_PATH'],
                $this->Context->Configuration['COOKIE_DOMAIN']);
        session_start();
    }
    $this->GetWordpressSettings();

from the new People.Class.Authenticator.php.

Hope this helps someone…

Edit 2: Make that Vanilla 1.1.1 – just once I get in really quick with an upgrade and look what happens… ;)

Edit 3: Hmm, now Vanilla 1.1.2… At least the above still works with no new changes.

Simple WordPress anti-spam for shared hosting

December 21st, 2006

I hate spam! The e-mail based stuff is bad enough (and much worse for the past few months) but it’s also invading web sites. WordPress, being such a popular package, is a major target for referrer and comment spam. In case you haven’t already come across these, they’re both techniques by which spammers attempt to place links on your site to their grimy porn / pharma / small caps or whatever sites.


Now at the simplest level these attempts are easy to block by ensuring that logs are not published and all comments must be approved before publication. But comment moderation can become a chore if you’re trying to sort one genuine message from among hundreds of fake ones, so there are some nice WordPress plugins available to help – notably Akismet, Spam Karma 2 and Bad Behavior. These three plugins each take a different approach to dealing with the problem, and they all seem to be effective. I’ve only tested Akismet and SK2 so far – both work well.

So I was reasonably comfortable with the situation (complacent?!) until I saw this thread on WHT titled Drummed out by spambots. Here’s someone with a small WordPress blog and similar anti-spam measures in place, and the sheer volume of spam requests has caused the host to suspend the web site. Now I don’t blame the host for this – if a single site is overloading the server you have to shut it down for the sake of all the other customers – but that’s not much consolation for the innocent victim of the spammers. Remember, nothing was fundamentally wrong with the blog – the spam was not published, the site was not exploited, the anti-spam system was working correctly but the server was simply overloaded by the job of handling comment-posts at an average rate of about 1 every second! In effect, it’s an amplified DoS (denial of service) attack.

For me, it’s an early warning. I run several WordPress blogs, both for my own projects and for client web sites and I really don’t want this to happen to any of them. Fortunately the discussions in that thread led to a participant “extras” proposing what looks to me like a viable solution. This post describes the implementation.

What’s wrong with existing solutions?

Generally, not much at all, but in this situation the processing power required was too great. I’m looking for a simple solution that takes minimum processing power – so no PHP.

Principles of the new system

In wordpress, visitor comments are posted to the PHP script wp-comments-post.php. Writers of evil comment-spam-bots know this and will generally go there directly, unlike real visitors who wish to comment on a post – they will always view the page first. So the aim is to change the name of that script (for genuine visitors), detect all bots going to the original script and ban them!

Implementation

So step 1 is to send real visitors to a different script – I’ll call it “my-wp-comments-post.php” but if you try this yourself choose a name of your own.

So I’ll make a copy of wp-comments-post.php called my-wp-comments-post.php in the main wordpress directory, and in my template (usually comments.php and comments-popup.php) I’ve changed the line:

<form action="<?php echo get_settings('siteurl'); ?>/wp-comments-post.php" etc.

to this:

<form action="<?php echo get_settings('siteurl'); ?>/my-wp-comments-post.php" etc.

Now, I might allow people a day or so to refresh their cache but after that I’m going to assume that anyone using the original script is an evil spammer and block them. To do this, using the idea suggested by extras in the post linked above, I’ll replace wp-comments-post.php with a new script that records the visitors IP address and then sends them an error page (403 Forbidden). I’m going to record the visitor’s IP address by creating an empty file, for reasons that will become apparent later. Here’s the simple code:

<?php
$blockdir = 'my-blocked-ips';
$ip = preg_replace('/[^d.]/','',$_SERVER['REMOTE_ADDR']);
$file = "$blockdir/$ip";
if ( $h = fopen($file, 'x') ) fclose($h);
header('HTTP/1.0 403 Forbidden');
exit;
?>

I’ve also created a directory called my-blocked-ips (again choose a different name if you want to do this) and made it writable by PHP – chmod 777 for a mod_php system. So, each hit on the old comment script records the visitor’s IP address. Now to do something with it…

.htaccess modifications

What we want to do is find the visitor’s IP address, look for a file of that name in the my-blocked-ips directory and if it’s found, block all access to the site. Here’s the mod_rewrite black magic to do it:

RewriteEngine On
RewriteBase /
#Use simple html error document in place of shtml standard
Errordocument 403 /403.html
# Ban blocked IPs (recorded as files in blocked directory)
RewriteCond %{REQUEST_URI} !^/403.html
RewriteCond /full/path/to/my-blocked-ips/%{REMOTE_ADDR} -f
RewriteRule . - [F,L]
# Prevent direct access to blocked directory
RewriteCond %{REQUEST_URI} ^/my-blocked-ips/
RewriteRule . - [F,L]

First I’ve set up a special error page for 403 errors – one that requires very little processing, so just a simple very small HTML page. Then the redirect, based on finding the visitor’s IP in the block list, and specifically excluding our new error page. Finally, just as a precaution I’ve blocked direct access to the blocked-ip directory (as it’s server-writable I don’t want anyone to place a PHP file in there and run it).

Cleaning up

It’s probably a good idea to periodically remove the files from my-blocked-ips, partly to improve performance checking for a file in that directory but mainly because innocent people on dynamic IP addresses could be unfairly banned – say if the IP address happens to be one previously (ab)used by an exploited computer. So a cron job to periodically un-ban older addresses would seem useful, something like this:

find /full/path/to/my-blocked-ips/ -type f -maxdepth 1 -mtime -2 -exec rm -f {} ;

Afterthought

Ok, I should have thought of this before but I don’t want this thing blocking the Googlebot (or any other search-engine spiders). If everything goes right it shouldn’t happen anyway, since there are no links leading to that page, but just to be on the safe side I’ve added this to robots.txt:

User-agent: *
Disallow: /wp-comments-post.php

Well-mannered bots will read that and not touch the blocking script. Spam-bots will most likely ignore it.

Does it work?

Too soon to tell. This site (like most WordPress blogs I guess) does receive a fair amount of comment spam – Akismet is showing 81 comment spams from the past 12 days. I’ll keep an eye on the system before and after the change and update this post later. I think success hangs on one important question: do the spam-bots actually look at the comments form in the pages or just default to the WordPress standard script? I’m assuming the latter, but if I’m wrong then the comment spam will continue coming in on the real contact form, in which case something smarter will be needed – perhaps based on the Bad Behavior plugin. We shall see.

Meanwhile, if anyone has relevant comments I’d be interested to see them (if only so I know the site is still working!)

Update (22 Jan 2007)

First, doing this requires some care to avoid either blocking legitimate users or letting the spam through unabated. With the site updates to version 2.06 and then 2.07 in quick succession it went through a short period in both of those states, so if any genuine visitors tried to post a comment and got themselves blocked for a day or two – sorry, mea culpa. I now have it set up with no modifications to the standard wordpress files or directories (except for my own template) which should allow for painless upgrades.

(If anyone’s interested in the file system set up let me know and I’ll post the details).

All that makes it hard to analyse results but it appears to me that:

  1. During the period when the system was working correctly, very few spammers were caught and blocked – suggesting that in most cases they aren’t simply going to wp-comments-post.php without checking that it’s the target of the comment form.
  2. The rate of comment spams overall has reduced dramatically. Even when the system was effectively disabled the site received 31 spam comments in 15 days, about one third of the rate earlier.

The best way I can interpret these rather conflicting results is that comment spammers may look for a standard WordPress install – anything non-standard and they simply move on to the next victim. If so, reducing comment spam may be easier than I thought…

I’ll update with some results after the system has been in action (and working) for a while longer.

Mirrored dynamic sites using WordPress

July 23rd, 2006

When synchronizing content across two or more servers one of the trickier aspects is handling user updates – forum posts, comments etc. If updates are accepted on all servers then merging the data would be a regular headache.

Instead, ClonePanel requires that each server be given a role – slave or master – and updates made on the master are copied to the slave. If any changes were made on the slave they would be overwritten and lost at the next sync. So it’s essential that the master server should handle all user updates and the slave must always be read-only.

WordPress handles this perfectly through the two setup options: WordPress address and Blog address.

The blog address is used throughout the site for links to posts, categories and pages – this can be blank to ensure that a visitor browsing the site (master or slave) will stay on the same site / server he first arrived at.

The WordPress address is used in links to admin functions (Register, Login, Dashboard) and in the action of forms – all the things that change the site in any way. So here we use the fully-qualified uri of the master server and ensure that all changes are made only there.

This site itself is an example. It’s available at clonepanel.com and clonepanel.net – one hosted in Australia the other in California (you may notice a slight speed difference, depending on your own location). You can browse either site without problems but as soon as you try to register, log-on or leave a comment you will find yourself on clonepanel.com – the master server.

The next step is to introduce failover – if the master server goes offline, how can the slave take over? That’s more difficult, and will be the subject of many future posts…

ClonePanel web site (WordPress version)

July 21st, 2006

I was updating the old web site to the latest version of Xaraya (1.1) and getting frustrated with some of the changes that demand modification of existing themes.

I’ve also been working on a couple of WordPress sites recently and loved the tiny code-base and the simplicity of integrating my own php code into a WordPress site.

Don’t get me wrong – I like Xaraya but it’s a huge beast! And if I’m going to be developing and maintaining other WordPress sites it makes more sense for me to concentrate on that rather than learning how to do the same things through Xaraya…

So I deleted Xaraya, downloaded WordPress and a neat theme from themes.wordpress.net and a couple of hours later here it is – the new WordPress-powered ClonePanel.