Blog Comment Spam Fix « Dvorak News Blog

Blog Comment Spam Fix

By John C Dvorak Monday September 26, 2005

Most people who run blogs have issues with comment spam in their blogs and there are all sorts of fixes. Marc Perkel at ctyme.com — my host — was floored, he said, when he realized a simple command to the Apache software would kill most of it — and it does indeed work!

Here is the short code running on the ctyme server for my dvorak.org using WordPress-based blogging software. Altering it for other blog software and other blogs should be simple for anyone running Apache.

< location /blog/wp-comments-newpost.php >
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^.*dvorak.org/.*
RewriteRule ^.* http://www.ctyme.com/comment-spam.html
< /location >

Essentially it makes the basic condition for any post rigid: it has to be coming from a link within the blog itself, the “comment” link. Most spam does not.

My spam count on the blog has dropped from 50-100 to 2 per day without any other tricks.

38 Comments

BlueBoi says:

9/26/2005 at 2:24 am

This will block the stupid bots, but the smarter bots that spoof the referer will have no problem. Good tip though.
Clear Rivers says:

9/26/2005 at 3:07 am

This is only a temporary fix. It’s very easy to fake the Referer Header.
Lenny says:

9/26/2005 at 3:23 am

This post should have included a link to the “I get no spam” audio clip. What the hey, I could have saved it and used it as the “new email notification” sound event because “I get no spam” as well.
Miguel Lopes says:

9/26/2005 at 3:41 am

I’d like to have that ‘I GET NO SPAM’ logo in a bigger format so I could make a t-shirt. Can it be arranged?

Oooops, there goes a nice merchandising idea…

But I’d really like to get the logo…
Eideard says:

9/26/2005 at 3:53 am

Bravo, Marc!
Zen Curmudgeon says:

9/26/2005 at 4:02 am

So, John, now it’s “I get ALMOST no spam”? 🙂

Twitfully yours,

ZC
nullbit says:

9/26/2005 at 5:18 am

Works for now, but spammers will adapt the minute the method becomes widely used enough for them to notice. In this case all these need to do is spoof the Referer: header, which is technically trivial.
Ernie Miller says:

9/26/2005 at 5:30 am

Glad to hear that’s working out for you. It should be noted, however, that the “referer” field is actually part of HTTP GET or POST requests that are made by comment spammers, and as such, would be trivial to bypass for most software by making sure that the domain they are spamming is also included in the REFERER POST header.

Frankly, I’m surprised they aren’t already doing this. But then, nobody ever said spammers were smart. 🙂
Michael Cotterell says:

9/26/2005 at 5:38 am

That’s awesome. I’m gonna put this on http://michaelcotterell.com/blog/ !
Evilpig says:

9/26/2005 at 5:45 am

Nice Job 🙂

Dvorak = NO SPAM AT ALL 😉

Lol
Stuart Colville says:

9/26/2005 at 6:03 am

Surely if the re-write engine is seeing the referrer sent by the user-agent this is easily bypassed by spoofing the HTTP referrer.

In PHP you can use a token method to prevent bots posting forms. First create a random token, (say use the rand in PHP and then use MD5 on it) put this into a hidden form field and also write it to the PHP session.

On receiving the form data if the hidden field token doesn’t match the one in the session then the form wasn’t sent from that site and it can be safely denied.
Kathy P. says:

9/26/2005 at 6:22 am

Not exactly related, but there is a photo of a young man at the SPX (Small Press Expo) in Bethesda Maryland with a shirt that says “The Dvorak Zone”. http://209.198.111.165/thebeat/
site admin says:

9/26/2005 at 7:06 am

I agree this fix will fade if implemented by too many people. As a test I recommend nobody use it but me! 🙂

And, yes, I have looked at numerous special fixes and have tried most of them. Eventaully I’m sure I’ll have to use the “post code” trick where you type something that appears in a box.
Durin Platnick says:

9/26/2005 at 8:55 am

This is a great hint and will definitely help combat some of the stupider spam bots. Like people have said so far, the referrer is trivial to spoof. At this time, if a smart spam bot were developed, I’m not sure there’s much to do except moderate any comments that have URLs in them.
Carnell says:

9/26/2005 at 9:49 am

That famous episode of TWIT with the “I Get No Spam” conversation is what made me check out this blog. I must say it is my first stop every day now! Thanks John, always informative and hilarious!
Cycincal Al says:

9/26/2005 at 10:14 am

great job! now if you could only find a way to stop subjecting us readers to your spam advertising ..
Miguel Lopes says:

9/26/2005 at 11:50 am

I think they call that a Turing test. Now, where can I get a big ‘I get no Spam’ logo? C’mon?
Gregory says:

9/26/2005 at 12:27 pm

WP Hash Cache uses a token method and stops spam dead. Pretty much the only spam it can’t stop is manually entered spam… and that’s basically impossible to stop.

No need for any extra input boxes, just works.
cavemonkey50 says:

9/26/2005 at 12:28 pm

I tried to add the code to my .htaccess but I now get a Internal Server Error on every page.
NullVariable says:

9/26/2005 at 12:44 pm

Like everyone said manual spam seems to be the only problem after a hack like this. The problem with that is there are companies in India that you can pay very small sums of money and they have 5-10 people who will just sign up onto forums and blogs and leave the spam comments for you. I imagine we’ll see more and more of this as the spammers get tired of fighting the great filters that keep coming out.
Jon Maddox says:

9/26/2005 at 1:24 pm

i manually changed the field names and the variables to the page that they are submitted to. That stops the scripts in its tracks….
Rust says:

9/26/2005 at 1:35 pm

SpamKarma is the only spam plugin I’ve used on any of my blogs for months now, and it’s killed all but maybe 10 spam posts (and those 10 were completely random letters – not even a poker link) with only 2 false positives in that time. It even nails trackback and pingback spam 🙂
Ryan says:

9/26/2005 at 1:57 pm

So this pretty much just blocks autospamming bots? It seems to me like this is something that blogging software should come built with. It should be checking that it only allows connections to the comment posting script from a file within the website.

What about people that come on and post links to their free ipod referal sites?
Vince Anido says:

9/26/2005 at 4:46 pm

I’ve pretty much killed comment spam on my WP site recently by using both Bad Behavior and Spam Karma 2. They’re pretty invisible to 95% of users, and they’re been very effective so far.
BlueBoi says:

9/26/2005 at 5:04 pm

What about using Capatcha and the Referer trick, plus I like the idea of using tokens, Session can’t be seen by anything on the client side, so if you put an aways changing token on the form and also in the Session heh you have a fix there, for one alot of these bots don’t support cookies and a session won’t work without a cookie. So basically in theory it would be bullet proof, but you aren’t going to stop a human spammer, cause they will always pass these tests.
Marc Perkel says:

9/26/2005 at 5:48 pm

Yes – they could spoof the referrer but then they lost the diverse source IPs they get with the current proxy tricks. Then I can just block the IP. So it’s not as easy as you think.
Ben Franske says:

9/26/2005 at 7:09 pm

A few months ago I did some research into anti-spam techniques for the b2evolution blogging software. It was in regards to referer spam for which this absolutely doesn’t work but I still looked at, evaluated and rejected this option as a general anti-spam measure for the following reasons. It is also important to remember I was doing said research for the b2evolution community not just myself so if it caused problems for basic users or could not be included by default it was unworkable.

1) As others have mention the referer is client suppllied and easy to change, especially in an automatic spamming script.

2) In addition, some site visitors intentionally block the referer via software on their PC (which they may not even know they have) and this prevents them from commenting.

3) This relies on your Apache installation supporting mod_rewrite not all installations do. Even among those that do there is some debate in the community as to how much of a blow it is to server resources to implement such a solution.

4) Making such modification requires either a dedicated server with access to the httpd.conf file OR support for .htaccess files which are also not supported by all hosts.

In conclusion, while the technique may work for some people for a while it is far from an end all be all solution, is not usable by many people with basic shared hosting plans and has been discussed in the blogging community before and generally rejected.
E Mooney says:

9/27/2005 at 12:31 am

sorry to ask, but would you think there would be an alternative version in .asp ?
0x1d3 says:

9/27/2005 at 4:28 pm

Didn’t pay much attention to this at first because I dont have a blog. But when I heard you talking about it on TWiT I came back to it. What about people that link to your page from a different place. For example the new Google Personalized. I try to link form there to here. But it comes up with nothing but the headers. However I just have to reload it to make it work. Not a big deal, but something to think about.
Squozen says:

9/27/2005 at 11:08 pm

I use SpamKarma – it’s astounding. I get NO spam!

http://unknowngenius.com/blog/wordpress/spam-karma/

Blog Comment Spam Fix

Search

Support the Blog — Buy This Book!

Twitter action

Support the Blog

Put this ad on your blog!

Syndicate

Junk Email Filter

Categories

Pages

About this Blog