Kimler Sidebar Menu

Kimler Adventure Pages: Journal Entries

random top 40

Comment & Trackback anti-SPAM Script

Filed in:b2evo

Comment & Trackback anti-SPAM Script

November 18th, 2005  · stk

Here is an anti-SPAM Perl script that provides an excellent defense against comment and trackback SPAM. It automatically renames the htsrv directory, using a random, 6-character/number sequence. It turns the "htsrv" directory - necessary for trackbacks and comments - into a moving target, making it difficult to SPAM. Hurrah!

3-Jun-2006: NOTE: The three techniques outlined in this article once worked as a defense against blog comment/trackback SPAM. They have all been defeated by spammers and are no longer effective.. :(

It's recommended that you look here for a table of SPAM-fighting options.

11-Dec-2005: Added - Support for both pre/post Phoenix b2evolution releases. Setting a switch will allow the script to run on either.

Here is an anti-SPAM Perl script that provides an excellent defense against comment and trackback SPAM. It automatically renames the htsrv directory, using a random, 6-character/number sequence. It turns the "htsrv" directory - necessary for trackbacks and comments - into a moving target, making it difficult to SPAM. Hurrah!

Renaming the "htsrv" directory has been our first line of defense and this script makes it even better. I used to manually rename it (then edit the _advanced.php file - only one line - to note the change). Because it was manual, I didn't do it very often - maybe once every couple of months. (One time, I waited too long. The spammers found the new name and BOOM - SPAM). Grrr. :> Changing it often is ideal, but I want to blog, not spend my time renaming files and such. I'm no geek! :roll:

Thanks to Dan MacTough (and some handy-dandy modifications by yours truly) ... there's now a PERL script that does it for you, randomly and automatically! The script runs as a cron job, periodically renaming the "htsrv" directory. Even IF the spammers FIND the moving target, there's only a small window it'll accept SPAM, because the script will run again, change the name & yield 404 "File not Found" errors. HA! Take THAT spammers! :D

If your b2evolution blog has ever been under attack by spammers, either leaving automated comment or trackback spam, then you'll appreciate this tool. I'll also provide two other techniques we use ... both of which are a good defense, as well. I can't guarantee these techniques will keep your b2evolution blog SPAM-free, because that's the nature of SPAM. (You're only SPAM-free ... till BOOM ... you're not). :-/ Still, they've worked for us for nearly a year & this script only tightens the defenses.

For the details on these 3 easy anti-SPAM techniques & the code for the "hidehtsrv" script ... read on ...

(1) Renaming the HTSRV Directory (& the "HideHTSRV" Script)

You don't need the script to implement this easy technique, but because it does it automatically, it is definitely a hassle-free improvement. To do it manually, rename the "htsrv" directory. Then, edit the "conf/_advanced.php" file, looking for the "htsrv_subdir" line & changing [ 'htsrv'; ] to whatever new directory name you used. Easy.

The script does the same thing, appending a "(dot)6X", after the original "htsrv" name (where "6X" = Six random numbers/letters). You can see the current folder name, by clicking the "permalink" ... it shows in the "trackback" URL (the grey text between two horizontal lines immediately above the comment area).

To use the Perl script, cut'n-paste from the window below and follow the instructions in the comments. It's as easy as all that. Dan did the scripting work and I prettied it up, made it easy to customize for your site. We hope you like it! (Too much work to cut-n'paste? Download it!)

# version 051211 (for b2evolution installs)
# By Dan MacTough
# Mod by stk
# This script renames the b2evolution htsrv directory and updates
# the _advanced.php configuration file with the new name.
# Run this manually or set up a cron job to run it periodically.
# This should help stop comment and trackback spam.
use strict;
use File::Copy;
# use CGI::Carp qw(fatalsToBrowser);
  #                       >>>> Instructions: <<<<                              #
  # (1) Change the 4 variables below to match your site                        #
  # (2) Upload script to your cgi-bin or bin directory (ASCII MODE)            #
  # (3) Chmod for execute capability (owner, group, world) e.g. 755            #
  # (4) Test via your browser (       #
  # (5) Success?  Yay!  -> Check trackback addy & comment ability              #
  #     [If you can't comment, chances are, the 3rd variable isn't correct]    #
  # (6) Failure?  Boo :( ->  Check your cPanel error logs.                     #
  #     It should be straight-forward to get this working.                     #
  #     IF you can't ... email me (Scott) at (via 'contact us'     #
  #     and I'll see if I can help).                                           #
  # (7) After successful testing, set up a cron job to run this script         #
  #     periodically (once a day, late at night, is recommended)               #
  #     (Can you say "good-bye automated SPAM?")  ;-)                          #
  #                                                                            #
  # Note: You may still receive manually entered SPAM.  If you do, use the     #
  #     Anti-spam tab to locally ban, delete and report the offending URL.     #
  #     This is the best defense for unwanted commentors (until moderated      #
  #     commenting is available).                                              #
  #                                                                            #
  #     Cheers, stk  :D                                                        #

  # CHANGE the next 3 variables
  # (1) Your root directory string (need trailing slash)
my $htmlroot = '/home/randsco/public_html/';
  # (2) The directory tree for your b2evolution install (need trailing slash)
my $blogroot = '/home/randsco/public_html/blogD/';
  # (3) Which version of b2evo are you running PRE-Phoenix (< 1.0)
  #     or POST-Phoenix (> v1.0) ... log into your back office to see
my $vers = 'pre'; # values must be either "post" or  "pre"

  # ADVANCED Setting (probably don't need to change this)
  # (4) Change this ONLY if you've manually
  #     modified the 'conf/_advanced.php' file, by adding something to the
  #     "htsrv_subdir = 'htsrv';" line (around line 300) that's
  #     IN FRONT of ['htsrv...]
  #     (Most installs do not need to worry about this 4th varialbe. If you've
  #     moved things around, you might have a directory or two there).  
  #     IF you do, you'll need to escape the trailing slash, like below:
  #     my $advHTSRV = 'folder\/';
my $advHTSRV = '';
  #  END of changes (You shouldn't have to modify anything below this line)    #

  # Generate random dir name
my $num = random_password($ARGV[0]);
sub random_password {
    my ($length, $vowels, $consonants, $numbers, $alt, $numpos, $s, $newchar, $i, $numpos);
($length) = @_;
if ($length eq "" or $length < 3) { $length = 6; } # at least 6-chars long
    $vowels = "aeiouyAEIOUY";
    $consonants = "bcdfghjklmnpqrstvwxzBCDFGHJKLMNPQRSTVWXWZ";
    $numbers = "0123456789";
    srand(time() ^ ($$ + ($$ << 15)) );
    $alt = int(rand(1000) % 2); # better than rand().
    $numpos = int(rand(5)); # no dirs ending in a number.
    $s = "";
    $newchar = "";
  foreach $i (0..$length-1) {
    if ($i == $numpos) { $newchar = substr($numbers,rand(length($numbers)),1); }
    elsif ($alt == 1)
      $newchar = substr($vowels,rand(length($vowels)),1);
      $alt = !$alt;
    } else {
      $newchar = substr($consonants,rand(length($consonants)),1);
      $alt = !$alt;
    $s .= $newchar;
  return $s;
  # End Random Dir Generation

  # Get the current htsrv directory
opendir(DIR, $blogroot) || die "can't opendir $blogroot: $!";
my @subdirs = grep {/^htsrv/} readdir(DIR);
my $htsrvDir = $subdirs[0]; # current b2e htsrv directory

closedir DIR;

  # Load the _advanced.php file
my $configfile = $blogroot.'conf/_advanced.php';
open(FILE, "<$configfile") || die "can't open $configfile to read: $!";
my @contents = <FILE>;
close FILE;

  # Rename the htsrv directory
my $old = $blogroot.$htsrvDir;
my $new = $blogroot.'htsrv.'.$num;
move($old, $new) || die "can't move $old to $new: $!";

  # Edit the _advanced.php file to replace $htsrv_subdir with the new name
open(FILE, ">$configfile") || die "can't open $configfile to write: $!\n".
  "Warning: $configfile does not point to $new! ".
  "Please correct this manually";

foreach (@contents)
if ($vers eq "post") { $_ =~ s/^(\$htsrv_subdir = '$advHTSRV)(htsrv).*?('.*)$/$1$2.$num\/$3/si; }
if ($vers eq "pre" ) { $_ =~ s/^(\$htsrv_subdir = '$advHTSRV)(htsrv).*?('.*)$/$1$2.$num$3/si;   }
print FILE @contents;
close FILE;

  # Feedback if run from a browser client
if ($ENV{REQUEST_METHOD} eq ('GET' || 'POST'))
    print "Content-type: text/html\n\n";
    print "<div style=\"border:3px double gray; padding:20px; text-align:center; width:500px; font-family:verdana,sans-serif; \"><h1>Hide HTSRV Script</h1><br />";
if ($vers eq "post") {
    print "(for b2evo versions <strong>above</strong> v1.0)"; }
if ($vers eq "pre") {
    print "(for b2evo versions <strong>below</strong> v1.0)"; }
    print "<h2 style=\"color:green;\">Success!</h2><br />";
print "<p>The script renamed <strong>$htsrvDir</strong> to <strong><span style=\"color:red;\">htsrv.$num</span></strong><br />";
    print "Your _advanced.php file has been edited to reflect this change.</p>";
    print "<em><span style=\"color:gray;font-size:12pt;\">Hide HTSRV script, courtesy of Dan MacTough (</span></em>";
print "<span style=\"color:gray;\"><em>& Scott Kimler (</em></span></div>";

(2) Blocking Off-Site Spammers

This technique, which I learned from Whoo, involves a change to your site's ".htaccess" file. The code blocks any remote calls to your comments file (meaning that one has to be LOOKING at your pages, in order to comment ... they can't be commented remotely, using a script that exploits the comment_post.php file).

Here's the .htaccess code:

# Block remote calls to comments
RewriteCond %{HTTP_REFERER} !^http://(www\.)?*$ [NC]
RewriteCond %{REQUEST_URI} .*comment_post.php$"
RewriteRule .* - [F]

(3) Renaming the Comment File

Putting the cart before the horse a little bit, the 3rd technique we employ, is renaming the "comment_post.php" file. Why? Because ALL the default b2evolution installs use this same name and spammers have learned it. By changing it, you're putting another barrier to SPAM.

Changing the file name is easy. Make something up. Be creative. Be strange! Be crazy! :crazy: But don't forget ... you'll need to edit two files so that b2evolution knows just how crazy you can be:

• admin/_edit_showposts.php (around line 266)

(summer beta v1.8) • inc/VIEW/items/

• skins/_feedback.php (around line 98)

In BOTH of the files, look for the following line and change the highlighted portion to your new file name. (AND ... the reason the cart was before the horse ... if you've implemented (2) above, you'll need to change the "comment_post.php" name there too.)

<form action="<?php echo $htsrv_url ?>/comment_post.php" method="post" class="bComment">

That's it. Done.

I hope that you find all of this useful. Personally, I like these techniques because they're all preventative. By keeping a low profile, you'll ward off - not only comment/trackback SPAM, but referrer SPAM, as well. Oh ... regarding referrer SPAM:

• DO keep current with the anti-spam blacklist in the back office

• DO check your referrer stats periodically

• DO report suspected SPAM referrers to the b2evolution blacklist

• DON'T display your stats on your page.

(You might also consider renaming the "/skins/yourSkin/stats.php" file, as this can be called up directly, by anyone, to have a look at your stats. :-/ ) Ack!

Happy blogging!

Views: 254946 views
22 Comments · GuestBook
default pin-it button
Updated: 17-Jul-2006
Web View Count: 254946 viewsLast Web Update: 17-Jul-2006

Your Two Sense:

XHTML tags allowed. URLs & such will be converted to links.

Subscribe to Comments

Auto convert line breaks to <br />

1.flag Comment
If you replace "comment_post.php" with $comment_post_url you can add a new setting to your config and get your script to rename both folder and file ;)

2.flag Danny Comment
Great work. I may have to use that perl script. I've had some problems with the method described in number 3. Some skins override the /skins/_feedback.php file, so any skins that do have to be edited, too. And then when you install a new skin you have to check and see if it does. Or, if you create a new skin and upload it to you have to make sure that you change it back to the original value (comment_post.php) before zipping it and emailing it in. Otherwise any users who download it won't be able to post comments using that skin. Not that this has ever happened to me.
3.flag stk Comment
LOL ... yer always wanting to task me with SOMETHING!

"$comment_post_file" makes more sense as a variable name tho.

You know I hate core file hacking. All that might be too many changes.
4.flag Comment
Lol, I meant 'file' but you can't edit comments and I'm too blonde to get things right first time ;)

I agree with you Danny, but somebody who's being spammed to the hilt will find the effort worth it ;)

5.flag Comment
*edit* cos I really am too blonde to get it right first time :S

If you added if ( !isset( $post_comment_FILE ) ) $post_comment_file = 'comment_post.php'; to the top of any skins you created then you wouldn't need to change skins uploaded to b2evo.

6.flag Dan Comment
Good write up, Scott. As I just mentioned in an email to you, I think that if you're renaming the htsrv directory, you don't need to rename the comment_post file -- because it's IN the htsrv directory. Renaming the htsrv directory kills two birds with one stone!

If you implement the hidehtsrv Perl script, you may be interested in my follow-up post which describes a method to permanently ban the IP of anyone who repeatedly tries to spam you at the default htsrv address.
7.flag stk Comment
Danny & ¥ ...

I think Dan raises a good point. Renaming the dir that contains the comment-post file is enuff.

I had already renamed our comment-post file before I learned about the htsrv dir rename technique ... so it was easier to leave it.

Using the "hideHTSRV" script, method (3) is prolly unecessary.

Dan ...

Keeping SPAM off our site is goal #1. I like your prophylactic solution (thanks again) ... since it works, of what benefit is banning IPs?
8.flag Sieg Comment
Hey Scott, I was looking at your anti spam hack, and I was just going to do it manual by renaming the htsrv folder, but I cant find it for the life of me. I remember you saying you changed a few things a while back on my blog dealing with spam, do you remember by chance if you may have renamed it to something else?
9.flag stk Comment

You're our first satellite ISP commentor!

I emailed you with your htsrv folder name (indeed, I renamed it a while back, when you were having problems with comment spam).

10.flag Ross Comment
I wonder ... if I use the script, will trackbacks that other have used on their sites become invalid when the htsrv folder name changes?
11.flag stk Comment

Changing the htsrv folder name will NOT "break" trackbacks done on other sites.

Reason: The trackback addy is used to LOG the "trackback", when one is made. It's the PERMALINK that is used in the "trackbacked" article on the other site ... and the permalink isn't affected by a 'htsrv' folder rename.

Make sense?

12.flag C.H. Truth Comment
That code that goes into the .htaccess file - where does it go and what is the best way to edit the file?

Should we change it to a .txt file or just edit it another way?
13.flag stk Comment
.htaccess is a text file and any text editor should be able to edit it. You may append a .txt to the end, but if you do, you'll need to change it back to ".htaccess" on the server. (b2evo ships with a sample .htaccess file).

One thing I forgot to mention is that you NEED (1) to have mod_rewrite capaiblities AND (2) the rewrite engine needs to be ON.

(1) is up to your host and there's not a lot you can do if they don't allow it (other than complain).

(2) is easy ... just add the line "RewriteEngine on" (not quoted)

(3) Just add the text anywhere after the "Re-writeEngine on" part.

Put it all together and you have the following: ( # are comment lines )

# Mod ReWrites
RewriteEngine on

# Block remote calls to comments
RewriteCond %{HTTP_REFERER} !^http://(www\.)?*$ [NC]
RewriteCond %{REQUEST_URI} .*comment_post.php$"
RewriteRule .* - [F]

The only thing to change is the [] part.

Hope that helps.

14.flag Sieg Comment
Now this should be the second SAT post:) Any way Im lost in B2 hades again. Im scratching my head trying to figure out why I cant change this darn link :

Trackback address for this post:

Im sure the pros will laugh at me, I knew I should have just taken the spam and not changed any settings so here I am again pleading for help:)
15.flag stk Comment

I had a look this morning, changed a few things and yer back good as new!

(See my private email for the details).

16.flag Dan Comment
Scott, you asked "of what benefit is banning IPs". The answer is bandwidth and deterence of a denial-of-service attack.

By banning IPs at the Apache (or whatever) server level, repeat spammers cannot mount a DOS attack by, for example, repeatedy calling a VALID url on your site, which unless you're only using static files, each call involves filesystem and database calls. Multiply that by many thousands and your bandwidth could get burned quickly and your site could become unavailable as it gets over-taxed.
17.flag blony Comment
Nice try and thanks. Seems auto-spammers actually use scripts that go to a post, invoke a comment link, and post a comment. Thus, they get around the name changes. I have spam attempts from same source within minutes of htsrv directory name change - each using the respectively correct directory name - and with on-site referal info.

The hits are irritating but, at least, a captcha keeps the comment spam out.
18.flag stk Comment

You're certainly NOT full of blony, as spammers HAVE defeated this technique.

However, there are now some more simple techniques that can be employed to defeat the spammers, without having to make visitors jump through the CAPTCHA hoop. ;)

But, you are correct, the three techniques in this article - that USED TO WORK - no longer do.

I've posted a comment at the top of the entry, to this effect.

Thanks for commenting.

19.flag blony Comment
One thing I've done for a spammer that repeatedly checks stats, and which I suspect is responsible for putting trojans all over the internet to do comment spam, is add the following to .htaccess:
RewriteCond %{query_string} disp=stats
RewriteCond %{HTTP_REFERER} ^.*d4ap.*$ [NC]

RewriteRule \.*$ [R]

I'm watching for loops or other manifestation of retaliation but, for now it's satisfying.

I'm considering doing something similar for the auto-comment spam but have yet to figure how to distinguish from genuine users on the fly.
20.flag blony Comment
For what it's worth, I have a PERL code that operates on a raw log and the .htaccess file. It generates deny directives for all comment_post accesses in the raw log and adds them to .htaccess.

It avoids duplicating anything in .htaccess, including anything added during the current run.

It is indiscriminate, though, regarding comment posts in the raw log and so one must use last_comments to delete the deny for any valid commenters.

I may change that to use a list of valid IPs up front since it's far easier to get those than enter all the spam IPs by hand. In fact, if motivated by inconvenience (mine or someone else's) I'll add the functionality to acquire valid comment IPs from the database. That way, if you clean up the spam once and then run it periodically, it should keep offensive IPs out.

Issue: Whether the offending IPs are roughly static is an issue. For the experimenting I'm currently up to, I assume they are based upon the heuristic that many given IPs repeatedly show up - it's nice to see them end up with a 403 since I made the update.

I was trying to address this by hand. Glad I stopped to automate. With a raw log of roughly six days, from which I'd added about a hundred combined with a down load of a blacklist generated by someone else (that contained a lot of addresses I was seeing) of 700 plus IP addresses - the PERL code generated about 1300 more offenders.

Any interest, feel you're (anyone) welcome to it.
21.flag stk Comment
You're not full of baloney and you're not lazy!! That's quite a bit of work to put together such a PERL script. Congrats. (The only thing I know about PERL, is that it comes from oysters). :||

Personally, I'm not big on CAPTCHA (b/c I don't like having visitor jump over a hoop, no matter how small, because of spammers. Just my opinion.)

No offense to your PERL script, but I'm not big on deny in .htaccess either, for two reasons (1) it's generally after-the-fact (you get spammed, then have to deal with the spam and add the IP to a list) ... so to me, it's "reactionary". and (2) from our investigations, spammers change IP's faster than I change socks, so any list (a) can grow quite long and (b) will likely contain a bunch of obsolete/one-use IP's anyway. (AND - consider - once a spammer has discarded an IP, it can be reassigned to a legit commenter).

I know of two new methods that I'll turn you on to, both of which IMHO, have some merit. (Again, though, we're not focusing on either one because we have half-a-dozen trickier things up our sleeves.)

(1) There's a post on b2evolution by Whoo about mod-security. It can check the PAYLOAD of the comment, so you can pop 403's with a broad brush-stroke for words like "cialis"; "xanax"; "phentermine", etc. It's powerful and cheap, though you need to be wary of banning things like "ambien" because it'll kill any comments that contain "ambienT noise", as well.

(2) I don't think this is an option for you, because your not running v1.8 CVS (I'm not either), but is worth mentioning. There's a plug-in written for b2evolution that checks a comment for "spamminess" (but only if it's not obviously spammy ... you set the threshold), against a spam DB at . I suppose something could be manufactured for versions 0.9.x, but we're not doing it because we don't like having to access 3rd parties for scripts (no telling if their servers will be up or down). Might be worth a look though.

Thanks for posting! Let me know if people show interest in the Perl script.
22.flag m Comment
hello from germany,
thank you for these very nice and helpfull information - I like the idea of renaming comment_post.php best :-) hope this works out fine ...
all the best,