Archive for August, 2005

Bad Behavior, Enterprise Edition?

August 27th, 2005 by Michael Hampton

I’ve been doing some thinking. This means you might want to get to the nearest immediately.

Yesterday a fairly well known person who works for a very well known company contacted me about Bad Behavior. What’s not well known is that many of this very well known company’s customer-facing Web sites run WordPress. And, no, their sites don’t at all look like blogs. They must be the most complex, and most non-blog-like, themes ever designed for WordPress.

Anyway, who they are and what they do aren’t relevant. A few of you know, and it’s not a very well kept secret, but it’s irrelevant to the topic at hand, so I won’t be mentioning either the person’s name or the company’s name. (The person in question, who doubtless is reading this, can feel free to disclose it if he or she wishes, however.)

The topic of the day is Bad Behavior on very large sites. To date, this person’s test of Bad Behavior yesterday on this very large site makes the largest installation ever of any Web software I’ve ever written. So I’m proud of that. But the test was not without its problems, and Bad Behavior probably won’t be running on that site for a while.

The main problem that came up during the test is that one entire office of the company was blocked from access to the sites. Presumably for the company’s internal security or some such reason, I haven’t received any raw data from the test which would help me diagnose the problem, but I have been able to make some educated guesses.

Bad Behavior is known to be intolerant of some brands of Web content filtering software. These particular bits of software, if you’re stuck with one, will read a Web page before you do, and make an immediate decision on whether to allow you access to it. The problem arises because they feed the Web server a false user agent. Because Bad Behavior looks deeper than the user agent, it is easily able to tell that the request isn’t really coming from Internet Explorer, and does what it’s designed to do: it blocks the request.

Needless to say, if you block a Web content filtering program, it’s typically going to get annoyed, and block the user who’s stuck behind the filter. This is my current best guess as to what happened in yesterday’s test.

The problem for me, as the author of this spam killing software, is I can’t easily tell the difference between a Web content filter which pretends — poorly — to be Internet Explorer, and a spambot, which also pretends to be Internet Explorer. I can only tell that the request isn’t really coming from Internet Explorer, and presume the requesting user agent is up to no good.

The issue is that Web content filters feed Web servers a fake user agent for a different purpose. If the Web server knew the request was coming from filtering software, it could feed the filter clean, innocuous data. The filter, thus fooled, would then allow the user to access the site, even though it may contain pornography, black-hat hacking information, competitors’ job listings, or anything else the company has decided not to allow its employees to access on company time. Thus the content filter presents a fake user agent to the Web server.

Anyone responsible for the programming of any Web content filtering software, or for that matter just about anything with an HTTP client in it, should feel free to contact me, and I will immediately tell you exactly what your software needs to do to pass spambot filtering and properly maintain the fiction of being a real Web browser. And since several well-known link spammers read me and keep up with Bad Behavior, you also should feel free to contact me, and I will immediately tell you to go to hell.

Anyway, I’m back to Bad Behavior in enterprise settings. If you plan to run Bad Behavior in such an environment, the first thing to do is to wait. While I test myself, and have a few relatively high traffic sites who also test new versions of Bad Behavior before release, they can’t catch everything, sometimes we miss things, and occasionally a third party will do something that throws a monkey wrench into the works, such as the recent release of Google Desktop. Wait for the minor version to stabilize before deploying it widely. Changes in the third digit of the version number are now reserved for bug and security fixes, so follow it as closely as your IT policies permit.

The second thing to do is to whitelist your entire company’s internal networks. This especially means the RFC1918 addresses which most of you use extensively. They are already in the bad-behavior-whitelist.php file; just uncomment them. I’m assuming, of course, that your internal networks are not a source of spam. If they are, you have more problems than I alone can solve.

Whitelist any scripts which you may need to access your site only if they fail. An example of this would be the W3C Validator. (It passes, however, and does not need to be whitelisted; it’s only an example.)

Also consider whitelisting any public IP addresses used by your company, its partners, its vendors, etc. I say consider, not just do it, because in some circumstances this may not make sense. For instance, if you can’t trust one of your vendors to keep its systems secure, you may not wish to whitelist them.

Finally, if you run into a problem you’re unable to resolve, or if you have any suggestions for improving Bad Behavior, contact me as soon as possible. I’ll do whatever I can to assist you. And if you’re a spammer, bend over; I’ve got something special for you.

Bad Behavior 1.2.1

August 26th, 2005 by Michael Hampton

Bad Behavior Bad Behaviour Make a Donation.

Bad Behavior 1.2.1 has been released to address issues people are having with whitelists not working, and with Google Desktop causing users to be blocked from the site. Bad Behavior is the Web’s premier link spam killer, protecting blogs, wikis, forums and CMS systems all over the Internet.

Obviously we’re not all perfect. A ridiculous omission caused Bad Behavior’s new whitelisting feature to simply not work. In order for whitelisting to work, install Bad Behavior 1.2.1. In addition, Google’s new Google Desktop is sending invalid HTTP headers to every site its users visit, causing Bad Behavior 1.2 sites to blacklist them.

Access will be automatically restored to affected users within 48 hours of installing the update, or you can empty the bad_behavior_log table after installing the update to restore access to affected users immediately.

Both of these issues have been fixed, so download Bad Behavior now!

Oh, and the next time I release software for testing, TEST it!

Bad Behavior whitelisting failure?

August 25th, 2005 by Michael Hampton

If you’re trying to use IP-based whitelisting in Bad Behavior, and finding that it fails to allow users through in that IP address or range, please contact me immediately, and send a copy of the whitelist entry and the Bad Behavior logs from the database showing the users from that IP address or range being denied after you added the entry. Then delete from your database any records containing that IP address, and contact me again if the trouble recurs.

Do the same if whitelisting by user agent fails, but remember that the user agent must match exactly for it to be whitelisted.

Google Desktop can’t read RSS feeds with Bad Behavior installed

August 25th, 2005 by Michael Hampton

Due to a bug in Google Desktop, Bad Behavior is blocking access to it when it tries to download users’ RSS feeds. I’ve sent a message to Google (though I don’t really expect much to happen) and I’ll see if I can have a workaround in place shortly.

Affected users will see “Web Clip Error: Unknown error” in the Google Desktop.

FeedBurner users who use the FeedBurner .htaccess redirects are not affected by this issue. (And since I’m one of them, I never noticed.)

I have a ticket [#32426362] from Google for this issue. If you are seeing this, you can contact desktop-feedback@google.com and place the ticket number, with the brackets, in the subject line, and let them know you are adversely affected by this issue. Also run the program located at http://desktop.google.com/DiagnoseGoogleDesktop.exe and include the diagnostic output that it gives in your message.

Bad Behavior protects WordPress.com

August 20th, 2005 by Michael Hampton

Bad Behavior Bad Behaviour Make a Donation.

Running behind the scenes of Matt Mullenweg‘s new commercial WordPress project, WordPress.com, is of course WordPress, everyone’s favorite blogging platform. And running on WordPress.com is Bad Behavior, the premier solution for blog spam.

Continue reading ‘Bad Behavior protects WordPress.com’

Bad Behavior 1.2

August 16th, 2005 by Michael Hampton

Bad Behavior Bad Behaviour

Make a Donation.

Update August 19: Bad Behavior is now available for Drupal.

Bad Behavior 1.2 has been released. Bad Behavior stops at the front door by denying spammers the ability to access your PHP-based web site at all.

Thanks to all of you who tested the release candidates, and actually found fewer bugs than I was expecting. Either I’m getting better at this, or you guys aren’t actually installing the software. :)

Continue reading ‘Bad Behavior 1.2′

Bad Behavior 1.2 Release Candidate 3

August 11th, 2005 by Michael Hampton

Bad Behavior Bad Behaviour

Make a Donation.

Bad Behavior 1.2 Release Candidate 3 has been posted. Bad Behavior stops at the front door by denying spammers the ability to access your PHP-based web site at all.

As I close in on a final 1.2 release, the reports I have gotten have been quite encouraging. Most testers have reported a complete elimination of link spam to their sites. So I’ve cleaned up a bit, fixed one problem, and this will probably be the final 1.2 release, or very close to it.

Continue reading ‘Bad Behavior 1.2 Release Candidate 3′

Bad Behavior 1.2 Release Candidate 2

August 8th, 2005 by Michael Hampton

Bad Behavior Bad Behaviour

Make a Donation.

The second release candidate of version 1.2 of Bad Behavior is now available! Bad Behavior stops at the front door by denying spammers the ability to access your PHP-based web site at all.

Surprisingly, no one reported any bugs in the first release candidate, but a very few spammers are still making it through. So I’ve made an update which attempts to address this and get that last 0.1% of the spam.

New from version 1.2 Release Candidate 1: When logging is turned on, Bad Behavior will identify spammers it has recently seen, even if their profile changes, and continue to block them. I believe this simple change should eliminate virtually all spam, even at the highest-traffic sites, while remaining fast.

Again, I still need reports of any spammers which escape Bad Behavior’s notice. Please contact me and include output from phpMyAdmin showing the relevant records for the spammer. Verbose logging has been turned on for this build so that the necessary records will be available if this happens.

Update August 11: Please see the newer version Bad Behavior 1.2 Release Candidate 3.

Bad Behavior 1.2 Release Candidate 1

August 8th, 2005 by Michael Hampton

Bad Behavior Bad Behaviour

Make a Donation.

The first release candidate for Bad Behavior 1.2 is now available. Bad Behavior, the bane of everywhere, has been strong and stable. I’ve added some new features and need your feedback.

Continue reading ‘Bad Behavior 1.2 Release Candidate 1′