I’ve been keeping an eye on the CarComplaints.com error log lately, watching for phishing attempts, misbehaving bots/scripts, & other random stupidity. Turns out the major offenders have something in common — they’re hosted on Amazon’s AWS platform.

One Amazon AWS customer was crawling pages in bursts at up to 100 per minute, but referencing our mixed-case URLs in all lowercase — racking up several hundred thousand 404 errors over several weeks. Luckily they had a “Ruby” user agent (Ruby script’s HTTP request?) … bye bye Ruby, at least until you change user agents.

Another Amazon AWS customer was requesting oggiPlayerLoader.htm in various locations. Anyone know what this “Frame Booster” is part of? (UPDATE: see my followup about Oggifinogi). Luckily they use a HEAD request, so those got banned too along with some other esoteric request methods suggested by Perishable Press.

RewriteCond %{HTTP_USER_AGENT} "Ruby" [NC,OR]
RewriteCond %{REQUEST_METHOD} ^(delete|head|trace|track) [NC]
RewriteRule ^(.*)$ - [F,L]

I cheerily reported both cases of AWS abuse to Amazon via their web abuse form. Turns out the abuse form is there only to mess with your head. Some form data has to be space-separated while other data must be comma-separated. Fields where you list IPs & URLs barely fit a single entry, much less multiple items. And good luck cutting your access log snippet down to their 2000 character limit. Amazon just launched their Cloud Drive — zillions of decaquintillobytes of storage space — but can they handle processing a few hundred lines of server logs? Nope.

The kicker is if they do accept, verify, & pass on your  complaint to their AWS customer, Amazon won’t provide any details about the offender so that you could, oh I don’t know, blog mean things about them. You’ll need a subpoena for that.

Moving on to abuse not related to AWS — people are referencing themes/default/style.css all over the place. The requests look legitimate, from various random IPs & user agents, so I’m guessing it’s a misbehaving browser plugin. Searching Google indicates it could be something called OpenScape, which I didn’t have time to research. Anyone know what that’s all about? Those got forbidden…

RewriteRule theme/default/style.css$ - [F,L]

And finally there’s Microsoft. For about a year, MSNBot has managed to take legitimate page URLS & tack Javascript onto the end, as in /Kia/Sephia/2001/engine/this.options[this.selectedIndex].value;” Only Microsoft could manage that.