Here’s a summary of the steps to prevent aboutus.org stealing content from your site.
Block their robot
Add the following lines to your robots.txt (in the “root” folder of your website)
Block their IP range
In your .htaccess file (if you’re on Apache) add the following lines:
deny from 66.249.16.
It appears that they now respect Robots.txt, and as Boris pointed out, there are some useful services in that address space.
Block the bot’s user agent
If you do user agent blocking, block the bot’s user agent:
Mozilla/5.0 (compatible; AboutUsBot/0.9; +http://www.aboutus.org/AboutUsBot)
Block the DomainTools.com IP Range
AboutUs.Org uses Domaintools services to generate thumbnail images of site content, so block their IP range too:
deny from 66.249.4.
(I’ve actually blocked the entire 66.249.x.x address space, just to be safe!)
Unfortunately, these steps are only any good if you’ve not already been indexed, there is one thing you can try if you’ve been indexed and had clearly copyrighted content stolen, and that’s contact his upstream host.
Report the site to his upstream host
Once I get a confirmation that they are the upstream provider involved, I’ll be recommending that if you’ve had your content stolen by AboutUs.Org, that you contact this upstream host.