Here's what you do to bypass the "robot police":
So what if you don't want wget to obey by the robots.txt file? You can simply add -e robots=off to the command like this:
wget -r -p -e robots=off http://www.example.com
Using wget To Download Entire Websites
courtesy Jam's Ubuntu Linux Blog.