Using WGET to download or back up an entire website

Simple to do with Linux from the command line.  This will make a folder with the name of the site, and download all the content it can access into that folder.

wget -c -r -k -U Mozilla www.thesite.com

-c skips any existing files already downloaded (handy if you're resuming after only downloading half the content)

-r downloads the whole site (recursive)

-k converts the links to local links so they will work if you click on them from the folder on your computer

-U tells the site that the request is coming from a Mozilla browser rather than WGET in case they are blocking 'trolling' type activities.

www.thesite.com should be replaced with the name of the site you want to download

 

Footnote: If your site requires authentication, then the following 2-step process can be used:

Log in and store the session cookie locally

wget --save-cookies cookies.txt --keep-session-cookies --post-data 'user=foo&password=bar' http://www.thesite.com/auth

Now download the site using the stored cookie for authentication

wget --load-cookies cookies.txt  -p http://www.mysite.com/