Archives For 30 November 1999

Someone asked me if it were possible to download a web site and make it available offline. To some extend, this can be done. Interactive forms will not work (searching, ordering, etc), but you can use ‘wget‘ to transform a website into a static version.

It goes like this:

wget \
 --recursive \
 --no-clobber \
 --page-requisites \
 --html-extension \
 --convert-links \
 --restrict-file-names=windows \
 --domains example.org \
 --no-parent \
 --wait=1 \
 --limit-rate=500K \
 example.org/

Let me explain:
The ‘–recursive’ option downloads the entire web site and ‘–domains’ tells wget not to follow links outside example.org. Otherwise you will download far too many pages. ‘–page-requisites’ makes sure we’ll get all the elements that compose the page (images, CSS, etc), ‘–html-extension’ saves files with the .html extension so they will work on a stand-alone pc, ‘–convert-links’ converts links so they’ll work off-line and ‘–no-clobber’ prevents any existing files to be overwritten

Using a ‘–limit-rate’ you can prevent wget from using all available bandwidth. Wile downloading will take longer, it is now possible to browse the web while wget is downloading.

Give it a try, it works pretty nice and is great if you’re about to make big changes to your site and you want to save a copy of the old version.