Search notes:

wget

wget is a command line tool that allows to download internet files (HTTP (incl. proxies), HTTPS and FTP) from batch files (that is: non interactively) or in the command line (cmd.exe, bash etc).

Important command line flags

IMHO, the most interesting command line flags for wget are:
-O --output-document=FILE write documents to FILE. Use - to write output to stdout. (Note, the lowercase -o specifies the log file)
-r --recursive specify recursive download.
-H --span-hosts go to foreign hosts when recursive.
-l --level=NUMBER maximum recursion depth (inf or 0 for infinite).
-np --no-parent don't ascend to the parent directory.
-nd --no-directories don't create directories.
-x --force-directories force creation of directories.
-nc --no-clobber skip downloads that would download to
-k --convert-links make links in downloaded HTML point to local files.
-p --page-requisites get all images, etc. needed to display HTML page.
-A --accept=LIST comma-separated list of accepted extensions.
-R --reject=LIST comma-separated list of rejected extensions.
-w --wait=SECONDS wait SECONDS between retrievals.
-m --mirror Same as -r -N -l inf -nr (--no-remove-listing)
--no-check-certificate don't validate the server's certificate.
--content-disposition honor the Content-Disposition header (Experimental)
See also the most important wget command line flags.

Examples

wget -x <url>
Page requisites: wget -p <url>.
Download one level only (page with «table of content») wget.exe -r -l 1 -nd -k -p <url>.

Mirroring a website

wget -oc:\temp\wget.log -r -k -p -np -nc <dirs and subdirs>
-nc so that the command can be restarted.
Prevent zip files from being downloaded, too:
wget -r -k -p -np -nc --reject=zip http://foo.bar/

Download requisites from other domains

At times, some requisites (such as images) are hostet on other domains. In this case these requisites can be downloaded with the -D flag which specifies a column separated list of domains which hosts can be spaned to (-H);
wget -r -k -p -H -D other.domain.xy,target.xz https://target.xz

Download specific filetype only

wget  --no-directories --accept=pdf --recursive --level=1 url
or, same thing
wget -nd -Apdf -r --level=1
If the files reside on another host, use also -H.
If the server uses CGI to serve some different suffixes and uses the Content-Disposition header, the --content-disposition flag might help.

Using --cut-dirs

wget -r -nH -np --cut-dirs=2 http://svn.openstreetmap.org/applications/utils/gary68/
Using --cut-dirs cuts directory-levels when directories are created. The above example creates gary68 instead of wvn.openstreetmap.org/applications/utils/gary68.

Specifying the result language

With the --header option, it possible to alter the Accept-Language header of the HTTP request:
wget --header='Accept-Language: de' http://foo.bar.baz/

Proxy

http_proxy, https_proxy and ftp_proxy.
If no proxy should be used for certain hosts, store them in the environment variable no_proxy, separated by commas.
See also --proxy-user and --proxy-passwd.

wget initialization files

/etc/wgetrc (global, for all users)
~/.wgetrc (for a single user)

Installing on Windows

The following PowerShell commands should be able to install wget.
If installing wget for windows without installer (but from the zip file), it also needs the files libeay32.dll, libiconv2.dll, libintl3.dll and libssl32.dll. These can be downloaded from https://sourceforge.net/projects/gnuwin32/files/wget/1.11.4-1/wget-1.11.4-1-dep.zip/download.
$ua = new-object system.net.webClient
$ua.downloadFile("http://downloads.sourceforge.net/gnuwin32/wget-1.11.4-1-bin.zip" , "$home\Downloads\wget.zip")

$shell = new-object -comObject shell.application
$shell.nameSpace("$home\bin").copyHere("$home\Downloads\wget.zip\bin\wget.exe")
rm $home\Downloads\wget.zip

# $ua.downloadFile("https://sourceforge.net/projects/gnuwin32/files/wget/1.11.4-1/wget-1.11.4-1-dep.zip/download", "$home\Downloads\wget-dep.zip")
$ua.downloadFile("https://sourceforge.net/projects/gnuwin32/files/wget/1.11.4-1/wget-1.11.4-1-dep.zip"         , "$home\Downloads\wget-dep.zip")
$shell.nameSpace("$home").copyHere("$home\Downloads\wget-dep.zip\bin")
rm $home\Downloads\wget-dep.zip
2021-10-13: Apparently, a new version of wget is available at eternallybored.org, which also does not seem to be dependent on other DLLs.

Misc

wget respects /robots.txt.

See also

~/.wget-hsts
cURL (With Windows 10, it is already preinstalled under %SystemRoot%\System32\curl.exe)
get.vbs is a simple script (written in VBScript) that allows to GET a resource via HTTP and print it to the console.
In PowerShell, wget is an alias for invoke-webRequest.
tools

Index