Download files and directories from web using curl and wget

This is one thing which everyone of us might have faced difficulty with or are still struggling to get a simple and exact answer.

FYI both and support HTTP, HTTPS, and FTP protocols.

Let’s get our hands dirty.

Downloading Files and Directories from web with WGET

Our first target is to download directories and underneath files from below location

wget $ wget -np -P . -r -R "index.html*" --cut-dirs=4

Let us understand this command in detail.

- -no-parent or -np
Do not ever ascend to the parent directory when retrieving recursively. This is a useful option, since it guarantees that only the files a certain hierarchy will be downloaded.

-P prefix or - -directory-prefix=prefix
Set directory prefix to prefix. The is the directory where all other files and subdirectories will be saved to, i.e. the top of the retrieval tree. The default is ‘.’ (the current directory).

-r or - -recursive
Turn on recursive retrieving. The default maximum depth is 5.

‘-R rejlist or - -reject rejlist’
Specify comma-separated lists of file name suffixes or patterns to accept or reject. Note that if any of the wildcard characters, ‘*’, ‘?’, ‘[’ or ‘]’, appear in an element of acclist or rejlist, it will be treated as a pattern, rather than a suffix. In this case, you have to enclose the pattern into quotes to prevent your shell from expanding it, like in ‘-A “*.mp3”’ or ‘-A ‘*.mp3’’.

Like we have done for all “index.html” files -> -R “index.html*”

cut_dirs = n
If you do not want the first few directories you can Ignore n remote directory components. Equivalent to - -cut-dirs=n.

Once run you will have the directory structure like below.

Now download files only from here

Here is the command.

wget $ wget -nd -np -P . -r -R "index.html*"

Here we have used one additional parameter — -nd.

-nd or - -no-directories

Do not create a hierarchy of directories when retrieving recursively. With this option turned on, all files will get saved to the current directory, without clobbering (if a name shows up more than once, the filenames will get extensions ‘.n’).

This command will download the files on the same directory.

Downloading Files from web with CURL

does not provide recursive download. So we can only use it for downloading files.

Download a single file with curl

curl $ curl -O

Download multiple files with curl

To download multiple files at the same time, use –O followed by the URL to the file that you wish to download.

Use the following syntax for this purpose:

curl -O [url1] -O [url2]

You can also make use of double quoting the URL to avoid copying it multiple times:

curl -O "http://domain/path/to/{file1,file2}"

curl $ curl -O “{,}"

Downloading and Uploading files over FTP using CURL and WGET

and can be used to download files using the FTP protocol:

wget --user=ftpuser --password='ftpuserpassword'
curl -u ftpuser:ftpuserpassword '' -o testdoc.pdf

CURL vs WGET — few differences

The main differences are:

  • wgetcan download files recursively whereas curl can not.
  • wget is a CLI utility and no libraries associated with it whereas curl is part of feature rich library libcurl.
  • curl offers upload and sending capabilities. wget only offers plain HTTP POST support.

That’s all!

Thanks for reading.

Hope you like the article. Please let me know your feedback in the response section.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store