The Downloads
package provides a single function, download
, which provides
cross-platform, multi-protocol, in-process download functionality implemented
with libcurl. It uses libcurl's multi-handle
callback API to present a Julian API: download(url)
blocks the task in which
it occurs but yields to Julia's scheduler, allowing arbitrarily many tasks to
download URLs concurrently and efficiently. As of Julia 1.6, this package is a
standard library that is included with Julia, but this package can be used with
Julia 1.3 through 1.5 as well.
The public API of Downloads
consists of two functions and three types:
download
— download a file from a URL, erroring if it can't be downloadedrequest
— request a URL, returning aResponse
object indicating successResponse
— a type capturing the status and other metadata about a requestRequestError
— an error type thrown bydownload
andrequest
on errorDownloader
— an object encapsulating shared resources for downloading
download(url, [ output = tempfile() ];
[ method = "GET", ]
[ headers = <none>, ]
[ timeout = <none>, ]
[ progress = <none>, ]
[ verbose = false, ]
[ downloader = <default>, ]
) -> output
url :: AbstractString
output :: Union{AbstractString, AbstractCmd, IO}
method :: AbstractString
headers :: Union{AbstractVector, AbstractDict}
timeout :: Real
progress :: (total::Integer, now::Integer) --> Any
verbose :: Bool
downloader :: Downloader
Download a file from the given url, saving it to output
or if not specified, a
temporary path. The output
can also be an IO
handle, in which case the body
of the response is streamed to that handle and the handle is returned. If
output
is a command, the command is run and output is sent to it on stdin.
If the downloader
keyword argument is provided, it must be a Downloader
object. Resources and connections will be shared between downloads performed by
the same Downloader
and cleaned up automatically when the object is garbage
collected or there have been no downloads performed with it for a grace period.
See Downloader
for more info about configuration and usage.
If the headers
keyword argument is provided, it must be a vector or dictionary
whose elements are all pairs of strings. These pairs are passed as headers when
downloading URLs with protocols that supports them, such as HTTP/S.
The timeout
keyword argument specifies a timeout for the download in seconds,
with a resolution of milliseconds. By default no timeout is set, but this can
also be explicitly requested by passing a timeout value of Inf
.
If the progress
keyword argument is provided, it must be a callback funtion
which will be called whenever there are updates about the size and status of the
ongoing download. The callback must take two integer arguments: total
and
now
which are the total size of the download in bytes, and the number of bytes
which have been downloaded so far. Note that total
starts out as zero and
remains zero until the server gives an indiation of the total size of the
download (e.g. with a Content-Length
header), which may never happen. So a
well-behaved progress callback should handle a total size of zero gracefully.
If the verbose
optoin is set to true, libcurl
, which is used to implement
the download functionality will print debugging information to stderr
.
request(url;
[ input = <none>, ]
[ output = <none>, ]
[ method = input ? "PUT" : output ? "GET" : "HEAD", ]
[ headers = <none>, ]
[ timeout = <none>, ]
[ progress = <none>, ]
[ verbose = false, ]
[ throw = true, ]
[ downloader = <default>, ]
) -> Union{Response, RequestError}
url :: AbstractString
input :: Union{AbstractString, AbstractCmd, IO}
output :: Union{AbstractString, AbstractCmd, IO}
method :: AbstractString
headers :: Union{AbstractVector, AbstractDict}
timeout :: Real
progress :: (dl_total, dl_now, ul_total, ul_now) --> Any
verbose :: Bool
throw :: Bool
downloader :: Downloader
Make a request to the given url, returning a Response
object capturing the
status, headers and other information about the response. The body of the
reponse is written to output
if specified and discarded otherwise. For HTTP/S
requests, if an input
stream is given, a PUT
request is made; otherwise if
an output
stream is given, a GET
request is made; if neither is given a
HEAD
request is made. For other protocols, appropriate default methods are
used based on what combination of input and output are requested. The following
options differ from the download
function:
input
allows providing a request body; if provided default toPUT
requestprogress
is a callback taking four integers for upload and download progressthrow
controls whether to throw or return aRequestError
on request error
Note that unlike download
which throws an error if the requested URL could not
be downloaded (indicated by non-2xx status code), request
returns a Response
object no matter what the status code of the response is. If there is an error
with getting a response at all, then a RequestError
is thrown or returned.
struct Response
proto :: String
url :: String
status :: Int
message :: String
headers :: Vector{Pair{String,String}}
end
Response
is a type capturing the properties of a successful response to a
request as an object. It has the following fields:
proto
: the protocol that was used to get the responseurl
: the URL that was ultimately requested after following redirectsstatus
: the status code of the response, indicating success, failure, etc.message
: a textual message describing the nature of the responseheaders
: any headers that were returned with the response
The meaning and availability of some of these responses depends on the protocol used for the request. For many protocols, including HTTP/S and S/FTP, a 2xx status code indicates a successful response. For responses in protocols that do not support headers, the headers vector will be empty. HTTP/2 does not include a status message, only a status code, so the message will be empty.
struct RequestError <: ErrorException
url :: String
code :: Int
message :: String
response :: Response
end
RequestError
is a type capturing the properties of a failed response to a
request as an exception object:
url
: the original URL that was requested without any redirectscode
: the libcurl error code;0
if a protocol-only error occurredmessage
: the libcurl error message indicating what went wrongresponse
: response object capturing what response info is available
The same RequestError
type is thrown by download
if the request was
successful but there was a protocol-level error indicated by a status code that
is not in the 2xx range, in which case code
will be zero and the message
field will be the empty string. The request
API only throws a RequestError
if the libcurl error code
is non-zero, in which case the included response
object is likely to have a status
of zero and an empty message. There are,
however, situations where a curl-level error is thrown due to a protocol error,
in which case both the inner and outer code and message may be of interest.
Downloader(; [ grace::Real = 30 ])
Downloader
objects are used to perform individual download
operations.
Connections, name lookups and other resources are shared within a Downloader
.
These connections and resources are cleaned up after a configurable grace period
(default: 30 seconds) since anything was downloaded with it, or when it is
garbage collected, whichever comes first. If the grace period is set to zero,
all resources will be cleaned up immediately as soon as there are no more
ongoing downloads in progress. If the grace period is set to Inf
then
resources are not cleaned up until Downloader
is garbage collected.