Skip to content
Tim Rühsen edited this page Feb 21, 2016 · 6 revisions

Build Status Coverity Scan Build Status Coverage Status

This page is still a work in progress. The idea is to create an updated, unified readme file in pure Markdown for rendering online. If well done, the Markdown file will be completely human readable and some day be able to replace the old README files as well

GNU Wget

Official Homepage: http://www.gnu.org/software/wget/

GNU Wget is a free utility for non-interactive download of files from the Web. It supports HTTP, HTTPS and FTP protocols, as well as retrieval through HTTP proxies.

It can follow links in HTML pages and create local versions of remote web sites, fully recreating the directory structure of the original site. This is sometimes referred to as "recursive downloading." While doing that, Wget respects the Robot Exclusion Standard (/robots.txt). Wget can be instructed to convert the links in downloaded HTML files to the local files for offline viewing.

Recursive downloading also works with FTP, where Wget can retrieve a hierarchy of directories and files.

With both HTTP and FTP, Wget can check whether a remote file has changed on the server since the previous run, and only download the newer files.

Wget has been designed for robustness over slow or unstable network connections; if a download fails due to a network problem, it will keep retrying until the whole file has been retrieved. If the server supports continuation, it will instruct the server to continue the download from where it left off.

Most of the features are configurable, either through command-line options, or via initialization file .wgetrc. Wget allows you to install a global startup file (/usr/local/etc/wgetrc by default) for site settings.

Wget works under almost all Unix variants in use today and, unlike many of its historical predecessors, is written entirely in C, thus requiring no additional software, such as Perl. The external software it does work with, such as OpenSSL, is optional. As Wget uses the GNU Autoconf, it is easily built on and ported to new Unix-like systems. The installation procedure is described in the INSTALL file.

As with other GNU software, the latest version of Wget can be found at the master GNU archive site, and its mirrors. Wget resides at ftp://ftp.gnu.org/pub/gnu/wget/

Please report bugs in Wget to [email protected].

See the file 'MAILING-LIST' for information about Wget mailing lists. Wget's home page is at http://www.gnu.org/software/wget/.

If you would like to contribute code for Wget, please read http://wget.addictivecode.org/PatchGuidelines.

Wget was originally written and mainained by Hrvoje Niksic. Please see the file AUTHORS for a list of major contributors, and the ChangeLogs for a detailed listing of all contributions.

Building From Git Sources

To reduce bandwidth and needless updates, the source code repository does not contain automatically-generated files, even when these are normally present in the distribution tarballs. Therefore, to build GNU Wget from the sources in the repository, you'll need to have one or more of the following (note that gettext, OpenSSL, GnuTLS, libidn, libiconv, libpsl, libpcre, pkg-config, libmetalink and GnuPG are not absolutely required):

  • autoconf currently, GNU Wget requires version 2.61. This is needed to generate the configure script from configure.in. This is not required when building from a tarball distribution; only when building from repository sources.

  • automake currently, GNU Wget requires version 1.10.1. This is needed for generating the Makefile.in templates that the configure script uses to generate the Makefiles. As with autoconf, it is not required when building from a tarball distribution; only when building from repository sources.

  • flex is needed to generate the CSS-parsing code.

  • Perl, if you wish to generate the wget(1) manpage, or run the tests in the tests/ sub directory. Tarball distributions include an already-generated wget.1 manual. The command make check runs the test suite written in perl and python (see below). To execute all the tests you need libwww-perl and libio-socket-ssl-perl perl library. If perl -MCPAN -e 'install Bundle::LWP' fails then you most likely don't have cpan module installed. First download CPAN and install CPAN. Then execute perl -MCPAN -e 'install Bundle::LWP'. Now make check should pass all of the tests in the test suite.

  • Python 3, if you want to run the tests in the testenv/ subdirectory. Keep in mind that make check will try to run all the perl and python tests. More information about the test suite below in the section "Testing and development".

  • texinfo in order to generate Info, PostScript and/or HTML documentation. You don't need texinfo in order to generate the wget(1) manpage; however, note that the manpage does not include the full documentation. Tarball distributions include the already-generated documentation in these formats.

  • gettext, if you wish to compile with NLS (Native Language Support), which is enabled by default. If you do not have gettext, you can compile without it by specifying --disable-nls to the ./configure script. This is true regardless of where you obtained the source you're building. NOTE: if you get errors about AM_GNU_GETTEXT and/or AM_INTL_SUBDIR, you probably have a buggy version of GNU m4. Upgrade to the latest version. You may also need to export M4=<new m4 path>, to be sure that autoconf/automake use it instead of the old one.

  • GnuTLS to allow encrypted data transfer (HTTPS). You need the header files and the library installed. As an alternative, you can use OpenSSL by specifying --with-ssl=openssl to the ./configure script. If you do not want HTTPS support, specify --without-ssl to the ./configure script. If you want to compile+link a non-system library version use --with-libgnutls-prefix (or if having pkg-config: see description below).

  • OpenSSL to allow encrypted data transfer (HTTPS) an alternative to GnuTLS. You need the header files and the library installed. If you want to compile+link a non-system library version use --with-libssl-prefix (or if having pkg-config: see description below).

  • libidn is required for IDN/IRI support (non-ASCII characters within what would otherwise be URLs).

  • libiconv is required on non-GNU systems, for IDN/IRI support. On GNU systems, the functionality provided by libiconv is already present in the system libraries.

  • git is used to fetch gnulib files through the bootstrap.sh script.

  • libpsl is required for using a public suffix list to check for valid cookie domains. You need the header files and the library installed.

  • libpcre is required for using Perl-compatible regular expressions with --accept-regex and --reject-regex. You need the header files and the library installed to compile and link Wget with PCRE support.

  • pkg-config helps the ./configure script to find installed libraries. Most libraries provide a pkg-config file (.pc extension) with information about dependencies, header file and library locations. Distributions deliver their specific .pc file to each library. If you want to compile+link against your own library version, make a copy of the appropriate .pc file and amend it to your needs (e.g. edit the line starting with prefix=). Before you execute the ./configure script, set (and export) PKG_CONFIG_PATH to the directory where you saved the .pc file. Example:

    $ PKG_CONFIG_PATH="." ./configure
    
  • libmetalink is needed to enable Metalink files support.

  • GnuPG with GPGME is used to verify GPG-signed Metalink resources.

For those who might be confused as to what to do once they check out the source code, considering configure and Makefile do not yet exist at that point, a shell script called bootstrap has been provided. After calling ./bootstrap you're ready to build GNU Wget in the normal fashion, with ./configure and make.

So, to sum up, after checking out the source code as described above, you may proceed as follows:

  1. Change to the topmost GNU Wget directory:

    $ cd wget        # assumes you've cloned a repository to "./wget"
    
  2. Generate all the automatically-generated files required prior to configuring the package:

    $ ./bootstrap
    
  3. Configure the package and compile it:

    $ ./configure --enable-assert [some_parameters]
    $ make
    
  4. Hack, compile, test, hack, compile, test...

```
$ src/wget --version
GNU Wget 1.12-devel (9cb2563197bc)
```

Testing and development

All developers are requested to enable the assertions on their development builds to ensure a stable codebase. Assertions are added to state certain assumptions about the code and its data which all developers should be mindful of. To enable assertions, run the configure command with the --enable-assert option, like this:

$ ./configure --enable-assert [other configure options]

Both the Perl and Python test suites (test/ and testenv/) include support for GDB and Valgrind. The environment variables GDB_TESTS and VALGRIND_TESTS are available to enable such wrappers. If specified, Wget would be run through either of them during the test. For example:

$ TESTS_ENVIRONMENT="VALGRIND_TESTS=1" make check

Or to execute a single test:

$ cd testenv
$ VALGRIND_TESTS=1 ./Test-O.py

That would execute Test-O.py test case, but running Wget through Valgrind.

GDB has preference over Valgrind. If both variables have been asserted, Wget would be run through GDB.

If you run a test case through GDB, please bear in mind that it could give a false negative. This is because some tests that expect Wget to fail rely on Wget's return code. However, when run through GDB, its return code will always be zero, causing the test to claim failure. This wrapper for GDB is, however, very useful to tackle bugs, allowing one to write a test case for some specific bug and then using GDB to fix it more easily. Otherwise, a dedicated server would have to be set up and write a custom CGI just to reproduce that bug, which might be tedious. Tests should only be run through GDB for that purpose.


All content (C) 2015 Free Software Foundation. For terms of use, redistribution, and modification, please see the WikiLicense page.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.

Additional permission under GNU GPL version 3 section 7

If you modify this program, or any covered work, by linking or combining it with the OpenSSL project's OpenSSL library (or a modified version of that library), containing parts covered by the terms of the OpenSSL or SSLeay licenses, the Free Software Foundation grants you additional permission to convey the resulting work. Corresponding Source for a non-source form of such a combination shall include the source code for the parts of OpenSSL used as well as that of the covered work.