Release Notes: Saving pages to local disk now works, and the program now uses the Host: header in outgoing requests for better virtual server support. HTML entities are decoded before extracting links, and gzip-encoded pages are requested from the server.
Release Notes: Crawler can now extract links from page content using regular expresions (with possible replacement for URL rewriting). Crawler can now log depth for easier debugging, and tracking of known URLs can be set to two modes (the first saves memory, second CPU).
Release Notes: Support was added for escaping "&" and "," in URLs. The delay parameter can now take time units like 1.3s and 2h. A new per-site parameter, "crawltime", (which works on the command line too) was added for limiting the time spent on crawling a site.
Release Notes: Support was added for crawling delays. Links of reject type are now logged, which is good for extracting URLs from a site. A crash which occurred when no default masks were used was fixed.
Release Notes: This release fixes various crashes. The documentation has been converted to Docbook and updated. A .jar file is distributed instead of .class files.
Release Notes: This release supports and generates real Referrer: headers. This is needed for sites with very strict Referrer header checks.
Release Notes: When a connection is closed, the program no longer attempts an infinite number of retries. Support for Smart Cache 0.87+ was added.
Release Notes: Network I/O error handling is improved.
Release Notes: This release adds support for V4 .cacheinfo files, which are new in Smart Cache 0.74.
Release Notes: This release has a new improved manual. A problem with masks not being merged when fetching documents from a known location specified by URL instead of by alias has been fixed.