Methanol is a modular, customizable Web crawling system with crawlers optimized for speed. It is designed to allow the administrator to set up any kind of filetype handling, parsing, and indexing rules.
A general-purpose Java library.
A fork of the bacula.org project.