Projects / Ex-Crawler

Ex-Crawler

The Ex-Crawler Project is divided into three subprojects. The main part is the Ex-Crawler daemon server, a highly configurable and flexible Web crawler written in Java. It comes with its own socket server, with which you can manage the server, users, distributed grid/volunteer computing, and much more. Crawled information is stored in a database (Currently MySQL, PostgreSQL, and MSSQL are supported). The second part is a graphical (Java Swing) distributed grid/volunteer computing client, including user computer state detection, based on JADIF Project. The Web search engine is written in PHP. It comes with a Content Management System, user language detection and multi-language support, and templates using Smarty, including an application framework that is partly forked from Joomla 1.5, so that Joomla components can be adapted quickly.

Tags
Licenses
Operating Systems
Implementation
Translations

RSS Recent releases

  •  10 Jun 2010 13:26

Release Notes: This release features a complete database rework, many speed improvements (up to 60% faster), PDF crawling, language detection, an URL filter, and hundreds of other improvements, bugfixes, and updates. Ex-Crawler can now be run as a daemon. Startup scripts and a process watcher were included. Setup was simplified. A utility that creates the required database tables was added and an automatic performance benchmark test was implemented so that you don't need to handle the number of threads manually.

Screenshot

Project Spotlight

Sculptor

A DSL and code generator for Java enterprise applications.

Screenshot

Project Spotlight

Gnome Partition Editor

A graphical frontend to libparted.