Release Notes: Better handling of special charatcters, better HTML to text extraction, support for new URL scheduling algorithms including score based algorithms, and support for exceptions to GeoIP. Some tests were fixed.


Release Notes: Better handling of special charatcters, better HTML to text extraction, support for new URL scheduling algorithms including score based algorithms, and support for exceptions to GeoIP. Some tests were fixed.


Release Notes: This release is integrated with the Solr enterprise search server, and can feed records directly to a Solr server. There is also a new version numbering system that is compatible with CPAN requirements.


Release Notes: Code for simple Lucene integration has been added to the templates directory. The documentation HTML generator has been changed to use ht4tex.


Release Notes: This release adds the switch ZebraIndexing to combineExport. It enables updating of the configured Zebra server with exported records. It fixes a bug in Zebra recordId handling. It adds the switches 'collapseinlinks' and 'nooutlinks' to combineExport. It improves indexing of PDF documents. It fixes a bug in the processing of pure text documents.


Release Notes: A fulltext-index was added in MySQL table search, as was a configuration variable to enable or disable it. Integration with the Zebra database system was fixed. Updates, fixes, and code cleaning were done. Support for SVM classifiers was added (which depends on SVMLight). Country determination was added (adding a dependency on GeoIp). Two new PlugIn types were added: "relevant text extraction" and "extra analysis".