USCVprogs is a project developing a collection of utilities for automated retrieval and processing of United States election returns and other data pertaining to U.S. elections. This includes parsers for formats generated by commonly used election machines manufactured by companies such as ES&S, Diebold, and Sequoia, as well as Web spider software for monitoring and retrieving information from BOE websites. This will include working with direct data files as well as OCR of bitmaps, scans, and faxes. It is primarily being developed to mine data for USCountVotes.org, but the developed utilities may be of use to political campaigns and to local Town Clerks or Boards of Elections.
WWW::PkgFind watches Web sites, FTP sites, GIT repositories, etc. for new code releases, and downloads them. In other words, it's like a Web spider tuned for downloading software packages and patches. It is also able to generate a queue of incoming packages, to allow subsequent processing (such as running tests on them).
Search::Xapian is a Perl XS frontend to the Xapian C++ search library. It is a fairly complete wrapper: most features of the Xapian library are made available for use from Perl. Xapian is a highly adaptable toolkit that allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model as well as a rich set of boolean query operators. It's fast and scalable to hundreds of millions of documents.