SCAN is a personal information retrieval framework, combining search, text analysis, tagging, and metadata functions for document collections management. SCAN is a component-based software using a number of plugins for specific features. The basic SCAN platform can be easily extended with plugins for different document formats and document location types.
Foxtrot is a full text indexing software for PDF, OpenOffice.org 1 and 2, MS Word, and XLS files. The packge provides two different frontends: a Google-like searching tool implemented with Perl-Gtk and a PHP-based Web interface. The backend scans directories asynchronously, converts files to text, and indexes them in a MySQL database.
Pinot is a D-Bus service that crawls, indexes your documents, and monitors them for changes. It is also a GTK-based user interface that enables you to query the index built by the service or your favorite Web engine, and display and analyze the results. It makes full use of advanced indexing and search facilities offered by Xapian, features language detection, dynamic document summaries, easy labelling of documents, and internal support for common file types. The D-Bus interface allows easy integration with other applications.
Doodle is a desktop search engine for Linux. It searches your hard drive for files using pattern matching on meta-data. It extracts file-format specific meta-data using libextractor and builds a suffix tree to index the files. The index can then be searched rapidly. It is similar to locate, but can take advantage of information such as ID3 tags. It is possible to do full-text indexing using the appropriate libextractor plugins. It also supports using FAM to keep the database up-to-date.