SCAN is a personal information retrieval framework, combining search, text analysis, tagging, and metadata functions for document collections management. SCAN is a component-based software using a number of plugins for specific features. The basic SCAN platform can be easily extended with plugins for different document formats and document location types.
jSlovo is a fast database engine with a GUI that was designed for free dictionaries. It can create a file-based database from a text file and then be used to search it for particular words. It can scan any large number of file-based databases and the size of the databases is not limited. HTML tags can be used in the text files and for cross-references.
Associations Indexing Service (AIS) was originally done as an extension of human memory for tagging (storing under personal keywords and associations) resources, URIs, bookmarks, and memos (for fast access to the information in future) by using the same keywords or queries, similar to popular search engines. It can be seen as a local search engine, used as an automatic indexer of big file hierarchies (e.g. personal archives or files repositories). It is based on Lucene, so the application will remain very fast with any size index.
LuSql is a command line Java application for the construction of a Lucene index from an arbitrary SQL query of a JDBC-accessible SQL database. It allows a user to control a number of parameters, including the SQL query to use, individual indexing/storage/term-vector nature of fields, analyzer, stop word list, and other tuning parameters. In its default mode, it uses threading to take advantage of multiple cores. LuSql can handle complex queries, allows for additional per record sub-queries, and has a plug-in architecture for arbitrary Lucene document manipulation.
Isobel is a framework to build complex information retrieval and analysis systems. Isobel can be functionally divided in two subsytems, Isobel Gatherer (the crawling and filtering subsystem) and Isobel Analyzer (the analysis subsystem). The two subsytems can also be used separately. Isobel Gatherer offers ready-to-use services like content fetching, scheduling, document format conversion, Hyperlink graph storage and analysis, content storage and indexing. A programmer may easily add new services. Isobel Analyzer uses the IBM UIMA architecture to reuse the analysis components developed for this architecture.
TiTLi is a Google-like search tool for relational databases . It builds on top of Apache Lucene to provide an API and a GWT-based UI for searching multiple databases from various vendors simultaneously. It is very fast due to indexing, and the database is queried only when a record is chosen.