121 projects tagged "Indexing"
WAscii is a Web frontend intended to display an AsciiDoc documentation repository. It allows you to search and browse your documentation files and automatically converts AsciiDoc to HTML, PDF, and ODF documents. It is intended to work directly from a subversion repository containing your AsciiDoc files.
TiTLi is a Google-like search tool for relational databases . It builds on top of Apache Lucene to provide an API and a GWT-based UI for searching multiple databases from various vendors simultaneously. It is very fast due to indexing, and the database is queried only when a record is chosen.
SCAN is a personal information retrieval framework, combining search, text analysis, tagging, and metadata functions for document collections management. SCAN is a component-based software using a number of plugins for specific features. The basic SCAN platform can be easily extended with plugins for different document formats and document location types.
Luke is a handy development and diagnostic tool for Apache Lucene. It accesses existing Lucene indexes and allows you to display and modify their contents in several ways. A user can browse by document number or by term, view documents, copy them to the clipboard, retrieve a ranked list of the most frequent terms, execute a search and browse the results, analyze search results, selectively delete documents from the index, reconstruct the original document fields, edit them, and reinsert them into the index, optimize indexes, and much more. Luke can also be extended through plugins.
LuSql is a command line Java application for the construction of a Lucene index from an arbitrary SQL query of a JDBC-accessible SQL database. It allows a user to control a number of parameters, including the SQL query to use, individual indexing/storage/term-vector nature of fields, analyzer, stop word list, and other tuning parameters. In its default mode, it uses threading to take advantage of multiple cores. LuSql can handle complex queries, allows for additional per record sub-queries, and has a plug-in architecture for arbitrary Lucene document manipulation.
SILVERCODERS DocStorage is a utility to improve document management. You can have one database for all invoices, guarantees, protocols, and other documents. DocStorage can extract plain text from documents in doc, XLS, PPT, PDF, RTF, ODT, ODS, ODP, docx, XLSX, PPTX, and many other formats. It can use an OCR engine to extract plain text even from scanned documents. It can perform global fulltext search in all documents regardless of format. It supports document versioning, document duplicate detection, document notes, and document signing. It provides full integration with software suites like Microsoft Office and OpenOffice.
Itzam/Sharp is a C# class library for creating and manipulating keyed-access database files containing variable-length, random access records. Information is referenced by a user-defined key value; indexes may be combined with or separate from data. Itzam/Sharp implements the Itzam engine (see Itzam/Core and Itzam/Java) in 100% managed C#. At 32K, the Itzam/Sharp engine is both small and powerful, and it works with any .NET language, including F# and Visual Basic.