RSS 9 projects tagged "Indexing"

Download Website Updated 15 Mar 2006 SWISH++

Screenshot
Pop 236.75
Vit 9.25

SWISH++ is a Unix-based file indexing and searching engine (typically used to index and search files on web sites). It was based on SWISH-E although SWISH++ is a complete rewrite. SWISH++ is at least 10 times faster and can handle much larger numbers of files. Additionally, it has unique features such as selective non-indexing, on-the-fly filters, user-selectable stemming, and more.

Download Website Updated 29 Oct 2008 WebGlimpse

Screenshot
Pop 174.42
Vit 11.35

WebGlimpse is a scalable, feature-rich search engine for indexing your Web site or any collection of local and remote sites you choose. Features include customizable output formats, custom ranking/ordering of hits, fuzzy matching, boolean queries, a Web administration interface for multiple archives, logging of queries, caching of results, and more. Localized search interfaces are provided in multiple languages including Spanish, German, French, Italian, Norwegian, Finnish, Russian, Hebrew, and others. It supports 3rd party filters for indexing PDF, Word, and Excel files. It is free for academic and most nonprofit users.

Download Website Updated 23 Dec 2013 GNU libextractor

Screenshot
Pop 492.93
Vit 52.04

libextractor is a library used to extract meta-data from files of arbitrary type. It is designed to use helper-libraries to perform the actual extraction, and to be trivially extendable by linking against external extractors for additional file types. The goal is to provide developers of file-sharing networks, file managers, and WWW-indexing bots with a universal library to obtain meta-data about files. It includes a shell-command and bindings for Java (JNI) and Python.

Download Website Updated 27 Apr 2005 POPsearch

Screenshot
Pop 71.06
Vit 4.94

POPsearch is a desktop search engine that is designed to help you easily find information on your computer. With features that other search engines don't have,it lets you index your entire collection of email messages and files. As information is indexed, it is immediately available for analysis from any Web browser. When POPsearch is configured correctly, you can also access your data remotely with RSS feeds, email feeds, or from any computer that has a Web browser.

Download Website Updated 14 Jan 2010 Doodle

Screenshot
Pop 165.99
Vit 6.28

Doodle is a desktop search engine for Linux. It searches your hard drive for files using pattern matching on meta-data. It extracts file-format specific meta-data using libextractor and builds a suffix tree to index the files. The index can then be searched rapidly. It is similar to locate, but can take advantage of information such as ID3 tags. It is possible to do full-text indexing using the appropriate libextractor plugins. It also supports using FAM to keep the database up-to-date.

No download Website Updated 03 Jul 2004 Amberfish

Screenshot
Pop 13.71
Vit 59.78

Amberfish is a general purpose text/XML retrieval utility. It features indexing of both free text and nested fields, built-in support for XML documents, structured queries allowing generalized field/tag paths, hierarchical result sets, automatic searching across multiple databases, efficient indexing, and relatively low memory requirements.

Download Website Updated 28 May 2008 Sman

Screenshot
Pop 64.80
Vit 3.43

Sman is "The Searcher for Man Pages", an enhanced version of "apropos" and "man -k". Sman adds several key abilities over its predecessors, including stemming and support for complex boolean text searches such as "(linux and kernel) or (mach and microkernel)". It shows results in a ranked order, optionally with a summary of the manpage with the searched text highlighted. Searches may be applied to the manpage section, title, body, or filename. The complete contents of the man page are indexed. A prebuilt index is used to perform fast searches.

Download Website Updated 30 Nov 2013 Recoll

Screenshot
Pop 270.38
Vit 31.04

Recoll is a personal full text desktop search tool based on Xapian. It provides an easy to use, feature-rich, easy administration interface with a Qt-based GUI. Text, HTML, PDF, PostScript, MS Word, OpenOffice, Wordperfect, KWord, Abiword, maildir, and mailbox mail folder formats are supported, along with their compressed versions and quite a few others. Powerful query facilities are provided. Multiple character sets are supported, and internal processing and storage uses Unicode UTF-8. Stemming is performed at query time and the stemming language can be switched after indexing.

Download Website Updated 10 May 2012 Search::Xapian

Screenshot
Pop 42.99
Vit 8.72

Search::Xapian is a Perl XS frontend to the Xapian C++ search library. It is a fairly complete wrapper: most features of the Xapian library are made available for use from Perl. Xapian is a highly adaptable toolkit that allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model as well as a rich set of boolean query operators. It's fast and scalable to hundreds of millions of documents.

Screenshot

Project Spotlight

Lziprecover

A recovery tool for lzip files.

Screenshot

Project Spotlight

hamsterdb Embedded Storage

A fast and portable database engine written in ANSI-C.