RSS 118 projects tagged "Indexing"

Download Website Updated 30 Nov 2003 harvest

Screenshot
Pop 188.75
Vit 8.65

Harvest is a system to collect information and make it searchable using a Web interface. It can collect information using HTTP, FTP, NNTP, and local files. Supported formats include HTML, DVI, PS, fulltext, mail, man pages, news, troff, WordPerfect, C sources, and many more. Adding support for new formats is easy due to Harvest's modular design.

Download Website Updated 06 Jan 2014 HTMLDOC

Screenshot
Pop 743.95
Vit 38.94

HTMLDOC converts HTML files and Web pages into indexed HTML, PostScript, and PDF files suitable for online viewing and printing. It can be used as a standalone GUI application, in a batch document processing environment, as a Web-based report generation application, or in embedded environments to support printing of HTML content. It runs on all Unix platforms as well as Mac OS X and Windows 2000 and higher.

Download Website Updated 15 Dec 2004 Namazu

Screenshot
Pop 148.92
Vit 3.69

Namazu is a full-text search system intended for easy use. Not only does it work as a small or medium scale Web search engine, but also as a personal search system for email or other files. Supported document types: HTML, Mail/News, MHonArc, RFC, TeX (with detex), man (with groff), Word (with wvWare), PDF (with pdftotext) and plain text.

Download Website Updated 14 Jun 2004 Net::Z3950::SimpleServer

Screenshot
Pop 50.26
Vit 2.63

Net::Z3950::SimpleServer is a Perl module which implements the server side of the Z39.50 (information retrieval) protocol. It hides the complexity of network exchanges, packet serialization, and session handling. You are required only to implement simple callbacks to support searching and record retrieval. It is the basis of the "Zoogle" project, which is a Z39.50 gateway to the Google web index.

Download Website Updated 30 Jan 2001 Sary

Screenshot
Pop 25.38
Vit 2.66

Sary is a suffix array library and tools. It provides fast full-text search facilities for text files on the order of 10 to 100 MB using a data structure called a suffix array. It can also search specific fields in a text file by assigning index points to those fields.

Download Website Updated 15 Mar 2006 SWISH++

Screenshot
Pop 238.95
Vit 9.24

SWISH++ is a Unix-based file indexing and searching engine (typically used to index and search files on web sites). It was based on SWISH-E although SWISH++ is a complete rewrite. SWISH++ is at least 10 times faster and can handle much larger numbers of files. Additionally, it has unique features such as selective non-indexing, on-the-fly filters, user-selectable stemming, and more.

Download Website Updated 17 Jan 2008 wf

Screenshot
Pop 41.35
Vit 2.94

wf scans a text file or standard input and counts the frequency of words through the whole text, sending resulting output to stdout showing each word and corresponding frequency.

Download Website Updated 13 Apr 2009 YASE

Screenshot
Pop 79.94
Vit 3.30

YASE is a text indexing and retrieval system. It allows you to index your document collection very easily. All words are indexed and can be optionally stemmed. The query tool supports searching all/any terms and can rank query results by relevance using the cosine measure.

Download Website Updated 03 Feb 2001 XM Tool

Screenshot
Pop 25.61
Vit 1.00

XM Tool is a series of Perl snippets than can be called separately or combined into more complex Perl scripts. It uses XMLish (plain) text as the representation between stages, and a sample processor to read C/JavaDoc sources and generate HTML or even docbook is provided.

Download Website Updated 15 Jul 2002 Doli

Screenshot
Pop 18.76
Vit 2.10

Doli (Documentation Libre Indexée) is a portable system to index and search documentation. The system consists of an indexer, and a Tcl-based Web server which provides the search interface. It was designed to provide a platform-independent method for searching HTML documentation. A PHP and MySQL interface is also included.

Screenshot

Project Spotlight

Librfm

A basic file management library for Rodent applications.

Screenshot

Project Spotlight

XWiki

An advanced Wiki engine.