RSS 73 projects tagged "Indexing"

Download Website Updated 30 Nov 2003 harvest

Screenshot
Pop 190.33
Vit 8.65

Harvest is a system to collect information and make it searchable using a Web interface. It can collect information using HTTP, FTP, NNTP, and local files. Supported formats include HTML, DVI, PS, fulltext, mail, man pages, news, troff, WordPerfect, C sources, and many more. Adding support for new formats is easy due to Harvest's modular design.

Download Website Updated 30 Jan 2001 Sary

Screenshot
Pop 25.10
Vit 2.66

Sary is a suffix array library and tools. It provides fast full-text search facilities for text files on the order of 10 to 100 MB using a data structure called a suffix array. It can also search specific fields in a text file by assigning index points to those fields.

Download Website Updated 17 Jan 2008 wf

Screenshot
Pop 41.17
Vit 2.94

wf scans a text file or standard input and counts the frequency of words through the whole text, sending resulting output to stdout showing each word and corresponding frequency.

Download Website Updated 03 Feb 2001 XM Tool

Screenshot
Pop 25.57
Vit 1.00

XM Tool is a series of Perl snippets than can be called separately or combined into more complex Perl scripts. It uses XMLish (plain) text as the representation between stages, and a sample processor to read C/JavaDoc sources and generate HTML or even docbook is provided.

Download Website Updated 15 Jul 2002 Doli

Screenshot
Pop 18.97
Vit 2.10

Doli (Documentation Libre Indexée) is a portable system to index and search documentation. The system consists of an indexer, and a Tcl-based Web server which provides the search interface. It was designed to provide a platform-independent method for searching HTML documentation. A PHP and MySQL interface is also included.

Download Website Updated 24 Jun 2001 LinkMaster

Screenshot
Pop 31.30
Vit 68.41

LinkMaster is a method of linking data between different applications on Palm devices. There are not many applications that support this method, but the list is growing. Even without special application support, it tracks recently-used programs and bookmarks for quick access.

Download Website Updated 24 Oct 2001 gelapas

Screenshot
Pop 20.05
Vit 1.00

gelapas crawls the file tree and extracts information from files. The default settings (and the shorthand options) are useful to extract information such as the title or meta tags from HTML files, but it could also be used for other kind of documents.

Download Website Updated 09 Feb 2002 XMLPublication

Screenshot
Pop 38.79
Vit 1.01

XMLPublication is a set of tools to generate Web pages from (possibly large) desktop documents or other structured documents, such as books with paragraphs, or tabular data. It cuts documents into Web pages, and creates customizable multi-indices. All this is done through a repeatable process in which data is separated from presentation and user settings. It uses XML techniques, particularly XSLT and Ant.

No download Website Updated 04 Feb 2002 Java Search Engine

Screenshot
Pop 75.56
Vit 66.75

Java Search Engine is a server-side search engine program for Web sites written completely in Java. It features HTML and PDF indexing, a built-in Web crawler, international encodings support, words and phrases search, and returning results as quotations with highlighted words (like Google). It is available as EJB, JSP, servlet, or Java API library. For non-Java enviroments, it is available as an XML server with XSLT support.

Download Website Updated 21 Apr 2002 Ron's Indexing Program

Screenshot
Pop 24.08
Vit 1.00

Ripunix is a command-line based system for indexing, searching, and browsing very large (multi- gigabyte) collections of plain text such as Project Gutenberg. It is optimized for efficiently maintaining these very large databases on machines with very small computing resources.

Screenshot

Project Spotlight

Fanurio

A time tracking and billing application for freelancers.

Screenshot

Project Spotlight

grep

GNU grep, egrep and fgrep.