120 projects tagged "Indexing"

Download Website Updated 06 Jan 2014 HTMLDOC

Screenshot
Pop 724.34
Vit 33.93

HTMLDOC converts HTML files and Web pages into indexed HTML, PostScript, and PDF files suitable for online viewing and printing. It can be used as a standalone GUI application, in a batch document processing environment, as a Web-based report generation application, or in embedded environments to support printing of HTML content. It runs on all Unix platforms as well as Mac OS X and Windows 2000 and higher.

Download Website Updated 06 Apr 2014 OpenSearchServer

Screenshot
Pop 558.82
Vit 40.51

OpenSearchServer is a powerful, enterprise-class, search engine program. Using its Web user interface, crawlers (Web, file, database, etc.), and REST/RESTFul API, you can integrate advanced full-text search capabilities into your application.

Download Website Updated 23 Dec 2013 GNU libextractor

Screenshot
Pop 493.76
Vit 44.57

libextractor is a library used to extract meta-data from files of arbitrary type. It is designed to use helper-libraries to perform the actual extraction, and to be trivially extendable by linking against external extractors for additional file types. The goal is to provide developers of file-sharing networks, file managers, and WWW-indexing bots with a universal library to obtain meta-data about files. It includes a shell-command and bindings for Java (JNI) and Python.

Download Website Updated 10 Apr 2014 OpenGrok

Screenshot
Pop 449.82
Vit 27.19

OpenGrok is a fast and usable source code search and cross reference engine. It helps you search, cross-reference, and navigate your source tree. It can understand various program file formats and version control histories like Mercurial, Bazaar, Git, ClearCase, Perforce, SCCS, RCS, CVS, or Subversion. In other words, it lets you grok (profoundly understand) the source.

Download Website Updated 28 Jun 2012 Xapian and Omega

Screenshot
Pop 393.96
Vit 15.91

Xapian is a search engine library, scalable to collections containing hundreds of millions of documents. It's written in C++ with bindings for Perl, Python, PHP, Java, Tcl, C#, Ruby, and Lua. It is a highly adaptable toolkit that allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also a rich set of boolean query operators. Omega is a Web search application built upon the Xapian library. It can index a Web server's document tree (including HTML, PDF, OpenOffice, MS Word/Excel/Powerpoint/Works, WordPerfect, RTF, PS, etc.), or data exported from arbitrary sources (e.g. SQL databases).

Download Website Updated 06 May 2014 Recoll

Screenshot
Pop 379.03
Vit 78.18

Recoll is a personal full text desktop search tool based on Xapian. It provides an easy to use, feature-rich, easy administration interface with a Qt-based GUI. Text, HTML, PDF, PostScript, MS Word, OpenOffice, Wordperfect, KWord, Abiword, maildir, and mailbox mail folder formats are supported, along with their compressed versions and quite a few others. Powerful query facilities are provided. Multiple character sets are supported, and internal processing and storage uses Unicode UTF-8. Stemming is performed at query time and the stemming language can be switched after indexing.

No download Website Updated 13 May 2014 Emdros

Screenshot
Pop 362.32
Vit 137.96

Emdros is a corpus query system for storing and searching linguistically annotated text. It is very generic, supporting almost any kind of annotation from almost any linguistic theory. All linguistic levels of analysis are supported, including phonology, morphology, the lexical level, syntax, and discourse. The core libraries act as a middleware layer between a client and an underlying SQL database. MySQL, PostgreSQL, and SQLite (2 and 3) are supported.

Download Website Updated 06 Jan 2007 Auto Directory Index PHP Script

Screenshot
Pop 350.02
Vit 7.71

AutoIndex is a PHP script that makes a table that lists the files in a directory, and lets users access the files and subdirectories. It includes searching, icons for each file type, an admin panel, uploads, access logging, file descriptions, and more.

No download Website Updated 03 Mar 2014 DocFetcher

Screenshot
Pop 346.54
Vit 26.41

DocFetcher is a desktop search application: It allows you to search the contents of documents on your computer. You can think of it as Google for your local files.

Download Website Updated 05 Oct 2013 Apache Lucene

Screenshot
Pop 287.97
Vit 19.47

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is suitable for nearly any application that requires full-text search, especially cross-platform.

Screenshot

Project Spotlight

Lilblue Linux

An XFCE4 desktop system built on uClibc.

Screenshot

Project Spotlight

Devel Live CD

A Live CD to compile programs.