RSS 13 projects tagged "Indexing"

Download Website Updated 02 May 2003 Pyndex

Screenshot
Pop 43.63
Vit 1.74

Pyndex is a simple and fast full-text indexer implemented in Python. Pyndex also includes an easy to use Bayesian classifier. It uses Metakit as its storage back-end. It works well for quickly adding a search feature to an application, and is also well suited to in-memory indexing and searching. It can handle phrase queries. It performs best in applications involving a few thousand documents, but its scaling is mostly limited by available memory.

Download No website Updated 11 May 2004 Lupy

Screenshot
Pop 60.47
Vit 3.66

Lupy is a full-text indexer for Python. It is a port of Jakarta Lucene to Python, and reads, writes, and searches indexes in Lucene binary format. Like Lucene, it is sophisticated, scalable, and Unicode aware.

Download Website Updated 28 Jun 2012 Xapian and Omega

Screenshot
Pop 402.16
Vit 16.25

Xapian is a search engine library, scalable to collections containing hundreds of millions of documents. It's written in C++ with bindings for Perl, Python, PHP, Java, Tcl, C#, Ruby, and Lua. It is a highly adaptable toolkit that allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also a rich set of boolean query operators. Omega is a Web search application built upon the Xapian library. It can index a Web server's document tree (including HTML, PDF, OpenOffice, MS Word/Excel/Powerpoint/Works, WordPerfect, RTF, PS, etc.), or data exported from arbitrary sources (e.g. SQL databases).

Download Website Updated 19 Aug 2003 ViIndex

Screenshot
Pop 15.94
Vit 1.00

ViIndex is an indexer program with a flexible and powerful indexer, a finder that allows you to use a combination of AND, OR, a regular expression, and the relative places of the keywords, and a powerful displayer with filter functionality.

Download Website Updated 30 Oct 2011 PyLucene

Screenshot
Pop 115.74
Vit 9.74

PyLucene is a Python extension for accessing Java Lucene from Python. Its goal is to allow use of Lucene's text indexing and searching capabilities from Python. It is designed to be API compatible with the latest version of Java Lucene.

Download Website Updated 07 Jul 2004 Alb

Screenshot
Pop 18.49
Vit 1.41

Alb generates hierarchical, captioned, XHTML 1.1/CSS 2.1 Web galleries from image directories.

Download Website Updated 06 Oct 2004 Dowser

Screenshot
Pop 48.93
Vit 2.24

Dowser is a Web research and archiving tool that clusters results from search engines, associates words that appear in previous searches, and keeps a local cache of all the results you click on in a searchable database along with summaries and links to related information. It helps you to keep track of what you find, with no advertising.

No download Website Updated 29 Nov 2005 isobel

Screenshot
Pop 24.41
Vit 1.00

Isobel is a framework to build complex information retrieval and analysis systems. Isobel can be functionally divided in two subsytems, Isobel Gatherer (the crawling and filtering subsystem) and Isobel Analyzer (the analysis subsystem). The two subsytems can also be used separately. Isobel Gatherer offers ready-to-use services like content fetching, scheduling, document format conversion, Hyperlink graph storage and analysis, content storage and indexing. A programmer may easily add new services. Isobel Analyzer uses the IBM UIMA architecture to reuse the analysis components developed for this architecture.

Download Website Updated 20 Oct 2008 pygccxml

Screenshot
Pop 38.21
Vit 3.25

The purpose of the GCC-XML extension is to generate an XML description of a C++ program from GCC's internal representation. The purpose of pygccxml is to read a GCC-XML generated file and provide a simple framework to navigate C++ declarations using Python classes.

Download Website Updated 12 Feb 2006 tag-not-ed

Screenshot
Pop 15.49
Vit 1.00

tag-not-ed is a system that allows you to create and manage text documents by attaching tags to them. Later, documents can be retrieved by running queries on those tags (e.g., "show me all docs that deal with 'dogs' and 'cats'"). It is composed of a front-end (currently a mode for the jed text editor) and an indexer. The latter can be used to implement a rudimentary "tagging file system".

Screenshot

Project Spotlight

TurnKey ejabberd Appliance

AN ejabberd appliance that is easy to use and lightweight.

Screenshot

Project Spotlight

TimeTrex Time and Attendance

Employee scheduling, attendance, job costing, invoicing, and payroll software.