RSS 35 projects tagged "Information Management"

Download Website Updated 28 Jun 2012 Xapian and Omega

Screenshot
Pop 404.90
Vit 16.30

Xapian is a search engine library, scalable to collections containing hundreds of millions of documents. It's written in C++ with bindings for Perl, Python, PHP, Java, Tcl, C#, Ruby, and Lua. It is a highly adaptable toolkit that allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also a rich set of boolean query operators. Omega is a Web search application built upon the Xapian library. It can index a Web server's document tree (including HTML, PDF, OpenOffice, MS Word/Excel/Powerpoint/Works, WordPerfect, RTF, PS, etc.), or data exported from arbitrary sources (e.g. SQL databases).

Download Website Updated 30 Nov 2013 Recoll

Screenshot
Pop 270.29
Vit 30.72

Recoll is a personal full text desktop search tool based on Xapian. It provides an easy to use, feature-rich, easy administration interface with a Qt-based GUI. Text, HTML, PDF, PostScript, MS Word, OpenOffice, Wordperfect, KWord, Abiword, maildir, and mailbox mail folder formats are supported, along with their compressed versions and quite a few others. Powerful query facilities are provided. Multiple character sets are supported, and internal processing and storage uses Unicode UTF-8. Stemming is performed at query time and the stemming language can be switched after indexing.

Download Website Updated 04 Apr 2014 Terrier

Screenshot
Pop 207.76
Vit 45.51

Terrier is software for the rapid development of Web, intranet, and desktop search engines. More generally, it is a modular platform for building large-scale information retrieval applications, providing indexing and probabilistic retrieval functionalities. It comes with a desktop search application.

Download Website Updated 14 Jan 2010 Doodle

Screenshot
Pop 166.73
Vit 6.28

Doodle is a desktop search engine for Linux. It searches your hard drive for files using pattern matching on meta-data. It extracts file-format specific meta-data using libextractor and builds a suffix tree to index the files. The index can then be searched rapidly. It is similar to locate, but can take advantage of information such as ID3 tags. It is possible to do full-text indexing using the appropriate libextractor plugins. It also supports using FAM to keep the database up-to-date.

No download No website Updated 21 Feb 2005 Zoe Intertwingle

Screenshot
Pop 164.75
Vit 7.12

Zoe is a Web based email client with a built in SMTP and POP3 server and Google-like search functionality that lives on your desktop. It is written in Java and uses Lucene technology to provided instant searching and threading of your email messages.

Download Website Updated 04 Jul 2011 Emdros

Screenshot
Pop 157.02
Vit 16.81

Emdros is a corpus query system for storing and searching linguistically annotated text. It is very generic, supporting almost any kind of annotation from almost any linguistic theory. All linguistic levels of analysis are supported, including phonology, morphology, the lexical level, syntax, and discourse. The core libraries act as a middleware layer between a client and an underlying SQL database. MySQL, PostgreSQL, and SQLite are supported.

Download Website Updated 18 Oct 2009 isbnsearch

Screenshot
Pop 116.85
Vit 2.98

isbnsearch provides a simple method for retrieving information about any book using only an ISBN or EAN barcode. It is intended to provide assistance for online libraries, user groups, or individual users, and is designed in such a way to provide a distributed ISBN database query system. Users can choose to view the summary information (author, title, publisher, date, edition, subject, ISBN) as HTML, XML, or a pre-formatted SQL statement.

Download Website Updated 12 Oct 2006 PDFBox

Screenshot
Pop 115.87
Vit 2.77

PDFBox is a Java library for manipulating PDF documents and extracting contents from existing PDF documents.

No download Website Updated 20 Dec 2005 DocMgr

Screenshot
Pop 106.32
Vit 2.03

DOCMGR is a document management system that incorporates automatic indexing of uploaded files, automatic OCR and content indexing of pictures, group-level permissions, LDAP authentication, email notifications, WebDAV, and a discussion board for stored files. Beyond its stock indexing subsystem, DocMGR also has the ability to incorporate Tsearch2 (a full-text indexing add-on for PostgreSQL) for a responsive, full-text file indexing system.

No download Website Updated 09 Nov 2011 Pinot

Screenshot
Pop 93.87
Vit 11.53

Pinot is a D-Bus service that crawls, indexes your documents, and monitors them for changes. It is also a GTK-based user interface that enables you to query the index built by the service or your favorite Web engine, and display and analyze the results. It makes full use of advanced indexing and search facilities offered by Xapian, features language detection, dynamic document summaries, easy labelling of documents, and internal support for common file types. The D-Bus interface allows easy integration with other applications.

Screenshot

Project Spotlight

GNUnet

A framework for secure peer-to-peer networking.

Screenshot

Project Spotlight

Suricata

A network IDS and IPS engine.