RSS 41 projects tagged "Indexing/Search"

Download Website Updated 04 Apr 2008 ApeSmit

Screenshot
Pop 40.64
Vit 1.00

ApeSmit is a very simple Python module to create XML sitemaps as defined at sitemaps.org. It doesn’t contain a Web spider or similar software; it just writes the data you provide to a file using the proper syntax.

Download No website Updated 29 Aug 2005 BTE

Screenshot
Pop 10.77
Vit 56.21

BTE (Body Text Extractor) is a Python module that extracts the main body of text from a Web page. Many Web articles consist of a main body which constitutes the relevant part of the particular page. Surrounding this body is irrelevant information such as copyright notices, advertising, links to sponsors, etc. BTE identifies and extracts the main body text of an article.

Download Website Updated 29 Jun 2009 Chestnut FTP Search

Screenshot
Pop 29.46
Vit 3.51

Chestnut FTP Search is a Web application to search for files on FTP servers. Users can query files by part of the file name, the entire file name, a regular expression, or a shell pattern. To store file indexes, PostgreSQL or MySQL is used.

No download Website Updated 08 Apr 2004 DharmaDoc

Screenshot
Pop 28.25
Vit 1.00

DharmaDoc automates most of the tedious work involved in setting up a local Web server that contains a Buddhist reference library. The program allows you to download and install documents, and generates a search engine index. Afterward, you can just type your Buddhist topic of interest into a Web browser and get a wealth of information.

Download Website Updated 26 Sep 2003 Diamond Wiki

Screenshot
Pop 16.97
Vit 1.00

Diamond Wiki is an experimental wiki with an emphasis on metadata and faceted navigation.

Download Website Updated 14 Feb 2003 Digest

Screenshot
Pop 21.31
Vit 1.00

Digest generates HTML index pages and image previews for collections of images. It is fast and simple, and it creates HTML that is compact, quick-rendering, and does not rely on JavaScript or CSS.

Download Website Updated 06 Oct 2004 Dowser

Screenshot
Pop 48.54
Vit 2.24

Dowser is a Web research and archiving tool that clusters results from search engines, associates words that appear in previous searches, and keeps a local cache of all the results you click on in a searchable database along with summaries and links to related information. It helps you to keep track of what you find, with no advertising.

Download Website Updated 19 Oct 2003 Ferret CMS

Screenshot
Pop 23.24
Vit 1.42

Ferret CMS is a Content-Management System based on Zope that aims to be simple and intuitive to use for the non-technical user and easy to install and maintain for the administrator, while offering the developer flexibility and extensibility. The aim is to be able to get a Web site mechanism up and running within five minutes. It offers built-in tools such as a search engine and a workflow mechanism to facilitate the content visualization, creation, and administration.

No download Website Updated 16 Jun 2011 Grub Next Generation

Screenshot
Pop 90.17
Vit 8.82

Grub Next Generation is a distributed client/server Web crawling system that helps to build and maintain indexes of the Web.

Download Website Updated 09 Sep 2005 HarvestMan

Screenshot
Pop 111.77
Vit 5.93

HarvestMan is a multithreaded off-line browser.It has many features for customizing offline browsing through URL filters, word filters, domain filters, URL priorities, depth-fetching, fetch levels, file limits, time limits, robot exclusion protocols, and many more. It is useful to download an entire Web site or certain files from a Web site to the hard disk for offline browsing later. It supports HTTP/HTTPS and FTP protocols and can work across proxies.

Screenshot

Project Spotlight

YAGF

A graphical frontend for the Cuneiform OCR tool.

Screenshot

Project Spotlight

iCaption

A simple-to-use soft subtitle editor, utilizing timeline based editing.