22 projects tagged "Indexing/Search"
Net::Z3950::SimpleServer is a Perl module which implements the server side of the Z39.50 (information retrieval) protocol. It hides the complexity of network exchanges, packet serialization, and session handling. You are required only to implement simple callbacks to support searching and record retrieval. It is the basis of the "Zoogle" project, which is a Z39.50 gateway to the Google web index.
PicBook automatically produces a photo album in HTML format of your scanned images or photographs. It come with automatic image processing, slideshow, transition effects, and other nifty features. It is easy to customise with its configuration file and HTML-templates. PicBook is a Bourne shell script and therefore should run on any Unix or Linux system. It requires the standard grep and awk commands, which should be available on most systems. Additionally, it needs the "convert" and "identify" commands from the ImageMagick package to handle images.
Site Index is a simple script which generates HTML pages showing a site index for all of your local domains. It uses the natural hierarchy of your filesystem, so if you've organized your pages well, the job is fully automated. It breaks the index into multiple pages to respect a links-per-page limit, and can do some sorting/management of the results according to domain/page "importance".
WebSuck goes through a Web page, following links and making a list of the datafiles encountered along the way. It is useful for such tasks as downloading large image galleries without clicking all the links yourself. It can output a file list in a format appropriate for wget, and another for GetRight. It can be used either via a Swing GUI or in console mode.
@1 Links Submission and Approval System I lets visitors submit free links to your site. Their links will only be added to the main datafile once you have viewed and approved them. A password- protected admin interface for searching, adding, marking, approving, holding, editing, and deleting is included. The user may also upload an optional site logo.
Java Search Engine is a server-side search engine program for Web sites written completely in Java. It features HTML and PDF indexing, a built-in Web crawler, international encodings support, words and phrases search, and returning results as quotations with highlighted words (like Google). It is available as EJB, JSP, servlet, or Java API library. For non-Java enviroments, it is available as an XML server with XSLT support.
This is a tool to collect information from web servers and to spider the web sites. This was written for the Open Source Security Testing Methodology (OSSTM) located on http://www.ideahamster.org/osstmm- description.htm. The spider is a multi-threaded resusable module that can be used in other projects.