WebGlimpse is a scalable, feature-rich search engine for indexing your Web site or any collection of local and remote sites you choose. Features include customizable output formats, custom ranking/ordering of hits, fuzzy matching, boolean queries, a Web administration interface for multiple archives, logging of queries, caching of results, and more. Localized search interfaces are provided in multiple languages including Spanish, German, French, Italian, Norwegian, Finnish, Russian, Hebrew, and others. It supports 3rd party filters for indexing PDF, Word, and Excel files. It is free for academic and most nonprofit users.
| Tags | Internet Web Indexing/Search Text Processing Indexing |
|---|---|
| Operating Systems | POSIX AIX BSD FreeBSD OpenBSD HP-UX IRIX Linux SCO Solaris |
| Implementation | C Perl |
Recent releases


Release Notes: Selective searching by directory was fixed. It had been broken for a few releases by a security enhancement.


Release Notes: The new HOOK feature for calling an external program for output customization needed some minor tweaks. A sample site that uses this feature for adding annotations to URLs in the search results has been made available. Users can view each other's notes as part of the search result page.


Release Notes: Webglimpse now provides an optional hook for inserting data from an external module into the results output. The external subroutine can act on the URL of the hit results, the original user query, and all other output variables configured on the system. This release also has a fix for the New Query box when using the simple search form and an updated install script that avoids compiler warnings for old code.


Release Notes: Improved handling of dynamic URLs, particularly those including queries in the path.


Release Notes: This release adds an option to output search results at URLs like http://yourserver.whatever/search/keyword instead of http://yourserver.whatever/cgi-bin/webglimpse.cgi?query=... This type of URL is more user-friendly, bookmarkable, and search engine friendly. It requires some Web server configuration.
A collection manager for books, videos, games, music, and other collectibles.
A financial fixed-income credit analytics, credit risk, bond analytics, and bond risk library.