Yioop! is a PHP search engine. Yioop! can be configured as either a general purpose search engine for the whole Web or it can be configured to provide search results for a set of URLs or domains. Yioop can crawl pages or can directly index archives such as ARC and WARC. It supports indexing several file formats such as HTML, Atom, PDF, DOC, PPT, RTF, RSS, XML, SVG, PNG, JPG, BMP, GIF, and sitemaps. The Yioop! crawler can be deployed on one or many machines. It supports having one or more to crawl scheduler processes, as well as multiple fetchers and mirrors. Crawling respects robots.txt including Crawl-delay. Yioop! crawls are stored in a Web archive format that is easy to move around. Crawling can be done on one machine and the results deployed elsewhere. Yioop! supports mixing of crawls. Yioop! comes with a search front end that can be localized as desired using a GUI. This GUI supports RTL languages. Management of crawls can also be done using this GUI. Yioop! can be configured in a straightforward manner to make use of file caching or memcache if available.
XapianFu is a Ruby library for working with Xapian databases. It builds on the GPL licensed Xapian Ruby bindings, but provides an interface more in-line with "The Ruby Way"(tm) and is considerably easier to use. For example, you can work almost entirely with Hash objects, and XapianFu will handle converting the Hash keys into Xapian term prefixes when indexing and when parsing queries. It also handles storing and retrieving hash entries as Xapian::Document values. XapianFu basically gives you a persistent Hash with full text indexing (and ACID transactions).
Arch is an extension of Apache Nutch (a popular, highly scalable general purpose search engine) for intranet search. It includes blind test evaluation tools for comparing to other search engines. Arch has many features critical for corporate environments, such as document level security.
InstaSearch is an Eclipse IDE plug-in for performing quick and advanced searches of source code files. It uses the Apache Lucene library for indexing and fast searching of files in the workspace. The search is performed instantly as you type, and resulting files are displayed in an Eclipse view. Each file then can be previewed using a few of the most closely matching and relevant lines. A double-click on the match leads to the matching line in the file.
FM SiteSearch Pro is a quick and simple solution to adding professional search capability to a Web site. It comes with a relevance engine, control panel, large Web site support, MySQL support (optional), search/keyword statistics, advanced searches, and specialized searches, and is fully customizable. It also comes with a setup interface.