Xapian is a search engine library, scalable to collections containing hundreds of millions of documents. It's written in C++ with bindings for Perl, Python, PHP, Java, Tcl, C#, Ruby, and Lua. It is a highly adaptable toolkit that allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also a rich set of boolean query operators. Omega is a Web search application built upon the Xapian library. It can index a Web server's document tree (including HTML, PDF, OpenOffice, MS Word/Excel/Powerpoint/Works, WordPerfect, RTF, PS, etc.), or data exported from arbitrary sources (e.g. SQL databases).
IMDbPY is a Python package useful to retrieve and manage the data of the IMDb movie database about movies, people, characters, and companies. It can retrieve data from both the IMDb's Web server and a local copy of the whole database. The IMDbPY package can be very easily used by programmers and developers to provide access to the IMDb's data to their programs. Some simple example scripts are included in the package.
HarvestMan is a multithreaded off-line browser.It has many features for customizing offline browsing through URL filters, word filters, domain filters, URL priorities, depth-fetching, fetch levels, file limits, time limits, robot exclusion protocols, and many more. It is useful to download an entire Web site or certain files from a Web site to the hard disk for offline browsing later. It supports HTTP/HTTPS and FTP protocols and can work across proxies.
pacparser is a library to parse proxy auto-config (PAC) files. Proxy auto-config files are a vastly used proxy configuration method these days. Web browsers can use a PAC file to determine which proxy server to use or whether to go direct for a given URL. The idea behind pacparser is to make it easy to add PAC file parsing capability to any program (C and Python are supported right now). It comes as a shared C library and a Python module that can be used to make any C or Python program PAC scripts intelligent. Some very useful targets could be popular Web software like wget, curl, and python-urllib.
Flightdeck-UI is a project that utilizes the ideas in the design of aircraft controls and instruments for creating general purpose user interfaces. The project includes Flightdeck-UI Online (a Web-based monitoring system that works entirely through the browser), the Multi-Variable Monitor (MVM) application, and a Tkinter widget library. Flightdeck-UI Online is installed on a Web server. The MVM application provides a graphical editor with theme support for quickly creating Flightdeck-UI control panels. It is possible but not necessary to write code in order to use MVM.