ImdbPHP provides an API to the movie information stored at the IMDb.com site. As this and the name suggests, it is primarily targeted at PHP programmers who want to extend their programs or their site with this movie information. The classes must be used in accordance with IMDb's copyright and conditions of use.
LogicalDOC is a Web-based document management system that is easy to use and learn. Its architecture leverages best-of-breed Java technology to achieve a powerful and flexible solution. It supports its users with a powerful search engine (Lucene), Web service interface (JAX-WS via CXF) compatible with .NET and PHP, versioning, annotation on documents, a WebDAV interface, importing and exporting from .zip files. Documents can be organized into hierarchical folders, searched using the integrated search engine, or browsed by Tag. The system is extensible thanks to the technologies used (Spring-Hibernate) and its plugin architecture.
eXist is a native XML database featuring efficient, index-based XPath query processing, extensions for keyword search, XUpdate support, and tight integration with existing XML development tools. The database is lightweight and may be easily deployed in a number of ways, running either as a stand-alone server process, inside a servlet engine, or directly embedded into an application.
Urd is a Web-based Usenet binary download manager. It stores the newsgroup information in a MySQL database and aggregates the articles into sets of a single download (e.g. one album or movie). The Web interface can be used to search with regular expressions. It uses its own downloading daemon that has support for scheduling downloads and updating databases. URD can also download directly from NZB files and even create NZB files. Further features include custom scripts, multiple languages, a template based Web interface, support for multiple servers, automatic par2 and unrar support, and an intuitive user interface.
YaCy is a search engine that anyone can use to index the Internet (WWW and FTP) or to create a search portal for others (Internet or intranet). The scale of YaCy is limited only by the number of users. and can index billions of web pages. In P2P mode it is fully decentralized. All users of the search engine network are equal and it is not possible for anyone to censor the content of the distributed index.
urlwatch is a script intended to help you watch URLs and get notified (via email) of any changes. The change notification will include the URL that has changed and a unified diff of what has changed. The script works out of a single directory, so there is no need to install anything. State files are kept in the same folder. The script supports stripping parts of a page that are always changing through the use of a filter hook function. It is typically run as a cronjob.
PDFTextStream is a PDF text and metadata extraction library available for Java and .NET. It supports all versions of the PDF document specification (including v1.7, used by Acrobat 8, 9, and X), extraction of text encoded using double-byte character sets (including Chinese, Japanese, and Korean), decryption of documents encrypted using 40-bit, 128-bit, 256-bit, and variable bit length ciphers, and extraction of all document metadata provided by PDF documents (including form data, bookmarks, and annotations). Easy integration with Jakarta Lucene is included, as well as interactive form update capability.