Xapian is a search engine library, scalable to collections containing hundreds of millions of documents. It's written in C++ with bindings for Perl, Python, PHP, Java, Tcl, C#, Ruby, and Lua. It is a highly adaptable toolkit that allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also a rich set of boolean query operators. Omega is a Web search application built upon the Xapian library. It can index a Web server's document tree (including HTML, PDF, OpenOffice, MS Word/Excel/Powerpoint/Works, WordPerfect, RTF, PS, etc.), or data exported from arbitrary sources (e.g. SQL databases).
GPP is a general-purpose preprocessor with customizable syntax, suitable for a wide range of preprocessing tasks. Its independence from any programming language makes it much more versatile than cpp, while its syntax is lighter and more flexible than that of m4. The syntax is fully customizable, which makes it possible to process text files, HTML, or source code equally efficiently in a variety of languages.
Spring is a lightweight Java/J2EE application framework based on code published in "Expert One-on-One J2EE Design and Development" by Rod Johnson. It includes powerful JavaBeans-based configuration management applying Inversion-of-Control principles, a generic abstraction layer for transaction management allowing for pluggable transaction managers, a JDBC abstraction layer, integration with Hibernate, JDO, Apache OJB, and iBATIS SQL Maps, AOP functionality, and a flexible MVC Web application framework with multiple view technologies. There is also a .NET port available.
screen-scraper is a tool for extracting data from Web sites. It works much like a database that provides access to the information of the Web. It provides a graphical interface allowing you to designate URLs, data elements to be extracted, and scripting logic to traverse pages and work with scraped data. Once these items have been created, screen-scraper can be invoked from external languages such as .NET, Java, PHP, and Active Server Pages. It can be scheduled to scrape information at periodic intervals, and can automatically write extracted data to CSV files.
Ice is a modern alternative to object middleware such as CORBA or COM/DCOM/COM+. It is easy to learn, yet provides a powerful network infrastructure for demanding technical applications. It features an object-oriented specification language, easy to use C++ and Java mappings, a highly efficient protocol (including protocol compression), asynchronous method invocation and dispatch, dynamic transport plug-ins, TCP/IP and UDP/IP support, SSL-based security, a firewall solution, and much more.
DotNetWikiBot Framework is a full-featured client API with a console interface that allows you to build programs and Web robots easily to manage information on MediaWiki-powered sites. DotNetWikiBot Framework is intended to help with many complicated and routine tasks of wiki site development and maintenance. Any .NET language can be used to access DotNetWikiBot library functions. Only minimal programming skills are required to make bots with DotNetWikiBot Framework.
SOHT (Socket over HTTP Tunneling) allows you to tunnel socket connections through an HTTP proxy. Restrictive firewalls often prohibit all outgoing trafic except for HTTP. This application allows you to tunnel socket connections over the HTTP protocol. This application consists of a server that serves as a proxy and a client which tunnels a socket connection over an HTTP connection to the server. The current server is written in Java, and there are clients in Java and .NET.
mojoPortal is a cross-platform object oriented Web site framework. It supports PostgreSQL, MySQL, Firebird, SQLite and MS SQL for the backend. It includes a content management system, forums, blogs, photo galleries, newsletter, polls, surveys, an event calendar, an RSS feed aggregator, and a skinnable design.