Ebot is a scalable and distribuited Web crawler. The URLs are saved to a NOSQL database (which supports map/reduce queries) that you can query via RESTful HTTP requests or using your preferred programming languages. The URLs that need to be analyzed are sent to AMQP queues. In this way, it is possible to run several crawlers in parallel and stop and start them without losing URLs.
couchCurl is a simple static PHP class that generates curl commands to work with CouchDB databases. It aims for quick and easy access to local CouchDB databases without authentication, and with little PHP processing overhead. It assumes that your PHP installation has exec() enabled and that the user can use curl. It supports most of the API (PUT, POST, GET) and adds a little extra stuff for making VIEWS easier to work with, and a function to help compress or make custom _ids.