With LinkChecker, you can check HTML documents and Web sites for broken links. It features recursion, robots.txt exclusion protocol support, HTTP proxy support, i18n support, multithreading, regular expression filtering rules for links, and user/password checking for authorized pages. Output can be colored or normal text, HTML, SQL, CSV, or a sitemap graph in DOT, GML, or XML format. Supported link types are HTTP/1.1 and 1.0, HTTPS, FTP, mailto:, news:, nntp:, Telnet, and local files.
Dosage is designed to keep a local copy of specific Web comics and other picture-based content such as Picture of the Day sites. With the dosage command line, script you can get the latest strip of a Web comic, catch up to the last strip downloaded, or download a strip for a particular date/index (if the comic's site layout makes this possible).
gPodder is a Podcast receiver/catcher written in Python and pyGTK. It manages podcast feeds for you, and automatically downloads all podcasts from as many feeds as you like. If you are interested in Podcast feeds, simply put the feed URLs into gPodder and it will download all episodes for you automatically. If there is a new episode, it will get it for you. It supports download resume, if the server supports it.
web2ldap is a full-featured Web-based LDAPv3 client written in Python. It is designed to run either as with stand-alone built-in Web server or under the control of another Web server with FastCGI support (e.g. Apache with mod_fastcgi). It has support for various LDAPv3 bind methods and a powerful built-in schema browser. HTML templates are supported for displaying and editing entries, and LDIF templates can be used for quickly adding new entries. A built-in X.509 parser displays a detailed view of certificates and CRLs with active links.
Performance Co-Pilot (PCP) is a framework and set of services for supporting system-level performance monitoring and performance management. It provides a unifying abstraction for all of the interesting performance data in a system, and allows client applications to easily retrieve and process any subset of that data using a single API. A client-server architecture allows multiple clients to monitor the same host, and a single client to monitor multiple hosts. Archive logging and replay are integrated so that a client application can use the same API to process real-time data from a host or historical data from an archive.
ViewVC (formerly known as ViewCVS) is a Python/CGI-based system for viewing and interacting with Subversion and CVS repositories through your Web browser. It can browse directories, view changelogs, generate diffs, view arbitrary revisions, and display annotated ("blame") listings. It has full support for tags and branches, and contains a database-backed query system like Bonsai. It was initially based on the cvsweb work by Henner Zeller, but has been ported to Python and dramatically enhanced.
rawdog is a feed aggregator capable of producing a personal "river of news" or a public "planet" page. It supports all common feed formats, including all versions of RSS and Atom. By default, it is run from cron, collects articles from a number of feeds, and generates a static HTML page listing the newest articles in date order. It supports per-feed customizable update times, and uses ETags, Last-Modified, gzip compression, and RFC3229+feed to minimize network bandwidth usage. Its behavior is highly customizable using plugins written in Python.
QuizComposer is a system for quiz composition/presentation/response-evaluation on the Web in any language. It features many response types to questions (checkbox clicks, number intervals, character patterns/regular expressions, ordered and unordered sets, and subsets), re-presentation of incorrectly answered questions with/without hints, test quizzes for limited groups, and packaging of quizzes and sets of quizzes for transportation and exchange.
NetCrawler is the frontend to a Web crawling system. This command line application will download all of the pages within a domain, and then parse and process all of the relative content (Images, Text, Audio, Video), saving this content within an XML document for later processing. It is definitely alpha quality, but has been used quite extensively.