BTE (Body Text Extractor) is a Python module that extracts the main body of text from a Web page. Many Web articles consist of a main body which constitutes the relevant part of the particular page. Surrounding this body is irrelevant information such as copyright notices, advertising, links to sponsors, etc. BTE identifies and extracts the main body text of an article.
urlwatch is a script intended to help you watch URLs and get notified (via email) of any changes. The change notification will include the URL that has changed and a unified diff of what has changed. The script works out of a single directory, so there is no need to install anything. State files are kept in the same folder. The script supports stripping parts of a page that are always changing through the use of a filter hook function. It is typically run as a cronjob.
PHP Content Management System (phpCMS) makes it possible to need only one template for your whole Web site. It allows you to provide dynamic menus with unlimited levels, and use templates and sub-templates without a database. It is search engine-friendly and proxy-friendly, as the pages it generates can not be distinguished from static HTML pages. PHP code can be added to any template and content file with an optional module. It supports the caching of parsed pages and gzip compression.
MyHeadlines is module that adds syndicated headline functionality to any PHP and MySQL-based website. Your users may subscribe to multiple RSS feeds from a fully categorized database of over 1,000 sources. It was previously a PHPNuke/PostNuke Addon, but can now be integrated with any Web site.
Historical Event Markup and Linking Project (Heml) provides an XML schema for historical events and a Java Web app which transforms conforming documents into hyperlinked timelines, maps and tables. It aims to provide a most information-rich interchange format for historical data, and thus add a historical component to the growing movement for a 'Semantic Web.'
POPsearch is a desktop search engine that is designed to help you easily find information on your computer. With features that other search engines don't have,it lets you index your entire collection of email messages and files. As information is indexed, it is immediately available for analysis from any Web browser. When POPsearch is configured correctly, you can also access your data remotely with RSS feeds, email feeds, or from any computer that has a Web browser.
phpFaber CMS is a Web-based content management system that fully supports MySQL. phpFaber CMS separates content from code and allows your organization to focus on content. The code, layout, and graphics are consistent through every single page of your site. phpFaber CMS (bundle edition) includes the CMS engine and modules for articles, banner ads, bookmarks, downloads, FAQs, feedback, GeoIP, Google sitemaps, links, news, newsletters, polls, recommendations, searching, and user management.