Vilistextum is a small and fast HTML to text converter. It is quite fault-tolerant and deals well with badly-formed or otherwise quirky HTML. It has full support for different character sets (e.g. Unicode). It is able to optimize for ebook reading, collapse multiple blank lines, and create footnotes out of links. A GUI frontend using kaptain is included.
gjots lets you organize text notes in a convenient, hierarchical way. It can be used for notes, jottings, bits and pieces, recipes, and even PINs and passwords, using encryption. It can also be used to "mind-map" larger compositions like manuals, Web pages, articles, etc. It is a bit like the KDE program "kjots", but uses the GTK library and supports a hierarchy of folders. Files can be output to HTML with an automatic table of contents or to docbook XML. Encryption is supported with ccrypt(1), gpg(1), and openssl(1), so that musings can be kept private.
Xidel is a command line tool to download Web pages and extract data from them. It can download files over HTTP/S connections, follow redirections, links, or extracted values, and process local files. The data can be extracted using XPath 2.0, XQuery 1.0, and JSONiq expressions, CSS 3 selectors, and custom, pattern-matching templates that are like an annotated version of the processed page. The extracted values can then be exported as plain text/XML/HTML/JSON, or assigned to variables to be used in other extract expressions or be exported to the shell. There is also an online CGI service for testing.
htmLawed is a PHP script that makes input text more secure, HTML standards-compliant, and suitable in general from the viewpoint of a Web-page administrator, for use in the body of HTML 4 or XHTML 1 or 1.1 documents. It is a customizable HTML/XHTML filter, processor, purifier, and sanitizer. It can ensure that HTML tags are balanced and properly nested tags, neutralize code that may be used for cross-site scripting (XSS) attacks, and limit the allowed HTML elements, tags, attributes, or URL protocols.
TWiki is a flexible, powerful, and simple Web based collaboration platform. It is suitable for dynamic intranets and knowledge bases, and for sharing and managing documents and collaborative projects. It resembles a normal Web site, but every page can be changed from a browser. It features automatic link generation, full text search, group authorization, Web forms, reporting, change notification, file attachments, revision control of pages and attachments, a modular templating system with skins, hierarchical navigation based on the topic parenting feature, and more. Plugins can be used to enhance the program and build groupware applications.
dlg2html is a set of Bash shell scripts which help automate the conversion of DLG Pro message bases to HTML for archiving or mirroring on the Web. (DLG Pro is a Bulletin Board System for Amiga personal computers.) The HTML message files contain the appropriate links to the next and previous messages, as well as to any replies or referenced messages. A message index is also produced. dlg2html works with terminal dumps of DLG message boards, which means that you do not need access to the actual BBS files to perform the conversion.