Template Data Interface (TDI, /ʹtedɪ/) is a markup templating system written in Python with (optional but recommended) speedup code written in C. Unlike most templating systems, TDI does not invent its own language to provide functionality. Instead, you simply mark the nodes you want to manipulate within the template document. The template is parsed, and the marked nodes are presented to your Python code, where they can be modified in any way you want.
libunibreak is an implementation of the line breaking and word breaking algorithms as described in Unicode Standard Annex 14 and Unicode Standard Annex 29. It is a superset of, and supersedes, liblinebreak. It is designed to be used in a generic text renderer. FBReader is one real-world example.
UverseWiki is a modular open source PHP framework designed for text processing. Unlike most existing solutions, it is not regular expression-based but instead uses a recursive descent parser to build a document object model. After the parsing stage has been finished and the DOM is produced, the original source is discarded and all operations are performed on the document tree instead: nodes can be altered, serialized, or rendered into a particular format (such as HTML or RTF). The wiki syntax is language-neutral and the processing itself is carried out in UTF-8.
Winnow efficiently trains and operates any number of unique Bayesian (Naive Bayes) classifiers on large sets of content. It has very high performance and works with very small training and unbalanced training sets. It has been used to power an innovative Web feed reader that uses smart tags, which learn and find the content you want to see, from more sources than you can follow with traditional feed readers. It works particularly well with Ruby and Ruby on Rails.
libcsv_parser++ is a C++ library for parsing text files to extract records and fields. The records can be delimited with any set of characters. It makes the following assumptions: the record terminator is only one character in length; the field terminator is only one character in length; and the fields are enclosed by single characters, if any. The parser can handle documents where fields are always enclosed, not enclosed at all, or optionally enclosed. When fields are strictly all enclosed, there is an assumption that any enclosure characters within the field are escaped by placing a backslash in front of the enclosure character. The software could be ported to Windows with very little effort.
Ezphpconfig generates a set of PHP configuration classes from a supplied XML file. You can then access your configuration values very quickly without having to parse the XML file on every request. The element (tag) names become property names and the text contained in the elements becomes the property's value. It also supports nested elements. If the generated PHP file is older than the XML file, it is re-generated using the data in the newer XML file. This class also supports array types using the element inside an element whose type attribute is set to "array".
seltz_analyzer is a PHP class that tries to find the most important words inside a well-formed XHTML trunk. Every word takes a score based on the role in the XHTML structure. For example, a word between strong tags will take 5 points. In addition, it will look at some simple syntax rules. For example a word with the first character uppercase will take 4 points. The score is cumulative, so the more a word is used, the more meaning it will have.
CollabNet Connector Framework is an Openadaptor-based SDK that allows rapid integrations and migrations dealing with the artifact data shared between different tools in the ALM cycle in combination with the collaborative platforms from CollabNet. It features bidirectional, out-of-the-box tracker integration between HP Quality Center, CollabNet SourceForge Enterprise, CollabNet Enterprise Edition, database tables, and CSV files.
SitemapGen4j is a Java library to generate XML sitemaps. It supports gzipped output, sitemap validation, and sitemap index generation. It can also generate Google-specific sitemaps, such as Mobile sitemaps, Geo sitemaps, Code Search sitemaps, Google News sitemaps, and Video sitemaps.