The Okapi project’s main purpose is to architect a set of building blocks for the creation of larger open source localization and translation tools. But many Okapi components are generic enough to be of interest to the text mining, natural language processing, and text retrieval communities. Okapi’s many text filters (HTML, Properties, XML (ITS XPath-based rules), OpenXML, ODF, Regex etc.) provide a straightforward way to access the text of multiple document formats. Its document events and pipeline can be made to integrate with other frameworks such as UIMA, LingPipe, OpenPipeline, OpenNLP, GATE, and Lucene. The advantage of Okapi’s text filters is that not only is text extracted, but all non-textual formatting is preserved. It is possible to decompose a document into events, process them via the pipeline, and then rebuild the input document without loss. Structural information can be added to Okapi document events so that tables, lists, links, titles etc. are grouped together and treated as a unit. This is useful when context based on a “universal” document structure is needed. The Okapi event model supports user configurable annotations, similar to UIMA, but simpler and more restricted in scope. User can annotate spans of text or add new resources such as translation memory matches, terminology, token types, or part of speech information.
OmegaT is a translation memory application intended for professional translators. It does not translate for you (software that does this is called "machine translation"). It features fuzzy matching, match propagation, simultaneous processing of multiple-file projects, simultaneous use of multiple translation memories, and external glossaries. Document file formats include plain text, HTML, and OpenOffice.org/StarOffice. It has Unicode (UTF-8) support (can be used with non-Latin alphabets). It is compatible with other translation memory applications (TMX Level 1).
Openbakery Translation is an internationalization tool for Java. Unlike standard i18n in Java, openbakery translation uses the text in the default locale as the key. There is also a tool which checks all of the source code for translations. This tool then provides a list of key/value pairs which have to be added to a certain resource file, and another list of pairs which can be removed. The translation works by simply calling a static method called "translate". The code works out of the box, without writing any properties files. You only write properties files when you really translate the program to a second language.
PEAR Validate is a set of useful methods to validate various kinds of data. It supports numbers (min/max, decimal, or not), email addresses (syntax, domain check, RFC 822), strings (alpha, upper and/or lowercase, numeric), dates (min, max), URIs (RFC 2396), and more. It also has many locale validation rules, specialized for each country or region (US, FR, UK, DE) or application domain (such as finance).
PHP Net_IDNA is a class to convert between the Punycode and Unicode formats. Punycode is a standard described in RFC 3492 and part of IDNA (Internationalizing Domain Names in Applications [RFC3490]) . This class allows PHP scripts to convert these domain names without having one of the PHP extensions installed. It supports both IDNA 2003 and IDNA 2008.
Piragibe is a business database driven application framework. Its main goal is to mimic, as closely as possible, the capabilities and behaviour of Oracle Forms. It offers a metaphor that resembles Oracle Developer with data blocks, forms, triggers, and events, a neat layer of data validation capable of validating fields, records, and blocks of records under programmatic control, clear separation and independence between database access, programmatic views of database data and visual presentation of data and information, access to any database supported by PHP, national language support, and the ability to develop CRUD applications with a few lines of code.
Pootle is a Web-based translation and translation management tool. It provides a rich set of features for mangaging a translation project. It integrates components of the Translate Toolkit to provide error checkers for translation messages and the ability to download files in a number of formats: PO, XLIFF, CSV. Pootle can also provide compiled PO files for download. You can use it to assign work to translators in your team, and you can define goals to help focus the efforts of your translation. Pootle can run without a Web server or be proxied through your existing Apache server.