rospell is a UTF-8 text editor for programmers and general use. It features syntax highlighting, a code beautifier for C and C++, and support for gdb, ctags, grep, diff, and merge. It includes spellers for English, French, German, Spanish, and Romanian languages. Romanian aspell and hunspell dictionaries are also available.
Linguistico is a set of language tools based on the Italian language. It includes a dictionary, a thesaurus, word definitions, and other scripts and programs. You can use these tools with OpenOffice.org, Mozilla Thunderbird, Mozilla Firefox, MySpell, MyThes, Aspell, and HunSpell.
libuninum is a library for converting Unicode strings to integers and integers to Unicode strings. Internal computation is done using arbitrary precision arithmetic, so there is no limit on the size of the integer that can be converted. Values are passed and returned as ASCII decimal strings, GNU MP mpz_t objects, or unsigned long integers. Auto-detection of the number system is provided. Very many number systems are supported. Group delimitation for output strings is fully controllable. Command line and graphical interfaces are also provided.
OmegaT is a translation memory application intended for professional translators. It does not translate for you (software that does this is called "machine translation"). It features fuzzy matching, match propagation, simultaneous processing of multiple-file projects, simultaneous use of multiple translation memories, and external glossaries. Document file formats include plain text, HTML, and OpenOffice.org/StarOffice. It has Unicode (UTF-8) support (can be used with non-Latin alphabets). It is compatible with other translation memory applications (TMX Level 1).
Solr is an enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g. Word and PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites. Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language. Solr's powerful external configuration allows it to be tailored to almost any type of application without Java coding, and it has an extensive plugin architecture when more advanced customization is required.
CharEntry is a tool for inserting non-ASCII characters into text, with particular emphasis on linguistic notation. It provides charts of the consonants, vowels, and diacritics of the International Phonetic Alphabet as well as a chart of precomposed accented characters. Clicking on a character inserts it into a text region, the contents of which may be saved to a file or copied and pasted elsewhere. A widget for inserting characters by Unicode codepoint is also provided. Furthermore, it is possible to read the definition of a custom character chart from a file.
Esperantilo ("Tool for Esperanto") is a UTF-8 editor with linguistics functions for the language Esperanto, and is also a system for computer aided translation. It contains a spell checker and grammar checker for the Esperanto language. It can translate Esperanto text in different formats to Polish, German, English, and Swedish and from Polish and English. It also supports computer aided translation by interactive machine translation. Translation memory can be used also for any language pairs. It is an XLIFF editor. It supports XLIFF and TMX (Level 1) formats. Machine translation uses direct translation at the syntax level.
SenseClusters is a natural language processing package that allows you to cluster similar contexts or to identify clusters of related words. It supports its own native methods based on first and second order representations of context, and also supports Latent Semantic Analysis. It is fully unsupervised, and can automatically discover the optimal number of clusters in your text. SenseClusters is a complete system that takes users from preprocessing of raw text to providing clustered output.