Redet is a tool for developing and executing regular expressions using any of more than 50 search programs, editors, and programming languages, intended both for developing regular expressions for use elsewhere and as a search tool in its own right. For each program in each locale, a palette showing the available constructs is provided. The properties of each program are determined by runtime tests, which guarantees that they will be correct for the program version and locale. Additional features include persistent history, extensive help, a variety of character entry tools, and the ability to change locale while running. Redet is highly configurable and fully supports Unicode.
JOrtho is a spell checker for Java. The library works with any JTextComponent from the Swing framework and checks as you type. The dictionary is based on the free Wiktionary.org, and is applicable for multiple languages. You can select the spell checking language via a context menu. The Features of JOrtho are the highlighting of potentially wrongly spelled words, a context menu with suggestions for correct forms of the word, and a context menu with option to change the checking language. At the moment there are nine languages for spell checking available: English, German, French, Spanish, Italian, Russian, Polish, Dutch, and Arabic.
Xlit converts text from one writing system into another. It allows the user to define a transliteration simply by typing the input strings in one window and the strings to which they are to be mapped in another. Transliteration may be restricted to regions bounded by specified delimiters or their complements. Transliteration may also be performed by external commands or plugins. Xlit can also convert one type of delimiter to another, e.g. from HZ escapes to XML. Xlit can read and write transliteration definitions in its own format and as Yudit keymaps. It can be run in batch mode without the GUI.
WordGenerator generates hypothetical words from specifications of their syllable structure. The user specifies the maximum length of the words in syllables, the abstract structure of syllables in the language (in terms of such units as consonants and vowels or onsets and rhymes), and the actual sounds that comprise each abstract class (e.g. the list of vowels in the language); WordGenerator then generates the words that conform to this specification. Such lists are useful to field linguists exploring the vocabulary of a language, and to designers of artificial languages.
OpenEphyra is a question answering (QA) system. It retrieves answers to natural language questions from the Web and other sources. OpenEphyra comes with implementations of algorithms that proved effective in Carnegie Mellon's Ephyra system, which participated in the TREC evaluations. It is platform independent and can be set up in just a few minutes. The goal of this project is to give researchers the opportunity to develop new QA techniques without worrying about the end-to-end system.
UnicodeDataBrowser is a browser for the UnicodeData.txt file, which contains much useful information but is not easily read by humans. It creates a scrollable table in which columns represent properties. The table may be sorted on any column. Abbreviations are expanded and characters cross-referenced in decomposition and casing fields are named. Regular expression search restricted to a selected column is available. The set of characters for which information is displayed may be restricted to those characters matching a regular expression on a specified property.
Grammar Browser provides a simple-to-use graphical interface to the grammatical structure and relations of any text, as parsed by the Stanford Parser. It contains a grammatical relation editor to modify, import, and export grammatical relation definitions (tregex patterns and features).
libalinga is a C++ implementation of a multi-stream codec for the ALingA (Aligned Linguistic Annotation) format. It makes use of libogg++. Each ALingA stream holds at least one stream of annotation data, which is in the LingA format. It may also interleave the signal stream(s) against which the LingA streams are aligned, or it may simply reference such streams. It also provides metadata about the underlying manifold for the signals and the annotations. The metadata is ordered for runtime parsing of the number and type of signal and LingA codecs to enable decoding of the multiple logical streams in one pass.
libalinga-java is a Java native interface to libalinga. It provides C++ and Java wrappers, as well as Java classes. It also provides control files to generate them from the libalinga interface using the program swig. The major and minor versions of this JNI will track that of libalinga, but its bugfixes are independent of libalinga bugfixes.