193 projects tagged "Linguistic"

Download Website Updated 30 Jul 2007 prbeditor

Screenshot
Pop 47.02
Vit 4.18

prbeditor is an editor for Java property resource bundle files. The application's intent is to help in the localization (l10n) of those programs that have been internationalized with Java's standard i18n mechanism. In contrast to other similar tools, it shows the keys and values of several languages at the same time in a spreadsheet, giving a global view of the resource files. The tool relies on the application of regular expresions to organize the keys and filter the visibility of the files. It includes a spell checker for several languages, based on word lists which may be downloaded separately.

Download Website Updated 04 Feb 2007 Japana

Screenshot
Pop 62.77
Vit 4.03

Japana is a small HTTP proxy written in Perl. It converts Japanese characters (Hiragana, Katakana, and Kanji) into ASCII (Romaji) on the fly. The translation is done with the kakasi library (an older version without the need for kakasi still exists).

Download Website Updated 12 Dec 2008 WordGenerator

Screenshot
Pop 60.97
Vit 3.95

WordGenerator generates hypothetical words from specifications of their syllable structure. The user specifies the maximum length of the words in syllables, the abstract structure of syllables in the language (in terms of such units as consonants and vowels or onsets and rhymes), and the actual sounds that comprise each abstract class (e.g. the list of vowels in the language); WordGenerator then generates the words that conform to this specification. Such lists are useful to field linguists exploring the vocabulary of a language, and to designers of artificial languages.

Download Website Updated 28 Jan 2007 ByteName

Screenshot
Pop 46.58
Vit 3.92

ByteName is a tool that for each byte of the input prints a line consisting of the byte offset, the byte in hex, octal, binary, and decimal, and its description in a selected single-byte encoding. A command line flag suppresses printing of lines corresponding to ASCII characters, which is useful for locating stray non-ASCII codes. It can also generate a chart for a specified encoding or, for a specified codepoint, generate descriptions in all known encodings.

No download Website Updated 25 Sep 2006 Connexor Machinese

Screenshot
Pop 28.35
Vit 3.89

Connexor Machinese analyzers process sequences of written words, identify and classify the various entities in them, and show how these relate to each other, marking the language with a simple and systematic notation. Currently, the Machinese product family includes: Machinese Phrase Tagger, a fast, light-weight morphosyntactic tagger; Machinese Syntax, a full-scale dependency parser; Machinese Semantics, a dependency parser with semantic analysis; and Machinese Metadata, an entity extractor.

Download Website Updated 24 Aug 2009 CharEntry

Screenshot
Pop 37.01
Vit 3.81

CharEntry is a tool for inserting non-ASCII characters into text, with particular emphasis on linguistic notation. It provides charts of the consonants, vowels, and diacritics of the International Phonetic Alphabet as well as a chart of precomposed accented characters. Clicking on a character inserts it into a text region, the contents of which may be saved to a file or copied and pasted elsewhere. A widget for inserting characters by Unicode codepoint is also provided. Furthermore, it is possible to read the definition of a custom character chart from a file.

Download Website Updated 16 Jul 2005 Transtalo

Screenshot
Pop 17.18
Vit 3.71

Transtalo is an automatic translator. It consists of a library interface and modules for source and destination languages (called input and output modules). These modules communicate to each other through sentence files in an XML format.

Download Website Updated 09 Dec 2007 libuninum

Screenshot
Pop 65.60
Vit 3.71

libuninum is a library for converting Unicode strings to integers and integers to Unicode strings. Internal computation is done using arbitrary precision arithmetic, so there is no limit on the size of the integer that can be converted. Values are passed and returned as ASCII decimal strings, GNU MP mpz_t objects, or unsigned long integers. Auto-detection of the number system is provided. Very many number systems are supported. Group delimitation for output strings is fully controllable. Command line and graphical interfaces are also provided.

Download No website Updated 06 May 2014 yawl

Screenshot
Pop 106.16
Vit 3.50

This is a comprehensive "word game" word list for UNIX/Linux. It is a superset of the author's ENABLE list, the "OSW", and various lists researched by the author's colleague, Alan Beale. At 264,093 words, it is the largest list of its kind, suitable for use in all manners of crossword-type board games and word construction games, as well as for a spell checker dictionary. The YAWL package now includes two anagramming utilities (supplied as source code, handled by the included Makefile). There is also a shell script that extends the UNIX "strings" system command. This is the word list package recommended for the author's Quackey word game.

Download Website Updated 18 Oct 2008 Cypher

Screenshot
Pop 27.02
Vit 3.50

Cypher is an AI program that generates the RDF graph and SPARQL query representations of plain language input, allowing users to speak plain language to update and query databases. With robust definition languages, Cypher's grammar and lexicon can quickly and easily be extended to process highly complex sentences and phrases of any natural language, and can cover any vocabulary. Equipped with Cypher, programmers can begin building next generation semantic Web applications that harness natural language.

Screenshot

Project Spotlight

CT-gui/CT-synth/CT-farfisa

A GUI toolkit for Linux and Android.

Screenshot

Project Spotlight

fcron

A command scheduler for non-permanently-running systems.