RSS 15 projects tagged "Linguistic"

Download Website Updated 20 Sep 2001 lexica

Screenshot
Pop 28.88
Vit 1.05

Lexica is a graphical interface to Unix/Linux dictionary resources. It is implemented in TCL/Tk and provides access to dict, wn, and grep.

No download Website Updated 23 Feb 2000 Mueller English-Russian Dictionary Kit

Screenshot
Pop 36.33
Vit 69.54

This is the first respectable English-Russian Dictionary with transcription (IPA) under GNU GPL. The dictionary has 46233 word articles and is 5.5MB. "MOVA" is a set of Bash and Tcl/TK scripts for dictionary management under UNIX.

Download Website Updated 30 Jul 2007 WordNet

Screenshot
Pop 81.33
Vit 2.69

WordNet® is an on-line lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept. Different relations link the synonym sets.

Download Website Updated 15 Mar 2005 Ellogon

Screenshot
Pop 65.20
Vit 1.83

Ellogon is a multi-lingual, cross-platform, general-purpose language engineering environment, developed in order to aid both researchers who are doing research in computational linguistics, as well as companies who produce and deliver language engineering systems. As a language engineering platform, it offers an extensive set of facilities, including tools for processing and visualising textual/HTML/XML data and associated linguistic information, support for lexical resources (like creating and embedding lexicons), tools for creating annotated corpora, accessing databases, comparing annotated data, or transforming linguistic information into vectors for use with various machine learning algorithms.

Download Website Updated 12 Sep 2008 Redet

Screenshot
Pop 277.67
Vit 11.45

Redet is a tool for developing and executing regular expressions using any of more than 50 search programs, editors, and programming languages, intended both for developing regular expressions for use elsewhere and as a search tool in its own right. For each program in each locale, a palette showing the available constructs is provided. The properties of each program are determined by runtime tests, which guarantees that they will be correct for the program version and locale. Additional features include persistent history, extensive help, a variety of character entry tools, and the ability to change locale while running. Redet is highly configurable and fully supports Unicode.

Download Website Updated 15 May 2011 uni2ascii

Screenshot
Pop 202.36
Vit 13.98

uni2ascii and ascii2uni provide conversion in both directions between UTF-8 Unicode and more than thirty 7-bit ASCII equivalents, including RFC 2396 URI format and RFC 2045 Quoted Printable format, the representations used in HTML, SGML, XML, OOXML, the Unicode standard, Rich Text Format, POSIX portable charmaps, POSIX locale specifications, and Apache log files. It can also convert between the escapes used for Unicode in languages such as Ada, C, Common Lisp, Java, Pascal, Perl, Postscript, Python, Scheme, and Tcl.

Download Website Updated 11 Jan 2010 msort

Screenshot
Pop 222.95
Vit 12.24

Msort sorts files in sophisticated ways. Records may be fixed size, newline-separated blocks, or terminated by any specified character. Key fields may be selected by position, tag, or character range. For each key, distinct exclusions, multigraphs, substitutions, and a sort order may be defined or locale collation rules used. Comparisons may be lexicographic, numeric, numeric string, hybrid, random, by string length, angle, domain name, date, time, month name, or ISO8601 timestamp. Keys may be reversed so as to generate reverse dictionaries. Optional keys are supported. Unicode is supported, including full case-folding. Msort itself has a somewhat complex command line interface, but may be driven by an optional GUI.

Download Website Updated 15 Nov 2009 minpair

Screenshot
Pop 52.38
Vit 4.85

Minpair consists of two programs, a C command-line program and a Tcl/Tk GUI, each of which can independently generate a complete list of minimal pairs (words differing in exactly one segment) for use in linguistic research. The GUI may also be used to control the faster CLI program. Both allow sequences of characters to be defined as single segments. Unicode is fully supported. It is also possible to obtain a list of pairs differing in exactly two positions for use in finding phonological rules.

Download Website Updated 30 Dec 2008 xlit

Screenshot
Pop 68.50
Vit 5.57

Xlit converts text from one writing system into another. It allows the user to define a transliteration simply by typing the input strings in one window and the strings to which they are to be mapped in another. Transliteration may be restricted to regions bounded by specified delimiters or their complements. Transliteration may also be performed by external commands or plugins. Xlit can also convert one type of delimiter to another, e.g. from HZ escapes to XML. Xlit can read and write transliteration definitions in its own format and as Yudit keymaps. It can be run in batch mode without the GUI.

Download Website Updated 12 Dec 2008 WordGenerator

Screenshot
Pop 69.20
Vit 4.11

WordGenerator generates hypothetical words from specifications of their syllable structure. The user specifies the maximum length of the words in syllables, the abstract structure of syllables in the language (in terms of such units as consonants and vowels or onsets and rhymes), and the actual sounds that comprise each abstract class (e.g. the list of vowels in the language); WordGenerator then generates the words that conform to this specification. Such lists are useful to field linguists exploring the vocabulary of a language, and to designers of artificial languages.

Screenshot

Project Spotlight

nf1db

An in memory database engine.

Screenshot

Project Spotlight

Hypertable

A high performance, scalable database modeled after Bigtable.