193 projects tagged "Linguistic"

Download Website Updated 05 Dec 2001 Grok

Screenshot
Pop 53.74
Vit 2.37

Grok is a library of Java components for performing various natural language tasks. These include several preprocessing tasks, chart parsing, a large categorial grammar for English (induced from the Penn treebank), and some knowledge representation components (basic coreference, salience tracking, etc.). The library also has a companion kit which provides a GUI interface to the components, several of which are implementations of interfaces in the Quipu OpenNLP API.

Download Website Updated 05 Dec 2001 The Quipu OpenNLP API

Screenshot
Pop 36.67
Vit 2.37

The Quipu OpenNLP API is a preliminary collection of Java interfaces for standardizing how natural language processing components interact.

Download Website Updated 04 Apr 2003 The Quipu Maximum Entropy Package

Screenshot
Pop 38.26
Vit 2.50

Maximum entropy is a powerful method for constructing statistical models of classification tasks, such as part-of-speech tagging in Natural Language Processing. The Quipu Maximum Entropy Package is a Java implementation of the maximum entropy framework. It allows you to train, evaluate, and use maxent models.

Download Website Updated 14 May 2005 WorldPrint

Screenshot
Pop 47.25
Vit 4.84

WorldPrint is a filter for Mozilla (Galeon, etc.), Htmldoc, and Netscape PostScript output that uses TrueType fonts to allow the printing of pages written in Unicode, Big5, SJIS, KOI-8, ISO-8859*, and other charsets.

Download Website Updated 06 Sep 2003 polcnv

Screenshot
Pop 16.43
Vit 1.15

polcnv is designed to convert files between different encoding methods used for Polish texts. It can be also used to covert plain text documents in any language using supported character encoding methods. The program uses ISO-10646 UCS-4 (equivalent to Unicode UTF-32) as internal representation.

No download Website Updated 23 Feb 2000 Mueller English-Russian Dictionary Kit

Screenshot
Pop 28.93
Vit 72.17

This is the first respectable English-Russian Dictionary with transcription (IPA) under GNU GPL. The dictionary has 46233 word articles and is 5.5MB. "MOVA" is a set of Bash and Tcl/TK scripts for dictionary management under UNIX.

Download Website Updated 11 Feb 2013 ANTLR

Screenshot
Pop 285.53
Vit 5.42

ANTLR (ANother Tool for Language Recognition) is a language tool that provides a framework for constructing recognizers, compilers, and translators from grammatical descriptions containing C++, Java, or Sather actions. It is similar to the popular compiler generator YACC, however ANTLR is much more powerful and easy to use. ANTLR-produced parsers are not only highly efficient, but are both human-readable and human-debuggable (especially with the interactive ParseView debugging tool). ANTLR can generate parsers, lexers, and tree-parsers in either C++, Java, or Sather. ANTLR is currently written in Java.

Download Website Updated 22 Oct 2007 Diogenes

Screenshot
Pop 110.93
Vit 4.69

Diogenes is a tool for searching and browsing the Latin and ancient Greek texts published on CD-ROM by the Packard Humanities Institute and the Thesaurus Linguae Graecae. It comes as an easy-to-install stand-alone application for GNU/Linux, Mac OS X, and Windows, based on the Firefox browser (i.e. Xulrunner). Alternatively, it can be installed by a network administrator as a server on a local network, and users then access it via an ordinary Web browser. There is also a command-line tool which can optionally format output as LaTeX instead of HTML.

Download Website Updated 18 Dec 2010 xgrk

Screenshot
Pop 22.69
Vit 1.09

xgrk provides the possibility to change keyboard mapping with alt-shift or meta-shift combinations or by clicking on the flag image. You will be able to write greek in X programs like netscape or xedit. Keycodes are auto-loaded on startup so it should work with all unices and keyboard layouts. Fonts are not included.

Download No website Updated 06 May 2014 yawl

Screenshot
Pop 106.16
Vit 3.50

This is a comprehensive "word game" word list for UNIX/Linux. It is a superset of the author's ENABLE list, the "OSW", and various lists researched by the author's colleague, Alan Beale. At 264,093 words, it is the largest list of its kind, suitable for use in all manners of crossword-type board games and word construction games, as well as for a spell checker dictionary. The YAWL package now includes two anagramming utilities (supplied as source code, handled by the included Makefile). There is also a shell script that extends the UNIX "strings" system command. This is the word list package recommended for the author's Quackey word game.

Screenshot

Project Spotlight

CT-gui/CT-synth/CT-farfisa

A GUI toolkit for Linux and Android.

Screenshot

Project Spotlight

fcron

A command scheduler for non-permanently-running systems.