RSS 12 projects tagged "Linguistic"

No download Website Updated 21 Sep 2004 Atlantida

Screenshot
Pop 15.23
Vit 56.24

Atlantida is a multilingual cross-platform dictionary. Currently it has 310,000 definitions, and knows how to pronounce 21,000 English words.

Download Website Updated 06 Oct 2004 Dowser

Screenshot
Pop 62.75
Vit 2.24

Dowser is a Web research and archiving tool that clusters results from search engines, associates words that appear in previous searches, and keeps a local cache of all the results you click on in a searchable database along with summaries and links to related information. It helps you to keep track of what you find, with no advertising.

Download Website Updated 15 Mar 2005 Ellogon

Screenshot
Pop 65.20
Vit 1.83

Ellogon is a multi-lingual, cross-platform, general-purpose language engineering environment, developed in order to aid both researchers who are doing research in computational linguistics, as well as companies who produce and deliver language engineering systems. As a language engineering platform, it offers an extensive set of facilities, including tools for processing and visualising textual/HTML/XML data and associated linguistic information, support for lexical resources (like creating and embedding lexicons), tools for creating annotated corpora, accessing databases, comparing annotated data, or transforming linguistic information into vectors for use with various machine learning algorithms.

Download Website Updated 29 Dec 2007 GNU Talk Filters

Screenshot
Pop 135.48
Vit 4.98

The GNU Talk Filters are filter programs that convert ordinary English text into text that mimics a stereotyped or otherwise humorous dialect. Some of these filters have been in the public domain for many years, but here they are provided as a single integrated package. The filters include austro, b1ff, brooklyn, chef, cockney, drawl, dubya, fudd, funetak, jethro, jive, kraut, pansy, pirate, postmodern, redneck, valspeak, and warez. This package provides the filters both as individual executables and collectively as a C library, so they can be easily embedded in other programs.

Download Website Updated 19 Sep 2004 HumAn Language GENerator

Screenshot
Pop 72.80
Vit 1.52

HALoGEN is an extremely powerful and easy to use general-purpose natural language generation system. It consists of a symbolic generator, a forest ranker, and some sample inputs. The symbolic generator includes the Sensus Ontology dictionary based on WordNet. The forest ranker includes a 250 million word ngram language model (unigram, bigram, and trigram) trained on the Wall Street Journal newspaper text. The symbolic generator is written in LISP and requires a Lisp interpreter.

Download Website Updated 08 Oct 2003 LinkGrammar-WN

Screenshot
Pop 63.09
Vit 1.00

LinkGrammar-WN is a lexicon expansion for the Link Grammar Parser. The Link Grammar Parser is a syntactic parser of the English language that is capable of handling a wide variety of syntactic constructions and is considered quite robust. The LinkGrammar-WN project aims to import lexical information from WordNet in an effort to increase the size of the LGP lexicon. This project is of interest to anyone interested in NLP (natural language parsing) of English text.

Download Website Updated 03 Feb 2004 Polygen

Screenshot
Pop 29.60
Vit 58.27

PolyGen is a program for generating random sentences according to a grammar definition, that is following custom syntactical and lexical rules. Formally, it is an interpreter of a language itself designed to define languages, where to interpret means executing a source program in real time and eventually outputting its result. Here, a source program is a grammar definition. The execution consists of the exploration of such grammar by selecting a random path, and the result is the sentence built on the way.

Download Website Updated 22 Jan 2004 PyBabelPhish

Screenshot
Pop 40.21
Vit 1.81

PyBabelPhish is a GTK-based program providing fast translations from one natural language to another. Texts translated to Spanish can be read aloud in Spanish through optional text-to-speech support.

No download Website Updated 12 Feb 2013 TAMS Analyzer

Screenshot
Pop 285.10
Vit 66.26

TAMS (Text Analysis Markup System) Analyzer is a qualitative or ethnographic coding and data extraction-analysis system.

Download Website Updated 25 May 2004 hiercat

Screenshot
Pop 28.57
Vit 1.00

Hiercat is an automatic text classifier that uses a keyword hierarchy to improve categorization. It uses the generative probabilistic model of Gaussier, et. al 2001 as its document model.

Screenshot

Project Spotlight

libquickmail

A C library for sending email with attachments.

Screenshot

Project Spotlight

spt::json

A simple JSON API for Qt 4.