RSS 13 projects tagged "Linguistic"

Download Website Updated 06 Sep 2003 polcnv

Screenshot
Pop 16.06
Vit 1.15

polcnv is designed to convert files between different encoding methods used for Polish texts. It can be also used to covert plain text documents in any language using supported character encoding methods. The program uses ISO-10646 UCS-4 (equivalent to Unicode UTF-32) as internal representation.

Download Website Updated 14 May 2005 WorldPrint

Screenshot
Pop 47.72
Vit 4.84

WorldPrint is a filter for Mozilla (Galeon, etc.), Htmldoc, and Netscape PostScript output that uses TrueType fonts to allow the printing of pages written in Unicode, Big5, SJIS, KOI-8, ISO-8859*, and other charsets.

Download Website Updated 24 Mar 2001 SyNTeX - Syntactic tree drawing program

Screenshot
Pop 58.22
Vit 69.13

SyNTeX is a LaTeX preprocessor that draws syntactic trees using the LaTeX picture environment. The preprocessor reads the comments in a LaTeX file and draws the tree based on commands that it finds in the comments.

Download Website Updated 22 Jul 2002 Linguaphile

Screenshot
Pop 63.34
Vit 1.49

Linguaphile is a simple command line language translator. It is open source, platform independent, and programmed in Perl. Linguaphile currently supports the following languages: Afrikaans, Alawa, Albanian, Arrernte, Basque, Belarusian, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, English, Esperanto, Estonian, Finnish, French, Galician, German, Greek, Hawaiian, Hungarian, Icelandic, Indonesian, Interlingua, Irish, Italian, Kala Lagaw Ya, Korean, Kriol, Latvian, Lithuanian, Malay, Maltese, Maori, Norwegian, Pitjantjatjara, Polish, Portuguese, Romanian, Russian, Samoan, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Thai, Tok Pisin, Turkish, Ukrainian, Warlpiri, and Welsh. The Spanish to English translation is the most useful at this stage.

Download Website Updated 06 Mar 2008 Mguesser

Screenshot
Pop 31.37
Vit 2.51

Mguesser is a tool to guess a text's character set and language. It is a standalone part of the mnoGoSearch engine. More than 100 various character set and language combinations are supported.

Download Website Updated 18 Jun 2002 respell

Screenshot
Pop 18.44
Vit 1.00

Respell converts English text between the American, British, and Canadian spelling conventions. It prompts the user for cases where more than one target spelling could be chosen for a source word. It can also create a 'universal' spelling which can be automatically converted to any of the three without loss of information.

Download Website Updated 03 Nov 2002 Marko

Screenshot
Pop 30.63
Vit 1.42

Marko is a simple toolset that allows you to create markov chain databases of a corpus (or two) of text and then allows you to compare unknown texts to these databases. For any two marko databases you can calculate the probability that the unknown body is related to one over the other. Possible applications include intelligent mail filtering, plagiarism detection, and historical research.

Download Website Updated 26 Mar 2006 dbacl

Screenshot
Pop 174.65
Vit 4.91

dbacl is a digramic Bayesian text classifier. Given some text, it calculates the posterior probabilities that the input resembles one of any number of previously learned document collections. It can be used to sort incoming email into arbitrary categories such as spam, work, and play, or simply to distinguish an English text from a French text. It fully supports international character sets, and uses sophisticated statistical models based on the Maximum Entropy Principle.

Download Website Updated 09 Feb 2003 DadaDodo

Screenshot
Pop 43.45
Vit 1.00

DadaDodo is a program that analyses texts for word probabilities, and then generates random cut-up sentences based on that. It is a travesty generator similar to Dissociated Press, but based on a Markov Chain of length 1.

Download Website Updated 29 Dec 2007 GNU Talk Filters

Screenshot
Pop 110.63
Vit 4.80

The GNU Talk Filters are filter programs that convert ordinary English text into text that mimics a stereotyped or otherwise humorous dialect. Some of these filters have been in the public domain for many years, but here they are provided as a single integrated package. The filters include austro, b1ff, brooklyn, chef, cockney, drawl, dubya, fudd, funetak, jethro, jive, kraut, pansy, pirate, postmodern, redneck, valspeak, and warez. This package provides the filters both as individual executables and collectively as a C library, so they can be easily embedded in other programs.

Screenshot

Project Spotlight

PSXImager

A collection of tools for dumping and mastering PlayStation 1 CD-ROM images.

Screenshot

Project Spotlight

Task Coach

A friendly task manager.