193 projects tagged "Linguistic"

Download Website Updated 22 Dec 2009 Booleano

Screenshot
Pop 23.75
Vit 42.87

Booleano is an interpreter of Boolean expressions; a library to define and run filters available as text (e.g. in a natural language) or in Python code. In order to handle text-based filters, Booleano ships with a fully-featured parser whose grammar is adaptive: Its properties can be overridden using simple configuration directives. On the other hand, the library exposes a Pythonic API for filters written in pure Python. These filters are particularly useful to build reusable conditions from objects provided by a third party library.

No download Website Updated 12 Feb 2013 TAMS Analyzer

Screenshot
Pop 177.49
Vit 31.58

TAMS (Text Analysis Markup System) Analyzer is a qualitative or ethnographic coding and data extraction-analysis system.

Download Website Updated 10 Jan 2014 pyPEG

Screenshot
Pop 187.02
Vit 22.79

pyPEG is a quick and easy solution for creating a parser in Python programs. pyPEG uses a PEG language in Python data structures to parse, so it can be used dynamically to parse nearly every context free language. The output is a plain Python data structure called pyAST, or, as an alternative, XML.

Download Website Updated 05 Oct 2013 Apache Lucene

Screenshot
Pop 287.97
Vit 19.47

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is suitable for nearly any application that requires full-text search, especially cross-platform.

No download Website Updated 05 Oct 2013 Apache Solr

Screenshot
Pop 163.58
Vit 12.31

Solr is an enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g. Word and PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites. Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language. Solr's powerful external configuration allows it to be tailored to almost any type of application without Java coding, and it has an extensive plugin architecture when more advanced customization is required.

Download Website Updated 15 May 2011 uni2ascii

Screenshot
Pop 187.87
Vit 12.06

uni2ascii and ascii2uni provide conversion in both directions between UTF-8 Unicode and more than thirty 7-bit ASCII equivalents, including RFC 2396 URI format and RFC 2045 Quoted Printable format, the representations used in HTML, SGML, XML, OOXML, the Unicode standard, Rich Text Format, POSIX portable charmaps, POSIX locale specifications, and Apache log files. It can also convert between the escapes used for Unicode in languages such as Ada, C, Common Lisp, Java, Pascal, Perl, Postscript, Python, Scheme, and Tcl.

Download Website Updated 11 Jan 2010 msort

Screenshot
Pop 180.88
Vit 11.35

Msort sorts files in sophisticated ways. Records may be fixed size, newline-separated blocks, or terminated by any specified character. Key fields may be selected by position, tag, or character range. For each key, distinct exclusions, multigraphs, substitutions, and a sort order may be defined or locale collation rules used. Comparisons may be lexicographic, numeric, numeric string, hybrid, random, by string length, angle, domain name, date, time, month name, or ISO8601 timestamp. Keys may be reversed so as to generate reverse dictionaries. Optional keys are supported. Unicode is supported, including full case-folding. Msort itself has a somewhat complex command line interface, but may be driven by an optional GUI.

Download Website Updated 12 Sep 2008 Redet

Screenshot
Pop 219.34
Vit 10.98

Redet is a tool for developing and executing regular expressions using any of more than 50 search programs, editors, and programming languages, intended both for developing regular expressions for use elsewhere and as a search tool in its own right. For each program in each locale, a palette showing the available constructs is provided. The properties of each program are determined by runtime tests, which guarantees that they will be correct for the program version and locale. Additional features include persistent history, extensive help, a variety of character entry tools, and the ability to change locale while running. Redet is highly configurable and fully supports Unicode.

Download Website Updated 05 May 2011 Faroese Spell Checking Dictionary

Screenshot
Pop 24.19
Vit 10.97

This Faroese spell checking dictionary is intended to be used with programs like aspell and ispell.

Download Website Updated 21 Mar 2013 JOrtho

Screenshot
Pop 71.25
Vit 8.06

JOrtho is a spell checker for Java. The library works with any JTextComponent from the Swing framework and checks as you type. The dictionary is based on the free Wiktionary.org, and is applicable for multiple languages. You can select the spell checking language via a context menu. The Features of JOrtho are the highlighting of potentially wrongly spelled words, a context menu with suggestions for correct forms of the word, and a context menu with option to change the checking language. At the moment there are nine languages for spell checking available: English, German, French, Spanish, Italian, Russian, Polish, Dutch, and Arabic.

Screenshot

Project Spotlight

CT-gui/CT-synth/CT-farfisa

A GUI toolkit for Linux and Android.

Screenshot

Project Spotlight

fcron

A command scheduler for non-permanently-running systems.