RSS 91 projects tagged "Linguistic"

Download Website Updated 11 Feb 2013 ANTLR

Screenshot
Pop 297.05
Vit 5.52

ANTLR (ANother Tool for Language Recognition) is a language tool that provides a framework for constructing recognizers, compilers, and translators from grammatical descriptions containing C++, Java, or Sather actions. It is similar to the popular compiler generator YACC, however ANTLR is much more powerful and easy to use. ANTLR-produced parsers are not only highly efficient, but are both human-readable and human-debuggable (especially with the interactive ParseView debugging tool). ANTLR can generate parsers, lexers, and tree-parsers in either C++, Java, or Sather. ANTLR is currently written in Java.

Download Website Updated 05 Oct 2013 Apache Lucene

Screenshot
Pop 259.46
Vit 21.14

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is suitable for nearly any application that requires full-text search, especially cross-platform.

Download Website Updated 12 Sep 2008 Redet

Screenshot
Pop 227.49
Vit 11.02

Redet is a tool for developing and executing regular expressions using any of more than 50 search programs, editors, and programming languages, intended both for developing regular expressions for use elsewhere and as a search tool in its own right. For each program in each locale, a palette showing the available constructs is provided. The properties of each program are determined by runtime tests, which guarantees that they will be correct for the program version and locale. Additional features include persistent history, extensive help, a variety of character entry tools, and the ability to change locale while running. Redet is highly configurable and fully supports Unicode.

Download Website Updated 10 Jan 2014 pyPEG

Screenshot
Pop 191.12
Vit 26.48

pyPEG is a quick and easy solution for creating a parser in Python programs. pyPEG uses a PEG language in Python data structures to parse, so it can be used dynamically to parse nearly every context free language. The output is a plain Python data structure called pyAST, or, as an alternative, XML.

Download Website Updated 11 Jan 2010 msort

Screenshot
Pop 184.92
Vit 11.42

Msort sorts files in sophisticated ways. Records may be fixed size, newline-separated blocks, or terminated by any specified character. Key fields may be selected by position, tag, or character range. For each key, distinct exclusions, multigraphs, substitutions, and a sort order may be defined or locale collation rules used. Comparisons may be lexicographic, numeric, numeric string, hybrid, random, by string length, angle, domain name, date, time, month name, or ISO8601 timestamp. Keys may be reversed so as to generate reverse dictionaries. Optional keys are supported. Unicode is supported, including full case-folding. Msort itself has a somewhat complex command line interface, but may be driven by an optional GUI.

Download Website Updated 26 May 2011 International Components for Unicode (C/C++)

Screenshot
Pop 184.79
Vit 12.37

ICU provides a Unicode implementation, with functions for formatting numbers, dates, times, and currencies (according to locale conventions, transliteration, and parsing text in those formats). It provides flexible patterns for formatting messages, where the pattern determines the order of the variable parts of the messages, and the format for each of those variables. These patterns can be stored in resource files for translation to different languages. Included are more than 100 codepage converters for interaction with non-unicode systems.

No download Website Updated 05 Oct 2013 Apache Solr

Screenshot
Pop 163.42
Vit 13.35

Solr is an enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g. Word and PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites. Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language. Solr's powerful external configuration allows it to be tailored to almost any type of application without Java coding, and it has an extensive plugin architecture when more advanced customization is required.

No download No website Updated 21 Feb 2005 Zoe Intertwingle

Screenshot
Pop 162.45
Vit 7.12

Zoe is a Web based email client with a built in SMTP and POP3 server and Google-like search functionality that lives on your desktop. It is written in Java and uses Lucene technology to provided instant searching and threading of your email messages.

Download Website Updated 22 Jan 2004 Pythoñol

Screenshot
Pop 102.76
Vit 2.11

Pythoñol is an all-in-one program that helps English speakers learn Spanish. It features pronunciation, verb conjugation, a dictionary with over 70,000 words, a thesaurus, quizzes, full-text translation, idioms, a verb browser, and a large reference section.

Download Website Updated 18 Feb 2009 Unicode Utilities

Screenshot
Pop 87.61
Vit 5.85

The Unicode Utilities are a set of programs for manipulating and analyzing Unicode text. uniname prints any combination of the character offset of each character, its byte offset, its hex code value, its encoding, the glyph itself, and its name. unidesc reports the character ranges to which different portions of the text belong. unihist generates a histogram of the characters in its input. ExplicateUTF8 determines and explains the validity of a sequence of bytes as a UTF-8 encoding. unirev reverses UTF-8 strings. unifuzz tests other programs' unicode handling.

Screenshot

Project Spotlight

PrestaShop Custom Fields

A PrestaShop module that adds extra fields to the checkout page.

Screenshot

Project Spotlight

YAGF

A graphical frontend for the Cuneiform OCR tool.