113 projects tagged "Linguistic"
ANTLR (ANother Tool for Language Recognition) is a language tool that provides a framework for constructing recognizers, compilers, and translators from grammatical descriptions containing C++, Java, or Sather actions. It is similar to the popular compiler generator YACC, however ANTLR is much more powerful and easy to use. ANTLR-produced parsers are not only highly efficient, but are both human-readable and human-debuggable (especially with the interactive ParseView debugging tool). ANTLR can generate parsers, lexers, and tree-parsers in either C++, Java, or Sather. ANTLR is currently written in Java.
After the Deadline for WordPress is a plugin that interfaces with After the Deadline, a Web service that helps you improve your writing and spend less time editing. This plugin adds a button for checking spelling and writing style to the WordPress visual editor mode. An API key is required to access the After the Deadline service.
An Gramadóir is a grammar checking engine that is designed for the rapid development of grammar checkers for minority languages and other languages with limited computational resources. Rule specifications are given according to a simple syntax combining XML and regular expressions. Part-of-speech tagging can be learned from text corpora using statistical methods. It is currently implemented for Irish (Gaeilge).
Solr is an enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g. Word and PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites. Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language. Solr's powerful external configuration allows it to be tailored to almost any type of application without Java coding, and it has an extensive plugin architecture when more advanced customization is required.
Arabic Wordlist is a project to deliver an English to Arabic translated word list to be used in translations and/or dictionaries. The word list contains in excess of 83,500 words (and growing), and spans a variety of categories (i.e. it is general in nature). This word list is encoded in UTF-8, and is expected to be used in many online free dictionaries.
BabelKit is an interface to a universal multilingual database code table. It takes all of the programming work out of maintaining multiple database code definition sets in multiple languages. The code administration and translation page lets developers define new virtual code tables, new languages, enter all codes and their descriptions, and then translate them into all languages of interest. Perl and PHP classes retrieve the code descriptions and automatically generate HTML code selection elements in the user's language. This makes internationalization and localization of Web sites and database interfaces much easier.
Booleano is an interpreter of Boolean expressions; a library to define and run filters available as text (e.g. in a natural language) or in Python code. In order to handle text-based filters, Booleano ships with a fully-featured parser whose grammar is adaptive: Its properties can be overridden using simple configuration directives. On the other hand, the library exposes a Pythonic API for filters written in pure Python. These filters are particularly useful to build reusable conditions from objects provided by a third party library.