RSS 17 projects tagged "NLP"

Download No website Updated 14 Apr 2014 Infovore

Screenshot
Pop 614.23
Vit 51.81

Infovore is a map/reduce framework for processing large RDF data sets such as Freebase and DBpedia. It is based on Hadoop.

No download No website Updated 15 Oct 2010 Language Detection Library for Java

Screenshot
Pop 55.50
Vit 35.78

The Language Detection Library for Java is a Java library to detect the natural languages in which texts are written. This task is also known as "language identification", "language guessing", and "language recognition". It has over 99% precision for more than 40 languages. The supported languages are Afrikaans, Arabic, Bulgarian, Bengali, Czech, German, Greek, English, Spanish, Persian, Finnish, French, Gujarati, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Italian, Japanese, Kannada, Korean, Macedonian, Malayalam, Marathi, Nepali, Dutch, Punjabi, Polish, Portuguese, Romanian, Russian, Slovak, Somali, Albanian, Swedish, Swahili, Tamil, Telugu, Thai, Tagalog, Turkish, Ukrainian, Urdu, Vietnamese, and Simplified/Traditional Chinese.

No download No website Updated 11 Dec 2010 AlchemyAPI Android SDK

Screenshot
Pop 81.66
Vit 34.98

The AlchemyAPI Android SDK enables real-time semantic analysis of text, HTML, or Internet-hosted Web page content. The SDK provides mechanisms to extract Concepts, Named Entities, Keywords and Tags, Categories, and clean HTML into text, and even detects languages. It can analyze text in eight different languages: English, French, German, Italian, Portuguese, Russian, Spanish, and Swedish. Example code and a demo application are included to help get you started.

No download Website Updated 14 Sep 2013 JOWKL

Screenshot
Pop 31.02
Vit 14.72

JOWKL (Java OmegaWiki Library) is a Java-based application programming interface which allows the user to access all information in the free, multilingual online dictionary OmegaWiki.

Download Website Updated 15 Oct 2013 eoconv

Screenshot
Pop 57.10
Vit 11.57

eoconv is a tool that converts text files to and from various Esperanto transliteration schemes (e.g. h-notation, x-notation) and text encodings, including Unicode, ISO-8859-3, HTML, LaTeX, and ASCII.

No download No website Updated 14 Feb 2014 TreeTagger for Java

Screenshot
Pop 163.41
Vit 7.44

TreeTagger for Java (TT4J) is a Java wrapper around the popular TreeTagger package by Helmut Schmid, a language independent part-of-speech tagger and lemmatizer. It was written with a focus on platform-independence and easy integration into applications.

Download No website Updated 14 Oct 2013 UBY

Screenshot
Pop 85.28
Vit 4.03

UBY is a large-scale unified lexical-semantic resource for natural language processing (NLP) based on the ISO standard Lexical Markup Framework (LMF).

No download No website Updated 23 Dec 2013 DKPro Core

Screenshot
Pop 58.31
Vit 2.34

DKPro Core is a collection of software components for natural language processing (NLP) based on the Apache UIMA framework. Many powerful and state-of-the-art NLP components are already freely available in the NLP research community. New and improved components are being developed and released continuously. The components cover the whole range of NLP-related processing tasks. DKPro Core provides wrappers for such third-party tool as well as original NLP components. DKPro Core builds heavily on uimaFIT which allows for rapid and easy development of NLP processing pipelines.

No download No website Updated 30 Nov 2013 DKPro WSD

Screenshot
Pop 59.60
Vit 2.17

DKPro WSD provides UIMA components which encapsulate corpus readers, linguistic annotators, lexical semantic resources, WSD algorithms, and evaluation and reporting tools. You configure the components, or write new ones, and arrange them into a data processing pipeline. DKPro WSD is modular and flexible. Components which provide the same functionality can be freely swapped. You can easily run the same algorithm on different data sets, or test several different algorithms on the same data set.

No download No website Updated 28 Nov 2013 TWSI

Screenshot
Pop 35.94
Vit 1.77

TWSI is software that produces lexical substitutions in context for over 1000 frequent nouns. It processes English text. This functionality is realized by a supervised word sense disambiguation system, which is trained by sense-labeled occurrences of target words. A classification model is trained for each word, and used to decide which sense an unseen occurrence most likely belongs to. Associated with senses are lists of substitutions, which are injected into the text using inline annotation.

Screenshot

Project Spotlight

ABC Path Solver

An automated solver for the puzzle game ABC Path.

Screenshot

Project Spotlight

juntaDados

A GNU/Linux multimedia distribution that is targeted at audio, video, and graphics producers.