RSS 17 projects tagged "NLP"

No download No website Updated 11 Dec 2010 AlchemyAPI Android SDK

Screenshot
Pop 81.55
Vit 35.03

The AlchemyAPI Android SDK enables real-time semantic analysis of text, HTML, or Internet-hosted Web page content. The SDK provides mechanisms to extract Concepts, Named Entities, Keywords and Tags, Categories, and clean HTML into text, and even detects languages. It can analyze text in eight different languages: English, French, German, Italian, Portuguese, Russian, Spanish, and Swedish. Example code and a demo application are included to help get you started.

Download Website Updated 29 Nov 2011 Apache OpenNLP

Screenshot
Pop 85.84
Vit 1.49

Apache OpenNLP is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services.

No download No website Updated 23 Dec 2013 DKPro Core

Screenshot
Pop 56.78
Vit 2.32

DKPro Core is a collection of software components for natural language processing (NLP) based on the Apache UIMA framework. Many powerful and state-of-the-art NLP components are already freely available in the NLP research community. New and improved components are being developed and released continuously. The components cover the whole range of NLP-related processing tasks. DKPro Core provides wrappers for such third-party tool as well as original NLP components. DKPro Core builds heavily on uimaFIT which allows for rapid and easy development of NLP processing pipelines.

No download No website Updated 30 Nov 2013 DKPro WSD

Screenshot
Pop 58.99
Vit 2.16

DKPro WSD provides UIMA components which encapsulate corpus readers, linguistic annotators, lexical semantic resources, WSD algorithms, and evaluation and reporting tools. You configure the components, or write new ones, and arrange them into a data processing pipeline. DKPro WSD is modular and flexible. Components which provide the same functionality can be freely swapped. You can easily run the same algorithm on different data sets, or test several different algorithms on the same data set.

Download No website Updated 14 Apr 2014 Infovore

Screenshot
Pop 613.27
Vit 31.76

Infovore is a map/reduce framework for processing large RDF data sets such as Freebase and DBpedia. It is based on Hadoop.

No download Website Updated 14 Sep 2013 JOWKL

Screenshot
Pop 30.82
Vit 14.85

JOWKL (Java OmegaWiki Library) is a Java-based application programming interface which allows the user to access all information in the free, multilingual online dictionary OmegaWiki.

No download Website Updated 15 Sep 2013 JWKTL

Screenshot
Pop 37.26
Vit 1.01

JWKTL (Java-based Wiktionary Library) is an application programming interface for the free multilingual online dictionary Wiktionary. Wiktionary is collaboratively constructed by volunteers and continually growing. JWKTL enables efficient and structured access to the information encoded in the English, German, and Russian Wiktionary language editions, including sense definitions, part of speech tags, etymology, example sentences, translations, semantic relations, and many other lexical information types.

No download Website Updated 28 Nov 2013 JobimText

Screenshot
Pop 36.25
Vit 1.01

JobimText provides a software solution for automatic text expansion using contextualized distributional similarity.

No download No website Updated 15 Oct 2010 Language Detection Library for Java

Screenshot
Pop 55.28
Vit 35.83

The Language Detection Library for Java is a Java library to detect the natural languages in which texts are written. This task is also known as "language identification", "language guessing", and "language recognition". It has over 99% precision for more than 40 languages. The supported languages are Afrikaans, Arabic, Bulgarian, Bengali, Czech, German, Greek, English, Spanish, Persian, Finnish, French, Gujarati, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Italian, Japanese, Kannada, Korean, Macedonian, Malayalam, Marathi, Nepali, Dutch, Punjabi, Polish, Portuguese, Romanian, Russian, Slovak, Somali, Albanian, Swedish, Swahili, Tamil, Telugu, Thai, Tagalog, Turkish, Ukrainian, Urdu, Vietnamese, and Simplified/Traditional Chinese.

No download Website Updated 26 Apr 2010 Okapi Framework

Screenshot
Pop 34.23
Vit 1.46

The Okapi project’s main purpose is to architect a set of building blocks for the creation of larger open source localization and translation tools. But many Okapi components are generic enough to be of interest to the text mining, natural language processing, and text retrieval communities. Okapi’s many text filters (HTML, Properties, XML (ITS XPath-based rules), OpenXML, ODF, Regex etc.) provide a straightforward way to access the text of multiple document formats. Its document events and pipeline can be made to integrate with other frameworks such as UIMA, LingPipe, OpenPipeline, OpenNLP, GATE, and Lucene. The advantage of Okapi’s text filters is that not only is text extracted, but all non-textual formatting is preserved. It is possible to decompose a document into events, process them via the pipeline, and then rebuild the input document without loss. Structural information can be added to Okapi document events so that tables, lists, links, titles etc. are grouped together and treated as a unit. This is useful when context based on a “universal” document structure is needed. The Okapi event model supports user configurable annotations, similar to UIMA, but simpler and more restricted in scope. User can annotate spans of text or add new resources such as translation memory matches, terminology, token types, or part of speech information.

Screenshot

Project Spotlight

Canari Framework

A Maltego rapid transform development framework.

Screenshot

Project Spotlight

CaptureMock

Capture-replay mocking for Python, the command line and client-server communication.