uimaFIT provides Java annotations for describing UIMA components which can be used to directly describe the UIMA components in Java code without the need for traditional UIMA XML descriptors. This greatly simplifies refactoring a component definition (e.g., changing a configuration parameter name). uimaFIT also makes it easy to instantiate UIMA components without using XML descriptor files by providing convenient factory methods. This makes uimaFIT an ideal library for testing UIMA components because the component can be easily instantiated and invoked without requiring a descriptor file to be created first. uimaFIT is very useful in research environments in which programmatic/dynamic instantiation of UIMA pipelines can simplify experimentation. For example, when performing 10-fold cross-validation across a number of experimental conditions, it can be quite laborious to create a different set of descriptor files for each run, or even a script which generates such descriptor files. uimaFIT is type system agnostic and does not depend on (or provide) a specific type system. This project has been superseded by the Apache uimaFIT project.
jWeb1T is an Java tool for efficiently searching n-gram data in the Web 1T 5-gram corpus format. It is based on a binary search algorithm that finds the n-grams and returns their frequency counts in logarithmic time. As the corpus is stored in many files, a simple index is used to retrieve the files containing the n-grams.
DKPro Core is a collection of software components for natural language processing (NLP) based on the Apache UIMA framework. Many powerful and state-of-the-art NLP components are already freely available in the NLP research community. New and improved components are being developed and released continuously. The components cover the whole range of NLP-related processing tasks. DKPro Core provides wrappers for such third-party tool as well as original NLP components. DKPro Core builds heavily on uimaFIT which allows for rapid and easy development of NLP processing pipelines.
Status Reporter is a time tracking application that was designed to track the time spent working on various activities associated with work breakdown structure (WBS) line items. Each activity is associated with a WBS line item, and a WBS line item is part of the overall departmental budget. Even though it was developed for a specific purpose, the ideas can be applied to almost any situation. For example, if you do not use WBS numbers, you could just create a single WBS number and associate all activities with it. Alternatively if you have multiple projects, you can use the WBS numbers as individual projects and have one or more activities associated with each. The WBS number does not have to be numeric, so it is flexible in its usage.
Midao JDBC simplifies development with Java JDBC. It is flexible, customizable, and simple/intuitive to use, and provides a lot of functionality: transactions, working with metadata, type handling, profiling, input/output processing/converting, pooled datasource libraries support, cached/lazy query execution, named parameters, multiple vendor support out of the box, custom exception handling, and overrides. With a single jar, it supports both JDBC 3.0 (Java 5) and JDBC 4.0 (Java 6). Midao JDBC is well tested. Not only does it have around 700 unit and functional tests, but it's also tested with the latest drivers of Derby, MySQL (MariaDB), PostgreSQL, Microsoft SQL, and Oracle. Midao is a data-centric project. Its goal is to shield Java developer from nuances of vendor implementation and standard boilerplate code. Midao JDBC is the first library released under it.
Apache uimaFIT provides Java annotations for describing UIMA components which can be used to directly describe the UIMA components in Java code without the need for traditional UIMA XML descriptors. This greatly simplifies refactoring a component definition (e.g., changing a configuration parameter name). It also makes it easy to instantiate UIMA components without using XML descriptor files by providing convenient factory methods. It is ideal for testing UIMA components because the component can be easily instantiated and invoked without requiring a descriptor file to be created first.
WebAnno is a general purpose Web-based annotation tool for a wide range of linguistic annotations. It offers annotation project management, freely configurable tagsets, and the management of users in different roles. It uses technology from the brat rapid annotation tool for visualizing and editing annotations in a Web browser. It supports annotation and visualization of arbitrarily large documents, pluggable import/export filters, the curation of annotations across various users, and farming out annotations to a crowdsourcing platform.
DKPro Similarity is a framework for text similarity. Its goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. The framework is designed to complement DKPro Core, a collection of software components for natural language processing (NLP) based on the Apache UIMA framework. DKPro Similarity comprises a wide variety of measures ranging from ones based on simple n-grams and common subsequences to high-dimensional vector comparisons and structural, stylistic, and phonetic measures. In order to promote the reproducibility of experimental results and to provide reliable, permanent experimental conditions for future studies, DKPro Similarity also comes with a set of full-featured experimental setups which can be run out-of-the-box and used for future systems to built upon.