Wandora is a general purpose data extraction, management, and publishing application based on Topic Maps and Java. Wandora has a graphical user interface, layered presentation of knowledge, several data storage options, rich data extraction, import and export capabilities, and an embedded HTTP server that enables dynamic publication of Topic Maps. Wandora is well suited for rapid ontology construction and knowledge mashups.
AceWiki is a semantic wiki that is powerful and at the same time easy to use. Making use of the controlled natural language ACE, the formal statements of the wiki are shown in a way that looks like natural English. In order to help the users to write correct ACE sentences, AceWiki provides a predictive editor.
OpenEphyra is a question answering (QA) system. It retrieves answers to natural language questions from the Web and other sources. OpenEphyra comes with implementations of algorithms that proved effective in Carnegie Mellon's Ephyra system, which participated in the TREC evaluations. It is platform independent and can be set up in just a few minutes. The goal of this project is to give researchers the opportunity to develop new QA techniques without worrying about the end-to-end system.
RapidMiner (formerly YALE) is a flexible Java environment for knowledge discovery in databases, machine learning, and data mining. Many nestable learning and preprocessing operators (including Weka) are provided. It features an XML-based graphical user interface, a plugin mechanism, and high-dimensional plotting, and provides an easy-to-use extension mechanism that makes it possible to integrate new operators and adapt the system to your personal requirements. A command line version is also included.
IkeWiki is a new kind of Wiki (a so-called "Semantic Wiki") developed by Salzburg Research that allows users to collaboratively annotate pages and links between pages with semantic annotations. Such annotations are useful because they give machines a certain amount of "understanding" of the content that goes beyond merely displaying the page. This information can then, for example, be used for context-specific presentation of pages, advanced querying, consistency verification, or drawing conclusions.
Jigsaw is an embedded data-store designed for the development of data-warehouse, analytical, and machine learning applications. Jigsaw can perform over one million operations a second, and scale to store tera-bytes of data. The object library contains classes for representing ordered and unordered mappings, highly compressed bit vectors with a range of set theoretic operators, and directly integrates a high performance sort system.
SenseClusters is a natural language processing package that allows you to cluster similar contexts or to identify clusters of related words. It supports its own native methods based on first and second order representations of context, and also supports Latent Semantic Analysis. It is fully unsupervised, and can automatically discover the optimal number of clusters in your text. SenseClusters is a complete system that takes users from preprocessing of raw text to providing clustered output.