Wandora is a general purpose data extraction, management, and publishing application based on Topic Maps and Java. Wandora has a graphical user interface, layered presentation of knowledge, several data storage options, rich data extraction, import and export capabilities, and an embedded HTTP server that enables dynamic publication of Topic Maps. Wandora is well suited for rapid ontology construction and knowledge mashups.
Social Networks Visualizer (SocNetV) is a flexible and user-friendly tool for the analysis and visualization of Social Networks. It lets you construct mathematical graphs with a few clicks on a virtual canvas, load networks of various formats (GraphViz, GraphML, Adjacency, Pajek, UCINET, etc), or create a network by crawling all links in a Web page. The application can compute basic network properties, such as density, diameter, and distances (shortest path lengths), as well as more advanced structural statistics, such as node and network centralities (i.e. closeness, betweenness, graph), clustering coefficient, etc.
Harry is a small tool for comparing strings and measuring their similarity. It implements several common distance and kernel functions for strings, as well as some exotic similarity measures. For example, Harry supports the Levenshtein (edit) distance, the Jaro-Winkler distance, and the compression distance. Harry is implemented using OpenMP, so its runtime scales linearly with the number of available CPU cores. Efficient implementations and effective caching speed comparison of strings.
Sally is a tool for mapping a set of strings to a set of vectors. This mapping is referred to as embedding and allows techniques of machine learning and data mining to be applied for the analysis of string data. It can be used with data such as text documents, DNA sequences, or log files. The vector space model or bag-of-words model is used. Strings are characterized by a set of features, where each feature is associated with one dimension of the vector space. Occurrences of the features in each string are counted. Alternatively, binary or TF-IDF values can be computed. Vectors can be output in plain text, LibSVM, or Matlab format.
Salad (short for Letter Salad) is an efficient and flexible implementation of the well-known anomaly detection method Anagram by Wang et al. (RAID 2006). Salad is based on n-gram models, that is, data is represented as all of its substrings of length n. During training these n-grams are stored in a Bloom filter. This enables the detector to represent a large number of n-grams in little memory and still being able to efficiently access the data. Salad extends Anagram by allowing various n-gram types, a 2-class version of the detector for classification, and various model analysis modes.
BEAM is a toolbox and development platform for viewing, analysing, and processing of remote sensing raster data. Originally developed to facilitate the utilisation of image data from Envisat's optical instruments, BEAM now supports a growing number of other raster data formats such as GeoTIFF and NetCDF as well as data formats of other EO sensors such as MODIS, AVHRR, AVNIR, PRISM and CHRIS/Proba. Various data and algorithms are supported by dedicated extension plug-ins. It includes VISAT, an intuitive desktop application, a set of scientific tools running either from the command line or invoked by VISAT, and a rich Java API for the development of new remote sensing applications and BEAM extension plug-ins.