The FLOSSmole Data Collectors are programs to collect data about open source software development. They collect data from major open source software forges and directories, including Savannah, Tigris, Github, Google Code, Launchpad, and SourceForge.

Theorem Linker is a program used to visualize references between theorems in a paper written using LaTeX. Using a .tex document (and a .aux file, created by the LaTeX compiler), Theorem Linker will search through a paper, find theorems, and find references to other theorems within a theorem's "proof". It will then create a digraph in a .dot file (to be opened with programs such as Graphviz or OmniGraffle) that will display each theorem as a node, with directed edges to describe the relations between the theorems. A path highlighted in red describes the longest path in the graph. Theorem Linker will also create folders containing graphs to individually show relations of each theorem in a paper.

Treba is a commandline tool for training, decoding, and calculating with weighted (probabilistic) finite state automata (WFSA/PFSA). Training algorithms include Baum-Welch (EM), Viterbi training, and Baum-Welch augmented with deterministic annealing. Treba is optimized for speed and numerical stability, and training algorithms can be run multi-threaded on hardware with multiple cores/CPUs. Forward, backward, and Viterbi decoding are supported. Automata for training/decoding are read from a text file, or can be generated randomly or with uniform transition probabilities with different topologies (ergodic or fully connected, Bakis or left-to-right, or deterministic). Observations used for training or decoding are read from text files compatible with AT&T finite state tools and OpenFST.


