Sefira library is meant to provide a well tested and reasonably optimized implementations of some tree comparison algorithms, so that they can be tried out in practical applications. Finding the largest common subtree of two ordered labeled trees is a problem with increasing applications (e.g. in XML processing and computational biology), on which substantial progress has been made in recent years. However, papers on tree comparison may not be available without subscription to the appropriate journals and even when they are, they generally don't contain executable code, so using the new algorithms is non-trivial. Also, there isn't one "best" algorithm yet. Distance definitions differ, and even for the same edit distance, time complexity can be measured on different tree features.
JDiff is a tool which performs folder recursive comparisons and file comparison based on Myer's minimum set of differences algorithm. Based on this file comparison algorithm, it also provides a 3-way merge, which can run without a graphical interface. JDiff is released as an auto-executable jar file.
Grep.pm is a much-modified fork of tcgrep. It understands context, matching from the start or end of a file (with a line count or byte count), and features size limits and highlighting. It extends matching to boolean expressions, structuring regular expressions, or even arbitrary pieces of Perl code. It can perform basic stemming and synonym-expansion in regular expressions (using expansyn). It also handles \0-lines, paragraphs, file slurping, directory recursion, and compressed files. It can act either as a Perl module or a command-line program. Grep.xchange is a support program taking grep or Grep.pm input and applying an expression at each grep match to the files specified in the grep output. This expression can be arbitrary Perl modifying e.g. just the line of the match with s///g, or operate against the current pos() position in the whole file. Grep.xchange --modified goes one step further and replaces the matched lines with the (edited) text from the grep output. Changes are logged in diff -u format and can be revoked/redone with patch.
Oxygen XML Diff is a complete solution for comparing and merging XML files. It offers both directory and file comparison, six different diff algorithms, and multiple levels of comparison. The comparison can also be performed inside ZIP-based archives (ZIP, JAR, ODF, OOXML). Oxygen XML Diff includes two specialized XML-aware comparison algorithms: XML Accurate, tuned for precise comparison, and XML Fast, tuned for speed at the expense of some accuracy.
DiffKit is an application and a framework, for comparing two tables of data, field-by-field. The tables can come from any of a number of sources, such as an RDBMS or CSV file, and DiffKit is able to mix different kinds of sources in the same diff operation. It is like the Unix diff utility, but for tables instead of lines of text. Diffs can be reported at both the row and field level, and the user can configure what to compare, how to compare it, what to ignore). DiffKit is highly customizable with respect to the sources of tabular data, the details of the comparison, and the characteristics of the output (diff report).
This program is especially useful when the main difference between the two LaTeX files is in permutation of sections, moved equations, etc. Given 2 LaTeX files, it prepares 2 annotated LaTeX files showing which equations of the 1st file are identical, or close, or similar, to which equations of the 2nd file. It does the same with the pieces of text. It also saves the snippets of text and equations from each text into separate files in 2 new folders, allowing the user to see the difference using a diff program.