4 projects tagged "diff"
Sefira library is meant to provide a well tested and reasonably optimized implementations of some tree comparison algorithms, so that they can be tried out in practical applications. Finding the largest common subtree of two ordered labeled trees is a problem with increasing applications (e.g. in XML processing and computational biology), on which substantial progress has been made in recent years. However, papers on tree comparison may not be available without subscription to the appropriate journals and even when they are, they generally don't contain executable code, so using the new algorithms is non-trivial. Also, there isn't one "best" algorithm yet. Distance definitions differ, and even for the same edit distance, time complexity can be measured on different tree features.
Grep.pm is a much-modified fork of tcgrep. It understands context, matching from the start or end of a file (with a line count or byte count), and features size limits and highlighting. It extends matching to boolean expressions, structuring regular expressions, or even arbitrary pieces of Perl code. It can perform basic stemming and synonym-expansion in regular expressions (using expansyn). It also handles \0-lines, paragraphs, file slurping, directory recursion, and compressed files. It can act either as a Perl module or a command-line program. Grep.xchange is a support program taking grep or Grep.pm input and applying an expression at each grep match to the files specified in the grep output. This expression can be arbitrary Perl modifying e.g. just the line of the match with s///g, or operate against the current pos() position in the whole file. Grep.xchange --modified goes one step further and replaces the matched lines with the (edited) text from the grep output. Changes are logged in diff -u format and can be revoked/redone with patch.
This program is especially useful when the main difference between the two LaTeX files is in permutation of sections, moved equations, etc. Given 2 LaTeX files, it prepares 2 annotated LaTeX files showing which equations of the 1st file are identical, or close, or similar, to which equations of the 2nd file. It does the same with the pieces of text. It also saves the snippets of text and equations from each text into separate files in 2 new folders, allowing the user to see the difference using a diff program.