Projects / similarity-utils


similarity-utils is a set of two programs to give a quantitative measure of how similar two files are, on a scale 0 to 1. similarity_by_diff measures the number of difference lines reported by diff(1), while similarity_by_zlib tries compressing the two files, both separately and together, and comparing the results.

Operating Systems

Recent releases

  •  07 Nov 2008 15:05

    Release Notes: similarity_by_diff now works with filenames containing spaces.

    •  23 Oct 2008 13:57

      Release Notes: The programs were fixed for recent coreutils changes.

      •  31 Aug 2003 14:05

        No changes have been submitted for this release.


        Project Spotlight


        A Fluent OpenStack client API for Java.


        Project Spotlight

        TurnKey TWiki Appliance

        A TWiki appliance that is easy to use and lightweight.