Projects / fdupe

fdupe

fdupe is a small Perl script which recursively scans directories and finds any duplicate files. The script compares the file contents, not file names. It doesn't use any Perl modules and is very fast.

Tags
Licenses
Operating Systems
Implementation

Recent releases

  •  19 Oct 2011 22:52

    Release Notes: A list of files can now be accepted from stdin.

    •  09 Sep 2011 19:41

      Release Notes: This release skips checking of hard links. This significantly shortens the runtime on file systems containing many hard linked files.

      •  10 Apr 2004 20:35

        No changes have been submitted for this release.

        Recent comments

        20 Oct 2011 23:32 hroptatyr

        sorry, wrong numbers :\
        freundt@gonzo:pts/6:~> find /data/data-source/ice/archive/2011 -type f | wc -l
        212374
        freundt@gonzo:pts/6:~> du -sm /data/data-source/ice/archive/2011
        20158

        I moved a .git directory into it earlier and forgot to move it out of there before counting. Still, the times are guaranteed to be of the directory as it is now.

        20 Oct 2011 23:26 hroptatyr

        It's an ftp mirror that is known to have tens of thousands of duplicates.
        freundt@gonzo:pts/6:~> find /data/data-source/ice/archive/2011 -type f | wc -l
        866949
        freundt@gonzo:pts/6:~> du -sm /data/data-source/ice/archive/2011
        49114

        IMPORTANT: The directory is nfs4 and there's NO subdirectories.

        20 Oct 2011 21:53 xmo

        Thanks for the comment hropatyr,
        unfortunately I can't reproduce your results.

        Could anybody do some reproducible benchmarks?

        20 Oct 2011 19:01 hroptatyr

        Disappointing, way too slow for practical purposes :(
        I recommend fdupes (http://premium.caribe.net/~adrian2/fdupes.html).

        fdupes archive/2011 27.50s user 38.93s system 23% cpu 4:41.45 total
        fdupe.pl archive/2011 259.65s user 289.21s system 49% cpu 18:31.07 total

        Screenshot

        Project Spotlight

        OpenStack4j

        A Fluent OpenStack client API for Java.

        Screenshot

        Project Spotlight

        TurnKey TWiki Appliance

        A TWiki appliance that is easy to use and lightweight.