Projects / fdupe

fdupe

fdupe is a small Perl script which recursively scans directories and finds any duplicate files. The script compares the file contents, not file names. It doesn't use any Perl modules and is very fast.

Tags
Licenses
Operating Systems
Implementation

RSS Recent releases

  •  20 Oct 2011 01:42

Release Notes: A list of files can now be accepted from stdin.

Release Notes: This release skips checking of hard links. This significantly shortens the runtime on file systems containing many hard linked files.

  •  10 Apr 2004 13:35

No changes have been submitted for this release.

RSS Recent comments

20 Oct 2011 23:32 hroptatyr Thumbs down

sorry, wrong numbers :\
freundt@gonzo:pts/6:~> find /data/data-source/ice/archive/2011 -type f | wc -l
212374
freundt@gonzo:pts/6:~> du -sm /data/data-source/ice/archive/2011
20158

I moved a .git directory into it earlier and forgot to move it out of there before counting. Still, the times are guaranteed to be of the directory as it is now.

20 Oct 2011 23:26 hroptatyr Thumbs down

It's an ftp mirror that is known to have tens of thousands of duplicates.
freundt@gonzo:pts/6:~> find /data/data-source/ice/archive/2011 -type f | wc -l
866949
freundt@gonzo:pts/6:~> du -sm /data/data-source/ice/archive/2011
49114

IMPORTANT: The directory is nfs4 and there's NO subdirectories.

20 Oct 2011 21:53 xmo

Thanks for the comment hropatyr,
unfortunately I can't reproduce your results.

Could anybody do some reproducible benchmarks?

20 Oct 2011 19:01 hroptatyr Thumbs down

Disappointing, way too slow for practical purposes :(
I recommend fdupes (premium.caribe.net/~ad...).

fdupes archive/2011 27.50s user 38.93s system 23% cpu 4:41.45 total
fdupe.pl archive/2011 259.65s user 289.21s system 49% cpu 18:31.07 total

Screenshot

Project Spotlight

Efax-gtk

A GUI frontend for the efax fax program.

Screenshot

Project Spotlight

check_tfl

A Nagios check for Transport For London services.