Projects / fdupe / Comments

Comments for fdupe

20 Oct 2011 23:32 hroptatyr

sorry, wrong numbers :\
freundt@gonzo:pts/6:~> find /data/data-source/ice/archive/2011 -type f | wc -l
212374
freundt@gonzo:pts/6:~> du -sm /data/data-source/ice/archive/2011
20158

I moved a .git directory into it earlier and forgot to move it out of there before counting. Still, the times are guaranteed to be of the directory as it is now.

20 Oct 2011 23:26 hroptatyr

It's an ftp mirror that is known to have tens of thousands of duplicates.
freundt@gonzo:pts/6:~> find /data/data-source/ice/archive/2011 -type f | wc -l
866949
freundt@gonzo:pts/6:~> du -sm /data/data-source/ice/archive/2011
49114

IMPORTANT: The directory is nfs4 and there's NO subdirectories.

20 Oct 2011 21:53 xmo

Thanks for the comment hropatyr,
unfortunately I can't reproduce your results.

Could anybody do some reproducible benchmarks?

20 Oct 2011 19:01 hroptatyr

Disappointing, way too slow for practical purposes :(
I recommend fdupes (http://premium.caribe.net/~adrian2/fdupes.html).

fdupes archive/2011 27.50s user 38.93s system 23% cpu 4:41.45 total
fdupe.pl archive/2011 259.65s user 289.21s system 49% cpu 18:31.07 total

Screenshot

Project Spotlight

ReciJournal

An open, cross-platform journaling program.

Screenshot

Project Spotlight

Veusz

A scientific plotting package.