Gfarm is a distributed filesystem, generally used for large scale cluster computing. It's implemented in userland, and can be mounted by FUSE. It utilizes locality of a file to access a data node, and supports Globus GSI for Wide Area Network. Users can explicitly control file replica location on Gfarm. Gfarm can be used as an alternative storage system to HDFS for Hadoop, Samba, MPI-IO, and GridFTP. Monitoring via ZABBIX and Ganglia is also supported.
Moose File System (MooseFS / MFS) is a fault tolerant, network distributed file system. It spreads data over several physical servers, which are visible to the user as one resource. For standard file operations MooseFS mounted with FUSE acts like other Unix-alike file systems: it has a hierarchical structure; it stores POSIX file attributes; and it supports special files, symbolic links, and hard links. Access to the file system can be limited based on IP address and/or password. It offers high reliability, since several copies of the data can be stored across separate computers. Capacity is dynamically expandable by attaching new computers or disks. Deleted files are retained for a configurable period of time (with a file system level "trash bin"). MooseFS supports coherent snapshots of files, even while the file is being written or accessed.
Deduplicator is a simple and efficient data deduplicator that works by hard linking files that have the same content. It is ideal for reducing the size of backups. It can save and restore intermediate results, so you can run it in a few short intervals, and allows you to review changes before they are committed to disk.