Projects / multi-rsync

multi-rsync

mrsync transfers whole files from one master machine to many remote machines in a LAN using Unix sockets' multicasting capability. It has congestion control so that it won't jam the network traffic relentlessly. It takes about 4 hours for 140GB to 100 targets on a 1Gbit LAN.

Tags
Licenses
Operating Systems
Implementation

RSS Recent releases

  •  19 Feb 2009 21:06

No changes have been submitted for this release.

  •  28 Oct 2008 18:49

Release Notes: The mechanism for target machines to report back missing pages has been revised. Previously, the missing pages were reported back one page at a time. The missing pages are now reported back in one or a few packets. As a consequence, the overall network traffic is reduced and sync performance is improved visibly.

  •  30 Jul 2006 15:34

Release Notes: In addition to fixing some bugs, this release mainly improves the process of changing monitors, and also enhances the handshake during syncing. There is a visible improvement in reliability from daily syncing using 4 mrsync sessions concurrently.

  •  18 May 2006 14:54

Release Notes: Large file support. Platform independence (between Linux and Unix). Dynamic catching for slow machines. Removal of meta-file-info. mcast IP address and port options. Multiple mrsync sessions may be run simultaneously. Code cleanup. Code fixes for 64-bit architectures. Tested on Debian 64-bit. A fix for some logic flaw that had caused premature machine dropout. Code provision for IPv6 (not tested yet). Minor bugfixes. Checking in multicatcher to see whether the system is ready for writing.

  •  08 Jun 2005 11:28

Release Notes: Congestion control has been implemented so that mrsync is more net-traffic-friendly. There is evidence that this actually improves the performance. A Python script is used as the glue to put everything together, i.e. mrsync.c becomes mrsync.py. Other changes collected over the years through interactions with users include an MCAST_ADDRESS option, a verbose control option, and replacing memory mapped file I/O with the usual seek() and write() sequence.

RSS Recent comments

30 May 2005 16:07 hwei

upcoming mrsync version 2.0
A new feature has been added, tested and in production use.
Specifically, a network congestion control
is implemented, which makes multicaster more network
friendly. The resulting codes actually deliver data a little
faster than the codes in which the congestion control is turned off.

Other changes:
* Python script (mrsync.py replacing mrsync.c )
is used to set things up.
* rtt now measures the real round trip time distribution.
* IP address can be set in command line (per Robert Dack
* replacing memory mapped file IO with the usual
seek() and write() sequence.
This change was echoed by Clint
Byrum
*adding verbose control so that by default mrsync prints
only essential info instead of detailed status report.
This was suggested by Clint Byrum

--------------------
Currently, the codes are waiting for clearance from our
company before I can put them in Freshmeat.

If you want to take a look at them, drop me an email. :)
HP

09 May 2005 17:11 spamaps

Re: Mrsync implementation

> www.adicio.com/mrsync-...

We did some more mods..

www.adicio.com/mrsync-...

This patch takes mmap out and also reduces the output to a more suitable level when you need to see errors, but not status messages. ;)

16 Apr 2005 12:37 spamaps

Re: Mrsync implementation

> Overall the performance is not all that

> bad, sending 100GIG to about 300 nodes

> equipped with 100MB connections takes

> about 8 hours. When dropping to only a

> small number of nodes 15 GIG to 6-10

> nodes in 25minutes.

We're doing it with gigabit. We've seen a much better rate, though we're copying a lot less data. Also, our focus is on doing this in as short a time period as possible. We have 15 boxes on a gigabit LAN, and about 1.6GB to transfer. It takes 7 minutes no matter if we just go to 1 or 15 of them.

One thing that I did was patch multicatcher to not use mmap. We're transferring some larger files, and mmaping 500MB or more on all our nodes was not efficient for the other processes running on them. I've put the patch up here for anyone who might need it, until "HP" can incorporate it into the baseline mrsync:

www.adicio.com/mrsync-...

18 Mar 2004 17:27 rdack

Mrsync implementation
We've been using Mrsync for about 1 year to move blocks of data around our clusters. I'm quite pleased with the utility, although it could use some additional features.

We found early on that it the multicaster needed to be on the same subnet as the recieving hosts. We adapted the code to allow a user defined interface for hosts which existed on more than one subnet for this purpose.

Currently, we find that the software needs the ability to allow mulitple multicasts to run simultaneously. This will require the multicast address and perhaps the port to change for each multicast running.

Overall the performance is not all that bad, sending 100GIG to about 300 nodes equipped with 100MB connections takes about 8 hours. When dropping to only a small number of nodes 15 GIG to 6-10 nodes in 25minutes.

Screenshot

Project Spotlight

PCBook

A reservation system for PCs.

Screenshot

Project Spotlight

phpMyFAQ

A PHP-based FAQ system.