Using Rsync on Windows

rsync is an application for replicating from a source directory to a destination directory i.e. after running rsync the contents of the source and destination directory will be identical.

There are various applications for do this such as robocopy, xxcopy and second copy. Or indeed there is reconcile written by an author not far from here. Some of these are free; others are not.

However rsync is especially suitable for use over wide area networks because:

See this article for the gory details.

The description client-server may raise some trepidation amongst Windows sysadmins as such apps have a bad reputation for making ill documented changes to the registry and scattering library files around the disk. However rsync needs no installing, makes no changes to the registry and consists of one executable and two dlls, which can be placed in whatever directory is convenient. However like many unix apps it has a formidably complicated command line syntax, and getting the server side config right can be a struggle. The point of the article is go through setting up rsync. Like so many things, once you get the knack of it you'll realise it's a lot simpler than it seems.

In this article I'm only going to be using rsync in client-server mode. It can also be used as a standalone app to replicate from one folder to another, but this only works well on a LAN. My focus is on replication over wide area networks and for this client-server mode is essential.

Downloading rsync

NB these notes apply to rsync version 3.0.4. There won't be any changes to the way rsync works in later versions, but every now and then they change the support libraries needed. For example earlier versions didn't need cygiconv-2.dll and future versions might need a library not decsribed here.

rsync is part of the Cygwin applications and the latest version can be downloaded from http://www.cygwin.com/. Run the setup.exe from the home page and it take you through the installation wizard. My preference is to choose the install option Download Without Installing as you just want a copy of the files. Just keep clicking Next through the setup wizard until you get to the Select Packages page.

At this point you need to select rsync, which you'll find under the heading Net.

Carry on clicking Next and the setup will download a number of packages, of which you only need four:

Extract these using the unzipper of your choice and place these four files together in a convenient directory:

rsync.exe
This is the rsync program
 
cygwin1.dll
This is the (in)famous Cygwin dll that implements the unix system functions in Windows
 
cygpopt-0.dll
This is a library used by rsync for parsing the command line
 
cygiconv-2.dll
GNU character set conversion library and utilities
 

If you open a command prompt and cd your way to wherever you placed the four files, you should be able to type rsync --help and have it spit out a long list of all the command line options. Assuming this works the next step is to configure rsync to act as a server.

An aside: the .bz2 compression isn't supported by my copy of WinZip. You can get a Windows decompressor for it at http://www.bzip.org/. Once you've decompressed the .bz2 file WinZip can open the .tar file.

Setting up the server

Unlike a Windows service there's no special setup necessary to make rsync run as a server. You just create a config file (traditionally called rsyncd.conf) and run:

rsync --config rsyncd.conf --daemon

Nothing will appear to happen, but if you open the Task Manager you'll see you now have a detached process rsync.exe running,a dn if you run netstat -a you'll see your server is listening on port 873. The only way to kill the rsync.exe process is through Task Manager or by logging out, so for testing it's easier to run:

rsync --config rsyncd.conf --daemon --no-detach

This stops rsync.exe from detaching itself from the command prompt, so you can kill it with ^C like any other command line app.

The rsyncd.conf config file is pretty simple. A good starting point is:

use chroot = false
strict modes = false
hosts allow = *
log file = rsyncd.log
pid file = rsyncd.pid

[asecretpwd]
path = /cygdrive/d/rsync
read only = false
transfer logging = yes

[anothersecretpwd]
path = /cygdrive/d/morersync
read only = false
transfer logging = yes

When a client connects it has to supply a password, and the password has to match one of the headings in square brackets. The heading it matches determines where the data is transferred. So in the example above, if the client supplies the password asecretpwd all the files, folders and subfolders it transfers will appear in d:\rsync. Alternatively if the client supplies the password anothersecretpwd all the files, folders and subfolders it transfers will appear in d:\morersync.

NB: the rsync password should be well protected as it's all you need to know to upload or download files. Don't leave it lying around in scripts that anyone can read, and if you are transferring data through the Internet use a VPN. If you must transfer the data directly through the Internet at least use a firewall rule to allow connections to port 873 only from known trusted IP addresses.

One final point: the user account running rsync as a server must have full control over the destination directory (i.e. d:\rsync or d:\morersync in this example). It's good practice to create an rsync user account so you can grant it access only to the directories rsync uses and to nowhere else. This prevents accidental misconfigurations overwriting or deleting data that it shouldn't. Remember that should your rsync password become compromised a hacker just has to run:

rsync -vrtz --delete /cygdrive/c/emptyfolder peach::asecretpwd

where c:\emptyfolder is an empty folder, to delete everything in the folder the [asecretpwd] header specifies (see the next section for what this command does). You have been warned! Of course this applies equally to apps like xxcopy and RoboCopy. If you have the rights to copy the files you have the rights to delete them.

So that's how to run rsync as a server. In practice you'll probably want to run it from the scheduler so you can transfer apps overnight or at some other convenient time. The last section of this document lists a batch file, StartRsync.bat, that you can run from the scheduler to set rsync running as a server. To kill the rsync server run the batch file KillRsync.bat (note that you also need the VB script KillRsync.vbs). I run KillRsync.bat at 17:00 to kill off the previous days rsyncs, then StartRsync.bat at 17:05 to restart the server. Then I start the rsync client around 17:30 to get it finished before the tape backup starts.

Uploading files from a client to a server

Suppose you have a rsync server called peach (or indeed peach.somedomain.co.uk) and you want to copy changes from the local directories d:\data\accounts and d:\data\personnel. On the server you have a directory e:\rsync and you have a config file looking like:

[asecretpwd]
path = /cygdrive/e/rsync
read only = false
transfer logging = yes

To copy the data you just need to run the commands:

rsync -vrtz --delete /cygdrive/d/data/accounts peach::asecretpwd
rsync -vrtz --delete /cygdrive/d/data/personnel peach::asecretpwd

Remember that to an app using Cygwin the drive D: is specified as /cygdrive/d and Cygwin uses the unix standard of / to separate directories, so d:\data\accounts is /cygdrive/d/data/accounts. Thus the first line copies the accounts subdirectory and the second does the same for the personnel subdirectory. NB the commands above copy the whole directory to the rsync server. because the rsyncd.conf file contains path = /cygdrive/e/rsync this means the destination directories are e:\rsync\accounts and e:\rsync\personnel.

rsync has dozens of command line flags, but many only really apply to unix, and of those that apply to Windows only a small subset are useful for our purpose. The four flags I've specified are:

-v
verbose i.e. list all files being copied. You don't have to use this but I find it nice to see what is being copied.
 
-r
recurse in subdirectories. If you dont do this only the top level directory gets copied.
 
-t
copy timestamps. Essential really as rsync normally uses timestamps to spot which files have changed.
 
-z
compress the data being transferred. You don't have to use this but it's a good thing on a WAN or other limited bandwidth link.
 
-n
dry run. OK I haven't used this flag in the example above, but it's useful to bear in mind. The -n flag shows you what would be copied without actually copying it. It's useful for testing when you're setting up a new rsync.
 

And that's about it; it's that simple. The thing I took a while to grasp (you may see it immediately!) is that the password you use i.e. the one defined in the rsyncd.conf file establishes a virtual root directory for the copy (e:\rsync in this case), so rsyncing /cygdrive/d/data/accounts copies the accounts directory into the e:\rsync directory on the rsync server. You could rsync any directory into this root for example:

rsync -vrtz --delete /cygdrive/d/data/accounts/sage/2007 peach::asecretpwd

copies d:\data\accounts\sage\2007 to e:\rsync\2007 not e:\rsync\accounts\sage\2007. However you can rsync a subsubdirectory if you want by using the syntax:

rsync -vrtz --delete /cygdrive/d/data/accounts/sage/2007 peach::asecretpwd/accounts/sage

Again the rsync side acts as a root so rsyncing to peach::asecretpwd/accounts/sage copies the directory to e:\rsync\accounts\sage.

And one final variation (last one I promise). As you've seen rsyncing /cygdrive/d/data/accounts copies the whole accounts directory. But:

rsync -vrtz --delete /cygdrive/d/data/accounts/ peach::asecretpwd

(note the trailing / on the directory name) copies the contents of the directory not the directory itself i.e. in this example it's equivalent to copy r;\data\accounts\* e:\rsync.

And that really is it. You'll probably be using a batch file to run the rsync overnight, and there's a sample batch file at the end of this article.

Downloading files from a server to a client

It's hardly worth while having this section as downloading files from an rsync server is done simply by reversing the order of the arguments. So suppose you have used rsync as described in the previous section to upload data e.g.:

rsync -vrtz --delete /cygdrive/d/data/accounts peach::asecretpwd

You download data again using:

rsync -vrtz --delete peach::asecretpwd /cygdrive/d/data/accounts

And you can use the same varients of the arguments as described in the section on uploading. If you're downloading to recover files from a backup remember the -n flag, which shows you waht would be copied without actually copying it. I strongly recommend you try -n first before you go ahead and actually do the copy.

Troubleshooting

Assuming you have rsync correctly configured the only problem I've seen is that occasionally rsync hangs while it's transferring a file, and it will sit there until you get bored an kill it. This seems to be due to a dodgy connection between the two sites. At least I've found it happens frequently between some pairs of sites and never happens between others. If you encounter this problem the fix is to use the --bwlimit flag to limit the data transfer speed. For example:

rsync -vrtz --delete --bwlimit=20 /cygdrive/d/data/accounts peach::asecretpwd

restricts the data transfer speed to 20Kb/sec. Play with the bwlimit setting until you find that data transfers reliably.

For other problems, using the examples described here both the server and the client log everything transferred so you can see what's going on. Remember the -n flag that shows you what would be transferred without actually transferring it.

Batch files

I use rsync every day for backing up Gb of data, and these are (minor varients) of the batch files I use. Feel free to copy and modify these as you wish.

Server side

To run the rsync server I have a scheduled task to run this batch file at some convenient time each day:

rem ********************************************************************
rem StartRsync
rem ==========
rem Run rsync as a detached daemon mode process
rem This assumes the rsync binaries are in c:\win32app\rsync.
rem ********************************************************************

set STARTRSYNCLOG=c:\temp\StartRsync.log
set RSYNCDIR=c:\win32app\Rsync

echo StartRsync >%STARTRSYNCLOG%
echo ========== >>%STARTRSYNCLOG%
date /t >>%STARTRSYNCLOG%
time /t >>%STARTRSYNCLOG%

rem *** Change to the rsync folder

c:
cd \win32app\Rsync

rem *** Start rsync as a detached process

echo Starting rsync >>%STARTRSYNCLOG%

set CYGWIN=nontsec
rsync.exe --config rsyncd.conf --daemon

echo Errorlevel = %errorlevel% >>%STARTRSYNCLOG%

rem *** All done

echo Finished >>%STARTRSYNCLOG%
time /t >>%STARTRSYNCLOG%

The only problem is that I'd get a new instance of rsync.exe every day as the old one would still be running. To prevent this I run the following script to kill the old rsync five minutes before I start the new one.

rem ********************************************************************
rem KillRsync.bat
rem =============
rem Kill all processes called rsync.exe
rem ********************************************************************

set KILLRSYNCLOG=c:\temp\KillRsync.log

echo KillRsync >%KILLRSYNCLOG%
echo ========== >>%KILLRSYNCLOG%
date /t >>%KILLRSYNCLOG%
time /t >>%KILLRSYNCLOG%

rem *** Change to the rsync folder

c:
cd \win32app\Rsync

rem *** Start the KillRsync script

echo Starting KillRsync.vbs >>%KILLRSYNCLOG%

cscript KillRsync.vbs 1>>%KILLRSYNCLOG% 2>>&1

echo Errorlevel = %errorlevel% >>%KILLRSYNCLOG%

rem *** All done

echo Finished >>%KILLRSYNCLOG%
time /t >>%KILLRSYNCLOG%


' **********************************************************************
' KillRsync.vbs
' =============
' Kill all rsync.exe processes
' **********************************************************************

option explicit

dim wmiMgmt, wmiProcesses, wmiProcess

const PROCNAME = "rsync.exe"

' *** Open WMI

set wmiMgmt = GetObject("WinMgmts:")

' *** Go through all process and kill any called rsync.exe

set wmiProcesses = wmiMgmt.InstancesOf("Win32_Process")

for each wmiProcess in wmiProcesses
  if lcase(wmiProcess.caption) = "rsync.exe" then
    wmiProcess.terminate
    ' Wait 5 seconds to make sure it's dead
    wscript.sleep 5000
  end if
next

' *** Finished

set wmiProcesses = nothing
set wmiMgmt = nothing

You might argue it's a bit silly to keep killing off a perfectly good instance rsync then restarting it five minutes later. However this guarantees that any crashed instances get cleaned up once a day.

Client side

To run the rsync as a client and copy data to the server I have a scheduled task to run this batch file at some convenient time each day. The script uses the command line mailer blat to mail me the log so I can see it worked. This is optional of course.

rem ********************************************************************
rem RunRsync.bat
rem ============
rem Use rsync to synchronise to an rsync server
rem ********************************************************************

set CYGWIN=nontsec
set SERVER=myserver
set RSYNCID=myid

set BLAT=c:\win32app\stdutils\blat.exe
set FROM=backup@mydomain.co.uk
set TO=someone@mydomain.co.uk
set SMTPSRVR=%COMPUTERNAME%

set LOGFILE=c:\temp\RunRsync.log

c:
cd \win32app\rsync

rem *** Open the logfile

echo Starting RunRsync >%LOGFILE%
date /t >>%LOGFILE%
time /t >>%LOGFILE%

rem *** Send the backup notification

echo "Sending backup notification" 1>>%LOGFILE% 2>>&1

c:\win32app\WWWMonitor\WWWMonitor.exe 2 0 %COMPUTERNAME% 1>>%LOGFILE% 2>>&1

rem *** Run Rsync

rsync -vrtz --delete /cygdrive/d/data %SERVER%::%RSYNCID% >>%LOGFILE% 2>>&1

time /t >>%LOGFILE%

rem *** Send the backup notification

echo "Sending backup notification" >>%LOGFILE%

c:\win32app\WWWMonitor\WWWMonitor.exe 3 0 %COMPUTERNAME% 1>>%LOGFILE% 2>>&1

rem *** Mail the log file

%BLAT% %LOGFILE% -f %FROM% -t %TO% -s "Rsync: %COMPUTERNAME%" -server %SMTPSRVR% 1>>c:\temp\blat.log 2>>&1

rem *** All done

John Rennie
23rd September 2007