#!/usr/bin/perl -w

=head1 NAME

drsync - rsync wrapper for synchronizing file repositories which are changed
in both sides

=head1 SYNOPSIS

  drsync.pl [ --rsync=/usr/bin/rsync ] [ --state-file=state_file.gz ] \
    [ --bzip2=/usr/bin/bzip2 ] [ --gzip=/usr/bin/gzip ] \
    rsync-args ... SRC [ ... ] DEST

=head1 DESCRIPTION

drsync is a wrapper for rsync. It does nothing unless you specify the
--state-file arg: it simply calls rsync with the given parameters.

If you specify --state-file, then it generates a file-list of your source
repositories, and compares that with the current filelist. Files which are
added or deleted are propagated to the destination place (new files are
created, deleted files are deleted there also), and the filelist is updated.

The list file can optionally be compressed with bzip2 or gzip, the program
detects it by the extension of the --state-file.

You can use --rsync, --bzip, --gzip to specify the path of these programs. 

=head1 EXAMPLE

=head2 Mailbox synchronization

I use this script to synchronize my mail-folders between two linux machines.
The plan was to use my notebook and my desktop computer to read and write
emails, and I wanted to see all the folders in both places.

I have a lot of incoming folders, all of those are located in the ~/mail
directory, and named INBOX.*. These are all in "maildir" format (one mail=one
file!), because it is better for synchronization than the mbox format.

I use this simple script on the notebook computer to synchronize the desktop
and the notebook mailboxes:

  drsync.pl --verbose --rsh=ssh --exclude=BACKUP --recursive \
    --state-file=.mail.desktop.drsync.bz2 desktop:mail ~
  drsync.pl --verbose --rsh=ssh --recursive \
    --state-file=.mail.notebook.drsync.bz2 mail desktop:

In the first step drsync copies the new mails from the desktop to the
notebook, and in the second, it copies the changes from the notebook back to
the desktop.

It works properly unless you change a file in both side. When you do
this, your last version overwrites the first! This is why maildir is
better for this purpose (less chance to modify the same file on both side).

=head1 HOW IT WORKS

rsync made the majority of the work, so rsync is required in both sides of the
synchronization. drsync is required only in the caller side.

First, it loads the file-list from the file, which is specified by the
"--state-file" command-line argument.

Then it generates the current filelist (by calling rsync -n), and compares the
two state.

Then it deletes the deleted files and creates the newly created files in the
destination place, using rsync-rsh if necessary. The new files are created
with 1970-01-01 timestamp (unix epoch), and they are 0 bytes long.

Then the new filelist is written back to the disk. Note: the filelist must be
in the machine, where drsync.pl runs.

Last, but not least we call rsync (again) to do the synchronization works,
with "--existing" and "--update". Then it copies the files which are
necessary to be copied.

The "state-file" can be compressed with gzip or bzip2, it is detected by the
extension of the file.

=head1 COMMAND-LINE SWITCHES

The script accepts most of the rsync options, and it calls rsync with the
given options.

If you _DO NOT_ specify --state-file option, then it calls rsync with no
changes at the command-line, so check L<rsync> for more info. The following
options apply _ONLY_ if --state-file is provided in the command-line.

=head2 Command Line switches for drsync.pl

=over 4

=item --state-file=filename

Sets the name of the file where to store the filenames of the current
state. If it has '.bz2' extension, it is automatically compressed and
decompressed by bzip2, if it has '.gz' exitension, then gzip is used. All
other filename assumes that the file is a plain textfile with filenames, which
has one filename per row.

=item --verbose, -v, -v=2, --verbose=2

If you specify -v or --verbose, you can see some debug messages by drsync, if
you use the -v=2 or --verbose=2 form, it calls rsync in verbose mode also.

=item --bzip2=bzip2_path

You can specify the path to "bzip2" executable. If not specified, then "bzip2"
is used.

=item --gzip=gzip_path

You can specify the path to "gzip" executable. If not specified, then "gzip"
is used.

=item --rsync=rsync_path

You can specify the path to "rsync" executable. If not specified, then "rsync"
is used.

=item --rsh=rsh_path

This is an rsync option also, it specifies the "rsh" command to run. You also
can use the RSYNC_RSH environment variable to set this.

=back

=head2 Overwritten rsync command-line options

These switches are available for rsync, but the meanings are changed or lost
when you use "drsync".

=over 4

=item --verbose, -v

See above.

=item --existing, --update

These methods are default in drsync.pl, so you don't need to specify them.

=item -n, --dry-run

This argument can be used to instruct drsync and rsync not to do any changes
to repositories and filelists. Use with -v option.

=back

=head1 TODO

=head2 Short-term

We need more error-handling on pipe opens.

=head2 Long-term

There are no long-term plans with this software, because this work of
operation has very strict limitations. I am currently testing the "unison"
project, which addresses similar problems or trying to integrate these
functionality to rsync.

=head1 COPYRIGHT

Copyright (c) 2000-2002 Szab, Balzs (dLux)

All rights reserved. This program is free software; you can redistribute it
and/or modify it under the same terms as Perl itself.

=head1 AUTHOR

dLux (Szab, Balzs) <dlux@dlux.hu>

=head1 CREDITS

  - Paul Hedderly <paul@mjr.org> (debian packaging, fixes for new rsync)

=cut

use File::Basename;
use Getopt::Mixed;
use File::Copy qw(move);
use IO::Handle;

$VERSION = 0.4;
$REVISION = q$Id: drsync.pl,v 1.27 2002/02/17 02:04:19 dlux Exp $;
$BASENAME = basename($0);

# reading and parsing options

Getopt::Mixed::init( 
    "rsync=s rsh=s bzip2=s gzip=s state_file=s state-file>state_file ".
    "n dry-run>n v:i verbose>v q quiet>q u update>u existing progress ".
    "b backup>b suffix:s version" 
);

$Getopt::Mixed::badOption = sub { my ($pos,$option,$reason) = @_;
    push @rsync_opts, $option;
    return ("","","");
};

delete $ENV{LANG}; # Make sure that rsync uses english messages

# setting up default options

$opt_rsync = "rsync";
$opt_bzip2 = "bzip2";
$opt_gzip = "gzip";
$opt_rsh = $ENV{RSYNC_RSH} || "rsh";
$opt_state_file = undef;
$opt_suffix = '~';

@COPY_ARGV=@ARGV;
Getopt::Mixed::getOptions();

$opt_v ||= 0;
push @rsync_opts,"--rsh=$opt_rsh";
push @rsync_opts,"--exclude","*$opt_suffix" if $opt_b;
push @optional_args,"-v" if $opt_v>1;
push @optional_args,"-q" if $opt_q;
push @optional_args,"--progress" if $opt_progress;
push @optional_args,"-n" if $opt_n;

die "$BASENAME $VERSION ($REVISION)\n" if $opt_version;
die "Usage: $BASENAME [options] src [...] dest\n".
    "'perldoc drsync.pl' for more information\n\n" if @ARGV<2;

#avoid warnings:
$opt_q++; $opt_progress++, $opt_version++;;

@srcdir=@ARGV;
($dest, $desthost, $destdir) = pop(@srcdir) =~ /^((?:(.*):)?(.*))$/;


# If "state-file" is not provided, we simply call rsync

exec($opt_rsync, @COPY_ARGV) if !defined $opt_state_file;

# recovering a broken session

if (-f $opt_state_file."~") {
    move $opt_state_file."~", $opt_state_file
        or die "Cannot recover state file from backup: $!";
}

# opening state file

$listfile_perms = 0666 ^ umask();
if (-f $opt_state_file) {
    $listfile_perms = (stat($opt_state_file))[2] & 07777;
    if ($opt_state_file =~ /\.bz2$/i) {
        $file = ospawn($opt_bzip2, "-cd", $opt_state_file) or die $!;
    } elsif ($opt_state_file =~ /\.gz$/i) {
        $file = ospawn($opt_gzip, "-cd", $opt_state_file) or die $!;
    } else {
        open $file, $opt_state_file or die $!;
    }

    # reading the file

    while (<$file>) {
        chomp;
        $old_filelist->{$_}=1;
    }

    close $file;
} else {
    $old_filelist={};
}

# generating filelist

print "Getting file list: $opt_rsync -n --stats @rsync_opts @srcdir /dev\n"
    if $opt_v;

$file = ospawn($opt_rsync,"-n","--stats",@rsync_opts,@srcdir,"/dev");

my $rsyncstate = 0; # start state
while (<$file>) {
    chomp;

    # pre-filenames
    if ($rsyncstate == 0) {
        next if /^rsync/ || /^receiving file list/;
        $rsyncstate = 1;
    }

    # filename processing
    if ($rsyncstate == 1) {
        if (/^$/ || /^rsync\[.*heap statistics/) {
            $rsyncstate = 2 ; # last state, do nothing
        } else {
            $new_filelist->{$_}=1;
        }
    }
}

close $file;

die "Filelist generation error, exiting\n" if $rsyncstate!=2;

# Creating "add" and "del" filelist hash

foreach my $key (keys %$new_filelist) {
    if (exists $old_filelist->{$key}) {
        delete $old_filelist->{$key};
    } else {
        $add_filelist->{$key} = 1;
    }
}

%$del_filelist = %$old_filelist;

if ($desthost) {
    print "Making r/ssh connection: $opt_rsh $desthost sh\n" if $opt_v;
    $file=ispawn($opt_rsh,$desthost,"sh");
} else {
    open $file,"|sh";
}

print $file "cd '$destdir'\n";

# Removing files which are removed here

foreach my $key (keys %$del_filelist) {
    $key =~ s/'/'\\''/g;
    print "-$key\n" if $opt_v;
    print $file "[ -f '$key' ] && ".
        ($opt_b ? "mv -f '$key' '$key$opt_suffix'" : "rm -f '$key'" ).
        "\n"
            if !$opt_n;
};

# Adding new files to the other side

foreach my $key (keys %$add_filelist) {
    $key =~ s/'/'\\''/g; # shell escape
    print "+$key\n" if $opt_v;
    $dir = dirname $key;
    if (!$opt_n) {
        print $file "[ -d '$dir' ] || mkdir -p '$dir'\n" if $dir;
        print $file "[ -f '$key' ] || touch -t 197001011200 '$key'\n";
    }
}

close $file;

if (!$opt_n) {

    # Writing out the filelist for a temporary file

    $tempname = $opt_state_file.".new.$$";

    if ($opt_state_file =~ /\.bz2$/i) {
        $file = ispawn("'$opt_bzip2' >'$tempname'") or die $!;
    } elsif ($opt_state_file =~ /\.gz$/i) {
        $file = ispawn("'$opt_gzip' >'$tempname'") or die $!;
    } else {
        open $file, ">$tempname" or die $!;
    }

    print $file join("\n",keys %$new_filelist);

    close $file;

    chmod $listfile_perms, $tempname
        or warn "Cannot chmod state file: $!";

    # Copying temp file to replace the new one

    if (-f $opt_state_file) {
        move $opt_state_file, $opt_state_file."~" 
            or die "Cannot make backup: $!"
    }

    move $tempname,$opt_state_file 
        or die "Cannot move temp state file: $!";
    if (-f $opt_state_file."~") {
        unlink $opt_state_file."~" 
            or die "Cannot unlink temp state file: $!";
    }

}

# Calling the final "rsync" to do the rest of the work

print 
    "Executing: $opt_rsync --update --existing ".
    "@rsync_opts @optional_args @srcdir $dest\n";

exec($opt_rsync, "--update", "--existing", 
    @rsync_opts, @optional_args, @srcdir,$dest) if !$opt_n;

# ospawn: forks a process, and returns the filehandle of the stdout of the
# process

sub ospawn { my (@args)=@_;
    open OFD, "-|" or exec @args;
    OFD->autoflush(1);
    return \*OFD;
}

# ispawn: forks a process, and returns the filehandle of the stdin of the
# process

sub ispawn { my (@args)=@_;
    open IFD, "|-" or exec @args;
    IFD->autoflush(1);
    return \*IFD;
}

