svlug-rsync_for_network_backup_tricks_for_dealing_with_non-quiescent_systems

This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.



Date: Sun, 6 Oct 2002 03:25:42 +0930 (CST)
From: Richard Sharpe <rsharpe@ns.aus.com>
To: svlug@lists.svlug.org
Subject: Re: [svlug] Backup strategies

On Sat, 5 Oct 2002, Karen Shaeffer wrote:

> On Sat, Oct 05, 2002 at 08:32:51AM -0700, hvrietsc@myrealbox.com wrote:
> > On Sat, Oct 05, 2002 at 01:36:23AM -0700, Karen Shaeffer wrote:
> > > 
> > > What if someone deletes a file, then cron (via rsync) deletes it on the
> > > other disk, and, finally, the individual who deleted the file realizes they
> > > didn't really want to delete the file?
> > > 
> > 
> > you can set up rsync so it does NOT delete the file on the backup which
> > is my prefered way of doing this, this way the file can be restored
> > until that point in time that the user creates a new file with the same
> > name.  (and even for that case rsync has some options). 
> 
> Thanks for the clarification. Although I have been aware of rsync for quite
> a while, I must confess to not have actually utilized it. I must make some
> time to read the manual... But what about this scenario: Say you have
> multiple hosts that each may initiate changes to a file. Can rsync be set up
> to sync all host copies to the most recent changed version--regardless of
> the host or user that initiates the changes? Can rsync maintain multiple
> versions of the same file?

While rsync is great for some types of backups, you have to have a 
quiescent file system for it to work properly. 

A couple of years ago I was using it to mirror an active system to a 
semi-hot standby every hour or so. However, our disquiet with that 
approach grew so great that we attempted to move to drbd for active 
mirroring, but we ran into problems with drbd because it was not 
SMP-safe at the time.

In the end, the business just has to understand that backup is not perfect 
and you have to be able to deal with the loss of all data since the last 
backup, and/or use network-wide logging techniques ...


===

Date: Sat, 5 Oct 2002 10:01:54 -0700 (PDT)
From: Robert Hajime Lanning <lanning@lanning.cc>
To: "svlug@lists.svlug.org" <svlug@lists.svlug.org>
Subject: Re: [svlug] Backup strategies

On Fri, 4 Oct 2002, hvrietsc@myrealbox.com wrote:
> and if you poor some coffe on your box then both disks will be gone...
> backup is all about being safe putting all your eggs in one basket
> is not a good idea. go with rsync...
>
> On Fri, Oct 04, 2002 at 06:28:13PM -0700, Drew Bertola wrote:
> > On Fri, Oct 04, 2002 at 05:30:08PM -0700, Rick Schultz wrote:
> > > Anybody have a backup solution they particularly like?
> >
> > I just can't find anything I don't like about rsync.
> >
> > Grab an old box like a pentium 150 or so, pop in a pci ide controller
> > (if necessary), a new 100MB IDE hard drive, and a good NIC.  Load
> > Linux.  Then run rsync scripts periodically via cron.  You can back up
> > a whole lot of data on several boxes that way.
> >
> > If you just have one box to worry about, just tuck in a second hard
> > drive and do the same thing locally.
> >
> > It's very unlikely that you'll lose both spindels at the same time, so
> > it's a good, cheap, and fast backup technique.

rsync would be fine for that "opps, I didn't mean to rm that file..." and
certain hardware failure.

But, what happens when you find that a datafile for an application that
you haven't used in a week was corrupted? (and your cron rsyncs once a day.)

rsync will give you no history.

The concept of timestamped savesets and incrementals would be something
to consider.

===



Date: Sat, 5 Oct 2002 10:36:13 -0700
From: Drew Bertola <drew@drewb.com>
To: "hvrietsc@myrealbox.com" <hvrietsc@myrealbox.com>,
Subject: Re: [svlug] Backup strategies

On Sat, Oct 05, 2002 at 10:01:54AM -0700, Robert Hajime Lanning wrote:
> rsync would be fine for that "opps, I didn't mean to rm that file..." and
> certain hardware failure.
> 
> But, what happens when you find that a datafile for an application that
> you haven't used in a week was corrupted? (and your cron rsyncs once a day.)
> 
> rsync will give you no history.
> 
> The concept of timestamped savesets and incrementals would be something
> to consider.

"man rsync" would be something to consider.

===

Date: Sat, 5 Oct 2002 10:03:40 -0700
From: Drew Bertola <drew@drewb.com>
To: svlug@lists.svlug.org
Subject: Re: [svlug] Backup strategies

On Sat, Oct 05, 2002 at 09:14:34AM -0700, Karen Shaeffer wrote:
> On Sat, Oct 05, 2002 at 08:39:49AM -0700, Drew Bertola wrote:
> > 
> > The point was that two spindles are good for even the simplest
> > scenario.
> 
> That was clear to me. But, the original question was concerned with backups.
> I think the central point of others, and certainly my comments, was that your
> solution has limitations that may be unacceptable to a large group of folks
> interested in backups.

Sorry, I only meant to suggest rsync rather than a commercial "wrapper
around tar" backup solution.  The examples were just there to
illustrate the ease with which a hobbiest could set up a backup
solution.

As far as two spindles on one machine vs. two machines, that was there
for the case of a home user who doesn't want to generate additional
heat / noise / space.  It's much more "environmentally friendly."

rsync is very versatile.  It can, as someone else stated, be used to
backup without deleting files that no longer exist locally (the
default).  It also can be used to create incremental updates of a
directory tree from a given date.  For example, I could do a complete
backup on Sunday (week 1), then incrementals throughout the week.  I
could then do the same for week 2.  That way, you not only have the
state of the machine at any point during the week.  

rsync has some great basic features:

- It compresses data during transit (by default), making it efficient
  over the network.

- It compares the state of every file between the working copy and the
  backup copy, thus it only updates files that have changed since the
  last backup.  This improves speed / network efficiency greatly.

- It plays well with ssh, so it's very secure over hostile networks
  (see: "-e ssh").

I typically do something like:

rsync -e ssh -avz drew@workstation.net:/home/drew drew@backup.net:/home

If I want to delete files that were deleted locally, I add --delete.

I can exclude files and directories using a stored list using:

--exclude-from=~/.rsync-excludes 

where:

$cat ~/.rsync-excludes
*~
*.rpm
*.tar
*.gz
*.tgz
*.tar.gz
*.zip
.ssh/
.bash_history
.rsync-exclude
weblogs/

And so much more.  rsync is very cool.

Just don't get src and dest confused.  That could be a bummer.  Think
of the cp command wrt src and dest.


===

Date: Sat, 5 Oct 2002 10:05:42 -0700
From: Drew Bertola <drew@drewb.com>
To: svlug@lists.svlug.org
Subject: Re: [svlug] Backup strategies

BTW, there's a good article from a couple weeks back on rsync wrt
backup solutions.  It's listed at linux.com, but it's on imaclinux.net:

http://docs.linux.com/article.pl?sid=02/05/24/1625228&mode=thread&tid=31

http://www.imaclinux.net/gh.php?single=92+index=0


===

Date: Sat, 5 Oct 2002 12:13:43 -0700
From: Karen Shaeffer <shaeffer@neuralscape.com>
To: Drew Bertola <drew@drewb.com>
Cc: svlug@lists.svlug.org
Subject: Re: [svlug] Backup strategies

On Sat, Oct 05, 2002 at 10:03:40AM -0700, Drew Bertola wrote:
> 
> 
> rsync has some great basic features:

Thanks for the comments. I need to read up about rsync, realizing it could
play a role for me. I guess as long as the state of files are frozen for a
window of time during synchronization, then rsync could be used to advantage
in a custom designed system. That constraint isn't too draconian for my needs.

===

Date: Sat, 05 Oct 2002 12:28:15 -0700
From: Steve M Bibayoff <smb23@csufresno.edu>
To: svlug@lists.svlug.org
Subject: Re: [svlug] Re: word

Graham Freeman <graham@calteg.org> wrote:

> In this case, this organization called me up and offered me a job.  
> One of
> the prerequisites for getting this job was that I give them a CV in 
> Wordformat.  I'm not going to jeopardize that by getting high-and-
> mighty over
> their choice of word processing software.  Now that I have the job, 
> I'm in
> a position to recommend superior replacements for the proprietary 
> softwarethat they're using.

I always like these arguments. Basically boils down to is a person is
more then willing to give up their constitutional rights just to use
certain types of software(of course they make the argument the
job/whatever is more important than "idealism"). Most people won't give
up any of their rights when they buy a car, tv, or almost anything else,
but why are they so willing to with software? Would you be willing to
work for a company that dictates what type (or any) relegion you had to
have in order to work for them?

This wasn't directed at anybody in paticaular, so please don't I'm
personall attacking anyone. And yes, my generalized examples could be
considered extreme, but they are genrally true. I just wish people would
start to question how much they value Freedom and it's place in their
value system.

Steve

===
Date: Sat, 5 Oct 2002 12:49:58 -0700 (PDT)
From: Robert Hajime Lanning <lanning@lanning.cc>
To: "svlug@lists.svlug.org" <svlug@lists.svlug.org>
Subject: Re: [svlug] Backup strategies

On Sat, 5 Oct 2002, Karen Shaeffer wrote:
> Thanks for the comments. I need to read up about rsync, realizing it could
> play a role for me. I guess as long as the state of files are frozen for a
> window of time during synchronization, then rsync could be used to advantage
> in a custom designed system. That constraint isn't too draconian for my needs.

The Linux LVM can create a snapshot logical volume of a live volume.  You
could do a snapshot, mount it then use rsync.

===

Date: Sat, 5 Oct 2002 13:13:48 -0700
From: Vince Hoang <svlug@az0.altern8.net>
To: svlug@lists.svlug.org
Subject: Re: [svlug] Backup strategies

On Sat, Oct 05, 2002 at 09:06:44AM -0700, Karen Shaeffer wrote:
> Can rsync maintain multiple versions of the same file?

A quick two-way rsync can be setup by using --update from the
secondary to the primary replica and then --delete from the
primary to the secondary replica.

Another file syncronization utility to look at is unison:
    http://www.cis.upenn.edu/~bcpierce/unison/

    "Unlike simple mirroring or backup utilities, Unison can deal
    with updates to both replicas of a distributed directory
    structure. Updates that do not conflict are propagated
    automatically. Conflicting updates are detected and displayed."



===

Date: Sat, 5 Oct 2002 13:37:20 -0700
From: Karen Shaeffer <shaeffer@neuralscape.com>
To: svlug@lists.svlug.org
Subject: Re: [svlug] Backup strategies

On Sat, Oct 05, 2002 at 12:49:58PM -0700, Robert Hajime Lanning wrote:
> On Sat, 5 Oct 2002, Karen Shaeffer wrote:
> 
> The Linux LVM can create a snapshot logical volume of a live volume.  You
> could do a snapshot, mount it then use rsync.

That's an interesting observation.

Checkout EVMS: It's more full featured than Linux LVM. 

http://evms.sourceforge.net/

Hmm, I am convinced the EVMS project is going to win out in the end. In fact,
as we speak, the evms development team is working out the details with the
Linux kernel maintainers to get evms included in the 2.5.40 kernel source.
Evms is ported to the Linux-2.4.19 source code as well.

===

Date: Sat, 5 Oct 2002 16:11:48 -0700
From: hvrietsc@myrealbox.com
To: "svlug@lists.svlug.org" <svlug@lists.svlug.org>
Subject: Re: [svlug] Backup strategies

On Sat, Oct 05, 2002 at 10:01:54AM -0700, Robert Hajime Lanning wrote:
> On Fri, 4 Oct 2002, hvrietsc@myrealbox.com wrote:
> > and if you poor some coffe on your box then both disks will be gone...
> > backup is all about being safe putting all your eggs in one basket
> > is not a good idea. go with rsync...
> >
> > On Fri, Oct 04, 2002 at 06:28:13PM -0700, Drew Bertola wrote:
> > > On Fri, Oct 04, 2002 at 05:30:08PM -0700, Rick Schultz wrote:
> > > > Anybody have a backup solution they particularly like?
> > >
> > > I just can't find anything I don't like about rsync.
> > >
> > > Grab an old box like a pentium 150 or so, pop in a pci ide controller
> > > (if necessary), a new 100MB IDE hard drive, and a good NIC.  Load
> > > Linux.  Then run rsync scripts periodically via cron.  You can back up
> > > a whole lot of data on several boxes that way.
> > >
> > > If you just have one box to worry about, just tuck in a second hard
> > > drive and do the same thing locally.
> > >
> > > It's very unlikely that you'll lose both spindels at the same time, so
> > > it's a good, cheap, and fast backup technique.
> 
> rsync would be fine for that "opps, I didn't mean to rm that file..." and
> certain hardware failure.
> 
> But, what happens when you find that a datafile for an application that
> you haven't used in a week was corrupted? (and your cron rsyncs once a day.)
> 
> rsync will give you no history.
> 
> The concept of timestamped savesets and incrementals would be something
> to consider.

to get at least a weeks worth of rollbackability:

DAY=`date +%a`
rsync  -varu /local/dir/ remotehost::backup/$DAY/

(and by default it will not delete files on the remote)

===

Date: Sat, 5 Oct 2002 16:28:45 -0700
From: hvrietsc@myrealbox.com
To: "svlug@lists.svlug.org" <svlug@lists.svlug.org>
Subject: Re: [svlug] Backup strategies


On Sat, Oct 05, 2002 at 12:49:58PM -0700, Robert Hajime Lanning wrote:
> On Sat, 5 Oct 2002, Karen Shaeffer wrote:
> > Thanks for the comments. I need to read up about rsync, realizing it could
> > play a role for me. I guess as long as the state of files are frozen for a
> > window of time during synchronization, then rsync could be used to advantage
> > in a custom designed system. That constraint isn't too draconian for my needs.
> 
> The Linux LVM can create a snapshot logical volume of a live volume.  You
> could do a snapshot, mount it then use rsync.

but unless you quiesce the application your snapshot could still contain
half written files and such. that is why commercial backup programs
spend a lot of time in quiescing applications, but the only safe way
is if the application itself has quiescing build in and IT starts the
backup.

but for most home uses all that isn't necessary.

===

Date: Sat, 5 Oct 2002 22:37:47 -0500
From: <clee@spiralis.merseine.nu>
To: svlug@lists.svlug.org
Subject: Re: [svlug] Backup strategies

>>>>> "Vince" == Vince Hoang <svlug@az0.altern8.net> writes:

    Vince> On Sat, Oct 05, 2002 at 09:06:44AM -0700, Karen Shaeffer
    Vince> wrote:
    >> Can rsync maintain multiple versions of the same file?

    Vince> A quick two-way rsync can be setup by using --update from
    Vince> the secondary to the primary replica and then --delete from
    Vince> the primary to the secondary replica.

    Vince> Another file syncronization utility to look at is unison:
    Vince> http://www.cis.upenn.edu/~bcpierce/unison/

Just to add to the list of backup programs...

One step beyond rsync is rdiff-backup (http://rdiff-backup.stanford.edu).  It
combines the features of a mirror and incremental backup and uses the same
bandwidth-saving algorithms as rsync.  I found it when I was looking for
programs to implement a reasonable "backup-to-hard-disk strategy".

===

Date: Tue, 8 Oct 2002 09:06:55 +0200
From: Ira Abramov <lists-svlug@ira.abramov.org>
To: "svlug@lists.svlug.org" <svlug@lists.svlug.org>
Subject: [svlug] Re: Backup strategies

> > > > > Anybody have a backup solution they particularly like?

well, I just saw something called "cdbackup" on Debian, check this out:

Description: C1D-R(W) backup utility cdbackup and cdrestore are a pair
of utilities designed to facilitiate streaming backup to and from
CD-R(W) disks.  Specificially, they were designed to work with
dump/restore, but tar/cpio/whatever you want should work, so long as it
writes to stdout for backups and reads from stdin for restores.

http://www.muempf.de/cdbackup.html

Sounds primitive though. maybe anyone knows something smarter? maybe
a combo of AMANDA and slocate, where you ask for "locate
/home/ira/blah.txt" and it lists a list versions and on what
CDs/tars/medias they are on from which dates.

Haven't used Amanda in a while, I wonder if they allow now for chopping
down of large staging archives across several medias or manage some sort
of a smart backup-to-disk.

http://sourceforge.net/projects/amanda/

one word of warning about backups, remember it's better to use file
serializers like cpio or tar, and avoid dump. other than the fact dump
is OS and FS specific, there are apperently major problems with using it
on 2.4 with all it's caching features, and I'd imagine journaling
filesystems mae it even more problematic. read this from Linus:

http://old.lwn.net/2001/0503/a/lt-dump.php3

AMANDA warns about this too (scroll to about the middle of the page)
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/amanda/amanda-2/docs/SYSTEM.NOTES?rev=1.48&content-type=text/vnd.viewcvs-markup

===


the rest of The Pile (a partial mailing list archive)

doom@kzsu.stanford.edu