This page is about personal backup practices. See also:

Backup is an ambiguous term. As a focus, this page is about backing up personal files. Additionally, we will talk about two related activities: synchronization and archival.

Synchronization enables offline editing of files on two or more. Periodically, changes are synchronized between the peers and any editing conflicts are resolved, possibly by consulting the user.

Unison is the best tool I’ve found for synchronization of files between multiple hosts.

  • it is cross-platform and maintained
  • supports 2-way merge, or n-way merge via a hub host
  • GUI for conflict resolution
  • no versioning
  • it is necessary to have the same major version of Unison (for example 2.40.x) installed on each host

Another option is a distributed filesystem such as Afs (and maybe in the future Btrfs) which allows offline editing of files on multiple intermittantly-connected peers. This is kind of a pie-in-the-sky option, and maybe isn’t very portable.

Archival safeguards from data loss due to system failure or accidental deletion by making periodic copies of files to another media or host. An ideal archival tool is:

  • Incremental; Only transmits changes since the last backup, for efficiency.
  • History-preserving; Allows retrieval of old versions of files from the backup copy.
  • Encrypted; If you are backing up sensitive files to an insecure host, you should use a tool that encrypts backups, like Duplicity.

Two archival tools which I can recommend are rdiff-backup and Duplicity. Both of which are incremental and history-preserving. The major difference between the two tools is that Duplicity supports encrypted delta backups to insecure hosts while rdiff-backup allows for easy browsing of backups on secured hosts. Additionally archfs is a FUSE filesystem for browsing rdiff-backups.

A combined strategy

My goals:

  • Have up-to-date editable copies of my files available on each peer.
  • Also be able to retrieve old versions of files.

Outline:

  • Use unison to sync each peer with a central always-on server host.
  • Make incremental backups periodically on the host, possibly to another location.
  • GUI to browse/retrieve old versions?
  • use Git and gitweb?

Another way to combine synchronization with archival would be to use a DVCS system such as Git, Mercurial, or Darcs, with only one level (rather than two in the above outline). The main obstacle is the lack of a convenient workflow to automate commits/fetches/merges. This would likely be a fruitful programming project.

References

https://wiki.archlinux.org/index.php/Sync_laptop_desktop