Platform backups are an essential part of systems administration. Therefore, they’re a widely studied subject and the reason many tools have been created. However, making good backups is still a complex task that involves the definition of the policy, studying, and choosing the tool(s) and their implementation.
1 · Backup policy
The backup policy is the definition of the different aspects of backups: what should we make a backup of? How often these will take place? What retention should they have? Where are the backups stored? How much time is acceptable to spend recovering data?
Obviously, it’s always desirable to have daily copies of all files with high retention and locally stored for a quick recovery, as well as external storage for greater protection. However, this conflicts with efficiency and costs: tape robots, disk occupancy, physical space to save copies, etc. Therefore, it is necessary to find a compromise by setting maximum costs and prioritizing backups from the most critical resources.
Every project or platform has its own requirements that need to be defined to establish the policy. To do this, it’s useful to:
- Differentiate between different environments (preproduction, development, testing, production, etc.)
- Determine the costs of possible data losses
- The time it would take to recover
- Assess available resources (hardware, network speed, remote disks, etc.)
- Analyze what’s essential to copy and what’s not.
2 · Backup saving and retention
An important aspect of security when establishing a backup policy is where the backup copies will be stored. When defining risk prevention plans regarding technology vaulting is usually considered to mitigate the effects of a possible incident in the site where backups are made. This practice consists of moving a complete copy of the data periodically to another location, for example, once a month. This is common when the physical support is on tape.
This is different from archiving, which consists of moving old unused data to a different location. A backup is always a copy, while archiving is the original data being transferred because it’s not being used but doesn’t want to be permanently deleted.
Even though there are different storage options, it’s interesting to consider Amazon’s Glacier service. Its low cost is a great advantage, but the fact that restoration isn’t guaranteed in a specific time and that it may take a few hours discards it for vaulting while making it an interesting candidate for archiving, as there are no data recovery requirements in limited time.
3 · Restoration
A backup’s final goal is being able to restore it in case of data loss. Hence, keeping in mind the restoration when defining a backup policy or choosing a tool is key. To do so, it’s important to have previously decided (in risk management) the following points:
- RTO (Recovery Time Objective)
It’s the maximum time in which a minimum level of service must be reached after a downtime (for example, due to loss of data) for not causing unacceptable consequences to the business.
- RPO (Recovery Point Objective)
It’s the maximum time period in which service data can be lost. If the time period is 6 hours, backups must be made every less time and information can be recovered before the period is used up.
A backup’s restoration time in case of data loss is part of the downtime itself, so the less time it takes to get done, the sooner the business process will be restored.
4 · Tools
Tools allow us to deploy a backup policy. Given the variety of platforms, many tools acting at different levels have been created. Some types of backups and tools (there are way more) are shown down below:
4.1 · Synchronization
This type of backup allows two directories in different locations (on the same machine or in different hosts) storing the same files. Many of the tools that allow synchronization are based on rsync or on its library.
- Rsync: the most well-known file synchronization tool, it has many options that allow great flexibility.
- Duplicity: based on the rsync library to make compressed and encrypted file backups.
- Unison: allows directory synchronization leveraging features from different systems and tools.
4.2 · Copies
The basic backup making system is copying files into a separate space. In this case, tools of a wide range of diversity and complexity can be used.
- fwbackups: a tool with a simple interface but with plenty of options, it allows you to schedule backups at different levels.
- Bacula: a very complete tool that allows you to make backups at different levels (total, differential, incremental), from different clients (linux, solaris, windows) and to various media (tape and disk). It’s a free software, although it has a commercial support option.
- Mondorescue: this software allows you to make a backup of an entire installation and leave copies on several physical media.
4.3 · Databases
Databases request special treatment. Saving the files that contain the databases is not usually a good backup system since a copy of the file can generate an inconsistent database at any time. Therefore, each server usually provides a backup system, often based on data dumps in different formats.
- MySQL: mysqldump dumps database data on SQL. This allows to make backups and create slaves amongst other uses.
- PostgreSQL: pg_dump does, just like mysqldump, a data dump on SQL language.
- SQL Server: Microsoft’s database server offers the SQL Server Management Studio utility, that allows you to schedule different backup tasks, as well as previous and subsequent tasks specifying when, to what level and at what level is the backup carried out so that it’s consistent and easily retrievable.
4.4 · Snapshots
Snapshots are “photographies” from the system or a part of it that allow to recover it in a state known to be adequate. Snapshots can be carried out at different levels:
- File system: there are file systems that allow you to make backups in the form of snapshots. ZFS has this utility.
- Disk volume: LVM offers this possibility for recovering volumes.
- Virtual machines: many virtual machine managers allow making snapshots of the virtual machines themselves. KVM, Xen or VMWare among others have this feature.
- Files: with rsnapshot you can make snapshot-shaped backups by leveraging rsync and using hard links transparently. Back In Time also makes directory snapshots, although for desktop environments only.
4.5 · Continuous Data Protection
It consists of automatically saving a copy of all changes made to the data, acquiring a remote copy of all versions. This allows data recovery of any moment in time. Even though there are optimized systems to save only the differences and take up little disk space, this system has penalties in the network given the continuous data transfer.
- AIMstor: it allows you to easily define policies through a graphic interface and supports different types of backup, replication and archiving.
- RecoverPoint: supports remote data replication using synchronous and asynchronous protocols.
- InMage DR-Scout: it has an optimized capacity repository and supports various platforms (Windows, Linux, Solaris …).
5 · What to take into account
With a countless number of tools available, making a choice can be complicated. To simplify the search and reduce options, defining your own needs and what each solution offers to find the tool that best suit them is essential. Some issues that can help to choose a tool are the following:
- Installation: is it packaged, or is compilation necessary? Is it easy to install? Does it have any special requirements?
- Setting up and maintenance: is it easy to maintain? Is it able to deploy a policy? How much learning time does it require? Does it have a graphic interface?
- Restoration: is the restoration easy and quick? Can a user restore a file of their own, or does it always have to be the administrator?
- Compatibility: does it work for all systems of the platform? Does the server have to run in a specific system?
- Physical support: does it allow tape backup, DVD, remote file systems, disk…?
- License: Is it free or commercial software? Does it have any support for companies?
6 · Conclusions
The variety of options allow to widely define the backup policy so that one or several tools can be used for the same platform, as well as different levels and retentions for different files. Often there isn’t a single and fixed solution, and rarely does the same policy work for two different platforms.
Although being flexible is advisable for small changes (the policy or resources can be modified), making large changes to the backup system can be complex, so it’s important to carry out a nice study to choose what the best option is. The key is defining and meeting the needs of each platform by adjusting the system to available resources.