Platform backups are an essential part of systems administration. Therefore, they’re a widely studied subject and the reason many tools have been created. However, making good backups is still a complex task that involves the definition of the policy, studying, and choosing the tool(s) and their implementation.

1 · Backup policy

The backup policy is the definition of the different aspects of backups: what should we make a backup of? How often these will take place? What retention should they have? Where are the backups stored? How much time is acceptable to spend recovering data?

Obviously, it’s always desirable to have daily copies of all files with high retention and locally stored for a quick recovery, as well as external storage for greater protection. However, this conflicts with efficiency and costs: tape robots, disk occupancy, physical space to save copies, etc. Therefore, it is necessary to find a compromise by setting maximum costs and prioritizing backups from the most critical resources.

Every project or platform has its own requirements that need to be defined to establish the policy. To do this, it’s useful to:

2 · Backup saving and retention

An important aspect of security when establishing a backup policy is where the backup copies will be stored. When defining risk prevention plans regarding technology vaulting is usually considered to mitigate the effects of a possible incident in the site where backups are made. This practice consists of moving a complete copy of the data periodically to another location, for example, once a month. This is common when the physical support is on tape.

This is different from archiving, which consists of moving old unused data to a different location. A backup is always a copy, while archiving is the original data being transferred because it’s not being used but doesn’t want to be permanently deleted.

Even though there are different storage options, it’s interesting to consider Amazon’s Glacier service. Its low cost is a great advantage, but the fact that restoration isn’t guaranteed in a specific time and that it may take a few hours discards it for vaulting while making it an interesting candidate for archiving, as there are no data recovery requirements in limited time.

3 · Restoration

A backup’s final goal is being able to restore it in case of data loss. Hence, keeping in mind the restoration when defining a backup policy or choosing a tool is key. To do so, it’s important to have previously decided (in risk management) the following points:

A backup’s restoration time in case of data loss is part of the downtime itself, so the less time it takes to get done, the sooner the business process will be restored.

4 · Tools

Tools allow us to deploy a backup policy. Given the variety of platforms, many tools acting at different levels have been created. Some types of backups and tools (there are way more) are shown down below:

4.1 · Synchronization

This type of backup allows two directories in different locations (on the same machine or in different hosts) storing the same files. Many of the tools that allow synchronization are based on rsync or on its library.

4.2 · Copies

The basic backup making system is copying files into a separate space. In this case, tools of a wide range of diversity and complexity can be used.

4.3 · Databases

Databases request special treatment. Saving the files that contain the databases is not usually a good backup system since a copy of the file can generate an inconsistent database at any time. Therefore, each server usually provides a backup system, often based on data dumps in different formats.

4.4 · Snapshots

Snapshots are “photographies” from the system or a part of it that allow to recover it in a state known to be adequate. Snapshots can be carried out at different levels:

4.5 · Continuous Data Protection

It consists of automatically saving a copy of all changes made to the data, acquiring a remote copy of all versions. This allows data recovery of any moment in time. Even though there are optimized systems to save only the differences and take up little disk space, this system has penalties in the network given the continuous data transfer.

5 · What to take into account

With a countless number of tools available, making a choice can be complicated. To simplify the search and reduce options, defining your own needs and what each solution offers to find the tool that best suit them is essential. Some issues that can help to choose a tool are the following:

6 · Conclusions

The variety of options allow to widely define the backup policy so that one or several tools can be used for the same platform, as well as different levels and retentions for different files. Often there isn’t a single and fixed solution, and rarely does the same policy work for two different platforms.

Although being flexible is advisable for small changes (the policy or resources can be modified), making large changes to the backup system can be complex, so it’s important to carry out a nice study to choose what the best option is. The key is defining and meeting the needs of each platform by adjusting the system to available resources.

TAGS: Archiving, backup, Resources, Risk prevention plan, RPO, RTO, Vaulting

speech-bubble-13-icon Created with Sketch.
José Hdz | May 13, 2017 3:37 am

Excelente Post!

Cristofer | April 3, 2017 7:51 am

Hola, en nuestra empresa hacemos copias backups a diario en donde contempla toda la información e emails , tengo una pregunta, si cancelo todos los emails de mi outlock como funciona la copia? copiará sin emails? me dicen que duran 7 días aprox. es verdad? me refiero a si el técnico podrá volver a recuperar esa información o no?

Enrique | February 7, 2017 3:19 am

A que se refiere con site en el punto número 2 ???

Alba Ferrer | February 7, 2017 1:30 pm

Hola Enrique,

en este contexto, ‘site’ se refiere a la localización donde se hacen los backups. Tradicionalmente sería en el mismo centro de datos donde residen los servidores (en un robot de cintas, en una cabina de discos donde se hagan las copias, etc), aunque también aplica a cloud (en un proveedor concreto). La idea del vaulting es tener datos fuera de esa localización por si ésta deja de estar disponible. Por ejemplo, si una región de AWS tiene problemas, o si hay un fallo eléctrico en el centro de datos, disponer de los datos en otro sitio para poder acceder a ellos.

Antonio Carvallo | November 1, 2016 10:48 pm

Nota fascinante


Leave a Reply

Your email address will not be published. Required fields are marked *