For those who have gone through the pain ...
How to recover data with Restic
In the first part of this article series, we described how you can easily and quickly make backups using containers with Restic. However, backing up data is not an end in itself. Rather, it is about recovering data if the backed-up system fails. This article is dedicated to this aspect of backup, which is just as simple when using Docker as recovering your previous backup. Finally, we will also cover how to delete backups if you run out of storage space.
This article assumes that Restic has already been installed and snapshots are backed up with Restic. You can find additional installation instructions in the first article of this series.
Restic commands that are covered in this article
In this article, we are interested in the following commands in particular:
snapshots
to view existing backupsrestore
to restore files and directoriesforget
to delete backups
To review: Restic can provide you with more information on which arguments are accepted at each command level using the --help
argument, e.g., restic --help
or restic backup --help
. This article assumes the use of root privileges via sudo su in order to avoid any problems with file permissions.
Restoring from a backup
After successfully creating backups in a Restic repository, the main purpose of backups is for: Restoring backed up data. If the container is still working, the content can be restored gracefully across the volume. In order to illustrate how this works, the content of the Docker volume is deleted in order to artificially induce a fault.
rm /var/lib/docker/volumes/nginxData/_data/
If the page is accessed in the browser at http://localhost:8080, then you should see an error message. Now we can restore the data. However, the rules for open files when performing a restore are analogous to those that are applied to backups. This means that the logical or technical dependencies between containers must also be followed in order to perform a restore:
- Stop the container
- Restore the volume(s)
- Start the container
Before restoring, make sure that the previously used Minio server has also been started, since it will hold the backups from the first part of this series for testing purposes:
docker start minio
The most reliable way to select a backup to restore is to transfer the desired snapshot ID. This can be determined via restic snapshots or from the output of restic backup. Since the saved path is an absolute path (i.e., from /
downwards), the --target-Option
must also be set to /
. If you were to use the complete path /var/lib/docker/volumes/nginxData/_data/index.html
here, then the entire directory tree would be inserted recursively into the _data/
directory.
First, it should be clarified which snapshots are available:
restic snapshots
repository 00d7d2bb opened successfully, password is correct
ID Time Host Tags Paths
--------------------------------------------------------------------------------
----------------------------
aed06d2f 2019-04-05 13:00:01 MY-HOST-1337 Complete backup KW15
/var/lib/docker/volumes/nginxData/_data
d7e6092d 2019-04-12 15:52:32 MY-HOST-1337 Complete backup KW16
/var/lib/docker/volumes/nginxData/_data
--------------------------------------------------------------------------------
ID d7e6092d
from the listing is now used for the specific restore. The command to restore the volume contents looks like this:
docker stop prod-nginx
prod-nginx
restic restore d7e6092d --target /
docker start prod-nginx
prod-nginx
Alternatively, you can use the command restic restore latest
, which uses the latest snapshot to perform the restore. However, this is not recommended, because this case is stored quite similarly to container images with the latest
tag: If there are different backups of different volumes, then it is not clear what data is actually in the most recently backed up snapshot. In principle, you have the option of including a search path, but this is contrary to the principle of ease of use.
Since caution is usually required with production data, it is advisable to include the --verify
argument when performing a restore:
restic restore d7e6092d --target / --verify
This should be understood to constitute an additional security measure. Restic reconciles the recovered data with the data from the backup repository.
Deleting backups
When you possess this knowledge, you can work more intelligently in the field of backup and recovery. However, when you are backing up data to your own disks, experience shows that storage space runs out rather quickly. If, on the other hand, you are backing up data to the cloud, then depending on the storage approach and the contracts that you have concluded, you may have access to what essentially amounts to an infinite amount of storage space. In this case it can be quite desirable to reduce the list of snapshots to a manageable size.
For these cases, Restic also provides a special feature, namely restic forget
. Before looking into the specific features of restic forget
, it is a good idea to take a look at the way Restic works. In order to ensure speed, Restic works intensively with references and hashes in addition to using encryption. Before each transfer, the hash of the part being backed up is calculated. If the hash shows that this part already exists, then it will no longer be transferred but only referenced. This deduplication saves both time and storage space. For inquisitive people, this is how it works: The restic stats --mode raw-data command
indicates the actual amount of storage space that used by the backup repository.
Currently, if you remove snapshots from the backup repository, they will disappear from the overview, but they still take up space on the hard disk. This is because finding unreferenced data takes time. Restic offers two alternatives to actually free up space. You can do this by using either a separate restic prune
command or the parameter restic forget --prune
.
The easiest way to remove backups is to use snapshot IDs. Because here you will not run into any potential future situations in which backups refuse to work. For example, this command removes three snapshots specified and frees up data on the hard disk.
restic forget 40dc1520 79766175 590c8fc8 –prune
Policies
If you have set up an automated backup, it is a common practice to automate the rotation of old backups as well, e.g., ones in which you want to hold only a certain number of backups for a certain interval of time. In that case, it may be impractical to use snapshot IDs. An alternative to using snapshot IDs is so-called policies, which provide you with the ability to select snapshots that should not be removed based on criteria.
In practice, it is helpful to try the parameter restic forget --dry-run
to see the effect without fear of data loss. Restic goes a long way to avoid accidental data loss. If a policy combination results in a situation in which all snapshots are deleted, Restic will not follow this policy, and it will not delete any snapshots, like in the following example.
restic forget --keep-last 0 --prune
repository 8460094c opened successfully, password is correct
no policy was specified, no snapshots will be removed
A simple policy is provided by the --keep-last
parameter, which holds the number of the most recent backups that have been transferred. This example retains the last three snapshots of each path:
restic forget --keep-last 3 –prune
In addition, there are a number of alternatives to narrow down the selection of snapshots that will be retained. For example, there is --keep-hourly
, which performs a number of hourly snapshots of the same file path. There are also equivalents on the daily, weekly and annual level.
There are two other interesting policy parameters that differ from the other time-based parameters. While --keep-tag
retains the snapshots with a given tag, it is possible to specify --keep-within {duration}
to save snapshots for a defined period lasting until the latest snapshot. For example, this example retains all snapshots taken in the past 2 years, 5 months, 7 days, and 3 hours before the latest snapshot:
restic forget --keep-within 2y5m7d3h –forget
Policy modules
All Restic policies can be easily combined. This can be achieved by repeating a parameter. If, for example, if you want Restic to keep one snapshot per month, week, and day, you can specify this very elegantly using the following parameters:
restic forget --keep-daily 1 --keep-weekly 1 --keep-monthly 1 –prune
It is worth taking a look at the documentation, especially if the policy is augmented with tag lists, which we are not able to discuss in further detail here.
Summary
Restic is a powerful tool that solves many important aspects of backup and restore, especially since it offers ease of use combined with speed and security. It can be a bit of a hassle in more complex cases to delete snapshots with policies, but there is always the method of using snapshot IDs. You can rely on this method as a backup, so to speak.
Visit our community platform to share your ideas with us, download resources and access our trainings.
Join us now