How to Develop an Effective Backup Policy
Can we all agree that bulletproof backups are essential for all businesses? For serious backup administration, you have to write a documented backup policy that facilitates reliable backups being created and tested – and then enforce it. Here's how to create a backup policy that's right for your organization.
A Google search will turn up several example backup policies. Notice how short and practical they are. This is essential to every backup policy – it has to be focused, with clear instructions and definitions. Backup policies are not standardized, but all of them have to address the following issues, not necessarily in this order:
What to Back Up
Backups should contain only otherwise unobtainable data files. You can skip temporary and cache files because they don't contain essential data or can be easily recreated. You can usually ignore non-configuration system files unless the server installation is very specific and would require too much time to reinstall from scratch.
Because it's not always easy to differentiate which files are custom, sometimes it's easier to back up whole systems. Many popular backup solutions, especially in virtualization, offer exactly such functionality. The biggest benefit is such solutions is that they let you restore entire servers in the shortest time possible and with as little confusion as possible. However, they are usually proprietary and non-transparent, limiting you to what the software designers think your needs are.
How to Back Up
A typical backup policy will involve three types of backups. Full backups contain all the information needed to restore a service. Naturally, they require the most resources and take the longest time to run, which means you won't want to run full backups every time you want to save data. Differential backups contain the cumulative changes since the last full backup. Such backups can be performed more frequently because they take less resources and time. Restoring information from differential backups, however, requires more time, because you must work with both the target differential backup and the most recent full backup. Incremental backups are similar to differential backups, except that they reflect only the changes since the previous differential backup; they don't compare with the original full backup directly, so they demand even less resources to create, but take more time and resources to restore.
What's the best mix of full, differential, and incremental backups? The answer differs for every business, depending on several factors, including budget. Restore time, for instance, plays a key role in determining backup policy. The shorter the restore time your business needs, the more you need your backups in a form closest to the original and in an easily accessible location. Another factor to consider is your backup window, which is the time your data is available to be backed up.
When to Schedule Backups
The backup process uses CPU, memory, and network resources, along with disk I/O operations. Because of this detrimental effect on a server's performance, backups should be scheduled for less busy hours, such as after midnight. In case of a 24/7 business, you can decrease the impact on a server's performance by performing incremental backups more frequently and thus avoiding a long backup window.
When scheduling backups, also consider the possible data loss from files not saved since the last successful backup. When you cannot afford any data loss, you can implement more advanced solutions that provide near-real-time backing up, or use mirroring for constant backup.
Ensure Backup Security, Reliability, and Availability
You cannot take chances with backups, and you have to consider every possible point of failure. Common backup design flaws are connected with:
- Physical storage – Ensure that your backup media are in a different physical location from the main site. This will protect them from natural disasters and physical theft. In the best case, the backup media should be present only during the backup and restore processes; immediately after that they should be returned to a safe location.
- Encryption – All backups should be considered by default to contain sensitive information and thus require encryption. If an unencrypted backup were to fall into the wrong hands, the information it held could be retrieved and could cause serious damages. If your backup application doesn't provide encryption on the fly, you can encrypt the backup media. Truecrypt is a popular and reliable application that implements a combination of strong encryption algorithms on any media.
- Online security – If you are security-paranoid, you know that nothing connected to the network is secure. Given that, you can see why a login ID with access to the main site should not have access to the backup site too. An intruder who has stolen such a login could then compromise your backups. To make access even harder, backup media should be stored offline.
How to Perform Specific Service Backups
So far we've covered the basic questions and concerns that you have to address in every backup policy. However, backup policies should also answer more specific problems for your environment, in order to avoid frivolous interpretation and unexpected behavior. Let's take for example database backups.
Database backups are trickier than simple file backups because database files are more complex than those in a filesystem, and more frequently updated. Just copying a database's data directory almost certainly would result in serious problems with the data integrity when trying to restore it. Let's look at MySQL, a typical open source database, as an example.
You could dump the contents of a MySQL database with standard utilities like mysqldump
, but these usually requires a lot of system resources and locking of tables. This usually means downtime unless you have set up MySQL replication. Without replication or binary logging, any information that changes in the time between backups is unrecoverable.
Specific application-level backups can be complex and tedious, and can significantly increases the risk of failures and incorrect backups. That's why in more complex environments it's better to create whole server backups containing everything from the server's operating system to the database service. If you're running in a virtual environment, such backups are easy to create with system snapshots and through the use of third-party applications such as VMProtect for VMware.
How to Test and Verify Backups
The last and most neglected point in every backup policy is backup verification. At some sites, backups are created but not verified and thus unreliable. Whether it is a MySQL dump with corrupted encoding or a forgotten independent VMware image, there is always a chance that something may go terribly wrong.
The only way to prevent disasters with backups is to test them regularly – each month or even more frequently – in order to ensure that your data restores properly and within the allowed timeframe. That's why your backup policy should clearly state who should verify the backups and when.
Backups are important to every business, but they are expensive in terms of human and system resources. That's why their planning and implementation is an important process for IT managers. Good managers test their backups regularly, while bad ones test them only during disasters.
Follow @openlogicFollow @CloudSwing
This work is licensed under a Creative Commons Attribution 3.0 Unported License
. Follow @openlogic
Follow @OSCloudServices
This work is licensed under a Creative Commons Attribution 3.0 Unported License
.