Introduction to Exchange Server 2016 Backup and Recovery

There’s plenty of excitement to be had when it comes to backup and recovery of Exchange servers. The excitement isn’t so much in the setting up and day to day operations of the backups, it’s more in the hoping that you’ll be able to get back the data you need when you find yourself in a recovery situation.

Over the years I have witnessed many unfortunate data loss incidents that were ultimately the fault of incorrectly configured, or non-operational backups. The simple fact is that failures happen. One day your server or storage will fail, and you’ll need to recover the data that it was hosting. You should expect that to happen. If you don’t have reliable backups then you might find yourself suddenly needing to update your resume and get your best suit dry cleaned.

This is the first part of a series on backup and recovery for Exchange Server 2016, and will provide an introduction to backup and recovery concepts for Exchange 2016. Throughout this series of articles I’ll also cover:

How to backup Exchange Server 2016 with Windows Server Backup
How to restore an Exchange Server 2016 database from backup
How to restore individual Exchange Server 2016 mailboxes and mailbox items
How to recover a failed Exchange Server 2016 standalone server
How to recover a failed Exchange Server 2016 database availability group member

Let’s get started by covering some of the general concepts around Exchange Server 2016 backup and recovery.

Backup and Recovery Terminology

As you deal with different Exchange Server backup and recovery scenarios you’ll encounter a lot of the same terminology, so let’s start with that.

Types of Backup

There are four backup types that you’ll generally see referred to in backup products and documentation.

Full – a complete copy of the data on a server, volume, application or file system. For Exchange 2016 database backups a “full” backup is sometimes also referred to as a “VSS Full” or an “application aware” backup. A full backup will include all of the data regardless of whether the data has changed since the last backup or not. A full backup is a complete set of data that can be used for a restore.
Incremental – a partial copy of the data on a server, volume, application or file system. An incremental backup will only include data that has changed since the last full or incremental backup. In a restore scenario the last full backup, plus all subsequent incremental backups up to the point in time you’re restoring to, will be required for the recovery to be successful.
Differential – similar to an incremental backup, however a differential backup does not mark the data as having been backed up. This means that differential backup sets tend to get larger and larger as you get further away from the last full backup. However in a restore scenario differentials can be simpler than incrementals because you only need the last full backup plus the latest differential backup to perform the recovery.
Copy – similar to a full, however the data is not marked as having been backed up. Copy backups are typically used to make a copy of data to another system for testing purposes. Copy backups are not suitable for recovery scenarios involving Exchange databases.

Each backup type has pros and cons. Full backups are the simplest to operate and recover from, but take the longest to run. Using a mixture of full backups plus incremental or differential backups can shorten some of your backup job times, but at the cost of extra time and complexity when you need to perform a recovery.

In addition to the backup types listed above you’ll encounter other terminology in various backup products such as “synthetic full” backups. Those terms can mean many different things depending on the backup vendor, so you should refer to the specific documentation for those products to find out more.

Backup Storage

Different backup products support writing and storing backup sets on a wide variety of storage types.

Tape – magnetic tape backup media that is available in many different formats and capacities. Tape is still commonly used today but not always as a primary backup media. Instead it is often used to replicate backup sets from disk storage so that a copy of the backed up data can be taken offsite.
Disk – very large capacity disk storage is very cheap these days, faster than tape for many backup and recovery scenarios, and often has attractive features such as hardware-based de-duplication, compression, and replication.
Cloud – there are many cloud-based storage providers to choose from these days, such as Amazon Web Services and Microsoft Azure. These providers sell storage by the gigabyte, usually at very low cost. Cloud-based backup storage often includes built-in replication of your data to protect it from failures in the cloud provider’s infrastructure. Cloud-based backup storage is also very practical in that you do not need to purchase large amounts of it up front as you do with on-site backup storage.

Cloud-based backup is becoming very popular these days, however backing up large amounts of data to the cloud does require good network bandwidth between you and your provider. It can also be slower to restore from. Some organizations use cloud-based storage as an off-site replica of their on-premises disk-based backup storage. Some even combine all three, backing up primarily to on-site disk, then replicating that to the cloud while also making copies of specific data to tape (usually multiple tapes) to be stored off-site for specific retention requirements.

When you are considering backup storage for your Exchange 2016 servers remember to follow the 3-2-1 Rule:

At least 3 copies of the data
Stored on at least 2 different media
At least 1 copy kept off-site

Other General Terminology

Here’s some additional terms you may need to be familiar with.

RPO – stands for Recovery Point Objective. The RPO is the point in time that you are attempting to recover data from. For example, attempting to recover a mailbox from 5pm last Monday. The RPO may also define how much data loss a business is willing to accept in the event of a disaster, and your backup solution should be designed to meet that RPO. For example, if the business tells you that they are willing to accept up to 24 hours of data loss, then running only a weekly backup is obviously not acceptable.
RTO – stands for Recovery Time Objective. The RTO defines the amount of time that is acceptable to perform a recovery after a disaster. Your backup solution should be designed to meet the RTO as well. For example, if the business requires an RTO of 8 hours but it would take you 20 hours to retrieve tapes from off-site storage and recover from them, then you would not be able to meet the RTO. However you should also be aware that the RTO can be impacted by infrastructure other than the Exchange server itself. If you virtualize your Exchange servers the the virtualization hosts are lost in a disaster, then obviously you can’t start to recover the Exchange VM until some other host is available.
VSS – stands for Volume Shadow-copy Service. VSS is part of the Windows Server operating system and is used to make application-aware backups of Exchange 2016 databases.
Recovery Database – a special type of Exchange server database that is used as the target for a database restore operation. Data within the mailboxes of a recovery database can’t be accessed by clients but can be extracted by the administrator and restored into a user’s mailbox.
Database Portability – this refers to Exchange Server 2016’s capability to mount databases that have been copied or restored from other Exchange 2016 servers within the same Exchange organization. This is useful when the original server that hosted the database is no longer available for the recovery operation.
Dial Tone Recovery – this refers to Exchange Server 2016’s capability to mount a temporary database with empty mailboxes for end users to connect to so that they can continue to send and receive emails. A dial tone recovery is often used to restore service for end users while the much longer process of recovery the mailbox data from backup is performed.
Log Truncation – all changes (transactions) to an Exchange 2016 database are stored in a memory buffer and also written to transaction log files. Periodically the memory buffer is flushed by committing changes to the database file itself. As there is generally some gap between what is written to the transaction log files and what has been committed to the database the log files become very important in a recovery scenario. Transaction logs accumulate on the server (and consume disk space) until the next database backup. When a full backup of the database is taken the server will remove the transaction log files that are no longer needed for recovery now that a backup of the database up to that specific point in time has been successfully taken.
Circular Logging – when circular logging is enabled the transaction logs are automatically truncated as the changes are committed to the database file. This reduces the disk space consumption by the transaction logs, but removes the ability to recover the database beyond the point of the most recent backup.

As an additional note you may encounter snapshot-based backup systems in the real world, especially when you’re running virtualized Exchange servers. While a snapshot-based backup solution may still be supported for backups, providing it takes an application-aware backup that properly truncates the transaction logs, snapshots are not supported for recovery purposes. Many snapshot-based backup products provide different processes or tools to use for recovering data from their backup sets that do not involve “rolling back” the VM to the last snapshot, which is fine. I mention this because a common mistake by administrators is to take a snapshot of an Exchange VM before any routine maintenance (such as monthly security patching) with the expectation that they can “roll back” the VM using that snapshot if something goes wrong with the patching. Unfortunately this type of snapshot recovery can be catastrophic for an Exchange server.

What to Backup for Exchange Server 2016

Exchange Server 2016 has two server roles; Mailbox and Edge Transport. The backup requirements for each server role are different.

Edge Transport – a full server backup is generally advisable, however it is not necessarily a requirement. If your ability to rebuild and reinstall the Edge Transport server (for example with an automated operating system deployment and pre-scripted Exchange installation and configuration) allows you to restore operation within an acceptable timeframe then you would not necessarily need to also use traditional backups for the server.
Mailbox – similar to the Edge Transport server you may consider not backing up the server operating system itself if you have fast enough rebuild processes. However, Mailbox servers also host the databases containing mailbox and public folder data, as well as the transport databases that may contain email messages still in transit. Therefore it is recommended to back up at least the databases, if not the entire server.

Aside from the considerations above you should also think about the various log files that are stored on the Exchange servers, such as event logs or message tracking logs. Those are important for historical purposes.

If you’re in any doubt as to what you should be backing up on your Exchange servers I recommend you err on the side of caution and backup everything.

Summary

In this article I’ve provided an overview of backup and recovery concepts that may apply to Exchange Server 2016. In the next part of this series we’ll look at backing up Exchange Server 2016 with Windows Server Backup.

Comments

Ovi 11 Aug 2020 Reply

Hi Paul.

I have an argument about enabling circular logging on CAS (hybrid) servers. Is there any reason in the world not to? We have multiple servers for this purpose and none of them has a backup configured (load balanced). Transaction logs are years older and I really see no logic in keeping this config. There is no mailbox on these servers, there is no intention to restore if something goes south (leverage the remaining boxes), so keeping the transaction logs seems unreasonable from my perspective.
Thank you.
Jerwin S Ravelo 24 Sep 2016 Reply

Thank you so much of your great information.
Phat Do 8 Nov 2015 Reply

Thanks a tons!
Very helpful for newbie Exchange Administrator