Home » Exchange Server » Exchange Server 2013 Autoreseed in Action

Exchange Server 2013 Autoreseed in Action

At my IT/Dev Connections session in September I advocated for simple designs for database availability groups. This included some points about Exchange Server 2013 storage design and layout, such as:

  • JBOD vs RAID
  • Multiple databases per volume
  • Volumes mounted in folders not drive letters
  • Co-locate the database and transaction log files on the same volume

Those recommendations came with caveats of course, depending on various factors. Aside from simple designs providing ease of management they can also mean you get to leverage the terrific new feature in Exchange Server 2013 called Autoreseed.

Autoreseed in Exchange Server 2013

With Autoreseed the members of an Exchange 2013 DAG are pre-configured with one or more spare volumes. When a disk fails the Exchange server is able to automatically replace the failed disk with a spare, and then reseed the lost database copies to the new volume.

This means the recovery workflow in Exchange 2013 goes like this:

  1. Disk fails (resiliency of your DAG is impacted)
  2. Spare disk automatically mounted
  3. Database copies reseeded (resiliency is restored automatically)
  4. Manual intervention to replace failed disk replaced with a new spare

In Exchange Server 2010, which also supported JBOD storage, the recovery workflow goes like this:

  1. Disk fails (resiliency of your DAG is impacted)
  2. Manual intervention to replace disk
  3. Manual intervention to reseed database copies (resiliency is restored)

The Exchange 2010 recovery workflow involves too many manual steps to restore the resiliency of the DAG, requires response by admins at any hour of the day, and is simply not efficient at scale.

The Exchange 2013 recovery workflow can automatically restore the resiliency of the DAG without manual intervention, requires response by admins at a lower urgency, and is far more efficient at scale.

Laying the foundation for Autoreseed involves implementing those recommendations I mentioned earlier. Let’s take a look at them in a little more detail.

RAID vs JBOD

For single datacenter deployments:

  • Always use RAID for the system/OS volume
  • Always use RAID when there are less than 3 database copies
  • Use JBOD when there are 3 or more database copies
  • Use JBOD for lagged copies only when 2 or more lagged database copies exist

For multiple datacenter deployments:

  • Always use RAID for the system/OS volume
  • Always use RAID when there are less than 2 database copies in a datacenter
  • Use JBOD when 2 or more database copies exist in a datacenter
  • Use JBOD for lagged copies as long as 2 or more lagged copies exist, or log play down is enabled

Multiple Databases Per Volume

Use multiple databases per volume when 3 or more database copies exist. Can be placed on RAID or JBOD (with preference for JBOD as I’ll explain shortly).

The number of databases per volume should equal the number of copies of the databases.

Volumes Mounted in Folders not Drive Letters

Mounting your volumes as drive letters is fine for non-DAG deployments, and works for DAG deployments as well, but is not recommended.

There is the obvious limitation of the size of the alphabet. With only 23 usable letters after A:, B:, and C: are consumed, and Exchange 2013 Enterprise capable of hosting 100 databases, you can easily run into problems or at the very least find yourself juggling a complex configuration to work around it.

Instead mount your volumes as folders, using a RAID-protected host volume (the C: volume for system/OS is fine for this).

Co-Locate Database and Transaction Log Files

Exchange admins are used to placing the database and transaction log files on separate volumes for recoverability from disk failures. This is still recommended for non-DAG scenarios.

For DAG scenarios the fact that you have multiple copies of each database mitigates the risk of a single disk failure taking out an entire database. So co-locating the database and transaction log files is recommended for DAG scenarios, especially when using multiple databases per volume, and also when using JBOD.

Combining the above, along with evenly distributed active, passive and lagged database copies, gives you an Exchange 2013 DAG that looks similar to this example.

Example Exchange 2013 DAG

This example obviously assumes that a four node DAG in two datacenters is the right solution for the environment. Your own requirements will vary of course, but this example is being used mainly to demonstrate Autoreseed.

Example Storage Layout for DAG Members

With all of the above in mind here is an example of how the storage layout would be configured for an Exchange 2013 DAG member.

We start with a RAID protected system/OS volume, and create two folders in the root of C:.

  • ExchangeDatabases
  • ExchangeVolumes

These match up with the default settings of an Exchange 2013 DAG for root folder paths.

autodagdisks1

Next, the volumes that will be hosting the databases and log files are configured. For this simple example a single volume is being configured to host active data and a single volume is being configured as a spare. These are mounted into sub-folders of C:ExchangeVolumes named Volume1 and Volume2.

autodagdisks2

Volume1 is then mounted into additional folders for hosting the database and log files. These folder names match the names of the databases in the DAG, for example DB01, DB02, DB03 and DB04. These are created as sub-folders of the C:ExchangeDatabases folder.

autodagdisks3

If you’re wondering what I mean by this, all I am referring to is mounting the volume into multiple paths instead of as a drive letter, just as you would normally see when first creating the volume.

autodagdiskmounts

Finally, create sub-folders of each database folder to host the DB and log files. These are named according to the database names again, so DB01 needs sub-folders named DB01.db and DB01.log.

autodagdisks4

These folders are then used as the paths when creating the mailbox databases themselves. For example, here are the paths for DB01 in this environment.

Autoreseed in Action

When a disk fails in an Exchange Server 2013 DAG member the Autoreseed workflow begins. However, the following conditions must be met for Autoreseed to take place:

  1. The database copies are not blocked from resuming replication or reseeding.
  2. The logs and databases files for the database are collocated on the same volume.
  3. The logs and database folder structure matches the naming convention required for Autoreseed.
  4. There are no other database copies on the volume that are in an “Active” state.
  5. All database copies on the volume are in a “FailedAndSuspended” state.
  6. The server has no more than 8 “FailedAndSuspended” database copies.

If those conditions are met then Autoreseed can attempt to resolve the issue.

The workflow begins with detection of the failed volume. Database copies are regularly checked to see whether any of them have been at a status of “FailedAndSuspended” for 15 minutes or longer. This is the state that a database copy will be in when there is an underlying storage issue. The 15 minute threshold exists to ensure that remedial action is not taken too quickly.

The server attempts to resume the FailedAndSuspended database copy 3 times.

The server attempts to assign a spare volume once per hour for up to 5 attempts.

The server attempts to reseed the database copies to the new volume, with up to 5 attempts at 1 hour intervals.

If the process was not successful after 5 attempts, it stops.

After 3 days, if the database copies are still “FailedAndSuspended”, the workflow begins again.

Summary

As you can see Autoreseed is quite intelligent and effective, resolving a straight-forward issue like storage failure with no manual intervention by the administrator except for replacing the failed disk with a new spare.

Just how good is Autoreseed?

In my test lab I tend to treat my servers pretty rough. To test Autoreseed I would regularly open up Server Manager on a DAG member and offline one of the volumes hosting database copies. Then I would go away and do something else for an hour or two.

Every single time Autoreseed successfully restored the resiliency of my DAG. Looking at the event logs it typically achieves this in a little over an hour. In the real world if there are delays or retries on some of the Autoreseed workflow steps, or the databases are larger and take longer to reseed, then it may take longer to recovery but I would have full confidence that it would work.

Autoreseed is a feature of a highly intelligent server application that is designed to run efficiently at scale. As with many features in Exchange Server 2013 to take full advantage of Autoreseed you design for *simpler* DAGs. This is counter-intuitive for some people who are used to adding complexity to their designs to make them more resilient.

But as you can see, by getting the right foundations in place you can easily to take advantage of the benefits of Autoreseed in your deployment.

Paul is a Microsoft MVP for Office Servers and Services. He works as a consultant, writer, and trainer specializing in Office 365 and Exchange Server. Paul is a co-author of Office 365 for IT Pros and several other books, and is also a Pluralsight author.
Category: Exchange Server

26 comments

  1. Victor Lambert says:

    Great Article!!
    We are implementing an Exchange 2016 environment using 5.5Tb SATA drives ReFS as Mount Points with 4 MailDB folders under each Mount Point.
    I’d like to run JetStress on the servers to verify the config.
    Can/should I run JetStress using the above config?
    Or should I format the drives NTFS with Drive Letters, run JetStress, then reconfig the drives as above?

      • Victor Lambert says:

        Thanks Paul…
        We are using physical hardware with multiple servers (including 9 6Tb SATA drives in each).
        When configuring the RAID controller, is the preferred method to create each drive as a single drive with RAID0 & a 256K stripe, then create mount points? Or 1 big drive RAID0?
        Do I use the RAID controller to configure the HotSpare, or will that be in the DAG setup?

        Thanks again…I’m finding little pieces to this config, but nothing all together

        • Luke V. says:

          I think you might be missing the point. JBoD is the setup for the volumes. Raid 1 or 5 the OS drive and that’s all there is to do. DAG mutes the need for redundant disks.

          • Victor Lambert says:

            Correct. Leaving the 6TB drives as RAW disk doesn’t leverage the cache on the RAID controller…leading to JetStress failing miserably.
            To setup a HotSpare with AutoReseed, is that done within the DAG configuration??
            Should the drives be individual RAID0 (so it uses the RAID controller cache), and config DAG to see a single 6TB RAID0 ReFS drive as the HotSpare?
            I thought you didn’t config RAID1 or 5 and let Exchange manage the DAG and HotSpare.

          • If you have RAID, then the volume is unlikely to ever completely fail, which means Autoreseed will never need to do anything. Autoreseed uses an available spare volume that it can see in Windows. It doesn’t integrate with your RAID controller or any RAID controller managed hot spare disk.

            The point of Autoreseed is to not use RAID for your Exchange database/log volumes. You get more overall disk capacity to use because you aren’t losing any of it to RAID, and you get a simpler storage configuration because you’re just presenting disks as volumes instead of having to manage RAID controllers, RAID configs, monitor RAID health, rebuild RAID when a disk fails, etc etc. Microsoft’s experience with Office 365 is that most Exchange failures are caused by disk controllers or disks, hence why Autoreseed was developed to automatically recover from those.

            In short, if you’re going to RAID all your Exchange database/log storage then you can pretty much ignore Autoreseed.

          • Victor Lambert says:

            Unfortunately, if the 6TB SATA disks are left RAW then they dont leverage the RAID controller cache and fail JetStress. I have configured each individual 6TB SATA drive as RAID0 using a 256K block size. NOT one big 54TB RAID0 drive. The point of this was to leverage the Autoreseed funtionality.
            Win2012R2 DiskMgr sees each drive as a 6TB drive and I’ve created mount points per Pauls instructions above. Leaving one drive as an empty 6TB ReFS drive. I’ll use the spare 6TB drive when configuring Autoreseed.
            I guess my point is…you need to leverage the RAID controller cache with the drives or I/O latency is horrible and will fail JetStress.

            Thanks for the clarification Paul

  2. Hazem says:

    Dear Paul,

    kindly what do you mean by “The number of databases per volume should equal the number of copies of the databases”

    • Let’s say you have four copies of each database, because you have a four-member DAG and you’ve added a database copy to each DAG member.

      In that scenario, it’s recommended to put four databases on the same disk/volume, rather than one database per disk/volume. So you might have 4 x 200GB databases and their log files on a single 2TB volume.

  3. Jan Dye says:

    I’m seeing this during the Auto Reseed – and I see it in some of your screen shots above. My reseed isn’t starting. What can I do to resolve this?

    “Please check the file system permissions. Error: System.IO.DirectoryNotFoundException: Could not find a part of the path . . . . .”

    Thank you for your great instructions.!

  4. Robert says:

    Hey Paul,

    Going through your plural sight videos and trying to make sure i really understand autoreseed. In the past i have had a tough time with understanding autoreseed.

    1) C:\ExhchangeDatabases\>DB01, DB02, DB03 DB04> those are mount points, and not folder names on the drive correct?
    Example: DB01 would read or show C:\exchangedatabases\db01.db (folder) and db01.log(folder) ?

    Instead of c:\exchangedatabases\db01\db01\db01.db db01.log – Correct? I know this this is a psycho question but i just want to make sure i have it down.

      • Robert Bollinger says:

        Right. Ok. So in this case (From your pluralsight videos) its expected that when looking at the DB01 mount point i will also the DB02 folders?

        Robert

        • That’s a design decision that you make. In the example above you can see a screenshot where I’ve mounted one volume for DB01, DB02, DB03 and DB04.

          So that is 4 databases per volume, in that example. Which means when I go to C:\ExchangeDatabases\DB01 I will see the sub-folders for all 4 databases and their logs, because its the same underlying storage volume.

  5. TS79 says:

    I am struggling with the spare selection portion of autoreseed:

    Where is the spare volume configured? How does Exchange select the volume to mount as the spare? Is it simply looking for a volume with sufficient capacity which is not yet mounted?

    • Don’t overthink it. I demonstrate in the article above how to configure a simple autoreseed scenario with Volume1 and Volume2 used in the example. Volume1 hosts the databases, Volume2 is the spare. By following the Microsoft guidance for configuring the storage paths, volume names, number of databases per volume, etc, the DAG is aware of what storage is in use and what is available as spares.

      • TS79 says:

        Thank you Paul,

        So the spare volume mounted in the ExchangeVolumes folder but not in the ExchangeDatabases volume will implicate it as a spare for the DAG.

        Once triggered autoreseed will adjust the ExchangeDatabases\ mount points from the failed volume to the spare volume during the reseed?

        Much appreciated.

Leave a Reply

Your email address will not be published. Required fields are marked *