Home » Exchange Server » Introduction to Exchange Server 2016 Database Availability Groups

Introduction to Exchange Server 2016 Database Availability Groups

For Exchange Server 2016 the high availability building block is the database availability group (DAG). Exchange 2016 DAGs are very similar to Exchange 2013 DAGs however there are some new features and behaviours to be aware of, which I’ll demonstrate in this article series. I’ll also cover:

Let’s begin with an overview of database availability group concepts.

Exchange Server 2016 DAG Concepts

Database availability groups can contain up to 16 Exchange 2016 mailbox servers, each of which hosts copies of one or more databases that are replicated with database copies on other members of the same DAG.

When a DAG is first created it has zero members. A minimum of two members is required for the DAG to provide high availability. Two-member DAGs are reasonably common as a simple HA deployment of Exchange, for example in the diagram below two Exchange 2016 servers and a file share witness make up a database availability group.

exchange-2016-dag-01Database Availability Groups and Quorum

Exchange Server DAGs make use of an underlying Windows Failover Cluster. You don’t need to create, configure, or even touch the Windows Failover Cluster using cluster management tools, except in specific maintenance scenarios that are clearly documented. When you add members to a DAG the failover clustering components are automatically installed and configured for you.

Quorum is the voting process that the cluster uses to determine whether the DAG should remain online or go offline. If the DAG goes offline all of the databases in the DAG are dismounted and inaccessible to end users, causing an outage.

There are two quorum models:

  • Node Majority – when the DAG has an odd number of members the file share witness is not required for the quorum voting process, because the DAG members can determine a “majority” themselves. For example, if one DAG member fails, 2/3 DAG members are still online (a majority) and the DAG can remain online. If two DAG members fail, 1/3 DAG members are still online, which may result in quorum being lost and the DAG going offline.
  • Node and File Share Majority – when the DAG has an even number of members the file share witness is included in the quorum voting process to ensure that a “majority” can be determined. For example, in a two-member DAG if one member fails, 1/2 members are still online (not a majority), but you would expect the DAG to be able to withstand a single node failure. The file share witness is used as the tie-breaker, meaning 2/3 “votes” are still available, and the DAG can stay online. Similarly with a four-member DAG, if two members failed, with the file share witness there are still 3/5 “votes” online, so the DAG can stay online.
All database availability groups are configured with a file share witness, whether it is used for voting or note. The quorum model is adjusted automatically by the DAG as you add or remove members.

I wrote above that in some failure scenarios the DAG may lose quorum and go offline. In some circumstances the DAG can sustain a majority of nodes being offline if there has been sequential failures. This is thanks to a feature of Windows Server 2012 clusters called Dynamic Quorum.

Database Copies and Continuous Replication

Each member of the Exchange 2016 DAG hosts one or more database copies, and participates in the process of continuous replication to keep those database copies updated with changes. The Exchange 2016 server edition determines how many database copies a DAG member can host. A Standard edition server can host up to 5 database copies, and an Enterprise edition server can host up to 100 database copies.

Exchange 2016 DAG members can host a mix of active and passive database copies, because the switchover/failover occurs at the database level, not the server level. So there is no concept of an “active server” or a “passive server”.

exchange-2016-dag-02

During continuous replication the transaction log data that is generated on the active database copy is shipped across the network to the DAG members hosting passive database copies. Those DAG members then replay the transaction log data to update their passive database copy. Replay can occur immediately, or it can be configured to be a lagged database copy.

Incremental Deployment

There is not a special installation of Exchange Server 2016 for DAG members. An Exchange 2016 mailbox server can be added to a DAG, or removed from a DAG, at any time without impacting the databases and other services hosted on that server. Incremental deployment makes it possible for organizations to deploy a single server today, and then scale out to a DAG at a later time if necessary, without any impact to production services.

Database Availability Group Networks

A DAG network is one or more IP subnets that the DAG members are directly connected to. Every Exchange 2016 database availability group has at least one DAG network that is used for client traffic. A DAG can also have one or more separate, dedicated networks for database replication traffic.

exchange-2016-dag-03

With the speed of modern networks it is generally recommended to use only one DAG network, which is simpler to manage and creates a more predictable failure scenario.

Site Resilience

An Exchange 2016 database availability group provides high availability for Exchange within a single datacenter or Active Directory site. Exchange 2016 DAGs can also be deployed across multiple datacenters to provide site resilience as well, allowing the Exchange services to remain online in the event of a complete datacenter outage.

Summary

In this article I’ve covered an introduction to Exchange Server 2016 database availability groups by explaining some of the fundamental concepts of DAGs. In the next part of this article series I’ll walk through the step by step process of creating an Exchange 2016 database availability group.

Paul is a Microsoft MVP for Office Servers and Services. He works as a consultant, writer, and trainer specializing in Office 365 and Exchange Server. Paul is a co-author of Office 365 for IT Pros and several other books, and is also a Pluralsight author.
Category: Exchange Server

59 comments

  1. David Klein says:

    good description of the process. I would like to know if the witness server requires any exchange components installed and if it requires an exchange license.

  2. Rob Gordon says:

    Hi,
    Correction: Do i need two SSL certificates, one for each server or can i use the same SSL certificate for both servers.

    • One certificate is the recommended approach. But the certificate is not directly related to DAG functionality, it is for client access services, which also needs to be considered and planned properly in a HA deployment.

  3. Gordon says:

    Hi – Do I need to configure internal and external access URL’s for virtual directories when adding a second exchange 2016 server as part of DAG creation? Thanks!

  4. Christian Poortvliet says:

    This does not, by any means, create an active active Exchange server deployment right? So a mailbox of user X resides in a Exchange server deployment in location A within DB01 and simoultaniously in Exchange server deployment in location B withing a replication of DB01 (or even a DB02).

    As far as i know an active active email server DB with Exchange is not possible, only HA right.

    You can’t have a single mailbox active-active amongst two exchange servers wich replicate each others databases (correct me if i’m wrong).

      • Christian Poortvliet says:

        Thanks, hope it will come some day, for now we have to move away from Exchange because of some specific client requirements. We do really need the active-active database where a single mailbox can be active on both databases simultaneously.

          • Christian Poortvliet says:

            Well, imagine a site with very limited , expensive and slow internet connectivity (satellite internet) which is high in latency and might drop out often.

            When you are on-site (30-40 users) you have an exchange deployment there so emails sent between those people stay local and don’t use the limited and expensive bandwidth.

            When leaving site (what happens often they work on rotation) they want/need email connectivety and they connect to the on-site server, via that very slow link, they also send/receive emails through that link.

            This ‘hammers’ the sattelite connection and creates huge delays in email. Also when the connection is offline (which does happen quite often) there is no email connectivety at all.

            So if an offsite Exchange deployment would exist and we have a sync between the on-site and off-site deployment of all boxes emails would only sync once.

            We let all external emails go to the off-site deployment first and let it sync through with the on-site deployment.

            Via DNS we resolve (on-site) to the local Exchange server and off-site automatically to the other (off-site) deployment.

            You see the problem? This solution is not part of Exchange is it?

            I’ve been looking at hybrid office 365 deployments but you need to transfer the mailbox. That is of course no option through that slow link.

            Also when they go off-site for one day or a couple of hours this does not work.

            Maybe you have another bright idea, but we are (Exchange wise) a bit out of options.

            Thanks for your replies really appreciate it!

          • No, Exchange can’t do anything like that. I doubt that solution is really going to help you, to be honest. You’re trying to reduce client traffic over a slow link, but you’re going to replace it with replication/sync traffic anyway. One way or another, the bits need to cross the wire. And trying to maintain sync over an unreliable connection is going to be a nightmare. I predict a lot of data inconsistency issues with anything that tries to work like that over a bad network link.

            Not knowing all the details of your situation, I would suggest that you put the server wherever the most people are located. If that is on-site, so be it. For anyone accessing email off-site, they should use the most lightweight connection option available, which is Outlook Web App (OWA), now called “Outlook on the web” in Exchange 2016.

  5. Christian Poortvliet says:

    Thanks Paul, yes we know, but we will test it thoroughly before implementing. The OWA won’t help us when the remote site is down for a couple of days. And yes it is going to be a nightmare, but we’ll give it a shot, without an MS product though.

    • Christian Poortvliet says:

      Forgot to add: when we place the server off-site we have the problem that all ‘internal’ emails need to go offsite and back on-site again to be received. That way they won’t have communication when the site is cut-off the internet…. With the server on-site at least the local business can continue. Know it sounds crazy/rare but we have dozens of clients facing this issue…

      • If the remote site is down you lose sync between two replicas anyway. There’s no solution here that protects you from every failure scenario. At some point you’ll need to decide which parts of the service you’re willing to lose under certain failure scenarios. Otherwise you’re looking for a unicorn.

        • Christian Poortvliet says:

          Well, let’s say we have a sync between onpremiseoffpremise DB (assume this is possible). MX comes in to offpremise, so when not on site emails will be received and also senders will get no bounce back.

          On-site the people can still email between one and another and prepare emails to be send to the ‘outside’ world.

          In theory this sounds better than having (with the downsides outlined above) one location where a Exchange environment exist.

          Only thing is what ‘happens’ if the sync becomes available again between the two DB’s.

          • Whatever hypothetical sync product you’re looking for would have to reconcile a lot of change between the two sync partners any time the link was lost for a period of time. In fact, I imagine that if you did lose a link for a few days, the sheer volume of sync needed between the two copies to catch up again would saturate the link for several more days anyway, rendering it useless for normal connectivity.

            Bottom line, an Exchange DAG won’t meet those requirements that you’ve spelled out.

  6. Michael Woods says:

    I’m trying to deploy a two datacenter site resilient solution that can survive a wan failure. I have found Exchange 2010 documentation that says to create two four member DAGs. The goal is to have both sites operational if a network failure is preventing communication between the two sites. Do you know if this is still the model in Exchange 2016, or have their been changes in the architecture that no longer require two DAGs?

    • Yes, if you need two separate datacenters to be the “active” datacenter in a WAN failure so that they can serve different geographic regions, then two DAGs is still the solution in Exchange 2016.

      • Gary Jackson says:

        What if you didn’t have this solution but only needed to have an automatic failover and run solely out of the secondary site if the primary location was offline? Right now I have a 3 node DAG (2 primary, 1 secondary) for Exchange 2010 but will be transitioning to 2016. My requirement is a more automatic failover solution and the ability to fully serve all users out of the secondary site (OWA/Autodiscovery). Would the 3 server situation remain?

        • If you want automatic site failover you need a third datacenter that can host the file share witness, and you need the primary and secondary datacenters to host an equal number of DAG members each.

          • PETR SAMONCHEV says:

            What will happen in this case when link between first and second datacenter will down but links to third datacenter will remain for both datacenter? Will lead this to split brain situation eaven DAC mode configured in DAG?

  7. Tim says:

    In your example of a two member DAG, with an FSW, if the FSW fails, the DAG will stay up, correct? If one DAG member and the FSW fail, then the DAG shuts down, correct?

    • Kiran Ramesh says:

      Hello Tim,

      FSW acts as a tie-breaker in a two member DAG. I shut down one of the Exchange servers and all is well!! There is simply a warning if another node or access to the FSW is lost the cluster will fail.

      The quorum model is designed to be automatically adjusted:
      A Node Majority quorum model is used for DAGs with an odd number of members.
      A Node and File Share Majority quorum is used for DAGs with an even number of members.

      Thanks
      Kiran

  8. David says:

    Hi Paul,

    As others have said your articles are an invaluable source of information, the go-to site for Exchange related matters.

    We currently have Exchange 2010 SP3 RU12 configured in a cross-site DAG with a single client access server in its own CAS Array in each site. The two data centres are connected at 12mbit. All the databases in the secondary site are passive and only activated (manually) in the event the primary data centre is unavailable.

    Our data centre situation is going to improve which includes much better network bandwidth to our secondary data centre. The data centres will be separated geographically but connected over dark fibre in the same subnet (AD site).

    It looks like we will go with Exchange 2016 (now that all our third-party applications are supported, and the goal of using Exchange Online archives soonish), configure the DAG and perform mailbox moves to the new Exchange 2016 servers, giving us pretty much zero downtime for Email services during our data centre move.

    We plan to split the databases 50/50, i.e. half active on one DAG member and the other half active on the other DAG member, automatic activation should one of the DAG members become unavailable. The complete opposite of how we are doing it today.

    1) What is the behaviour of Exchange 2016 when it comes to accessing mailboxes? If I have a namespace e.g. outlook.company.com with an internal A record pointing to Exchange server 1, will Outlook use Exchange server 1 for its Autodiscover lookup and then connect the Outlook client to the Exchanger server on which the database is mounted or will Outlook use Exchange server 1 to access a mailbox which its database is mounted on Exchange server 2?

    2) Without a hardware load-balancer in play (possibly in the near future), in the event site 1 is unavailable can I simply amend (small TTL) the internal DNS record of our namespace to the IP address of the other Exchange server?

    Regards,
    David

  9. Lee Tourgee says:

    Hey Paul,
    What’s your opinion of dedicated replication NICs for 2016 Exchange? It used to be that was always recommended but with 2016 it’s not anymore? What is best practice in your opinion?

  10. Praveen says:

    Dear Paul

    We are running Active Directory windows 2008 and Microsoft exchange 2010 SP3 with standalone servers.
    I am planning to Migrate the AD to windows 2012 and Exchange to 2016 .The Exchange will be in DAG.I am also Planning to add a new Active Directory and an Exchange server which will be a part of DAG in second Location as a DR approach.
    Is it possible to migrate AD to 2012 and after Exchange to 2016,and configure DAG in DR site.How many total Witness server required.

      • Your question is a bit confusing but I’ll try to answer it. I think what you’re saying about the “new Active Directory” is you plan to add a new AD domain controller to a second location. Yes, you can add an Exchange server to that second location as well, but it can’t be the same server that is also the domain controller, you’ll need to make them two separate servers.

        Yes, you can configure a DAG using the two Exchange servers in each site. You’ll need one witness server, and it should be located in the primary datacenter.

  11. Bikram says:

    Hi,

    The article is great, detailed and self explanatory…
    Thanks for all the efforts…

    Currently I have 2013 DAG,

    1) 2 Servers Each having all roles installed on it.
    2) I have barracuda Spam filter too.
    3) I have pointed my mail.company.com and barracuda to DAG IP.. functioning flawlessly from past 2 years.
    4) In case of a server failure the second server takes over automatically, without CAS, HW loadbalancer in picture

    I have some queries..

    Since exchange 2013 creates IP less DAG

    1) Where will clients connect to?
    2) What will happen to Clients… OWA & Outlook?

    I am planning to upgrade my Ex2013 to 2016… a lot depends upon the answer to this..

    Thanks…

  12. Oleg Mironov says:

    Hi Paul,

    I have a quick question regarding Exchange 2016 DAG DR site fail over. I currently have the following environment:

    2 Datacenters (Production and DR)
    1 DAG
    3 Exchange 2016 CU1 Servers (2 in the Production datacenter, 1 in DR datacenter)
    1 Witness server located in the Production datacenter (I know it’s not really doing anything because I have 3 Exchange servers in my DAG)

    During a disaster scenario where my Production datacenter is completely down, will the DAG automatically switch over to DR or do I have to run the following commands:

    Stop-DatabaseAvailabilityGroup DAG01 –MailboxServer Server1 –ConfigurationOnly”

    Stop-DatabaseAvailabilityGroup DAG01 –MailboxServer Server2 –ConfigurationOnly”

    Then from the Exchange 2016 Server in DR:

    Stop-Service ClusSvc
    Restore-DatabaseAvailabilityGroup DAG01 –ActiveDirectorySite Datacenter-2

  13. Aletheia Khabo says:

    HI Paul
    I have 2 questions
    1. i have 2 exchange servers and currently both are in the primary site and would like for the pother to be moved to the secondary , problem is that outlook users currently choose one or the other
    i would like for outlook to choose only the primary server name and choose only the secondary when the primary is unreachable. How can this be effected.
    2. Does the secondary and primary servers necessarily have to be in a different AD sites

      • Jawarah says:

        Hi Paul,

        My scenario is similar to Aletheia’s.

        As client access services (in Exchange 2016) only authenticate and proxy connections to back end services, would there be any performance issues for end users if I have a client access server in the primary site and another in the secondary (using DNS) over a slow link? All users are in the primary site.

        Regards,
        Jawarah

  14. Erik says:

    Hi Paul,
    I am getting an error when adding the third server into the DAG.

    A server-side database availability group administrative operation failed. Error The operation failed. CreateCluster errors may result from incorrectly configured static addresses. Error: An error occurred while attempting a cluster operation. Error: Cluster API failed: “AddClusterNode() (MaxPercentage=100) failed with 0x5b4. Error: This operation returned because the timeout period expired”. [Server: server.domain.domain.com]

  15. Jake says:

    Hi Paul, looking for recommendation for 2 servers to be used at MBX servers with 1 Edge Transport server of lesser capability. All in the same datacenter.

    The 2 mbx servers’ databases will be members of a DAG so that will provide High Availability.

    When using the role requirements calculator, I can’t seem to get away from having to rely on backups and not relying instead on the Exchange Native Data Protection? Is this not possible with only 2 servers in one location?

  16. Kevin says:

    Hi Paul,
    Broad stroke recommended plan if I may: We have a (flaky) 2 site DAG on Ex2010. Main site is the office where all staff are located, remote site is a backup data centre. Sites connected by 100Mb VLAN.
    We are moving to Ex2016 across the two sites with an Azure based Kemp load balancer.
    I’m happy enough with the risk to roll back the Ex2010 estate to a single (clustered) server at the main site, so I’m thinking to do that then deploy Ex2016 as a single (clustered) server at the main site, migrate the single server over and remove Ex2010, then expand the Ex2016 estate to a DAG across the two sites with external client access coming in via the Azure Kemp load balancer. There’s a few new things to learn along the way so I’m cautious to not bite off more than I can chew in one go! Does that sound like the “easiest” approach, or do you have other recommendations?

      • Kevin says:

        Hi Paul,
        Apologies for being so brief – The HA cluster is obviously two (or more) servers with shared storage. The Exchange server is a VM on the cluster. So from an Exchange server point of view it’s just a single server, but from a hardware point of view HA is provided by the cluster. My description was supposed to show it was a single server, running in a cluster environment.
        Hope that helps!
        KR,
        Kevin

  17. Prabodha says:

    Great Article Paul, I would like to know more about the local search instance ‘read from passive copy’ (came with Exch 2016 CU3), when you make a passive copy to be active. And if there are 5 servers in a DAG, and one is active, on the event of active copy server failed, what parameter DAG use to choose one passive copy to become active ? If you have already any article please give.

    Thanks

  18. Matt says:

    Hi Paul,

    I have added a 4th DAG member at our DR site. I have included an IP from the DR subnet in “Database availability group IP addresses” but was just wondering if I need to add an additional DNS entry for the new DAG IP address?

    thanks!

  19. Md Shaifullah Mozide Palash says:

    Dear Paul,
    Please guide me regarding DC -DR network configuration for exchange 2016. Currently i m using 10.50.10.0 in my DC site and 10.90.10.0 in my DR site. My question is that,
    01. Can i use same network directly connected in both site (10.5.1.0)
    02. I have must use different network (10.5.1.0-DC & 10.6.1.0-DR)
    03. I need your best recommendation regarding Microsoft

    Please response soon

Leave a Reply

Your email address will not be published. Required fields are marked *