If you’re running an Exchange 2016 database availability group, and one of the DAG members fails, you can recover the DAG member to restore the high availability of your Exchange mailbox databases. Providing that your DAG is healthy and configured correctly, your remaining DAG member(s) should be able to maintain service availability while you perform the recovery.
Recovering a failed DAG member makes use of the Exchange recovery installation method, which reinstalls Exchange onto a server of the same name and pulls configuration information from Active Directory. However there are some additional steps required before and after you perform the recovery install.
For this demonstration scenario I have a two-member Exchange 2016 DAG named EX2016DAG01 with the following members:
- EX2016SRV1 has failed
- EX2016SRV2 remains healthy
Removing the Failed Member from the Database Availability Group
The failed member needs to be removed from the DAG configuration by running the following commands. First, the database copies on the failed DAG member are removed. They should have a status of ServiceDown, and you can remove them with the Remove-MailboxDatabaseCopy cmdlet.
[PS] C:\>Get-MailboxDatabaseCopyStatus -Server EX2016SRV1 Name Status CopyQueueLength ReplayQueueLength LastInspectedLogTime ContentIndexState ---- ------ --------------- ----------------- -------------------- ----------------- DB05\EX2016SRV1 ServiceDown 0 0 Unknown DB06\EX2016SRV1 ServiceDown 0 0 Unknown DB07\EX2016SRV1 ServiceDown 0 0 Unknown DB08\EX2016SRV1 ServiceDown 0 0 Unknown [PS] C:\>Get-MailboxDatabaseCopyStatus -Server EX2016SRV1 | Remove-MailboxDatabaseCopy -Confirm:$false
Next, remove the failed server from DAG membership using the Remove-DatabaseAvailabilityGroupServer cmdlet. The -ConfigurationOnly switch is used to make the change in Active Directory without needing to communicate with the failed server.
[PS] C:\>Remove-DatabaseAvailabilityGroupServer -Identity EX2016DAG01 -MailboxServer EX2016SRV1 -ConfigurationOnly
The failed DAG member also needs to be manually evicted from the underlying Windows Failover Cluster.
[PS] C:\>Get-ClusterNode EX2016SRV1 | Remove-ClusterNode Remove-ClusterNode Are you sure you want to evict node EX2016SRV1? [Y] Yes [N] No [S] Suspend [?] Help (default is "Y"): y
Removing EdgeSync Credentials
If the AD site has an Edge Transport server subscribed, you’ll need to remove the EdgeSync credentials from the Exchange server using ADSIEdit. If you don’t complete this step, Exchange setup will fail with the following error:
The internal transport certificate for the local server was damaged or missing in Active Directory. The problem has been fixed. However, if you have existing Edge Subscriptions,
you must subscribe all Edge Transport servers again by using the New-EdgeSubscription cmdlet in the Shell.
To remove the EdgeSync credentials, open ADSIEdit and connect to the well known naming context of “Configuration”. Browse to Services -> Microsoft Exchange -> Your Org Name -> Administrative Groups -> Admin Group Name -> Servers. Right-click the server object and select Properties. Find the msExchEdgeSyncCredentials attribute in the list, and edit it to remove all entries.
Preparing the New Server for Exchange Recovery
Replace or rebuild the failed server with a new installation of Windows Server, using the same computer name as the failed server, and join it to the Active Directory domain. You should configure the server to match the failed server and your other DAG members in terms of networking and storage. Once you have the server ready, you can check which build of Exchange 2016 to install by running Get-ExchangeServer from a healthy Exchange server and noting the build number.
[PS] C:\>Get-ExchangeServer | Select Name,AdminDisplayVersion Name AdminDisplayVersion ---- ------------------- EX2013SRV1 Version 15.0 (Build 1210.3) EX2010SRV1 Version 14.3 (Build 123.4) EX2016SRV1 Version 15.1 (Build 396.30) EX2016SRV2 Version 15.1 (Build 396.30) EX2016EDGE Version 15.1 (Build 225.42)
The failed server in this demo, EX2016SRV1, was running Exchange 2016 Cumulative Update 1 (you can check build numbers here).
Performing a Recovery Install of Exchange 2016
Open a CMD prompt, navigate to the folder where you’ve mounted the Exchange 2016 ISO or extracted the setup files, and run setup with the following parameters.
G:\>setup /m:recoverserver /iacceptexchangeserverlicenseterms
Wait for setup to complete, then restart the server.
Add the Recovered Server as a DAG Member
After restarting the server you can add it back to the database availability group.
[PS] C:\>Add-DatabaseAvailabilityGroupServer -Identity EX2016DAG01 -MailboxServer EX2016SRV1
After adding the DAG member, you can add the mailbox database copies as well.
Since every environment is different, here’s a few additional steps you might need to look at as well:
- Export/import the SSL certificate from another server to the recovered server
- Verify, and if necessary re-apply, the client access namespaces on the virtual directories
- Recreate the Edge Subscription
- Reinstall antivirus, backup, monitoring agents
- Rebalance activation preferences
- Run Exchange Analyzer