Figuring Out Recovery Strategies

This session was an open panel conversation between Brian Hymer, Brian Desmond, and Patrick Ancipink, three acknowledged experts with a core focus on Active Directory.

The key goal of the session was to discuss Active Directory (AD) issues, especially how to recover from them. Feedback came from the panel leaders and several attendees.

You’ve Been the Victim of an AD Attack

To start the ball rolling, the panel asked the audience to “… raise your hand if your organization has been a victim of an attack recently or some time ago…” Unfortunately, but not surprisingly, several audience members raised their hands, underlining the importance of having this panel discussion!

Next, Brian Desmond shared a scenario from one of his customers:

The customer’s environment comprised 17 Datacenters worldwide, hosting an Active Directory Domain Controller machine in each site, all belonging to the same single forest; After a security breach infected several servers on the network across different sites, the situation looked disastrous for the organization. Eventually, 1 last Domain Controller machine was found which apparently didn’t have any ransomware patterns. As a rescue mechanism, the technical teams decided to isolate the machine and thus saved the Domain Credential information, allowing for a successful recovery operation.

Brian emphasized that, besides the technical challenge of the situation and recovery process, an even bigger challenge existed within the technical teams; how to move forward. He shared a crucial piece of advice with the audience:

When you are under attack, find a stakeholder who can make the needed decision to pull the plug, revise, and recover.

To keep the discussion going, Brian Desmond threw another question to the audience:

Have you tested Active Directory Recovery? Reaction: why do we need to do this?

From this discussion, it seems that organizations typically test for physical failures like server or storage outages, but you should validate the full stack, including Active Directory, the application layer, and network and storage endpoints. Any component can fail, and all are susceptible to malware attacks. Some are harder to test and mitigate than just replacing failed physical components.

Backup for Recovery

The next question involved backup as a means of disaster recovery. Surprisingly, several people admitted they still rely on traditional tape drives, stored off-site in a secured location due to compliance regulations in their business industry.

The panel briefly touched on the importance of running adequate recovery testing, even if it includes tape-based backups. Apart from the technical aspect of running restores, the panel emphasized how crucial it is to not only look at the technical characteristics of running the restore itself, but also review aspects such as RPO (Recovery Point Objective), RTO (Recovery Time Objective), and how they can extend the actual recovery time and potential data loss.

Input from the audience highlighted the importance of retaining recovery data and backup jobs. Often, a cyberattack or malware infection might have occurred days, weeks, or months before an attack occurs. This means that malware could already have compromised your recovery solution, meaning that the recovery might be useless. An overall consensus across the audience and the panel was that they should have a retention period of at least 90 days (preferably more) and conduct yearly backups

From here, the panel presented a few interactive questions on screen, asking for input from the audience. The opening question was how often a DR plan gets tested. Out of the +/- 60 responses, about 45% was equal (27) for “never”, or “once a year”. Only 7 responses reported running tests once per quarter (figure 1). 

Cybersecurity Risk Management for Active Directory

Discover how to prevent and recover from AD attacks through these Cybersecurity Risk Management Solutions.

Testing Disaster Recovery

Heard at TEC: Hacked and Afraid – dramatic tales from AD disaster recovery scenarios
Figure 1: How Often Do You Test Your DR Plan?

Based on these numbers, someone in the audience asked the question – what is the best practice to perform a DR-drill test?

The panel’s answer, as well as feedback from the attendees, was to focus on creating and continuously updating Response Plans, working out details of playbooks, and covering different aspects of recovery procedures and operations.

Brian picked up this topic to add that it is also important to outline who will perform the recovery operations. Is it the internal IT team or do you rely on an external support team? Is this already part of the Response Plan or is it ad-hoc (much like a hotline or 911, you only call during an emergency)? It is also crucial to map out the partnership between your internal teams and the support partner, or if you rely on the external partner to detect, mitigate, and fix the cyberattack.

The next question touched on the recovery status after a cyberattack. Surprisingly, 28/42 responses needed 2-5 weeks to recover. About 25% (12) needed several months, and a minority admitted they never really recovered from a cyberattack (Figure 2).  

Recovering from a Cyberattack

Heard at TEC: Hacked and Afraid – dramatic tales from AD disaster recovery scenarios
Figure 2: How Long to Recover from a Cyberattack?

These numbers are based on a quick poll of the room. The important takeaway from this question is that several weeks of downtime usually occur before an attacked organization can be back up and running.

Brian emphasized this aspect, highlighting he had (several) customers who literally needed to close their business because everything was down.

Privileged Access Workstation

Providing secured access to your applications and data centers often brings up the conversation of a Privileged Access Workstation (PAW). This solution received only a mild welcome because it assumes that the PAW is correctly set up (isolated from the production network, monitoring in place, and continuous anti-malware scanning). The audience expressed a risk in using PAWs, because if an attacker compromises the workstation, it has access to a VM, which could mean that the attacker can compromise the entire server environment.

A comment emphasized that in many cyber-attacks, while not immediately visible at first, Active Directory is the target, NOT the application or database layer. Since identity is the key to the kingdom, it’s one of the most popular targets in any attack.

Question: Any predictions for 2023 around cyber-attacks?

This last question from the audience to the panel asked for their view and assumptions for 2023. The easy answer is: More of the same. Much more!!

One noticeable trend was that organizations rely more heavily on buying tools to resist attacks. The risk here is that there is blind trust in the tool, without digging into the specifics or configuration options. This could lead to a false impression of safety and trust while still leaving the door open for cyberattacks.

Question: How do you plan on using Azure AD in the next 24-36 months?

Making the audience aware of the additional challenges of expanding on-premises data centers to the cloud, specifically around the hybrid aspect of identity with Azure Active Directory, the panel was interested in hearing more about the hybrid identity roadmap in the audience. Hybrid was the way forward for 70% of the audience. About 20% assume moving to a 100% cloud architecture, and 10% will stay on-premises for the near future. The decision is influenced by different business verticals and compliance regulations in specific industries (Figure 3).

Heard at TEC: Hacked and Afraid – dramatic tales from AD disaster recovery scenarios
Figure 3: Adoption of Azure Active Directory in the next 24-36 Months

Summary

This panel was a very interactive approach to learning more about (Active Directory) recovery challenges and how to prepare for them.

Both Brians and Patrick shared a lot of great guidance from their own experiences, keeping the discussion going with a few baseline questions. The audience responded with a few questions of their own.

While the numbers in the charts are a good representation of answers, you need to understand that they only captured the views of the attendees that participated in the discussion, not all of the attendees in the room. That said, it should show some trends around the state of Disaster Recovery at Organizations, how painful and time-intensive a recovery operation process can be, and how long an organization could be out of business.

Last, I also learned that there is a huge interest in hybrid cloud, answered from an identity perspective, but also that about 20% of the respondents were going for Azure Active Directory as the only identity scenario in use.

Cybersecurity Risk Management for Active Directory

Discover how to prevent and recover from AD attacks through these Cybersecurity Risk Management Solutions.

About the Author

Peter De Tender

Peter looks back to a +25 year career as IT expert, with a background in Microsoft datacenter technologies. Since early 2012, Peter started shifting to cloud technologies (Office 365, Intune), and quickly jumped onto the Azure platform, working as cloud solution architect and trainer. Mid 2019, Peter took on a position as Azure Technical Trainer within Microsoft Corp, providing Azure Readiness Workshops to larger customers and partners within the EMEA Region and global, with a focus on Azure DevOps, Apps and Infra and SAP workloads. Beginning 2022, Peter relocated to Redmond, WA to continue this role out of the West US team. Peter was an Azure MVP for 5 years, is a Microsoft Certified Trainer (MCT) for +12 years and still actively involved in the community as public speaker, technical writer, book author and publisher.

Leave a Reply