HCX for disaster recovery – Exploring VMware Cloud on AWS-Integrated Services

HCX provides limited disaster recovery (DR) functionality alongside its workload migration features, providing a basic low-cost DR solution. HCX uses the same migration replication engine for DR by protecting workloads. VSR features, such as support for protection groups, recovery plans, and advanced automated recovery workflow capabilities, are unavailable through VMware HCX.

HCX offers limited disaster recovery (DR) functionality alongside its workload migration features, providing a basic DR solution. Utilizing the same migration replication engine, HCX transmits data efficiently over an optimized WAN. While it can be employed for DR to safeguard development platforms against low-cost threats or ensure temporary environment protection, advanced VSR features like support for protection groups, recovery plans, and sophisticated automated recovery workflow capabilities are not available through VMware HCX.

Information

One point to emphasize is that even though HCX DR replication and orchestration is not recommended for production DR use cases, HCX layer 2 extension capability can be leveraged in combination with VSR and VCDR. Make sure that possible dependencies on the source network (i.e., the location of the default gateway) are taken into consideration while preparing the DR plan.

VMware Site Recovery service

VMware Site Recovery VSR uses the time-tested VMware SRM product to deliver VSR as a VMware Cloud on AWS integrated add-on service. This service simplifies traditional DR operations. The service is designed to provide a disaster recovery solution that can mitigate the need for a physical secondary site and quickly scale to a full production environment, simplifying DR operations. The following figure illustrates protecting a organization data center with VMware Cloud on AWS using VSR.

Figure 3.10 – VSR architecture

VSR leverages vSphere Replication (vR) to provide native hypervisor-based replication. All infrastructure services are delivered through software, and the orchestration is done through the VSR add-on. Organizations can replicate VM images and create automated recovery plans. These plans include the startup order, recovery steps, and recovery plan. The following figure shows the different components included in the replication process on the on-premises and VMware Cloud on AWS sides:

Figure 3.11 – vR architecture

Admins can then run tests to validate your environment within a bubble network. This allows you to prepare for a disaster fully.

VSR can protect multiple on-premises sites or another VMware Cloud on AWS SDDC, as shown in the following diagram:

Figure 3.12 – VSR protecting multiple sources

VSR uses automated recovery programs that can be quickly set up and are easy to maintain through automation. Many of the steps in the recovery process are automated.

Auditors can also see the benefits of non-disruptive tests. They can verify a company’s resilience to disasters and ability to meet recovery time objectives (RTOs). They regularly test recovery plans and check for configuration drift. Organizations can run their recovery plans anytime they like, without worrying about impacting users or applications. Admins can address configuration drift or fix problems with recovery plans that environmental changes may have caused:

Figure 3.13 – Configuration drift and recovery risk

This non-disruptive testing can reduce the recovery risk and discover configuration drifts earlier. The Site Recovery service can then begin to execute its recovery process once a disaster has been declared through a series of automated workflows.

Failback can be enabled with bidirectional workflows. Workloads can be automatically migrated to their original site using manual user intervention.

The downside of using VMware Site Recovery (VSR) service is the fact that it relies on the VMware Cloud on AWS SDDC running vSAN-based storage as the target replication target. While vSphere-based replication on the target vSAN-based storage is robust and resilient, organizations will need a 2-Host SDDC Cluster Pilot light environments that have costs associated with compute and storage capacity. The following section will explore this challenge and provide a cost-effective Disaster Recovery as a Service (DRaaS) solution, where cloud storage compute resources are disaggregated and paying in advance for compute resources is not required.

Leave a Reply

Your email address will not be published. Required fields are marked *