HCX for hybrid network extension – Exploring VMware Cloud on AWS-Integrated Services

One of the many unique functionalities of VMware HCX is its Layer 2 extension. We have already provided a quick summary of its features. In this section, we will cover architecture and design decisions that will help cloud architects to create a network design for hybrid deployments.

From a high-level network perspective, a Layer 2 extension allows us to extend the level 2 broadcast domain (ARP and reverse ARP flows) between two disjointed network infrastructures, while retaining the same IP addresses for the application workload. In the absence of this feature, communication could only be established with IP routing and requires the reassigning of IP addresses.

Retaining IP addresses during migration shortens drastically the downtime, preparation window, and overall risks; however, it also poses a couple of challenges:

  • Traffic flow: To facilitate proper ARP communication, all broadcast network traffic must be captured and retransmitted between two segments. This retransmission happens over the WAN connection (either over the internet or AWS Direct Connect (DX) and might pose security challenges for the network security team.
  • Bandwidth: Traffic flow needs to be intercepted and retransmitted. This is accomplished by NE appliances, deployed in pairs. Each pair establishes its own tunnel to broadcast traffic with 1.5 GB/s maximum throughput per extended VLAN.
  • Availability: By default, the extension is handled by a single appliance pair. If one of the appliances fails, VMs residing on the extended segment are not able to communicate with the rest of the network.
  • Routing: Each VLAN has only a single default gateway. This gateway normally resides on the source side of the network extension. For the target (extended) environment, all network traffic sent to a destination outside of the segment must traverse back to the source side through the HCX NE tunnel.

However, the version of HCX available with VMware Cloud on AWS mitigates some of the aforementioned challenges:

  • Traffic flow: HCX supports encryption of the Layer 2 extension traffic via an IPSec VPN. The VPN tunnel is established between HCX appliances and is independent of the underlying connectivity between the on-premises site and the VMware Cloud on AWS SDDC.
  • Bandwidth. Even if each tunnel has fully supported throughput, you can increase performance by adding multiple NE pairs and distributing VLANs between them.
  • Availability: With the recent addition of the NE high availability feature, the HCX version deployed with VMware Cloud on AWS now supports the creation of an NE group, consisting of four appliances. Inside each group, two pairs are established – one is active and transmits the traffic, and the other is a stand-by.
  • Routing: With the recently introduced MON feature, HCX now supports optimized traffic flow in the target SDDC. VMs residing on the extended segment can communicate with other segments within the same SDDC or reach the AWS S3 endpoint.

We are used to the HCX network extension being actively used during migration, with switchover to a routed segment once the migration completes. However, with this new capability, you can opt to continue to run your workloads on an extended segment forever.

It is important to emphasize that HCX does not include a Layer 2 loop validation protocol, such as the spanning tree protocol. Therefore, best practice architecture is critical. The following unsupported topologies can create a Layer 2 broadcast storm.

Leave a Reply

Your email address will not be published. Required fields are marked *