Tuesday 29 January 2013

Real World Fault Tolerance Use Cases

I've always been a big fan of VMware's Fault Tolerance feature but could never really find a valid use case to actually implement it.  Fault Tolerance (FT) has some very strict requirements / limitations:

  • VM is limited to one vCPU
  • VM disks need to be thick-eager zeroed
  • VM must be running a supported OS
  • Snaphosts are not supported on the VM (Think about how you can backup the VM)
  • VM Hot add memory or CPU cannot be utilised
  • DRS cannot be utilised
  • Physical processors needs to support FT

It is also recommended that you have a dedicated redundant FT network with minimum 1GB pNics for the FT traffic.  In our current infrastructure we are limited to the number of physical nics we can present to the ESXi hosts due to the blade chassis we are currently using.  We were unable to allocate 2 nics to dedicate to FT so we never really utilized FT.

We recently had a requirement to move our DMZ to our ISP and thus reused some decommissioned Dell R610 hosts.  We bumped up the memory and pNic count and now we could comfortably allocate 2 pNics per host for FT.  We use a Citrix Access Gateway (CAG) VPX appliance for remote access to both published apps and desktops and although this is protected with VMware HA, FT offered more protection in the event of a complete host outage.  The only issue that I had was that the CAG was not on the supported list of OS's but it seems to work fine.

We configured the hosts with the required FT network and enabled FT on the VM when it was powered off and 15 minutes later the VM was fully protected with FT

No comments:

Post a Comment