FT or Fault Tolerance, provides continuous availability for the protected virtual machines, in the event of a VM/hardware failure. It ensures zero downtime preventing data loss and here’s where FT differs from HA. This is done by creating a live shadow instance of a VM (running on another host), referred as the Secondary VM, that is always up-to-date with the original VM, referred as the Primary VM. FT works as below:
Highly recommended to enable FT for any mission critical VM’s that will cause a SLA breach or revenue loss, if a downtime has happened. FT works by maintaining a secondary VM running in virtual lockstep with the primary VM.
Virtual lockstep (of VMWare) records all inputs and events that occur on the primary VM and replicates to the secondary VM. As all the events happening in primary is replicated to secondary, there’s no way to block the secondary VM from replacing the primary.
Both the primary and secondary continuously exchange heartbeats, by that way the status is monitored by each other. If the primary fails, secondary takes in charge replacing the primary and another secondary VM is activated immediately to continue the FT mode.
If the secondary VM fails, it is also replaced with another VM immediately, avoiding interruption to user experience and data loss. Primary and secondary VM’s are not allowed to stay on the same host, avoiding a chance of host failure affecting the loss of both virtual machines.