This articles addresses a particular aspect of redundancy management. This is a result from my experiments at Alstom & personal thoughts.
In a Highly Available architecture with routing needs, it is of course necessary ensure continuous routing. There are a few different solutions for that, but the one I am interested in here is router redundancy.
Virtual Router Redundancy Protocol (VRRP) is one of the most widespread router redundancy protocol across router manufacturers.
I'm not going to go thoroughly into all VRRP principles and features, but some important aspects have to be reminded:
- VRRP uses a virtual IP & virtual MAC address
- RRP elects a Master & Backup instances across all available instances, the master gets to carry the virtual IP
- VRRP uses announces between all instances to manage who gets to be master.
- Each VRRP instance has a priority (0-255)
- A pre-empt attribute can be defined to takeover the Mastership is one's priority is higher.
- Conditions are configurable, depending on the implementation on how to behave when an instance detects a pre-empt possibility. For example it can choose to wait for X seconds before pre-empting the Mastership.
During boot process, some device polarize their interfaces to perform a health check. Some manufacturers even leave these interfaces up for a while even though the device is not able to forward any data during the process. Moreover, these interface polarization may happen multiple time during boot process. This is illustrated on figures 1 to 3 below.
While it may be understandable for the equipment manufacturer, directly connected devices are not aware of the booting process and may mis-interpret it.
For example a router with VRRP enabled, in Init of Backup State will see its interface up and start announcing again its priority as if back to nominal mode.
If the device ends up getting back in Master state, then all traffic routed on this interface will be sent towards the still-booting device, which is still not ready to forward anything.
The router has absolutely no clue the traffic is lost in most cases (Ethernet + IP don't have any acknowledgement mechanisms).
Such a booting process stage may take -and I am using real life numbers here- up to one minute ! One minute of traffic is the death of most on-going TCP connections, VoIP calls (humans are not as patient as UDP) etc. I feel that a huge deal in many cases, with as worst case scenario vital function relying on those redundancy features!
It bugs me a bit, and I can't explain why there is so little ways of protecting another device from this annoying phenomenon that are actually implemented in many manufacturer software.
OK, so what theoretical ways do we have to prevent these naive VRRP role change?
There are actually a lot of possibilities, here are a few of them I have seen:
- Delay the interface operational "Up" status.
This way, the interface is a little bit less naive and says "OK, let's see if you can last more than a minute" to the adjacent booting device. In other cases (plugging a wire for example), it should not interfere with anything else.
It's brutal but it should work in any cases as long as the timer is long enough.
- Configure pre-empt conditions.
Like a timer or more complex conditions. This does work only when the VRRP announces are not using the failed device.
- Configure BFD for the interface May only work if BFD doesn't use the failed device/BFD is available on the failed device/the VRRP router and the failed device belongs to the same organization.
- Configuring a ping-server/Tracking an IP
It is a bit of an overkill, and may not always work if firewalls are present in the neighbourhood (not so uncommon, as we are at a routing point).
- Delay the VRRP instance operational "Up" status.
This is probably the cleanest way as it only affects the VRRP instance (which is the only one having a problem here) and not other functions (administration of the failed device for example). Also, it works even if the VRRP announces passes through the failed device (which to me feels like a cheap use of VRRP).
This is different from an initialization delay, as the Init state is reached way before the link is falsely restored.
- Other ways may exist, but that is all I have seen so far. If you happen to know some, feel free to contact me, I'll be glad to learn !
As a conclusion, I would like to point out that these workarounds are mostly implementation dependent, as I don't think the RFC details this type of cases.
Also, this situation only happens when the VRRP announces uses the same medium as the traffic itself, or when a pre-empt is configured.