Redundant Routing Protocols

HSRP, GLBP, and VRRP

 

HSRP (Hot-Standby Routing Protocol) - RFC 2281 - mostly used with Cisco routers - a well-known integrated protocol that is available via specific config commands.  HSRP does not inherently support load sharing (there is nothing in the RFC that talks of load sharing).  However, Cisco has since come up with "MHSRP" (Multigroup HSRP) so that it can be used for load sharing.  

***  also see

Using HSRP for Fault-Tolerant Routing (Cisco)   Load Sharing with HSRP  

 

GLBP (Gateway Load Balancing Protocol) - no RFC (Cisco proprietary) - used with Cisco Routers - this is an enhancement over HSRP, in that it offers load sharing by default.  You can configure GLBP in such a way that traffic from LAN clients can be shared by multiple routers, thereby sharing the traffic load more equitably among available routers.  GLBP supports up to 1024 virtual routers (GLBP groups) on each physical interface of a router, and up to 4 virtual forwarders per group

*** also see  Cisco GLBP 

Cisco's Data Sheet on GLBP   High Availability in Campus Networks with GLBP  

VRRP (Virtual Redundant Routing Protocol) - RFC 3768 - this is typically used with non-Cisco routers (such as Juniper), although the Cisco 3000 uses it - and it is similar to HSRP

NOTE on Load Sharing   -   GLBP vs HSRP/VRRP - GLBP performs a similar, but not identical, function for the user as the HSRP and VRRP.   Both HSRP and VRRP protocols allow multiple routers to participate in a virtual router group configured with a virtual IP address. One member is elected to be the active router to forward packets sent to the virtual IP address for the group. The other routers in the group are redundant until the active router fails. With standard HSRP and VRRP, these standby routers pass no traffic in normal operation - which is wasteful.  Therefore the concept cam about for using multiple virtual router groups, which are configured for the same set of routers.  But to share the load, the hosts must be configured for different default gateways, which results in an extra administrative burden of going around and configuring every host and creating 2 or more groups of hosts that each use a different default gateway. 

GLBP is similar in that it provides load balancing over multiple routers (gateways) - but it can do this using only ONE virtual IP address !!!  Underneath that one virtual IP address is multiple virtual MAC addresses, and this is how the load is balanced between the routers.  Instead of the hassle of configuring all the hosts with a static Default Gateway, you can lket them use ARP's to find their own. Multiple gateways in a "GLBP redundancy group" respond to client Address Resolution Protocol (ARP) requests in a shared and ordered fashion, each with their own unique virtual MAC addresses. As such, workstation traffic is divided across all possible gateways.  Each host is configured with the same virtual IP address, and all routers in the virtual router group participate in forwarding packets.

 

Cisco's HSRP Detailed

Cisco’s “Hot Standby Routing Protocol” – RFC2281

(VRRP is for non-Cisco routers and is very similar to HSRP)

From RFC2281:  Using HSRP, a set of routers work in concert to present the illusion of a single virtual router to the hosts on the LAN.  This set is known as an HSRP group or a standby group.  A single router elected from the group is responsible for forwarding the packets that hosts send to the virtual router.  This router is known as the active router.  Another router is elected as the standby router.  In the event that the active router fails, the standby assumes the packet forwarding duties of the active router.  Although an arbitrary number of routers may run HSRP, only the active router forwards the packets sent to the virtual router.

 

Two "Real Routers" become one "Virtual Router" with HSRP

 

HSRP is a Cisco proprietary disaster recovery routing scheme, which has one active router and one or more standby routers (usually just one standby), all on the same LAN segment – and together they all form one, virtual router.  HSRP can be specifically on a LAN, or it can be done on a WAN, where the routers each have their own access circuit into the IP cloud. For cost considerations, the active is often configured at a higher BW than the standby’s, but they can be the same data rate.  HSRP does not inherently support load sharing, but there are workarounds to configure it to work that way. 

 

HSRP Group (also called "Standby Group") - HSRP is designed so that two or more routers can be grouped together as a single "Virtual Router", by sharing a single virtual IP address and a single virtual MAC address

 

HSRP isn't a routing protocol !!!  It's simply a way for routers on the same multi-access network to present a reliable (due to multiple routers and paths) virtual IP address/es.  HSRP has the benefit that it keeps host configuration simple—a commonly used static default is all that's required. It also reacts to failures in a matter of seconds.

 

Overview of How it Works

 

The routers share the same IP and MAC addresses, therefore in the event of failure of one router, the hosts on the LAN are able to continue forwarding packets to a consistent IP and MAC address. The process of transferring the routing responsibilities from one device to another is transparent to the user.

The Hot Standby Router Protocol, HSRP, provides a mechanism which is designed to support non-disruptive failover of IP traffic in certain circumstances. In particular, the protocol protects against the failure of the first hop router when the source host cannot learn the IP address of the first hop router dynamically. 

 

The protocol is designed for use over multi-access, multicast or broadcast capable LANs (e.g., Ethernet). HSRP is not intended as a replacement for existing dynamic router discovery mechanisms and those protocols should be used instead whenever possible. A large class of legacy host implementations that do not support dynamic discovery are capable of configuring a default router. HSRP provides failover services to those hosts.

 

Active vs Stanby Routers - using HSRP, a set of routers work in concert to present the illusion of a single virtual router to the hosts on the LAN. This set is known as an HSRP group or a standby group. A single router elected from the group is responsible for forwarding the packets that hosts send to the virtual router. This router is known as the active router. Another router is elected as the standby router. In the event that the active router fails, the standby assumes the packet forwarding duties of the active router. Although an arbitrary number of routers may run HSRP, only the active router forwards the packets sent to the virtual router.

 

To minimize network traffic, only the active and the standby routers send periodic HSRP messages once the protocol has completed the election process. If the active router fails, the standby router takes over as the active router. If the standby router fails or becomes the active router, another router is elected as the standby router.

 

On a particular LAN, multiple hot standby groups may coexist and overlap. Each standby group emulates a single virtual router. For each standby group, a single well-known MAC address is allocated to the group, as well as an IP address. The IP address SHOULD belong to the primary subnet in use on the LAN, but MUST differ from the addresses allocated as interface addresses on all routers and hosts on the LAN, including virtual IP addresses assigned to other HSRP groups.

 

Although HSRP itself does not support load sharing (see GLBP for that) - if multiple groups are used on a single LAN, load splitting can still be achieved by distributing hosts among different standby groups.

 

Shared IP and MAC Addresses - besides sharing an IP address, that IP address has a common MAC address that the routers share.  For example, you have a workgroup of say, 100 computers.  Each one of these machines has been configured with a default gateway, if these machines have used the default gateway or router, they have it's MAC address in their ARP cache.  So since the routers in the HSRP group share the same virtual IP address with a corresponding virtual MAC address, when they fail over, the workstations are unaware of the change.  What they see, is a "virtual" router.  

 

ARP Issues - if for some reason a host loses the static default gateway and sends out an ARP - the response from the router must tell the host that it can be reached by it's virtual IP address.  It is important that the routers respond to any ARP's that may occur from the hosts with their "virtual" IP address - not their actual interface IP address.  If a router does respond to an ARP with it's interface IP address, the host will still be able to communicate through that address - but if that router goes down then the host loses connectivity.

 

HSRP messages between Routers (UDP) - ther are 3 types:  Keep-Alives (Hello), Resign, and Coup - the routers in an HSRP group send and receive keep alives using the multicast address of 224.0.0.2 and UDP port 1985.  By default the hello interval is 3 seconds.  Once 3 hello intervals pass without hearing from the active router, the standby router automatically becomes the active router.  Each router is configured with a priority number, the router with the highest priority number in a standby group is the active router, everyone else just relaxes.  If the ACtive router must

 

 

Example HSRP Networks

 

Standard Dual-Router, Dual Internet Access Circuit HSRP

 

In this case, the customer has purchased a primary T3, and a backup T1 to the Internet - each connected to a different router.  The idea is that in the event of a failure, at least a minimal amount of traffic will still flow so that critical functions can continue, albeit at a much degraded pace.  He needs to configure each router with HSRP so that RouterA is the "Active" router and RouterB is the "Standby" router.  The "perimeter network" (the LAN segment) interface of Border RouterA is configured with IP address 10.0.0.253, and Border RouterB is given 10.0.0.254.  These are the actual IP addresses assigned to the Ethernet interfaces of each router.  Both routers also have a "virtual" IP address 10.0.0.1

 

HSRP on RouterA (the primary, "active" router) is configured so that it normally also holds the shared virtual interface address (10.0.0.1) on its perimeter network interface. HSRP on Border RouterB is configured to monitor the health of Border RouterA. Internet traffic from the host follows the static default route toward 10.0.0.1 to Border RouterA and exits on the T3 when both border routers are operating.

HSRP with Two Border Routers in Normal Operation

 

 

 

 

But suppose Border RouterA fails as shown in the next diagram:

HSRP with Failed Primary Border Router

 

Within seconds of Border RouterA's failure, Border RouterB's perimeter network interface takes over the shared virtual interface address (10.0.0.1). The static default route in the host now points to Border RouterB with no work on the host's part. Its Internet traffic now exits on the T1 via Border RouterB.

 

 

Now suppose that the T3 fails but Border RouterA continues to operate. We want Border RouterB to take over the shared virtual address even though Border RouterA is still functioning. This case is handled by configuring Border RouterA to "give up" the address whenever it looses carrier detect on the T3.

 

HSRP with Failed Primary Internet Connection

 

This behavior is implemented with a priority system. Border RouterA is configured to lower its priority whenever carrier detect is lost on the T3. Border RouterB seizes control of the shared virtual interface address whenever it notices that its priority is now the highest in the group of routers sharing the address.

 

NOTE:  these examples show 2 routers, but HSRP supports more than two routers.  Multiple routers can share a single virtual interface address !!!

 

At this point, HSRP may sound pretty good (and it is), but there are a few of things you should keep in mind.

 

 

 

Load Sharing with BGP only, vs BGP and HSRP together

 

HSRP does not support load sharing all by itself !!!  For that you could use Cisco's GLBP (Gateway Load Balancing Protocol).  But, you can use a combination of BGP and HSRP to offer load sharing and reliability.  

 

For pure load sharing – BGP all by itself is the way to go.  HSRP does a lot for reliability, but it can work against outbound load sharing in some cases because it only allow one interface to act as the “Active router” – which means during normal operation there is no load sharing.  The standby router is just that – standby only, and does not pass traffic during normal operation.  This can really crimp the BW of a site that has 2 T1s, and more than 1 T1s worth of output bandwidth.

Consider the network below:

 

Load Sharing with BGP but Without HSRP

 

Since both ISPs are sending only default routes, each border router will use its Internet connection for all exit traffic it receives. If each host generates about the same amount of outbound traffic, reasonably good outbound load sharing is achieved. (This might be especially desirable if both hosts together generated more traffic than would fit on either Internet connection individually.)

 

Although the outbound load sharing might be good with this configuration, your outbound traffic might be reaching its destination through some pretty circuitous paths.  As a quick reminder, think about what happens to traffic from HostB that destined for a customer of ISPA. It would have to be carried by at least ISPB (and perhaps several other ASes) before reaching ISPA.

 

If an Internet connection fails in a BGP-only topology (no HSRP) -  BGP will lose the default route it had heard through that connection. Exit traffic sent to either router will eventually exit on the remaining (working) Internet connection.

 

·        As a comparison – HSRP also can achieve the same effect, as shown above in the “HSRP with Failed Primary Internet Connection” diagram - although probably not quite as quickly.

 

If a border routers fails in a BGP-only topology (no HSRP)  - any hosts using the failed border router as the destination for a static default route would lose Internet connectivity.  

 

·        As a comparison - HSRP dealt handily with this problem (see the “HSRP with Failed Primary Border Router” diagram above). 

 

Protection again BOTH possibilities (i.e. Internet connection failure OR a border router failure) – HSRP will protect against both possible failures.  You configure HSRP on both border routers and configured both hosts to use the HSRP virtual interface address for their static default route.  This will allow better reliability since either Internet connection or either border router could fail without loss of Internet connectivity.

 

HOWEVER, HSRP does not support Load Sharing !!!  So in the absence of failure, all exit traffic from the AS would go out one Internet connection while the outbound side of the other sat largely idle. This could lead to congestion, especially if the total exit traffic from HostA and HostB exceeded the capacity of either Internet connection. In short, adding just HSRP to a dual-router network gives you reliability – but at the expense of load sharing.