Hello everyone,
Today, I'm going to share an analysis of M-LAG route forwarding exceptions on CE6850.
Issue Description
A bank customer uses two CE6850 switches as gateways, two CE6850 switches as access devices, and another vendor's firewalls as egress devices of a service zone in the data center. When a link between the CS 6850 and the firewall was disconnected during a service emergency drill, services on some servers were interrupted intermittently.
Topology diagram:
Figure 1 Topology diagram

Note: All IP addresses used here are inconsistent with the actual IP addresses of the site.
Figure 1 shows the customer's live network topology. CS 6850-1 and CS 6850-2 function as gateways of the service zone. The M-LAG technology is enabled on interfaces of the gateways connected to access switches and S firewalls. The server gateway is deployed on CS 6850. AS 6850-1 and AS 6850-2 function as access switches. The M-LAG technology is enabled on interfaces of the access switches connected to gateways and servers. Links between CS 6850-1 and CS 6850-2 and between AS 6850-1 and AS 6850-2 are peer-links.
Handling Process
After the check, it is found that:
1. When the server traffic is forwarded to AS 6850-1 based on the hash algorithm and all links are normal, the service traffic will be forwarded along the path shown in Figure 2. After the check, all services are normal.
Figure 2 Service traffic forwarding path – 1

2. When the server traffic is forwarded to AS 6850-2 based on the hash algorithm and all links are normal, the service traffic will be forwarded along the path shown in Figure 2. After the check, forwarding of all the services is normal.
Figure 3 Service traffic forwarding path – 2

3. When a link (involving interface 2 of the firewall) between CS 6850 and the firewall is disconnected and the server traffic is forwarded to AS 6850-2 based on the hash algorithm, the service traffic should have been sent to CS 6850-2, then to CS 6850-1 through the peer-link, and then to interface 1 of the firewall, as shown in Figure 3. However, in the actual situation, all the traffic sent along this path fails to be forwarded. The service traffic sent along the service traffic forwarding path – 1 is still normal.
Figure 4 Service traffic forwarding path – 3

On CS 6850-2, ping the server IP address 192.168.10.21 and the firewall IP address 172.16.1.6. The ping operations succeed. On Server 2, ping the remote service IP address 10.1.1.1 and firewall IP address 172.16.1.6. The ping operations fail.
Check CS 6850-1. The M-LAG status is normal. The IP address of the interface on CS 6850-1 connecting to the firewall interface in VRRP mode is 172.16.1.1, and the virtual MAC address of VRRP is xxxx-yyyy-zzzz. The interface configurations are as follows:
interface Vlanif13
ip binding vpn-instance Hexin
ip address 172.16.1.3 255.255.255.248
vrrp vrid 200 virtual-ip 172.16.1.1
The routing table and FIB table are normal, as shown in the following:
<CS 6850-02>dis ip routing-table vpn-instance Hexin
Proto:Protocol Pre: Preference
Route Flags: R- relay, D - download to fib, T - to vpn-instance, B - black hole route
------------------------------------------------------------------------------
Routing Table :Hexin
Destinations : 51 Routes : 51
Destination/Mask Proto Pre Cost Flags NextHop Interface
0.0.0.0/0 Static 60 0 RD 172.16.1.6 Vlanif13
172.16.1.0/29 Direct 0 0 D 172.16.1.3 Vlanif13
<CS 6850-02>dis ip fib slot 1 vpn-instance Hexin
Route Flags: G- Gateway Route, H - Host Route, U -Up Route
S - Static Route, D - Dynamic Route, B - Black Hole Route
--------------------------------------------------------------------------------
FIB Table: Hexin Total number of Routes: 51
Destination/Mask Nexthop Flag Interface TunnelID
0.0.0.0/0 172.16.1.6 GSU Vlanif13 -
Check the ARP entry of the next-hop IP address. The ARP entry is normal.
<LS-XX-HX-CS-6850-02>dis arp inter vlan 13
ARP Entry Types: D - Dynamic, S - Static, I - Interface, O - OpenFlow
EXP: Expire-time
IP ADDRESS MAC ADDRESS EXP(M) TYPE/VLAN INTERFACE VPN-INSTANCE
------------------------------------------------------------------------------
172.16.1.6 aaaa-bbbb-cccc I Vlanif13 Hexin
Repeat the preceding steps to check the routing table and ARP table on CS 6850-1. The routing table and ARP table are normal.
Capture packets on the interface of CS 6850-1 connected to the firewall. It is found that the encapsulation of the packets sent from the firewall to the CS 6850-1 is abnormal.
Figure 5 Packet Capture Result

The destination MAC address of the packets is dddd-eeee-ffff, but not the virtual MAC address xxxx-yyyy-zzzz of the VRRP group. When receiving such packets, CS 6850 finds that the MAC address is not its own MAC address and then queries the MAC address table or broadcasts the packet in the VLAN. Consequently, it cannot query the routing table and forward the packet based on the routing table, leading to service exceptions.
The customer contacts the firewall vendor to check the firewall configuration. It is found that the firewall has a reverse route forwarding function. If this function is not selected, the firewall does not query the ARP table and routing table when forwarding packets; instead, the firewall directly forwards the packets through the corresponding physical interface. Consequently, the physical MAC address of the peer interface is encapsulated as the destination MAC address of the packets. However, as the switch networking uses the M-LAG + VRRP mode, packets can be normally forwarded only when the virtual MAC address of the VRRP group is encapsulated as the destination MAC address. After the customer selected the reverse route forwarding function, the packet encapsulation became normal and services were restored.
Root Cause
The firewall has a reverse route forwarding function. If this function is not selected, the firewall does not query the ARP table and routing table when forwarding packets; instead, the firewall directly forwards the packets through the corresponding physical interface. Consequently, the physical MAC address of the peer interface is encapsulated as the destination MAC address of the packets. However, as the switch networking uses the M-LAG + VRRP mode, packets can be normally forwarded only when the virtual MAC address of the VRRP group is encapsulated as the destination MAC address.
That is all I want to share with you! Thank you!