Hello everyone,
I will share with you how to deal with services are interrupted due to dual-point faults in HoVPN scenarios.
1. Problem Description and Networking
The HoVPN solution is used on the live network. TE+LDP tunnels are used on the access ring, and LDP over TE tunnels are used on the aggregation ring. The CSG is dual-homed to two ASGs, and ASG1 functions as the preferred master ASG. The following figure shows the simplified networking. Services are interrupted due to dual-point faults on the access ring and ASG links.
2. Fault Locating
In this example, the live network information has been replaced.
Theoretically analyze the fault on the access ring link and ASG interconnection link. (1) Upstream traffic: The TE and LDP tunnels from CSG1 to ASG1 go Down, and upstream traffic is switched to ASG2. (2) Downstream traffic: The U2000 manages devices on the live network using the public network management solution. Loopback routes on the access ring are introduced to the aggregation ring. Therefore, if two points of failure occur, ASG1 still has the LDP LSP of CSG1. If the VPNv4 peer relationship between CSG1 and ASG1 does not go Down due to timeout, the downstream traffic is preferentially sent to ASG1. After the VPNv4 peer relationship between CSG1 and ASG1 goes Down due to timeout, the downstream traffic is sent to ASG2.
Description:
On the live network, the ASG advertises the default route of the public network to the access ring due to an adjustment. Disconnect the VPNv4 peer relationship between CSG1 and ASG1 to restore services. If the ASG advertises the default route to the public network to the access ring, CSG1 has a default route to ASG1 when two points of failure occur. However, CSG1 does not have an LDP LSP to ASG1, and upstream traffic is switched to ASG2. The VPNv4 peer relationship between CSG1 and ASG1 does not go Down due to timeout, ASG1 is preferentially selected for downstream traffic, and services are not interrupted.
After the dual-point fault occurs, check the default route of the private network on CSG1 and the specific route of the base station on RSG1. The ping operation fails.
//CSG1 has a default public network route to ASG1.
//The VPNv4 peer relationship between CSG1 and ASG1 is in the established state.
//CSG1 does not have a tunnel to ASG1
//CSG1 has a default private network route, and the next hop is ASG2.
//RSG1 has specific routes to base stations, and the next hop is ASG1.
//The route is normal but the ping operation fails.
According to the failure point, check whether there is any abnormality on the access ring side. Let ’s take a look at the troubleshooting of ASG1:
//The public network route from ASG1 to
CSG1 is normal and can be pinged.
//ASG1 has an LDP LSP to CSG1.
//ASG1 can ping the LDP LSP of CSG1, but the returned address is incorrect. In normal cases, ASG1 should reply from 1.1.1.1.
//Tracert lsp 1.1.1.1. The egress is 10.12.6.1 instead of 1.1.1.1.
//Check the IP address 10.12.6.1. It is the interface IP address of a P node.
//On the P node, MPLS LDP is not configured on the outbound interface to 1.1.1.1.
//Check the MPLS view. The lsp-trigger ip-prefix ldplsp command is run in the MPLS view. (If the policy for triggering the establishment of LSPs is set to all static routes or IGP routes or the IP prefix list is used to trigger the establishment of LSPs, the establishment of proxy egress LSPs is triggered.)
//1.1.1.1 can pass ip-prefix ldplsp
//The P device generates a proxy egress LSP destined for 1.1.1.1, and downstream traffic is iterated to the proxy egress LSP. As a result, services are interrupted
3. Root Cause and Solution
(1) LDP was not configured on the P node on the aggregation ring, and proxy egress LSP was enabled on the P node by default. When the two nodes were faulty, the downstream traffic reached ASG1 and was iterated to the proxy egress LSPs on the access ring. As a result, services were interrupted.
(2) Add the MPLS LDP configuration to the interface on the live network. To prevent services from being iterated to the proxy egress LSP, disable the generation of the proxy LSP on all devices configured with the lsp-trigger ip-prefix command.
mpls
proxy-egress disable
This is what I want to share with you today, thank you!