After the network cable is removed, the VIP is switched from node 1 to node 2. However, the NFS service switchover takes more than one minute, as shown in the following figure.
At 09:08:34, the network cable of node 1 was removed, and the NFS service on node 1 was interrupted.
09:08:30 pubeth5 147383.17 6745.54 217852.06 383.60 0.00 0.00 0.00 09:08:32 pubeth5 151990.45 7276.38 224677.70 412.64 0.00 0.00 0.00 09:08:34 pubeth5 91280.00 4227.50 134910.39 241.32 0.00 0.00 0.00 09:08:36 pubeth5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 |
The NFS service was not started on the two nodes until 09:10:48. The service switchover took more than one minute.
09:10:44 pubeth5 1.00 0.00 0.06 0.00 0.00 0.00 0.00 09:10:46 pubeth5 1.49 0.50 0.09 0.02 0.00 0.00 0.00 09:10:48 pubeth5 176.88 0.50 10.36 0.02 0.00 0.00 0.00 09:10:50 pubeth5 14355.78 1.01 841.16 0.04 0.00 0.00 0.00 |
The test shows that the APR cache table of the client is not updated for a long time after the IP address of the NAS is switched. As a result, the client always uses the old MAC address to request the NFS service.
linux-tele2client:/proc/sys/net/ipv4/neigh/p3p1 # arp -n Address HWtype HWaddress Flags Mask Iface 1.1.1.12 ether fc:48:ef:28:80:83 C p3p1 1.1.1.67 ether fc:48:ef:28:7f:d0 C p3p1 1.1.1.11 ether fc:48:ef:28:7f:d0 C p3p1 100.49.0.1 ether dc:d2:fc:1c:a0:a7 C em2 |
Handling suggestion: Ping the NAS VIP from the client in real time to update the ARP cache table. Check the ARP aging time on the client. If the aging time is too long, change it to a smaller value. The default aging time on Linux is 60 seconds.
linux-tele2client:/proc/sys/net/ipv4/neigh/p3p1 # echo 60 > gc_stale_time linux-tele2client:/proc/sys/net/ipv4/neigh/p3p1 # cat gc_stale_time 60 |