We know that the correct attitude in the management of an ISP is always to reduce the existence of points of failure so that we can keep the service levels of our customers as high as possible.
With this in mind, we should plan our network and service architecture always with at least N + 1. What does that mean? We have to plan our structure with the number of assets necessary for its full operation, adding at least one more asset in case of failure.
One of the critical points in the structure of an ISP is its subscriber authentication base, which uses the RADIUS protocol and is usually centralized. Unfortunately, we see some systems that do not allow the adoption of redundancy with rapid convergence, which is a time of failure that causes a long period of unavailability before the reestablishment of services.
Even though we have an efficient redundancy, ISO 27001 teaches us that we must have all risks mapped and treated, which means that we must have in our disaster recovery planning a way to quickly re-establish the provision of services to our subscribers.
What I bring you today is a drastic attitude, which should only be used as a last resort, but which can relieve the pressure on the technical team in case of a serious failure in the authentication basis, allowing them more time to the solution of the failure.
By default, Huawei BNG is programmed so that, in case of non-response to the RADIUS request, the subscriber's access is considered as rejected.
With the command below, we changed this default behavior, making it so that in the event of a timeout in the authentication request, the subscriber is considered valid and his access is allowed, enabling the re-establishment of his services.
aaa
#
authentication-scheme a1
authening radius-no-response online authen-domain example.com
#
accounting-scheme acct1
accounting start-fail online
#
domain example.com
authentication-scheme a1
accounting-scheme acct1
radius-server group rd1
#
It is not recommended that this command be activated permanently in your system, as every subscriber that is allowed after the timeout, will not have their controls, which would perhaps be sent by RADIUS parameters, activated in their sessions. In a case of mass disconnection, we can have several cases of subscribers who by timeout on the radius are online, but with their inadequate settings.
I thank my friend @vagkaefer for helping to locate this command.
I hope that this article can assist in day-to-day operations and I ask you to comment on your experiences in this matter.
#MVE
#HuaweiEnterpriseCommunity


