1.1.1 Alarm of HDFS service unavailability during installation
[Problem Phenomenon] Print "all name services in BAD state. so need to send HDFS alarm" in the controller log
[Possible causes]
1. Name service does have an exception
2. HDFS service unavailability false positives
[Problem Location Step] {R2C50 Tr5 Version}
1. The analysis step, first of all, if the name service exception, should be reported to the name service exception alarm, but there is no name service exception alarm in the cluster.
2. In the process of installation, the name service checking thread has not started, and sent an unavailable alarm for HDFS service.
3. By comparing the codes, it is found that in the process of checking whether HDFS service unavailability alarms need to be sent,
Private Boolean is HDFSneedSendAlarm (ManagedRole)
{
Int badNameService Count = 0;
Map < String, Pair > pairs = role. getPairs ();
For (java. util. Map. Entry < String, Pair > pairentry: pairs. entrySet ())
{
Pair pair = pairentry. getValue ();
If (pair. getState (). getState (). equals (EntityHealthState. BAD))
{
BadNameService Count++;
}
}
If (badNameServiceCount = pairs. size ())
{
LOG. error ("all name services are in BAD state. so need to send HDFS alarm, badNameServiceCount = {}),
BadName Service Count;
Return true;
}
Return false;
}
Use the above code to detect, and in the cluster installation process, pairs number is 0, badNameServiceCount = 0, which is exactly the same, resulting in alarm false alarm. Therefore, badName Service Count=! 0 should be added to the actual inspection.
