Hello all,
I want to talk about HMaster HA and the relationship HBase with other components. The HMaster in the HBase allocates regions; migrates corresponding meta from a failed regionserver to another regionserver. The HMaster HA feature is brought in to prevent HBase functions from being affected by the HMaster SPOF.
Implementation
Figure 1 HMaster HA architecture
The HMaster HA architecture is implemented by creating ephemeral zookeeper node in the ZooKeeper cluster.
Upon startup, two HMaster nodes try to create a master znode in the ZooKeeper cluster. The HMaster node that creates the master znode becomes the Active HMaster, and the other is the Standby HMaster which will add watch events to the master node.
If the active node fails, it disconnects from the ZooKeeper cluster. After the session expires, the master node disappears. The standby node detects the disappearance of the node through watch events and creates a master node to make itself be the Active Master. Then, the Active/Standby switchover completes. If the subsequently failed node detects existence of the master node after being restarted, it enters the Standby state and adds watch events to the master znode.
When the client accesses the HBase, it first obtains the HMaster's address based on the master node information on the Zookeeper and then establishes a connection to the active HMaster.
Relationship with Other Components
Relationship Between HDFS and HBase:
HDFS is the subproject of Apache Hadoop. HBase uses the Hadoop Distributed File System (HDFS) as the file storage system. HBase is located in structured storage layer. The HDFS provides highly reliable support for lower-layer storage of HBase. All the data files of HBase can be stored in the HDFS, except for some log files generated by HBase.
Relationship Between ZooKeeper and HBase:
Figure 2 describes the relationship between ZooKeeper and HBase.
Figure 2 Relationship between ZooKeeper and HBase
1. HRegionServer registers itself to ZooKeeper in Ephemeral node. ZooKeeper stores the HBase information, including the HBase metadata and HBase address.
2. HMaster detects the health status of each HRegionServer using ZooKeeper and monitors them.
3. HBase can deploy multiple HMasters (like HDFS NameNode). When the active HMatser node is faulty, the standby HMaster node obtains the state information of the entire cluster using ZooKeeper, which means that HBase single point faults can be avoided using ZooKeeper.
This is what I want to share with you today, thank you!