Got it

HDFS vs. HBase

Latest reply: Oct 11, 2021 09:13:19 309 3 3 0 0

Original link

https://dzone.com/articles/hdfs-vs-hbase-all-you-need-to-know


hello, everyone!

This post is a about “HDFS vs. HBase”.


The sudden increase in the volume of data from an order of gigabytes to zettabytes has created the need for a more organized file system for storing and processing data. The demand stemming from the data market has brought Hadoop into the limelight, making it one of biggest players in the industry. Hadoop Distributed File System (HDFS) — the commonly known file system of Hadoop — and HBase — Hadoop’s database — are the most topical and advanced data storage and management systems available in the market.

What Are HDFS and HBase?

HDFS is fault-tolerant by design and supports rapid data transfers between nodes — even during system failures. HBase is a non-relational and open-source NoSQL database that runs on top of Hadoop. HBase falls under the CP type of the CAP (consistency, availability, and partition tolerance) theorem.

HDFS is most suitable for performing batch analytics. However, one of its biggest drawbacks is its inability to perform real-time analysis — which is a trending requirement of the IT industry. HBase, on the other hand, can handle large datasets and is not appropriate for batch analytics. Instead, it is used to write/read data from Hadoop in real-time.

Both HDFS and HBase are capable of processing structured, semi-structured, as well as unstructured data. HDFS lacks an in-memory processing engine, which slows down the process of data analysis, as it uses plain old MapReduce to do it. HBase, on the contrary, boasts an in-memory processing engine that drastically increases the speed of reads and writes.

HDFS is very transparent in its execution of data analysis. HBase, on the other hand, being a NoSQL database in tabular format, fetches values by sorting them under different key values.

HDFS and HBase Use Cases

The two use cases we'll look at involve Cloudera optimization for a bank and an analytics solution for a global CPG player.

Cloudera Optimization for European Bank With HBase

HBase is ideally suited for real-time environments, which can be best demonstrated by citing an example from one of our clients, a renowned European bank. To derive critical insights from the logs of application and web servers, we implemented a solution in Apache Storm and Apache HBase together. Given the huge velocity of the data, we opted for HBase over HDFS, as HDFS does not support real-time writes. The results were overwhelming: it reduced the query time from three days to three minutes.

Analytics Solution for Global CPG Player With HDFS and MapReduce

With our global beverage player client, the primary objective was to perform batch analysis to gain SKU-level insights and involved recursive/sequential calculations. HDFS and MapReduce frameworks were better-suited than complex Hive queries on top of HBase. MapReduce was used for data wrangling and to prepare data for subsequent analytics. Hive was used for custom analytics on top of data processed by MapReduce. The results were impressive again, as there was a drastic reduction in the time taken to generate custom analytics: from three days to three hours.

To offer a reasonable comparison between HDFS and HBase, the following points need to be emphasized:

HDFSHBase
HDFS is a Java-based file system utilized for storing large datasets.HBase is a Java-based NoSQL database.
HDFS has a rigid architecture that does not allow changes. It doesn’t facilitate dynamic storage.HBase allows for dynamic changes and can be utilized for standalone applications.
HDFS is ideally suited for write-once and read-many use cases.HBase is ideally suited for random writes and reads of data that is stored in HDFS.

And that's it! Feel free to leave any questions or remarks in the comments below.




Detailed explanation, thanks my friend HDFS vs. HBase-4213433-1
View more
  • x
  • convention:

little_fish
little_fish Created Oct 11, 2021 09:14:30 (0) (0)
 
zaheernew
zaheernew Created Oct 11, 2021 13:49:11 (0) (0)
Thanks for sharing.  

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.