Hi, everyone!
I will share with you the relationship between Hive and other components.
Relationship Between Hive and HDFS
Hive is the subproject of Apache Hadoop. Hive uses the Hadoop Distributed File System (HDFS) as the file storage system. Hive parses and processes structured data, and HDFS provides highly reliable lower-layer storage support for Hive. All data files in the Hive database are stored in HDFS, and all data operations on Hive are also performed using HDFS APIs.
Relationship Between Hive and MapReduce
Hive data computing depends on MapReduce. MapReduce is a subproject of the Apache Hadoop project. It is a parallel computing framework based on HDFS. During data analysis, Hive translates HQL statements submitted by users into MapReduce jobs and submits the jobs for MapReduce to execute.
Relationship Between Hive and DBService
MetaStore (metadata service) of Hive processes the structure and attribute information of Hive databases, tables, and partitions. The information needs to be stored in a relational database and is maintained and processed by MetaStore. In FusionInsight HD, the relational database is maintained by the DBService component.
Relationship Between Hive and Elasticsearch
Hive uses Elasticsearch as its extended file storage system. Hive integrates the Elasticsearch-Hadoop plug-in of Elasticsearch, creates a foreign table, and stores table data in Elasticsearch so that it can read and write Elasticsearch index data.
This is what I want to talk about today, thank you!