Got it

What is SmallFS used for?

110

The number of files that HDFS NameNode can manage is restricted by the node's heap memory. A large number of small files (more small files indicate less data blocks) generated during the use of services can rapidly consume NameNode memory and slow NameNode running.
A background small file merging feature (namely, SmallFS) is developed to solve this problem. SmallFS automatically detects small files in the system based on the file size threshold, merges them, and stores metadata to a local LevelDB to reduce the NameNode load. Moreover, it provides a new FileSystem interface for users to transparently access these small files.

Other related questions:
What is Loader used for?
Compared with conventional Extract-Transform-Load (ETL), Loader has the following advantage and disadvantage: 1. Advantage: Loader uses a MapReduce-based parallel computing architecture as the underlying architecture, which delivers a faster data processing speed than ETL. 2. Disadvantage: Compared with ETL, Loader focuses more on the data import and export function of FusionInsight Hadoop and is weak in data conversion.
What is Yarn used for?
Yarn is the resource management system of Hadoop 2.0. It is a general resource module that manages and schedules resources for applications. Yarn can be used in the MapReduce framework as well as other frameworks such as Tez, Spark, and Storm.
What is Kafka used for?
Kafka is a distributed, partitioned, and replicated message publishing and subscription system that provides features similar to the Java Message Service (JMS). Kafka features message persistence, high throughput, distribution, multi-client support, and real-time processing, and applies to online and offline message consumption. It is ideal for Internet service data collection scenarios, such as conventional data collection, active website tracing, aggregation of operation data in statistics systems (monitoring data), and log collection.
Definition of SmallFS
The number of files that HDFS NameNode can manage is restricted by the node's heap memory. A large number of small files (more small files indicate less data blocks) generated during the use of services can rapidly consume NameNode memory and slow NameNode running. A background small file merging feature (namely, SmallFS) is developed to solve this problem. SmallFS automatically detects small files in the system based on the file size threshold, merges them, and stores metadata to a local LevelDB to reduce the NameNode load. Moreover, it provides a new FileSystem interface for users to transparently access these small files.
What is SmallFS?
The number of files that HDFS NameNode can manage is restricted by the node's heap memory. A large number of small files (more small files indicate less data blocks) generated during the use of services can rapidly consume NameNode memory and slow NameNode running. A background small file merging feature (namely, SmallFS) is developed to solve this problem. SmallFS automatically detects small files in the system based on the file size threshold, merges them, and stores metadata to a local LevelDB to reduce the NameNode load. Moreover, it provides a new FileSystem interface for users to transparently access these small files.
If you have more questions, you can seek help from following ways:
To WeiKnow To Live Chat

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.