Definition of SmallFS

The number of files that HDFS NameNode can manage is restricted by the node's heap memory. A large number of small files (more small files indicate less data blocks) generated during the use of services can rapidly consume NameNode memory and slow NameNode running.
A background small file merging feature (namely, SmallFS) is developed to solve this problem. SmallFS automatically detects small files in the system based on the file size threshold, merges them, and stores metadata to a local LevelDB to reduce the NameNode load. Moreover, it provides a new FileSystem interface for users to transparently access these small files.

Scroll to top