What is the concept of RDD?

Resilient Distributed Dataset (RDD) is a core concept of Spark. It indicates a read-only distributed dataset that can be partitioned. Partial or all data of this dataset can be cached in the memory and reused in subsequent computations.

Scroll to top