What is the concept of RDD?
107Resilient Distributed Dataset (RDD) is a core concept of Spark. It indicates a read-only distributed dataset that can be partitioned. Partial or all data of this dataset can be cached in the memory and reused in subsequent computations.