What is Colocation

In the context of HDFS Colocation is the capability of writing a set of files to the same set of data nodes. Each DataNode will store one Replica of all the blocks belonging to all files in this set

