Philosophy behind YARN Resource Management by Bikas Saha.
From the post:
YARN is part of the next generation Hadoop cluster compute environment. It creates a generic and flexible resource management framework to administer the compute resources in a Hadoop cluster. The YARN application framework allows multiple applications to negotiate resources for themselves and perform their application specific computations on a shared cluster. Thus, resource allocation lies at the heart of YARN.
YARN ultimately opens up Hadoop to additional compute frameworks, like Tez, so that an application can optimize compute for their specific requirements.
The YARN Resource Manager service is the central controlling authority for resource management and makes allocation decisions. It exposes a Scheduler API that is specifically designed to negotiate resources and not schedule tasks. Applications can request resources at different layers of the cluster topology such as nodes, racks etc. The scheduler determines how much and where to allocate based on resource availability and the configured sharing policy.
If YARN does become the cluster operating system, knowing the “why” of its behavior will be as important as knowing the “how.”