Configuring fair scheduler 

YARN provides a plugin policy for scheduler. By default, the capacity scheduler is configured to be used. In this section, we will see how to configure and set up queues for the fair scheduler. By default, a fair scheduler lets you run all the applications but if configured properly, the number of applications that can be run per user and per queue can be limited. Sometimes, it is important to restrict the number of applications per user so that other applications do not wait longer in the queue submitted by other users.

The first configuration change would go in YARN-site.xml to enable YARN to use a fair scheduler. To use the fair scheduler we need to first configure the appropriate scheduler class in YARN-site.xml as follows:

<property>
     <name>YARN.resourcemanager.scheduler.class</name>
     <value>org.apache.hadoop.YARN.server.resourcemanager.scheduler.           fair.FairScheduler
</value> </property>

 The next step is to specify the scheduler configuration file location into YARN-site.xml by adding the following:

<property>
<name>YARN.scheduler.fair.allocation.file</name>
<value>/opt/packt/Hadoop/etc/Hadoop/fair-scheduler.xml</value>
</property>

Once it is done, the next step is to configure the scheduler properties and the next changes will go to the fair-scheduler.xml file. The first change in fair-scheduling.xml is to make queue allocation policy which will look as follows:

<?xml version="1.0"?>
<allocations>
<defaultQueueSchedulingPolicy>fair</defaultQueueSchedulingPolicy>
<queue name="root">
<queue name="dev">
<weight>40</weight>
</queue>

<queue name="prod">
<weight>60</weight>
<queue name="marketing"/>
<queue name="finance" />
<queue name ="sales" >

</queue>
</queue>
<queuePlacementPolicy>
<rule name="specified" create="false" />
<rule name="primaryGroup" create="false" />
</queuePlacementPolicy>
</allocations>

The preceding allocation configuration has a root queue, which means all jobs submitted to YARN will go into the root queue first. The prod and dev are two sub queues of the root queue. They share 60 and 40 percentage of resources.