Job and Task Scheduling In Hadoop

0 votes

 I am little confused about the terms "Job scheduling" and "Task scheduling" in Hadoop when I was reading about delayed fair scheduling in this slide.

Please correct me if I am wrong in my following assumptions:

  1. Default scheduler, Capacity scheduler and Fair schedulers are only valid at job level when multiple jobs are scheduled by the user. They don't play any role if there is only a single job in the system. These scheduling algorithms form the basis for "job scheduling"

  2. Each job can have multiple map and reduce tasks and how are they assigned to each machine? How are tasks scheduled for a single job? What is the basis for "task scheduling"?

Oct 31, 2018 in Big Data Hadoop by Frankie
• 9,830 points

1 answer to this question.

0 votes

n case of fair scheduler, when there is a single job running, that job uses the entire cluster. When other jobs are submitted, tasks slots that free up are assigned to the new jobs, so that each job gets roughly the same amount of CPU time.

Unlike the default Hadoop scheduler, which forms a queue of jobs, this lets short jobs finish in reasonable time while not starving long jobs. It is also an easy way to share a cluster between multiple of users. Fair sharing can also work with job priorities - the priorities are used as weights to determine the fraction of total compute time that each job gets.

The CapacityScheduler is designed to allow sharing a large cluster while giving each organization a minimum capacity guarantee. The central idea is that the available resources in the Hadoop Map-Reduce cluster are partitioned among multiple organizations who collectively fund the cluster based on computing needs. There is an added benefit that an organization can access any excess capacity not being used by others. This provides elasticity for the organizations in a cost-effective manner.

Learn more about Big Data and its applications from the Azure Data Engineer Associate.

answered Oct 31, 2018 by Neha
• 6,300 points

Related Questions In Big Data Hadoop

0 votes
1 answer

Map and Reduce task memory settings in Hadoop YARN

It's preferable and generally, it is recommended ...READ MORE

answered Apr 18, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points
0 votes
1 answer

Not able to start Job History Server in Hadoop 2.8.1

You have to start JobHistoryServer process specifically ...READ MORE

answered Mar 30, 2018 in Big Data Hadoop by Ashish
• 2,650 points
0 votes
1 answer
0 votes
1 answer

What are SUCCESS and part-r-00000 files in Hadoop?

Yes, both the files i.e. SUCCESS and ...READ MORE

answered Apr 12, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points
0 votes
1 answer
0 votes
1 answer

How to pause and resume hive job?

Practically speaking, it's difficult/impossible to pause and resume ...READ MORE

answered Jul 17, 2018 in Big Data Hadoop by Neha
• 6,300 points
0 votes
1 answer

How to create a project for the first time in Hadoop.?

If you want to learn Hadoop framework ...READ MORE

answered Jul 27, 2018 in Big Data Hadoop by Neha
• 6,300 points
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP