Before we jump into the good stuff, let’s ask ourselves
What is CPU Scheduling and why do we need that?
You did? Good. Moving on now (bad joke).
What & Why
CPU Scheduling is the ‘act’ of determining which process should get the CPU next and start or continue executing.
The scheduler is a process of itself controlling when other processes will get to run based on different parameters we will talk about.
As for the ‘Why’, I believe that after understanding what a CPU scheduler is, it sounds more or less near impossible to work without it.
There are two main types of schedulers — Preemptive and non-preemptive schedulers.
If a scheduler is preemptive it might decide at some point that process A had enough CPU for now and decides to hand it to another process.
A non-preemptive scheduler doesn’t support this behavior and CPU is yielded when a process terminates or the process is waiting for some I/O operation and in the meantime is sleeping.
How Do We Measure Schedulers
There are a few main metrics we will focus on, but before we do, let’s try to give an illustration of what a scheduler might look like
In the above illustration, you can see that our machine has 3 cores.
The numbers indicate the order of arrival.
The first job came and demanded 1 core for 3-time units, then the second one came and demanded 2 cores for 5-time units, and so on.
Utilization is defined by the percentage of time that our CPU is busy.
In the case above we have 18 available blocks but only 16 of them are being used, meaning that the utilization here is 0.888 (88.8%).
Throughput is defined by how much work is done per time unit.
In our case, 3 processes finish their execution in 6-time units meaning that our throughput is 0.5.
Wait time is defined by the difference between the time the job was submitted and the time it actually started to run.
In our case, job 3 could hypothetically be submitted in time unit 2 but at this point, jobs 1 and 2 took all the resources which made job 3 waits until it had enough resources to start running.
Response time is defined by the difference between the time the job was submitted and the termination time.
Assuming job 3 was submitted in time unit 2 and terminated in time unit 6 it means the response time of this job is 4.
We want to maximize utilization and throughput and minimize wait and response time
First-Come First-Served (FCFS)
The name is pretty self-explanatory — Jobs are scheduled by their arrival time.
If there are enough free cores, an arriving job will start to run immediately.
Otherwise, it waits until enough cores are freed.
The above diagram illustrates shows how FCFS would work, and we can immediately see that we can optimize it.
As we see, job 4 only requires two cores for a single time unit and it can be scheduled on the unutilized cores.
- Easy to implement — FIFO wait queue
- Perceived as most fair
- Creates fragmentation — the unutilized cores
- Small or short jobs might wait for a long time
FCFS With Backfilling
This variation of FCFS reduces the number of unutilized cores.
Whenever a job arrives or terminates, we try to start the head of the wait queue — as we did in the original FCFS.
Then, iterate over the waiting jobs and try to backfill them.
Backfilling happens when a short waiting job can “jump over” the head of the wait queue without delaying its start time.
As you can see, job 3 wasn’t delayed but we could make job 4 jumps over it and execute while job 3 waits for enough resources.
- Less fragmentation — better utilization
- Must know runtimes in advance in order the calculate the size of the “holes” and to know which candidates can be backfilled.
Shortest-Job First (SJF)
Unlike FCFS, instead of ordering jobs by their arrival time, we order time by their estimated runtime.
This algorithm is optimal in the metric of average wait time, let’s try to get some intuition why.
Let’s assume that performing FCFS led us to this point
Let’s try to think how it would be different with SJF and compute the respective average wait time.
Regarding the FCFS scheduler (first illustration):
- job 1 waits 0 time units
- job 2 waits 3 time units
- job 3 waits 4 time units
Hence, the average wait time is (0+3+4)/3 = 7/3
Let’s do the same for the SJF scheduler (second illustration):
- job 1 waits 2 time units
- job 2 waits 0 time units
- job 3 waits 1 time unit
The average wait time, in this case, is (2+0+1)/3 = 1
There are many more scheduling algorithms out there which I will cover in the future.
We didn’t even get to talk about preemptive schedulers, so there’s plenty more stuff to discuss, but let’s take a quick rest at this point.