Optimal job scheduling

Optimal job scheduling is a class of optimization problems related to scheduling. The inputs to such problems are a list of jobs (also called processes or tasks) and a list of machines (also called processors or workers). The required output is a schedule – an assignment of jobs to machines. The schedule should optimize a certain objective function. In the literature, problems of optimal job scheduling are often called machine scheduling, processor scheduling, multiprocessor scheduling, or just scheduling.

There are many different problems of optimal job scheduling, different in the nature of jobs, the nature of machines, the restrictions on the schedule, and the objective function. A convenient notation for optimal scheduling problems was introduced by Ronald Graham, Eugene Lawler, Jan Karel Lenstra and Alexander Rinnooy Kan.[1][2] It consists of three fields: α, β and γ. Each field may be a comma separated list of words. The α field describes the machine environment, β the job characteristics and constraints, and γ the objective function.[3] Since its introduction in the late 1970s the notation has been constantly extended, sometimes inconsistently. As a result, today there are some problems that appear with distinct notations in several papers.

Single-stage jobs vs. multi-stage jobs

In the simpler optimal job scheduling problems, each job j consists of a single execution phase, with a given processing time pj. In more complex variants, each job consists of several execution phases, which may be executed in sequence or in parallel.

Machine environments

In single-stage job scheduling problems, there are four main categories of machine environments:

  • 1: Single-machine scheduling. There is a single machine.
  • P: Identical-machines scheduling. There are m {\displaystyle m} parallel machines, and they are identical. Job j {\displaystyle j} takes time p j {\displaystyle p_{j}} on any machine it is scheduled to.
  • Q: Uniform-machines scheduling. There are m {\displaystyle m} parallel machines, and they have different given speeds. Job j {\displaystyle j} on machine i {\displaystyle i} takes time p j / s i {\displaystyle p_{j}/s_{i}} .
  • R: Unrelated-machines scheduling. There are m {\displaystyle m} parallel machines, and they are unrelated – Job j {\displaystyle j} on machine i {\displaystyle i} takes time p i j {\displaystyle p_{ij}} .

These letters might be followed by the number of machines, which is then fixed. For example, P2 indicates that there are two parallel identical machines. Pm indicates that there are m parallel identical machines, where m is a fixed parameter. In contrast, P indicates that there are m parallel identical machines, but m is not fixed (it is part of the input).

In multi-stage job scheduling problems, there are other options for the machine environments:

  • O: Open-shop problem. Every job j {\displaystyle j} consists of m {\displaystyle m} operations O i j {\displaystyle O_{ij}} for i = 1 , , m {\displaystyle i=1,\ldots ,m} . The operations can be scheduled in any order. Operation O i j {\displaystyle O_{ij}} must be processed for p i j {\displaystyle p_{ij}} units on machine i {\displaystyle i} .
  • F: Flow-shop problem. Every job j {\displaystyle j} consists of m {\displaystyle m} operations O i j {\displaystyle O_{ij}} for i = 1 , , m {\displaystyle i=1,\ldots ,m} , to be scheduled in the given order. Operation O i j {\displaystyle O_{ij}} must be processed for p i j {\displaystyle p_{ij}} units on machine i {\displaystyle i} .
  • J: Job-shop problem. Every job j {\displaystyle j} consists of n j {\displaystyle n_{j}} operations O k j {\displaystyle O_{kj}} for k = 1 , , n j {\displaystyle k=1,\ldots ,n_{j}} , to be scheduled in that order. Operation O k j {\displaystyle O_{kj}} must be processed for p k j {\displaystyle p_{kj}} units on a dedicated machine μ k j {\displaystyle \mu _{kj}} with μ k j μ k j {\displaystyle \mu _{kj}\neq \mu _{k'j}} for k k {\displaystyle k\neq k'} .

Job characteristics

All processing times are assumed to be integers. In some older research papers however they are assumed to be rationals.

  • p i = p {\displaystyle p_{i}=p} , or p i j = p {\displaystyle p_{ij}=p} : the processing time is equal for all jobs.
  • p i = 1 {\displaystyle p_{i}=1} , or p i j = 1 {\displaystyle p_{ij}=1} : the processing time is equal to 1 time-unit for all jobs.
  • r j {\displaystyle r_{j}} : for each job a release time is given before which it cannot be scheduled, default is 0.
  • online- r j {\displaystyle {\text{online-}}r_{j}} : an online problem. Jobs are revealed at their release times. In this context the performance of an algorithm is measured by its competitive ratio.
  • d j {\displaystyle d_{j}} : for each job a due date is given. The idea is that every job should complete before its due date and there is some penalty for jobs that complete late. This penalty is denoted in the objective value. The presence of the job characteristic d j {\displaystyle d_{j}} is implicitly assumed and not denoted in the problem name, unless there are some restrictions as for example d j = d {\displaystyle d_{j}=d} , assuming that all due dates are equal to some given date.
  • d ¯ j {\displaystyle {\bar {d}}_{j}} : for each job a strict deadline is given. Every job must complete before its deadline.
  • pmtn: Jobs can be preempted and resumed possibly on another machine. Sometimes also denoted by 'prmp'.
  • size j {\displaystyle {\text{size}}_{j}} : Each job comes with a number of machines on which it must be scheduled at the same time. The default is 1. This is an important parameter in the variant called parallel task scheduling.

Precedence relations

Each pair of two jobs may or may not have a precedence relation. A precedence relation between two jobs means that one job must be finished before the other job. For example, if job i is a predecessor of job j in that order, job j can only start once job i is completed.

  • prec: There are no restrictions placed on the precedence relations.
  • chains: Each job is the predecessor of at most one other job and is preceded by at most one other job.
  • tree: The precedence relations must satisfy one of the two restrictions.
    • intree: Each node is the predecessor of at most one other job.
    • outtree: Each node is preceded by at most one other job.
  • opposing forest: If the graph of precedence relations is split into connected components, then each connected component is either an intree or outtree.
  • sp-graph: The graph of precedence relations is a series parallel graph.
  • bounded height: The length of the longest directed path is capped at a fixed value. (A directed path is a sequence of jobs where each job except the last is a predecessor of the next job in the sequence.)
  • level order: Each job has a level, which is the length of the longest directed path starting from that job. Each job with level k {\displaystyle k} is a predecessor of every job with level k 1 {\displaystyle k-1} .
  • interval order: Each job x {\displaystyle x} has an interval [sx,ex) and job x {\displaystyle x} is a predecessor of y {\displaystyle y} if and only if the end of the interval of x {\displaystyle x} is strictly less than the start of the interval for y {\displaystyle y} .=

In the presence of a precedence relation one might in addition assume time lags. The time lag between two jobs is the amount of time that must be waited after the first job is complete before the second job to begin. Formally, if job i precedes job j, then C i + i j S j {\displaystyle C_{i}+\ell _{ij}\leq S_{j}} must be true. If no time lag i j {\displaystyle \ell _{ij}} is specified then it is assumed to be zero. Time lags can also be negative. A negative time lag means that the second job can begin a fixed time before the first job finishes.

  • : The time lag is the same for each pair of jobs.
  • i j {\displaystyle \ell _{ij}} : Different pairs of jobs can have different time lags.

Transportation delays

  • t j k {\displaystyle t_{jk}} : Between the completion of operation O k j {\displaystyle O_{kj}} of job j {\displaystyle j} on machine k {\displaystyle k} and the start of operation O k + 1 , j {\displaystyle O_{k+1,j}} of job j {\displaystyle j} on machine k + 1 {\displaystyle k+1} , there is a transportation delay of at least t j k {\displaystyle t_{jk}} units.
  • t j k l {\displaystyle t_{jkl}} : Between the completion of operation O k j {\displaystyle O_{kj}} of job j {\displaystyle j} on machine k {\displaystyle k} and the start of operation O l , j {\displaystyle O_{l,j}} of job j {\displaystyle j} on machine l {\displaystyle l} , there is a transportation delay of at least t j k l {\displaystyle t_{jkl}} units.
  • t k {\displaystyle t_{k}} : Machine dependent transportation delay. Between the completion of operation O k j {\displaystyle O_{kj}} of job j {\displaystyle j} on machine k {\displaystyle k} and the start of operation O k + 1 , j {\displaystyle O_{k+1,j}} of job j {\displaystyle j} on machine k + 1 {\displaystyle k+1} , there is a transportation delay of at least t k {\displaystyle t_{k}} units.
  • t k l {\displaystyle t_{kl}} : Machine pair dependent transportation delay. Between the completion of operation O k j {\displaystyle O_{kj}} of job j {\displaystyle j} on machine k {\displaystyle k} and the start of operation O l , j {\displaystyle O_{l,j}} of job j {\displaystyle j} on machine l {\displaystyle l} , there is a transportation delay of at least t k l {\displaystyle t_{kl}} units.
  • t j {\displaystyle t_{j}} : Job dependent transportation delay. Between the completion of operation O k j {\displaystyle O_{kj}} of job j {\displaystyle j} on machine k {\displaystyle k} and the start of operation O l , j {\displaystyle O_{l,j}} of job j {\displaystyle j} on machine l {\displaystyle l} , there is a transportation delay of at least t j {\displaystyle t_{j}} units.

Various constraints

  • rcrc: Also known as Recirculation or flexible job shop. The promise on μ {\displaystyle \mu } is lifted and for some pairs k k {\displaystyle k\neq k'} we might have μ k j = μ k j {\displaystyle \mu _{kj}=\mu _{k'j}} . In other words, it is possible for different operations of the same job to be assigned to the same machine.
  • no-wait: The operation O k + 1 , i {\displaystyle O_{k+1,i}} must start exactly when operation O k , i {\displaystyle O_{k,i}} completes. In other words, once one operation of a job finishes, the next operation must begin immediately. Sometimes also denoted as 'nwt'.
  • no-idle: No machine may ever be idle between the start of its first execution to the end of its last execution.
  • size j {\displaystyle {\text{size}}_{j}} : Multiprocessor tasks on identical parallel machines. The execution of job j {\displaystyle j} is done simultaneously on size j {\displaystyle {\text{size}}_{j}} parallel machines.
  • fix j {\displaystyle {\text{fix}}_{j}} : Multiprocessor tasks. Every job j {\displaystyle j} is given with a set of machines fix j { 1 , , m } {\displaystyle {\text{fix}}_{j}\subseteq \{1,\ldots ,m\}} , and needs simultaneously all these machines for execution. Sometimes also denoted by 'MPT'.
  • M j {\displaystyle M_{j}} : Multipurpose machines. Every job j {\displaystyle j} needs to be scheduled on one machine out of a given set M j { 1 , , m } {\displaystyle M_{j}\subseteq \{1,\ldots ,m\}} . Sometimes also denoted by Mj.

Objective functions

Usually the goal is to minimize some objective value. One difference is the notation U j {\displaystyle \sum U_{j}} where the goal is to maximize the number of jobs that complete before their deadline. This is also called the throughput. The objective value can be sum, possibly weighted by some given priority weights w j {\displaystyle w_{j}} per job.

  • -: The absence of an objective value is denoted by a single dash. This means that the problem consists simply in producing a feasible scheduling, satisfying all given constraints.
  • C j {\displaystyle C_{j}} : the completion time of job j {\displaystyle j} . C max {\displaystyle C_{\max }} is the maximum completion time; also known as the makespan. Sometimes we are interested in the mean completion time (the average of C j {\displaystyle C_{j}} over all j), which is sometimes denoted by mft (mean finish time).[4]
  • F j {\displaystyle F_{j}} : The flow time of a job is the difference between its completion time and its release time, i.e. F j = C j r j {\displaystyle F_{j}=C_{j}-r_{j}} .
  • L j {\displaystyle L_{j}} : Lateness. Every job j {\displaystyle j} is given a due date d j {\displaystyle d_{j}} . The lateness of job j {\displaystyle j} is defined as C j d j {\displaystyle C_{j}-d_{j}} . Sometimes L max {\displaystyle L_{\max }} is used to denote feasibility for a problem with deadlines. Indeed using binary search, the complexity of the feasibility version is equivalent to the minimization of L max {\displaystyle L_{\max }} .
  • U j {\displaystyle U_{j}} : Throughput. Every job is given a due date d j {\displaystyle d_{j}} . There is a unit profit for jobs that complete on time, i.e. U j = 1 {\displaystyle U_{j}=1} if C j d j {\displaystyle C_{j}\leq d_{j}} and U j = 0 {\displaystyle U_{j}=0} otherwise. Sometimes the meaning of U j {\displaystyle U_{j}} is inverted in the literature, which is equivalent when considering the decision version of the problem, but which makes a huge difference for approximations.
  • T j {\displaystyle T_{j}} : Tardiness. Every job j {\displaystyle j} is given a due date d j {\displaystyle d_{j}} . The tardiness of job j {\displaystyle j} is defined as T j = max { 0 , C j d j } {\displaystyle T_{j}=\max\{0,C_{j}-d_{j}\}} .
  • E j {\displaystyle E_{j}} : Earliness. Every job j {\displaystyle j} is given a due date d j {\displaystyle d_{j}} . The earliness of job j {\displaystyle j} is defined as E j = max { 0 , d j C j } {\displaystyle E_{j}=\max\{0,d_{j}-C_{j}\}} . This objective is important for just-in-time scheduling.

There are also variants with multiple objectives, but they are much less studied.[2]

Examples

Here are some examples for problems defined using the above notation.[1]

  • P 2 C max {\displaystyle P_{2}\parallel C_{\max }} – assigning each of n {\displaystyle n} given jobs to one of the two identical machines so to minimize the maximum total processing time over the machines. This is an optimization version of the partition problem
  • 1|prec| L max {\displaystyle L_{\max }} – assigning to a single machine, processes with general precedence constraint, minimizing maximum lateness.
  • R|pmtn| C i {\displaystyle \sum C_{i}} – assigning tasks to a variable number of unrelated parallel machines, allowing preemption, minimizing total completion time.
  • J3| p i j C max {\displaystyle p_{ij}\mid C_{\max }} – a 3-machine job shop problem with unit processing times, where the goal is to minimize the maximum completion time.
  • P size j C max {\displaystyle P\mid {\text{size}}_{j}\mid C_{\max }} – assigning jobs to m {\displaystyle m} parallel identical machines, where each job comes with a number of machines on which it must be scheduled at the same time, minimizing maximum completion time. See parallel task scheduling.

Other variants

  • All variants surveyed above are deterministic in that all data is known to the planner. There are also stochastic variants, in which the data is not known in advance, or can perturb randomly.[2]
  • In a load balancing game, each job belongs to a strategic agent, who can decide where to schedule his job. The Nash equilibrium in this game may not be optimal. Aumann and Dombb[5] assess the inefficiency of equilibrium in several load-balancing games.

See also

References

  1. ^ a b Graham, R. L.; Lawler, E. L.; Lenstra, J.K.; Rinnooy Kan, A.H.G. (1979). "Optimization and Approximation in Deterministic Sequencing and Scheduling: a Survey" (PDF). Proceedings of the Advanced Research Institute on Discrete Optimization and Systems Applications of the Systems Science Panel of NATO and of the Discrete Optimization Symposium. Elsevier. pp. (5) 287–326.
  2. ^ a b c Eugene L. Lawler, Jan Karel Lenstra, Alexander H. G. Rinnooy Kan, David B. Shmoys (1993-01-01). "Chapter 9 Sequencing and scheduling: Algorithms and complexity". Handbooks in Operations Research and Management Science. 4: 445–522. doi:10.1016/S0927-0507(05)80189-6. ISBN 9780444874726. ISSN 0927-0507.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  3. ^ B. Chen, C.N. Potts and G.J. Woeginger. "A review of machine scheduling: Complexity, algorithms and approximability". Handbook of Combinatorial Optimization (Volume 3) (Editors: D.-Z. Du and P. Pardalos), 1998, Kluwer Academic Publishers. 21-169. ISBN 0-7923-5285-8 (HB) 0-7923-5019-7 (Set)
  4. ^ Horowitz, Ellis; Sahni, Sartaj (1976-04-01). "Exact and Approximate Algorithms for Scheduling Nonidentical Processors". Journal of the ACM. 23 (2): 317–327. doi:10.1145/321941.321951. ISSN 0004-5411. S2CID 18693114.
  5. ^ Aumann, Yonatan; Dombb, Yair (2010). Kontogiannis, Spyros; Koutsoupias, Elias; Spirakis, Paul G. (eds.). "Pareto Efficiency and Approximate Pareto Efficiency in Routing and Load Balancing Games". Algorithmic Game Theory. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer: 66–77. doi:10.1007/978-3-642-16170-4_7. ISBN 978-3-642-16170-4.

External links

  • Scheduling zoo (by Christoph Dürr, Sigrid Knust, Damien Prot, Óscar C. Vásquez): an online tool for searching an optimal scheduling problem using the notation.
  • Complexity results for scheduling problems (by Peter Brucker, Sigrid Knust): a classification of optimal scheduling problems by what is known on their runtime complexity.