Sunteți pe pagina 1din 14

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2017.2693187, IEEE
Transactions on Cloud Computing
JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 1

A Johnson’s-Rule-Based Genetic Algorithm for


Two-Stage-Task Scheduling Problem in
Data-Centers of Cloud Computing
Yonghua Xiong, Suzhen Huang, Min Wu, Senior Member, IEEE, Jinhua She, Senior Member, IEEE, and
Keyuan Jiang, Senior Member, IEEE

Abstract—One of the keys to making cloud data-centers (CDCs) proliferate impressively is the implementation of efficient task
scheduling. Since all the resources of CDCs, even including operating systems (OSes) and application programs, can be stored and
managed on remote data-centers, this study first analyzed the task scheduling problem for CDCs and established a mathematical
model of the scheduling of two-stage tasks. The Johnson’s rule was combined with the genetic algorithm to create a
Johnson’s-rule-based genetic algorithm (JRGA), which takes into account the characteristics of multiprocessor scheduling in CDCs.
New crossover and mutation operations were devised to make the algorithm converge more quickly. In the decoding process, the
Johnson’s rule is used to optimize the makespan for each machine. Simulations were used to compare the performance of the JRGA
with that of the list scheduling algorithm and an improved list scheduling algorithm. The results demonstrate the validity of the JRGA.

Index Terms—genetic algorithm, Johnson’s rule, task scheduling, data-center, cloud computing.

1 I NTRODUCTION

T HE proliferation of high-speed networks and high-


performance computing systems in the last decade has
created opportunities for cloud data centers(CDCs). At the
clients making thousands of requests and a great number of
server clusters. Tasks and client requests should be allocated
to available machines, and the order of the tasks on each
core of a cloud computing system, CDCs not only have to machine should be arranged, so as to minimize the total
provide centric management of large scale resources, but completion time, which is called the makespan .
also need to offer quick response to a large number of tasks. This study considered the problem of how to schedule
Consequently, the problem of allocating variable resources data request tasks on the machines in a CDC system. A
reasonably and scheduling highly concurrent tasks efficient- task is assumed to consist of two steps: task executing
ly [1], [2] have become a challenge, which in turn assures and results transmission (i.e. executing and transmission).
the service quality, maximizes the resource utilization, and Executing means getting data from a disk or distributed
reduces the costs of CDC. storage system, storing it in memory and completing exe-
A CDC consists of number of server clusters connected cution; and transmission means transmitting the execution
with various kinds of clients by a wired and/or wireless results from memory to a client over the network. Since
network. Since all the resources, even including OSes (OS as the executing and transmission times depend on the size
a service for users in a new paradigm of cloud computing of the task, it is a two-stage-task scheduling problem that
[3]), are stored on the servers, a client just needs to generate is to optimize processing of a set of tasks in two process
a request to get the resource services from CDCs. When a stages (i =1, 2), each of which has multiple processors or
client needs an OS or a program, it sends a service request machines [m(i) ≥ 1, (i = 1, 2)] operating in parallel. The
to the server clusters, which then stream the appropriate detailed descriptions and classifications of the two-stage-
software to the client over the network. Each server cluster task scheduling problem can be found in the next section. In
may use a distributed storage strategy, such as Hadoop, most cases two-stage scheduling problems are NP-complete
or regarded as one machine. There are usually numerous [4]. One important and difficult problem is how to minimize
the makespan when two-stage tasks are scheduled on a
• Y. H. Xiong, M. Wu, and J. She are with the School of Automation, multiprocessor system.
China University of Geosciences, Wuhan 430074, China and Hubei key In this study, the general two-stage-task scheduling
Laboratory of Advanced Control and Intelligent Automation for Complex problem was dealt with by establishing a general model for
Systems, Wuhan 430074, China; J. She is also with the School of Engi-
neering, Tokyo University of Technology, Tokyo 192-0982, Japan.
two-stage-task scheduling for a CDC based on the process of
E-mail: {xiongyh, wumin}@cug.edu.cn, she@stf.teu.ac.jp. task allocation. Our main contributions can be summarized
• S. Z. Huang is with the School of Information Science and Engineering, as follows.
Central South University, Changsha 410083, China. 1) We study how the task scheduling problem in data
E-mail: huangsuzhen@csu.edu.cn.
• K. Jiang is with the Department of Computer Information Technology & centers of cloud computing can be conducted as a two-stage
Graphics, Purdue University Northwest, Hammond, 46323, USA. flow shop scheduling problem with [m(1) ≥ 2], [m(2) ≥ 2],
E-mail: kjiang@pnw.edu. and establish a mathematical model of task scheduling with
Manuscript received April 19, 2005; revised January 11, 2014. the goal of minimizing the makespan.

2168-7161 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2017.2693187, IEEE
Transactions on Cloud Computing
JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 2

2) We show that the Johnson’s rule can be well com- For Type 2, Hsu et al. proposed a method that combines
bined with the GA for the two-stage flow shop scheduling a modified Johnson’s rule with the first-fit rule [10]. For the
problem, and present a Johnson’s-rule-based GA (JRGA) case m(1) = 2, which forms an open shop, and m(2) = 1,
algorithm to solve it. More specifically, the GA was used Dong et al. [11] devised a heuristic method that combines
to allocate tasks to appropriate machines, and the Johnson’s Gonzalez and Sahni’s algorithms [12]. For more complex
rule was used as a decoding technique to determine the situations, many efforts have been made to find an efficient
order in which tasks are processed on each machine. scheduling algorithm for a hybrid flow shop scheduling
3) We study how to improve the original GA to make it problem for multiple processors. For example, Tozkapan et
more suitable for the nature of this scheduling problem, and al. developed a lower-bounding procedure and a dominance
design new crossover and mutation operations to increase criterion, and incorporated them into a branch-and-bound
the fitness of individuals in JRGA. We finally compare the procedure that minimizes total weighted flow time [13]. It
JRGA with the conventional LS algorithm and the improved works well when the problem is relatively small. And for a
LS (ILS) algorithm by experimental results, and prove that large problem, self-adaptive differential evolution [14] pro-
the JRGA is able to converge to the global optimum. vides better performance than the hybrid tabu. Currently,
We also classify the existing studies concerning the two- for the case m(1) > 2 and m(2) = 1, the focus has been on
stage task scheduling problem depending on the number the two-stage assembly scheduling problem. In the problem,
of machines used in each of the two stages, and show that multi-machines in the first stage produce components while
the JRGA can be used not only in the data centers of cloud a single assembly machine assembles them into products
computing but also in the others domains that are of the in the second stage. Liao et al. formulated the two-stage
special characteristics of [m(1) ≥ 2], [m(2) ≥ 2]. assembly scheduling problem of multiple products to min-
The rest of the paper is organized as follows: Section 2 imize makespan as a mixed integer programming model
puts forwards some related work of task scheduling. Section [15]. Xiong et al. addressed a distributed two-stage assem-
3 establishes a mathematical model for the task scheduling bly scheduling problem to minimize the weighted sum of
problem in a CDC. Section 4 explains the JRGA, the new makespan and mean completion time. A variable neighbor-
operations, and a list distribution algorithm (LDA). Section hood search algorithm and a hybrid genetic algorithm were
5 presents some simulation results and a discussion. Finally, proposed to approximately optimize the objectives for a set
Section 6 presents some conclusions and future directions. of small-sized instances [16].
For Type 3, heuristics yield the most successful solutions
to task scheduling problems, such as list scheduling (LS)
2 R ELATED W ORK [17], clustering algorithms [18], and task duplication [19].
For a large problem, Allaoui collected three heuristics: the
Two-stage-task scheduling has been widely studied because list algorithm (LA), the LPT (largest processing time first)
of its impact on performance in a parallel-processing system heuristic, and H heuristics [20]. The results showed that H
(for example, [5], [6]). Studies are classified into four types heuristics provide better performance than LPT heuristics
depending on the number of machines used in each of the do for the worst case if and only if m(2) ≥ 4. Wang et
two stages [m(i) (i = 1, 2)]: al. considered especially the preventive maintenance on
Type 1 Only one machine is used in each stage [m(1) = the first stage machine, and investigated an integrated bi-
m(2) = 1]. objective optimization problem [21]. Mirabi et al. studied
Type 2 Two or more machines are used in the first stage, the case in machine breakdown condition when the machine
but only one is used in the second [m(1) ≥ 2, may not always be available during the scheduling period,
m(2) = 1]. and provided a heuristic algorithm to find the optimal
Type 3 Only one machine is used in the first stage, but job combinations and the optimal job schedule [22]. Hojati
two or more are used in the second [m(1) = 1, addressed the two-stage disassembly flowshop scheduling
m(2) ≥ 2]. problem with the objective to minimize the makespan [23].
Type 4 Two or more machines are used in each stage For Type 4, a great deal of effort has been devoted to
[m(1) ≥ 2, m(2) ≥ 2]. devising an interactive search method with high efficiency
and accuracy for the case in which there are at least two
Type 1 is the standard two-stage flow shop problem. identical machines in each stage. Oĝuz et al. developed a
Johnson presented an optimal solution that employed the heuristic algorithm that provides quick solutions to practi-
Johnson’s rule to arrange all the tasks [7]. For Type 1, most of cal problems [24]. Nikzad et al. considered a special kind
the recent studies are concerned with the scheduling prob- of two-stage hybrid flow shop scheduling problem with
lems from the viewpoint of customized jobs. An et al. con- the aim of minimizing makespan of products [25]. Afshin
sidered a m(1) = m(2) = 1 two-stage flowshop scheduling Mansouria introduced the emerging green scheduling as
problem in which jobs should be processed on the second a new approach to two-stage flowshop scheduling [26]. A
machine within limited waiting times after those jobs are multi-objective mathematical model considering the energy
completed on the first machine, and presented a branch and consumptions as an explicit decision criterion by leveraging
bound algorithm for the objective of minimizing makespan variable procession times was developed, and a heuristic
[8]. Cheng et al. studied minimizing the makespan of a two- algorithm was proposed to find the approximation optimal
machine flowshop scheduling problem with a truncated solutions in shorter time.
learning function in which the actual processing time of a For CDC system, m(1) = m(2) = n (n ≥ 2) in the two-
job is a function of the jobs position [9]. stage-task scheduling problem, which makes it a special case

2168-7161 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2017.2693187, IEEE
Transactions on Cloud Computing
JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 3

of Type 4. This problem has characteristics that differentiate of data intensive applications, but do not arrange their order
it from other Type-4 problems: In contrast to the other types, in each stage. To be fully aware of the special needs for
in which the two task operations (executing, transmission) services, Liu et al. [36] schedule different service requests to
can be carried out on different machines [27], both opera- the given VMs under the constraints with no order of in and
tions must be processed on the same machine in a CDC; out requests, and make the objective function optimal.
that is, once a task is allocated to a machine, that machine However, to the best of our knowledge, existing orig-
does both the executing and the transmission. Each machine inal and improved GAs only allocate tasks to available
has two independent channels for task processing, one for machines. They do not arrange the order of execution of
executing and one for transmission; and transmission can tasks on each machine. Moreover, we do not find any
only begin after executing is finished. In addition, the tasks literature that combines GAs and Johnson’s rule for the
in a CDC are usually more diverse and complex than those scheduling problems in the others fields in additional to
in other systems because they involve OSes, applications, cloud computing.
user data, etc.
In the two-stage-task scheduling problem for a CDC,
3 M ODEL OF T WO -S TAGE -TASK S CHEDULING
when there is only one machine in each stage [m(1) =
m(2) = 1], both operations are processed on the same CDCs are a resource pool that provides users with all kinds
machine. This is a Type-1 problem, which can be solved of cloud services. Clients even do not need an OS or any
by using the Johnson’s rule [7]. On the other hand, when application programs, and may either have a very small
there are 2 or more than two machines in each stage amount of local storage or none at all.
[m(1) = m(2) = n (n ≥ 2)], which makes the two-stage- Since OSes and application programs themselves stored
task scheduling problem for a CDC hard to solve. on CDCs or the relative data execution results can be
Genetic algorithms (GAs) are one of the best ways of streamed to clients over a network, huge amounts of data
solving NP-hard problems, and their effectiveness in han- need to be transferred in real time. How the data transfer is
dling many task scheduling problems has been demon- managed and scheduled directly affects the performance of
strated [28]–[30]. Zhang et al. [28] designed two selection the whole system. Tasks and requests from clients have to
methods in the initialization, and applied different crossover be allocated to available machines, and the order in which
and mutation operation to minimize makespan for flexi- tasks are processed on each machine has to be arranged,
ble job-shop scheduling problem with only one machine with the overall goal of minimizing the makespan.
in every job-shop stage, but there exists new exploring
space for multi-machines in every stage. For parallel task 3.1 Task Scheduling Process in Cloud Data-Center
scheduling situation in multiple heterogeneous machines Fig. 1 shows a diagram of task scheduling in the CDC
with only one stage, Mohammad et al. [29] divided this system which contains various types of clients (PCs, mobile
scheduling into two-phase: a heuristic list-based algorithm devices, tablets, etc.), an access control server, and a CDC.
to quickly generate a high quality task schedule, which is Each client generates one or more jobs(such as loading a
located at an approximate area around the optimal schedule; file). Each job, Jj (j = 1, · · · , m), consists of two operations:
and customized genetic algorithm to search the approximate executing and transmission. Both operations are carried out
area to improve the schedule generated by the first phase. (1)
on the same machine, Mi (i = 1, · · · , n). Let Tj be the
The new crossover operator are performed well in the k- (2)
stage flowshop with mi identical parallel processors of the time required for executing, and Tj be the time required
hybrid flowshop scheduling problem [30] to just confirm a for transmission. The network connects all the clients and
sequence of k tasks with no order, and in this environment, the CDC to an access control server, which collects the
no preemption of jobs is allowed. However, none of the specifications of the jobs and the machines. Based on those
existing studies has taken advantage of GAs to the two- specifications, it allocates the jobs to available machines.
stage task scheduling problem with [m(1) ≥ 2], [m(2) ≥ 2]. Then, the scheduling module in the access control server
In the recent several years, GAs have been found in puts the jobs for each machine in a particular order based
some studies to the task scheduling problems in cloud on the job specifications. Finally, the machines execute the
computing [31]–[36]. Evolutionary algorithms are discussed jobs over the network.
in [31] to solute scheduling problems, especially the genetic
algorithm [32]. Tsai et al. [33] combines the Taguchi method 3.2 Mathematical Model
and evolution algorithm to balance on exploration and Task scheduling has the goal of minimizing the makespan.
exploitation, and establish a multi-objective cost model with It involves two problems: allocation, or how to allocate jobs
non-dominated sorting technique to get the assignment of to machines, and sorting, or how to determine the order
tasks on available resources and makespan. To schedule in which the jobs are executed on each machine. These
the most suitable cloud services to the clients, Kaur et al. problems are formulated below.
[34] improve the genetic algorithm for scheduling tasks by Consider two sets:
taking into consideration their computational complexity {
M = {M1 , M2 , · · · , Mi , · · · Mn }
and computing capacity of processing elements, to complete (1)
J = {J1 , J2 , · · · , Jj , · · · Jm }
the process of adjust particular cloud service dynamics.
Bilgaiyan et al. [35] regard task scheduling as a two-stage where M contains n identical parallel machines, and J is
process: transmission and execution of amounting data, and a set of m jobs to be processed in a two-stage flow shop.
apply the genetic algorithm to manage the escalating costs Two parameters (namely, the times required for executing

2168-7161 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2017.2693187, IEEE
Transactions on Cloud Computing
JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 4

Job queue
Get-job-parameters
J1 J2 Ă Jj Ă Jm module
Get-server-cluster-
Parameters module
Access
7j(2) Scheduling module
control server Ă 7j(2) 7j+1(2) Ă
7j(1) Ă 7j(1) 7j+1(1) Ă

Structure of job Jj Job queues for M1, M2, ...


Ă Ă

M1 M2 Mi Mn
Clients Server clusters

Fig. 1. Diagram of task scheduling in CDC system.

and
{ transmission)
} are associated with each job, Jj : Jj = The objective of task scheduling is to find a schedule that
(1) (2) (1) (2)
Tj , Tj , where Tj , Tj ≥ 0 for j = 1, 2, · · · , m. minimizes C̄(s).
(1) (2)
Define a whole schedule as s, which includes two parts: For job Jj (j = 1, 2, · · · , m), let tbj and tbj be the be-
distribution and sorting. in order to present the distribution ginning times for executing and transmission, respectively,
(1) (2)
relationship in s, a distribution matrix, D = [dij ] is intro- and let tej and tej be the respective ending times. They are
duced, which specifies the mapping of J onto M : constrained by the following relationships:
{ (1) (1) (1)
1, Jj is allocated to Mi tej = tbj + Tj (7)
dij = (2)
0, Jj is not allocated to Mi . (2) (2) (2)
tej = tbj + Tj . (8)
A distribution matrix can represent all possible mappings,
and each mapping is represented uniquely. Since each job Now, the scheduling optimization problem is
must be handled by just one machine, and since the sched- min C̄(s) (9)
ule allocates all the jobs, we have
s.t.

n (2)
tbj ≥ tej
(1)
(10)
dij = 1 (3)
(1) (1) (1) (1)
i=1 dij tej ≤ dik tbk or dik tek ≤ dij tbj (11)

n ∑
m
(2) (2) (2) (2)
dij = m. (4) dij tej ≤ dik tbk or ≤
dik tek dij tbj(12)
i=1 j=1 i = 1, 2, · · · , n; j, k = 1, 2, · · · , m; j ̸= k.
As an example, below is a distribution matrix that allo- In this problem,
cates 15 jobs to 6 machines. Each row is for one machine,
and each column is for one job. 1) (10) means that, for each job, transmission starts
  only after executing is finished;
0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 2) The constraints (11) and (12) ensure that a machine
1 0 0 0 0 0 1 0 0 1 0 0 0 1 0
  carries out executing and transmission for one job at
0 0 0 0 0 0 0 1 0 0 0 1 0 0 0
D= 0 0 0 1 0 0 0 0 1 0 0 0 0
. a time, for all the jobs assigned to the same machine,
 0 0 Mi , for example,the executing (or transmission)for
0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 a new job (Jk , as in (11)) begins only after the
0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 executing (or transmission) for the previous job (Jj )
(5) is finished.
Practically speaking, Equations (3) and (4) mean that, in
each column, there is one and only one entry with a value Two assumptions are needed for the analysis:
of one.
1) All the machines are identical and have both the
Another component of s is a sorting rule that determines
same knowledge of the network and the same op-
the order of execution of jobs on each machine. Let Ci (s) be
portunities to receive requests. Since Virtual Ma-
the time required for the completion of the jobs allocated to
chines (VMs) can fairly share sources of physical
machine, Mi , for a sorted order of execution. The comple-
machines and have been widely adopted in cloud
tion time for all the jobs in s, which is the makespan, is the
computing [37], the assumptions are rational if all
maximum execution time for all n machines in M . Thus,
the machines are VMs.
C̄(s) = max{Ci (s)}, i = 1, 2, · · · , n. (6) 2) The preemption of jobs is not allowed.

2168-7161 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2017.2693187, IEEE
Transactions on Cloud Computing
JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 5

4 J OHNSON -R ULE - BASED G ENETIC A LGORITHM


The two-stage-task scheduling problem is NP-hard [38], Initialization
and the commonly used approximate algorithms are not
suitable. In this study, a GA and the Johnson’s rule were Set k=0
combined to create a new algorithm that yields a proper Select the parent population P0
schedule with the minimum makespan.
The Johnson’s rule was first put forward to solve the
two-stage flowshop scheduling problem, which is able to
reach the optimal solution in linear time. The main idea is Obtain Pk
to leave the least idle time of both machines by arrange the
tasks according to some rules which is named as Johnson’s
rule. The Johnson’s rule has proven to be the best way of
handling the problem when m(1) = m(2) = 1. However, Crossover Mutation
when there are multiple machines in both stages, the prob-
lem of allocation must be considered; and a GA proposed
by Holland [40], is a powerful tool for solving it. Obtain Hk Reserve Pk Obtain Mk
A GA (Fig. 2) is a search heuristic that mimics the process
of natural selection by implementing some of its basic
mechanisms, such as reproduction, crossover, and mutation.
It starts with a randomly generated population, which is Decode
an initial set of solutions to a problem. Each solution is
encoded in a finite-length string, called an individual, based Calculate fitness
on certain rules. A fitness function is used to evaluate the
fitness of each individual. Three operations are performed
on the current population to create the next generation: Selection
selection, crossover, and mutation. Selection measures the k=k+1
fitness of each individual in the current population and
selects the fittest ones to be members of the next generation.
Crossover chooses a pair of individuals in the current pop-
NO Is the stop
ulation and exchanges some of their chromosomes to form
new individuals. And mutation changes the value of some criterion met?
of the chromosomes in an individual. Selecting the fittest
individuals to be members of the next generation ensures YES
that the population evolves toward better solutions with Output
each generation. Finally, the individual that is best adapted
to the environment is an optimal solution to the problem.
The JRGA is a hybrid GA that uses the Johnson’s rule Fig. 2. Flowchart of GA.
in the decoding process. Moreover, in contrast to studies in
which mutation can reduce an individual’s fitness [27], [28], Algorithm 1 Creation of 1 × m distribution matrix
in this study a new evolutionary mutation operation is em- Require:
ployed that always makes an individual’s fitness increase. D̂ = [ ]
The JRGA is explained below. 1: for j = 1, 2, · · · , m do
2: for i = 1, 2, · · · , n do
3: if Dij = 1 then
4.1 Specification and Initialization of Population
4: D̂ = [D̂, i]
Scheduling optimization (9) involves two problems: alloca- 5: end if
tion and sorting. The sorting problem has been thoroughly 6: end for
studied [41], [42] so it is not difficult to decide the order of 7: end for
execution of the jobs allocated to a particular machine. Thus,
this study only considered allocation.
As mentioned above, a distribution matrix D = [dij ], to M1 , so the second entry is 1; J3 is allocated to M5 , so the
which has (2) as entries, specifies the mapping of J onto third entry is 5; and so on. That gives us
M . And for the problem considered in this study, in each
column there is one and only one entry with a value of one. D̂ = [2 1 5 4 1 6 2 3 4 2 6 3 1 5 2] (13)
This allows us to encode the information in a distribution
matrix in a simpler form. Since each job is allocated to just for our example. Thus, each column represents one job, and
one machine, we can use a 1-dimensional matrix, D̂, that the entry indicates the machine to which the job is allocated.
just lists the numbers of the machines to which the jobs are More generally, the rule for converting an n × m distri-
allocated. In the example distribution matrix (5) above, J1 is bution matrix, D, into a compact 1 × m distribution matrix,
allocated to M2 , so the first entry of D̂ is 2; J2 is allocated D̂, is as follows:

2168-7161 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2017.2693187, IEEE
Transactions on Cloud Computing
JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 6

Algorithm 2 LDA
Require: P J
(1) (2) 2 J1 L1={J2, J5, J13 }
Assume that J = {Jj } = {Tj , Tj } and that an
1 J2
individual in the population has the form (14). 5 J3
Li = [ ] L2={J1, J7, J10 , J15 }
4 J4
1: for i = 1, 2, · · · , n do 1 J5
2: for j = 1, 2, · · · , m do 6 J6 L3={J8, J12 }
3: if Ij = i then 2 J7
4: Li = [Li , Jj ] 3 J8
L4={J4, J9}
5: end if 4 J9
6: end for 2 J 10
7: Rearrange the jobs in Li into the Johnson order. 6 J 11 L5={J3, J14 }
8: end for
3 J 12
1 J 13
5 J 14 L6={J6, J11 }
2 J15
Conveniently, this 1 × m matrix is precisely the mathe-
matical form that we need for a chromosome–or individual–
in a population for a genetic algorithm. We thus define an Fig. 3. Job lists decoded from individual
individual, I(x), to be a 1 × m matrix whose elements are .
integers in the range [1, n], where m is the number of jobs
and n is the number of machines: TABLE 1
Job parameters for individual (16).
I(x) = [I1 , I2 , · · · , Ij , · · · , Im ], Ij ∈ {1, · · · , n}, j = 1, · · · , m.
(14) Time (s) J1 J2 J3 J4 J5 J6 J7 J8
The population, X , to which the JRGA is applied, is a set of Tj
(1)
87 51 11 78 72 69 94 72
np individuals: Tj
(2)
98 45 28 88 64 49 64 59
X = {I(1), . . . , I(x), . . . , I(np )}, x = 1, 2, ..., np . (15) Time (s) J9 J10 J11 J12 J13 J14 J15
(1)
Tj 36 28 27 74 24 78 45
np is the size of the population. It is determined by taking (2)
Tj 48 35 48 63 29 70 37
into account the trade-off between computational cost and
optimization result. For this study, it was chosen to be np =
30. The initial population is formed by randomly generating
the integers in each individual to ensure that the population Johnson’s-rule-based sorting:
is scattered around the search space. (1)
Step 1. Divide the jobs into two groups based on Tj
(2)
4.2 Johnson’s-Rule-based Decoding Method and Tj : Group 1 contains the jobs for which
(1) (2)
The process of decoding involves applying the LDA (Al- ≤ Tj , and Group 2 contains the rest.
Tj
  
gorithm 2) to an individual (or chromosome) to generate a Step 2. Put the jobs in Group 1 in non-descending order
schedule.
(1)
based on Tj .    
The LDA first examines an individual, (14), to obtain Step 3. Put the jobs in Group 2 in non-ascending order
(2)
information on how jobs are allocated to machines: For each based on Tj .
machine, Mi , it constructs a job list, Li , by first extracting the Step 4. Form a new list appending Group 2 to Group 1.  
entries in I for which the value is i, and then making a list
  
of the corresponding jobs. For example, for the individual Algorithm 2 decodes the job list of each machine from
I (Fig. 3) and sorts it into the Johnson order. This yields a
I = [2 1 5 4 1 6 2 3 4 2 6 3 1 5 2], (16)
complete
 schedule. Below, we use the example individual
the job list for M2 is constructed by first extracting all the (16) to illustrate the process of sorting a job list into the
entries with a value of 2 (I1 , I7 , I10 , I15 ), and then listing Johnson order. The timing data for each job are given in

the corresponding jobs: L2 = {J1 , J7 , J10 , J15 }. Thus, the Table (1).
algorithm for the LDA is as follows: The job list for M2 is L2 = {J1 , J7 , J10 , J15 }. The jobs are
Once the jobs are properly allocated to the machines, the divided into two groups based on the Johnson’s rule: Group
(1) (2)
scheduling problem is reduced to determining the order of 1 contains J1 and J10 because Tj ≤ Tj (j = 1, 10); and
job execution for each machine, which is similar to the two- Group2 contains J7 and J15 . The jobs in Group 1 are put
stage flow shop. Johnson has reported an optimal solution (1)
in the order {J10 , J1 } based on the values of Tj (that is,
for this scheduling problem [7]. So, the job order for each (1) (1)
machine is decided in accordance with the ÿ Johnson’s rule.
T10 < T1 (28 < 87)). The jobs in Group 2 are put in the
 (2) (2)
(1) (2)
A job Jj ∈ J has the form Jj = {Tj , Tj }. So, for order {J 7 , J15 } based on the values of Tj (that is, T7 >
(2)
each machine, we use the Johnson’s rule to put the jobs in T15 (64 > 37)). Appending Group 2 to Group 1 gives us
ÿ 
its job list into the Johnson order, which yields the smallest the order of job execution for M2 : {J10 , J1 , J7 , J15 }.
makespan. The processing time for the sorting is of order The LDA is used to decode each I(x) in X into a full
O(n log n). The sorting process is given below. schedule, s.

2168-7161 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2017.2693187, IEEE
Transactions on Cloud Computing
JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 7

4.3 Fitness next generation are broken up into pairs of individuals; and
Mathematically, fitness is considered as an objective func- some of the pairs (called parents) are selected for crossover.
tion used to evaluate the individuals in a population. For The percentage of pairs selected is based on a predetermined
a GA, a fitness function has non-negative values; and the number called the crossover rate, c, which is the probability
larger the value is for an individual, the better the solution of the crossover operation being applied to a pair. The choice
is that the individual represents. of c is discussed in Section 5 using a numerical example.
The objective of the scheduling optimization problem is This study tested two crossover operations: a linear
to minimize C̄(s) = max{Ci (s)}. So, we define the fitness crossover operation (LXO) and a two-point crossover (TPX),
of an individual to be which is widely used in GAs and has proven to be effective
in most cases.
1 1
f (I(x)) = = , x = 1, 2, · · · , np . (17) LXO:
C̄(s) max{Ci (s)} Input: Two parents in the current population
After Algorithm 2 is used to decode each individual in Output: Two offspring
the population into a full schedule, the time needed to
Step 1. Name the parents I1 and I2 , and name the
complete the schedule, C̄(s), is calculated. Then, (17) is used
offspring I1′ and I2′ .
to calculate the fitness of each individual.
Step 2. Randomly generate a number rc = rand[0, 1].
Step 3. Generate two offspring by producing their genes
4.4 Selection in the following manner:
The selection operation chooses the fittest individuals in a
Iˆ1 = rc I1 + (1 − rc )I2 , Iˆ2 = rc I2 + (1 − rc )I1 .
population to form the parent population for the next gener-
ation. The rest individuals simply disappear. The larger an Step 4. Round off the genes obtained in Step 3 to inte-
individual’s fitness is, the greater are its chances of survival. gers:
Thus, as the population evolves, more and more individuals I1′ = [Iˆ1 ], I2′ = [Iˆ2 ]
approach the optimal fitness. In this study, selection is based
on the roulette wheel method, which is the most widely where [x] means the integer that is closest to x.
used one. It employs a random number in the range [0, 1]. It is given by [x] = max{n ∈ Z|n ≤ x + 0.5}.
Roulette wheel method: TPX:
Step 1 Calculate the fitness, f (I(x)), of each individual Input: Two parents in the current population
in the population. Output: Two offspring
Step 2 Calculate the total fitness of the current popula- Step 1. Name the parents I1 and I2 , and name the
tion offspring I1′ and I2′ .
∑np
F = f (I(x)). Step 2. Randomly set a crossing point in I1 and I2 .
x=1 Step 3. Exchange the subchromosomes of I1 and I2 , as
determined by the crossing point, to form the
Step 3 Calculate the probability of each individual be-
offspring I1′ and I2′ .
ing selected,
f (I(x)) The numerical example in Section 5 compares the effec-
p(I(x)) =
. tiveness of the two crossover operations.
F
Step 4 Calculate the cumulative probability of each in-
dividual: 4.6 Evolutionary Mutation
∑x The mutation operation changes one or more of the genes
P (I(x)) = p(I(i)) of an individual to improve the search coverage and also
i=1 to maintain the diversity of the population so as to prevent
Step 5 Generate a random number r = rand[0, 1]. premature convergence. The mutation rate, u, for a popula-
Step 6 Select the xth individual if the condition P (I(x− tion is the probability that an individual will mutate. It is
1)) < r < P (I(x)) holds for 2 ≤ x ≤ np , and fixed and is related to the crossover rate, c. The numerical
select the first individual if r < P (I(1)). example in Section 5 illustrates how it is chosen.
A gene in an individual is chosen at random for mu-
tation. Two new mutation operations are used to improve
4.5 Crossover the fitness of the population: one-point mutation (OM) and
Without a special mechanism in a GA, the individuals in cross mutation (XM).
a new generation inherit all of their genes directly from After Algorithm 2 is used to decode an individual into
their parents without alteration, which limits the search a full schedule, the job list, Li , of each machine, Mi , is
coverage. To eliminate this drawback, the crossover op- determined; and the time needed to complete all the jobs
eration exchanges one or more genes in two individuals for each machine is calculated. Let Cmin (s) be the shortest
(or chromosomes) to produce two new individuals. This completion time, and let Cmax (s) be the longest. Suppose
can improve the population and increase the probability that Cmin (s) = Ca (s) and Cmax (s) = Cb (s). Clearly, the
of finding a better solution. After the selection operation fitness of an individual is determined by the maximum
is applied to a population, the resulting candidates for the completion time for all the machines, that is, Cmax (s). There

2168-7161 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2017.2693187, IEEE
Transactions on Cloud Computing
JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 8

are two situations in which changing the allocation of jobs Algorithm 3 JRGA.
to machines helps to shorten Cmax (s). In order to simplify Ensure:
the mutation operation, we take the length of Jj (Jj ∈ J) to Let m be the number of jobs in J and n be the number
(1) (2) (1) (2)
be Tj + Tj . of machines in M . Let Tj and Tj be the times needed
Situation 1: Assume that there is a job, Jk ∈ J , in Lb to complete the two steps of each job Jj ∈ J . Let s be
for Mb for which the length is greater than the difference the schedule and C̄(s) be the total completion time
between Cmax (s) and Cmin (s), that is, ∃Jk ∈ Lb such that 1: Initialize the three control parameters: size of population
(1) (2)
Tk + Tk > Cmax (s) − Cmin (s) = Cb (s) − Ca (s). If we (np ), crossover rate (c), and mutation rate (u). Initialize
move Jk to the job list of Ma , La , the completion time C̄(s) all the input parameters: number of jobs (m), number
(1) (2)
will be shorter than Cb (s). In this case, the OM mutation of machines (n), and Tj and Tj for each Jj ∈ J .
operation below changes a gene in a chromosome. Initialize the number of the current generation: k = 0.
OM: 2: Randomly generate an initial population, P0 , of np indi-
Input: An individual (14) in the current population. Li and viduals, as described in Subsection 4.1.
(1) (2) 3: Use Algorithm 2 to decode each individual into a com-
Ci (s) of each Mi ∈ M . Tj and Tj of each Jj ∈ J .
Output: A new individual. plete schedule, calculate Ci (s) for each machine, and
obtain Cmax (s). Evaluate the fitness of all individuals.
Step 1. Examine Ci (s) for all the Mi , and find the ma- 4: Use the selection operation to select individuals in the
chine (Ma ) for which it is a minimum (Ca (s)) current population to be the parent population, Pk , for
and find the machine (Mb ) for which it is a the next generation.
maximum (Cb (s)). 5: Pair the individuals in Pk , and select a certain percent-
Step 2. Calculate the difference between Cmax (s) and age of pairs, as determined by the crossover rate, c. Use
Cmin (s): a crossover operation (either LXO or PMX) to combine
the individuals in the selected pairs. This yields a hybrid
∆ = Cmax (s) − Cmin (s) = Cb (s) − Ca (s). population, Hk .
6: Apply a mutation operation (either OM or XM) to the
Step 3. For all the jobs in the job list, Lb , of Mb , find population Pk , which yields a middle population, Mk .
out if there is a job, Jk , such that Ik = b and 7: Decode the offspring produced in Steps 5 and 6. Calcu-
(1) (2)
Tk + Tk < ∆. late Ci (s) for each machine for all the individuals and
Step 4. If such a Jk exists, move Jk from Mb to Ma . evaluate their fitness.
8: Select the fittest np individuals in Pk ∪ Hk ∪ Mk to form
Situation 2: Assume that the job Jk ∈ J is in the job list the next generation, Pk+1 .
Lb of Mb and that the job Jl ∈ J is in the job list La of Ma . In 9: If the stop criterion is satisfied, output the optimal
addition, assume that the difference between the completion schedule. Otherwise, let k = k + 1 and repeat Steps
times for Jk and Jl is larger than that the difference between 4-8.
Cmax (s) and Cmin (s), that is, ∃Jk ∈ Lb and Jl ∈ La such
(1) (2) (1) (2)
that Tk + Tk − (Tl + Tl ) > Cmax (s) − Cmin (s) =
Cb (s) − Ca (s). 4.7 Procedure for JRGA
Exchanging Jk and Jl (or more specifically, moving Jk
The steps in the JRGA are listed in Algorithm 3. The stop
to the job list for Ma and moving Jl to the job list for Mb )
criterion used to terminate a GA is usually based on the
shortens the completion time. The XM mutation operation
number of generations, a time limit, a fitness limit, the
exchanges two genes in an individual.
number of stall generations, a stall time limit, or some
XM: other limits. In this study, the stop criterion for the JRGA
Input: An individual (14) in the current population. Li and is the maximum number of generations, which is set to
(1) (2)
Ci (s) for each Mi ∈ M . Tj and Tj for Jj ∈ J . 1000 according to the experience and can be adjust to an
Output: A new individual. appropriate value while simulating.
Step 1. Examine Ci (s) for all the Mi , and find the ma-
chine (Ma ) for which it is a minimum (Ca (s)) 5 S IMULATIONS
and find the machine (Mb ) for which it is a
maximum (Cb (s)). The JRGA was implemented and tested in a simulated
Step 2. Calculate the difference between Cmax (s) and environment using MATLAB. In the simulated date center,
Cmin (s): the number of machines and tasks are set to mimic the
real environment. Consider three types of tasks: system
∆ = Cmax (s) − Cmin (s) = Cb (s) − Ca (s). requests (processing time: 1000-10000 µs), application re-
quests (processing time: 100-1000 µs), and data requests
Step 3. For all the jobs in the job list Lb for Mb and for (processing time: 10-100 µs). The proportions of the three
all the jobs in the job list La for Ma , determine used in the simulations were approximately 15%, 35%, and
if there exist Jk and Jl such that Ik = b, Il = a, 50%, respectively. These numbers are based on the statistics
(1) (2) (1) (2)
and Tk + Tk − (Tl + Tl ) < ∆. on the data types in CDCs over a period of time. In order
Step 4. If such a Jk and Jl exist, exchange Jk and Jl . to simulate more closely to the real situation, all tasks are
randomly generated within the above constraint.

2168-7161 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2017.2693187, IEEE
Transactions on Cloud Computing
JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 9

TABLE 2 TABLE 3
ρ for JRGA for different parameters. Makespan (µs) for situation 1.(m(1) = m(2) = 2) for different numbers
of tasks.

c u LXO & OM LXO & XM TPX & OM TPX & XM n Topt TJRGA TLS TILS
1 1 1.0131 1.0168 1.0048 1.0162 20 2261 2297 2567 2458
0.8 1 1.0069 1.0251 1.0041 1.0137 40 3928 3939 4445 4159
0.6 1 1.0192 1.0115 1.0191 1.0125 60 6115 6128 6896 6184
0.5 1 1.0219 1.0819 1.0256 1.0241 80 7870 7883 8847 7890
1 0.7 1.0175 1.0427 1.0075 1.0252 100 9629 9643 10500 9651
1 0.5 1.0283 1.0653 1.0110 1.0263 150 15286 15292 15934 15303
1 0.1 1.0715 1.0665 1.0326 1.0331
Best c & u 0.8 & 1 0.6 & 1 0.8 & 1 0.6 & 1
TABLE 4
Makespan (µs) for 100 tasks for situation 2.(m(1) = m(2) = 2).

The performance was evaluated by means of the approx-


m Topt TJRGA TLS TILS
imation ratio 2 9171 9183 10141 9235
Tc
ρ= , (18) 3 6567 6625 7691 6661
Topt 4 4851 5015 6026 5056
5 4234 4587 5278 4432
where Tc is the calculated makespan and Topt is the the- 6 3255 3593 4477 3388
oretically optimal makespan. ρ was used to evaluate the 7 2802 3347 4030 2945
JRGA and other algorithms. Note that, since Topt is not 8 2285 2854 3447 2514
9 2327 3076 3542 2577
available, the upper bound on the optimal makespan was
used instead: We used the Johnson algorithm to calculate
the optimal makespan for a single machine, T̄opt , and set
Fig 4 shows how the best and mean makespans change
Topt = T̄opt /m (19) with each generation for situation 1.(m(1) = m(2) = 2).
Convergence to the best makespan takes at most 70 genera-
5.1 Preliminary simulations tions. Moreover, convergence becomes faster as the number
The two types of crossover operations (LXO, TPX) and of tasks increases. More specifically, convergence to the best
the two types of mutation operations (OM, XM) can be makespan takes around 70 generations for 20-40 tasks (Figs.
combined in four ways, yielding four kinds of GAs. The 4(a) and 4(b)), around 50 generations for 60-80 tasks (Figs.
preliminary experiments used 20 randomly generated tasks 4(c) and 4(d)), and around 40 generations for 100-150 tasks
and 2 machines for each stage to find the best combination of (Figs. 4(e) and 4(f)).
c and u. Four values of c (1.0, 0.8, 0.6, 0.5) and four values of The changes in best and mean makespans for 100 tasks
u (1.0, 0.7, 0.5, 0.1) were tested. The values of c were taken and for different numbers of machines are shown in Fig. 5.
from [39], and the values of u were set to a higher level In the best case (Fig. 5(a): m(1) = m(2) = 2), convergence
than usual to explore the potential of the new mutation to the best makespan took less than 40 generations. The
operations to improve fitness. search for the best solution becomes more difficult as the
A comparison of ρ for the four kinds of JRGA (Table 2) number of machines increases. When the number is 5, it
shows that c and u have a great influence on makespan. takes a long time to find a solution: It takes 50-80 generations
Note that, for all the parameters, ρ was very close to 1 to reach the steady state. However, even under extremely
for all four kinds of JRGA, and that its maximum was only harsh conditions (Fig. 5(h): m(1) = m(2) = 9), the JRGA
1.0715. This means that the makespan for the JRGA is almost yields an acceptable result in less than 140 generations.
equal to the theoretical optimum.
Among the four kinds of JRGA, the best was the one with 5.3 Comparison
the crossover operation TPX and the mutation operation
OM: It had the smallest ρ of 1.0041 for c = 0.8 and u = 1. In this study, the conventional LS algorithm and the im-
In addition, the combination of TPX and OM produced proved LS (ILS) [45] algorithm were also used to solve
satisfactory results for different parameter. For this reason, the task scheduling problem. The makespans for randomly
this combination was used for all further simulations. generated tasksare shown in Tables 3 and 4 for the JRGA,
the LS algorithm and the ILS algorithm, along with the
theoretically optimal schedule, which is calculated by (19).
5.2 Performance evaluation The results show that the JRGA produced the best solution
GAs are often criticized because of their high time com- for m(1) = m(2) = 2 and for different numbers of tasks
plexity posed by a great number of generations to find a varying from 20 to 150. However, Table 4 also shows that
satisfactory result. The time required by the JRGA with TPX the ILS algorithm yielded better performance than the JRGA
and OM to find the optimal solution was tested in two when 5 or more machines were used to process 100 tasks.
situations: The values of ρ were compared for the JRGA, LS, and
Situation 1. Two machines processed from 20 to 150 tasks ILS. ρ decreases as the number of tasks increases (Fig 6).
(Fig. 4). In addition, the JRGA yields much lower values than the
Situation 2. From 2 to 9 machines processed 100 tasks other algorithms do, with the difference from the value for
(Fig. 5). an optimal schedule being less than 1%. However, as the

2168-7161 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2017.2693187, IEEE
Transactions on Cloud Computing
JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 10

x 10 3 x 10 3
2.8 5.0
Best Best
Makespan (μs) 2.7 Mean 4.8 Mean

Makespan (μs)
2.6 4.6
2.5
4.4
2.4
4.2
2.3
2.2 4.0
2.1 3.8
0 20 40 60 80 100 0 20 40 60 80 100
Generations Generations
(a) 20 tasks. (b) 40 tasks.

x 10 3 9.6 x 10
3
6.6
Best 9.4 Best
6.4 Mean Mean
9.2

Makespan (μs)
6.2
Makespan (μs)

9.0
6.0 8.8
5.8 8.6
5.6 8.4
5.4 8.2
5.2 8.0
0 20 40 60 80 100 0 20 40 60 80 100
Generations Generations
(c) 60 tasks. (d) 80 tasks.

x 10 4 x 10 4
1.16 1.6
1.14 Best 1.58 Best
Mean 1.56 Mean
1.12
Makespan (μs)

Makespan (μs)

1.1 1.54
1.52
1.08 1.5
1.06 1.48
1.04 1.46
1.02 1.44
1.0 1.42
0.98 1.4
0 20 40 60 80 100 0 20 40 60 80 100
Generations Generations
(e) 100 tasks. (f) 150 tasks.

Fig. 4. Changes in best and mean makespans for different tasks for situation 1.(m(1) = m(2) = 2).

number of machines increases, ρ also increases for the JRGA the machines. The numbers of tasks increased from 200 to
(Fig. 7). When the number is 5 or more, the ILS algorithm 1000 and the machines varied from 2 to 9. The results show
tends to show better performance than the JRGA; but the that, when the ratio of the number of tasks to the number
JRGA still yields a ρ of less than 1.1. of the machines is relatively low, the ILS performs better;
but as the ratio increases, at some point the JRGA starts
The ILS is an orderly allocation method in that it directly
yielding a shorter makespan that is closer to the opt. In the
allocates a task to the machine with the smallest workload,
future work, more attention should be given to strengthen
while the JRGA is a random allocation method that allocates
the performance with low ratio of tasks to machines.
all the tasks randomly. For this reason, the ILS may allocate
tasks to machines more uniformly when the number of 5.4 Convergence Analysis
tasks is relatively small as it directly allocates a task to the
Markov chain theory has been used to analyze the conver-
machine with the smallest workload. In contrast, when the
gence of a GA [43], and in this study it was used to examine
number of tasks is large, the JRGA tends to find a better
the convergence of the JRGA.
solution than the ILS in a smaller number of iterations
Consider the scheduling optimization problem (9),
for JRGA randomly allocate tasks and optimize it through
which can be rewritten as
interactions. To verify this, a contrast experiment is designed
with different ratio of the number of tasks to the number of max{f (I(x)); I(x) ∈ X} (20)

2168-7161 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2017.2693187, IEEE
Transactions on Cloud Computing
JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 11

x 10 4 x 10 3
1.3 9.5
Best Best
1.25 Mean Mean
Makespan (μs) 9.0

Makespan (μs)
1.2
8.5
1.15
8.0
1.1
1.05 7.5
1.0 7.0
0 20 40 60 80 100 0 20 40 60 80 100
Generations Generations
(a) m(1) = m(2) = 2. (b) m(1) = m(2) = 3.

x 10 3 x 10 3
7.5 6.5
Best Best
Mean Mean
7.0 6.0
Makespan (μs)

Makespan (μs)
6.5 5.5

6.0 5.0

5.5 4.5
5.0 4.0
0 20 40 60 80 100 120 0 20 40 60 80 100 120
Generations Generations
(c) m(1) = m(2) = 4. (d) m(1) = m(2) = 5.

x 10 3 x 10 3
5.5 4.8
Best 4.6 Best
Mean Mean
5.0 4.4
Makespan (μs)

Makespan (μs)

4.2
4.5
4.0
4.0 3.8
3.6
3.5 3.4
3.2
3.0 3.0
0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140
Generations Generations
(e) m(1) = m(2) = 6. (f) m(1) = m(2) = 7.

x 10 3 x 10 3
4.8 4.6
4.6 Best 4.4 Best
4.4 Mean Mean
4.2
Makespan (μs)

4.2
Makespan (μs)

4.0
4.0
3.8 3.8
3.6 3.6
3.4 3.4
3.2 3.2
3.0 3.0
2.8 2.8
0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160
Generations Generations
(g) m(1) = m(2) = 8. (h) m(1) = m(2) = 9.

Fig. 5. Changes in best and mean makespans for situation 2.(m(1) = m(2) = 2) for 100 tasks.

where X = {I(1), I(2), · · · , I(x), · · · , I(np )} is the pop- individuals make up a population. f (I(x)) is the fitness
ulation, or space of individuals, which is bounded and function for I(x). The product space X np = X ×X ×· · ·×X
compact. Each point, I(x), in X is an individual; and np is the population space. It can represent all populations, and

2168-7161 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2017.2693187, IEEE
Transactions on Cloud Computing
JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 12

it contains all possible solutions.


The fitness function for a population is defined to be TJRGA/Topt
1.15 TLS/Topt
F (X) = max{f (I(x)); I(x) ∈ X}. (21) TILS/Topt
1.1
The optimal set is

ρ
1.05
Sopt = {I(x), I(x) ∈ X, f (I(x)) = fmax }, (22)
where fmax is the theoretically optimal fitness (that is, 1.0
1/Topt ), and the global optimal set is 0.95
np 20 40 60 80 100 120 140 160
Sopt = {X; ∃I(x) ∈ X, I(x) ∈ Sopt }. (23) The number of tasks
Assume that ζ(k) is the population state of the k th gener-
Fig. 6. Approximation ratio ρ for situation 1.(m(1) = m(2) = 2) for
ation (k = 1, 2, · · · , max). In the optimization problem, max different numbers of tasks.
is the maximum number of generations given in Section
4. ζc (k), ζm (k), and ζs (k) are the states after crossover,
mutation, and selection, respectively. As shown in Fig. 2, 1.6 TJRGA/Topt
ζ(k+1) = ζs (k). ζ(k), ζc (k), ζm (k), and ζs (k) are all random TLS/Topt
1.5 TILS/Topt
variables in the population space X np .
In the JRGA, the operations are invariant with respect
1.4
1.3

ρ
to time. Thus, the transition probability function is also
time-invariant. The transition probability functions for the 1.2
crossover, mutation, and selection operations are 1.1
Pc (X, S) = Pc (ζc (k) ∈ S|ζ(k) = X) (24)
1.0
1 2 3 4 5 6 7 8 9 10
Pm (X, S) = Pc (ζm (k) ∈ S|ζ(k) = X) (25) The number of machines
Ps (X, S) = Pc (ζs (k) ∈ S|ζc (k) = X) (26)
+Pc (ζs (k) ∈ S|ζm (k) = X), Fig. 7. Approximation ratio ρ for 100 tasks for different values of situation
2.(m(1) = m(2) = 2).
where S ⊆ X np . So, the JRGA is modeled by a stationary
Markov chain {ζ(k); k ∈ Z + } in the state space X np with a
transition probability of In the JRGA, accessibility and absorption can be speci-
∫ fied as follows:
P1 (X, S) = Pc (X, dx)Ps (x, S) (27) n
Prop. 1. For any generation k , if ζ(k) ∈ Soptp
, then ζc (k +
∫ np np
1) ∈ Sopt and ζm (k + 1) ∈ Sopt .
P2 (X, S) = Pm (X, dy)Ps (y, S), (28) np
Prop. 2. For any generation k , if ζc (k) ∈ Sopt , then ζ(k +
np
where P1 (X, S) is the transition probability for crossover, 1) ∈ Sopt .
np
and P2 (X, S) is the transition probability for mutation. As Prop. 3. For any generation k , if ζm (k) ∈ Sopt , then ζ(k +
np
shown in Fig. 2, the offspring produced by crossover and 1) ∈ Sopt .
mutation in one generation have the same probability of
Applying Theorem 5 in [44] shows that both P1 (X, S)
participating in the selection operation, with the selected
and P2 (X, S) converge due to the accessibility and ab-
individuals being passed on to the next generation. So, the
sorption just described. So, P1 (X, S) + P2 (X, S) converges.
total transition probability of the JRGA is the sum of those
That is, P (X, S) converges to the global optimum. This
two transition probabilities:
completes the proof.
P (X, S) = P1 (X, S) + P2 (X, S)

= Pc (X, dx)Ps (x, S) 6 C ONCLUSION

+ Pm (X, dy)Ps (y, S). (29) The two-stage-task scheduling problem for CDC is quite
different from other scheduling problems because both steps
For the JRGA, we have the following theorem regarding have to be processed by the same machine. This means
convergence. that existing scheduling algorithms are not applicable to
Theorem 1. Assume that X np is a measurable space that is our problem. In this study, we analyzed the two-stage-task
bounded and compact. If the Markov chain {ζ(k); k ∈ scheduling problem for CDCs and devised the JRGA to
Z + } given by (29) is accessible and absorbing, then it tackle to problem. This study has three main contributions:
converges to the global optimal set.
1) The two-stage-task scheduling problem was stud-
Proof. It has been proven that, if the global optimal ied, and a model of two-stage-task scheduling was
set is accessible and absorbing, then a GA converges to its established for CDCs.
optimum, and the transition probability is [44]: 2) The Johnson’s rule and a GA were combined into
∫ ∫ the JRGA so as to take into account the special char-
PGA (X, S) = Pc (X, dx)Pm (x, dy)Ps (y, S). (30) acteristics of multiprocessor scheduling in CDCs.

2168-7161 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2017.2693187, IEEE
Transactions on Cloud Computing
JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 13

3) Two new evolutionary mutation operations were [12] T. Gonzalez and S. Sahni, “Open shop scheduling to minimize
devised that have the potential to improve the fit- finish time, J ACM, vol. 23, no. 4, pp. 665-679, 1976.
[13] A. Tozkapan, ö. Kirca, and C. S. Chung, “A branch and bound
ness of individuals. algorithm to minimize the total weighted flow time for the two-
stage assembly scheduling problem,” Comput. & Oper. Res., vol. 30,
The JRGA was tested in simulations of different situa- no. 2, pp. 309-320, 2003.
tions and was found to be effective. The results show that [14] A. Allahverdi and F. S. Al-Anzi, “The two-stage assembly schedul-
the JRGA yields a makespan that is very close to the theo- ing problem to minimize total completion time with setup times,”
retical optimum, that it converges quickly as the number of Comput. & Oper. Res., vol. 36, no. 10, pp. 2740-2747, 2009.
[15] C. J. Liao, C. H. Lee and H. C. Lee, “An efficient heuristic for a
tasks increases, and that the speed of convergence decreases two-stage assembly scheduling problem with batch setup times to
as the number of machines increases. A comparison of the minimize makespan,” Comput. & Indust. Engineering, no. 88, pp.
JRGA with two other scheduling algorithms shows that the 317-325, 2015.
JRGA is better than the conventional LS algorithm in all [16] F. Xiong and K. Y. Xing, “Metaheuristics for the distributed two-
stage assembly scheduling problem with bi-criteria of makespan
cases, and that it is better than the ILS algorithm when the and mean completion time,” Int. J. of Production Research, vol. 52,
number of machines is small. no. 9, pp. 2743-2766, 2014.
However, more efforts are still needed to further this [17] G. Q. Liu, K. L. Poh, and M. Xie, “Iterative list scheduling for
heterogeneous computing,” J. Parallel Distrib. Comput., vol. 65, no.
study and they include considering the scheduling with 5, pp. 654-665, 2005.
multiple heterogeneous machines and developing schedul- [18] R. Xu and D. Wunsch, “Survey of clustering algorithms,” IEEE
ing algorithm that with lower complexity. Trans. Neural Networks, vol. 16, no. 3, pp. 645-678, 2005.
[19] I. Ahmad and Y K. Kwok, “On exploiting task duplication in
parallel program scheduling,” IEEE Trans. Parallel Distrib. Syst.,
vol. 9, no. 9, pp. 872-892, 1998.
ACKNOWLEDGMENTS [20] H. Allaoui and A. Artiba, “Scheduling two-stage hybrid flow shop
This work was jointly supported the National Nature Sci- with availability constraints,” Comput. & Oper. Res., vol. 33, no. 5,
pp. 1399-1419, 2009.
ence Foundation of China under Grant No. 61202340, by the [21] S.J. Wang and M. Liu, “Two-stage hybrid flow shop scheduling
International Postdoctoral Exchange Fellowship Program with preventive maintenance using multi-objective tabu search
under Grant No.20140011, by the Fundamental Research method,” International Journal of Production Research, vol. 52, no.
Funds for the Central Universities, China University of Geo- 5, pp. 1495-1508, 2014.
[22] M. Mirabi, S. M. T. Fatemi Ghomi and F. Jolai, “A two-stage hybrid
sciences (Wuhan), by the Hubei Provincial Natural Science flowshop scheduling problem in machine breakdown condition,”
Foundation of China under Grant No. 2015CFA010, and by J. Intell Manuf, no. 24, pp. 193C199, 2013.
the 111 project under Grant B17040. [23] M. Hojati, “Minimizing make-span in 2-stage disassembly flow-
shop scheduling problem,” Computers & Industrial Engineering, no.
94, pp. 1-5, 2016.
[24] C. Oĝuz, M. Fikret Ercan, T. C. Edwin Cheng, and Y. F. Fung,
R EFERENCES , “Heuristic algorithms for multiprocessor task scheduling in a
[1] A. Wolke, M. Bichler and T. Setzer, “Planning vs. Dynamic Con- two-stage hybrid flow-shop,” Eur. J. Oper. Res., vol. 149, no. 2, pp.
trol: Resource Allocation in Corporate Clouds,” IEEE Trans. Cloud 390-403, 2003.
Comput., vol.4, no.3, pp.322 - 335,2016. [25] F. Nikzad, J. Rezaeian, I. Mahdavi, and I. Rastgar, “Scheduling
[2] M. żotkiewicz, M. Guzek, D. Kliazovich and P. Bouvry, “Minimum of multi-component products in a two-stage flexible flow shop,”
Dependencies Energy-Efficient Scheduling in Data Centers,” IEEE Applied Soft Computing, no. 32, pp. 132-143, 2015.
Trans. Parallel Distrib. Syst., vol.27, no.12 ,pp.3561 - 3574,2016. [26] S. A. Mansouria, E. Aktasb and U. Besikcic, “Green scheduling of a
[3] Y. H. Xiong, S. Z. Huang, M. Wu, Y. X. Zhang and J. H. She, “A two-machine flowshop: Trade-off between makespan and energy
novel resource management method of providing operating sys- consumption,” Eur. J. of Operational Research, no. 248, pp. 772-788,
tem as a service for mobile transparent computing,” The Scientific 2016.
World Journal, vol. 2014, pp. 1-12, 2014. [27] F. Pezzella, G. Morganti, and G. Ciaschetti, “A genetic algorithm
[4] H. Topcuoglu, S. Hariri, and M.Y. Wu, “Performance-effective and for the flexible job-shop scheduling problem,” Comput. & Oper.
low-complexity task scheduling for heterogeneous computing,” Res., vol. 35, no. 10, pp. 3202-3212, 2008.
IEEE Trans. Parallel Distrib. Syst., vol. 13, no. 3, pp. 260-274, 2002. [28] G. Zhang, L. Gao, and Y. Shi, “An effective genetic algorithm for
[5] K. Li, “Scheduling precedence constrained tasks with reduced pro- the flexible job-shop scheduling problem,” Expert Syst. Appl., vol.
cessor energy on multi-processor computers,” IEEE Trans. Comput., 38, no. 4, pp. 3563-3573, 2011.
vol. 61, no 12, pp. 1668-1681, 2012. [29] M. I. Daoud and N. Kharma, “A hybrid heuristicCgenetic algo-
[6] M. Hu and B. Veeravalli, “Requirement-aware scheduling of bag- rithm for task scheduling in heterogeneous processor networks,”
of-tasks applications on grids with dynamic resilience,” IEEE J. Parallel Distrib. Comput., vol. 71, no. 11, pp. 1518-1531, 2011.
Trans. Comput., vol. 62, no 10, pp. 2108-2114, 2013. [30] C. Oĝuz and M. F. Ercan, “A genetic algorithm for hybrid flow-
[7] S. M. Johnson, “Optimal two and three stage production schedules shop scheduling with multiprocessor tasks,” J. Scheduling, vol. 8,
with setup times included,” Nav. Res. Logist., vol. 1, no. 1, pp. 61- no. 4, pp. 323-351, 2005.
68, 1954. [31] Z. H. Zhan, X. F. Liu, Y. J. Gong, and J. Zhang, “Cloud computing
[8] Y.J. An, Y. D. Kim and S. W. Choi, “Minimizing makespan in a resource scheduling and a survey of its evolutionary approaches,”
two-machine flowshop with a limited waiting time constraint and ACM Computing Surveys, vol. 47, no. 4, pp. 1-33, 2015.
sequence-dependent setup times,” Comput. & Operat. Research, no. [32] S. H. Jang, T. Y. Kim, J. K. Kim and J. S. Lee, “The study of genetic
71, pp. 127C136, 2016. algorithm-based task scheduling for cloud computing,” Int. J. of
[9] T.C.E. Cheng, C. C. Wu, J. C. Chen, W. H. Wu, and S. R. Cheng, Control and Automation, vol. 5, no. 4, pp. 157-162, 2012.
“Two-machine flowshop scheduling with a truncated learning [33] J. T. Tsai, J. C. Fang and J. H. Chou, “Optimized task scheduling
function to minimize the makespan” Int. J. Production Economics, and resource allocation on cloud computing environment using
no. 141, pp. 79-86, 2013. improved differential evolution algorithm,” Computers & Opera-
[10] C. J. Hsu, W. H. Kuo , D. L. Yang and M. S. Chern, “Minimizing tions Research, vol. 40, no. 12, pp. 3045-3055, 2013.
the makespan in a two-stage flowshop scheduling problem with a [34] S. Kaur and A. Verma, “An efficient approach to genetic algorithm
function constraint on alternative machines,” J. Mar. Sci. Technol., for task scheduling in cloud computing environment,” Int. J. of
vol. 14, no. 14, pp. 213-217, 2006. Information Technology and Computer Science, vol. 4, no. 10, pp. 74-
[11] J. Dong, J. Hu, and Y. Chen, “Minimizing makespan in a two- 79, 2012.
stage hybrid flow shop scheduling problem with open shop in [35] S. Bilgaiyan, S. Sagnika and M. Das, “An Analysis of Task Schedul-
one stage,” J. Chinese Univ., vol. 28, no. 3, pp. 358-368, 2013. ing in Cloud Computing using Evolutionary and Swarm-based

2168-7161 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2017.2693187, IEEE
Transactions on Cloud Computing
JOURNAL OF LATEX CLASS FILES, VOL. 13, NO. 9, SEPTEMBER 2014 14

Algorithms,” International Journal of Computer Applications, vol. 89, Min Wu (SM’08) received his B.S. and M.S.
no. 2, pp. 11-18, 2014. degrees in engineering from Central South U-
[36] J. Liu, X. G. Luo, X. M. Zhang and F. Zhang, “Job scheduling model niversity, Changsha, China, in 1983 and 1986,
for cloud computing based on multi-objective genetic algorithm,” respectively, and his Ph.D. degree in engineering
International Journal of Computer Science Issues, vol. 10, no. 1, pp. from the Tokyo Institute of Technology, Tokyo,
134-139, 2013. Japan, in 1999. He was a faculty member of the
[37] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph and R. Katz, School of Information Science and Engineering
“A view of cloud computingłclearing the clouds away from the at Central South University, attaining the position
true potential and obstacles posed by this computing capabilities, of full professor, from 1986 to 2014. In 2014, he
Communications of the ACM, vol. 53, no. 4, pp. 50-58, 2010. moved to the China University of Geosciences,
[38] J. N. D. Gupta, “Two-stage, hybrid flowshop scheduling problem,” Wuhan, China, where he is a professor in the
J. Oper. Res. Society, vol. 39, no. 4, pp. 359-364, 1988. School of Automation. He was a visiting scholar with the Department of
[39] S. N. Sivanandam and S. N. Deepa, “Genetic Algorithm Optimiza- Electrical Engineering, Tohoku University, Sendai, Japan, from 1989 to
tion Problems,” Springer Berlin Heidelberg, 2008. 1990, and a visiting research scholar with the Department of Control
[40] J. H. Holland, Adaption in natural and artificial systems, University and Systems Engineering, Tokyo Institute of Technology, from 1996
of Michigan Press, Ann Arbor, 1975. to 1999. He was a visiting professor at the School of Mechanical,
[41] M. Hekmatfar, S. M. T. F. Ghomi, and B.Karimi, “Two stage Materials, Manufacturing Engineering and Management, University of
reentrant hybrid flow shop with setup times and the criterion of Nottingham, Nottingham, UK, from 2001 to 2002. His current research
minimizing makespan,” Applied Soft Computing, vol.11, no. 8, pp. interests include robust control and its applications, process control, and
4530-4539, 2011. intelligent control.
[42] A. Bellanger, S. Hanafi, and C. Wilbaut,“Three-stage hybrid- Dr. Wu received the IFAC Control Engineering Practice Prize Paper
flowshop model for cross-docking,” Computers & Operations Re- Award in 1999 (together with M. Nakano and J. She).
search, vol. 40, no. 4, pp. 1109-1121, 2013.
[43] J. A. Lozano, P. Larrañaga, M. Graña and F.X. Albizuri, “Genetic
algorithms: bridging the convergence gap,” Theor. Comput. Sci.,
vol. 229, no. 1, pp. 11-22, 1999.
[44] J. He and L. Kang, “On the convergence rates of genetic algorithm- Jinhua She (M’94-SM’08) received his B.S. de-
s,” Theor. Comput. Sci., vol. 229, no. 1, pp. 23-39, 1999. gree in engineering from Central South Universi-
[45] J. Ren, Y.X. Zhang and J.E. Chen, “Analysis on the Scheduling ty, Changsha, China in 1983, and his M.S. and
Problem in Transparent Computing,”Proc. IEEE Int. Conf. High Ph.D. degrees in engineering from the Tokyo
Performance Computing and Communications & Int. Conf. Embedded Institute of Technology, Tokyo, Japan in 1990
and Ubiquitous Computing (HPCC EUC 13), Zhangjiajie, China, and 1993, respectively. In 1993, he joined the
2013, pp.1832-1837. School of Engineering, Tokyo University of Tech-
nology, Tokyo, where he is currently a professor.
His research interests include the application of
control theory, repetitive control, process con-
trol, Internet-based engineering education, and
robotics.
Dr. She is a member of the Society of Instrument and Control Engi-
neers, the Institute of Electrical Engineers of Japan, the Japan Societyof
Mechanical Engineers, and the Asian Control Association. He was the
Yonghua Xiong received his M.S. and Ph.D. recipient of the International Federation of Automatic Control (IFAC)
degrees in engineering from Central South U- Control Engineering Practice Prize Paper Award in 1999 (jointly with M.
niversity, Changsha, China in 2004 and 2009. Wu and M. Nakano).
He was a visiting scholar with the Department
of Computer Science, City University of Hong
Kong, Hong Kong, from 2006 to 2008. He was
a lecturer and then an associate professor of the
School of Information Science and Engineering,
Central South University from January 2005 to Keyuan Jiang (SM’98) received his B.S. degree
August 2014. He joined the staff of the China in computer science from Southeast University,
University of Geosciences in September 2014, Nanjing, China in 1982, his M.S. in biomedical
where he is currently a professor of the School of Automation. He engineering from Shanghai JiaoTong University,
has been in the Department of Computer Information Technology & Shanghai, China in 1985, and his Ph.D. degrees
Graphics, Purdue University Northwest, USA, as a post-doctor since in biomedical engineering from Vanderbilt Uni-
January 2015. His research interests include computational intelligence, versity, Nashville, USA, respectively. He joined
scheduling algorithms, and cloud computing. the Department of Computer Information Tech-
nology & Graphics of Purdue University North-
west, where he is currently a professor and the
Department Head. His research expands from
bioinformatics, mining healthcare social media, and clinical research
and healthcare information technology. Dr. Jiang is a senior member
of the IEEE Computer Society and the IEEE Engineering in Medicine
and Biology Society. He is also a member of ACM.
Suzhen Huang received her B.S. and M.S. de-
gree in engineering from Central South Univer-
sity, Changsha, China, in 2012 and 2015, re-
spectively. Her research interests include cloud
computing and scheduling algorithms.

2168-7161 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

S-ar putea să vă placă și