Accepted Manuscript: 10.1016/j.osn.2018.06.001

Accepted Manuscript
Traffic prediction based on machine learning for elastic optical networks
Michal Aibin
PII: S1573-4277(17)30190-X
DOI: 10.1016/j.osn.2018.06.001
Reference: OSN 486
To appear in: Optical Switching and Networking
Received Date: 28 September 2017

Revised Date: 4 May 2018
Accepted Date: 1 June 2018
Please cite this article as: M. Aibin, Traffic prediction based on machine learning for elastic optical
networks, Optical Switching and Networking (2018), doi: 10.1016/j.osn.2018.06.001.
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to
our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and all
legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Traffic prediction based on machine learning for elastic

optical networks
PT
Michal Aibin∗
British Columbia Institute of Technology
Department of Computing, Vancouver, BC, Canada
RI
SC
Abstract
The increased data transfers and rapidly evolving cloud services lead to the in-
U
evitable need for the new techniques applied to communication networks, such
AN
as AI, machine learning, and data analysis. In this paper, we present two ap-
proaches that employ the machine learning techniques to enable traffic predic-
tion in Elastic Optical Networks. Results show that the application of adaptive
M
strategies has superior performance, which is a future opportunity for telecom-
munication operators to improve the efficiency of their network architectures.
D
Keywords: elastic optical networks, dynamic routing, cloud services, machine

learning, traffic prediction
TE
1. Introduction
EP
Over the last decade, optical networks have gone through a rapid evolu-
tion, starting with 16 wavelengths of 2.5 Gb/s in the late 1990s to 80 wave-
lengths of 100 Gb/s in 2012 [1, 2]. Today, the term optical networks denote
C
5 high-capacity telecommunications networks based on optical technologies and

components that can provide capacity, provisioning, routing, grooming, and/or
AC
restoration at the wavelength level. With estimated exponential traffic growth,

future networks have to boost their capacity. The channel capacity will need
∗ Correspondingauthor
Email address: maibin@bcit.ca (Michal Aibin)
Preprint submitted to Journal of Optical Switching and Networking May 4, 2018

ACCEPTED MANUSCRIPT
to be increased beyond 100 Gb/s per channel or higher, with an increase of

10 spectral efficiency.
PT
The backbone transport technique in nowadays optical networks is Wave-
length Division Multiplexing (WDM). The main idea underlying the concept of
WDM networks is to communicate end-users in the optical layer through all-
RI
optical WDM channels, which are named as lightpaths [3]. A connection in a
15 wavelength-routed WDM network is supported by a lightpath which may span
SC
multiple fiber links. Also, when there are no wavelength converters, a lightpath
must occupy the same wavelength on all the fiber links through which it traverses
due to the wavelength-continuity constraint. Despite all benefits of conventional
U
WDM networks, their biggest problem is a low bandwidth efficiency due to a
fixed granularity [4].
20
AN
Aiming to break the fixed-grid spectrum allocation limit of conventional
WDM networks, a novel spectrum efficient and scalable optical transport net-
M
work architecture, called Elastic Optical Networks (EONs) is introduced. The
idea underlying the concept of Elastic Optical Networks is to allocate appropriate-
25 sized, optical bandwidth to an end-to-end optical path. It is different than in
D
fixed-sized optical bandwidth allocation in WDM. Unlike the rigid bandwidth in

WDM, an optical path in EONs expands according to the traffic volume [5, 6, 7].
TE
Moreover, the growing popularity of cloud and content-oriented service has

led to the increased demand for the data transfers. It is inevitable that current
EP
30 solutions will need to be upgraded or changed shortly. Currently, the cloud

data centers (DCs) are no longer a new thing - they become to be a standard
resource, used by many companies. Everything is measured by the use of virtual
C
resources and payable per hours of using it [8].

One of the key challenges in increasing the efficiency of cloud computing is
AC
35 to predict the bandwidth requirement in the next control time interval based on
the online measurement of traffic characteristics. By using the machine learning
methods, the goal is to forecast future traffic rate variations as precisely as
possible, based on the measured history. In this paper, we propose a Monte
Carlo Tree Search (MCTS) algorithm [9] as a mechanism for traffic prediction
2
ACCEPTED MANUSCRIPT
40 in cloud data center networks. Monte Carlo Tree Search is used to identify the
best combination of cloud data centers and candidate path pairs for provisioning
PT
services related to specific requests. It builds a sparse search tree and selects
actions using Monte Carlo sampling. These actions are used to deepen the tree
in the most promising direction [10]. We then compare our results with the
RI
45 results achieved by the Artificial Neural Network, trained on the dataset with
modeled data belonging to the last weeks.
SC
The main contribution is the evaluation of benefits of the traffic prediction
mechanisms using a specific provider-centric use case. To efficiently test the
traffic prediction mechanisms, we use various dynamic routing algorithms in
U
50 Wide Area Networks. The algorithms proposed in this paper do not depend on
a particular implementation and, therefore, apply to other frameworks. More-
AN
over, experiments demonstrate that the proposed methods for traffic prediction
have superior performance when applied to standard heuristic algorithms used
M
in Elastic Optical Networks and can potentially become a new direction for
55 optimization of optical networks performance.
To the best of our knowledge, this is the first paper that introduces Monte
D
Carlo Tree Search as a method of traffic prediction for Elastic Optical Networks.
In the related works, the machine learning techniques for optimization of com-
TE
munication networks are mostly used for classification of IP traffic [11, 12, 13] or
60 network intrusion/anomaly detection [14, 15, 16, 17, 18, 19, 20, 21, 22]. In this
EP
paper, the Monte Carlo Tree Search algorithm [9] is adapted to the DC resource
allocation problem. The first tutorial on using the game theory in communica-
tion networks was presented in [23]. Furthermore, the [24, 25] focuses on the
C
implementation of Monte Carlo Tree Search algorithm for deflection routing in

complex networks. However, the topic of a probabilistic routing/prediction in
AC
65
optical networks is still not widely discussed in the literature, and there is a
need for further study of this topic. For details and possible further research
directions, we refer to [26].
The remainder of the paper is divided as follows. In Section II we introduce
70 the optimization problem and describe the network model. Section III contains
3
ACCEPTED MANUSCRIPT
the information about the traffic prediction mechanisms used in the paper. In
Section IV we present simulation setup and results, and finally, Section V con-
PT
cludes the work.
2. Problem description
RI
75 2.1. Network model
We use similar notations as in [27]. The optical network is modeled as graph
SC
G(V , E, B, L), where V denotes a set of vertices (nodes), E is a set of directed
edges (fiber links), each fiber link can accommodate |B| frequency slices at most,
and L = [l(1), l(2), ..., l(|E|)] represents link lengths for each e ∈ E. There are
U
80 |R| cloud data centers allocated at nodes of the network. The location of them
AN
is provided by Amazon Web Services. DC are characterized by five main param-
eters: the number of computational units (CPU units), the number of memory
(RAM units), the number of disk space (storage units), the expense of use (ex-
M
pressed in USD / hour) and the location (which is used to calculate the distance
85 between the client and DC). Depending on the network, data centers are cen-
D
tralized (Euro28 network) or scattered around the two shores of the continent
(US26 network). A set of |D| traffic requests is created dynamically, during
TE
the simulations. It contains sets of multicast, unicast, and anycast requests.

Furthermore, we assume that all DCs provide the same requested service. Each
90 request d between the client nodes may be a unicast or multicast type, whereas
EP
requests to and from DCs are anycast type. Finally, each anycast request may
be assigned to any of the DCs. The downstream and upstream anycast requests
C
are referred as associated. It is also assumed that various modulation formats

may be used in the EONs. The higher spectral efficiency is achieved with, the
AC
95 higher modulation formats, leading to lower spectrum demand, at the cost of

shorter transmission distance. On the other hand, less spectrally efficient mod-
ulation formats can transmit over longer distances. Since regenerators in optical
networks are costly, the selection of modulation formats is made to minimize
the number of regenerators placed in the network. The spectrum requirement
4
ACCEPTED MANUSCRIPT
100 for a particular request is determined according to a distance-adaptive trans-

mission (DAT) rule [28]. We use the physical model of EONs as in paper [29]
PT
and a transmission model proposed in [30], which estimates the transmission
distance in a function of the modulation level and transported bit-rate. More-
over, a 12.5 GHz guard band between neighboring connections is introduced.
RI
105 For all considered modulation formats, the transmission reach is extended by
using regenerators, which are applied whenever necessary.
SC
To handle client requests in the network, in addition to solving the Rout-
ing, Modulation and Spectrum Assignment (RMSA) problem [31] in the optical
network, one has to decide on choosing the best DC. In order to establish flows,
U
110 one need also to find routes to allocate requests in the network. Therefore,
along with the problem of flow optimization, there is a problem of selection of
AN
candidate routing paths. The algorithms calculate the set of candidate paths P
that includes precisely k paths.
M
2.2. Objective Function
115 Below is the ILP model used in our simulations. In particular, we address
D
one objective function: the minimization of the average request blocking per-
centage, defined as the percentage of rejected requests for data center resources
TE
to all requests in offered to the network 1.
Request Blocking ILP model

EP
120
Sets
B slices
C
D all requests offered to network

Dbl blocked requests
AC
Dany anycast requests (upstream and downstream)

Dany(DS) anycast downstream requests
Duni unicast requests
Dmulti multicast requests
C(d, p) candidate channels for requests d ∈ D
5
ACCEPTED MANUSCRIPT
P (d) candidate paths for requests d ∈ D

if d ∈ Duni the candidate path connects end nodes of the requests
PT
if d ∈ Dany (upstream), path p connects the client and the DC nodes
if d ∈ Dany (downstream), path p connects the DC and the client nodes
if d ∈ Dmulti , the candidate path connects end nodes of the multicast tree request
RI
E network links
Constants
SC
δedp 1, if link e belongs to path p realizing request d; 0, otherwise
ndp requested number of slices for request d on path p
γdpcb 1, if channel c associated with request d on path p uses slice b; 0, otherwise
U
τ (d) index of a request associated with request d
if d is a downstream request, then τ (d) have to be an upstream connection
AN
if d is an upstream request, then τ (d) have to be a downstream connection
s(p) source node of path p
M
t(p) destination node of path p
Variables
D
xdpc 1, if channel c on candidate path p is used to realize request d; 0, otherwise

TE
yeb 1, if slice b is occupied on link e; 0, otherwise
Objective
EP
|Dbl |
min × 100% (1)
|D|
Subject to
C
X X
AC
xdpc = 1, d ∈ D (2)
p∈P (d) c∈C(d)
X X X
γdpcb δedp xdpc ≤ yeb , e ∈ E, b ∈ B (3)
d∈D p∈P (d) c∈C(d,p)
6
ACCEPTED MANUSCRIPT
X X X X
xdpc s(p) = xτ (d)pc t(p), d ∈ Dany(DS) (4)
PT
p∈P (d) c∈C(d,p) p∈P (τ (d)) c∈C(τ (d),p)
We take a similar approach as in [32] for formulating the ILP problem. The
equation (2) assures that for each request d precisely one candidate path and
RI
125 one candidate channel are selected. Next, equation (3) guarantees that a slice
on a particular link can be allocated to at most one lightpath. Moreover, the
SC
constraint (4) assures that both associated anycast requests use candidate paths
connected to the same DC node.
U
3. Traffic Prediction Mechanisms
130 3.1. Monte Carlo Tree Search AN

The Monte Carlo Tree Search (MCTS) algorithm is implemented to enable
traffic prediction in the network. In this approach, the DC requests are processed
M
in batches. If the decision-making agent has access to a generative model of the
system that is capable of generating samples of successor states ζ 0 and rewards
D
135 ι given a state ζ and an action a, it may be used to perform a sampling-based

look-ahead search for rewarding actions [33].
TE
The nodes and edges of the search tree correspond to states and actions,
respectively. The root of the tree corresponds to the initial state ζ0 . Let |Aζ |
EP
be the number of available actions at a given state ζ. The search tree node that
140 corresponds to this state has |Aζ | child nodes, each corresponding to a possible
next state ζ 0 that is a result of selecting an action a ∈ Aζ . Each tree node stores
C
a value κ and a visit count σ. A path from the root to a leaf node defines an
action policy π.
AC
The higher value of κ the better the quality of the decision. The κ is cal-
145 culated using the trade-off between the cost of service and request blocking
percentage. The σ indicates how often the particular pair of path and DC was
chosen in the prediction. Choosing often the same DC and path pair leads to
higher utilization of it. The σ parameter balances the choices, by lowering the
7
ACCEPTED MANUSCRIPT
reward of DC and path pairs that are overused. The total reward is a sum
150 of scores, calculated from the root to the termination leaf node, using the (5)
PT
equation. The goal is to choose the decision that leads to the higher value of
the reward.
The procedure of the Monte Carlo Tree Search looks as follows. In the
RI
beginning, the MCTS has a tree that consists only of the root node. Next
155 phases are then executed until a predefined computational budget β is used. In
SC
simple words, β indicates the number of search tree levels that are going to be
created.
The following phases of the MCTS algorithm can be distinguished: selection,
U
where the tree is traversed from the root until a non-terminal leaf node. A child
node is selected based on a selection strategy at every leaf node, which may be
160
AN
exploratory or exploitative. In this paper, The Single-Player Upper Confidence
Bounds for Trees (UCT) [34] is used as a selection strategy. Let θ denote the
M
visit count of current node of the search tree and Ψ the set of all its children.
Furthermore, let κψ and σψ denote the value and visit count of a node with an
165 index ψ. UCT selects a child χ from:
D
s sP
2 − σψ κψ + LP C
κψ ln θ ψ∈Ψ ιψ
χ = arg max + EX + (5)
TE
ψ∈Ψ σψ σψ σψ
where ι2ψ is the sum of the squared rewards that the ψ th child node has received
so far, and LP C is a large positive constant; next, we proceed to expansion
EP
phase, where one or more of its successors are added to the tree, to expand
it. The new node corresponds to the next state of the prediction [35]; after
C
170 reaching the simulation time limit or the terminal state, a reward is calculated.
This reward is then propagated from the terminal node to the root to calculate
AC
the quality of the final solution.

A search tree is first constructed where the root corresponds to the current
DC and the optical resource utilization in the network. The root has |R| × k
175 children for each (DC, candidate path) pair available for serving the current DC
request. Monte Carlo simulations are executed using the current distribution
8
ACCEPTED MANUSCRIPT
of the DC requests to deepen the search tree up to β levels. In this paper, the
five selection cycles for each request set was established as the computational
PT
budget β. When a leaf node at depth β is reached, its value is calculated as the
180 sum of all utilization scores of DCs and optical links in the network. The (DC,
candidate path) pair that corresponds to the root’s child with the highest value
RI
is then selected for serving the current request. The runtime of the algorithm
can be computed as O(|Aζ | × β), where |Aζ | is the number of random children
SC
to consider per search, and β is the computational budget.
185 3.2. Artificial Neural Network
U
In the second approach, we use Artificial Neural Network (ANN) as in [36] to
predict traffic changes in a non-supervised manner. The size of an ANN depends
AN
on the number of inputs, hidden layers, and neurons. We consider ANN models
with x inputs, y neurons in a single hidden layer and a single output. Conse-
quently, y(x + 1) coefficients need to be found to specify every ANN. Our ANN
M
190
was trained in three phases. First, we perform input data preprocessing, using
normalization, to ensure that all the inputs are at a comparable range [-1;1].
D
Next, we select the significant inputs - which in our case are link utilization,
data center utilization, and the information how particular routing decisions
TE
195 made in previous (historical) iterations were affecting the network utilization.
Finally, we perform dimensioning of the hidden layer. The tuning process re-
vealed the following values of the input parameters: x = 3, y = 5. The whole
EP
procedure is self-learning to improve its efficiency. The weights of neurons were

updated during training using backpropagation technique. For more details, we
C
200 refer to [36].

The runtime complexity of trained ANN depends on the number of neurons.
AC
Since the output functions of neurons are relatively simple to calculate, we as-
sume they are constant per neuron. The number of neurons in our optimization
problem is also a constant. Thus, the overall complexity is equal to O(n2 ).
205 Finally, the backpropagation during the training procedure is linear in regards
to the number of training samples. Therefore, the operation of updating the
9
ACCEPTED MANUSCRIPT
weights for a single sample is constant.
3.3. RMSA Algorithms
PT
To assess the proposed traffic prediction mechanism, we implemented three
210 RMSA algorithms from the literature and use them with/without traffic predic-
RI
tion enabled. Those are AMRA [37], MNC [38] and Genetic Algorithm proposed
in [39]. Moreover, for low traffic loads we obtained the optimal results using
the IBM CPLEX Solver, and for moderate and high traffic loads, we imple-
SC
mented the approach of choosing the nearest data center to traffic origin (SPF
215 algorithm).
U
4. Results discussion
4.1. Simulation Scenario AN

We consider the Euro28 network (28 nodes, 82 unidirectional links, and 7
DCs) and the US26 network (26 nodes, 84 unidirectional links, and 10 DCs),
M
220 shown in Fig. 1. The location of DCs, interconnection points, and submarine
cable landing stations are obtained from the Data Center Map website [40]. In
D
each DC location, ten m3.2xlarge AWS EC2 instances are available. The pricing
model for DC resources shown in the Table 1 is based on the AWS.
TE
Table 1: Prices of DC services provided by the AWS.
Region # of DCs locations Price per hour

EP
US-WEST 5 0.431 $
US-CENTRAL 2 0.474 $
C
US-EAST 3 0.442 $
EURO-WEST 3 0.585 $
AC
EURO-CENTRAL 3 0.499 $
EURO-EAST 1 0.632 $
The EON technology is used for the optical layer. In simulation scenarios,
225 the entire available band of 4 THz spectrum is divided into 12.5 GHz frequency
10
ACCEPTED MANUSCRIPT
slices thus resulting in 320 slices. We use EON with BV-Ts to implement the
PDM-OFDM technology with multiple modulation formats, selected adaptively
PT
between BPSK, QPSK, and m-QAM, where m belongs to 8, 16, 32, and 64.
The three types of BV-Ts are applied with a different bit-rate limit, 40 Gbps,
230 100 Gbps, and 400 Gbps. Each network has three interconnection points to
RI
other networks that carry international traffic. We take into consideration the
physical impairment of links (fiber attenuation, component insertion loss) and
SC
use regenerators for signals that require higher modulation formats.
The traffic model is based on the 2018 projection of ”Cisco Visual Networking
235 Index” forecast, with following request types:
U
- Processing as a Cloud (PaaC): Cloud providers deliver a computing plat-
AN
form, typically including an operating system, programming-language ex-
ecution environment, database, and web server. PaaC requests require
CPU (maximum 16 units) and RAM (maximum 64 GB) from DCs. We
M
240 consider node to DC and international traffic to DC PaaC requests, served
by anycast and unicast flows. It represents 17.8% of all traffic, 10-200 Gbps
D
of requested bit-rate.
- Storage as a Cloud (SaaC): These type of requests require storage re-

TE
sources from DCs. Cloud storage may be used for copying virtual machine
245 images from the cloud to on-premises locations, to import a virtual ma-
EP
chine image from an on-premises location to the cloud image library, or

to move virtual machine images between user accounts or between data
centers. We consider node to DC, DC to DC, and international traffic to
C
DC SaaC requests, served by all types of flows. It represents 5.6% of all

traffic, 100-400 Gbps of requested bit-rate and maximum of 1 TB storage
AC
250
per request.
- Software as a Service (SaaS): This is one of the most popular type of cloud
services sometimes referred to as “on-demand software”. It is often priced
on a pay-per-use or subscription basis. SaaS requires all three types of
11
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
M
D
TE
EP
C
AC
Figure 1: The Euro28 (top) and the US26 (bottom) network topologies.
12
ACCEPTED MANUSCRIPT
255 DC resources. In more detail it uses maximum of 8 CPU units, 32 GB of

RAM per request and 100 GB storage per request. We consider node to
PT
DC and international traffic to DC SaaS requests, served by unicast and
multicast flows. It represents 58% of all traffic, 10-100 Gbps of requested
bit-rate.
RI
260 - Optical as a Service (OaaS): It requires storage resources from DCs and
optical network resources in network to transfer large amounts of data
SC
between nodes and DCs or between DC and DC, served only by anycast
flows. It represents 18.6% of all traffic, 100-400 Gbps of requested bit-rate,
maximum of 4 CPU units, 4 GB of RAM per request and 4 TB storage
U
265 per request.
AN
We assume that the requests arrive in batches based on a Poisson distribu-
tion with the mean arrival rate of λ requests per unit time and their lifetime
M
exponentially distributed with the mean 1/γ. The traffic load is expressed in
λ/γ Erlangs. In simulations of both Euro28 and US26, the number of requests
270 is 510,000. The first 10,000 requests before the network load reached a steady-
D
state were not considered.

TE
4.2. Results
The main goal of the simulations is to compare the results of allocation of

data center traffic with and without traffic prediction. The metric that is used
EP
275 for the evaluation is the Blocking Percentage (BP).

In the first scenario, we evaluated the traffic prediction mechanisms efficiency
C
using Euro28 network, as shown in Tables 2 and 3. We used low traffic loads
(50-150 ER) to obtain optimal results and compared them with the heuristic
AC
algorithms. Furthermore, the simulations were performed in moderate to high

280 traffic loads scenarios. The solution space was too large to get the optimal
results, thus, we compared the results in comparison to the SPF method.
The performance of all RMSA algorithms was improved by using machine
learning techniques. The use of MCTS and ANN allowed achieving results close
13
ACCEPTED MANUSCRIPT
Table 2: Comparison of various dynamic routing approaches - average blocking requests for
the Euro28 network.
PT
Traffic Load Optimal SPF AMRA AMRA+MCTS AMRA+ANN MNC MNC+MCTS MNC+ANN GA GA+MCTS GA+ANN
50 ER 0.00% 0.51% 0.17% 0.00% 0.02% 0.43% 0.15% 0.17% 0.21% 0.00% 0.05%
100 ER 0.02% 0.95% 0.22% 0.04% 0.04% 0.60% 0.19% 0.23% 0.25% 0.04% 0.12%
200 ER 0.11% 1.81% 0.43% 0.18% 0.17% 0.88% 0.49% 0.53% 0.57% 0.19% 0.34%
500 ER - 3.97% 1.08% 0.57% 0.80% 1.58% 0.99% 1.23% 1.07% 0.51% 1.01%
RI
600 ER - 4.51% 1.45% 0.71% 1.12% 1.74% 1.29% 1.37% 1.37% 0.69% 1.22%
700 ER - 5.70% 1.89% 1.01% 1.32% 2.27% 1.63% 1.91% 2.07% 2.01% 2.05%
800 ER - 8.78% 2.90% 1.42% 1.73% 3.24% 1.95% 2.03% 3.12% 2.81% 2.94%
SC
Table 3: Cost per hour of service (in USD) for the Euro28 network.
Traffic Load SPF AMRA AMRA+MCTS AMRA+ANN MNC MNC+MCTS MNC+ANN GA GA+MCTS GA+ANN
50 ER 0.68 1.09 0.66 0.44 0.81 0.75 0.72 1.20 0.99 0.89
U
100 ER 0.87 1.34 1.01 0.67 1.21 1.06 1.01 1.52 1.13 1.00
200 ER 2.11 1.57 1.18 0.99 2.18 1.98 1.87 2.57 2.01 1.80
500 ER 6.28 5.44 4.44 3.89 5.58 5.01 4.98 5.81 5.15 5.20
600 ER
700 ER
800 ER
6.78
7.02
7.14
5.81
6.22
6.88
4.62
6.31
8.21
AN
4.32
5.78
7.13
5.91
6.19
7.14
5.88
6.52
7.93
5.52
6.11
8.01
6.01
6.77
7.88
5.69
6.66
7.67
5.70
6.53
7.44
M
to the optimal ones, obtained by the IBM CPLEX Solver. Again, both MCTS
285 and ANN techniques provided superior performance. It is worth noticing that
for low traffic loads the Genetic Algorithm achieved the best results, but its
D
performance deteriorated for higher traffic loads. It is caused by the number of

TE
decision needed to be calculated in a short amount of time. Due to the dynamic

characteristics of the scenarios, we set a limit for calculations equal to 50 ms -
290 the GA was unable to reach the full efficiency in this time, what was even more
EP
visible for the US26 network, where it performed poorly for moderate and higher
traffic loads. The more adaptive method - AMRA - performed very well for all
levels of traffic, resulting in acceptable SLA even for high traffic loads. Finally,
C
the ANN method worked better for lower traffic loads. The MCTS provided
AC
295 remarkable improvements to all RMSA methods in all traffic scenarios. The
traffic patterns in nowadays networks change quite frequently. Thus, it is better
to do the MCTS sampling than analyzing the history data (as in ANN), because
the MCTS is adapting to all network changes in a shorter time.
14
ACCEPTED MANUSCRIPT
Table 4: Comparison of various dynamic routing approaches - average blocking requests for
the US26 network.
PT
Traffic Load Optimal SPF AMRA AMRA+MCTS AMRA+ANN MNC MNC+MCTS MNC+ANN GA GA+MCTS GA+ANN
50 ER 0.01% 0.71% 0.33% 0.02% 0.05% 1.03% 0.75% 0.77% 0.81% 0.02% 0.11%
100 ER 0.07% 1.25% 0.52% 0.10% 0.12% 1.42% 0.99% 1.03% 0.85% 0.10% 0.32%
200 ER 0.19% 2.01% 0.73% 0.58% 0.56% 1.88% 1.00% 1.22% 0.97% 1.39% 2.34%
500 ER - 7.07% 2.28% 0.72% 0.75% 2.98% 1.29% 1.38% 3.07% 1.91% 2.21%
RI
600 ER - 8.61% 3.55% 0.79% 0.92% 3.94% 1.89% 2.07% 4.37% 2.29% 2.72%
700 ER - 9.01% 4.29% 1.21% 1.32% 6.72% 2.13% 2.18% 5.71% 3.63% 3.95%
800 ER - 10.28% 5.05% 1.72% 1.93% 9.44% 3.85% 3.99% 6.32% 4.22% 4.44%
SC
Table 5: Cost per hour of service (in USD) for the US26 network.
Traffic Load SPF AMRA AMRA+MCTS AMRA+ANN MNC MNC+MCTS MNC+ANN GA GA+MCTS GA+ANN
50 ER 0.88 1.22 0.89 0.67 1.12 0.85 0.89 1.44 1.22 1.01
U
100 ER 0.97 1.53 1.52 0.96 1.78 1.64 1.67 1.92 1.36 1.23
200 ER 2.32 1.98 1.89 1.24 2.46 2.01 1.99 2.92 2.14 1.89
500 ER 6.88 6.24 4.94 4.22 5.99 5.72 5.03 6.00 5.65 5.29
600 ER
700 ER
800 ER
7.01
7.32
7.99
6.78
6.99
7.72
5.02
6.89
7.99
AN
4.99
5.90
7.53
6.12
6.89
7.76
6.81
7.20
8.11
6.62
7.11
8.51
6.18
6.97
8.22
5.99
6.92
8.01
6.07
7.32
7.92
M
Moreover, the costs of using traffic prediction methods were comparable to
300 methods without traffic prediction. The most cost-efficient method was AMRA
with ANN. On the other hand, the cost of using MCTS and ANN increased with
D
higher traffic loads. It can be explained by the fact that with higher network
TE
resources utilization, the decisions that enabled to serve more traffic, resulted
in higher costs per hour of service than for the other approaches.
305 We also evaluated the network performance using a different network, which
EP
is US26. The revealed trends were similar, with the tendency to be slightly
higher than for Euro28 network. Another visible change was a significant in-
crease in the cost per hour of service. In US26, DCs are concentrated in the East
C
and the West Coasts while DCs in the Euro28 network are more centralized.
AC
310 Hence, because of more significant distances, poor decisions more significantly
affect the cost of request provisioning.
In addition, we evaluated the decision time with and without machine learn-
ing methods for the AMRA and GA methods for the Euro28 (see Fig. 2) and
the US26 (see Fig. 3) networks. As we can observe, the use of machine learn-
15
ACCEPTED MANUSCRIPT
50
PT
40
Decision time (ms)
RI
30
SC
20
10
U
500 550 600 650 700 750 800
Traffic load (Erlang)
AMRA
GA AN AMRA+MCTS
GA+MCTS
AMRA+ANN
GA+ANN
Figure 2: Average decision time with and without traffic prediction methods enabled for the
M
Euro28 network.
315 ing techniques has not a significant impact on the time of the calculations -
D
the visible overhead is about 5-10 ms. Moreover, the GA method reached the
maximum 50 ms calculations time for higher traffic loads, which confirms the
TE
previous observation why its efficiency is lower with increased traffic rates.
The average decision time for US26 network was higher than for the Euro28.
EP
320 It is due to a higher complexity of the network (in particular, a higher graph
degree of the vertices and a larger link lengths).
C
5. Conclusion
AC
In this paper, we focused on applying the machine learning techniques for

traffic prediction in EONs. We showed the benefits of using them with dynamic
325 routing algorithms, developed for cloud data center traffic. The main conclusion
is that the Monte Carlo sampling adapts better and in a shorter time to all traffic
changes than the Artificial Neural Network.
16
ACCEPTED MANUSCRIPT
50
PT
40
Decision time (ms)
RI
30
SC
20
10
U
500 550 600 650 700 750 800
Traffic load (Erlang)
AMRA
GA AN AMRA+MCTS
GA+MCTS
AMRA+ANN
GA+ANN
Figure 3: Average decision time with and without traffic prediction methods enabled for the
M
US26 network.
References
D
[1] I. Tomkos, B. Mukherjee, S. K. Korotky, R. Tucker, L. Lunardi, The Evolu-

TE
330 tion of Optical Networking, Proceedings of the IEEE 100 (5) (2012) 1017–
9219. doi:10.1109/JPROC.2012.2187363.
EP
[2] J. M. Simmons, Optical Network Design and Planning, no. 2nd Edition in
Optical Networks, Springer International Publishing, 2014. doi:10.1007/
978-3-319-05227-4.
C
335 [3] I. Chlamtac, A. Ganz, G. Karmi, Lightpath communications: An approach

AC
to high bandwidth optical WAN’s, IEEE Transactions on Communications

40 (7) (1992) 1171–1182. doi:10.1109/26.153361.
[4] O. Gerstel, On The Future of Wavelength Routing Networks, IEEE Net-

work 96 (11) (1996) 14–20.
17
ACCEPTED MANUSCRIPT
340 [5] M. Jinno, H. Takara, B. Kozicki, Concept and Enabling Technologies of

Spectrum-Sliced Elastic Optical Path Network (SLICE), in: Asia Com-
PT
munications and Photonics Conference and Exhibition, Shanghai, China,
2009. doi:10.1364/ACP.2009.FO2.
[6] M. Jinno, H. Takara, B. Kozicki, Dynamic optical mesh networks: Drivers,
RI
345 challenges and solutions for the future, in: 35th European Conference on
Optical Communication, Vienna, Austria, 2009, pp. 2–5.
SC
[7] I. Tomkos, S. Azodolmolky, J. Sole-Pareta, D. Careglio, E. Palkopoulou,
A tutorial on the flexible optical networking paradigm: State of the art,
U
trends, and research challenges, in: Proceedings of the IEEE, Vol. 102,
350 2014, pp. 1317–1337. doi:10.1109/JPROC.2014.2324652.
AN
[8] M. Aibin, Dynamic Routing Algorithms for Cloud-Ready Elastic Optical
Networks, Ph.D. thesis, Wroclaw University of Science and Technology
M
(2017).
[9] L. Kocsis, C. Szepesvári, Bandit based monte-carlo planning, in: Proceed-

D
355 ings of ECML, 2006, pp. 282–203. doi:10.1007/11871842.

TE
[10] C. D. Rosin, Nested rollout policy adaptation for Monte Carlo tree search,
in: IJCAI International Joint Conference on Artificial Intelligence, 2011,
pp. 649–654. doi:10.5591/978-1-57735-516-8/IJCAI11-115.
EP
[11] S. Zander, T. Nguyen, G. Armitage, Automated traffic classification and

360 application identification using machine learning, in: IEEE Conference on
C
Local Computer Networks, 2005, pp. 250–257. doi:10.1109/LCN.2005.35.

AC
[12] N. Williams, S. Zander, G. Armitage, A preliminary performance compari-

son of five machine learning algorithms for practical IP traffic flow classifi-
cation, ACM SIGCOMM Computer Communication Review 36 (5) (2006)
365 5. doi:10.1145/1163593.1163596.
18
ACCEPTED MANUSCRIPT
[13] M. Mirza, J. Sommers, P. Barford, X. Zhu, A machine learning approach

to TCP throughput prediction, IEEE/ACM Transactions on Networking
PT
18 (4) (2010) 1026–1039. doi:10.1109/TNET.2009.2037812.
[14] S. Suthaharan, Big Data Classification: Problems and Challenges in Net-

work Intrusion Prediction with Machine Learning, ACM SIGMETRICS
RI
370
Performance Evaluation Review 41 (4) (2014) 70–73. doi:10.1145/

2627534.2627557.
SC
[15] W. Huang, G. Song, H. Hong, K. Xie, Deep architecture for traffic flow
prediction: Deep belief networks with multitask learning, IEEE Trans-
U
375 actions on Intelligent Transportation Systems 15 (5) (2014) 2191–2201.
doi:10.1109/TITS.2014.2311123.
AN
[16] C. Sommer, Shortest-path queries in static networks, ACM Computing
Surveys 46 (4) (2014) 1–31. doi:10.1145/2530531.
M
[17] T. Subbulakshmi, S. M. Shalinie, Detection and Classification of
380 DDoS Attacks Using Machine Learning Algorithms, European Jour-
D
nal of Scientific Research 47 (3) (2010) 334–346. doi:DOI:10.1007/

978-3-642-14478-3{\_}25.
TE
[18] J. L. Berral, N. Poggi, J. Alonso, R. Gavaldà, J. Torres, M. Parashar,

Adaptive distributed mechanism against flooding network attacks based on
EP
385 machine learning, in: Proceedings of the 1st ACM workshop on Workshop
on AISec - AISec ’08, 2008, p. 7. doi:10.1145/1456377.1456389.
C
[19] M. M. Najafabadi, T. M. Khoshgoftaar, C. Kemp, N. Seliya, R. Zuech,

Machine learning for detecting brute force attacks at the network level, in:
AC
Proceedings - IEEE 14th International Conference on Bioinformatics and

390 Bioengineering, BIBE 2014, 2014, pp. 379–385. doi:10.1109/BIBE.2014.
73.
[20] T. Ahmed, B. Oreshkin, M. Coates, Machine learning approaches to net-

work anomaly detection, Proceedings of the 2nd USENIX workshop on
19
ACCEPTED MANUSCRIPT
Tackling computer systems problems with machine learning techniques

395 (2007) 7:1–7:6.
PT
[21] M. V. Mahoney, A Machine Learning Approach to Detecting Attacks by
Identifying Anomalies in Network Traffic, Ph.D. thesis (2003).
RI
[22] C. F. Tsai, Y. F. Hsu, C. Y. Lin, W. Y. Lin, Intrusion detection by ma-
chine learning: A review, Expert Systems with Applications 36 (10) (2009)
11994–12000. doi:10.1016/j.eswa.2009.05.029.
SC
400
[23] W. Saad, Z. Han, M. Debbah, A. Hjorungnes, T. Basar, Coalitional Game

Theory for Communication Networks: A Tutorial, IEEE Signal Processing
U
Magazine 26 (2009) 77–97. doi:10.1109/MSP.2009.000000.
405
AN
[24] S. Haeri, W. W. K. Thong, G. Chen, L. Trajkovic, A reinforcement
learning-based algorithm for deflection routing in optical burst-switched
networks, in: 14th International Conference on Information Reuse and In-
M
tegration, IEEE IRI, 2013, pp. 474–481. doi:10.1109/IRI.2013.6642508.
[25] S. Haeri, L. Trajkovic, Deflection routing in complex networks, in: Pro-

D
ceedings - IEEE International Symposium on Circuits and Systems, 2014,

pp. 2217–2220. doi:10.1109/ISCAS.2014.6865610.
TE
410
[26] F. Musumeci, C. Rottondi, A. Nag, I. Macaluso, D. Zibar, M. Ruffini,

M. Tornatore, A Survey on Application of Machine Learning Techniques
EP
in Optical Networks (2018). doi:arXiv:1803.07976v1.

URL http://arxiv.org/abs/1803.07976
C
415 [27] M. Aibin, K. Walkowiak, Dynamic routing of anycast and unicast traf-
fic in elastic optical networks with various modulation formats - Trade-off
AC
between blocking probability and network cost, in: 15th International Con-
ference on High Performance Switching and Routing (HPSR), Vancouver,
Canada, 2014, pp. 64–69. doi:10.1109/HPSR.2014.6900883.
420 [28] M. Jinno, B. Kozicki, H. Takara, A. Watanabe, Y. Sone, T. Tanaka, A. Hi-

rano, Distance-adaptive spectrum resource allocation in spectrum-sliced
20
ACCEPTED MANUSCRIPT
elastic optical path network, IEEE Communications Magazine 48 (8) (2010)

138–145. doi:10.1109/MCOM.2010.5534599.
PT
[29] M. M. Klinkowski, K. Walkowiak, On the advantages of elastic optical
425 networks for provisioning of cloud computing traffic, Network, IEEE 27 (6)
(2013) 44–51. doi:10.1109/MNET.2013.6678926.
RI
[30] C. T. Politi, V. Anagnostopoulos, C. Matrakidis, A. Stavdas, A. Park,
SC
M. Heath, U. Kingdom, Dynamic Operation of Flexi-Grid OFDM-based
Networks, Optical Fiber Communication Conference and Exposition and
430 the National Fiber Optic Engineers Conference (OFC/NFOEC) 1. doi:
U
10.1364/OFC.2012.OTh3B.2.
AN
[31] B. C. Chatterjee, N. Sarma, E. Oki, Routing and Spectrum Allocation in
Elastic Optical Networks: A Tutorial, IEEE Communications Surveys &
Tutorials 17 (3) (2015) 1776–1800. doi:10.1109/COMST.2015.2431731.
M
435 [32] K. Walkowiak, R. Goscien, M. Klinkowski, On Minimization of the Spec-
trum Usage in Elastic Optical Networks with Joint Unicast and Anycast
D
Traffic, in: Asia Communications and Photonics Conference (ACP), 2013,

pp. 4–6. doi:10.1364/ACPC.2013.AF4G.1.
TE
[33] M. Kearns, Y. Mansour, A. Y. Ng, A sparse sampling algorithm for near-

440 optimal planning in large Markov decision processes, in: IJCAI Interna-
EP
tional Joint Conference on Artificial Intelligence, Vol. 2, 1999, pp. 1324–

1331. doi:10.1023/A:1017932429737.
C
[34] M. P. D. Schadd, M. H. M. Winands, M. J. W. Tak, J. W. H. M. Uiterwijk,

Single-player Monte-Carlo tree search for SameGame, Knowledge-Based
AC
445 Systems 34 (2012) 3–11. doi:10.1016/j.knosys.2011.08.008.
[35] R. Coulom, Efficient Selectivity and Backup Operators in Monte-Carlo Tree

Search, in: Computers and games, Vol. 4630, 2007, pp. 72–83. doi:10.
1007/978-3-540-75538-8{\_}7.
21
ACCEPTED MANUSCRIPT
[36] F. Morales, M. Ruiz, L. Velasco, Data Analytics Based Origin-Destination

450 Core Traffic Modelling, in: International Conference on Transparent Opti-
PT
cal Networks, 2017, pp. 1–4.
[37] M. Aibin, K. Walkowiak, Adaptive modulation and regenerator-aware dy-

namic routing algorithm in elastic optical networks, in: IEEE International
RI
Conference on Communications (ICC), London, UK, 2015, pp. 5138–5143.
455 doi:10.1109/ICC.2015.7249139.
SC
[38] N. Wang, J. P. Jue, Holding-time-aware routing, modulation, and spectrum
assignment for elastic optical networks, in: IEEE Global Communications
U
Conference, 2014, pp. 2180–2185. doi:10.1109/GLOCOM.2014.7037131.
460
AN
[39] E. Arianyan, Efficient Resource Allocation in Cloud Data Centers Through
Genetic Algorithm (2012) 566–570.
[40] Data Center Map, DataCenterMap.com (2017).

M
URL http://www.datacentermap.com
D
TE
C EP
AC
22

Accepted Manuscript: 10.1016/j.osn.2018.06.001

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Accepted Manuscript: 10.1016/j.osn.2018.06.001

Încărcat de

Drepturi de autor:

Formate disponibile

Accepted Manuscript

Traffic prediction based on machine learning for elastic optical networks

To appear in: Optical Switching and Networking

Received Date: 28 September 2017

Traffic prediction based on machine learning for elastic

Keywords: elastic optical networks, dynamic routing, cloud services, machine

5 high-capacity telecommunications networks based on optical technologies and

restoration at the wavelength level. With estimated exponential traffic growth,

Preprint submitted to Journal of Optical Switching and Networking May 4, 2018

to be increased beyond 100 Gb/s per channel or higher, with an increase of

fixed-sized optical bandwidth allocation in WDM. Unlike the rigid bandwidth in

Moreover, the growing popularity of cloud and content-oriented service has

30 solutions will need to be upgraded or changed shortly. Currently, the cloud

resources and payable per hours of using it [8].

implementation of Monte Carlo Tree Search algorithm for deflection routing in

We use similar notations as in [27]. The optical network is modeled as graph

the simulations. It contains sets of multicast, unicast, and anycast requests.

are referred as associated. It is also assumed that various modulation formats

95 higher modulation formats, leading to lower spectrum demand, at the cost of

100 for a particular request is determined according to a distance-adaptive trans-

to all requests in offered to the network 1.

Request Blocking ILP model

D all requests offered to network

Dany anycast requests (upstream and downstream)

P (d) candidate paths for requests d ∈ D

xdpc 1, if channel c on candidate path p is used to realize request d; 0, otherwise

yeb 1, if slice b is occupied on link e; 0, otherwise

130 3.1. Monte Carlo Tree Search AN

135 ι given a state ζ and an action a, it may be used to perform a sampling-based

the quality of the final solution.

185 3.2. Artificial Neural Network

procedure is self-learning to improve its efficiency. The weights of neurons were

200 refer to [36].

weights for a single sample is constant.

3.3. RMSA Algorithms

4.1. Simulation Scenario AN

Table 1: Prices of DC services provided by the AWS.

Region # of DCs locations Price per hour

- Storage as a Cloud (SaaC): These type of requests require storage re-

chine image from an on-premises location to the cloud image library, or

DC SaaC requests, served by all types of flows. It represents 5.6% of all

255 DC resources. In more detail it uses maximum of 8 CPU units, 32 GB of

state were not considered.

The main goal of the simulations is to compare the results of allocation of

275 for the evaluation is the Blocking Percentage (BP).

algorithms. Furthermore, the simulations were performed in moderate to high

performance deteriorated for higher traffic loads. It is caused by the number of

decision needed to be calculated in a short amount of time. Due to the dynamic

In this paper, we focused on applying the machine learning techniques for

[1] I. Tomkos, B. Mukherjee, S. K. Korotky, R. Tucker, L. Lunardi, The Evolu-

335 [3] I. Chlamtac, A. Ganz, G. Karmi, Lightpath communications: An approach

to high bandwidth optical WAN’s, IEEE Transactions on Communications

[4] O. Gerstel, On The Future of Wavelength Routing Networks, IEEE Net-

340 [5] M. Jinno, H. Takara, B. Kozicki, Concept and Enabling Technologies of

[6] M. Jinno, H. Takara, B. Kozicki, Dynamic optical mesh networks: Drivers,

[9] L. Kocsis, C. Szepesvári, Bandit based monte-carlo planning, in: Proceed-

355 ings of ECML, 2006, pp. 282–203. doi:10.1007/11871842.

[11] S. Zander, T. Nguyen, G. Armitage, Automated traffic classification and

Local Computer Networks, 2005, pp. 250–257. doi:10.1109/LCN.2005.35.

[12] N. Williams, S. Zander, G. Armitage, A preliminary performance compari-

[13] M. Mirza, J. Sommers, P. Barford, X. Zhu, A machine learning approach

[14] S. Suthaharan, Big Data Classification: Problems and Challenges in Net-

Performance Evaluation Review 41 (4) (2014) 70–73. doi:10.1145/