Sunteți pe pagina 1din 7

Combining the parareal algorithm and reduced order modeling

for time dependent partial differential equations


C. Harden, J. Peterson∗
School of Computational Science, Florida State University, Tallahassee FL 32306-4120

June 1, 2006

Abstract
Due to the computational intensity of solving nonlinear partial differential equations, there
has been much interest in developing algorithms which improve the ability to perform real-
time calculations. One approach to reducing computational costs is the development of low-
dimensional models; a second approach is to develop parallel algorithms. In this work we
combine these two approaches using reduced order modeling and a parallel-in-time algorithm
known as the parareal algorithm for solving evolution equations.

Keywords: Parareal; Reduced order modeling; Parallel algorithm, Proper orthogonal decom-
position

1 Introduction
In many applications, the ability to perform real-time calculations is essential. Consequently, there
is much interest in developing algorithms which reduce the computational costs of solving systems
of time-dependent, nonlinear partial differential equations (PDEs). One popular class of techniques
which offers some promise is the use of low-dimensional models such as the well-known reduced order
modeling (ROM) approach. Another approach in reducing computational costs is to use parallel
algorithms to take advantage of multiprocessor machines. Although a significant body of literature
exists which is devoted to developing and analyzing parallel algorithms for the spatial discretization
of differential equations (such as domain decomposition techniques), computational methods for
handling time integration have typically been sequential in nature.
The parareal algorithm was proposed by Lions, Maday and Turinici [1] in 2001 as a parallel-
in-time approach for solving time dependent differential equations. The name of the algorithm
indicates the intention of its design, that is, for parallel, real-time computations involving time
evolution equations whose solutions may not be readily obtainable using only a single processor
machine. Since the appearance of this paper, various authors have investigated properties of the
algorithm and have begun applying the algorithm in various settings; see, for example, [2], [3], [4], [5]
and the references cited therein.
∗ Corresponding author.

1
There has been much interest in the development of very low-dimensional models which can
be used to determine accurate approximations in simulation and control problems. Many examples
have been reported in the literature (see, for example, [6], [7], [8], [9] and the references cited therein)
that show that several classes of ROMs are effective, i.e., an accurate approximate solution can be
obtained using very small-dimensional ROMs.
In practice it is probably true that neither ROM nor parallel-in-time algorithms alone can be
enough to realize real-time solutions of practical nonlinear systems of PDEs. The goal of this paper
is to demonstrate that the two approaches can be combined to realize a much greater speed-up in
calculation time than either method alone can achieve.

2 Parareal algorithm
The parareal algorithm is a parallel-in-time method which is a purely parallel algorithm, i.e., it has
no value in a serial context. The algorithm is iterative in nature and allows the computation of the
solution at later times before the solution at earlier times is fully resolved; moreover, the solution
at multiple timesteps can be calculated simultaneously. The algorithm involves a correction step
which shares a common thread with the method of iteration improvement for linear systems and the
defect-correction algorithms for differential equations.
To describe the method, first assume that if the time integration over some interval (0, T ] is
performed in a sequential manner a timestep of length ∆t with a total of N ∗ M timesteps is needed.
Then we can briefly summarize the steps in the algorithm as follows:
Step (i) decompose the time interval [0, T ] into N coarse time intervals [Tn , Tn+1 ], n = 0, . . . , N −
1 of length ∆T and solve the differential equation sequentially in time; denote the solution at
each Tn by Cn , n = 0, . . . , N where C0 denotes the initial condition for the problem;
Step (ii) decompose each coarse interval [Tn , Tn+1 ], n = 0, . . . , N − 1, into M subintervals
of length ∆t; in parallel solve the differential equation over each subinterval using Ci , i =
0, . . . , N − 1 as the initial condition at Ti ; denote this solution at the points Ti , i = 1, . . . , N ,
by Fi ;
Step (iii) define the errors or defects at the points Tn , n = 1, . . . , N , by Sn = Fn − Cn ; solve the
coarse grid problem for the correction to F n where the right-hand side of the equation is the
jump S n ; call these corrections δ n , n = 1, . . . , N ;
Step (iv) set C n = F n + δ n ; return to Step (ii) if satisfactory convergence has not been achieved.
The calculations for Step (ii) for the fine grid solutions are done in parallel and are divided
across multiple processors; this is possible since the fine solutions over each coarse interval [Tn , Tn+1 ],
n = 0, . . . , N − 1, are independent. All other computations are serial. We will also see that when the
original problem is nonlinear, the algorithm can be implemented so that only the parallel calculations
will involve solving nonlinear systems; the serial calculations will be linear.
To illustrate the implementation of the algorithm, consider an initial boundary value problem
for a nonlinear parabolic equation where we use finite element methods (FEM) for the spatial dis-
cretization and a backward Euler approximation for temporal approximations. Specifically, consider
the problem

ut − ∆u + f (u) = g(x, t) in Ω × (0, T ] 
u(x, 0) = u0 (x) in Ω (1)
u(x, t) = 0 on ∂Ω × (0, T ].

2
Assuming that the spatial discretization has already been effected, the parareal algorithm for this
problem is given by the following steps.
Step (i) For n = 1, . . . , N , solve

Cn0 − Cn−1
0
Z Z Z Z
∇Cn0 · ∇φ = 0

φ+ g(x, Tn )φ − f Cn−1 φ where C00 = u0 .
Ω ∆T Ω Ω Ω

For k = 0, 1, . . . perform the following steps until satisfactory convergence is achieved:


Step (ii) for each interval [Tn , Tn+1 ], n = 0, . . . , N − 1, set ∆t = (Tn+1 − Tn )/M , Fn,0 = Cnk
and for m = 1, . . . , M solve in parallel the nonlinear problems
k k
Fn,m − Fn,m−1
Z Z Z Z
k k
φ+ ∇Fn,m ∇φ + f (Fn,m )φ = g(x, tm )φ ;
Ω ∆t Ω Ω Ω

denote this solution at each point Tn , n = 1, . . . , N , by Fnk ;


Step (iii) for n = 0, 1, . . . , N − 1, define Snk = Fnk − Cnk , set δ0k = 0 and for n = 1, . . . , N solve
the problems
Z k k k
δn − δn−1 Sn−1
Z Z Z
k 0 k k
φ+ ∇δn ∇φ + f (Fn )δn φ = φ;
Ω ∆T Ω Ω Ω ∆T

Step (iv) set


Cnk+1 = Fnk + δnk , n = 1, . . . , N .
Note that the serial computations only involve the solution of linear systems; all nonlinear systems
k
are solved in parallel. Also note that in Step (iii) we are using Sn−1 to compute the correction at
time Tn because we do not want to correct the fine grid solution at time T1 since this was computed
using the correct initial guess. In fact, the solution Cnk+1 from Step (iv) will yield the same solution
for the first k + 1 intervals as the corresponding serial calculation using a timestep of length ∆t.
To illustrate the computational properties of the algorithm we consider the following nonlinear
reaction diffusion equation

ut − ∆u + u2 = g(x, y, t) 0 < x, y < 1, t > 0 (2)

where g(x, y, t) is chosen so that the exact solution is u(x, y, t) = cos(10t) tan(x2 + y 2 − 1) with
Dirichlet boundary data set. All results reported use continuous piecewise quadratic elements on
triangles for the the spatial finite element approximation, a backward Euler approximation in time,
and Newton’s method to solve the nonlinear equations with a serial direct solver for the resulting
linear systems.
In Table 1 we summarize the errors obtained by solving this example using a standard serial finite
element approach with a timestep of length ∆t and the parareal algorithm using the same timestep
for the fine calculation (Step (ii) ) but a much larger timestep, ∆T , for the coarse grid calculations
(Steps (i) and (iii)). As can be seen from the table, it takes two iterations of the parareal algorithm
to obtain equivalent accuracy as the serial finite element approach. The errors reported are the
maximum L2 and H 1 errors over the time interval [0, 5]; here H 1 denotes the standard Sobolev
space of functions in L2 which have one weak derivative in L2 .
To compare the speedup of the parareal algorithm, calculations were performed on the Teragold,
an IBM SP3 machine. For the timings reported, a serial solver is used; of course, if an iterative
parallel solver had been employed the matrix assembly time would play a more significant role in the
timing. The only parallelization used is in Step (iii). To compare the timings for solving our example

3
Parareal Algorithm
h ∆T ∆t FEM Initial Coarse Iteration # 1 Iteration # 2
L2 −error L2 −error L2 −error L2 −error

1
10 0.1 0.01 9.873×10−3 8.424×10−2 1.749×10−2 7.339×10−3

1
20 0.1 0.0025 2.501×10−3 8.424×10−2 1.760×10−2 3.116 ×10−3

H 1 −error H 1 −error H 1 −error H 1 −error

1
10 0.1 0.01 6.117×10−2 0.4382 9.156×10−2 5.152×10−2

1
20 0.1 0.0025 1.588×10−2 0.4372 8.457×10−2 1.704 ×10−2

Table 1: Comparison of errors using standard finite element approach and the parareal/FEM ap-
proach.

using both a serial finite element approach and the parareal algorithm, in Table 2 we compared the
“wall clock” timing for each computation and report the ratio of the timing for the serial FEM
and the parareal/FEM method. Note that the second column of Table 2 gives the total number of
timesteps in the serial calculation, the third column gives the number of coarse time steps for the
parareal algorithm and the fourth column gives the number of fine steps each processor is calculating
in Step (ii).

No. of
T T ∆T
h processors speedup
∆t ∆T ∆t

1
10 100 10 10 10 3.62
1
10 400 20 20 20 4.82
1
10 6400 80 80 80 8.39

1
20 100 10 10 10 3.66
1
20 400 20 20 20 4.97
1
20 6400 80 80 80 9.13

Table 2: Speed-up results for the parareal/FEM approach compared to the serial FEM approach.

3 ROM and the Parareal Algorithm


The decrease in computational time achieved by using the parareal algorithm is oftentimes still not
sufficient to perform real-time calculations. ROM techniques have been shown to be effective in

4
reducing computational time. Here we combine the ideas of ROM and the parareal algorithm to
achieve a further reduction in computational efforts.
To briefly define what we mean by ROM, we consider approximating the solution u(x, t) of a
nonlinear, time dependent partial differential equation. Then a ROM method would proceed as
follows. First, one chooses a reduced basis {φi (x)}di=1 ; where d is small compared to the usual
number of functions used in a finite element approximation or the number of grid points used in a
finite difference approximation. Next, one seeks an approximation urom (x, t) to the state u of the
form
d
X
urom (x, t) = aj (t)φj (x) ∈ V ≡ span{φ1 , . . . , φd }.
j=1

Then, one determines the coefficients aj , j = 1, . . . , d, by solving the equation in the set V , e.g.,
one could find a Galerkin solution of the equation in a standard way, using V for the space of
approximations. The cost of such a computation would be very small if d is small. Oftentimes,
however, the off-line determination of the reduced basis {φi }di=1 can be quite large but this can be
viewed as a preprocessing step. The reduced basis vectors are global in nature and thus the resulting
ROM problem leads to dense systems. There have been many approaches suggested for determining
the reduced basis. Here we follow the approach of [6], [9] where the reduced basis is generated
by using proper orthogonal decomposition (POD) for a set of snapshots obtained by computing
approximations of the solution of the differential equation for a sampling of parameter values and
times.
To combine the ideas of the parareal algorithm and ROM, we simply use ROM to solve both the
coarse and fine time grid calculations. We will see that the parareal algorithm behaves in a similar
manner when applied in the ROM setting.
To illustrate the algorithm we consider a nonlinear reaction diffusion example which involves
four time dependent parameters on the boundary. These boundary parameters complicate the
ROM process; for details see [9]. Specifically, we consider the PDE given in (2) with the boundary
conditions 
2t t < 0.5
y = 1 u = 4x(1 − x)β1 where β1 =
2(1 − t) t ≥ 0.5
y = 0 u = 4x(1 − x)β2 where β2 = 4(t − t2 )
x = 0 u = 4y(1 − y)β3 where β3 = | sin(2πt)|
x = 1 u = 4y(1 − y)β4 where β4 = | sin(4πt)|
and initial condition u(x, y, 0) = 0.
For the snapshot generation we sampled points in the four-dimensional parameter space, then
solved the full finite element model with h = 0.1 for the equation by impulsively jumping between
the sampled parameters; snapshots were generated from the solution at various time intervals for
the choice of parameters. A total of 300 snapshots were generated and a POD technique was used to
determine the basis vectors. Satisfying the inhomogeneous Dirichlet boundary data requires extra
work and was handled by generating basis vectors which satisfied general inhomogeneous boundary
data; again see [9] for details.
The calculations reported in Table 3 are for the spatial discretization of h = 0.1. The errors
are computed by comparing with the full FEM solution of the problem since an exact solution is
not known; once again the errors measure the maximum L2 error over all time. In Table 3, column
four gives the error between the ROM solution and the standard FEM solution and the remaining
columns give the errors at each stage of the parareal/ROM algorithm. Note that similar to the first
example, it takes two iterations of the parareal/ROM algorithm to match the error in the ROM
solution.

5
Parareal/ROM Algorithm
# basis ∆T ∆t ROM Initial Coarse Iteration # 1 Iteration # 2
vectors L2 −error L2 −error L2 −error L2 −error

4 0.1 0.01 2.442×10−2 0.1064 1.899×10−2 1.344×10−2

8 0.1 0.01 2.338×10−3 0.1054 1.405×10−2 1.940×10−3

16 0.1 0.01 1.126 ×10−4 0.1053 1.403×10−2 1.569×10−3

Table 3: Comparison of errors for 4-parameter problem using standard ROM approach and the
parareal/ROM algorithm. Errors are calculated by comparing to the full finite element solution.

In Figure 3 we illustrate the speed-up gained by implementing the parareal algorithm for this
4-parameter problem using various choices of the number of processors. The plot on the left in
Figure 3 compares the timings for the FEM and Parareal/FEM approach (i.e., no ROM is used)
for solving this problem while the plot on the right compares the ROM and Parareal/ROM timings
when eight basis vectors are used. In these calculations we choose a fixed fine timestep ∆t and vary
the number of coarse time intervals which we set to the number of processors. The two extremes for
the choice of the number of coarse intervals is one (∆T = T ) and the choice of T /∆t; i.e., ∆T = ∆t.
Of course, each of these two choices give no speed-up. Figure 3 suggests that an optimal speedup of
approximately four is achieved using 16 processors with ∆t = 0.01 and with ∆t = 0.005 an optimal
speedup of approximately six is achieved using 20 processors in both cases. Once again the speed-
up is calculated as the ratio of the wall clock time for the ROM solution and the Parareal/ROM
solution. A significant speedup for using the ROM approach compared with the full FEM approach
is only realized when the problem size is much larger than in this example. See [10] for examples of
realistic, large-scale solutions of the Navier-Stokes equations using a parareal/ROM/FEM approach.

References
[1] J.-L. Lions, Y. Maday, G. Turinici, Résolution d’EDP par un schéma en temps “pararéel”, C.
R. Acad. Sci. Paris, Ser I 332 (2001) 661–668.
[2] G. Bal, On the convergence and stability of the parareal algorithm to solve partial differential
equations, Proceedings of Fifteen International Conference on Domain Decomposition Methods,,
Berlin, Vol. 40, LNCSE Series, Springer Verlag, 2004, 425–432.
[3] M. J. Gander, S. Vandewalle, Analysis of the parareal time-parallel time-integration method,
Report TW443, Dept. of Computer Science, Katholieke Universiteit Leuven (2005) 1–23.
[4] Y. Maday, G. Turinici, A parareal in time procedure for the control of partial differential
equations, C. R. Acad. Sci. Paris, Ser I 335 (2002) 387–392.
[5] P. F. Fishcher, F. Hecht, Y. Maday, A parareal in time semi-implicit approximation of the
Navier-Stokes equations, Proceedings of Fifteen International Conference on Domain Decom-
position Methods,, Berlin, Vol. 40, LNCSE Series, Springer Verlag, 2004.

6
6 6

5 5

4 4
SpeedUp

SpeedUp
3 3

2 2

1 1

0 0
0 10 20 30 40 50 60 0 10 20 30 40 50 60
Processors Processors

Figure 1: Speed-up results for parareal and parareal/ROM as the number of processors is varied.
The plot on the left compares the FEM and parareal/FEM approach whereas the plot on the right
compares the ROM and the parareal/ROM timings. The solid line indicates ∆t = 0.01 and the
dotted line indicates ∆t = 0.005.

[6] J. Burkardt, M. Gunzburger, H. Lee , POD and CVT-based reduced-order modeling of Navier-
Stokes flows. Comp. Meth. Appl. Mech. Engrg., to appear
[7] K. Kunisch and S. Volkwein , Control of Burger’s equation by a reduced order approach using
proper orthogonal decomposition, JOTA 102 (1999) 345–371.
[8] M. Barrault, Y. Maday, N. Nguyen, A. Patera, An ‘empirical interpolation’ method: application
to efficient reduced-basis discretization of partial differential equations , C. R. Acad. Sci. Paris,
Ser, I 339 (2004) 667–672.
[9] M. Gunzburger, J. Peterson, Reduced-order modeling with multiple parameters, submitted to
Comp. Meth. Appl. Mech. Engrg.
[10] C. Harden, J. Peterson, Combining the parareal algorithm and reduced order modeling for
real-time calculations of time dependent PDEs, International conference on recent advances in
scientific computation, Beijing, China, June 2006.

S-ar putea să vă placă și