Importance of Duality

ON THE IMPORTANCE OF
DUALITY IN OPTIMAL
DESIGN
Aharon Ben-Tal
MINERVA Optimization Center
TechnionIsrael Institute of Technology
WHAT IS DUALITY THEORY?

Kuhn (SIAM-AMS Proc., Vol. 9, 1976):
A Duality Theory is made of the following elements:
(a) A pair of optimization problems, one a
minimization problem (problem (P )) and the other
a maximization problem (program (D)) based on
the same data.
(b) Weak duality holds: min(P ) max(D).
(c) Necessary and sufficient conditions for optimality
of a feasible pair is the equality of the
corresponding objective function.
My addition
(d) A computable (tractable) relation between the
optimal solutions of (P ) and (D).
(e) Relation between: primal feasibility and dual
boundedness and dual attainability.
LAGRANGIAN DUALITY
inf n {f (x) | gi (x) 0 , i = 1, . . . , m}
(P )
xIR
f, gi convex
Lagrangian: L(x, y) = f (x) +
Pn
i=1
yi gi (x)
Dual objective function:

g(y) =
domg
inf L(x, y) a concave function
xIRn
{y | g(y) > }
sup {g(y) | y domg , y 0}
(D)
yIRm
Duality Theorem If (P ) satisfies

x
: gi (
x) < 0 i = 1, . . . , m (Slater condition)
then
inf(P ) = max(D)
Extensions to -dimensional spaces

3
CONIC DUALITY
K = closed convex pointed cone, int K 6=
T
(P )
inf n c x | Ax b K
A=mn
xIR
(P ) is strictly feasible if x
: A
x b int K
The conic-dual of (P ) is
T
T
(D)
sup b y | A y = c, y K
yIRm
K is the dual cone of K
K = y | y T x 0,
x K
(D) is strictly feasible if y int K : AT y = c

Examples of dual cones
K = IRn+ ,
K =K
q
o
n
K = Ln = (x1 , . . . , xn1 , xn ) | xn x21 + + x2n1 ,
K=
n
S+
nn
= A IR
K = {x | Ax 0}
K = K
| A = A, A 0 , K = K
T
K = A y | y 0
4
SDP DUALITY
(P )
min {hc, xi | Ax B 0}
xIRn
Ax = x1 A1 + + xn An
A : IRn Sm
Ai , B Sm ,
Dual of (P ) : maxm {hB, Y i | A Y = c, Y 0}
Y S
hB, Y i = Trace(BY )
A : Sm IRn , A Y = (Tr(Y A1 ), . . . , Tr(Y An ))T
=
{Tr(BY ) | Tr(Y Ai ) = ci , i = 1, . . . , n, Y 0}
(D) Ymax
Sm
CONIC DUALITY THEOREM

(P )
inf{cT x | Ax b K}
(D)
sup{bT y | AT y = c, y K }
1. Dual of (D) = (P )
2. Weak duality: inf(P ) sup(D)
3. Strong duality: If one of the (P ) (D) is strictly
feasible and bounded, then its dual is solvable and
inf(P ) = sup(D). If both are strictly feasible, then
min(P ) = max(D).
4. Complementarity: If (P ) or (D) is strictly feasible
and bounded, then a feasible pair x, y is an optimal
pair, if and only if
y T (Ax b) = 0
THE POWER OF DUALITY

IN STRUCTURAL DESIGN
17
A typical Structural Design problem is

ComplF (t) supf F Complf (t) min
ti 0 ,
i Tr(ti ) i , i = 1, . . . , n ,
Pn
i=1 Tr(ti ) w
where
1 T
Complf (t) = supvV f v 2 v A(t)v
Pn PS
A(t) = i=1 s=1 bTis ti bis
V = {v IRm | Rv r}
ti Sd , i = 1, . . . , n, are design variables
F IRm is the set of loading scenarios.
From now on, we assume that

A) V satisfies the Slater condition:
v : R
v<r
B) The ground structure does not admit rigid body
motions:
X
bis bTis 0
i,s
C) 0 i < i and
i < w
i
18
Semidefinite reformulation of Structural Design

problem
The function
X
1
T
T
bis ti bTis v
Complf (t) = sup f v v
2
v:Rvr
i,s
of f IRm and t = (t1 , . . . , tn ), ti Sd+ is

semidefinite representable:
d n
m
f IR , t S+ , Complf (t)
m
0 :
2 2rT f T + T R
n X
S
X
T
f + RT
b
t
b
is
i
is
i=1 s=1
|
{z
}
0;
A(t)
ti 0, i = 1, . . . , n .
19
With semidefinite representation of Complf (t),

we can immediately pose the multi-load SD
problem as a semidefinite program:
2 2rT `
f + RT
`
`
min
f`T + T` R
n
S
0, ` = 1, . . . , k;
XX
T
bis ti bis
i=1 s=1
ti 0, i = 1, . . . , n;
n
X
Tr(ti ) w;
i=1
i Tr(ti ) i , i = 1, . . . , n;
` 0, ` = 1, . . . , k
with design variables ti Sd , i = 1, . . . , n,
` IR#obst , IR, where #obst = dim r.
20
The robust SD problem also can be posed as a

semidefinite program provided that there are no
obstacles:
= IRm .
Proposition: Let
F = {f = Qu | uT u 1}
Then the function
[Q Mm,k ] .

X
1
T
T
T
bis ti bTis v
ComplF (t) = sup sup u Q v v
2
v u:uT u1
i,s
of t = (t1 , . . . , tn ) (Sd+ )n , is semidefinite representable:

t (Sd+ )n , ComplF (t)
m
2 Ik
Q
Pn
i=1
QT
PS
T
b
t
b
is
i
is
s=1
0.
Consequently, in the case in question, the robust SD
21
problem is equivalent to the semidefinite program

min
2 Ik
Q
Pn
i=1
QT
PS
T
s=1 bis ti bis
Pn
i=1
0;
ti 0, i = 1, . . . , n;
Tr(ti ) w;
i Tr(ti ) i , i = 1, . . . , n;
with design variables ti Sd , IR.
22
Universal semidefinite form of the SD problem

Both multi-load and robust cases of the SD
problem are covered by the following generic
semidefinite program:
(P r)
min
2 Ip + D` z + D`
[E` z + E` ]
[E` z + E` ]T
n
S
0, l = 1, . . . , K;
XX
T
bis ti bis
i=1 s=1
ti 0, i = 1, . . . , n;
n
X
Tr(ti ) w;
i=1
i Tr(ti ) i , i = 1, . . . , n;
z 0,
where
design variables are {ti Sd }ni=1 , z IRN , IR;
data are m d matrices bis , affine mappings
z 7 D` z+D` ; IRN Sd , z 7 E` z+E` : IRN Mm,p ,
` = 1, . . . , K, and reals 0 i < i , i = 1, . . . , n,
w > 0.
23
(P r)
min
2 Ip + D` z + D`
[E` z + E` ]
[E` z + E` ]T
S
n
0, l = 1, . . . , K;
XX
T
bis ti bis
i=1 s=1
Pn
i=1
ti 0, i = 1, . . . , n;
Tr(ti ) w;
i Tr(ti ) i , i = 1, . . . , n;
z 0,
From the computational viewpoint, a disadvantage

of (P r) is its huge design dimension. E.g., when
(P r) comes from obstacle-free truss design, the
dimension of the design vector is
M2
O(M ) ,
2
M being the number of nodes. For a planar truss
with 15 15 nodal grid, this dimension is
> 25, 000. For a spatial truss with 10 10 10
nodal grid, the dimension is > 500, 000.
It turns out that the semidefinite dual to (P r)
admits analytical elimination of most of the
variables, which in many important cases allows to
reduce dramatically the dimension (and thus, the
computational complexity) of the problem.
24
Passing from (P r) to its dual, we get the problem

T
Tr(D
+
2E
`
`
` V` )
`=1 h
i
Pn
i=1 i i+ i i w
maximize
V`T
V`
PK
0, ` = 1, . . . , K
[` Sp , ` Sm , V` Mm,p ] ,
D`
i 0, i = 1, . . . , n ` S
,
+
i , i 0, i = 1, . . . , n i , i IR ,
0 [ IR] ,
PK
[ IRN ] ,
PK
2 `=1 Tr(` ) =
`=1 [D` ` + 2E` V` ] +

PK PS T
`=1
s=1 bis ` bis + i

+
+ i i Id
(Dini )
1,
0,
0, i = 1, . . . , n,
the design variables being ` , ` , V` , l = 1, . . . , K, , i .

However, , i can be immediately eliminated.
Also, it can be shown that at optimality ` = V` `1 V`T .
25
With this elimination, the dual problem becomes

minimize
K
X
Tr(D` ` + 2E`T V` )
`=1
n
Xh
i i+
i=1
A()
BiT (V )
Bi (V ) ( +
i+
i )Id
i i
i
+ w
0, l = 1, . . . , N,
(D1)
i+ i 0, i = 1, . . . , n ;
PK
0;
`=1 Tr(` )
PK
[D
+
2E
`
`
` V` ]
`=1
1;
0,
the design variables of the problem being

K
{` Sp }l=1 ,
{Vi Mm,p }l=1 ,

n
i IR i=1 , IR .
26
When the dual problem is solved by path-following

interior point methods, it is easy to recover a
(nearly) optimal design from nearly optimal central
path solutions to (D1).
Computational advantages of (D1) as compared to
(P r) are especially significant in the case of truss
design.
min
2
f`
Pn
f`T
T
t
b
b
i
i
i
i=1
Pn
0, ` = 1, . . . , k
ti 0, i = 1, . . . , n;
i=1 ti
2
1
viT v1
(P r)
Pk
f`T v` + w min
T
b1 v1
T
2
bi v2
..
..
.
.
0, i = 1, . . . , n; (D1)
k
bTi vk
bTi v2 bTi vk
Pk
`=1 ` = 1
`=1
27
(P r) has n + 1 = M2 O(M ) design variables and

k large LMIs (of size (m + 1) (m + 1)).
m = 2M O( M ) for planar and

m = 3M O(M 2/3 ) for spatial trusses.
(D1) has k(m + 1) = O(1)M k n = O(M 2 )
design variables and n small (of sizes
(k + 1) (k + 1)) LMIs.
E.g., for planar truss with 15 15 nodal grid and 3
loads:
Setting
Design dimension
Effort of analyzing LMIs

at a point, a.o.
(P r)
25,096
37,309,230
(D1)
1,264
267,680
28
In the case of Shape design, the advantages of the

dual setting also can be quite significant.
Consider, e.g., the obstacle-free planar Shape
problem with rectangular cells and with simple
bounds.
The primal problem is:
min
0, ` = 1, . . . , k ;
ti
0, i = 1, . . . , n ;
Tr(ti )
2
f`
Pn
i=1
f`T
P4
T
s=1 bis ti bis
n
X
i=1
3
IR, ti S
29
The dual problem is
k
X
f`T v` + w min
`=1
v1T bi1
1
..
v1T biS
1
..
vkT bi1
k
..
T
bT
i1 v1 biS v1
T
bT
i1 vk biS vk
k
X
vkT biS
0, i = 1, . . . , n;
I3
` = 1 .
`=1
[` , IR, v` IRm ]
E.g., for planar shape with 14 14 cells and 3 loads:
Setting
Design dimension
Effort of analyzing LMIs

at a point, a.o.
(P r)
1,177
37,309,230
(D1)
1,264
71,608
30
From dual back to primal

If (D1) were the usual semidefinite dual of (P r),
the problem dual to (D1) was (P r).
In fact, (D1) is not the semidefinite dual to (P r) it
is obtained from this dual by eliminating part of the
variables. It turns out that the semidefinite dual to
(D1) is a nontrivial (and instructive) equivalent
reformulation of (P r), namely, the problem
min
2 Ip + D` z + D`
l
q11
l ]T [q l ]T
[q11
1S
t1
l
q1S
..
l ]T [1l ]T
[qn1
nS
.
t1
l
qn1
l
qnS
l
Tr(ti )
Pi n
Tr(ti )
Pn PSi=1
l
b
q
is
is
i=1
s=1
z
..
.
tn
..
0,
tn
1, . . . , K ;
i , i = 1, . . . , n ;
w;
E` z + E` , l = 1, . . . , k;
0,
(P r+ )
the design variables being symmetric d d matrices ti ,

l
, l = 1, . . . , K,
i = 1, . . . , n, d p matrices qis
i = 1, . . . , n, s = 1, . . . , S, real and z IRN .
31
E.G., in the case of single-load obstacle-free Truss

design (P r+ ) becomes
2
q1
..
.
qn
q1
t1
min
qn
0;
..
tn
n
X
q i bi
i=1
n
X
= f;
ti
0, i = 1, . . . , n;
ti
w.
i=1
Since by Lemma on Schur Complement for ti 0,

one has
2 q 1 q n
1
n
i 2
X
q
t1
1
(q
)
.
0
,
..
.
2
t
i
.
.
i=1
qn
tn
the problem can be rewritten as

n
n
n
X
X
1 X (q i )2
min | ti 0,
ti w,
q i bi = f.
2 i=1 ti
i=1
i=1
32
We can analytically carry out partial minimization

in ti :
Pn
(q i )2
i=1 ti
min| ti 0,
|q i |
P
ti =w |q` |
i ti
thus coming to the problem in q-variables:
!2
X
1 X i
|q |
min |
q i bi = f
2w
i
i
m
X
X
i
|q | min |
q i bi = f
|
{z
LP!
33
OPTIMAL CONTROL EXAMPLE

t0 = initial time,
x(t) IRn
u(t) IR
t1 > t0 final time
state vector at time t

control function, u L2 [t0 , t1 ]
A(t) n n matrix, b IRn , c IRn

PRIMAL PROBLEM
Z t1
1
min J =
u2 (t)dt
2 t0
subject to
(1)
X(t1 ) c
(2)
x(t)
= A(t)x(t) + b(t)u(t) .
Let {X 1 (t), . . . , X n (t)} be a set of linearly independent

solutions of the homog. system
x(t)
= A(t)x(t)
set
X(t) = [X 1 (t), . . . , X n (t)] n n fundamental matrix
7
Then, solution of (2) is given by

Z
x(t) = X(t)X
(t0 )x(t0 ) + X(t)
X 1 (s)b(s)ds .
t0
Define
(t1 , t2 ) = X(t1 )X 1 (t2 ) .
Then, primal problem can be written
Z
1 t1 2
min
u (t)dt
2 t0
subject to
Z t1
|
t0
(t1 , t)b(t)u(t)dt c (t1 , t0 )x0

{z
}
|
{z
}
d
Ku
COMPUTING THE LAGRANGIAN DUAL

Dual objective function
Z t1
1 2
g(y) = min
u (t)dt + y T (d Ku(t))dt
uL2 [t0 ,t1 ] t0 2
Z t1
1 2
= y T d + min
u (t) y T (t1 , t)b(t)u(t) dt .
2
u() t0
Pointwise minimization inside integral
u
(t) = y T (t1 , t)b(t)
8
hence,
1
g(y) = y d
2
|
T
t0
T
T
y (t1 , t)b(t)b(t) (t1 , t) y dt
{z
}
y T Qy
where Q is the N.S.D. n n matrix

Z t1
1
(t1 , t)b(t)b(t)T (t1 , t)T dt .
Q=
2 t0
The Dual Problem is
max y d + y Qy
y0
A quadratic programming (finite dimensional)

with Q as above and d = c (t1 , t0 )x0 .
EXAMPLE: A NONCONVEX
QUADRATIC PROBLEM
(A)
minn
xIR
1 T
z Qz + cT z :
2
zT z < 1
Q = n n symmetric indefinite
eigenvalues
1 2 n (1 < 0)
orthonormal e-vectors
u1 u2 un
= diag(1 , . . . , n );
P = (u1 , . . . , un )
change of variables: x = P z and using Q = P T P ,

P T P = I,
problem (A) converted to:
1
(P )
minn
i x2i + cT x : xT x 1 ,
xIR
2
c = P c .
Dual of (P) via lagrangian duality:
(D)
1
c2i
max
+ : i + 0, i
0
2
i + u
a concave program
computing the dual of the dual.

10
Rewriting (D) as
2
1
ci
max
yIR ,IR
2
yi
subject to
(D)
yi + i + = 0
multiplier
yi 0, 0
ui
2
is
The (lagrangian) dual of (D)
1
(DD)

min 2 i ui |
ci | ui
subject to
ui 1, ui 0
(DD) is, of course, a convex program.

What is the relation between (P ) and (DD)?
min
(P )
x
i
i
2
cTi xi
x2i 1
11
Theorem 1 The nonconvex program (P) is equivalent

to the convex program (DD).
{ui : i = 1, . . . , n} solves (DD) iff
n
o
p
xi = (sign ci ) ui , i = 1, . . . , n solves (P ) .
Proof:
min(P ) max(D) = min(DD)
p
but xi = (sign ci ) ui is feasible to (P ) and objective

function of (P ) evaluated at x is equal to min(DD).
12
EXAMPLE: STATISTICAL
INFORMATION THEORY
X
random variable (nondegenerate)
support of X
fX
density of X
class of density functions with support B

which are absolutely contin. w.r.t.
a nonnegative measure dt
Ai (t) summable functions
Z
f (t)
dt
(P )
inf I(f, fX ) =
f (t) log
f D
f
(t)
X
B
s.t.
R
f (t)Ai (t)dt = i , i = 1, . . . , m
-dimensional problem
Fundamental Problem in Statistical Information
Theory Application in Traffic Engineering, Accounting,
Marketing, Signal Processing, Statistics . . .
D
Lp (, F, P ) linear space of measurable real-valued function f : IR, kf kp <

1/p
R
, (1 < p < )
kf kp = |f ()|p dP ()
13
The Dual Problem
Z
(D)
sup i i log
fX (t)eAi (t)i dt
IRm
Unconstrained!
Finite Dimensional!
Duality Theorem
(i) inf(P ) = sup(D)
(ii) inf(P ) = min(P )
(iii) sup(D) = max(D) iff (P ) is superconsistent
(iv) sup(D) < (P ) is feasible
(v) If solves (D), then
fX (t)eAi (t)i
f (t) = R
Ai (t)i dt
f
(t)e
B X
solves (P )
14
* If fX (t) is not a density function, but just a

positive summable function, then dual problem:
Z
sup i i
fX (t)eAi (t)i 1 dt
IRm
Examples
(Z
f (t) log f (t)dt :
(1) inf
f (t)dt = 1
1/(b a) a t b
sol. f (t) =
0
otherwise
Z
(2) inf
f D
f (t) log f (t)dt :
t2 f (t)dt = 2
1
x2 /2 2
sol. f (t) =
e
2
Z
f (t)
(3) inf
f (t) log K1 dt
f D 0
t
Z
tf (t)dt = K/
sol.
UNIFORM
tK1 et
f (t) =
(K)
NORMAL
GAMMA
15
LIST OF P.D.F.s DERIVED FROM THE

ABOVE DUALITY
discrete r.v.
continuous r.v.
uniform
Normal
geometric
Laplace
binomial
Generalized Cauchy
Poisson
exponential
log Series
gamma
Truncated Geometric
..
.
beta
log normal
..
.
multivariate r.v.
Multi Normal
Multi log Normal
Dirichlet
Multivariate Beta of 2nd kind
(Generalized) multivariate logistic
..
.
16

Importance of Duality

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Importance of Duality

Încărcat de

Drepturi de autor:

Formate disponibile

ON THE IMPORTANCE OF

WHAT IS DUALITY THEORY?

Dual objective function:

inf L(x, y) a concave function

sup {g(y) | y domg , y 0}

Duality Theorem If (P ) satisfies

Extensions to -dimensional spaces

K is the dual cone of K

(D) is strictly feasible if y int K : AT y = c

CONIC DUALITY THEOREM

THE POWER OF DUALITY

A typical Structural Design problem is

From now on, we assume that

Semidefinite reformulation of Structural Design

of f IRm and t = (t1 , . . . , tn ), ti Sd+ is

With semidefinite representation of Complf (t),

The robust SD problem also can be posed as a

of t = (t1 , . . . , tn ) (Sd+ )n , is semidefinite representable:

Consequently, in the case in question, the robust SD

problem is equivalent to the semidefinite program

Universal semidefinite form of the SD problem

From the computational viewpoint, a disadvantage

Passing from (P r) to its dual, we get the problem

`=1 [D` ` + 2E` V` ] +

the design variables being ` , ` , V` , l = 1, . . . , K, , i .

With this elimination, the dual problem becomes

the design variables of the problem being

{Vi Mm,p }l=1 ,

When the dual problem is solved by path-following

(P r) has n + 1 = M2 O(M ) design variables and

m = 2M O( M ) for planar and

Effort of analyzing LMIs

In the case of Shape design, the advantages of the

The dual problem is

Effort of analyzing LMIs

From dual back to primal

the design variables being symmetric d d matrices ti ,

E.G., in the case of single-load obstacle-free Truss

Since by Lemma on Schur Complement for ti 0,

the problem can be rewritten as

We can analytically carry out partial minimization

thus coming to the problem in q-variables:

OPTIMAL CONTROL EXAMPLE

t1 > t0 final time

state vector at time t

A(t) n n matrix, b IRn , c IRn

Let {X 1 (t), . . . , X n (t)} be a set of linearly independent

Then, solution of (2) is given by

(t0 )x(t0 ) + X(t)

(t1 , t)b(t)u(t)dt c (t1 , t0 )x0

COMPUTING THE LAGRANGIAN DUAL

where Q is the N.S.D. n n matrix

A quadratic programming (finite dimensional)

change of variables: x = P z and using Q = P T P ,

Dual of (P) via lagrangian duality:

computing the dual of the dual.

(DD) is, of course, a convex program.

Theorem 1 The nonconvex program (P) is equivalent

but xi = (sign ci ) ui is feasible to (P ) and objective

random variable (nondegenerate)

class of density functions with support B

Ai (t) summable functions

Lp (, F, P ) linear space of measurable real-valued function f : IR, kf kp <

The Dual Problem