Sunteți pe pagina 1din 33

ON THE IMPORTANCE OF

DUALITY IN OPTIMAL
DESIGN
Aharon Ben-Tal
MINERVA Optimization Center
TechnionIsrael Institute of Technology

WHAT IS DUALITY THEORY?


Kuhn (SIAM-AMS Proc., Vol. 9, 1976):
A Duality Theory is made of the following elements:
(a) A pair of optimization problems, one a
minimization problem (problem (P )) and the other
a maximization problem (program (D)) based on
the same data.
(b) Weak duality holds: min(P ) max(D).
(c) Necessary and sufficient conditions for optimality
of a feasible pair is the equality of the
corresponding objective function.
My addition
(d) A computable (tractable) relation between the
optimal solutions of (P ) and (D).
(e) Relation between: primal feasibility and dual
boundedness and dual attainability.

LAGRANGIAN DUALITY
inf n {f (x) | gi (x) 0 , i = 1, . . . , m}

(P )

xIR

f, gi convex
Lagrangian: L(x, y) = f (x) +

Pn
i=1

yi gi (x)

Dual objective function:


g(y) =
domg

inf L(x, y) a concave function

xIRn

{y | g(y) > }

sup {g(y) | y domg , y 0}

(D)

yIRm

Duality Theorem If (P ) satisfies


x
: gi (
x) < 0 i = 1, . . . , m (Slater condition)
then
inf(P ) = max(D)

Extensions to -dimensional spaces


3

CONIC DUALITY
K = closed convex pointed cone, int K 6=
T

(P )
inf n c x | Ax b K
A=mn
xIR

(P ) is strictly feasible if x
: A
x b int K
The conic-dual of (P ) is
T

T
(D)
sup b y | A y = c, y K
yIRm

K is the dual cone of K

K = y | y T x 0,

x K

(D) is strictly feasible if y int K : AT y = c


Examples of dual cones
K = IRn+ ,

K =K
q
o
n
K = Ln = (x1 , . . . , xn1 , xn ) | xn x21 + + x2n1 ,
K=

n
S+

nn

= A IR

K = {x | Ax 0}

K = K

| A = A, A 0 , K = K
T

K = A y | y 0
4

SDP DUALITY

(P )

min {hc, xi | Ax B 0}

xIRn

Ax = x1 A1 + + xn An

A : IRn Sm

Ai , B Sm ,
Dual of (P ) : maxm {hB, Y i | A Y = c, Y 0}
Y S

hB, Y i = Trace(BY )
A : Sm IRn , A Y = (Tr(Y A1 ), . . . , Tr(Y An ))T
=
{Tr(BY ) | Tr(Y Ai ) = ci , i = 1, . . . , n, Y 0}
(D) Ymax
Sm

CONIC DUALITY THEOREM


(P )

inf{cT x | Ax b K}

(D)

sup{bT y | AT y = c, y K }

1. Dual of (D) = (P )
2. Weak duality: inf(P ) sup(D)
3. Strong duality: If one of the (P ) (D) is strictly
feasible and bounded, then its dual is solvable and
inf(P ) = sup(D). If both are strictly feasible, then
min(P ) = max(D).
4. Complementarity: If (P ) or (D) is strictly feasible
and bounded, then a feasible pair x, y is an optimal
pair, if and only if
y T (Ax b) = 0

THE POWER OF DUALITY


IN STRUCTURAL DESIGN

17

A typical Structural Design problem is


ComplF (t) supf F Complf (t) min
ti 0 ,
i Tr(ti ) i , i = 1, . . . , n ,
Pn
i=1 Tr(ti ) w
where

1 T
Complf (t) = supvV f v 2 v A(t)v
Pn PS
A(t) = i=1 s=1 bTis ti bis
V = {v IRm | Rv r}
ti Sd , i = 1, . . . , n, are design variables
F IRm is the set of loading scenarios.

From now on, we assume that


A) V satisfies the Slater condition:

v : R
v<r
B) The ground structure does not admit rigid body
motions:
X
bis bTis 0
i,s

C) 0 i < i and

i < w

i
18

Semidefinite reformulation of Structural Design


problem
The function

X
1
T
T
bis ti bTis v
Complf (t) = sup f v v
2
v:Rvr
i,s

of f IRm and t = (t1 , . . . , tn ), ti Sd+ is


semidefinite representable:
d n
m
f IR , t S+ , Complf (t)
m
0 :

2 2rT f T + T R

n X
S
X

T
f + RT
b
t
b
is
i
is

i=1 s=1
|
{z
}

0;

A(t)

ti 0, i = 1, . . . , n .

19

With semidefinite representation of Complf (t),


we can immediately pose the multi-load SD
problem as a semidefinite program:

2 2rT `

f + RT
`
`

min

f`T + T` R

n
S
0, ` = 1, . . . , k;
XX
T
bis ti bis
i=1 s=1

ti 0, i = 1, . . . , n;
n
X

Tr(ti ) w;

i=1

i Tr(ti ) i , i = 1, . . . , n;
` 0, ` = 1, . . . , k
with design variables ti Sd , i = 1, . . . , n,
` IR#obst , IR, where #obst = dim r.

20

The robust SD problem also can be posed as a


semidefinite program provided that there are no
obstacles:
= IRm .
Proposition: Let
F = {f = Qu | uT u 1}
Then the function

[Q Mm,k ] .


X
1
T
T
T
bis ti bTis v
ComplF (t) = sup sup u Q v v
2
v u:uT u1
i,s

of t = (t1 , . . . , tn ) (Sd+ )n , is semidefinite representable:


t (Sd+ )n , ComplF (t)
m

2 Ik
Q

Pn
i=1

QT
PS

T
b
t
b
is
i
is
s=1

0.

Consequently, in the case in question, the robust SD

21

problem is equivalent to the semidefinite program


min

2 Ik
Q

Pn
i=1

QT
PS

T
s=1 bis ti bis

Pn
i=1

0;
ti 0, i = 1, . . . , n;

Tr(ti ) w;

i Tr(ti ) i , i = 1, . . . , n;
with design variables ti Sd , IR.

22

Universal semidefinite form of the SD problem


Both multi-load and robust cases of the SD
problem are covered by the following generic
semidefinite program:
(P r)
min

2 Ip + D` z + D`
[E` z + E` ]

[E` z + E` ]T

n
S
0, l = 1, . . . , K;
XX
T
bis ti bis
i=1 s=1

ti 0, i = 1, . . . , n;
n
X

Tr(ti ) w;

i=1

i Tr(ti ) i , i = 1, . . . , n;
z 0,

where
design variables are {ti Sd }ni=1 , z IRN , IR;
data are m d matrices bis , affine mappings
z 7 D` z+D` ; IRN Sd , z 7 E` z+E` : IRN Mm,p ,
` = 1, . . . , K, and reals 0 i < i , i = 1, . . . , n,
w > 0.
23

(P r)
min

2 Ip + D` z + D`
[E` z + E` ]

[E` z + E` ]T

S
n
0, l = 1, . . . , K;
XX
T
bis ti bis
i=1 s=1

Pn
i=1

ti 0, i = 1, . . . , n;
Tr(ti ) w;

i Tr(ti ) i , i = 1, . . . , n;
z 0,

From the computational viewpoint, a disadvantage


of (P r) is its huge design dimension. E.g., when
(P r) comes from obstacle-free truss design, the
dimension of the design vector is
M2
O(M ) ,
2
M being the number of nodes. For a planar truss
with 15 15 nodal grid, this dimension is
> 25, 000. For a spatial truss with 10 10 10
nodal grid, the dimension is > 500, 000.
It turns out that the semidefinite dual to (P r)
admits analytical elimination of most of the
variables, which in many important cases allows to
reduce dramatically the dimension (and thus, the
computational complexity) of the problem.
24

Passing from (P r) to its dual, we get the problem


T
Tr(D

+
2E
`
`
` V` )
`=1 h
i
Pn
i=1 i i+ i i w

maximize

V`T

V`

PK

0, ` = 1, . . . , K

[` Sp , ` Sm , V` Mm,p ] ,

D`
i 0, i = 1, . . . , n ` S
,
+

i , i 0, i = 1, . . . , n i , i IR ,
0 [ IR] ,

PK

[ IRN ] ,
PK
2 `=1 Tr(` ) =

`=1 [D` ` + 2E` V` ] +


PK PS T
`=1
s=1 bis ` bis + i

+
+ i i Id

(Dini )
1,

0,

0, i = 1, . . . , n,

the design variables being ` , ` , V` , l = 1, . . . , K, , i .


However, , i can be immediately eliminated.
Also, it can be shown that at optimality ` = V` `1 V`T .

25

With this elimination, the dual problem becomes


minimize

K
X

Tr(D` ` + 2E`T V` )

`=1
n
Xh

i i+

i=1

A()

BiT (V )

Bi (V ) ( +

i+

i )Id

i i

i
+ w

0, l = 1, . . . , N,
(D1)

i+ i 0, i = 1, . . . , n ;
PK

0;

`=1 Tr(` )
PK

[D

+
2E
`
`
` V` ]
`=1

1;

0,

the design variables of the problem being


K

{` Sp }l=1 ,

{Vi Mm,p }l=1 ,



n
i IR i=1 , IR .

26

When the dual problem is solved by path-following


interior point methods, it is easy to recover a
(nearly) optimal design from nearly optimal central
path solutions to (D1).
Computational advantages of (D1) as compared to
(P r) are especially significant in the case of truss
design.
min

2
f`

Pn

f`T

T
t
b
b
i
i
i
i=1

Pn

0, ` = 1, . . . , k
ti 0, i = 1, . . . , n;

i=1 ti

2
1

viT v1

(P r)

Pk

f`T v` + w min

T
b1 v1

T
2
bi v2

..
..
.
.
0, i = 1, . . . , n; (D1)

k
bTi vk

bTi v2 bTi vk

Pk
`=1 ` = 1
`=1

27

(P r) has n + 1 = M2 O(M ) design variables and


k large LMIs (of size (m + 1) (m + 1)).

m = 2M O( M ) for planar and


m = 3M O(M 2/3 ) for spatial trusses.
(D1) has k(m + 1) = O(1)M k n = O(M 2 )
design variables and n small (of sizes
(k + 1) (k + 1)) LMIs.
E.g., for planar truss with 15 15 nodal grid and 3
loads:
Setting

Design dimension

Effort of analyzing LMIs


at a point, a.o.

(P r)

25,096

37,309,230

(D1)

1,264

267,680

28

In the case of Shape design, the advantages of the


dual setting also can be quite significant.
Consider, e.g., the obstacle-free planar Shape
problem with rectangular cells and with simple
bounds.
The primal problem is:

min

0, ` = 1, . . . , k ;

ti

0, i = 1, . . . , n ;

Tr(ti )

2
f`

Pn
i=1

f`T
P4

T
s=1 bis ti bis

n
X
i=1

3
IR, ti S

29

The dual problem is

k
X

f`T v` + w min

`=1

v1T bi1

1
..

v1T biS

1
..

vkT bi1

k
..

T
bT
i1 v1 biS v1

T
bT
i1 vk biS vk

k
X

vkT biS

0, i = 1, . . . , n;

I3

` = 1 .

`=1

[` , IR, v` IRm ]
E.g., for planar shape with 14 14 cells and 3 loads:
Setting

Design dimension

Effort of analyzing LMIs


at a point, a.o.

(P r)

1,177

37,309,230

(D1)

1,264

71,608
30

From dual back to primal


If (D1) were the usual semidefinite dual of (P r),
the problem dual to (D1) was (P r).
In fact, (D1) is not the semidefinite dual to (P r) it
is obtained from this dual by eliminating part of the
variables. It turns out that the semidefinite dual to
(D1) is a nontrivial (and instructive) equivalent
reformulation of (P r), namely, the problem
min

2 Ip + D` z + D`
l
q11

l ]T [q l ]T
[q11
1S

t1

l
q1S

..

l ]T [1l ]T
[qn1
nS

.
t1

l
qn1

l
qnS

l
Tr(ti )
Pi n
Tr(ti )
Pn PSi=1
l
b
q
is
is
i=1
s=1
z

..

.
tn

..

0,

tn

1, . . . , K ;
i , i = 1, . . . , n ;
w;
E` z + E` , l = 1, . . . , k;
0,
(P r+ )

the design variables being symmetric d d matrices ti ,


l
, l = 1, . . . , K,
i = 1, . . . , n, d p matrices qis
i = 1, . . . , n, s = 1, . . . , S, real and z IRN .
31

E.G., in the case of single-load obstacle-free Truss


design (P r+ ) becomes

2
q1
..
.
qn

q1
t1

min

qn

0;
..

tn
n
X
q i bi
i=1
n
X

= f;

ti

0, i = 1, . . . , n;

ti

w.

i=1

Since by Lemma on Schur Complement for ti 0,


one has

2 q 1 q n

1
n
i 2
X

q
t1
1
(q
)
.
0
,
..

.
2
t
i
.

.
i=1
qn

tn

the problem can be rewritten as


n
n
n
X
X
1 X (q i )2
min | ti 0,
ti w,
q i bi = f.
2 i=1 ti
i=1
i=1
32

We can analytically carry out partial minimization


in ti :
Pn

(q i )2
i=1 ti

min| ti 0,

|q i |
P
ti =w |q` |

i ti

thus coming to the problem in q-variables:

!2
X
1 X i
|q |
min |
q i bi = f
2w
i
i
m
X
X
i
|q | min |
q i bi = f
|

{z
LP!

33

OPTIMAL CONTROL EXAMPLE


t0 = initial time,
x(t) IRn
u(t) IR

t1 > t0 final time

state vector at time t


control function, u L2 [t0 , t1 ]

A(t) n n matrix, b IRn , c IRn


PRIMAL PROBLEM

Z t1
1
min J =
u2 (t)dt
2 t0
subject to
(1)

X(t1 ) c

(2)

x(t)

= A(t)x(t) + b(t)u(t) .

Let {X 1 (t), . . . , X n (t)} be a set of linearly independent


solutions of the homog. system
x(t)

= A(t)x(t)
set
X(t) = [X 1 (t), . . . , X n (t)] n n fundamental matrix
7

Then, solution of (2) is given by


Z
x(t) = X(t)X

(t0 )x(t0 ) + X(t)

X 1 (s)b(s)ds .

t0

Define
(t1 , t2 ) = X(t1 )X 1 (t2 ) .
Then, primal problem can be written
Z
1 t1 2
min
u (t)dt
2 t0
subject to
Z t1
|

t0

(t1 , t)b(t)u(t)dt c (t1 , t0 )x0


{z
}
|
{z
}
d
Ku

COMPUTING THE LAGRANGIAN DUAL


Dual objective function
Z t1
1 2
g(y) = min
u (t)dt + y T (d Ku(t))dt
uL2 [t0 ,t1 ] t0 2

Z t1
1 2
= y T d + min
u (t) y T (t1 , t)b(t)u(t) dt .
2
u() t0
Pointwise minimization inside integral
u
(t) = y T (t1 , t)b(t)
8

hence,
1
g(y) = y d
2
|
T

t0

T
T
y (t1 , t)b(t)b(t) (t1 , t) y dt
{z
}
y T Qy

where Q is the N.S.D. n n matrix


Z t1
1
(t1 , t)b(t)b(t)T (t1 , t)T dt .
Q=
2 t0
The Dual Problem is

max y d + y Qy

y0

A quadratic programming (finite dimensional)


with Q as above and d = c (t1 , t0 )x0 .

EXAMPLE: A NONCONVEX
QUADRATIC PROBLEM

(A)

minn

xIR

1 T
z Qz + cT z :
2

zT z < 1

Q = n n symmetric indefinite
eigenvalues

1 2 n (1 < 0)

orthonormal e-vectors

u1 u2 un

= diag(1 , . . . , n );

P = (u1 , . . . , un )

change of variables: x = P z and using Q = P T P ,


P T P = I,
problem (A) converted to:

1
(P )
minn
i x2i + cT x : xT x 1 ,
xIR
2

c = P c .

Dual of (P) via lagrangian duality:

(D)

1
c2i
max

+ : i + 0, i
0
2
i + u
a concave program

computing the dual of the dual.


10

Rewriting (D) as

2
1
ci

max

yIR ,IR
2
yi

subject to

(D)

yi + i + = 0
multiplier

yi 0, 0

ui
2

is
The (lagrangian) dual of (D)
1

(DD)


min 2 i ui |
ci | ui
subject to
ui 1, ui 0

(DD) is, of course, a convex program.


What is the relation between (P ) and (DD)?
min
(P )

x
i
i
2

cTi xi

x2i 1

11

Theorem 1 The nonconvex program (P) is equivalent


to the convex program (DD).
{ui : i = 1, . . . , n} solves (DD) iff
n
o
p
xi = (sign ci ) ui , i = 1, . . . , n solves (P ) .

Proof:
min(P ) max(D) = min(DD)
p

but xi = (sign ci ) ui is feasible to (P ) and objective


function of (P ) evaluated at x is equal to min(DD).

12

EXAMPLE: STATISTICAL
INFORMATION THEORY
X

random variable (nondegenerate)

support of X

fX

density of X

class of density functions with support B


which are absolutely contin. w.r.t.
a nonnegative measure dt

Ai (t) summable functions

Z
f (t)
dt
(P )
inf I(f, fX ) =
f (t) log
f D
f
(t)
X
B
s.t.
R
f (t)Ai (t)dt = i , i = 1, . . . , m
-dimensional problem
Fundamental Problem in Statistical Information
Theory Application in Traffic Engineering, Accounting,
Marketing, Signal Processing, Statistics . . .
D

Lp (, F, P ) linear space of measurable real-valued function f : IR, kf kp <


1/p
R
, (1 < p < )
kf kp = |f ()|p dP ()
13

The Dual Problem

Z
(D)
sup i i log
fX (t)eAi (t)i dt
IRm

Unconstrained!

Finite Dimensional!

Duality Theorem
(i) inf(P ) = sup(D)
(ii) inf(P ) = min(P )
(iii) sup(D) = max(D) iff (P ) is superconsistent
(iv) sup(D) < (P ) is feasible

(v) If solves (D), then

fX (t)eAi (t)i

f (t) = R
Ai (t)i dt
f
(t)e
B X
solves (P )

14

* If fX (t) is not a density function, but just a


positive summable function, then dual problem:

Z
sup i i
fX (t)eAi (t)i 1 dt
IRm

Examples
(Z

f (t) log f (t)dt :

(1) inf

f (t)dt = 1

1/(b a) a t b
sol. f (t) =

0
otherwise
Z
(2) inf

f D

f (t) log f (t)dt :

t2 f (t)dt = 2

1
x2 /2 2
sol. f (t) =
e
2
Z
f (t)
(3) inf
f (t) log K1 dt
f D 0
t
Z
tf (t)dt = K/

sol.

UNIFORM

tK1 et
f (t) =
(K)

NORMAL

GAMMA
15

LIST OF P.D.F.s DERIVED FROM THE


ABOVE DUALITY
discrete r.v.

continuous r.v.

uniform

Normal

geometric

Laplace

binomial

Generalized Cauchy

Poisson

exponential

log Series

gamma

Truncated Geometric
..
.

beta
log normal
..
.

multivariate r.v.
Multi Normal
Multi log Normal
Dirichlet
Multivariate Beta of 2nd kind
(Generalized) multivariate logistic
..
.
16

S-ar putea să vă placă și