Sunteți pe pagina 1din 412

DISCRETE CONVEX ANALYSIS

SIAM Monographs on
Discrete Mathematics and Applications

The series includes advanced monographs reporting on the most recent theoretical, computational, or
applied developments in the field; introductory volumes aimed at mathematicians and other mathematically
motivated readers interested in understanding certain areas of pure or applied combinatorics; and graduate
textbooks. The volumes are devoted to various areas of discrete mathematics and its applications.
Mathematicians, computer scientists, operations researchers, computationally oriented natural and social
scientists, engineers, medical researchers, and other practitioners will find the volumes of interest.
Editor-in-Chief
Peter L. Hammer, RUTCOR, Rutgers, The State University of New Jersey

Editorial Board
A. V. Kostochka, Siberian Branch of the Russian Academy of
M. Aigner, Freie Universitat Berlin, Germany Sciences, Russia
N. Alon, Tel Aviv University, Israel F. T. Leighton, Massachusetts Institute of Technology, USA
E. Balas, Carnegie Mellon University, USA T. Lengauer, Gesellschaft fur Mathematik und
J- C. Bermond, UniversitedeNice-SophiaAntipolis, France Datenverarbeitung mbH, Germany
J. Berstel, Universite Marne-la-Vallee, France S. Martello, DEIS University of Bologna, Italy
N. L. Biggs, The London School of Economics, United Kingdom M. Minoux, Universite Pierre et Marie Curie, France
B. Bollobas, University of Memphis, USA R. Mb'hring, Technische Universitat Berlin, Germany
R. E. Burkard, Technische Universitat Graz, Austria C. L. Monma, Bellcore, USA
D. G. Cornell, University of Toronto, Canada J. Nesetril, Charles University, Czech Republic
I. Gessel, Brandeis University, USA W. R. Pulleyblank, IBM T. J. Watson Research Center, USA
F. Glover, University of Colorado, USA A. Recski, Technical University of Budapest, Hungary
M. C. Golumbic, Bar-Han University, Israel C. C. Ribeiro, Catholic University of Rio de Janeiro, Brazil
R. L. Graham, AT&T Research, USA H. Sachs, Technische Universitat llmenau, Germany
A. J. Hoffman, IBM T. J. Watson Research Center, USA A. Schrijver, CWI, The Netherlands
T. Ibaraki, Kyoto University, Japan R. Shamir, Tel Aviv University, Israel
H. Imai, University of Tokyo, Japan N. J. A. Sloane, AT&T Research, USA
M. Karoriski, Adam Mickiewicz University, Poland, and Emory W. T. Trotter, Arizona State University, USA
University, USA D. J. A. Welsh, University of Oxford, United Kingdom
R. M. Karp, University of Washington, USA D. de Werra, Ecole Polytechnique Federate de Lausanne,
V. Klee, University of Washington, USA Switzerland
K. M. Koh, National University of Singapore, Republic of P. M. Winkler, Bell Labs, Lucent Technologies, USA
Singapore Yue Minyi, Academia Sinica, People's Republic of China
B. Korte, Universitat Bonn, Germany

Series Volumes
Murota, K., Discrete Convex Analysis
Toth, P. and Vigo, D., The Vehicle Routing Problem
Anthony, M., Discrete Mathematics of Neural Networks: Selected Topics
Creignou, N., Khanna, S., and Sudan, M., Complexity Classifications of Boolean Constraint Satisfaction Problems
Hubert, L., Arable, P., and Meulman, J., Combinatorial Data Analysis: Optimization by Dynamic Programming
Peleg, D., Distributed Computing: A Locality-Sensitive Approach
Wegener, I., Branching Programs and Binary Decision Diagrams: Theory and Applications
Brandstadt, A., Le, V. B., and Spinrad, J. P., Graph Classes: A Survey
McKee, T. A. and McMorris, F. R., Topics in Intersection Graph Theory
Grilli di Cortona, P., Manzi, C., Pennisi, A., Ricca, R, and Simeone, B., Evaluation and Optimization of Electoral Systems
DISCRETE CONVEX ANALYSIS

KAZUO MUROTA
University of Tokyo; PRESTO, JST
Tokyo, Japan

Society for Industrial and Applied Mathematics


Philadelphia
Copyright © 2003 by the Society for Industrial and Applied Mathematics.

10 9 8 7 6 5 4 3 2 1

All rights reserved. Printed in the United States of America. No part of this book may
be reproduced, stored, or transmitted in any manner without the written permission of
the publisher. For information, write to the Society for Industrial and Applied
Mathematics, 3600 University City Science Center, Philadelphia, PA 19104-2688.

library of Congress Cataloging-in-Publication Data


Murota, Kazuo, 1955-
Discrete convex analysis / Kazuo Murota.
p. cm. — (SIAM monographs on discrete mathematics and applications)
Includes bibliographical references and index.
ISBN 0-89871-540-7
1. Convex functions. 2. Convex sets. 3. Mathematical analysis. I. Title. II. Series.

QA331.5.M87 2003
515'.8—dc21
2003042468

is a registered trademark.
Contents

List of Figures xi

Notation xiii

Preface xxi

1 Introduction to the Central Concepts 1


1.1 Aim and History of Discrete Convex Analysis 1
1.1.1 Aim 1
1.1.2 History 5
1.2 Useful Properties of Convex Functions 9
1.3 Submodular Functions and Base Polyhedra 15
1.3.1 Submodular Functions 16
1.3.2 Base Polyhedra 18
1.4 Discrete Convex Functions 21
1.4.1 L-Convex Functions 21
1.4.2 M-Convex Functions 25
1.4.3 Conjugacy 29
1.4.4 Duality 32
1.4.5 Classes of Discrete Convex Functions 36
Bibliographical Notes 36

2 Convex Functions with Combinatorial Structures 39


2.1 Quadratic Functions 39
2.1.1 Convex Quadratic Functions 39
2.1.2 Symmetric M-Matrices 41
2.1.3 Combinatorial Property of Conjugate Functions . . 47
2.1.4 General Quadratic L-/M-Convex Functions 51
2.2 Nonlinear Networks 52
2.2.1 Real-Valued Flows 52
2.2.2 Integer-Valued Flows 56
2.2.3 Technical Supplements 58
2.3 Substitutes and Complements in Network Flows 61
2.3.1 Convexity and Submodularity 61

v
vi Contents

2.3.2 Technical Supplements 63


2.4 Matroids 68
2.4.1 Prom Matrices to Matroids 68
2.4.2 From Polynomial Matrices to Valuated Matroids . . 71
Bibliographical Notes 74

3 Convex Analysis, Linear Programming, and Integrality 77


3.1 Convex Analysis 77
3.2 Linear Programming 86
3.3 Integrality for a Pair of Integral Polyhedra 90
3.4 Integrally Convex Functions 92
Bibliographical Notes 99

4 M-Convex Sets and Submodular Set Functions 101


4.1 Definition 101
4.2 Exchange Axioms 102
4.3 Submodular Functions and Base Polyhedra 103
4.4 Polyhedral Description of M-Convex Sets 108
4.5 Submodular Functions as Discrete Convex Functions Ill
4.6 M-Convex Sets as Discrete Convex Sets 114
4.7 M^-Convex Sets 116
4.8 M-Convex Polyhedra 118
Bibliographical Notes 119

5 L-Convex Sets and Distance Functions 121


5.1 Definition 121
5.2 Distance Functions and Associated Polyhedra 122
5.3 Polyhedral Description of L-Convex Sets 123
5.4 L-Convex Sets as Discrete Convex Sets 125
5.5 L^-Convex Sets 128
5.6 L-Convex Polyhedra 131
Bibliographical Notes 131

6 M-Convex Functions 133


6.1 M-Convex Functions and M^-Conyex Functions 133
6.2 Local Exchange Axiom 135
6.3 Examples 138
6.4 Basic Operations 142
6.5 Supermodularity 145
6.6 Descent Directions 146
6.7 Minimizers 148
6.8 Gross Substitutes Property 152
6.9 Proximity Theorem 156
6.10 Convex Extension 158
6.11 Polyhedral M-Convex Functions 160
6.12 Positively Homogeneous M-Convex Functions 164
Contents vii

6.13 Directional Derivatives and Subgradients 166


6.14 Quasi M-Convex Functions 168
Bibliographical Notes 175

7 L-Convex Functions 177


7.1 L-Convex Functions and L''-Convex Functions 177
7.2 Discrete Midpoint Convexity 180
7.3 Examples 181
7.4 Basic Operations 183
7.5 Minimizers 185
7.6 Proximity Theorem 186
7.7 Convex Extension 187
7.8 Polyhedral L-Convex Functions 189
7.9 Positively Homogeneous L-Convex Functions 193
7.10 Directional Derivatives and Subgradients 196
7.11 Quasi L-Convex Functions 198
Bibliographical Notes 202

8 Conjugacy and Duality 205


8.1 Conjugacy 205
8.1.1 Submodularity under Conjugacy 206
8.1.2 Polyhedral M-/L-Convex Functions 208
8.1.3 Integral M-/L-Convex Functions 212
8.2 Duality 216
8.2.1 Separation Theorems 216
8.2.2 Fenchel-Type Duality Theorem 221
8.2.3 Implications 224
8.3 M2-Convex Functions and L2-Convex Functions 226
8.3.1 M2-Convex Functions 226
8.3.2 L2-Convex Functions 229
8.3.3 Relationship 234
8.4 Lagrange Duality for Optimization 234
8.4.1 Outline 234
8.4.2 General Duality Framework 235
8.4.3 Lagrangian Function Based on M-Convexity . . . . 238
8.4.4 Symmetry in Duality 241
Bibliographical Notes 244

9 Network Flows 245


9.1 Minimum Cost Flow and Fenchel Duality 245
9.1.1 Minimum Cost Flow Problem 245
9.1.2 Feasibility 247
9.1.3 Optimality Criteria 248
9.1.4 Relationship to Fenchel Duality 253
9.2 M-Convex Submodular Flow Problem 255
9.3 Feasibility of Submodular Flow Problem 258
viii Contents

9.4 Optimality Criterion by Potentials 260


9.5 Optimality Criterion by Negative Cycles 263
9.5.1 Negative-Cycle Criterion 263
9.5.2 Cycle Cancellation 265
9.6 Network Duality 268
9.6.1 Transformation by Networks 269
9.6.2 Technical Supplements 273
Bibliographical Notes 278

10 Algorithms 281
10.1 Minimization of M-Convex Functions 281
10.1.1 Steepest Descent Algorithm 281
10.1.2 Steepest Descent Scaling Algorithm 283
10.1.3 Domain Reduction Algorithm 284
10.1.4 Domain Reduction Scaling Algorithm 286
10.2 Minimization of Submodular Set Functions 288
10.2.1 Basic Framework 288
10.2.2 Schrijver's Algorithm 293
10.2.3 Iwata-Fleischer-Fujishige's Algorithm 296
10.3 Minimization of L-Convex Functions 305
10.3.1 Steepest Descent Algorithm 305
10.3.2 Steepest Descent Scaling Algorithm 308
10.3.3 Reduction to Submodular Function Minimization . . 308
10.4 Algorithms for M-Convex Submodular Flows 308
10.4.1 Two-Stage Algorithm 309
10.4.2 Successive Shortest Path Algorithm 311
10.4.3 Cycle-Canceling Algorithm 312
10.4.4 Primal-Dual Algorithm 313
10.4.5 Conjugate Scaling Algorithm 318
Bibliographical Notes 321

11 Application to Mathematical Economics 323


11.1 Economic Model with Indivisible Commodities 323
11.2 Difficulty with Indivisibility 327
11.3 M^-Concave Utility Functions 330
11.4 Existence of Equilibria 334
11.4.1 General Case 334
11.4.2 M^-Convex Case 337
11.5 Computation of Equilibria 340
Bibliographical Notes 344

12 Application to Systems Analysis by Mixed Matrices 347


12.1 Two Kinds of Numbers 347
12.2 Mixed Matrices and Mixed Polynomial Matrices 353
12.3 Rank of Mixed Matrices 356
12.4 Degree of Determinant of Mixed Polynomial Matrices 359
Contents ix

Bibliographical Notes 361

Bibliography 363

Index 379
This page intentionally left blank
List of Figures

1.1 Convex set and nonconvex set 2


1.2 Convex function 2
1.3 Conjugate function (Legendre-Fenchel transform) 11
1.4 Separation for convex and concave functions 12
1.5 Discrete separation 14
1.6 Convex and nonconvex discrete functions 14
1.7 Exchange property (B-EXC[Z]) 19
1.8 Definition of L-convexity 22
1.9 Discrete midpoint convexity 23
1.10 Property of a convex function 26
1.11 Exchange property in the definition of M-convexity 27
1.12 Conjugacy in discrete convexity 31
1.13 Duality theorems (/: M^-convex function, h: M^-concave function) 35
1.14 Separation for convex sets 35
1.15 Classes of discrete convex functions (M^-convex n L^-convex =
M^-convex n L^-convex = separable convex) 37

2.1 Electrical network 42


2.2 Multiterminal network 53
2.3 Characteristic curve 54
2.4 Conjugate discrete convex functions /a(£) and ga(f]} 57
2.5 Discrete characteristic curve Fa 57

3.1 Conjugate function (Legendre-Fenchel transform) 81


3.2 Separation for convex sets 83
3.3 Separation for convex and concave functions 84
3.4 Nonconvexity in Minkowski sum 91
3.5 Integral neighborhood N(x) of x (o: point of N(x)} 94
3.6 Concept of integrally convex sets 97

4.1 M^-convex sets 117

5.1 L^-convex sets 129


5.2 Discrete midpoint convexity 129

xi
xii List of Figures

6.1 Scaling fa for a = 2 145


6.2 Minimum spanning tree problem 149
6.3 Quasi-convex function 168

7.1 Discrete midpoint convexity 181

8.1 Conjugacy in discrete convex functions 215


8.2 Duality theorems (/: M^-convex function, h: M^-concave function) 224

9.1 Characteristic curve (kilter diagram) for linear cost 251


9.2 Minimum cost flow problem for Fenchel duality 254
9.3 Submodular flow problem for M-convex intersection problem . . . 265
9.4 Transformation by a network 269
9.5 Bipartite graphs for aggregation and convolution operations . . . . 272
9.6 Rooted directed tree for a laminar family 273

10.1 Structure of G and G at v* 316


10.2 Conjugate scaling /^ and scaling ga for a — 2 320

11.1 Consumer's behavior 325


11.2 Exchange economy with no equilibrium for x° = (1,1) 328
11.3 Minkowski sum Di(p) + D2(p) 329
11.4 Aggregate cost function ^ and its convex closure *P for an exchange
economy with no equilibrium 336
11.5 Graph for computing a competitive equilibrium 341

12.1 Electrical network with mutual couplings 348


12.2 Hypothetical ethylene dichloride production system 350
12.3 Jacobian matrix in the chemical process simulation 351
12.4 Mechanical system 352
12.5 Accurate numbers 353
Notation
positively homogeneous combinatorial
function set
function function
M-convex feM B = B(p) e Mo 7 e 0M 76T
L- convex ge£ D = D(7) e C0 p&0£ peS

0: = ( 0 , 0 , . . . , 0 )
1: =(1, !,...,!)
2V: set of all subsets of set V (i.e., power set of V)
V: "for all," "for any," or "for each"
3: "there exists" or "for some"
T
: transpose of a vector or a matrix
+: sum, Minkowski sum (3.21), (3.52)
Q : infimal convolution over R™ (3.20)
DZ : infimal convolution over Z™ (i.e., integer infimal convolution) (6.43)
V: componentwise maximum (1-28)
V: "join" operation in a lattice Note 10.15
A: componentwise minimum (1-28)
A: "meet" operation in a lattice Note 10.15
| • |: cardinality (number of elements) of a set
[•,•]: interval (of reals or integers) (3-1), (3.54)
[-, -]R: interval of real numbers (3-1)
[•, -]z- interval of integers (3.54)
(•,•): inner product, pairing (1-7), (3.18)
|| • ||i: fx-norm of a vector (4.2)
norm a
II • ||oo: ^oo- °f vector (3.60)
"•": convex hull of a set, convex closure of a function (3.56)
[ • 1: rounding up to the nearest integer section 3.4
[ • J: rounding down to the nearest integer section 3.4
&R/(X): subdifferential of (convex) function / at x (3.23), (6.86)
dzf(x): integral subdifferential of (convex) function / at x (6.88)
d'^h(x): subdifferential of (concave) function h at x (8.19)
d'zh(x): integral subdifferential of (concave) function h at x (8.19)
d+a: initial vertex of arc a section 2.2
d~a: terminal vertex of arc a section 2.2
d£: boundary of flow £ (2.27)
<9H: set of boundaries of feasible flows section 9.3

XIII
XIV Notation

cf: set of boundaries of optimal flows section 9.4


£: flow section 2.2, section 9.1
£: current section 2.2
77: tension section 2.2, section 9.1
77: voltage section 2.2
6+v: set of arcs leaving vertex v section 2.2
5~v: set of arcs entering vertex v section 2.2
dp: coboundary of potential p (2.28), (9.20)
63: indicator function of set S (3.12), (3.51)
63*'- support function of set S (3.31)
A: Laplacian section 2.1.2
A/(x; v, u): directional difference of / at x in the direction of \v — Xu (6.2)
A+X: set of arcs leaving vertex subset X (9-14)
A~X: set of arcs entering vertex subset X (9.15)
7: distance function section 5.2
7: shortest path length with respect to distance function 7 section 5.2
7: extension of distance function 7 (6.82)
7: cost per unit flow section 9.1.1
7P: reduced cost (9.33)
FI: cost function in network flow problem (9-2), (9.38)
F2: cost function in network flow problem (9.42)
FS: cost function in network flow problem (9-7), (9.46)
Fa: characteristic curve of arc a (2.31)
K: cut function (9.16)
p,: supermodular set function section 4.3
II*: set of optimal potentials section 9.4
p: submodular set function (4.9)
p: rank function of a matroid section 2.4
p: Lovasz extension (linear extension) of set function p (4.6)
Xo- zero vector section 2.1.3
Xi'- ith unit vector section 2.1.3
Xx'- characteristic vector of subset X (1-14)
u: valuation of a matroid section 2.4.2
aff S: afflne hull of set S section 3.1
argmax/: set of maximizers of function /
argmin/: set of minimizers of function / (3.16)
A[J\: submatrix of matrix A with column indices in J section 2.4
B: M-convex set, M-convex polyhedron section 4.1
(B): simultaneous exchange axiom of matroids section 2.4
Notation xv

B: base family of a matroid section 2.4


B(p): base polyhedron defined by submodular set function p (4-13)
(B-EXC[Rj): exchange axiom of M-convex polyhedra section 4.8
(B-EXC+[R]): exchange axiom of M-convex polyhedra section 4.8
(B-EXC[Zj): exchange axiom of M-convex sets section 4.1
(B-EXCW[Z]): exchange axiom of M-convex sets section 4.2
(B-EXC+[Z]): exchange axiom of M-convex sets section 4.2
(B-EXC_[Z]): exchange axiom of M-convex sets section 4.2
(B^-EXCfZ]): exchange axiom of M^-convex sets section 4.7
c: upper capacity function section 9.2
c: lower capacity function section 9.2
Ci: cost function of producer I section 11.1
C[Z —> R]: set of univariate discrete convex functions (3.68)
C[Z —-> Z]: set of univariate integer-valued discrete convex functions section 3.4
C[R —> R]: set of univariate polyhedral convex functions section 3.1
C[Z|R —> R]: set of univariate integral polyhedral convex functions section 6.11
C[R —> R|Z]: set of univariate dual-integral polyhedral convex functions
section 6.11
D: L-convex set, L-convex polyhedron section 5.1, section 5.6
Dh\ demand function of consumer h section 11.1
0(7): L-convex polyhedron defined by distance function 7 (5-4)
V(x): family of tight sets for base x (4.22)
deg: degree of a polynomial section 2.4.2
dep(x,u): smallest tight set for base x that contains element u Note 10.11
det: determinant of a matrix section 2.4.1
domp: effective domain of set function p (4.3)
dom/: effective domain of function / on R™ or Z™ (3.3), (1.25), (1.26)
doniR,/: effective domain of function / on R™ (1-26)
doniz/: effective domain of function / on Z™ (1-25)
epi/: epigraph of function / (3-14)
/: convex function, M-convex function section 3.1, section 1.4.2
f'(x; •): directional derivative of function / at x (3.24)
/: convex closure of function / (3.56)
/: local convex extension of function / (3.61)
f ( x , y ) : a lower bound for f ( y ) — f ( x ) (6.55)
/*: (convex) conjugate of function / (3.26), (8.11)
/**: biconjugate (/*)* of function / section 3.1
f[a,b]'- restriction of function / to interval [a, b] (3.55)
XVI Notation

fu'- restriction of function / to subset U (6.40)


fu: projection of function / to subset U (6-41)
fu*: aggregation of function / to subset U (6.42)
/": scaling of function / (6.47)
/^: conjugate scaling of function / (10.77)
f[-P\(x): = f ( x ) - (p,x) (3.22), (3.69)
F: a field section 12.2
F(s): field of rational functions in variable s over F section 12.2
g: L-convex function section 1.4.1
G = (V, A): directed graph with vertex set V and arc set A
section 2.2, section 9.2
h: concave function, M-concave function section 3.1, section 8.2
h°: (concave) conjugate of function h (3.28), (8.12)
H: set of consumers section 11.1
inf: infimum
k: L-concave function section 8.2
K: set of indivisible commodities section 11.1
K: subfield of field F section 12.2
K(s): field of rational functions in variable s over K section 12.2
L: set of producers section 11.1
£o[Z]: set of L-convex sets section 5.1
£o[Z]: set of indicator functions of L-convex sets section 1.4.3
£o[R]: set of L-convex polyhedra section 5.6
£o[Z|R]: set of integral L-convex polyhedra section 5.6
£o[Z]: set of L^-convex sets section 5.5
£o[R]: set of L^-convex polyhedra section 5.6
£0[Z|R]: set of integral L''-convex polyhedra section 5.6
£[Z —> R]: set of L-convex functions section 7.1
£[Z —> Z]: set of integer-valued L-convex functions section 7.1
£[R —> R]: set of polyhedral L-convex functions section 7.8
£[Z|R —> R]: set of integral polyhedral L-convex functions section 7.8
£[R —» R|Z]: set of dual-integral polyhedral L-convex functions section 7.8
£'' [Z —> R]: set of L''-convex functions section 7.1
£t|[Z —> Z]: set of integer-valued L''-convex functions section 7.1
£t'[R —> R]: set of polyhedral L''-convex functions section 7.8
£t'[Z|R —> R]: set of integral polyhedral L^-convex functions section 7.8
£tl[R —> R|Z]: set of dual-integral polyhedral L^-convex functions section 8.1.2
Notation XVII

o£[R —> R]: set of positively homogeneous polyhedral L-convex functions


section 7.9
o£[Z|R —> R]: set of positively homogeneous integral polyhedral L-convex
functions section 7.9
o£[R —» R|Z]: set of positively homogeneous dual-integral polyhedral
L-convex functions section 8.1.2
o£[Z —> R]: set of positively homogeneous L-convex functions section 7.9
o£[Z —> Z]: set of positively homogeneous integer-valued L-convex
functions section 7.9
£2[Z —> R]: set of L2-convex functions section 8.3
£2[Z —> Z]: set of integer-valued L2-convex functions section 8.3
£2 [^ ~* R]: sgt of L2-convex functions section 8.3
£2[Z —> Z]: set of integer-valued Lj-convex functions section 8.3
(Lb-APR[Z]): property of L^-convex functions section 7.2
max: maximum
min: minimum
.Mo[Z]: set of M-convex sets section 4.1
.Mo[Z]: set of indicator functions of M-convex sets (1-21)
.Mo[R]: set of M-convex polyhedra section 4.8
.Mo[Z|R]: set of integral M-convex polyhedra section 4.8
.Mo[Z]: set of M^-convex sets section 4.7
jM0[R]: set of M^-convex polyhedra section 4.8
jM0[Z|R]: set of integral M^-convex polyhedra section 4.8
M.\L —> R]: set of M-convex functions section 6.1
,M[Z —> Z]: set of integer-valued M-convex functions section 6.1
.M[R —>• R]: set of polyhedral M-convex functions section 6.11
,M[Z|R —» R]: set of integral polyhedral M-convex functions section 6.11
.M[R —> R|Z]: set of dual-integral polyhedral M-convex functions section 6.11
M^[Z —> R]: set of M^-convex functions section 6.1
.M^Z —> Z]: set of integer-valued M^-convex functions section 6.1
.M^R —> R]: set of polyhedral M^-convex functions section 6.11
jVf^ZIR —» R]: set of integral polyhedral M^-convex functions section 6.11
.M^R —> R|Z]: set of dual-integral polyhedral M^-convex functions section 8.1.2
o-M[R —> R]: set of positively homogeneous polyhedral M-convex functions
section 6.12
0.M[Z|R —> R]: set of positively homogeneous integral polyhedral M-convex
functions section 6.12
o.M[R —> R|Z]: set of positively homogeneous dual-integral polyhedral
M-convex functions section 8.1.2
o-M[Z —> R]: set of positively homogeneous M-convex functions section 6.12
xviii Notation

oM [Z —> Z]: set of positively homogeneous integer-valued M-convex functions


section 6.12
M.<z\L —> R]: set of M2-convex functions section 8.3
M.jftL —> Z]: set of integer-valued M2-convex functions section 8.3
.Mj [Z —> R]: set of M^-convex functions section 8.3
1
.Mj [Z —» Z]: set of integer-valued M^-convex functions section 8.3
(M-EXC[Zj): exchange axiom of M-convex functions section 1.4.2, section 6.1
(M-EXC'[Zj): exchange axiom of M-convex functions section 1.4.2, section 6.1
(M-EXCioc[Z]): local exchange axiom of M-convex functions section 6.2
(M-EXCW[Z]): weak exchange axiom of M-convex functions section 6.2
(M-EXC[R]): exchange axiom of polyhedral M-convex functions
section 1.4.2, section 6.11
(M-EXC'[R]): exchange axiom of polyhedral M-convex functions section 6.11
(M^-EXCfZ]): exchange axiom of M^-convex functions section 1.4.2, section 6.1
(M^-EXQR]): exchange axiom of polyhedral M^-convex functions
section 1.4.2, section 6.11
(M^-EXC^R]): exchange axiom of polyhedral M^-convex functions section 6.11
(M^-EXC + [Rj): exchange axiom of polyhedral M^-convex functions section 2.1.3
(M^-EXCdfR]): exchange axiom of polyhedral M^-convex functions section 2.1.3
(M^-EXC^R]): exchange axiom of polyhedral M^-convex functions section 2.1.3
(M-GS[Zj): gross substitutes property of M-convex functions section 6.8
(M^-GSfZj): gross substitutes property of M^-convex functions section 6.8
(M^-SWCSfZ]): stepwise gross substitutes property of M^-convex functions
section 6.8
(M-SI[Zj): descent property of M-convex functions section 6.6
(M^-SIfZ]): descent property of M^-convex functions section 6.6
(—M^-EXC^Z]): exchange axiom of M^-concave functions section 11.3
(—M^-GS[Z]): gross substitutes property of M^-concave function section 11.3
(—M^-SWGSfZ]): stepwise gross substitutes property of M^-concave
functions section 11.3
(—M^-SI[Z]): ascent property of M^-concave functions section 11.3
MCFPo: minimum cost flow problem (linear arc cost) section 9.1.1
MCFPs: minimum cost flow problem (nonlinear cost) section 9.1.1
MSFPi: submodular flow problem (linear arc cost) section 9.2
MSFP2: M-convex submodular flow problem (linear arc cost) section 9.2
MSFPa: M-convex submodular flow problem (nonlinear arc cost) section 9.2
maxSFP: maximum submodular flow problem section 9.3

N(x): integral neighborhood of point x (3.58)

p: variable of an L-convex function section 1.4.1


Notation xix

p: potential section 2.2, section 9.4


p V q: vector of componentwise maxima of p and q (1-28)
p A q: vector of componentwise minima of p and q (1-28)
P(p): submodular polyhedron denned by submodular set function p (4.28)
:
P(7>7>7) I^-convex polyhedron denned by (7,7,7) (5.16)
Q: set of rational numbers
Q(p, n)'. M.^-convex polyhedron (g-polymatroid) defined by (p, /j,) (4.36)
R: set of real numbers
R+: set of nonnegative real numbers
R++: set of positive real numbers
riS1: relative interior of set S section 3.1
sup: supremum
supp+: positive support (2-21)
supp~: negative support (2-21)
S: convex hull of set S section 3.1
<S[R]: set of real-valued submodular set functions (4-10)
<S[Z]: set of integer-valued submodular set functions (4.11)
(SBF[Z]): submodularity of functions on Z" section 1.4.1, section 7.1
(SBF[R]): submodularity of functions on R™ section 1.4.1, section 7.8
(SBF^Zj): translation submodularity of functions on Zn
section 1.4.1, section 7.1
(SBF^Rj): translation submodularity of functions on R™
section 1,4.1, section 7.8
n
(SBS[Zj): submodularity of sets in Z section 1.4.1, section 5.1
(SBS[R]): submodularity of sets in Rn section 5.6
(SBS^Zj): translation submodularity of sets in Zn section 5.5
(SBS^R]): translation submodularity of sets in Rn section 5.6
5;: supply function of producer / section 11.1
(SI): single improvement property section 11.3
7~[R]: set of real-valued distance functions with triangle inequality section 5.2
T[Z]: set of integer-valued distance functions with triangle inequality section 5.2
(TRF[Z]): linearity in direction 1 of functions on Z" section 1.4.1, section 7.1
(TRF[R]): linearity in direction 1 of functions on Rn section 1.4.1, section 7.8
(TRS[Z]): translation property in direction 1 of sets in Z™
section 1.4.1, section 5.1
(TRS[Rj): translation property in direction 1 of sets in R™ section 5.6
Uh- utility function of consumer h section 11.1
xx Notation

(VM): axiom of valuated matroids section 2.4.2


x: variable of an M-convex function section 1.4.2
x°: total initial endowment (11.13)
Z: set of integers
Z+: set of nonnegative integers
Z++: set of positive integers
Preface

Discrete Convex Analysis is aimed at establishing a novel theoretical frame-


work for solvable discrete optimization problems by means of a combination of the
ideas in continuous optimization and combinatorial optimization. The theoretical
framework of convex analysis is adapted to discrete settings and the mathemati-
cal results in matroid/submodular function theory are generalized. Viewed from
the continuous side, the theory can be classified as a theory of convex functions
/ : Rn —> R that have additional combinatorial properties. Viewed from the dis-
crete side, it is a theory of discrete functions / : Z™ —» Z that enjoy certain nice
properties comparable to convexity. Symbolically,
Discrete convex analysis — Convex analysis + Matroid theory.
The theory emphasizes duality and conjugacy as well as algorithms. This results in
a novel duality framework for nonlinear integer programming.
Two convexity concepts, called L-convexity and M-convexity, play primary
roles, where "L" stands for "lattice" and "M" for "matroid." L-convex functions and
M-convex functions are convex functions with additional combinatorial properties
distinguished by "L" and "M," which are conjugate to each other through a discrete
version of the Legendre-Fenchel transformation. L-convex functions and M-convex
functions generalize, respectively, the concepts of submodular set functions and base
polyhedra of (poly)matroids.
L-convexity and M-convexity prevail in discrete systems.
• In network flow problems, flow and tension are dual objects. Roughly speak-
ing, flow corresponds to M-convexity and tension to L-convexity.
• In matroids, the rank function corresponds to L-convexity and the base family
to M-convexity.
• M-matrices in matrix theory correspond to L-convexity and their inverses to
M-convexity. Hence, in a discretization of the Poisson problem of partial
differential equations, for example, the differential operator corresponds to
L-convexity and the Green function to M-convexity.
• Dirichlet forms in probability theory are essentially the same as quadratic
L-convex functions.
This book is intended to be read profitably by graduate students in operations
research, mathematics, and computer science and also by mathematics-oriented

xxi
xxii Preface

practitioners and application-oriented mathematicians. Self-contained presentation


is envisaged. In particular, no familiarity with matroid theory nor with convex
analysis is assumed. On the contrary, I hope the reader will acquire a unified view
on matroids and convex functions through a variety of examples of discrete systems
and the axiomatic approach presented in this book.
I would like to express my appreciation for the encouragement, support,
help, and criticism that I have received during my research on the theory of Dis-
crete Convex Analysis. Joint work with Akiyoshi Shioura and Akihisa Tamura
has been most substantial and collaborations with Satoru Fujishige, Satoru Iwata,
Gleb Koshevoy, and Satoko Moriguchi enjoyable. Moral support offered by Bill
Cunningham, Andras Frank, and Laci Lovasz has been encouraging. I have ben-
efited from discussions with and/or comments by Andreas Dress, Atsushi Kajii,
Mamoru Kaneko, Takahiro Kawai, Takashi Kumagai, Tomomi Matsui, Makoto
Matsumoto, Shiro Matuura, Tom McCormick, Yoichi Miyaoka, Kiyohito Nagano,
Maurice Queyranne, Andras Recski, Andras-**-*ebo,Maiko Shigeno, Masaaki Sug-
ihara, Zoltan Szigeti, Takashi Takabatake, Yoichiro Takahashi, Tamaki Tanaka,
Fabio Tardella, Levent Tungel, Jens Vygen, Jun Wako, Walter Wenzel, Yoshitsugu
Yamamoto, and Zaifu Yang. In preparing this book I have been supported by sev-
eral friends. Among others, Akiyoshi Shioura and Akihisa Tamura went through the
text and provided comments and Satoru Iwata agreed that his unpublished results
be included in this book. A significant part of this book is based on my previous
book [147] in Japanese published by Kyoritsu Publishing Company. Finally, I ex-
press my deep gratitude to Peter Hammer, the chief editor of this monograph series,
for his support in the realization of this book.
October 2002 Kazuo Murota
Chapter 1

Introduction to the
Central Concepts

Discrete Convex Analysis aims at establishing a new theoretical framework of dis-


crete optimization through mathematical studies of convex functions with combina-
torial structures or discrete functions with convexity structures. This chapter is a
succinct introduction to the central issues discussed in this book, including the role
of convexity in optimization, several classes of well-behaved discrete functions, and
duality theorems. We start with an account of the aim and the history of discrete
convex analysis.

1.1 Aim and History of Discrete Convex Analysis


The motive for Discrete Convex Analysis is explained in general terms of optimiza-
tion. Also included in this section is a brief chronological account of discrete convex
functions in relation to the theory of matroids and submodular functions.

1.1.1 Aim
An optimization problem, or a mathematical programming problem, may be ex-
pressed generically as follows:

This means that we are to find an x that minimizes the value of f ( x ) subject to
the constraint that x should belong to the set 5. Both / and S are given as the
problem data, whereas x is a variable to be determined. The function / is called
the objective function and the set S the feasible set.
In continuous optimization, the variable x typically denotes a finite-dimensional
real vector, say, x 6 R™, and accordingly we have S C R™ and / : R™ —> R (or
/ : S —> R).1 An optimization problem with 5 a convex set and / a convex function
lr
The notation R means the set of all real numbers and Rn the set of n-dimensional real vectors.

1
2 Chapter 1. Introduction to the Central Concepts

Figure 1.1. Convex set and nonconvex set.

Figure 1.2. Convex function.

is referred to as a convex program, where a set S is convex if the line segment join-
ing any two points in S is contained in S (see Fig. 1.1) and a function / : S —> R
defined on a convex set S is convex if

whenever x, y 6 S and 0 < A < 1 (see Fig. 1.2). Convex programs constitute a class
of optimization problems that are tractable both theoretically and practically, with
a firm theoretical basis provided by "convex analysis." The tractability of convex
programs is largely based on the following properties of convex functions:

1. Local optimality (or minimality) guarantees global optimality. This implies,


in particular, that a global optimum can be found by descent algorithms.
2. Duality, such as the min-max relation or the separation theorem, holds good.
This leads, for instance, to primal-dual algorithms using dual variables and
also to sensitivity analysis in terms of dual variables.
1.1. Aim and History of Discrete Convex Analysis 3

Some more details on these issues will be discussed in section 1.2.


In discrete optimization (or combinatorial optimization), on the other hand,
the variable x takes discrete values; most typically, x is an integer vector or a {0,1}-
vector. Whereas almost all discrete optimization problems arising from practical
applications are difficult to solve efficiently, network flow problems are recognized as
tractable discrete optimization problems. In the minimum cost flow problem with
linear arc costs, for instance, we have the following fundamental facts that render
the problem tractable:
1. A flow is optimal if and only if it cannot be improved by augmentation along
a cycle. This statement means that the global optimality of a solution can
be characterized by the local optimality with respect to augmentation along
a cycle.

2. A flow is optimal if and only if there exists a potential on the vertex set
such that the reduced arc cost with respect to the potential is nonnegative
on every arc. This is a duality statement characterizing the optimality of
a flow in terms of the dual variable (potential). This provides the basis for
primal-dual algorithms.
In more abstract terms, it is accepted that the tractability of the network flow
problems stems from the matroidal structure (or submodularity) inherent therein.
Whereas the meaning of this statement will be substantiated later, it is mentioned
at this point that a matroid is an abstract combinatorial object defined as a pair
of a finite set, say, V, and a family B of subsets of V that satisfies certain abstract
axioms. We refer to V as the ground set, a member of B as a base, and a subset
of a base as an independent set. The matroid is considered to be fundamental in
combinatorial optimization, which is evidenced by the following facts:2
1. A base is optimal with respect to a given weight vector if and only if it
cannot be improved by an elementary exchange, which means a modification
of a base B to another base (B \ {«}) U {v} with u in B and v not in B.
Thus, the local optimality with respect to elementary exchanges guarantees
the global optimality. Moreover, an optimal base can be found by the so-called
greedy algorithm, which may be compared to the steepest descent algorithm
in nonlinear optimization.

2. Given a pair of matroids on a common ground set, the intersection prob-


lem is to find a common independent set of maximum cardinality. Edmonds's
intersection theorem is a min^max duality theorem that characterizes the max-
imum cardinality as the minimum of a submodular function defined by the
rank functions of the matroids.
With the above facts it is natural to think of matroidal structure as a discrete
or combinatorial analogue of convexity. The connection of matroidal structure to
convexity was formulated in the early 1980s as a relationship between submodular
2
A more specific account of these facts will be given in section 1.3.
4 Chapter 1. Introduction to the Central Concepts

functions and convex functions. It was shown by Prank that Edmonds's inter-
section theorem can be rewritten as a separation theorem for a pair of submod-
ular/supermodular functions, with an integrality (discreteness) assertion for the
separating hyperplane in the case of integer-valued functions. Another reformula-
tion of Edmonds's intersection theorem is Fujishige's Fenchel-type min-max duality
theorem for a pair of submodular/supermodular functions, again with an integrality
assertion in the case of integer-valued functions. A precise statement, beyond anal-
ogy, about the relationship between submodular functions and convex functions was
made by Lovasz: A set function is submodular if and only if the so-called Lovasz
extension of that set function is convex. These results led to the recognition that
the essence of the duality for submodular/supermodular functions consists of the
discreteness (integrality) assertion in addition to the duality for convex/concave
functions. Namely,

Such developments notwithstanding, our understanding of convexity in dis-


crete optimization seems to be only partial. In convex programming, a convex ob-
jective function is minimized over a convex feasible region, which may be described
by a system of inequalities in (other) convex functions. In matroid optimization,
explained above, the objective function is restricted to be linear and the feasible
region is described by a system of inequalities using submodular functions. This
means that the convexity argument for submodular functions applies to the con-
vexity of feasible regions and not to the convexity of objective functions. In the
literature, however, we can find a number of nice structural results on discrete op-
timization of nonlinear objective functions. For example, the minimum cost flow
problem with a separable convex cost function admits optimality criteria similar
to those for linear arc costs (Minoux [131] and others), which can be carried over
to the submodular flow problem with a separable convex cost function (Fujishige
[65]). The minimization of a separable convex function over a base polyhedron also
admits a local optimality criterion with respect to elementary exchanges (Fujishige
[60], Girlich-Kowaljow [78], Groenevelt [81]). This fact is used in the literature of
resource allocation problems (Ibaraki-Katoh [93], Hochbaum [90], Hochbaum-Hong
[91], Girlich-Kovalev-Zaporozhets [77]). The convexity argument concerning sub-
modular functions, however, does not help us understand these results in relation
to convex analysis. We are thus waiting for a more general theoretical framework
for discrete optimization that can be compared to convex analysis for continuous
optimization.
Discrete Convex Analysis is aimed at establishing a general theoretical frame-
work for solvable discrete optimization problems by means of a combination of the
ideas in continuous optimization and combinatorial optimization. The theoretical
framework of convex analysis is adapted to discrete settings and the mathemati-
cal results in matroid/submodular function theory are generalized. Viewed from
the continuous side, the theory can be classified as a theory of convex functions
/ : Rn —> R that have additional combinatorial properties. Viewed from the dis-
crete side, it is a theory of discrete functions / : Z™ —> Z that enjoy certain nice
1.1. Aim and History of Discrete Convex Analysis 5

properties comparable to convexity.3 Symbolically,

The theory emphasizes duality and conjugacy with a view to providing a novel
duality framework for nonlinear integer programming. It may be in order to mention
that the present theory extends the direction set forth by J. Edmonds, A. Frank,
S. Fujishige, and L. Lovasz (see section 1.1.2), but it is, rather, independent of the
convexity arguments in the theories of greedoids, antimatroids, convex geometries,
and oriented matroids (Bjorner-Las Vergnas-Sturmfels-White-Ziegler [16], Korte-
Lovasz-Schrader [114]).
Two convexity concepts, called L-convexity and M-convexity, play primary
roles in the present theory. L-convex functions and M-convex functions are both
(extensible to) convex functions and they are conjugate to each other through a
discrete version of the Legendre-Fenchel transformation. L-convex functions and M-
convex functions generalize, respectively, the concepts of submodular set functions
and base polyhedra. It is noted that the "L" in "L-convexity" stands for "lattice"
and the "M" in "M-convexity" for "matrbid."

1.1.2 History
This section is devoted to an account of the history of discrete convex functions
in matroid theory that led to L-convex and M-convex functions (see Table 1.1).
There are, however, many other previous and recent studies on discrete convex-
ity outside the literature of the matroid (Hochbaum-Sharnir-Shanthikumar [92],
Ibaraki-Katoh [93], Kindler [112], Miller [130], and so on).
The concept of matroids was introduced by H. Whitney [218] in 1935, together
with the equivalence between the submodularity of rank functions and the exchange
property of independent sets. This equivalence is the germ of the conjugacy between
L-convex and M-convex functions in the present theory of discrete convex analysis.
In the late 1960s, J. Edmonds found a fundamental duality theorem on the
intersection problem for a pair of (poly)matroids. This theorem, Edmonds's inter-
section theorem, shows a min-max relation between the maximum of a common
independent set and the minimum of a submodular function derived from the rank
functions. The famous article of Edmonds [44] convinced us of the fundamental
role of submodularity in discrete optimization. Analogies of submodular functions
to convex functions and to concave functions were discussed at the same time. The
min-max relation supported the analogy to convex functions, whereas some other
facts pointed to concave functions. No unanimous conclusion was reached at this
point.
The relationship between submodular functions and convex functions, which
was made clear in the early 1980s through the work of A. Frank, S. Fujishige,
and L. Lovasz, was described in section 1.1.1 but is mentioned again in view of
its importance. The fundamental relationship between submodular functions and
convex functions, due to Lovasz [123], says that a set function is submodular if
3
The notation Z means the set of all integers and Zn the set of n-dimensional integer vectors.
6 Chapter 1. Introduction to the Central Concepts

Table 1.1. History (matroid and convexity).

Year (ca.) Author(s) Result


1935 Whitney [218] axioms of matroid
exchange property
<=> submodularity
1965 Edmonds [44] polymatroid
polyhedral method
intersection theorem
1975 weighted matroid intersection
Edmonds [45]
Lawler [118]
Tomizawa-Iri [201] potential
Iri-Tomizawa [96] potential
Frank [54] weight splitting
1982 relationship to convexity
Frank [55] discrete separation theorem
Fujishige [62] Fenchel-type duality
Lovasz [123] Lovasz (linear) extension
1990 Dress-Wenzel [41], [42] valuated matroid
axiom, greedy algorithm
Favati-Tardella [49] integrally convex function
1995 Murota [135], [139] valuated matroid intersection
Murota [137], [140] L-/M-convex function
Fenchel-type duality
separation theorem
Murota-Shioura [151] M^-convex function
2000 Fujishige-Murota [68] L15- convex function
Murota-Shioura [152] polyhedral L-/M-convex function
Murota-Shioura [156], [157] continuous L-/M-convex function

and only if the Lovasz extension of that function is convex. Reformulations of


Edmonds's intersection theorem into a separation theorem for a pair of submod-
ular/supermodular functions by Frank [55] and a Fenchel-type min-max duality
theorem by Fujishige [62] indicate its similarity to convex analysis. The discrete
mathematical content of these theorems, which cannot be captured by the rela-
tionship of submodularity to convexity, lies in the integrality assertion for integer-
valued submodular/supermodular functions. Further analogy to convex analysis,
such as subgradients, was conceived by Fujishige [63]. These developments in the
1980s led us to the understanding that (i) submodularity should be compared to
convexity, not to concavity, and (ii) the essence of the duality for a pair of sub-
modular/supermodular functions lies in the discreteness (integrality) assertion in
1.1. Aim and History of Discrete Convex Analysis 7

addition to the duality for convex/concave functions:

(i) submodular functions ~ convex functions,


(ii) duality for submodular functions ~ convexity + discreteness.

A remark is in order here, although it involves technical terminology from convex


analysis. The Lovasz extension of a submodular set function is a convex function,
but it is bound to be positively homogeneous (/(Ax) = Xf(x) for A > 0). As a matter
of fact, it coincides with the support function of the base polyhedra associated
with the submodular function. This suggests that the convexity arguments on
submodularity deal with a restricted class of convex functions, namely, the class of
support functions of convex sets. The relationship of submodular set functions to
convex functions summarized in (i) and (ii) above is generalized to the full extent
by the concept of L-convex functions in the present theory.
Addressing the issue of local vs. global optimality for functions defined on
integer lattice points, P. Favati and F. Tardella [49] came up with the concept
of integrally convex functions in 1990. This concept successfully captures a fairly
general class of functions on integer lattice points, for which a local optimality
implies the global optimality. Moreover, the class of submodular integrally convex
functions (i.e., integrally convex functions that are submodular on integer lattice
points) was considered as a subclass of integrally convex functions. It turns out
that this concept is equivalent to a variant of L-convex functions, called L''-convex
functions, in the present theory.
We have so far seen major milestones on the road toward L-convex functions
and now turn to M-convex functions.
A weighted version of the matroid intersection problem was introduced by
Edmonds [44]. The problem is to find a maximum-weight common independent set
(or a common base) with respect to a given weight vector. Efficient algorithms for
this problem were developed in the 1970s by Edmonds [45], Lawler [118], Tomi/awa-
Iri [201], and Iri-Tomizawa [96] on the basis of a nice optimality criterion in terms
of dual variables. The optimality criterion of Frank [54] in terms of weight splitting
can be thought of as a version of such an optimality criterion using dual variables.
The weighted matroid intersection problem was generalized to the polymatroid
intersection problem as well as to the submodular flow problem. It should be
noted, however, that in all of these generalizations the weighting remained linear or
separable convex.
The concept of valuated matroids, introduced by Dress and Wenzel [41], [42]
in 1990, provides a nice framework for nonlinear optimization on matroids. A val-
uation of a matroid is a nonlinear and nonseparable function of bases satisfying a
certain exchange axiom. It was shown by Dress and Wenzel that a version of the
greedy algorithm works to maximize a matroid valuation and this property in turn
characterizes a matroid valuation. Not only the greedy algorithm but also the inter-
section problem extends to valuated matroids. The valuated matroid intersection
problem, introduced by Murota [135], is to maximize the sum of two valuations.
This generalizes the weighted matroid intersection problem, since linear weighting
is a special case of matroid valuation. Optimality criteria, such as weight splitting,
8 Chapter 1. Introduction to the Central Concepts

as well as algorithms for the weighted matroid intersection, are generalized to the
valuated matroid intersection (Murota [136]). An analogy of matroid valuations to
concave functions resulted in a Fenchel-type min-max duality theorem for matroid
valuations (Murota [139]). This Fenchel-type duality is neither a generalization nor
a special case of Fujishige's Fenchel-type duality for submodular functions, but these
two can be generalized into a single min-max equation, which is the Fenchel-type
duality theorem in the present theory.
A further analogy of valuated matroids to concave functions led to the concept
of M-convex/concave functions in Murota [137], 1996. M-convexity is a concept of
"convexity" for functions denned on integer lattice points in terms of an exchange
axiom and affords a common generalization of valuated matroids and (integral)
polymatroids. A valuated matroid can be identified with an M-concave function
denned on {0, l}-vectors. The base polyhedron of an integral polymatroid is a syn-
onym for a {0, +oc}-valued M-convex function. The valuated matroid intersection
problem and the polymatroid intersection problem are unified into the M-convex
intersection problem. The Fenchel-type duality theorem for matroid valuations is
generalized for M-convex functions and the submodular flow problem is general-
ized to the M-convex submodular flow problem (Murota [142]), which involves an
M-convex function as a nonlinear cost. The nice optimality criterion using dual
variables survives in this generalization. Thus, M-convex functions yield fruitful
generalizations of many important optimization problems on matroids.
The two independent lines of development, namely, the convexity argument
for submodular functions in the early 1980s and that for valuated matroids and
M-convex functions in the early 1990s, were merged into a unified framework of
discrete convex analysis advocated by Murota [140] in 1998. The concept of L-
convex functions was introduced as a generalization of submodular set functions.
L-convex functions form a conjugate class of M-convex functions with respect to
the Legendre-Fenchel transformation. This completes the picture of conjugacy
advanced by Whitney [218] in 1935 as the equivalence between the submodularity
of the rank function of a matroid and the exchange property of independent sets of
a matroid. The duality theorems carry over to L-convex and M-convex functions.
In particular, the separation theorem for L-convex functions is a generalization of
Frank's separation theorem for submodular functions.
Ramifications of the concepts of L- and M-convexity followed. M*1 -convex func-
tions,4 introduced by Murota-Shioura [151], are essentially equivalent to M-convex
functions, but are sometimes more convenient. For example, a convex function in
one variable, when considered only for integer values of the variable, is an M^-convex
function that is not M-convex. Lfi-convex functions, due to Fujishige-Murota [68],
are an equivalent variant of L-convex functions. It turns out that L''-convex func-
tions are exactly the same as the submodular integrally convex functions that had
been introduced by Favati-Tardella [49] in their study of local vs. global optimality.
The success of polyhedral methods in combinatorial optimization naturally sug-
gests the possibility of polyhedral versions of L- and M-convex functions. This idea
was worked out by Murota-Shioura [152] with the introduction of the concepts of
4
"M^-convex" should be read "M-natural-convex" and similarly for "L^-convex."
1.2. Useful Properties of Convex Functions 9

L- and M-convexity for polyhedral functions (piecewise linear functions in real vari-
ables). These convexity concepts were denned also for quadratic functions (Murota-
Shioura [155]) and for closed convex functions (Murota-Shioura [156], [157]).
We conclude this section with a remark on a subtle point in the relationship
between submodularity and convexity. Prom the discussion in the early 1980s we
have agreed that submodularity should be compared to convexity. This statement
is certainly true for set functions. When it comes to functions on integer points,
however, we need to be careful. As a matter of fact, an M^-concave function is
submodular and concave extensible (Theorems 6.19 and 6.42), whereas an L^-convex
function is submodular and convex extensible (Theorem 7.20). This shows that
submodularity and convexity are mutually independent properties for functions on
integer points. It is undoubtedly true, however, that submodularity is essentially
related to discrete convexity.

1.2 Useful Properties of Convex Functions


We have already mentioned that convex functions are tractable in optimization (or
minimization) problems, which is mainly because of the following properties:
1. Local optimality (or minimality) guarantees global optimality.
2. Duality, e.g., the min-max relation or the separation theorem, holds good.
The purpose of this section is to give more specific descriptions of these properties
and to discuss their possible versions for discrete functions.
Let us first recall the definition of a convex function. A function / : R™ —>
R U {+00} is said to be convex if

for all x,y e R" and for all A with 0 < A < 1, where it is understood that the
inequality is satisfied if f ( x ) or f(y) is equal to +00. The inequality (1.2) implies
that the set

called the effective domain of /, is a convex set. Hence, the present definition of
a convex function coincides with the one in (1.1) that makes explicit reference to
the effective domain S. A special case of inequality (1.2) for A = 1/2 yields the
midpoint convexity

and, conversely, this implies convexity provided / is continuous. We often assume


(explicitly or implicitly) that f ( x ) < +00 for some x 6 R™ whenever we talk about
a convex function /. A function h : Rn —> RU {—00} is said to be concave if — h is
convex.
A point (or vector) x is said to be a global optimum of / if the inequality
10 Chapter 1. Introduction to the Central Concepts

holds for every y and a local optimum if this inequality holds for every y in some
neighborhood of x. Obviously, global optimality implies local optimality. The
converse is not true in general, but it is true for convex functions.

Theorem 1.1. For a convex function, global optimality (or minimality) is guaran-
teed by local optimality.

Proof. Let x be a local optimum of a convex function /. Then we have f ( z ) > f ( x )


for any z in some neighborhood U of x. For any y, z — Xx + (1 — \)y belongs to U
for A < 1 sufficiently close to 1 and it follows from (1.2) that

This implies f(y) > f ( x ) .

The above theorem is significant and useful in that it reduces the global prop-
erty to a local one. Still it refers to an infinite number of points or directions around
x for the local optimality. In considering discrete structures on top of convexity we
may hope that a fixed and finite set of directions suffices to guarantee the local
optimality. For example, in the simplest case of a separable convex function

which is the sum of univariate convex functions5 /,(x(z)) in each component of


x = (x(i) | i = 1,... ,n), it suffices to check for local optimality in 2n directions:
the positive and negative directions of the coordinate axes. Such a phenomenon of
discreteness in direction, so to speak, is a reflection of the combinatorial structure
of separable convex functions. Although the combinatorial structure of separable
convex functions is too simple for further serious consideration, similar phenom-
ena of discreteness in direction occur in nontrivial ways for L-convex or M-convex
functions, as we will see in section 1.4.
We now go on to the second issue of duality and conjugacy. For a function /
(not necessarily convex), the convex conjugate /* : R™ —> Ru {+00} is defined by

where

for p = (p(i) i — 1, . . . , n ) and x = (x(i) \ i = l , . . . , n ) . The function /* is


also referred to as the (convex) Legendre-Fenchel transform of / and the mapping
/ i—> /* as the (convex) Legendre-Fenchel transformation.
5
A univariate function means a function in a single variable.
1.2. Useful Properties of Convex Functions 11

Figure 1.3. Conjugate function (Legendre-Fenchel transform)

For example, for f(x) = exp(x), where n = 1, we see

by a simple calculation. See Fig. 1.3 for the geometric meaning in the case of n = 1.
The Legendre-Fenchel transformation gives a one-to-one correspondence in
the class of well-behaved convex functions, called closed proper convex functions,
where the precise meaning of this technical terminology (not important here) will
be explained later in section 3.1. The notation /" means (/*)*, the conjugate of
the conjugate function of /.

Theorem 1.2 (Conjugacy). The Legendre-Fenchel transformation f H-> /* gives a


symmetric one-to-one correspondence in the class of all closed proper convex func-
tions. That is, for a closed proper convex function f, /* is a closed proper convex
function and f" = f.

Similarly, for a function h, the concave conjugate h° : R™ —> R U {—00} is


defined by

The duality principle in convex analysis can be expressed in a number of dif-


ferent forms. One of the most appealing statements is in the form of the separation
theorem, which asserts the existence of a separating affine function Y — a* + (p*, x)
for a pair of convex and concave functions (see Fig. 1.4).

Theorem 1.3 (Separation theorem). Let f : R™ -» RU {+00} and h : Rn ->


Ru{—00} be convex and concave functions, respectively (satisfying certain regularity
Chapter 1. Introduction to the Central Concepts

Figure 1.4. Separation for convex and concave functions.

conditions). If6

there exist a" € R and p* e R™ such that

It is admitted that the statement above is mathematically incomplete, refer-


ring to certain regularity conditions, which will be specified later in section 3.1.
Another expression of the duality principle is in the form of the Fenchel duality.
This is a min-max relation between a pair of convex and concave functions and their
conjugate functions. The certain regularity conditions in the statement below will
be specified later.

Theorem 1.4 (Fenchel duality). Let f : Rn -> R U {+00} and h : Rn -> R U


{—ex)} be convex and concave functions, respectively (satisfying certain regularity
conditions). Then

Such a min-max theorem is computationally useful in that it affords a cer-


tificate of optimality. Suppose that we want to minimize f ( x ) — h(x) and have
x = x as a candidate for the minimizer. How can we verify or prove that x is
indeed an optimal solution? One possible way is to find a vector p such that
f ( x ) — h(x) = h°(p) — f*(p). This implies the optimality of x by virtue of the
min-max theorem. The vector p, often called a dual optimal solution, serves as
6
The notation V means "for all," "for any," or "for each."
1.2. Useful Properties of Convex Functions 13

a certificate for the optimality of x. It is emphasized that the min-max theorem


guarantees the existence of such a certificate p for any optimal solution x. It is also
mentioned that the min-max theorem does not tell us how to find optimal solutions
x and p.
It is one of the recurrent themes in discrete convexity how the conjugacy
and the duality above should be adapted in discrete settings. To be specific, let
us consider integer-valued functions on integer lattice points and discuss possible
notions of conjugacy and duality for / : Zn —» Z U {+00} and h : Z™ —> Z U
{—oo}. Some ingredients of discreteness (integrality) are naturally expected in the
formulation of conjugacy and duality. This amounts to discussing another kind
of discreteness, discreteness in value, in contrast with discreteness in direction,
mentioned above.
Discrete versions of the Legendre-Fenchel transformations can be defined by

They are meaningful as transformations of discrete functions in that the resulting


functions /* and h° are also integer valued on integer points. We call (1.9) and
(1.10), respectively, convex and concave discrete Legendre-Fenchel transformations.
With these definitions, a discrete version of the Fenchel duality would read as
follows.
[Discrete Fenchel-type duality theorem] Let f : Zn —> Z U {+00} and
h : Z™ —> Z U {—00} be convex and concave functions, respectively (in
an appropriate sense). Then

Such a theorem, if any, claims a min-max duality relation for integer-valued non-
linear functions, which is not likely to be true for an arbitrary class of discrete
functions. It is emphasized that the definition of convexity itself is left open in the
above generic statement, although h should be called concave when —h is convex.
As for the separation theorem, a possible discrete version would read as fol-
lows, imposing integrality (a* € Z, p* e Zn) on the separating affine function (see
Fig. 1.5).

[Discrete separation theorem] Let f : Z™ —> ZU{+oo} and h : Z" —> ZU


{—00} be convex and concave functions, respectively (in an appropriate
sense). If

there exist a* 6 Z and p* € Z™ such that

Again the precise definition of convexity remains unspecified here.


14 Chapter 1. Introduction to the Central Concepts

Figure 1.5. Discrete separation.

Figure 1.6. Convex and nonconvex discrete functions.

To motivate the framework we will introduce in the subsequent sections, let


us try a naive and natural candidate for the convexity concept, which turns out to
be insufficient.
Let us (temporarily) define / : Zn —> Z U {+00} to be convex if it can be
extended to a convex function on Rn, i.e., if there exists a convex function / :
Rn -> R U {+00} such that

This is illustrated in Fig. 1.6.


In the one-dimensional case (with n = 1), this is equivalent to defining / :
Z —> Z U {+00} to be convex if

As is easily verified, the discrete separation theorem, as well as the discrete Fenchel
duality, holds with this definition in the case of n — 1.
When it comes to higher dimensions, the situation is not that simple. The
following examples demonstrate that the discrete separation fails with this naive
definition of convexity.
1.3. Submodular Functions and Base Polyhedra 15

Example 1.5 (Failure of discrete separation). Consider two discrete functions


defined by

where x = (x(l),x(2)) e Z2. They are integer valued on the integer lattice Z2,
with /(O) = h(0) = 0, and can be extended, respectively, to a convex function
/ : R2 -> R and a concave function h : R2 —> R given by

where x = (x(l), x(2)) e R2. Since f ( x ) > h(x) (Vx e R 2 ), the separation theorem
in convex analysis (Theorem 1.3) applies to the pair (/, h) to yield a (unique)
separating affine function (p*,x), with p* = (1/2,1/2). We have /(x) > (p*,x} >
h(x) for all x € R2 and, a fortiori, /(x) > (p*,x) > /i(x) for all x 6 Z2. However,
there exists no integral vector p* e Z2 such that /(x) > (p*,x) > /i(x) for all
x € Z2. This demonstrates the failure of the desired discreteness in the separating
affine function. •

Example 1.6 (Failure of real-valued separation). This example shows that even
the existence of a separating affine function can be denied. For the discrete functions

where x = (x(l),x(2)) € Z2, we have f ( x ) > h(x) (Vx e Z 2 ). There exists,


however, no pair of real number a* G R and real vector p* e R2 for which f(x] >
a* + (p*,x) > h(x) for all x € Z2. Note that the separation theorem in convex
analysis (Theorem 1.3) does not apply to the pair of their convex/concave extensions
(/, /i), which are given by

for xj= (x(l),x(2)) € R2, since /(1/2,1/2) < h(l/2,1/2). This example also shows
that / > h on R" does not follow from / > h on Z™. •

Similarly, the discrete Fenchel duality fails under the naive definition of con-
vexity. The above two examples serve to demonstrate this.
Thus, the naive approach to discrete convexity does not work, and some deep
combinatorial or discrete-mathematical considerations are needed. We are now
motivated to look at some results in the area of matroids and submodular functions,
which we hope will provide a clue for fruitful definitions of discrete convexity.

1.3 Submodular Functions and Base Polyhedra


We describe here a few results on submodular functions and base polyhedra that
are relevant to our discussion in this introductory chapter, whereas a more com-
prehensive treatment is given in section 4.3. Emphasis is placed on the conjugacy
relationship between these two objects and the analogy to convex functions recog-
nized in the early 1980s.
16 Chapter 1. Introduction to the Central Concepts

1.3.1 Submodular Functions


A set function7 p : 2V —> RU {+00}, which assigns a real number (or +oc) to each
subset of a given finite set V, is said to be submodular if

where it is understood that the inequality is satisfied if p(X) or p(Y) is equal to


+00. This is called the submodularity inequality. We assume, for a set function p
in general, that p(0) = 0 and p(V) is finite. A function /z : 2V —> R U {—00} is
supermodular if —p, is submodular.
The relationship between submodularity and convexity can be formulated in
terms of the Lovasz extension (also called the Choquet integral or the linear exten-
sion). For any set function p : 2V —> R U {±00} the Lovasz extension of p is a
function p : Rv —> R U {±00}, a real-valued function in real variables, defined as
follows.8
For each p 6 R17, we index the elements of V in nonincreasing order in the
components of »; i.e., V = ivt, v->, . . . , iv,} and

where9 n = \V\. Using the notation PJ = P(VJ), Vj = {vi,V2, • • •, Vj} for j =


1 , . . . , n, and xx for the characteristic vector of a subset X C V defined by

This is an expression of p as a linear combination of the characteristic vectors of


the subsets Vj. The linear interpolation of p according to this expression yields

which is the definition of the Lovasz extension p of p. Note that 0 x (±00) = 0 in


(1.16) by convention. The Lovasz extension p is indeed an extension of p in that
p(Xx) = P(X) for X C V.
The relationship between submodularity and convexity reads as follows.10
7
The notation 2V means the set of all subsets of V or the power set of V. Hence, X 6 2V is
equivalent to saying that X is a subset of V.
8
The notation R,^ means the real vector space with coordinates indexed by the elements of V.
If V consists of n elements, then R.^ may be identified with Rn. In the original definition, p(p) is
defined only for nonnegative vectors p.
9
The notation |V| means the number of elements of V.
10
The proofs of Theorems 1.7 and 1.8 are given in Chapter 4, when we come to their rigorous
treatments in Theorems 4.16 and 4.17.
1.3. Submodular Functions and Base Polyhedra 17

Theorem 1.7 (Lovasz). A set function p is sub-modular if and only if its-Lovdsz


extension p is convex.

Duality for a pair of submodular/supermodular functions is formulated in the


following discrete separation theorem. We use the notation

for a vector x = (x(v) \ v € V) e Rv and a subset X C V.

Theorem 1.8 (Prank's discrete separation theorem). Let p : 2V —» RU {+00} and


H : 1V —> R U {—00} be submodular and supermodular functions, respectively, with
p(0) = M(0) = 0, p(V) < +00, and n(V) > -oo. //

there exists x* e Rv such that

Moreover, if p and p, are integer valued, the vector x* can be chosen to be integer
valued.

Let us elaborate on this theorem in reference to the separation theorem in


convex analysis. Let p and ft be the Lovasz extensions of p and //, respectively. We
have p > [L on the nonnegative orthant R^ by the assumption p > n as well as the
definition (1.16) of the Lovasz extension. Define functions g and k by g = p and
k = fi on R+ and g = +00 and k = —oo elsewhere. Then g is convex and k is
concave, by Theorem 1.7, and the separation theorem in convex analysis (Theorem
1.3) applies to the pair of g and k, yielding /?* € R and x" e Rv such that

This inequality for p = xx yields the inequality (1.17) above, where /?* = 0 follows
from 0(0) = p(0) = 0 and fc(0) = /x(0) = 0. Thus, the first half of the discrete sepa-
ration theorem, the existence of a real vector x*, can be proved on the basis of the
separation theorem in convex analysis and the relationship between submodularity
and convexity. The combinatorial essence of the above theorem, therefore, consists
of the second half, claiming the existence of an integer vector for integer-valued
functions. Hence, we have the accepted understanding
Duality for submodular functions = Convexity + Discreteness,
mentioned in section 1.1.1.
We denote by S = S[Z] the class of integer-valued submodular set functions
and by Q£ = QiC[Z —> Z] that of discrete functions obtained as the restriction to Zv
of the Lovasz extensions of some member of S. That is, Q£ consists of functions
g : Zv —> Z U {+00} such that g(p) = p(p] (Vp e Zv) for some p e S. In view of
the above theorems, Q£ is a promising class of discrete convex functions. This is
indeed true, as we will see in section 1.4.1.
18 Chapter 1. Introduction to the Central Concepts

1.3.2 Base Polyhedra


A submodular function p : 2V —> R U {+00} is associated with a polyhedron B(p),
called the base polyhedron, defined by

We are particularly interested in the case of integer-valued p, for which the base
polyhedron is integral in the sense of

where the overline designates the convex hull11 in Rv. This integrality means,
in particular, that all the vertices of the polyhedron B(p) are integer points. In this
integral case, we refer to B(/o) as the integral base polyhedron associated with p.
Assuming the integrality of p, we consider a discrete set

the set of integer points contained in integral base polyhedron B(p). If integer-
valued submodular functions can be viewed as well-behaved discrete convex func-
tions, there is a fair chance of such discrete sets B being well-behaved discrete
convex sets. This is indeed the case in many senses, as we will see in Chapter 4.
• Here we focus on an axiomatic characterization of such a B that makes no
explicit reference to the defining submodular function p. Denoting the positive
support and the negative support of a vector x = (x(v) v € V) 6 Zv by

we consider a simultaneous exchange property for a nonempty set B C Zv:

where Xu is the,characteristic vector of u 6 V; i.e., \u — X{u} m the notation of


(1.14). See Fig. 1.7 for an illustration of this exchange property.
The following is a fundamental theorem connecting submodularity and ex-
changeability.12

Theorem 1.9. The class of integer-valued submodular functions p : 2V —> Z U


{+00} with p(0) = 0 and p(V) < +00 and the class of nonempty subsets B C Zv
satisfying (B-EXC[Zj) are in one-to-one correspondence through mutually inverse
mappings:

11
The convex hull of a set means the smallest convex set containing the set.
12
The proofs of Theorems 1.9, 1.10, 1.11, and 1.12 are given later when we come to their rigorous
or more general treatments in Theorems 4.15, 8.12, 6.26, and 4.18.
1.3. Submodular Functions and Base Polyhedra 19

Figure 1.7. Exchange property (B-EXC[Zj).

The relationship between submodularity and exchangeability, stated in The-


orem 1.9 above, can be reformulated as a conjugacy with respect to the discrete
Legendre-Fenchel transformation (1.9). This reformulation establishes a connection
to convex analysis.
Let MO [Z] denote the class of nonempty sets B satisfying the exchange axiom
(B-EXC[Z]) and -M0[Z] be the class of the indicator functions 6B of B e -Mo[Z]; i.e.,

where SB '• Zv —> {0, +00} is defined by

Recall also the notation o£[Z —> Z] for the class of the restrictions to Zv of the
Lovasz extensions of integer-valued submodular set functions. Then Theorem 1.9
can be rewritten as follows.

Theorem 1.10. Two classes of discrete functions, $L = o£,[Z —> Z] and Mo =


.Mo[Z]; are in one-to-one correspondence under the discrete Legendre-Fenchel trans-
formation (1.9). That is, for g £ Q£ and f € Mo, we have g* G Mo, f & o£,
g" = g, and f=f.

The conjugacy relationship between submodularity and exchangeability set


forth in the above theorem will be fully generalized to the conjugacy between L-
convexity and M-convexity in the present theory, as will be described soon in sec-
tion 1.4.3.
Fundamental optimization problems on base polyhedra are tractable even un-
der integrality constraints. We consider two representative problems here:

1. the optimal base problem to discuss the issue of local vs. global optimality
and
20 Chapter 1. Introduction to the Central Concepts

2. the (unweighted) intersection problem to show a min-max duality theorem


with discreteness assertion.
The two optimization problems on matroids mentioned in section 1.1.1 are
special cases of the above problems. This is because the base family of a matroid
can be identified, through characteristic vectors of bases, with a nonempty set B of
{0, l}-vectors having the exchange property (B-EXC[Zj).
Let B C Zv be a nonempty set satisfying the exchange axiom (B-EXC[Zj)
and c e Hv be a given cost (weight) vector. The optimal base problem is to find
x e B that minimizes the cost f ( x ) = (c,x} = ^2V&V c(v)x(v). This problem admits
the following local optimality criterion for global optimality.13

Theorem 1.11. Assume B C Zv satisfies (B-EXC[Zj). A point x 6 B minimizes


f ( x ) = (c, x) over B if and only if f(x) < f(x — Xu + Xv) for all u, v e V such that
x-Xu+Xv £ B.

To describe the intersection problem we need to introduce another polyhedron

called the submodular polyhedron, associated with a submodular function p : 1V —>


R U {+00}. Given a pair of submodular functions p\ and p% defined on a common
ground set V, the intersection problem is to find a vector x in P(/?i) fl P(/?2) that
maximizes the sum of the components x(V). Edmonds's intersection theorem below
shows a min-max duality relation in this problem.

Theorem 1.12 (Edmonds's intersection theorem). Let pi,p2 : 2^ —> R U {+00}


be submodular functions with pi(0) = /?2(0) = 0, pi(V) < +00, and P2(V) < +00.
Then

Moreover, if pi and p2 are integer valued, the polyhedron P(pi) fl P(/?2) is integral
in the sense of

and there exists an integer-valued vector x* that attains the maximum on the left-
hand side of (1.24).

Discreteness is twofold in Edmonds's intersection theorem. First, the min-


imum on the right-hand side of (1.24) is taken over combinatorial objects, i.e.,
subsets of V, independently of whether the submodular functions are integer val-
ued or not. Second, the maximum can be taken over discrete (integer) points in the
case of integer-valued submodular functions. The former is sometimes referred to
as the dual integrality and the latter as the primal integrality.
13
This is a generalization of a well-known optimality criterion for the minimum spanning tree
problem that a spanning tree is optimal if and only if no improvement is possible by exchanging
arcs in and out of the tree. Details are given in Example 6.27.
1.4. Discrete Convex Functions 21

In sections 1.4.2 and 1.4.4, exchange property (B-EXC[Zj) is generalized to


define the concept of M-convex functions and, accordingly, Edmonds's intersection
theorem is generalized to the Fenchel-type duality theorem for M-convex functions.

1.4 Discrete Convex Functions


The backbone of the theory of discrete convex analysis is outlined in this section as a
quick preview of the main structural results to be presented in subsequent chapters.
The definitions of L-convex and M-convex functions are given, together with concise
descriptions of their major properties, including local optimality criteria for global
optimality, conjugacy between L-convexity and M-convexity, and various forms of
duality theorems.
We use the notation

for the effective domains of / : Zv ->• R U {±00} and g : Rv -> R U {±00}.

1.4.1 L-Convex Functions


The first kind of discrete convex functions, L-convex functions, is obtained from a
generalization of the Lovasz extension of submodular set functions.
Let p : 2V —> R U {+00} be a submodular set function and p be its Lovasz
extension, which is indeed an extension of p in the sense that p(xx) — p(X) f°r
X C V. The submodularity of p on 2y, or that of p on {0,1}V, extends to the
entire space. In fact, it can be shown14 that g = p satisfies

where p V q and p A q are, respectively, the vectors of componentwise maxima and


minima of p and q; i.e.,

Note that the submodularity inequality (1.13) for p is a special case of (1.27) with
p — YX and q — ~YV because of the identities

It also follows immediately from the definition (1.16) that

for r = p(V), where 1 = (1,1,..., 1) 6 Ry. This shows the linearity of g with
respect to the translation of p in the direction of 1. The properties (1.27) and
14
Proofs of the claims in this subsection are given in Chapter 7.
22 Chapter 1. Introduction to the Central Concepts

Figure 1.8. Definition of L-convexity.

(1.30) of the Lovasz extension of a submodular set function are discretized to the
following definition of L-convex functions.
We say that a function g : 2*v —> RU {+00} with domzg 7^ 0 is L-convex if it
satisfies15-16

Naturally, a function k is said to be L-concave if —fcis L-convex.


Figure 1.8 illustrates, in the case of n = 2, how properties (SBF[Zj) and
(TRF[Zj) together can serve as a discrete analogue of convexity. By (SBF[Zj) and
(TRF[ZJ) we obtain

for the points p and q, which are discrete approximations to the midpoint (p + q)/2.
This inequality may be thought of as a discrete approximation to the midpoint
convexity (1.3). We return to midpoint convexity in (1.33) below.
It follows from (SBF[Z]) and (TRF[Zj) that the effective domain, say, D, of
an L-convex function satisfies17

A nonempty set D C Zv is called L-convex if it satisfies (SBS[Zj) and (TRS[Zj)


above. Obviously, a set D is L-convex if and only if its indicator function 6r> is an
L-convex function.
Since an L-convex function g is linear in the direction of 1, we may dispense
with this direction as far as we are interested in its nonlinear behavior. Namely,
instead of the function g in n = \V\ variables, we may consider a function g' in n— I
variables defined by

15
SBF stands for submodularity for functions and TRF for translation for functions.
16
The notation 3 means "there exists" or "for some" in contrast to V meaning "for all" or "for
any."
17
SBS stands for submodularity for sets and TRS for translation for sets.
1.4. Discrete Convex Functions 23

Figure 1.9. Discrete midpoint convexity.

where, for an arbitrarily fixed element VQ € V, a vector p € Zv is represented as


P = (Po,p')i with Po = P(VO) e Z and p' € Zv/ for V = V \ {v0}. Note that the
effective domain domz</ of g' is the restriction of domzff to the coordinate plane
defined by p0 = 0. A function g' derived from an L-convex function by such a
restriction is called an L^-convex18 function.
More formally, an L''-convex function is defined as follows. Let 0 denote a new
element not in V and put V — {0} U V. A function g : Zv —> R U {+00} is called
L^-convex if the function g : Zv —» R U {+00} defined by

is L-convex. It turns out that L^-convexity can be characterized by a kind of gen-


eralized submodularity:
(SBF"[Z]) g(p) + g(q) > g((p - al) V q) + g(p A (<? + al))
(Vp,geZy,VaeZ+),
which we name translation submodularity. Note that this inequality for a = 0
coincides with the original submodularity (SBF[Zj).
An alternative characterization of L''-convexity is by discrete midpoint con-
vexity (see Fig. 1.9):

where [^j2] and [^^J denote, respectively, the integer vectors obtained from E^3-
by componentwise round-up and round-down to the nearest integers. The discrete
midpoint convexity is a natural approximation to the midpoint convexity (1.3) of
ordinary convex functions.
Whereas L^-convex functions are conceptually equivalent to L-convex func-
tions, the class of L^-convex functions is strictly larger than that of L-convex func-
tions. In fact, it is easy to derive the translation submodularity (SBF^Zj) from
(SBF[Z]) and (TRF[Z]) or, more intuitively, a comparison of Figs. 1.9 and 1.8 in-
dicates this. The simplest example of an L^-convex function that is not L-convex is
the one-dimensional discrete convex function depicted in Fig. 1.6 (left).
18
"L^-convex" should be read "L-natural-convex."
24 Chapter 1. Introduction to the Central Concepts

L-convex functions enjoy the following nice properties that are expected of
discrete convex functions.

• An L-convex function can be extended to a convex function.


• Local optimality (or minimality) guarantees global optimality. Specifically,
we have the following:
— For an L-convex function g and a point p € domz<7,

— For an L''-convex function g and a point p 6 domzg,

Thus L-convex functions are endowed with the property of discreteness in


direction.
• Discrete duality, e.g., the Fenchel-type min-max duality or discrete separa-
tion, holds good. Thus, L-convex functions are endowed with the property of
discreteness in value. (This will be explained in section 1.4.4.)
• Efficient algorithms can be designed for the minimization of an L-convex func-
tion and for the Fenchel-type min-max duality.

L-convexity is closely related to network flow problems such as the minimum


cost flow problem and the shortest path problem. As an indication of this connection
we mention that, given an integer-valued distance function19 7 on V, the set of
admissible integer-valued potentials

is an L-convex set. The converse is also true; i.e., any L-convex set has such a
polyhedral description for some 7 satisfying the triangle inequality

The concepts of L'/L^-convexity can also be defined for functions in real vari-
ables through an appropriate adaptation of the conditions (SBF[Zj) and (TRF[Zj).
Namely, we can define a function g : Rv —> R U {+00} with doniRg ^ 0 to be
L-convex if

19
An integer-valued distance function on V means a function 7 : V x V —> Z U {+00} such that
f ( v , v) = 0 for all v e V.
1.4. Discrete Convex Functions 25

lAconvex functions are defined as the restriction of L-convex functions, as in (1.31),


and are characterized by

More precisely, L-convexity can be defined for closed proper convex functions.20
Instead of dealing with this most general class of functions, this monograph focuses
on polyhedral convex functions21 and quadratic functions. The Lovasz extension
of a submodular set function is a polyhedral L-convex function that has the ad-
ditional property of being positively homogeneous. Quadratic L''-convex functions
are characterized in section 2.1.2 as quadratic forms denned by diagonally dominant
symmetric M-matrices22 and hence they are equivalent to the (finite-dimensional)
Dirichlet forms known in probability theory.
We conclude this section by identifying the four types of L-convex functions
that we are concerned with: real-valued L-convex functions on integers, integer-
valued L-convex functions on integers, real-valued polyhedral L-convex functions
on reals, and quadratic L-convex functions on reals. For the first three classes we
introduce the following notation:

Note the inclusion

where o£[Z —> Z] is the notation from section 1.3.1 for the class of the restrictions
to Zv of the Lovasz extensions of integer-valued submodular set functions.

1.4.2 M-Convex Functions


The second kind of discrete convex functions, M-convex functions, is obtained from a
generalization of the simultaneous exchange property (B-EXC[Z]) of base polyhedra.
As a motivation for the axiom of M-convex functions, let us first observe that
a convex function / : R™ —» R U {+00} satisfies the inequality

for every a with 0 < a < 1. The validity of this inequality can be verified easily
from the definition of a convex function by adding the inequality (1.2) for X — a
and (1.2) for A = 1 - a.
20
The definition of closed proper convex function can be found in section 3.1.
21
A polyhedral convex function is a function that can be represented as the maximum of a finite
number of affine functions on a polyhedral effective domain.
22
Here is an unfortunate conflict of our notation with the standard terminology in matrix theory.
M-matrices do not correspond to M-convex functions but to L-convex functions.
26 Chapter 1. Introduction to the Central Concepts

Figure 1.10. Property of a convex function.

The inequality (1.39) above shows that the sum of the function values evalu-
ated at two points, x and y, does not increase if the two points approach each other
by the same distance on the line segment connecting them (see Fig. 1.10). For a
function defined on discrete points Z™, we simulate this property by moving two
points along the coordinate axes rather than on the connecting line segment.
We say that a function / : Z ^ - > R U { + o o } with domz/ ^ 0 is M-convex if
it satisfies the following exchange axiom:
(M-EXC[Zj) For x,y e domz/ and u e supp+(x - y), there exists
v e supp~ (x — y) such that

See Fig. 1.11 for an illustration of this exchange property. The inequality (1.40)
implicitly imposes the condition that x~Xu+Xv G domz/ and y+Xu~Xv £ domz/
for the finiteness of the right-hand side. With the use of the notation
A/(z; v, u) = f(z + Xv~ Xu) - f(z) (1.41)
for z e domz/ and u,v e V, the exchange axiom (M-EXC[Zj) can be expressed
alternatively as follows:

where the maximum and the minimum over an empty set are -co and +00, respec-
tively. Naturally, a function h is said to be M-concave if —h is M-convex.
It follows from (M-EXC[Zj) that the effective domain of an M-convex function
satisfies the exchange axiom (B-EXC[Z]) that characterizes the set of integer points
in an integral base polyhedron, since x — Xu + Xv G domz/ and y + Xu ~Xv 6 domz/
1.4. Discrete Convex Functions 27

Figure 1.11. Exchange property in the definition of M-convexity.

for x, y £ domz/ in (1.40). In particular, the indicator function SB '• Zv —> {0, +00}
of a set B C Zv is M-convex if and only if B is the set of integer points in an
integral base polyhedron. Accordingly, we refer to a set of integer points satisfying
(B-EXC[Z]) as an M-convex set.
The effective domain of an M-convex function /, being an M-convex set, lies
on a hyperplane {x e Ry x(V) = r} for some integer r and, accordingly, we may
consider the projection of / along a coordinate axis. This means that, instead of
the function / in n = \V\ variables, we consider a function /' in n — 1 variables
defined by

where V = V \ {VQ} for an arbitrarily fixed element VQ € V and a vector x e Zv


is represented as x = (XQ,X') with XQ = X(VQ) 6 Z and x' € Zv . Note that
the effective domain domz/' of /' is the projection of domz/ along the chosen
coordinate axis VQ. A function /' derived from an M-convex function by such a
projection is called an M^-convex23 function.
More formally, an M^-convex function is defined as follows. Let 0 denote a
new element not in V and put V = {0} U V. A function / : Zv —> R U {+00} is
called Afl -convex if the function / : Zy —> R U {+00} defined by

is an M-convex function. It turns out24 that an M^-convex function / can be


characterized by a similar exchange property:
(M^-EXCfZ]) For x,y e domz/ and u € supp+(z - y),

23
"M''-convex" should be read "M-natural-convex."
24
Proofs of the claims in this subsection are given in Chapter 6.
28 Chapter 1. Introduction to the Central Concepts

Whereas M^-convex functions are conceptually equivalent to M-convex func-


tions, the class of M^-convex functions is strictly larger than that of M-convex func-
tions. This follows from the implication (M-EXC[Z]) => (Mb-EXC[Z]). The simplest
example of an M^-convex function that is not M-convex is the one-dimensional dis-
crete convex function depicted in Fig. 1.6 (left).
M-convex functions enjoy the following nice properties that are expected of
discrete convex functions.

• An M-convex function can be extended to a convex function.

• Local optimality (or minimality) guarantees global optimality. Specifically,


we have the following:

— For an M-convex function / and a point x G domz/,

(This is a generalization of Theorem 1.11.)


- For an M^-convex function / and a point x e dom z /,

Thus, M-convex functions are endowed with the property of discreteness in


direction.

• Discrete duality, e.g., the Fenchel-type min-max duality or discrete separation,


holds good. Thus, M-convex functions are endowed with the property of
discreteness in value. (This will be explained in section 1.4.4.)

• Efficient algorithms can be designed for the minimization of an M-convex


function and for the Fenchel-type min-max duality.

M-convex functions are closely related to network flow problems, such as the
minimum cost flow problem and the shortest path problem. As an indication of this
connection we mention that the distance 7 on V denned by 7(14, v) = Af(x;v,u)
for u, v G V with a fixed x € domz/ satisfies the triangle inequality (1.35). This
is because the exchange property (M-EXC[Z]) applied to x = x — Xv + Xw and
y = x - Xu + Xv, for which supp + (x - y) = {u,w} and supp~(£ — y) = {v},
yields f(x) + f(y) > f(x - \u + Xv) + f(y + Xu - Xv), which is equivalent to
A/(x; w, v) + A/(z; v, u) > A/(x; w, u).
The concepts of M'/M^-convexity can also be defined for functions in real
variables through an appropriate adaptation of the exchange axiom. Namely, we
can define a function / : Hv —> R U {+00} with doniR/ 7^ 0 to be M-convex if it
satisfies the following exchange property:
1.4. Discrete Convex Functions 29

(M-EXC[Rj) For x,y 6 doniR/ and u e supp+(o; — y), there exist


v G supp~ (x — y) and a positive number QQ € R-++ such that

for all a G R with 0 < a < an-


M^-convex functions are defined as the projection of M-convex functions, as in
(1.43), and are characterized by the following:
(M^-EXCfR]) For x,y e domR/ and u 6 supp+(x — y), there exist
v & supp~(o; — y} U {0} and a positive number an e R+-t- such that

for all a e R with 0 < a < a0,


where Xo = 0 by convention. More precisely, M-convexity can be defined for closed
proper convex functions. Instead of dealing with this most general class of func-
tions, this monograph focuses on polyhedral convex functions and quadratic func-
tions. Polyhedral M-convex functions are a quantitative generalization of the base
polyhedra explained in section 1.3.2, whereas quadratic M-convex functions are
characterized in section 2.1.3 as quadratic forms defined by the inverse of diago-
nally dominant symmetric M-matrices.
We conclude this section by identifying the four types of M-convex functions
that we are concerned with: real-valued M-convex functions on integers, integer-
valued M-convex functions on integers, real-valued polyhedral M-convex functions
on reals, and quadratic M-convex functions on reals. For the first three classes we
introduce the following notation:

Note the inclusion

where A1o[Z] is the notation from section 1.3.2 for the class of indicator functions
of sets of integer points contained in integral base polyhedra.

1.4.3 Conjugacy
The conjugacy relationship between L-convexity and M-convexity is a distinguish-
ing feature of the present theory. Whereas conjugacy in ordinary convex analysis
gives a symmetric one-to-one correspondence within a single class of closed proper
convex functions (Theorem 1.2), conjugacy described in this section establishes a
one-to-one correspondence between two different classes of discrete functions hav-
ing different combinatorial properties denoted by "L" and "M." We describe the
30 Chapter 1. Introduction to the Central Concepts

conjugacy for integer-valued L-convex and M-convex functions on integer points,


namely, for £ = £[Z —> Z] and M = M[Z —> Z], although a similar conjugacy
relationship exists between L''-convex and M^-convex functions and also between
their polyhedral versions.
In Theorem 1.10 we saw the conjugacy between 0£ — o£[Z —> Z] and MQ =
MO [Z] as a reformulation of the equivalence between submodularity for set functions
and exchangeability for discrete sets stated in Theorem 1.9. Since Q£ and M.Q are
subclasses of £ and M, respectively, we can summarize our present knowledge as

where <—> above denotes the conjugacy with respect to the discrete Legendre-
Fenchel transformation (1.9). The following theorem25 shows that the conjugacy
extends to a relation between £ and M..

Theorem 1.13 (Discrete conjugacy theorem). The classes of integer-valued L-


convex functions and M-convex functions, £ = £[Z —> Z] and M = A4[Z —> Z], are
in one-to-one correspondence under the discrete Legendre-Fenchel transformation
(1.9). That is, for g € £ and f e M, we have g* € M, f* G £, 5** = g, and
/" = /-

The essence of the relationship between MQ and Q£ is the conjugacy between


M-convex sets and their support functions, the latter being positively homogeneous
L-convex functions. Symmetrically, we can formulate the conjugacy between L-
convex sets and their support functions in the following theorem, where we denote
by £o[Z] the class of the indicator functions of L-convex sets and by oA4[Z —» Z]
that of positively homogeneous M-convex functions.

Theorem 1.14. Two classes of discrete functions, £Q = £o[Z] andoM = oM[Z —>
Z], are in one-to-one correspondence under the discrete Legendre-Fenchel transfor-
mation (1.9). That is, for g € £Q and f & ®M, we have g* G oM., /* € £o, g" = g,
and f" = f.

Just as a positively homogeneous L-convex function can be identified with a


submodular set function, so can a positively homogeneous M-convex function / be
identified with a distance function 7 on V satisfying the triangle inequality (1.35).
The correspondence is given by

which establishes a one-to-one mapping between oA4 and T, where T = T[Z]


denotes the class of distance functions on V satisfying the triangle inequality (1.35).
Figure 1.12 demonstrates the conjugacy relations as well as the one-to-one
correspondences explained in the above. This diagram clarifies the relationship
among various classes of combinatorial objects, including submodular functions
25
The proofs of Theorems 1.13 and 1.14 are given in Theorem 8.12 and (8.17), respectively.
1.4. Discrete Convex Functions 31

M-convex functions
positively homogeneous (base polyhedra)
M-convex functions M-convex sets
(Theorem 6.59) (Theorem 4.15)
distance functions submodular set functions
(Theorem 5.5) (Theorem 7.40)
L-convex sets positively homogeneous
L-convex functions
L-convex functions

Figure 1.12. Conjugacy in discrete convexity.

( S ) , distance functions (T), and base polyhedra (Mo)- It is recalled again that the
conjugacy is defined by the (discrete) Legendre-Fenchel transformation (1.9).
The pair of L- and M-convexity prevails in discrete systems.
• In network flow problems, flow and tension are dual objects. Roughly speak-
ing, flow corresponds to M-convexity and tension to L-convexity. Namely,
tension M : flow.
In multiterminal electrical networks consisting of nonlinear resistors, the equi-
librium state can be characterized as a stationary point of a convex function
representing the energy (or power). The function is M-convex when expressed
in terms of the terminal current supplied by current sources. It is L-convex
when expressed in terms of the terminal voltage (or potential) specified by
voltage sources. Network flow problems are discussed in section 2.2 and Chap-
ter 9.
In a matroid, the rank function corresponds to L-convexity and the base family
to M-convexity:
rank function : L <—> M : base family.
In a valuated matroid, the valuation of bases is an M-concave function defined
on the unit cube {0,1} V • Matroids and valuated matroids are explained in
section 2.4.
• The concept of M-matrices corresponds to L-convexity. Specifically, a quadratic
function is L''-convex if and only if it is defined by a diagonally dominant sym-
metric M-matrix. The inverse of such a matrix corresponds to M-convexity.
A diagonally dominant symmetric M-matrix arises, for instance, from a dis-
cretization of the Poisson problem of partial differential equations, where the
matrix is an approximation to the differential operator (Laplacian) and its
inverse corresponds to the Green function. Hence
differential operator : L <—> M : Green function.
32 Chapter 1. Introduction to the Central Concepts

Dirichlet forms in probability theory are exactly the same as L^-convex quadratic
functions. These quadratic forms are discussed in section 2.1.

1.4.4 Duality
Duality theorems for L- and M-convex functions are stated here in the case of
integer-valued functions defined on integer points.26 We explain their significance in
relation to previous results, such as Frank's discrete separation theorem for submod-
ular/supermodular functions, Edmonds's intersection theorem, and Frank's weight-
splitting theorem for the weighted matroid intersection problem.
Recall from section 1.2 the generic form of a Fenchel-type min-max duality
theorem:
[Discrete Fenchel-type duality theorem] Let f : Zv —> Z U {+00} and
h : Zv —> Z U {—00} be convex and concave functions, respectively (in
an appropriate sense). Then

We can now specify the meaning of convexity left open in this generic statement by
L-convexity or M-convexity. Then the following theorems result.

Theorem 1.15 (Fenchel-type duality for L-convex functions). Let g : Zv —>


Z U {+00} be an L^-convex function and k : Zv —> Z U {—00} be an L^-concave
function such that domz<? fl domzfc 7^ 0 or domzg* fl donizfc0 ^ 0. Then we have

inf{ff(p) - k(p) \p&Zv} = sup{fc°(x) - g'(x) \ x e Zv}. (1.49)

If this common value is finite, the infimum is attained by somep € domzgndomzfe


and the supremum is attained by some x € donizg* fl domzfc 0 .

Theorem 1.16 (Fenchel-type duality for M-convex functions). Let f : Zv —>


Z U {+00} be an hfi -convex function and h : Zv —•> Z U {—00} be an Afi-concave
function such that doniz/ fl domz/i ^ 0 or doniz/" fl domz^i0 7^ 0. Then we have

inf{/(x) - h(x) xeZv} = sup{h°(p)-f(p) p&Zv}. (1.50)

If this common value is finite, the infimum is attained by some x e domz/ndomzft


and the supremum is attained by some p & doniz/* n domzft 0 .

Although the above theorems look different, they are actually the same theo-
rem if we assume the conjugacy between L^-convex functions and M^-convex func-
tions (a variant of Theorem 1.13). In fact, substitution of g = f* and k = h° in
(1.49) yields

26
The proofs of Theorems 1.15, 1.16, 1.17, 1.18, 1.23, and 1.24 are given in Theorems 8.21, 8.21,
8.16, 8.15, 5.9, and 4.21, respectively.
1.4. Discrete Convex Functions 33

which is equivalent to (1.50) by (/*)* = / and (h°)° = h. Thus, the Fendiel-type


min-max theorem is self-conjugate.
Next we turn to discrete separation theorems. Recall, again from section 1.2,
the generic form of a discrete separation theorem:
[Discrete separation theorem] Let f : Zv —>• Zu{+oo} and h : Zv —> ZU
{—tx)} be convex and concave functions, respectively (in an appropriate
sense). If

there exist a* 6 Z and p* e Zv such that

We can substitute L-convexity or M-convexity for convexity in this generic statement


to obtain a conjugate pair of discrete separation theorems.

Theorem 1.17 (L-separation theorem). Let g : Zv —> Z U {+00} be an L^-


v
convex function and k : Z —> Z U {—00} be an L^-concave function such that
domzfif n domzfc ^ 0 or domZ£r* n domzfc° ^ 0. If g(p) > k(p) (Vp. 6 Z v ), there
exist (3* e Z and x* € Zv such that

Theorem 1.18 (M-separation theorem). Let f : Zv -> Z U {+CXD} be an Afi-


convex function and h : Zv —> Z U {—00} be an JW*1 -concave function such that
domz/ n domz/i ^ 0 or domz/* n domz/i° ^ 0. If f ( x ) > h(x) (Vx e Z y ), there
exist a* G Z and p* € Zv such that

These duality theorems include a number of previous important results as


special cases. We demonstrate this for Prank's discrete separation theorem for
submodular/supermodular functions, Edmonds's intersection theorem, and Frank's
weight-splitting theorem for the weighted matroid intersection problem.

Example 1.19. Frank's discrete separation theorem (Theorem 1.8) in the in-
tegral case can be derived from the L-separation theorem (Theorem 1.17). The
submodular and supermodular functions p and fi can be identified, respectively,
with an L''-convex function g : Zv -> Z U {+00} and an L''-concave function
k : Zv -> Z U {-00} by p(X) = g(xx) and n(X) = k(xx) for X C V, where
domz<7 C {0,1}v and dom z fc C {0, l}y. The L-separation theorem applies, since
the first assumption, domzg ndom z fc ^ 0, is met by g(0) = fc(0) = 0, which follows
from p(0) = yu(0) = 0. We see /?* = 0 from the inequality (1-51) for p = 0, and then
the desired inequality (1.17) is obtained from (1.51) with p = xx for X C V. •
34 Chapter 1. Introduction to the Central Concepts

Example 1.20. Edmonds's intersection theorem (Theorem 1.12) in the integral


case,
max{a;(V) \ x £ P(Pl)nP(p2)nZv} = mm{Pl(X) +p2(V\X) X C V}, (1.53)
can be derived from the Fenchel-type duality theorem for M-convex functions (The-
orem 1.16). Define f ( x ) = 5i(x) and h(x) = (l,x) — $2(0;) by using the indicator
functions 5i(x) of P(/o,) fl Zv (i = 1, 2). Then / is M^-convex and h is M^-concave
with domz/ n domz/i ^ 0. An easy calculation yields

which implies domz/" n doniz/i0 C {0,1} V and

Substituting these expressions into the Fenchel-type min-max relation (1.50) yields

which is equivalent to the desired equation (1.53). •

Example 1.21. Frank's weight-splitting theorem for the matroid intersection prob-
lem with integer weights is a special case of the M-separation theorem (Theorem
1.18). Given two matroids (V, BI) and (V, $2) on a common ground set V with
base families BI and B2, as well as an integer-valued weight vector w : V —> Z, the
optimal common base problem is to find B € BI fl B2 that minimizes the weight
w(B) = ^2veBw(v}- Frank's weight-splitting theorem says that a common base
B* (E BI n BI is optimal if and only if there exist integer vectors w\ and w*2 such
that
(i) w = w* + w2,
(ii) B* is a minimum-weight base of (V, BI) with respect to w^, and
(iii) B* is a minimum-weight base of (V,62) with respect to w^-
The "if" part is easy and the content of this theorem lies in the assertion about the
existence of such a weight splitting.
For an optimal common base B*, define

which are M-convex and M-concave, respectively (h is constant on 82)- Noting that
f ( x ) > h(x) (x & Z y ), as well as domz/ n domz/i ^ 0, we apply the M-separation
theorem to obtain a* 6 Z and p* € Zv for which the inequality (1.52) is true. A
weight splitting constructed by

has the desired properties (i) to (iii). In fact, (1.52) with x = XB* reads w(B*) >
a* +p*(B*) > w(B*), which shows a* = w(B*) -p*(B*) = wl(B*). It follows
1.4. Discrete Convex Functions 35

Figure 1.13. Duality theorems (f: Aft-convex function, h: Afl-concave function).

Figure 1.14. Separation for convex sets.

also from (1.52) that w(B) > a* + p*(B) for every B (E BI (namely, (ii)) and
a* +p*(B) > w(B*) for every B e B2 (namely, (iii)).
Moreover, the valuated matroid intersection theorem, a generalization of the
weight-splitting theorem, can be regarded as a special case of the M-separation
theorem (to be explained in Example 8.28). •

The relationship among duality theorems is summarized in Fig. 1.13. A deriva-


tion of Fujishige's Fenchel-type duality theorem from the Fenchel-type duality the-
orem for L-convex functions will be explained in Example 8.26.
We conclude this section with discrete separation theorems for a pair of L-
convex sets and for a pair of M-convex sets. First we recall the separation theorem
for a pair of convex sets (see Fig. 1.14).

Theorem 1.22 (Separation for convex sets). If Si and 82 are disjoint convex sets
36 Chapter 1. Introduction to the Central Concepts

in Hv, there exists a nonzero vector p* € R^ such that

The discrete versions of this theorem read as follows.

Theorem 1.23 (Discrete separation for L-convex sets). If D\ and D% are disjoint
L-convex sets, there exists x* £ {—1,0,1} V such that

Theorem 1.24 (Discrete separation for M-convex sets). //B\ and B-2 are disjoint
M-convex sets, there exists p* & {0,1}V U {0, — 1}V such that

Let us dwell on the content of these theorems, referring to the latter. The first
implication, explicit in the statement of Theorem 1.24, is that the separating vector
p* is so special that p* or — p* is a {0,1}-vector. The second, less conspicuous and
more subtle, is that B\ D _B2 = 0 follows from B\ n B2 = 0, since (1.55) implies
BI H B2 = 0. The implication

for a pair of discrete sets comprises an essential ingredient in a successful theory of


discrete convexity, as will be discussed in section 3.3.

1.4.5 Classes of Discrete Convex Functions


Besides L-, M-, lA, and M^-convex functions, we will consider in this book some
other classes of discrete convex functions, including integrally convex functions,
L^-convex functions, and M^-convex functions, whose definitions are given later.
The inclusion relationships among these classes of discrete convex functions are
depicted in Fig. 1.15 for ease of reference. The properties of these discrete convex
functions with respect to various fundamental operations are summarized in Table
1.2; counterexamples for the failure of the properties can be found in Murota-
Shioura [153].

Bibliographical Notes
References for optimization abound in the literature. See, e.g., Nemhauser-Rinnooy
Kan-Todd [166] as a general handbook; Bazaraa-Sherali-Shetty [8], Bertsekas [10],
Fletcher [52], Mangasarian [126], and Nocedal-Wright [169] for nonlinear optimiza-
tion; and Cook-Cunningham-Pulleyblank-Schrijver [26], Du-Pardalos [43], Korte-
Vygen [115], Lawler [119], and Nemhauser-Wolsey [167] for combinatorial optimiza-
tion. References for convex analysis are included in the bibliographical notes at the
1.4. Discrete Convex Functions 37

Figure 1.15. Classes of discrete convex functions (M^-convex n L''-convex


= M^-convex n L^-convex = separable convex).

end of Chapter 3, and those for network flow theory and matroid theory are in
Chapter 2.
Pujishige [65] is a standard reference for submodular functions, and Narayanan
[165] and Topkis [203] cover some other topics related to electrical networks and
economics, respectively. Theorem 1.7, connecting submodularity and convexity, is
due to Lovasz [123], and the name "Lovasz extension" was coined by Fujishige [63],
[65]. The discrete separation for submodular functions, Theorem 1.8, is due to Frank
[55]. Theorem 1.9, the equivalence between submodularity and exchangeability, is
folklore from the 1980s. Seeing that no explicit and rigorous proof can be found in
the literature, we will provide a proof in Theorem 4.15 in this book. The recasting
into Theorem 1.10 is by Murota [140]. The local optimality criterion for the linearly
weighted base problem in a matroid (Theorem 1.11) is a standard result (see, e.g.,
Corollary 8.7 of [26]). The intersection theorem, Theorem 1.12, is due to Edmonds
[44]. The weight-splitting theorem described in Example 1.21 is due to Frank [54].
M-convex functions are introduced in Murota [137], followed by L-convex func-
tions in Murota [140]. Their fundamental properties are established in Murota
[137], [140], [141], [142]; the discrete conjugacy theorem (Theorem 1.13) and the
L-separation theorem (Theorem 1.17) in [140]; the M-separation theorem (Theo-
rem 1.18) in [137], [140], [142]; and the Fenchel-type duality theorem for M-convex
functions (Theorem 1.16) in [137], [140]. The separation theorems for L-convex
sets and M-convex sets (Theorems 1.23 and 1.24) are due to [140]. M-convexity
38 Chapter 1. Introduction to the Central Concepts

Table 1.2. Operations for discrete convex sets and functions (/: function,
S: set; Q: Yes [cf. Theorem, Prop.], x: No).
Miller's convex integrally separable
discrete convex extensible convex convex
/1+/2 X 0 X 0
SinS 2 X o X o
/+ sep-conv X o O [3-24] o
S H [a, b] o o O o
f+ affine X o O [3-25] o
/iQz/2 X X X o
Si+S 2 X X X o
/' II * o X o
dom/ 0 o O [3.28] o
arg min / 0 o O [3.28] o
1 1 Mg - convex L2 -convex M^-convex L^-convex
/1 + /2 X x x (Mj-conv) O [7.11]
Sin 6-2 X x x (M2-conv) O [5.7]
/+ sep-conv O x O [6.15] O [7.H]
5 n [a, 6] o x O O
/+ affine o O O [6.15] O [7.H]
/lQz/2 X x O [6-15] x (L2-conv)
Si + S2 X X O [4-23] x (L2-conv)
/* x (L2-conv) x (M2-conv) x (L^-conv) x (M^-conv)
dom/ O [8.29] O [8.39] O [6.7] 0 [7.8]
arg min / O [8-30] O [8.40] O [6-29] O [7.16]

sep-conv: separable convex function, affine: affine function,


dom: effective domain (1.25), argmin: set of minimizers (3.16),
Dz : integer infimal convolution (6.43),
/*: conjugate (1.9) of integer-valued function /

and L-convexity are investigated also for functions in real variables for polyhedral,
quadratic, and closed convex functions in Murota-Shioura [152], [155], [156], [157].
M^-convex functions are introduced by Murota-Shioura [151] and L''-convex
functions by Pujishige—Murota [68]. The concept of submodular integrally convex
functions, together with a characterization by discrete midpoint convexity, is due
to Favati-Tardella [49]. The equivalence of this concept to L''-convexity is shown in
[68]. Table 1.2 is taken from Murota-Shioura [153].
Chapter 2

Convex Functions with


Combinatorial Structures

The objective of this chapter is to demonstrate how convex functions with combi-
natorial structures arise naturally from a variety of discrete systems, such as (i)
discretizations of the Poisson equation, (ii) electrical networks consisting of linear
(ohmic) and nonlinear resistors, and (iii) matrices (matroids) and polynomial matri-
ces (valuated matroids). It is emphasized that such functions are always equipped
with a pair of combinatorial properties, namely, submodularity (L-convexity) and
exchangeability (M-convexity).

2.1 Quadratic Functions


In this section we see how quadratic convex functions with combinatorial structures
arise naturally from linear discrete systems such as discretizations of the Poisson
partial differential equations and electrical networks consisting of linear (ohmic) re-
sistors. In so doing we intend to illustrate the rather vague idea of discreteness in
direction introduced in the previous chapter. In accordance with the correspondence
between quadratic functions and symmetric matrices, submodularity (L-convexity)
and exchangeability (M-convexity) for quadratic functions and their conjugate func-
tions are translated into combinatorial properties of symmetric matrices and their
inverses.

2.1.1 Convex Quadratic Functions


A quadratic form is associated with a symmetric matrix A as27

Recall that a symmetric matrix A is said to be positive semidefinite if x^Ax > 0


for any vector x and positive definite if X T Ax > 0 for any nonzero vector x. As is
27
The notation T means the transpose of a vector or a matrix. In section 2.1 we denote the ith
component of a vector x by Zj instead of x(i).

39
40 Chapter 2. Convex Functions with Combinatorial Structures

well known, the convexity (resp., strict convexity) of / is equivalent to the positive
semidefiniteness (resp., positive definiteness) of A:

Positive (semi)definiteness admits a number of characterizations. The first is


in terms of eigenvalues:

Note that the eigenvalues of a symmetric matrix are all real.


The second characterization is in terms of minors (subdeterminants). Let
N — { I , . . . , n} be the index set of rows and columns of A. For I C N and J C N
we denote by A[I, J] the submatrix of A with row indices in / and column indices in
J. A submatrix of the form A[I, I] for some / C N is called a principal submatrix
and its determinant a principal minor. A principal submatrix of the form A[I, I]
with / = {1,..., fc} for some k (< n) is called a leading principal submatrix and its
determinant a leading principal minor. Then we have

and

The criterion (2.8) compares favorably with (2.7) in that there are only n leading
principal minors as opposed to 2™ principal minors. Positive (semi)definiteness can
be checked with O(n3) arithmetic operations by an algorithm similar to Gaussian
elimination.
A change of the variable in (2.1), x = Sy with a nonsingular matrix S, re-
sults in another quadratic form fs(y] = f ( S y ) , which is associated with another
symmetric matrix S^AS. The convexity of a quadratic form is preserved under
such linear transformations of the variable, and the positive semidefiniteness of a
symmetric matrix also remains invariant.
The change of the variable x = Sy with a general nonsingular S, rotating the
coordinate axes, does not respect any special coordinate directions. It would be
reasonable to expect that such a general transformation should not be compatible
with any combinatorial properties relevant to discreteness in direction. Conversely,
we may regard properties of a quadratic form or of a symmetric matrix as being
combinatorial with discreteness in direction if they are not invariant with respect to
the entire class of transformations but are invariant with respect to some restricted
subclass thereof (the class of diagonal scalings, for example). A typical combinato-
rial property of this kind is the sign pattern of the entries of a matrix. This is what
we will study in the following subsection.
2.1. Quadratic Functions 41

2.1.2 Symmetric M-Matrices


As a typical combinatorial property we consider a particular sign pattern of a sym-
metric matrix that arises naturally in applications. The main theme here is the
translation of this sign pattern of a symmetric matrix into a combinatorial property
of the quadratic form associated with it.
We consider symmetric matrices L = (iij \ i,j = l,...,n) that satisfy the
following two conditions:

Note that the second condition (2.10) can also be expressed, under (2.9), as

Such matrices often appear in applications, as demonstrated below.

Example 2.1. Consider the Poisson equation —Au = cr, where A is the Laplacian
Si=i d2/do;?, a denotes the source term, and d is the dimension of the space.28 A
standard discretization scheme for this differential equation, where we assume d = 1
for illustration purposes, gives rise to a system of linear equations described by a
matrix like

This matrix satisfies the two conditions (2.9) and (2.10) above.

Example 2.2. Consider the simple electrical network depicted in Fig. 2.1. It
consists of five branches (linear resistors) connected at four nodes. We denote the
conductance (the reciprocal of resistance) of branch j by g$ > 0 (j = 1,...,5),
the potential at node i by Pi (i = 1 , . . . , 4 ) , the voltage across branch j by j]j
(j = 1 , . . . , 5), and the current in branch j by ^ (j = 1 , . . . , 5). The underlying
graph can be represented by the incidence matrix

whose rows and columns correspond, respectively, to the nodes and branches; the
jth column has entry 1 at its initial node and —1 at its terminal node. The voltage
28
We assume the Dirichlet boundary condition.
42 Chapter 2. Convex Functions with Combinatorial Structures

Figure 2.1. Electrical network.

vector j] — (TJJ \ j = 1,..., 5) is expressed in terms of the potential vector p = (pi \


i = 1,.... 4) as j) = ATp. The constitutive equation (Ohm's law) is represented as
£ = YTJ, where £ — (£j | j = 1,..., 5) is the current vector and Y = diag (gj j —
1,..., 5) is the conductance matrix. When a current source represented by a vector
c = (ci i = 1,... ,4) is applied, Kirchhoff's current law is described by At; = c.
Combining these equations yields AYArp = c for an admissible potential p. The
coefficient, matrix L = AYAT here is given by

which satisfies the two conditions (2.9) and (2.10) above. The matrix L is called
the node admittance matrix. •

Note 2.3. A matrix L is called an M-matrix if it can be represented as L = si — B


with a matrix B consisting of nonnegative entries and a real number s > p(B),
where p(B) denotes the spectral radius (the largest modulus of an eigenvalue) of B.
A nonsingular M-matrix is characterized as a matrix whose off-diagonal entries are
all nonpositive and the entries of whose inverse matrix are all nonnegative. With
this terminology we can say that symmetric matrices with off-diagonal nonposi-
tivity and diagonal dominance considered in this section are exactly the same as
diagonally dominant symmetric M-matrices. In passing we mention the fact that
any symmetric M-matrix can be transformed into a diagonally dominant symmetric
M-matrix by a symmetric diagonal scaling. M-matrices are fundamental concepts in
control system theory (Kodama-Suda [113], Siljak [193]), numerical linear algebra
(Axelsson [6], Varga [207]), and economics. The reader is referred to Berman-
Plemmons [9] for mathematical properties of M-matrices. It is also mentioned that
2.1. Quadratic Functions 43

a symmetric compartmental matrix (Anderson [3]) is the same as the negative of a


diagonally dominant symmetric M-matrix. •

Proposition 2.4. A symmetric matrix L with properties (2.9) and (2.10) is positive
semidefinite.

Proof. By (2.6) it suffices to show that any principal minor is nonnegative. We


prove this by induction on the size n of the matrix L. Any principal submatrix of
order < n— 1 satisfies (2.9) and (2.10) and therefore its determinant is nonnegative
by the induction hypothesis. It remains to show det L > 0. Partition L as

where L\\ is of order n — 1 and £22 = 4m- If fnn = 0; then lni = i^n — 0
(i = 1, . . . , n — 1) and therefore detL = 0. Suppose £nn > 0 and put LH =
£11 - Li2L22~lL2i. An off-diagonal entry i^ - iintnn^(-nj of Ln, where 1 < i ^
j < n — 1, is nonpositive by lij,lin,lnj < 0 and inn > 0. For the ith row sum of
LH we have

Thus, LH satisfies (2.9) and (2.10), and, by the induction hypothesis, its determi-
nant is nonnegative. Hence, detL = lnn • det-Ln > 0. D

We now look at the associated quadratic form

which is convex by Proposition 2.4 and (2.2). Our goal here is to reveal a key
combinatorial property of g(p) that reflects the combinatorial properties (2.9) and
(2.10) of the matrix L.

Note 2.5. It is quite natural to consider a quadratic form in association with a


linear system of equations. For a positive-definite symmetric matrix L, the solu-
tion p to a system of linear equations Lp = c can be characterized as the unique
minimizer of the quadratic function |p T Lp — p1 c (variational formulation). Such
a quadratic function often has a physical meaning. For instance, in the electrical
network of Example 2.2, the function

represents the power (energy) consumed in the network. The Poisson equation
—Aw = a (with d = 1) in Example 2.1 can be translated into a variational problem
of minimizing a functional
44 Chapter 2. Convex Functions with Combinatorial Structures

In this case, the quadratic function represents a discretization of I[u]. •

For vectors p, q 6 R™, we denote the vectors of componentwise maxima and


minima by p V q and p A q, respectively:

A function g : R™ —> RU {+00} is said to be submodular if it satisfies the inequality

This inequality is referred to as the submodularity inequality. It is understood that


the inequality (2.17) is satisfied if g(p) = +00 or g(q) = +00.
Submodularity corresponds to off-diagonal nonpositivity.

Proposition 2.6. For a symmetric matrix L, the off-diagonal nonpositivity (2.9)


of L is equivalent to the submodularity (2.17) of the associated quadratic form g(p).

Proof. The inequality (2.17) for p — Xi (*th unit vector) and q = Xj (jth unit
vector) yields (2.9). For the converse, put a — p/\q, p = a + p, and q = a + q. Then
pV q = a + p + q. Substitution of these into (2.14) shows that the right-hand side
of (2.17) minus the left-hand side of (2.17) is represented as

Off-diagonal nonpositivity can thus be translated into submodularity. Then


how does the combination of the off-diagonal nonpositivity and the diagonal domi-
nance of L translate to g(p)t To this end, we strengthen submodularity to

which we call translation submodularity. Submodularity (2.17) is a special case of


this with a = 0.

Theorem 2.7. For a symmetric matrix L, conditions (a) and (b) are equivalent.
(a) L has off-diagonal nonpositivity (2.9) and diagonal dominance (2.10).
(b) g(p) has translation submodularity (SBF^Rj).

Proof, (b) => (a): Proposition 2.6 shows (2.9). (SBF^R]) with p = Xi, 1 = -1,
and a = 1 yields (2.10), since
2.1. Quadratic Functions 45

(a) =>• (b): Put I = {i \ a < Pi - qi}, and let J be the complement of I.
We have

wmcn
With the use of X^fce/ ^ — ~ SjeJ ^'' is a consequence of (2.10), as well as
(2.9), we obtain

Hence follows (SBF^R]). D

Theorem 2.7 and Proposition 2.4 show that


translation submodularity =>• convexity
for a quadratic function. The converse, however, is not true. It is emphasized that
translation submodularity is a combinatorial property in that it is not invariant
under coordinate rotations but respects the fixed coordinate axes and the particular
direction 1.

Note 2.8. The quadratic form considered in this section coincides with what
is known as the Dirichlet form (in finite dimension) in the theory of the Markov
process and potential theory; see Fukushima-Oshima-Takeda [71]. We mention here
the equivalence among the five conditions (a), (b), (c), (d), and (e) below.29 The
five conditions are equivalent, by Theorem 2.7, to the translation submodularity
(SBF"[B.])offl(p).
(a) L has off-diagonal nonpositivity (2.9) and diagonal dominance (2.10).
(b) The normal contraction operates on g(p) = |pTLp; i.e., for p, q € R™,

(c) Every unit contraction operates on g(p) = ^pTLp; i.e.,

(d) For any a > 0, Sa = (I + ^L) exists and is Markovian; i.e.,

29 l
In the terminology of the theory of Dirichlet forms, — L corresponds to the generator, a Sa
to the resolvent, and Tt to the semigroup.
46 Chapter 2. Convex Functions with Combinatorial Structures

(e) For any t > 0, Tt — exp(-tL) is Markovian.


The proof of the equivalence of the above five conditions follows.
[(a) =>• (b)]: Denote the distinct values in {pi \ I < i < n} U {0} by TTI >
7T2 > • • • > ?rm, and put X = {i \ pi = wi} and Y = {i \ Pi = nm}. It suffices to
prove g(p) > g(q) for q = p — a\x with 0 < a < 2(7Ti — -KZ) or q = p + /3xy with
0 < (3 < 2(7r m _i — TTTO), since any normal contraction q can be obtained from p by a
series of transformations of such forms. We consider the former case (the latter can
be dealt with similarly). We may assume X ^ 0, a > 0, and TTI > 7T2 > 0. Then we
have

[(b) =>• (c)]: Note that q = (0 Vp) A 1 is a normal contraction of p.


[(c) =>• (a)]: Let a be a sufficiently small positive number. Then (2.9) follows
from (c) for p = Xi ~ aXj and (2.10) from (c) for p = 1 + axi-
[(c) =>• (d)]: Since (c) =>• (a), I/ is positive semidefinite by Proposition 2.4, and
therefore, 5Q exists. For a fixed x with 0 < a; < 1 and a > 0, the function

takes the unique minimum at p = po, where po = Sax. For go = (0 Vpo) A1 we have
g(Po) > g(qo) by (c) and (p0 -x)~r(p0 -x) > (q0 - x)1 (qQ - x) by 0 < x < 1. Hence
V'(Po) > ^(9o); which implies po = qo = (0 Vpo) A 1. This shows 0 < po = Sax < 1.
[(d) => (c)]: Since Sa = (sij) is Markovian, we have s^ > 0 (1 < i, j < n) and
T™_, sa <l(l<i<n). Define

for a > 0. Then g^(p) tends to g(p) as a —> +00. On the other hand, the
expression

shows that every normal contraction operates on g^a\p). The limit of a -^ +00
establishes (c).
[(d) =4> (e)]: This is due to the formula
2.1. Quadratic Functions 47

[(e) ^ (d)]: This is due to theformulaSax = a/0°° e~atTtxdt. •

2.1.3 Combinatorial Property of Conjugate Functions


As a continuation of our study of quadratic convex functions with translation sub-
modularity, we now consider the conjugate of such functions. The conjugate of
a quadratic form is another quadratic form, which is associated with the matrix
inverse of the original coefficient matrix.

Proposition 2.9. Let M and L be positive-definite symmetric matrices. The


quadratic forms f ( x ) = \ x1 MX ~and g(p) = ^prLp are conjugate to each other
with respect to the Legendre-Fenchel transformation (1.6) if and only if M and L
are inverse to each other.

Proof. This follows from a straightforward calculation based on (1.6). D

Hence, part of our study consists of investigating the combinatorial structure of


the inverse of symmetric matrices with off-diagonal nonpositivity (2.9) and diagonal
dominance (2.10). We introduce the following notation:

Example 2.10. Recall the Poisson equation —Au = a in Example 2.1. The inverse
of the matrix L in (2.11) is given by

Whereas the matrix L represents a differential operator, M = L l corresponds to


the Green function. The function g(p) is an approximation to the functional I[u]
for the variational formulation (Note 2.5), and the conjugate of g(p) is that for the
inverse problem of finding a for a given u. •

Let us consider the quadratic form

associated with M e £ l. We are to show that f ( x ) possesses an exchange property:


(M^-EXCfR]) For x,y e doniR/ and i 6 supp+(z - y), there exist
j € supp~(x - y) U {0} and a positive number a0 € R++ such that

for all a € R with 0 < a < O.Q.


48 Chapter 2. Convex Functions with Combinatorial Structures

It should be clear that Xi designates the iih unit vector for 1 < i < n while xo is
the zero vector and

for x — (xi i = 1,... ,n) & Rn. Recall that such an exchange property can be
viewed as a combinatorial analogue of the basic inequality (1.39) valid for a general
convex function.
The following is the main theorem of this section, stating the conjugacy rela-
tionship between L''-convexity and M^-convexity for strictly convex quadratic func-
tions.

Theorem 2.11. Suppose that strictly convex quadratic forms g and f are conjugate
to each other. Then g satisfies translation submodularity (SBF^R]) if and only if
f has exchange property (M^-EXCfR]).

The conjugacy relationship between (SBF11 [R]) and (M^-EXC[R]) stated above
for strictly convex quadratic forms is in fact valid for a more general class of func-
tions, as is fully developed in Chapter 8. This particular case, however, deserves
separate consideration, in that it admits a matrix-algebraic proof using the Farkas
lemma and thereby provides a new insight into (SBF^Rj) vs. (M^-EXC[R]) conju-
gacy.
In what follows we prove Theorem 2.11 by establishing Theorem 2.12 below.30
As variants of (M^-EXC^R]) we consider the following:
(M^-EXC + [Rj) For x,y G domR/ and i e supp+(o; - y), there exist
j e supp~(x - y) U {0} and a positive number a0 G R-++ such that

for all a € R with 0 < a < a®.

(M^-EXCdfR]) For x,y e domR/ and i e supp+(:r - y),

The latter is motivated by the identity

valid for the directional derivative f ' ( x ; d ) and sufficiently small a > 0. We also
denote by (M^-EXC^R]) the property (M^-EXC^R]) with < replaced by strict
inequality <.

30
The reader may go on to section 2.2, skipping the technical matters given in the rest of
section 2.1.
2.1. Quadratic Functions 49

Theorem 2.12. For annxn nonsingular symmetric matrixM = [mi,m2,... ,mn],


with rrij € R™ denoting the jth column vector of M, the following nine conditions
(a), (b), (b + ),..., (e), (e+) are equivalent.
(a) Me/;- 1 .
(b) For any x & R™ and i 6 supp + (x),

(b+) For any x € R™ and i G supp+(:r),

(c) For any x 6 R" and i e supp+(a;),

(c+) For any x e R™ and i e supp+(o;),

(d) /(x) = \x^Mx satisfies (M><-EXCd[R]).


(d+) f ( x ) = \xYMx satisfies (M"-EXC+[R]).
(e) f ( x ) = \xlMx satisfies (Mh-EXC[R]).
(e+) f ( x ) = \x^Mx satisfies (M^-EXC+IR]).

Proof. We prove the equivalence by showing the following implications:

The implications indicated by <— or j are easy to see and those by <&• or f|- are
proved below.
(a) <=» (b+): Denoting M~l by L = (iij), we have ML = /; i.e., X)?=i ^jimj =
Xi, which can be rewritten as
50 Chapter 2. Convex Functions with Combinatorial Structures

The condition (a), L G £, is equivalent to all the coefficients in this expression being
nonnegative, whereas the latter is equivalent, by the Farkas lemma (Proposition 2.13
below), to

This is nothing but (b + ), since xTXi > 0 is the same as i & supp + (x).
(b) =>• (b+): This follows from the above argument and the latter half of
Proposition 2.13 below.
(a) => (c+): Fix x £ Rn. It suffices to show

Let i e supp+(x) attain the minimum on the left-hand side. Put S = supp+(:c) U
supp~(x) and let x € Rs denote the restriction of x to S. The submatrix of M
with row and column indices in S is denoted by M = (Wij \ j & S), where WLj e Rs.
Then we have supp + (x) = supp + (x), supp~(x) — supp~(o;), i & supp+(;r), Xj =£ Q
(Vj 6 5), and xTmj = xrmj (Vj 6 S). Since M e £~1 by Proposition 2.14 below,
M satisfies (b+). Hence

in which

by the choice of i. Hence, we obtain

as well as a similar expression for f ' ( x ; —\i + Xj) + f ' ( y \ Xi — Xj}- We then replace
x — y with x.
(d+) => (e+): This follows easily from (2.22). D

Proposition 2.13. For a matrix A and a vector b, the conditions (a) and (b) below
are equivalent (Farkas lemma}:31
(a) Ax = b for some nonnegative x > 0.
31
Inequality between vectors means componentwise inequality; e.g., x > 0 for x = (xi)™=l means
Xi > 0 for i = 1,... ,n.
2.1. Quadratic Functions 51

(b) yT6 > 0 for any y such that y1 A > 0T.


// A is nonsingular, condition (b) is equivalent to
(c) yT6 > 0 for any y such that yr A > 0T.

Proof, (a) => (b) => (c) is obvious, (b) => (a) is proved later in Theorem 3.9. For
(c) => (b), there exists z such that ZTA — 1T by the assumed nonsingularity of A.
Then (y + e;z) T A > 0T for any e > 0, and (c) yields (y + ez)Tb > 0, which implies
yT& > 0, since £ > 0 is arbitrary. D

l 1
Proposition 2.14. Any principal submatrix of M & £ belongs to £ .

Proof. Partition M and L = M~l compatibly as

To prove MH e C { by induction on the size of MH, we may assume M22 and £22
are 1x1. Since L22 = inn > 0, we have Mn"1 = Lu - L^I^"1^21 (= LU),
which shows the nonsingularity of M\I. Then the proof of Proposition 2.4 shows
MII e £~l. D

Note 2.15. It is worth noting that conditions (c) and (c+) in Theorem 2.12
immediately imply positive semidefiniteness and positive definiteness, respectively.
The proof for the former reads as follows, while a similar proof works for the latter.
Let /i be an eigenvalue of the matrix M and x be the corresponding eigenvector
with supp+(x) ^ 0. Then (c) shows

for i e supp+ (x). This implies JJL > 0, since, otherwise, the left-hand side is negative
and the right-hand side is zero. Hence M is positive semidefinite by (2.4). It is
also noted that ma > 0 follows from (c) with x = Xi and fn%j > 0 from (c) with
x = Xi + o-Xj with a > 0 large. •

2.1.4 General Quadratic L-/M-Convex Functions


We have so far considered strictly convex quadratic forms denned by positive-
definite matrices. The conjugacy relationship carries over to convex quadratic forms
defined by positive-semidefinite matrices, as follows.
Suppose that g is a quadratic convex function given by
52 Chapter 2. Convex Functions with Combinatorial Structures

with a positive-semidefinite symmetric matrix L and a linear subspace K C R™.


Then the conjugate of g is also a quadratic convex function given by

with a positive-semidefinite symmetric matrix M and a linear subspace H C R™


such that

where

Note that (2.26) can be rewritten as

The conjugacy stated in Theorem 2.11 for strictly convex quadratic forms
is generalized as follows. See Murota-Shioura [155] for the proof as well as the
structure of the coefficient matrices L and M.

Theorem 2.16. Suppose that g : R™ -> R U {+00} in (2.24) and f : Rn ->


R U {+00} in (2.25) are conjugate to each other. Then g satisfies translation sub-
modularity (SBF^Rj) if and only if f has exchange property (M^-EXCfR]).

2.2 Nonlinear Networks


In the previous section we have seen that an electrical network gives rise to a con-
vex function with combinatorial properties (Example 2.2). The node admittance
matrix L is a diagonally dominant symmetric M-matrix (with off-diagonal non-
positivity and diagonal dominance), and the associated quadratic function (2.15)
representing the power (energy) consumed in the network has translation submod-
ularity (SBF^R]). In this section we shall see a similar phenomenon in an electrical
network of nonlinear resistors or, equivalently, in a nonlinear minimum cost flow
problem. General convex functions, not necessarily quadratic, arise as a result of
nonlinearity. Two aspects of discreteness, discreteness in direction and discrete-
ness in value, both appear naturally in the network flow problem. Accordingly, we
consider functions of type R" —> R in section 2.2.1 and those of type Zn —> Z in
section 2.2.2.

2.2.1 Real-Valued Flows


Let G = (V,A) be a directed graph with the set of vertices (nodes) V and the set
of arcs (branches) A and T be a set of distinguished vertices called terminals; see
Fig. 2.2. For each vertex v e V, S+v and S~v denote the sets of arcs leaving v and
2.2. Nonlinear Networks 53

Figure 2.2. Multiterminal network.

entering v, respectively. For each arc a G A, d+a designates the initial vertex of a
and d~a the terminal vertex of a. We consider here a minimum cost flow problem,
in which each arc is associated with a nonlinear convex cost function and the supply
(or demand) of flow is specified at terminal vertices.
The physical model we have in mind is a multiterminal electrical network
that consists of nonlinear resistors and is driven by a (current or voltage) source
applied to the terminal vertices. To reinforce physical intuition, we sometimes
use terminology such as current and voltage instead of flow and tension, but no
physics is really involved in our arguments. A reader who is more comfortable with
combinatorial optimization terminology may replace the terminology as follows:
electrical network network
current flow
voltage tension
current source supply of flow
potential potential (dual variable)
current potential cost function in flow
voltage potential cost function in tension
characteristic curve kilter diagram
Each arc a € A is associated with a flow (or current) £(a) and a tension (or
voltage) r/(a) and each vertex v e V with a potential p ( v ) . The boundary of flow £
is defined to be32

which represents the net flow leaving vertex v. The coboundary of potential p is

which expresses the difference in the potentials at the end vertices of an arc a.
32
Rockafellar's notations div and A in [178] are related to ours by div = d and A = —S.
54 Chapter 2. Convex Functions with Combinatorial Structures

Figure 2.3. Characteristic curve.

For each terminal vertex v G T, let x(v) denote the amount of flow going out
of the network at v and p(v) be the potential at v. We have structural equations

expressing the conservation laws as well as an obvious relation

The vectors x = (x(v) v € T) e RT and p = (p(v) v £ T) e RT play primal roles


in our discussion below.
Each arc a € A is associated with a characteristic curve Fa C R2, which
describes the admissible pairs of flow £(a) and tension r](a):

In physical terms a characteristic curve shows the constitutive equation for a non-
linear resistor, describing the possible pairs of current and voltage. In the linear
case (as in Example 2.2) we have

for some Ra > 0, which represents the resistance of an ohmic resistor. We consider
here a nonlinear case, where monotonicity

is assumed (see Fig. 2.3).


A conjugate pair of convex functions are induced from the characteristic curve
F0. Define

which means that /a(£) is the area below F0 in Fig. 2.3 and ga(f]) is the area above
Fa. The functions /a(£) and ffa(^) are both convex as a consequence of the assumed
2.2. Nonlinear Networks 55

monotonicity (2.32). Moreover, with suitable choices of integral constants, we have

Hence follows

That is, fa and ga are conjugate to each other with respect to the Legendre-Fenchel
transformation (1.6).
In the theory of electrical networks, the function /<,(£) is sometimes called the
current potential (or content) and ga(r]) the voltage potential (or cocontent). In the
case of a linear resistor, the functions fa and ga are quadratic, i.e.,

and they are both equal to half the power consumed in the resistor.
When a current source described by a; G RT with X^eT x(v^ = 0 is applied
to the terminal vertices of the network, the equilibrium state of (£(a) a G A)
and (77(0) | a & A) is determined as a solution to the structural equations (2.29)
and the constitutive equations (2.31). A variational formulation of this problem is
to minimize the total current potential ]CoeA/a(£( a )) among au possible current
distributions £ subject to the conservation law (2.29), with the current vector at
the equilibrium state being characterized as a minimizer of this problem (see Note
2.18 in section 2.2.3). We define f ( x ) to be the minimum value of the total current
potential in this variational problem; i.e.,

When a voltage source described by p £ RT (with respect to some reference


point) is applied to the terminal vertices of the network, the equilibrium state of
(£(a) a £ A) and (rj(a) \ a e A) is determined as a solution to the structural
equations (2.29) and (2.30) and the constitutive equations (2.31). A variational
formulation of this problem is to minimize the total voltage potential Y^aeA 9a('n(a))
among all possible voltage distributions r) subject to the conservation law (2.29)
and (2.30), with the voltage vector at the equilibrium state being characterized as
a minimizer of this problem (see Note 2.18). We define g(p) to be the minimum
value of the total voltage potential in this variational problem; i.e.,

The functions / and g introduced above are both convex (see Note 2.17 in
section 2.2.3) and they are conjugate to each other (see Note 2.18). In this sense
56 Chapter 2. Convex Functions with Combinatorial Structures

they stand on equal footing and there seems to be no concept in convex analysis
that distinguishes between / and g.
When it comes to combinatorial properties, however, these functions have
distinctive features. As is shown in Notes 2.19 and 2.20 in section 2.2.3, the function
/ : RT —> R U {+00} is endowed with an exchange property:
(M-EXC[R]) For x,y e doniR./ and u 6 supp+(z - y), there exist
v € supp~(x — y) and a positive number a$ e R++ such that

for all a € R with 0 < a < ao,


and the function g : RT —» R U {+00} satisfies

with r - 0. (SBF[R]) is the submodularity and (TRF[Rj) with r = 0 is the


invariance in the direction of 1 = (1,1,...,!), which corresponds to the fact that
the reference point of a potential can be chosen arbitrarily. Recall that we have
already seen (M-EXC[Rj) in the definition of M-convex functions in section 1.4.2
and (SBF[Rj) and (TRF[Rj) in the definition of L-convex functions in section 1.4.1.
The following points are emphasized here for the conjugate pair of convex
functions, / and g, appearing in the network flow problem:
• The functions / and g cannot be categorized with respect to convexity alone.
• The functions / and g can be classified into different categories (M-convexity
and L-convexity) with respect to combinatorial properties.
• Exchangeability (M-convexity) and submodularity (L-convexity) appear as
conjugate properties.
The combinatorial properties (M-EXC[R]),~ (SBFfRJ), and (TRF[R]) above
capture the kind of discreteness we call discreteness in direction. In the next sub-
section we turn to the other kind of discreteness, discreteness in value, inherent in
the network flow problem, by considering integer-valued flows.

2.2.2 Integer-Valued Flows


By replacing R in the previous argument with Z systematically, we consider integer-
valued flows in a network specified by integral data. In particular, we assume all the
vectors representing flow, tension, potential, etc., are integer valued; i.e., £ € ZA,
r]£ZA,p£Zv,x£ ZT, p e ZT, etc.
As for the cost functions, we assume that each arc a G A is associated with a
pair of integer-valued functions /Q, ga : Z —> Z U {+00} such that
2.2. Nonlinear Networks 57

Figure 2.4. Conjugate discrete convex functions /«(£) and ga(t]).

Figure 2.5. Discrete characteristic curve Ta.

and

It should be clear that (2.41) is a discrete analogue of the conjugacy (2.36), the dis-
crete Legendre-Fenchel transformation (1.9) for univariate functions. An example
of such a conjugate pair of cost functions is demonstrated in Fig. 2.4.
The characteristic curve Ta in this discrete setting is defined to be a subset
of Z2 induced from (fa,ga) by (2.35). It can be characterized as a subset of Z2
with the monotonicity property (2.32). Figure 2.5 shows the characteristic curve
Ta induced from (fa,9a) in Fig. 2.4.
In parallel with (2.37) and (2.38) we define functions / : ZT -»• ZU {+00} and
58 Chapter 2. Convex Functions with Combinatorial Structures

Note that these expressions are identical to (2.37) and (2.38) except that the vectors
are now integer valued and, in particular, the infima are taken over integer vectors.
Fortunately, such a discretization in the definitions of / and g does not destroy
the combinatorial properties discussed above. On the contrary, the discretization
turns out to be compatible with natural discretizations of the combinatorial prop-
erties. Namely, it can be shown (see Note 2.19) that the function / has a discrete
version of the exchange property:
(M-EXC[Zj) For x,y € domz/ and u e supp+(:r — y), there exists
v 6 supp~ (x — y) such that

This is essentially the same as (M-EXC[Rj) with cto = a = 1. On the other hand,
the function g satisfies

with r = 0 (see Note 2.20). Furthermore, these functions / and g are conjugate to
each other with respect to the discrete Legendre-Fenchel transformation (1.9), as
is proved later in section 9.6.
We have thus seen that the transition from R to Z is quite smooth in the
network flow problem. The combinatorial properties are discretized compatibly in
the discretization of the problem data. We emphasize that this is by no means
a general phenomenon but is an outstanding characteristic of the network flow
problem.

2.2.3 Technical Supplements


This section provides a series of proofs for the major properties of the functions

defined in (2.37) and (2.38).


2.2. Nonlinear Networks 59

Note 2.17. We prove that / in (2.44) and g in (2.45) are convex functions under
the assumption that / > — oo and g > — oo.
To show the convexity of /, fix x, y G doniR,/. For any e > 0 there exist £x
and £v such that

where • \T denotes the restriction of a vector to T. For A € [0, I]R we have d(A£x +
(1 - A)£j,) = \d£x + (1 - X)d£y and, therefore,

This implies Xf(x) + (1 — \ ) f ( y ) > /(Ax + (1 - A)y), since e > 0 is arbitrary.


To show the convexity, of g, fix p, q € doniRg. For any e > 0 there exist ryp,
T)q, p, and q such that

For A e [0,1]R we have S(Xp + (1 - A)g) = XSp + (1 - A)<5g = -[Ajjp + (1 - A)??,]


and, therefore,

This implies Ag(p) + (1 — A)g(g) > ^(Ap + (1 — A)g), since e > 0 is arbitrary. •

Note 2.18. The variational formulations of the network equilibrium are derived
here under the assumption of the existence of an equilibrium. Also shown is the
conjugacy between / in (2.44) and g in (2.45). It follows from (2.34) that

for any £, rj, p, x, and p satisfying the conservation laws expressed by (2.29) and
(2.30). By (2.35), the inequality above is an equality if and only if (£(a), r)(a)) 6 Fa
for each a e A. Suppose that an equilibrium state exists when a current source
x = x* is applied to the network and let £*, 77*, p*, p* be the vectors at the
equilibrium. The inequality (2.50) with 77 = 77*, p = p*, p — p*, and x = x* yields
60 Chapter 2. Convex Functions with Combinatorial Structures

which shows that the minimum of the current potential Y^aeA fa(£,(a)) is attained
by £ = £*. A similar result occurs for the variational formulation using the voltage
potential X^oeA 9a(rj(a)) when a voltage source described by p = p* is applied. For
the conjugacy of / and g, note that

follows from (2.50) and the definitions of / and g in (2.44) and (2.45). This in-
equality for x — x* turns into an equality for p = p*, which shows that f ( x * ) =
supp{(p,:c*} — g(p)} = g'(x*). A similar argument works for g = f*. The argument
here is admittedly lacking in mathematical rigor, for which the reader is referred to
Rockafellar [178]. •

Note 2.19. We prove that / in (2.44) satisfies (M-EXC[R]) under the assumption
that / > — oo. Fix x,y € doniR,/. For any e > 0 there exist £x and £y satisfying
(2.46) and (2.47). Consider the difference in the flows, £y — £x & R/4, for which we
have

Since u & supp+(x — y), there exists a path compatible with £y — £x that connects u
to some vertex in supp~(:r — y) (i.e., an augmenting path with respect to the pair of
flows £x and £ y ). More formally, there exist n : A —> {0, ±1} and v G supp~(x — y)
such that

where Xu, Xv G Rv are the characteristic vectors of u and v. For two flows £x + air
and £y — air with 0 < a < a0, where a0 = min |£y(a) - £s(a)| (> 0), we have
a:|ir(o)| = l

It then follows that

This implies (M-EXC[Rj) if a0 = a0(s) does not tend to zero as £ —> 0 and v = v(e)
remains the same as e —> 0. For the former property, we can take an augmenting
path TT such that UQ > (x(u) — y(u))/\A\, and for the latter we may take a subse-
quence of e that corresponds to a single v. In the discrete case of (2.39), UQ is a
positive integer and a = I is a valid choice. Hence follows (M-EXC[Zj). •
2.3. Substitutes and Complements in Network Flows 61

Note 2.20. We prove that g in (2.45) satisfies (SBF[R]) and (TRF[R]) under the
assumption that g > —oo. First, (TRF[R]) is obvious from 5(p + al) = dp. Fix
p, q € doniR^. For any e > 0 there exist r]p, rjq, p, and q satisfying (2.48) and (2.49).
For ?7V = — 8{p V q) and rj^ = ~S(p A q), we have

Hence, for each a e A, there exists Aa (0 < Aa < 1) such that

which, together with the convexity of ga, implies that

Therefore, we obtain

which implies (SBF[R]), since e > 0 is arbitrary. The discrete case (2.43) with ga
in (2.40) can be treated similarly. •

2.3 Substitutes and Complements in Network Flows


In section 1.3 we explained that submodularity should be compared to convexity.
This statement is certainly true for set functions, but, when it comes to functions
in real or integer vectors, it is more appropriate to regard convexity and submod-
ularity as mutually independent properties. In this section we address this issue
with reference to substitutes and complements in network flows discussed in the lit-
erature and show that the concepts of L-convexity and M-convexity help us better
understand the relationship between convexity and submodularity.

2.3.1 Convexity and Submodularity


We consider a network flow problem. Let G = (V, A) be a directed graph with vertex
set V and arc set A. For each arc a & A, we are given a nonnegative capacity c(a)
for flow and a weight w(a) per unit flow. The maximum weight circulation problem
is to find a flow £ = (£(a) | a e A) that maximizes the total weight 2 a eA w(a)£(a)
subject to the capacity (feasibility) constraint

and the conservation constraint


62 Chapter 2. Convex Functions with Combinatorial Structures

We denote by F the maximum weight of a feasible circulation.


Our concern here is how the weight -F depends on the problem parameters
(w,c). Namely, we are interested in the function F = F(w,c) in w e R/4 and
c 6 R+. We first look at convexity and concavity.

Proposition 2.21. F is convex in w and concave in c.

Proof. F = max{u>T£ | 7V£ = 0, 0 < £ < c] is the maximum of linear functions


in w and hence convex in w, where 7V£ = 0 represents the conservation constraint
(2.52). By linear programming duality (see Theorem 3.10 (2)), we obtain an alter-
native expression .F = min{cT?7 | N^p + rj >w,n> 0}, which shows the concavity
of F in c. D

We next consider submodularity and supermodularity. A function / : R™ —>


R U {+00} is said to be submodular if

and supermodular if

where x V y and x A y are, respectively, the vectors of componentwise maxima and


minima of x and y as defined in (2.16). With the economic terms substitutes and
complements we have the following correspondences:

/ is submodular 4=> goods are substitutes,


/ is supermodular •<=>• goods are complements,

where / is interpreted as representing a utility function.


Two arcs are said to be parallel if every simple cycle33 containing both of them
orients them in the opposite direction, and series if every simple cycle containing
both of them orients them in the same direction. A set of arcs is said to be parallel
if it consists of pairwise parallel arcs, and series if it consists of pairwise series arcs.
With the notation wp = (w(a) \ a e P), cp = (c(a) \ a e P), u>s = (w(a) \ a 6 S),
and cs = (c(a) a e S), the following statements hold true, where the proof is given
later.

Theorem 2.22. Let P be a parallel arc set and S a series arc set.
(1) F is submodular in WP and in cp.
(2) F is supermodular in u>s and in c$.

33
Formally, a simple cycle is an alternating sequence (VQ, ai,vi, 0,2,..., v^—i, ak,Vk) of vertices
Vi (i = 0 , 1 , . . . , fc) and arcs a, (i = 1,..., k) such that {9+aj, 9~a;} = {i>;~i, i>j} (i = 1 , . . . , k),
v
o — v/f, and Vi ^ Vj (1 < i < j < k).
2.3. Substitutes and Complements in Network Flows 63

Combining Proposition 2.21 and Theorem 2.22 yields that

Thus, all combinations of convexity/concavity and submodularity/supermodularity


arise in our network flow problem. This demonstrates that convexity and submod-
ularity are mutually independent properties.
Although convexity and submodularity are mutually independent, the com-
binations of convexity/concavity and submodularity/supermodularity in (2.55) are
not accidental phenomena but logical consequences that can be explained in terms
of L''-convexity and M^-convexity. The function F is endowed with L''-convexity
and M^-convexity, as follows, where the proof is given in section 2.3.2.

Theorem 2.23. Let P be a parallel arc set and S a series arc set.
(1) F is l)- -convex in WP and f$-concave in cp.
(2) F is M^-convex in ws and L^-concave in eg.

In general, L''-convexity implies submodularity by (SBF^Rj) in the definition,


whereas M^-convexity implies supermodularity, as will be shown in Theorem 6.51.
Accordingly, L''-concavity implies supermodularity and M^-concavity submodular-
ity. With the aid of these general results on L''-convex and M^-convex functions,
Theorem 2.23 provides us with a somewhat deeper understanding of (2.55). Namely,
it is understood that

2.3.2 Technical Supplements


This section gives the proof of Theorem 2.23. We start with basic properties of
parallel and series arc sets that we use in the proof. Let us call TT : A —> {0, ±1} a
circuit if dv: = 0 and supp+ (TT) U supp~ (TT) forms a simple cycle.

Proposition 2.24. Let n be a circuit.


(1) |supp+(7r) n P\ < I and |supp~(7r) D P| < 1 for a parallel arc set P.
(2) |supp+(7r) n S\ = 0 or |supp~(7r) n S\ = 0 for a series arc set S.

Proposition 2.25. Let S be a series arc set and -K\ and 7T2 be circuits. If
supp+(7Ti) n supp+(7T2) (~l S ^ 0, there exists a circuit TT such that supp+(7r) C
supp+(7Ti) U supp+(?T2), supp~(?r) C supp~(7Ti) U supp"^), and supp + (7r) n S =
(sUpp+(7Ti) U SUpp+(7T2)) n S.
64 Chapter 2. Convex Functions with Combinatorial Structures

Proof. Suppose a e (supp + (7r 2 )\supp + (7ri))n5 < . By an elementary graph argument
we can find a circuit TT' such that supp + (7r') C supp+(?r1) Usupp + (7r 2 ), supp~(7r') C
supp~ (TTI ) U supp~ (7T2), and supp+ (TT' ) n S D (supp+ (TTI ) D S) U {a}. Repeating this
we can find TT. D

The main technical tool in the proof is the conformal decomposition34 of a


circulation £, which is a representation of £ as a positive sum of circuits conformal
to £; i.e.,

where fa > 0 and TTJ : A —> {0, ±1} is a circuit with supp+(7Ti) C supp + (£) and
supp~(7Tj) C supp~(£) for i — 1 , . . . , m.

Proof of L"-Convexity in t«p


The L''-convexity of F in WP is equivalent to the submodularity of F(w — WQXP, c)
in (WP,WQ), which in turn is equivalent to

for a, b e P with a^b and A, // e R+.


To show (2.58) let £ and £ be optimal circulations for w and u> + \Xa + HXb-
We can establish (2.58) by constructing feasible circulations £a and &, such that

since this implies

where the left-hand side is bounded by F(w + \Xa,c) + F(w + nXb,c) and the
right-hand side is equal to_F(w,c) + F(w + \Xa + HXb,c). If £(a) < £(a), we
can take £a = £ and £& = ? to meet (2.60). If £(6) < £(6), we can take £a — £
and £;, = £ to meet (2.60). Otherwise, we make use of the conformal decomposition
£ —£ = Y^iL\ fii^i. Since a e supp + (J-£), we may assume 7Tj(a) = 1 for i = 1,..., I
and 7Tj(a) = 0 for i = 1+1,..., m. We have 7Tj(6) = 0 for i = 1 , . . . , (. by Proposition
2.24 (1), since P is parallel and {a, 6} C supp+(£ - £). Then £a = £ + £)Li AT»
and ^6 = £ + Y^iLe+i A7r» are feaslble circulations that satisfy (2.60).
To show (2.59) let £ and £ be optimal circulations for w and w + \xa — A*XP-
We can establish (2.59) by constructing feasible circulations £a and £p such that

34
More generally, the conformal decomposition is defined for a vector in a subspace in terms of
elementary vectors of the subspace; see Iri [94] and Rockafellar [178].
2.3. Substitutes and Complements in Network Flows 65

since this implies

If C( a ) < £( a )> we can take £a = £ and £p = £ to meet (2.61). Otherwise, we use


the conformal decomposition £ — £ = X^i /%fli> in which we assume 7Tj(a) = 1
for i = 1 , . . . , i and 7r$(a) = 0 for i = I + 1,..., m. Since P is parallel, we have
|supp~ (?!-;) n P < 1 by Proposition 2.24 (1) and hence 7Tj(P) > 0 for i = 1 , . . . , t
Therefore, £a = £ + St=i Pi^i and £p = £ + SHi+i A71"* are feasible circulations
with the properties in (2.61).

Proof of M^-Concavity in CP
We prove the M^-concavity of F in CP by establishing (M^-EXC[R]) for — F as a
function in cp. In our notation this reads as follows:
Let ci,c 2 e R+ be capacities with ci(a') = c2(a') for all a' € A\P.
For each a e supp+(ci — 02), there exist 6 e supp~(ci — c2) U {0} and a
positive number ao such that

for all a e [0, a 0 ]R, where %0 = 0.


Let £1 and £2 be optimal circulations for Ci and c2, respectively. We shall find
a0 > 0 and 6 6 supp~(ci - c2) U {0} such that, for any a 6 [0, aoJR, there exist
circulations £{ and £2 such that

If £i(a) < ci(a), we can take a0 = ci(a) — £i(a), & = 0, £{ = £1, and £2 = £2
to meet (2.62). Suppose £i(a) = ci(a). We have ^(a) = Ci(a) > c 2 (a) > £ 2 (a).
Let TT be a circuit such that a 6 supp+(?r) C supp+(£i — £2) and supp~"(?r) C
supp~(£i -£ 2 ). Since P is parallel and a e supp + (7r), we have supp+(?r) DP = {a}
and |supp~(7r) n P| < 1 by Proposition 2.24 (1). If |supp~(?r) n P = 1, define
b by {6} = supp~(?r) n P; otherwise put 6 = 0. We can take a0 > 0 such that
«o < ICi( a/ ) — £2(0')! for all a' e supp+(7r) U supp~(?r). Then £i = £1 - air and
£, = £2 + CKTT satisfy (2.62) if 0 < a < a0.

Proof of M"-Convexity in w§
We prove the M11-convexity of F in ws by establishing (M^-EXCfR]). In our notation
this reads as follows:
Let Wi,w 2 G R.A be weight vectors with Wi(a') = w2(a') for all a' €
A\S'. For each a e supp + (w;i-w 2 ), there exist 6 € supp~(wi— w 2 )U{0}
and a positive number a0 such that

for all a 6 [0, QO]R, where xo = 0-


66 Chapter 2. Convex Functions with Combinatorial Structures

Let £1 and £2 be optimal circulations for w\ and W2, respectively, with £i(a) mini-
mum and £2(0) maximum.

Proposition 2.26. There exists O.Q > 0 such that £1 is optimal for w\ — a\a and
£2 is optimal for w^ + aXa for all a e [0, O;O]R-

Proof. For any circuit TT such that IT (a) = —1 and 0 < £1 + f3ir < c for some
/3 > 0, we have (toi,£i + /?7r) < (wi,£i) by the choice of £1- Let a\ > 0 be the
minimum of — (wi, TT) over all such circuits TT. Then £1 is optimal for w\ — a\a for
all a e [0. O!I]R, since (u>i -ax a > Ci +/?7r) < (wi ~aXa, £1} for any /? > 0 and circuit
TT such that 0 < £1 + /?TT < c. Similarly, let 0:2 > 0 be the minimum of —(w2,n)
over all circuits TT such that 7r(a) = 1 and 0 < £2 + 0ir < c for some (3 > 0. Then £2
is optimal for 102 + QXa for all a G [0, O^JR. Put ao = min(ai, a^)- D

where the last equality is by Proposition 2.26. In what follows we assume £i(a) <
6 (a).
By Proposition 2.24 (2), we can impose further conditions on £1 and £2 that,
for each /> 6 S \ {a}, £1(6) is maximum among all optimal £1 for w\ with £i(a)
minimum, and &(b) is minimum among all optimal £2 for w-2 with £2(0) maximum.

Proposition 2.27. There exists a® > 0 SMC/I iftorf £1 zs optimal for w\ — a(xa — Xb)
and £2 is optimal for w^ + a(Xa — Xb) for all b £ S\ {a} and for all a & [0, O;O]R-

Proof. For any circuit TT such that TT(<Z) — 7r(6) = —1 for some b & S \ {a} and
0 < £1 + /^TT < c for some (3 > 0, we have (tyi, £1 + /571"} < (wi > £1} by the choice of £1.
Let cti > 0 be the minimum of —(WI,TT) over all such circuits TT. Then £1 is optimal
for Wi — a(xo — Xb) f°r an a G [0, «I]R- Similarly, let a2 > 0 be the minimum of
— (w2,Tr) over all circuits TT such that ?r(a) — TT(&) = 1 for some b e S* \ {a} and
0 < £2 + /371" < c f°r some /3 > 0. Then £2 is optimal for w^ + a(x<z ~ Xb) f°r a^
a e [0, CC2JR- Put a0 = min(ai, a 2 ). D

Proposition 2.27 implies

We want to find b £ supp (wi — w?) for which (2.63) is nonnegative.


We make use of the conformal decomposition £2 — £1 = Y^iiLi ft7^- Since S is
series, we may assume, by Proposition 2.25, that
2.3. Substitutes and Complements in Network Flows 67

Proposition 2.28. There exists b € (supp+(7ri) n S) n supp (w\ — w?)-

Proof. We have (U>I,TTI} < 0, since £1 is optimal for u>i and 0 < £1 + /3i7Ti < c.
Similarly, we have (w?, — TTI) < 0. Hence,

Since u>i(a) — 1112(0.) > 0 in this summation, we must have Wi(b) — W2(b) < 0 for
some b e supp+(7Ti) n S. D

For b e (supp + (7Ti) n 51) C1 supp~(u>i — ^2) in Proposition 2.28, we have

which shows the nonnegativity of (2.63).

Proof of L"-Concavity in GS

The L''-concavity of F in cs is equivalent to the supermodularity of F(w, c - c0xs)


in (c§, CQ), which in turn is equivalent to

for a, b 6 S with a j^= b and A, /z € R+.


To show (2.64), let £a and £& be optimal circulations for c + A%0 and c + nxb-
We can establish (2.64) by constructing circulations £ and £ such that

If £ a (a) < c (a), we can take £ = £a and £ = £5 to meet (2.66). If £&(&) < c(6), we
can take £ = £b and £ = £a to meet (2.66). Otherwise, we have £ 0 (a) > c(a) > £b(a)
and £ 0 (6) < c(6) < £b(6), and therefore a e supp+(£a — £b) and fc e supp~(£a — £;,).
We make use of the conformal decomposition £a — £f, = J]iii A71"*) where we assume
7Tj(a) = 1 for i = 1 , . . . , I and 7Tj(a) = 0 for i = £ + 1,..., m. We have 7Tj(&) = 0 for
i = 1,... ,1 by Proposition 2.24 (2), since S is series and a € supp+(£a — &,) and
b e supp-(£a - &). Then £ = £a - £-=i frm and f = 6 + Eti A^ satisfy ( 2 - 66 )-
To show (2.65), let £a and £s be optimal circulations for c+ \Xa and c — ^xs-
We can establish (2.65) by constructing circulations £ and £ such that

If £0(o) < c(a), we can take £ = £a and £ = £s to meet (2.67). Otherwise, we


have £ a (a) > c(a) > £s(a), and therefore a 6 supp+(£a — £5). We use the conformal
68 Chapter 2. Convex Functions with Combinatorial Structures

decomposition £ a —£s = SI^i A^i- Since S is series, we may assume by Proposition


2.25 that

and 7Ti(a) = 0 for i = i + 1 , . . . , m. Then supp (TTJ) n S = 0 for i = I,... ,1. Noting
Z)i=i A = £a( a ) ~£s(a) > £a(a) — c(a), let k be the smallest integer with £i=1 A >
£a(a)_-c(a) and define /?' - K«(<0-<=(<»)]-E^ 1 A- Then £ = £a-Ei=i A^-A^fc
and £ = £s + Y^=l fc*i + P'^k satisfy (2.67), since £(o) = £ 0 (a) - £^1' & - P =
c(a), £(a) = £ s (a) + ^=1 A + @' = £s(a) + Ca(a) - c(a) < c(a) + X - p,, and, for
any b € supp + (7Tfc) n S, we have

This completes the proof of Theorem 2.23.

2.4 Matroids
In section 1.3.2 we introduced the concept of base polyhedra in terms of an abstract
exchange axiom and mentioned that a matroid can be identified with a base polyhe-
dron having vertices of {0, l}-vectors. To compensate for such an abstract definition
of matroids, we explain here some linear-algebraic facts behind the abstract axioms.
The key is the Grassmann—Pliicker relation for determinants, qualitative analyses
of which lead to the concepts of matroids and valuated matroids.

2.4.1 From Matrices to Matroids


Suppose we are given a matrix, say,

where a i , . . . , < Z 5 6 R3 denote the column vectors. Let V denote the set of its
column indices; we have V = {1,..., 5} in our example.
The concept of matroids is derived from a combinatorial consideration of linear
dependence among column vectors. We say that a subset J of V is independent if
the corresponding column vectors {a, | j <E J} are linearly independent. Since a
subset of an independent set is obviously independent, we may focus on maximal
independent sets (maximal with respect to set inclusion). A maximal independent
2.4. Matroids 69

set is called a base and the family of bases (or base family) is denoted by B. In our
example we have

The base family B has a remarkable combinatorial property:


(B) For any J, J' e B and any i 6 J \ J', there exists j & J' \J such
that J - i + j € S and J' + i-j eB,
where J \ J' = {k \ k 6 J, fc ^ J'} and J — i + j and J' + i — j are shorthand
notations for (J \ {i}) U {j} and (J' U {i}) \ {j}, respectively. For instance, take
J = {1,2,3} and J' = {3,4,5} in our example. For i = 1 we can take j = 4 to
obtain J - i + j = {4,2,3} 6 S and J' +i-j = {3,1,5} e S. The choice of j = 5
is not valid, since J — i + j = {5,2,3} ^ B. The property (B) above is called the
(simultaneous) exchange property.
For the proof of the exchange property (B) we need to introduce the Grassmann-
Plucker relation, a fundamental fact in linear algebra. For J C V we denote35 the
determinant of the submatrix A[J] = (a,- | j £ J) by det A[J]. The Grassmann-
Pliicker relation is an identity

valid for any J,J'<^V and any i € J\J' (the proof is sketched in Note 2.30). The
notation detA[J — i+j] here means the determinant of A[J] with column i replaced
with column j.
To prove (B), suppose that J, J' e B. Then the left-hand side of (2.69) is
distinct from zero, and therefore, there exists a nonzero term, say, indexed by
j € J' \ J, in the summation on the right-hand side of (2.69). Then we have
det A[J -i+j]^Q and det A[J' + i - j] ^ 0; i.e., J-i+jeB and J' + i - j e B.
This proves (B). We emphasize that the exchange property (B) is derived from
the Grassmann-Pliicker relation by a qualitative consideration that distinguishes
between zero and nonzero, while disregarding the numerical information.
An alternative representation of linear independence among column vectors is
given by the rank function p : 2V —> Z denned by

The rank function has the following properties (proved in Note 2.31):

35
We consider J such that A[J] is square and implicitly assume an ordering of the elements of
J, which affects the sign of the determinant.
70 Chapter 2. Convex Functions with Combinatorial Structures

The third property (R3) shows that p is a submodular function. The rank function
and the base family determine each other by

From a given matrix we have thus derived a set family B C 2V with property
(B) and a set function p : 2V —> Z with property (R) (i.e., (Rl) to (R3)), where
V is the set of columns of the given matrix. It is all-important to realize that the
properties (B) and (R) are stated without reference to the given matrix and, as such,
they make sense as properties of a set family and a set function in general. Matroid
theory adopts these properties as abstract axioms and studies the combinatorial
structure implied by these axioms (and nothing else).
It turns out that a set family B satisfying axiom (B) and a set function p
satisfying axiom (R) are equivalent to each other. More precisely, we have the
following statement.

Theorem 2.29. The class of set functions p : 2V —> Z satisfying (Rl), (R2), and
(R3) and the class of nonempty families B C 2V satisfying (B) are in one-to-one
correspondence through the mutually inverse mappings (2.71) and (2.72).

In this sense, the two objects B and p represent one and the same combinatorial
structure, which is called a matroid. That is, a matroid is a pair (V,B) of a finite
set V and a family B of subsets of V that satisfies (B), or, alternatively, a matroid
is a pair (V,p) of a finite set V and a set function p on V that satisfies (R). We
may also denote a matroid by a triple (V, B, p). The set V is called the ground set,
B is the base family, and p is the rank function of the matroid. A member of B is
a base and a subset of a base is an independent set.
Though defined by such simple axioms, the concept of matroids is fundamental
and fruitful in studying combinatorial structures. The exchange property (B) above
is the germ of (B-EXC[Zj) treated in section 1.3.2 and (R3) is the submodularity
featured in section 1.3.1. The one-to-one correspondence between B and p stated
in Theorem 2.29 above is the germ of Theorem 1.9 that establishes the equivalence
between exchangeability and submodularity.

Note 2.30. The idea of the proof of the Grassmann-Plucker relation (2.69) is
indicated here for a 3 x 5 matrix
2.4. Matroids 71

where a'j = a,j (j = 1,3,4,5). The generalized Laplace expansion applied to det A
yields

On the other hand, subtracting the lower half (three rows) from the upper half
(three rows) of A yields

which is obviously singular. Hence, det A = 0, and

establishing (2.69) for J = {1,2,3}, J' = {3,4,5}, and i = 1.

Note 2.31. We prove (Rl), (R2), and (R3) for the rank function (2.70) associated
with a matrix. (Rl) and (R2) are obvious. To prove (R3), let {a,,- | j G JXY} (where
JXY C XC\Y) be a base of {a,- | j € Xr\Y}. There exists some Jx C X\Y such that
{flj j & JXY U Jx} is a base of {aj | j 6 X}, since any set of independent vectors
can be augmented to a base. For the same reason, there exists some Jy C Y \ X
such that {dj | j G JXY U Jx U Jy} is a base of {a,,- j & X U Y}. Then we have
\JXY = p(X n Y), \JXY\ + \Jx = p ( X ) , \JXY\ + \JX\ + \JY = p(X U y), and
I JXY I + \JY\ < p(Y)> where the last inequality is due to the independence of the
vectors indexed by JXY U JY- Hence follows (R3). •

2.4.2 From Polynomial Matrices to Valuated Matroids


In section 2.4.1 we abstracted the axiom of a matroid from the Grassmann-Pliicker
relation for matrices. A similar argument for polynomial matrices leads us to the
concept of valuated matroids, which may be thought of as discrete concave functions.
Suppose we are given a polynomial matrix in variable s, say,

with the column set V = {1, 2, 3,4}.


Since the determinant det A[J] is a polynomial in s, we can talk of its degree,
which we denote by w(J):

where we put uj(J) = -oo if det A[J] = 0 or A[J] is nonsquare (when det A[J] is
not denned). Using the notation B for the family of bases we have
72 Chapter 2. Convex Functions with Combinatorial Structures

We now look at the Grassmann-Pliicker relation (2.69) with respect to the


degree in s. For J,J' € B, the degree of the left-hand side of (2.69) is equal to
u>(J) +ui(J'). This implies that at least one term on the right-hand side must have
degree not lower than this. Hence, the function w has the following property:

(VM) For any J,J'eB and any i e J \ J', there exists j e J' \ J such
that J - i + j € B, J' + i - j e B, and

The inequality can be strict due to possible cancellations of the highest degree terms
on the right-hand side of (2.69).
In our example (2.73) we have detA[{l,2}] = detA[{3,4}] = 1, and hence
u>(J) = w(J') = 0 for J = {1,2} and J' = {3,4}. For i = I e J \ J' we can choose
j = 3e J' \ J, for which w( J - i + j) + u(J' + i - j) = 2.
The concept of valuated matroids is obtained by adopting (VM) as an axiom.
Namely, a valuated matroid is a pair (V,u>) of a finite set V and a set function
ui : 2V —* RU {—00} that satisfies (VM), where it is assumed that

is nonempty.
Not surprisingly, valuated matroids are closely related to matroids. First,
(VM) for u implies (B) for B. This means that (V, B) is a matroid if (V, ui) is a
valuated matroid. Accordingly, B is called the base family of the valuated matroid
(V, u). It is also said that a; is a valuation of the matroid (V, B).
Next, the maximizers of u> form the base family of a matroid. This is because,
for two maximizers J and J' with u(J) = u(J'} = max u, we have w( J — i + j) =
w(J' + i — j) = maxo; in (VM), which means (B) for the family of maximizers of w.
Furthermore, for any p — (p(i) i 6 V) (E Ry, the function u>[—p] : 2V —> RU{—00}
defined bv

is a valuated matroid, and therefore, the maximizers of w[—p] form the base family
of a matroid for each p 6 Rv. This property, in turn, characterizes a valuated
matroid as follows.

Theorem 2.32. Let (V, B) be a matroid with ground set V and base family B. A
function LU : B —> R is a valuation if and only if for any p : V —> R the maximizers
of uj[—p] form the base family of a matroid.

Proof. This is a special case of Theorem 6.30 to be shown later. D

This theorem seems to suggest that we should compare valuated matroids to


concave functions and matroids to convex sets. Here is a connection of valuated
2.4. Matroids 73

matroids to M-convexity. A set function LJ : 2V —>RU{—00} can be identified with


a function / : Zv -> R U {+00} with domz/ C {0,1}V by

where domz/ = {\j | J e B} with B in (2.75). It is easy to observe that (VM) for
(jj is equivalent to (M-EXC[Zj) for /. That is, a; is a valuated matroid if and only
if / is an M-convex function. For instance, the valuated matroid (V, a;) associated
with the polynomial matrix in (2.73) can be identified with an M-convex function

( 0 ( a r e Bo),
/(a:) =4 -1 (a:eB\Bo))
[ +00 (x£Zv\B),

where B = {(1,1,0,0), (0,0,1,1), (1,0,1,0), (1,0,0,1), (0,1,1,0), (0,1,0,1)} and B0 =


{(1,1,0,0),(0,0,1,1)}.
In parallel with the generalization of the base family to a valuation w, the
rank function of a matroid can be generalized as follows. Assuming that u is an
integer-valued valuation, we define a function g : Zv —> Z U {+00} by

In our example of (2.73) we have

As is shown in Note 2.34, the function g is submodular over the integer lattice:

which is a generalization of the submodularity (R3) of the rank function of a ma-


troid. The connection to a matroid rank function is conspicuous in the special case
where w(J) =0 for all J e B. Then we have

by (2.78) and (2.71), where p is the rank function of the underlying matroid (V, B).

Note 2.33. We have started with a polynomial matrix to define a function u> with
property (VM). As is evident from the proof, the same construction works for a
matrix over a non-Archimedian valuated field (van der Waerden [205]). The name
"valuated matroid" comes from this fact. •
74 Chapter 2. Convex Functions with Combinatorial Structures

Note 2.34. The submodularity (2.79) of the function g in (2.78) can be proved as
follows. First we show

which is a special case of (2.79) for p = Xi and </ = Xj- Take /, J e B with
g(0) = u(I) and g(Xi + Xj) = w(J) + \J n {ij}\. If \J n {i,j}\ < 1, we have
9(Xi + Xj) = max(s(x»),5(Xj)), which implies (2.80). The case of | J n {i,j}\ =
\I H {*> j}| = 2 is also easy. Suppose that \J n {«, j}| = 2 and |J n {i,j}\ < 1, and
assume j € J\/ without loss of generality. By (VM), there exists k & I\ J for which
u(I)+u(J) < u}(I+j-k)+u}(J-j + k). This establishes (2.80), since w(7) = 0(0),
u(J) = <?(X* + Xj) - 2, w(/ + j - fc) <ff(xj-)- 1, and w(J - j +fc)< 5 (Xi) - 1.
For p = (p A q) + Xi and 9 = (p A g) + Xj (* 7^ j)i the same argument applies to
u/(J) = w(J) + ^j€ j(pA 9)0) to prove (2.79). The general case can be treated by
induction on ||p — <?||i; we may assume supp + (p — q) ^ 0 and add the inequalities
(2.79) for (p - Xi, q) and for (p, (p V q) - Xi) with i 6 supp+(p - q). •

Bibliographical Notes
M-matrices are well-studied objects in applied mathematics and a comprehensive
treatment of their mathematical properties can be found in Berman-Plemmons
[9]. The connection of symmetric M-matrices to L-/M-convex quadratic functions
presented in section 2.1 is mostly based on Murota [147], whereas the general case
described in section 2.1.4 is due to Murota-Shioura [155]. See Fukushima-Oshima-
Takeda [71] and Doyle-Snell [39] for connections to probability theory.
For network flow problems in combinatorial optimization, Ford-Fulkerson [53]
is the classic, whereas Ahuja-Magnanti-Orlin [1] describes recent algorithmic devel-
opments. Thorough treatments of the network flow problem on the basis of convex
analysis can be found in Iri [94] and Rockafellar [178], the former putting more
emphasis on physical issues and the latter more mathematical. In particular, the
functions / and g in (2.37) and (2.38) are considered in the case of |T| = 2 in
[94] and [178]. The variational formulations are also discussed in Brayton-Moser
[19] and Clay [25]. The terminologies of current potential and voltage potential are
taken from [19], though they seem to be more often called content and cocontent.
M-convexity and L-convexity in the network flow problem are pointed out in Murota
[140], [141], [142].
Substitutes and complements in network flows are discussed in Gale-Politof
[73], Granot-Veinott [79], and Shapley [184], [185]. In particular, Theorem 2.22 is
due to Gale-Politof [73]. The connection to M-convexity and L-convexity (Theorem
2.23) is due to Murota-Shioura [158].
A number of books on matroids are available: Oxley [170], Welsh [211], and
White [212], [213], [214] are standard mathematical textbooks; Recski [175] realizes
a successful balance between theory and application; and Murota [146] empha-
sizes linear-algebraic motivations. For optimization on matroids, see, e.g., Cook-
Cunningham-Pulleyblank-Schrijver [26], Faigle [48], Korte-Vygen [115], Lawler
[119], and Schrijver [183]. Key papers in the development of matroid theory, in-
cluding Whitney [218], are collected in Kung [116]. Nakasawa [164] gives simple
2.4. Matroids 75

exchange axioms for matroids. The simultaneous exchange property for matroids
is due to Brualdi [20]. The concept of valuated matroids is due to Dress-Wenzel
[41], [42]. Chapter 5 of [146] is a systematic presentation of the theory of valuated
matroids including duality. Circuit axioms are investigated in Murota-Tamura
[159] and constrained optimizations in Althofer-Wenzel [2]. Oriented matroids are
another ramification of matroids, for which a comprehensive monograph of Bjorner-
Las Vergnas-Sturmfels-White-Ziegler [16] is available.
This page intentionally left blank
Chapter 3

Convex Analysis,
Linear Programming,
and Integrality

This chapter provides technical elements that are needed in subsequent chapters.
Some basic facts in convex analysis and linear programming are given in the first
two sections. The following two sections address integrality issues, i.e., integrality
for a pair of integral polyhedra and the concept of integrally convex functions.

3.1 Convex Analysis


A minimal set of prerequisites from convex analysis is given in this section, while
the reader is referred to the textbooks listed in the bibliographical notes at the end
of this chapter for comprehensive accounts.
For two vectors a = (a(i))?=1 and b = (b(i))f=l & (RU {±00})™ we define

where, if a(i) = —oo, for example, a(i) < x ( i ) is to be understood as —oo < x(i).
Sets such as [a, 6] and (a, 6) are referred to as closed intervals and open intervals,
respectively.
For a function / : R™ —> R U {±00} we define

which is the effective domain of /.


A function / : Rn —» R U {+00} is said to be convex if it satisfies

Note that — oo is excluded from the possible function values of a convex function
and that the inequality (3.4) is satisfied, by convention, if it is in the form of
+00 < +00. A convex function with a nonempty effective domain is called a
proper convex function. A function is strictly convex if it satisfies (3.4) with strict

77
78 Chapter 3. Convex Analysis, Linear Programming, and Integrality

inequalities, i.e., if

A function h : Rn —> R U {—00} is concave if —h is convex, i.e., if

where an empty set is a convex set. A set S is a cone if it satisfies

A convex cone is a cone that is convex and a set S is a convex cone if and only if
it satisfies the condition

A convex polyhedron is a typical convex set S described by a finite number of linear


inequalities as

where a,,,- 6 R and bi G R (i = I,..., m, j = 1,..., n). If hi = 0 for alH, then S is


a convex cone.
For a finite number of points xi,..., xm in a set S, a point represented as

with nonnegative coefficients A; (1 < i < m) with unit sum (Y^iLi \ — 1) is called
a convex combination of those points. The convex closure of 5*, denoted as S, is
defined to be the set of all possible convex combinations of a finite number of points
of S. If 5" is convex, any convex combination of any finite set of points of S belongs
to S, and vice versa, and therefore S is convex if and only if S = S.
For a set 5, the intersection of all the convex sets containing S is the smallest
convex set containing S, which is called the convex hull of S. The convex hull of S
coincides with the convex closure of S. The convex hull of a set S is not necessarily
closed (in the topological sense). The smallest closed convex set containing S is
called the closed convex hull of S. For a finite set S, the convex hull is always
closed.
The affine hull of a set S is defined to be the smallest affine set (a translation
of a linear space) containing S and is denoted by aff S. The relative interior of S,
denoted as riS, is the set of points x € S such that {y e Rra | \\y - x\\ < e} n aff 5
3.1. Convex Analysis 79

is contained in S for some > 0. In other words, the relative interior of S is the
set of the interior points of S with respect to the topology induced from aff S.
We have so far defined convex functions and convex sets independently, but
they are actually closely related to each other. The indicator function of a set
S C R™ is a function 6S : Rn -> {0, +00} denned by

Then, as is easily seen,

5 is a convex set •$=>• 5s is a convex function. (3.13)


This shows how the concept of convex sets can be defined in terms of that of convex
functions. Conversely, convex functions can be denned in terms of convex sets. The
epigraph of a function / : R™ —> R U {+00}, denoted as epi/, is the set of points
in R™ x R lying above the graph ofY — /(or). Namely,
epi/ = { ( x , Y) e R"+1 | Y > f ( x } } . (3.14)
Then we have

which shows that the convexity concept for functions can be induced from that for
sets. In passing, we mention that a function / is said to be closed convex if epi / is
a closed convex set in Rn+1.
A (global) minimizer of / is a point x such that f ( x ) < f(y) for all y. The
set of the minimizers of /, denoted by

is a convex set for a convex function /. A global minimizer of a convex function


can be characterized by local minimality (Theorem 1.1).
For a family of convex functions {/& | k e K}, indexed by K, the pointwise
maximum, f ( x ) = sup{/fc(x) | k 6 K}, is again a convex function, where the index
set K here may possibly be infinite. In particular, the maximum of a finite or
infinite number of affine functions

is a convex function, where a^ G R and pk & R™ for k £ K and

designates the inner product56 of p = (p(i))f=i and x = (x(i))™ =1 .


36
More precisely, (p, x) is not so much an inner product as a pairing, since p and x belong to
different (mutually dual) spaces.
80 Chapter 3. Convex Analysis, Linear Programming, and Integrality

A function defined on Rn is said to be polyhedral convex if its epigraph is a


convex polyhedron in R n+1 . A polyhedral convex function is exactly such a function
that can be represented as the maximum of a finite number of affine functions (i.e.,
(3.17) with finite K) on an effective domain represented as (3.10). In the case
of n — 1 (univariate case), a polyhedral convex function is nothing but a convex
piecewise linear function consisting of a finite number of linear pieces. We denote
by C[R —> R] the family of univariate polyhedral convex functions.
The sum of two functions /& : R™ —> R U {+00} (k = 1,2) is a function
/i + /2 : Rn -> R U {+00} defined naturally by

and their infimal convolution is a function f±O /2 : R™ —> R U {±00} defined by

The sum of two convex functions is convex, and the infimal convolution of two
convex functions is convex if it does not take the value of — oo. If /i and /2 are
the indicator functions of sets Si and 82, then /i + /2 and /id /2 are the indicator
functions of the intersection Si fl 82 and the Minkowski sum Si +82, respectively,
where

Modifying a function by a linear function is a fundamental operation. For a


function / and a vector p, we denote by f[—p] the function defined by

This is a convex function for / convex.


The subdifferential of a function / at a point x € dom / is defined to be the
set

Note that p 6 d-R.f(x) if and only if a; € argmin/[—p]. Being the intersection of


(infinitely many) half-spaces indexed by y, &R/(X) is convex (possibly empty) for
any / and any x. The set &R/(X) is nonempty for / convex and x in the relative
interior of dom/. An element of d n f ( x ) is called a subgradient of / at x. If /
is convex and differentiable at x, the subdifferential &nf(x) consists of a single
element, which is the gradient V/ = (df/dx(i))f=1 of / at x.
The directional derivative of a function / at a point x G dom / in a direction
d e Rra is defined by

when this limit (finite or infinite) exists, where a j 0 means that a tends to 0 from
the positive side (a > 0). For convex /, the limit exists for all d, and f ' ( x ; d ) is a
convex function in d. For polyhedral convex /, there exists £ > 0, independent of
x 6 dom /, such that
3.1. Convex Analysis 81

Figure 3.1. Conjugate function (Legendre-Fenchel transform).

The conjugate (or convex conjugate) of a function / : R™ —> RU{+oo}, where


dom/ ^ 0 is assumed, is a function /* : R" —> R U {+00} defined by

This is a convex function in p, being the maximum of (infinitely many) affine func-
tions in p indexed by x. The function /• is also called the (convex) Legendre-Fenchel
transform of /, and the mapping / i—> /* is referred to as the (convex) Legendre-
Fenchel transformation.
In the favorable situation where / is a smooth convex function and the supre-
mum in (3.26) is attained by a unique x = x(p) for each p, we have

where x = x(p) is determined as the solution to the equation V/(x) = p. This


hints at a geometrical interpretation of the conjugate function. In the case of n = I
(see Fig. 3.1), for simplicity, the tangent line to the graph Y = f ( x ) with slope p
intersects the Y-axis at a point with the y-coordinate equal to —f*(p).
Similarly, the concave conjugate of a function h : R" —> R U {—oo}, where
dom h ^ 0, is a function h° : Rn —> R U {—00} denned by

Note that h°(p) = -(-h)'(-p).

Example 3.1. For a convex function

the conjugate is given by f'(p) = exp(p — 1). This can be verified by a simple
calculation based on (3.27). •
82 Chapter 3. Convex Analysis, Linear Programming, and Integrality

For a function /, we may think of (/*)*, the conjugate of the conjugate of /,


which is called the biconjugate of / and denoted as /**. The biconjugate of / is the
largest closed convex function that is dominated pointwise by /. In particular, the
biconjugate 63" of the indicator function S$ of a set S is the indicator function of
the closed convex hull of S.

Theorem 3.2. The Legendre-Fenchel transform f* is a closed proper convex func-


tion for any f with dom/ ^ 0, and /** = / for a closed proper convex function f .
Hence, the Legendre-Fenchel transformation / • — > / * gives a symmetric one-to-one
correspondence in the class of all closed proper convex functions.

As a consequence of Theorem 3.2 and the definition (3.23), we obtain the


relationships

for a closed proper convex function / and vectors x,p € R n .


The conjugate 63* of the indicator function 63 of a set S C Rn is expressed
as

which is the support function of S. The support function of a nonempty set is


a positively homogeneous closed proper convex function, where a function g, in
general, is said to be positively homogeneous if

holds for any A > 0 and p 6 Rra (this condition yields g(0) = 0 if dom g ^ 0).
Theorem 3.2 implies a one-to-one correspondence between closed convex sets and
positively homogeneous closed proper convex functions. In this sense, positively
homogeneous convex functions are convex sets in disguise. For a closed convex
function / and a point x in the relative interior of dom /, for example, the directional
derivative f ' ( x ; d ) is a positively homogeneous closed proper convex function in d
and it coincides with the support function of the subdifferential &R,f(x):

The support function of a convex cone S C R™ agrees with the indicator function
of another convex cone,

called the polar cone of S. By Theorem 3.2, (S*)* = S for a closed convex cone S.
When S is represented as
3.1. Convex Analysis 83

Figure 3.2. Separation for convex sets.

with afe e R™ (& = !,..., m), we have

Note 3.3. A bounded polyhedron can be represented in two different ways: as


the convex hull of the vertices (vertex-oriented representation) and as the inter-
section of finitely many half-spaces described by linear inequalities (face-oriented
representation). Let 5 be a bounded polyhedron, S° be the set of its vertices, and

be a nonredundant representation of S. Then we have bk = $s0*(Pk)- This shows


that the translation between the two representations, S° <—> {(bk,Pk)}, can be
regarded as a special case of the Legendre-Fenchel transformation. In fact we have
already seen this phenomenon in Theorem 1.9, which gives two equivalent character-
izations of base polyhedra, one by exchangeability and the other by submodularity.
The exchangeability (B-EXC[Z]) is for vertices and the submodularity for faces;
p(X) in Theorem 1.9 corresponds to bk in the present notation (and Xx = Pk)- Re-
call that we formulated this as the Legendre-Fenchel conjugacy in Theorem 1.10.

The duality principle constitutes the core of convex analysis. It can be stated
in many different forms, but we focus here on separation theorems (for sets and for
functions) and the Fenchel duality theorem.
The following is the separation theorem for convex sets (see Fig. 3.2).

Theorem 3.4 (Separation for convex sets). Let 81,82 C R™ be nonempty convex
sets.
84 Chapter 3. Convex Analysis, Linear Programming, and Integrality

Figure 3.3. Separation for convex and concave functions.

(1) //5*i H 62 = 0, there exists a nonzero vector p* & Rn such that

If, in addition, Si and 82 are closed and at least one of them is bounded, then the
inequality > above can be replaced with strict inequality >.
(2) ri S\ n ri 62 — 0 if and only if there exists a vector p* € R™ such that (3.37)
holds and

(3) // Si is polyhedral, Si fl ri 6*2 = 0 if and only if there exists p* G R™ such that


(3.37) holds and

Proof. See Theorems 11.3 and 20.2 and Corollary 11.4.2 of Rockafellar [176]. D

The separation theorem for convex functions, illustrated in Fig. 3.3, asserts
the existence of an affine function that lies between a convex function and a concave
function.

Theorem 3.5 (Separation for convex functions). Let f : R™ —> R U {+00} be a


proper convex function and h : R™ —> R U {—00} a proper concave function, and
assume that (al) or (a2) below is satisfied:
(al) ri (dom /) n ri (dom ft) ^ 0,
(a2) / and h are polyhedral and dom / n dom h ^ 0.
If f ( x ) > h(x) (Vx 6 R"), there exist a* e R and p* e R™ such that
3.1. Convex Analysis 85

Proof. The proof is based on Theorem 3.4 applied to epigraphs. See Corollary 5.1.6
in Stoer-Witzgall [194] and the proof of Theorem 31.1 in Rockafellar [176]. D

Another expression of the duality principle is in the form of the Fenchel duality.
This is a min-max relation between a pair of convex function / and concave function
h and their conjugate functions /* and h°. We include a proof to demonstrate the
equivalence of the Fenchel duality and the separation for functions.

Theorem 3.6 (Fenchel duality). Let f : R™ —> R U {+00} be a proper convex


function and h : R™ —•> R U {—00} a proper concave function, and assume that at
least one of the following four conditions (al)-(b2) is satisfied:
(al) ri (dom /) n ri (dom h) ^ 0,
(a2) / and h are polyhedral, and dom/ n dom/i ^ 0,
(bl) / and h are closed37, and ri (dom/*) D ri (domft, 0 ) ^ 0,
(b2) / and h are polyhedral, and dom/* n dom/i° ^ 0.
Then it holds that

Moreover, if this common value is finite, the supremum is attained by some p G


dom/* ndom/i 0 under the assumption o/(al) or (a2), and the infimum is attained
by some x G dom/ n dom h under the assumption of (bl) or (b2).

Proof. By the definitions (3.26) and (3.28) of the conjugate functions, we have

for any x and p. This shows inf > sup in (3.41). Hence (3.41) holds if inf = — oo or
sup = +00. In what follows we assume inf > —oo and sup < +00.
Suppose that (al) or (a2) is satisfied. Then inf in (3.41) is of finite value, say,
A, and Theorem 3.5 applies to (/ — A, /i) and yields some a* G R and p* G R™
such that

This means that f ' ( p * ) < -a* -A and h°(p*) > -a*, implying inf = A < h°(p*)-
f*(p*} < sup and hence (3.41). This also shows that p* attains the supremum.
In the remaining case where (bl) or (b2) is satisfied, we can use a similar
argument for (/*, h°) on the basis of the identities (/*)* = / and (/i°)° = h shown
in Theorem 3.2. In the case (b2) note that the conjugate function of a polyhedral
convex function is again polyhedral. D

We note that, if the supremum in (3.41) is attained by p = p*, then

37
By this we mean that / and —h are closed convex functions.
86 Chapter 3. Convex Analysis, Linear Programming, and Integrality

Example 3.7. The separation theorem and the Fenchel duality theorem are illus-
trated for the convex function

and the concave function h(x) = —f(—x). The graphs of Y = f ( x ) and Y = h(x) are
tangent to the F-axis at the origin (0,0), and therefore there exists no separating
affine function, although f ( x ) > h(x) (Vx). This does not contradict Theorem
3.5, since neither (al) nor (a2) is satisfied. This shows the importance of the
conditions (al) and (a2) in Theorem 3.5. The conjugate functions are given by
f*(p) — exp(p — 1) and h°(p) = — exp(p — 1), and hence

Therefore, the infimum and the supremum in the Fenchel duality (3.41) are both
equal to 0; the infimum is attained by x = 0, whereas the supremum is not attained.
Note that the condition (bl) in Theorem 3.6 is met. •

Example 3.8. The infimum and the supremum in the Fenchel duality (3.41) can
be distinct if none of the conditions (al)-(b2) in Theorem 3.6 is satisfied. For the
convex function / and the concave function h in x = (x(l),a;(2)) defined by

we have inf = 0 and sup = —1 in (3.41). Note that dom/* = {p \ p(2) < 0} and
dornh0 = {p p ( l ) > 0,p(2) > 0}, which shows that ri (dom/*) n ri (domft 0 ) = 0,
the failure of condition (bl) in Theorem 3.6. •

The addition (3.19) and the infimal convolution (3.20) are conjugate oper-
ations with respect to the Legendre-Fenchel transformation. For proper convex
functions /i and /2 we have

where the latter is true under the assumption that ri (dom/i) n ri (dom/g) ^ 0.

3.2 Linear Programming


Linear programming is, undoubtedly, the most important subclass of convex opti-
mization problems. Some fundamental facts about duality and integrality in linear
programming are described here.
3.2. Linear Programming 87

We start with a fundamental fact about linear inequality systems, known as


the Farkas lemma, with a proof based on the separation theorem for convex sets.

Theorem 3.9 (Farkas lemma). For a matrix A and a vector b, the conditions (a)
and (b) below are equivalent:38
(a) Ax = b for some nonnegative x > 0.
(b) j/ T 6 > 0 for any y such that y1 A > 0 T .

Proof, [(a) => (b)]: It follows from Ax = b, x > 0, and y1 A > 0T that j/ T 6 =
yTAx > 0.
[(b) => (a)]: Let 5 be the convex cone generated by the column vectors a,j
(j = l,...,n)ofA- i.e., S = {£"=1 x(j)aj \ x(j) > 0}. If (a) fails, then b £ S, which
implies, by the separation theorem for convex sets (Theorem 3.4), that y1 a^ > 0
(j = 1 , . . . , n) and y T 6 < 0 for some y. D

A linear programming problem is an optimization problem to minimize or max-


imize a linear objective function subject to linear equality/inequality constraints.
Such a problem is also termed a linear program, often abbreviated to LP. Given an
m x n matrix A, an m-dimensional vector b, and an n-dimensional vector c, it is
convenient to consider a pair of LPs:

The LPs in such a pair are said to be dual to each other. For convenience of
reference, we call the problem on the left the primal problem and the one on the
right the dual problem. We denote the feasible regions of the above problems by

The linear programming duality is stated in the following theorem.

Theorem 3.10 (LP duality).


(1) [Weak duality] crx > 6Ty for any x e P and y € D.
(2) [Strong duality] 7/P ^ 0 or £> ^ 0, then39

This common value is finite if and only if both P and D are nonempty, and in
that case, the infimum and the supremum are attained by some x & P and y e D,
respectively.
38
Inequality between vectors means componentwise inequality; e.g., x > 0 for x = (x(j))™=l
means x(j) > 0 for j = 1 , . . . , n.
39
By convention, inf xe p = +00 if P = 0 and sup yeD = —oo if D = 0.
88 Chapter 3. Convex Analysis, Linear Programming, and Integrality

(3) [Complementarity] Assume x 6 P and y € D. Then x is optimal in the primal


problem and y is optimal in the dual problem if and only if

where (A^y — c)(j) denotes the jth component of A~^y — c.

Proof. (1) is easy to see. The essence of this theorem lies in (2), which can be
derived from the Farkas lemma. Then (3) follows. See, e.g., Chvatal [24], Dantzig
[36], Schrijver [181], and Vanderbei [206]. D

Linear programming acquires combinatorial flavor through integrality consid-


erations. An LP described by integer data (an integer matrix A and integer vectors
b and c) may or may not have an integer optimal solution. The major interest in
this context is under which condition an integer optimal solution is guaranteed.
An integer matrix is totally unimodular if every minor is equal to ±1 or 0.
Each entry of a totally unimodular matrix is either ±1 or 0.

Example 3.11. The incidence matrix of a graph is a typical example of a totally


unimodular matrix. Let G = (V, E) be a directed graph with vertex set V and arc
set E, where we assume no self-loops exist. The incidence matrix of G, say, A, is
a matrix such that the row set is indexed by V and the column set by E, and the
(v, a)-entry is given by

An example of an incidence matrix is (2.12). •

Example 3.12. Let V be a finite set. For a chain C : Xi<^X2'^ • • • ^ X m of


subsets of V, the incidence matrix of C is an m x \V\ matrix C denned by

Note that the z'th row of C is the characteristic vector of Xi. For two chains C1 and
C 2 , with incidence matrices C1 and C2, the matrix A— [^ 2 ] is totally unimodular.
To prove this, it suffices to assume that A is square and to show detA e
{0, ±1}. Let Ck : X? <^X$ ^ • • • ^ X^k (k = 1,2) be the chains and, for k = 1,2,
define Dk to be the matrix with the zth row of Ck replaced with the characteristic
vector of X? \ Xk_{ for i = 1,.. .,mk, where Xfi = 0. Put A = [ J^ 2 ]. Then
det A = ± det A by the construction. We also have det A £ {0, ±1}, since A, having
at most one entry of 1 and at most one entry of —1 in each column, can be regarded
as a submatrix of the incidence matrix of a graph (see Example 3.11). •

The following theorem relates the total unimodularity of the coefficient matrix
to the integrality of optimal solutions of LPs.
3.2. Linear Programming 89

Theorem 3.13. Let A be a totally unimodular matrix.


(1) If b is integral, the primal LP in (3.45) has an integral optimal solution x G Z"
as long as it has an optimal solution.
(2) // c is integral, the dual LP in (3.45) has an integral optimal solution y G Zm
as long as it has an optimal solution.

Proof. See, e.g., Chvatal [24], Cook-Cunningham-Pulleyblank-Schrijver [26], Korte-


Vygen [115], Lawler [119], and Schrijver [181]. D

Such a theorem enables us to treat combinatorial problems via linear pro-


gramming. Let us demonstrate this for the weighted bipartite matching problem.
Let G = (V+, V~; E) be a bipartite graph with vertex bipartition (V+, V~) and arc
set E. A set M of arcs of G is called a matching if each vertex of G is incident to at
most one arc of M and a perfect matching if each vertex of G is incident to exactly
one arc of M. We have |M| = \V+\ = \V~ for a perfect matching M.

Proposition 3.14. Let G = (V+, V~; E) be a bipartite graph with a perfect match-
ing, and let c : V+ x V~ —> R U {+00} be a (weight or cost) vector such that
c(u,v) < +00 •& (u,v) G E. Then there exist a vector*0 p : V+ U V~ —> R and
orderings of vertices V+ = {HI, ..., um} and V~ = {v\,... ,vm} such that

The set of arcs {(ui, Vi) i = 1,..., m} is a perfect matching of minimum weight,
and, therefore,
m
Minimum weight of a perfect matching — Jj (p(vi) — p(ui)). (3.49)
z=l

Proof. Consider the primal LP in (3.45) in which A is the incidence matrix of G


with arcs directed from V+ to V~, b is an integer vector defined by

and c is the vector of weights (c(u, v) \ (u,v) G E). Since A is totally unimodular
by Example 3.11, the optimal solution x may be chosen, by Theorem 3.13, to
be an integer vector, which, being a {0, l}-vector because of the constraints, can
be interpreted as the incidence vector of an optimal matching. The dual optimal
solution can be identified with a vector p : V+ U V~ —> R, and the condition (3.48)
follows from the dual feasibility and the complementarity. D

4i:)
This p is called a potential or an optimal potential.
90 Chapter 3. Convex Analysis, Linear Programming, and Integrality

3.3 Integrality for a Pair of Integral Polyhedra


Discrete duality often boils down to integrality for a pair of integral polyhedra.
In this section we observe some fundamental facts about the intersection and the
Minkowski sum of a pair of integral polyhedra. In so doing we intend to gain a
better understanding of the subtlety in the relationship between the integrality of
polyhedra and the convexity of discrete sets.
A polyhedron is said to be rational if it is described by a finite system of linear
inequalities with rational coefficients, i.e., if all the coefficients a^ and 6, in (3.10)
can be chosen to be rational numbers. A rational polyhedron P C R™ is an integral
polyhedron if P = P n Z™, i.e., if it coincides with the convex hull (convex closure)
of the integer points contained in it.
Let us say that a discrete set S C Zra is hole free if

which means that all the integer points contained in the convex hull of S belong to
S itself. A finite set of integer points is hole free if and only if it is the set of integer
points in some integral polytope.41
The hole-free property (3.50) seems to be a natural requirement for a discrete
set to be qualified as being convex. This is indeed compatible with our previous
naive idea of convexity for discrete functions in terms of the extensibility to convex
functions formulated in (1.11). We define the indicator function 63 '• Z™ —> {0, +00}
of a discrete set S by

Then a discrete set is hole free if and only if its indicator function is extensible to
a convex function.
We are now interested in the compatibility of the hole-free property with the
Minkowski addition (3.21), which is one of the fundamental operations in convex
analysis. We define the Minkowski sum Si + 63 of two discrete sets Si, 62 C Zn by

which we also call the discrete Minkowski sum or the integral Minkowski sum to
emphasize the discreteness. If the hole-free property can be qualified as a discrete
version of convexity, this property should be preserved in Minkowski addition. Con-
trary to this optimistic expectation, the Minkowski sum of hole-free sets can have
a hole, as is demonstrated in Example 3.15 below.

Example 3.15. Two sets

are hole free with Si = S^"n Z2 and S2 = S2 n Z2 (see Fig. 3.4). Nevertheless, the
discrete Minkowski sum

41
A polytope is a bounded polyhedron.
3.3. Integrality for a Pair of Integral Polyhedra 91

Figure 3.4. Nonconvexity in Minkowski sum.

has a hole at (1,1) and, therefore, Si + 82 j^ Si + 82 n Z2. We observe, in passing,


that

which shows Si n 6*2 ^ Si n S2. •

The above example issues the following warnings to us about the integrality
for a pair of hole-free discrete sets Si and £2 with Sk = Sk n Z™ (k — 1,2).

1. [Si + S2 = Si + S2 H Zn] is not always true.

2. [Si n S2 = 0 => Si n S^ = 0] is not always true.

3. [Si n 82 = Si n 82] is not always true.

4. The intersection PI n P2 of integral polyhedra Pk C R™ (fc = 1,2) is not


always an integral polyhedron.

The above facts suggest that the hole-free property (3.50) alone is not appro-
priate as the condition of discrete convexity for sets. Some deeper combinatorial
properties are needed.
The first two properties above will turn out to be critical in many situations,
and, in fact, they are essentially equivalent to each other.

Proposition 3.16. Suppose that a family T of sets of integer points has the
property

where x — S = {x — y \ y £ S}. Then conditions (a) and (b) below are equivalent
for T. __ _
(a) VSi, S2 e f: Sl n S2 = 0 => Sl n S2 = 0.
(b) VSi, S2 € f: Sl+S2=S1 + S2n Z™.

Proof, (a) =» (b): For_£ e_S_i + S2 n Z" we have xe(Sl+ S2) n Zn by Proposition
3.17 (4) below. Hence S[ nS2 ^ 0 for S[ = Si and S'2 = x-S2. By (a), there exists
y e S[ n S2. Then y € Si and 3z & S2 : y = x - z. Therefore, x € Si + S2.
92 Chapter 3. Convex Analysis, Linear Programming, and Integrality

_(b) =» (a): Suppose S7 n S^ ^ 0 and put S( = Si and S'2 = -S2. Then 0 e


S[ + STj = S{ + S'2 (see Proposition 3.17 (4) below). By (b) we obtain 0 e S{ + S'2,
which is equivalent to Si fl S2 ^ 0. D

We say that a family T of sets of integer points has convexity in intersection


if (a) above is true and convexity in Minkowski sum if (b) is true. It will be
shown in sections 4.6 and 5.4 that the families of M-convex sets and L-convex sets,
respectively, have these properties.
Finally, we mention basic relations that are always true.

Proposition 3.17. Assume Sk = Sk n Z™ for k = 1,2.

Proof. (1), (2), and (3) are obvious. We prove (4). First, note that Si + S2 2
Si + S2, which follows from 51 + 52 2 Si + 62 and the convexity of Si + S2. To
show the reverse inclusion, take

where A, > 0, ]T\ Aj = 1, y, € Si, /u.,,- > 0, J^ • /Zj = 1, and Zj € ^2 (the summations
being finite sums). With i>ij = \i/j,j we obtain

which shows x & Si + S2. D

3.4 Integrally Convex Functions


Integrally convex functions form a fairly general class of discrete convex functions,
for which global optimality is guaranteed by local optimality (in an appropriate
sense). Almost all discrete convex functions treated in this book, including L-convex
and M-convex functions, fall into this category.
For two integer vectors, a, b e (ZU{±oo})™, the integer interval [a, b] = [a, 6]z
is defined by

where, if a(i) = — oo, for example, a(i) < x(i) is to be understood as —oo < x(i).
The restriction of a function / : Z n —> R U {+00} to an interval [a, b] is denned as
the function /[0,6] : Z" —-> R U {+00} given by
3.4. Integrally Convex Functions 93

Let / : Z™ —> R U {+00} be a function denned on the integer lattice, where


it is a tacit agreement that the effective domain domz/ is nonempty. The convex
closure of / is defined to be a function / : R™ —> R U {±00} given by

If this function / coincides with / on integer points, i.e., if

we say that / is convex extensible and call / the convex extension of /.42 The
following fact is easy to see.

Proposition 3.18. // a function / : Z™ —> R U {+00} is convex extensible, then


argmin/[-p] is hole free for each p € R™. The converse is also true if domz/ is
bounded.

A local version of the convex extension of a function / can be denned by


relaxing the requirement in the definition (3.56) of the convex closure. Instead of
imposing the inequality (p, y) + a < f(y) for all y e Z™, we ask for this condition
only for points y € Z™ lying in a neighborhood of x £ R".
To be specific, we define the integral neighborhood of x € R" (see Fig. 3.5) by

where, for z € R in general, \z\ denotes the smallest integer not smaller than z
(rounding up to the nearest integer) and \_z] the largest integer not larger than z
(rounding down to the nearest integer). Note an alternative expression

using the ^-norm

With this neighborhood we define the local convex extension of / by

Note the obvious relations

We have an alternative expression

42
We say that / is concave extensible if —/ is convex extensible and then — (—/) is the concave
extension of /.
94 Chapter 3. Convex Analysis, Linear Programming, and Integrality

Figure 3.5. Integral neighborhood N(x) of x (o: point of N ( x ) ) .

with

as a consequence of LP duality (Theorem 3.10). In the univariate case (n = 1), the


graph of / consists of line segments connecting the points { ( z , f(z)) z £ Z} in the
natural order.
The local convex extension / is convex on every unit interval

with an integral point z 6 Z™, but is not necessarily convex in the entire space R n .
If / is convex on R n , the function / is said to be integrally convex. Alternatively,
we can define

In particular, an integrally convex function is convex extensible. A function h is


called integrally concave if —h is integrally convex. Note the following fact.

Proposition 3.19. For a function f : Zn —> R U {+00},

Example 3.20. Here is an example of a convex-extensible function that is not


integrally convex. Let / : Z2 —> R be defined by f ( x ) = \x(l) — 2x(2)| for x =
( x ( l ) , x(2)) 6 Z2. Obviously, this function is extensible to a convex function /(x) =
|x(l) - 2x(2)| defined for x = (x(l),x(2)) e R 2 . In particular, we have /(1,1/2) =
0. On the other hand, we have /(1,1/2) = 1_ since N(x) = {(1,0), (1,1)} for
x = (1,1/2) and /(1,0) = /(1,1) = 1. Hence /(I,1/2) ^ /(I,1/2), which shows
that / is not integrally convex. •

The global minimum of an integrally convex function can be characterized


by a local optimality. This is the key property of integrally convex functions that
justifies this notion is as follows.

Theorem 3.21. For an integrally convex function / : Z™ —> R U {+00} and


x € domz/, we have
3.4. Integrally Convex Functions 95

Proof. It suffices to show <*=. Put NI(X) = {y 6 R™ \\y - x\\x < 1} for
x e domz/. By (3.63) and (3.65) we have f ( x ) < f(y]_ for all y € 7Vi(z). Combining
this with integral convexity (3.64) shows f ( x ) < f(y) (Vy 6 N\(x)), the local
minimality of / at x. Since / is convex, a; is a global minimizer of / by Theorem
1.1 and, a fortiori, a global minimizer of /. D

The following variant of the above theorem will be used later.

Proposition 3.22. Let f : Zn —> R U {+00} be an integrally convex function such


that f(z + 1) = f ( z ) for all z€Zn. For x e domz/ we have

Proof. It suffices to show that the latter condition in (3.66) implies f ( x ) < f ( x +
XY — Xz) for any disjoint Y and Z. On putting U — {1,... ,n} \ (Y U Z) and
x° = x — 1 we have f(x) — f ( x ° ) and

Here we have f ( x ° + \Xu + Xv) > f ( x ° ) , since f ( x ° + ^xu + Xv) can be represented,
by integral convexity, as a convex combination of f ( x ° + xw + XY) with W C U
and f ( x ° + xw + XY) > f ( x ° ) for any W C U by the assumption. D

Note 3.23. The optimality criterion in Theorem 3.21 is certainly local, but not
satisfactory from the computational complexity viewpoint. We need O(3n) function
evaluations to verify the local optimality condition in (3.65). •

A function / : Z™ —> R U {+00} is called a separable convex function if it can


be represented as

with univariate discrete convex functions fi 6 C[Z —» R] (i = 1 , . . . , n), where

is the set of univariate discrete convex functions. Similarly, we denote by C[Z —*• Z]
the set of integer-valued univariate discrete convex functions.

Proposition 3.24. The sum of an integrally convex function and a separable


convex function is an integrally convex function.
96 Chapter 3. Convex Analysis, Linear Programming, and Integrality

Proof. Put f(x) = / 0 (z) + Y^i=i /i( x (*))) where /0 is integrally convex and /j e
C[Z —> R] for i = 1,..., n. For any (A y ) e A with X]yeAr(z) ^yV =x
> we ^ave

It follows from this and (3.63) that

which shows the convexity of /. D

The following proposition is an immediate corollary of Proposition 3.24, where,


for p 6 R™, we define

Proposition 3.25.
(1) A separable convex function is integrally convex.
(2) /[—p] is integrally convex for integrally convex f and vector p € R™.

A set of integer points S C Z™ is said to be integrally convex if its indicator


function 6$ is an integrally convex function. This means that a set S is integrally
convex if and only if

for any x & R™. We also have

S is an integrally convex set <=> S n N(x) = S n N(x) (Vx e R n ) (3.72)

(see Fig. 3.6).


An integrally convex set is hole free.

Proposition 3.26. S = S n Z" for an integrally convex set S.

Proof. This follows from (3.72), since N ( x ) = {x} for an integer point x. D

Note 3.27. The family of integrally convex sets has neither convexity in intersec-
tion nor convexity in Minkowski sum. Example 3.15 shows this. •

The integral convexity of a function can be characterized by the integral con-


vexity of the minimizers (Theorem 3.29 below).
3.4. Integrally Convex Functions 97

Figure 3.6. Concept of integrally convex sets.

Proposition 3.28. Let / : Z™ —> RU {+00} be an integrally convex function.


(1) domz/ is an integrally convex set.
(2) For each p € R™, argmin/[—p] is an integrally convex set.

Proof. (1) By / = / and the definition of / we have

dom / n N(x) = dom fC\N(x)= dom fr\N(x) = dom fr\N(x).

Then (3.72) shows the integral convexity of dom/.


(2) We assume p = 0 by Proposition 3.25 (2) and use (3.71) for S = argmin/.
For x (E S we have min/ = f ( x ) = f ( x ) and therefore x 6 S fl N(x). D

Theorem 3.29. Suppose a function f : Z™ —*• RU {+00} has a nonempty bounded


effective domain. Then
f is an integrally convex function
<$=>• argmin/[—p] is an integrally convex set for each p £ R™.

Proof. The implication => was shown in Proposition 3.28. For the converse we are
to show f ( x ) = f ( x ) for x 6 R™. If x £ dom/, we have f ( x ) = f ( x ) = +00 by
dom/ = dom/ and (3.62). Assume x e dom/, and consider a pair of (mutually
dual) LPs:

Here (p,a) and (A y | y £ dom/) are the variables of (P) and (D), respectively.
Problem (P) is obviously feasible, and so is (D) by a; € dom/. Let (p*,a*) and
A* = (A* y e dom/) be optimal solutions of (P) and (D), respectively. Then
98 Chapters. Convex Analysis, Linear Programming, and Integrality

(3.56), (3.62), (3.63), and LP duality (Theorem 3.10 (2)) imply

It remains to show that the inequality here is in fact an equality.


To denote the set of tight constraints at (p*, a*), we put

We have {y e dom/ | A* > 0} C S by the complementarity (Theorem 3.10 (3)).


Hence x 6 S, and furthermore, x € 5 n N ( x ) by the integral convexity of S and
(3.71). Therefore, there exists another optimal solution A = (\y y € dom/) to
(D) satisfying {y \v > 0} C S n N(x). Then, by (3.63), we obtain

which shows that the inequality in (3.73) is an equality. D

We mention a technical fact to be used in section 8.1.

Proposition 3.30. For an integer-valued integrally convex function f : Zn —»


Z U {+00} and p € R n , we have argmin/[—p] ^ 0 ifinff[—p] > — oo.

Proof. The proof is not difficult; see Lemma 6.13 in Murota-Shioura [152]. D

Note 3.31. The intersection of integrally convex sets is not necessarily integrally
convex; e.g., Si = {(0,0,0), (0,1,1), (1,1, 0), (1,2,1)} and S2 = {(0,0,0), (0,1,0),
(1,1,1), (1,2,1)} are integrally convex, but their intersection Si n 62 = {(0,0,0),
(1,2,1)} is not, since (Si n S2) n N(x) = 0 for x = (1/2,1,1/2) e Si n S2. This
implies also that the sum of integrally convex functions is not necessarily integrally
convex. The discrete separation theorem (see section 1.2) does not hold for integrally
convex functions; Example 1.5 shows this.

Note 3.32. A function / : Z™ —> RU {+00} is said to be a Miller's discrete convex


function if

holds for any x,y € domz/ and any a e [0, I]R. An integrally convex function
satisfies this condition. The optimality criterion (3.65) stated for integrally convex
functions in Theorem 3.21 is in fact valid for Miller's discrete convex functions. •
3.4. Integrally Convex Functions 99

Bibliographical Notes
The introduction to convex analysis in section 3.1 is kept to the minimum needed
for later developments in this book. For a systematic and comprehensive account,
see Borwein-Lewis [17], Hiriart-Urruty-Lemarechal [89], Rockafellar [176], Rock-
afellar [177], Rockafellar-Wets [179], and Stoer-Witzgall [194]. In particular, see
Theorems 11.3 and 20.2 and Corollary 11.4.2 of [176] for separation for convex sets
(Theorem 3.4); Corollary 5.1.6 in [194] and the proof of Theorem 31.1 in [176]
for separation for convex functions (Theorem 3.5); and Theorem 31.1 in [176] and
Corollary 5.1.4 in [194] for Fenchel duality (Theorem 3.6). Example 3.8 is taken
from [194].
References on linear programming abound in the literature; see, e.g., Chvatal
[24], Dantzig [36], Schrijver [181], and Vanderbei [206]. Matching is one of the
central topics in graph theory, the standard reference being Lovasz-Plummer [125].
Matching is also fundamental in combinatorial optimization; see Cook-Cunningham-
Pulleyblank-Schrijver [26], Du-Pardalos [43], Korte-Vygen [115], Lawler [119], and
Nemhauser-Wolsey [167].
Section 3.3 is a collection of basic facts, as presented in Murota [147]. The
terms convexity in intersection and convexity in Minkowski sum are coined here.
Proposition 3.16 is explicit in Danilov-Koshevoy [32], where a general framework
for convexity in intersection and convexity in Minkowski sum is provided.
The concept of integrally convex functions was introduced by Favati-Tardella
[49], where the effective domains are assumed to be integer intervals. The optimality
criterion (Theorem 3.21) is in [49]. Propositions 3.24 and 3.28 are due to Murota-
Shioura [153], and Theorem 3.29 (implicit in [153]) is taken from Murota [147].
Miller's discrete convex functions are introduced by Miller [130] along with the
optimality criterion (3.65).
This page intentionally left blank
Chapter 4

M-Convex Sets and


Submodular Set Functions

M-convex sets form a class of well-behaved discrete convex sets. They are defined
in terms of an exchange axiom and correspond one-to-one to integer-valued sub-
modular set functions. An M-convex set is exactly the same as the integer points
contained in the base polyhedron associated with some integral submodular func-
tion. This chapter, accordingly, is a systematic presentation of known results in the
theory of matroids and submodular functions from the viewpoint of discrete convex
analysis.

4.1 Definition
Let V be a finite set, say, V = {!,..., n}. A nonempty set of integer points B C Zv
is defined to be an M-convex set if it satisfies the following exchange axiom:

(B-EXC[Z]) For x,y e B and u e supp+(x — y ) , there exists v €


supp~ (a; — y) such that x — Xu + Xv € B and y + Xu — Xv € B.

Here supp+(x — y) and supp~(x — y) are the positive support and the negative
support of x — y defined in (1.19) and \u is the characteristic vector of u 6 V. We
denote by A^o[Z] the set of M-convex sets.
M-convexity thus defined for a set B C Zv is equivalent to the M-convexity
of the indicator function SB '• Zv —> {0,+00} (defined in (3.51)). Namely, B is
an M-convex set satisfying (B-EXC[Z]) if and only if SB is an M-convex function
satisfying (M-EXC[Zj) introduced in section 1.4.2.
Recall that we encountered (B-EXC[Zj) in section 1.3.2 as the exchange prop-
erty that characterizes the sets of integer points associated with a submodular
function (see Theorem 1.9). Hence, an M-convex set is exactly the same as the
set of integer points contained in the base polyhedron defined by an integer-valued
submodular set function.
An immediate consequence of the exchange axiom (B-EXC[Zj) is that an M-
convex set lies on a hyperplane {x € Rv x(V) = r} for some r € Z, where we use

101
102 Chapter 4. M-Convex Sets and Submodular Set Functions

the notation

Proposition 4.1. For an M-convex set B we have x(V) — y(V) for any x,y € B.

Proof. The proof is by induction on ||x — y\\\. If ||x — y\\\ = 0, we obviously


have x(V) = y(V). The case of ||x - y\\i = 1 is excluded by (B-EXC[Zj). If
\\x — y\\i > 2, (B-EXC[Zj) implies y' = y + Xu — Xv € B, for which we have
y'(V) = y(V), \\x - y'\\i = \\x - y||i - 2, and also x(V) = y'(V) by the induction
hypothesis. D

Since an M-convex set lies on a hyperplane {x £ R^ | x(V) = r}, we may


equivalently consider the projection of an M-convex set along an arbitrarily cho-
sen coordinate axis. We call the projection of an M-convex set an Mfi-convex set.
Whereas M-convex sets are conceptually equivalent to M-convex sets, the class of
M^-convex sets is strictly larger than that of M-convex sets. The simplest example
of an M^-convex set that is not M-convex is an integer interval [a, b}z-
We focus on M-convex sets in the development of the theory and deal with
M^-convex sets in section 4.7.

4.2 Exchange Axioms


There are a number of equivalent variants of the exchange axiom (B-EXC[Zj).
Whereas (B-EXC[Z]) requires that both x — Xu + Xv and y + Xu — Xv belong to B,
(B-EXC+[Z]) below imposes this only on y + Xu — Xv-

Proposition 4.2. For a set B C Zv, (B-EXC[Zj) is equivalent to the following:


(B-EXC-i-[Z]) For x,y 6 B and u € supp+(x — y), there exists v &
supp~(x - y) such that y + Xu~Xv^B.

Proof. It suffices to show (B-EXC+[Z]) =» (B-EXC[Z]). First, it is easy to see that


(B-EXC+[Z]) implies the following:
(B-EXC_ioc[Z]) F o r x , y e B with ||x-y||i = 4 and v e supp~(x-y),
there exists u 6 supp+(x — y) such that y + Xu — Xv G B.
To prove the claim by contradiction, we assume that there exists a pair ( x , y ) for
which (B-EXC[Zj) fails. That is, we assume that the set of such pairs

V = { ( x , y ) | x,y e B, 3u» € supp+(x - y),Vv e supp~(x - y) :


x-Xuf+Xv<£Bory + X u f - X v i £ B}
4.3. Submodular Functions and Base Polyhedra 103

is nonempty. Take a pair (x,y) G V with minimum \\x — y\\i\ we have \\x — y\\i > 4.
Fix u* € supp+(o; — y ) as above, take any UQ 6 supp+(x — y — x«, )> and put

where Y ± 0 by (B-EXC+[Z]). Take any v0 e Y, where we assume v0 e X n F if


X n r ^0. Then y ' = y + x«0 - X«o satisfies y' 6 £ and ||x-y'||i = ||x-y||i-2.
We also have (x, y') e P, as shown below, a contradiction to the choice of (x, y).
It remains to show (x, y') £ T>. We have u* e supp + (x — y ' ) and want to show

Put y" = y' + Xu* ~ Xv = y + Xu0 + Xu. ~ Xv0 - Xv Note that y + Xu. - Xv £ B,
since (a;, y) e T> and x - Xu* + Xv € B. If X D Y ^ 0, we have y + Xu, ~Xv$-B and
y + Xu.~ Xv0 i B, and therefore, y" £ B by (B-EXC+[Z]). If X n y = 0, we have
V + Xu, ~ Xvi B and y + x«0 ~Xv<£B, and therefore, y" ^ B by (B-EXC_i oc [Z]).
In either case we have (x, y') G T>. D

We introduce two other variants:


(B-EXCW[Z]) For distinct x,y & B, there exist u €. supp + (x — y) and
v € supp~ (x - y) such that x — xu + Xv € B and y + Xu ~ Xv € -B-
(B-EXC_[Zj) For x,y e B and M € supp+(x — y), there exists v 6
supp~(x — y) such that x — Xu + Xv G B.

Theorem 4.3. Conditions (B-EXC[Z]); (B-EXCW[Z]), (B-EXC+[Z]), and (B-


EXC_[Z]) are equivalent for a set B C Zv.

Proof. The implication (B-EXC[Z]) ^ (B-EXCW[Z]) is obvious, and Proposition


4.2 shows (B-EXC[Z]) ^ (B-EXC+[Z]). We also have (B-EXC[Zj) ^ (B-EXC_[Z]),
since (B-EXC_[Z]) for B is equivalent to (B-EXC+[Z]) for -B, and (B-EXC[Z]) for
B is equivalent to (B-EXC[Z]) for -B. We show (B-EXCW[Z]) => (B-EXC_[Z]) by
induction on ||x — y\\\. Suppose x,y € B and u e supp+(2; - y). By (B-EXCW[Z])
there exist u\ 6 supp+ (x — y) and vi € supp~ (x — y) such that x — % Ul + x«i G B
and y' = y + x«i — X«i e ^- ^ ui = u > we are done. Otherwise (MI ^ u), we
have ||x — j/'IJ! = ||x — j/||i — 2 and, by the induction hypothesis, (B-EXC_[Zj)
applies to (x,y') and u € supp + (x — y'). Hence, x — Xu + Xv G B for some
f 6 supp~(x — y') C supp~(x — y). D

4.3 Submodular Functions and Base Polyhedra


We introduce here some fundamental facts about submodular set functions, which
turn out to describe the convex hull of M-convex sets.
Let p : 2V —> R U {±00} be a set function. Its effective domain, denoted as
domp, is defined to be the family of subsets at which p is finite; i.e.,
104 Chapter 4. M-Convex Sets and Submodular Set Functions

Throughout this book we assume, for a set function p in general, that p(0) = 0,
p(V) is finite, and either p : 2V -> R U {+00} or p : 2V -> R U {-oo}.
The Lovdsz extension*3 of a set function p is a function p : R y —> R U {±00}
defined as follows. Given a vector p e R^, we denote by p\ > p2 > • • • > pm the
distinct values of its components and put

Then we have an identity

which is a representation of p as a linear combination of xut (i — ! ) • • • > m ) - The


Lovasz extension p is the linear interpolation on the basis of this representation.
Namely, p is defined by

with reference to the representation (4.5). The Lovasz extension p is a positively


homogeneous function that coincides with p on {0, \}v in the sense of

Since pi — pi+i > 0 ( l < z < m — 1) and p(Um) = p(V) is finite on the right-hand
side of (4.6), we have

A set function p : 1V —> R U {+00} is said to be submodular if it satisfies

We denote by <S[R] the set of submodular set functions p with p(0) = 0 and p(V) <
+00 and by <S[Z] the set of integer valued such submodular set functions; i.e.,

For p € <S[R], the effective domain T> = domp forms a ring family (sublattice of the
Boolean lattice 2V), which means, by definition, that

43
The Lovasz extension is also called the Choquet integral or the linear extension. Often p(p)
is denned only for nonnegative p, although it is more convenient for us to define it over the entire
space R^.
4.3. Submodular Functions and Base Polyhedra 105

We call a set function fj, : 2V —* RU {—00} supermodular if —/x is submodular.


The conditions /z(0) = 0 and n(V) > —oo are always assumed for a supermodular
function fj, (i.e., — JJL 6 <S[R]).
For a submodular function p 6 <S[R] we consider a polyhedron

the base polyhedron associated with p. A point (element) of B(/o) is called a base
and an extreme point of B(p) is an extreme base.

Proposition 4.4. B(p) is nonempty for p e <S[R].

Proof. For simplicity of description we assume that dom p has a maximal chain
of length n = \V\. On suitably indexing the elements of V, V = {v\,V2, • • • ,vn},
we have Vj = {v-\_, v%,..., Vj} € domp for j = 1,..., n. Define a vector x £ R^ by
x(vj) = p(Vj) - p(Vj-i) for j = 1,... ,n with V0 = 0. We show x(X) < p(X) by
induction on \X\. When \X\ = 0, this is obviously true by p(0) = 0. When |X| > 1,
let j be the maximum index such that u, £ X. Then we have

Hence follows B(p) ^ 0. D

The support function of B(p) coincides with the Lovasz extension of p.

Proposition 4.5. For a submodular set function p & S[R], we have

where p is the Lovasz extension (4.6) of p.

Proof. For simplicity we assume that dom p has a maximal chain of length n = \V\.
Consider a pair of linear programs (LPs):

Here x €. Rv and y = (yx \ X e domp) are the variables of (A) and (B), respec-
tively, and p 6 R^ is regarded as a parameter. Note that the equality constraints
in (B) can be written as
106 Chapter 4. M-Convex Sets and Submodular Set Functions

Problem (A) is feasible by B(p) ^ 0 in Proposition 4.4. By LP duality (The-


orem 3.10 (2)), the optimal value of (A), max (A), is equal to the optimal value of
(B), min(B). Obviously, max (A) is equal to the left-hand side of (4.14). Thus,

For the feasibility of (B) we have the following statement, where Ui (i =


1 , . . . , TO) are the subsets determined from p by (4.4).
Claim 1: (B) is feasible •<=> Ui € domp (i = 1 , . . . , TO).
(Proof of Claim 1) For the proof of 4=, take a maximal chain {Vj } of dom p such
that {Ui \ i = 1,..., TO} C {Vj \ j = 1 , . . . , n}, where we put Vj = {^1,^2, • • • , Vj}
(j = 1 , . . . , n). Then the vector y* defined by

is a feasible solution for (B). For the proof of =>, take a feasible y that maximizes
r = E{yx\x\2 l^edomp}.
Claim 2: C = [X G domp \ {V} yx > 0} U {V} forms a chain.
(Proof of Claim 2) For Y, Z G C \ {V} with yY>Vz> 0, we have the identity

where Y n Z, Y U Z G domp. The maximally of F implies that \Y \ Z\ = 0 or


\Z \ Y\ = 0, since otherwise F would be increased by 2yz\Y \ Z\ \Z \ Y\ when the
feasible solution is modified according to the above identity. Therefore Y C Z or
Z C y. Thus Claim 2 is proven.
Since C is a chain with J^xec VxXx — P and yx > 0 for X e C \ {V}, the
family C must coincide with {Ui i = 1 , . . . , m} (cf. (4.5)), and therefore, Ui G domp
(i = 1 , . . . , TO). This completes the proof of Claim 1.
Suppose that (B) is feasible, and let y* be denned by (4.17) with reference to
a maximal chain {Vj} of domp containing {Ui}. Define another vector x* £ Rv by

with VQ = 0. The solutions x = x* and y — y* are feasible in (A) and (B),


respectively, and have the same objective value:

By LP duality (Theorem 3.10 (1)), this shows the optimality of y* for (B); hence
^xy*xp(X) = min(B). On the other hand, £x V*xp(X) = p(p) by (4.6) and
(4.17). Combining these with (4.16) shows (4.14).
4.3. Submodular Functions and Base Polyhedra 107

In the case of infeasible (B), we have min (B) = +00 in (4.16), whereas p(p) —
+00 by Claim 1 and (4.8). Hence (4.14) follows. D

Proposition 4.6. B(p) is an integral polyhedron for p € S[Z\.

Proof. For each p, the optimal base (4.18) is an integer vector for p e S[Z]. D

Note 4.7. A ring family (4.12) typically arises from a graph, and conversely, any
ring family can be represented by a graph. For a directed graph G = (V, A) with
vertex set V and arc set A, we call a subset X of V an ideal if (u, v) £ A and u & X
imply v 6 X. That is, X is an ideal if and only if no arc leaves X. The set of ideals

is a ring family with {0, V} C T>. Conversely, for a ring family T> on a set V, we
consider a directed graph G = (V, .A) with

Denote by minP the minimum element of T) and by max I? the maximum element.
Then T> coincides with the family of ideals of G that contain min T> and are contained
in max£>. In particular, V = V(G) if {0, V] C V. The graph G given by (4.20) is
transitive; i.e., (u,v) 6 A, (v,w) 6 A => (u,w) 6 A. The constructions (4.19) and
(4.20) establish a one-to-one correspondence between the set of ring families on V
including {0, V} and the set of transitive directed graphs with vertex set V.
An acyclic graph44 G = (V, A) represents a partial order ^ on V defined by
[v X u &• v is reachable from u by a directed path], and the corresponding ring
family T> has the property that a maximal chain of V is of length \V\. In this
particular case, (4.19) reads

Thus, we have a one-to-one correspondence between the set of partial orders on V


and the set of ring families on V including {0, V} and having a maximal chain of
length |V|. •

Note 4.8. The minimizers of a submodular set function p form a ring family. Let
a denote the minimum of p. If p(X) = p(Y) = a, we have

by submodularity (4.9). This implies p(X U Y) = p(X n y) = a. Hence,

44
A directed graph is called acyclic if it does not contain directed cycles.
108 Chapter 4. M-Convex Sets and Submodular Set Functions

Note 4.9. For a base x 6 B(p), a subset X C V is said to be a ttg/ii sei at x if


xpO = ppQ. The family of tight sets at x, denoted by

is a ring family satisfying

This follows from Note 4.8 applied to p(X) - x ( X ) . Note that {0, V} C V(x). •

Note 4.10. LP (A) in the proof of Proposition 4.5 is the problem of maximizing
the weight of a base x € B(/o) with respect to a given weight vector p. An optimum
solution is given by (4.18) for an ordering of the elements of V satisfying p(v\) >
• • • > p(vn), where domp is assumed to have a maximal chain of length n = \V\.
Often we refer to this fact by saying that the greedy algorithm works for finding an
optimal base. •

Note 4.11. A partial order on V is associated with each extreme base. Assume
that T> = dom/9 has a maximal chain of length n = |V| and denote by ^ the partial
order on V associated with T>, as in (4.21). A linear order < on V, or an ordering
of elements of V, is said to be an extension of ^ if [v ^ u => v < u]. A linear
extension of ^ generates an extreme base in (4.18). Conversely, any extreme base
x is generated in this way, but there can be several linear orders that generate x.
We define a partial order -<x on V associated with x by

The partially ordered set P(x) = (V, ^x) thus defined corresponds to the family of
tight sets T>(x) in the sense of Note 4.7. In particular,

4.4 Polyhedral Description of M-Convex Sets


An M-convex set is hole free (Theorem 4.12 below), which allows us to identify an
M-convex set with its convex hull. The convex hull of an M-convex set is called
an M-convex polyhedron, which is indeed a polyhedron described by a submodular
function (Proposition 4.13 below).
Let us start with the hole-free property of an M-convex set.

Theorem 4.12. B = ~B n Z v for an M-convex set B C Zv.

Proof. Obviously, B C B n Zv. To show the reverse inclusion, take an arbitrary


x 6 B n Zv, which can be represented as
4.4. Polyhedral Description of M-Convex Sets 109

with distinct Xi (1 < i < m). We may assume that there is a positive integer N
such that N\i e Z + (1 < i < m). For a representation of the form (4.24), we define

which is intended to measure the complexity of the representation. The representa-


tion (4.24) with m = 1 means x & B (we are done). If m > 2, there are two distinct
indices j, k and u e V such that Xj(u) < x(u) < Xk(u). (B-EXC[Z]) shows the exis-
tence of v £ supp~(xk -Xj) with x'h = Xk — Xu + Xv € B and x'j = X j + X u ~ X v & B.
A modification of the representation (4.24), according to the following:

gives another representation of the form (4.24), for which <J> is smaller at least by
4min(Aj,Afc) (> 4/AT). The condition NXi € Z+ is preserved in this modification.
The process of modification ends with m = 1, showing x £ B. D

The convex hull of an M-convex set is a polyhedron—the base polyhedron


defined by some submodular set function.

Proposition 4.13. For an M-convex set B, define p : 2V —» Z U {+00} by

(l)peS[Z].
( 2 ) B = B(p).

Proof. (1) First we show the submodularity inequality (4.9) for X and Y with
p(X U y) and p(X n y) both finite. Take y,z&B with p(A" U y) = y(A" U y) and
p(X n y) = z(X n y); we choose such (y, z) with | \y — z\ \i minimum. Then we have
y(v) = z(v) (v G X n y), since, otherwise, 3u G (X n y) n supp + (z — y) and, by
(B-EXC+[Z]), 3v 6 supp^(z — y) such that y' = y + \u — Xv G B, for which we
have y'(X U y) > y(X U y) and \\y' — z\\\ < \\y - z\\i — 2, a contradiction to our
choice of ( y , z ) . Therefore,

which shows (4.9).


In the case of p(X U y) + p(X ("I y) = +00, we consider a sequence of M-convex
sets Bk = {x e B | — k < x(v) < k (v € 1^)} and the corresponding pk 6 5[Z] for
fc = 1,2, The submodularity inequality (4.9) follows from

by letting k —> +00.


110 Chapter 4. M-Convex Sets and Submodular Set Functions

(2) The inclusion B C ~B(p) is obvious. For the converse we may assume that
p(X) is finite for all X C V (by the same argument as in the latter half of the proof
of (1)). Let z e Rv be an extreme point of B(p). As we have seen in (4.18), there
is an ordering of the elements of V, say, V = {i>i,..., vn}, such that z(Vj) = p(Vj)
with Vj = {vi,... ,Vj} for j = 1 , . . . , n , where n = \V\. For each j = 1 , . . . , n , there
exists Xj G B with p(Vj) = Xj(Vj). By repeated applications of (B-EXC + [Zj), as in
the proof of (1) above, we can show the existence of x & B such that x(Vj) = Xj(Vj)
for j = 1 , . . . , n. We then have z(Vj) = p(Vj) — Xj(Vj) = x(Vj) for j = 1 , . . . , n,
which means z = x s B. Since any extreme point of B(p) is contained in _B, we
must have B(/o) C B. 0

The converse of the above proposition is also true.

Proposition 4.14. Let p 6 <S[Z] be an integer-valued submodular set function.


(1) B = B(p) n Zv is an M-convex set.
(2) p(X) = sup^pQ | x e B(p)} (XCV).

Proof. (1) First, B is nonempty by Propositions 4.4 and 4.6. By Proposition 4.2
it suffices to show (B-EXC+[Z]). Suppose, to the contrary, that (B-EXC+[Z]) fails
for some x, y e B and some u & supp+(x — y ) . For each v £ supp~(x — y), we have
y + X u ~ X v $. B, which, together with the integrality of p, implies the existence of a
tight set Xv e V(y) with u € Xv and v £ Xv. For Z — r\vesuPP-(x-y) Xv-, we have
y(Z) = p(Z) by (4.23), whereas x(Z) > y ( Z ) by u e Z and Z n supp~(a; - y) = 0.
It then follows that x(Z) > p ( Z ) , a contradiction to x & B(p).
(2) This follows from (4.14) with p = xx and (4.7). An alternative proof is as
follows. Since p(X) > sup{x(X) \ x € B(p)} is obvious, it suffices to establish an
equality here in the case of sup < +00. Let x 6 B(p) attain the supremum. Then
for each u 6 X and u e V" \ X there exists Xuv e £>(£) with u e JsTuw and w ^ X uu .
Since P(x) is a ring family (see (4.23)), we have X = \Ju€X C\vev\x xuv 6 T>(x],
which means x ( X ) = p ( X ) . D

Propositions 4.13 and 4.14 together imply a one-to-one correspondence be-


tween the family Aio[Z] of M-convex sets and the family <S[Z] of integer-valued
submodular set functions. In this sense, the exchange property (B-EXC[Zj) and
the submodularity (4.9) are equivalent.

Theorem 4.15. A set B C Z v is M-convex if and only if B = B(p) n Zv for an


integer-valued submodular set function p e <S[Z]. More specifically, the mappings
$ : M0[Z] -> S[Z] and * : S[Z] -+ M0[Z] defined by

are inverse to each other, establishing a one-to-one correspondence between Mo[Z]


andS[Z\.

Proof. For B e M 0 [ Z ] , we have 3>(B) 6 <S[Z] and * o $(5) = B n Z y = B by


4.5. Submodular Functions as Discrete Convex Functions 111

Proposition 4.13 and Theorem 4.12, respectively. For p 6 <5[Z], we have ^>(p) €
.A/fo[Z] and $ o *&(p) = p by Propositions 4.6 and 4.14. D

4.5 Submodular Functions as Discrete Convex


Functions
We prove two fundamental theorems connecting submodularity and convexity, which
we have already seen in section 1.3.1.
The Lovasz extension p of a Submodular set function p is a convex function,
since, from the expression (4.14), p is the support function of the base polyhedron.
The converse is also true.

Theorem 4.16 (Lovasz). A set function p : 2V —> R U {+00} with p(0) = 0 and
p(V) < +00 is submodular if and only if its Lovasz extension, p : Hv —> Ru{+oo},
defined in (4.6) is convex; i.e.,

Proof. It suffices to prove •£= only. Since p is positively homogeneous, the convexity
of p implies

an
This shows the submodularity of p, since p(xx) — p(X)i P(XY) — pW)i d P(XX +
XY) = p(X U Y) + p(X n y) by (4.7) and (4.6). D

The connection of submodularity to convexity is reinforced by the following


discrete separation theorem for a pair of submodular/supermodular set functions.

Theorem 4.17 (Frank's discrete separation theorem). Let p : 2V —> R U {+00}


and n : 2V —> R U {—00} be submodular and supermodular functions, respectively,
with p(0) = /z(0) = 0, p(V) < +00, and fj,(V) > — oo (namely, p,—p, € <S[R]).
V

there exists x* e Rv such that

Moreover, if p and /j, are integer valued (namely, p, —/j, 6 «5[Z]), the vector x* can
be chosen to be integer valued (namely, x* € Zv).

The combinatorial essence of the above theorem lies in the second half, claim-
ing the existence of an integer vector for integer-valued functions, whereas the exis-
tence of a real vector x* alone can be proved on the basis of the separation theorem
112 Chapter 4. M-Convex Sets and Submodular Set Functions

in convex analysis and the relationship between submodularity and convexity stated
in Theorem 4.16 (see Note 4.20).
The proof of Theorem 4.17 is based on Edmonds's intersection theorem below,
which is the most important duality theorem in the theory of submodular functions.
For a submodular set function p £ <S[R], we define a polyhedron

called the submodular polyhedron associated with p. Note the relationship

P(/>) n {x 6 Rv x(V) = p ( V ) } = B(p)

to the base polyhedron B(p).

Theorem 4.18 (Edmonds's intersection theorem). Let pi,p2 '• 2V —> Ru{+oo} be
submodular set functions with pi(0) = p2($) = 0, pi(V) < +00, and p2(V) < +00
(namely, pi,pz £ S[R}). Then

Moreover, if pi and p2 are integer valued (namely, pi,p2 & <5[Z]), the polyhedron
P(pi) C}P(p2) is integral in the sense of

and there exists an integer-valued vector x* that attains the maximum on the left-
hand side of (4.29).

Proof. Denoting 2?$ = (dompi) \ {0} (i = 1, 2), we consider a pair of LPs:

Here a; e Ry and (yix \ X e £>i,z = 1,2) are the variables of (A) and (B),
respectively, and p 6 Hv is a parameter. Note that the equality constraints in (B)
can be written as

Problem (A) is feasible. Let p be such that (A) has an optimal solution. Then
Problem (B) also has an optimal solution by LP duality (Theorem 3.10).
4.5. Submodular Functions as Discrete Convex Functions 113

There exists an optimal solution to (B) such that Ci = {X e T>i \ yix > 0}
forms a chain for i — 1,2. To prove this, take an optimal solution that maximizes

If ViY > Viz > 0 for some i e {1,2} and Y, Z € T>j, we have

where the latter is due to the submodularity of p^. This means that the modification
of (j/ix) to (y'ix) defined by

would increase F by 2yiZ\Y \ Z\ \Z\Y\ while maintaining the optimality. By the


maximality of F we must have \Y \Z\ =0 or \Z\Y\ = 0; i.e., Y C Z or Z C Y.
Let Ai be the incidence matrix of Ci tor i = 1,2. Namely, ^4j is a \d x |V|
matrix with rows indexed by Ci and columns by V; for X e Cj and i; € V, the
(X,w)-entry is equal to 1 if u € X and to 0 otherwise. Define A = [A12], which
is a totally unimodular matrix, as shown in Example 3.12. The vector y = (ytx
X G Ci, i = 1,2) of nonzero entries of the optimal solution to (B) is determined as a,
solution to y T A = p; see (4.31). By the total unimodularity of A, y can be chosen
to be integral for an integral p.
For p = 1, y is a {0, l}-vector, which implies C\ = {X} and €2 = {V \ X} for
some X C V. Hence, the optimal value of Problem (B) for p = 1 is equal to the
right-hand side of (4.29). On the other hand, the optimal value of Problem (A) for
p = 1 is obviously equal to the left-hand side of (4.29). Then the identity (4.29)
follows from LP duality (Theorem 3.10 (2)).
It remains to show the integrality (4.30) for p\,p2 € <S[Z]. Define a vector
p = (pi(X) | X e d,i = 1,2). By the complementarity (Theorem 3.10 (3)), a
feasible solution x to (A) is optimal if and only if it satisfies Ax = p. Such an x
can be chosen to be integral by the integrality of p and the total unimodularity of
A. The above argument shows that Problem (A) has an integral optimal solution
for any p £ Z^ for which (A) has an optimal solution. This implies (4.30). D

As an immediate corollary of (4.30) we obtain

for pi,p2 G «S[Z], the integrality of the intersection of integral base polyhedra.
We are now in the position to prove Frank's discrete separation theorem (The-
orem 4.17). Consider Edmonds's intersection theorem (Theorem 4.18) for
114 Chapter 4. M-Convex Sets and Submodular Set Functions

It follows from (4.26) that the minimum on the right-hand side of (4.29) is equal
to (j,(V). Hence, there exists x* e P(pi) n P(p2) such that x*(V) = fj,(V). The
condition x* e P(pi) is equivalent to x*(X] < p(X) (\/X C V), and x* e P(pz)
is equivalent to x*(V \X) < p2(V \ X) (VX C V), which is equivalent further to
x*(X) > A*PO (VX C V) by x*(V) = n(V). Hence, p(X) > x*(X) > fi(X) for all
X C V. For integer-valued p and /u, we can take an integral x* by the integrality
assertion in Theorem 4.18. This completes the proof of Theorem 4.17.

Note 4.19. Discreteness is twofold in Edmonds's intersection theorem. First,


the minimum on the right-hand side of (4.29) is taken over combinatorial objects,
i.e., subsets of V, independently of whether the submodular functions are integer
valued or not. Second, the maximum can be taken over discrete points in the case
of integer-valued submodular functions. The former is sometimes referred to as the
dual integrality and the latter as the primal integrality. •

Note 4.20. The first half of the discrete separation theorem (Theorem 4.17) is
derived here from the separation theorem in convex analysis and the relationship
between submodularity and convexity given in Theorem 4.16. Let p and /} be the
Lovasz extensions of p and /z, respectively. We have p(p) > £t,(p) (\/p e R+) by the
assumption p > n as well as the definition (4.6) of the Lovasz extension. Define
functions g : R7 —> R U {+00} and k : Rv —> R U {-00} by

Then g is convex and k is concave by Theorem 4.16; these functions are polyhedral,
and domg PI domfc ^ 0 by <?(0) = fc(0) = 0. The separation theorem in convex
analysis (Theorem 3.5) applies to the pair of g and k, yielding j3* 6 R and x* e R K
such that

This inequality for p = xx (X C V) yields the inequality (4.27), where j3* — 0


follows from g(0) = p(0) = 0 and fc(0) = //(0) = 0. •

4.6 M-Convex Sets as Discrete Convex Sets


We show a number of nice properties of M-convex sets that qualify them as well-
behaved discrete convex sets.
We start with a discrete separation theorem for two M-convex sets.

Theorem 4.21 (Discrete separation for M-convex sets). Let BI and B2 (C Zv) be
M-convex sets. If they are disjoint (BinB2 = 0), there exists p* e {0, Ij^UfO, -1}V
such that
4.6. M-Convex Sets as Discrete Convex Sets 115

Proof. By Theorem 4.15, we have Bi = B(pi) n Zv for some submodular functions


Pi € S[Zi] (i = 1,2). If pi(V) 7^ pz(V), we can take p* = \v or — xv- Suppose
PI(V) = P2(V). Theorem 4.17 applied to p(X) = p^X) and n(X) = p2(V)-p2(V\
X) yields

Since BI n £?2 = 0 by assumption, there exists such an -X". Noting

we see that p* = ~Xx (or p* — 1 — xx G {0,1}V) is a valid choice for (4.33). D

The content of Theorem 4.21 consists of two claims. The first, explicit in the
statement, is that the separating vector p* is so special that p* or — p* is a {0,1}-
vector. The second, less conspicuous and more subtle, is that BI r\B2 = 0 is implied
by B\r\B2 — 0, since otherwise the inequality (4.33) is impossible. The implication

was named convexity in intersection in section 3.3.


The following theorem shows another integrality property, stronger than (4.34),
of the intersection of two M-convex polyhedra.

Theorem 4.22. For M-convex sets BI, B2 C Zv we have BI n B2 = BI D B2.

Proof. For the representation Bi = B(pj) n Zy with pi 6 S[Z] (i = 1,2), we have


B7 = B(pi) (i = 1,2). Then the claim follows from (4.32). D

We now turn to the Minkowski sum of M-convex sets. The claim (3) in the
theorem below says that the Minkowski sum of M-convex sets is again an M-convex
set. An important consequence of this is that the family of M-convex sets has the
property of convexity in Minkowski sum considered in section 3.3.

Theorem 4.23.
(1) For submodular set functions pi,p2 6 «5[R], we have

(2) For integer-valued submodular set functions pi,p2 € <5[Z], we have

(3) For M-convex sets Bi,B2 C Zv, BI + B2 is an M-convex set and


116 Chapter 4. M-Convex Sets and Submodular Set Functions

Proof. (1) The proof of B(pi) + B(p 2 ) C B(pi + p2) is easy: A vector x e
B(pi) + B(p2) can be decomposed as x = Xi + x2 with £j € B(pi) (i = 1, 2), which
implies x ( X ) = xi(X) + x2(X) < pi(X)+ p2(X) (equality for X = V). Conversely,
for x e B(pi + p2), we have x ( X ) — p2(X) < p i ( X ) , and by the discrete separation
theorem (Theorem 4.17), there exists y 6 Rv such that x(X) - p2(X) < y(X) <
pi(X), with equality for X = V. For z = x — y, we have y e B(pi) and z e B(p2).
Hence x = y + z e B(pi) + B(p 2 ).
(2) This is because the vectors x, y, z above can be taken to be integral.
(3) We can represent 5; = B(PJ) n Z y with p{ 6 «S[Z] (z = 1,2). Then the
left-hand side of (2) is B\ + B2, whereas for the right-hand side of (2) we have

by (1) and Proposition 3.17 (4). Since pi + p2 £ «S[Z], BI + B2 = B(pi + p 2 ) n Zv


is an M-convex set. D

Finally, we show the integral convexity of an M-convex set.

Theorem 4.24. An M-convex set is integrally convex.

Proof. Let B be an M-convex set and H = {x £ Rv x(V) = r} be the hyperplane


containing it. Theorem 4.22 applied to BI = B and B2 = N(x) n H with N(x)
denned in (3.58) yields Br\N(x)n~H = T3nN(x)nH. Tins implies^ n N(x) =
B n N(x) since Br\H = B, N(x) n H = N(x) nH,saidBnH = B. Then the
integral convexity of B follows from (3.72). D

Note 4.25. The intersection of two M-convex sets is not necessarily M-convex,
though it is integrally convex (see Theorem 8.31). Such a set is referred to as an
M2-convex set. An example of an M2-convex set that is not M-convex is given by

which is the intersection of two M-convex sets BI = S U {(0,1,1, —2)} and B2 =


S U {(1,1,0, -2)}. Note that (B-EXC[Z]) fails for S with x = (1,0,1,-2), y =
(0,1,0, -1), and it = 1. •

4.7 M*-Convex Sets


In section 4.1 we introduced the concept of M^-convex sets as the projection of M-
convex sets along an arbitrarily chosen coordinate axis. The concepts of M^-convex
sets and M-convex sets are essentially equivalent, since an M-convex set lies on a
hyperplane {x £ Rv | x(V) = r} for some r G Z (Proposition 4.1). All the results
for M-convex sets can be translated for M^-convex sets. Here we state some of these
(nontrivial) translations.
The definition of an M^-convex set by projection may be stated more formally
as follows. Let 0 denote a new element not in V and put V = {0} U V. A set
4.7. M^-Convex Sets 117

Figure 4.1. Afl-convex sets.

Q C Zv is an Afi -convex set if it can be represented as

for some M-convex set B C Z^uv. It turns out that an M^-convex set Q can be
characterized by an exchange axiom:
(B^-EXCfZ]) For x,y £ Q and u e supp+(x - y),
(i) x - Xu e Q and y + Xu € Q, or
(ii) there exists v € supp~(x - y) such that x - Xu + Xv £ Q and
y + Xu - Xv e Q.
It is required in (B^-EXCfZ]) that at least one of (i) and (ii) be satisfied, depending
on a given triple (x, y,u). Examples of M^-convex sets are given in Fig. 4.1.
Whereas M^-convex sets are conceptually equivalent to M-convex sets, the
class of M^-convex sets is strictly larger than that of M-convex sets. This follows
from the implication (B-EXC[Z]) => (B^-EXCfZ]), as well as from the example of an
integer interval [a, b]z that is not M-convex but M^-convex. We denote by .Mg[Z]
the set of M^-convex sets.
The projection of a base polyhedron is known to coincide with what is called a
generalized polymatroid (or g-polymatroid for short); see Theorem 3.58 of Fujishige
[65]. Hence, an M^-convex set is precisely the set of integer points of an integral
g-polymatroid and the convex hull of an M^-convex set (Afi -convex polyhedron) is
represented as

for the pair (p, JJL) of integer-valued submodular function p € <S[Z] and supermodular
function n (i.e., —// 6 <S[Z]) such that

A set Q C Ziv satisfies (B^-EXC^Z]) if and only if it can be represented as Q =


Q(/9, IJL) fl Zv in this way.
The intersection of two M^-convex sets is called an M^-convex set. An M--
convex set is a projection of an Ma-convex set, and the class of M^-convex sets is
strictly larger than that of Ma-convex sets.
118 Chapter 4. M-Convex Sets and Submodular Set Functions

4.8 M-Convex Polyhedra


M-convex polyhedra are defined in section 4.4 as the convex hull of M-convex sets,
and as such, they are necessarily integral polyhedra. The concept of M-convexity,
however, can also be defined for general (nonintegral) polyhedra.
A nonempty polyhedron B C Rv is denned to be an M-convex polyhedron if
it satisfies the following:
(B-EXC[Rj) For x,y e B and u 6 supp+(x - y ) , there exist v <E
supp~(x—y) and a positive number ao € R++ such that x—a(xu—Xv) €
B and y + a(\u - Xv) e B for all a e [0, O-OJR.
The following weaker exchange axiom:
(B-EXC+[Rj) For x,y & B and u e supp + (x — y), there exist v &
supp~(x-y) and a positive number ao e R++ such that y+a(\u—Xv} G
B for all a e [0, Q;O]R,
is in fact equivalent to (B-EXC[Rj). That is,

for a nonempty polyhedron B C R y (cf. Proposition 4.2).


The one-to-one correspondence of M-convex sets with submodular set func-
tions (Theorem 4.15) is generalized as follows:

where an integral M-convex polyhedron means an M-convex polyhedron B such


that B = BnZv. For an integral M-convex polyhedron B and integer points
x, y € BnZv, we can take a0 — 1 in (B-EXC[Rj) by (4.40). An integral polyhedron
B is M-convex if and only if B n Zv is an M-convex set. We denote by .Mo[R] the
set of M-convex polyhedra and by A^o[Z|R] the set of integral M-convex polyhedra.
The projection of an M-convex polyhedron along a coordinate axis is called
an M^-convex polyhedron, for which we have

where Q(/o, /u) is defined in (4.36), and the following exchange axiom:
(B^-EXCfRj) For x,y e Q and u e supp+(x - y), there exist v e
supp^(a: — y) U {0} and a positive number a0 e R-++ such that x -
a(Xu - Xv) e Q and y + a(xu - Xv) 6 Q for all a e [0, O<O]R.
We denote by A^glR] and .MQ[Z|R] the sets of M^-convex polyhedra and integral
M^-convex polyhedra, respectively.
4.8. M-Convex Polyhedra 119

An M-convex cone means a cone that is an M-convex polyhedron. It is char-


acterized as a convex cone spanned by vectors of the form \u — Xv (u,v 6 V), to
be proved in Note 8.9. That is,

where we may assume that A is transitive; i.e., (u, v) & A, (v, w) € A => (u, w) e A.
See Theorem 3.26 of Fujishige [65] for the extreme rays of an M-convex cone.
An M-convex polyhedron is characterized as a polyhedron such that the tan-
gent cone at each point is an M-convex cone (by (a) •& (b) in Theorem 6.63).
Combining this with (4.42) yields a characterization of an M-convex polyhedron in
terms of the direction of edges:
B is an M-convex polyhedron

Similarly,
Q is an M^-convex polyhedron

where xo — fl-
it is noted again that M-convex polyhedra and M^-convex polyhedra are syn-
onyms of base polyhedra and g-polymatroids.

Bibliographical Notes
As remarked already, this chapter is a reorganization of known results in the the-
ory of submodular functions; see Fujishige [65] and Schrijver [183]. The proof of
Theorem 4.12 (hole-free property) is taken from Murota [141]. The equivalence
of exchangeability and submodularity (Theorem 4.15) is well known, but neither
precise statement nor proof can be found in the literature; Theorem 4.15 and the
proof are taken from [141]. The name "Lovasz extension" was introduced by Fu-
jishige [63], [65]. Theorem 4.16 (submodularity vs. convexity) is by Lovasz [123]
and Theorem 4.17 (discrete separation theorem) by Frank [55]. The intersection
theorem (Theorem 4.18) and the related statements (Theorem 4.22 (convexity in
intersection) and Theorem 4.23 (convexity in Minkowski sum)) are due to Edmonds
[44]. Theorem 4.21 (separation of M-convex sets) is observed in Murota [140]. In-
tegral convexity of M-convex sets (Theorem 4.24) is due to Murota-Shioura [153].
The example in Note 4.25 is an adaptation of Example 3.7 of [153]. The ter-
minology of M^-convex sets as well as that of the exchange axiom (B^-EXCfZ])
is introduced by Murota-Shioura [151]. The concept of g-polymatroids is due to
Frank [57] (see also Frank-Tardos [58]), whereas the characterization as the pro-
jection of base polyhedra is by Fujishige [64]. Proofs of (4.38) and (4.39) can be
120 Chapter 4. M-Convex Sets and Submodular Set Functions

found in Murota-Shioura [152]. The characterization (4.43) of base polyhedra by


edges seems to have appeared first in Tomizawa [200] (without proof); a proof
can be found in Pujishige-Yang [69] and an alternative proof is given in Note 8.9.
Variants of (4.43) yield other classes of polyhedra with combinatorial structures;
see Danilov-Koshevoy [32], Fujishige-Makino-Takabatake-Kashiwabara [67], and
Kashiwabara-Takabatake [109]. A relaxation (weakening) of the exchange axiom
gives rise to the concept of the jump systems of Bouchet-Cunningham [18]; see also
Lovasz [124].
Chapter 5

L-Convex Sets and


Distance Functions

L-convex sets form another class of well-behaved discrete convex sets. They are
defined in terms of an abstract axiom and correspond one-to-one to integer-valued
distance functions satisfying the triangle inequality. L-convex sets (or their convex
hull) are, in fact, a familiar object in the theory of network flows, though the ter-
minology of L-convexity is not used there. Emphasis here is placed on a systematic
presentation of various properties of L-convex sets from the viewpoint of discrete
convex analysis.

5.1 Definition
A nonempty set of integer points D C Zv is denned to be an L-convex set if it
satisfies the following two conditions:
(SBS[Z]) p,qeD => PVq,pKqGD,
(TRS[Z]) peD=>p±l&D.
We denote by CQ [Z] the set of L-convex sets.
L-convexity thus defined for a set D C Zv is equivalent to the L-convexity of
the indicator function SD : Zv —> {0, +00} of D, defined in (3.51). Namely, D is
an L-convex set, satisfying (SBS[Zj) and (TRS[Z]), if and only if SD is an L-convex
function, satisfying (SBF[Z]) and (TRF[Zj) introduced in section 1.4.1.
Since an L-convex set is homogeneous in the direction of 1 by (TRS[Z]), we
may consider the restriction of an L-convex set to the coordinate plane defined
by P(VO) = 0 for an arbitrary VQ € V. A set derived from an L-convex set by
such a restriction (intersection with a coordinate plane) is called an L^-convex set.
Whereas L''-convex sets are conceptually equivalent to L-convex sets, the class of
L^-convex sets is strictly larger than that of L-convex sets. The simplest example
of an L''-convex set that is not L-convex is an integer interval [a, b]z-
We focus on L-convex sets in the development of the theory and deal with
L''-convex sets in section 5.5.

121
122 Chapter 5. L-Convex Sets and Distance Functions

5.2 Distance Functions and Associated Polyhedra


We introduce here some fundamental facts about distance functions and the asso-
ciated polyhedra, which turn out to be the convex hull of L-convex sets.
By a distance function we mean a function 7 : V x V —> R U {+00} such
that j(v, v) = 0 (Vu e V), where 7 may take negative values and is not necessarily
symmetric (i.e., 7(1*, i>) ^ j(v,u) in general). With a distance function 7 we can
associate a directed graph G7 = (V, Ay) with vertex set V and arc set

where ~f(u,v) represents the length of arc (u,v). We denote by 7(11, v) the shortest
length of a path from u to v in G7. The function 7 is well defined if there exists
no negative cycle in G7, where a negative cycle means a directed cycle of negative
length.
The triangle inequality

is a natural property for a distance function 7. We denote by T[R] the set of


distance functions satisfying the triangle inequality and by T[Z] the set of integer-
valued such functions. We have 7 e T[R] for any distance function 7 such that G7
contains no negative cycle and 7 = 7 for 7 e T[R].
For a distance function 7, a vector p e Rv is said to be an admissible potential
or a feasible potential if it satisfies the system of inequalities

The set of admissible potentials is denoted by

Note that the triangle inequality (5.2) is not assumed here.


The following are fundamental facts well known in network flow theory.

Proposition 5.1. Let 7 be a distance function.


(1) D(7) / 0 ^=> no negative cycle exists in graph G7.
(2) 7/D(7) ^0, we have

and~D(-y) =0(7).
(3) For 7 6 ^"[R.], D(7) is nonempty and

(4) D(7) is an integral polyhedron for an integer-valued 7.


5.3. Polyhedral Description of L-Convex Sets 123

Proof. For x G Rv we consider a pair of linear programs (LPs):

Here A = (Xuv \ (u, v) G A*/) and p are the variables of (P) and (D), respectively. The
coefficient matrix is totally unimodular, being the negative of the incidence matrix
of graph G7 (see Example 3.11). The set of feasible solutions to (D) coincides with
D(7).
(1) If (D) is feasible, the sum of the inequalities (5.3) for arcs in a directed
cycle shows the nonnegativity of the cycle. Conversely, if no negative cycle exists,
we can define p(v) to be the shortest path length from a fixed starting vertex to v
to obtain a feasible solution p to (D) (with an obvious modification for vertices v
not reachable from the starting vertex).
(2) In the particular case of x = Xv0 ~Xu0 for distinct UQ, VQ G V, the objective
function of (D) is equal to (p,x) = P(VQ) — P(UQ), and hence the optimal value of
(D) equals the right-hand side of (5.5) for (u, v) = (UQ,VQ). By Theorem 3.13, the
optimal solution A to (P) can be chosen to be an integer vector, which is in fact a
{0, l}-vector. Such an optimal solution to (P) represents a shortest path from UQ to
i>o, and therefore, the optimal value of (P) is equal to J(UQ,VQ). By the feasibility
of (D), LP duality (Theorem 3.10 (2)) applies to show (5.5). By 7 > 7 we have
D(T) 2 0(7). For p G 0(7), adding the inequalities (5.3) for arcs in the shortest
path from UQ to VQ yields p(i>o) — P(WO) 5- T"(UO> ^0)1 which shows 0(7) C D(T").
(3) The condition 7 G T[R] implies the nonexistence of negative cycles and
7 = 7 in (5.5).
(4) By Theorem 3.13, the integrality of 0(7) follows from the total unimod-
ularity of the coefficient matrix. D

5.3 Polyhedral Description of L-Convex Sets


An L-convex set is hole free (Theorem 5.2 below), which allows us to identify an
L-convex set with its convex hull. The convex hull of an L-convex set is called an
L-convex polyhedron, which is indeed a polyhedron described by a distance function
(Proposition 5.3 below).
Let us start with the hole-free property of an L-convex set.

Theorem 5.2. D = ~D n Zv for an L-convex set D C Zv.


124 Chapter 5. L-Convex Sets and Distance Functions

Proof. Obviously, D C D n Zv. To show the reverse inclusion, take an arbitrary


p G D n Zv, which can be represented as

with distinct pi (1 < i < m). The representation (5.7) with TO = 1 means p 6 D
(we are done). When TO > 2, repeated modifications of (5.7) as

if Afe > Aj : AJ-PJ + Xkpk => Xj[(pjVpk) + (p,- Ap f e )] + (A fc - \j)pk


result in another representation of the form (5.7), with pi < p2 < • • • < pm- Then
we have p\ < p < pm, in particular, and another kind of modification is applicable
to (5.7):

if A m > AI : Aipi + XmPm => AI(PI +p'm) + (Xm - Ai)p m ,

if AI > \m : Aipi + \mPm =>• \m(p'l + Pm} + (^1 - A m )pi,

where p[ = p\ +1 and p'm =pm — l. Using these modifications we eventually arrive


at (5.7) such that p-1 < p\ < p < pm < p+1. Thenpi = p — xx andp m = p + xx
for some X C V, and hence p = (pi+ 1)/\pm £ D by (SBS[Z]) and (TRS[Zj). D

The convex hull of an L-convex set is a polyhedron described by some distance


function.

Proposition 5.3. For nonempty D C Zv, define 7 : V x V —> Z U {+00} by

(1) 7 satisfies the triangle inequality (5.2); i.e., 7 € 1~[Z].


(2) D = D(7) if D is an L-convex set.

Proof. (1) 7(^1,^2)+7(^2,^3) =snppeD(p(v2)-p(vl))+suppED(p(v3)-p(v2)') >


SU
PP6£>(P(^3) - P(VI)^ = -y(vi,V3).
(2) Obviously, D C 0(7). By the integrality of 0(7) shown in Proposition 5.1
(4), the converse (2) is also true if any q e 0(7) n Zv belongs to D. For distinct u
and v, we have 7(1*, D) > q(v) — q(u), and, by the definition of 7 and (TRS[Zj), there
exists puv e D such that p ut) (u) — q(u) and p uw (v) > q(v). For pu = \lv^upuv,
we have p u (w) = ?(u) and pu(v) > q(v) (Vv e V), and also pu e -D by (SBS[Zj).
Hence, for p = /\uevpu, we have p = q and also p e D by (SBS[Zj). D

A sort of converse of the above proposition is true. Note that the triangle
inequality (5.2) is not assumed in the proposition below.

Proposition 5.4. For an integer-valued distance function 7, D = D(7) n Zvis an


L-convex set provided that 0(7) is nonempty.
5.4. L-Convex Sets as Discrete Convex Sets 125

Proof. (TRS[Z]) is obvious. For (SBS[Zj) we can easily show that p(v) - p(u) <
j(u,v) and q(v) — q(u) < ^(u,v) imply (p V q)(v) — (p V q)(u) < j(u,v) and (p A
q)(v) -(pAg)(u) <^(u,v). D

Propositions 5.3 and 5.4 together imply a one-to-one correspondence between


the family £o[Z] of L-convex sets and the family T[Z] of integer-valued distance
functions with the triangle inequality.

Theorem 5.5. A set D C Zv is L-convex if and only if D — D(7) fl Zv for an


integer-valued distance function 7 € T[Z] satisfying the triangle inequality. More
specifically, the mappings $ : £o[Z] —•> T[Z] and ^ : T[Z] —> £o[Z] defined by

are inverse to each other, establishing a one-to-one correspondence between £p[Z]


andT[Z].

Proof. For D <E £0[Z] we have $(£>) e T[Z] and # o $(JD) = ~D n Z y = £»


by Proposition 5.3 and Theorem 5.2. For 7 e T[Z] we have #(7) G £o[Z] and
$ o ^(7) = 7 by Propositions 5.4 and 5.1. D

Note 5.6. An M-convex polyhedron is described by a submodular set function


(Theorem 4.15), and the correspondence is one-to-one:
M-convex polyhedra <—> submodular set functions.
An L-convex polyhedron is described by a distance function with the triangle in-
equality (Theorem 5.5), which gives another one-to-one correspondence:
L-convex polyhedra <—> distance functions with the triangle inequality.
These two one-to-one correspondences will be unified into a single conjugacy rela-
tionship between M-convex functions and L-convex functions in Chapter 8. •

5.4 L-Convex Sets as Discrete Convex Sets


We show a number of nice properties of L-convex sets that qualify them as well-
behaved discrete convex sets.
First we consider the intersection of two L-convex sets. Recall from (5.1) and
(5.4) the definitions of a graph G7 and a polyhedron 0(7) associated with a distance
function 7.

Theorem 5.7. Let Di,D2 C Zv be L-convex sets.


(1) A"n 1^ = £>in£» 2 .
On representing Di = D(7$) n Zv with •ji e T[Z] (i = 1, 2) and defining Ji2(u, v) =
min(7i(u,w),72(u, u)), we have the following.
(2) D! n D2 = D(712) H Z^.
126 Chapter 5. L-Convex Sets and Distance Functions

(3) DI n D2 7^ 0 •*=>• no negative cycle exists in graph G712.


(4) DI n D2 is an L-convex set if it is nonempty.

Proof. (1), (2) It follows from 0(71) n 0(72) = D(7i 2 ) that DI n D2 = (D(li) H
Zv) n (D(72) n Zv) = (D(7i) n D(72)) n Zv = D( 7l2 ) n Zv. Since D(_7i2JJs an
integral polyhedron, we obtain DI n D2 = D(7i 2 ) = D(7i) n D(72) = DI n D2.
(3) This is by (2) and Proposition 5.1 (1).
(4) This follows from (2) and Proposition 5.4. D

The first claim (1) in the above theorem shows

the property called convexity in intersection in section 3.3.


Convexity in Minkowski sum is also shared by L-convex sets.

Theorem 5.8. For L-convex sets Di,D2 C Zv, wehaveDi+D2 — DI + D2 nZV

Proof. T = £o[Z meets the condition (3.53) in Proposition 3.16 and has the
property (a) there by (5.9). D

The following discrete separation theorem holds for two L-convex sets.

Theorem 5.9 (Discrete separation for L-convex sets). Let DI and D2 (C Zv) be
L-convex sets. If they are disjoint (Di C\D2 = 0), there exists x* € { — 1,0,1} V such
that

Proof. We use the notation in Theorem 5.7; in particular, Di = 0(7^) n Zv with


7i & 1~[Z] (i = 1,2). By Theorem 5.7 (3), there exists in the graph G712 a negative
cycle with respect to arc length 712 = min(7i,7 2 ). Let vo,vi,V2, • • • ,Vk-i be the
sequence of vertices in a negative cycle with the minimum number of vertices,
where k > 2. Since 71 and 72 satisfy the triangle inequality, k is even, and we
may assume "/i(v2i,v2i+i) < ~f2(v2i,v2i+i) and 7i(u 2 j+i,u 2 i+ 2 ) > 72(v 2 i+i,u 2 »+ 2 )
for 0 < i < k/2 - 1, where vk = VQ. Define x* 6 {-1,0,1}V by x*(v2i) = 1,
x*(v2i+l) = -1 (0 < i < k/2 - 1), and x*(v) = 0 for other v. It follows from LP
duality (Theorem 3.10 (2)) and the minimality of k that

and, therefore,
5.4. L-Convex Sets as Discrete Convex Sets 127

This shows (5.10). D

The content of Theorem 5.9 consists of two claims. The first, explicit in the
statement, is that the separating vector x* is so special that it is a {0, ±l}-vector.
The second, less conspicuous and more subtle, is the implication (5.9), convexity in
intersection, since otherwise the inequality (5.10) is impossible.
Finally, we show the integral convexity of an L-convex set by deriving an
expression of the convex hull of an L-convex set.
For a vector p e Rv, let a\ > a2 > • • • > am be the distinct values of the
nonzero components of vector a = p — \p\, and define

where m > 0. Then we have

with ao = 1 and UQ = 0. This is a representation of p as a convex combination


of bJ + XUi (i = 0 , 1 , . . . , m), since on - ai+i > 0 (i = 0 , 1 , . . . , m - 1), am > 0,
and Y^Q1(ai ~ ai+i) + &m = 1- Note that these points [pj + x^ (i = 0 , 1 , . . . , m)
belong to the integral neighborhood N(p) of p defined by (3.58).
The convex hull of an L-convex set can be characterized with reference to the
expression (5.11).

Theorem 5.10. For an L-convex set D C Zv, we have

Hence, an L-convex set is integrally convex.

Proof. The expression (5.11) shows the inclusion D in (5.12). To show the converse,
take p e D and put p0 = \p\, a = p - \p\, and & = \j>\ + XUi(P) (i - 0 , 1 , . . . , m).
In the representation D = D(7) with an integer-valued distance function 7 (by
Theorem 5.5), we have

Since \a(v} — a(u)\ < 1, po(v) —po(u) e Z, and 7(1*, v) e Z, we havepo(^) ~Po(u) <
j(u,v) for any u,v, and furthermore po(v) — Po(u) + 1 < 7(1*, v) if a(i>) > a(u).
Hence follows
128 Chapter 5. L-Convex Sets and Distance Functions

which shows qi € 0(7), and therefore, Qi £ 0(7) fl Zv = D. Hence, we have C in


(5.12). Since qt e N(p) for i = 0 , 1 , . . . , m, we have p 6 DnN(p); see (3.71). Thus
D is integrally convex. D

Note 5.11. The Minkowski sum of two L-convex sets, to be called an L^-convex
set, is not necessarily L-convex, though it is integrally convex (see Theorem 8.42).
An example of an L2-convex set that is not L-convex is given by

which is the Minkowski sum of two L-convex sets D\ = {(0,0,0,0), (1,1,0,0)} +


{al a e Z} and £>2 = {(0,0,0,0), (0,1,1,0)} + {al | a e Z}; note that (SBS[Z])
fails for (0,1,1,0) and (1,1,0,0).h •

5.5 iJ-Convex Sets


In section 5.1 we introduced the concept of L^-convex sets as the restriction of L-
convex sets to an arbitrarily chosen coordinate plane. The concepts of L''-convex sets
and L-convex sets are essentially equivalent, since an L-convex set is homogeneous
in the direction of 1 by (TRS[ZJ). All the results for L-convex sets can be translated
for L''-convex sets. Here we state some additional results.
The definition of an L^-convex set by restriction may be stated more formally
as follows. Let 0 denote a new element not in V and put V = {0} U V. A set
P C Zv is an L^-convex set if it can be represented as

for some L-convex set D C Zv. It turns out that an L''-convex set P can be
characterized by the property
(SBS"[Z]) p,q&P =» ( p - a l ) V g , p/\(q + al)eP (Va e Z+)
(see Note 5.12 for the proof). This condition for a = 0 agrees with (SBS[Zj).
Examples of L^-convex sets are given in Fig. 5.1.
Whereas L^-convex sets are conceptually equivalent to L-convex sets, the class
of L^-convex sets is strictly larger than that of L-convex sets. This follows from the
implication [(SBS[Z]) and (TRS[Z])] =>• (SBS^Z]), as well as from the example of
an integer interval [a, b]z that is not L-convex but L^-convex. We denote by jCg[Z]
the set of L^-convex sets.
For a set P C Zv, (SBS^Zj) above is equivalent to either of the following
conditions:
5.5. iJ-Convex Sets 129

Figure 5.1. L^-convex sets.

Figure 5.2. Discrete midpoint convexity.

(see Note 5.12 for the proof). The property (5.15) is called discrete midpoint con-
vexity (see Fig. 5.2), where \z\ and \z\ denote, respectively, the integer vectors
obtained from z G R^ by componentwise round-up and round-down to the nearest
integer. Thus, L''-convex sets can be characterized by one of the three equivalent
conditions (SBS^Z]), (5.14), and (5.15).
The convex hull of an L''-convex set (called an L^-convex polyhedron) can be
represented as

with an integer-valued distance function 7 : V x V —> Z U {+00} and integer-valued


functions 7 : V —> ZU{+oo} and 7 : V —> Zu{—00}. We may impose an additional
condition on (7,7,7): the distance function 7 on V = V U {0} denned by

(as well as by 7(1;, v) = 0 (Vu € V)) satisfies the triangle inequality. A set P C Zv
satisfies (SBS^Zj) if and only if it can be represented as P = P(7,7,7) n Zv.
The Minkowski sum of two L^-convex sets is called an L^-convex set. An L^-
convex set is a restriction of an L2-convex set, and the class of L^-convex sets is
strictly larger than that of L2-convex sets.
130 Chapter 5. L-Convex Sets and Distance Functions

Note 5.12. We prove the equivalence among L''-convexity (as induced from L-
convexity by restriction), (SBS^Z]), (5.14), and (5.15) for P C Zv.
[lAconvexity <^> (SBS^[Z])]: By (5.13) and (TRS[Z]) we have

(SBS[Z]) for D is equivalent to the following condition for P:

Assuming a = q0 — p0 > 0, put p' = p — PQ\ and q' = q - q0l. Then (p V q) — (po V
</o)l = (p' - al) V q1 and (p A q) — (p0 A qo)l = p' A (q1 + al). Hence, the above
condition is equivalent to (SBS^[Zj).
[(SBSll[Z])^>(5.14)]: For a = maxvev{p(v) - q(v)} - 1, we have a > 0, (p -
al) V q = q + xx, and p A (q + al) = p - xx- Then (SBS"[Z]) implies (5.14).
[(5.14)^(5.15)]: Put p" = [2±2] and q" = [2±2j, and define p',q' £ Zv by

Note that \p'(v)—q'(v)\ < 1 (Vu e V), supp+(p'-g') C supp + (p—q), andsupp~(p' —
q') C supp^(p - q). Repeated applications of (5.14) to (p,q) yield p',q' 6 P, and
an application of (5.14) to (p',qr) gives p",q" e P.
[(5.15)^(SBSll[Z])]: For p,q £ P, define a sequence (q(0\q(1\ ...) of integer
points as follows:

Here, note that q^ e P (k = 0,1,...). We see that

It follows that there exists some positive integer JV such that q^ = q(N>) for any
integer k > N. Because of (i)-(iii) such a q(N^ is equal to (p— 1) V (p/\q) and hence
we have (p — 1) V (p A q) € P. Replacing p with (p — 1) V (p A q) and repeating the
above argument, we also have (p — 2 • 1) V (p A q) e P. Repeating this argument
(or more rigorously by induction), we have (p — al) V (p A g) e P for a e Z+. In
particular, we have p A q £ P. By symmetry we also have (p V </) A (q + al) e P
for a € Z+ and, in particular, p V q £ P. Now, replacing p with p V Q in the above
argument from the beginning, we have (p — al) V q 6 P for a e Z+. By symmetry
we also have p A (g + al) G P for a 6 Z+. •
5.6. L-Convex Polyhedra 131

5.6 L-Convex Polyhedra


L-convex polyhedra are defined in section 5.3 as the convex hull of L-convex sets,
and as such they are necessarily integral polyhedra. The concept of L-convexity,
however, can also be defined for general (nonintegral) polyhedra.
A nonempty polyhedron D C Hv is defined to be an L-convex polyhedron if
it satisfies

By an integral L-convex polyhedron we mean an L-convex polyhedron D such that


D = D n Zv. An integral polyhedron D is L-convex if and only if D n Zv is an
L-convex set. We denote by £o[Rj the set of L-convex polyhedra and by £0[Z|R]
the set of integral L-convex polyhedra.
Let 7 be a distance function and assume 0(7) ^ 0. Then 0(7) is an L-
convex polyhedron. If, in addition, 7 is integer valued, D(7) is an integral L-convex
polyhedron. The one-to-one correspondence of L-convex sets with distance functions
(Theorem 5.5) is generalized as follows:

The restriction of an L-convex polyhedron to a coordinate plane is called an


L^-convex polyhedron. A polyhedron P C Rv is L^-convex if and only if

We also have

where P(7,7,7) is denned in (5.16) and 7 in (5.17) belongs to T[R]. We denote by


£Q[R] and £o[Z|R] the sets of L^-convex polyhedra and integral L''-convex polyhe-
dra, respectively.
An L-convex cone means a cone that is an L-convex polyhedron. We have

as is proved in Note 8.10. An L-convex polyhedron is characterized as a polyhedron


such that the tangent cone at each point is an L-convex cone (by (a) •& (b) in
Theorem 7.45).

Bibliographical Notes
The concept of L-convex sets was introduced by Murota [140] and that of L^-convex
sets by Pujishige-Murota [68]. The polyhedron D(7) associated with a distance
132 Chapter 5. L-Convex Sets and Distance Functions

function is a well-studied object, appearing, e.g., in the dual of the trans-shipment


problem. In particular, (5.5) is known as the maximum-separation minimum-route
theorem (Theorem 21.1 of Iri [94]) or as the max tension min path theorem (sec-
tion 6C of Rockafellar [178]). Theorem 5.5 (the description of L-convex sets by
D(7)) and Theorem 5.9 (separation of L-convex sets) are due to Murota [140].
Theorem 5.2 (hole-free property), Theorem 5.7 (convexity in intersection), and The-
orem 5.8 (convexity in Minkowski sum) are by Murota [141]. Integral convexity of
L-convex sets (Theorem 5.10) is due to Murota-Shioura [153]. The example in Note
5.11 is an adaptation of Example 3.11 of [153]. Nonintegral L-convex polyhedra in
section 5.6 are considered in Murota-Shioura [152].
Chapter 6

M-Convex Functions

M-convex functions form a class of well-behaved discrete convex functions. They are
defined in terms of an exchange axiom and are characterized as functions obtained
by piecing together M-convex sets in a consistent way or as collections of distance
functions with some consistency. Fundamental properties of M-convex functions
are established in this chapter, including the local optimality criterion for global
optimality, the proximity theorem for minimizers, integral convexity, and extensi-
bility to convex functions. Duality and conjugacy issues are treated in Chapter 8
and algorithms in Chapter 10.

6.1 M-Convex Functions and M^-Convex Functions


We recall the definitions of M-convex functions and M^-convex functions from sec-
tion 1.4.2.
A function / : Zv —> R U {+00} with dom/ ^ 0 is said to be an M-convex
function if it satisfies the following exchange axiom:

(M-EXC[Zj) For x,y € dom/ and u e supp+(x - y), there exists


v 6 supp~ (x — y) such that

Inequality (6.1) implicitly imposes the condition that x — \u + Xv £ dom/ and


y + Xu — Xv S dom /. With the use of the notation

the exchange axiom (M-EXC[Z]) can be expressed alternatively as follows:

(M-EXC'[Z]) For x,y e dom/,

133
134 Chapter 6. M-Convex Functions

where the maximum and the minimum over an empty set are — oo and +00, respec-
tively. We denote by M [Z —» R] the set of M-convex functions and by M [Z —> Z]
the set of integer-valued M-convex functions.

Proposition 6.1. The effective domain of an M-convex function is an M-convex


set. Therefore, it lies on a hyperplane {x € R^ x(V) = r} for some integer r.

Proof. It follows from (M-EXC[Z]) that B = dom/ satisfies (B-EXC[Z]). Then


the latter half follows from Proposition 4.1. D

Since the effective domain of an M-convex function / lies on a hyperplane, we


may consider, instead of the function / in n = \V\ variables, the projection /' of
/ along an arbitrarily chosen coordinate axis UQ £ V, where the projection /' is a
function in n — 1 variables defined by

with the notation V' = V \ {WQ} and (XQ,X') & Z x Zv . A function derived from
an M-convex function by such a projection is called an M^-convex function.
More formally, let 0 denote a new element not in V, and put V — {0} U V. A
function / : Zv -> RU{+oo} is called M*-convex if the function / : Zv -> Ru{+oo}
defined by

is an M-convex function. We denote by A4^[Z —> R] the set of M^-convex functions


and by M^[Z —> Z] the set of integer-valued M^-convex functions.
To characterize M^-convex functions we introduce another exchange axiom:
(Mb-EXCfZ]) For x,y & dom/ and u (S supp+(x - y),

An alternative form of (M^-EXCfZ]) using the notation (6.2) is as follows:


(M^-EXC'IZJ) For x,y € dom/,

where, by convention, Xo is the zero vector, and


6.2. Local Exchange Axiom 135

Theorem 6.2. For a function f : Zv —>• R U {+00} with dom / ^ 0, we have

Proof. (M-EXC[Zj) for / in (6.4) is translated to conditions on / as follows:

where S+ = supp + (x — y ) and S~ = supp~(a;—y). As is easily seen, these conditions


imply (6.5). The converse is shown in Note 6.6 in section 6.2. D

M^-convex functions are conceptually equivalent to M-convex functions, but


the class of M^-convex functions is larger than that of M-convex functions.

Theorem 6.3. An M-convex function is Afi-convex. Conversely, an M*-convex


function is M-convex if and only if the effective domain is contained in {x e Zv \
x(V) = r} for some r e Z.

Proof. The first half follows from the obvious implication (M-EXC[Z])=^(M1'-
EXC[Z]) and Theorem 6.2. The second half is from the equivalence of (M-EXC[Zj)
and (M^-EXCfZj) under the condition on the effective domain. D

For ease of reference we summarize the relationship between M and M13 as

where Mn and M\ denote, respectively, the sets of M-convex functions and M11-
convex functions in n variables, and the expression M\ ^ Mn+i means a correspon-
dence of their elements (functions) up to a translation of the effective domain along
a coordinate axis, where (6.4) gives the correspondence under the normalization of
r = 0.
By the equivalence between M-convex functions and M^-convex functions, all
theorems stated for M-convex functions can be rephrased for M^-convex functions,
and vice versa. In this book we primarily work with M-convex functions, making
explicit statements for M^-convex functions when appropriate.

6.2 Local Exchange Axiom


There are a number of axioms equivalent to (M-EXC[Zj). We consider here a local
exchange axiom:
(M-EXCioc[Z]) For x,y e dom/ with \\x - y\\\ = 4, there exist u e
supp+(rr — y) and v & supp~(o; — y) such that (6.1) holds true.
136 Chapter 6. M-Convex Functions

On expressing y = x — \Ul - Xu2 + Xv^ + Xv2 with 111,1(2,^1,^2 6 V and {ui, 1(2} n
{^1,^2} = 0, we see that (M-EXC]OC[Z]) is written as

Theorem 6.4.' 7/dom/ is an M-convex set, then (M-EXC[Zj) «=> (M-EXCioc[Z]).

Proof. It suffices to show that (M-EXCioc[Z]) =>• (M-EXC[Z]). To prove this by


contradiction, we assume that there exists a pair (x,y) for which (M-EXC[Zj) fails.
That is, we assume the set of such pairs

is nonempty, where B = dom/. Take a pair (x,y) G T> with minimum \\x — y\\i,
where ||o; — y||i > 4 by (M-EXCioc[Z]), and fix u* & supp + (x — y) appearing in the
definition of T>. For a fixed e > 0, define p : V —> R by

We use the notation

Claim 1:

(Proof of Claim 1) The equality (6.12) is obvious from the definition of p, and
(6.13) can be seen as follows. If x — Xu* +Xv G B, (6.13) follows from Afp(x; v, w*) =
0 (by (6.12)) and

(by the definition of u»). If x - Xu, + Xv £ B, (6.13) follows from the fact that
A f p ( y ; u f , v) = e or +00 depending on whether y + Xu, — Xv 6 B or not.
Claim 2: There exist UQ 6 supp+(x — y) and VQ e supp~~(x — y) such that
y + Xu0 ~ Xv0 e B, U* e supp+(a; - (y + Xu0 ~ XvJ), and
6.2. Local Exchange Axiom 137

(Proof of Claim 2) Put UQ = w* if x(u») > y(u*) + 2; otherwise, take any


UQ e supp + (rr — y)\ {w*}, which is possible by ||x — j/||i > 4. By (B-EXC[Zj) there
exists i> 6 supp~ (a; — y) such that y + Xu0 ~Xv & B. Let VQ € supp~ (x — y) be such
a v that minimizes A/ P (J/;UO, w). Then we have (6.14).
Claim 3: (x,y') eT>tory' = y + Xu0 - Xv0-
(Proof of Claim 3) It suffices to show

We may assume x — Xuf + Xv S B, since otherwise Afp(x;v,u*) = +00. Then


A f p ( x ; v , u f ) = 0 by (6.1*2) and

by (M-EXCioc[Z]), (6.13), and (6.14). This establishes Claim 3.


Since | \x — y' \ \ = \\x-y\\-2, Claim 3 is a contradiction to the choice of (a;, y ) .
Therefore, T> must be an empty set. D

As a corollary to Theorem 6.4 we see that (M-EXC[Z]) is equivalent to a weak


exchange axiom:
(M-EXCW[Z]) For distinct x, y 6 dom/, there exist u € supp+(o; - y)
and v £ supp~(x — y) such that (6.1) holds true.

Note the difference in the two axioms: "Vu, 3u" in (M-EXC[Zj) and "3u, 3u" in
(M-EXC W [Z]).

Theorem 6.5. (M-EXC[Z]) <^=^> (M-EXC w [Zj).

Proof. It suffices to show <= when dom/ / 0. (M-EXCW[Z]) for / implies (B-
EXCW[Z]) for dom/, and therefore, dom/ is an M-convex set by Theorem 4.3.
Then Theorem 6.4 establishes the claim. D

Note 6.6. The proof of Theorem 6.2 is completed here. It remains to show that
(M^-EXC[Z]) for / implies (M-EXC[Z]) for / denned by (6.4). First, dom/ is an
M-convex set, since (M^-EXCfZ]) for / implies (B^-EXCtZj) for dom/ and dom/
is the projection of dom/ (see section 4.7 and (4.35) in particular). By Theorem
6.4, (M-EXC[Zj) is equivalent to (M-EXCioc[Z]), and therefore, it suffices to show
(6.7), (6.8), and (6.9) for x, y such that x - y — Xm + Xu2 - Xvt - Xv2 with
{ui,U2,vi,V2} C V U {0} and {ui,u2} PI {vi,v2} = 0. Since (6.7) is obvious from
(6.5), it remains to show (6.8) for four cases:
138 Chapter 6. M-Convex Functions

where HI, u?, vi, v% are all distinct. We deal with (al) and (b4) below; the other
cases are left to the reader.
Case (al): We abbreviate z
(ceiazPifa)', for instance, We are to derive We are to derive.

By (6.5) for u = u\, we have (6.16) or

By (6.5) for u = 112, we have (6.16) or

Furthermore, by (6.5) we have

Adding (6.17), (6.18), (6.19), and (6.20) yields

which implies (6.16).


to
Case (b4): We abbreviate z = (xAy) +aixui +/?iX«i ( a i f i i ) \ f°r instance,
x = (10) and y = (02). We are to show

This, however, is derived from (6.5) for (02), (10), and u = v\.

6.3 Examples
We have already seen M-convexity in network flows and in matroids (section 2.2,
section 2.4). In this section we see some other examples of M-convex functions, such
as linear functions, quadratic functions, and separable convex functions.
We start by recalling the following facts from Proposition 6.1.

Proposition 6.7.
(1) The effective domain of an M-convex function is an M-convex set.
(2) The effective domain of an M^-convex function is an M^-convex set.
6.3. Examples 139

Linear functions A linear (or affine) function45

with p € R™ and a € R, is M-convex or M^-convex according as dom / is M-convex


or M^-convex. The inequalities (6.1) in (M-EXC[Z]) and (6.5) in (M h -EXC[Z]) are
satisfied with equality.

Quadratic functions A separable quadratic function

with a, e R_|_ (i = 1,..., n), is M-convex if dom/ is an M-convex set. A quadratic


function of the form

with Oj e R (i — 1 , . . . , n) and b e R, is M^-convex if

The proof is given in Note 6.10. Note that cross-terms in (6.23) have a common
coefficient b.
The following proposition shows a necessary and sufficient condition for a
quadratic form to have M-convexity.

Proposition 6.8. Let A = (aij)fj=1 be a symmetric matrix and f ( x ) = |#TAr


for x £ dom/.
(1) Suppose dom/ = {x 6 Zn £)r=i x(^i =r
} for some
r £ Z. Then f is
M-convex if and only if46

(2) Suppose dom/ = Z™. Then f is A$-convex if and only if

45
In this section, V = {1,..., n}, f denotes a real-valued function in integer variables, i.e., / :
Z —> Ru{+oo}, and x(i) is the ith component of an integer vector x = (x(i) i = 1,... ,n) 6 Z n .
n
46
As a special case of (6.25) and (6.26) with i — j and k = I, we have an +dkk ^ la-ik- Similarly,
we have an > an, as a special case of (6.27).
140 Chapter 6. M-Convex Functions

Proof. (1) The local exchange property (M-EXCioc[Z]), which is equivalent to


M-convexity by Theorem 6.4, reduces to (6.25).
(2) A straightforward translation of (6.25) using (6.4) gives this result.

A symmetric matrix A = («ij)™j=1 with

for some w\ > w^ > • • • > wn satisfies (6.25), and the associated quadratic form
f ( x ) = j x Ax defined on {x 6 Z™ | Y^t=i x(^) = r} f°r some r e Z is M-convex
by Proposition 6.8 (1) above. For n = 4, such a matrix looks like

Separable convex functions A separable convex function

with univariate discrete convex functions fi e C[Z —> R] (i = 1,... ,n) is M-convex
or M^-convex according to whether dom/ is M-convex or M^-convex; see (3.68) for
the notation C[Z —> R]. As special cases we see the following.

Proposition 6.9. Let ip G C[Z —> R] be a univariate discrete convex function.


(1) ip is Afi-convex.
(2) The function f : Z2 ->• R U {+00} defined by

is M-convex.

Moreover, a quasi-separable convex function

with f i 6 C[Z -> R] (i = 0 , 1 , . . . , n) is M^-convex. Note that the function f(x0, x)


in (6.4) associated with f ( x ) of (6.32) is a separable convex function. A special case
of (6.32) with

coincides with the quadratic function in (6.23).


6.3. Examples 141

Laminar convex functions By a laminar family we mean a nonempty family T


of subsets of {!,...,n} such that

For a laminar family T and a family of univariate discrete convex functions fx £


C[Z -> R] indexed by X 6 T, the function defined by

is an M^-convex function (the proof is given in Note 6.11). We refer to such an / as a


laminar convex function. A special case of (6.34) with T = {{1,..., n}, {!},..., {n}}
coincides with the quasi-separable convex function in (6.32). Laminarity is indis-
pensable for M^-convexity; e.g., f ( x ) = (x(l) + x(2)) 2 + (x(2) + z(3))2, with a
nonlaminar family {{1,2}, {2,3}}, is not an M^-convex function.

Minimum-value functions Given a vector (a» | i — 1,... , n) e R n , we denote


by (J.(X) the minimum value of a» with index i belonging to X C {1,..., n}. More
formally, choosing a* e R U {+00} with a* > max{aj | i = 1,..., n}, we define a
set function /z by

Identifying X with its characteristic vector \x yields a function / : Zn —> Ru{+oo},


which is given by

This is an M^-convex function.

Note 6.10. The M11-convexity of the quadratic function in (6.23) with (6.24) is
proved here. Let i e supp+(x — y ) . If x > y, we have

Otherwise, there exists j € supp~ (x — y) such that

Hence / satisfies (M^-EXCfZ]). It is also easy to verify the conditions (6.26) to


(6.28) in Proposition 6.8 (2). •
142 Chapter 6. M-Convex Functions

Note 6.11. An elementary proof of the M^-convexity of a laminar convex function


in (6.34) is given here; an alternative simpler proof using network flow is in Note
9.31. For X € T, denote by T(X) the family of all maximal elements of T properly
contained in X and note that

Take x, y € dom/ and u € supp+(a; - y) in (M^-EXCfZ]). To prove (6.5) it suffices


to show the existence of v & supp^(x — y) U {0} such that

If x(X) > y(X) for all X e T containing u, we can take v = 0 to meet (6.38) and
(6.39). Otherwise, let XQ be the unique minimal element of T such that u G XQ
and X(XQ) < y(Xo). By the minimality of XQ and (6.37) we have
(i) 3v e XQ \ Urer(Xo) ^ such that x(v) < y(v) or
(ii) 3 X l e T(X0) such that x(Xl) < y ( X l ) .
In case (i), this v is valid for (6.38) and (6.39). In case (ii), from (6.37) follows
(i) 3v € Xi \ UysT(Xi) ^ sucn tnat x(v) < v(v} or
(ii) 3X2 e T(Xl) such that a;(X2) < y ( X 2 ) .
Repeating this argument we eventually arrive at case (i). •

Note 6.12. In section 2.1 we considered M^-convex quadratic functions in real


variables, whereas we have investigated functions in integer variables in this section;
namely, Rn —> R in section 2.1 and Zn —> R here. Both are characterized by
exchange properties; the former by (M^-EXCfR]) and the latter by (M^-EXCfZ]).
One of the main results of section 2.1, Theorem 2.12, says that a positive-definite
symmetric matrix belongs to the class jC^1 of (2.19) if and only if the associated
quadratic form satisfies (M^-EXCfRj). This statement, however, does not carry
over to the discrete setting. For instance, consider

where A £ C. l. The associated quadratic form f ( x ) = ^x^Ax satisfies (M11-


EXC[R]) as a function / : R3 —> R in real variables, but does not meet (M^-
EXC[Z]) when viewed as / : Z3 —> R in integer variables. This phenomenon seems
to be indicative of the subtleties inherent in discreteness. See Note 8.13 for the
conjugate of M^-convex quadratic functions. •

6.4 Basic Operations


Basic operations on M-convex functions are presented here, whereas a most impor-
tant operation, transformation by networks, is treated later in section 9.6.
6.4. Basic Operations 143

First we introduce some operations on a function / : Zv —» R U {+00} in


general. For a subset U C V, the restriction, the projection, and the aggregation
of f to U are functions /[/ : Z17 -»• R U {+00}, /^ : Zu -> R U {±00}, and
fu* : Zu x Z -> R U {±00} denned respectively by

where Oy\{/ means the zero vector in Zv\u. For a pair of functions /$ : Zv —>
R U {+00} (i — 1, 2), the integer infimal convolution is a function /iD z /2 : Z v —>
R U {±00} denned by

Provided that /i^z h ls away from the value of —oo, we have

where the right-hand side means the discrete Minkowski sum (3.52). The projection
fu can be represented as

which says that fu coincides with the restriction to U of the integer infimal con-
volution of / with the indicator function 5fj of U — {x e Zy | x(v) = 0 (v 6 U)}.
We continue to use the notation f[—p] for p € Ry denned in (3.69) and /[Qifc] for
an integer interval [a, b] defined in (3.55).
M-convex functions admit the following operations.

Theorem 6.13. Let f, /i,/ 2 € M[Z —> R] 6e M-convex functions.


(1) For A G R++, A/ is M-convex.
(2) For a e Zv, /(a - x) and /(a + x) are M-convex in x.
(3) Forp G R y , /[—p] «s M-convex.
(4) For ¥>„ e C[Z -> R] (u e V),

is M-convex provided dom/ / 0.


(5) For a, 6 € (Z U {ioo})17, i/ie restriction f[a,b] t° the integer interval [a, b]
is M-convex provided dom/[a.b] ^ 0.
(6) For U C V, the restriction fu is M-convex provided dom/[/ 7^ 0.
(7) For {7 C V, the aggregation fu* is M-convex provided fu* > — oo.
(8) The integer infimal convolution f = fi^z/2 is M-convex provided f >
—oo.

Proof. (1), (2), (5), and (6) are obvious and (3) is a special case of (4).
144 Chapter 6. M-Convex Functions

(4) For x, y G dom/ C dom/ and u € supp + (x — y), use (M-EXC[Zj) for / to
obtain v € supp~(o; — y) satisfying (6.1) for /. Then

(7) We show this in Note 9.29 using transformation by a network.


(8) We show this in Note 9.30 using transformation by a network. D

As is easily seen, the converse of Theorem 6.13 (5) is also true.

Proposition 6.14. For a function f : Zv -> R U {+00}, we have

f is M-convex 4=> f[a,b] ^s M-convex for any a, b 6 Zv with dom f\a,b] ^ 0-

The operations in Theorem 6.13 are also valid for M^-convex functions. In
addition, the projection is allowed for M^-convex functions.

Theorem 6.15. Let /,/i,/ 2 e M*[Z -> R] be Afi-convex functions.


(1) Operations (l)-(8) of Theorem 6.13 are valid for Afi-convex functions.
(2) For U C V, the projection fu is Aft-convex provided fu > —oo.

Proof. (2) This follows from (6.45) as well as the M^-versions of (6) and (8) of
Theorem 6.13. D

Note 6.16. The sum of two M-convex functions is not necessarily M-convex. For
example, recall the M-convex sets £?i and B% in Note 4.25 such that B\ D B% is
not M-convex. Their indicator functions are M-convex, but their sum, which is the
indicator function of B\ n #2 > is not M-convex. The sum of two M-convex functions
is studied under the name M^-convex function in section 8.3. A similar argument
applies to the sum of two M^-convex functions. •

Note 6.17. The proviso fu* > —oo in Theorem 6.13 (7) can be weakened to
fu*(xo) > —oo for some XQ. A similar weakening holds for / > —oo in Theorem
6.13 (8) and fu > -oo in Theorem 6.15 (2). •

Note 6.18. For a function / : Zv —> RU {+00} and a positive integer a, we define
a function /" : Zv -> R U {+00} by
6.5. Supermodularity 145

Figure 6.1. Scaling fa for a = 2.

This is called a scaling in the domain or a domain scaling. If a — 2, for in-


stance, this amounts to considering the function values only on vectors of even
integers (see Fig. 6.1). Scaling is one of the common techniques used in design-
ing efficient algorithms—this is particularly true of network flow algorithms (see
Ahuja-Magnanti-Orlin [1]).
M-convexity (or M^-convexity) is not preserved under scaling. For example,
the indicator function / of an M-convex set
{d(i,o, -i,o) + C2 (i,o,o, -i) + C3(o, i, -i,o) + c4(o, i,o, -i) | a e {o, i}}
is M-convex, but /" for a = 2 is not, because it is the indicator function of
{(0,0,0,0), (1,1,—1,—1)}, which is not M-convex. Nevertheless, scaling an M-
convex function is useful in designing efficient algorithms, as we will see in sec-
tion 10.1 as well as in Theorem 6.37 (a proximity theorem for M-convex functions).
It is worth mentioning that some subclasses of M-convex functions are closed under
the scaling operation; linear, quadratic, separable, and laminar M-convex functions
form such subclasses. See Proposition 10.41 for a type of scaling operation for
M-convex functions. •

6.5 Supermodularity
M^-convex functions are supermodular on the integer lattice.

Theorem 6.19. An &fi-convex function f e M^\L —> R] is supermodular; i.e.,

Proof. For x & Zv, (M>>-EXC[Z]) applied to (x + Xu + Xv, x) yields

We prove (6.48) by induction on | \x — y\ \ i, where we may assume supp+ (x — y) j^ 0


and supp~(z - y) ^ 0. By (6.49), (6.48) is true if \\x - y\\i < 2. For x, y with
146 Chapter 6. M-Convex Functions

\\x — y\\i > 3, we may assume x V y,x A y e dom/ and also ^{x(u) — y(u) u e
supp+(x — y)}> 2, by symmetry. Take u e supp+(o; — y) and put x' = (x A y) + Xu
and y' = y + Xu- Since dom/ is an M^-convex set, it includes the integer interval
[x A y, x V y]z and, in particular, x',y' e dom/. By ||x' — y||i < ||x - y\\i — I and
\\x — y'\\i — \\x — y\\i — 1, the induction hypothesis yields

which shows (6.48).

Example 6.20. The converse of Theorem 6.19 is not true. For instance, a function
/ : Z3 -> RU{+oo} denned by dom/ = {0,1}3 and /(!,!,!) = 2, /(I,1,0) =
/(1,0,1) = 1, /(0,0,0) = /(1,0,0) = /(0,1,0) = /(0,0,1) = /(0,1,1) = 0 is
supermodular and not M^-convex; (M^-EXC^]) fails for x = (0,1,1), y = (1,0,0),
and u = 2. •

Note 6.21. We have repeatedly said that submodularity corresponds to convexity


(in section 4.5, in particular). Theorem 6.19 says, however, that M^-concave func-
tions are submodular. Though somewhat annoying, this is not a contradiction, but
provides a better understanding of the fact that motivated the analogy of submod-
ularity to concavity in the 1970s. The fact is that, for a univariate concave function
h, the set function p denned by p(X) = h(\X\) for X C V is submodular (Edmonds
[44], Lovasz [123]). A possible understanding based on Theorem 6.19 is as follows:
p is an M^-concave function, viewed as a function on {0, l}y, and, therefore, it is
submodular. Recall also section 2.3.1 for the issue of convexity vs. submodularity.

Note 6.22. The supermodular inequality (6.48) is void for an M-convex function
/ because xVy,x/\y£ dom/ occurs only when x = y. For an M-convex function
/, the property corresponding to (6.49) is expressed as

where u, v, w are distinct elements of V and x G Zv.

6.6 Descent Directions


One of the most conspicuous features of an M-convex function / is that it has a
prescribed set of possible descent directions in the sense that

This is an exemplar of what we understand as discreteness in direction.


6.6. Descent Directions 147

Proposition 6.23. An M-convex function f € M.\L —> R] satisfies (6.50).

Proof. By (M-EXC[Z]) there exist HI & supp+(x - y) and v\ e supp~(a; - y) such


that

where y2 = y + Xu^ — Xvi- By (M-EXC[Zj) applied to (x,y2), there exist u2 e


supp+(o; - 7/2) and ^2 e supp~(x — j/ 2 ) such that

whereas = 2/2+X« 2 -X«2 = 2/+X«i+Xu 2 -Xfi -X« 2 - Repeating this m = ||z-y||i/2


times, we obtain («j, Uj) (i = 1,..., m) such that y = x — ^D^! (Xu, — X«i) and

Therefore, f ( x - Xm + Xvt) — f ( x ) < 0 for some i. D

The property (6.50) is essential for M-convexity. For an M-convex function /


and any p e R17, f\p] is again M-convex, and, therefore, / satisfies the following
property:
(M-SI[Z]) For p e Ry and x,y € dom/ with /[p](x) > f\p](y),

As the M^-version we consider the following:


(M'l-SIfZ]) For p e Hv and x, y e dom/ with f\p\(x) > f\p\(y),

where xo — 0) as usual.

Theorem 6.24. Let f : Z y -> R U {+00} 6e o function with dom / ^ 0.


(1) / is an M-convex function 4=> / satisfies (M-SI[Z]).
(2) / zs an M^-convex function <=> / satisfies (M^-SIfZ]).

Proof. It suffices to prove (1). The implication => is immediate from Theorem
6.13 (3) and Proposition 6.23. The converse <£= follows from Claims 1 and 2 below
by Theorem 6.4.
Claim 1: B — dom/ is an M-convex set.
(Proof of Claim 1) For x,y e B and u € supp+(x — y), take a sufficiently large
M > 0 and define p : V —> R by
148 Chapter 6. M-Convex Functions

Then f\p](x) > f\p](y), and by (M-SI[Zj) there exist w & supp+(x - y) and v 6
supp~(x-y) such that f\p](x)-f\p](x-xw+Xv) = f(x)~ f(x~Xw + Xv)+p(w)-
M > 0. This is possible only if w = it, which shows (B-EXC_[Z]) for B. Then B
is an M-convex set by Theorem 4.3.
Claim 2: / satisfies the local exchange axiom (M-EXCioc[Z]).
(Proof of Claim 2) Take x,y e B with ||x - y||i = 4 and put y = x —
Xui ~ Xu2 + Xvi + Xv2 with Ui,u2,vi,v2 e V and {wi,M 2 } n {vi,v2} = 0. In the
following we assume u\ ^ u2 and v\ ^ ^2 (the other cases can be treated similarly).
Consider a bipartite graph G = (V+, V~; E) with vertex bipartition V+ = {ui,u2},
V~ = {^1,^2} and arc set E — {(ui,Vj) A . f ( x ; v j , U i ) < +00 (i,j = 1,2)}. The
graph G has a perfect matching as a consequence of (B-EXC[Zj) for B. We think
of A f ( x ; V j , U i ) as the weight of arc (ui,Vj) and apply Proposition 3.14 to obtain
p : V —» R such that

and p(ui) + p(u2) — p(v\) — p(v2) is equal to the right-hand side of (6.11) (and
p(v) = 0 for v G V \ {ui,U2,vi,V2})- Failure of inequality (6.11) would imply

a contradiction to (M-SI[Zj). D

The proof of Proposition 6.23 shows the following.

Proposition 6.25. For an M-convex function f & M.\L —> R] and x,y € dom/,
we have

where

6.7 Minimizers
Global optimality for an M-convex function is characterized by local optimality.

Theorem 6.26 (M-optimality criterion).


(1) For an M-convex function f & M\L —> R] and x 6 dom/, we have

(2) For an M^-convex function f e M^[Z —> R] and x £ dom/, we have


6.7. Minimizers 149

Figure 6.2. Minimum spanning tree problem.

Proof. It suffices to prove 4= in (1), but this follows from Proposition 6.23. D

Example 6.27. The minimum spanning tree problem serves as a canonical exam-
ple to illustrate the M-optimality criterion. Let G = (U, E) be a graph with vertex
set U and arc set E. A set T of arcs is called a spanning tree if it forms a con-
nected subgraph that contains no circuit and covers all the vertices. The minimum
spanning tree problem is to find a spanning tree T that has the minimum weight
with respect to a given weight w : E —> R, where the weight of T is defined as
Seer w(e)- It is well known that a spanning tree T has the minimum weight if and
only if w(e) < w(e') for any e e T and e' G E \ T such that T - e + e' is a spanning
tree (see Fig. 6.2). This well-known optimality criterion is a special case of Theorem
6.26 (1) applied to an M-convex function / : ZE —> R U {+00} defined by

Note that, for a spanning tree T and arcs e € T and e' £ E \ T, we have /(XT) <
f(XT — Xe + Xe') if and only if (i) w(e) < w(e') or (ii) T — e + e' is not a spanning
tree. In this connection it is noted that T - e + e' is a spanning tree if and only if e
belongs to the unique circuit contained in T U {e'}, called the fundamental circuit
with respect to (T, e'). •

Theorem 6.26 above shows how to verify the optimality of a given point with
O(n 2 ) function evaluations. The next theorem suggests how to find a minimizer.
Stating that a given point can be easily separated from some minimizer, it serves
as a basis of the domain reduction algorithm for M-convex function minimization,
to be explained in section 10.1.3.

Theorem 6.28 (M-minimizer cut). Let f : Zv —> R U {+00} be an M-convex


function with arg min / ^ 0.
(1) For x e dom/ and v G V, let u G V be such that
150 Chapter 6. M-Convex Functions

Then there exists x* 6 arg min / with

(2) For x G dom/ and u 6 V, let v £ V be such that

Then there exists x* e arg min/ with

(3) For x e dom / \ arg min /, let u, v e V be such that

Then there exists x* e arg min/ with

Proof. (1) Put x' = x — Xu + Xv Assume, to the contrary, that there is no


x* e arg min/ with x*(u) < x'(u). Let x* be an element of arg min/ with x*(u)
being minimum. Then we have x*(u) > x'(u). By applying (M-EXC[Zj) to x*,
x', and u we obtain some w s supp~(x* — x') such that if A.f(x*;w,u) > 0 then
Af(x';u,w) < 0. Since Af(x*;w,u) > 0 by the choice of x*, we have /(V) >
f(x' + Xu - Xw) = f(x — Xw + Xv), a contradiction to the property of u.
(2) The proof is similar to that for (1).
(3) Put x' = x — Xu + Xv (£ x). By (1) there exists x* e arg min/ such
that x*(u) < x'(u)', we assume that x* maximizes x*(v) among all such vectors. If
x*(v) > x'(v) is not satisfied, (M-EXC[Zj) applies to x', x*, and v to yield some
w €E supp"(x' — x*) satisfying (a) &f(x';w,v) < 0, (b) Af(x*;v,w) < 0, or (c)
A/(x';w;,u) = Af(x*;v,w) = 0. We have &f(x';w,v) >0^>yx'-Xv+Xw=x-
Xu + Xw and the choice of u and v. We also have A/(o;*;u, iy) > 0 by x* Sargmin/.
Therefore, we have (c), which implies x* + Xv — Xw 6 arg min/, a contradiction to
the choice of x*. D

The minimizers of an M-convex function form an M-convex set, a property


that is essential for a function to be M-convex.

Proposition 6.29. For an M-convex function f 6 M[Z —* R], argmin/ is an


M-convex set if it is not empty.

Proof. For x,y € argmin/, we have x-Xu+Xv, V + Xu~Xv € argmin/ in (6.1).


This shows that argmin/ satisfies (B-EXC[Zj). D

The following theorem reveals that M-convex functions are characterized as


functions obtained by piecing together M-convex sets in a consistent way. This
6.7. Minimizers 151

shows how the concept of M-convex functions can be denned from that of M-
convex sets.

Theorem 6.30. Let f : Zv —> R U {+00} be a function with a bounded nonempty


effective domain.
(1) / is M-convex <=> argmin/[— p] is an M-convex set for each p G R^.
(2) / is M^-convex •£=> argmin/[—p] is an M^-convex set for each p € R^.

Proof. It suffices to prove (1). The implication =>• is immediate from Theorem 6.13
(3) and Proposition 6.29. For 4=, it suffices, by Theorem 6.4, to show that B —
dom/ is an M-convex set and / satisfies the local exchange axiom (M-EXCioc[Z]).
Claim 1: B is an M-convex set.
(Proof of Claim 1) Put Bp = argmin/[— p] for each p. Then we have B =
|Jp Bp for the convex hulls of B and Bp. For x, y G B, there exists p such that y G Bp
and z = tx + ( l - % e £ ^ for some t > 0. It follows from (B-EXC+[R]) of ~B~P that,
for u G supp+ (x — y) = supp+ (z — y ) , there exists v G supp~(z — y) = supp~ (x — y)
such that y + a(xu - Xv) G Bp C B for all sufficiently small a > 0. This shows
(B-EXC+[R]) for B. Therefore, B is an M-convex set.
For (M-EXCioc[Z]), take x, y € B with ||o;-y||i = 4. Let / : Rv -> RU{+oo}
be the convex closure of /, where it is noted that / is not assumed to be convex
extensible. Let p G R^ be a subgradient of / at c = (x + y)/2 £ R y . We have
c e argmin/[—p] = Bp, where Bp is an integral M-convex polyhedron. Hence,
the intersection of Bp with the interval / = [a; A y,x V J/]R is an integral M-convex
polyhedron, in which c is contained. This means that c can be represented as a
convex combination of some integral vectors, say, zi,..., zm G (lr\Bp)r\Zv = If~\Bp:

where Y^k=i -^fc = 1 and At > 0 (fc = 1,..., m).


Since ||z-y||i = 4, wehavey = x-Xvi-Xv3+Xv3+Xvt for some v 1,^2,^3,^4 6
V with {vi,vz} n {^3,^4} = 0. In the following we assume that v\, v%, vs, and v^
are all distinct (the other cases can be treated similarly). Noting that any element
z of I n Bp can be represented as z = (x A y) + Xv, + Xvj (i ^ j)> we consider an
undirected graph G = (Vo,Eo) with vertex set VQ = {vi,V2,V3,V4} and edge set
EO = {{vi, Vj} zk = (x A y) + Xvi +Xv.,,k = l,..., m}.
Claim 2: G has a perfect matching (of size 2).
(Proof of Claim 2) For each i (1 < i < 4), we have c(t>,) — (x A y)(vi) = 1/2,
whereas Zk(vi) — ( x / \ y ) ( v i ) G {0,1} for all k in (6.58). Hence, for each i, there exist
ki and /CQ such that

Translating this into G, we see that for each vertex v^ there is an edge that covers
(is incident to) Vi and also there is another edge that avoids (is not incident to) DJ.
This condition implies the existence of a perfect matching in G.
152 Chapter 6. M-Convex Functions

Finally we derive (M-EXCioc[Z]) from Claim 2. We divide into two cases, (i) If
{{vi,Vz}, {i>3, ^4}} C EQ, both x and y appear among the z/c's, and hence x, y 6 Bp.
By (B-EXC[Z]) for Bp, we have x - Xvf + Xv, e Bp and y + \Vi - xVj e Bp for some
i € {1, 2} and j € {3,4}. Hence,

which shows (6.11) with equality, (ii) If {{^1,^2}, {^3, ^4}} £ •f'O, it follows from
Claim 2 that {{ui,i>j}, {^2, Vj}} C J50 for some i, j with {z, j} = {3,4}. Then

both belong to Bp; i.e.,

Hence,

which establishes (6.11). D

Note 6.31. The boundedness assumption on dom/ in Theorem 6.30 is not re-
strictive substantially, since we know from Proposition 6.14 that / is M-convex if
and only if its restriction f[a,b] to every bounded integer interval [a, b] is M-convex
(as long as dom/[ a>b ] ^ 0). On the other hand, the boundedness assumption seems
inevitable. The function

in x = (x(l),x(2)) 6 Z2 is not M-convex, but, for each p € R y , argmin/[-p] is


equal to {0} (an M-convex set) if it is not empty. •

6.8 Gross Substitutes Property


In the previous section we saw that the minimizers of an M-convex function / form
an M-convex set for a fixed p 6 R v . We investigate here how the minimizers
of f\p] change with the variation of p. The term gross substitutes stems from an
economic interpretation, where p represents the price vector; some background in
mathematical economics will be given in section 11.3.
We first observe a general phenomenon, independent of M-convexity, in the
variation of minimizers. Let / : Zv —> R U {+00} be any function and assume
x € argmin/[p] and y € argmin/[g] foip,q € Hv. It follows from f\p](y) > f\p](x)
and f[q](x) > f[q](y) that
6.8. Gross Substitutes Property 153

A particular case of this inequality with q = p + axu for u € V and a > 0 yields
y(u) < x(u). Namely, we have

This is a well-known phenomenon valid for any function /, a kind of monotonicity


in the variation of minimizers. Note that nothing is claimed here about the other
components x(v) and y(v) with v j^ u.
The gross substitutes property that we consider in this section is concerned with
the variation of other components. In contrast to (6.59) we introduce a condition
on /:

Obviously, this condition is equivalent to the following:


(M-GS[Z]) If x G argmin/[p], p < q, and argmin/[g] ^ 0, there exists
y G argmin/[g] such that y(v] > x(v) for all v G V with p(v) = q(v).
It should be clear that the inequality p < q above means p(v) < q(v) (Vv G V).

Proposition 6.32. An M-convex function f G M[Z —> R] satisfies (M-GS[Z]).

Proof. For x G argmin/[p] and p < q, let y be an element of argmin/fg] with


\\y — x\\i minimum. Suppose, on the contrary, that p(u) = q(u) and x(u) > y(u)
for some u e V. By (M-EXC[Zj) there exists v & supp~(o; — y) such that

By x G argmin/[p] and y G argmin/[g] we have

and hence

Adding (6.61) and (6.63) yields

This shows that (6.61), (6.62), and (6.63) are satisfied in equalities. In particular,
we have y + \u — \v € argmin/[g], a contradiction to our choice of y. D

For a function / let / be the function given by (6.4). It is easy to see that /
satisfies (M-GS[Z]) if and only if / satisfies the following:
154 Chapter 6. M-Convex Functions

(M^-GS[Zj) If x & argmin/[p-p 0 l], P < q, Po < <?o, and argmin/[g-


<j0l] ^ 0, there exists y G argmin/[# — gol] such that
(i) y(v) > x(v] for every v € V with p(v) = q(v), and
(S) y(V) < x(v) if po = qo,
where p, q £ R.v and po,qo & R-- Note that (M^-GSfZ]) is equivalent to the pair of
(6.64) and (6.65) below:

Proposition 6.33. An Afl-convex function f € M\Z -> R] satisfies (M^-GS[Z}).

Proof. This follows from Proposition 6.32 applied to / in (6.4). D

Note that (M h -GS[Z]) =» (M-GS[Z]) as well as M[Z -* R] C M*[Z -> R].


Hence Proposition 6.32 is contained in Proposition 6.33 as a special case.
The properties (M-GS[Z]) and (M^-GSfZ]) characterize M-convex and M11-
convex functions, respectively.

Theorem 6.34. Let f : Zv —> R U {+00} be a function that is convex extensible47


and has a bounded nonempty effective domain.

Proof. The implications => in (1) and (2) have been shown in Propositions 6.32
and 6.33. We give a proof of <= for (1) by using Theorem 6.30. It suffices to
show that B = argmin/ is an M-convex set, since (M-GS[Zj) for / implies this for
/[—p] for any p € R^. Since B = B fl Zv by the convex extensibility of / (see
Proposition 3.18), this is further reduced to showing that every edge of polyhedron
B is parallel to \u — \v for some u, v 6 V (see (4.43)). Let E be an edge of
B. By B = argmin/ we have E fl Zv = argmin/[p] for some p € R^. For
two distinct integer points x, y on E, neither supp + (x — y) nor supp~(x — y) is
empty by B C {z e Zv z(V) = r}. By (6.60) with u e supp+(x - y) and
sufficiently small a > 0 there exists y € argmin/[g] such that y(v) > x(v) (Vv ^ u),
where q = p + a\u. Note that x ^ y since f[q](x) > f[q](y) > f[q](y)- Since
47
It will be shown in Theorem 6.42 that M^-convex functions are convex extensible.
6.8. Gross Substitutes Property 155

a > 0 is sufficiently small, we have y e argmin/[p] from y € argmin/[g]. This


means that y e E and that x — y is a scalar multiple of x — y. In particular,
supp+(x — y) = supp+(x — y) = {u}. Similarly, supp~(o; — y) = {v} for some v.
Since x(V) — y(V), this means that x — y is a scalar multiple of \u — Xv- Q

A function / : Zv —> R U {+00} is said to have the stepwise gross substitutes


property (SWGS) if it satisfies the following:
(M^-SWCSfZ]) For x & argmin/[p], p € R v , and u e V, at least one
of (i) or (ii) holds true:
(i) x £ argmin/[p + axu] for any a > 0,
(ii) there exist a > 0 and y e argmin/[p + «%„] such that
y(u) = x(u) — I and y(v) > x(v) for all v 6 I/ \ {u}.
This property also characterizes M'-'-convex functions.

Proposition 6.35. An Afl-convex function f e M^[Z -> R] satisfies (Mh-SWGS[Z]).

Proof. We may assume p = 0. Suppose that (i) in (M^-SWCSfZ]) fails, and let
a* be the maximum value of a such that x € argmin/[ax u ]. By the M-optimality
criterion (Theorem 6.26 (2)), x £ argmin/[o;Xu] if and only if

which can be rewritten as

Noting also that A/(x;i,s) > 0 (Vs,t e V U {0}), we see

Let w & (V (J {0}) \ {u} be such that a* = A/(x; w, w) and put y = x — Xu + Xw-
Then /[a*xu](:r) = f[a*Xu](y) as well as x € argmin/[o:*Xw]. Hence follows (ii) in
(M"-SWGS[Z]). D

Theorem 6.36. For a convex-extensible function f : Zv —> R U {+00} with a


nonempty effective domain,

Proof. The implication => was shown in Proposition 6.35. We give a proof of
•4= by using Theorem 6.30 (2). It suffices to show that B = argmin/ is an M^-
convex set, since (M^-SWGSfZj) for / implies this for /[—p] for any p e Hv. Since
B = B n Zv by the convex extensibility of / (see Proposition 3.18), this is further
reduced to showing that every edge of polyhedron B is parallel to Xu ~ Xv °r Xu
156 Chapter 6. M-Convex Functions

for some u,v E V (see (4.43)). Let E be an edge of B. By B — argmin/ we have


Er\ Zv = argmin/[p] for some p E TLV. Let x and y be two distinct integer points
on E with supp+(z - y) ^ 0. By (M^-SWGSfZ]) with u £ supp+(a; - y), there
exist a > 0 and x E argmin/[p + QXU] such that x(u) = x(u) — I and x(w) > x(w)
(Mw ± u), since (i) of (M^-SWGS[Z]) fails by f\p + Xu](y) <f\P + Xu}(x). Since

we have x & arg min f\p]. This means that x £ E and that x — x is a scalar multiple
of x — y. In particular supp + (x — y) = supp+(x — x) = {u}. If supp~(x — y) = 0,
then x - y is a scalar multiple of \u. Otherwise, a similar argument shows that
supp~(x — y) = {v} for some v and there exists y E Er\Zv such that y(v) = y(v) — 1
and y(w) > y(w) (Vtu 7^ w). Since x — x is a scalar multiple of y — y, we have
x(u) = x(i>) + /? and y(u) = y(w) + 1//3 for some /3 > 0. We must have /3 — 1 since
£(i>) and y(u) are integers. Therefore, x — y is a scalar multiple of Xu — Xv- D

6.9 Proximity Theorem


Suppose that we have an optimization problem to solve and another optimization
problem approximating the original problem. Proximity theorem is a generic term
for a theorem that guarantees the existence of an optimal solution to the original
problem in some neighborhood of an optimal solution to the approximate problem.
Our optimization problem here is the minimization of an M-convex function
/, and the approximation to it is the problem of (locally) minimizing the scaling
of / with a positive integer a, denoted as fa in (6.47). Recall from Note 6.18 that
the scaling of an M-convex function is not necessarily M-convex, and hence a local
optimum of /" may not be a global optimum of fa.
The following proximity theorem, named the M-proximity theorem, shows that
a global optimum of the original function / exists in a neighborhood of a local
optimum of /".

Theorem 6.37 (M-proximity theorem). Assume a & Z++ and n = \V\.


(1) Let f : Zv —> RU{+oo} be an M-convex function. If xa E dom/ satisfies

then arg min / ^ 0 and there exists x* E arg min / with

(2) Let f : Zv —> RU{+oo} be an Afi-convex function. If xa E dom/ satisfies

then arg min / ^ 0 and there exists x* £ arg min / with


6.9. Proximity Theorem 157

Proof. It suffices to prove (1) by showing that, for any 7 e R with 7 > inf /, there
exists some x* € dom/ satisfying f(x*) < 7 and (6.67). Suppose that x* 6 dom/
minimizes \\x* — xa\\i among all vectors satisfying f(x*) < 7. In the following,
we fix v G V and prove xa(v) — x*(v) < (n — l)(a — 1). (The inequality x*(v) —
xa(v) < (n — l)(a — 1) can be shown similarly.) We may assume xa(v) > x*(v);
put k — xa(v) — x*(v).
Claim 1: There exist wi,W2,... ,Wk € ^\{^} and 3/0 (= xa), j/i, • • • , J/fc 6 dom/
such that

(Proof of Claim 1) We prove the claim by induction on i. Suppose j/j_i €


dom/. By (M-EXC[Z]) for j/»_i, x*, and v € supp + (y,_i - x*), there exists Wi 6
supp~(t/j_! - x*) C supp~(:rQ — x*) C V \ {f} such that

By the choice of x* we have f(x* + \v — Xwf) > /(#*)> and hence /(j/j) = /(j/»-i —
x« + x»«i) < /(y*-i)-
Claim 2: For any to € V\{w} with yk(w] > xa(w) and /i 6 [0, yk(w)— xa(w) —
we
l]z> have

(Proof of Claim 2) We prove this by induction on fj,. For /z e [0, yk(w)— xa(w)-
l ] z , put x' = xa - n(xv — Xw) and assume x' e dom/. Let j* (1 < j* < k) be the
largest index such that Wjf = w. Then j/j. (tu) = j/fe(iw) > x'(tw) and supp~(j/j. —
x') = {t;}. (M-EXC[Zj) implies /(*') + /(%.) > /(^'-X,+X^) + /(%, +X»-X»).
By Claim 1 we have /(j/j. + Xw ~ Xtu) > /(%',)• This establishes Claim 2.
Claim 2 and (6.66) imply

for any w; with /z^, = yk(w) — xa(w) > 0. Hence yk(w) — xa(w) < a — I for all
w 6 V \ {u}. Then we obtain

where the second equality is by xa(V) = j/fe(V). D

Example 6.38. The M-proximity theorem is illustrated for the univariate M11-
convex function in Fig. 6.1 in section 6.4, where a = 2. Obviously, xa = 0 is
the minimizer of fa satisfying (6.68) and x* = 1 is the minimizer of /. We have
xa - x*\ = 1 = n(a — 1), in agreement with (6.69). •

The minimizer cut theorem (Theorem 6.28) can be adapted to scaling. Theo-
rem 6.28, except for (3), is a special case of the following theorem with a = 1.
158 Chapter 6. M-Convex Functions

Theorem 6.39 (M-minimizer cut with scaling). Let / : Zv —> RU {+00} be an


M-convex function with argmin/ ^ 0, and assume a € Z ++ and n = \V\.
(1) For x e dom/ and v e V, let u e V be such that

Then there exists x* 6 argmin/ with

(2) For x €. dom/ and u G V, let v & V be such that

Then there exists x* e argmin/ with

Proof. We prove (2), while (1) can be proved similarly. Put xa = x + a(xv — Xu)-
We may assume max{x*(t;) | x* 6 argmin/} < xa(v); otherwise we are done. Let
x* be an element of argmin/ with x*(v) maximum and k = xa(v) — x*(v) (> 1).
The rest of the proof is the same as the proof of Theorem 6.37 (from Claim 1 until
the end). D

The algorithmic use of the above theorems, M-proximity and minimizer cut
with scaling, is shown in sections 10.1.2 and 10.1.4, respectively.

Note 6.40. An £i-norm version of Theorem 6.37 (1), with (6.67) replaced with

can be obtained from a slight modification of the proof; see Murota-Tamura [162].

Note 6.41. The M-proximity theorem (Theorem 6.37) is closely related to the
result of Hochbaum [90]. See also Moriguchi-Shioura [134]. •

6.10 Convex Extension


This section establishes one of the major properties of M-convex functions—that
they can be extended to convex functions in real variables. The extensibility to
convex functions is by no means obvious from the definition of M-convex functions;
note that the exchange axiom (M-EXC[Z]) refers only to function values on integer
points. The convex extension of an M-convex function can be obtained by piecing
together M-convex polyhedra in a consistent way.
6.10. Convex Extension 159

The first theorem shows that the convex extension of an M-convex function
can be constructed locally.

Theorem 6.42. An Afl-convex function is integrally convex. In particular, an


A/fi-convex function is convex extensible.

Proof. It suffices to consider an M-convex function /. The restriction of / to any


bounded integer interval [a, b], denoted by /[aj6j, is an M-convex function (Proposi-
tion 6.14). For any p € Rv, argmin(/[ a _;,][—p}) is an M-convex set by Proposition
6.29, and hence it is an integrally convex set by Theorem 4.24. Therefore, f[a,b] *s
an integrally convex function by Theorem 3.29. This implies the integral convexity
of / by Proposition 3.19. D

The next theorem characterizes the convex extension of an M-convex function


as a collection of M-convex polyhedra.

Theorem 6.43. Let f : Zv —» R U {+00} be a function with dom/ ^ 0 and f be


its convex closure.

Proof. It suffices to prove (1). The implication => is due to Theorem 6.42 and
Proposition 6.29. The converse •<= can be established by Theorem 6.30 applied to
the restriction of / to every bounded integer interval. D

By integral convexity, the convex extension f ( x ) of an M-convex function /


can be represented as a convex combination of f(y) with y e N(x), where N(x)
is the integral neighborhood of £ € Ry defined in (3.58). The following theorem
states that we can use a single set of convex combination coefficients for a pair
of M^-convex functions. This fact, though technical, is crucial in establishing the
separation theorem for M^-convex functions (Theorem 8.15).

Theorem 6.44. For two Afl-convex functions /i, /2 e M^[Z —> R] and x e R y ,
there exists A = (Xy y e N ( x ) ) such that
160 Chapter 6. M-Convex Functions

Proof. We may assume fi and /2 to be M-convex and x G dom/i fl dom/2. For


i = l,2, let (p^ Q!J) e Rv x R be such that

(see (3.61)). Then

is an M-convex set. Since x e £?i n #2, where B! n B2 = B\ n £?2 by Theorem 4.22,


there exists A = (\y \ y G AT(x)) satisfying {y | \y > 0} C £?i nB? and (6.71). Such
a A also satisfies (6.72) by the complementarity (Theorem 3.10 (3)), as in the proof
of Theorem 3.29. D

6.11 Polyhedral M-Convex Functions


As we have seen, M-convex functions on the integer lattice can be extended to
convex functions in real variables. The convex extension of an M-convex function is
a polyhedral convex function when restricted to a finite interval. Motivated by this
we define here the concept of M-convexity for polyhedral convex functions in general
and show that major properties of M-convex functions survive in this generalization.
A polyhedral convex function / : Rv —> R U {+00} with dom^/ ^ 0 is said
to be M-convex if it satisfies the following exchange property:
(M-EXC[R]) For x,y G doniR./ and u £ supp + (x - y), there exist
v G supp~ (x — y) and a positive number ao € R++ such that

for all a & [0, a 0 ]R-


Note that, if the inequality above holds for a = ao, it holds for all a G [0, «O]R by
convexity of /. With the notation

for directional derivatives (see (3.24)), (M-EXC[Rj) can be rewritten as follows:


(M-EXC'[R]) For x,y G domR/,

We denote by Ai[R —>• R] the set of polyhedral M-convex functions. Polyhedral


M-concave functions are defined in an obvious way.
An M-convex function on integer points naturally induces a polyhedral M-
convex function via convex extension (which exists by Theorem 6.42).
6.11. Polyhedral M-Convex Functions 161

Theorem 6.45. The convex extension f of an M-convex function f & M[Z —> R]
on the integer lattice is a polyhedral M-convex function, i.e., f G A^[R —> R],
provided that f is polyhedral.

Proof. The proof is given later in Note 8.8. D

Example 6.46. The convex extension / of an M-convex function / e M[Z —> R]


may consist of an infinite number of linear pieces, in which case / is not polyhedral
convex. For example, we have / e M[Z —> R] and / ^ jVf[R —> R] for / : Z2 —>
R U {+00} denned by

It is worth noting that, if domz/ is bounded, / is polyhedral and therefore / €


M[n -> R] by Theorem 6.45. •

We now define integrality for polyhedral convex functions in general. By


an integral polyhedral convex function we mean a polyhedral convex function /
such that

We say that a polyhedral convex function / has dual integrality, or is a dual-integral


polyhedral convex function, if its conjugate function /* has integrality (6.75). Since
argmin/*[—x] = $R/(X), as in (3.30), / has dual integrality if and only if

We denote by C[Z|R —> R] and C[R —> R Z] the sets of univariate polyhedral
convex functions with integrality (6.75) and dual integrality (6.76), respectively.
Polyhedral M-convex functions with integrality (6.75) are referred to as inte-
gral polyhedral M-convex functions, the set of which is denoted by ./Vf [Z|R —> R].
Polyhedral M-convex functions with dual integrality (6.76) are referred to as dual-
integral polyhedral M-convex functions, the set of which is denoted by ./Vf [R —> R|Z].
By Theorems 6.45 and 6.43, an integral polyhedral M-convex function is noth-
ing but a polyhedral M-convex function that can be obtained as the convex extension
of an M-convex function on integer points. Therefore, we have

where the second expression means that there exists an injection from .A/f [Z|R —» R]
to M.[Z —> R], representing an embedding of ./W[Z|R —> R] into M[Z —> R].
The effective domain of a polyhedral M-convex function is an M-convex poly-
hedron lying on a hyperplane {x e R^ x(V) = r} for some r £ R. Hence,
polyhedral M^ -convex functions can be defined as the projection of polyhedral M-
convex functions, just as M^-convex functions on integer points are defined from
162 Chapter 6. M-Convex Functions

M-convex functions via (6.4). We denote by .M^pR- ~^ R-] the set of polyhedral
M^-convex functions and by Ai^fZIR —> R] the set of integral polyhedral M^-convex
functions. The relationship between M and M*1 is described by M.n C M^ ~ Mn+i,
where M.n and M.^ denote, respectively, the sets of polyhedral M-convex functions
and polyhedral M^-convex functions in n variables.
The following are the R-counterparts of (M^-EXC[Z]) and (M^-EXC^Z]):
(M^-EXCfRj) For x,y 6 doniR/ and u & supp + (z — y), there exist
v G supp~ (x — y) U {0} and a positive number «o 6 R++ such that

for all a e [0, aoJR,


where Xo = 0.
(Ml-EXC'[R]) For x,y & dom R /,

where

Theorem 6.47. For a polyhedral convex function f : Ry —> R U {+00} with


doniR/ ^ 0, we have

polyhedral hfi -convexity •<=> (M^-EXC[R]) ^> (Mh-EXC'[R]).

Theorem 6.48. A polyhedral M-convex function is polyhedral Afi-convex. Con-


versely, a polyhedral M^-convex function is polyhedral M-convex if and only if the
effective domain is contained in {x € Ry x(V) — r} for some r 6 R.

Almost all properties of M-convex functions on integer points carry over to


polyhedral M-convex functions. To be specific, Theorems 6.13, 6.15, 6.19, and 6.26
and Proposition 6.29 are adapted as follows. Note, however, that the proofs are not
straightforward adaptations; see Murota-Shioura [152].
For a subset U C V, the restriction fu '• R6' —> R U {+00}, the projection
f : Hu -> R U {±00}, and the aggregation fu* : Hu x R -> R U {±00} are
u

defined similarly to (6.40), (6.41), and (6.42). Note in Theorem 6.49 (2) below that
a scaling factor j3 is allowed, unlike in the discrete case (cf. Note 6.18).

Theorem 6.49. Let f , f i , f z G A^[R —> R] be polyhedral M-convex functions.


(1) For A e R++, A/ is polyhedral M-convex.
(2) For a e R y and /? e R \ {0}, /(a + /3x) is polyhedral M-convex in x.
(3) For p G R^ ; /[—p] is polyhedral M-convex.
6.11. Polyhedral M-Convex Functions 163

(4) Fortpv e C [ R ^ R ] (veV),

is polyhedral M-convex provided domR/ ^ 0.


(5) For a,b € (RU {±00})^, the restriction /[ 0> bj to the real interval [a,b] is
polyhedral M-convex provided domR/[a>b] / 0.
(6) For U C V, the restriction fu is polyhedral M-convex provided domR/;/
^0.
(7) For U C V, the aggregation fu* is polyhedral M-convex provided
fU* > -°°-
(8) The infimal convolution f = f i ^ f 2 is polyhedral M-convex provided
/ > -oo.

Theorem 6.50. Let f, /i, /2 e .M^R -> R] be polyhedral M^-convex functions.


(1) Operations (l)-(8) of Theorem 6.49 are valid for polyhedral M^-convex
functions.
(2) For U C V, the projection fu is polyhedral M^-convex provided fu > —oo.

Theorem 6.51. A polyhedral M^-convex function f e .M^R -* R] is supermodu-


lar; i.e.,

Theorem 6.52 (M-optimality criterion).


(1) For a polyhedral M-convex function f £ M[R, —»• R] and x e domR/, we
have

(2) For a polyhedral ^-convex function f 6 .M^R —* R] and x 6 domR/,


we have

Proposition 6.53. Let / e A4[R —» R] 6e a polyhedral M-convex function. For


any p 6 R^, argmin/[— p] is an M-convex polyhedron if it is not empty.

The property in Proposition 6.53 characterizes polyhedral M-convexity, to be


shown in Theorem 6.63.

Note 6.54. Here are two remarks on QQ m (M-EXC[Rj). First, for an integral
polyhedral M-convex function / e A4[Z|R —> R], we can take a0 = 1. Second,
if (M-EXC[R]) is true at all, we can take ao = \(x(u) — y(u))/\supp~(x — y)\
independently of /; see Murota-Shioura [152] for the proof. •
164 Chapter 6. M-Convex Functions

Note 6.55. The proviso fu* > —oo in Theorem 6.49 (7) can be weakened to
fu*(xo) > —oo for some XQ. The same can be said for / > —oo in Theorem 6.49
(8) and fu > -oo in Theorem 6.50 (2). •

6.12 Positively Homogeneous M-Convex Functions


There exists a one-to-one correspondence between positively homogeneous M-convex
functions and distance functions satisfying the triangle inequality.
We denote by oA^fR —> R] the set of polyhedral M-convex functions that
are positively homogeneous in the sense of (3.32) and by o.M[Z|R —> R] the set of
integral polyhedral M-convex functions that are positively homogeneous. Also we
denote by oM [Z —> R] the set of M-convex functions / € M [Z —> R] on integer
points such that the convex extensions / are positively homogeneous.
These three families of functions can be identified with each other, i.e.,

by the following proposition. We introduce yet another notation, 0M [Z —> ZJ, for
the set of integer-valued functions belonging to oM. [Z —> R].

Proposition 6.56.
(1) 0M[Z\R ->• R] = oM[R -> R].
(2) The convex extension of a function in oM[Z —» R] belongs to o.M[R —> R].

Proof. (1) Take / € oAi[R —> R]. For any p £ R y , argmin/[—p] is a cone that is
an M-convex polyhedron (or empty) by Proposition 6.53. Hence, argmin/[—p] =
B(/?) for a {0, +oo}-valued submodular set function p; see section 4.8. This shows
the integrality of argmin/[-p], and therefore / 6 o.M[Z|R —> R].
(2) Take / e 0M[Z'—> R]. Since / is integrally convex and / is positively
homogeneous, / can be represented as the maximum of a finite number of linear
functions. Hence, / is polyhedral and / € .M[R —> R] by Theorem 6.45. D

A positively homogeneous M-convex function / induces a distance function


7 = 7/ satisfying the triangle inequality by

More precisely, we have the following, where T[R] and T[Z] denote respectively the
sets of real-valued and integer-valued distance functions with the triangle inequality.

Proposition 6.57.
(1) For f e 0M[n ->• R], we have jf e T[R].
(2) For f 6 0M[Z -> Z], we have 7/ 6 T[Z].

Proof. For (1) we apply (M-EXC[R]) to x = \V3 ~Xv2,y = Xv2 - X«i. and u = vi,
where we can take a = 1 by Proposition 6.56 (1) and Note 6.54. This yields the
triangle inequality (5.2). For (2) we use (M-EXC[Zj) in a similar manner. D
6.12. Positively Homogeneous M-Convex Functions 165

Conversely, a distance function satisfying the triangle inequality induces a


positively homogeneous M-convex function. For 7 S ^[R], we define 7 : R^ —>
RU{+oo} by

which is called the extension of 7. Proposition 5.1 as well as its proof shows

where D(7) is the L-convex polyhedron (5.4) associated with 7. Denote by 7z :


Zv —> R U {+00} the restriction of 7 to Z v , and note that for x 6 Zv we may
assume \uv 6 Z+ in (6.82), as is explained in the proof of Proposition 5.1.

Proposition 6.58.
(1) For 7 e T[R], we have 7 e 0-M[R ->• R].
(2) For 7 e T[Z], we have fa € 0.M[Z -> Z].

Proof. Expression (6.82) is a special case of the M-convex function (2.37) appearing
in network flow problems (section 2.2), where T — V, A = {a = (u, v) u, v 6 V; u ^
v}, and /„„(£) = 7(u, v)£ (for £ > 0) and +00 (for £ < 0). See also Note 2.19. D

The next theorem shows a one-to-one correspondence between positively ho-


mogeneous M-convex functions and distance functions satisfying the triangle in-
equality.

Theorem 6.59. For0M = 0.M[R -> R] and T = T[R], the mappings $ : 0M -»


T and ^ : T —> ^M. defined by

are inverse to each other, establishing a one-to-one correspondence between oM and


T. The same statement is true for $M. = oA4[Z —> Z] and T = T[Z].

Proof. For 7 e T, we have ^(7) G o-M by Proposition 6.58 and $ o ^(7) = 7 by


(6.83). For / e 0M, we have $(/) e T by Proposition 6.57. Since / is a positively
homogeneous convex function, we have

whenever Y^u,v&v ^uv(Xv - Xu) = a; and \uv > 0 (w, v e V). This implies / <
^> o $(/). In the case of 0Ai = o-M[Z —» Z] and T = T[Z], the opposite inequality
166 Chapter 6. M-Convex Functions

/ > ^>o$(/) is given by Proposition 6.25, whereas, in the case of $M = o.M[R —* R]


and T = T[R], the inequality can be established by an argument similar to the proof
of Proposition 6.23. D

6.13 Directional Derivatives and Subgradients


Directional derivatives and subgradients of M-convex functions are considered in this
section. For a polyhedral M-convex function /, the directional derivative f ' ( x \ d )
is a positively homogeneous M-convex function in d, and the subgradients of /
at a point form an L-convex polyhedron. Furthermore, each of these properties
characterizes M-convexity.
We start with directional derivatives of a polyhedral M-convex function / e
.M[R —> R]. Recall from (3.25) that, for each x e doniR/, there exists e > 0 such
that

Proposition 6.60. For f 6 .M[R —> R] and x € doniR/, we have f ' ( x ; - ) 6


0 X[R^R].

Proof. By (6.85), f ' ( x ; •) has the exchange property in the neighborhood of d = 0.


Then the claim follows from the positive homogeneity of f ' ( x ; •). D

For a function / : Zv —»• R U {+00} and a point x & domz/, we define

and call it the subdifferential of / at x (cf. (3.23)). An element of &R,/(X) is called


a subgradient of / at x. If / is convex extensible, we have

where / is the convex extension of /. The set of integer-valued subgradients

is called the integer subdifferential of / at x £ domz/.


Directional derivatives and sub differentials of M-convex functions are given
as follows. It is recalled that £0[R], A)[Z|R], £0[Z], and M[TL -> R Z] denote,
respectively, the sets of L-convex polyhedra, integral L-convex polyhedra, L-convex
sets, and dual-integral polyhedral M-convex functions. Also recall the definition of
7 in (6.82).

Theorem 6.61.
(1) For f e M[R —> R] and x e doniR/, define •jfiX(u,v) = f ' ( x ; ~xu + Xv)
(u,v € V). Then
6.13. Directional Derivatives and Subgradients 167

and dnf(x) ^ 0 in particular. If f 6 M[R. —> R|Z], then

(2) For f g .M[Z —> R] and x e domz/, de/ine ff,x(u, v) = f ( x — \u + %„) -


/(x) ( u , u e V). T/ien

and &R/(X) ^ 0 in particular. If f G A^[Z —> Z], £ften

7/,x€T[Z], ^/(x) e £0[Z|R], 9 z /(x) £ £0[Z], daftx) = d z f ( x ) ,

and dzf(x) j^= 0 m particular.

Proof. (1) Proposition 6.60 shows f ' ( x ; - ) £ oM[H —> R], from which follows
7/,z e ^"[R] by Proposition 6.57. By the definition of a sub differential and Theorem
6.52 (M-optimality criterion) we see

We have D(7/,x) e £0[R] by (5.18) and /'(*;•) = 7/,*(') by (3.31), (3.33), and
(6.84). If / e M[R -» R|Z], d n f ( x ) is an integral polyhedron by (6.76) and
1f,x(u, v) =sup{p(v) - p(u) p 6 ^R/(X)} e Z.
(2) Applying (M-EXC[Z]) to x + Xv3 ~Xv2,x + Xv2 - Xvi, and u = vl shows
the triangle inequality (5.2), and hence jftX & 1~\R]. The rest of the proof is similar
to (1), where we use Theorem 6.26 (1) instead of Theorem 6.52 and Theorem 5.5
instead of (5.18). D

The following fact shows the consistency of (1) and (2) in Theorem 6.61.

Proposition 6.62. For f e .M[Z|R —> R] and x 6 domR/ n Zv, we have


f ' ( x ; -Xu + Xv) = f(x -Xu + Xv) - f ( x ) for u,veV.

Proof. This follows from integrality (6.75) and (4.40). D

The following theorem affords characterizations of polyhedral M-convex func-


tions in terms of the M-convexity of directional derivatives, the L-convexity of sub-
differentials, and the M-convexity of minimizers.

Theorem 6.63. For a polyhedral convex function f : Rv —* R U {+00} with


dom.R/ ^ 0, the jour conditions (a), (b), (c), and (d) below are equivalent.
(a) / e Af[R-+R].
168 Chapter 6. M-Convex Functions

Figure 6.3. Quasi-convex function.

(b) f ' ( x ; •) 6 oAi[R —> R] for every x 6 domR/.


(c) dnf(x) £ A)[R] f°r every x 6 domR/.
(d) argmin/[— p] e .Mo[R] for every p € Rv with inf/[—p] > — oo.

Proof, (a) =>• (b) is by Proposition 6.60, (a) =>• (c) by Theorem 6.61, and (a) =/•
(d) by Proposition 6.53. The rest is proved later in Note 8.7. D

An integrality consideration in the equivalence of (a) and (d) in the above


theorem yields a characterization of integral polyhedral M-convex functions.

Theorem 6.64. For a polyhedral convex function f : Rv —> R U {+00} with


doniR,/ ^ 0, the two conditions (a) and (d) below are equivalent.
(a) / e A 4 [ Z | R - » R ] .
(d) argmin/[-p] e A^o[Z|R] /or every p € Ry with inf/[—p] > -oo.

Note 6.65. By Theorem 6.61 we can identify f ( x , y ) in Proposition 6.25 as the


directional derivative of the convex extension / of / € M.\L —» R]. That is, we
have f ( x , y ) =J (x;y - x). •

6.14 Quasi M-Convex Functions


Quasi M-convex functions are introduced as a generalization of M-convex functions.
The optimality criterion and the proximity theorem survive in this generalization.
A function / : Rn —> R U {+00} is said to be quasi convex if it satisfies

whenever x,y € dom.R/ and 0 < A < 1 and semistrictly quasi convex if

whenever x, y 6 domp,/, f(x) ^ f(y), and 0 < A < 1. See Fig. 6.3 for an illustration
of a (semistrictly) quasi-convex function.
6.14. Quasi M-Convex Functions 169

Quasi convexity is ordinal convexity in the sense that the definition involves no
addition of function values, but relies only on comparisons. In this connection note
that, if f ( x ) is convex and <p is a nondecreasing function representing a nonlinear
scaling, then y>(/(x)) is quasi convex.
Quasi-convex functions enjoy the following nice properties:

• A strict local minimum of a quasi-convex function is a strict global minimum.

• A local minimum of a semistrictly quasi-convex function is a global minimum.

• Level sets of quasi-convex functions are convex sets.

Due to these properties, quasi convexity also plays an important role in continuous
optimization (see, e.g., Avriel-Diewert-Schaible—Zang [5]).
The concept of quasi M-convexity is defined for a function / : Zv —> Ru{+oo}
as follows. Recall the exchange axiom for M-convex functions:

(M-EXC[Z]) For x,y € dom/ and u £ supp+(x — y), there exists


v £ supp~ (x — y) satisfying

The sign patterns of &f(x;v,u) and A/(j/;u, v) compatible with (implied by) in-
equality (6.91) are as follows:

Here Q and x denote possible and impossible cases, respectively. Relaxing con-
dition (6.91) to compatible sign patterns leads to two versions of quasi M-convex
functions. We say that a function / : Zv —> R U {+00} with dom / / 0 is quasi
M-convex if it satisfies the following:

(QM) For x,y 6 dom/ and u G supp+(x — y), there exists v £


supp~ (a: - y) satisfying

Similarly, a function / : Zv —> R U {+00} with dom/ ^ 0 is semistrictly quasi


M-convex if it satisfies the following:

(SSQM) For x,y e dom/ and u € supp+(o; - y), there exists v 6


supp~ (x — y) satisfying
170 Chapter 6. M-Convex Functions

Example 6.66. A quasi M-convex function arises from a nonlinear scaling of an


M-convex function. For an M-convex function / : Zv —> R U {+00} and a function
(p : R -» R U {+00}, define / : Zv -> R U {+00} by

Then / satisfies (QM) if if is nondecreasing and (SSQM) if <p is strictly increasing.

The following weaker variants of (QM) and (SSQM) turn out to be useful for
our subsequent discussion:
(QM W ) For distinct x,y G dom/, there exist u G supp + (x — y) and
v G supp~~ (x — y) satisfying

(SSQMW) For distinct x,y G dom/, there exist u & supp + (x — y) and
v G supp~ (x — y) satisfying

The property (QM W ) can be expressed in two alternative forms below. The
first (6.93) may be regarded as a variant of (6.89) with discreteness in direction,
and the second (6.94) is similar to (6.50) in section 6.6.

Theorem 6.67. For f : Zv —> R U {+00}, (QM W ) is equivalent to each of the


following conditions:

Proof. Obviously, (6.94) implies (QM W ) and (6.93). We prove (QM W ) => (6.94)
and (6.93) => (6.94) by induction on \\x — y\\\. Suppose x,y G dom/ and f ( x ) >
f(y)', we may assume \\x—y\\\ > 2. If (QM W ) is true, there exist some u & supp+(x —
y) and v G supp~(x — y) such that Af(x;v,u) < 0 or Af(y;u,v) < 0; in the latter
case the induction hypothesis for x and y' = y + xu — Xv yields A/(x; v', u') < 0 for
some u' G supp+(x — y') C supp + (x — y) and v' G supp~(x — y') C supp~(a; — y).
If (6.93) is true, there exist u G supp+(a; — y) and v G supp~(x — y) such that
A/(x;f,u) < 0 or f(y + Xu ~ Xv) £ f ( x } ' , m the latter case we have /(x) > f ( y ' )
for y' = y + Xu — Xv and the induction hypothesis yields A/(o;; v', u') < 0 for some
u' G supp + (x — y') C supp + (x — y) and v' G supp~(x — y') C supp~(x — y). D
6.14. Quasi M-Convex Functions 171

The relationship among various versions of quasi M-convex functions is sum-


marized as follows. The second statement below shows that all the conditions are
equivalent for / if they are imposed on every perturbation of / by a linear function.

Theorem 6.68. For f : Zv —> R U {+00} the following implications hold true.

Proof. (1) The equivalence of (M-EXC[Z]) and (M-EXCW[Z]) is due to Theorem


6.5. The remaining implications are obvious.
(2) Combining Theorems 6.72 and 6.74 below establishes this. D

The quasi M-convexity of a set B C Zv can be defined as the quasi M-


convexity of the indicator function SB '• Zv —> {0, +00}. The properties (QM) and
(QM W ) for SB correspond respectively to the following properties of B:
(Q-EXC) For x, y € B and u 6 supp + (x—y), there exists v 6 supp~(x-
y) such that x - \u + Xv e B or y + Xu ~ Xv 6 B.
(Q-EXCW) For distinct x,y € B, there exist u 6 supp+(x - y) and
v € supp~ (x — y) such that x — Xu + Xv G B or y + Xu ~ Xv & B.

Proposition 6.69. A set B C Zv satisfies (Q-EXCW) if and only if, for distinct
x,y £ B, there exist u e supp + (x — y) and v £ supp~(x - y) with x — \u + Xv 6 B.

Proof. Theorem 6.67 for f — SB reduces to this statement. D

Proposition 6.70. For a set B C Zv satisfying (Q-EXCW), we have x(V) = y(V)


for any x,y € B.

Proof. The proof is easy (similar to that of Proposition 4.1). D

Example 6.71. Whereas we have the obvious implications (B-EXC[Z]) => (Q-
EXC) =>• (Q-EXCW), these conditions are not equivalent. For instance,

satisfies (Q-EXC) and not (B-EXC[Z]), and

satisfies (Q-EXCW) and not (Q-EXC). •


172 Chapter 6. M-Convex Functions

The weaker version (QM W ) of quasi M-convexity for functions can be charac-
terized by the corresponding quasi M-convexity (Q-EXCW) of level sets. For any
a & R U {+00}, the level set is defined as

Note that L(/, +00) = dom/ and L(/, ao) = argmin/ for ao = min/.

Theorem 6.72. A function f : Zv —+ R U {+00} satisfies (QM W ) if and only if


the level set L(/, a) satisfies (Q-EXCW) for all a & R.

Proof, ["only if"]: Let x and y be distinct elements of L(/, a). By (QM W ), we have
A/(x; v,u) < 0 or A/(y; u, v) < 0 for some u € supp+ (x — y) and v € supp~ (x — y ) .
Then, x - Xu + Xv e L(/, a) or y + \u - Xv & L(/, a).
["if"]: For any distinct x,y e dom/ with /(x) > /(y), we have x — Xu + Xv &
L(f,f(x)) for some u € supp + (x — y) and v € supp~(a; — y) by (Q-EXCW) and
Proposition 6.69. Hence f ( x — Xu + Xv) < /0*0- ^

Proposition 6.73. If f satisfies (QM W ), then dom/ satisfies (Q-EXCW).

Proof. In the proof of "only if" of Theorem 6.72, replace L(/, a) with dom/. D

An M-convex function can be characterized by quasi M-convexity of level sets


of perturbed functions.

Theorem 6.74. A function f : Zv —» Ru{+oo} satisfies (M-EXC[Z]) if and only


if the level set L(/[p], a) satisfies (Q-EXCW) for all p e Rv and a e R.

Proof. The "only if" part follows from Theorem 6.72. To prove the "if" part,
we first observe that Theorem 6.4 can be strengthened to a statement that (M-
EXC[Z]) and (M-EXCioc[Z]) are equivalent if dom/ satisfies (Q-EXCW). (This can
be shown by modifying the proof of Claim 2 in the proof of Theorem 6.4.) Note that
(Q-EXCW) holds for dom/ by Theorem 6.72 and Proposition 6.73. To show (M-
EXCi oc [Z]),takeo;,y e dom/with ||x-y||i = 4 and put y = x-Xm-Xu^+Xvi+Xvz
with Ui,U2,Vi,V2 £ V and {MI,u^} D {vi,v%} = 0- In the following we assume
MI ^ «2 and vi ^ V2 (the other cases can be treated similarly). Consider a bipartite
graph G = (V+,V~;E) with vertex bipartition V+ — {ui,u2}, V" = {^1,^2} and
arc set E = {(ui,Vj) A/(x;t; j ,u i ) < +00 (ij = 1,2)}.
Claim 1: G has a perfect matching (of size 2).
(Proof of Claim 1) It suffices to show that every vertex has an edge incident
to it. Take p e Rv such that P(UI) + p(u2) — p(v\) — P(VZ) = f(y) — f ( x ) and
P(VJ) >P(UI) -Af(x;Vj,ui) for j = 1,2 (and p(v) = 0 for v e V \ {ui,u2, Vi,v2}).
Then f\p\(x) = f\p}(y) < f\p\(x - Xui + Xvj) for j = 1,2. By (Q-EXCW) for
L
(f\P\J\p}(x)) we nave f\p](x - Xu2 + Xv^ < f\P\(x) < +00 for some j € {1,2}.
Hence there is an edge incident to U2, and similarly for other vertices.
Claim 2: Inequality (6.11) is satisfied.
6.14. Quasi M-Convex Functions 173

(Proof of Claim 2) By Claim 1 we may assume {(MI, Ui), (1*2,^2)} != E. We can


takepe Rv such that f\p](x) = f\p](y) and f\p](x-Xu!+Xvi) = f\p](x-Xu2+Xv2)
(and p(v) = 0 for v G V \ {ui,u 2 , v\, v2}). If {(ui,u 2 ), ("2,^1)} Q E, we can choose
p satisfying an additional condition f\p](x- Xm + Xv2] = f\P\(x - Xu2 + Xm); and
then (Q-EXCW) for L(/[p],/[p](x)) yields (6.11). If j>i,u 2 ) € £ and (u 2 ,ui) £ £,
we can choose p satisfying an additional condition f\p\(x) < f\p](x — Xui + Xv2),
and then (Q-EXCW) for L(/[p],/[p](x)) yields (6.11). The remaining cases are
similar. D

Next we turn to the minimization of quasi M-convex functions. The following


properties, respectively weaker than (SSQM) and (SSQMW), turn out to be relevant.
(SSQM^) For x,y e dom/ with /(x) ^ f ( y ) and u e supp+(x - y ) ,
there exists v e supp~ (x — y) satisfying
A/(a;;u, w) < 0 or A/(y;w, w) < 0 or A/(x; w , u ) = A/(y;u, u) = 0.
(SSQM^) For x, y € dom / with /(x) ^ /(y), there exist u e supp+(x—
y) and u G supp~ (x — y) satisfying
A/(x;w,u)<0 or A/(y;w,i>) < 0 or A/(x;v,u) = Af(y;u,v) = 0.

The property (SSQM^) can be expressed in two alternative forms below. The
first (6.96) may be regarded as a variant of (6.90) with discreteness in direction,
and the second (6.97) is identical to (6.50) in section 6.6.

Theorem 6.75. For f : Zv ->• R U {+00}, (SSQM^) is equivalent to each of the


following conditions:

Proof. The proof is much the same as that for Theorem 6.67.

Global optimality (minimality) for a quasi M-convex function is characterized


by local optimality.

Theorem 6.76 (Quasi M-optimality criterion).


(1) For f : Zv —> R U {+00} satisfying (QM W ) and x € dom/, we have

satisfying (SSQM^) and x 6 dom/, we have


174 Chapter 6. M-Convex Functions

Proof. It suffices to show <=. (1) is immediate from (6.94) in Theorem 6.67 and
(2) is from (6.97) in Theorem 6.75. D

The minimizer cut theorem (Theorem 6.28) for M-convex functions can be
generalized for quasi M-convex functions.

Theorem 6.77 (Quasi M-minimizer cut). Let f : Zv —* RU {+00} be a function


with (SSQM^) 7 and assume argmin/ ^ 0. Then (1), (2), and (3) in Theorem 6.28
hold true.

Proof. The proof of Theorem 6.28 is valid here when (M-EXC[Zj) is replaced with
(SSQM^). D

The proximity theorem for M-convex functions (Theorem 6.37) can be gener-
alized for quasi M-convex functions.

Theorem 6.78 (Quasi M-proximity theorem). Let f : Zv —> R U {+00} be a


function with (SSQM^), n = \V\, and a 6 Z ++ . // xa € dom/ satisfies (6.66),
then argmin/ ^ 0 and there exists x* & argmin/ with (6.67).

Proof. It suffices to show that, for any 7 e R with 7 > inf/, there exists some
x* e dom/ satisfying f(x*) < 7 and (6.67). Suppose that x* € dom/ minimizes
| \x* —xa ||i among all vectors satisfying f(x*) < 7. In the following, we fix v 6 V and
prove xa(v)—x*(v) < ( n — l ) ( a —1). (The inequality x*(v) -xa(v) < ( n — l ) ( a — 1)
can be shown similarly.) We may assume xa(v) > x*(v). Put

Claim 1: For y e argmin{/(y') | y' £ S}, we have y(v) = x*(v).


(Proof of Claim 1) Suppose that y(v) > x*(v). From the definition of x* we
have/(y) > f ( x * ) . By (SSQM^) for y, x*, andw e supp + (y-z*) C supp+(a;Q-:r*),
there exists w 6 supp~(y — x*) C supp~(o;a — x*) such that if Af(x*;v,w) > 0
then Af(y;w,v) < 0. By the choice of x*, we have Af(x*;v,w) > 0 and hence
f ( y ~ Xv + Xw) < f(y)- Since y — Xv + Xw € S, this contradicts the minimality of
f ( y ) . Thus Claim 1 is proved.
Take any y from argmin{/(y') y' £ S}, and represent it as

We have A = xa(v) — x*(v) by Claim 1.


Claim 2: For any w 6 supp"(x" — x*) with p,w > 0 and p, e [0, fj,w — l]z,
6.14. Quasi M-Convex Functions 175

(Proof of Claim 2) We prove the claim by induction on p,. For JJL 6 [0, [iw — l]z,
put x' — xa — n(xv — Xw), and assume x' 6 dom/. Note that x' 6 S and x'(v) >
x*(v), and hence Claim 1 implies f ( x ' ) > f ( y ) . Since supp~(y-o:') = {v}, (SSQM^)
for y, x', and w; e supp + (y-x') implies that if A/(y;i>,u>) > 0 then A/(x'; w, v) < 0.
By Claim 1 we have A/(j/; v, w) > 0, from which Claim 2 follows.
Claim 2 and (6.66) imply

where the third equality follows from xa(V) = y(V). D

Theorem 6.79 (Quasi M-minimizer cut with scaling). Let f : Zv —> R U {+00}
be a function satisfying (SSQM^) with argmin/ ^ 0, and assume a e Z ++ and
n=\V\. Then (I) and (2) in Theorem 6.39 hold true.

Proof. We prove (2), while (1) can be proved similarly. Put xa = x + a(xv ~ Xu)-
We may assume max{o;*(u) | x* € argmin/} < xa(v); otherwise we are done. Let
x* be an element of arg min / with x* (v) maximum. The rest of the proof is the
same as the proof of Theorem 6.78 (from Claim 1 until the end). D

Bibliographical Notes
The concept of M-convex functions was introduced by Murota [137] and that of
M^-convex functions by Murota-Shioura [151]; Theorems 6.2 and 6.3 are due to
[151].
The local exchange axiom (Theorem 6.4) is given in Murota [137], and the
weak exchange axiom (Theorem 6.5) is explicit in Murota [147].
Quadratic functions of the form (6.23) are treated in Camerini-Conforti-
Naddef [22], and their M^-convexity is observed in Murota-Shioura [151]. Propo-
sition 6.8 (characterization of quadratic M-convex functions) is due to Murota-
Shioura [155]. Quadratic functions defined by symmetric matrices of the form
(6.29), or (6.30), are treated in Hochbaum-Shamir-Shanthikumar [92], and their
M-convexity is noted by A. Shioura. Quasi-separable convex functions (6.32) are
considered by [22], and their M^-convexity is pointed out in [151]. The M^-convexity
of laminar convex functions (6.34) and minimum-value functions (6.36) is due to
Danilov-Koshevoy-Murota [34], [35]. The names laminar convex functions and
minimum-value functions are coined in this book.
The basic operations in section 6.4 are listed in Murota [141], [144], [147].
Theorem 6.13 (8) (infimal convolution) is due to Murota [137], whereas Theorem
6.15 (2) (projection of M^-convex functions) is due to [141]. The scaling operation
for M-convex functions is considered by Moriguchi-Murota-Shioura [133].
176 Chapter 6. M-Convex Functions

The supermodularity of M^-convex functions (Theorem 6.19) is observed by


Murota-Shioura [153]. A special case for valuated matroids was noted earlier by
Dress-Terhalle [40] and Murota [138].
Theorem 6.24 (descent direction) is observed by Murota-Tamura [160] as
a generalization of its special (but essential) case with dom/ C {0,1}V due to
Fujishige-Yang [69]. Proposition 6.25 is in Murota [137], [140], [142].
Theorems on minimizers are of fundamental importance. Theorem 6.26 (M-
optimality criterion) and Theorem 6.30 (characterization by minimizers) are by
Murota [137]. Theorem 6.28 (M-minimizer cut) is by Shioura [190].
The connection to the gross substitutes property was studied almost simulta-
neously by Danilov-Koshevoy-Lang [33], Fujishige-Yang [69], and Murota-Tamura
[160]. The concave version of (6.60) is identical to condition GS in [33]. Proposi-
tions 6.32 and 6.33 and Theorem 6.34 are due to [160], and Proposition 6.35 and
Theorem 6.36 are due to [33].
Results about minimizers under scaling are relatively new. Theorem 6.37
(M-proximity theorem) is by Moriguchi-Murota-Shioura [133]. Theorem 6.39 (M-
minimizer cut with scaling) is by Tamura [197].
The convex extension of M-convex functions has been understood step by
step. Convex extensibility (latter half of Theorem 6.42) and the characterization
by minimizers (Theorem 6.43) are in Murota [137]. Integral convexity (Theorem
6.42) and convex extension for a pair of M-convex functions (Theorem 6.44) are by
Murota-Shioura [153].
Polyhedral M-convex functions are investigated by Murota-Shioura [152], to
which all the theorems in section 6.11 (Theorems 6.45, 6.47, 6.48, 6.49, 6.50, 6.51,
6.52) as well as Proposition 6.53 are ascribed. M-convexity for nonpolyhedral convex
functions is considered in Murota-Shioura [156], [157].
The correspondence between positively homogeneous M-convex functions and
distance functions (Theorem 6.59) is established for the case of Z in Murota [141]
and generalized to the case of R in Murota-Shioura [152]. Proposition 6.56 is stated
in Murota [147].
Theorem 6.61 for directional derivatives and subgradients is shown for the case
of Z in Murota [140], [141] and generalized to the case of R in Murota-Shioura [152].
Theorem 6.63 (characterizations in terms of directional derivatives, subdifferentials,
and minimizers) is by [152], whereas its ramification with integrality (Theorem 6.64)
is stated in Murota [147].
The concept of quasi M-convex functions was introduced by Murota-Shioura
[154], to which almost all major results in section 6.14 (Theorems 6.67, 6.68, 6.72,
6.75, 6.76, 6.77, 6.78) are ascribed. Exceptions are Theorem 6.74 (characteriza-
tion of M-convex functions by level sets) by Shioura [191] and Theorem 6.79 (quasi
M-minimizer cut with scaling) by Tamura [197]. Zimmermann [221] considers com-
binatorial optimization problems with quasi-convex objective functions in real vari-
ables.
M-convex functions find applications in resource allocation problems (Katoh-
Ibaraki [110], Moriguchi-Shioura [134]), mathematical economics (to be treated in
Chapter 11), and analysis of polynomial matrices (to be treated in Chapter 12).
Chapter 7

L-Convex Functions

L-convex functions form another class of well-behaved discrete convex functions.


They are defined in terms of an abstract axiom involving submodularity and are
characterized as functions obtained by piecing together L-convex sets in a consistent
way or as collections of submodular set functions with some consistency. Funda-
mental properties of L-convex functions are established in this chapter, including
the local optimality criterion for global optimality, the proximity theorem for mini-
mizers, discrete midpoint convexity, integral convexity, and extensibility to convex
functions. Duality and conjugacy issues are treated in Chapter 8 and algorithms in
Chapter 10.

7.1 L-Convex Functions and iJ-Convex Functions


We recall the definitions of L-convex functions and L''-convex functions from sec-
tion 1.4.1.
A function g ; Zv —> R U {+00} with domg ^ 0 is said to be an L-convex
function if it satisfies

(SBF[Zj) is submodularity on the integer lattice and (TRF[Zj) linearity in the di-
rection of 1. Note that we have r G Z if g is integer valued. Also recall the
submodularity inequality

We denote by £[Z —•> R] the set of L-convex functions and by £[Z —> Z] the set of
integer-valued L-convex functions.
Since an L-convex function g is linear in the direction of 1, we may dispense
with this direction as far as we are concerned with its nonlinear behavior. Namely,
instead of the function g in n = \V\ variables, we may consider the restriction g' to

177
178 Chapter 7. L-Convex Functions

an arbitrarily chosen coordinate plane, say, P(UQ) = 0 for some UQ G V, where the
restriction g' is a function in n — 1 variables defined by

with the notation V = V \ {UQ} and (po,p') G Z x Zv>. A function derived from
an L-convex function by such a restriction is called an L''-convex function.
More formally, an L''-convex function is defined as follows. Let 0 denote a new
element not in V, and put V — {0} U V. A function g : Zv —> R U {+00} is called

£^ [Z —» R] the set of L^-convex functions and by $ [Z —> Z] the set of integer-valued


L''-convex functions.
It turns out that L''-convexity can be characterized by a kind of generalized
submodularity:

which we call translation submodularity. Note that a is restricted to be nonnegative


and this inequality for a = 0 agrees with the original submodularity (SBF[Z]).

Theorem 7.1. For a function g : Zv —> R U {+00} with domg ^ 0, we have

Proof. Let g be denned by (7.2). The submodularity of g, i.e.,

is translated to a condition

on g. Assuming a = q0 - p0 > 0, put p' = p - p0l and q' = q — gol- Then


(P V q) - (po V q0)l = (p' - al) V q' and (p A q) - (p0 A <?0)1 = p' A (q' + al). Hence
the above inequality is equivalent to (SBF^Z]). D

L'-convex functions are conceptually equivalent to L-convex functions, but the


class of L^-convex functions is larger than that of L-convex functions. The condition
(7.3) below is stronger than (SBF^Zj) in that it requires the inequality not only for
nonnegative a but also for negative a.

Theorem 7.2. An L-convex function g e £[Z —* R] satisfies


7.1. L-Convex Functions and L''-Convex Functions 179

Proof. By (SBF[Z]) and (TRF[Z]) we see

Theorem 7.3. An L-convex function is L^-convex. Conversely, an L^-convex


function is L-convex if and only if it satisfies (TRF[Zj).

Proof. This follows from Theorem 7.1 and the obvious implications (7.3) =>•
(SBF»[Z]) =>(SBF[Z]). D

For ease of reference we summarize the relationship between L and Lfi as

where Cn and &n denote, respectively, the sets of L-convex functions and L^-convex
functions in n variables, and the expression £^ ~ Cn+i means a correspondence of
their elements (functions) up to the constant r in (TRF[Zj), where (7.2) gives the
correspondence under the normalization of r = 0.
By the equivalence between L-convex functions and L''-convex functions all
theorems stated for L-convex functions can be rephrased for L''-convex functions,
and vice versa. In this book we primarily work with L-convex functions, making
explicit statements for L''-convex functions when appropriate.
A set function p : 2V —>• RU {+00} can be identified with a function g : Zv —>
RU {+00} with domg C {0,1}V through

The following states that the submodularity of p is the same as the L^-convexity of g.

Proposition 7.4. Let p : 2V —> RU {+00} be a set function with dom/9 ^ 0 and
g : Zv —>• RU {+00} be the associated function in (7.5). Then we have

Proof. (SBF^Zj) for a = 0 is equivalent to the submodularity (4.9) of p. (SBF^Zj)


for a > I is void for a function g with doing C {0,1}V since (p — al) V q = q and
p A (q + al) = p for any p, q 6 {0,1}V and a > 1. D

The proposition above shows that L^-convex functions effectively contain sub-
modular set functions as a subclass; i.e.,

where <—> denotes the embedding by (7.5).


180 Chapter 7. L-Convex Functions

We mention here the following fundamental fact, showing that submodularity


(on the integer lattice) is in fact a local property. The proof is easy and omitted.

Proposition 7.5 (Local submodularity). Let g : Zv —> R U {+00} be a function


with domg being L^-convex. Then g satisfies the submodularity inequality (7.1) for
allp,q 6 Zv if and only if it satisfies (7.1) for allp,q 6 Zv with \\p-q\\oc = 1-

Note 7.6. With an M^-concave function h : Zv -» R U {-00} with dom/i =


{0,1} V , we may associate a function 5 : Zv —> RU{+oo} such that domg = {0,1}^
and g(p) — h(p) for p e {0, l}y. Since ft is submodular by Theorem 6.19, g
is L''-convex by Proposition 7.4. In this sense we have M^-concave => L''-convex
for functions on {0, l}-vectors. The converse does not hold, as is demonstrated
by an lAconvex function g : Z3 —> R U {+00} with domg = {0,1}3 defined by
0(1,1,1) = -2, 0(1,1,0) = 0(1, 0,1) = -1, and g(0,0,0) = g(l, 0,0) = g(0,1,0) =
g(0,0,l) = 0(0,1,1) = 0. Note that h = -oo and 0 = +00 outside {0,1} V and
that any function on {0,1}V can be extended to a convex function and to a concave
function. •

7.2 Discrete Midpoint Convexity


We show a characterization of L''-convexity in terms of discrete midpoint convexity

which is an obvious approximation to the midpoint convexity (1.3) of ordinary


convex functions; see Fig. 7.1. We consider another property for a function 0:

where

This says that the sum of the function values at a pair of points (p, q) does not
increase when the pair is replaced with another pair (p — xx-,1 + Xx) of closer
points.

Theorem 7.7. For a function g : Zv —* R U {+00} with dom0 ^ 0, we have

Hence, each of these is a necessary and sufficient condition for g to be L^-convex.

Proof. First, Theorem 7.1 shows the equivalence of (SBF^Zj) to lAconvexity.


[(SBFl'[Z])=KLt|-APR[Z])]: Suppose that supp+(p - q) ^ 0 and put a =
maxV£v{p(v) — q(v)} — 1. We have a > 0, (p — al) Vq — q + Xx, and p/\ (q + al) =
p-Xx- Hence (L^-APR[Zj) follows from (SBF^Zj).
7.3. Examples 181

Figure 7.1. Discrete midpoint convexity.

We have \p'(v) — q'(v)\ <1 (v 6 V), supp+(p' — q') C supp+(p — q), and supp (p'—
</') £ supp~(p - g). Starting with (p, Q) and applying (LI'-APR[Z]) repeatedly,
we obtain g(p) + g(q) > g(p') + g(q'). Applying (L"-APR[Z]) to (p',q') yields
9(P') + 9(l'} > 9(P") + 9(q"}- Hence follows g(p) + g(q) > g(p") + g(q").
[(7.7)=^>(SBFl'[Z])]: (SBF^Z]) for g is equivalent to the submodularity of g in
(7.2) (cf. proof of Theorem 7.1). Since domg is an L^-convex set by (7.7) and (5.15),
dome? is an L-convex set. By Proposition 7.5, the submodularity of g is equivalent
to the local submodularity of g and the latter holds if and only if

for all p 6 Zv and X, Y C V with X n Y = 0. These two conditions follow easily


from discrete midpoint convexity (7.7). D

7.3 Examples
We have already seen L-convexity in network flows and in matroids (section 2.2,
section 2.4). In this section we see some other examples of L-convex functions, such
as linear functions, quadratic functions, and separable convex functions.
First we note the following facts.

Proposition 7.8.
(1) The effective domain of an L-convex function is an L-convex set.
(2) The effective domain of an L^-convex function is an L^-convex set.

Proof. (1) (SBF[Z]) and (TRF[Z]) for g imply (SBS[Zj) and (TRS[Z]) for D =
dom^. (2) Similarly, (SBF"[Z]) for g implies (SBS^Z]) for D = domg. D
182 Chapter 7. L-Convex Functions

Linear functions A linear (or affine) function48

with x € R™ and a G R is L-convex or L''-convex according as domg is L-convex


or L''-convex.

Quadratic functions A quadratic function

with ay = Oji G R («, j = 1 , . . . , n) is L''-convex if and only if

which can be proved as in Theorem 2.7. Accordingly, g is L-convex if and only if

In Example 2.1 (Poisson equation) and Example 2.2 (electrical network), we have
seen the matrices

which satisfy (7.10) and (7.11), respectively.

Separable convex functions A separable convex function

with univariate discrete convex functions gi G C[Z —> R] (i = 1 , . . . , n) is L^-convex


if domgr is an L^-convex set.49 In particular, a separable convex function with a
chain condition

48
In this section, V = {1,... ,n}, g denotes a real-valued function in integer variables, i.e., g :
n
Z —> R U {+00}, and p(i) is the ith component of an integer vector p = (p(i) i = 1 , . . . , n) 6 Z".
49
It is easy to verify discrete midpoint convexity (7.7) for g.
7.4. Basic Operations 183

is L''-convex. For g^ € C[Z —> R] (i ^ j;«, j = 1,..., n), the function

is L-convex if dom g is an L-convex set. As special cases of (7.12) and (7.14) we see
the following.

Proposition 7.9. Let tfj € C[Z —> R] be a univariate discrete convex function.
(1) ip is L^-convex.
(2) TTie function g : Z2 -> R U {+00} de/med fry 0(p) = VOK 1 ) ~ P( 2 )) ««
L-confeo;.

Maximum-component functions The function

which gives the maximum value of the components of p, is L-convex.


Multimodular functions A function h : Zn —> Ru {+00} is said to be multi-mod-
ular if the function g : Zn+1 —» R U {+00} defined by
ff(po,p) = /i(p(l)-po,p(2)-p(l),...,p(n)-p(n-l)) (po e Z,p € Z") (7.16)
is submodular. This means that a function h : Z™ —> R U {+00} is multimodular if
and only if it can be represented as

for some L11-convex function g.


Submodular set functions Any submodular set function may be regarded as an
L^-convex function by Proposition 7.4.

7.4 Basic Operations


Basic operations on L-convex functions are presented here, whereas the most im-
portant operation, transformation by networks, is treated later in section 9.6.
L-convex functions admit the following operations. See (6.41) for the definition
of the projection gu.

Theorem 7.10. Let g,gi,g% € £[Z —> R] be L-convex functions.


(1) For A € R++, \g is L-convex.
(2) For a e Zv and (3 & Z \ {0}, g(a + j3p) is L-convex in p.
(3) For x e Ry, g[—x] is L-convex.
(4) For U C V, the projection gu is L-convex provided gu > — oo.
(5) For ipv e C[Z -f R] (w € V),
184 Chapter 7. L-Convex Functions

is L- convex provided g > — oo.


(6) The sum gi + g^ is L-convex provided dom (g\ + 52) 7^ 0-

Proof. (1), (2), (3), and (6) are obvious.


(5) (TRF[Z]) is easy to see. For (SBF[Zj) we indicate the idea by assuming
that, for each p\,pi 6 dom g, the inflmum in (7.18) is attained by some qi, q2 e Zy.
Proposition 7.9 (2) and (SBF[Zj) for g respectively show

On the other hand, (7.18) implies

Adding these inequalities yields (SBF[Zj) for g.


(4) A special case of (5) with if)v = <5{0} (v & U) and ij)v — Sz (v e V \ U)
shows the L-convexity of g = g ^ z ^ f j , where Sfj is the indicator function of U —
{p e Zv | p(v) = 0 (v G U)}. By (6.45), gu is the restriction of g to U. Then
(SBF[Z]) of gu is immediate from that of g, and (TRF[Z]) of gu follows from that
of g since g(p + xv) =g(p + Xu)- Q

Note in Theorem 7.10 (2) that we have a scaling factor (3, in contrast to the
similar statement (Theorem 6.13 (2)) for M-convex functions. Also note that g in
Theorem 7.10 (5) is the infimal convolution of g with a separable convex function.
Operations in Theorem 7.10 are also valid for L^-convex functions. In addi-
tion, restrictions are allowed for L''-convex functions. See (3.55) and (6.40) for the
definitions of the restrictions g[a,b] and gu-

Theorem 7.11. Let g,gi,g2 e £^[2 —> R] be L^-convex functions.


(1) Operations (l)-(6) of Theorem 7.10 are valid for ifi-convex functions.
(2) For a,b € (Z U {±oo})y, the restriction g^a^ to the integer interval [a,b]
is L^-convex provided domg^f,] ^= 0-
(3) For U C V, the restriction g\j is L^-convex provided dom gu ^ 0-

Proof. (2), (3) It is easy to verify (SBF^Zj) for g[a<b] and gv. D

Note 7.12. The infimal convolution of two L-convex functions is not necessarily
L-convex, and similarly for the infimal convolution of two L^-convex functions. Such
functions are studied in section 8.3 under the names L2-convex functions and L2-
convex functions, respectively. •
7.5. Minimizers 185

Note 7.13. The proviso gu > —oo in Theorem 7.10 (4) can be weakened to
gu (po) > — oo for some po, and similarly for g > —oo in Theorem 7.10 (5). •

7.5 Minimizers
Global optimality for an L-convex function is characterized by local optimality.

Theorem 7.14 (L-optimality criterion).


(1) For an L-convex function g € £[Z —> R] and p 6 dom g, we have

(2) For an L^-convex function g £ &\L —> R] and p 6 domg, we have

Proof. It suffices to prove 4= in (1) and (2). We first consider (2). For any disjoint
Y, Z C V, condition (7.20) together with submodularity yields

which implies the optimality criterion (3.65) for integrally convex functions. Since
an L^-convex function is integrally convex (to be shown in Theorem 7.20), Theorem
3.21 establishes 4= in (2). Next, (1) follows from (2), since an L-convex function
is L''-convex (Theorem 7.3) and the right-hand side of (7.19) implies g(p - \Y) =
g(p + xv\Y)>g(p)- n
The well-known optimality criterion for a submodular set function is an im-
mediate corollary of (2) above.

Theorem 7.15. Let p be a submodular set function. A subset X 6 dom/9 is a


minimizer of p if and only if p(X) < p(Y) for any Y that includes X or is included
in X.

Proof. Let g be the L''-convex function associated with p (see (7.5) and Proposition
7.4), and apply Theorem 7.14 (2) with p = xx- D

Although Theorem 7.14 affords a local criterion for global optimality of a point
p, a straightforward verification of (7.19) requires O(2 n ) function evaluations. The
verification can be done in polynomial time as follows. We consider a submodu-
lar set function pp defined by pp(Y) = g(p + XY) — g(p) and note that (7.19) is
equivalent to saying that pp achieves its minimum at Y = 0. This condition can be
verified in polynomial time by the submodular function minimization algorithms in
section 10.2.
The minimizers of an L-convex function form an L-convex set, a property that
is essential for a function to be L-convex.
186 Chapter 7. L-Convex Functions

Proposition 7.16. For an L-convex function g € C[Z —> R], argming is an


L-convex set if it is not empty.

Proof. Suppose D = arg min <? is nonempty. We have r = 0 in (TRF[Zj) and hence
D satisfies (TRS[Z]). For p,q 6 D, we have p V q,p A q e D by (SBF[Z]). This
shows (SBS[Z]) for D. D

The following theorem reveals that L-convex functions are characterized as


functions obtained by piecing together L-convex sets in a consistent way. This
shows how the concept of L-convex functions can be defined from that of L-convex
sets.

Theorem 7.17. Let g : Zv —> R U {+00} be a function with a bounded nonempty


effective domain.
(1) g is L-convex <^=> argmin<?[-x] is an L-convex set for each x e Ry.
(2) g is L^-convex •<==> argmin<?[—x] is an L^-convex set for each x G ~RV.

Proof. It suffices to prove (1). The implication =>• is immediate from Theorem 7.10
(3) and Proposition 7.16. The converse is shown later in Note 7.47. D

7.6 Proximity Theorem


We show a proximity theorem for L-convex function minimization, stating that
a global optimum of an L-convex function g exists in a neighborhood of a local
optimum of its scaling ga defined by ga(p) = g(ap)/a. Note that ga is L-convex by
Theorem 7.10 (2), and accordingly, a local minimizer of ga is a global minimizer of
ga by the L-optimality criterion (Theorem 7.14).

Theorem 7.18 (L-proximity theorem). Assume a € Z++ and n = \V\.


(1) Let g : Zv —> R U {+00} be an L-convex function with g(p) = g(p + 1)
(Vp e Zv). Ifpae domg satisfies

then arg min 5^0 and there exists p* e arg min g with

(2) Let g : Zv —>RU{+oo} be an L^-convex function. If pa € domg satisfies

then arg min <? ^ 0 and there exists p* e arg min g with
7.7. Convex Extension 187

Proof. (1) It suffices to show that, for any ft > inf g, there exists p* that satisfies
g(p*) < /3 and (7.22); note that there exist only a finite number of p* satisfying
(7.22). We may assume pa = 0. By (TRF[Z]) with r = 0 there exists p* such that
g(p*) < ft and p* > 0; let p* be minimal (with respect to order >) among such
vectors. We have p* (v) = 0 for some v G V and

We can represent p* as p* = Z)£=i MiXXi, where //$ € Z++ (i = l , . . . , f c ) , 0 ^


Xi ^2 ^ • • • ^ Xk ^ y, and 0 < k < n - 1.
Claim 1:

(Proof of Claim 1) Put p = X^=i /J-iXXi+UXXj and suppose p G dom g. By JC,-


C supp+(p*) and (7.25) we have g(p* — XXj) > d(p*)- Since Xj = argmax^6y{p*(t;)
- p ( v ) } , (LH-APR[Z]) shows that g(p* - Xx,) > 5(p*) =>ff(p+ XxJ < 5(p)- Note
that g satisfies (lAAPR[Z]) by Theorem 7.7.

(Proof of Claim 2) Put p — Y^i=i P-iXXi and q = ^xXj and suppose q 6 dom g.
Since V" \ Xj = argmax^y{q(v) - p ( v ) } and g(p + Xv\Xj)=g(p- XXj} > g(p) by
Claim 1, (Ll-APR[Z]) implies g(q) > g(q - \v\Xj) = g(q + Xxt).
It follows from Claim 2 and (7.21) that ^i < a for i = 1 , . . . , k, and hence

(2) This follows from (1) applied to g in (7.2). D

The algorithmic use of the above theorem is shown in sections 10.3.2 and 10.4.5.

7.7 Convex Extension


This section establishes one of the major properties of L-convex functions, which is
that they can be extended to convex functions in real variables. The extensibility to
convex functions is by no means obvious from the definition of L-convex functions
in terms of axioms referring only to function values on integer points. The convex
extension of an L-convex function can be obtained by piecing together the Lovasz
extensions.
With a function g : Zv -* R U {+00} and a point p e domg we associate a
set function pg^p : 2,v —> R U {+00} defined by

If g is L-convex, the associated set function pg,p belongs to <S[R] (i.e., p9tp is sub-
modular, /?g,p(0) = 0, and pg,p(V) < +00).
188 Chapter 7. L-Convex Functions

The next theorem shows that an L-convex function g can be extended to a


convex function, and that the convex extension can be constructed from the Lovasz
extension of pg>p for varying p € Zv.

Theorem 7.19. Let g € £[Z —> R] be an L-convex function and ~g be its convex
closure.
(1) For p 6 domg and q £ [0, I]R, we have

where p3tp denotes the Lovasz extension (4.6) of the associated set function pg>p in
(7.26), q\ > 92 > • • • > qm are the distinct values of the components of q, and

(2) For p e Zv and q e [0,1]R, we have50

Proof. (2) For each p € Zv, let hp(q) denote the function in q e [0, I]R defined
by the right-hand side of (7.29). If p € domg, p9jp belongs to <S[R], and hence
hp is a polyhedral convex function by Theorem 4.16. With the representation
of a real vector s = p + q with p = [s\ and q = s — [s\ we define a function
h : Hv -* R U {+00} by h(s) = hp(q). We have

where C/o — 0 and t/j = Ui(q) for i = 1 , . . . , m, as in (7.28). By construction, /i is


convex in \p,p + I]R for each p G Zy. Furthermore, it is convex in the entire space
Ry, because h(s - al) = h(s) - ar for s e Rv and a e R by (TRF[Z]) of g, and
for each s € Rv there exist a € R and p € Zv such that s — al is an interior
point of \p,p + I]R. Obviously, we have h(p) = g(p) for every p e Zv, and /i is the
maximum among convex functions with this property. Therefore, h = ~g.
(I) If p G doing, the right-hand side of (7.29) can be rewritten as (7.27).
50
Expression (7.29) does not involve oo — oo even for p f domg.
7.8. Polyhedral L-Convex Functions 189

(3) This is a special case of (7.29) with q — 0.


(4) This is immediate from (7.29) and (TRF[Z]).
(5) For q e N0 we put a — min^ey q(v). By (4), g(p+q) = g(p+(q-al))+ar.
We can apply (7.27) to g(p + (q - al)) since q - al e [0,1]R. D

The above theorem implies the integral convexity of an L-convex function and
hence that of an L''-convex function.

Theorem 7.20. An L^-convex function is integrally convex. In particular, an


ifi -convex function is convex extensible.

An integrally convex function with submodularity (SBF[Zj) is called a sub-


modular integrally convex function. This turns out to be a synonym of L''-convex
function.

Theorem 7.21. For a function g : Zv —> R U {+00} with doing ^ 0,

g is L^-convex ^=> g is submodular integrally convex.

Proof. The implication =>• follows from (SBF[Zj) and Theorem 7.20. The converse
can be shown as follows. By the integral convexity of g, the convex closure g
coincides with the local convex extension, and, by the submodularity of g, the latter
is obtained as the Lovasz extension (7.27) of pg<p in (7.26). Therefore, we have

for any p, q G Zv. On the other hand, we have

from convex extensibility and midpoint convexity (1.3). From these follows the
discrete midpoint convexity (7.7) of g, which means L^-convexity by Theorem 7.7.
D

Note 7.22. Submodularity (SBF[Zj) alone does not guarantee convex extensibility.
For example, any function g : Z2 —> Ru{+oo} with doing = {p € Z2 | p ( l ) = p(2)}
is submodular. •

7.8 Polyhedral L-Convex Functions


As we have seen, L-convex functions on the integer lattice can be extended to
convex functions in real variables. The convex extension of an L-convex function is
a polyhedral convex function when restricted to a finite interval. Motivated by this
190 Chapter 7. L-Convex Functions

we define here the concept of L-convexity for polyhedral convex functions in general
and show that major properties of L-convex functions survive in this generalization.
A polyhedral convex function g : Rv —>.R U {+00} with doniRg ^ 0 is said
to be L-convex if it satisfies

(SBF[Rj) is submodularity on the real-vector lattice and (TRF[Rj) is linearity in the


direction of 1. We denote by £[R —> R] the set of polyhedral L-convex functions.
Polyhedral L-concave functions are defined in an obvious way.
Submodularity (SBF[Rj) is in fact a local property. Under auxiliary conditions
on the effective domain, it is implied by submodularity for a certain set of local
pairs of p and q. The following two propositions are mentioned here; the proofs are
straightforward and omitted (see Theorems 4.26 and 4.27 of Murota-Shioura [152]).

Proposition 7.23. Let g : R^ —> R U {+00} be a function with dom^g a closed


set. Then g is submodular (SBF[Rj) if, for each po € domR<7; there exists e =
e(po) > 0 such that inequality (7.1) is satisfied for all p, q with \\p-po\\oo < £ ana
Ik-Polloo <£•

Proposition 7.24. Let g : R^ —> RU{+oo} be a function withdom^g an interval.


(1) g is submodular (SBF[Rj) if

for all p G doniRg; u, v 6 V with u ^ v; and A, p, & R+.


(2) g is submodular (SBF[Rj) if (7.30) holds for allp e doniRg; u, v £ V with
u / v; A e [0,j3i_i -PJ]R; and p, e [0,Pj-i - PJ]R, where pl > p2 > • • • > pm
denote the distinct values of the components of p, the indices i and j are such that
p(u) = pi and p(v) =pj, and [0,po —pi\R means [0, +OO)R by convention.

The Lovasz extension p of a submodular set function p is a polyhedral L-convex


function.

Proposition 7.25. For p e <S[R] we have p 6 £[R -> R].

Proof. (TRF[Rj) for g — p is obvious, and we show (SBF[Rj) below.


First, assume p is finite valued. Then doniR/5 = Rv. By Proposition 7.24
(2), (SBF[R]) follows from inequality (7.30) for u 6 Ui \ f/i_i, v € Uj \ Uj-i,
A € [0,pi_i — pi\R, and p, & [0,pj_i — PJ]R, where Ui (i = 1 , . . . , m) are defined in
(4.4) and UQ = 0- Expression (4.6) shows
7.8. Polyhedral L-Convex Functions 191

If i =£ j, we have

and, if i = j, we may assume A > // and then we have

from the submodularity of p. Hence follows (7.30).


The general case where p may possibly take the value of +00 can be reduced
to the finite-valued case. For each k £ Z++, we define pk by

and put <?fc = pk. We have pk < +00 and pk G <5[R] and therefore (SBF[Rj) for
5fe. Since p(X) = lim/c^oo pk(X) (VX C V), we have #(p) = lim^oo gk(p) for each
p G Hv. Hence follows (SBF[R]) for g. D

An L-convex function on integer points naturally induces a polyhedral L-


convex function via convex extension (which exists by Theorem 7.20).

Theorem 7.26. The convex extension~g of an L-convex function g € £[Z —> R] on


the integer lattice is a polyhedral L-convex function, i.e., ~g G £[R —> R], provided
that g is polyhedral.

Proof. We use (7.27) in Theorem 7.19. (TRF[R]) is obvious. By Proposition 7.25,


pgtp is submodular on Rv, and hence g is submodular on [P,P+!]R for eachp G Zv.
The submodularity (SBF[Rj) of g on R^ follows from this. D

Example 7.27. The convex extension g of an L-convex function g G £[Z —» R]


may consist of an infinite number of linear pieces, in which case ~g is not polyhedral
convex. For example, we have g G £[Z —* R] and g ^ £[R —» R] for g : Z2 —> R
defined by g(p) = (p(l) — p(2)) 2 . It is worth noting that, if donizgr is bounded, g is
polyhedral and therefore ~g G £[R —> R] by Theorem 7.26. •

Polyhedral L-convex functions with integrality (6.75) are referred to as integral


polyhedral L-convex functions, the set of which is denoted by £[Z|R —» R]. Polyhe-
dral L-convex functions with dual integrality (6.76) are referred to as dual-integral
polyhedral L-convex functions, the set of which is denoted by £[R —* R|Z].
192 Chapter 7. L-Convex Functions

By Theorem 7.26 and Proposition 7.16 an integral polyhedral L-convex func-


tion is nothing but a polyhedral L-convex function that can be obtained as the
convex extension of an L-convex function on integer points. Therefore, we have

where the second expression means that there exists an injection from £[Z|R —> R]
to £[Z -> R], representing an embedding of £[Z|R -> R] into £[Z -» R].
The effective domain of a polyhedral L-convex function is an L-convex polyhe-
dron, which is homogeneous in direction 1. Hence, polyhedral L^-convex functions
can be defined as the restriction of polyhedral L-convex functions, just as L''-convex
functions on integer points are defined from L-convex functions via (7.2). We denote
by £^[R —> R] the set of polyhedral L''-convex functions and by £ ll [Z|R —> R] the
set of integral polyhedral L^-convex functions. The relationship between L and Lfi
is described by Cn C £„ ~ £ r a +ij where £ra and C\ denote, respectively, the sets of
polyhedral L-convex functions and polyhedral L''-convex functions in n variables.
The R-counterpart of (SBF^Z]) is the following:

Theorem 7.28. For a polyhedral convex function g : Rv —> R U {+00} with


domR<7 7^ 0, (SBF^Rj) is equivalent to polyhedral L^-convexity.

The condition (7.32) below is stronger than (SBF^Rj) in that it requires the
inequality not only for nonnegative a but also for negative a.

Theorem 7.29. A polyhedral L-convex function g e £[R —> R] satisfies

Theorem 7.30. A polyhedral L-convex function is polyhedral L^-convex. Con-


versely, a polyhedral L^-convex function is polyhedral L-convex if and only if it
satisfies (TRF[Rj).

Almost all properties of L-convex functions on integer points carry over to


polyhedral L-convex functions. To be specific, Theorems 7.10, 7.11, and 7.14 and
Proposition 7.16 are adapted as follows. Note, however, that the proofs are not
straightforward adaptations; see Murota-Shioura [152].

Theorem 7.31. Let g,gi,g2 € £[R —>• R] be polyhedral L-convex functions.


(1) For A € R++, A<7 is polyhedral L-convex.
(2) For a £ Tiv and (3 € R \ {0}, g(a + ftp) is polyhedral L-convex in p.
(3) For x & Hv, g[—x\ is polyhedral L-convex.
(4) For U C V, the projection gu is polyhedral L-convex provided gu > — oo.
7.9. Positively Homogeneous L-Convex Functions 193

(5) For V>« e C[R -> R] (v 6 V),

is polyhedral L-convex provided g > — oo.


(6) The sum g\ + g% is polyhedral L-convex provided dom (g\ + $2) ^ 0-

Theorem 7.32. Let g,gi,g% G &[R —> R] be polyhedral 1$-convex functions.


(1) Operations (l)-(6) of Theorem 7.31 are valid for polyhedral Lfi-convex func-
tions.
(2) For a,b e (R U {±oo})v, i/ie restriction g^a^ to the real interval [a,b] is
polyhedral Lfi-convex provided dom.g[a^ ^ 0.
(3) ForU C I/, the restriction gu is polyhedral l}-convex provided dom gu ^ 0.

Theorem 7.33 (L-optimality criterion).


(1) For a polyhedral L-convex function g & £[R —> R] and p e domRg, we
have

(2) For a polyhedral L^-convex function g e /^[R —> R] and p e doniR^, we


have

Proposition 7.34. Let g € £[R —> R] be a polyhedral L-convex function. For any
x € R^; argming[—x] is an L-convex polyhedron if it is not empty.

The property in Proposition 7.34 characterizes polyhedral L-convexity, to be


shown in Theorem 7.45.

Note 7.35. The proviso gu > -oo in Theorem 7.31 (4) can be weakened to
9U(po) > -°° for some po, and similarly for g > -oo in Theorem 7.31 (5). •

Note 7.36. For L^-convex functions on integer points we have seen characteriza-
tions in terms of discrete midpoint convexity and submodular integral convexity
(Theorems 7.7 and 7.21). These characterizations, however, do not carry over to
polyhedral L^-convex functions. In contrast, translation submodularity (SBF^Z])
is generalized to (SBF^R]), as stated in Theorem 7.28. •

7.9 Positively Homogeneous L-Convex Functions


Positively homogeneous L-convex functions coincide with the Lovasz extensions of
submodular set functions.
We denote by o£[R ~* R] the set of polyhedral L-convex functions that are
positively homogeneous in the sense of (3.32) and by o£[Z|R —> R] the set of
194 Chapter 7. L-Convex Functions

integral polyhedral L-convex functions that are positively homogeneous. Also we


denote by o£[Z —> R] the set of L-convex functions g e £[Z —> R] on integer points
such that the convex extensions ~g are positively homogeneous.
These three families of functions can be identified with each other, i.e.,

by the following proposition. We introduce yet another notation, oC[L —» Z], for
the set of integer-valued functions belonging to o£[Z —> R].

Proposition 7.37.
(1) 0 £[Z|R->R] = 0 £[R->R].
(2) The convex extension of a function in o£[Z —> R] belongs to o£[R —» R].

Proof. (1) Take g e o£[R -* R]- For any x e Ry, argming[-z] is a cone that is an
L-convex polyhedron (or empty) by Proposition 7.34. Hence, argmingf—x] = D(7)
for a {0, +oo}-valued distance function 7; see section 5.6. This shows the integrality
of argmin<?[—x], and therefore g e 0 £[Z|R —> R].
(2) Take g e o£[Z —> R]. Since g is integrally convex and g is positively
homogeneous, g can be represented as the maximum of a finite number of linear
functions. Hence ~g is polyhedral, and <; e £[R —> R] by Theorem 7.26. D

A positively homogeneous L-convex function g induces a submodular set func-


tion pg by

More precisely, we have the following statements, where <S[R] and <S[Z] denote
respectively the sets of real-valued and integer-valued submodular set functions
defined in (4.10) and (4.11).

Proposition 7.38.
(1) For g e 0£[R -» R], we have pg <E <S[R].
(2) For 5 e 0£[Z -> Z], we ftave ps € <S[Z].

Proof. The submodularity of /9S is immediate from that of 3. Note also that
Pg(9) = 5(0) = 0 and pg(V) = g(l) < +00. D

Conversely, the Lovasz extension /3 of a submodular set function p e <S[R] is


a positively homogeneous L-convex function. We recall from (4.6) the definition

where pi > £>2 > • • • > pm denote the distinct values of the components of p, and
Ui = Ui(p) = {v e V I p(v) > pi} for i = 1 , . . . , m. Denote by pz the restriction of
p to Zv, and note that we have pz : Zy —> Z U {+00} for integer-valued p.
7.9. Positively Homogeneous L-Convex Functions 195

Proposition 7.39.
(1) For p e 5[R], we have p &0£[R -> R]-
(2) For p £ S[Z], we have pz € 0C\Z -> Z].

Proof. (1) We have p € £[R —> R] by Proposition 7.25, whereas the positive
homogeneity of p is obvious. (2) follows easily from (1). D

The next theorem shows a one-to-one correspondence between positively ho-


mogeneous L-convex functions and submodular set functions.

Theorem 7.40. For0£ - 0£[R ->• R] and S — <S[R], the mappings $ : 0£ -+ 5


and \l/ : 5 —> Q£ defined by

are inverse to each other, establishing a one-to-one correspondence between $C and


S. The same statement is true for Q£ = o£[Z —> Z] and <S = «S[Z] wiift $ : g ^> pg
and $ : p i—> />z •

Proof. For p e S we have *(p) 6 0£ by Proposition 7.39 and $ o ty(p) = p by


p(JQ = p(xx) m (4.7). For 5 e Q£ we have p = $(5) £ 5 by Proposition 7.38.
Denote by gz the restriction of g to Z^. By Theorem 7.19 (5) we have

for q G NO, which remains valid for all q & Ry since g is positively homogeneous
and the origin 0 is an interior point of -/Vo- Hence g — \P o 3>(g). D

The above argument leads to the following proposition, to be used in sec-


tion 7.10.

Proposition 7.41. For a positively homogeneous polyhedral convex function g :


Rv —> R U {+00} with domR(? ^ 0, conditions (a) and (b) below are equivalent.
( a )c?eo£[R->R].
(b) argmin<?[—x] e £o[R] for every x G Hv with inf g[—x] > — oo.

Proof, (a) =J> (b) is immediate from Proposition 7.34. For (b) =>• (a), define p by
p(X) = g(xx) (X C V) and denote by p its Lovasz extension (7.36).
Claim 1: g(p) = p(p) for all p 6 Ry.
(Proof of Claim 1) The positively homogeneous convexity of g as well as (7.36)
yields p(p) > g(p). We may assume p e doniR.<7, since otherwise /3(p) — g(p) = +00.
Take x such that p € argming[—x , put Z) = argmin <?[-£], and let SD be its
indicator function. Since D is an L-convex cone, we have 60 € o£[R —> R]- A set
function p. defined by fi(X} — SD(XX) (X C V) belongs to <S[R] and its Lovasz
extension coincides with SD by Theorem 7.40. From this and (4.8) we see
196 Chapter 7. L-Convex Functions

In view of the expression (4.6) and the linearity of g on D we obtain g(p) = p(p).
By Claim 1 and the convexity of g, p is submodular by Theorem 4.16. In
addition we have p(0) = g(0) = 0 and p(V) = p(l) = <?(!) < +00. Therefore,
p G <S[R]. This implies p G 0£[R -> R] by Proposition 7.39 (1). D

7.10 Directional Derivatives and Subgradients


Directional derivatives and subgradients of L-convex functions are considered in this
section. For a polyhedral L-convex function g, the directional derivative g'(p~, d) is a
positively homogeneous L-convex function in d and the subgradients of g at a point
form an M-convex polyhedron. Furthermore, each of these properties characterizes
L-convexity.
We start with directional derivatives of a polyhedral L-convex function g G
£[R —> R]. Recall from (3.25) that, for p G domR<7, there exists £ > 0 such that

Proposition 7.42. If g & £[R —> R] and p G domRg, then g'(p', •) G o£[R —> R].

Proof. By (7.37), g'(p\ •) satisfies (SBF[R]) and (TRF[R]) in the neighborhood of


d = 0. Then the claim follows from the positive homogeneity of g'(p; •). D

Directional derivatives and subdifferentials of L-convex functions are given as


follows. It is recalled that .Mo[R], -Mo[Z|R], M 0 [ Z ] , and £[R -> R|Z] denote, re-
spectively, the sets of M-convex polyhedra, integral M-convex polyhedra, M-convex
sets, and dual-integral polyhedral L-convex functions. See (3.23), (6.86), and (6.88)
for the notation OR and dz, and note that ~g'(p', •) in Theorem 7.43 (2) denotes the
directional derivative of the convex extension ~g of g at p.

Theorem 7.43.
(1) For g e £[R -» R] and p G dom Rff , define pg,p(X) = g'(p; Xx) (XCV).
Then

and dfig(p) ^ 0 in particular. If g G £[R —> R Z], then

(2) For g G £[Z -> R] and p G domzg, define pg,p(X) = g(p + Xx) ~ g(p)
(XCV). Then

and &Rg(p) ^= 0 in particular. If g & £[Z —» Z], iften


7.10. Directional Derivatives and Subgradients 197

and dzg(p) ^ 0 in particular.

Proof. (1) Proposition 7.42 shows g'(p;-) e o£[R —> R], from which follows
Pgtp € <S[R], by Proposition 7.38. By Theorem 7.33 (1) (L-optimality criterion),

We have B(p9,p) € M0[H} by (4.39) and g'(p; •) = p 9 , p (-) by (3.31), (3.33), and
(4.14). If g e £[R -»• R|Z], dn,g(p) is an integral polyhedron by (6.76) and
pg,p(X) = sup{x(X) | x e dRg(p)} € Z.
(2) It is easy to see p9tp e <S[R]. The rest of the proof is similar to (1), where
we use Theorem 7.14 (1) instead of Theorem 7.33 (1) and Theorem 4.15 instead of
(4.39). D

We have consistency between (1) and (2) in Theorem 7.43.

Proposition 7.44. For g e £[Z|R —» R] andp e domR#nZ v , we have g'(p; xx) =


g(p + xx)-g(p) forX cv.
Proof. This follows from integrality (6.75) and (5.19). D

The next theorem affords characterizations of polyhedral L-convex functions


in terms of the L-convexity of directional derivatives, the M-convexity of subdiffer-
entials, and the L-convexity of minimizers.

Theorem 7.45. For a polyhedral convex function g : Hv —> R U {+00} with


dom.R<7 ^ 0, the four conditions (a), (b), (c), and (d) below are equivalent.
(a) 0 e £ [ R - > R ] .
(b) g'(p; •) € 0£[R -»• R] for every p e doniR^.
(c) dn.g(p) e A^0[R] for every p e doniR^.
(d) argming[—x] € £o[R] for every x 6 Hv with mfg[—x] > — oo.

Proof, (a) =^> (b) is by Proposition 7.42, (a) => (c) by Theorem 7.43, and (a) =>
(d) by Proposition 7.34.
(b) =>• (a): By (7.37), g is submodular and linear in the direction of 1 in a
neighborhood of each p e Hv. This implies (SBF[R]) and (TRF[R]) (see Proposi-
tion 7.23).
(b) •& (c): This follows from the relation (<5aRfl(p))" = g'(p; •) in (3.33) and the
one-to-one correspondence between .Mo[R] and o£[R —> R], which is a consequence
of (4.39) and Theorem 7.40.
(d) =>• (b): To use Proposition 7.41 let x € Rv be such that inf g'(p; -)[-x] >
—oo. Then inf g[—x] > —oo and argmingi[—x] E £o[R] by (d). By (5.18) we have

argmingf[—x] = {q € R^ q(v) — q(u) < j(u,v) (u,v € V)}


198 Chapter 7. L-Convex Functions

for some 7 € 3"[R]- Since argmin(g'(p; •)[~ a; ]) is a cone, we see

with Ap = {(u,v) | p(v) — p(u) = 7(u,u)}. This shows that argmin((/(p; •)[~:r]) is
an L-convex cone. Then (b) follows from Proposition 7.41. D

An integrality consideration in the equivalence of (a) and (d) in the above


theorem yields a characterization of integral polyhedral L-convex functions.

Theorem 7.46. For a polyhedral convex function g : Rv —> R U {+00} with


doniRg ^ 0, the two conditions (a) and (d) below are equivalent.
(a)«7e£[Z|R->R].
(d) argmin<7[—x] G £o[Z|R] for every x 6 R17 wz£/i inf g[—x] > — oo.

Note 7.47. We prove <£= of Theorem 7.17 (1). We have argmin#[-x] e £o[Z]
for every x € R^ by the assumption. Since an L-convex set is integrally con-
vex (Theorem 5.10), g is an integrally convex function by Theorem 3.29. By the
boundedness of dome/, the convex closure ~g of g is a polyhedral convex function
and argming[—x] = argmin<?[—x] 6 £o[Z|R]. This implies g e £[Z|R —> R] by
Theorem 7.46 and therefore g e £[Z -> R]. •

7.11 Quasi L-Convex Functions


Quasi L-convex functions are introduced as a generalization of L-convex functions.
The optimality criterion and the proximity theorem survive in this generalization.
To define quasi L-convexity, we relax the submodularity inequality

to sign patterns of g(p A q) — g(p) and g(p V q) — g(q) compatible with (implied by)
this inequality, which are given as follows:

Here Q and x denote possible and impossible cases, respectively.


We call a function g : Zv —> R U {+00} quasi submodular if it satisfies the
following:
(QSB) For any p, q e Zv, g(p A g) < g(p) or g(p V q) < g(q).
Since p and q are symmetric, (QSB) implies also that g(p A q) < g(q) or g(p V q) <
g(p). Similarly, we call g semistrictly quasi submodular if it satisfies the following
property:51
51
The condition (SSQSB) was introduced by Milgrom-Shannon [129], in which g : Zv —> R U
{—00} is called quasi supermodular if — g satisfies (SSQSB).
7.11. Quasi L-Convex Functions 199

(SSQSB) For any p.q e Zv, both (i) and (ii) hold:

Furthermore, a function g : Zv —> R U {+00} with domg ^ 0 is called quasi


L-convex if it satisfies (QSB) and (TRF[Zj) and semistrictly quasi L-convex if it
satisfies (SSQSB) and (TRF[Z]).

Example 7.48. A quasi L-convex function arises from a nonlinear scaling of an


L-convex function. For a submodular function g : Zv —> R U {+00} and a function
if : R -» R U {+00}, define g : Zv -> R U {+00} by

Then g satisfies (QSB) if ip is nondecreasing and (SSQSB) if tp is strictly increasing.


If g satisfies (TRF[Zj) with r = 0, this property is inherited by g. •

Weaker variants of (QSB) and (SSQSB) can be conceived by considering pos-


sible sign patterns of the four values g(p A q) — g(p), g(p A q) — g(q), g(p V q) — g(p),
&ndg(pVq)-g(q).
(QSBW) For a,nyp,q e domg, m&x{g(p),g(q)} > mm{g(p/\q),g(pVq)}.
(SSQSBW) For any p, q e domg, either (i) or (ii) holds:
(i) max{g(p),g(q)} > min{g(p A q),g(p V q)},
(ii) g(p) = g(q) = g(p /\q)= g(p V q).
The relationship among various versions of quasi submodularity is summarized
as follows. The second statement below shows that all the conditions are equivalent
for g if they are imposed on every perturbation of g by a linear function. Recall the
definition of g[x]; i.e., g[x](p) — g(p) + ( p , x ) .

Theorem 7.49. For g : "Zv —> RU {+oc}, the following implications hold true.

Proof. (1) This is immediate from the definitions.


(2) Combining Theorems 7.51 and 7.52 below establishes this.

As is easily seen from the definitions of quasi L-convexity, most of the prop-
erties of quasi-submodular functions can be restated naturally in terms of quasi
L-convex functions, and vice versa. We will work mainly with quasi-submodular
functions.
The following are quasi versions of Theorem 7.2 for L-convex functions.
200 Chapter 7. L-Convex Functions

Proposition 7.50. Assume that g : Zv —> RU {+00} satisfies g(p) = g(p + 1) for
allpeZv.
(1) For g satisfying (QSBW) and for p, q € Zv and a & Z, we have

In particular, for p, q e domg and a e [0,ai - a 2 ]z, we have

where X C V, a\ € Z, and a<2 G Z U {—00} are defined by

(2) For g satisfying (SSQSBW) andforp,q 6 Z17 with g(p) ^ (/(g) and a € Z,
we /ia«e inequality (7.39) wzi/i sinci inequality. In particular, for p, q G domg with
g(p) 7^ <?(<?) ana7 a e [0,ai - a 2 ]z, we have (7.40) wz£/i sihci inequality.
(3) For 5 satisfying (SSQSB) andforp,q e Zv and a e Z, we ftawe

7n particular, for p, q € domg ana7 a € [0, a\ — a-2\z, we have

Proof. Inequality (7.39) follows from

max{#(p), g(q)} = max{g(p),g(q - al)} > min{g(p V (q - al)),g(p A (q - al))},

in which g(p/\(q — al)) = g((p/\(q — al)) + al) = ^((p + al) Ag). Inequality (7.40)
is obvious from (7.39) since pV{o— (c<i — ce)l} = p + axx and (p+ (ai — a)l) Agr =
9 — c^Xx for a e [0,a\ — a2\z- The proofs of (2) and (3) are similar.

The quasi submodularity of a set D C Zv can be defined as the quasi submod-


ularity of the indicator function SD '• Zv —> {0, +CXD}. (QSB) for Srj is equivalent to
(QDL) p, q e D ==> p/\qeDoipVqeD
for D, whereas (SSQSB) for 6D is equivalent to (SBS[Z]) for D.
Level sets of quasi-submodular functions have quasi submodularity. Further-
more, the weaker version (QSBW) of quasi submodularity for functions can be char-
acterized by the property (QDL) of level sets; recall the notation L(g, a) from (6.95).

Theorem 7.51. A function g : Zv —> R U {+00} satisfies (QSBW) if and only if


the level set L(g, a) satisfies (QDL) for every a e R.
7.11. Quasi L-Convex Functions 201

Proof. For the "if" part, take p,q & doing and put a = max{g(p),g (</)}. Since
p,q G L(g,a), we have p A q & L(g,a) or p V q G L(g, a); i.e., max{g(p),g(<?)} >
min{g(p A q),g(p V 5)}. The "only if" part is even easier. D

A submodular function over the integer lattice can be characterized by using


level sets of functions perturbed by linear functions.

Theorem 7.52. A function g : Zv —>• RU {+00} satisfies (SBF[Z]) if and only if


the level set L(g[x], a) satisfies (QDL) for all x € Rv and a £ R.

Proof. The "only if" part follows from Theorem 7.51 and the submodularity of g[x\.
For the proof of the "if" part, takep,q G domg. By (QDL) for L(g, max{g(p),g (<?)})
we have p A g G dom g or p V q G dom g. We consider the former case, where we
may assume p A q ^ p,q. For any e > 0, we can choose some x G Ry such that
g[x\(p) = g[x](q) = g[x}(p A q) - e. (QDL) for L(g[x],o) with a = g(x\(p) shows
pVq G L(g[x],a), which implies g[x](p) + g[x](q) = 1a> g[x](p/\q)+g[x](pVq)-e.
Since e > 0 is arbitrary, this means (SBF[Zj). D

Next we turn to the minimization of a quasi L-convex function. We assume


r = 0 in (TRF[Zj) since otherwise no minimizer exists.
Global minimality is characterized by local minimality.

Theorem 7.53 (Quasi L-optimality criterion). Assume that g : Zv —> R U {+00}


satisfies g(p) = g(p + 1) (Vp G Zv).
(1) For g satisfying (QSBW) and p G domg, we have: g(p) < g(q) for all
q e Zv such that q -p is not a multiple of 1. <$=>• g(p) < g(p + Xx) for all X C V
withX i {0,V}.
(2) For g satisfying (SSQSBW) and p G domg, we have: g(p) < g(q) (Vq G

Proof. We prove •<= of (1) by contradiction. Suppose that g(q) < g(p) for some
q G domg such that q — p is not a multiple of 1. We may assume that q > p
by (TRF[Zj) with r = 0 and that q minimizes maxt,ey{q'(t;) — p(v)} among such
vectors. Put X = argmax.vev{q(v) - p ( v ) } , where X ^ V. By Proposition 7.50
(1), we obtain

whereas g(p) < g(q — Xx) by the choice of q. Hence follows g(p) > g(p + xx), a
contradiction to the strict local minimality of p. The other direction => of (1) is
obvious, and (2) can be shown similarly by Proposition 7.50 (2). D

The proximity theorem for L-convex functions (Theorem 7.18) can be gener-
alized for quasi L-convex functions.

Theorem 7.54 (Quasi L-proximity theorem). Let g : Zv —» R U {+00} be a


function satisfying (SSQSB) and g(p) — g(p+ 1) (Vp G Z y ), and assume n = \V\
202 Chapter 7. L-Convex Functions

and a € Z ++ . If pa 6 doing satisfies (7.21), then argming ^ 0 and there exists


p* e argming with (7.22).

Proof. The proof of Theorem 7.18 works with (7.43) and (7.44) in place of
(L*-APR[Z]).

Bibliographical Notes
The concept of L-convex functions was introduced by Murota [140]. L^-convex
functions are defined by Pujishige-Murota [68] as a variant of L-convex functions,
together with the observation that they coincide with the submodular integrally
convex functions considered earlier by Favati-Tardella [49]. Theorems 7.1 and 7.3
are due to [68], and Theorem 7.2 is stated in Murota [147].
Discrete midpoint convexity is considered by Favati-Tardella [49] with an ob-
servation of its equivalence to submodular integral convexity. The equivalence of
discrete midpoint convexity to translation submodularity (SBF^Zj) in Theorem 7.7
is by Fujishige-Murota [68], whereas that to (L^-APRfZ]) is noted in Murota [147].
Condition (7.11) for quadratic L-convex functions is given in Murota [141].
Separable convex functions with chain conditions (7.13) are considered in Best-
Chakravarti-Ubhaya [11]. Multimodular functions are treated in Hajek [85].
The basic operations in section 7.4 are listed in Murota [141], [144], [147].
The theorems on minimizers of L-convex functions are of fundamental impor-
tance. Theorem 7.14 (L-optimality criterion) is stated in Murota [145]. Theorem
7.15 (optimality for submodular set functions) can be found as Theorem 7.2 in Fu-
jishige [65]. Theorem 7.17 (characterization by minimizers) is a corollary of Theorem
7.45 due to Murota-Shioura [152]. A thorough study of minimizers of submodular
functions is made in Topkis [202].
The L-proximity theorem (Theorem 7.18) is due to Iwata-Shigeno [105]. The
present proof based on (L^-APRfZ]) is by Murota-Shioura [154].
The construction of the convex extension of an L-convex function by means
of the Lovasz extensions (Theorem 7.19) is due to Murota [140]. The same idea,
however, was used earlier by Favati-Tardella [49] for submodular integrally con-
vex functions. The equivalence of L''-convexity to submodular integral convexity
(Theorem 7.21) is due to Fujishige-Murota [68].
Polyhedral L-convex functions are investigated by Murota-Shioura [152], to
which all the theorems in section 7.8 (Theorems 7.26, 7.28, 7.29, 7.30, 7.31, 7.32,
and 7.33) as well as Proposition 7.34 are ascribed. L-convexity for nonpolyhedral
convex functions is considered in Murota-Shioura [156], [157].
The correspondence between positively homogeneous L-convex functions and
submodular set functions (Theorem 7.40) is established for the case of Z in Murota
[140] and generalized to the case of R in Murota-Shioura [152]. Proposition 7.37 is
stated in Murota [147].
Theorem 7.43 for directional derivatives and subgradients is shown for the case
of Z in Murota [140], [141], and generalized to the case of R in Murota-Shioura [152].
Theorem 7.45 (characterizations in terms of directional derivatives, subdifferentials,
7.11. Quasi L-Convex Functions 203

and minimizers) is by [152], whereas its ramification with integrality (Theorem 7.46)
is stated in Murota [147].
The concept of quasi L-convex functions was introduced by Murota-Shioura
[154] on the basis of the idea of Milgrom-Shannon [129]. Theorem 7.52 is due to
[129], and the other theorems in section 7.11 (Theorems 7.49, 7.51, 7.53, and 7.54)
are in [154].
This page intentionally left blank
Chapter 8

Conjugacy and Duality

By addressing the issues of conjugacy and duality, this chapter provides the the-
oretical climax of discrete convex analysis. Whereas conjugacy in convex analysis
gives a symmetric one-to-one correspondence within a single class of closed convex
functions, conjugacy in discrete convex analysis establishes a one-to-one correspon-
dence between two different classes of discrete functions with different combinatorial
properties distinguished by "L" and "M." The conjugacy between L-convexity and
M-convexity is thus one of the most remarkable features of discrete convex analysis.
Discrete duality is another distinguishing feature. It is expressed in a number of the-
orems, such as the separation theorems for M-convex/M-concave functions and for
L-convex/L-concave functions (M- and L-separation theorems) and the Fenchel-type
duality theorem. Besides formal parallelism with convex analysis, these discrete du-
ality theorems carry deep combinatorial facts, implying, for example, Edmonds's
intersection theorem and Prank's discrete separation theorem for submodular set
functions as special cases.

8.1 Conjugacy
M-convex functions and L-convex functions form two distinct classes of discrete
functions that are conjugate to each other under the Legendre-Fenchel transfor-
mation. This stands in sharp contrast with conjugacy in convex analysis, which
is a symmetric one-to-one correspondence within a single class of closed convex
functions. The conjugacy correspondence between M-convexity and L-convexity is
in fact a translation of two different combinatorial properties, exchangeability and
submodularity, on top of convexity. The relationship between submodularity and
supermodularity with respect to conjugacy is discussed first in section 8.1.1. Con-
jugacy for polyhedral M-/L-convex functions is established in section 8.1.2 and that
for integer-valued M-/L-convex functions on integer points in section 8.1.3.

205
206 Chapter 8. Conjugacy and Duality

8.1.1 Submodularity under Conjugacy


Submodularity and supermodularity are not symmetric under the Legendre-Fenchel
transformation. The conjugate of a submodular function is always supermodular,
whereas the conjugate of a supermodular function is not necessarily submodular.
In this subsection we assume that / is a function in real variables, / : THV
RU {+00}, with a nonempty effective domain. Recall that / is submodular if

and supermodular if

Also recall from (3.26) that the Legendre-Fenchel transform /* : Ry -> RU {+00}
is denned by

Theorem 8.1. For a submodular function f, the Legendre-Fenchel transform /*


is supermodular.

Proof. For x, y € Hv and p, q e Rv, we have

From this inequality, Submodularity (8.1), and the definition (8.3), we see that

Taking the supremum over x and y we obtain

which shows the supermodularity of /*. D

In contrast to Theorem 8.1, the Legendre-Fenchel transform of a supermodular


function is not necessarily submodular. For example, consider a pair of convex
quadratic functions f(x) = ^ X T Ax and g(p) = ^pTA~lp with

We have g = f by Proposition 2.9, whereas / is supermodular and g is not sub-


modular by Proposition 2.6.
8.1. Conjugacy 207

If n = 2, however, supermodularity does imply submodularity of the Legendre-


Fenchel transform.

Proposition 8.2. For a supermodular function f in two variables, the Legendre-


Fenchel transform /* is submodular.

Proof. It suffices to show that

for p = (p(l),p( 2 )) e R2 and q = (<?(!), <?(2)) 6 R2 with p(l) > q(l) and p(2) >
q(2). We claim that

for any x = (x(l),x(2)) e R2 and y = (j/(l),y(2)) e R2. The inequality (8.4) is an


immediate consequence of (8.5), since the supremum of the left-hand side of (8.5)
over x and y coincides with the left-hand side of (8.4).
Proof of (8.5): If x(l) > y(l) and z(2) > j/(2), we have

and, therefore,

If x(l) < y(l), we have

and, therefore,

A similar argument holds for the case of x(2) < j/(2).

By Theorem 8.1, submodularity is preserved under the transformation / H->


—/*. However, this does not establish a symmetric one-to-one correspondence
within the class of submodular functions. It is not true, either, that the map-
ping / i — > / * gives a one-to-one correspondence between the class of submodular
functions and the class of supermodular functions.
208 Chapter 8. Conjugacy and Duality

8.1.2 Polyhedral M-/L-Convex Functions


Conjugacy for polyhedral M-convex and L-convex functions is considered here. We
start with a technical lemma.

Proposition 8.3. Let g £ £[R —> R] be a polyhedral L-convex function. For


x,y £ R^ with inf g[—x] > — oo and inf g[—y] > — oo and for u £ supp+(x — y),
there exists v £ supp~(x — y) such that

Proof. We may assume argming[—x] j^ 0 and argmin<ji[—y] ^= 0. By Proposition


7.34, we have argmingf—x] £ A)[R] and argming[—y] £ £o[R]- It suffices to
demonstrate the existence of v £ supp~(x — y) such that p(v) < q(v) for all p £ Dx
and q £ Dy, where

To prove this by contradiction, suppose that for every v £ supp (x — y) there exist
pv £ Dx and qv £ Dy with pv(v) > qv(v). Then, for

we have p* £ Dx, q* £ Dy, and p*(v) > q*(v) (Vu e supp (x - y ) ) . By denning

with A = mm{p*(v) — q*(v) \ v £ supp+(p* — g*)} > 0, we obtain

from Theorem 7.29. By supp (x — y) C supp+(p* — ?»), on the other hand, we see

where the last equality is due to x(V) = y(V) — r (the constant in (TRF[R])).
Combining (8.7) and (8.8) results in
8.1. Conjugacy 209

which is a contradiction to p* G argming[—x] and q* € argming r [—y]. D

The conjugacy theorem for polyhedral M-convex and L-convex functions is


now stated.

Theorem 8.4 (Conjugacy theorem).


(1) The classes of polyhedral M-convex functions and polyhedral L-convex func-
tions, M. = .M[R —> R] and £, = £[R —> R], are in one-to-one correspondence
under the Legendre-Fenchel transformation (8.3). That is, for f G M. and g G £,
we have f £ £, g* G M, f" = f, and g9' = g.
(2) The classes of polyhedral A/fl-convex functions and polyhedral L^-convex
functions, Ai^R —» R] and $\R —> R], are m one-to-one correspondence under
(8.3) m a similar manner.

Proof. (1) and (2) are equivalent, so we prove (1). We first note that /** = / for
any polyhedral convex function /.
For / G M we have $R/*(P) = argmin/[—p] G .Mo[R] (Vp £ dom/*) by
Proposition 6.53. Then /* G £ by (c)=^(a) in Theorem 7.45. (An alternative proof
is described in the proof of Theorem 8.6.)
Conversely, take g € C, and x,y G domg*. Since inf<7[—x] > —oo and
inf g[—y] > —oo, Proposition 8.3 shows that for every u € supp+(x — y) there
exists v G supp-(z - y) satisfying (8.6). Noting that (£ a r g min g [-x])* = (g')'(x', •)>
which follows from (3.30) and (3.33), we obtain

This shows (M-EXC'[R]) for g*, and hence g' & M.

Theorem 8.4 (2) states that L^-convex functions and M^-convex functions are
transformed to each other, where L^-convex functions are submodular (Theorem
7.28) and M^-convex functions are supermodular (Theorem 6.51). It is noted that
Theorem 8.4 (2) does not imply, nor is it implied by, Theorem 8.1, which shows the
supermodularity of the conjugate of a submodular function.
Recalling the basic fact that the conjugate of the indicator function of a con-
vex set is a positively homogeneous convex function, and vice versa, we see from
Theorem 8.4 above that M-convex polyhedra ./Vfo[R] and positively homogeneous
L-convex functions o£[R —»• R] are conjugate to each other and also that L-convex
polyhedra £o[R] and positively homogeneous M-convex functions o.M[R —> R] are
conjugate to each other. On the other hand, we can identify positively homogeneous
L-convex functions o£[R —* R] with submodular set functions <S[R] (Theorem 7.40)
and positively homogeneous M-convex functions oA^[R —> R] with distance func-
tions with the triangle inequality T[R] (Theorem 6.59). We can summarize these
210 Chapter 8. Conjugacy and Duality

one-to-one correspondences in the following diagram:

In addition, the polarity between M-convex cones and L-convex cones follows
from (8.9). This is because two convex cones are polar to each other if and only if
their indicator functions are conjugate to each other. Thus we obtain the following
theorem.

Theorem 8.5. A polyhedral cone is M-convex if and only if its polar cone is L-
convex. Hence, the classes of M-convex cones and L-convex cones are in one-to-one
correspondence under polarity (3.34).

Taking integrality into account in diagram (8.9), we obtain

where ,M[Z|R —> R] and £[Z|R —> R] denote the sets of integral polyhedral M-
convex and L-convex functions, respectively; .M[R —> R|Z] and £[R —> R|Z] denote
the sets of dual-integral polyhedral M-convex and L-convex functions, respectively;
and o.M[R —> R|Z] and o£[R —> R|Z] are their subclasses with positive homogene-
ity.52
It is known that the conjugacy relationship between M-convexity and L-
convexity holds more generally for closed proper convex functions. Recall that
the Legendre-Fenchel transformation gives a symmetric one-to-one correspondence
in the class of all closed proper convex functions (Theorem 3.2).

Theorem 8.6. A closed proper convex function f satisfies (M-EXC[R]) if and


only if f = g* for a closed proper convex function g that satisfies (SBF[Rj) and
(TRF[Rj).

Proof. The proof of the "if" part is essentially the same as the latter half of
the proof of Theorem 8.4. The "only if" part needs a new approach, since the
implication (c)=^(a) in Theorem 7.45 does not carry over to nonpolyhedral convex
functions. Suppose that / satisfies (M-EXC[R]) and put g = /*. It is easy to show
(TRF[Rj) for g. The proof of the submodularity (SBF[Rj) for g consists of the
following steps.
52
The notation for dual integrality extends naturally to other classes of functions. For example,
.M^R -+ R|Z] and /^[R —> R|Z] denote the sets of dual-integral polyhedral M^-convex and
L^-convex functions, respectively.
8.1. Conjugacy 211

1. We may assume that dom/ is bounded, so that domg — Rv.


2. For po e Tiv and U C V with \U\ = 2, denote by / : Hu -> R U {+00} the
projection of /[—po] to U and by 5 : Ru —* R the restriction of g(po + p) to
U. Then we have g = (/)*.
3. (M-EXC[R]) of / implies the supermodularity of /.
4. The supermodularity of / implies the submodularity of (/)* by Proposi-
tion 8.2.
5. The submodularity of g for any po and any U implies the submodularity of g.
The details are given in Murota-Shioura [156], [157].

Note 8.7. With Theorem 8.4 we complete the proof of Theorem 6.63 (characteriza-
tions of polyhedral M-convex functions), (b) ^ (c) follows from <59R/(X)* = f'(x; •)
in (3.33) and the correspondence between £o[R] and o-M[R —> R], which is a
special case of Theorem 8.4. To show (a) •&• (c) •& (d), put g = /* and note
that d R f ( x ) = argmin<?[—x] and argmin/[—p] = dRg(p). By Theorem 8.4 and
Theorem 7.45 (characterizations of polyhedral L-convex functions), we see that
/ e M[n -*• R] & g 6 £[R -> R] <^ argming[-z] 6 C0[R] & ^(p) & M0[R}.

Note 8.8. We complete the proof of Theorem 6.45 using Theorem 6.63 ((d)
(a)) established in Note 8.7. Let / be the convex extension of / 6 M[Z —> R]. For
any p € Ry, argmin/[—p] is an M-convex polyhedron if it is not empty (Theorem
6.43). Since / is polyhedral by the assumption, Theorem 6.63 ((d) =>• (a)) sho
/e.M[R^R].

Note 8.9. Using Theorem 8.5, we complete the proof of (4.42), the representation
of an M-convex cone in terms of vectors Xu ~ Xv (u>v £ V). Let B be an M-
convex cone and D be the polar of B. By Theorem 8.5, D is an L-convex cone,
and by (5.18) it can be represented as D — {p G R^ (p,a») < 0 (i = 1,... ,m)}
for some a^ = \Ui — \Vi (i = 1,... ,m). Since B is the polar of D, this implies
B = {x 6 Rv | x is a nonnegative combination of a* (i = 1 , . . . , m)} by (3.36).
Conversely, a convex cone of this form is M-convex, as can be shown by reversing
the above argument.

Note 8.10. Using Theorem 8.5, we complete the proof of (5.21), the representation
of an L-convex cone in terms of a ring family T> C 2V. Let D be an L-convex cone
and B be the polar of D. By Theorem 8.5, B is an M-convex cone, and, by (4.39),
it can be represented as B = B(p) using p € <S[R] with p : 2V —> {0, +00}; i.e.,

with T> = dom/5, which is a ring family. Since D is the polar of B, this implies
212 Chapter 8. Conjugacy and Duality

by (3.36). Conversely, a convex cone of this form is L-convex, as can be shown by


reversing the above argument. •

8.1.3 Integral M-/L-Convex Functions


We turn to functions defined on integer points.
For functions / : Zv —> RU{+oo} and h : Zv —> RU{—oo}, discrete versions
of the Legendre-Fenchel transformations are defined by

We call (8.11) and (8.12), respectively, convex and concave discrete Legendre-
Fenchel transformations. The functions /* : R^ —* R U {±00} and h° : R^ —>
R U {±00} are called the convex conjugate of / and the concave conjugate of h,
respectively. Note that h°(p) = — (—h)'(—p).
For an integer-valued function /, f'(p) is integral for an integer vector p.
Hence, (8.11) with p 6 Zv defines a transformation of / : Zv —> Z U {+00} to
/* : Zv -> Z U {±00}; we refer to (8.11) with p e Zv as (8.11)z. We call (/*)*
using (8.11)z the integer biconjugate of / and denote it by /**. Similarly, (8.12)
with p e Zv is designated by (8.12)z and we define h°° = (h°)°.
The following fact is fundamental for the conjugacy of discrete functions.

Proposition 8.11. For a function f : Zv —> Z U {+00} and a point x 6 dom z /,


we have f"(x) = f ( x ) i f d z f ( x ) ^ 0, where f" means the integer biconjugate with
respect to the discrete Legendre-Fenchel transformation (8.11)z.

Proof. For p e d z f ( x ) , we have f ' ( p ) = (p,x) - f ( x ) (cf. (3.30)), and therefore


f"(x] = sup{(q,x) -/'((?) q£Zv}> (p,x) - /'(p) = f ( x ) . On the other hand,
f"(x) < f ( x ) for any / and x. D

The conjugacy theorem for discrete M-convex and L-convex functions reads
as follows.

Theorem 8.12 (Discrete conjugacy theorem).


(1) The classes of integer-valued M-convex functions and integer-valued L-
convex functions, M — M[Z —> Z] and C = £[Z —> Z], are in one-to-one cor
respondence under the discrete Legendre-Fenchel transformation (8.11)z- That is,
for f e M and g 6 £, we have /* 6 C, g* e M, f" — f, and g" = g.
(2) The classes of integer-valued M^-convex functions and integer-valued L$-
convex functions, M^[Z —> Z] and &[Z —> Z], are in one-to-one corresponden
under (8.11)z in a similar manner.

Proof. The basic idea of the proof is to apply Theorem 8.4 to the convex exten-
sions of / and g with additional arguments for discreteness. Since (1) and (2) are
equivalent, we deal with (2).
8.1. Conjugacy 213

(i) Take / e M.^[Z —> Z]. Let / be the convex extension of / and / be the
conjugate of / in the sense of (8.3). We have f ' ( p ) = f (p) for p e Zv.
If doniz/ is bounded, / is polyhedral convex, and therefore / e .M^R —> R]
by Theorem 6.45. Then Theorem 8.4 shows /* € &[H -> R]. Since /*(p) = /*(p)
for p € Zv, (SBF^R]) for /* implies (SBF^Z]) for /*. Hence /• e C\Z -> Z].
If domz/ is unbounded, we consider the restriction /& of / to integer interval
[—fcl,fcl]z for A; E Z large enough to ensure domz/ fl [—fcl,fcl]z ^ 0- Then we
have fk & M^Z -> Z} and /£ e -C^Z -» Z] by the argument above. For each
p € domz/*, there exists kp such that /*(p) — /*(p) for all k > kp (cf. Theorem
6.42 and Proposition 3.30). Therefore, (SBF^Z]) for ft implies (SBF^Z]) for /*.
Hence/* € &[Z -> Z].
(ii) Take # € /^ [Z —> Z]. Let ^ be the convex extension of g and g* be the
conjugate of ~g in the sense of (8.3). We have

and, in particular, g*(x) = g*(x) (x e Zv).


If domz5 is bounded, ~g is polyhedral convex, and therefore ~g 6 £tl [R —» R]
by Theorem 7.26. Then g* 6 A^^R -> R] by Theorem 8.4, and <?* satisfies (M1"-
EXC[R]) by Theorem 6.47. We claim that a0 = 1 is valid in (M^-EXCpR,]) for
x,y e domRg* n Zv = domzg*. Then it follows that g' satisfies (M^-EXCfZ])
and g* e M^[Z —> Z] by Theorem 6.2. To show a0 — 1, fix x,y e doniz*?*,
•u e supp+(x - y), and u e supp~(x - y) U {0} in (M^-EXCfRj). By the assumed
boundedness of domzg, the supremum in (8.13) is attained by some p. Moreover,
there exist po e Zv and a\ > 0 such that

for all a e [0, QIJR. Condition (8.14) can be written as

which is equivalent, by the L-optimality criterion (Theorem 7.14 (2)), to

Note that the right-hand side is an integer and the coefficient of a on the left is
either ±1 or 0. By virtue of this integrality, the above inequality is satisfied by all
a e [0, I]R, if it is satisfied by some a > 0. Therefore, (8.14) holds for all a 6 [0, I]R.
Similarly, there exists qo G Zv such that

for all a 6 [0,1]R. Combining (8.14) and (8.16) shows


214 Chapter 8. Conjugacy and Duality

for all a e [0, I]R. Hence ao = 1 is valid.53


If donizgr is unbounded, we consider the restriction g^ of g to integer interval
[—fcl,fcl]z for k £ Z large enough to ensure domzg n [—fcl,fcl]z ^ 0. Then we
have gk e ^[Z —> Z] and g* e .M^Z —> Z] by the argument above. For each
x € doniz<?*, there exists kx such that g*(x) = g%.(x) for all k > kx (cf. Theorem
7.20 and Proposition 3.30). Therefore, (M^-EXC[Z]) for g*k implies (M&-EXC[Z])
for g*. Hence 5* 6 A^[Z ^ Z].
(iii) Finally, /" = / and g" = g follow from Proposition 8.11, Theorem
6.61 (2), and Theorem 7.43 (2). D

As the discrete counterpart of diagram (8.9), we obtain the following:

This follows from the discrete conjugacy theorem (Theorem 8.12) in combination
with Theorems 7.40, 6.59, 4.15, and 5.5. In addition, we can obtain the M^-fLfi-
version of (8.17).
The conjugacy relationship among discrete convex functions is schematized in
Fig. 8.1, where M^-convex and L^-convex functions are defined in section 8.3. This is
the ultimate picture for the discrete conjugacy relationship, which originated in the
equivalence between the base family and the rank function of a matroid (section 2.4).
In other words, the exchange property and submodularity are conjugate to each
other at various levels.
Examples of mutually conjugate M-convex and L-convex functions are demon-
strated below for integer-valued functions defined on integer points.
In the network flow problem in section 2.2, if /„ e C[Z —> Z] and ga e C[Z —>
Z] are conjugate for each arc a € A, then

in (2.42) and (2.43) are conjugate to each other. We will dwell on the conju-
gacy in network flow in section 9.6.
In a valuated matroid (V,B,w), which arises, e.g., from a polynomial matrix
(section 2.4.2),

53
An alternative proof is possible on the basis of (i)—(iii) below if (iii) is accepted as a known
fact: (i) (8.15) is equivalent to x — a(xu — Xv) 6 dn9(po), (ii) dRg(po) is an integral M^-convex
polyhedron (Theorem 7.43 (2)), and (iii) for an integral M^-convex polyhedron Q and x, y 6 Qr\Zv,
a0 = 1 is valid in (B^EXCp,]).
8.1. Conjugacy 215

(FNC = function, SET = set, PHF = positively homogeneous function)

Figure 8.1. Conjugacy in discrete convex functions.

in (2.77) and (2.78) are conjugate to each other.


If / e M^[Z —> Z] and g 6 &\L —> Z] are conjugate, then the restriction ju
and the projection gu to a subset U C V are conjugate to each other and the
projection fu and the restriction gu are conjugate to each other.
If / e M[Z -» Z] and g e £[Z ->• Z] are conjugate and ^ e C[Z ->• Z] and
V>t) € C[Z —> Z] are conjugate for each v e V, then
216 Chapter 8. Conjugacy and Duality

in (6.46) and (7.18) are conjugate to each other.


If fi e M\7, ->• Z] and 3, e £ h [Z -> Z] are conjugate for i = 1,2, then
/i n z /2 € .M^Z —> Z] and gi + g2 £ ^[Z —> Z] are conjugate to each other.

Note 8.13. In section 2.1 we saw the conjugacy between M^-convex and L''-convex
quadratic functions in real variables (Theorems 2.11 and 2.16). This conjugacy
relationship does not have a discrete counterpart. Let / : Zn —> Z be a quadratic
function represented as f ( x ) = XT Ax, with a positive-definite symmetric matrix A
with integer entries, and /* : Z" —> Z be the discrete Legendre-Fenchel transform
(8.11) of /. If A satisfies (6.26) through (6.28), then / is M^-convex and /* is
L^-convex, but /* is not necessarily a quadratic function. Likewise, if A satisfies
(7.10), then / is L^-convex and /* is M^-convex, but /* is not necessarily a quadratic
function. For instance, f ( x ) = x2, where x G Z, is an M^-convex function with

which is not quadratic since /'(-I) = /*(0) = /'(I) = 0.

8.2 Duality
Discrete duality theorems lie at the heart of discrete convex analysis. Major theo-
rems presented in this section are the separation theorem for M-convex/M-concave
functions (M-separation theorem), the separation theorem for L-convex/L-concave
functions (L-separation theorem), and the Fenchel-type duality theorem. These
theorems look quite similar to the corresponding theorems in convex analysis, but
they express, in fact, some deep facts of a combinatorial nature. Almost all dual-
ity results in optimization on matroids and submodular functions are corollaries of
these theorems.

8.2.1 Separation Theorems


We start by reviewing the preliminary general discussion in section 1.2. A discrete
separation theorem is a statement that, for / : Zv —> Z U {+00} and h : Zv —>
Z U {—00} belonging to certain classes of functions, if f(x) > h(x] for all x £ Zv,
then there exist a* £ Z and p* £ Zv such that

Denoting by / the convex closure of / and by h the concave closure of h (i.e., —h is


the convex closure of —ft), we observed the following phenomena in Examples 1.5
and 1.6:
8.2. Duality 217

2. f ( x ) > h(x) (Vx e Zv) =^ existence of a* £ R and p* 6 R^,


3. existence of a* e R and p* £ Ry =^=> existence of a* 6 Z and p* 6 Zv.
We will see below that all three implications hold true for M-convex and L-
convex functions. The following proposition addresses the first.

Proposition 8.14.
(1) 7//,-/ie.Ml[Z->R], then

Proof. (1) Theorem 6.44 with /i = / and /2 = —h shows that fi + f% > 0 implies
A+]2>o.
(2) It suffices to prove the claim when g, — k e £[Z —> R]. Theorem 7.19
applied to g and — k shows this. D

The separation theorem for M-convex/M-concave functions reads as follows.


It should be clear that /* and h° are the convex and concave conjugate functions of
/ and h defined by (8.11) and (8.12), respectively. In the proof we use the notations
<9R and d'z for the concave version of subdifferentials denned as

Theorem 8.15 (M-separation theorem). Let f : Zv —> R U {+00} be an Afl-


convex function and h : Zv —> R U {—00} be an Afl-concave function such that
domz/ n domz/i ^ 0 or domR/* n domR/i° ^ 0. // /(x) > h(x) (Vx 6 Zv), there
exist a* e R and p* & Ry such that

Moreover, if f and h are integer valued, there exist integer-valued a* 6 Z and


p* 6 Zv.

Proof. We may assume /, —h e M\L —> R].


(i) Suppose that domz/ n domz/i ^ 0. For the convex closure f of f and the
concave closure h of h, we have f(x] > h(x) (Va; 6 R^) by Proposition 8.14 (1).
Since doniR/ n dorriR/i ^ 0, the separation theorem in convex analysis (Theorem
3.5) gives a* e R and p* e Rv such that/(x) > o^ + (p*,ar) > /i(x) (Vx e Ry)
(see Note 8.19). This implies (8.20) since / = / and h = ft on Zv by Theorem 6.42.
The integrality assertion is proved from the facts that the integer subdif-
ferential of an integer-valued M-convex function is an L-convex set and that L-
convex sets have the property of convexity in intersection. We may assume that
inf{/(x) - h(x) | x 6 Zv} - 0. Then there exists x0 e Zv with /(x 0 ) - h(x0) = 0
218 Chapter 8. Conjugacy and Duality

(by the integrality of the function value). By (6.87) and Theorem 6.61 (2) we have

which is nonempty since p* € dfif(xo) n d^h(xo). Since d z f ( x o ) and d'zh(x0)


above are L-convex, convexity in intersection for L-convex sets (5.9) guarantees
the existence of an integer vector p** e d z f ( x 0 ) n d'zh(xo). With this p** and
a** = h(x0) - (p**,x0) e Z, the inequality (8.20) is satisfied.
(ii) Next suppose that domz/ n dom z ft = 0 and domR/* n dom R ft° ^ 0. For
a fixed po G domR/* n domR,ft° and for any p € Rv, we have

from which follows

Since domz/ and domzft are disjoint M-convex sets, the separation theorem for M-
convex sets (Theorem 4.21) gives p* G Hv such that the right-hand side of (8.21)
with p = p* is nonnegative. With this p* and a* G R such that f ' ( p * ) < -a* <
h°(p*), the inequality (8.20) is satisfied.
For integer-valued / and ft, we have /•,—ft 0 G £[Z|R —> R] and, hence,
dom R /*,dom R ft° G £0[Z|R]. We may assume p0 G Zv by (5.9) and p* G Zv by
Theorem 4.21. Then f*(p*) and h°(p*) are integers, and therefore we can take an
integer a* e Z. D

Next we state the separation theorem for L-convex/L-concave functions.

Theorem 8.16 (L-separation theorem). Let g : Zv —» R U {+00} be an Lfi-


convex function and k : Zv —> R U {—00} be an L^ -concave function such that
domzg n dom z fc ^ 0 or domRg° n domRfc° ^ 0. If g(p) > k(p) (Vp e Zv), there
exist (3* € R and x* e Rv such that

Moreover, if g and k are integer valued, there exist integer-valued j3* G Z and
x* e zv.
Proof. We may assume g, —k & £[Z —> R].
(i) Suppose that domzg n domzfc j^ 0. For the convex closure ~g of g and the
concave closure A; of k, we have (?(p) > fc(p) (Vp e R17) by Proposition 8.14 (2).
Since domR<7 n domRfc ^ 0, the separation theorem in convex analysis (Theorem
3.5) gives /?* e R and x* e Ry such that g(p) > /3_* + (p,x*) > Jfc(p) (Vp e Ry)
(see Note 8.19). This implies (8.22) since g = g and k = k on Zv by Theorem 7.20.
8.2. Duality 219

The integrality assertion is proved from the facts that the integer subdif-
ferential of an integer-valued L-convex function is an M-convex set and that M-
convex sets have the property of convexity in intersection. We may assume that
inf{0(p) - k(p) | p 6 Zv} = 0. Then there exists p0 e Zv with g(pQ) - k(p0] = 0
(by the integrality of the function value). By (6.87) and Theorem 7.43 (2), we have

which is nonempty since x* € dRg(po) H c^fc(Ri)- Since dzg(po) and d'zk(po)


above are M-convex, convexity in intersection for M-convex sets (4.34) guarantees
the existence of an integer vector x** £ dzg(po) n d'zk(po). With this x** and
/3** = k(p0) - (po,x**) e Z, the inequality (8.22) is satisfied.
(ii) Next suppose that domzg n domzfc = 0 and doniR,<?* n domR.fc0 ^ 0. For
a fixed XQ e doniRg* n domRA;0 and for any x e Ry, we have

from which follows

Since domz<? and dom^A; are disjoint L-convex sets, the separation theorem for L-
convex sets (Theorem 5.9) gives x* G R^ such that the right-hand side of (8.23) with
x = x* is nonnegative. With this x* and /?* e R such that g*(x*} < —/3* < k°(x*),
the inequality (8.22) is satisfied.
For integer-valued g and k we have g°,—k° e ,M[Z|R —» R] and, hence,
domRfi(*,domRA;0 e Ai0[Z|R]. We may assume x0 e Zv by (4.34) and x* & Zv by
Theorem 5.9. Then g*(x*) and k°(x*) are integers, and therefore we can take an
integer j3* e Z. D

As an immediate corollary of the M-separation theorem we can obtain an op-


timality criterion for the problem of minimizing the sum of two M-convex functions,
which we call the M-convex intersection problem. Note that the sum of M-convex
functions is no longer M-convex and Theorem 6.26 (M-optimality criterion) does
not apply.

Theorem 8.17 (M-convex intersection theorem). For Afi-convex functions f i , f z €


M^[Z —» R] and a point x* e doniz/i n doniz/2, we have

if and only if there exists p* e Rv such that


220 Chapter 8. Conjugacy and Duality

Conditions (8.25) and (8.26) are equivalent, respectively, to

with the notation xo = 0, and for such a p* we have

Moreover, if fi and fa are integer valued, i.e., /i,/2 G M^[Z —> Z], we can choose
integer-valued p* G Zv.

Proof. The sufficiency of (8.25) and (8.26) is obvious. Conversely, suppose that
(8.24) is true and apply the M-separation theorem (Theorem 8.15) to f ( x ) = f \ ( x )
and h(x) = f i ( x * ) + h(x*) — fa(x) to obtain a* and p* satisfying (8.20). We have
a* = fi[—p*](x*) from (8.20) with x = x* and hence

This implies (8.25) and (8.26), which are equivalent to (8.27) and (8.28), respec-
tively, by the M-optimality criterion (Theorem 6.26 (2)). To prove (8.29) take
x G argmin(/i + / 2 ). Then

which, along with (8.25) and (8.26), implies x G argmin/i[— p*] n argmin/2[+p*].
Hence follows C in (8.29), whereas D is obvious. Finally, the integrality of p* is due
to the integrality assertion in Theorem 8.15. D

Note 8.18. The assumptions on the effective domains are necessary in the sepa-
ration theorems (Theorems 8.15 and 8.16). For instance, for an M-convex function
/ : Z2 —> Z U {+00} and an M-concave function h : Z2 —> Z U {—00} denned by

we have

domz/ndomzft = 0, and domz/* ndomzft 0 = 0. There exists no separating affine


function for (/, ft) or (/*,

Note 8.19. This is a technical supplement to the proof of the M-separation the-
orem (Theorem 8.15). (A similar remark applies to the proof of the L-separation
theorem.) We applied the separation theorem to / and ft, the convex and concave
8.2. Duality 221

extensions of / and h, without verifying the assumption in Theorem 3.5. If / and


h are polyhedral, the assumption (a2) of Theorem 3.5 is met and the theorem is
literally applicable. If inf{/(x) — h(x) \ x e Zv} is attained by some x = x0 e Zy,
we have

in which the directional derivatives / (XQ;-) and h (XQ; •) of / and h at XQ are


polyhedral and Theorem 3.5 may be used for the pair of /(XQ) + f (XQ; x — XQ) and
h(xo) + h (XQ; x — XQ). Otherwise we have to resort to a variant of the separation
theorem such as the following: Let / : Zv -> R U {+00} and h : Zv -» R U {-00}
be integrally convex and concave functions with domz/ fl doniz/i ^ 0, and denote
their convex and concave extensions by / and h. If f(x) > h(x) (\/x € Rv), then
there exist a* <E R and p* e Hv such that J(x) > a* + (p*,x) > ~h(x) (Vx e Hv).
*

Note 8.20. The original proof of the M-separation theorem is based on an algo-
rithmic argument for a generalization of the submodular flow problem involving an
M-convex cost function (Murota [142]). In particular, the argument is purely dis-
crete, not relying on the separation theorem in convex analysis. See sections 9.1.4
and 9.5. •

8.2.2 Fenchel-Type Duality Theorem


The Fenchel-type duality theorem is discussed here. Before giving a precise state-
ment of the theorem we explain the essence of the assertion. For any functions
/ : Zv —> R U {+00} and h : Zv —> R U {—oo}, we have a chain of inequalities

from the definitions (8.11) and (8.12) of conjugate functions, where / and h are the
convex and concave closures of / and h, respectively. We observe the following:

1. The second inequality is in fact an equality (under certain regularity assump-


tions) by the Fenchel duality theorem in convex analysis (Theorem 3.6).

2. The first inequality can be strict even when / is convex extensible and h is
concave extensible, as is demonstrated by Example 1.6. A similar statement
applies to the third inequality.

The following theorem asserts that the first and second inequalities in (8.30) turn
into equalities for Mb-convex/M^-concave functions and L^-convex/L^-concave func-
tions and that all three inequalities are equalities for such integer-valued functions.
222 Chapter 8. Conjugacy and Duality

Theorem 8.21 (Fenchel-type duality theorem).


(1) Let f : Zv -> RUJ+oo} be an Afl-convex function and h : Zv -> Ru{-oo}
be an M^-concave function, i.e., /, —h e M.^[Z —> R], sucft t/iaidomz/ndomz/i 7^ 0
or doniR/* n doniR/i0 ^ 0. Then we have

If this common value is finite, the supremum is attained by some p € domR/* n


doniR/i°.
(2) Let g : Zv —» RU{+oo} be an L^-convex function and k : Zv —> RU{—00}
be an L^-concave function, i.e., g, —k € fi[Z —» R], suc/i i/ia£ domz<? n domzfc ^ 0
or doniRg* n doniRfc0 / 0. T"ften we /iave

If this common value is finite, the supremum is attained by some x € doniRg* n


domRfc0.
(3) Let f : Zv —> Z U {+00} 6e an integer-valued Afi-convex function and h :
v
Z —> Z U {—(X)} 6e an integer-valued Afi-concave function, i.e., /, — h £ M^[Z —>•
Z], such that domz/ fl domz/i ^ 0 or domz/* PI domz/i° 7^ 0. T/ien we Aave

If this common value is finite, the infimum is attained by some x € domz/ndomz/i


and the supremum is attained by some p & domz/* H domz/i 0 .
(4) Let g : Zv —> Z U {+CXD} 6e an integer-valued L^-convex function and k :
v
Z —> ZU{—00} 6e an integer-valued L^-concave function, i.e., g, —k £ &[Z —> Z],
swcft i/iai domzgi n domz/c 7^ 0 or domZ5* H dom z fc 0 7^ 0. T/ien we have

If this common value is finite, the infimum is attained by somep e domzgndomzfc


and the supremum is attained by some x € domzg* n domzfc 0 .

Proof. (1) Suppose that domz/ H dornzft. ^ 0. By (8.30) we may assume that
A = mf{f(x) — h(x) x e Zv} is finite. By the M-separation theorem (Theorem
8.15) for (/ - A, h), there exist a* 6 R and p* & Hv such that

for all x e Zv, which implies ft°(p*) - f ' ( p " ) > A. Combining this with (8.30)
shows (8.31) as well as the attainment of the supremum by p*. Next suppose that
domz/ n domz/i = 0 and doniR/* n doniR/i0 ^ 0. The separation theorem for
M-convex sets (Theorem 4.21) applied to BI = domz/i and B2 = domz/ gives
p* e {0, ±1}V satisfying (4.33). Putting p = p0 + cp* in (8.21) (within the proof
of the M-separation theorem) and letting c —> +00, we obtain sup — +00 in (8.31),
whereas inf — +00 by domz/ n domz/i = 0.
8.2. Duality ; 223

(2) (The proof goes in parallel with (1).) Suppose that domzg fl domzA; ^ 0.
By (8.30) we may assume that A = inf{<?(p) — k(p) \ p e Zv} is finite. By the L-
separation theorem (Theorem 8.16) for (g — A, k), there exist /?* e R and x* e Rv
such that

for all p & Zv, which impliesfc°(a;*)- ff*(x*) > A. Combining this with (8.30)
shows (8.32) as well as the attainment of the supremum by x*. Next suppose
that domzg n domzfc = 0 and domR,<?* PI domR,fc0 ^ 0. The separation theorem
for L-convex sets (Theorem 5.9) applied to D\ = domz^ and D-2 = domzff gives
x* € {0, ±1}V satisfying (5.10). Putting x = x0 + ex* in (8.23) (within the proof
of the L-separation theorem) and letting c —> +00, we obtain sup — +00 in (8.32),
whereas inf = +00 by domzg fl donizfc = 0.
(3) In the proof of (1) we can take a* € Z, p* e Zv, and c € Z. The supremum
and infimum for finite (8.33) are attained since the functions are integer valued.
(4) In the proof of (2) we can take j3* e Z, x* £ Zv, and c e Z. The
supremum and infimum for finite (8.34) are attained since the functions are integer
valued. 0

The M-separation and L-separation theorems are parallel or conjugate in their


statements as well as in their proofs. In contrast, the Fenchel-type duality theorem
for integer-valued functions is self-conjugate in that the substitution of / = g* and
h = k° into (8.33) results in (8.34) by virtue of g = g" and k = k°°. To emphasize
the parallelism we have proved the M-separation theorem and the L-separation
theorem independently and derived the Fenchel-type duality theorem therefrom. It
is noted, however, that, with the knowledge of M-/L-conjugacy, these three duality
theorems are almost equivalent to one another; once one of them is established, the
other two can be derived by relatively easy formal calculations.

Note 8.22. In Theorem 8.21 (1) the infimum is not necessarily attained by any
x G Zv (and similarly for (2)). For example, consider / : Z —» R U {+00} and
h : Z -^ R U {-00} defined by

which are M^-convex and M^-concave, respectively. We have domz/ = domz/i =


Z+, domR/* = (-oo,0]R, domR/i0 = [0,+oo)R, and inf = sup = 0 in (8.31).
However, no x attains the infimum, whereas the supremum is attained by p = 0. •

Note 8.23. The assumptions on the effective domains are necessary in Theorem
8.21. For the M-convex and M-concave functions / and h in Note 8.18, we have
domz/ n domz/i = 0 and domz/* n domz/i0 = 0. The identity (8.33) fails with
infimum = +00 and supremum = — oo. •
224 Chapter 8. Conjugacy and Duality

Figure 8.2. Duality theorems (/: M^-convex function, h: Aft-concave function).

8.2.3 Implications
In spite of the apparent similarity to the corresponding theorems in convex analysis,
the discrete duality theorems established above convey deep combinatorial proper-
ties of M-convex and L-convex functions. We now demonstrate this by deriving
major duality results in optimization on matroids and submodular functions as im-
mediate corollaries of these theorems (see also Fig. 8.2). The connection to the
duality in network flow problems is discussed in Chapter 9.

Example 8.24. Frank's discrete separation theorem (Theorem 4.17) is a special


case of the L-separation theorem (Theorem 8.16). By Proposition 7.4, the sub-
modular and supermodular set functions p and // can be identified, respectively,
with an L''-convex function g : Zv —> R U {+00} with donizg C {0,1} V and an
lAconcave function k : Zy -> RU {-00} with domz/fc C {0,1}V by p(X) = g(xx)
and n(X) = k(xx) for X C V. The L-separation theorem applies to (g,k) since
the first assumption, domz^ndonizA; ^ 0, is met by g(0) = fc(0) = 0, which follows
from p(0) = ^(0) = 0. We see /?* = 0 from the inequality (8.22) for p = 0, and
then the desired inequality (4.27) is obtained from (8.22) with p = xx for X C V.
When p and p, are integer valued, g and k are also integer valued, and the integrality
assertion in the L-separation theorem implies the integrality assertion in Theorem
4.17. •

Example 8.25. Edmonds's intersection theorem (Theorem 4.18) in the integral


case is a special case of the Fenchel-type duality theorem (Theorem 8.21 (3)). This
is explained in Example 1.20. •

Example 8.26. The Fenchel-type duality theorem for submodular set functions is a
special case of the Fenchel-type duality theorem for L''-convex functions (Theorem
8.21 (2), (4)). The conjugate functions of a submodular set function p : 2V —>
RU{+oc} and a supermodular set function \JL : 2V —» R(j{—00} (i.e., p, —/it G <S[R])
8.2. Duality 225

are defined by

The Fenchel-type duality theorem for submodular set functions is an identity

with an additional integrality assertion that, for integer-valued p and n, the maxi-
mum on the right-hand side of (8.35) can be attained by an integer vector x 6 Zv.
As in Example 8.24, we consider an L''-convex function g and an L''-concave function
k associated with p and p.. We have g* = p°, k° = /it 0 , and domz<? PI donizfc ^ 0,
and, therefore, (8.35) is obtained as a special case of (8.32) and (8.34). •

Example 8.27. Prank's weight-splitting theorem for the weighted matroid in-
tersection problem is a special case of the optimality criterion for the M-convex
intersection problem (Theorem 8.17). Given two matroids (V, BI) and (V,BZ) on
a common ground set V with base families BI and $2, as well as a weight vec-
tor w : V —» R, the optimal common base problem is to find B £ B\ n B^ that
minimizes the weight w(B) — ^V€BU)(V). Frank's weight-splitting theorem says
that a common base B* e BI n B% is optimal if and only if there exist real vectors
w*, w% : V —> R such that
(i) w = w* + toj-I,
(ii) B* is a minimum-weight base of (V, B\) with respect to w"(, and
(iii) B* is a minimum-weight base of (V, BZ) with respect to w%-
In addition, the theorem states that, if w is integer valued, the vectors w^ and w\
can be chosen to be integer valued. The combinatorial content of this theorem lies
in the assertion about the existence of an integer weight splitting in the case of
integer-valued weight. Applying Theorem 8.17 to a pair of M-convex functions

yields p* satisfying (8.25) and (8.26) with additional integrality in the case of integer-
valued w. A weight splitting constructed by

has the properties (ii) and (iii) because of (8.25) and (8.26). In Example 1.21 we
derived the weight-splitting theorem (integer-weight case) from the M-separation
theorem (Theorem 8.15). •

Example 8.28. Suppose we are given two valuated matroids (V, a>i) and (V,^)
as well as a weight vector w : V —> R. The valuated matroid intersection problem
is to find B C V that maximizes w(B) + u\(B) + w 2 (B). The weight-splitting
theorem for valuated matroid intersection says that a common base B* maximizes
226 Chapter 8. Conjugacy and Duality

w(B) + LOi(B) + W2(S) if and only if there exist real vectors w*,w^ '• V —> R
such that
(i) w = wl + w^,
(ii) B* maximizes u>i[w;*], and
(iii) B* maximizes u^fu^L
where wiftu*] and u^^] are defined by (2.76). In addition, the theorem states that,
if wi, uj2, and w are all integer valued, the vectors w* and w^ can be chosen to be
integer valued. Let /i and /2 be the M-convex functions associated, respectively,
with LUi and o>2 by (2.77). Maximizing w(B) + uJi(B) + ^(B) is equivalent to
minimizing f \ ( x ) + / 2 [ — w ] ( x ) , and a desired weight splitting can be obtained from
the M-convex intersection theorem (Theorem 8.17) as in Example 8.27. •

It is emphasized again that the discrete duality theorems are of combinatorial


nature and cannot be obtained through mere combination of the convex-extensibility
theorem (Theorems 6.42 and 7.20) with the separation theorem (Theorem 3.5) or the
Fenchel duality theorem (Theorem 3.6) for (ordinary) convex functions. Examples
1.5 and 1.6 should be convincing enough to demonstrate this point.

8.3 M2-Convex Functions and L2-Convex Functions


Two additional classes of discrete functions, called M2-convex functions and "L^-
convex functions, are considered here. An M2-convex function is a function repre-
sentable as the sum of two M-convex functions, and an L^-convex function is the
integer infimal convolution of two L-convex functions. These functions play crucial
roles in combinatorial optimization. In Edmonds's intersection theorem (Theorem
4.18), for example, the left-hand side of the min-max relation (4.29) corresponds to
M2-convexity and the right-hand side to L2-convexity.

8.3.1 M2-Convex Functions


A function / : Zv —> R U {+00} with dom/ ^ 0 is said to be M^-convex if it ca
be represented as the sum of two M-convex functions, i.e., if / = /i + /2 for some
/i 1/2 G .M[Z —> R]. We denote by M-^jL —> R] the set of M2-convex function
and by .A/^Z —> Z] the subclass of M2-convex functions / = /i + /2 with some
/i> /2 G M\L —> Z]. An J\4-convex function is denned similarly as the sum of tw
M^-convex functions, which is obtained as the projection of an M2-convex function.
The notations M$L —*• R] and M\[Z —* Z] are defined in an obvious way. We ha

Note that a set is M2-convex (resp., M2-convex) if and only if its indicator function
is M2-convex (resp., M^-convex).
The effective domain and the set of minimizers of an M2-convex function are
M2-convex sets; the latter is a consequence of the M-convex intersection theorem
(Theorem 8.17).
8.3. M2-Convex Functions and L2-Convex Functions 227

Proposition 8.29.
(1) For an M%-convex function f, dom/ is M^-convex.
(2) For an M%-convex function f, dom/ is M^-convex.

Proof. This follows from the relation dom (/i + /2) = dom /i n dom /2 and the M-
or M^-convexity of dom/, (i = 1, 2) given in Proposition 6.7. D

Proposition 8.30.
(1) For an M%-convex function f, argmin/ is M^-convex if it is not empty.
(2) For an M^ -convex function f, argmin/ is M^-convex if it is not empty.

Proof. This follows from argmin(/i + /2) — argmin/i[—p*] n argmin/ 2 [+p*] in


(8.29) and the M- or M11 -convexity of argmin fi[—p*] and argmin/ 2 [+£>*] given in
Proposition 6.29. D

M2-convexity implies integral convexity.

Theorem 8.31. An M^-convex function is integrally convex. In particular, an


A/2 -convex set is integrally convex.

Proof. For / = /i + /2 with /i,/ 2 6 M*[Z -» R] and x G Ry, Theorem 6.44


implies

where /, /i^jind /2_are the local convex extensions (3.61) of /, /i, and /^respec-
tively, and /i and /2 are the convex closures (3.56) of /i and /2. Since /i + /2 is
convex, so is /. D

For the minimality of an M2-convex function we have the following criterion.

Theorem 8.32 (M2-optimality criterion). For an M2- convex function f 6 -M2[Z —>
R] and x €. dom /, we have

Proof. By Theorem 8.31 the optimality criterion for an integrally convex function
(Theorem 3.21) applies. We may impose the condition |V| = \Z\ because x(V) is
constant for any x e dom/. D

The optimality criterion above is not suitable for polynomial-time verification.


If the summands f\ and /2 in / = /i + /2 are known, the minimality can be verified
in polynomial time by the following criterion, as will be explained in Note 9.21. We
mention that the M-convex intersection theorem (Theorem 8.17) also serves as an
optimality criterion for M2-convex functions when the summands are known.
228 Chapter 8. Conjugacy and Duality

Theorem 8.33 (M2-optimality criterion). For M-convex functions fi, f? € M.\L —>
R] and a point x £ dom /j n dom /2, we have

if and only if

for any u1,..uk1,u1...,uk e V with u1,..uk}n{v1,...,uk} =0, where uk+1=


u1 by convention.

Proof. The proof is given later in Note 9.21.

A scaling version of the optimality criterion above leads to a proximity theorem


for M2-convex functions.

Theorem 8.34 (M2-proximity theorem). Let /i,/2 € M[Z —> R] be M-convex


functions, and assume a & Z++ and n = \V\. If xa & dom/i n dom/2 satisfies

for any u1,...,uk,v1,...,uk3V with {u1,..,uk}n{v1,...,uk} =0, where uk+1=


u1, then arg min(f1+f2)= 0 and there exists x* e arg min(f1+f2)with

Proof. See Murota-Tamura [162].

Straightforward calculations based on the M-convex intersection theorem yield


the following two theorems.

Theorem 8.35.
8.3. M2-Convex Functions and L2-Convex Functions 229

Proof. Using the M-convex intersection theorem (Theorem 8.17), we see that

For (2) and (3), note that dnfi(x) e C\[Z R] from Theorem 6.61 (2). D

where DZ denotes the integer infimal convolution (6.43) and * is the discrete
Legendre-Fenchel transformation (8.11)z- A relation conjugate to this also holds
for M-convex functions as follows.

Theorem 8.36. For integer-valued Afi-convex functions /i,/2 € M^[Z —> Z] with
dom/i n dom/2 ^ 0, we have (/i + /2)* = /i*Qz h" and (fi + /2)** = /i + h-

Proof. For p e dom (fi + /2)* there exists q € Zv such that

by the M-convex intersection theorem (Theorem 8.17). This shows that (/i +/ 2 )* >
/i* n z /2*, whereas < is obvious from (p,o;) - /i(x) - / 2 (x) < /*(g) + /*(p - <?).
The second identity follows from the first because of (8.38) and Theorem 8.12. D

8.3.2 L2-Convex Functions


A function g : Zv —> R U {+00} is said to be L-2-convex if it can be represented
as the infimal convolution of two L-convex functions, i.e., if g = <?iClz52 for some
(?i,<72 G £[Z —»• R.]- We denote by £ 2 [Z —> R] the set of L2-convex functi
and by £2[Z —> Z] the subclass of L2-convex functions 5 = ffidz^ with some
<?i) 52 € £[Z —> Z]. An L^-convex function is defined similarly as the integer infimal
convolution of two lAconvex functions, which is obtained as the restriction of an
L2-convex function. The notations C\[Z —> R] and jC^[Z —> Z] are defined in n
obvious way. We have

Note that a set is L2-convex (resp., L2-convex) if and only if its indicator function
is L2-convex (resp., L^-convex).
230 Chapter 8. Conjugacy and Duality

Note 8.37. Here is a technical supplement concerning the definition of an L2-


convex function. By definition, a function g : Zv —> R U {+00} is L2-convex if it
can be represented as

for some <?i,<?2 G £[Z —»• R]- We may assume that the infimum is attained for
each p 6 domg. Namely, it is known that for an L2-convex function g there exist
L-convex functions gi, g% 6 £[Z —> R] such that

As an example, consider a pair of L-convex functions in two variables:

The infimal convolution g — gi^z92 is identically zero, with the infimum in (8.40)
unattained. An obvious valid choice for (8.41) is 171 = g% = 0 (identically). •

Note 8.38. For #1,52 G £[Z —> R], it can be shown that (51 DZfl^Xpo) = —oo
(3p 0 ) =*> (Si n z02)(p) = -oo ( V p e domsri+dom^)- •

The effective domain and the set of minimizers of an L2-convex function are
L2-convex sets.

Proposition 8.39.
(1) For an L-2-convex function g, doing is L^-convex.
(2) For an L\-convex function g, domg is L^-convex.

Proof. This follows from the relation dom (giOz g2) = domgi +dom<72 and the Iv-
or L^-convexity of dom <jij (i — 1, 2) given in Proposition 7.8.

Proposition 8.40.
(1) For an L%-convex function g, argming is L^-convex if it is not empty.
(2) For an I\-convex function g, argming is L\-convex if it is not empty.

Proof. By (8.41) this follows from a general fact in Proposition 8.41 below and the
L- or L^-convexity of argming, (i = 1,2) given in Proposition 7.16. D

Proposition 8.41. If g\,gz '• Zv —> RU {+00} are such that

then we have
8.3. M2-Convex Functions and l-2-Convex Functions 231

Proof. It suffices to prove

since the converse inclusion is always true, independently of (8.42). Take p* &
argmin(<7iD2 <?2)[—a:]. By (8.42) there exist p\ and p\ such that p* = p* + p2
and (giO2g2)(p*) = gi(p*) + g2(p2)- If 5i[-^](Pi) > 9i[-x](pi) for some pi, we
would have

a contradiction to the choice of p*. Hence p* € argmin<?i[—x]. Similarly, we have


P2 € arg min g2 [—x]. D

The assumption (8.42) above is necessary for the identity (8.43) to hold. For
the functions gi and 52 in Note 8.37, for instance, we have argmin((?in z g 2 ) = Z2
but arg min gi = arg min 52 = 0- For integer-valued functions, however, (8.42) is
always satisfied.
L2-convexity implies integral convexity.

Theorem 8.42. An L^-convex function is integrally convex. In particular, an


L^-convex set is integrally convex.

Proof. It suffices to consider L2-convex sets and functions. To emphasize the


essence we give a proof for the integral convexity of an L2-convex set. This im-
plies, by Theorem 3.29, the integral convexity of an L2-convex function g with a
bounded effective domain, since arg min <?[—x] is an L2-convex set for any x 6 Ry
by Proposition 8.40. A complete proof can be found in Murota-Shioura [153].
Let S be an L2-convex set represented as S = D\ + D2 with Di,D2 <E £0[Z].
We will show that p e S = > p e S T l N(p)_ (see (3.71))^J3y S = £>i + D2 = TTl+T>2
we have p = p\ + p2 for some pi € D\ and p2 e D2. Put 01 = pi — [pij and
0"2 = Ip2] — P2, where 0 < a,k(v) < I for k = 1,2 and v € V. Denoting the distinct
values among {ai(v),a2(v) \ v e V} by ai > a2 > • • • > am (> 0) and defining
Ukt = {v e V | afc(u) > QJ} for k = 1,2 and i = 1 , . . . , m, we have

and, hence,

where OQ = 1, a m+ i = 0, and f/10 = C/2o = 0. This implies p € S(~)N(p), since


ft = [PiJ + Xt/ii + [^2! - Xu2i belongs to S D JV(p) for z = 0 , 1 , . . . , m, as shown
below.
232 Chapter 8. Conjugacy and Duality

[Proof of qi e S] We have [pi\ + xuu £ DI by Theorem 5.10 for pi e Dl.


Since — p2 6 — D?, we similarly see [pal ~~ Xu2i € -^2- Hence q^ & DI+ D? = S.
[Proof of qi € -/V(p)] We are to show

we have

Ifp(w) e Z, (8.46) shows X£/ii( w ) = Xu2i(v), which implies (8.44). Supposep(v) £ Z.


We put

and divide into two cases: (i) v G W and (ii) v €E V" \ W. In case (i), let i be such
that v € t/ij. Then v e C/2i follows from

Therefore, -1 < Xc/ii(f) - Xu2i(v) < 0, which implies (8.45). In case (ii), let i be
such that v G U^i- Then TJ € C/H follows from

Therefore, 0 < xuu(v) — Xu2i(v) < 1, which implies (8.45).

For the minimality of an L2-convex function, we have the following criterion.

Theorem 8.43 (L2-optimality criterion). For an L^-convex function g € £2[Z —•>


R] and p & dom g, we have

Proof. By Theorem 8.42 this is obtained as a special case of the optimality criterion
for an integrally convex function (Proposition 3.22).

A scaling version of the optimality criterion above leads to a proximity theorem


for L2-convex functions.

Theorem 8.44 (L2-proximity theorem). Let g : Zv —> RU {+00} be an L2-convex


function such that g(p) — g(p + 1) (Vp € Z y ), and assume a £ Z++ and n = \V\.
If pa £ domg satisfies
8.3. Mg-Convex Functions and l-2-Convex Functions 233

then argmin0 ^ 0 and there exists p* € argming with

Proof. Let g be represented as (8.41) with L-convex functions g\ and 02, where
gi(p + 1) = 0j(p) (Vp) for i = 1, 2 as a consequence of g(p) = g(p + 1) (Vp). There
exist p?,p2* e Zy such that 0(pa) = 0i(p") + ^2(^2) and Pa = Pi + Pa • For anY
V C I/ we have

by the definition of infimal convolution, whereas 0i(p?) + 52(^2 ) = 5(pa) 5= ff(Pa +


axr) by (8.49). Hence

By the L-proximity theorem (Theorem 7.18) there exist p* & argmingi and p^ &
argmin<72 such that

Then p* — p* + p\ satisfies (8.50). Moreover, p* is a minimizer of g because


p\ G argmingi and p% € argmin^-

The following two theorems are the counterparts of Theorems 8.35 and 8.36.

Theorem 8.45.
(1) .For0i,02 e /^[Z —*• R] withgiOzg2 > -oo and(8A2) andp e dom(0in z 0 2 ),
there exist pt e dom0j (i = 1,2) such that p = p\ +P2 and

(2) For 01,02 6 ^[Z —> Z] with giOzg2 > -oo andp e dom(0inz52), i/iere
exzsi ^ 6 dom 0j (i = 1,2) SMC/I tftai p = Pi + P2 «wrf

(3) -For 0 € >C^[Z —> Z] and p € dom0, <9z0(p) is an A^ -convex set. For
g e £2[Z —> Z] anrfp e dom0, dzg(p) is an M^-convex set.

Proof. Recall the relation x € <?a(0i n z02)(p) ^ p G argmin(0iD z 02)[—x] from


(3.30), and use (8.43) and c*R0i(pi) € A^o[Z|R], which is a variant of Theorem
7.43 (2). D

Theorem 8.46. For integer-valued L^-convex functions 0i,02 € J$[Z —> Z]


TOi/i 0idz02 > —oo, we /iaw (0iQz02)** = 0iDzfi f 2 J where * means the discrete
Legendre-Fenchel transformation (8.11)z-
234 Chapter 8. Conjugacy and Duality

Proof. Applying Theorem 8.36 to /; = gf <E M^[Z -> Z] shows this.

Note 8.47. An Li2-convex function g(p) = min{(?i (q) +52(^ — 9) 1 € Z^},


represented as in (8.41), can be evaluated efficiently, since gi(q) + g^P — q) is an
L-convex function in q, to which the minimization algorithms in section 10.3 can be
applied. •

8.3.3 Relationship
The relationship between M2- and L2-convex functions is discussed here. The first
theorem shows the conjugacy relationship between M2- and La-convex functions.

Theorem 8.48. The two classes of functions M.^\L —> Z] and H^\L —> Z] are
in one-to-one correspondence under the discrete Legendre-Fenchel transformation
(8.11)z, and similarly for M\[Z -> Z] and £ 2 [Z -> Z].

Proof. This is due to (8.38) and Theorems 8.36, 8.46, and 8.12. D

Separable convex functions are characterized as functions possessing both M2-


convexity and Lj-convexity.

Theorem 8.49. For a function f : Zv —> RU {+00}, we have

f is A^-convex and I\-convex •<=> / is A^-convex and L^-convex


•^=> / is separable convex.

Proof. It suffices to show that, if / is both M^-convex and L^-convex, then it is


separable convex. We may assume that dom / is bounded. Take any p & Hv. By
Propositions 8.30 and 8.40, the set argmin/[-p] is both M^-convex and L^-convex,
and therefore it is an integer interval. This means that / is a separable convex
function. D

8.4 Lagrange Duality for Optimization


8.4.1 Outline
On the basis of the conjugacy and duality theorems we can develop a Lagrange
duality theory for a (nonlinear) integer program:

where c : Zv —> Z U {+00} and 0 ^ B C Zv. The canonical "convex" case consists
of problems in which
8.4. Lagrange Duality for Optimization 235

(REG) B is an M-convex set, and


(OBJ) c is an M-convex function.
We refer to a problem with (REG) and (OBJ) as an M-convex program.
We follow Rockafellar's conjugate duality approach [177] to convex/nonconvex
programs in nonlinear optimization. The whole scenario of the present section is
a straightforward adaptation of it, whereas the technical development leading to
a strong duality assertion for "convex" programs relies heavily on fundamental
theorems of a combinatorial nature. An adaptation of the Lagrangian function
in nonlinear programming affords a duality framework that covers "nonconvex"
programs. We follow the notation of [177] to emphasize the parallelism.
In the canonical "convex" case, the problem dual to P turns out to be a
maximization of an L2-concave function, where the strong duality holds between the
pair of primal/dual problems. This is a consequence of the conjugacy between M2-
and L2-convexity and the Fenchel-type duality theorem for M-/L-convex functions.
In the literature of integer programming we can find a number of duality
frameworks, such as the subadditive duality. The present approach is distinguished
from those in the following ways:
1. It is primarily concerned with nonlinear objective functions.
2. The theory parallels the perturbation-based duality formalism in nonlinear
programming.
3. In particular, the dual problem is derived from an embedding of the given
problem in a family of perturbed problems with a certain convexity in the
direction of perturbation.
4. It identifies M-convex programs as the well-behaved core structure to be com-
pared to convex programs in nonlinear programming.

8.4.2 General Duality Framework


We describe the general framework, in which neither (REG) nor (OBJ) is assumed.
First we rewrite the problem P as follows:

with

where SB '• Zv —> {0, +00} is the indicator function of B. We say that the problem
P is feasible if f ( x ) < +00 for some x G Zv.
Next we embed the optimization problem P in a family of perturbed problems.
As the perturbation of / we consider F : Zv x Zu —> Z U {+00}, with U being a
finite set, such that
236 Chapter 8. Conjugacy and Duality

Here the second condition (8.55) means that the integer biconjugate of F(x,u) as
a function in u for each fixed x coincides with F(x, u) itself.

Note 8.50. By Proposition 8.11, the condition (8.55) is satisfied if, for each x,
either F(x, •) = +00 or the integer subdifferential of F(x,u) with respect to u is
nonempty for each u € domF(:r, •). Recall that an integer-valued M^- or Lj-convex
function has a nonempty integer subdifferential (Theorems 8.35 and 8.45). •

The resulting family of optimization problems, parametrized by u G Zu, reads


as follows:

We define the optimal value function (f>: Zu —> Z U {±00} by

and the Lagrangian function K : Zv x Zu —> Z U {±00} by

For each x 6 Zv, the function K(x,-) : y >-* K(x,y) is the concave discrete
Legendre-Fenchel transform of the function —F(x, •) : u H-> —F(x,u).
Our assumptions (8.54) and (8.55) on F(x,u) guarantee the following.

Proposition 8.51.
(1) F(x,u) = sup{K(x,y)-(u,y)\y£Zu}
( 2 ) f ( x ) = sup{K(x,y) yeZu}(xeZv).

Proof. (1) Abbreviate F(x,u) and K(x,y) to F(u) and K(y), respectively. We
have F'(y) = -K(-y) by (8.58), while F(u) = F"(u} by (8.55). Therefore,

(2) This follows from (1) with u = 0 and (8.54).

We define the dual problem to P as follows:

where the objective function g : Zu —> Z U {±00} is defined by

We say that the problem D is feasible if g(y) > — oo for some y 6 Zu. If the problem
P is feasible, we have g(y) < +00 for all y € Zu, since g(y) < K(x,y) < f(x] for
all x € Zv and y e Zu.
8.4. Lagrange Duality for Optimization 237

We use the following notations:

We write min(P) instead of inf (P) if the problem P is feasible and the infimum is
finite, in which case the infimum is attained (i.e., opt(P) ^ 0), and similarly for
max(D).

Theorem 8.52 (Weak duality). inf(P) > sup(D).

Proof. We have

Hence, sup(D) = supyg(y) < inf(P).

Our main interest lies in the strong duality, namely, in the case where the
inequality in the weak duality turns to an equality with a finite common value.

Theorem 8.53.
(1) g(y) = -v*(-y).
(2)sup(D) = p"(0).
(3) inf(P) =¥>(()).
(4) inf(P) = sup(D) <=> ip(0) = <p"(0).
(5) Suppose inf(P) is finite. Then min(P) = max(D) <S=> dz<p(Qi) ^ 0.
(6) //min(P) = max(D), then opt(D) = -dz<p(0).

Proof. (1) By the definitions we have

(2) By using (1) we have

(3) This is obvious from (8.54) and (8.57).


(4) The equivalence is due to (2) and (3).
(5), (6) We have the following chain of equivalence: y € — dz<y?(0) <^> f ( u ) —
</j(0) > (u,-y} (Vu e Zu) &• inf u ( ¥ >(u) + (u,y)) = y>(0) <^> g(y) = ip(0), where
238 Chapter 8. Conjugacy and Duality

mfu((p(u) + (u, y)} = g(y) is shown in the proof of (1). This implies the claims when
combined with the weak duality (Theorem 8.52).

Theorem 8.54 (Saddle-point theorem). Both inf(P) and sup(D) are finite and
min(P) = max(D) if and only if there exist x £ Zv and y 6 Zu such that K(x,y)
is finite and

If this is the case, we have x & opt(P) and y G opt(D).

Proof. By Proposition 8.51 (2) we have f(x) = supyK(x,y) for any x € Zv,
whereas g(y) = inf x K(x,y) for any y G Zu by the definition (8.60). In view of the
weak duality (Theorem 8.52) and the relation

we see that

8.4.3 Lagrangian Function Based on M-Convexity


As the perturbation F, we choose F = Fr : Zv x Zv —> Z U {+00} defined by

where r : Zv —> Z U {+00} is an M-convex function with r(0) = 0. (We take V as


the U in the general framework.) The special case with r = 0 is distinguished by
the subscript 0. Namely,

We single out the case of r = 0 because the technical development in this special
case can be made within the framework of M-/L-convex functions, whereas the
general case involves M2-/L2-convex functions. Throughout this section we assume
(REG), i.e., that B is an M-convex set.
We use the subscript r to denote the quantities derived from Fr; namely,
8.4. Lagrange Duality for Optimization 239

Our choice of the perturbation (8.61) is legitimate, meeting the requirements


(8.54) and (8.55), as follows.

Proposition 8.55. Assume (REG).


(1) Fr(x,0) = f ( x ) (x£Zv).
(2) For each x & Zv, FQ(X, ii) is M-convex in u or FQ(X, u) = +00 for all u.
(3) For each x & Zv, Fr(x,u) is M^-convex in u or Fr(x,u) = +00 for all u.
(4)Fr(x,-)"=Fr(x,-) (x£Zv).
(5) Fr(x,u) = sup{Kr(x,y) ~(u,y)\ye Zv} (x,u& Zv}.
Assume (REG) and (OBJ).
(6) For each u 6 Zv, Fr(x,u) is M^-convex in x or Fr(x,u) = +00 for all x.

Proof. (I) This follows from r(0) = 0.


(2) We have F0(x, u) = +00 unless x e dom c. For each x, 5B(x+u) = SB-X(U)
is the indicator function of B — x (translation of B by x), which is again an M-convex
set. Therefore, SB (x + u) is M-convex in u.
(3) We have Fr(x,u) = +00 unless x e dome. Besides SB(X + u), r(u) is
M-convex by the assumption. Hence, Fr(x, •) is the sum of two M-convex functions
for each x € dom c. By definition, such a function is either M2-convex or identically
equal to +00.
(4) This follows from (3) and Theorem 8.36.
(5) This follows from (4) and Proposition 8.51 (1).
(6) The proof is similar to (3) by the symmetry between c(x) and r(u). D

The Lagrangian function Kr(x,y] has the following properties. It should be


clear that SB' is the support function of B and 0-B^zr[y\ means the integer
infimal convolution of the indicator function of — B = {x \ —x e B} and r[y\(u) =
r(u) + (u,y).

Proposition 8.56.

Proof. It suffices to prove (2), since (1) is its special case with r = 0. Assume
x e dome. Substituting (8.61) into (8.64) we obtain

The alternative expression is easy to see. D


240 Chapter 8. Conjugacy and Duality

Theorem 8.57. Assume (REG).


(1) For each x & Zv, KQ(X, y) is L-concave in y or KQ(X, y) = +00 for all y.
(2) For each x e Zv, Kr(x, y) is L2-concave in y or Kr(x, y) = +00 for all y.
Assume (REG) and (OBJ).
(3) For each y & Zv, K$(x, y) is M-convex in x or KQ(X, y) € {+00, —00} for
all x.
(4) For each y 6 Zv, Kr(x,y) is M2-convex in x or Kr(x,y) £ {+00, — 00}
for all x.

Proof. (1), (3) The expression of Ko(x,y) in Proposition 8.56 (!) shows these.
(2) In the expression of Kr(x,y) in Proposition 8.56 (2) we have SB-X + f G
M2[L —> Z] or = +00 (see Note 6.17). Then the conjugacy in Theorem 8.48 implies
this.
(4) In the expression of Kr(x,y) in Proposition 8.56 (2) the second term
(S-Bazf[y])(—x) is M-convex or e {+00, — 00} since it is the integer infimal con-
volution of two M-convex functions (Theorem 6.13 (8)). D

In the case of M-convex programs the dual objective function gr and the
optimal value function ipr are well behaved, as follows.

Theorem 8.58. Assume (REG) and (OBJ).


(1) go is L-concave or go(y) = —oo for all y.
(2) gr is L2-concave, gr(y) = —oo for all y, or gr(y) — +00 for all y.
(3) <f>o is M-convex or tf>o(u) e {+00, -00} for all u.
(4) (fr is M2-convex or yv(u) G {+00, — 00} for all u.

Proof. We prove (1), (3), (4), and, finally, (2).


(1) Using Proposition 8.56 (1) we obtain

This shows ^o is L-concave or = — oo, since the sum of two L-concave functions is
again L-concave provided the effective domains of the summands are not disjoint.
(3) We have (po(u) = inf z (c(x) + SB(X + u)) = (cDz <5_ B )(—u). The assertion
follows from Theorem 6.13 (8).
(4) It follows from Fr(x,u) = FQ(X,U) + r(u) that <pr(u) = +00 unless u e
domr and that (pr(u) = <fo(u) + r(u) if u € domr. If (p0 e M[Z —> Z], then
ipr G M-2\L —> Z] or ipr = +00. If <fo(u) e {+00, —oo}, then <pr(u) e {+00, —oo}.
(2) First recall the relation gr(y) = —(pr*(-y) (Theorem 8.53 (1)). If (pr e
Mz[Z —> Z], the conjugacy in Theorem 8.48 implies the L2-concavity of gr. If
(pr = +00, then gr = +00. If (pr(u) = —oo for some u, then gr = — oo. D

Strong duality holds true for M-convex programs with the Lagrangian function
Kr(x,y).

Theorem 8.59 (Strong duality). Assume (REG), (OBJ), and that the problem P
8.4. Lagrange Duality for Optimization 241

is feasible and bounded from below.


(1) min(P) = !f>r(0) = <yjr"(0) = max(D r ).
(2) opt(D P ) = -dzvr(0).

Proof. Since </v(0) is finite by the assumption, <pr is M2-convex by Theorem 8.58
(4) and dz<fr(0) 7^ 0 by Theorem 8.35. Then the assertions follow from Theorem
8.53. D

It should be emphasized that the M-convexity of the objective function c is a


sufficient condition and not an absolute prerequisite for the strong duality to hold.

Example 8.60. Let us consider the case where c(x) is a linear function on another
M-convex set B' C Zv. The primal problem with c(x) = (x,w) + SB'(X) (where
w € Zv denotes a weight vector) reads as follows:

The Lagrangian function KQ is given by

from which is derived the following dual problem:

This is the polymatroidal version of the optimal common base problem explained
in Example 8.27. The optimal solution y = y* to D gives the weight splitting
w* = w — y* and w\ = y*.
For a concrete instance, take V = {1,2},

We have B H B' = {(0,0), (1, -1)} and

8.4.4 Symmetry in Duality


So far we have derived the dual problem D from the primal P by means of a
perturbation function F(x,u) such that F(x,0) = f ( x ) and F(x, •) 6 Mi\L —> Z].
Namely,
242 Chapter 8. Conjugacy and Duality

We have seen that g is L2-concave, i.e., —g € £a[Z —> Z], in the "convex" case
where (REG) and (OBJ) are satisfied.
We are now interested in the reverse process, i.e., how to restore the primal
problem P from the dual D in a way consistent with the general duality framework of
section 8.4.1. We embed the dual problem D in a family of maximization problems
defined in terms of another perturbation function G(y, v) such that G(y, 0) = g(y)
and -G(y, •) e £2[Z -> Z]. Namely,

With reference to (8.60) and Proposition 8.51 we define a perturbation func-


tion G : Zu x Zv -» Z U {±00} by54

By this we intend to consider a family of maximization problems parametrized by


veZv:
Maximize G(y,v) subject to y G Zu.
The optimal value function 7 : Zv —> Z U {±00} is accordingly defined by

It is then natural to introduce the dual Lagrangian function K(x, y) : Zv x Zu —>


Z U {±00} as

The problem dual to the problem D is to minimize

As can be imagined from the corresponding constructions in convex analysis (cf. sec-
tion 4 of Rockafellar [177]), K(x,y) and f ( x ) thus constructed do not necessarily
coincide with the original K(x,y) and f ( x ) . We show, however, that the dual of
the dual comes back to the primal in the canonical case with a bounded M-convex
set B using the Lagrangian function Kr.

Example 8.61. For KQ of (8.66) in Example 8.60 we can calculate

We observe that Ko(x,y) = Ko(x,y) where they take finite values.


54
Here we have v € Zv and not v dV.
8.4. Lagrange Duality for Optimization 243

In what follows we always assume (REG), (OBJ), and that B is bounded. We


consider the Lagrangian function Kr with U = V.

Proposition 8.62. Assume (REG), (OBJ), and that B is bounded. Then

Proof. The definitions (8.67) and (8.69) show Gr(y, •) = ~(Kr(-, y))* and Kr(; y) =
(-Gr(y,-))' for each y. Since Kr(-,y) 6 M2[Z -> Z] or = +00 by Theorem 8.57
(4) when B is bounded, we have KT(-,y) — (Kr(-,y))" for each y. Hence follows
Kr = Kr. Then Proposition 8.51 (2) and (8.70) imply / = /. D

Proposition 8.63. Assume (REG), (OBJ), and that B is bounded.


(l}Gr(y,0)=gr(y)(y£Zv).
(2) For each y 6 Zv, Gr(y,v) is L^-concave in v or Gr(y,v) = +00 for all v.
(3) For each v e Zv, Gr(y, v) is L2-concave in y, Gr(y, v) = —oo for all y, or
Gr(y, v) = +00 for all y.

Proof. (1) This is obvious from (8.65) and (8.67).


(2) The definition (8.67) shows Gr(y,-) = -(Kr(-,y))' for each y, while
Kr(-,y) € M%[Z —> Z] or = +00 by Theorem 8.57 (4) when B is bounded. Hence
-Gr(y, •) € £ 2 [Z -> Z] or = -oo by Theorem 8.48.
(3) By (8.67), (8.64), and (8.61), we have the expression

in which c[—v] is M-convex. On the other hand, Theorem 8.58 (2) shows that

is L2-concave, gr(y) = —oo for all y, or gr(y) = +00 for all y. By replacing c with
c[—v] we obtain the claim. D

The optimal value function jr, defined by (8.68) with reference to Gr, enjoys
the following properties.

Theorem 8.64. Assume (REG), (OBJ), and that B is bounded.


(1) f ( x ) = -7,°(-x).
(2) 7,. is L>2-concave or jr(v) = +00 for all v.

Proof. (1) Using Proposition 8.51 (2), (8.69), and Proposition 8.62 we obtain
244 Chapter 8. Conjugacy and Duality

(2) Since / 6 M.i\L —> Z] or = +00, the assertion follows from (1) and the
conjugacy between £2[Z —> Z] and M.i\L —> Z]. D

Theorem 8.65. Assume (REG), (OBJ), and iftai ifte problem P is feasible and B
is bounded.
(1) min(P) = 7r(0) = 7r°°(0) = max(D r ).
(2)opt(P)=d z (-7,)(0)^0.

Proof. The proof is essentially the same as that of Theorem 8.59. To be specific,
we have the following chain of equivalence: x & dz(—7r)(0) •<=> Tr(^) ~ 7r(0) <
(v, -x) (Vv e Zv) &-mfv((v, -z)-7 r (t>)) = 7r(0) <=> f ( x ) = jr(0). This implies
the claim when combined with the weak duality (Theorem 8.52). D

Bibliographical Notes
The conjugacy relationship between M-convexity and L-convexity was established
first for integer-valued functions (Theorem 8.12) by Murota [140], whereas the
present proof is based on Murota [147]. The conjugacy theorem for polyhedral
M-/L-convex functions (Theorem 8.4) is due to Murota-Shioura [152]. The polar-
ity between M-/L-convex cones in Theorem 8.5 is stated in [147] and Proposition
8.11 for the integer biconjugate is in [140]. Theorem 8.1 is a special case of a theorem
.of Topkis [202], stated explicitly as Corollary 2.7.3 in Topkis [203].
The M-separation theorem (Theorem 8.15) is given in Murota [137], [140],
[142] and the L-separation theorem (Theorem 8.16) in [140]. The Fenchel-type
duality theorem for M-convex functions originated in [137] (see also [140]); the
present form (Theorem 8.21) is in Murota [147]. The M-convex intersection theorem
(Theorem 8.17) is in [137], [142]. The Fenchel-type duality theorem for submodular
set functions described in Example 8.26 is due to Fujishige [62]. The weight-splitting
theorem for weighted matroid intersection in Example 8.27 is due to Frank [54], and
that for valuated matroid intersection in Example 8.28 is due to Murota [135]; see
also Theorem 5.2.40 of Murota [146].
M2-convex and L2-convex functions were introduced by Murota [140], to which
Theorems 8.35, 8.36, 8.45, and 8.46 and the conjugacy theorem (Theorem 8.48) ar
ascribed. Theorems 8.31 and 8.42 (integral convexity) as well as Theorem 8.49 are
due to Murota-Shioura [153]. Theorems 8.32 and 8.43 (M2-/L2-optimality criteria
and Theorems 8.34 and 8.44 (M2-/L2-proximity theorems) are given by Murota-
Tamura [162]. See Tamura [198] for Notes 8.37 and 8.38.
The Lagrange duality of section 8.4 is developed in Murota [140]. See Nemhauser-
Rinnooy Kan-Todd [166] and Nemhauser-Wolsey [167] for the subadditive duality.
Chapter 9

Network Flows

In Chapter 2 we had a glimpse of the intrinsic relationship between M-/L-convexity


and network flows (nonlinear electrical networks). Pursuing this direction further
we show the following facts in this chapter, (i) The minimum cost flow problem
can be generalized to the submodular flow problem, where M-/L-convexity plays
a fundamental role, (ii) The submodular flow problem with an M-convex function
admits nice optimality criteria in terms of potentials and negative cycles, (iii) The
optimality criterion using potentials is equivalent to the Fenchel-type duality theo-
rem, (iv) A conjugate pair of M-convex and L-convex functions is transformed to
another conjugate pair of M-convex and L-convex functions through network flows.
Algorithms are treated in Chapter 10.

9.1 Minimum Cost Flow and Fenchel Duality


To single out the role of M-/L-convexity we first review standard results on the
conventional minimum cost flow problem. Emphasis is placed on the equivalence of
the optimality criterion in terms of potentials and the Fenchel duality theorem for
convex functions.

9.1.1 Minimum Cost Flow Problem


Let G = (V, A) be a directed graph with vertex set V and arc set A. Suppose that
each arc a e A is associated with an upper capacity c(a), a lower capacity c(a), and
a cost 7(0) per unit flow. Furthermore, for each vertex v & V, the amount of flow
supply at v is specified by x(v). The minimum cost flow problem is to find a flow
£ = (£(a) a € A) that minimizes the total cost (7, £)A = SaeA 7(a)£(a) subject to
the capacity constraint and the supply specification. Here the supply specification
means a constraint that the boundary d£ of £, defined by

245
246 Chapter 9. Network Flows

should be equal to the given x. The problem is described by a graph G = (V, A),
an upper capacity c : A —> R U {+00}, a lower capacity c : A —> R U {—oo}, a
cost vector 7 : A —> R, and a supply vector x : V —> R, where it is assumed that
c(a) > c(a) for each a G A The variable to be optimized is the flow £ : A —> R.
Minimum cost flow problem MCFP0 (linear arc cost)55

The minimum cost flow problem is a typical well-behaved combinatorial prob-


lem that has nice properties, such as
1. an optimality criterion in terms of potentials (dual variables),
2. an optimality criterion in terms of negative cycles,
3. the integrality of optimal solutions, and
4. efficient algorithms.
Precise statements for the first three above are given later in Theorems 9.4, 9.5,
and 9.6, respectively. In particular, the integrality of optimal solutions refers to
the fact that, if the capacity constraint and the supply specification are given in
terms of integer-valued functions, c : A —> Z U {+00}, c : A —> Z U {—oo}, and
x : V —* Z, then there exists an integer-valued optimal flow £ to the above problem.
This implies that the problem MCFPo specified by such integer-valued data remains
essentially the same even if the integrality condition

is additionally imposed on the flow £. We refer to the problem with (9.6) in place
of (9.5) as the minimum cost integer-flow problem.
To discuss the relationship to convex analysis it is convenient to consider a
more general form of the minimum cost flow problem. The generalization is twofold.
First, the linear arc cost /^ae.4 7( a )£( a ) is replaced with a nonlinear cost represented
by a separable convex function ^2aeA / 0 (£(a)) with a family of univariate polyhedral
convex functions fa S C[R —> R] indexed by a G A. Second, with a polyhedral
convex function / : Rv —> R U {+00}, an additional term /(d£) for the flow
boundary <9£ is introduced in the cost function as a generalization of the supply
specification <9£ = x.
Minimum cost flow problem MCFPs (nonlinear cost)56

55
MCFP stands for minimum cost flow problem.
56
We have MCFP; for i = 0, 3 and not for i = 1, 2. This is for consistency with section 9.2.
9.1. Minimum Cost Flow and Fenchel Duality 247

Obviously, MCFPo is a special case of MCFPs, where

for a € A and / is the indicator function 6{xj of the singleton set {x}.
Among the four nice properties of MCFP0 listed above, the optimality crite-
rion by potentials is generalized to MCFPa, as we will see in section 9.1.3, whereas
the other three fail to survive for a general /. In considering the integer-flow version
of the problem it is natural to assume /„ e C[Z —> R] (or /„ e C[Z|R —> R]) for
each a e A, but it is not clear what combinatorial property to impose on / to ensure
the integrality of optimal solutions. M-convexity gives an answer to this, as we will
see in section 9.4.

Note 9.1. In MCFPs we have restricted / and fa (a € A] to be polyhedral convex


functions. This is for consistency with our theoretical framework of polyhedral
M-/L-convex functions. The optimality criterion by potentials (Theorem 9.4), as
well as its equivalence to the Fenchel duality to be discussed in section 9.1.4, remains
valid for nonpolyhedral convex functions under appropriate assumptions; see Iri [94]
and Rockafellar [178]. •

9.1.2 Feasibility
For the minimum cost flow problem MCFPo, a feasible flow means a function £ :
A —» R that satisfies

We say that MCFPo is feasible if it admits a feasible flow.


For X C V we denote the sets of arcs leaving and entering X by

and define the cut capacity function K : 2V —> R U {+00} by

Proposition 9.2. The cut capacity function K is submodular.

Proof. It is easy to verify


248 Chapter 9. Network Flows

where the summation is taken over all arcs a connecting X \ Y and Y\X. D

If a flow £ meets the capacity constraint (9.12), its boundary x = <9£ satisfies

for all X C V and also x(V) = 0 = «(V). This means x e B(re), where B(K) is the
base polyhedron (4.13) associated with n.
The above argument shows that the condition x € B(K) is necessary for
MCFPo to be feasible. It is also sufficient, as stated in the following theorem.

Theorem 9.3 (Feasibility). For c : A ->• R U {+00}, c : A —> R U {-oo}, and


x : V —> R, there exists a flow £ : A —> R satisfying (9.12) and (9.13) if and only if

That is,

If c and c are integer valued, we may restrict £ to be integer flows; namely,

Proof. This follows from the max-flow min-cut theorem or a variant thereof, called
Hoffman's circulation theorem (see, e.g., (2.65) of Fujishige [65] or Theorem 3.18 of
Cook-Cunningham-Pulleyblank-Schrijver [26]). D

9.1.3 Optimality Criteria


The minimum cost flow problem MCFPs, which has convex boundary cost and
separable convex arc cost, admits a nice optimality criterion in terms of potentials.
The conventional case MCFP0 admits, in addition, an optimality criterion in terms
of negative cycles and the integrality of optimal solutions.
A potential means a function p : V —> R (or a vector p e Rv) on the vertex
set. The coboundary of a potential p is a function 5p : A —> R defined by

The inner product (pairing) of tension 77 : A —> R and flow £ : A —+ R can be


expressed as

if x = 9£ and p is a potential such that


9.1. Minimum Cost Flow and Fenchel Duality 249

The identity (9.21) is a fundamental relation, frequently used in the subsequent


arguments. It should be clear that

With reference to a potential p we modify the cost functions / and fa (a e A)


to the reduced cost functions f[-p] and fa[op(a)} (a € A) defined by

A straightforward calculation with the use of (9.21) yields

where inf /[—p] and mi fa[6p(a)] with a € A mean the infima of the reduced cost
functions. The inequality (9.25) gives a lower bound for the minimum of F3. In
particular, if

for some p, then £ is an optimal flow satisfying (9.25) with equality. This statement
is true for any functions / and fa (a & A).
The converse is also true under a fairly general assumption that / and fa
(a €. A) are convex.

Theorem 9.4 (Potential criterion). In the minimum cost flow problem MCFPs
with polyhedral convex f and fa (a & A), we have the following:
(1) For a feasible flow £ : A —> R, the two conditions (OPT) and (POT) below
are equivalent.
(OPT) £ is an optimal flow.
(POT) There exists a potential p : V —» R such that
(i) £(o) € &Tgmmfa[Sp(a)] for every a & A, and
(ii) d£ € argmin/[-p].
(2) Suppose that a potential p : V —> R satisfies (i) and (ii) above for an
optimal flow £. A feasible flow £' is optimal if and only if
250 Chapter 9. Network Flows

(i) £'(a) € argmin fa[5p(a)] for every a G A, and


(ii) d£' e argmin f[—p].

Proof. (1) (POT) =$> (OPT) is already shown. To prove (OPT) => (POT), suppose
that £ is an optimal flow. Putting

we see

where inf FS is finite and x = <9£ attains the infimum of the last expression. Noting
that /A is a polyhedral convex function (see Note 2.17) and dom/^ n dom/ ^ 0,
we apply the Fenchel duality theorem in convex analysis (Theorem 3.6 and (3.42))
to obtain p : V —> R such that

The second equation shows (ii) in (POT). We will show that the first equation above
implies (i) in (POT). It follows from (9.26) and (9.21) that

for any x' e Ry, and therefore,

On the other hand, the optimality of £ implies /yi(d£) = ^2a€A /a(£( a ))> which, in
combination with (9.21), yields

Substituting (9.28) and (9.29) into the first equation in (9.27) shows
9.1. Minimum Cost Flow and Fenchel Duality 251

Figure 9.1. Characteristic curve (kilter diagram) for linear cost.

which is equivalent to (i) in (POT).


(2) This is obvious from (1) and (9.25). D

A potential p satisfying (i) and (ii) in (POT) is called an optimal potential.


Though this definition refers to a particular optimal flow £, it is, in fact, independent
of the choice of £ by Theorem 9.4 (2).
Condition (i) in (POT) is closely related to the characteristic curve (or kilter
diagram) Fa introduced in section 2.2 with an illustration in Fig. 2.3. Since

by (2.34) and (2.35), condition (i) in (POT) says that flow £(a) and tension r?(a) =
—6p(a) should satisfy the constitutive equation in every arc a G A. In the case of lin-
ear arc cost, the characteristic curve Fa takes the form of Fig. 9.1, and, accordingly,
condition (i) in (POT) is expressed as

in terms of the reduced cost jp : A —> R defined by

In the conventional case MCFPo with linear arc cost, the optimality criterion
can be reformulated in terms of negative cycles in an auxiliary network. For a
feasible flow £ : A —> R, let G^ = (V, A^) be a directed graph with vertex set V and
arc set A^ = A^ U jB| consisting of two disjoint parts:
252 Chapter 9. Network Flows

(a: reorientation of a),


and define a function £j : A$ —> R, representing arc lengths, by

We refer to (G^l£) as the auxiliary network. We call a directed cycle of negative


length a negative cycle.

Theorem 9.5 (Negative-cycle criterion). For a feasible flow £, : A —> R to the mini-
mum cost flow problem MCFPo, conditions (OPT) and (NNC) below are equivalent.
(OPT) £ is an optimal flow.
(NNC) There exists no negative cycle in (G^,l^) with t% o/(9.34).

Proof. By (9.31), (9.32), and the definition (9.34) of £?, condition (i) of (POT) in
Theorem 9.4 is equivalent to

whereas condition (ii) of (POT) is void for MCFPo. On the other hand, the existence
of a potential p : V —*• R satisfying (9.35) is equivalent to (NNC), as is well known
in network flow theory. Hence follows the equivalence of (NNC) and (OPT) by
Theorem 9.4. D

The minimum cost flow problem MCFP0 is endowed with remarkable inte-
grality properties:
1. An integer-valued optimal flow exists if the upper and lower capacities and
the supply vector are integer valued (primal integrality).
2. An integer-valued optimal potential exists if the cost vector is integer valued
(dual integrality).

Theorem 9.6 (Integrality). Suppose that the minimum cost flow problem MCFPo
has an optimal solution.
(1) [Primal integrality] // c : A —> Z U {+00}, c : A —> Z U {-oo}, and
x : V —> Z, then there exists an integer-valued optimal flow £ : A —> Z.
(2) [Dual integrality] The set of optimal potentials
II* = {p p : optimal potential}
is an L-convex polyhedron. If^:A^Z, then II* is an integral L-convex polyhedron
and there exists an integer-valued optimal potential p : V —> Z.

Proof. (I) Let p be an optimal potential. By (9.31) and (9.32), a flow £ is optimal if
and only if it is a feasible flow with respect to a more restrictive capacity constraint
c*(a) <£,(a) <c*(a) with
9.1. Minimum Cost Flow and Fenchel Duality 253

for each a & A. Since c*(a) and c*(a) are integers for every a € A, the claim follows
from (9.19) in Theorem 9.3.
(2) Since condition (i) of (POT) in Theorem 9.4 is equivalent to (9.35) in the
proof of Theorem 9.5, II* coincides with the polyhedron described by (9.35) with
an optimal £. This implies the L-convexity of II* (see section 5.6). The integrality
assertion follows from Proposition 5.1 (4). D

The nice features of the minimum cost flow problem discussed so far (Theorems
9.4, 9.5, and 9.6) are derived mainly from the combinatorial structure inherent
in the underlying graph, as well as the convexity of the cost functions. Further
combinatorial properties stemming from the M-convexity of the cost functions will
be investigated in section 9.4 and section 9.5.

Note 9.7. Here is a comment on the definition of the coboundary. In this book
we follow the convention of defining Sp(a) by

6p(a) = (p at the initial vertex of a) — (p at the terminal vertex of a).

The boundary d£(v) is defined to be the amount of flow leaving v and the tension 77
is defined as 77 = -8p (see (9.1), (9.20), and (9.22)). Then follows the fundamental
identity

Another convention of defining 6p(a) by

Sp(a) = (p at the terminal vertex of a) — (p at the initial vertex of a)

and the tension 77 by r\ = 5p results in

The notations div and A in Rockafellar [178] are related to ours as div = d and
A = -5. m

9.1.4 Relationship to Fenchel Duality


We discuss here the relationship between the potential criterion for optimality for
the minimum cost flow problem MCFPa and the Fenchel duality in convex analysis.
The potential criterion for MCFPs (Theorem 9.4 in section 9.1.3) has been
derived from the Fenchel duality applied to / and -/A, where

and the evaluation of JA amounts to solving a minimum cost flow problem with
nonlinear arc cost /„ but without boundary cost /. Thus, the minimum cost flow
problem MGFPs with boundary cost can be understood as a composition of the
254 Chapter 9. Network Flows

Figure 9.2. Minimum cost flow problem for Fenchel duality.

minimization/maximization problem of the Fenchel duality and the minimum cost


flow problem without boundary cost.
The proof of Theorem 9.4 yields, as a byproduct, a min-max identity for
MCFP3:

where

with g = /* and ga = fa* for a € A. The identity (9.36) is an immediate conse-


quence of the Fenchel duality (3.41):

in which f ' ( p ) = g(p) and

by (9.28). The left-hand side of (9.36) is MCFPs in disguise, and accordingly,


we may think of the maximization problem on the right-hand side of (9.36) as an
optimization problem dual to MCFPa.
Although the potential criterion for MCFPa has been derived from the Fenchel
duality, they are essentially equivalent, which we demonstrate here. To be specific,
we derive the Fenchel duality theorem (Theorem 3.6, Case (a2)) from the optimality
criterion for MCFP3 (Theorem 9.4).
Given a polyhedral convex function f\ : Ry —> R U {+00} and a polyhedral
concave function h^ : Rv —> R U {—00} with dom/i n dom/i2 ^ 0, we consider
a minimum cost flow problem MCFPa on the bipartite graph G = (V\ U V^A) in
Fig. 9.2. The vertex set of G consists of two copies of V, i.e., V\ and Vi, and the
9.2. M-Convex Submodular Flow Problem 255

arc set is A = {(^1,^2) | v € V}, with v\ € V\ and v-z € V-z denoting the copies of
v € V. We define the boundary cost function / : R/1 x R72 -» R U {+00} by

and assume that the arc cost functions fa (a e A) are identically zero without
capacity constraints. Note that x\ = —x2 if (xi,x2) = d£ for a flow £ in this
network. Assuming inf (/i — /la) > — oo, let £ be an optimal flow, which exists since
/ is a polyhedral convex function. Let (piiPi} 6 R^1 x R^2 be an optimal potential
satisfying (POT) in Theorem 9.4. Condition (i) of (POT) implies pi = p2- Since

condition (ii) of (POT) gives

for x = d£|vi and p = p\. This implies the Fenchel duality (3.41) for f\ and h^\ see
also (3.30) and (3.42).

9.2 M-Convex Submodular Flow Problem


A series of generalizations of the minimum cost flow problem to the M-convex
submodular flow problem is described. Recall the conventional minimum cost flow
problem MCFPo introduced in section 9.1.1. It is described by a graph G — (V, A),
an upper capacity ~c : A —» R U {+00}, a lower capacity c : A —> R U {—oo}, a cost
vector 7 : A —> R, and a supply vector x : V —> R, where c(o) > c(a) for each
a e A.
A generalization of MCFPo is obtained by relaxing the supply specification
(9£ = x to the constraint that 9£ belong to a given set B of feasible or admissible
supplies:

The nice properties described in section 9.1 are maintained if -B is a base polyhedron
represented as B = B(p) with a submodular set function p : 2V —> RU{+oo}. Such
a problem described by some p G <S[R] is called the submodular flow problem.
Submodular flow problem MSFPi (linear arc cost)57

57
MSFP stands for M-convex submodular flow problem. We use the notation MSFPj with
i = 1, 2, 3 to indicate the hierarchy of generality in the problems.
256 Chapter 9. Network Flows

In the integer-flow version of the problem, with £(o) e Z (a e A) instead of (9.41),


we assume p & S[Z].
A further generalization of the problem is obtained by introducing a cost
function for the flow boundary d£ rather than merely imposing the constraint <9£ e
B. Namely, with a function / : Hv —> R U {+00} we add a new term f(d£) to
the objective function, thereby imposing the constraint d£ € B — dom / implicitly.
The aforementioned nice properties are maintained if / is a polyhedral M-convex
function. Such a problem described by some / 6 ./Vf [R —> R] is called the M-convex
submodular flow problem.
M-convex submodular flow problem MSFP2 (linear arc cost)

Note that the M-convex submodular flow problem with a {0, +00}-valued / reduces
to the submodular flow problem MSFPi. In the integer-flow version of the problem
we assume c : A -> Z U {+00}, c : A -> Z U {-oo}, and / 6 M[Z -> R] (or
/eM[Z|R->R]).
A still further generalization is possible by replacing the linear arc cost in T%
with a separable convex function. Namely, using univariate polyhedral convex func-
tions fa e C[R -» R] (a € A), we consider Y^aeA /a(£(°0) instead of ^2aeA 7(a)£(a)
to obtain MSFPa below, a special case of MCFPs with / being M-convex.
M-convex submodular flow problem MSFPs (nonlinear arc cost)

In the integer-flow version of the problem we assume / e M[Z —> R] and fa e


C[Z -> R] for a e A (or / e M[Z\R -^ R] and /„ e C[Z|R -^ R] for a e A).
Obviously, MSFP2 is a special case of MSFPs with

The converse is also true; i.e., MSF?3 can be put into a problem of the form of
MSFP2, as is explained in Note 9.8.
Throughout this chapter we assume
9.2. M-Convex Submodular Flow Problem 257

since d£(V) = 0 for any flow £ and d£ & dom / = B — B(/j) is imposed.
In subsequent sections we will see that the optimality criteria in terms of
potentials and negative cycles, as well as efficient algorithms for the conventional
minimum cost flow problem MCFPo, can be generalized for the M-convex submod-
ular flow problem.

Note 9.8. The problem MSFP3 on G = (V,A) can be written in the form of
MSFP2 on a larger graph G = (V, A). We replace each arc a = (u, v) G A with a
pair of arcs, a+ = (u, i>~) and a~ = (v£,v), where v+ and v~ are newly introduced
vertices. Accordingly, we have A = {a + ,a~ | a G A} and V = V(J{v^,v~ \ a G A}.
For each a G A we consider a function /0 : R2 —> R U {+00} given by

and define / : R^ -> R U {+00} by

where x\v denotes the restriction of x to V. For a flow £ : A —> R, we have


|(o+) = f(oT) if (d£(v+),d|(u-)) G dom/0. The problem MSFP3 is thus reduced
to MSFP2 with the objective function f 2(|) = /(<9£). Note that, if / e .M[R -> R]
and /0 G C[R ->• R] for a e .A, then / G Ai[R -> R].

Note 9.9. The cost function FS of MSFPs consists of two terms, the separable arc
cost X^aeA /o(£(a)) and the M-convex boundary cost /(<9£). Noting that the former
is M^-convex, one might be tempted to consider a (nonseparable) M^-convex cost
function defined on the arc set. The integer-flow version of such a problem, however,
contains the Hamiltonian path problem, a well-known NP-complete problem, as a
special case.
Suppose that we want to check for the existence of an (s, i)-Hamiltonian path
in a directed graph G = (V, A), where we may assume s ^ t G V and 5~s = 6+t = 0.
We construct another directed graph G = (V, A) by replacing each arc a = (u, v) G
A with three arcs connected in series:

where v£ and va are newly introduced vertices. Hence, V — V U {v£,va \ a € ^4}


and A = A+ U A° U A~, with A+ = {a+ \ a e A}, A° = {a° \ a G A}, and
A~ = {a~ \ a £ A}. We consider three matroids, say, M+, M"", and M° on A+,
A~, and A°, respectively. M+ is a partition matroid in which B+ C A+ is a base
if and only if

M is another partition matroid defined similarly (with + replaced with —), and M°
is the graphic matroid in which B° C. A° is a base if and only if {a G A \ a° G B°} is
258 Chapter 9. Network Flows

a tree of the original graph G. Let Q be the set of characteristic vectors of a subset
B of A such that B H A+ is a base of M+, B n A~ is a base of M~, and B r\ A° is
an independent set of M°. Then a {0, l}-flow £ in G with £ e Q and <9£ = Xs —
corresponds to an (s, i)-Hamiltonian path in G. Since Q is an M^-convex set, the
constraint | e Q can be represented by a {0, +oo}-valued M^-convex cost fun
on the arc set A. •

9.3 Feasibility of Submodular Flow Problem


The feasibility of the submodular flow problem MSFPi is investigated here. Recall
that we are given a graph G = (V, A), an upper capacity c : A —> Ru{+oo}, a lowe
capacity c : A —> R U {—oo}, and a submodular set function p : 2V —» R U {+00
where c(o) > c(a) for a £ A and p(0) = p(V) = 0. A feasible flow means a function
£ : A —> R that satisfies

The problem MSFPi is said to be feasible if it admits a feasible flow.


In section 9.1.2 we considered (9.52) to obtain Theorem 9.3. We now combine
(9.52) and (9.53) for the feasibility of MSFPi.

Theorem 9.10 (Feasibility). A submodular flow problem MSFPi is feasible if and


only if

Moreover, if c, c, and p are integer valued and the problem is feasible, there exists
an integer-valued feasible flow £ : A —> Z.

Proof. Let K be the cut capacity function denned by (9.16). By Theorem 9.3 a
feasible flow exists if and only if B(/c) n B(p) 7^ 0. The latter condition is equiva-
lent to

by Edmonds's intersection theorem (Theorem 4.18) and further to (9.54) by

In a feasible problem with integer-valued c, c, and p, both B(K) and B(p) are
integral base polyhedra (integral M-convex polyhedra), and B(K) n B(p) n Zv is
nonempty by (4.32). Then (9.19) in Theorem 9.3 guarantees the existence of an
integer flow £ : A -+ Z with d£ G B(K) n B(p) n Zv. D

Note 9.11. The necessity of (9.54) is easy to see. For any X C V, the net amount
of flow entering X is equal to zero:
9.3. Feasibility of Submodular Flow Problem 259

and the constraints (9.52) and (9.53) should be satisfied:

Combining these two yields (9.54). Theorem 9.10 claims that this "obvious" neces-
sary condition is in fact sufficient. •

Note 9.12. In a feasible submodular flow problem, the set of boundaries of feasible
flows, dE = {d£ £ : feasible flow}, is an M2-convex polyhedron, and it is an integral
M2-convex polyhedron if c, c, and p are integer valued. This can be seen from the
proof of Theorem 9.10. •

The maximum submodular flow problem is to find a feasible flow £ that max-
imizes £(ao) for a specified arc ao € A.
Maximum submodular flow problem maxSFP

A max-flow min-cut theorem holds for this problem. Note that for any X C V
with OQ G /\+X we have an "obvious" inequality:

by (9.55) and (9.56).

Theorem 9.13 (Max-flow min-cut theorem). For a feasible maximum submodular


flow problem maxSFP,

where this common value can be +00. If c, c, and p are integer valued and (9.61)
is finite, there exists an integer-valued maximum flow £ : A —> Z.

Proof. Divide the arc ao = (u, v) into two arcs in series, say, ao = (u, w) and
a0 = (w,v), and denote by G = (V,A) the resulting graph, where V = V U {w}
and A = A U {o0}. Define the capacities of a0 by C(OQ) = t and c(a0) = +00 with
a parameter i, and let p be defined for all subsets of V by p(X U {w}) = p(X) for
X C V. The maximum in (9.61) is equal to the maximum (or supremum) of t such
260 Chapter 9. Network Flows

that the submodular flow problem on G = (V, A) is feasible. With this relationship,
Theorem 9.10 implies (9.61) as well as the integrality assertion. D

If £ and X attain the maximum and the minimum in (9.61), respectively, and
if £(ao) < c(a0), then we have

9.4 Optimality Criterion by Potentials


In section 9.1.3 we saw a potential criterion for optimality (Theorem 9.4) for the
minimum cost flow problem MCFPa- Since the M-convex submodular flow problem
MSFPa is a special case of MCFPa, the following optimality criterion for MSFPs is
immediate from Theorem 9.4.

Theorem 9.14 (Potential criterion). In the M-convex submodular flow problem


MSFP3 with fa e C[R -> R] (a € A) and f g M[H —> R], we have the following.
(1) For a feasible flow £ : A —> R, the two conditions (OPT) and (POT) below
are equivalent.
(OPT) £ is an optimal flow.
(POT) There exists a potential p : V —> R such that
(i) £(a) e argmin/ a [<5p(a)] for every a e A, and
(ii) <9£e arg min/[-p].
(2) Suppose that a potential p : V —* R satisfies (i) and (ii) above for an
optimal flow £. A feasible flow £' is optimal if and only if
(i) £'(a) € arg min fa [Sp(a)] for every a 6 A, and
(ii) d£' £ arg min f[—p].

A comment is in order on the role of the M-convexity of /. Since fa is a


univariate convex function for every a G A, condition (i) in (POT) can be expressed
in terms of directional derivatives as:

If / is M-convex, condition (ii) in (POT) can also be expressed in terms of directional


derivatives as:

by the M-optimality criterion in Theorem 6.52 (1). These expressions show how the
conditions in (POT) can be verified efficiently for a given p. It is also mentioned that
these expressions lead to another optimality criterion in terms of negative cycles, to
be established in section 9.5, and furthermore to the cycle-canceling algorithm for
the M-convex submodular flow problem, to be explained in section 10.4.3.
9.4. Optimality Criterion by Potentials 261

An alternative representation of condition (ii) in (POT) is obtained from the


M-convexity of /. The function conjugate to /, say, g, is a polyhedral L-convex
function with g(p+ 1) = g(p) for all p (by Theorem 8.4 and (9.51)). It follows from
(3.30) and the L-optimality criterion (Theorem 7.33 (1)) that

This shows that argmin/[—p] coincides with the base polyhedron B(gp) associated
with the set function gp denned by

which is submodular by Theorem 7.43 (1). Hence,

This expression is used in the primal-dual algorithm for the M-convex submodular
flow problem, to be explained in section 10.4.4.
We go on to discuss integrality properties of the M-convex submodular flow
problem. This generalizes the well-known facts (Theorem 9.6) for the minimum cost
flow problem MCFP0. Recall the notation M[Z\R -> R] and M[R -»• R|Z] for the
sets of integral and dual-integral polyhedral M-convex functions, respectively.

Theorem 9.15. Suppose that an optimal solution exists in the M-convex submod-
ular flow problem MSFP3 with fa <E C[R -> R] (a e A) and f e X[R -> R].
(1) The set of the boundaries of optimal flows,

is an M2-convex polyhedron, and the set of optimal potentials,

is an L-convex polyhedron.
(2) [Primal integrality] If fa £ C[Z|R -> R] (a e A) and f e X[Z|R ->• R],
then 95* is an integral M^-convex polyhedron, and there exists an integer-valued
optimal flow £ : A —> Z.
(3) [Dual integrality] ///„ e C[R -> R|Z] (a G A) and f 6 M[R ->• R Z], then
II* is an integral L-convex polyhedron, and there exists an integer-valued optimal
potential p : V —> Z.

Proof. (1) Let p be an optimal potential. Since argmin/ a [5p(a)] forms an interval,
say, [c*(a),c*(a)]R, condition (i) in (POT) of Theorem 9.14 can be expressed as
c*(a) < £(a) < c*(o) (a 6 A). Just as in (9.16) and (9.18), the set of <9£ for such £
coincides with the base polyhedron B(K*) for K* defined by
262 Chapter 9. Network Flows

Combining this with (9.65) we obtain 9H* = B(K*) nB(<7 p ), which is an M2-convex
polyhedron. Now let £ be an optimal flow. Potentials p satisfying (i) in (POT) form
an L-convex polyhedron, say, DI, by (9.63), whereas those satisfying (ii) in (POT)
form another L-convex polyhedron D% by (9.64). Therefore, II* = DI n D% is an
L-convex polyhedron.
(2) Both B(K*) and B(0P) are integral M-convex polyhedra. The integrality
of dE* = B(/c*) n B(gp) follows from (4.32).
(3) Both DI and D% are integral L-convex polyhedra by Theorem 6.61 (1).
The integrality of II* = DI n D2 follows from Theorem 5.7. D

For linear arc cost, with fa given by (9.50), the integrality conditions are
simplified as follows:

Finally, we state the optimality criterion for the integer-flow version of the
M-convex submodular flow problem MSFPa. This is a corollary of Theorems 9.14
and 9.15.

Theorem 9.16 (Potential criterion). Consider the M-convex submodular integer-


flow problem MSFP3 with fa e C[Z -» R] (a e A) and f e M[Z -> R].
(1) For a feasible integer flow £ : A —> Z, i/ie two conditions (OPT) and
(POT) below are equivalent.
(OPT) £ is an optimal integer flow.
(POT) There exists a potential p : V —> R such that
(i) £(a) e argmin/a[(5j3(a)] /or eaerj/ a £ A, and
(ii) d£ e argmin/[-p].
(2) Suppose that a potential p : V —> R satisfies (i) and (ii) above for an
optimal integer flow £. A feasible integer flow £' is optimal if and only if
(i) £'(a) e argmin/ a [£p(a)] for every a e A, and
(ii) <9£' e argmin/[-pj.
(3) T/ie set of the boundaries of optimal integer flows,

d'E* = {<9£ £ : optimal integer flow},

is an M-2-convex set.
(4) If the cost functions are integer valued, i.e., if fa e C[Z —> Z] (a e A) and
f £ M[Z —> Z], then there exists an integer-valued potential p :V—>Zin (POT).
Moreover, the set of integer-valued optimal potentials,

II* = {p p : integer-valued optimal potential},

is an L-convex set.

In connection to (i) and (ii) in (POT) in Theorem 9.16, note the equivalences
9.5. Optimality Criterion by Negative Cycles 263

These are the discrete counterparts of (9.63) and (9.64).

Note 9.17. The Fenchel-type duality theorem for M-convex functions (Theorem
8.21) is essentially equivalent to the optimality criterion for the M-convex submod-
ular integer-flow problem (Theorem 9.16). See section 9.1.4 and note that, for an
M-convex function /i and an M-concave function /i 2 > f ( x i , X 2 ) = /i(a^i) — /i2(—#2)
is an M-convex function.

9.5 Optimality Criterion by Negative Cycles


The optimality of an M-convex submodular flow can also be characterized by the
nonexistence of negative cycles in an auxiliary network. This fact leads to the
cycle-canceling algorithm to be described in section 10.4.3.

9.5.1 Negative-Cycle Criterion


We consider the M-convex submodular flow problem MSFP2 with M-convex bound-
ary cost and linear arc cost. This is not restrictive, since MSFPa, having nonlinear
convex arc cost, can be put in the form of MSFP2, as explained in Note 9.8. We
consider real-valued flows and then integer-valued flows.
We assume / e .M[R —> R] in considering real-valued flows. For a feasible
flow £ : A —> R, we define an auxiliary network as follows. Let Gj = (V,A^) be a
directed graph with vertex set V and arc set A^ = A*^ U _B| U Cj consisting of three
disjoint parts:

We define a function i% : A% —> R, representing arc lengths, by

We refer to (Gj,^) as the auxiliary network. We call a directed cycle of negative


length a negative cycle.
The following theorem gives an optimality criterion in terms of negative cycles.

Theorem 9.18 (Negative-cycle criterion). For a feasible flow £ : A —> R to the


M-convex submodular flow problem MSFP2 with f e M[R —> R], the conditions
(OPT) and (NNC) below are equivalent.
264 Chapter 9. Network Flows

(OPT) £ is an optimal flow.


(NNC) There exists no negative cycle in (Gf,l^) with I^ o/(9.71).

Proof. As is well known in network flow theory, (NNC) is equivalent to the existence
of a potential p : V —> R such that

By (9.31), (9.32), (9.64), and the definition (9.71) of i^ this condition is equivalent
to conditions (i) and (ii) of (POT) in Theorem 9.14. Hence follows the equivalence
of (NNC) and (OPT) by Theorem 9.14. D

Note 9.19. In a problem with dual integrality the arc length ^ is integer valued.
The integrality of ^(a) for a e A* U B* is due to (9.67) and that for a e Q is
by Theorem 6.61 (1). For integer-valued £^, we can take an integer-valued p in the
proof of Theorem 9.18. •

Next we consider the integer-flow problem under the assumptions

For a feasible integer flow £ : A —> Z, we define an auxiliary network (Gj,^) in a


similar manner, while modifying the definitions of C^ and (.^ to

Theorem 9.20 (Negative-cycle criterion). For a feasible integer flow £ : A —> Z to


the M-convex submodular integer flow problem MSFP2 with (9.72), the conditions
(OPT) and (NNC) below are equivalent.
(OPT) £ is an optimal flow.
(NNC) There exists no negative cycle in (G^,l^) with I £ o/(9.74).

Proof. This is similar to the proof of Theorem 9.18. Note, however, that Theorem
9.16 is used here in place of Theorem 9.14. D

Note 9.21. The M-convex intersection problem introduced in section 8.2.1 can
be formulated as an M-convex submodular flow problem. Given two M-convex
functions /i,/2 : Zv —> R U {+00}, we consider an M-convex submodular flow
problem on the bipartite graph G = (Vi U V2,A) in Fig. 9.3, where V\ and Vi are
copies of V and A = {(v\, v%) v e V} with vi £ V\ and v% € Vi denoting the copies
of v e V. The boundary cost function / : ZVl x ZV2 -> R U {+00} is defined by
/(zi,X2) = f i ( x i ) + f2(^X2) for Xi G ZVl and X2 6 Z^ 2 , whereas the arc costs are
9.5. Optimality Criterion by Negative Cycles 265

Figure 9.3. Submodular flow problem for M-convex intersection problem.

identically zero without capacity constraints. Since x\ = —x% if (xi^x^) = <9£ f


a flow £ in this network, the M-convex submodular flow problem is equivalent to
minimizing f i ( x ) + /2(x). The negative-cycle optimality criterion (Theorem 9.20)
for this M-convex submodular flow problem yields the M2-optimality criterion in
Theorem 8.33. This argument shows also that the M2-optimality criterion can be
verified in polynomial time.

9.5.2 Cycle Cancellation


The negative-cycle optimality criterion states that the existence of a negative cycle
implies the nonoptimality of a feasible flow. This suggests the possibility of im-
proving a nonoptimal feasible flow by the cancellation of a suitably chosen negative
cycle.
Let us consider the integer-flow problem with (9.72). Suppose that negative
cycles exist in the auxiliary network (G^,l^) for a feasible integer flow £, where the
arc length l^ is defined by (9.74). Choose a negative cycle with the smallest number
of arcs and let Q (C A^) be the set of its arcs. Modifying the flow £ along Q we
obtain a new integer flow £ defined by

The following theorem shows that £ is a feasible flow with an improvement in the
objective function

This gives an alternative proof for "(OPT) =>• (NNC)," which has already been
established in Theorem 9.20 with the aid of Theorem 9.16.

Theorem 9.22. For a feasible integer flow £ to the M-convex submodular integer-
flow problem MSFP2 with (9.72), let Q be a negative cycle with the smallest number
of arcs in (Gj,^). Then £ in (9.75) is a feasible integer flow and
266 Chapter 9. Network Flows

The rest of this section is devoted to the proof of Theorem 9.22. The key
ingredient of the proof is the unique-rain condition, denned as follows.
For a pair ( x , y ) of integer vectors satisfying x G doniz/ and ||x — y\\<x> = 1, we
consider a bipartite graph G(x, y) = (V+, V~\E) with vertex sets V+ = supp+(:r -
y) and V~ = supp^ (x - y) and arc set

and associate c(u,v) = & f ( x ; v , u ) with arc ( u , v ) G E as its weight. We say


that (x, y) satisfies the unique-min condition if there exists in G(x, y) exactly one
minimum-weight perfect matching with respect to c.
Denote by f ( x , y) the minimum weight of a perfect matching in G(x, y), where
f ( x , y ) = +00 if no perfect matching exists. Proposition 6.25 shows f(y) — f ( x ) >
f ( x , y ) for any x G doniz/ and y G Zv. The unique-min condition is a sufficient
condition for this inequality to be an equality.

Proposition 9.23. Let f G A4[Z —> R] be an M-convex function, and assume


x G domz/, y G Zv, and \\x-y\\oo — 1. I f ( x , y ) satisfies the unique-min condition,
then y G domz/ and

Proof. The set function u> defined by w(X] — —f(xf\y+xx) (X C F) is a valuated


matroid; see (2.77). The present claim is a reformulation of the unique-max lemma
for valuated matroids (see Theorem 5.2.35 in Murota [146]). D

The following proposition gives a necessary and sufficient condition for a bi-
partite graph to have a unique minimum-weight perfect matching. It also shows
that the unique-min condition for a pair of integer vectors can be checked by an
efficient algorithm.

Proposition 9.24. Let G — (V+,V~;E) be a bipartite graph with \V+\ = \V~ \ (=


m) and c : V+ x V —> RU{+oo} be a weight function such that c(u, v) < +00 ^=^
(u,v) G E. There exists a unique minimum-weight perfect matching if and only if
there exist a potential p : V+UV~~ —> R and orderings of vertices V+ = {HI, ... ,um}
and V~ = {vi,..., vm} such that

Proof. This follows from the complementarity (Theorem 3.10 (3)) in the linear
program formulation in the proof of Proposition 3.14. D

The following is the key fact. Note that <9£ G domz/ and \\d£ — <9£||oo — 1-
9.5. Optimality Criterion by Negative Cycles 267

Proposition 9.25. (3£,<9£) satisfies the unique-min condition.

Proof. Consider the bipartite graph G(d£,d£) = (V+,V~;E), where V+ =


supp+(<9£ - d£), V' = supp-(<9£ - <9£), and

We have \V+\ = \V \ — m for m = ||<9£ — d£||i/2 and the weight of arc (u,v)
equal to A/(9£; v, u). We may think of G(d£, d£) as a subgraph of the graph Gj of
section 9.5.1 by regarding E as a subset of C^ in (9.73). Then Q n C% determines a
perfect matching in G(d£,d£).
Let M — {(ui,Vi) | i = 1,...,TO}be a minimum-weight perfect matching in
G(d£,d£) and p be an optimal potential in Proposition 3.14. Note that M is a
subset of

Regarding M as a subset of Cj, we define Q' = (Q\C^)UM. Since M is a minimum-


weight perfect matching, Qr\C^ is a perfect matching, and ^(a) = Af(d£;v,u) for
a = (u, v) € C%, we have ^g(M) < t$(Q n C^), from which follows

Since Q' is a union of disjoint cycles with \Q'\ = \Q\ and Q is a negative cycle with
the smallest number of arcs, (9.78) implies that Q' is also a negative cycle with the
smallest number of arcs.
To prove by contradiction, suppose that (d£, <9£) does not satisfy the unique-
min condition. Since (KJ, vt) e (7| for i = 1,..., m, it follows from Proposition 9.24
that there exist distinct indices i*. (fc = 1,..., q; q > 2) such that (uik, vik+1) e C£
for fc = 1 , . . . , g , where i g+ i = i\. That is,

On the other hand, we have

It then follows that

i.e.,
268 Chapter 9. Network Flows

For k — ! , . . . , < / , let P'(vik+1,Uik) denote the path on Q' from Vik+1 to Uik,
and let Q'k be the directed cycle consisting of arc (u$ fc , Vik+1) and path P'(vik+1, itj fc ).
Obviously,

where the union here (and also below) means the multiset union, counting the
number of occurrences of elements. A simple but crucial observation is that

for some integer q' with 1 < q' < q. Hence,

where (9.79) and (9.78) are used. This implies that ^(<2'fc) < 0 for some k, which,
however, is a contradiction, since Q'k has a smaller number of arcs than Q'. This
completes the proof. D

Proof of Theorem 9.22: It follows from Propositions 9.25 and 9.23 as well as
the definition of / that

whereas

Adding these two results in F 2 (£) < T 2 (£) + ^(Q).

9.6 Network Duality


Transformation by a network is one of the most important operations for M-convex
and L-convex functions. A given pair of M-convex and L-convex functions defined
on entrance vertices of a network is transformed through the network to another
pair of M-convex and L-convex functions on exit vertices. Moreover, if the functions
in the given pair are conjugate to each other, the resulting pair is also conjugate.
This fact reveals a deeper intrinsic relationship of M-/L-convexity to network flow,
partly discussed in section 2.2. The theorems as well as their implications are stated
in section 9.6.1, and the proofs are given in section 9.6.2.
9.6. Network Duality 269

Figure 9.4. Transformation by a network.

9.6.1 Transformation by Networks


We first deal with functions of the Z —» Z type, integer-valued functions defined on
integer points, and then functions of other types, Z —> R and R —> R.
Let G = (V, A; S, T) be a directed graph with vertex set V, arc set A, entrance
set S, and exit set T, where S and T are disjoint subsets of V; see Fig. 9.4 for
an illustration. For each a & A, the costs of integer-valued flow and tension are
represented, respectively, by functions /„ : Z —*• Zu{+oo} and ga : Z —> ZU{+oo}.
Given functions /, 5 : Zs —> Z U {+00} associated with the entrance set S of
the network, we define functions /, g : ZT —> Z U {±00} on the exit set T by

We may think of f(y) as the minimum cost to meet a demand specification y at the
exit, where the cost consists of two parts, the cost f(x) of supply or production of
x at the entrance and the cost 5^aeA/a(^( a )) of transportation through arcs; the
sum of these is to be minimized over varying supply x and flow £ subject to the
flow conservation constraint <9£ = (x, — y, 0). A similar interpretation is possible
for g(q). We regard / and g as the results of transformations of / and g by the
network; (9.81) and (9.82) are called transformations of flow type and of potential
type, respectively.
270 Chapter 9. Network Flows

The following theorem reveals the harmonious relationship between network


flow and M-/L-convexity by which a conjugate pair of M-convex and L-convex func-
tions is transformed to another conjugate pair of M-convex and L-convex functions.
Note that C[Z —> Z] denotes the set of univariate integer-valued discrete convex
functions, and * means the discrete Legendre-Fenchel transformation (8.11)z-

Theorem 9.26. Assume fa,ga <E C[Z -> Z] for each a <E A. For f,g : Zs
Z U {+00}, let /, g : ZT -> Z U {±00} be the functions induced by G = (V, A- S, T)
according to (9.81) and (9.82), where it is assumed that f > —oo, f ^ +00, g > —oo,
and g ^ +00.

We explain the implications of this theorem by considering three special cases.


The first special case is a well-known construction in matroid theory, induction of
a matroid through a graph. Given a graph G = (V, A; S, T) and a matroid (S, B) on
S with base family B, let B be the family of subsets of T that can be linked with
some base of (S, B) by a vertex-disjoint linking in G. Then B forms the base family
of a matroid on T, which is referred to as the matroid induced from (S, B) through
G. To formulate this as a special case of Theorem 9.26 (1), we split each vertex
v e V into two copies, v' and v", to consider a graph G = (V U V", A; S', T") with

where S' = {v1 v& S} and T" = {v" v e T}. Let / : Zs/ -> Z U {+00} be the
indicator function of the set of the characteristic vectors of bases, i.e., / = SB for
B = {xxr | X e B}, and, for each arc a 6 A, define fa to be the indicator function
of {0,1}. Then the induced function / : ZT —> Z U {+00} represents the family B
in the sense that / = 6^ for B = {XY" \Y&B}. Then the M-convexity of / stated
in Theorem 9.26 (1) shows that (T,B) is a matroid.
The second case is where (5, T) = (V, 0). Then the induced functions / and
g are constants, having no arguments, and the conjugacy asserted in Theorem 9.26
(3) amounts to a min-max relation

for
9.6. Network Duality 271

This is the discrete counterpart of (9.36), showing the duality nature of the assertion
of Theorem 9.26 (3).
The third case is where 5 = 0. Then the induced functions are

which are identical to (2.42) and (2.43), respectively, and the claims in Theorem
9.26 reduce to the facts observed in section 2.2.2.
Whereas Theorem 9.26 deals with integer-valued functions defined on integer
points, similar statements are true for functions of type Z —> R and R —> R. Note
that the conjugacy assertion is missing in the case of Z —> R.

Theorem 9.27. Assume fa,ga 6 C[Z -> R] for each a e A. For f,g : Zs ->
R U {+00}, let /, g : ZT -> R U {±00} be the functions induced by G = (V, A; 5, T)
according to (9.81) and (9.82), where it is assumed that f > — oo, / ^ +00, g > —oo,
and g ^ +00.

Theorem 9.28. Assume fa,ga £ C[R —> R] for each a e A. For f,g : Rs —>
RU{+oo}, let f,g : RT -» RU{±oo} be the functions induced by G = (V,A;S,T)
according to

where it is assumed that f > -oo, / ^ +00, g > -oo, and g ^ +00.
272 Chapter 9. Network Flows

Figure 9.5. Bipartite graphs for aggregation and convolution operations.

where * means the Legendre-Fenchel transformation (3.26).

A number of fundamental operations on M-convex and L-convex functions can


be formulated as the transformation by networks, as is partly demonstrated below.

Note 9.29. The M-convexity of the aggregation fu* of an M-convex function /


(Theorem 6.13 (7)) is proved here as an application of Theorem 9.27. Let V' be
a copy of V and consider a bipartite graph G — (S U T, A;S, T) with S = V,
T = U U {MO}, and A = {(v', v) \ v e U} U {(v', u0) \ v <G V \ U}, where v' £ V is
the copy of v G V and UQ is a distinguished vertex (see Fig. 9.5 (left)). We regard
/ as being defined on S and assume that the arc cost functions fa (a & A) are
identically zero. The function / induced on T coincides with the aggregation fu*.
This also means that the aggregation fu* can be evaluated by solving an M-convex
submodularflowproblem. •

Note 9.30. The M-convexity of the infimal convolution /in2/2 of M-convex


functions (Theorem 6.13 (8)) is proved here as an application of Theorem 9.27. Let
Vi and V-2 be copies of V and consider a bipartite graph G = (S U T, A; S, T) with
S = Vi U V2, T = V, and A = {(vi,v) v e V] U {(v2, v) \ v e V}, where Vi G Vi is
the copy of v G V for i — 1,2 (see Fig. 9.5 (right)). We regard fa as being denned
on Vi for i — 1, 2 and assume that the arc cost functions fa (a e A) are identically
zero. The function / induced on T coincides with the infimal convolution f\Oz /2.
This also means that the infimal convolution /iO z /2 can be evaluated by solving
an M-convex submodular flow problem. •
9.6. Network Duality 273

Figure 9.6. Rooted directed tree for a laminar family.

Note 9.31. An alternative proof of the M^-convexity of a laminar convex function


(6.34) is given here as an application of Theorem 9.27. Let T be a laminar family
of subsets of V, where we may assume that 0 ^ T, V e T, and every singleton
set belongs to T. We represent T by a, directed tree G = (U, A; S, T) with root
MO, where U = {ux \ X € T} U {u0}, A = {ax X e T}, S = {u0}, T =
{u{vj v e V}, and d~ax = ux and d+ax = ux for X £ T, where X denotes
the smallest member of T that properly contains X (and V = 0 by convention).
As an example, the rooted directed tree (arborescence) for V = {1,2,3,4,5} and
T = {{l},{2},{3},{4},{5},{2,3},{l,2,3},{4,5},y} is depicted in^Fig. 9.6. We
associate the given function fx with arc ax for X e T. The function / on T induced
from / = 0 on S by this network coincides with the laminar convex function (6.34),
and its M^-convexity follows from Theorem 9.27. •

9.6.2 Technical Supplements


Proof of Theorem 9.26
It suffices to consider the case of / € A4[Z —> Z] and g G £[Z —> Z].
(1) To prove (M-EXC[Zj) for /, we fix 2/1,2/2 € dom/ and u e supp+(yi -2/2)
and look for v e supp~(j/i — 2/2) such that58

by a refinement of the augmenting path argument used in Note 2.19 for the special
case of S = 0. We take & e ZA and z; e Zs for i = 1,2 such that

We search for a kind of augmenting path with respect to the pair (£1, £2) that yields
the desired inequality (9.84). If we are not successful in finding such a path, we
modify the flow pair to a new pair with smaller ^-distance ||£i — &\\i, so that we
can eventually find an appropriate augmenting path.
58
For w G T, xfu is the characteristic vector of {w} in ZT.
274 Chapter 9. Network Flows

Before giving a formal proof we explain the idea of the proof in a typical
situation. Consider the difference in the flows, £2 — Ci 6 Z'4, for which we have
9(& ~ £1) = (xy ~ xi,yi - 2/2,0). Since u € supp+(<9(£2 - £1)), there exists a
simple path, say, PI, compatible with £2 ~~ £1 that connects u to some vertex v\
in supp~(i9(^2 — £1)) = supp + (xi - £2) U supp~(j/i - yi). This is an augmenting
path with respect to the pair of flows £1 and £2- Suppose that we have the case of
vi E supp + (xi — £ 2 ). By (M-EXC[Zj) for / we obtain HI <E supp"(xi — £2) such
that59

Since u\ E supp+(9(^2 — £i))> there exists a simple path, say, P2, compatible with
£2 ~ £1 that connects HI to some v% € supp~(<9(£2 ~ Ci)) = supp + (xi — ^2) U
supp~(yi — 7/2)- Suppose further that v? E supp~(yi — 2/2) and P% is vertex disjoint
from PI. Putting v — V2 we represent the path PI U P% by TT : A —> {0, ±1}
such that supp+(7r) C supp+(^2 — 6): supp'(?r) C supp^(^ 2 - Ci)> and dir =
(Xui - X?i > Xu - X« > 0). For the augmented flows £{ = ^ + TT and ^2 = 6 - TT a
the new bases x{ = £1 - x^ + xf j and x'2 = x2 + X^ - xf1, we have

and

since

By (9.85), (9.86), (9.88), and (9.87), we obtain

which shows the inequality (9.84). Having presented the rough idea, we are now in
a position to start the proof that works in general.
We shall construct a pair of flows £,( and £2 that satisfy (9.87) for some v E
supp~(yi — 2/2) and £^,£2 6 Zs, and also

where

59
For w 6 S, Xw is the characteristic vector of {w} in Zs.
9.6. Network Duality 275

with b\s denoting the restriction of b to S. To obtain such (££,£2) we generate a


sequence of tuples (<^i, 1^2,61,62, w) with <f>i,ip2 £ Z"4, 61, 62 € Zy, and w e V
such that

where x™ £ Zy in (9.93). We call (ipi,y>2i 61, b?,w) a flow-boundary tuple and w


the frontier vertex. Note that (61,62) is determined uniquely from (ip\,(p2,w] by
(9.93).
We start with (<pi,<p2,61,62, w) = (6,6, <9£i+X«, d&-Xu, u), which obviously
satisfies (9.90) to (9.94). If the frontier vertex w is in supp~(yi — j/ 2 ), we end
successfully with (£i,£ 2 ,v) = (^1,^2,^); note that for u; e T we have <9y>i =
(6i|s, -(yi - X« + XD, 0) and d<pz = (62 s, -(2/2 + X« - xD- 0).
A flow-boundary tuple (<pi, y> 2 ,61,6 2 , w) is updated to another flow-boundary
tuple (<p'i,<pr2, 6'1; 62, w') in three ways: (i) flow push, (ii) basis exchange, and (iii) cross
over. In all three cases, \\<pi — <P2\\1 > \\<Pi — <P2\\i and tne conditions (9.90) to (9.94)
are maintained.
A flow push augments one unit of flow along an arc a* incident to the frontier
vertex w. If w = d+a* (w is the initial vertex of a*) and (pi(a*) < ip2(a*), the
flows are updated as <f((a*) = <pi(a*) + 1, </> 2 ( a *) = ¥2(a*) ~ 1, an(i tne frontier
vertex is changed to w' = d~~a*, the terminal vertex of a*. Symmetrically, if
w = d~a* and <pi(a*) > ^(a*), the flows and the frontier vertex are updated as
^(a*) = <f>i(a*) - 1, ¥> 2 ( a *) = ^2(0*) + 1, w' — d+a*. The boundaries remain
unchanged; 6j — b\ and 62 = 62. See (9.88) for the condition (9.94).
A basis exchange applies when w € S and 61 (w) > 6 2 (w). By (M-EXC[Zj) for
/ there exists w' G 5 such that bi(w') < 6 2 (u/) and

The frontier vertex is updated to this w' and the boundaries to b( =b\~Xw+ Xw1,
62 = 62 + Xw - Xw'- The flows remain the same: ip{ = ipi and ip'2 = y? 2 -
A crossover updates ((/PI, tp2,61, 62, w) with reference to another flow-boundary
tuple (</^,</?2,6i,62, U ) °) with the same frontier vertex w° = w. The flow-boundary
tuple is updated to

according to whether

or not. The condition (9.94) is maintained, since


276 Chapter 9. Network Flows

Under the additional conditions that

we have

We use crossovers to avoid cycling in generating the flow-boundary tuples.


The generation of flow-boundary tuples consists of stages. Each stage consists
of repeated applications of flow push and base exchange, possibly followed by an ap-
plication of crossover. Denote by (ip\ , y>2 \b\ , 62 j w^) the flow-boundary tuple
at the beginning of a stage and by (</4 , v4 > b[ ,b% ,w^) (j = 2 , . . . , fc) the flow-
boundary tuples generated so far in the stage. We end the stage if either (a) w^ G
supp~(yi — 7/2) or (b) u/ fc ) = u 7^ w/ 1 ); we are done in case (a), whereas in case (b)
we go on to the next stage. If w^k> — w^ for some j with 1 < j < k — 1, we apply
crossover to (^\^\bf\b(^\w^) with reference to (<f^\^\^\b^\w^}
to obtain the next flow-boundary tuple (^ fc+1) ,( / J 2 fc+1) ,^ fc+1) ,6 2 fe+1) , w ( fc+1 )) and
end this stage to go on to the next stage. Otherwise, we apply flow push or
base exchange, whichever is applicable, to generate the next flow-boundary tuple
(( / 4 fe+1) ,</4 fc+1) X fc+1) ,4 fc+1) ;™ (fe+1) ) in the current staSe- We Prohibit, however,
applying flow push on an arc a* right after a flow push on the same arc a* (this
is possible if \tpi(a*) — y)2(a*)| = !)• We also prohibit applying a base exchange
with a pair (w,wr) right after a base exchange with (w',w) (this is possible if
bi(w) = b2(w] + I and bi(w') = b2(w') - 1).
Thus, a stage terminates if (a) w^ € supp~(yi — y2), (b) w^ = u ^ w^l\
or (c) a crossover is applied. In case (a) we have successfully found v — w^
for (9.84). In case (b) we have bi — d(p\ + Xu> b% = d(p2 — Xu, and w = u for
( < P i , y 2 , & i , & 2 , w ) = (< / 5f ) ,^ f c ) ,&f ) ,6^ / c ) ,w;( f e ) ),justaswehadfor(y)i,^2,^i,^2,w) =
(£i) ^2> d£i + Xu, d^2 ~Xu,u) at the beginning of the generation process. In case (c)
we obtain a closer pair of flows, with which the next stage starts.
The above generation process terminates with a finite number of flow-boundary
tuples. This is because (i) the frontier vertices in one stage are distinct and hence
the number of flow-boundary tuples generated in one stage is bounded by \V\,
(ii) \\ipi — <f2\\i decreases at least by one at a stage ending in case (c) (note that
the conditions in (9.95) are met and (9.96) holds true), and (iii) a stage ending in
case (b) must be preceded by a stage ending in case (c).
We have thus shown how to construct a desired pair of flows (£i, £2) by gener-
ating flow-boundary tuples starting with (</?i, <p2,b\,b2,w) — (£1,£21 d£i + Xu, ^2 —
Xu,u). This completes the proof of (1).
(2) Put a = g(p + 1) - g(p), which is independent of p by (TRF[Z]). Since
S(p + 1, q + 1, r + 1) = S(p, q, r), we have
9.6. Network Duality 277

which shows (TRF[Zj) for g. Suppose that g(q) is finite for q = qi,q2- There exist
(77i,Pi,7*i) and (772,252,7*2) such that

Here we have

by the submodularity (SBF[Zj) of g and

with

by the convexity of ga (see Note 2.20). The submodularity (SBF[Z]) of g then


follows because

(cf. (9.21)), whereas the assumed conjugacy implies

Therefore, we have

from which follows the weak duality

Fix y with f ( y ) finite. By Theorem 9.16 (4) there exists (p*,q*,r*) such that

with 77* = — S(p*,q*,r*). This implies

Combining this with (9.97) shows g = /*.


The proof of Theorem 9.26 is completed. It is mentioned that (1) follows from
(2) and (3) with the aid of the conjugacy theorem (Theorem 8.12).
278 Chapter 9. Network Flows

Proof of Theorem 9.27


Because the functions are real valued, the infima in (9.81) and (9.82) may not be
attained. The proof for Theorem 9.26 can be adapted to this case by introducing
£ > 0 and letting s -> 0 as in Notes 2.19 and 2.20.

Proof of Theorem 9.28


The augmenting flow argument for (1) suffers from a technical difficulty that the
amount of augmenting flow may possibly converge to zero. To circumvent this
difficulty we first prove (2) and (3) and then use the conjugacy (Theorem 8.4) to
show (1). See Murota-Shioura [152] for details.

Bibliographical Notes
The minimum cost flow problem treated in section 9.1 is one of the most fundamen-
tal problems in combinatorial optimization. For network flows, Ford-Fulkerson [53]
is the classic, whereas Ahuja-Magnanti-Orlin [1] describes recent algorithmic devel-
opments; see also Cook-Cunningham-Pulleyblank-Schrijver [26], Du-Pardalos [43],
Korte-Vygen [115], Lawler [119], and Nemhauser-Wolsey [167]. Thorough treat-
ments of the network flow problem on the basis of convex analysis can be found in
Iri [94] and Rockafellar [178].
The submodular flow problem was introduced by Edmonds-Giles [46] using
crossing-submodular functions. The present form avoids crossing-submodular func-
tions on the basis of the fact, due to Fujishige [61], that the base polyhedron defined
by a crossing-submodular function can also be described by a submodular function.
See Fujishige [65] for other equivalent neoflow problems, such as the independent
flow (Fujishige [59]) and the polymatroidal flow (Hassin [87], Lawler-Martel [120],
[121]). The M-convex submodular flow problem was introduced by Murota [142].
Section 9.3 is a collection of standard results on the submodular flow problem.
Theorems 9.10 and 9.13 are taken from Fujishige [65] (Theorems 5.1 and 5.11,
respectively), where the former is ascribed to Frank [56].
The optimality criterion by potentials for the M-convex submodular flow prob-
lem was established by Murota [142] for the integer-flow version (Theorem 9.16) and
adapted to the real-valued case (Theorem 9.14) with the integrality assertion (The-
orem 9.15) in Iwata-Shigeno [105] and Murota [147].
The optimality criterion by negative cycles (Theorem 9.20) was established
by Murota [140], [142] for the integer-flow version and adapted to the real-valued
case (Theorem 9.18) in Murota-Shioura [152]. Theorem 9.22 is in [140], [142].
Proposition 9.23 is a reformulation of the unique-max lemma due to Murota [135].
Proposition 9.25 is also from [135]; the proof technique using (9.80) originated in
Fujishige [59].
Transformation by networks was found first for M-convex functions / s M [Z —>
R] by Murota [137]; the proof given in section 9.6.2 is due to Shioura [188], [189].
Transformation of L-convex functions is stated explicitly in Murota [145]. The ex-
tension to polyhedral M-convex and L-convex functions is made in Murota-Shioura
9.6. Network Duality 279

[152]. Theorems 9.26, 9.27, and 9.28 are explicit in Murota [147]. Induction of ma-
troids through graphs is due to Perfect [171] (for bipartite graphs) and Brualdi [21];
see also Schrijver [183], Welsh [211], and White [213]. The alternative proof of the
M^-convexity of a laminar convex function described in Note 9.31 is communicated
by A. Shioura.
This page intentionally left blank
Chapter 10

Algorithms

Algorithmic aspects of M-convex and L-convex functions are discussed in this chap-
ter. Three fundamental optimization problems tractable by efficient algorithms are
(i) M-convex function minimization, which is a nonlinear extension of the minimum-
weight base problem for matroids; (ii) L-convex function minimization, which in-
cludes submodular set function minimization as a special case; and (iii) minimiza-
tion/maximization in the Fenchel-type min-max duality, which is equivalent to the
M-convex submodular flow problem.

10.1 Minimization of M-Convex Functions


Four kinds of algorithms for M-convex function minimization are described: the
steepest descent algorithm, the steepest descent scaling algorithm, the domain re-
duction algorithm, and the domain reduction scaling algorithm. Throughout this
section we assume that / : Zv —> RU {+00} is an M-convex function, n = \V\, and
F is an upper bound on the time to evaluate /.

10.1.1 Steepest Descent Algorithm


The local characterization of global minimality for M-convex functions (Theorem
6.26) immediately suggests the following algorithm of steepest descent type.

Steepest descent algorithm for an M-convex function / € Ai[Z —> R]


SO: Find a vector x & dom /.
SI: Find u, v G V (u ^ v) that minimize f(x — Xu + Xv)-
S2: If f ( x ) < f(x — Xu + Xv), then stop (x is a minimizer of /).
S3: Set x := x — Xu + Xv and go to SI.

Step SI can be done with n2 evaluations of function /. At the termination of


the algorithm in step S2, x is a global minimizer by Theorem 6.26 (M-optimality
criterion). The function value / decreases monotonically with iterations. This

281
282 Chapter 10. Algorithms

property alone does not ensure finite termination in general, although it does if /
is integer valued and bounded from below.
Let us derive an upper bound on the number of iterations by considering the
distance to the optimal solution rather than the function value.

Proposition 10.1. If f has a unique minimizer, say, x*, the number of iterations
in the steepest descent algorithm is bounded by \\x° — x*\\\/1, where x° denotes the
initial vector found in step SO.

Proof. Put x' = x — Xu + Xv m step S2. By Theorem 6.28 (M-minimizer cut),


we have x*(u) < x(u) — 1 = x'(u) and x*(v) > x(v) + I = x'(v), which implies
\\x' — x*||i — \\x — x* \\ — 2. Note that ||x° — x*\\i is an even integer. D

When given an M-convex function /, which may have multiple minimizers, we


consider a perturbation of / so that we can use Proposition 10.1. Assume now that
the effective domain is bounded and denote its ^i-size by

We arbitrarily fix a bijection (p : V —> {1,2,... ,n} to represent an ordering of the


elements of V, put Vi = tp~l(i) for i = 1,... ,n, and define a vector p € Ry by
p(vi) = el for i = 1,... ,n, where e > 0. The function /£ = f\p] is M-convex by
Theorem 6.13 (3) and, for a sufficiently small e, it has a unique minimizer that is
also a minimizer of /. Suppose that the steepest descent algorithm is applied to the
perturbed function fe. Since fs(x - Xu + Xv) = f(x - Xu + Xv) + E"=i ^x(vi) -
e v(«) -)_ £v(v) ^ this amounts to employing a tie-breaking rule:

where

in the case of multiple candidates in step SI of the steepest descent algorithm


applied to /.
With this tie-breaking rule we have the following complexity bound.

Proposition 10.2. For an M-convex function f with finite K\ in (10.1), the


number of iterations in the steepest descent algorithm with tie-breaking rule (10.2)
is bounded by K\/2. Hence, if a vector in dom/ is given, the algorithm finds a
minimizer of f in O(F • n2Ki) time.

By Theorem 6.76 (quasi M-optimality criterion) and Theorem 6.77 (quasi M-


minimizer cut), the steepest descent algorithm can also be used for minimizing quasi
M-convex functions satisfying (SSQM^).

Note 10.3. For integrally convex functions we have the local optimality criterion
for global optimality (Theorem 3.21). This naturally suggests the following.
10.1. Minimization of M-Convex Functions 283

Steepest descent algorithm for an integrally convex function /


SO: Find a vector x € dom/.
SI: Find disjoint Y,ZCV that minimize f(x — XY + Xz)-
S2: If f ( x ) < f(x — XY + X z ) , then stop (x is a minimizer of /).
S3: Set x := x — XY + Xz and go to SI.

The steepest descent algorithm for M-convex functions is a special case of this. It is
emphasized that no efficient algorithm for step SI is available for general integrally
convex functions. •

10.1.2 Steepest Descent Scaling Algorithm


Scaling is one of the fundamental general techniques in designing efficient algo-
rithms. The proximity theorem for M-convex functions leads us to the following
steepest descent scaling algorithm for M-convex function minimization. We assume
that the effective domain is bounded and denote its £00-size by

Steepest descent scaling algorithm for an M-convex function


/ 6 M\L -f R]
SO: Find a vector x e dom/, and set a ~ 2^82(^00^n)^ B .= dom /.
SI: Find an integer vector y that locally minimizes
f(,,\ _ / f(x +ay) (x + ay&B),
J(y>
~ { +00 (x + ayiB)
in the sense of f ( y ) < f(y — Xu + Xv) (Vu, v & V) by the steepest descent
algorithm of section 10.1.1 with initial vector 0 and set x := x + ay.
S2: If a = 1, then stop (x is a minimizer of /).
S3: Set B := B n (y e Zv \\y - x^ < (n - 1)(a - 1)} and a := a/2 and go
to SI.

By the M-proximity theorem (Theorem 6.37 (1)), the set B always contains a global
minimizer of / and, at the termination of the algorithm in step S2, x is a global
minimizer by the M-optimality criterion (Theorem 6.26 (1)). The number of itera-
tions is bounded by [Iog2(-ft'oo/4n.)] • Some remarks are in order concerning step SI.
If the function / is such that the scaled function / remains M-convex, step SI can
be done in O(F • n4) time by the steepest descent algorithm with tie-breaking rule
(10.2). This time bound follows from Proposition 10.2 with K\ < 4ra2. For a general
/, however, / is not necessarily M-convex (see Note 6.18) and no polynomial bound
for step SI is guaranteed, although we have an obvious exponential time bound
0(F-(4n)"n 2 ).
On the basis of Theorem 6.76 (quasi M-optimality criterion) and Theorem
6.78 (quasi M-proximity theorem), the steepest descent scaling algorithm can be
adapted to the minimization of quasi M-convex functions satisfying (SSQM^).
284 Chapter 10. Algorithms

10.1.3 Domain Reduction Algorithm


The domain reduction algorithm is a kind of bisection method that searches for the
minimum of an M-convex function by generating a sequence of nested subsets of
the domain on the basis of the M-minimizer cut theorem (Theorem 6.28). For an
M-convex function with bounded effective domain, the algorithm finds a minimizer
in time polynomial in n and Iog2 KOC, where K^ is defined by (10.3).
We introduce the following notations for a bounded nonempty set B C Zv:

The set B° is intended to represent the central part of B, i.e., the set of vectors of
B lying away from the boundary. The set B° is nonempty if B is M-convex; see
Proposition 10.6 below.
Domain reduction algorithm for an M-convex function / S Ai[Z —*• R]
SO: Set B :=dom/.
SI: Find a vector x <E B°.
S2: Find u, v € V (u ^ v) that minimize f(x — Xu + Xv)-
S3: If f ( x ) < f(x — Xu + Xv), then stop (x is a minimizer of /).
S4: Set B := B D {y € Zv \ y(u) < x(u) - 1, y ( v ) > x(v) + 1} and go to SI.
The vector x G B° in step SI can be found with O(n2log2K00) evaluations of /
by the procedure to be described below. The set B forms a decreasing sequence
of M-convex sets, which contain a minimizer of / because of the M-minimizer cut
theorem (Theorem 6.28). Since x is taken from the central part of B, UB(W) — IB(W)
for w e {u, v} decreases with a factor of (1 — ~) and hence the number of iterations
is bounded by O(n2 Iog2 -ftToo)- The above algorithm, therefore, finds a minimizer of
/ with O(n 4 (log 2 Koo)2) evaluations of / provided that it is given a vector in dom /.

Proposition 10.4. If a vector in dom / is given, the domain reduction algorithm


finds a minimizer of an M-convex function f in O(F • n4(log2 Koc)2) time.

It remains to show how to find x e B° in step SI when given a vector of


B. For a vector y of an M-convex set B and two distinct elements u,v of V, the
exchange capacity is defined by

which is a nonnegative integer representing the distance from y to the boundary of


B in the direction of Xv - Xu- In the domain reduction algorithm, B is always an
M-convex set and the exchange capacity can be computed by a binary search with
[Iog2 KOO] evaluations of /. For x e B we define
10.1. Minimization of M-Convex Functions 285

with an obvious observation that V° (x) = V if and only if x 6 B°.


If V°(x) i- V, we can modify x to x' e B with the property \V°(x')\ >
\V°(x)\ + 1 as follows. Take any u e V \V°(x) and assume x(u) > u°B(u);
the other case with x(u) < i°B(u) can be treated in a similar manner. Putting
{vi,V2, • • • ,vn-i} = V\ {u} and x0 = x, we define a sequence r c i , x 2 , . . . ,xn-i by
Xi = Xi-i + oti(xVi - Xu) with

for i = 1 , 2 , . . . , n — 1 and put x' = xn-i-

Proposition 10.5. V°(x') D V°(x) U {u}.

Proof. The inclusion V°(x'} 2 V°(x) is obvious. To prove x'(u) = u°B(u) by


contradiction, suppose x'(u) > u°B(u) and take x* & B°, where B° ^ 0 by Propo-
sition 10.6 below. Since u & supp+(o;' — #*), it follows from (M-EXC[Z]) that
x" = x' + Xvi — Xu € B and x'(vi) < x*(vi) < u°B(vi) for some Vi. Since Vi 6
supp+(x" — Xi) and supp~(o;" — Xi) = {u}, (M-EXC[Zj) implies Xi + \Vi — \u 6 B.
But this contradicts the definition of x,. D

The modification of x to x' described above can be done with n evaluations


of the exchange capacity. Repeating such modifications at most n times we arrive
at x with V°(x) = V. Thus, given a vector in B, we can find x e B° with at most
n2 evaluations of the exchange capacity.
We finally prove the nonemptiness of B°.

Proposition 10.6. B° ^ 0 if B is an M-convex set.

Proof. Let p 6 <S[Z] be the submodular function satisfying B = B(/o) n Zv and p


be its Lovasz extension. Then we have IB(V) — p(V) — p(V\{v}} and UB(V) — p(v).
We can assert the nonemptiness of B° by establishing

(see Theorem 3.8 in Pujishige [65]). We prove (i) here; a similar argument works for
(ii). Fix X C V and put p = xx, Pv = 1 — Xv (v € X), and k — \X\. It follows from

that kp(V) = /3(fcl) = p(p + Y^vexPv) and


286 Chapter 10. Algorithms

With these identities we see

The last inequality holds true since

by the positive homogeneity and convexity of p. Hence follows i°B(X} < p ( X ) . D

By Theorem 6.77 (quasi M-minimizer cut), the domain reduction algorithm


can also be used for minimizing quasi M-convex functions satisfying (SSQM5^) pro-
vided that the effective domain is a bounded M-convex set.

10.1.4 Domain Reduction Scaling Algorithm


We present here the domain reduction scaling algorithm for M-convex function min-
imization, a combination of the idea of the domain reduction algorithm of sec-
tion 10.1.3 with a scaling technique based on the theorem of M-minimizer cut with
scaling (Theorem 6.39).
The algorithm works with a pair (x, (.) of integer vectors, where x is the current
solution and £ is a lower bound for an optimal solution. Specifically, two conditions

are maintained, where

The algorithm consists of scaling phases parametrized (or labeled) by a nonnegative


integer a, called the scaling factor, which is initially set to be sufficiently large and
is decreased until it reaches unity. In each scaling phase with a fixed a, the pair
(x, (.} is modified so that it satisfies an additional condition

At the end of the algorithm we have a = 1 and hence x = i by (10.6). This means
S(l) n dom/ = {x}, since x(V) = y(V) for any y G dom/ by Proposition 6.1 and
since x(V) < y(V) for any y e S(£) distinct from x. Furthermore, we see from the
second condition in (10.5) that a; is a minimizer of /, since {x} — S(K) n dom/ D
S(f)nargmin/^0.
The outline of the algorithm reads as follows, where step SI for the a-scaling
phase is described later and K^ is defined in (10.3).
10.1. Minimization of M-Convex Functionsiions 287

Domain reduction scaling algorithm for an M-convex function


/ e M[Z -> R]
SO: Find a vector x € dom/ and set t := x - K^l, a := ^lo^K^/2n^.
SI: Modify (x,£) to meet (10.6) (a-scaling phase).
S2: If a = 1, then stop (x is a minimizer of /).
S3: Set a := a/2 and go to SI.
The a-scaling phase is now described. In view of (10.6) we employ a subset
V of V such that

Initially, V* is set to V and then decreases monotonically to the empty set.


ct-scaling phase for (x, (., a)
SO: Set V := V.
SI: If V — 0, then output (x,l) and stop.
S2: Take any ueV*.
S3: Find v e V that minimizes f(x + a(\v - x«))-
S4: If v = u or x(u) - a < l(u), then set l(u) := max[l(u), x(u) - (n - l)(a - 1)]
and V := V \ {u} and go to SI.
S5: Otherwise, set £(v) := max[l(v),x(v) + a — (n — l)(a — \)}, x := x + a(xv ~Xu)
and V := V \ {v} and go to SI.
As is easily seen, the first condition in (10.5) is maintained in steps S4 and S5.
The second condition in (10.5) is also maintained by virtue of Theorem 6.39 (M-
minimizer cut with scaling). The subset V is nonincreasing, although it may be
that v $. V* in step S5, and then the operation V := V \ {v} is void and V*
remains unchanged. Denote the initial value of (x,l) by (x°,l°). For each w € V
the value of x(w) is decreased at most (x°(w) — l°(w))/a times before it is deleted
from V*. Hence, the number of iterations in the a-scaling phase is bounded by
\\x° — i°\\\/a. In particular, the a-scaling phase terminates with V = 0.
The time complexity of the domain reduction scaling algorithm is given as
follows, where it is assumed for simplicity that K^ is known.

Proposition 10.7. // a vector in dom/ is given, the domain reduction scaling


algorithm finds a minimizer of an M-convex function j in 0(F • n3log2(K00/n))
time.

Proof. At the beginning of the a-scaling phase we have \\x°— (°\\\ < n(n— l)(2a—1)
by (10.6). Since step S3 in the a-scaling phase can be done with n evaluations of
/, the a-scaling phase terminates in O(F • n3) time. The number of scaling phases
is equal to \\og2(Koc/2n)~\. D

On the basis of Theorem 6.79 (quasi M-minimizer cut with scaling), the do-
main reduction scaling algorithm can be adapted to the minimization of quasi
M-convex functions satisfying (SSQM^) provided that the effective domain is a
bounded M-convex set.
288 Chapter 10. Algorithms

10.2 Minimization of Submodular Set Functions


The minimization of submodular set functions is one of the most fundamental prob-
lems in combinatorial optimization. In this section we deal with algorithms for
minimizing submodular set functions, which we will use as an essential component
in algorithms for L-convex functions.

10.2.1 Basic Framework


Let p : 2 —> R be a submodular set function,60 where p(0) = 0 and n = \V\. In
V

discussing the efficiency or complexity of algorithms it is customary to categorize


them into finite, pseudopolynomial, weakly polynomial and strongly polynomial
algorithms. For our problem of minimizing p, a finite algorithm is trivial; we may
evaluate p(X) for all subsets X to find the minimum. This takes O(F • 2") time,
where F is an upper bound on the time to evaluate p. The complexity of an
algorithm may depend on the complexity or size of p; if p is integer valued,

often serves as a measure of the size of p. An algorithm for minimizing p is said to be


(i) pseudopolynomial, (ii) weakly polynomial, or (iii) strongly polynomial, according
as the total number of evaluations of p as well as other arithmetic operations in-
volved is bounded by a polynomial in (i) n and M, (ii) n and Iog2 M, or (iii) n alone.
Our objective in this section is to describe two strongly polynomial algorithms for
minimizing p.
Let

be the base polyhedron associated with p. Recall that a point in B(/o) is called a
base and an extreme point of B(p) is an extreme base. For any base x and any
subset X we obviously have

where x~ is the vector in R^ defined by x~(v) = min(0,x(v)) for v e V. The


inequalities are tight for some x and X, as follows.

Proposition 10.8. For a submodular set function p : 2V —-> R, we have

If p is integer valued, the maximizer x can be chosen to be an integer vector.

Proof. Although this is an easy consequence of Edmonds's intersection theorem


(Theorem 4.18) with pi = p and p2 = 0, a direct proof is given here. Let x be a
60
Note that we assume p to be finite valued for all subsets. An adaptation to the general case
p : 2V —> RU {+00} is explained in Note 10.14.
10.2. Minimization of Submodular Set Functions 289

maximizer on the left-hand side. For any u € supp~(z) and v €. supp + (x), there
exists a subset Xuv such that u 6 Xuv C V \ {v} and x(Xuv) — p(Xuv). Put
X = U u6 sup P -(x) a6,Upp+(a,) Xuv. We have x(X) = p(X) by (4.23) and x'(V) =
x(X) since supp~(x) C X and supp+(a;) C V \ X.. The integrality assertion can be
established by the same argument starting with an integral base x that maximizes
x~(V) over all integral bases. D

The min-max relation (10.11) shows that we can demonstrate the optimality
of a subset X by finding a base x with x~(V) — p(X). But how can we verify that
a vector x belongs to B(p)? By definition, x e B(p) if and only if mmx(p(X) —
x(X)) = p(V) - x(V) = 0. Thus, testing for membership in B(p) for an arbitrary
x seems to need a submodular function minimization procedure.
To circumvent this difficulty, we recall from Note 4.10 that we can generate an
extreme base by the greedy algorithm. Let L = (vi,v<2, • • •, vn) be a linear ordering
of V and define L(VJ) = {i>i, t>2, • • •, vj} for j — 1,...,n. Then the extreme base y
associated with L is given by

Any base x can be represented as a convex combination of a number of extreme


bases, say, {yi \ i £ I}, as

where we may assume |J| < n by the Caratheodory theorem. Combining (10.12)
and (10.13) shows that any base can be represented by a list of linear orderings
{Li | i € /} (that generate {j/j | i 6 /}) and coefficients of the convex combination
{A; | i e /}. With this representation of x we can be sure that x is a member of
B(p). For u, v € V and i e I we use the notation

where w ^ v means w -<, v or w = v.


The following proposition gives a sufficient condition for optimality in terms
of the linear orderings {Li | i € /}.

Proposition 10.9. Let x be a base represented as (10.13) with {(Lj,Aj) i e 1}


and W be a subset of V.
(1) //supp-(z) C W and supp+(x) CV\W, then x~(V) = x(W).
(2) Ifu^v for every u e W, v e V \ W, and i G /, then x(W) = p(W).
(3) If the conditions in (1) and (2) are satisfied, x and W are optimal in
(10.11).

Proof. (1) is obvious. By (10.12) the condition in (2) implies p(W) = yi(W) for
every i € /. Then x(W) = Eie/^W = P(W)- (3) follows from (1), (2), and
(10.10). D
290 Chapter 10. Algorithms

We are thus led to a basic algorithmic framework:


1. We maintain (and update) a number of linear orderings {Li i € /}, together
with the associated extreme bases {yi \ i 6 /} and the coefficients of convex
combination {A^ i e /}, to represent a base x.

2. We terminate when the conditions in Proposition 10.9 are satisfied.


Two strongly polynomial algorithms using this framework are described in subse-
quent sections.

Note 10.10. Here is a brief historical account of submodular function minimiza-


tion. Its importance seems to have been recognized around 1970 by J. Edmonds [44]
and others. The first polynomial algorithm was given by M. Grotschel, L. Lovasz,
and A. Schrijver—weakly polynomial in 1981 [82], and strongly polynomial in 1988
[83]. These algorithms, however, are based on the ellipsoid method and, as such, are
not so much combinatorial as geometric. Efforts for a combinatorial polynomial al-
gorithm have been continued with major contributions made by W. H. Cunningham
and others [15], [28], [29], who showed the basic framework above as well as a com-
binatorial pseudopolynomial algorithm for submodular function minimization. In
1994, extending the min-cut algorithm of Nagamochi-Ibaraki [163], M. Queyranne
[172] came up with a combinatorial algorithm for symmetric submodular function
minimization, which is to minimize over nonempty proper subsets a submodular
function p such that p(X) = p(V \ X) for all X C V. Combinatorial strongly poly-
nomial algorithms for general submodular functions were found, independently, in
the summer of 1999 by two groups, S. Iwata, L. Fleischer, and S. Fujishige [101],
[102] and A. Schrijver [182]. Both of these follow Cunningham's framework, but
they are significantly distinct in technical aspects. Subsequently, a new problem was
recognized by Schrijver. These two algorithms are certainly combinatorial, but they
rely on arithmetic operations (division, in particular) in computing the coefficients
of convex combination in (10.13). The question posed by Schrijver is as follows: Is
it possible to design a fully combinatorial strongly polynomial algorithm that is free
from division and relies only on addition, subtraction, and comparison? This was
answered in the affirmative by Iwata [99] in the fall of 2000. •

Note 10.11. Because the minimizers form a ring family (see Note 4.8), there exists
a unique minimal minimizer as well as a unique maximal minimizer of a submodular
set function p : 2V —> R. Given an optimal base x with the representation x =
Sie/^il/i m (10.13), we can compute the minimal and maximal minimizers in
strongly polynomial time as follows.
With the notation T>(x) for the family of tight sets at x, introduced in (4.22)
of Note 4.9, we have a representation

for the family of the minimizers of p. Noting that T>(x) is a ring family, let Gx =
(V,AX) be the directed graph associated with T>(x), as defined by (4.20) in Note
10.2. Minimization of Submodular Set Functions 291

4.7. This is equivalent to saying that (u, v) € Ax if and only if v G dep(x, u), where
dep(x,w) means the smallest tight set at x that contains u. Then the minimal
minimizer can be identified as the set of vertices reachable from supp~ (x) in Gx
and the maximal minimizer as the complement of the set of vertices reachable to
supp+ (x) in Gx. Moreover, the graph Gx enables us to enumerate all the minimizers.
By (10.13) we have

Since each yi is an extreme base, we can easily compute dep(yi,u). For each u e V,
we start with D := V and update D to D \ {v} as long as D \ {v} € T^(yi)
for some v £ D \ {u}; we obtain D = dep(yi,u) at the termination. We can thus
compute dep(yi, u) with O(n 2 ) evaluations of p. The number of function evaluations
can be reduced to O(n) if a linear ordering Lj = (vi,vz,... ,vn) generating yi
is also available; assuming u = Vk, start with D = {vi,f2> • • • ,Vk-i} and, for
j = k - 1, k - 2 , . . . , 1, update D to D \ {vj} if D \ {vj} e £>(&). Therefore, the
graph Gx can be constructed with O(n 3 |/|) or 0(n 2 |/|) evaluations of function p,
where it is reasonable to assume |/| < n. •

Note 10.12. The minimal minimizer of a submodular set function p : 1V —> R


can also be computed using any submodular function minimization algorithm n + I
times. Let X be a minimizer of p. For each v e X, compute a minimizer Yv of a
submodular set function pv : 2X\^ —> R, the restriction of p to X \ {v}, defined
by pv(Y) = p(Y) for Y C X \ {v}. Then the minimal minimizer of p is given
by {v G X | p(Yv) > p ( X ) } , since p(Yv) > p(X) if v is contained in the minimal
minimizer and p(Yv) = p(X) if not. The maximal minimizer can be computed
similarly. •

Note 10.13. An alternative way to find the minimal minimizer of a submodular


set function p : 2V —> R is to introduce a penalty term to represent the size of a
subset and to minimize a modified submodular set function p(X) = p(X) + s\X\
with a sufficiently small positive parameter e. If p is integer valued, e = l/(n + 1)
is a valid choice. The maximal minimizer can be computed similarly. •

Note 10.14. It has been assumed that the submodular function p is finite valued for
all subsets. This assumption is not restrictive but for convenience of description. Let
p 6 <S[R] be a submodular set function on V taking values in RU{+oo}; T> = domp
is a ring family with {0, V} C T>. For v € V we denote by Mv the smallest member
of T> containing v and by Nv the largest member of T> not containing v. For X C V
we denote by X the smallest member of T> including X. We assume that we can
compute Mv for each v & V efficiently, say, in time polynomial in n. Then we can
compute X and Nv by
292 Chapter 10. Algorithms

Let us assume, without loss of generality, that the length of a maximal chain of V
is equal to n = \V\. Consider now a set function ~p : 2V —> R denned by

with c e R^ given by

As is shown below, (i) p is a finite-valued submodular function and (ii) X £ arg min p
implies X e arg min p. Thus we can minimize p via the minimization of p.
Proof of (i): p is obviously finite valued. If X e V, v £ X, and Y = X U {v} e
T>, we have YL)NV = NVU {v} and Y D Nv = X and, therefore,

This means that n(X) = p(X}-\-c(X) is nondecreasing on T>. Putting Ji(X) — n(X)
for X C V and noting X U F = XUY and X n F D X n F, we see

which shows the submodularity of /I and hence that of p.


Proof of (ii): For any Y e D we have p(T) < p(T) + c(X) - c(X) = p(X) <
p(Y) = p(Y). m

Note 10.15. The method in Note 10.14 can be used to minimize a submodular
function defined on a (finite) distributive lattice. By a lattice we mean a tuple
(S, V, A) of a nonempty set S and two binary operations V and A on S1 such that

for any a, b, c €. S. A lattice (S, V , A ) is a distributive lattice if, in addition, the


distributive law

holds true. A ring family T> is a typical distributive lattice, where (S, V, A) =
(Z>, U , n ) . The converse is essentially true: any distributive lattice (S, V , A ) can be
represented in the form of a ring family (Birkhoff's representation theorem). The
size of the underlying set of the ring family is equal to the length of a maximal
chain of S. A function p : S —> R defined on a distributive lattice (S, V, A) is said
to be submodular if it satisfies
10.2. Minimization of Submodular Set Functions 293

Thus, with an appropriate representation of (S, V, A), a submodular function p on


S can be minimized by using the method in Note 10.14. •

Note 10.16. Maximizing a submodular set function is a difficult task in general.


It is known that no polynomial algorithm exists for it (and this statement is inde-
pendent of the P ^ NP conjecture); see Jensen-Korte [106] and Lovasz [122], [123].
In this context, M^-concave functions on {0, l}-vectors form a tractable subclass of
submodular set functions. Recall that an M^-concave function is submodular (The-
orem 6.19) and that it can be maximized efficiently by algorithms in section 10.1.

10.2.2 Schrijver's Algorithm


We explain here Schrijver's strongly polynomial algorithm for submodular func-
tion minimization. This algorithm achieves strong polynomiality using a distance
labeling with an ingenious lexicographic rule.
Following the basic framework introduced in section 10.2.1, the algorithm
employs the representation of a base in terms of a convex combination of extreme
bases associated with linear orderings. Given {(I/i,Ai) | i € /}, the algorithm
constructs a directed graph G = (V, A) with arc set

(see (10.14) for the notation -<i) and searches for a directed path from P — supp + (x)
to N = supp~(a;). If there is no such path, the algorithm terminates by setting W
to be the set of vertices reachable to N. Then W and x satisfy all the conditions in
Proposition 10.9, and hence W is a minimizer of p and a; is a maximizer of x~(V).
Otherwise, it modifies {(Li, Aj) | i 6 /} with reference to a path from P to N.
Schrijver's algorithm for submodular function minimization
SO: Take any linear ordering LI and set / := {!}, AI := 1.
SI: Construct the graph G = (V,A) for {(Li, A,) | i e /} by (10.17).
S2: Set P := supp+(x), TV := supp~(x) for base x = £)ie/ A^ in (10.13).
S3: If there exists no directed path from P to AT in G, let W be the set
of vertices reachable to N and stop (W is a minimizer of p).
S4: Update {(I/j,Aj) i 6 /} and go to SI.
We now describe the concrete procedure for step S4, where a directed path exists
from P to ./V in G. Let d(v) denote the distance (= minimum number of arcs in a
directed path) in G from P to v.
We choose s, t e V as follows (a lexicographic rule). We fix a linear ordering -<Q
of elements of V; this is independent of the linear orderings Lj. Let t be the element
in N reachable from P with d(v) maximum; in the case of multiple candidates,
choose the largest with respect to -<Q. Let s be the element with (s,t) € A, d(s) =
d(t) — l; in the case of multiple candidates, choose the largest with respect to -<o- Let
a be the maximum of |(s,i]^J over i e I and let k e I be such that |(s, t]_<J = <*.
294 Chapter 10. Algorithms

Index the elements of V so that Lk = (vi, • - •, vn) and assume vp = s. Then


we have vp+a = t and ( s , t } ^ k = {vp+i,...,vp+a}. For j = l , . . . , a , consider a
linear ordering

which is obtained from Lk by moving vp+j to the position just before vp = s, and
let Zj be the extreme base associated with L*.

Proposition 10.17. For some S > 0, j/fc + 6(xt ~ Xs) can be represented as a
convex combination of {zj \ j = 1 , . . . , a}.

Proof. Put Vh = Lk(vh) = {vi,..., vh} for h = 1,..., n and VQ = 0. By (10.12)


we have

and, therefore,

where the inequalities follow from submodularity; for instance, for h = p+j, we have

in which (Vp-i U {vp+j}) U Vp+j-i = Vp+j and (Vp-i U {vp+j}) n Vp+j-! = Vp-!.
The sign pattern of (10.18), as well as that of (xt~Xs)(vh), for h with p < h < p+j
looks like:

Note also that each row sum is equal to zero since Zj(V) = yk(V). If all the diagonal
entries marked by @ are strictly positive, we can represent xt ~ Xs as a nonnegative
combination of {zj —y^ \ j = 1 , . . . , a} with a positive coefficient for j — a; namely,
10.2. Minimization of Submodular Set Functions 295

X t ~ X s = E°=i »j(zi ~ yk) with //j > 0 (j = 1,..., a - 1) and p,a > 0. Then the
claim is true for 5 = 1/($3?=1 Mj)- ^ a diagonal entry, say, in the j'oth row, vanishes,
then Zj0 = yk and the claim is true for 6 = 0. D

Define x = x + \k$(Xt — Xa)- This vector can be represented as a convex


combination of Y" = {y, | i € / \ {fc}} U {zj \ j = 1,..., a} by Proposition 10.17 and
(10.13). Let x' be the point on the line segment connecting x and x that is closest
to x with the t-component x'(t) < 0. This means

Note that x' can be represented as a convex combination of Y U {yk} and, moreover,
{yk} can be dispensed with if x'(t) < 0. By a variant of Gaussian elimination we
can obtain a convex combination representation of x' using at most n vectors from
Y U {yk}- We update {(Li, A,) | i e /} according to this representation. Since
\Y\ + 1 < 2n, step S4 can be done with O(n3) arithmetic operations.

Proposition 10.18. The number of iterations in Schrijver's algorithm is bounded


by O(n 6 ). Hence, Schrijver's algorithm finds a minimizer of a submodular set func-
tion p : 2V —> R with O(n8) function evaluations and 0(n 9 ) arithmetic operations,
where n = \V\.

Proof. Denote by 0 the number of indices i £ I such that |(s,t]^J = a. Let


x',d',A',P',N',t',s',a',/3' be the objects x,d,A,P,N,t,s,a,/3 in the next itera-
tion. We first observe that a new arc appears only if it connects two vertices lying
between s and t with respect to -<&:
(a) For each arc (v, w) e A' \ A we have s -<k w -<fc v dfc *•
Proof of (a): By (u,tu) ^ A we have w -<k v and by (v,w) € A' we have
v -<* w for some j, where 1 < j' < a and t; -<* w means that v precedes w in L*.
Hence 7; = vp+j and (a) follows.
The crucial properties are the monotonicity in the sense that
(b) d'(v) > d(v) for all v € V and,
(c) if d'(v) = d(v) for all v 6 V, then (d'(t'),t',s',a',/3') is lexicographically
smaller than (d(t),t,s,a,(3).
Proof of (b): Note that P' C P. If (b) fails, there exists an arc (v, w)eA'\A
with d(w) > d(v) + 2. By (a) we have s -<k w -<k v ^k t and hence d(w) < d(s) + l =
d(t) < d(v) + 1, a contradiction. This shows (b).
Proof of (c): Assume d'(v) = d(v] for all v e V. Since x'(t') < 0, we have
x(f) < 0 or t' = s by (10.19). By the choice of t and the inequality d(s) < d(t),
we see that d(t') < d(t) and that if d(t') = d(t) then t' ^o t. Next assume also
that t' = t. We have ( s ' , t ) G A', whereas (s',t) £ A' \ A by (a). Hence (s',t) e A
and the maximality of s implies s' ^o s. Finally assume also that s' = s. For each
296 Chapter 10. Algorithms

j — 1,..., a, (s, £]_<• is a proper subset of (s, t]->k. This implies a' < a. If a' = a,
then /3' < (3, since x'(t) = x'(t') < 0 and L^ disappears in the update. Hence (c).
It follows from (c) that d(v) increases for some v € V in O(n 4 ) iterations
because each of d(t), t, s, a, (3 is bounded by n and there are at most n pairs
(d(t),t) if d does not change. For each v € V, d(v) can increase at most n times.
Therefore, the total number of iterations is bounded by O(n 6 ). D

Note 10.19. A more detailed analysis of Vygen [208] yields an improved bound of
O(n 5 ) on the number of iterations in Schrijver's algorithm. •

10.2.3 Iwata-Fleischer-Fujishige's Algorithm


We explain here Iwata-Fleischer-Fujishige's strongly polynomial algorithm for sub-
modular function minimization. While sharing the basic framework of section 10.2.1,
this algorithm differs substantially from Schrijver's in that it is based on a scaling
technique rather than distance labeling.

Weakly Polynomial Scaling Algorithm

We start with a scaling algorithm for minimizing a submodular set function p : 2V —>
R. The algorithm is weakly polynomial for integer-valued p. It is emphasized that
the value of M in (10.8) need not be computed.
Recall from Proposition 10.8 that the problem dual to minimizing p is to
maximize x~(V) over x e B(/o). To add flexibility in solving this maximization
problem we introduce a scaling parameter 6 > 0 to relax (or enlarge) the feasible
region B(p) to B(p + Kg), where

The function Kg is submodular and therefore B(p + K,j) — B(p) + B(K^) by Theorem
4.23 (1). For a concrete representation of B(K^) we observe that Kg is the cut
capacity function associated with a complete directed graph G = (V, A), where

and the arc capacities are all equal to 6; indeed, Kg coincides with K in (9.16) with
c(a) = 8 and c(a) = 0 for every arc a € A. By a S-feasible flow we mean a function
if : A —> R such that for each (u,v) € A we have (i) 0 < <p(u,v) < 5 and (ii)
either <p(u,v) = 0 or <p(v,u) — 0. Then B(KS) = {d(p \ ip is 5-feasible} by (9.18) in
Theorem 9.3. Thus our relaxation problem with parameter 6 reads as follows:
Maximize z~(V) over z = x + dtp with x e B(p) and ^-feasible ip. (10.21)
The algorithm consists of scaling phases, each of which corresponds to a fixed
parameter value 5. We start with an arbitrary linear ordering, the extreme base x
associated with it, zero flow (p = 0, and
10.2. Minimization of Submodular Set Functions 297

where x+ is the vector in Rv defined by x+(v) = m&x(Q,x(v)) for v € V. In each


scaling phase, we construct an approximate solution to (10.21) from a given pair of
a base x and a ^-feasible flow <p and then cut S and tp in half for the next scaling
phase. We terminate the algorithm when S is sufficiently small; specifically, when
6 < 1/n2 for integer-valued p.
In the scaling phase with parameter value 6 we maintain a ^-feasible flow tp
and a directed graph G^ — (V, Av~) with arc set

We aim at increasing z~(V) by sending flow along directed paths in Gv from S to


T defined by

Such a directed path is called a 6-augmenting path. We also maintain a base x


represented in the form of (10.13) with {(Li, A*) | i € /}.
If there exists a (^-augmenting path P, we modify tp to another ^-feasible flow
by setting

for each arc (u, v) in P. This results in an increase of z~(V) by 6, an improvement


in the objective function in our optimization problem (10.21). We refer to this
operation as Augment(i/?, P).
Suppose that no ^-augmenting path exists and denote by W the set of vertices
reachable from S in Gv; we have S C W C V \ T. In this case we cannot increase
z~(V) by flow augmentation but the current solution may or may not be optimal.
With an additional condition we have approximate optimality for (10.21), as is
stated in the following theorem, which is a relaxation version of the min-max relation
in Proposition 10.8. A triple (i,u,v) ofi €. I, u € W, and v G V\W is called active
if v is the immediate predecessor of u in Li.

Proposition 10.20. // 5 C W C V \ T, no arcs leave W in Gv, and no active


triples exist, then

Moreover, W is a minimizer of p if 5 < A/n 2 with

Proof. Since S C W C V \ T, we have z(v) < 6 for every v £ W and z(v) > —5
for every v € V \ W. Therefore,

Since x(W) = p(W] by the nonexistence of active triples and Proposition 10.9 (2)
and d<p(W) > 0 by the nonexistence of arcs leaving W in Gv, we have z~(V) >
p(W) — n5. Since dtp(v) < (n — 1)5 for every v 6 V, we have
298 Chapter 10. Algorithms

With 5 < A/n 2 we have x~(V) > p(W) - n25 > p(W) - A, whereas x~(V) < p(Y)
for all Y C V by (10.10). Hence W is a minimizer of p. D

On the basis of Proposition 10.20 above we terminate the scaling phase if nei-
ther augmenting path nor active triple exists. Otherwise, while keeping z invariant,
we aim at "improving the situation" by either
1. eliminating an active triple or
2. enlarging the reachable set W.
With a view to eliminating an active triple (i, u, v) we modify Li by swapping u and
v in Li; denote the old pair (L/i,yi) by (Lk,yk) with a new index fc. The extreme
base associated with the updated Li is given by yi — yk + /3(Xu — Xv) with

(see (10.12)). Defining a = mm((f>(u,v),Xi(3) let us modify x and ip by setting

Then z = x + dip is invariant and ip remains ^-feasible. The updated x is equal to

which is a convex combination of {yj j € / U {k}}. In the saturating case where


a = \i/3, the old extreme base yk disappears from (10.25). Since u precedes v in
the new L^, the active triple (i,u,v) is successfully eliminated, whereas the size of
the index set / remains the same. In the nonsaturating case where a < Aj/3, the
old extreme base y^ remains and the size of I increases by one. Nevertheless, the
situation is somewhat improved; namely, the reachable set W is enlarged to contain
v as a result of (p(u, v) = 0 for the updated flow (p. The above task is done by the
procedure Double-Exchange(z,M, v) below.

Procedure Double-Exchange(i,«, v)
SI: Set /3 := p(Li(u) \ {v}} - p(Lt(u)) + yi(v), a := mm(^(u, v), A;/3).
S2: If a < \i/3, then let k be a new index and set
/ := 7 U {k}, Afc := A* - a//?, A* := a/j3, yk := j/», Lk := Lt.
S3: Set ^ := yi + j3(\u - Xv), x := x + a(xu ~ Xv), <p(u, v) := (p(u, v) - a.
Update Li by swapping u and v.

The overall structure of the algorithm, which we name the IFF scaling algo-
rithm, is described below. Reduce(o;,/) is a procedure that computes an expression
of x as a convex combination of at most n afflnely independent extreme bases cho-
sen from the current extreme bases indexed by I; this can be done by a variant
of Gaussian elimination. Parameter A for the stopping criterion in step S4 should
satisfy (10.24); we take A = 1 for integer-valued p.
10.2. Minimization of Submodular Set Functions 299

IFF scaling algorithm for submodular function minimization


SO: Take any linear ordering L\ and let y\ be the associated extreme base.
If 3/i~(V) = 0, then output V as a minimizer and stop.
If l / i ( V ) = 0, then output 0 as a minimizer and stop.
Set / := {!}, A! := 1, x := j/i, tp := 0, S := mm{x+(V), \x~ (V)\} / H* .
SI: Let W be the set of vertices reachable from S in Gv.
S2: If W n T ^ 0, then let P be a ^-augmenting path, apply Augment(<£>, P)
and Reduce(x,I), and go to SI.
S3: If there exists an active triple, then apply Double-Exchange to an active
triple (i,u,v) and go to SI.
S4: If S < A/n 2 , then output W as a minimizer and stop.
S5: Apply Reduce(o;,/), set 5 := 5/2, and <p := <p/2, and go to SI.
Iterations of steps SI to S3 constitute a scaling phase. Step S2 increases z~(V)
by flow augmentation, whereas step S3 improves the situation by Double-Exchange.
Whenever flow is augmented in step S2 we apply Reduce to reduce the size of J,
which may have grown as a result of repeated executions of step S3. The correctness
of the algorithm follows from the second half of Proposition 10.20, which guarantees
the optiniality of W at the termination in step S4. Note, however, that the base
x is not necessarily optimal in (10.11). As for complexity the algorithm is weakly
polynomial for integer-valued p.

Proposition 10.21. The IFF scaling algorithm finds a minimizer of an integer-


valued submodular set function p : 2V —> Z with O(n5 Iog2 M) function evaluations
and arithmetic operations, where n = \V\ and M = max.{\p(X}\ X C V}.

Proof. This can be derived from the properties listed in Proposition 10.22. D

Proposition 10.22.
(1) The number of scaling phases is O(log 2 (M/A)).
(2) The first scaling phase calls Augment O(n 2 ) times.
(3) A subsequent scaling phase calls Augment O(n 2 ) times.
(4) Between calls to Augment, there are at most n — 1 calls to nonsaturating
Double-Exchange.
(5) Between calls to Augment, there are at most 2n3 calls to saturating Double-
Exchange.
(6) We always have \I\ < In.
(7) Reduce(a;, /) with \I\ < 2n can be done in O(n 3 ) arithmetic operations.

Proof. (1) We have 6 < M/n2 in step SO, since x+(V) = x(X) < p(X) < M for
X = supp+(:c). The number of scaling phases is bounded by log 2 ((M/n 2 )/(A/n 2 )) =
log2(M/A).
(2) Let x denote the initial base in step SO. Then z (V) = x (V) at the
beginning of the scaling phase. Throughout the scaling phase we have z~(V) <
z(V) = x(V) as well as z~(V) < 0. Since Augment increases z~(V) by <5, the
number of calls to Augment is bounded by
300 Chapter 10. Algorithms

(3) At the beginning of a subsequent scaling phase, we have z~(V) > p(W) —nS
by Proposition 10.20, where z = x + dtp with (p = 2ip and 6 — 26. Since

this implies that z~(V) > p(W) — 2n8 — «2<5/4 at the beginning of the scaling phase.
Throughout the scaling phase we have

Therefore, the number of calls to Augment is bounded by (2n5 + n25/2)/S = 2n +


n2/2.
(4) Each nonsaturating Double-Exchange adds a new element to W.
(5), (6) A call to Reduce results in |/| < n. A new index is added to / only in
a nonsaturating Double-Exchange. By (4), |/| grows to at most 2n — 1. Hence, the
number of triples (i,u,v) is bounded by 2n3.
(7) Reduce can be performed by a variant of Gaussian elimination. D

The following is a key property of the scaling algorithm that we make use of in
designing a strongly polynomial algorithm. Recall that a scaling phase ends when
the algorithm reaches step S4.

Proposition 10.23. At the end of a scaling phase with parameter S, the following
hold true.
(1) If x(w) < —n28, then w is contained in every minimizer of p.
(2) // x(w) > n2S, then w is not contained in any minimizer of p.

Proof. We have x~(V) > p(W) — n2S by Proposition 10.20, whereas, for any
minimizer X of p, we have p(W) > p(X) > x(X) > x ~ ( X ) . Hence, x~(V) >
x ~ ( X ) - n25. Therefore, if x(w) < —n2S, then w e X. On the other hand,

Therefore, if x(w] > n2S, then w £ X. D

Strongly Polynomial Fixing Algorithm

Using the scaling algorithm as a subroutine, we can devise a strongly polynomial al-
gorithm for submodular function minimization. The strongly polynomial algorithm,
which we call the IFF fixing algorithm, exploits two fundamental facts:
• The minimizers of p form a ring family (see Note 4.8).
10.2. Minimization of Submodular Set Functions 301

• A ring family can be represented as the set of ideals of a directed graph (see
Note 4.7).
Let D° be the directed graph representing the family T>° of minimizers of p. This
means that W C V is a minimizer of p if and only if it is an ideal of D° such
that min£>° C W C max 2}°. The algorithm aims at constructing the graph by
identifying arcs of D° or elements of (mini?0) U (V \ max£>°) one by one with the
aid of the scaling algorithm.
To be more specific, we maintain an acyclic graph D = (U, F) and two disjoint
subsets Z and H of V such that

• Z is included in every minimizer of p, i.e., Z C min£>°;


• H is disjoint from any minimizer of p, i.e., H C V \ maxX>°;
• each u € U corresponds to a nonempty subset, say, F(u), of V, and (F(u) |
u e U} is a partition of V \ (Z U H);
• for each u & U and any minimizer W of p, either F(u) C W or F(w) f~l W = 0;
• an arc (u, w) € F implies that every minimizer of p including T(u) includes F(w;)

Using the notation T(Y) = Uusr ^(u) f°r ^ — U, we define a function p : 2U —»•
Rby

It is easy to verify that


• p is submodular;
• a subset W of V is a minimizer of p if and only if W = T(Y) U Z for a
minimizer V C U of /5;
• an arc (u, w) G F implies that every minimizer of p containing u contains w;
i.e., a minimizer of p is an ideal of D.
The algorithm consists of iterations. Initially we set U := V, Z := 0, H := 0,
and F :— 0. At the beginning of each iteration, we compute

where R(u) denotes the set of vertices reachable from u £ U in D. If rj < 0, we


are done by Proposition 10.24 below. Otherwise, we either enlarge Z Li H or add
an arc to D, where directed cycles that may possibly arise in this modification are
contracted to a single vertex; the partition F of V \ (Z U H) is modified accordingly.

Proposition 10.24. // 77 < 0, then V \ H is the maximal minimizer of p.

Proof. Let Y be the unique maximal minimizer of p. If Y ^ U, there is an


element u £ U \Y such that Y U {u} is an ideal of D. By Y U {u} 2 R(u) and the
submodularity of p, we have
302 Chapter 10. Algorithms

which contradicts the definition of Y. Thus, U is the maximal minimizer of p and


hence F(t/) U Z = V \ H is the maximal minimizer of p. D

Suppose that 77 > 0 and let u G U be the vertex that attains the maximum in
(10.26). Then

and we have at least one of the following three cases: (i) p(U) > 77/8, (ii) p(R(u) \
{u}) < -77/3, or (iii) p(R(u)) - p(U) > 77/8.
Case (i): If p(U) > 77/8, we invoke a procedure Fix + (p, D, 77), described below,
to find an element w 6 U that is not contained in any minimizer of p. Since T(w)
cannot be included in any minimizer of p, we add T(w) to H and delete w from D.
Case (ii): If p(R(u) \ {u}) < — 77/8, we invoke another procedure Fix~(/5, D, 77),
described below, to find an element w € U that is contained in every minimizer of
p. Since T(w) must be included in every minimizer of p, we add T(w) to Z and
delete w from D.
Case (iii): If p(R(u)) — p(U) > 77/8, we consider the contraction of p by R(u),
which is a submodular function p* on U \ R(u) defined by

and find an element w G U \ R(u) that is contained in every minimizer of p*. As


explained below, we can do this by applying Fix" to (p*,D*,/7), where D* means
the subgraph of D induced on the vertex set U \ R(u). A subset X C. U \ R(u) is
a minimizer of p* if and only if X U R(u) minimizes p over subsets of U containing
R(u). Therefore, if a minimizer of p containing u exists, then it must contain w.
Equivalently, if a minimizer of p including T(u) exists, then it must include F(u>).
Accordingly, we add a new arc (u, w) to F, where the arc (u, w) is new because
w ^ R(u). If the added arc yields directed cycles, we contract the cycles to a single
vertex, with corresponding modifications of U and F.
Thus, in each iteration with 77 > 0, we either enlarge Z U H or add a new arc
to D. Therefore, after at most r? iterations, we can terminate the algorithm with
77 < 0 when we have a minimizer of p by Proposition 10.24.
The procedures Fix" (p, D, 77) and Fix + (p, D,rj) are as follows. Given a sub-
modular function p : 2U —> R, an acyclic graph D — (U,F), and a positive real
number 77 such that

the procedure F\x~(p,D,r]) finds an element w & U that is contained in every


minimizer of p. Similarly, Fix+(/5, D, 77) finds an element w G U that is not contained
in any minimizer of p when

The procedures F\x.~(p,D,r]) and Fix + (/5, D,rj) are the same as the IFF scaling
algorithm except that they start with 6 = 77 and a linear extension of the partial
order represented by D and return w in step S4. We put h = \U\.
10.2. Minimization of Submodular Set Functions 303

Procedure Fix~(/5, D, 77)


SO: Take any linear extension LI of the partial order represented by D and
let yi be the associated extreme base.
Set / := {!}, A! := 1, x :— yi, <p := 0, 5 := rj.
SI: Let W be the set of vertices reachable from S in Gv.
S2: If W H T ^ 0, then let P be a (^-augmenting path, apply Augment(</?, P)
and Reduce(x, 7), and go to SI.
S3: If there exists an active triple (i,u,v), then apply Double-Exchange to it
and go to SI.
S4: If there exists w € C/ with x(ty) < -n2£, then return such a w.
S5: Apply Reduce(z,/), set (5 := (5/2, and <£ := y>/2, and go to SI.
Procedure F\x+(p,D,rj) is identical to Fix" (p,D,n) except that step S4 is replaced
with
S4: If there exists w G U with x(w) > n2S, then return such a w.
The correctness of the above procedures at the termination in step S4 is guaran-
teed by Proposition 10.23. As to the complexity we have the following as well as
Proposition 10.22 (3)-(7).

Proposition 10.25. The following statements hold true for Fix ± (p, D,rj).
(1) The number of scaling phases is O(log2 n), where n = \U .
(2)//

then the first scaling phase calls Augment O(n) times.

Proof. (1) Assume 8 < n/(3n3). By (10.28) we have x(U) = p(U) > ry/3 > hz5
and hence x(w) > n26 for some w & U. Therefore, the number of scaling phases
in Fix+ is bounded by [Iog2(3n3)] = O(log 2 n). For Y in (10.27), we have x(Y) <
p(Y) < —rj/3 < -n38 and hence x(w) < -n28 for some w G Y. Therefore, the
number of scaling phases in Fix" is bounded by O(log2 n).
(2) Let x denote the initial base in step SO. By the proof of Proposition
10.22 (2), the number of calls to Augment is bounded by x+(U)/S, whereas x+(U) <
nrj by (10.29). Since S — 77, the number of calls to Augment is bounded by n. D

The applications of Fix* in the IFF fixing algorithm are legitimate.

Proposition 10.26. Let r? be defined by (10.26).


(1) Conditions (10.28) and (10.29) are satisfied by (p,D,r)) in Case (i).
(2) Conditions (10.27) and (10.29) are satisfied by (p,D,n) in Case (ii).
(3) Conditions (10.27) and (10.29) are satisfied by (p*,D#,rj) in Case (in).

Proof. In Case (i), (10.28) is obviously satisfied. Condition (10.27) is satisfied with
Y = R(u) \ {w} in Case (ii) and Y = U \ R(u) in Case (iii). To show (10.29) for
304 Chapter 10. Algorithms

(p,D,rj), let y be an extreme base generated by a linear extension of the partial


order of D. For each u € U we have y(u) = p(Y) — p(Y \ {u}) for some Y D R(u),
whereas

by the submodularity of p and the definition of n. This proves (10.29) for (p, D, 77).
Finally, for each u e U \ R(u), put J?*(u) = R(u) \ R(u) and observe

This shows (10.29) for (p,,D*,ri). D

We are now in the position to assert the correctness and the strong polyno-
miality of the IFF fixing algorithm.

Proposition 10.27. The IFF fixing algorithm finds a minimizer of a submodular


set function p : 2V —+ R with O(n 7 log 2 n) function evaluations and arithmetic
operations, where n = |V|.

Proof. This follows from Proposition 10.25 and Proposition 10.22 (3)-(7). D

Finally, we note that the maximal minimizer is found.

Proposition 10.28. The IFF fixing algorithm finds the maximal minimizer of p.

Proof. This follows from Proposition 10.24. D

Note 10.29. For minimization of a submodular set function p : 2V —> R U {+00}


defined effectively on a general ring family, the IFF scaling/fixing algorithm can
be applied to the associated finite-valued submodular function ~p in Note 10.14.
Alternatively, the IFF scaling/fixing algorithm can be tailored to this general case
if the ring family is represented as the set of ideals of a directed graph (V, E). The
min-max relation (10.11) in Proposition 10.8 holds true in this general case. The
representation (10.13) of a base x as a convex combination of extreme bases y,
(i € /) should be augmented by an additional term 9£ as

where 9£ is the boundary of a nonnegative flow £ : E —> R+. Then we have


10.3. Minimization of L-Convex Functions 305

where tp is a flow in the complete graph on V representing the superposition of £ and


(p. It is possible to design an algorithm that finds the minimum of p by maintaining
the extreme bases and the flow tp. See Iwata [99] for more details. •

10.3 Minimization of L-Convex Functions


Three kinds of algorithms for L-convex function minimization are described: the
steepest descent algorithm, the steepest descent scaling algorithm, and the reduction
to submodular function minimization on a distributive lattice. All of them depend
heavily on the algorithms for submodular function minimization in section 10.2.
Throughout this section g : Zv —> R U {+00} denotes an L- or L''-convex function
with \V\ = n. For an L-convex function g, it is assumed that

since otherwise g does not have a minimum.

10.3.1 Steepest Descent Algorithm


The local characterization of global minimality for L-convex functions (Theorem
7.14) naturally leads to the following steepest descent algorithm.

Steepest descent algorithm for an L-convex function g € £[Z —» R]


SO: Find a vector p e domg.
Si: Find X C V that minimizes g(p + xx)-
S2: If g(p) < g(p + X x ) , then stop (p is a minimizer of g).
S3: Set p := p + xx and go to SI.

Step SI amounts to minimizing a set function

over all subsets X of V. As a consequence of the submodularity of g, pp is sub-


modular and can be minimized in strongly polynomial time by the algorithms in
section 10.2. At the termination in step S2, p is & global minimizer by Theorem
7.14 (1) (L-optimality criterion). The function value g decreases monotonically with
iterations. This property alone does not ensure finite termination in general, but it
does if g is integer valued and bounded from below.
We introduce a tie-breaking rule in step SI:

Thus, we can guarantee an upper bound on the number of iterations. Let p° be the
initial vector found in step SO. If g has a minimizer at all, it has a minimizer p*
satisfying p° < p* by (10.31). Let p* denote the smallest of such minimizers, which
exists since p* A q* € argming for p*, q* € argmin <?.
306 Chapter 10. Algorithms

Proposition 10.30. In step SI, p < p* implies p + xx < P*• Hence the number
of iterations is bounded by \\p° - p*\\i-

Proof. Put Y = {v e V | p(v) — P*(v)} and p' = p + xx- By submodularity we


have

whereas g(p*) < g(p*Vp') since p* is aminimizer of g. Hence g(p') > g(p*/\p'). Here
we have p' = p + Xx and p* A p' — p + Xx\Y, whereas X is the minimal minimizer
by the tie-breaking rule (10.33). This means that X \ Y = X; i.e., X n Y = 0.
Therefore, p' = p + xx < P*- D

It is easy to find the minimal minimizer of pp using the existing algorithms


for submodular set function minimization (see Notes 10.11, 10.12, and 10.13 and
Proposition 10.28). Assuming that the minimal minimizer of a submodular set func-
tion can be computed with O(<r(n)) function evaluations and O(r(n)) arithmetic
operations and denoting by F an upper bound on the time to evaluate g, we can
perform step SI in O(a(n)F + r(n)) time. We measure the size of the effective
domain of g by

where it is noted that domg itself is unbounded by (10.31).

Proposition 10.31. For an L-convex function g with finite K\, the number of
iterations in the steepest descent algorithm with tie-breaking rule (10.33) is bounded
by KI . Hence, if a vector in dom g is given, the algorithm finds a minimizer of g
in O((a(n)F + r(n))Ki) time.

Proof. We have \\p° — p*||i < KI since p°(v) = p*(v) for some v e V. Then the
claim follows from Proposition 10.30. D

The steepest descent algorithm can be adapted to L''-convex functions. Let g


be an L''-convex function and recall from (7.2) that it is associated with an L-convex
function g as

The steepest descent algorithm above applied to this L-convex function g yields the
following algorithm for the L''-convex function g.
Steepest descent algorithm for an L^-convex function g G £&[Z —>• R]
SO: Find a vector p e doing.
SI: Find e € {1, —1} and X C V that minimize g(p + £Xx)-
S2: If g(p) < g(p + exx), then stop (p is a minimizer of g).
S3: Set p := p + sxx and go to SI.
Step SI amounts to minimizing a pair of submodular set functions
10.3. Minimization of L-Convex Functions 307

Let X+ be the minimal minimizer of p+ and X~ be the maximal minimizer of p~.


The tie-breaking rule in step SI above reads

This is a translation of the tie-breaking rule (10.33) for g in (10.35) through the
correspondence

where p = (0,p) & Z x Zv. Since (l,Xv\x-) cannot be minimal in the presence of
(0, Xx+); we choose (1, X+) in the case of min/9+ = min/9~. At the termination in
step S2, p is a global minimizer by Theorem 7.14 (2) (L-optimality criterion).
In view of the complexity bound given in Proposition 10.31 we will derive a
bound on the size of dom g in terms of the size of dom g. Let K\ (g) be denned by
(10.34) for g. The li-size and ^-size of doing are denoted, respectively, by

Proposition 10.32. K^(g) < KI + nKx < min[(n + 1)^, 2nK00\.

Proof. Take p = {po,p} and q = (qo,q) in dom g such that Ki(g) = \po — qo\ +
\\p - q\\i and either (i) po = qQ or (ii) p(v) = q(v) for some v € V. We may assume
Po > Qo and p > q since p V q,p A q e domg and \\(p V q) - (p A q)\\i = \\p - q\\i-
The vectors p' = p — pol and q' = q — qol belong to domg. In case (i), we have
Ki(g] = \\p-q\\\ = \\p'~<l'\\i < KI. Incase (ii), we havep 0 -<?o = q'(v)-p'(v) and

Note finally that KI < nKx and Kx < KI. D

The steepest descent algorithm could be used for minimizing quasi L-convex
functions satisfying (SSQSBW) because of Theorem 7.53 (quasi L-optimality crite-
rion). Note, however, that the set function pp of (10.32), to be minimized in step
SI, is not necessarily submodular and hence no efficient procedure is available for
step SI.
308 Chapter 10. Algorithms

10.3.2 Steepest Descent Scaling Algorithm


The steepest descent algorithm for L-convex function minimization can be made
more efficient with the aid of a scaling technique. The efficiency of the result-
ing steepest descent scaling algorithm is guaranteed by the complexity analysis in
section 10.3.1 combined with the proximity theorem for L-convex functions.
The algorithm for an L-convex function g with (10.31) reads as follows, where

Steepest descent scaling algorithm for an L-convex function g G JC.[Z —*• R]


SO: Find a vector p € domg and set a := 2^l°S2(Kaa/'2n)]
SI: Find an integer vector q that locally minimizes g(q) = g(p + aq) in the
sense of g(q) < g(q + Xx) (VX C V) by the steepest descent algorithm
of section 10.3.1 with initial vector 0 and set p := p + aq.
S2: If a = 1, then stop (p is a minimizer of g).
S3: Set a := a/2 and go to SI.
Note first that the function g(q) = g(p + aq) is an L-convex function. By the
L-proximity theorem (Theorem 7.18 (1)), there exists a minimizer q of g satis-
fying 0 < q < (n - 1)1. Then, by Propositions 10.30 and 10.31, the steepest
descent algorithm with tie-breaking rule (10.33) finds the minimizer in step SI in
O((cr(n)F + r(n))n 2 ) time, where <r(n), r(n), and F are defined in section 10.3.1.
The number of executions of step SI is bounded by [log2(A'cio/2n)] and, at the
termination of the algorithm in step S2 with a = 1, p is a minimizer of g by The-
orem 7.14 (L-optimality criterion). Thus, the complexity of the steepest descent
scaling algorithm is bounded by a polynomial in n and Iog2 (Koc/Zn).
The steepest descent scaling algorithm can be adapted to quasi L-convex func-
tions satisfying (SSQSB) because of Theorem 7.53 (quasi L-optimality criterion)
and Theorem 7.54 (quasi L-proximity theorem). Note, however, that no efficient
procedure is available for the minimization in step SI.

10.3.3 Reduction to Submodular Function Minimization


The effective domain of an L''-convex function g is a distributive lattice (a sublattice
of Zv) on which g is submodular. Hence we can make use of submodular function
minimization algorithms as adapted to functions on distributive lattices (see Note
10.15). If the £i-size of domg is given by KI, domg is isomorphic to a sublattice
of the Boolean lattice 2V for a set V of cardinality KI . Hence, the complexity of
this algorithm is polynomial in n and K\. It may be noted, however, that, being
dependent only on the submodularity of g, this approach does not fully exploit
L^-convexity.

10.4 Algorithms for M-Convex Submodular Flows


Five algorithms for the M-convex submodular flow problem are described: the two-
stage algorithm, the successive shortest path algorithm, the cycle-canceling algo-
10.4. Algorithms for M-Convex Submodular Flows 309

rithm, the primal-dual algorithm, and the conjugate scaling algorithm. Because
the optimality criterion for the M-convex submodular flow problem is essentially
equivalent to the duality theorems for M-/L-convex functions, these algorithms can
be used for finding a separating afnne function in the separation theorem and the
optimal solutions in the minimization/maximization problems in the Fenchel-type
duality.

10.4.1 Two-Stage Algorithm


This section is intended to provide a general structural view on the duality nature
of the M-convex submodular flow problem. It is based on the recognition of the
M-convex submodular flow problem as a composition of the Fenchel-type duality
and the minimum cost flow problem that does not involve an M-convex function.
The algorithm presented in this section, called the two-stage algorithm, computes
an optimal potential by solving an L-convex minimization problem in the dual
problem and constructs an optimal flow as a feasible flow to another submodular
flow problem.
As an adaptation of our discussion in section 9.1.4, the relationship between
the M-convex submodular flow problem MSFPs and the Fenchel-type duality may
be summarized as follows. To be specific, we consider the integer-flow version of
MSFP3 on the graph G = (V, A) with / <E M[Z -> Z] and /„ e C[Z -> Z] for a e A.
M-convex submodular flow problem MSFP3 (integer flow)

We assume the existence of an optimal solution.


First, we identify the problem dual to MSFPs and indicate how to compute
an optimal potential. With the introduction of a function

we obtain

where / e M[Z -> Z] and fA € M(Z -> Z]. Putting g = /*, ga = /0* for
a € A, and
310 Chapter 10. Algorithms

we have gA = fA* (see (9.28)) and also g e £[Z -> Z], #0 e C[Z -> Z] for a € v4,
and gA e £[Z -> Z]. The Fenchel-type duality (Theorem 8.21 (3)) gives

which is equivalent to61

The function

to be minimized on the right-hand side of (10.44) is an L-convex function and a


minimizer of g is an optimal potential for MSFPa, and vice versa, in the sense of
Theorem 9.16.
Next we discuss how to construct an optimal flow. Let p* be a minimizer of g
and define c* : A -> Z U {-oo}, c* : A -> Z U {+00}, and B* C Zv by

where B* is an M-convex set by Proposition 6.29. Since p* is an optimal potential,


a flow £* is optimal if and only if

The two-stage algorithm is described as follows.


Two-stage algorithm for MSFP3 (integer flow)
SI: Find a minimizer p* of g in (10.45).
S2: Find a flow £* satisfying (10.48).

The feasibility of this approach is guaranteed by the following facts if the given
functions, / and fa for a € A, can be evaluated.
1. We can evaluate g by applying an M-convex function minimization algorithm
to /. Similarly, we can evaluate ga for a e A.
2. We can find a minimizer p* in step SI by applying an L-convex function
minimization algorithm to g.
3. We can find a member of B* by applying an M-convex function minimization
algorithm to /[—p*}.
4. We can find a flow £* in step S2 as a feasible flow to the submodular flow
problem defined by c*, c*, and B*. This can be done, e.g., by the successive
shortest path algorithm described in section 10.4.2.
61
We have seen (10.44) in (9.83) as a special case of Theorem 9.26 (3).
10.4. Algorithms for M-Convex Submodular Flows 311

10.4.2 Successive Shortest Path Algorithm


We present the successive shortest path algorithm to find a feasible integer flow in
an integral submodular flow problem. We adopt the most primitive form of the
algorithm to better explain the basic idea without being bothered by technicalities.
Given a graph G = (V,A), an upper capacity c : A —> Z U {+00}, a lower
capacity c : A —> Zu{—oo}, and an M-convex set B C Zv, we are to find an
integer flow £ : A —> Z satisfying

It is assumed that c(a) < c(a) for each a e A.


The algorithm maintains a pair (£, z) e ZA x Zv of integer flow £ satisfying
(10.49) and base x e B and repeats modifying (£, z) to resolve the discrepancy
between d£ and x. For such (£,x) let G^,x = (V,j4j >x ) be a directed graph with
vertex set V and arc set A^tX = A*c U B£ U Cx consisting of three disjoint parts:

and define

In order to reduce the discrepancy ||z - d£\\i = X^ev \x(v) ~ <9£(v)|, tne
algorithm augments a unit flow along a shortest path P from S+ to S~ (shortest
with respect to the number of arcs) and modifies £ to £ given by

Obviously, £ satisfies the capacity constraint (10.49). The algorithm also updates
the base x to

which remains a base belonging to B; see Note 10.33. For the initial vertex s G S+
of the path P, either d£(s) increases or x(s) decreases by one and hence |z(s) —
<9£(s)| reduces by one. A similar result is true for the terminal vertex of P in S~,
whereas x(v) = d£(v) is kept invariant at every inner vertex v of P. Therefore,
each augmentation along a shortest path decreases ||z — <?£||i by two. Repeating
this process until the source S+ and consequently the sink S~ become empty, the
algorithm constructs a pair (£, z) with df, — x. Then £ is a feasible flow satisfying
both (10.49) and (10.50).
312 Chapter 10. Algorithms

Successive shortest path algorithm for finding a feasible integer flow


SO: Find £ e ZA satisfying (10.49) and x <E B.
SI: If S+ is empty, then stop (£ is a feasible flow).
S2: If there is no path from S+ to S~, then stop (no feasible flow exists).
S3: Let P be a shortest path from S+ to S~ and. for each arc a £ P. set

and go to SI.

Note 10.33. The updated vector x in (10.52) remains a base, i.e., x e B, by


Proposition 9.23 with / = SB (indicator function of B). Let G(x, x) be the bipartite
graph, as defined in section 9.5.2. All the arcs have zero weight. Hence, (x, x) meets
the unique-min condition if and only if G(x, x) has a unique perfect matching. The
latter condition holds in the algorithm because P is chosen to be a shortest path,
whereas Proposition 9.23 says that the unique-min condition implies x £ B. •

Note 10.34. Instead of augmenting a unit flow along P it is more efficient to


augment as much as possible. The maximum admissible amount is given by 6 =
min{c(a) | a € P} with

where cg(-, •, •) means the exchange capacity defined in (10.4). It can be shown that

stays in B; see Lemma 4.5 of Fujishige [65]. The successive shortest path algorithm
can be adapted to finding a real-valued feasible flow to a nonintegral submodular
flow problem. For a polynomial complexity bound of the algorithm, it is important
to choose, in the case of multiple candidates, an appropriate shortest path with
reference to some lexicographic ordering. The successive shortest path algorithm
can be generalized for optimal flow problems involving cost functions; see [65],
Fujishige-Iwata [66], and Iwata [98] for such algorithms for MSFPi (without M-
convex cost), whereas such an algorithm for the integer-flow version of MSFP2
(with M-convex cost) is given in Moriguchi-Murota [132]. •

Note 10.35. A number of algorithms are available for finding a feasible submodular
flow; e.g., Fujishige [59], Frank [56], Tardos-Tovey-Trick [199], and Fujishige-Zhang
[70]. The reader is referred to Fujishige [65], Fujishige-Iwata [66], and Iwata [98]
for expositions. •

10.4.3 Cycle-Canceling Algorithm


For the M-convex submodular integer-flow problem with linear arc cost, we have
seen in section 9.5 that a feasible flow is optimal if and only if there exists no negative
10.4. Algorithms for M-Convex Submodular Flows 313

cycle in an auxiliary network (Theorem 9.20) and that a nonoptimal flow can be
improved by augmenting a flow along a suitably chosen negative cycle (Theorem
9.22). These two facts suggest the following cycle-canceling algorithm, which works
on the auxiliary network (G^,^) introduced in section 9.5.
Cycle-canceling algorithm for MSFP2 (integer flow)
SO: Find a feasible integer flow £.
SI: If (G^,^) has no negative cycle, then stop (£ is an optimal flow).
S2: Let Q be a negative cycle with the smallest number of arcs.
S3: Modify £ to £ of (9.75) and go to SI.
The objective function I^ decreases monotonically by Theorem 9.22. This
property alone does not ensure finite termination in general, but it does if F^ is
integer valued and bounded from below. In the special case with dom/ C {0,1}^,
which corresponds to the valuated matroid intersection problem (Example 8.28),
some variants of the cycle-canceling algorithm are known to be strongly polynomial
(see Murota [136]).
Cycle-canceling algorithms can also be designed for problems with real-valued
flow on the basis of Theorem 9.18.

Note 10.36. For MSFPi (submodular flow problem with linear arc cost and with-
out M-convex cost), a number of cycle-canceling algorithms are proposed, includ-
ing Fujishige [59], Cui-Fujishige [27], Zimmermann [222], Wallacher-Zimmermann
[210], and Iwata-McCormick-Shigeno [103]. •

10.4.4 Primal-Dual Algorithm


A primal-dual algorithm for the M-convex submodular flow problem is described.
The algorithm maintains a pair of flow and potential and modifies them to opti-
mality. We deal with the case of linear arc cost with dual integrality.
M-convex submodular flow problem MSFP2

Here, c : A -^ R U {+00}, c : A -> R U {-oo}, / € M[TL -> R Z], and 7 : A -> Z


(see (9.67)). The feasibility of the problem is also assumed.
By Theorems 9.14 and 9.15 as well as (9.31), (9.32), and (9.65), a feasible
flow £ : A —> R is optimal if and only if there exists an integer-valued potential
p : V —> Z such that
314 Chapter 10. Algorithms

where 7P : A —> Z is the (integer-valued) reduced cost defined by

gp : 2V —> R U {+00} is the submodular set function derived from g = f* e


£[Z|R -> R] by

with gp(V) = 0 by (9.51), and B(gp) is the base polyhedron (M-convex polyhedron)
associated with gp.
The algorithm maintains a pair (£,p) 6 R"4 x Zv of a feasible flow £ and
an integer-valued potential p that satisfies (10.59) and repeats modifying (£,p) to
increase the set of arcs satisfying (10.57) and (10.58). We say an arc is in kilter
with respect to (£,p) if it satisfies (10.57) and (10.58) and out of kilter otherwise.
Note that exactly one of (10.57) and (10.58) fails for an out-of-kilter arc.
An initial (£,p) satisfying (10.59) can be found as follows. For any feasible
flow £ we consider a graph Gf — (V, Q) with vertex set V and arc set

and define the length of arc ( u , v ) as /'(<9£; — Xu + Xv), which is an integer by the
assumed dual integrality (Note 9.19). By solving a shortest path problem we can
find an integral vector p such that

which implies d£ 6 argmin/[-p] = B(gp) by (9.64) and (9.65).


To classify out-of-kilter arcs we define

where the dependence on p is implicit in the notation. Note that D^(v) is the set
of arcs leaving v for which (10.58) fails, D7 (v) is the set of arcs entering v for which
(10.57) fails, and {D^(v) \ v £ V} gives a partition of the set of out-of-kilter arcs.
If D^(v) = 0 for all v € V, condition (i) of (POT) is satisfied and the current
(£,£>) is optimal. Otherwise, the algorithm picks62 any v* € V with nonempty
D^(v*) and tries to meet the conditions (10.57) for a 6 D7(v*) and (10.58) for
a e Dt(v*) by changing the flow to £'. The flow £' is determined by solving the
following maximum submodular flow problem on G = (V, A) with a more restrictive
capacity constraint c*(a) < £'(a) < c*(a) with
62
The original primal-dual algorithm' picks an out-of-kilter arc and tries to meet (10.57) and
(10.58) for that arc. The present strategy to pick a vertex is to improve the worst-case complexity
bound.
10.4. Algorithms for M-Convex Submodular Flows 315

for each a 6 A.
Maximum submodular flow problem maxSFP

where £' : A —> R is the variable to be optimized.


Since the capacity interval [c* (a) ,c* (a)] is included in the original capacity
interval [c(a),c(a)], any feasible flow in maxSFP is feasible in MSFP2. Note also
that maxSFP has a feasible flow £' = £. In maximizing the objective function

it is intended to~ meet the conditions (10.58) and (10.57) by increasing the flow
in a G D£(V*') to the upper capacity and decreasing the flow in a G D^(v*) to
the lower capacity. The maximum submodular flow problem is a special case of
the feasibility problem for the submodular flow problem and a number of efficient
algorithms are available for it (see section 10.4.2).
Let £' be an optimal solution to maxSFP above. If D^/(v*) is empty, the
algorithm updates £ to £' without changing p. The condition (10.59) is maintained
because of (10.65). If D^(v*) is nonempty, the algorithm finds a minimum cut
W C V (explained later) and updates p to

as well as £ to £'. The condition (10.59) is maintained, as is shown in Proposition


10.38 below.
The primal-dual algorithm for the M-convex submodular flow problem with a
linear arc cost (MSFP2) with dual integrality is summarized as follows.

Primal-dual algorithm for MSFPz with dual integrality


SO: Find (£,p) e R/4 x Zv satisfying (10.54) and (10.59).
SI: If D$(v) = 0 for all v 6 V, then stop ((£ ,p) is optimal).
S2: Take any v* e V with D^(v*) ^ 0 and solve maxSFP to obtain £'.
S3: If D£I(V*) ^ 0, then find a minimum cut W and set p := p + xw-
S4: Set £ := £' and go to SI.

It remains to explain the minimum cut for maxSFP. For W C V containing


v*, we define the cut capacity i/(W) by
316 Chapter 10. Algorithms

Figure 10.1. Structure of G and G at v*.

where A+W and A W mean the sets of arcs leaving W and entering W, respec-
tively, as in (9.14) and (9.15). The following proposition states that the flow value
£'(D£(v*))-£'(Dt(v*)) is bounded by v(W] for any W C V with v* e W and that
this bound is tight for some W, which is referred to as a minimum cut in the above.
A minimum cut can be found with the aid of an appropriately defined auxiliary
network.

Proposition 10.37. In the maximum submodular flow problem maxSFP, we have

For a maximum flow £' and a minimum cut W, we have

Proof. We prove this by applying Theorem 9.13 (max-flow min-cut theorem) to a


maximum submodular flow problem on a graph G = (V, A), which is obtained from
G = (V,A) by a local modification at v* illustrated in Fig. 10.1. The vertex v* in
G is split into two vertices, v* and v*, and a new arc OQ = (v*,i)*) is introduced;
V — V U {{>*} and A = A U {a0}. The initial vertex of a € D£(V*) is changed to v*
and the terminal vertex of a & D7(v*) is changed to v*. For W C V we denote by
A + jy and A~W the sets of arcs leaving and entering W, respectively, in G.
The problem maxSFP is equivalent to maximizing the flow £'(OQ) in OQ in G =
(V,A), where the conservation of flow at v* (i.e., d£'(v*) = 0) is assumed and no
10.4. Algorithms for M-Convex Submodular Flows 317

capacity constraint is imposed on UQ. Note that £'(ao) = £'(Dt(v*))—t;'(D7(v*)) as


a consequence of the flow conservation at v*. As for cuts, we note the correspondence
between W C V with OQ e A+W and W C V with v* G W and observe the
identities

for such a W. Then we obtain (10.68) from (9.61) in Theorem 9.13. Note that
d(p + Xw) — g(p) = 9p(W) in v(W) corresponds to p in (9.61). Finally, (10.69),
(10.70), and (10.71) are shown in (9.62). D

The condition (10.59) is maintained when (£,p) is modified to (£',?/)•

Proposition 10.38. #£' G B(<jy) for an optimum flow £' mmaxSFP and potential
p' = p + xw with a minimum cut W.

Proof. It follows from (10.65), (10.69), and discrete midpoint convexity (7.7) that

This shows #£' &B(gp>). D

The following proposition shows the key properties for the correctness and
complexity of the primal-dual algorithm.

Proposition 10.39.
(1) The set of out-of-kilter arcs is nonincreasing.
(2) For each arc a, |7P(a)| is nonincreasing while the arc stays out of kilter.
(3) The potential is changed in at most \V\ iterations and, each time the
potential is changed, the value of max.a€rj^(v*) |7p(a)| decreases at least by one.
(4) Each time (£,p) is changed, the value of

decreases at least by one. Therefore, the primal-dual algorithm terminates in at


most NO iterations, where NO denotes the value of N at step SO.

Proof. The reader is referred to the kilter diagram in Fig. 9.1.


(1) We show that an in-kilter arc a with respect to (£,p) remains in kilter in
updating (£,p). It follows from £'(a) e [c*(a),c*(a)j and (10.62) that a remains in
kilter with respect to (£',p). Suppose that p is updated to p' = p + \w- Since
318 Chapter 10. Algorithms

we may assume that a is in kilter with respect to (£',p) and (i) a G A+VP7 \ Dt(v*)
or (ii) a e A~VF \ D^(v*). In case (i) we have £'(a) = c*(a) from (10.70), whereas

In case (ii) we have £'(a) — ^*(a) from (10.71), whereas

Thus the conditions (10.57) and (10.58) are preserved in either case.
(2) By (10.73), it suffices to show that, if (i) 7p(a) > 0, a e A+VF, or
(ii) 7P(a) < 0, a e A~W, then a is in kilter with respect to (£',p')- In case 0)i we
have a e <\+W\D+(v*), from which follows £'(a) = c*(a) = c(a) by (10.70). Since
7P'(a) > 0, a is in kilter with respect to (£',p')> and similarly for case (ii).
(3) If the potential does not change, we have D^i(v*) = 0 as well as D^(v) C
D^(v) for all w 6 V^ by (1). Therefore, the potential must be updated in |V|
iterations. Suppose now that p is changed to p'. An arc a & Df(v*).\ A+W is in
kilter with respect to (£',£>'), since 7 P /(a) < 7P(a) < 0 and £'(a) = c*(a) = c(a) by
(10.71). Similarly, a e D7(v*) \A~W is in kilter with respect to (£',p') because of
(10.70). Fora e D+(w*)nA+VF, we have |7j/(a)| = |7P(a)|-l froni7 p /(a) =7 p (a) +
1 and 7P(a) < 0. Similarly, for a e D~^(v*) n A^H 7 , we have |7P'(a)| = 7 P (a)| - 1
from 7P'(a) = 7P(a) — 1 and 7P(a) > 0. Therefore, maxo6£){(«*) |7P(«)| decreases at
least by one.
(4) When the potential changes, TV decreases because of (3). When the po-
tential remains invariant, N decreases because of Z?£/(u*) =0. D

Primal-dual algorithms can also be designed for problems without dual inte-
grality by using real-valued potential functions.

Note 10.40. The framework of the primal-dual algorithm for MSFPi (submodular
flow problem without M-convex cost) was established in Frank [55] and Cunningham-
Frank [30]. See Fujishige [65], Fujishige-Iwata [66], and Iwata [98] for expositions.

10.4.5 Conjugate Scaling Algorithm


With the use of conjugate scaling of the M-convex cost function, the primal-dual
algorithm is enhanced to a polynomial-time algorithm. We continue to deal with the
M-convex submodular flow problem MSF?2 with dual integrality; i.e., we assume
7 : A -> Z and / e M[K. -> R Z].
First we explain the intuition behind the cost-scaling algorithm for the sub-
modular flow problem MSFPi without M-convex function (see section 9.2 for MSFPi).
As Proposition 10.39 (4) shows, the time complexity of the primal-dual algorithm
depends essentially on
10.4. Algorithms for M-Convex Submodular Flows 319

Motivated by this fact, we consider MSFPi with a new objective function

where a is a positive integer representing cost scaling and [ • ] means rounding up


to the nearest integer. It is expected (or hoped) that such a scaling will result in
smaller values of (10.74) and hence in an improvement in the computation time of
the algorithm. On the other hand, the scaled problem with (10.75) is fairly close
to the original problem, since a|~7(a)/a] ~ 7(0), and, therefore, the solution to the
scaled problem is likely to be a good approximation that can be used as an initial
solution in solving the original problem by the primal-dual algorithm. The scaling
algorithm embodies the above idea by starting with a large a and successively
halving a until a = 1.
When an M-convex function / is involved, as in MSFP2, it is natural to try
a scaling of the form [/(•)/<*]. This approach, however, does not seem to work
in general, since \f(-)/oi\ is not necessarily M-convex for an M-convex function /.
Conjugate scaling is a kind of scaling operation compatible with M-convexity.
Let / : Rv —> Ru{+oo} be a polyhedral convex function with dual integrality
in the sense that the conjugate function g = /* has integrality (6.75). Then we have

where the supremum is taken over integer points. Replacing g(p) with ga(p) =
g(ap)/a (p € Zv) in this expression we define

which we call the conjugate scaling of / with scaling factor a £ Z++. Note that
/^ is again a dual-integral polyhedral convex function. It is easy to see that

and that doniR,/^ = doniR,/ provided /^ > —oo. Figure 10.2 illustrates the
conjugate scaling of a univariate function / with a = 2.

Proposition 10.41. For a dual-integral polyhedral M-convex function f £ M [R —>


R|Z], we have /< a > e M[R -» R|Z] provided /<"> > -oo.

Proof. We have /* = g & £[Z|R -> R] and hence ga e £[Z -» R] by Theorem


7.10 (2). Therefore, /< Q > = (ga)' e M\R, -> R|Z] by (8.10). D

We are now in the position to present the conjugate scaling algorithm. Initially
the algorithm finds a feasible flow £o and an integer-valued potential po satisfying
(10.59) and applies the conjugate scaling to the objective function rewritten as
320 Chapter 10. Algorithms

Figure 10.2. Conjugate scaling f^ and scaling ga for a = 2.

where 7 = 7po and / = /[—p 0 ]. We denote the conjugate scaling of / by /^ and


put g = /*. Note that / e M[H -> R|Z] and

where we regard g as a member of £[Z —> R]. Recall the notation n = \V\.

Conjugate scaling algorithm for MSFP2 with dual integrality


SO: Find a feasible flow £o G R"4 and an integer-valued potential po 6 Zv
satisfying (10.59) and define 7, /, and g accordingly.
Set p* := 0, K :== max a€A |7(a)|, a := 2^2 K] _
SI: If a < 1, then stop ((£,Po + p*) is optimal).
S2: Find an integer vector p € Zv that minimizes ga(p) — (p,d£) subject
to Ip* < p < 2p* + (n- 1)1.
S3: Solve MSFP2 for (7,/) = (py/a], / < Q > ) by the primal-dual algorithm
starting with (£,p) to obtain an optimal (£,*,P*) € R"4 x Zv.
S4: Set £ := £* and a := a/2 and go to SI.
The correctness of the algorithm is ensured by Theorem 7.18 (L-proximity theorem),
which implies that the minimizer p found in step S2 under the restriction 2p* < p <
2p* + (n — 1)1 is in fact a global minimizer of ga(p) — (p, d£). Hence, the condition
<9£ € B(gap) = argmin/< a >[-p] for (10.59) is maintained.
In step S2, the minimizer p can be found by the L-convex function minimiza-
tion algorithms of section 10.3, where the number of evaluations of ga is bounded
by a polynomial in n. Given /, an evaluation of ga amounts to minimizing a poly-
hedral M-convex function, since g(p) = sup{(p, x) — f ( x ) \ x (E R^}. If / has primal
10.4. Algorithms for M-Convex Submodular Flows 321

integrality, i.e., if / G A4[Z|R —> R], theng(p) = swp{(p,x)—f(x) \ x e Zv}, which


can be computed by the algorithms in section 10.1.
In step S3, the number of iterations (updates of (£,p)) within the primal-dual
algorithm is bounded by n2. Denote by pa the value of p at the beginning of step
S3 and put p2a = p*, 7™ = |~7/of|, and -J2a = py/(2a:)]. Then we have

from which follows

This means that at the beginning of step S3 we have

for every out-of-kilter arc a and therefore N in (10.72) is bounded by n2.


Obviously, steps SI through S4 are repeated flog2 K~\ times. For the value of
K we have

if the initial potential po in step SO is computed from a shortest path on the graph
Gj = (V^Cf), as explained in section 10.4.4, where in the second term on the
right-hand side we consider only those x, u, v for which f ' ( x ; —\u + Xv) is finite. If
/ 6 .M[Z|R —> R], the second term can be bounded as

which implies

Bibliographical Notes
The tie-breaking rule (10.2) for M-convex function minimization, as well as Propo-
sition 10.2, is due to Murota [148]. Variants of steepest descent algorithms are re-
ported in Moriguchi-Murota-Shioura [133] with some computational results. Scal-
ing algorithms for M-convex function minimization, including the one described in
section 10.1.2, were considered first in [133], although the proposed algorithms run
in polynomial time only for a subclass of M-convex functions that are closed un-
der scaling. A polynomial-time scaling algorithm for general M-convex functions is
given by Tamura [197]. The domain reduction algorithm in section 10.1.3 is due to
Shioura [190] and its extension to quasi M-convex functions is observed in Murota-
Shioura [154]. The domain reduction scaling algorithm in section 10.1.4, with its
extension to quasi M-convex functions, is due to Shioura [192]. Minimizing an M-
convex function on {0, l}-vectors is equivalent to maximizing a matroid valuation,
for which a greedy algorithm of Dress-Wenzel [41] works; see also section 5.2.4 of
Murota [146].
322 Chapter 10. Algorithms

The literature of submodular function minimization was described in Note


10.10. The algorithmic framework expounded in section 10.2.1 is due to Cun-
ningham [28], [29] as well as Bixby-Cunningham-Topkis [15]. The algorithm in
section 10.2.2 is by Schrijver [182], whereas an earlier version of this algorithm
based on partial orders associated with extreme bases (presented at the Workshop
on Polyhedral and Semidefinite Programming Methods in Combinatorial Optimiza-
tion, Fields Institute, November 1-6, 1999) is described in Murota [147]. The
algorithm in section 10.2.3 is due to Iwata-Fleischer-Fujishige [102]. Improvements
on those algorithms in terms of time complexity were made by Fleischer-Iwata [50]
and Iwata [100]. See McCormick [127] for a detailed survey on submodular function
minimization. Note 10.12 was communicated by K. Nagano, Note 10.14 is based on
[182], and Proposition 10.28 was communicated by S. Iwata.
Favati-Tardella [49] proposes a weakly polynomial algorithm for submodular
integrally convex function minimization. This is the first polynomial algorithm for
L-convex function minimization, when translated through the equivalence between
submodular integrally convex functions and lAconvex functions. The steepest de-
scent algorithm for L-convex function minimization in section 10.3.1 is given in
Murota [145]. The tie-breaking rule (10.33), as well as Proposition 10.31, is due
to Murota [148]. The steepest descent scaling algorithm in section 10.3.2 is due
to S. Iwata (presented at Workshop on Matroids, Matching, and Extensions, Uni-
versity of Waterloo, December 6-11, 1999), where step SI is performed not by the
steepest descent algorithm but by the algorithm in section 10.3.3.
The framework of M-convex submodular flow problems is advanced by Murota
[142]. The successive shortest path algorithm for a feasible flow described in sec-
tion 10.4.2 originates in Fujishige [59] and the present form is due to Frank [56].
The cycle-canceling algorithm of section 10.4.3 is devised in [142] as a proof of
the negative-cycle criterion for optimality (Theorem 9.20). In the special case of
valuated matroid intersection, the algorithm can be polished to a strongly polyno-
mial algorithm (Murota [136]); see also Note 10.36. The primal-dual algorithm of
section 10.4.4 is due to Iwata-Shigeno [105]; see also Note 10.40. A strongly polyno-
mial primal-dual algorithm for the valuated matroid intersection problem is given
in [136]. The conjugate scaling algorithm of section 10.4.5 is due to [105]. A scal-
ing algorithm for a subclass of the M-convex submodular flow problem is given by
Moriguchi-Murota [132]. Capacity scaling algorithms for submodular flow problems
(without M-convex costs) are given in Iwata [97] and Fleischer-Iwata-McCormick
[51]. For other algorithms for submodular flow problems (without M-convex costs),
see the book of Fujishige [65] and surveys of Fujishige-Iwata [66] and Iwata [98].
Chapter 11

Application to
Mathematical Economics

This chapter presents an application of discrete convex analysis to a subject in


mathematical economics: competitive equilibria in economies with indivisible (or
discrete) commodities. For economies consisting of continuous commodities, repre-
sented by real-valued vectors, a rigorous mathematical framework was established
around 1960 on the basis of convexity, compactness, and fixed-point theorems. For
indivisible commodities, however, no general mathematical framework seems to have
been established. Such a framework, if any, should embrace both convexity and dis-
creteness; the present theory of discrete convex analysis appears to be a promising
candidate for it. It is shown that, in an Arrow-Debreu type model of competitive
economies with indivisible commodities, an equilibrium exists under the assumption
of the M^-concavity of consumers' utility functions and the M'-'-convexity of produc-
ers' cost functions. Moreover, the equilibrium prices form an L''-convex polyhedron,
and, therefore, they have maximum and minimum elements. The conjugacy between
M-convexity and L-convexity corresponds to the relationship between commodities
and prices.

11.1 Economic Model with Indivisible Commodities


As an application of discrete convex analysis we deal with competitive equilibria in
economies with a number of indivisible commodities and money. Indivisible com-
modities mean commodities (goods) whose quantities are represented by integers,
such as houses, cars, and aircraft, whereas money is a real number representing the
aggregation of the markets of other commodities.
We consider an economy (of Arrow-Debreu type) with a finite set L of pro-
ducers, a finite set H of consumers, a finite set K of indivisible commodities, and
a perfectly divisible commodity called money. Productions of producers and con-
sumptions of consumers are integer-valued vectors in ZK representing the numbers
of indivisible commodities that they produce or consume. Here producers' out-
puts are represented by positive numbers, while negative numbers are interpreted
as inputs to them, and consumers' inputs are represented by positive numbers,

323
324 Chapter 11. Application to Mathematical Economics

while negative numbers are interpreted as outputs from them. Given a price vector
p = (p(k) : k € K) 6 R^ of commodities, each producer (assumed to be male) in-
dependently schedules a production in order to maximize his profit, each consumer
(assumed to be female) independently schedules a consumption to maximize her
utility under the budget constraint, and all agents exchange commodities by buy-
ing or selling them through money. An important feature of this model is that the
independent agents take the price as granted; i.e., they assume that their individual
behaviors do not affect the price. Such an economy is called a competitive economy.
We assume that a producer I e L is described by his cost function GI : ZK —>
RU {+00}, whose value is expressed in units of money. He wishes to maximize the
profit (p,y) — Ci(y) in determining his production y = yi e ZK. This means that
yi is chosen from the supply set

where the function Si : RK —> 2Z is called the supply correspondence. Accordingly,


the profit function TTJ : R^ —> R is defined by

To avoid possible technical complications irrelevant to discreteness issues, we assume


that dom Ci is a bounded subset of 7iK for each I 6 L. This guarantees, for instance,
that Si (p) is nonempty for any p.
Each consumer h € H has an initial endowment of indivisible commodities
and money, represented by a vector (x^,m^) € ZK x R+, where x^(fc) denotes
the number of the commodity k e K and m°h the amount of money in her initial
endowment. Consumers share in the profits of the producers. We denote by dih the
share of the profit of producer I owned by consumer h, where

Thus, consumer h gains an income

where /3h '• RK —> R, and accordingly her schedule (x, m) = (x/j, mh) should belong
to her budget set

We assume that a consumer h is associated with a utility function Uh '• "ZK x R —>
RU {—00} that is quasi linear in money; namely,
11.1. Economic Model with Indivisible Commodities 325

Figure 11.1. Consumer's behavior.

with a function63 [//, : Zx —> Ru{—oo}. Consumer h maximizes Uh under the bud-
get constraint; that is, (x,m) — (xh,TTih) ls a solution to the following optimization
problem:

(see Fig. 11.1). Under the assumption that domUh is bounded0 and mh is suffi-
ciently large, we can take

to reduce the above problem to an unconstrained optimization problem:

This means that Xh is chosen from the demand set

The function Dh '• RK —> 2Z is called the demand correspondence.


A tuple ((xh | h 6 H), (yi \ I e L),p), where xh e ZK, j/j e ZK, and p 6 RK,
is called an equilibrium or a competitive equilibrium if

63
In economic terminology, U^ is called the reservation value function, although we refer to it
as the utility function in this book.
64
The boundedness of dom I//, is a natural assumption because no one can consume an infi-
nite number of indivisible commodities. This assumption is also convenient for concentrating on
discreteness issues in our discussion.
326 Chapter 11. Application to Mathematical Economics

That is, each agent achieves what he or she wishes to achieve, the balance of supply
and demand holds, and an equilibrium price vector is nonnegative. Denoting the
total initial endowment of indivisible commodities by

we can rewrite the supply-demand balance (11.11) as

On eliminating Xh and yi using (11.9) and (11.10), we see that p e R^ is an


equilibrium price if and only if

where the right-hand side is a Minkowski sum in ZK. It is noted that money balance

is implied by (11.11) with (11.3), (11.4), (11.7), and 7rj(p) = (p ) W ) - d(yi).


We are concerned with mathematical properties of equilibria, rather than their
economic-theoretical significance. A most fundamental question would be as follows:
When does an equilibrium exist? Namely, the first problem we should address is this:
Problem 1: Give a (sufficient) condition for the existence of an equilib-
rium in terms of utility functions Uh and cost functions C\.
The conditions (11.9) and (11.10) for an equilibrium are given in terms of demand
correspondences Dh and supply correspondences 5; without explicit reference to
utility functions Uh and cost functions C\. This motivates the following:
Problem 2: Give a (sufficient) condition for the existence of an equi-
librium in terms of demand correspondences Dh and supply correspon-
dences Si.
When an equilibrium exists, we may be interested in its structure:
Problem 3: Investigate the structure of the set of equilibria.
A more specific problem in this category is as follows: Do the maximum and mini-
mum exist among equilibrium price vectors?
We shall answer the above problems with the use of concepts and results in
discrete convex analysis. Our answers are the following.
11.2. Difficulty with Indivisibility 327

(1) An equilibrium exists if Uh (h e H) are M^-concave functions and Ci (I € I/)


are M^-convex functions (Theorems 11.13 and 11.14).
(2) An equilibrium exists if Dh(p) (h € H) and Si(p) (I € L) are M^-convex sets
for each p (Theorem 11.15).
(3) The set P* of the equilibrium prices is an L''-convex polyhedron (Theorem
11.16). This means, in particular, that p V q,p A q € P* for any p, q e P* and
that there exist a maximum and a minimum among equilibrium prices.
As a preliminary consideration, the difficulty arising from indivisible com-
modities is demonstrated in section 11.2 by a simple example. In section 11.3 we
discuss the relevance of M^-concavity as an essential property of utility functions.
The results mentioned above are proved in section 11.4. Finally, in section 11.5,
we show that an equilibrium can be computed by solving an M-convex submodular
flow problem.

Note 11.1. A special case of our economic model with L = 0, where no producers
are involved, is called the exchange economy. A difficulty of indivisible commodities
already arises in this case, as we will see in section 11.2. •

Note 11.2. Commodities that can be represented by real-valued vectors are called
divisible commodities. A framework for the rigorous mathematical treatment of
equilibria in economies of divisible commodities was established around 1960 usin
convexity, compactness, and fixed-point theorems as major mathematical tools. See
Debreu [37], [38], Nikaido [168], Arrow-Hahn [4], and McKenzie [128].

Note 11.3. A considerable literature already exists on equilibria in economies


with indivisible commodities. We name a few: Henry [88], 1970; Shapley-Scarf
[187], 1974; Kaneko [107], 1982; Kelso-Crawford [111], 1982; Gale [72], 1984; Quinz
[173], 1984; Svensson [196], 1984; Wako [209], 1984; Kaneko-Yamamoto [108], 19
Van der Laan-Talman-Yang [204], 1997; Bikhchandani-Mamer [13], 1997; Danilov
Koshevoy-Murota [34], 1998 (also [35], 2001); Bevia-Quinzii-Silva [12], 1999; Gu
Stacchetti [84], 1999; and Yang [219], 2000.

11.2 Difficulty with Indivisibility


The difficulty in the mathematical treatment of indivisible commodities is illustrated
by a simple example. We consider an exchange economy consisting of two agents
(H = {1,2}, L = 0) dealing in two indivisible commodities (K = {1,2}). Putting
S = {(0,0), (0,1), (1,0), (1,1)} we define the utility functions Uh for h = 1,2 in
(11.6) by

where domC/i = domC/2 = S (see Fig. 11.2). The demand correspondences D\ and
_D2, calculated according to (11.8), are also given in Fig. 11.2. For instance, for
328 Chapter 11. Application to Mathematical Economics

Figure 11.2. Exchange economy with no equilibrium for x° = (1,1).

p = (p(l),p(2)) with 0 < p ( l ) < 1 and 0 < p(2) < 1, we have Di(p) = {(1,1)}; for
p = (1,1), we have D^p) = {(1,1), (0,1), (1,0)} and £>2(p) = {(1,1)}.
Given a total initial endowment x°, an equilibrium is a tuple (xi,x 2 ,p) e
Z2 x Z2 x R2. such that

For x° = (1, 2), for example, the tuple of x\ = (0,1), x? = (1,1), p = (2,1) satisfies
the above conditions and hence is an equilibrium.
Another case, x° = (1,1), is problematic. As we have seen in (11.15), a
nonnegative vector p is an equilibrium price if and only if x° € Di(p) + D^p).
Superposition of the diagrams for Di(p) and £>2(p) in Fig. 11.2 yields a similar
diagram for the Minkowski sum Di(p) + Dz(p), shown in Fig. 11.3. We see from
this diagram that no p satisfies (1,1) 6 Di(p) + D^p) and hence no equilibrium
exists for x° = (1,1). The diagram consists of eight regions, corresponding to the
eight points in [0, 2]z x [0, 2]z except {(1,1)}. Hence an equilibrium exists for every
11.2. Difficulty with Indivisibility 329

Figure 11.3. Minkowski sum Di(p) + D2(p).

x° £ ([0,2] z x [0,2] z ) \ {(1,1)} and not for x° = (1,1). Let us have a closer look at
the problematic case to better understand the discreteness inherent in the problem
and to identify the source of the difficulty.
In view of the established mathematical framework for divisible commodi-
ties, we consider an embedding of our discrete problem via a concave extension of
the utility functions. Denote by Ui and U2 the concave extensions of U\ and U%,
respectively. Obviously, doniRt/i = dom.Rt/2 = S, where S = [0, I]R x [0, I]R, and

The demand correspondences are defined by

for h = 1,2 and an equilibrium is a tuple ( x i , X 2 , p ) 6 R2 x R2 x R+ such that

In our case of x° — (1,1), the tuple of xi = x2 = (1/2,1/2) and p = (3/2,3/2) is


an equilibrium in this sense, but it is not qualified as an equilibrium in the original
problem of indivisible commodities, in which xi and x2 must be integer vectors.
Thus, there is an essential discrepancy between the original discrete problem and
the derived continuous problem.
We can identify the reason for this discrepancy as the lack of convexity in
Minkowski sum discussed in section 3.3. Since Dh(p) coincides with the convex
hull Dh(p) of Dh(p) and £>i(p) + D2(p) = -Di(p) + D2(p) holds by Proposition
3.17 (4), the derived continuous problem has an equilibrium if and only if x° e
330 Chapter 11. Application to Mathematical Economics

DI(P) + D2(p). On the other hand, the original discrete problem has an equilibrium
if and only if x° e Di(p) + D2(p), as noted already. For p = (3/2,3/2), we have
I>i(p) = {(0,1), (1,0)}, D2(p) = {(0, 0), (1,1)}, and

which has a hole at (1,1) (see Example 3.15 and Fig. 3.4). This hole is the very
reason for the nonexistence of an equilibrium for indivisible commodities.

11.3 M^-Concave Utility Functions


We demonstrate the relevance of M^-concavity to utility functions by indicating its
relationship with fundamental properties such as submodularity, the gross substi-
tutes property, and the single improvement property, discussed in the literature of
mathematical economics.
First, recall from Theorem 6.2 that we can define an M^-concave function as
a function U : ZK —* RU {—00} with domU ^ 0 satisfying the following exchange
property:

where Xi is the ith unit vector and a maximum taken over an empty set is denned
to be — co. A more compact expression of this exchange property is

where xo is the zero vector, AC/(x;j, i) — U(x — Xi + Xj) - U(x) as in (6.2),


&U(x;0,i) = U(x-Xi)'-U(x),aa.d&U(y,i,0) = U(y + Xi)-U(y).
All the results established in the previous chapters for M^-convex functions
can obviously be rephrased for M^-concave functions. In particular, an M^-concave
function U has the following properties (reformulations of Theorems 6.42, 6.19, 6.26,
and 6.24, Propositions 6.33 and 6.35, and Theorem 6.30).
• Concave extensibility: The concave closure U of U satisfies

• Submodularity:

Utility functions are usually assumed to have decreasing marginal returns, a


property that corresponds to submodularity in the discrete case.
11.3. M^-Concave Utility Functions 331

• Local characterization of global maximality: For x € dom U,

This says that special ascent directions work for increasing a utility function.
• (—M^-GSfZ]): Ifx £ argmax£/[-p+p 0 l], P < <?, Po < <?o, and argmax[7[-g +
qrol] ^ 0, there exists y e argmax/7[—q + </ol] such that
(i) y ( i ) > x(i) for every i £ K with p(i) = q(i) and
(ii) y(K)<x(K) if p0 = qo-
This is a version of the gross substitutes property. Recall the brief discussion
on the gross substitutes property at the beginning of section 6.8.
• (—M^-SWCSfZ]): For x € argmint/[-p], p <E RK, and i € K, at least one of
(i) and (ii) holds true:
(i) x e argmaxC/[—p — a\i] for any a > 0,
(ii) there exist a > 0 and y e argmaxt/[—p - a\i] such that y(i) = x(i) — 1
and y(j) > x(j) for all j € K \ {i}.
This is another version of the gross substitutes property, called the stepwise
gross substitutes property.
• M^-convexity of maximizers: For every p € R^, argmax[/[—p] is an M13-
convex set.
The properties above are essential features of M^-concave functions and, in
fact, the last four characterize M^-concavity, as follows (reformulations of Theorems
6.24, 6.34, 6.36, and 6.30). Note that submodularity and concave extensibility (even
together) do not imply M^-concavity.

Theorem 11.4. For a function U : ZK —> R U {—00} with a nonempty effective


domain,

Theorem 11.5. For a concave-extensible function U : ZK —> R U {—00} with a


bounded nonempty effective domain,

Theorem 11.6. For a concave-extensible function U : ZK —> R U {—00} with a


nonempty effective domain,
332 Chapter 11. Application to Mathematical Economics

Theorem 11.7. For a function U : ZK —> R U {—00} with a bounded nonempty


effective domain,

U is M^-concave ^=> argmaxf/[—p] is an ^-convex set for each p 6 RK.

Let us consider the special case of set functions to discuss the relationship of
the above theorems to the results of Kelso-Crawford [111] and Gul-Stacchetti [84].
The gross substitutes property was denned originally for a set function U : 2K —> R
in [111]. It reads as follows:
(GS) If X e argmaxt/"[-p] and p < q, there exists Y e argmaxC/[—q]
such that {i 6 X \ p ( i ) = q(i)} C Y,
where

Two novel properties introduce are introduced in [84]; i.e., the single improvement
property:

and the no complementarities property:

It is pointed out in [84] that these three properties are equivalent:65

for a set function U. It is also observed that (NC) is implied by the strong no
complementarities property:

To derive these results from our previous theorems, we identify a set function
U : 2K -> R with a function U : ZK -> R U {-oo}, where dom U = {0,1} K , by

When translated for U, the properties (GS) and (SI) for U may be written respec-
tively as follows:
65
To be precise, this equivalence is shown in [84] for a monotone nondecreasing U, where p and
q can be restricted to nonnegative vectors.
11.3. M^-Concave Utility Functions 333

(—Mb-GS w [Z]) If x e argmaxC/[—p] and p < q, there exists y e


argmaxt/[—q] such that y(i) > x(i) for every i € K with p(z) = q(i).

(-M^-SIwfZj) For p € R^ and x € domf/ \ argmax[/[-p],

Obviously, we have (-M*-GS[Z]) => (-M»-GS W [Z]) and (-M*-SI[Z]) =» (-M"-


SIW[Z]) and it can be shown (see Murota-Tamura [160]) that the converses are also
true if domC/ = {0,1}^. It follows from this and Theorems 11.4 and 11.5 that

On the other hand, we see

from the multiple exchange axiom for a matroid (see Theorem 4.3.1 in Kung [117])
and Theorem 11.7. Thus, the three conditions (GS), (SI), and (NC) for U are all
equivalent to the M^-concavity of U. Obviously, the special case of (SNC) with
|/| = 1 coincides with the exchange axiom (—M I '-EXC[Z]) for U and, therefore,

Finally, we mention that the submodularity (11.19) of U is equivalent to the sub-


modularity of U:

Note 11.8. Utility functions are to be maximized by consumers. As is discussed


in Note 10.16, however, maximizing a general submodular set function is compu-
tationally intractable. M^-concave utility functions form a subclass of submodular
functions that can be maximized efficiently. •

Note 11.9. We mention here some examples of M^-concave utility functions treated
explicitly or implicitly in the literature of indivisible commodities. See also the
examples of M^-convex functions shown in section 6.3.
(1) A separable concave function

denned in terms of a family of univariate discrete concave functions (ui \ i € K)


(i.e., — Ui 6 C[Z —> R]) is an M^-concave function (see (6.31)).
334 Chapter 11. Application to Mathematical Economics

(2) A quasi-separable concave function

defined in terms of a family of univariate discrete concave functions (ui i G /•CU{0})


(i.e., -Ui G C[Z —> R]) is an M^-concave function (see (6.32)).
(3) Given a vector (0, | i & K) G RK, we denote by U(X) the maximum value
of di with index i belonging to X C K. More formally, choosing a* G R U { —00}
with a* < muiigx a^, we define a set function E7 : 2K —> R U {—00} by

Identifying X with its characteristic vector Xx and imposing monotonicity, we


obtain a function U : ZK —>• R U {—00} given by

This is an M^-concave function. The function U represents a unit demand prefer-


ence. •

11.4 Existence of Equilibria


We prove the existence of equilibria by establishing two statements: (i) an equilib-
rium price can be characterized as a nonnegative subgradient of the aggregate cost
function and (ii) under the assumption of M^-convexity/concavity of individual cost
functions and utility functions, the aggregate cost function is an M^-convex function
for which the subdifferential is nonempty.

11.4.1 General Case


The conditions Xh G Dh(p) and yi G Si(p) in (11.9) and (11.10) can be rewritten
by (3.30) as

using the notations for subdifferentials defined in (8.19) and (6.86). Hence, p G R+
is an equilibrium price if and only if it satisfies (11.22) for some x^ G domC/h,
(h G H) and yi G domC1; (I G L) meeting the supply-demand balance (11.14) for
the given total initial endowment x°. Furthermore, if ((xh h G H), (yi I G L),p)
is an equilibrium, the set P*(x°) of all equilibrium prices for x° is expressed as
11.4. Existence of Equilibria 335

In particular, the right-hand side of this expression is independent of the choice of


an equilibrium allocation ((xh h 6 H), (yi \ I & L)).
We define the aggregate cost function ^> : ZK —> R U {+00} by

This is the integer infimal convolution (6.43) of producers' cost functions and the
negatives of consumers' utility functions. Owing to our assumptions that
• dom Uh is bounded for each h £ H, and
• domC; is bounded for each / € L,
the infimum in (11.24) is attained for each z in

where the right-hand side is a Minkowski sum in ZK (see (6.44)).

Proposition 11.10. For a total initial endowment x°, the following statements
hold true under the boundedness assumption on dom Uh for h e H and dom C\ for
l&L.
(1) There exists an equilibrium if and only if x° G dom* and (—dR^(x°)) D
R*^0.
(2)p'(x°) = (-dR<Sf(x°»nnx.
(3) I f ( ( x h h e H), (yi | I e L)) satisfies (11.9), (11.10), and (11.14) for some
p (not necessarily nonnegative), then

and ((XH | h e H), (yi \ I e L),p') is an equilibrium for any p' e (—0R\l/(a;°)) nR^.

Proof. Definition (11.24) can be rewritten as

which shows

Therefore, we have
336 Chapter 11. Application to Mathematical Economics

Figure 11.4. Aggregate cost function 'f and its convex closure fy for an
exchange economy with no equilibrium.

If ((xh h G H ) , ( y i | I 6 L),p) is an equilibrium, we have x^ € argmaxt/ft,[— p]


(h 6 H), yi e argminC;[—p] (I e L), and (11.14), from which follows

Since p e R£, we have p e (-<9R#(:r0)) n R£.


Conversely, for p e (-^R*(X°)) n R£, we have *[p](x°) = inf*[p]. Let
(xfc | /i e jf?) and (?/; | I G L) be vectors that attain the minimum on the right-hand
side of (11.28) for z = x°. Then we have (11.30) and hence Xh e arg max Uh [—p]
(h e H) and yi e argminC t [-p] (I € L) by (11.29). This means that ((xh h £
H), (yi I € L),p) is an equilibrium. The expression (11.26) follows from the above
argument. D

Proposition 11.10 suggests that if a difficulty in the discreteness of indivisible


commodities exists, it should be reflected in the emptiness of the subdifferential
dR^(x°).

Example 11.11. Let us investigate the previous example (section 11.2) of an


exchange economy with no equilibrium. The aggregate cost function ^ takes the
values shown in Fig. 11.4 (left), where dom * = [0,2]z x [0,2]z- The convex closure
* of # is given on the right of Fig. 11.4. Since #(1,1) ^ *(!,!), ^,#(1,1) is empty
and, by Proposition 11.10, no equilibrium exists for x° = (1,1). For x° = (1,2), on
the other hand, p = (2,1) is an equilibrium price since (2,1) e ( — d n ^ ( x ° ) ) n R^_.

11.4. Existence of Equilibria 337

11.4.2 M"-Convex Case


The M^-concavity of utility functions and the M^-convexity of cost functions are
key to the existence of equilibria, as well as to a desirable structure of equilibrium
prices.
In Proposition 11.10 we saw that the existence of an equilibrium is almost
equivalent to the existence of a subgradient of \I>. The latter is guaranteed under
M^-concavity/convexity as follows.

Proposition 11.12. LetUh (h £ H) be A^-concave functions withdomUh bounded


and Ci (I £ L) be Aft-convex functions with domCi bounded. Then $> is an Afl-
convex function and ^^(x0) ^ 0 for any x° £ dom^.

Proof. By the assumption, ^ is an infimal convolution of M^-convex functions,


which is M^-convex by Theorem 6.15. The second claim follows from Theorem
6.61 (2). D

We shall present three theorems on the existence of equilibria. The first is for
an exchange economy (with L = 0).

Theorem 11.13. Consider an exchange economy with agents indexed by H and


suppose that Uh (h £ H) are nondecreasing A/*1 -concave functions with domUh
bounded. Then there exists an equilibrium ((x^ \ h £ H ) , p ) for every x° £
Eft 6 H d o m ^-

Proof. We use Proposition 11.10. Since &R^!(X°) ^ 0 by Proposition 11.12, it


suffices to show the nonnegativity of (any) p 6 — d^(x°}. The function U(z) =
—*&(z) is nondecreasing and

Putting z = x° + axi and letting a —> +00, we see that p(i) > 0.

We next consider the existence of equilibria in the general case with consumers
and producers. To separate discreteness issues from topological ones, we consider
an embedding of our discrete model in a continuous space, just as we have done for
the simple exchange economy in section 11.2.
Utility functions Uh, being M^-concave, are extensible to concave functions
Uh '• Rx -> RU{-oo}. Similarly, cost functions Ci, being M^-convex, are extensible
to convex functions Ci : RK —» RU {+00}. Then demand correspondences Dh for
h £ H and supply correspondences Si for I £ L are defined, respectively, as

Note that Dh(p) and §i(p) coincide respectively with the convex hulls Dh(p) and
Si(p) of Dh(p) and Si(p). We define an equilibrium in the continuous model as a
338 Chapter 11. Application to Mathematical Economics

tuple ((xh\heH),(yi I e L),p) of xh & RK, yt e RK, and p 6 R£ satisfying

as well as the supply-demand balance (11.14) and the price nonnegativity (11.12).
The following theorems state that the existence of an equilibrium in the con-
tinuous model implies that of an equilibrium for indivisible commodities under
our assumption of M^-concavity/convexity. We emphasize here that this is by no
means a general phenomenon, as we have seen in section 11.2, but is peculiar to M^-
concavity/convexity. For topological issues, the reader is referred to the literature
in economics indicated in Note 11.2.

Theorem 11.14. Suppose that utility functions UH (h G H) are Afl -concave and
cost functions Ci (I € L) are Afl-convex. If the derived continuous economy has
an equilibrium satisfying (11.33), (11.34), (11.14), and (11.12) for a total initial
endowment x° e Z^, there exists an equilibrium of indivisible commodities for x°
satisfying (11.9), (11.10), (11.14), and (11.12).

Proof. By Theorem 11.7, Dh(p) and Si(p) are M^-convex sets if they are not empty.
Hence, the claim follows from Theorem 11.15 below. D

Theorem 11.15. Suppose that, for eachp € R+, demand sets DH(P) (h £ H) and
supply sets Si(p) (I € L) are Afl-convex if they are not empty. If the derived contin-
uous economy has an equilibrium satisfying (11.33), (11.34), (11.14), and (11.12)
for a total initial endowment x° e Z+, there exists an equilibrium of indivisible
commodities for x° satisfying (11.9), (11.10), (11.14), and (11.12).

Proof. For an equilibrium price p e R+ in the continuous model, we have

On noting Dh(p) = Dh(p) and §i(p) = Si(p), as well as Theorem 4.12 and Propo-
sition 3.17 (4), we see

By Theorem 4.23 (3), on the other hand, J2h€H ^h(p)-^2ieL &i(p) *san M^-convex
set, which is hole free by Theorem 4.12. Hence follows

which shows (11.15).


11.4. Existence of Equilibria 339

Finally we consider the structure of equilibrium prices.

Theorem 11.16. Suppose that utility functions Uh (h € H) are Afl-concave and


cost functions Ci (I 6 L) are Afi -convex and that there exists an equilibrium for
a total initial endowment x°. Then the set P*(x°) of all the equilibrium price
vectors is an L^-convex polyhedron. This means, in particular, that p, q € P*(x°)
=> p V q, pf\q&P*(x°), which implies the existence of the smallest and the largest
equilibrium price vectors.

Proof. The subdifferentials d^Uh(xh) (h e H) and &R.Ci(yi) (I e L) in (11.23)


are lAconvex polyhedra by Theorem 6.61 (2). Since R+ is L^-convex and the
intersection of L''-convex polyhedra is L''-convex, P*(x°) is an L''-convex poly-
hedron.

As we have seen in the above, we have the correspondence

Note 11.17. Inspection of the proof of Theorem 11.15 shows that it relies solely
on the following two properties of M^-convex sets: (i) M^-convex sets are hole free
and (ii) Minkowski sums of M^-convex sets are M^-convex sets. This suggests a
generalization of Theorem 11.15 for a class J- of sets of integer points such that (i)
5 e T, x e ZK =>• S = 5 n ZK, x - S e T, and (ii) Si, 52 E f => Si + S2 e T
(cf. Proposition 3.16). Namely, if Dh(p) e T (h e H) and Si(p) e F (I £ L) for
each p e R^f, where we assume 0 € F, and if the derived continuous economy has
an equilibrium, then there exists an equilibrium for indivisible commodities.

Note 11.18. In an exchange economy with two agents, an equilibrium exists under
the L11-concavity of utility functions or the L1'-convexity of demand sets.
(1) If the utility functions C/i and C/2 are nondecreasing L^-concave functions,
an equilibrium exists for any x° & domC/i + dom[/2 (cf. Theorem 11.13).
(2) Suppose that the utility functions C/i and f/2 are L^-concave. If the derived
continuous economy has an equilibrium for x° 6 Z^f, there exists an equilibrium
for indivisible commodities (cf. Theorem 11.14).
(3) Suppose that, for each p e R+, the demand sets -Di(p) and D2(p) are iJ-
convex if they are not empty. If the derived continuous economy has an equilibrium
for x° & Z+, there exists an equilibrium for indivisible commodities (cf. Theorem
11.15).
Proof: (1) In the proof of Theorem 11.13, * is an L2-convex function and,
therefore, 5a*(x°) ^ 0 by Theorem 8.45.
(2) In the proof of Theorem 11.14, D\(p) and D2(p) are lAconvex or empty.
Hence the claim follows from (3) by Proposition 7.16.
(3) This follows from the proof of Theorem 11.15 by virtue of the convexity
in Minkowski sum for L^-convex sets (Theorem 5.8). •
340 Chapter 11. Application to Mathematical Economics

11.5 Computation of Equilibria


In this section we show an algorithmic procedure to compute an equilibrium in the
case where C\ are M^-convex and Uh are M^-concave. We use the framework of the
M-convex submodular flow problem MSFP2, introduced in section 9.2. The solution
to this problem yields consumptions and productions satisfying (11.9), (11.10), and
(11.14), as well as a price vector, which, however, may not be nonnegative. This
constitutes the first phase of our algorithm. The second phase finds an equilibrium
price vector by solving a shortest path problem. We can modify the second phase
so that we can find the smallest or the largest equilibrium price vector.
For an M^-convex cost function Ci : ZK —•> R U {+00}, we define the corre-
sponding M-convex function Cj : Z^UK —> RU {+00} by

compatibly with (6.4). We also define the M-concave function


R U {-00} associated with Uh : ZK -> R U {-00} by

In accordance with (11.24), we consider the aggregate cost function

Obviously, we have

for the aggregate cost function ^ defined in (11.24). For the total initial endowment
z° e ZK, we put x° = (-x°(K),x°) € Z<°> UK .
The function \I>, being an integer infimal convolution of M-convex functions,
is also M-convex and can be evaluated by solving an MSFP2 (see Note 9.30). A
concrete description is given below.
The instance of MSFPa for the evaluation of ^(x°) is defined on a directed
bipartite graph G = (V+,V~,A) with vertex partition (V+,V~) and arc set A
given by
11.5. Computation of Equilibria 341

Figure 11.5. Graph for computing a competitive equilibrium.

Note that V+, Vt+, and VJj" are copies of {0}(JK. Figure 11.5 illustrates the graph G
for H = {a, /?}, L = {A, B}, and K = {1,2}. For each arc a € A, we put c(a) = —oo,
c(a) = +00, and 7(0) = 0. Using the indicator function 5^°} '• %Ve —> {0, +00} of
{x°}, we define a function / : Zv+uv~ -> R U {+00} by

where w G Z1^ , yi G Z17; for / G L, and i^ G Zvh for h £ H. The function / is


M-convex, since C1/ (I G L) are M-convex and Uh (h G #) are M-concave. This is
the instance of the MSFP2 that we use for the computation of ^>(x°).
The optimal flow and the optimal potential for the MSFP2 above give the
allocation and the price vector in the equilibrium, as is stated later in Theorem
11.20. The following proposition is a lemma for it.

Proposition 11.19. Let £ G ZA be an optimal flow andp G R y + u y be an optimal


potential for the above instance of MSFP2. Define66

and regard x*h, y*, w*, and p as vectors on {0} U K.


(1) w* = x°, x*h(0) = -x*h(K) for h&H, and ft (0) = - y f ( K ) for I G L.
(2)£° + E J6i i/r = E h€ fl^-
(3) p(fc-) = p(fc+) = p(k+) for k G {0} U K, h G H, and I G L.
(4) x*h G arg max Uh [—P] for h £ H and y* G arg min Ci [— p] for I & L.

Proof. (1) is obvious from the definitions of Ci, Uh, and 6^°}- By the structure
of G, for each k G {0} U K, we have

66
The notation —d£\v- designates the restriction of —d£ to Vh . Hence x*h = — d(,\v- means
*J(fc) = -ae(fc)forfcev- h -.
342 Chapter 11. Application to Mathematical Economics

This means (2), since w* = x° by (1). Since 7(0) = 0, c(a) = —oo, and c(a) = +00
for any a e A, condition (i) of (POT) of Theorem 9.16 implies p(d+a) — p(d~a) = 0
for all a G A. This implies (3). It follows from (3) that

We also have

Then (4) follows from (ii) of (POT) in Theorem 9.16, (11.35), and (11.37).

The following theorem, an immediate consequence of Proposition 11.19, shows


that consumptions and productions satisfying (11.9), (11.10), and (11.14) can be
obtained from a solution to the above instance of MSFP2 and that, if the optimal
potential is nonnegative, then it serves as an equilibrium price vector.

Theorem 11.20. Let £ e ZA be an optimal flow and p e Ry uv be an optimal


potential with p(Q+) = 0 for the above instance O/MSFP2. Define

and regard x*h, y*, and p as vectors on K. Then we have

Therefore, ((x*h h & H),(yf \ I & L},p) is an equilibrium if p > 0. Moreover, if


there exists an equilibrium at all, then ((x*h h e H), (yf \ I 6 L)) is the consump-
tions and productions of some equilibrium.

Any algorithm for the MSFP2 will find a tuple ((x*h h e H), (yf I € L),p),
which gives an equilibrium for the total initial endowment x° if the optimal potential
p happens to be nonnegative. We go on to consider the set of all nonnegative optimal
potentials or, equivalently, the set of all equilibrium price vectors. We already know
from Theorem 11.16 that the set is an L^-convex polyhedron, and our objective here
is to derive a concrete description in terms of a linear inequality system. With this
knowledge, the existence of an equilibrium price vector can be checked by solving
11.5. Computation of Equilibria 343

a linear programming problem, which can be reduced to the dual of a single-source


shortest path problem.
Recall from (11.23) that the set P*(x°) of all equilibrium price vectors for x°
is expressed as

in which we have

by (3.30). By Theorem 6.26 (2) (M-optimality criterion), we have y^ e arg min Ci [—p]
if and only if

and x*h e argmax[/h[—p] if and only if

By defining

we obtain a concrete representation of P*(x°) as follows. Note that l(j) < +00,
u(j) > — oo, and u(i,j] > —oo for any i,jEK(i^ j).

Theorem 11.21. The set P*(x°) of all equilibrium price vectors is an ifi-convex
polyhedron described as

with l ( j ) , u ( j ) , and u(i,j) defined in (11.40), (11.41), and (11.42).

By Theorem 11.21, the nonemptiness of P*(x°) can be checked by linear


programming. In particular, the largest equilibrium price vector, if any, can be
344 Chapter 11. Application to Mathematical Economics

found by solving a linear programming problem:

Similarly, the smallest equilibrium price vector can be found by solving another
linear programming problem:

Both (11.44) and (11.45) can be easily reduced to the dual of a single-source shortest
path problem.

Theorem 11.22. There exists an equilibrium price vector if and only if the problem
(11.44) is feasible. The smallest and the largest equilibrium price vectors, if any,
can be found by solving the shortest path problem.

Thus, the existence of a competitive equilibrium in our economic model with


M^-convex cost functions of producers and M^-concave utility functions of consumers
can be checked in polynomial time by the following algorithm.
Algorithm for computing an equilibrium
SO: Construct the instance of the MSFP2.
SI: Solve the MSFP2 to obtain ( ( x * h \ h & H ) , (yf \leL),p).
(If MSFPg is infeasible, no equilibrium exists.)
S2: Solve the problem (11.44) to obtain an equilibrium ((x*h \ h € H),
(yf I e L),p*) with largest p*.
(If (11.44) is infeasible, no equilibrium exists.)
Whereas the above algorithm yields the largest equilibrium price vector, the smallest
price vector can be computed by solving (11.45) instead of (11.44) in step S2.

Bibliographical Notes
The unified framework for indivisible commodities by means of discrete convex
analysis is proposed in Danilov-Koshevoy-Murota [34], [35], to which Theorems
11.14 and 11.15 as well as Notes 11.9 and 11.17 are ascribed. Theorem 11.16 for
the structure of equilibrium prices and Note 11.18 are by Murota [147].
The gross substitutes property was introduced by Kelso-Crawford [111] and
investigated thoroughly by Gul-Stacchetti [84], in which the equivalence of (GS),
(SI), and (NC) is proved. The connection of these conditions to M^-concavity was
pointed out by Fujishige-Yang [69] for set functions, with subsequent generaliza-
tions by Danilov-Koshevoy-Lang [33] (Theorem 11.6) and Murota-Tamura [160]
(Theorems 11.4 and 11.5). See Roth-Sotomayor [180] for more on (GS).
11.5. Computation of Equilibria 345

The computation of an equilibrium via an M-convex submodular flow problem


described in section 11.5 is due to Murota-Tamura [161].
M-convexity is also amenable to the stable marriage problem (stable matching
problem) of Gale-Shapley [74], which is one of the most applicable models in eco-
nomics and game theory. Eguchi-Fujishige [47] formulates a generalization of the
stable marriage problem in terms of M^-convex functions and presents an extension
of the Gale-Shapley algorithm.
Submodularity plays important roles in economics and game theory. We
mention here the paper of Shapley [186] as an early contribution and Bilbao [14],
Danilov-Koshevoy [31], Milgrom-Shannon [129], and Topkis [203] as recent litera-
ture.
This page intentionally left blank
Chapter 12

Application to
Systems Analysis by
Mixed Matrices

This chapter presents an application of discrete convex analysis to systems analysis


by mixed matrices. Motivated by a physical observation to distinguish two kinds
of numbers appearing in descriptions of physical/engineering systems, the concepts
of mixed matrices and mixed polynomial matrices are introduced as mathematical
tools for dealing with two kinds of numbers in systems analysis. Discrete convex
functions arise naturally in this context and the discrete duality theorems are vital
for the analysis of the rank of mixed matrices and the degree of determinants of
mixed polynomial matrices.

12.1 Two Kinds of Numbers


A physical/engineering system can be characterized by a set of relations among var-
ious kinds of numbers representing physical quantities, parameter values, incidence
relations, etc., where it is important to recognize the difference in the nature of
the quantities involved in the problem and to establish a mathematical model that
reflects the difference.
A primitive, yet fruitful, way of classifying numbers is to distinguish nonvan-
ishing elements from zeros. This dichotomy often leads to graph-theoretic methods
for systems analysis, where the existence of nonvanishing numbers is represented by
a set of arcs in a certain graph.
Closer inspection reveals, however, that two different kinds can be distin-
guished among the nonvanishing numbers; some of the nonvanishing numbers are
accurate in value and others are inaccurate in value but independent of one another.
We may alternatively refer to the numbers of the first kind as fixed constants and
to those of the second kind as system parameters.
Accurate numbers (fixed constants): Numbers accounting for various sorts of
conservation laws, such as Kirchhoff's laws, which, stemming from the topo-
logical incidence relation, are precise in value (often ±1).
Inaccurate numbers (system parameters): Numbers representing independent

347
348 Chapter 12. Application to Systems Analysis by Mixed Matrices

Figure 12.1. Electrical network with mutual couplings.

physical parameters, such as resistances in electrical networks and masses in


mechanical systems, which, being contaminated with noise and other errors,
take values independent of one another.

It is emphasized that the distinction between accurate and inaccurate numbers


is not a matter in mathematics but in mathematical modeling, i.e., the way in
which we recognize the problem. This means in particular that it is impossible in
principle to give a mathematical definition to the distinction between the two kinds
of numbers.
The objective of this section is to explain, by means of typical examples, what
is meant by accurate and inaccurate numbers and how numbers of different nature
arise in mathematical descriptions of physical/engineering systems. We consider
three examples from different disciplines: an electrical network, a chemical process,
and a mechanical system.

Example 12.1. Consider the electrical network in Fig. 12.1, which consists of
five elements: two resistors of resistances Ti (branch i) (i = 1,2), a voltage source
(branch 3) controlled by the voltage across branch 1, a current source (branch 4)
controlled by the current in branch 2, and an independent voltage source of voltage
e (branch 5). The element characteristics are represented as

where & and rn are the current in and the voltage across branch i (i = 1 , . . . , 5)
in the directions indicated in Fig. 12.1. We then obtain the following system of
equations:
12.1. Two Kinds of Numbers 349

The upper five equations are the structural equations (Kirchhoff's laws), while the
remaining five are the constitutive equations.
The nonzero coefficients, ±1, appearing in the structural equations represent
the incidence relation in the underlying graph and are certainly accurate in value.
The entries of —1 contained in the constitutive equations are also accurate by defi-
nition. In contrast, the values of the physical parameters ri, r2, a, and (3 are likely
to be inaccurate, being only approximately equal to their nominal values on account
of various kinds of noises and errors.
The unique solvability of this network amounts to the nonsingularity of the
coefficient matrix of (12.1). A direct calculation shows that the determinant of this
matrix is equal to r-2 + (1 — a)(l + /3)ri, which is highly probably distinct from zero
by the independence of the physical parameters {ri,r2,a,/?}. Thus, the electrical
network of this example is solvable in general or, more precisely, solvable generically
with respect to the parameter set {ri,r 2 ,a,/?}. The solvability of this system will
be treated in Example 12.11 by a systematic combinatorial method (without direct
computation of the determinant). •

The second example concerns a chemical process simulation.

Example 12.2. Consider a hypothetical system (Fig. 12.2) for the production
of ethylene dichloride (C2H4C12), which is slightly modified from an example used
in the Users' Manual of Generalized Interrelated Flow Simulation of The Service
Bureau Company.
Feeds to the system are 100 mol/h of pure chlorine (Ck) (stream 1) and 100
mol/h of pure ethylene (C2H 4 ) (stream 2). In the reactor, 90% of the input ethylene
is converted into ethylene dichloride according to the reaction formula

At the purification stage, the product ethylene dichloride is recovered and the unre-
acted chlorine and ethylene are separated for recycling. The degree of purification is
described in terms of the component recovery ratios 01, 02, and 03 of chlorine, ethy-
lene, and ethylene dichloride, respectively, which indicate the ratios of the amounts
recovered in stream 6 of the respective components over those in stream 5. We
consider the following problem:
350 Chapter 12. Application to Systems Analysis by Mixed Matrices

Figure 12.2. Hypothetical ethylene dichloride production system.

Given the component recovery ratios ai and a-2 of chlorine and ethylene,
determine the recovery ratio x = 03 of ethylene dichloride with which a
specified production rate y mol/h of ethylene dichloride is realized.
Let un, Ui2, and Ui3 mol/h be the component flow rates of chlorine, ethylene,
and ethylene dichloride in stream i, respectively. The system of equations to be
solved may be put in the following form, where u is an auxiliary variable in the
reactor and r (= 0.90) is the conversion ratio of ethylene:

This is a system of linear/nonlinear equations in the unknown variables x, u, and Uij,


where the equation UQS = x u53 in the purification is the only nonlinear equation.
We may regard a, (j = 1,2) and r (— 0.90) as inaccurate and independent numbers.
The stoichiometric coefficients in the reaction formula (12.2) are accurate numbers.
The Jacobian matrix of (12.3), shown in Fig. 12.3, contains five inaccurate numbers,
ai, oi2, r, x, and 1^53. The solvability of this problem will be treated in Example
12.12.

Example 12.3. Consider the mechanical system in Fig. 12.4 consisting of two
masses m\ andTO2,two springs fci and fc2, and a damper /; u is the force exerted
12.1. Two Kinds of Numbers 351

Figure 12.3. Jacobian matrix in the chemical process simulation.

from outside. This system may be described by vectors x = (xi,X2,X3,x 4 ,X5,o; 6 )


and u = (u), where x\ and x2 are the vertical displacements (downward) of masses
m\ and m^, #3 and x± are their velocities, £5 is the force by the damper /, and XQ
is the relative velocity of the two masses. The governing equation is

We may regard {rni,m2, &i, &2, /} as independent system parameters and other
nonvanishing entries (i.e., ±1) of F, A, and B as fixed constants. The Laplace
transform67 of the equation (12.4) gives a frequency domain description

where o;(0) = 0 and it(0) = 0 are assumed. The coefficient matrix [A — sF B] is


a polynomial matrix in s with coefficients depending on the system parameters.
67
See Chen [23] or Zadeh-Desoer [220] for the Laplace transform.
352 Chapter 12. Application to Systems Analysis by Mixed Matrices

Figure 12.4. Mechanical system.

We have employed a six-dimensional vector x in our description of the system.


It is possible, however, to describe this system using a four-dimensional state vector.
The minimum dimension of the state vector is known to be equal to the degree in
s of the determinant of

Thus the number degdet[A — sF] is an important characteristic, sometimes called


the dynamical degree, of the system (12.4).

As illustrated by the examples above, accurate numbers often appear in equa-


tions for conservation laws such as Kirchhoff's laws; the law of conservation of
mass, energy, or momentum; and the principle of action and reaction, where the
nonvanishing coefficients are either 1 or — 1, representing the underlying topological
incidence relations. Another typical example is integer coefficients (stoichiometric
coefficients) in chemical reactions such as

where nonunit integers such as 2 appear. In dealing with dynamical systems, we


encounter another example of accurate numbers that represent the defining rela-
tions, such as those between velocity v and position x and between current £ and
charge Q:
12.2. Mixed Matrices and Mixed Polynomial Matrices 353

Figure 12.5. Accurate numbers.

Typical accurate numbers are illustrated in Fig. 12.5.


The rather intuitive concept of two kinds of numbers will be given a mathe-
matical formalism in the next section.

12.2 Mixed Matrices and Mixed Polynomial Matrices


The distinction of two kinds of numbers can be embodied in the concepts of mixed
matrices and mixed polynomial matrices.
Assume that we are given a pair of fields F and K, where If is a subfield of
F. Typically, K is the field Q of rational numbers and F is a field large enough to
contain all the numbers appearing in the problem in question. In so doing we intend
to model accurate numbers as numbers belonging to K and inaccurate numbers
354 Chapter 12. Application to Systems Analysis by Mixed Matrices

as numbers in F that are algebraically independent over K, where a family of


numbers ti,..., tm of F is called algebraically independent over K if there exists no
nonzero polynomial p(X\,..., Xm) over K such that p(ti,..., tm) = 0. Informally,
algebraically independent numbers are tantamount to free parameters.
A matrix A = (A^) over F, i.e., Aij e F, is called a mixed matrix with
respect to (K, F) if

where
(M-Q) Q = (Qij) is a matrix over K and
(M-T) T = (Tij) is a matrix over F such that the set of its nonzero
entries is algebraically independent over K.
We usually assume

to make the decomposition (12.7) unique.

Example 12.4. In the electrical network of Example 12.1 it is reasonable to regard


{TI, r2, a, (3} as independent free parameters. Then the coefficient matrix in (12.1) is
a mixed matrix with respect to (K, F) = (Q, Q(ri, r2, a, /?)), where Q(ri, r%, a, j3)
means the field of rational functions in r\,r2,a,(3 with coefficients from Q. The
decomposition A = Q + T is given by

Consider now a polynomial matrix in s with coefficients from F:


12.2. Mixed Matrices and Mixed Polynomial Matrices 355

where Ak (k = 0 , 1 , . . . , N) are matrices over F. We say that A(s) is a mixed


polynomial matrix with respect to (K, F) if it can be represented as

with

where

(MP-Q) Qk (k = 0,1,..., N) are matrices over K and


(MP-T) Tk (k = 0 , 1 , . . . , N) are matrices over F such that the set of
their nonzero entries is algebraically independent over K.

Obviously, the coefficient matrices

are mixed matrices with respect to (K, F). Also note that A(s) is a mixed matrix
with respect to (K(s), F ( s ) ) in spite of the occurrence of the variable s in both Q(s)
and T(s), where K(s) and F(s) denote the fields of rational functions in variable s
with coefficients from K and JF, respectively.
Mixed polynomial matrices are useful in dealing with linear time-invariant dy-
namical systems. The variable s here is primarily intended to denote the variable for
the Laplace transform for continuous-time systems, though it could be interpreted
as the variable for the z-transformes for discrete-time systems.

Example 12.5. In the mechanical system of Example 12.3 it is reasonable to regard


{mi,TO2, ki,kz, /} as independent free parameters. Then the matrix A(s) in (12.6)
is a mixed polynomial matrix with respect to (K,F) = (Q, Q(rni,m2,fci,/c 2 ,/)).
The decomposition A(s) = Q(s) + T(s) is given by

Our intention in the splitting (12.7) or (12.8) is to extract a meaningful com-


binatorial structure from the matrix A or A(s) by treating the Q-part numerically
and the T-part symbolically. This is based on the following observations.
68
See Chen [23] or Zadeh-Desoer [220] for the z-transform.
356 Chapter 12. Application to Systems Analysis by Mixed Matrices

Q-part: As is typical with electrical networks, the Q-part primarily represents


the interconnection of the elements. The Q-matrix, however, is not uniquely
determined, but is subject to our choice in the mathematical description. In
the electrical network of Example 12.1, for instance, the coefficient matrix

for Kirchhoff 's voltage law may well be replaced with

Accordingly, the structure of the Q-part should be treated numerically or


linear algebraically. In fact, this is feasible in practice, since the entries of the
Q-matrix are usually small integers, causing no serious numerical difficulty in
arithmetic operations.
T-part: The T-part primarily represents the element characteristics. The nonzero
pattern of the T-matrix is relatively stable against our choice in the mathe-
matical description of constitutive equations, and therefore it can be regarded
as representing some intrinsic combinatorial structure of the system. It can
be treated properly by graph-theoretic concepts and algorithms.
Combination: The structural information from the Q-part and the T-part can be
combined properly and efficiently by virtue of the fact that each part defines
a combinatorial structure with discrete convexity (matroid or valuated ma-
troid, to be more specific). Mathematical and algorithmic results in discrete
convex analysis (matroid theory and valuated matroid theory) afford effective
methods of systems analysis.
We may summarize the above as follows:
Q-part by linear algebra
T-part by graph theory
Combination by matroid theory

12.3 Rank of Mixed Matrices


The rank of a mixed matrix A = Q + T can be expressed in terms of the L^-convex
functions (submodular set functions) associated with Q and T. This enables us, for
example, to test efficiently for the solvability of the electrical network in Example
12.1 and of the chemical process simulation problem in Example 12.2.
Let A = Q + T be a mixed matrix with respect to (K,F). The rank of A
is denned with reference to the field F. That is, the rank of A is equal to (i) the
maximum number of linearly independent column vectors of A with coefficients
taken from F, (ii) the maximum number of linearly independent row vectors of A
with coefficients taken from F, and (iii) the maximum size of a submatrix of A for
12.3. Rank of Mixed Matrices 357

which the determinant does not vanish in jF. The row set and the column set of A
are denoted by R and C, respectively. For / C R and J C C, the submatrix of A
with row indices in / and column indices in J is designated by A[I, J].
We start with the nonsingularity of a mixed matrix.

Proposition 12.6. A square mixed matrix A = Q + T is nonsingular if and only


if there exist some I C R and J C C such that both Q[I, J] and T[R \I,C\J] are
nonsingular.

Proof. It follows from the denning expansion of the determinant that

with e(I, J) e {1, —1}. If A is nonsingular, we have det A^Q and hence det Q[I, J] •
det T[R \ /, C \ J} ^ 0 for some / and J. The converse is also true, since no
cancellation occurs among nonzero terms on the right-hand side by virtue of the
algebraic independence of the nonzero entries of T.

The following is a basic rank identity for a mixed matrix.

Theorem 12.7. For a mixed matrix A = Q + T,

Proof. Proposition 12.6 applied to submatrices of A establishes (12.9). D

The right-hand side of the identity (12.9) is a maximization over all pairs
(/, J), the number of which is as large as 2' fl l + l c 'l, too large for an exhaustive
search for maximization. Fortunately, however, it is possible to design an efficient
algorithm to compute this maximum on the basis of the following facts:
• The function p ( I , J) = rankQ[7, J] can be evaluated easily by Gaussian elim-
ination.
• The function r(I, J) = rankTf/, J] can be evaluated easily by finding a max-
imum matching in a bipartite graph representing the nonzero pattern of T.
• The maximization can be converted, with the aid of Edmonds's intersection
theorem for matroids (a special case of Theorem 4.18), to the minimum of an
L''-convex function.
To state the main theorem of this section we need another function 7 : 2R x
2° -> Z denned by

Note that 7(7, J) represents the number of nonzero rows of the submatrix T[I, J}.
358 Chapter 12. Application to Systems Analysis by Mixed Matrices

Theorem 12.8. For a mixed matrix A = Q + T,

Proof. (12.10) can be proved from (12.9) with the aid of Edmonds's intersection
theorem for matroids (a special case of Theorem 4.18) and (12.11) can be derived
from (12.10) using the formula

which is a version of the fundamental min-max relation between maximum match-


ings and minimum covers. (12.12) follows easily from (12.11). For details see the
proofs of Theorem 4.2.11 and Corollary 4.2.12 of Murota [146].

We mention the following theorem as an immediate corollary of the third


identity (12.12). Note the duality nature of this theorem.

Theorem 12.9 (Konig-Egervary theorem for mixed matrices). For a mixed matrix
A = Q + T, there exist I C R and J C C such that
(i) |/| + |J| -rankQ[7, J] = \R\ + \C\ -rankA, and
(ii) rankT[J, J] =0.

Proof. Take (/, J) that attains the minimum in (12.12). D

To see the connection of the above rank formulas to L''-convexity, we define


three functions gp,gr^9-y '• ZR(JC —> Z U {+00}, with domgp = domgr = domg7
{0,l}*uc,by

Proposition 12.10. gp, gT, and g^ are Lfi-convex functions.

Proof. It is easy to see that p(I(JJ) = p(R\I, J) +1/| is a submodular set function
on R U C (see (2.70)). This is equivalent to the L''-convexity of gp by Theorem 7.1,
and similarly for gT and <?7. D

We can rewrite the right-hand sides of (12.10) and (12.11) using these Lfi-
convex functions. Namely, we see
12.4. Degree of Determinant of Mixed Polynomial Matrices 359

where 1 e ZRuC. Note that both gp + gT and gp + #7 are lAconvex and therefore
(gp + gT)' and (gp + g^)9 are M^-convex. As for (12.12) we observe that

is an L'-convex set and

This shows that the right-hand side of (12.12) is the minimum of an lAconvex
function over an L^-convex set. The discrete convexity implicit in Theorem 12.8
is thus exposed. A concrete algorithmic procedure for computing the rank of A =
Q + T is described in section 4.2 of Murota [146].

Example 12.11. The unique solvability of the electrical network in Example


12.1 can be shown by Theorem 12.7 applied to A = Q + T in Example 12.4. In
(12.9) the maximum value of 10 is attained by / = {1,2,3,4,5,7,10} and J =
{3,4,5,7,8,9,10}. In Theorem 12.8 the right-hand sides of (12.10), (12.11), and
(12.12) are equal to 10 with the minima attained by / = R and J = 0.

Example 12.12. The generic solvability of the chemical process simulation prob-
lem in Example 12.2 is denied by Theorem 12.8 applied to the Jacobian matrix in
Fig. 12.3. In (12.12) the minimum is attained by

Note that 7(7, J) = 0, |J| = |J| = 10, and p ( I , J ) = 3, for which p ( I , J ) - \I\ -
\J\ + \R\ + \C\ = 3 - 10 - 10 + 16 + 16 = 15 < \R\ = \C\ = 16. This shows that t
Jacobian matrix in Fig. 12.3 is singular, and hence the simulation problem is not
solvable in general.

12.4 Degree of Determinant of Mixed Polynomial


Matrices
The degree of determinant of a mixed polynomial matrix A(s) = Q(s)+T(s) can be
expressed in terms of the infimal convolution of two M-convex functions associated
with T(s) and Q(s). This enables us, for example, to compute the dynamical degree
of the mechanical system in Example 12.3 in an efficient way by solving an M-convex
submodular flow problem.
Let A(s) = (Aij(s)) be a polynomial matrix with each entry being a poly-
nomial in s with coefficients from a certain field F. We denote by R and C the
row set and the column set of A(s). The degree of minors (subdeterminants) is an
important characteristic of A(s). For example, the sequence of 6k (k = 1, 2 , . . . ) of
the highest degree in s of a minor of order fc,
360 Chapter 12. Application to Systems Analysis by Mixed Matrices

determines the Smith-McMillan form at infinity as well as the structural indices of


the Kronecker form (see section 5.1 of Murota [146]). Here the function

to be maximized in (12.13) is essentially M-concave, since w : 2,RUC —> Z U {—00}


defined by o>(/ U J) = 5(R \ I, J) for / C R and J C C is a valuated matroid (see
(2.74) and (2.77) as well as Example 5.2.15 of Murota [146]).
The following is the basic identity for the degree of the determinant of a mixed
polynomial matrix.

Theorem 12.13. For a square mixed polynomial matrix A(s) = Q(s) + T(s),
degdet^ = max{degdetQ[7, J]+degdetT[R\I,C\ J] \ \I\ = \ J\,I C R, J C C},
(12.14)
where both sides are equal to —oo if A is singular.

Proof. It follows from the denning expansion of the determinant that

with e ( I , J ) 6 {!,—!}. Since the degree of a sum is bounded by the maximum


degree of a summand, we obtain

The inequality turns into an equality if the highest degree terms do not cancel one
another. The algebraic independence of the nonzero coefficients in T(s) ensures
this. D

The right-hand side of the identity (12.14) is a maximization over all pairs
(7, J), the number of which is as large as '2\R\+\C\, too large for an exhaustive search
for maximization. Fortunately, however, it is possible to compute this maximum
efficiently by reducing this maximization problem to the M-convex submodular flow
problem.
To see the connection to M-convexity, we define functions /Q,/T : ZRuC —>
Z U {+00} with dom/ Q ,dom/ T C {0,1} RUC by

Both fq and fa are M-convex functions. The right-hand side of (12.14) can now
be identified as the negative of an integer infimal convolution of these M-convex
functions. Namely,
12.4. Degree of Determinant of Mixed Polynomial Matrices 361

where 1 € ZRUC. This reveals the discrete convexity implicit in Theorem 12.13
and also shows an efficient way to compute the degree of determinant of a mixed
polynomial matrix A(s) = Q(s) + T(s), since the infimal convolution of M-convex
functions can be computed efficiently by solving an M-convex submodular flow
problem, as we have seen in Note 9.30. Such an algorithm is described in detail in
section 6.2 of Murota [146].

Example 12.14. The dynamical degree of the mechanical system in Example 12.3
can be computed by Theorem 12.13 applied to A(s) = Q(s) + T(s) in Example 12.5.
In (12.14) the maximum value of 4 is attained by I = {1,2,5,6} and J = {1,2,5,6}.
Hence the dynamical degree is equal to four. •

Bibliographical Notes
This chapter is largely based on Murota [146]. The observation on two kinds of num-
bers and the concept of mixed matrices are due to Murota-Iri [149], [150], in which
Theorem 12.7 is given. Theorem 12.8 is taken from [146]. The Konig-Egervary the-
orem for mixed matrices (Theorem 12.9) is due to Bapat [7] and Hartfiel-Loewy [86].
The connection between mixed polynomial matrices and M-convexity explained in
section 12.4 is due to Murota [143] and a related topic can be found in Iwata-Murota
[104].
Applications of matroid theory to electrical networks are fully expounded in
Iri [95] and Recski [175]. When gyrators are involved in electrical networks, a
generalization of mixed matrices to mixed skew-symmetric matrices is useful, as is
explained in section 7.3 of Murota [146]. See Geelen-Iwata [75] and Geelen-Iwata-
Murota [76] for recent results on mixed skew-symmetric matrices.
Matroid theory also finds applications in statics and scene analysis (Graver-
Servatius-Servatius [80], Recski [175], Sugihara [195], Whiteley [215], [216], [217]).
For planar truss structures, in particular, a necessary and sufficient condition for
generic (infinitesimal) rigidity can be expressed in terms of unions of graphic ma-
troids, where a matroid union is a special case of the Minkowski sum of two M-
convex sets. It is noted that the rigidity of a truss structure can be represented by
a rank condition on a matrix associated with the truss, but that this matrix does
not fall into the category of mixed matrices. Recent results on rigidity in nongeneric
cases are surveyed in Radics-Recski [174].
This page intentionally left blank
Bibliography

[1] R. K. Ahuja, T. L. Magnanti, and J. B. Orlin: Network Flows—Theory, Algo-


rithms and Applications, Prentice-Hall, Englewood Cliffs, NJ, 1993. (Cited
on pp. 74, 145, 278)
[2] I. Althofer and W. Wenzel: Two-best solutions Under distance constraints:
The model and exemplary results for matroids, Advances in Applied Mathe-
matics, 22 (1999), 155-185. (Cited on p. 75)
[3] D. H. Anderson: Compartmental Modeling and Tracer Kinetics, Lecture Notes
in Biomathematics, 50, Springer-Verlag, Berlin, 1983. (Cited on p. 43)
[4] K. J. Arrow and F. H. Hahn: General Competitive Analysis, Holden-Day, San
Francisco, 1971. (Cited on p. 327)
[5] M. Avriel, W. E. Diewert, S. Schaible, and I. Zang: Generalized Concavity,
Plenum Press, New York, 1988. (Cited on p. 169)
[6] O. Axelsson: Iterative Solution Methods, Cambridge University Press, Cam-
bridge, U.K., 1994. (Cited on p. 42)
[7] R. B. Bapat: Konig's theorem and bimatroids, Linear Algebra and Its Appli-
cations, 212/213 (1994), 353-365. (Cited on p. 361)
[8] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty: Nonlinear Programming:
Theory and Algorithm, 2nd ed., Wiley, New York, 1993. (Cited on p. 36)
[9] A. Berman and R. J. Plemmons: Nonnegative Matrices in the Mathematical
Sciences, SIAM, Philadelphia, 1994. (Cited on pp. 42, 74)
[10] D. P. Bertsekas: Nonlinear Programming, 2nd ed., Athena Scientific, Belmont,
MA, 1999. (Cited on p. 36)
[11] M. J. Best, N. Chakravarti, and V. A. Ubhaya: Minimizing separable convex
functions subject to simple chain constraints, SIAM Journal on Optimization,
10 (2000), 658-672. (Cited on p. 202)
[12] C. Bevia, M. Quinzii, and J. Silva: Buying several indivisible goods, Mathe-
matical Social Sciences, 37 (1999), 1-23. (Cited on p. 327)
[13] S. Bikhchandani and J, W. Mamer: Competitive equilibrium in an exchange
economy with indivisibilities, Journal of Economic Theory, 74 (1997), 385-
413. (Cited on p. 327)

363
364 Bibliography

[14] J. M. Bilbao: Cooperative Games on Combinatorial Structures, Kluwer Aca-


demic, Boston, 2000. (Cited on p. 345)
[15] R. E. Bixby, W. H. Cunningham, and D. M. Topkis: Partial order of a polyma-
troid extreme point, Mathematics of Operations Research, 10 (1985), 367-378.
(Cited on pp. 290, 322)
[16] A. Bjorner, M. Las Vergnas, B. Sturmfels, N. White, and G. M. Ziegler:
Oriented Matroids, 2nd ed., Cambridge University Press, Cambridge, U.K.,
1999. (Cited on pp. 5, 75)
[17] J. M. Borwein and A. S. Lewis: Convex Analysis and Nonlinear Optimization:
Theory and Examples, Springer-Verlag, Berlin, 2000. (Cited on p. 99)
[18] A. Bouchet and W. H. Cunningham: Delta-matroids, jump systems, and
bisubmodular polyhedra, SI AM Journal on Discrete Mathematics, 8 (1995),
17-32. (Cited on p. 120)
[19] R. K. Brayton and J. K. Moser: A theory of nonlinear networks, I, II, Quar-
terly of Applied Mathematics, 22 (1964), 1-33, 81-104. (Cited on p. 74)
[20] R. A. Brualdi: Comments on bases in dependence structures, Bulletin of the
Australian Mathematical Society, 1 (1969), 161-167. (Cited on p. 75)
[21] R. A. Brualdi: Induced matroids, Proceedings of the American Mathematical
Society, 29 (1971), 213-221. (Cited on p. 279)
[22] P. M. Camerini, M. Conforti, and D. Naddef: Some easily solvable nonlinear
integer programs, Ricerca Operativa, 50 (1989), 11-25. (Cited on p. 175)
[23] Ch.-T. Chen: Linear System Theory and Design, 2nd ed., Holt, Rinehart and
Winston, New York, 1970. (Cited on pp. 351, 355)
[24] V. Chvatal: Linear Programming, W. H. Freeman and Company, New York,
1983. (Cited on pp. 88, 89, 99)
[25] R. Clay: Nonlinear Networks and Systems, John Wiley and Sons, New York,
1971. (Cited on p. 74)
[26] W. J. Cook, W. H. Cunningham, W. R. Pulleyblank, and A. Schrijver: Com-
binatorial Optimization, John Wiley and Sons, New York, 1998. (Cited on
pp. 36, 37, 74, 89, 99, 248, 278)
[27] W. Cui and S. Pujishige: A primal algorithm for the submodular flow prob-
lem with minimum-mean cycle selection, Journal of the Operations Research
Society of Japan, 31 (1988), 431-440. (Cited on p. 313)
[28] W. H. Cunningham: Testing membership in matroid polyhedra, Journal of
Combinatorial Theory (B), 36 (1984), 161-188. (Cited on pp. 290, 322)
[29] W. H. Cunningham: On submodular function minimization, Combinatorica,
5 (1985), 185-192. (Cited on pp. 290, 322)
[30] W. H. Cunningham and A. Frank: A primal-dual algorithm for submodular
flows, Mathematics of Operations Research, 10 (1985), 251-262. (Cited on
p. 318)
Bibliography 365

[31] V. I. Danilov and G. A. Koshevoy: Cores of cooperative games, superdiffer-


entials of functions, and the Minkowski difference of sets, Journal of Mathe-
matical Analysis and Applications, 247 (2000), 1-14. (Cited on p. 345)
[32] V. I. Danilov and G. A. Koshevoy: Discrete convexity and unimodularity, I,
Preprint. (Cited on pp. 99, 120)
[33] V. Danilov, G. Koshevoy, and C. Lang: Gross substitution, discrete convexity,
and submodularity, Discrete Applied Mathematics (2003), in press. (Cited
on pp. 176, 344)
[34] V. Danilov, G. Koshevoy, and K. Murota: Equilibria in economies with indi-
visible goods and money, RIMS Preprint 1204, Kyoto University, May 1998.
(Cited on pp. 175, 327, 344)
[35] V. Danilov, G. Koshevoy, and K. Murota: Discrete convexity and equilibria
in economies with indivisible goods and money, Mathematical Social Sciences,
41 (2001), 251-273. (Cited on pp. 175, 327, 344)
[36] G. B. Dantzig: Linear Programming and Extensions, Princeton University
Press, Princeton, NJ, 1963. (Cited on pp. 88, 99)
[37] G. Debreu: Theory of Value—An Axiomatic Analysis of Economic Equilib-
rium, John Wiley and Sons, New York, 1959. (Cited on p. 327)
[38] G. Debreu: Existence of competitive equilibrium, in: K. J. Arrow and M.
D. Intriligator, eds., Handbook of Mathematical Economics, Vol. II, North-
Holland, Amsterdam, 1982, Chap. 15, 697-743. (Cited on p. 327)
[39] P. G. Doyle and J. L. Snell: Random Walks and Electrical Networks, Mathe-
matical Society of America, Washington, DC, 1984. (Cited on p. 74)
[40] A. W. M. Dress and W. Terhalle: Well-layered maps and the maximum-degree
k x fc-subdeterminant of a matrix of rational functions, Applied Mathematics
Letters, 8 (1995), 19-23. (Cited on p. 176)
[41] A. W. M. Dress and W. Wenzel: Valuated matroid: A new look at the greedy
algorithm, Applied Mathematics Letters, 3 (1990), 33-35. (Cited on pp. 6,
7, 75, 321)
[42] A. W. M. Dress and W. Wenzel: Valuated matroids, Advances in Mathemat-
ics, 93 (1992), 214-250. (Cited on pp. 6, 7, 75)
[43] D.-Z. Du and P. M. Pardalos, eds.: Handbook of Combinatorial Optimization,
Vols. 1-3, A, Kluwer Academic, Boston, 1998, 1999. (Cited on pp. 36, 99,
278)
[44] J. Edmonds: Submodular functions, matroids and certain polyhedra, in: R.
Guy, H. Hanani, N. Saner, and J. Schonheim, eds., Combinatorial Structures
and Their Applications, Gordon and Breach, New York, 1970, 69-87. (Cited
on pp. 5, 6, 7, 35, 37, 119, 146, 224 290)
[45] J. Edmonds: Matroid intersection, Annals of Discrete Mathematics, 14
(1979), 39-49. (Cited on pp. 6, 7, 35, 224)
366 Bibliography

[46] J. Edmonds and R. Giles: A min-max relation for submodular functions


on graphs, Annals of Discrete Mathematics, 1 (1977), 185-204. (Cited on
p. 278)
[47] A. Eguchi and S. Fujishige: An extension of the Gale-Shapley stable matching
algorithm to a pair of M^-concave functions, Discrete Mathematics and Sys-
tems Science Research Report, No. 02-05, Division of Systems Science, Osaka
University, November 2002; Mathematics of Operations Research, submitted.
(Cited on p. 345)
[48] U. Faigle: Matroids in combinatorial optimization, in: N. White, ed., Com-
binatorial Geometries, Cambridge University Press, London, 1987, 161-210.
(Cited on p. 74)
[49] P. Favati and F. Tardella: Convexity in nonlinear integer programming,
Ricerca Operativa, 53 (1990), 3-44. (Cited on pp. 6, 7, 8, 38, 99, 202, 322)
[50] L. Fleischer and S. Iwata: A push-relabel framework for submodular function
minimization and applications to parametric optimization, Discrete Applied
Mathematics (2003), in press. (Cited on p. 322)
[51] L. Fleischer, S. Iwata, and S. T. McCormick: A faster capacity scaling al-
gorithm for minimum cost submodular flow, Mathematical Programming, 92
(2002), 119-139. (Cited on p. 322)
[52] R. Fletcher: Practical Methods of Optimization, 2nd ed., John Wiley and
Sons, New York, 1987. (Cited on p. 36)
[53] L. R. Ford, Jr. and D. R. Fulkerson: Flows in Networks, Princeton University
Press, Princeton, NJ, 1962. (Cited on pp. 74, 278)
[54] A. Frank: A weighted matroid intersection algorithm, Journal of Algorithms,
2 (1981), 328-336. (Cited on pp. 6, 7, 35, 37, 224, 244)
[55] A. Frank: An algorithm for submodular functions on graphs, Annals of Dis-
crete Mathematics, 16 (1982), 97-120. (Cited on pp. 6, 35, 37, 119, 224,
318)
[56] A. Frank: Finding feasible vectors of Edmonds-Giles polyhedra, Journal of
Combinatorial Theory (B), 36 (1984), 221-239. (Cited on pp. 278, 312, 322)
[57] A. Frank: Generalized polymatroids, in: A. Hajnal, L. Lovasz, and V. T. Sos,
eds., Finite and Infinite Sets, I, North-Holland, Amsterdam, 1984, 285-294.
(Cited on p. 119)
[58] A. Frank and E. Tardos: Generalized polymatroids and submodular flows,
Mathematical Programming, 42 (1988), 489-563. (Cited on p. 119)
[59] S. Fujishige: Algorithms for solving the independent-flow problems, Journal
of Operations Research Society of Japan, 21 (1978), 189-204. (Cited on
pp. 278, 312, 313, 322)
[60] S. Fujishige: Lexicographically optimal base of a polymatroid with respect
to a weight vector, Mathematics of Operations Research, 5 (1980), 186-196.
(Cited on p. 4)
Bibliography 367

[61] S. Fujishige: Structure of polyhedra determined by submodular functions on


crossing families, Mathematical Programming, 29 (1984), 125-141. (Cited
on p. 278)
[62] S. Fujishige: Theory of submodular programs: A Fenchel-type min-max the-
orem and subgradients of submodular functions, Mathematical Programming,
29 (1984), 142-155. (Cited on pp. 6, 35, 224, 244)
[63] S. Fujishige: On the subdifferential of a submodular function, Mathematical
Programming, 29 (1984), 348-360. (Cited on pp. 6, 37, 119)
[64] S. Fujishige: A note on Frank's generalized polymatroids, Discrete Applied
Mathematics, 7 (1984), 105-109. (Cited on p. 119)
[65] S. Fujishige: Submodular Functions and Optimization, Annals of Discrete
Mathematics, 47, North-Holland, Amsterdam, 1991. (Cited on pp. 4, 37,
117, 119, 202, 248, 278, 285, 312, 318, 322)
[66] S. Fujishige and S. Iwata: Algorithms for submodular flows, IEICE Trans-
actions on Systems and Information, E83-D (2000), 322-329. (Cited on
pp. 312, 318, 322)
[67] S. Fujishige, K. Makino, T. Takabatake, and K. Kashiwabara: Polybasic poly-
hedra: structure of polyhedra with edge vectors of support size at most 2,
Discrete Mathematics, to appear. (Cited on p. 120)
[68] S. Fujishige and K. Murota: Notes on L-/M-convex functions and the separa-
tion theorems, Mathematical Programming, 88 (2000), 129-146. (Cited on
pp. 6, 8, 38, 131, 202)
[69] S. Fujishige and Z. Yang: A note on Kelso and Crawford's gross substi-
tutes condition, Mathematics of Operations Research, to appear. (Cited
on pp. 120, 176, 344)
[70] S. Fujishige and X. Zhang: New algorithms for the intersection problem of
submodular systems, Japan Journal of Applied Mathematics, 9 (1992), 369-
382. (Cited on p. 312)
[71] M. Fukushima, Y. Oshima, and M. Takeda: Dirichlet Forms and Symmetric
Markov Processes, Walter de Gruyter, Berlin, 1994. (Cited on pp. 45, 74)
[72] D. Gale: Equilibrium in a discrete exchange economy with money, Interna-
tional Journal of Game Theory, 13 (1984), 61-64. (Cited on p. 327)
[73] D. Gale and T. Politof: Substitutes and complements in network flow prob-
lems, Discrete Applied Mathematics, 3 (1981), 175-186. (Cited on p. 74)
[74] D. Gale and L. S. Shapley: College admissions and stability of marriage,
American Mathematical Monthly, 69 (1962), 9-15. (Cited on p. 345)
[75] J. F. Geelen and S. Iwata: Matroid matching via mixed skew-symmetric ma-
trices, METR 2002-03, University of Tokyo, April 2002. (Cited on p. 361)
[76] J. F. Geelen, S. Iwata, and K. Murota: The linear delta-matroid parity prob-
lem, Journal of Combinatorial Theory (B), to appear. (Cited on p. 361)
368 Bibliography

[77] E. Girlich, M. Kovalev, and A. Zaporozhets: A polynomial algorithm for


resource allocation problems with polymatroid constraints, Optimization, 37
(1996), 73-86. (Cited on p. 4)
[78] E. Girlich and M. M. Kowaljow: Nichtlineare diskrete Optimierung,
Akademie-Verlag, Berlin, 1981. (Cited on p. 4)
[79] F. Granot and A. F. Veinott, Jr.: Substitutes, complements and ripples
in network flows, Mathematics of Operations Research, 10 (1985), 471-497.
(Cited on p. 74)
[80] J. Graver, B. Servatius, and H. Servatius: Combinatorial Rigidity, American
Mathematical Society, Providence, RI, 1993. (Cited on p. 361)
[81] H. Groenevelt: Two algorithms for maximizing a separable concave function
over a polymatroid feasible region, European Journal of Operational Research,
54 (1991), 227-236. (Cited on p. 4)
[82] M. Grotschel, L. Lovasz, and A. Schrijver: The ellipsoid method and its
consequences in combinatorial optimization, Combinatorica, 1 (1981), 169-
197 [Corrigendum: Combinatorica, 4 (1984), 291-295]. (Cited on p. 290)
[83] M. Grotschel, L. Lovasz, and A. Schrijver: Geometric Algorithms and Combi-
natorial Optimization, 1st ed., 2nd. ed., Springer-Verlag, Berlin, 1988, 1993.
(Cited on p. 290)
[84] F. Gul and E. Stacchetti: Walrasian equilibrium with gross substitutes, Jour-
nal of Economic Theory, 87 (1999), 95-124. (Cited on pp. 327, 332, 344)
[85] B. Hajek: Extremal splittings of point processes, Mathematics of Operations
Research, 10 (1985), 543-556. (Cited on p. 202)
[86] D. J. Hartfiel and R. Loewy: A determinantal version of the Frobenius-Konig
theorem, Linear Multilinear Algebra, 16 (1984), 155-165. (Cited on p. 361)
[87] R. Hassin: Minimum cost flow with set-constraints, Networks, 12 (1982),
1-21. (Cited on p. 278)
[88] C. Henry: Indivisibilites dans une economic d'echanges, Econometrica, 38
(1970), 542-558. (Cited on p. 327)
[89] J.-B. Hiriart-Urruty and C. Lemarechal: Convex Analysis and Minimization
Algorithms I, II, Springer-Verlag, Berlin, 1993. (Cited on p. 99)
[90] D. S. Hochbaum: Lower and upper bounds for the allocation problem and
other nonlinear optimization problems, Mathematics of Operations Research,
19 (1994), 390-409. (Cited on pp. 4, 158)
[91] D. S. Hochbaum and S.-P. Hong: About strongly polynomial time algorithms
for quadratic optimization over submodular constraints, Mathematical Pro-
gramming, 69 (1995), 269-309. (Cited on p. 4)
[92] D. S. Hochbaum, R. Shamir, and J. G. Shanthikumar: A polynomial algo-
rithm for an integer quadratic non-separable transportation problem, Mathe-
matical Programming, 55 (1992), 359-371. (Cited on pp. 5, 175)
Bibliography 369

[93] T. Ibaraki and N. Katoh: Resource Allocation Problems: Algorithmic Ap-


proaches, MIT Press, Boston, 1988. (Cited on pp. 4, 5)
[94] M. Iri: Network Flow, Transportation and Scheduling—Theory and Algo-
rithms, Academic Press, New York, 1969. (Cited on pp. 64, 74, 132, 247,
278)
[95] M. Iri: Applications of matroid theory, in: A. Bachem, M. Grotschel, and
B. Korte, eds., Mathematical Programming—The State of the Art, Springer-
Verlag, Berlin, 1983, 158-201. (Cited on p. 361)
[96] M. Iri and N. Tomizawa: An algorithm for finding an optimal "independent
assignment," Journal of the Operations Research Society of Japan, 19 (1976),
32-57. (Cited on pp. 6, 7, 35, 224)
[97] S. Iwata: A capacity scaling algorithm for convex cost submodular flows,
Mathematical Programming, 76 (1997), 299-308. (Cited on p. 322)
[98] S. Iwata: Submodular flow problems (in Japanese), in: S. Fujishige, ed.,
Discrete Structures and Algorithms, Vol. VI, Kindai-Kagakusha, Tokyo, 1999,
Chapter 4, 127-170. (Cited on pp. 312, 318, 322)
[99] S. Iwata: A fully combinatorial algorithm for submodular function minimiza-
tion, Journal of Combinatorial Theory (B), 84 (2002), 203-212. (Cited on
pp. 290, 305)
[100] S. Iwata: A faster scaling algorithm for minimizing submodular functions, in:
W. J. Cook and A. S. Schulz, eds., Integer Programming and Combinatorial
Optimization, Lecture Notes in Computer Science, 2337, Springer-Verlag,
2002, 1-8. (Cited on p. 322)
[101] S. Iwata, L. Fleischer, and S. Fujishige: A combinatorial, strongly polynomial-
time algorithm for minimizing submodular functions, Proceedings of the 32nd
ACM Symposium on Theory of Computing (2000), 97-106. (Cited on p. 290)
[102] S. Iwata, L. Fleischer, and S. Fujishige: A combinatorial, strongly polynomial-
time algorithm for minimizing submodular functions, Journal of the ACM, 48
(2001), 761-777. (Cited on pp. 290, 322)
[103] S. Iwata, S. T. McCormick, and M. Shigeno: Fast cycle canceling algorithms
for minimum cost submodular flow, Combinatorica, to appear. (Cited on
p. 313)
[104] S. Iwata and K. Murota: Combinatorial relaxation algorithm for mixed poly-
nomial matrices, Mathematical Programming, 90 (2001), 353-371. (Cited
on p. 361)
[105] S. Iwata and M. Shigeno: Conjugate scaling algorithm for Fenchel-type duality
in discrete convex optimization, SI AM Journal on Optimization, 13 (2003),
204-211. (Cited on pp. 202, 278, 322)
[106] P. M. Jensen and B. Korte: Complexity of matroid property algorithms, SIAM
Journal on Computing, 11 (1982), 184-190. (Cited on p. 293)
[107] M. Kaneko: The central assignment game and the assignment markets, Jour-
nal of Mathematical Economics, 10 (1982), 205-232. (Cited on p. 327)
370 Bibliography

[108] M. Kaneko and Y. Yamamoto: The existence and computation of competitive


equilibria in markets with an indivisible commodity, Journal of Economic
Theory, 38 (1986), 118-136. (Cited on p. 327)
[109] K. Kashiwabara and T. Takabatake: Polyhedra with submodular support
functions and their unbalanced simultaneous exchangeability, Discrete Applied
Mathematics (2003), in press. (Cited on p. 120)
[110] N. Katoh and T. Ibaraki: Resource allocation problems, in: D.-Z. Du and P.
M. Pardalos, eds., Handbook of Combinatorial Optimization, Vol. 2, Kluwer
Academic, Boston, 1998, 159-260. (Cited on p. 176)
[111] A. S. Kelso, Jr., and V. P. Crawford: Job matching, coalition formation, and
gross substitutes, Econometrica, 50 (1982), 1483-1504. (Cited on pp. 327,
332, 344)
[112] J. Kindler: Sandwich theorems for set functions, Journal of Mathematical
Analysis and Applications, 133 (1988), 529-542. (Cited on p. 5)
[113] S. Kodama and N. Suda: Matrix Theory for System Control (in Japanese),
Society of Instrument and Control Engineers, Tokyo, 1978. (Cited on p. 42)
[114] B. Korte, L. Lovasz, and R. Schrader: Greedoids, Springer-Verlag, Berlin,
1991. (Cited on p. 5)
[115] B. Korte and J. Vygen: Combinatorial Optimization: Theory and Algorithms,
Springer-Verlag, Berlin, 2000. (Cited on pp. 36, 74, 89, 99, 278)
[116] J. P. S. Kung: A Source Book in Matroid Theory, Birkhauser, Boston, 1986.
(Cited on p. 74)
[117] J. P. S. Kung: Basis-exchange properties, in: N. White, ed., Theory of Ma-
troids, Cambridge University Press, London, 1986, Chapter 4, 62-75. (Cited
on p. 333)
[118] E. L. Lawler: Matroid intersection algorithms, Mathematical Programming, 9
(1975), 31-56. (Cited on pp. 6, 7)
[119] E. L. Lawler: Combinatorial Optimization: Networks and Matroids, Holt,
Rinehart and Winston, New York, 1976, Dover Publications, New York, 2001.
(Cited on pp. 36, 74, 89, 99, 278)
[120] E. L. Lawler and C. U. Martel: Computing maximal polymatroidal network
flows, Mathematics of Operations Research, 7 (1982), 334-337. (Cited on
p. 278)
[121] E. L. Lawler and C. U. Martel: Network flow formulations of polymatroid
optimization problems, Annals of Discrete Mathematics, 16 (1982), 515-534.
(Cited on p. 278)
[122] L. Lovasz: Matroid matching and some applications, Journal of Combinato-
rial Theory (B), 28 (1980), 208-236. (Cited on p. 293)
[123] L. Lovasz: Submodular functions and convexity, in: A. Bachem, M. Grotschel,
and B. Korte, eds., Mathematical Programming—The State of the Art,
Springer-Verlag, Berlin, 1983, 235-257. (Cited on pp. 5, 6, 37, 119, 14
293)
Bibliography 371

[124] L. Lovasz: The membership problem in jump systems, Journal of Combina-


torial Theory (B), 70 (1997), 45-66. (Cited on p. 120)
[125] L. Lovasz and M. Plummer: Matching Theory, North-Holland, Amsterdam,
1986. (Cited on p. 99)
[126] O. L. Mangasarian: Nonlinear Programming, SIAM, Philadelphia, 1994.
(Cited on p. 36)
[127] S. T. McCormick: Submodular Function Minimization, in: K. Aardal, G.
Nemhauser, and R. Weismantel, eds., Handbook on Discrete Optimization,
Elsevier Science, Berlin, 2003, to appear. (Cited on p. 322)
[128] L. McKenzie: General equilibrium, in: J. Eatwell, M. Milgate, and P. New-
man, eds., The New Palgrave: General Equilibrium, Macmillan, London, 1989,
Chapter 1. (Cited on p. 327)
[129] P. Milgrom and C. Shannon: Monotone comparative statics, Econometrica,
62 (1994), 157-180. (Cited on pp. 198, 203, 345)
[130] B. L. Miller: On minimizing nonseparable functions denned on the integers
with an inventory application, SIAM Journal on Applied Mathematics, 21
(1971), 166-185. (Cited on pp. 5, 99)
[131] M. Minoux: Solving integer minimum cost flows with separable convex objec-
tive polynomially, Mathematical Programming, 26 (1986), 237-239. (Cited
on p. 4)
[132] S. Moriguchi and K. Murota: Capacity scaling algorithm for scalable M-
convex submodular flow problems, Optimization Methods and Software, to
appear. (Cited on pp. 312, 322)
[133] S. Moriguchi, K. Murota, and A. Shioura: Scaling algorithms for M-convex
function minimization, IEICE Transactions on Fundamentals of Electronics,
Communications and Computer Sciences, E85-A (2002), 922-929. (Cited
on pp. 175, 321)
[134] S. Moriguchi and A. Shioura: On Hochbaum's scaling algorithm for the gen-
eral resource allocation problem, Research Reports on Mathematical and
Computing Sciences, B-377, Tokyo Institute of Technology, January 2002.
(Cited on pp. 158, 176)
[135] K. Murota: Valuated matroid intersection, I: optimality criteria, SIAM Jour-
nal on Discrete Mathematics, 9 (1996), 545-561. (Cited on pp. 6, 7, 35, 224,
244, 278)
[136] K. Murota: Valuated matroid intersection, II: algorithms, SIAM Journal on
Discrete Mathematics, 9 (1996), 562-576. (Cited on pp. 8, 313, 322)
[137] K. Murota: Convexity and Steinitz's exchange property, Advances in Mathe-
matics, 124 (1996), 272-311. (Cited on pp. 6, 8, 37, 175, 176, 244, 278)
[138] K. Murota: Matroid valuation on independent sets, Journal of Combinatorial
Theory (B), 69 (1997), 59-78. (Cited on p. 176)
372 Bibliography

[139] K. Murota: Fenchel-type duality for matroid valuations, Mathematical Pro-


gramming, 82 (1998), 357-375. (Cited on pp. 6, 8)
[140] K. Murota: Discrete convex analysis, Mathematical Programming, 83 (1998),
313-371. (Cited on pp. 6, 8, 37, 74, 119, 131, 132, 176, 202, 244, 278)
[141] K. Murota: Discrete convex analysis (in Japanese), in: S. Fujishige, ed.,
Discrete Structures and Algorithms, Vol. V, Kindai-Kagakusha, Tokyo, 1998,
Chapter 2, 51-100. (Cited on pp. 37, 74, 119, 132, 175, 202)
[142] K. Murota: Submodular flow problem with a nonseparable cost function,
Combinatorica, 19 (1999), 87-109. (Cited on pp. 8, 37, 74, 176, 221, 244,
278, 322)
[143] K. Murota: On the degree of mixed polynomial matrices, SIAM Journal on
Matrix Analysis and Applications, 20 (1999), 196-227. (Cited on p. 361)
[144] K. Murota: Discrete convex analysis—Exposition on conjugacy and duality,
in: L. Lovasz, A. Gyarfas, G. O. H. Katona, A. Recski, and L. Szekely, eds.,
Graph Theory and Combinatorial Biology, The Janos Bolyai Mathematical
Society, Budapest, 1999, 253-278. (Cited on pp. 175, 202)
[145] K. Murota: Algorithms in discrete convex analysis, IEICE Transactions on
Systems and Information, E83-D (2000), 344-352. (Cited on pp. 202, 278,
322)
[146] K. Murota: Matrices and Matroids for Systems Analysis, Springer-Verlag,
Berlin, 2000. (Cited on pp. 74, 244, 266, 321, 358, 359, 360, 361)
[147] K. Murota: Discrete Convex Analysis—An Introduction (in Japanese), Ky-
oritsu Publishing Company, Tokyo, 2001. (Cited on pp. xxii, 74, 99, 175,
176, 202, 244, 278, 322, 344)
[148] K. Murota: On steepest descent algorithms for discrete convex functions,
METR 2002-12, University of Tokyo, November 2002. (Cited on pp. 321,
322)
[149] K. Murota and M. Iri: Matroid-theoretic approach to the structural solvability
of a system of equations (in Japanese), Transactions of Information Processing
Society of Japan, 24 (1983), 157-164. (Cited on p. 361)
[150] K. Murota and M. Iri: Structural solvability of systems of equations—A
mathematical formulation for distinguishing accurate and inaccurate num-
bers in structural analysis of systems, Japan Journal of Applied Mathematics,
2 (1985), 247-271. (Cited on p. 361)
[151] K. Murota and A. Shioura: M-convex function on generalized polymatroid,
Mathematics of Operations Research, 24 (1999), 95-105. (Cited on pp. 6,
8, 38, 119, 175)
[152] K. Murota and A. Shioura: Extension of M-convexity and L-convexity to
polyhedral convex functions, Advances in Applied Mathematics, 25 (2000),
352-427. (Cited on pp. 6, 8, 38, 98, 120, 132, 162, 163, 176, 190, 192, 202, 244,
278, 279)
Bibliography 373

[153] K. Murota and A. Shioura: Relationship of M-/L-convex functions with dis-


crete convex functions by Miller and by Favati-Tardella, Discrete Applied
Mathematics, 115 (2001), 151-176. (Cited on pp. 36, 38, 99, 119, 132, 176,
231, 244)
[154] K. Murota and A. Shioura: Quasi M-convex and L-convex functions: Quasi-
convexity in discrete optimization, Discrete Applied Mathematics (2003), in
press. (Cited on pp. 176, 202, 321)
[155] K. Murota and A. Shioura: Quadratic M-convex and L-convex functions,
RIMS Preprint 1326, Kyoto University, July 2001. (Cited on pp. 9, 38, 52,
74, 175)
[156] K. Murota and A. Shioura: M-convex and L-convex functions over the real
space—Two conjugate classes of combinatorial convex functions, METR 2002-
09, University of Tokyo, July 2002. (Cited on pp. 6, 9, 38, 176, 202, 211)
[157] K. Murota and A. Shioura: Conjugacy relationship between M-convex and
L-convex functions in continuous variables, RIMS Preprint 1378, Kyoto Uni-
versity, September 2002; Mathematical Programming, to appear. (Cited on
pp. 6, 9, 38, 176, 202, 211)
[158] K. Murota and A. Shioura: Substitutes and complements in network flows
viewed as discrete convexity, RIMS Preprint 1382, Kyoto University, October
2002. (Cited on p. 74)
[159] K. Murota and A. Tamura: On circuit valuation of matroids, Advances in
Applied Mathematics, 26 (2001), 192-225. (Cited on p. 75)
[160] K. Murota and A. Tamura: New characterizations of M-convex functions
and their applications to economic equilibrium models with indivisibilities,
Discrete Applied Mathematics (2003), in press. (Cited on pp. 176, 333, 344)
[161] K. Murota and A. Tamura: Application of M-convex submodular flow problem
to mathematical economics, in: P. Eades and T. Takaoka, eds., Algorithms
and Computation, Lecture Notes in Computer Science, 2223, Springer-Verlag,
2001, 14-25; Japan Journal of Applied Mathematics, to appear. (Cited on
p. 345)
[162] K. Murota and A. Tamura: Proximity theorems of discrete convex functions,
RIMS Preprint 1358, Kyoto University, June 2002. (Cited on pp. 158, 228,
244)
[163] H. Nagamochi and T. Ibaraki: Computing edge-connectivity in multigraphs
and capacitated graphs, SIAM Journal on Discrete Mathematics, 5 (1992),
54-64. (Cited on p. 290)
[164] T. Nakasawa: Zur Axiomatik der linearen Abhangigkeit, I, II, III, Science
Reports of the Tokyo Bunrika Daigaku, Section A, 2 (1935), 235-255; 3 (1936),
45-69; 3 (1936), 123-136. (Cited on p. 74)
[165] H. Narayanan: Submodular Functions and Electrical Networks, Annals of Dis-
crete Mathematics, 54, North-Holland, Amsterdam, 1997. (Cited on p. 37)
374 Bibliography

[166] G. L. Nemhauser, A. H. G. Rinnooy Kan, and M. J. Todd, eds.: Optimization,


Handbooks in Operations Research and Management Science, Vol. 1, Elsevier
Science, Amsterdam, 1989. (Cited on pp. 36, 244)
[167] G. L. Nemhauser and L. A. Wolsey: Integer and Combinatorial Optimization,
John Wiley and Sons, New York, 1988. (Cited on pp. 36, 99, 244, 278)
[168] H. Nikaido: Convex Structures and Economic Theory, Academic Press, New
York, 1968. (Cited on p. 327)
[169] J. Nocedal and S. J. Wright: Numerical Optimization, Springer-Verlag, New
York, 1999. (Cited on p. 36)
[170] J. G. Oxley: Matroid Theory, Oxford University Press, Oxford, U.K., 1992.
(Cited on p. 74)
[171] H. Perfect: Independence spaces and combinatorial problems, Proceedings of
the London Mathematical Society, 19 (1969), 17-30. (Cited on p. 279)
[172] M. Queyranne: Minimizing symmetric submodular functions, Mathematical
Programming, 82 (1998), 3-12. (Cited on p. 290)
[173] M. Quinzii: Core and equilibria with indivisibilities, International Journal of
Game Theory, 13 (1984), 41-61. (Cited on p. 327)
[174] N. Radics and A. Recski: Applications of combinatorics to statics—Rigidity
of grids, Discrete Applied Mathematics, 123 (2002), 473-485. (Cited on
p. 361)
[175] A. Recski: Matroid Theory and Its Applications in Electric Network Theory
and in Statics, Springer-Verlag, Berlin, 1989. (Cited on pp. 74, 361)
[176] R. T. Rockafellar: Convex Analysis, Princeton University Press, Princeton,
NJ, 1970. (Cited on pp. 84, 85, 99)
[177] R. T. Rockafellar: Conjugate Duality and Optimization, SIAM Regional Con-
ference Series in Applied Mathematics 16, SIAM, Philadelphia, 1974. (Cited
on pp. 99, 235, 242)
[178] R. T. Rockafellar: Network Flows and Monotropic Optimization, John Wiley
and Sons, New York, 1984. (Cited on pp. 53, 60, 64, 74, 132, 247, 253, 278)
[179] R. T. Rockafellar and R. J.-B. Wets: Variational Analysis, Springer-Verlag,
Berlin, 1998. (Cited on p. 99)
[180] A. E. Roth and M. A. O. Sotomayor: Two-Sided Matching—A Study in Game-
Theoretic Modelling and Analysis, Cambridge University Press, Cambridge,
U.K., 1990. (Cited on p. 344)
[181] A. Schrijver: Theory of Linear and Integer Programming, John Wiley and
Sons, New York, 1986. (Cited on pp. 88, 89, 99)
[182] A. Schrijver: A combinatorial algorithm minimizing submodular functions in
strongly polynomial time, Journal of Combinatorial Theory (B), 80 (2000),
346-355. (Cited on pp. 290, 322)
[183] A. Schrijver: Combinatorial Optimization—Polyhedra and Efficiency,
Springer-Verlag, Heidelberg, Germany, 2003. (Cited on pp. 74, 119, 279)
Bibliography 375

[184] L. S. Shapley: On network flow functions, Naval Research Logistics Quarterly,


8 (1961), 151-158. (Cited on p. 74)
[185] L. S. Shapley: Complements and substitutes in the optimal assignment prob-
lem, Naval Research Logistics Quarterly, 9 (1962), 45-48. (Cited on p. 74)
[186] L. S. Shapley: Cores of convex games, International Journal of Game Theory,
1 (1971), 11-26 (errata, 199). (Cited on p. 345)
[187] L. S. Shapley and H. Scarf: On cores and indivisibilities, Journal of Mathe-
matical Economics, I (1974), 23-37. (Cited on p. 327)
[188] A. Shioura: An algorithmic proof for the induction of M-convex functions
through networks, Research Reports on Mathematical and Computing Sci-
ences, B-317, Tokyo Institute of Technology, July 1996. (Cited on p. 278)
[189] A. Shioura: A constructive proof for the induction of M-convex func-
tions through networks, Discrete Applied Mathematics, 82 (1998), 271-278.
(Cited on p. 278)
[190] A. Shioura: Minimization of an M-convex function, Discrete Applied Mathe-
matics, 84 (1998), 215-220. (Cited on pp. 176, 321)
[191] A. Shioura: Level set characterization of M-convex functions, IEICE Trans-
actions on Fundamentals of Electronics, Communications and Computer Sci-
ences, E83-A (2000), 586-589. (Cited on p. 176)
[192] A. Shioura: Fast scaling algorithms for M-convex function minimization
with application to the resource allocation problem, IEICE Technical Report
COMP 2002-43, The Institute of Electronics, Information and Communica-
tion Engineers, 2002. Discrete Applied Mathematics, to appear. (Cited on
p. 321)
[193] D. D. Siljak: Large-Scale Dynamic Systems—Stability and Structure, North-
Holland, New York, 1978. (Cited on p. 42)
[194] J. Stoer and C. Witzgall: Convexity and Optimization in Finite Dimensions I,
Springer-Verlag, Berlin, 1970. (Cited on pp. 85, 99)
[195] K. Sugihara: Machine Interpretation of Line Drawings, MIT Press, Cam-
bridge, MA, 1986. (Cited on p. 361)
[196] L.-G. Svensson: Competitive equilibria with indivisible goods, Journal of Eco-
nomics, 44 (1984), 373-386. (Cited on p. 327)
[197] A. Tamura: Coordinatewise domain scaling algorithm for M-convex function
minimization, in: W. J. Cook and A. S. Schulz, eds., Integer Programming
and Combinatorial Optimization, Lecture Notes in Computer Science, 2337,
Springer-Verlag, 2002, 21-35. (Cited on pp. 176, 321)
[198] A. Tamura: On convolution of L-convex functions, Optimization Methods and
Software, to appear. (Cited on p. 244)
[199] E. Tardos, C. A. Tovey, and M. A. Trick: Layered augmenting path algo-
rithms, Mathematics of Operations Research, 11 (1986), 362-370. (Cited
on p. 312)
376 Bibliography

[200] N. Tomizawa: Theory of hyperspaces (XVI)—On the structure of hedrons


(in Japanese), Papers of the Technical Group on Circuit and System Theory,
Institute of Electronics and Communication Engineers of Japan, CAS82-174,
1983. (Cited on p. 120)
[201] N. Tomizawa and M. Iri: An algorithm for solving the "independent assign-
ment" problem with application to the problem of determining the order
of complexity of a network (in Japanese), Transactions of the Institute of
Electronics and Communication Engineers of Japan, 57A (1974), 627-629.
(Cited on pp. 6, 7)
[202] D. M. Topkis, Minimizing a submodular function on a lattice, Operations
Research, 26 (1978), 305-321. (Cited on pp. 202, 244)
[203] D. M. Topkis: Supermodularity and Complementarity, Princeton University
Press, Princeton, NJ, 1998. (Cited on pp. 37, 244, 345)
[204] G. van der Laan, D. Talman, and Z. Yang: Existence of an equilibrium in a
competitive economy with indivisibilities and money, Journal of Mathematical
Economics, 28 (1997), 101-109. (Cited on p. 327)
[205] B. L. van der Waerden: Algebra, Springer-Verlag, Berlin, 1955. (Cited on
P. 73)
[206] R. J. Vanderbei: Linear Programming: Foundations and Extensions, 2nd ed.,
Kluwer Academic, Boston, 2001. (Cited on pp. 88, 99)
[207] R. S. Varga: Matrix Iterative Analysis, 2nd ed., Springer-Verlag, Berlin, 2000.
(Cited on p. 42)
[208] J. Vygen: A note on Schrijver's submodular function minimization algorithm,
Journal of Combinatorial Theory (B), to appear. (Cited on p. 296)
[209] J. Wako: A note on the strong core of a market with indivisible goods, Journal
of Mathematical Economics, 13 (1984), 189-194. (Cited on p. 327)
[210] C. Wallacher and U. T. Zimmermann: A polynomial cycle canceling algorithm
for submodular flows, Mathematical Programming, 86 (1999), 1-15. (Cited
on p. 313)
[211] D. J. A. Welsh: Matroid Theory, Academic Press, London, 1976. (Cited on
pp. 74, 279)
[212] N. White, ed.: Theory of Matroids, Cambridge University Press, London,
1986. (Cited on p. 74)
[213] N. White, ed.: Combinatorial Geometries, Cambridge University Press, Lon-
don, 1987. (Cited on pp. 74, 279)
[214] N. White, ed.: Matroid Applications, Cambridge University Press, London,
1992. (Cited on p. 74)
[215] W. Whiteley: Matroids and rigid structures, in: N. White, ed., Matroid Appli-
cations, Cambridge University Press, London, 1992, Chapter 1, 1-53. (Cited
on p. 361)
Bibliography 377

[216] W. Whiteley: Some matroids from discrete applied geometry, in: J. E. Bonin,
J. G. Oxley, and B. Servatius, eds., Matroid Theory, American Mathematical
Society, Providence, RI, 1996, 171-311. (Cited on p. 361)
[217] W. Whiteley: Rigidity and scene analysis, in: J. E. Goodman and J.
O'Rourke, eds., Handbook of Discrete and Computational Geometry, CRC
Press, Boca Raton, FL, 1997, 893-916. (Cited on p. 361)
[218] H. Whitney: On the abstract properties of linear dependence, American Jour-
nal of Mathematics, 57 (1935), 509-533. (Cited on pp. 5, 6, 8, 74)
[219] Z. Yang: Equilibrium in an exchange economy with multiple indivisible com-
modities and money, Journal of Mathematical Economics, 33 (2000), 353-365.
(Cited on p. 327)
[220] L. A. Zadeh and C. A. Desoer: Linear System Theory, McGraw-Hill, New
York, 1963. (Cited on pp. 351, 355)
[221] U. Zimmermann: Minimization of some nonlinear functions over polyma-
troidal network flows, Annals of Discrete Mathematics, 16 (1982), 287-309.
(Cited on p. 176)
[222] U. Zimmermann: Negative circuits for flows and submodular flows, Discrete
Applied Mathematics, 36 (1992), 179-189. (Cited on p. 313)
This page intentionally left blank
Index

accurate number, 347 entering, 53


active triple, 297 leaving, 52
acyclic, 107 augmenting path, 60, 273, 274
admissible potential, 122 6-, 297
affine hull, 78 auxiliary network, 252, 263
agent, 324
aggregate cost function, 335 base, 105
aggregation extreme, 105
of function to subset, 143, 162 matrix, 69
by network transformation, 272 matroid, 70
algebraically independent, 354 base family
algorithm matrix, 69
competitive equilibrium, 344 matroid, 70
conjugate scaling, 320 valuated matroid, 72
cycle-canceling, 313 base polyhedron, 18, 105
domain reduction, 284 integral, 18
domain reduction scaling, 287 biconjugate function, 82
fully combinatorial, 290 integer, 212
greedy, 3, 108 bipartite graph, 89
IFF fixing, 300 bipartite matching, 89
IFF scaling, 299 Birkhoff's representation theorem, 292
L-convex function minimization, Boolean lattice, 104
305, 306, 308 boundary, 53
M-convex function minimization, branch, 52
281, 283, 284, 287 budget set, 324
primal-dual, 315
pseudopolynomial, 288 certificate of optimality, 12
Schrijver's, 293 chain, 88
steepest descent, 281, 305, 306 characteristic curve, 54, 251
steepest descent scaling, 283, 308 discrete, 57
strongly polynomial, 288 characteristic vector, 16
submodular function minimization, chemical process, 349
293, 299, 300 Choquet integral, 16, 104
successive shortest path, 312 closed convex function, 79
two-stage, 310 closed convex hull, 78
weakly polynomial, 288 closed interval, 77
arc, 52 closure

379
380 Index

concave function, 216 contraction


convex function, 93 normal, 45
convex set, 78 unit, 45
coboundary, 53, 248 convex closure
another convention, 253 function, 93
cocontent, 55 set, 78
combinatorial optimization, 3 convex combination, 78
commodity convex cone, 78
divisible, 327 convex conjugate, 10, 81
indivisible, 323 discrete, 212
compartmental matrix, 43 convex extensible, 93
competitive economy, 324 convex extension, 93
competitive equilibrium, 325 local, 93
complementarity, 88 convex function, 2, 9, 77
complements, 62 closed, 79
concave closure, 216 dual-integral polyhedral, 161
concave conjugate, 11, 81 integral polyhedral, 161
discrete, 212 laminar, 141
concave extensible, 93 polyhedral, 80
concave extension, 93 positively homogeneous, 82
concave function, 9, 78 proper, 77
quasi-separable, 334 quadratic, 40
separable, 333 quasi, 168
conductance, 41 quasi-separable, 140
cone, 78 separable, 10, 95, 140, 182
convex, 78 strictly, 77
L-convex, 131 univariate, 10
M-convex, 119 convex hull, 78
polar, 82 closed, 78
conformal decomposition, 64 convex polyhedron, 78
conjugacy theorem convex program, 2
closed proper M-/L-convex, 210 M-, 235
in convex analysis, 11, 82 convex set, 2, 78
discrete M-/L-convex, 30, 212 convexity
polyhedral M-/L-convex, 209 discrete midpoint, 23, 129, 180
conjugate function function, 77
concave, 81, 212 in intersection, 92
convex, 81, 212 midpoint, 9
conjugate scaling, 319 in Minkowski sum, 92
conjugate scaling algorithm, 320 quasi, 168
conservation law, 54 set, 78
constitutive equation, 54, 349 convolution
constraint, 1 infimal, 80
consumer, 323 integer infimal, 143
consumption, 323 by network transformation, 272
content, 55 cost function
Index 381

aggregate, 335 domain reduction scaling algorithm,


flow, 53, 246, 255, 256 287
flow boundary, 256 dual-integral polyhedral
producer's, 324 convex function, 161
reduced, 249 L-convex function, 191
tension, 53 M-convex function, 161
current, 41, 53 dual integrality
current potential, 55 intersection theorem, 20, 114
cut capacity function, 247 linear programming, 89
cycle minimum cost flow problem, 252
negative, 122, 252, 263 polyhedral convex function, 161
simple, 62 polyhedral L-convex function, 191
cycle-canceling algorithm, 313 polyhedral M-convex function, 161
submodular flow problem, 261
decreasing marginal return, 330 dual linear program, 87
demand dual problem, 87
correspondence, 325 dual variable, 53
set, 325 duality, 2, 11
descent direction, 147 Edmonds's intersection theorem,
diagonal dominance, 41 20
directed graph, 52, 88 Fenchel, 85
directional derivative, 80 Fenchel-type, 222, 225
Dirichlet form, 45 L-separation, 218
discrete Legendre-Fenchel transforma- linear programming, 87
tion, 13, 212 M-separation, 217
discrete midpoint convexity matroid intersection, 225
function, 23, 180 separation for convex functions,
set, 129 84
discrete separation theorem separation for convex sets, 35, 83
generic form, 13, 216 separation for L-convex functions,
L-convex function, 218 218
L-convex set, 36, 126 separation for M-convex functions,
M-convex function, 217 217
M-convex set, 36, 114 separation for submodular func-
submodular function, 17, 111 tions, 17, 111
submodular function (as special strong, 87
case of L-separation), 33, 224 valuated matroid intersection, 225
discreteness weak, 87
in direction, 10 weight splitting, 34, 225
in value, 13 dynamical degree, 352
distance function, 122
distributive lattice, 292 economy
distributive law, 292 of Arrow-Debreu type, 323
divisible commodity, 327 Edmonds's intersection theorem, 3, 20,
domain reduction algorithm, 284 112
382 Index

(as special case of Fenchel-type polyhedral M-convex function, 29,


duality), 34, 224 56, 160
effective domain polyhedral M^-convex function, 29,
function over Rn, 9, 21, 77 47, 162
function over Zn, 21 simultaneous, 69
set function, 103 weak, 137
electrical network, 41, 43, 348 exchange capacity, 284, 312
multiterminal, 53 exchange economy, 327
elementary vector, 64 extension
entering arc, 53 concave, 93
epigraph, 79 convex, 93
equilibrium distance function, 165
competitive, 325 local convex, 93
economy, 325 Lovasz, 16, 104, 111
electrical network, 55 partial order, 108
exchange axiom set function, 16, 104
(B-EXC[R]), 118 extreme base, 105
(B-EXC+pRJ), 118
Farkas lemma, 50, 87
(B-EXC[Zj), 18, 101
feasible
(B-EXC+[Z]), 102
6-, 296
(B-EXC_[Z]), 103
dual problem, 236
(B-EXC W [ZJ), 103
flow, 247, 258
(B>>-EXC[R]), 118
minimum cost flow problem, 247
(B"-EXC[Z]), 117 potential, 122
local, 135 primal problem, 235
M-convex function, 26, 58, 133 set, 1
M-convex polyhedron, 118 submodular flow problem, 258
M-convex set, 18, 101 Fenchel duality, 12, 85
M^-convex function, 27, 134 Fenchel transformation, 81
M^-convex polyhedron, 118 Fenchel-type duality
M^-convex set, 117 generic form, 13
(M-EXC[R]), 29, 56, 160 L-convex function, 32, 222
(M-EXC'[R]), 160 M-convex function, 32, 222
(M-EXC[Z]), 26, 58, 133 submodular function, 225
(M-EXC'[Z]), 26, 133 fixed constant, 347
(M-EXCioc[Z]), 135 flow, 53
(M-EXC w [Zj), 137 ^-feasible, 296
(M>i-EXC[R]), 29, 47, 162 feasible, 247, 258
(M^-EXC'[R]), 162 Frank's discrete separation theorem,
(M^-EXC+[R]), 48 17, 111
(M^-EXC[Z]), 27, 134 (as special case of L-separation),
(Mb-EXC'[Z]), 134 33, 224
(-M"-EXC[Z]), 330 Frank's weight-splitting theorem, 34,
matroid, 69 225
multiple, 333 fully combinatorial algorithm, 290
Index 383

fundamental circuit, 149 total, 326


initial vertex, 53
g-polymatroid, 117 inner product, 79
generalized polymatroid, 117 integer biconjugate, 212
generator, 45 integer infimal convolution, 143
global minimizer, 79 integer interval, 92
global optimality, 2 integer subdifferential, 166
global optimum, 9 integral base polyhedron, 18
goods integral L-convex polyhedron, 131
divisible, 327 integral M-convex polyhedron, 118
indivisible, 323 integral neighborhood, 93
gradient, 80 integral polyhedral
graph convex function, 161
acyclic, 107 L-convex function, 191
bipartite, 89 L''-convex function, 192
directed, 52, 88 M-convex function, 161
Grassmann-Pliicker relation, 69 M^-convex function, 162
greedy algorithm, 3, 108 integral polyhedron, 90
gross substitutes property, 153, 331 integrality
stepwise, 155, 331 dual, 161, 252, 261
ground set, 70 linear programming, 89
gyrator, 361 minimum cost flow problem, 252
polyhedral convex function, 161
Hamiltonian path problem, 257 polyhedral L-convex function, 191
hole free, 90 polyhedral M-convex function, 161
polyhedron, 90
ideal, 107 primal, 252, 261
IFF fixing algorithm, 300 submodular flow problem, 261
IFF scaling algorithm, 298, 299 integrally concave function, 94
in kilter, 314 integrally convex function, 7, 94
inaccurate number, 347 submodular, 189
incidence integrally convex set, 96
chain, 88 intersection
graph, 88 convexity in, 92
topological, 347 M-convex, 219
income, 324 matroid, 3
independent set submodular polyhedron, 20
matrix, 68 valuated matroid, 225
matroid, 70 intersection theorem
indicator function, 79, 90 Edmonds's, 3, 20, 112
indivisible commodity, 323 Edmonds's (as special case of Fenchel-
indivisible goods, 323 type duality), 34, 224
infimal convolution, 80 M-convex, 219
integer, 143 valuated matroid, 225
by network transformation, 272 weighted matroid, 225
initial endowment, 324 interval
384 Index

closed, 77 L^-convex set, 121, 128


integer, 92 Lagrange duality, 234
open, 77 Lagrangian function, 236
dual, 242
jump system, 120 laminar convex function, 141
by network transformation, 273
kilter laminar family, 141
diagram, 53, 251 Laplace transform, 351
in, 314 lattice, 292
out of, 314 distributive, 292
Kirchhoff's law, 349 sub-, 104
current, 353 leading principal minor, 40
voltage, 353 leading principal submatrix, 40
Konig-Egervary theorem for mixed ma- leaving arc, 52
trix, 358 Legendre-Fenchel transform
concave, 11, 81
L-concave function, 22 convex, 10, 81
polyhedral, 190 discrete, 13, 212
L-convex cone, 131 Legendre-Fenchel transformation
L-convex function, 8, 22, 177 concave, 11, 81
dual-integral polyhedral, 191 convex, 10, 81
integral polyhedral, 191 discrete, 13, 212
polyhedral, 190 Legendre transformation, 81
positively homogeneous, 193 level set, 172
quadratic, 52, 182 linear extension
quasi, 199 partial order, 108
semistrictly quasi, 199 set function, 16, 104
L-convex polyhedron, 123, 131 linear order, 108
integral, 131 linear program, 87
L-convex set, 22, 121 dual problem, 87
L-optimality criterion, 185, 193 primal problem, 87
quasi, 201 linear programming, 86
L-proximity theorem, 186 duality, 87
quasi, 201 linearity in direction 1, 177, 190
L-separation theorem, 33, 218 local convex extension, 93
L2-optimality criterion, 232 local optimality, 2
L2-proximity theorem, 232 local optimum, 10
L2-convex function, 229 Lovasz extension, 16, 104, 111
L2-convex set, 128 LP, 87
L^-convex function, 229 duality, 87
L2-convex set, 129
L''-convex function, 8, 23, 178 M-concave function, 8, 26
integral polyhedral, 192 polyhedral, 160
polyhedral, 192 M-convex cone, 119
quadratic, 48, 52, 182 M-convex function, 8, 26, 133
L^-convex polyhedron, 129, 131 dual-integral polyhedral, 161
Index 385

integral polyhedral, 161 incidence (graph), 88


polyhedral, 160 M-, 42
positively homogeneous, 164 mixed, 354
quadratic, 52, 139 mixed polynomial, 355
quasi, 169 mixed skew-symmetric, 361
semistrictly quasi, 169 node admittance, 42
M-convex intersection polynomial, 71, 354
problem, 219, 264 positive-definite, 39
theorem, 219 positive-semidefinite, 39
M-convex polyhedron, 108, 118 principal sub-, 40
integral, 118 totally unimodular, 88
M-convex program, 235 matroid, 70
M-convex set, 27, 101 induction through a graph, 270
M-convex submodular flow problem, intersection problem, 34, 225
256 valuated, 72
economic equilibrium, 341 max-flow min-cut theorem
M-matrix, 42 for submodular flow, 259
M-minimizer cut, 149 maximum submodular flow problem,
with scaling, 158 259
M-optimality criterion, 148, 163 maximum weight circulation problem,
quasi, 173 61
mechanical system, 350
M-proximity theorem, 156
midpoint convexity, 9
quasi, 174
discrete function, 23, 180
M-separation theorem, 33, 217
discrete set, 129
M2-optimality criterion, 227, 228
Miller's discrete convex function, 98
M2-proximity theorem, 228
min-max relation, 2
M2-convex function, 226
minimizer, 79
Mg-convex set, 116
global, 9, 79
M2-convex function, 226
integrally convex function, 94
M^-convex set, 117 L-convex function, 185, 305
M^-convex function, 8, 27, 134 L2-convex function, 232
integral polyhedral, 162 local, 10
polyhedral, 161 M-convex function, 148, 281
quadratic, 48, 52, 139 M2-convex function, 227, 228
M^-convex polyhedron, 117, 118 maximal, 290, 291, 304, 307
M^-convex set, 102, 117 minimal, 290, 291, 305, 307
Markovian, 45 submodular set function, 288
matching, 89 minimizer cut
bipartite, 89 M-convex function, 149
perfect, 89 M-convex function with scaling,
weighted, 89, 266 158
mathematical programming, 1 quasi M-convex function, 174
matrix quasi M-convex function with scal-
compartmental, 43 ing, 175
incidence (chain), 88 minimum cost flow problem, 53, 245
386 Index

integer flow, 246 optimal value function, 236


minimum cut, 316 optimality
minimum spanning tree problem, 149 global, 2, 9
Minkowski sum, 80, 90 local, 2, 10
convexity in, 92 optimality criterion
discrete, 90 integrally convex function, 94, 95
integral, 90 L-convex function, 185, 193
minor, 40, 359 L2-convex function, 232
leading principal, 40 M-convex function, 148, 163
principal, 40 M-convex submodular flow, 262-
mixed matrix, 354 264
mixed polynomial matrix, 355 M2-convex function, 219, 227, 228
mixed skew-symmetric matrix, 361 minimum cost flow, 249, 252
money, 323 by negative cycle, 252, 263, 264
monotonicity, 54 by potential, 249, 260, 262
multimodular function, 183 quasi L-convex function, 201
multiple exchange axiom, 333 quasi M-convex function, 173
multiterminal electrical network, 53 submodular flow, 260
submodular set function, 185
negative cycle, 122, 252, 263 sum of M-convex functions, 219
criterion, 252, 263, 264 valuated matroid intersection, 225
negative support, 18 weighted matroid intersection, 225
neighborhood, integral, 93 optimization
network, 53 combinatorial, 3
auxiliary, 252, 263 continuous, 1
electrical, 53 discrete, 3
transformation by, 270 optimum
network flow global, 9
duality, 268 local, 10
electrical network, 41, 43 out of kilter, 314
L-convexity, 24, 31, 56, 58, 270
M-convexity, 28, 31, 56, 58, 270 pairing, 79
maximum weight circulation, 61 parallel, 62
minimum cost flow, 245 partial order
multiterminal, 53 acyclic graph, 107
submodular flow, 255 extreme base, 108
no complementarities property, 332 perfect matching, 89
strong, 332 minimum weight, 89, 266
node, 52 Poisson equation, 41, 43, 47
node admittance matrix, 42 polar cone, 82
normal contraction, 45 polyhedral
convex function, 25, 80
objective function, 1 L-concave function, 190
off-diagonal nonpositivity, 41 L-convex function, 190
open interval, 77 L^-convex function, 192
optimal potential, 89, 251 M-concave function, 160
Index 387

M-convex function, 160 M2-convex function, 226


M^-convex function, 161 Mg-convex set, 117
method, 8 polyhedral M-convex function, 161
polyhedron proper convex function, 77
base, 105 proximity theorem, 156
convex, 78 L-convex function, 186
integral, 90 L2-convex function, 232
integral L-convex, 131 M-convex function, 156
integral M-convex, 118 M2-convex function, 228
L-convex, 123, 131 quasi L-convex function, 201
L^-convex, 129, 131 quasi M-convex function, 174
M-convex, 108, 118 pseudopolynomial algorithm, 288
M^-convex, 117, 118
rational, 90 quadratic
submodular, 112 form, 39
polynomial matrix, 71, 354 function, 39
mixed, 355 L-convex function, 182
polytope, 90 L^-convex function, 48, 52, 182
positive definite, 39 M-convex function, 139
positive semidefinite, 39 M-convex function, 48, 52, 139
positive support, 18 quasi convex, 168
positively homogeneous semistrictly, 168
function, 7, 82 quasi L-convex function, 199
L-convex function, 193 quasi L-optimality criterion, 201
M-convex function, 164 quasi L-proximity theorem, 201
potential, 41, 53, 89, 248 quasi linear, 324
criterion, 249, 260, 262 quasi M-convex function, 169
optimal, 251 quasi M-minimizer cut, 174
primal-dual algorithm, 315 with scaling, 175
primal integrality, 252, 261 quasi M-optimality criterion, 173
intersection theorem, 20, 114 quasi M-proximity theorem, 174
linear programming, 89 quasi-separable
primal problem, 87 concave function, 334
principal minor, 40 convex function, 140
principal submatrix, 40 quasi submodular, 198
leading, 40 semistrictly, 198
producer, 323
production, 323 rank function
profit, 324 matrix, 69
function, 324 matroid, 70
projection rational polyhedron, 90
base polyhedron, 117 reduced cost, 249, 251
function to subset, 143, 162 relative interior, 78
M-convex function, 134 reservation value function, 325
M-convex polyhedron, 118 resistance, 41
M-convex set, 102 resolvent, 45
388 Index

resource allocation problem, 4, 176 spanning tree, 149


restriction stable marriage problem, 345
function to interval, 92 stable matching problem, 345
function to subset, 143, 162 steepest descent algorithm
L-convex function, 178 L-convex function, 305, 306
L-convex polyhedron, 131 M-convex function, 281
L-convex set, 121 steepest descent scaling algorithm
L2-convex function, 229 L-convex function, 308
L2-convex set, 129 M-convex function, 283
polyhedral L-convex function, 192 stepwise gross substitutes property, 155,
rigidity, 361 331
ring family, 104, 107, 292 stoichiometric coefficient, 350
strictly convex function, 77
saddle-point theorem, 238 strong duality, 87
scaling, 145 strong no complementarities property,
conjugate, 319 332
cost, 318 strongly polynomial algorithm, 288
domain, 145 structural equation, 54, 349
nonlinear, 170, 199 subdeterminant, 40, 359
scaling algorithm subdifferential, 80
L-convex function, 308 concave function, 217
M-convex function, 283, 287 discrete function, 166
M-convex submodular flow, 320 integer, 166
semigroup, 45 subgradient, 80
semistrictly quasi convex, 168 discrete function, 166
semistrictly quasi L-convex, 199 sublattice, 104
semistrictly quasi M-convex, 169 submatrix
semistrictly quasi submodular, 198 leading principal, 40
separable concave function, 333 submodular, 62, 206
quasi, 334 function, 16, 44, 70, 104
separable convex function, 10, 95, 140, function on distributive lattice, 292
182 integrally convex-function, 7, 189
with chain condition, 182 polyhedron, 20, 112
quasi, 140 utility function, 330
separation theorem submodular flow problem, 255
convex function, 2, 11, 84 economic equilibrium, 341
convex set, 35, 83 feasibility theorem, 258
generic discrete, 13, 216 M-convex, 256
L-convex function, 218 maximum, 259
L-convex set, 36, 126 submodular function minimization
M-convex function, 217 IFF fixing algorithm, 300
M-convex set, 36, 114 IFF scaling algorithm, 298, 299
submodular function, 17, 111 Schrijver's algorithm, 293
series, 62 submodularity, 44, 177, 190
simple cycle, 62 inequality, 16, 44, 177
single improvement property, 332 local, 180
Index 389

substitutes, 62 voltage potential, 55


successive shortest path algorithm, 312
sum weak duality, 87
weak
of functions, 80 exchange axiom, 137
of M-convex functions, 226 weakly polynomial algorithm, 288
supermodular, 16, 62, 105, 145, 206 weight splitting
function 105 matroid intersection, 34, 225
supply valuated matroid intersection, 225
correspondence, 324
set, 324 z-transform, 355
support
function, 82
negative, 18
positive, 18
system parameter, 347

tension, 53
another convention, 253
terminal vertex, 52, 53
tight set, 108
transformation by network, 269
of flow type, 269
of potential type, 269
transitive, 107, 119
translation submodularity, 23, 44, 178
triangle inequality, 24, 122
two-stage algorithm, 310

unimodular
totally, 88
unique-min condition, 266
unit contraction, 45
unit demand preference, 334
univariate function, 10
discrete convex, 95
polyhedral convex, 80
utility function, 324

valuated matroid, 7, 72, 225


intersection problem, 225
valuation, 72
variational formulation, 43, 55
vertex, 52
initial, 53
terminal, 53
voltage, 41, 53

S-ar putea să vă placă și