LecturesNetworkSystems Full Book

Lectures on
Network Systems
Francesco Bullo
With contributions by
Jorge Corts
Florian Drfler
Sonia Martnez
Lectures on Network Systems
Francesco Bullo
Version v0.85(i) (6 Aug 2016).
With contributions by J. Corts, F. Drfler, and S. Martnez
This document is intended for personal use: you are allowed to print this pdf
file and/or photocopy it. All other rights are reserved, e.g., this document (in
whole or in part) may not be posted online or shared in any way without express
consent. 2012-16.
Contents
Contents iii
I Linear Systems 1
1 Motivating Problems and Systems 3

1.1 Social influence networks: opinion dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Wireless sensor networks: averaging algorithms . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Compartmental networks: dynamical flows among compartments . . . . . . . . . . . . . . 6
1.4 Appendix: Robotic networks in cyclic pursuit and balancing . . . . . . . . . . . . . . . . . 7
1.5 Appendix: Design problems in wireless sensor networks . . . . . . . . . . . . . . . . . . . 9
1.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 Elements of Matrix Theory 13

2.1 Linear systems and the Jordan normal form . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Row-stochastic matrices and their spectral radius . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 PerronFrobenius theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3 Elements of Graph Theory 35

3.1 Graphs and digraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Paths and connectivity in undirected graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Paths and connectivity in digraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4 Weighted digraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.5 Appendix: Database collections and software libraries . . . . . . . . . . . . . . . . . . . . . 41
3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4 The Adjacency Matrix 45

4.1 The adjacency matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Algebraic graph theory: basic and prototypical results . . . . . . . . . . . . . . . . . . . . . 46
iii
iv Contents
4.3 Graph theoretical characterization of irreducible matrices . . . . . . . . . . . . . . . . . . . 47

4.4 Graph theoretical characterization of primitive matrices . . . . . . . . . . . . . . . . . . . . 50
4.5 Elements of spectral graph theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5 Discrete-time Averaging Systems 57

5.1 Averaging with primitive row-stochastic matrices . . . . . . . . . . . . . . . . . . . . . . . 57
5.2 Averaging with reducible matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.3 Averaging with reducible matrices and multiple sinks . . . . . . . . . . . . . . . . . . . . . 60
5.4 Appendix: Design of graphs weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.5 Appendix: Centrality measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6 The Laplacian Matrix 75

6.1 The Laplacian matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.2 The Laplacian in mechanical networks of springs . . . . . . . . . . . . . . . . . . . . . . . 77
6.3 The Laplacian in electrical networks of resistors . . . . . . . . . . . . . . . . . . . . . . . . 78
6.4 Properties of the Laplacian matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.5 Graph connectivity and the rank of the Laplacian . . . . . . . . . . . . . . . . . . . . . . . 80
6.6 Appendix: Community detection via algebraic connectivity . . . . . . . . . . . . . . . . . . 82
6.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7 Continuous-time Averaging Systems 91

7.1 Example systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.2 Continuous-time linear systems and their convergence properties . . . . . . . . . . . . . . 93
7.3 The Laplacian flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.4 Second-order Laplacian flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.5 Appendix: Design of weight-balanced digraphs . . . . . . . . . . . . . . . . . . . . . . . . 102
7.6 Appendix: Distributed optimization using the Laplacian flow . . . . . . . . . . . . . . . . . 102
7.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
8 The Incidence Matrix and its Applications 111

8.1 The incidence matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
8.2 Properties of the incidence matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
8.3 Distributed estimation from relative measurements . . . . . . . . . . . . . . . . . . . . . . 113
8.4 Appendix: Cycle and cutset spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
9 Positive and Compartmental Systems 125

9.1 Introduction and example systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
9.2 Positive systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
9.3 Compartmental systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
9.4 Table of asymptotic behaviors for averaging and positive systems . . . . . . . . . . . . . . 137
9.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Contents v
II Topics in Averaging Systems 141
10 Convergence Rates, Scalability and Optimization 143

10.1 Some preliminary calculations and observations . . . . . . . . . . . . . . . . . . . . . . . . 143
10.2 Convergence factors for row-stochastic matrices . . . . . . . . . . . . . . . . . . . . . . . . 145
10.3 Cumulative quadratic index for symmetric matrices . . . . . . . . . . . . . . . . . . . . . . 147
10.4 Circulant network examples and scalability analysis . . . . . . . . . . . . . . . . . . . . . . 149
10.5 Design of fastest distributed averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
10.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
11 Time-varying Averaging Algorithms 155

11.1 Examples and models of time-varying discrete-time algorithms . . . . . . . . . . . . . . . . 155
11.2 Convergence over time-varying connected graphs . . . . . . . . . . . . . . . . . . . . . . . 157
11.3 Convergence over digraphs connected over time . . . . . . . . . . . . . . . . . . . . . . . . 158
11.4 Analysis methods and proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
11.5 Time-varying algorithms in continuous-time . . . . . . . . . . . . . . . . . . . . . . . . . . 163
11.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
12 Randomized Averaging Algorithms 169

12.1 Examples of randomized averaging algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 169
12.2 A brief review of probability theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
12.3 Randomized averaging algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
12.4 Table of asymptotic behaviors for averaging systems . . . . . . . . . . . . . . . . . . . . . 173
III Nonlinear Systems 175
13 Nonlinear Systems and Robotic Coordination 177

13.1 Coordination in relative sensing networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
13.2 Stability theory for dynamical systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
13.3 A nonlinear rendezvous problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
13.4 Flocking and Formation Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
13.5 Rigidity and stability of the target formation . . . . . . . . . . . . . . . . . . . . . . . . . . 190
13.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
14 Coupled Oscillators: Basic Models 197

14.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
14.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
14.3 Coupled phase oscillator networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
14.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
15 Networks of Coupled Oscillators 207

15.1 Synchronization of identical oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
15.2 Synchronization of heterogeneous oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . 211
vi Contents
15.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
16 Virus Propagation: Basic Models 219

16.1 The SI model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
16.2 The SIR model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
16.3 The SIS model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
16.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
17 Virus Propagation in Contact Networks 227

17.1 The stochastic network SI model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
17.2 The network SI model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
17.3 The network SIS model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
17.4 The network SIR model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
17.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
18 Lotka-Volterra Population Dynamics 239

18.1 The Lotka-Volterra population model: setup . . . . . . . . . . . . . . . . . . . . . . . . . . 239
18.2 Two-species model and analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
18.3 General results for Lotka-Volterra models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
18.4 Cooperative Lotka-Volterra models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
18.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
Bibliography 249
Preface
Books which try to digest, coordinate, get rid of the duplication, get rid of the less fruitful methods
and present the underlying ideas clearly of what we know now, will be the things the future
generations will value. Richard Hamming (1915-1998)
Topics These lecture notes are intended for first-year graduate students interested in network systems,
distributed algorithms, and cooperative control. The objective is to answer basic questions such as: What are
fundamental dynamical models of interconnnected systems? What are the essential dynamical properties
of these models and how are they related to network properties? What are basic estimation, control, and
optimization problems for these dynamical models?
The book is organized in three parts: Linear Systems, Topics in Averaging Systems, and Nonlinear
Systems. The Linear Systems part, together with part on the Topics in Averaging Systems, includes
(i) several key motivating examples systems drawn from social, sensor, and compartmental networks,
as well as additional ones from robotics,
(ii) basic concepts and results in matrix and graph theory, with an emphasis on PerronFrobenius theory,
algebraic graph theory and linear dynamical systems,
(iii) averaging systems in discrete and continuous time, described by static, time-varying and random
matrices, and
(iv) positive and compartmental systems, described by Metzler matrices, with examples from ecology,
epidemiology and chemical kinetics.
The Nonlinear Systems part includes
(v) formation control and coordination problems for relative sensing networks,
(vi) networks of phase oscillator systems with an emphasis on the Kuramoto model and models of power
networks, and
(vii) virus propagation models, including lumped and network models as well as stochastic and determin-
istic models, and
vii
viii Contents
(viii) population dynamic models, describing mutualism, competition and cooperation in multi-species
systems.
Teaching instructions These lecture notes are meant to be taught over a quarter-long course with a
total 35 to 40 hours of contact time. On average, each chapter should require approximately 2 hours of
lecture time. Indeed, these lecture notes are an outgrowth of an introductory graduate course that I taught
at UC Santa Barbara over the last several years.
The intended audience is 1st year graduate students in Engineering, Sciences, and Applied Mathematics
programs. For the first part on Linear Systems, the required background includes competency in linear
algebra and only very basic notions of dynamical systems. For the second part on Nonlinear Systems
(including coupled oscillators and virus propagation), the required background includes a calculus course.
The treatment is self-contained and does not require a nonlinear systems course.
For the benefit of instructors, these lecture notes are supplemented by three documents:
a solution manual, available upon request by instructors at accredited institutions;

an abbreviated version of these notes in slides/landscape format, especially suited for displaying on a
projector for classroom teaching, and
an abbreviated version of these notes in classnotes format (with large sans-serif fonts, small margins),
especially suited as markup copy for classroom teaching.
The book, in its three formats, are available for download at: http://motion.me.ucsb.edu/book-lns.
Acknowledgments I am extremely grateful to Florian Drfler for his contributions to
(i) Chapter 13 Nonlinear Systems and Robotic Coordination,

(ii) Chapter 14 Coupled Oscillators: Basic Models,
(iii) Chapter 15 Networks of Coupled Oscillators,
(iv) Sections 5.5, 6.6, 7.6, 8.4 and 10.3, as well as
(v) a large number of exercises.
I am extremely grateful to Jorge Corts and Sonia Martnez for their fundamental contribution to my under-
standing and our joint work on distributed algorithms and robotic networks; their scientific contribution is
most obviously present in
(i) Chapter 1 Motivating Problems and Systems,

(ii) Chapter 3 Elements of Graph Theory,
(iii) Chapter 4 The Adjacency Matrix.
I am extremely grateful to Alessandro Giua for detailed comments and insightful suggestions; his input
helped shape the early chapters. I am grateful to Noah Friedkin for instructive discussions about social
influence networks that influenced Chapter 5. I wish to thank Sandro Zampieri and Wenjun Mei for their
contribution to Chapters 16 and 17 and to Stacy Patterson for adopting an early version of these notes and
Contents ix
providing me with detailed feedback. I wish to thank Jason Marden and Lucy Pao for their invitation to
visit the University of Colorado at Boulder and deliver an early version of these lecture notes.
I also would like to acknowledge the generous support received from funding agencies. This book
is based on work supported in part by the Army Research Office through grants W911NF-11-1-0092 and
W911NF-15-1-0577, the Air Force Office of Scientific Research through grant FA9550-15-1-0138, and the
National Science Foundation through grants CPS-1035917 and CPS-1135819.
A special thank you goes to all students who took this course and all scientists who read these notes.
Particular thanks go to Alex Olshevsky, Ashish Cherukuri, Bala Kameshwar Poolla, Basilio Gentile, Catalin
Arghir, Deepti Kannapan, Fabio Pasqualetti, Francesca Parise, John W. Simpson-Porco, Pedro Cisneros-
Velarde, Peng Jia, Saber Jafarpour, Sepehr Seifi, Shadi Mohagheghi, Tyler Summers, and Vaibhav Srivastava.
for their contributions to these lecture notes and related homework.
Finally, I wish to thank Gabriella, Marcello, Lily and my whole family for their loving support.
Santa Barbara, California, USA Francesco Bullo

29 Mar 2012 6 Aug 2016
Part I
Linear Systems
1
Chapter 1
Motivating Problems and Systems
In this introductory chapter, we introduce some example problems and systems from multiple disciplines.
The objective is to motivate our treatment of linear network systems in the following chapters. We look at
the following examples:
(i) In the context of social influence networks, we discuss a classic reference on how opinions evolve
and possibly reach a consensus in groups of individuals. Here, consensus means that the opinions of
the individuals are identical.
(ii) In the context of wireless sensor networks, we discuss distributed simple averaging algorithms and, in
the appendix, two advanced design problems in the context of parameter estimation and hypothesis
testing.
(iii) In the context of compartmental networks, we discuss dynamical flows among compartments, such
as arising in ecosystems.
(iv) Finally, in the context of robotic networks: we discuss simple robotic behaviors for cyclic pursuit and
balancing.
In all cases we are interested in presenting the basic models and motivating interest in understanding their
dynamic behaviors, such as the existence and attractivity of equilibria.
We present additional linear examples in later chapters and nonlinear examples in the second part.
For a similar valuable list of related and instructive examples, we refer to (Hendrickx 2008, Chapter 9)
and (Garin and Schenato 2010, Section 3.3). Other examples of multi-agent systems and applications can be
found in the following texts (Ren and Beard 2008; Bullo et al. 2009; Mesbahi and Egerstedt 2010; Cristiani
et al. 2014; Fuhrmann and Helmke 2015; Francis and Maggiore 2016).
3
4 Chapter 1. Motivating Problems and Systems
1.1 Social influence networks: opinion dynamics

This example is an illustration of the rich literature on opinion dynamics, starting with the early works
by French (1956), Harary (1959), and DeGroot (1974). Specifically, we adopt the setup quite literally
from (DeGroot 1974).
We consider a group of n individuals who must act together as a
team. Each individual has his own subjective probability distribution
Fi for the unknown value of some parameter (or more simply an
estimate of the parameter). We assume now that individual i is
appraised of the distribution Fj of each other member j 6= i of the
group. Then the DeGroot model predicts that the individual will
revise its distribution to be:
n
X
Fi+ = aij Fj ,
j=1
Figure 1.1: Interactions in a social influ-
ence network
where aij denotes the weight that individual i assigns to the dis-
tribution of individual j when carrying out this revision. More
precisely, the coefficient aii describes the attachment of individual i to its own opinion and aij , j 6= i, is an
interpersonal influence weight that individual i accords to individual j.
In the DeGroot model, the coefficients aij satisfy the following constraints: they are nonnegative,
Pm is, aij 0, and, for each individual, the sum of self-weight and accorded weights equals 1, that is,
that
j=1 aij = 1 for all i. In mathematical terms, the matrix

a11 . . . a1n
.. .. ..
A= . . .
an1 . . . ann
has nonnegative entries and each of its rows has unit sum. Such matrices are said to be row-stochastic.
Questions of interest are:
(i) Is this model of human opinion dynamics believable at all?
(ii) How does one measure the coefficients aij ?
(iii) Under what conditions do the distributions converge to consensus? What is this value?
(iv) What are more realistic, empirically-motivated models, possibly including stubborn individuals or
antagonistic interactions?
1.2. Wireless sensor networks: averaging algorithms 5
1.2 Wireless sensor networks: averaging algorithms
sensor node
gateway node
Figure 1.2: A wireless sensor network composed of a collection of spatially-distributed sensors in a field and a
gateway node to carry information to an operator. The nodes are meant to measure environmental variables, such as
temperature, sound, pressure, and cooperatively filter and transmit the information to an operator.
A wireless sensor network is a collection of spatially-distributed devices capable of measuring physical and
environmental variables (e.g., temperature, vibrations, sound, light, etc), performing local computations,
and transmitting information to neighboring devices and, in turn, throughout the network (including,
possibly, an external operator).
Suppose that each node in a wireless sensor network has measured a scalar environmental quantity,
say xi . Consider the following simplest distributed algorithm, based on the concepts of linear averaging:
each node repeatedly executes

x+
i := average xi , {xj , for all neighbor nodes j} , (1.1)
where x+ i denotes the new value of xi . For example, for the graph in Figure 1.3, one
can easily write x+
1 := (x 1 + x 2 )/2, x +
2 := (x 1 + x 2 + x3 + x4 )/4,
and so forth. In summary, the algorithms behavior is described by 3 4

1/2 1/2 0 0
1/4 1/4 1/4 1/4
x+ =
0 1/3 1/3 1/3 x = Awsn x, 1 2
0 1/3 1/3 1/3
Figure 1.3: Example graph
where the matrix Awsn in equation is again row-stochastic.
(i) Does each node converge to a value? Is this value the same for all nodes?
(ii) Is this value equal to the average of the initial conditions?
(iii) What properties do the graph and the corresponding matrix need to have in order for the algorithm
to converge?
(iv) How quick is the convergence?
1.3 Compartmental networks: dynamical flows among compartments

Compartmental systems model dynamical processes characterized by conservation laws (e.g., mass, fluid,
energy) and by the flow of material between units known as compartments. The flow of energy and
nutrients (water, nitrates, phosphates, etc) in ecosystems is typically studied using compartmental modelling.
For example, Figure 1.4 illustrates a widely-cited water flow model for a desert ecosystem (Noy-Meir 1973).
precipitation soil evaporation, drainage, runo
uptake plants transpiration
drinking herbivory
animals evaporation
Figure 1.4: Water flow model for a desert ecosystem. The blue line denotes an inflow from the outside environment.
The red lines denote outflows into the outside environment.
If we let qi denote the amount of material in compartment i, the mass balance equation for the ith
compartment is written as:
X
qi = (Fji Fij ) Fi0 + ui ,
j6=i
where ui is the inflow from the environment and Fi0 is the outflow into the environment. We now assume
linear flows, that is, we assume that the flow Fij from node i to node j (as well as to the environment) is
proportional to the mass quantity at i, that is, Fij = fij qi for a positive flow rate constant fij . Therefore
we can write
X
qi = (fji qj fij qi ) fi0 qi + ui
j6=i
and so, in vector notation, there exists an appropriate C matrix such that
q = Cq + u.
For example, let us write down the compartmental matrix C for the water flow model in figure. We
let q1 , q2 , q3 denote the water mass in soil, plants and animals, respectively. Moreover, as in figure, we
let fe-d-r , ftrnsp , fevap , fdrnk , fuptk , fherb , denote respectively the evaporation-drainage-runoff, transpiration,
evaporation, drinking, uptake, and herbivory rate. With these notations, we can write

fe-d-r fuptk fdrnk 0 0
C= fuptk ftrnsp fherb 0 .
fdrnk fherb fevap
(i) for constant inflows u, does the total mass in the system remain bounded?
(ii) is there an asymptotic equilibrium? do all evolutions converge to it?
(iii) which compartments become empty asymptotically?
1.4. Appendix: Robotic networks in cyclic pursuit and balancing 7
1.4 Appendix: Robotic networks in cyclic pursuit and balancing

In this section we consider two simple examples of coordination motion in robotic networks. The standing
assumption is that n robots, amicably referred to as bugs, are placed and restricted to move on a circle of
unit radius. Because of this bio-inspiration and because this language is common in the literature (Klamkin
and Newman 1971; Bruckstein et al. 1991; Marshall et al. 2004), we refer to the following two problems as
n-bugs problems.
On this unit circle the bugs positions are angles measured counterclockwise from the positive horizontal
axis. We let angles take value in [0, 2), that is, an arbitrary position satisfies 0 < 2. The bugs are
numbered counterclockwise with identities i {1, . . . , n} and are at positions 1 , . . . , n . It is convenient
to identify n + 1 with 1. We assume the bugs move in discrete times k in a counterclockwise direction by a
controllable amount ui (i.e., a control signal), that is:
i (k + 1) = mod(i (k) + ui (k), 2).
where mod(, 2) is the remainder of the division of by 2 and its introduction is required to ensure
that i (k + 1) remains inside [0, 2).
The n-bugs problem is related to the study of pursuit curves and inquires about what the paths of n bugs
are, not aligned initially, when they chase one another. We refer to (Watton and Kydon 1969; Bruckstein
et al. 1991; Marshall et al. 2004; Smith et al. 2005) for surveys and recent results.
Objective: optimal patrolling of a perimeter. Approach: Cyclic pursuit

We now suppose that each bug feels an attraction and moves towards the closest counterclockwise
neighbor, as illustrated in Figure 1.5. Recall that the counterclockwise distance from i and i+1 is the length
of the counterclockwise arc from i and i+1 and satisfies:
distcc (i , i+1 ) = mod(i+1 i , 2),
In short, given a control gain [0, 1], we assume that the ith bug sets its control signal to
upursuit,i (k) = distcc (i (k), i+1 (k)).
i i
i+1 i+1
i 1
distcc (i , i+1 ) distcc (i , i+1 ) distc (i , i 1)
Figure 1.5: Cyclic pursuit and balancing prototypical n-bug problems

(i) Does this system have any equilibrium?
(ii) Is a rotating equally-spaced configuration a solution? An equally-spaced angle configuration is one

for which mod(i+1 i , 2) = mod(i i1 , 2) for all i {1, . . . , n}. Such configurations are
sometimes called splay states.
(iii) For which values of do the bugs converge to an equally-spaced configuration and with what
pairwise distance?
Objective: optimal sensor placement. Approach: Cyclic balancing

Next, we suppose that each bug feels an attraction towards both the closest counterclockwise and the
closest clockwise neighbor, as illustrated in Figure 1.5. Given a control gain [0, 1/2] and the natural
notion of clockwise distance, the ith bug sets its control signal to
ubalancing,i (k) = distcc (i (k), i+1 (k)) distc (i (k), i1 (k)),
where distc (i (k), i1 (k)) = distcc (i1 (k), i (k)).

(i) Is a static equally-spaced configuration a solution?
(ii) For which values of do the bugs converge to a static equally-spaced configuration?
(iii) Is it true that the bugs will approach an equally-spaced configuration and that each of them will
converge to a stationary position on the circle?
A preliminary analysis
It is unrealistic (among other aspects of this setup) to assume that the bugs know the absolute position
of themselves and of their neighbors. Therefore, it is interesting to rewrite the dynamical system in terms
of pairwise distances between nearby bugs.
For i {1, . . . , n}, we define the relative angular distances (the lengths of the counterclockwise arcs)
di = distcc (i , i+1 ) 0. (We also adopt the usual convention that dn+1 = d1 and that d0 = dn ). The
change of coordinates from (1 , . . . , n ) to (d1 , . . . , dn ) leads us to rewrite the cyclic pursuit and the cyclic
balancing laws as:
upursuit,i (k) = di ,
ubalancing,i (k) = di di1 .
In this new set of coordinates, one can show that the cyclic pursuit and cyclic balancing systems are,
respectively,
di (k + 1) = (1 )di (k) + di+1 (k), (1.2)

di (k + 1) = di+1 (k) + (1 2)di (k) + di1 (k). (1.3)
1.5. Appendix: Design problems in wireless sensor networks 9
These are two linear time-invariant dynamical systems with state d = (d1 , . . . , dn ) and governing equation
described by the two n n matrices:

1 0 0 1 2 0
. .. . ..
0
1 .. . 0

1 2 . . . 0
..
Apursuit = ... ..
.
..
.
..
. 0 , A = ..
.
..
.
..
. 0 .
balancing
.
.. .. .. ..
0 . . 1 0 . . 1 2
0 0 1 0 1 2
We conclude with the following remarks.

(i) Equations (1.2) and (1.3) are correct if the counterclockwise order of the bugs is never violated. One
can show that this is true for < 1 in the pursuit case and < 1/2 in the balancing case; we leave
this proof to the reader in Exercise E1.2.
(ii) The matrices Apursuit and Abalancing , for varying n and , are Toeplitz and circulant. Moreover, they
have nonnegative entries for the stated ranges of and are row-stochastic.
(iii) If one defines the agreement space, i.e., {(, , . . . , ) Rn | R}, then each point in this set is
an equilibrium for both systems.
P
(iv) It must be true for all times that (d1 , . . . , dn ) {x Rn | xi 0, ni=1 xi = 2}. This property is
indeed the consequence of the nonnegative matrices Apursuit and Abalancing being doubly-stochastic,
i.e., each row-sum and each column-sum is equal to 1.
(v) We will later study for which values of the system converges to the agreement space.
1.5 Appendix: Design problems in wireless sensor networks

In this appendix we show how averaging algorithms can be used to tackle realistic wireless sensor network
problems.
1.5.1 Wireless sensor networks: distributed parameter estimation

The next two examples are also drawn from the field of wireless sensor network, but they feature a more
advanced setup and require a basic background in estimation and detection theory, respectively. The key
lessons to be learnt from these examples is that it is useful to have algorithms that compute the average of
distributed quantities.
Following ideas from (Xiao et al. 2005; Garin and Schenato 2010), we aim to estimate an unknown
parameter Rm via the measurements taken by a sensor network. Each node i {1, . . . , n} measures
yi = Bi + vi ,
where yi Rmi , Bi is a known matrix and vi is random measurement noise. We assume that
(A1) the noise vectors v1 , . . . , vn are independent jointly-Gaussian variables with zero-mean E[vi ] = 0mi
and positive-definite covariance E[vi vi> ] = i = > i , for i {1, . . . , n}; and

B1
P
(A2) the measurement parameters satisfy the following two properties: i mi m and ... is full
Bn
rank.
Given the measurements y1 , . . . , yn , it is of interest to compute a least-square estimate of , that is, an
estimate of that minimizes a least-square error. Specifically, we aim to minimize the following weighted
least-square error:
Xn Xn

min yi Bi b 2 1 = y i Bi
b > 1 yi Bi b .

i
i
b i=1 i=1
In this weighted least-square error, individual errors are weighted by their corresponding inverse covariance
matrices so that an accurate (respectively, inaccurate) measurement corresponds to a high (respectively,
low) error weight. With this particular choice of weights, the least-square estimate coincides with the
so-called maximum-likelihood estimate; see (Poor 1998) for more details. Under assumptions (A1) and (A2),
the optimal solution is
X n 1 X
n
b = Bi> 1
i B i Bi> 1
i yi .
i=1 i=1
This formula is easy to implement by a single processor with all the information about the problem, i.e., the
parameters and the measurements.
To compute b in the sensor (and processor) network, we perform two steps:
[Step 1:] we run two distributed algorithms in parallel to compute the average of the quantities Bi> 1 i Bi
and Bi> 1
i y i .
[Step 2:] we compute the optimal estimate via
1
b = average B1> 1
1 B 1 , . . . , B > 1

n n B n average B > 1

1 1 y1 , . . . , B > 1

n n y n .

(i) How do we design algorithms to compute the average of distributed quantities?
(ii) What properties does the graph need to have in order for such an algorithm to exist?
(iii) How do we design an algorithm with fastest convergence?
1.5.2 Wireless sensor networks: distributed hypothesis testing

We consider a distributed hypothesis testing problem; these ideas appeared in (Rao and Durrant-Whyte
1993; Olfati-Saber et al. 2006). Let h , for in a finite set , be a set of two or more hypotheses about
an uncertain event. For example, given a certain area of interest, we could have h0 = no target is present,
h1 = one target is present and h2 = two or more targets are present.
Suppose that we know the a priori probabilities p(h ) of the hypotheses and that n nodes of a sensor
network take measurements yi , for i {1, . . . , n}, related to the event. Independently of the type of
measurements, assume you can compute
p(yi |h ) = probability of measuring yi given that h is the true hypothesis.
1.5. Appendix: Design problems in wireless sensor networks 11
Also, assume that each observation is conditionally independent of all other observations, given any
hypothesis.
(i) We wish to compute the maximum a posteriori estimate, that is, we want to identify which one is the
most likely hypothesis, given the measurements. Note that, under the independence assumption,
Bayes Theorem implies that the a posteriori probabilities satisfy
n Y
p(h )
p(h |y1 , . . . , yn ) = p(yi |h ).
p(y1 , . . . , yn )
i=1
(ii) Observe that p(h ) is known, and p(y1 , . . . , yn ) is a constant normalization factor scaling all posteriori
probabilities equally. Therefore, for each hypothesis , we need to compute
n
Y
p(yi |h ),
i=1
or equivalently, we aim to exchange data among the sensors in order to compute:

!
Xn
exp log(p(yi |h )) = exp n average log p(y1 |h ), . . . , log p(yn |h ) .
i=1
(iii) In summary, even in this hypothesis testing problem, we need algorithms to compute the average of
the n numbers log p(y1 |h ), . . . , log p(yn |h ), for each hypothesis .
Questions of interest here are the same as in the previous section.
1.6 Exercises
E1.1 Simulating the averaging dynamics. Simulate in your favorite programming language and software pack-
age the linear averaging algorithm in equation (1.1). Set n = 5, select the initial state equal to (1, 1, 1, 1, 1),
and use the following undirected unweighted graphs, depicted in Figure E1.1:
(i) the complete graph,
(ii) the ring graph, and
(iii) the star graph with node 1 as center.
Which value do all nodes converge to? Is it equal to the average of the initial values? Turn in your code, a few
printouts (as few as possible), and your written responses.
Figure E1.1: Complete graph, ring graph and star graph with 5 nodes
E1.2 Computing the bugs dynamics. Consider the cyclic pursuit and balancing dynamics described in Section 1.4.
Verify
(i) the cyclic pursuit closed-loop equation (1.2),
(ii) the cyclic balancing closed-loop equation (1.3), and
(iii) the counterclockwise order of the bugs is never violated.
Hint: Recall the distributive property of modular addition: mod(a b, n) = mod(mod(a, n) mod(b, n), n).
Chapter 2
Elements of Matrix Theory
We review here basic concepts from matrix theory. These concepts will be useful when analyzing graphs
and averaging algorithms defined over graphs.
In particular we are interested in understanding the convergence of the linear dynamical systems
discussed in Chapter 1. Some of those systems are described by matrices that have nonnegative entries and
have row-sums equal to 1.
Notation
It is useful to start with some basic notations from matrix theory and linear algebra. We let f : X Y
denote a function from set X to set Y . We let R, N and Z denote respectively the set of real, natural
and integer numbers; also R0 and Z0 are the set of nonnegative real numbers and nonnegative integer
numbers. For real numbers a < b, we let
[a, b] = {x R | a x b}, ]a, b]= {x R | a < x b},

[a, b[ = {x R | a x < b}, ]a, b[= {x R | a < x < b}.
Given a complex number z C, its norm (sometimes referred to as complex modulus) is denoted by |z|, its
real part by <(z) and its imaginary part by =(z). We let i denote the imaginary unit 1.
We let 1n Rn (respectively 0n Rn ) be the column vector with all entries equal to +1 (respectively
0). Let e1 , . . . , en be the standard basis vectors of Rn , that is, ei has all entries equal to zero except for the
ith entry equal to 1.
We let In denote the n-dimensional identity matrix and A Rnn denote a square n n matrix with
real entries {aij }, i, j {1, . . . , n}. The matrix A is symmetric if A> = A.
For a matrix A, C is an eigenvalue and v Cn is a right eigenvector, or simply an eigenvector, if
they together satisfy the eigenvalue equation Av = v. Sometimes it will be convenient to refer to (, v)
as an eigenpair. A left eigenvector of the eigenvalue is a vector w Cn satisfying w> A = w> .
A symmetric matrix is positive definite (resp. positive semidefinite) if all its eigenvalues are positive
(resp. nonnegative). The kernel of A is the subspace kernel(A) = {x Rn | Ax = 0n }, the image of A is
image(A) = {y Rn | Ax = y, for some x Rn }, and the rank of A is the dimension of its image. Given
vectors v1 , . . . , vj Rn , their span is span(v1 , . . . , vj ) = {a1 v1 + + aj vj | a1 , . . . , aj R} Rn .
13
14 Chapter 2. Elements of Matrix Theory
2.1 Linear systems and the Jordan normal form

In this section we introduce a prototypical model for dynamical systems and study its stabilities properties
via the so-called Jordan normal form, that is a key tool from matrix theory.
2.1.1 Discrete-time linear systems

We start with a basic definition.
Definition 2.1 (Discrete-time linear system). A square matrix A defines a discrete-time linear systems
by
x(k + 1) = Ax(k), x(0) = x0 , (2.1)
or, equivalently by x(k) = Ak x0 , where the sequence {x(k)}kZ0 is called the solution, trajectory or
evolution of the system.
Sometimes it is convenient to adopt the shorthand x+ = f (x) to denote the system x(k + 1) = f (x(k)).
We are interested in understanding when a solution from an arbitrary initial condition has an asymptotic
limit as time diverges and to what value the solution converges. We formally define this property as follows.
Definition 2.2 (Semi-convergent and convergent matrices). A matrix A Rnn is
(i) semi-convergent if limk+ Ak exists, and

(ii) convergent if it is semi-convergent and limk+ Ak = 0nn .
It is immediate to see that, if A is semi-convergent with limiting matrix A = limk+ Ak , then
lim x(k) = A x0 .
k+
In what follows we characterize the sets of semi-convergent and convergent matrices.
Remark 2.3 (Modal decomposition for symmetric matrices). Before treating the general analysis
method, we present the self-contained and instructive case of symmetric matrices. Recall that a symmetric
matrix A has real eigenvalues 1 2 n and corresponding orthonormal (i.e., orthogonal and
unit-length) eigenvectors v1 , . . . , vn . Because the eigenvectors are an orthonormal basis for Rn , we can write
the modal decomposition
x(k) = y1 (k)v1 + + yn (k)vn ,
where the ith normal mode is defined by yi (k) = vi> x(k). We then left-multiply the two equalities (2.1) by
vi> and exploit Avi = i vi to obtain
yi (k + 1) = i yi (k), yi (0) = vi> x0 , = yi (k) = ki (vi> x0 ).
In short, the evolution of the linear system (2.1) is
x(k) = k1 (v1> x0 )v1 + + kn (vn> x0 )vn .
Therefore, each evolution starting from an arbitrary initial condition satisfies
2.1. Linear systems and the Jordan normal form 15
(i) limk x(k) = 0n if and only if |i | < 1 for all i {1, . . . , n}, and
(ii) limk x(k) = (v1> x0 )v1 + + (vm
> x )v if and only if = = = 1 and | | < 1 for all
0 m 1 m i
i {m + 1, . . . , n}.

2.1.2 The Jordan normal form

In this section we review a very useful canonical decomposition of a square matrix. Recall that two n n
matrices A and B are similar if B = T AT 1 for some invertible matrix T . Also recall that a similarity
transform does not change the eigenvalues of a matrix.
Theorem 2.4 (Jordan normal form). Each matrix A Cnn is similar to a block diagonal matrix
J Cnn , called the Jordan normal form of A, given by

J1 0 0

0 J2 . . . 0
J =. Cnn ,

.. . . . . . . 0
0 0 Jm
where each block Ji , called a Jordan block, is a square matrix of size ji and of the form

i 1 0

0 i . . . 0

Ji = . Cji ji . (2.2)
. .
.. .. .. 1
0 0 i
Clearly, m n and j1 + + jm = n.
We refer to (Horn and Johnson 1985) for a standard proof of this theorem. In other words, Theorem 2.4
implies there exists an invertible matrix T such that
A = T JT 1 (2.3)
AT = T J (2.4)
1 1
T A = JT . (2.5)
The matrix J is unique, modulo a re-ordering of the Jordan blocks. The eigenvalues of J, and therefore also
of A, are the (not necessarily distinct) numbers 1 , . . . , m . Given an eigenvalue ,
(i) the algebraic multiplicity of is the sum of the sizes of all Jordan blocks with eigenvalue (or,
equivalently, the multiplicity of as a root of the characteristic polynomial of A), and
(ii) the geometric multiplicity of is the number of Jordan blocks with eigenvalue (or, equivalently, the
number of linearly-independent eigenvectors associated to ).
An eigenvalue is
(i) simple if it has algebraic and geometric multiplicity equal precisely to 1, that is, a single Jordan block
of size 1, and
(ii) semisimple if all its Jordan blocks have size 1, so that its algebraic and geometric multiplicity are
equal.
Let t1 , . . . , tn and r1 , . . . , rn denote the columns and rows of T and T 1 respectively. If all eigenvalues
of A are semisimple, then the equations (2.4) and (2.5) imply, for all i {1, . . . , n},
Ati = i ti and ri A = i ri .
In other words, the ith column of T is the right eigenvector (or simply eigenvector) of A corresponding to
the eigenvalue i , and the ith row of T 1 is the corresponding left eigenvector of A.
Finally, it is possible to have eigenvalues with larger algebraic than geometric multiplicity. In this case,
the columns of the matrix T are the right eigenvectors and the generalized right eigenvectors of A, whereas
the rows of T 1 are the left eigenvectors and the generalized left eigenvector of A. For more details about
generalized eigenvectors, we refer to reader to (Horn and Johnson 1985).
Example 2.5 (Revisiting the wireless sensor network example). Next, as numerical example, let us
reconsider the wireless sensor network discussed in Section 1.2 and the 4-dimensional row-stochastic matrix
Awsn , which we report here for convenience:

1/2 1/2 0 0
1/4 1/4 1/4 1/4
Awsn = 0 1/3 1/3 1/3 .

0 1/3 1/3 1/3
With the aid of a symbolic mathematics program, we compute Awsn = T JT 1 where

1 0 0 0 1 0 2 + 273 2 273
0 0 0 0
J = , T = 1 0 11 73 11 + 73 , and
0 0 1 (5 73) 0 1 1 8 8
24
1
0 0 0 24 (5 + 73) 1 1 8 8
1 1 1 1

6 3 4 4
0 0 21 1
1 2
T = 1 + 19 1
48 5
48 1
64 3 1 3 .
64
96 9673 73 64 73 64 73
1
96 9619

73
1
48 5
+ 48 73
1
64 + 64 3
73
1
64
3
+ 64 73

Therefore, the eigenvalues of A are 1, 0, 24
1
(5 73) 0.14, and 1
24 (5 + 73) 0.56. Corresponding to
the eigenvalue 1, the right and left eigenvector equations are:
> >
1 1 1/6 1/6
1 1 1/3 1/3
Awsn
1 = 1 and Awsn = .
1/4 1/4
1 1 1/4 1/4
2.1. Linear systems and the Jordan normal form 17
2.1.3 Semi-convergence and convergence for discrete-time linear systems

We can now use the Jordan normal form to study the powers of the matrix A. We start by computing
k
J1 0 0

0 J2k . . . 0 1
k 1 1 1 k 1 T ,
A =T | JT T JT {z T JT } = T J T = T .. . . . .
k times . . . 0
0 0 k
Jm
so that, for a square matrix A with Jordan blocks Ji , i {1, . . . , m}, the following statements are equivalent:
(i) A is semi-convergent (resp. convergent),
(ii) J is semi-convergent (resp. convergent), and
(iii) each block Ji is semi-convergent (resp. convergent).
Next, we compute the kth power of the generic Jordan block Ji with eigenvalue i as a function of
block size 1, 2, 3, . . . , ji ; they are, respectively,
k k k1 k k2
i 1 i 2 i jik1 ikji +1
..
k kk1 k k2 k k k1 . . .
" # i 0 i 1 i .
k
k i ki k1

i 2 i

i , , k k1 , . . . , . . . .
0 k ,
0 ki i i .. .. .. .. k k2
2 i
0 0 ki
k k k1
0 0 i 1 i
0 0 ki
(2.6)
k k
where the binomial coefficient = k!/(m!(k m)!) satisfies
m m k m /m!.
Note that, independently
of the size of Ji , each entry of the kth power of Ji is upper bounded by a constant times k h ki for some
nonnegative integer h. Because exponentially-decaying factors dominate polynomially-growing terms, we
know

0, if || < 1,
h k
lim k = 1, if = 1 and h = 0,
k

non-existent or unbounded, if ( = 1) or (|| > 1) or ( = 1 and h = 1, 2, . . . ).
(2.7)
In summary, for each block Ji with eigenvalues i , we can infer that:
(i) a block Ji of size 1 is convergent if and only if |i | < 1,
(ii) a block Ji of size 1 is semi-convergent and not convergent if and only if i = 1, and
(iii) a block Ji of size larger than 1 is semiconvergent and convergent if and only if |i | < 1.
Based on this discussion, we are now ready to present necessary and sufficient conditions for semi-
convergence and convergence of an arbitrary square matrix.
We complete this discussion with two useful definitions and the main result of this section.
1 1 1
(a) The spectrum of a convergent matrix (b) The spectrum of a semiconvergent (c) The spectrum of a matrix that is not
matrix, provided the eigenvalue 1 is semiconvergent.
semisimple.
Figure 2.1: Eigenvalues and convergence properties of discrete-time linear systems
Definition 2.6 (Spectrum and spectral radius of a matrix). Given a square matrix A,
(i) the spectrum of A, denoted spec(A), is the set of eigenvalues of A; and

(ii) the spectral radius of A is the maximum norm of the eigenvalues of A, that is,
(A) = max{|| | spec(A)},
or, equivalently, the radius of the smallest disk in C centered at the origin and containing the spectrum
of A.
Theorem 2.7 (Convergence and spectral radius). For a square matrix A, the following statements hold:
(i) A is convergent if and only if (A) < 1,

(ii) A is semi-convergent if and only if (A) 1, no eigenvalue has unit norm other than possibly the
number 1, and if 1 is an eigenvalue, then it is semisimple.
2.2 Row-stochastic matrices and their spectral radius

Motivated by the example systems in Chapter 1, we are now interested in discrete-time linear systems
defined by matrices with special properties. Specifically, we are interested in matrices with nonnegative
entries and whose row-sums are all equal to 1.
The square matrix A Rnn is
(i) nonnegative (respectively positive) if aij 0 (respectively aij > 0) for all i and j in {1, . . . , n};
(ii) row-stochastic if nonnegative and A1n = 1n ;
(iii) column-stochastic if nonnegative and A> 1n = 1n ; and
(iv) doubly-stochastic if it is row- and column-stochastic.
2.2. Row-stochastic matrices and their spectral radius 19
In the following, we write A > 0 and v > 0 (respectively A 0 and v 0) for a positive (respectively
nonnegative) matrix A and vector v.
Given a finite number of points p1 , p2 , . . . , pn in Rn , a convex combination of p1 , p2 , . . . , pn is a point
of the form
1 p1 + 2 p2 + + n pn
where the real numbers 1 , . . . , n satisfy 1 + + n = 1 and i 0 for all i {1, . . . , n}. (For example,
on the plane R2 , the set of convex combinations of two distinct points is the segment connecting them and
the set of convex combinations of three distinct points is the triangle (including its interior) defined by
them.) The numbers 1 , . . . , n are called convex combination coefficients and each row of a row-stochastic
matrix consists of convex combination coefficients.
2.2.1 The spectral radius for row-stochastic matrices

To characterize the spectral radius of a row-stochastic matrix, we introduce a useful general method to
localize the spectrum of a matrix.
Theorem 2.8 (Gergorin Disks Theorem). For any square matrix A Rnn ,
[ n Xn o
spec(A) z C |z aii | |aij | .
i{1,...,n} j=1,j6=i
| {z }
n
disk in the complex plane centered at aii with radius
P
j=1,j6=i |aij |
Proof. Consider the eigenvalue equation Ax = x for the eigenpair (, x), where and x 6= 0n are
in general complex. Choose the index i {1, . . . , n} so that |xi | =Pmaxj{1,...,n} |xj | > 0. The ith
component of the eigenvalue equation can be rewritten as aii = nj=1,j6=i aij xj /xi . Now, take the
complex magnitude of this equality and upper-bound its right-hand side:
n
X xj
n
X |xj |
n
X

| aii | = aij |aij | |aij | .
xi |xi |
j=1,j6=i j=1,j6=i j=1,j6=i
This inequality defines a set of the possible locations for the arbitrary eigenvalue of A. The statement
follows by taking the union of such sets for each eigenvalue of A.
Each disk in the theorem statement is referred to as a Gergorin disks, or more accurately, as a Gergorin
row disks; an analogous disk theorem can be stated for Gergorin column disks. Exercise E2.16 showcases an
instructive application to distributed computing of numerous topics covered so far, including convergence
notions and the Gergorin Disks Theorem.
Lemma 2.9 (Spectral properties of a row-stochastic matrix). For a row-stochastic matrix A,
(i) 1 is an eigenvalue, and

(ii) spec(A) is a subset of the unit disk and (A) = 1.
Proof. First, recall that A being row-stochastic is equivalent to two facts: aij 0, i, j {1, . . . , n},
and A1n = 1n . The second fact implies that 1n is an eigenvector with eigenvalue 1. Therefore, by
definition of spectral radius, (A) 1. Next, we prove that (A) 1 by invoking the Gergorin Disks
Theorem 2.8 to show that spec(A) is contained in the unit disk centered at the origin. The Gergorin disks
of a row-stochastic matrix as illustrated in Figure 2.2.
aii 1
X
aij
j6=i
Figure 2.2: All Gergorin disks of a row-stochastic matrix are contained in the unit disk.
P
Note that A being row-stochastic implies aii [0, 1] and aii + j6=i aij = 1. Hence, the center of the
ith Gergorin disk belongs to the positive real axis between 0 and 1, and the right-most point in the disk is
at 1.
Note: because 1 is an eigenvalue of each row-stochastic matrix A, clearly A is not convergent. But it is
possible for A to be semi-convergent.
2.3. PerronFrobenius theory 21
2.3 PerronFrobenius theory

We have seen how row-stochastic matrices are not convergent; we now focus on characterizing those that
are semi-convergent. To establish whether a row-stochastic matrix is semi-convergent, we introduce the
widely-established PerronFrobenius theory for nonnegative matrices.
2.3.1 Classification of nonnegative matrices

In the previous section we already defined nonnegative and positive matrices. In this section we are
interested in classifying nonnegative matrices in terms of their zero/nonzero pattern and of the asymptotic
behavior of their powers.
We start by introducing simple example nonnegative matrices and related comments:

1 0
A1 = , spec(A1 ) = {1, 1}, the zero/nonzero pattern in Ak1 is constant, and lim Ak1 = I2 ,
0 1 k

0 1
A2 = , spec(A2 ) = {1, 1}, the zero/nonzero pattern in Ak2 is oscillating, and lim Ak2 does not exists,
1 0 k

0 1
A3 = , spec(A3 ) = {0, 0}, Ak3 = 0 for all k 2, and lim Ak3 = 0,
0 0 k

1 1 1 k k 1 2 1
A4 = , spec(A4 ) = {1, 1/2}, A4 > 0 for all k 2, and lim A4 = , and
2 2 0 k 3 2 1

1 1
A5 = , spec(A5 ) = {1, 1}, the zero/nonzero pattern in Ak5 is constant and lim Ak5 is unbounded.
0 1 k
Based on these preliminary examples, we now introduce two sets of nonnegative matrices with certain
characteristic properties.
Definition 2.10 (Irreducible and primitive matrices). For n 2, an n n nonnegative matrix A is

Pn1 k
(i) irreducible if k=0 A is positive,
(ii) primitive if there exists k N such that Ak positive.
A matrix that is not irreducible is said to be reducible.
Note that A1 , A3 and A5 are reducible wereas A2 and A4 are irreducible. Moreover, note that A2 is not
primitive whereas A4 is. Additionally note that a positive matrix is clearly primitive. Finally, note that, if
there is k N such that Ak is positive, then (one can show that) all subsequent powers Ak+1 , Ak+2 , . . .
are necessarily positive as well; see Exercise E2.5.
We now state a useful result and postpone its proof to Exercise E4.5.
Lemma 2.11 (A primitive matrix is irreducible). If a nonnegative matrix is primitive, then it is also
irreducible.
As a consequence of this lemma we can draw the set diagram in Figure 2.3 describing the set of
nonnegative square matrices and its subsets of irreducible, primitive and positive matrices. Note that the
inclusions in the diagram are strict in the sense that:
(i) A3 is nonnegative but not irreducible;

(ii) A2 is irreducible but not primitive; and
(iii) A4 is primitive but not positive.
non-negative irreducible
P primitive positive
n 1 k
(A 0) ( k=0 A > 0) (there exists k (A > 0)
such that Ak > 0)
Figure 2.3: The set of nonnegative square matrices and its subsets of irreducible, primitive and positive matrices.
2.3.2 Main results

We are now ready to state the main results in Perron-Frobenius-theory theory and characterize the properties
of the spectral radius of a nonnegative matrix as a function of the matrix properties. We state the results in
three related theorems.
Theorem 2.12 (Perron-Frobenius Theorem). Let A Rnn , n 2. If A is nonnegative, then

(i) there exists a real eigenvalue || 0 for all other eigenvalues ,
(ii) the right and left eigenvectors v and w of can be selected nonnegative.
If additionally A is irreducible, then
(iii) the eigenvalue is strictly positive and simple,
(iv) the right and left eigenvectors v and w of are unique and positive, up to rescaling.
If additionally A is primitive, then
(v) the eigenvalue satisfies > || for all other eigenvalues .
Some remarks and some additional statements are in order.

For nonnegative matrices, the real nonnegative eigenvalue is the spectral radius (A) of A. We refer
to as the dominant eigenvalue of A; it is also referred to as the Perron root. The dominant eigenvalue is
equivalently defined by
(A) = inf{ R | Au u for all u > 0}.
For irreducible matrices, the right and left eigenvectors v and w (unique up to rescaling) of the dominant
eigenvalue are called the right and left dominant eigenvector. One can show that, up to rescaling, the right
dominant eigenvector is the only positive right eigenvector of a primitive matrix A (a similar statement
holds for the left dominant eigenvector); see also Exercise E2.4.
We refer to Theorem 4.8 and Exercise E4.8 in Section 4.5 for some useful bounds on the dominant
eigenvalue and to Theorem 5.2 in Section 5.2 for a version of the PerronFrobenius Theorem for reducible
matrices.
Remark 2.13 (Examples and counterexamples). The characterizations in the theorem are sharp in the
following sense:

0 1
(i) the matrix A3 = is nonnegative and reducible, and, indeed, its dominant eigenvalue is 0;
0 0

0 1
(ii) the matrix A2 = is irreducible but not primitive and, indeed, its dominant eigenvalues +1 is not
1 0
stricly larger, in magnitude, than the other eigenvalues 1.
2.3.3 Applications to dynamical systems

The PerronFrobenius Theorem for a primitive matrix A has immediate consequences for the behavior of
Ak as k and, therefore, the asymptotic behavior of the dynamical system x(k + 1) = Ax(k).
Proposition 2.14 (Powers of primitive matrices). For a primitive matrix A with dominant eigenvalue
and with dominant right and left eigenvectors v and w normalized so that v > w = 1, we have
lim Ak /k = vw> .
k
We now apply this result to row-stochastic matrices. Recall that A 0 is row-stochastic if A1n = 1n .
Therefore, the right eigenvector of the eigenvalue 1 can be selected as 1n .
Corollary 2.15 (Consensus for primitive row-stochastic). For a primitive row-stochastic matrix A,
(i) the simple eigenvalue (A) = 1 is strictly larger than the magnitude of all other eigenvalues, hence A is
semi-convergent;
(ii) limk Ak = 1n w> , where w is the left positive eigenvector of A with eigenvalue 1 satisfying w1 +
+ wn = 1;
(iii) the solution to x(k + 1) = Ax(k) satisfies

lim x(k) = w> x(0) 1n ;
k
(iv) if additionally A is doubly-stochastic, then w = n1 1n (because A> 1n = 1n and n1 1>

n 1n = 1) so that
1>
n x(0)

lim x(k) = 1n = average x(0) 1n .
k n
In this case we say that the dynamical system achieves average consensus.
> >
w w1 w2 wn w x(0)
.. , and (1 w> )x(0) = (w> x(0))1 = .. .
Note: 1n w> = ... = ... ..
.
..
. . n n .
w> w1 w2 wn w> x(0)
Note: the limiting vector is therefore a weighted average of the initial conditions. The relative weights
of the initial conditions are the convex combination coefficients w1 , . . . , wn . In a social influence network,
the coefficient wi is regarded as the social influence of agent i. An early reference to average consensus
is (Harary 1959).
Example 2.16 (Revisiting the wireless sensor network example). Finally, as numerical example, let
us reconsider the wireless sensor network discussed in Section 1.2 and the 4-dimensional row-stochastic matrix
Awsn . First, note that Awsn is primitives because A2wsn is positive:

1/2 1/2 0 0 3/8 3/8 1/8 1/8
1/4 1/4 1/4 1/4 3/16 17/48 11/48 11/48
Awsn =
0
= A2wsn =
1/12 11/36 11/36 11/36 .
1/3 1/3 1/3
0 1/3 1/3 1/3 1/12 11/36 11/36 11/36
Therefore, the PerronFrobenius Theorem 2.12 for primitive matrices applies to Awsn . The four pairs of
eigenvalues and right eigenvectors of Awsn (as computed in Example 2.5) are:

2 273 2(1 + 73) 0
1 1 0
(1, 14 ), (5 + 73), 11 + 73 , (5 73), 11 73 , 0, .
24 8 24 8 1
8 8 1
Moreover, we know that Awsn is semi-convergent. To apply the convergence results in Corollary 2.15, we numer-
ically compute its left dominant eigenvector, normalized to have unit sum, to be w = [1/6, 1/3, 1/4, 1/4]> so
that we have:
1/6 1/3 1/4 1/4
1/6 1/3 1/4 1/4
lim Akwsn = 14 w> = 1/6 1/3 1/4 1/4 .

k
1/6 1/3 1/4 1/4
Therefore, each solution to the averaging system x(k + 1) = Awsn x(k) converges to a consensus vector
(w> x(0))14 , that is, the value at each node of the wireless sensor network converges to w> x(0) = (1/6)x1 (0)+
(1/3)x2 (0) + (1/4)x3 (0) + (1/4)x4 (0). Note that Awsn is not doubly-stochastic and, therefore, the averaging
algorithm does not achieve average consensus and that node 2 has more influence than the other nodes.
Note: If A is reducible, then clearly it is not primitive. Yet, it is possible for an averaging algorithm
described by a reducible matrix to converge to consensus. In other words, Corollary 2.15 provides only
a sufficient condition for consensus. Here is a simple example of an averaging algorithm described by a
reducible matrix that converges to consensus:
x1 (k + 1) = x1 (k),
x2 (k + 1) = x1 (k).
To fully understand what all phenomena are possible and what properties of A are necessary and sufficient
for convergence to consensus, we will study graph theory in the next two chapters.
2.3.4 Selected proofs

We conclude this section with the proof of some selected statements.
Proof of Theorem 2.12 We start by establishing that a primitive A matrix satisfies (A) > 0. By
contradiction, if spec(A) = {0}, then the Jordan normal form J of A is nilpotent, that is, there is a k N
so that J k = Ak = 0 for all k k . But this is a contradiction because A being primitive implies that there
is k N so that Ak > 0 for all k k .
Next, we prove that (A) is a real positive eigenvalue with a positive right eigenvector v > 0. We first
focus on the case that A is a positive matrix, and later show how to generalize the proof to the case of
primitive matrices. Without loss of generality, assume (A) = 1. If (, x) is an eigenpair for A such that
|| = (A) = 1, then
|x| = |||x| = |x| = |Ax| |A||x| = A|x| = |x| A|x|. (2.8)
Here, we use the notation |x| = (|xi |)i{1,...,n} , |A| = {|aij |}i,j{1,...,n} , and vector inequalities are
understood component-wise. In what follows, we show |x| = A|x|. With the shorthands z = A|x| and
y = z |x|, equation (2.8) reads y 0 and we aim to show y = 0. By contradiction, assume y has a
non-zero component. Therefore, Ay > 0. Independently, we also know z = A|x| > 0. Thus, there must
exist > 0 such that Ay > z. Eliminating the variable y in the latter equation, we obtain A z > z, where
we define A = A/(1 + ). The inequality A z > z implies Ak z > z for all k > 0. Now, observe that
(A ) < 1 so that limk Ak = 0nn and therefore 0 > z. Since we also knew z > 0, we now have a
contradiction. Therefore, we know y = 0.
So far, we have established that |x| = A|x|, so that (1, |x|) is an eigenpair for A. Also note that A > 0
and x 6= 0 together imply A|x| > 0. Therefore we have established that 1 is an eigenvalue of A with
eigenvector |x| > 0. Next, observe that the above reasoning is correct also for primitive matrices if one
replaces the first equality (2.8) by |x| = |k ||x| and carries the exponent k throughout the proof.
In summary, we have established that there exists a real eigenvalue > 0 such that || for all
other eigenvalues , and that each right (and therefore also left) eigenvector of can be selected positive
up to rescaling. It remains to prove that is simple and is strictly greater than the magnitude of all other
eigenvalues. For the proof of these two points, we refer to (Meyer 2001, Chapter 8).
A is primitive, we know is simple and so we write the Jordan

Proof of Proposition 2.14 Because
0
normal form of A as A = T T 1 with
0 B

w1>
w2>
>

T = v1 v2 v3 . . . v m , and T 1 = w3 ,
.
..
>
wm
where v1 , . . . , vn (respectively, w1 , . . . , wn ) are the columns of T (respectively the rows of T 1 ). Equiva-

lently, we have

0
A v1 v2 v3 . . . vm = v1 v2 v3 . . . vm .
| {z } | {z } 0 B
=T =T
The first column of the above matrix equation is Av1 = v1 , that is, v1 is the dominant right eigenvector
of A. By analogous arguments, we find that w1 is the dominant left eigenvector of A. Next, we recall
k
k 0
A =T T 1 so that
0 B
A k
1k 0 1 1 0 1
lim =T lim T =T T .
k+ k+ 0 (B/)k 0 0
Here we used the fact that Theorem 2.12 implies (B/) < 1, which in turn implies limk+ B k /k =
0(n1)(n1) by Theorem 2.7. Moreover,
>
1 0 0 ... 0 w1
0 0 0 . . . 0 w2>

0 >

lim Ak /k = v1 v2 v3 . . . vm 0 0 0 . . . w3 = v1 w1> .
k+ . . . . .. ..
.. .. .. . . . .
0 0 0 ... 0 >
wm
Finally, the (1, 1) entry of the matrix equality T T 1 = In gives precisely the normalization v1> w1 = 1.
This concludes the proof of Proposition 2.14.
2.4. Exercises 27
2.4 Exercises
E2.1 Simple properties of stochastic matrices. Let A1 , A2 , . . . , Ak be n n matrices, let A1 A2 Ak be their
product and let 1 A1 + +k Ak be their convex combination with arbitrary convex combination coefficients.
Show that
(i) if A1 , A2 , . . . , Ak are nonnegative, then their product and all their convex combinations are nonnegative,
(ii) if A1 , A2 , . . . , Ak are row-stochastic, then their product and all their convex combinations are row-
stochastic, and
(iii) if A1 , A2 , . . . , Ak are doubly-stochastic, then their product and all their convex combinations are
doubly-stochastic.
E2.2 Semi-convergence and Jordan block decomposition. Consider a matrix A Cnn , n 2, with (A) =
1. Show that the following statements are equivalent:
(i) A is semi-convergent,
(ii) either A = In or there exists a nonsingular matrix T Cnn and a number m {1, . . . , n 1} such
that
Im 0m(nm) 1
A=T T ,
0(nm)m B
where B C(nm)(nm) is convergent, that is, (B) < 1.
(Note that, if A is real, then it is possible to find real T and B in statement (ii) by using the notion of real
Jordan normal form (Hogben 2013).)
E2.3 Row-stochastic matrices after pairwise-difference similarity transform. For n 2, let A Rnn be
row stochastic. Define T Rnn by

1 1
.. ..
. .
T = .
1 1
1/n 1/n . . . 1/n
Perform the following tasks:

(i) for x = [x1 , . . . , xn ]> , write T x in components and show T is invertible,

Astable 0n1
(ii) show T AT = 1
for some Astable R(n1)(n1) and c Rn1 ,
c> 1
(iii) if A is doubly-stochastic, then c = 0,
(iv) show that A primitive implies (Astable ) < 1, and

0 1
(v) compute T AT 1 for A = .
1 0
E2.4 Uniqueness of the nonnegative eigenvector in irreducible nonnegative matrices. Given a square
matrix A Rnn , show that:
(i) if v1 is a right eigenvector of A corresponding to the eigenvalue 1 , w2 is a left eigenvector of A relative
to 2 , and 1 6= 2 , then v1 w2 ; and
(ii) if A is nonnegative and irreducible and u Rn0 is a right nonnegative eigenvector of A, then u is an
eigenvector corresponding to the eigenvalue (A).
E2.5 Powers of primitive matrices. Let A Rnn be nonnegative. Show that Ak > 0, for some k N, implies
Am > 0 for all m k.
E2.6 The exponent of a primitive matrix.

(i) Let G be the digraph with nodes {1, . . . , 3} and edges {(1, 2), (2, 1), (2, 3), (3, 1)}. Explain if and why
G is strongly connected and aperiodic.
(ii) Recall a nonnegative matrix is primitive if there exists k such that Ak > 0; the smallest such k is called
the exponent of the primitive matrix A. Do one of the following:
a) prove that the exponent of any promitive matrix A is less than or equal to n, or
b) provide a counterexample.
E2.7 Sufficient condition for primitivity. Consider a nonnegative matrix A Rnn . If there exists r
{1, . . . , n} such that Arj > 0 and Air > 0 for all i, j {1, . . . , n}, that is, if A has the sparsity pattern

?
?

?

A= ? ,

? ? ? ? ? ? ?

?
?
then the matrix A is primitive. (Here the symbol ? denotes a strictly positive entry. The absence of a symbol
denotes a positive or zero entry.)
E2.8 Reducibility fallacies. Consider the following statement:
Any nonnegative square matrix A Rnn with a zero entry is reducible, because the zero entry
can be moved in position An,1 via a permutation.
Is the statement true? If yes, explain why; if not, provide a counterexample.
E2.9 Symmetric doubly-stochastic matrix. Let A Rnn be doubly-stochastic. Show that:
(i) the matrix A> A is doubly-stochastic and symmetric,
(ii) spec(A> A) [0, 1],
(iii) the eigenvalue 1 of A> A is not necessarily simple even if A is irreducible.
E2.10 On some nonnegative matrices. How many 2 2 matrices exist that are simultaneously doubly stochastic,
irreducible and not primitive? Justify your claim.
E2.11 Discrete-time affine systems. Given A Rnn and b Rn , consider the discrete-time affine system
x(k + 1) = Ax(k) + b.
Assume A is convergent and show that

(i) the matrix (In A) is invertible,
(ii) the only equilibrium point of the system is (In A)1 b, and
(iii) limk x(k) = (In A)1 b for all initial conditions x(0) Rn .
E2.12 An affine averaging system. Given a primitive doubly-stochastic matrix A and a vector b satisfying 1>
n b = 0,
consider the dynamical system
x(k + 1) = Ax(k) + b.
Show that
(i) the quantity k 7 1>
n x(k) is constant,
Exercises for Chapter 2 29
(ii) for each R, there exists a unique equilibrium point x satisfying 1>
n x = , and
(iii) all solutions with initial condition x(0) satisfying 1n x(0) = converge to x .
>
Hint: Use Exercise E2.2 and E2.11

E2.13 The Neumann series. For A Cnn , show that the following statements are equivalent:
(i) (A) < 1,
(ii) limk Ak = 0nn , and
P
(iii) the Neumann series k=0 Ak converges.
Additionally show that, if any and hence all of these conditions hold, then the matrix (I A) is invertible and

X
Ak = (I A)1 .
k=0
Hint: This statement, written in the style of (MeyerP2001, Section 7.10), is an extension of Theorem 2.7 and a

generalization of the classic geometric series 1x
1
= k=0 xk , convergent for all |x| < 1. For the proof, the hint
is to use the Jordan normal form.
E2.14 Orthogonal and permutation matrices. A set G with a binary operation mapping two elements of G into
another element of G, denoted by (a, b) 7 a ? b, is a group if:
a ? (b ? c) = (a ? b) ? c for all a, b, c G (associativity property);
there exists e G such that a ? e = e ? a = a for all a G (existence of an identity element); and
there exists a1 G such that a ? a1 = a1 ? a = e for all a G (existence of inverse elements).
Recall that: an orthogonal matrix R is a square matrix whose columns and rows are orthonormal vectors,
i.e., RR> = In ; an orthogonal matrix acts on a vector like a rotation and/or reflection; let O(n) denote the set
of orthogonal matrices. Similarly, recall that: a permutation matrix is a square binary (i.e., entries equal to 0
and 1) matrix with precisely one entry equal to 1 in every row and every columns; a permutation matrix acts
on a vector by permuting its entries; let Pn denote the set of permutation matrices. Prove that
(i) the set of orthogonal matrices O(n) with the operation of matrix multiplication is a group;
(ii) the set of permutation matrices Pn with the operation of matrix multiplication is a group; and
(iii) each permutation matrix is orthogonal.
E2.15 On doubly-stochastic and permutation matrices. The following result is known as the Birkhoff Von
Neumann Theorem. For a matrix A Rnn , the following statements are equivalent:
(i) A is doubly-stochastic; and
(ii) A is a convex combination of permutation matrices.
Do the following:
show that the set of doubly-stochastic matrices is convex (i.e., given any two doubly-stochastic matrices
A1 and A2 , any matrix of the form A1 + (1 )A2 , for [0, 1], is again doubly-stochastic);
show that (ii) = (i);
find in the literature a proof of (i) = (ii) and sketch it in one or two paragraphs.
E2.16 The Jacobi relaxation in parallel computation. Consider n distributed processors that aim to collectively
solve the linear equation Ax = b, where b Rn and A Rnn is invertible and its diagonal elements aii
are nonzero. Each processor stores a variable xi (k) as the discrete-time variable k evolves and applies the
following iterative strategy termed Jacobi relaxation. At time step k N each processor performs the local
computation
1
Xn
xi (k + 1) = bi aij xj (k) , i {1, . . . , n}.
aii
j=1,j6=i
Next, each processor i {1, . . . , n} sends its value xi (k + 1) to all other processors j {1, . . . , n} with
aji 6= 0, and they iteratively repeat the previous computation. The initial values of the processors are arbitrary.
(i) Assume the Jacobi relaxation converges, i.e., assume limk x(k) = x . Show that Ax = b.
(ii) Give a necessary and sufficient condition for the Jacobi relaxation to converge.
(iii) Use Gergorin Disks Theorem 2.8 to P show that the Jacobi relaxation converges if A is strictly row
n
diagonally dominant, that is, if |aii | > j=1,j6=i |aij | for all i {1, . . . , n}.
E2.17 The Jacobi over-relaxation in parallel computation. We now consider a more sophisticated version of the
Jacobi relaxation presented in Exercise E2.16. Consider again n distributed processors that aim to collectively
solve the linear equation Ax = b, where b Rn and A Rnn is invertible and its diagonal elements aii
are nonzero. Each processor stores a variable xi (k) as the discrete-time variable k evolves and applies the
following iterative strategy termed Jacobi over-relaxation. At time step k N each processor performs the
local computation

n
X
xi (k + 1) = (1 )xi (k) + bi aij xj (k) , i {1, . . . , n},
aii
j=1,j6=i
where R is an adjustable parameter. Next, each processor i {1, . . . , n} sends its value xi (k + 1) to all
other processors j 6= i with aji 6= 0, and they iteratively repeat the previous computation. The initial values
of the processors are arbitrary.
(i) Assume the Jacobi over-relaxation converges to x? and show that Ax? = b if 6= 0.
(ii) Find the expression governing the dynamics of the error variable e(k) := x(k) x? .
P
(iii) Suppose that A is strictly row diagonally dominant, that is |aii | > j6=i |aij |. Use the Gergorin Disks
Theorem 2.8 to discuss the convergence properties of the algorithm for all possible values of R.
Hint: Consider different thresholds for .
E2.18 Robotic coordination and geometric optimization on the real line. Consider n 3 robots with dynam-
ics pi = ui , where i {1, . . . , n} is an index labeling each robot, pi R is the position of robot i, and ui R
is a steering control input. For simplicity, assume that the robots are indexed according to their initial position:
p1 (0) p2 (0) p3 (0) pn (0). We consider the following distributed control laws to achieve some
geometric configuration:
(i) Move towards the centroid of your neighbors: The robots i {2, . . . , n 1} (each having two neighbors)
move to the centroid of the local subset {pi1 , pi , pi+1 }:
1
pi = (pi1 + pi + pi+1 ) pi , i {2, . . . , n 1} .
3
The robots {1, n} (each having one neighbor) move to the centroid of the local subsets {p1 , p2 } and
{pn1 , pn }, respectively:
1 1
p1 = (p1 + p2 ) p1 and pn = (pn1 + pn ) pn .
2 2
By using these coordination laws, the robots asymptotically rendezvous.
(ii) Move towards the centroid of your neighbors or walls: Consider two walls at the positions p0 p1
and pn+1 pn so that all robots are contained between the walls. The walls are stationary, that is,
p0 = 0 and pn+1 = 0. Again, the robots i {2, . . . , n 1} (each having two neighbors) move to
the centroid of the local subset {pi1 , pi , pi+1 }. The robots {1, n} (each having one robotic neighbor
and one neighboring wall) move to the centroid of the local subsets {p0 , p1 , p2 } and {pn1 , pn , pn+1 },
respectively. Hence, the closed-loop robot dynamics are
1
pi = (pi1 + pi + pi+1 ) pi , i {1, . . . , n} .
3
By using these coordination laws, the robots become uniformly spaced on the interval [p0 , pn+1 ].
(iii) Move away from the centroid of your neighbors or walls: Again consider two stationary walls at p0 p1 and
pn+1 pn containing the positions of all robots. We partition the interval [p0 , pn+1 ] into areas of interest,
where each robot gets a territory assigned that is closer to itself than to other robots. Hence, robot i
{2, . . . , n 1} (having two neighbors) obtains the partition Vi = [(pi + pi1 )/2, (pi+1 + pi )/2], robot 1
obtains the partition V1 = [p0 , (p1 +p2 )/2], and robot n obtains the partition Vn = [(pn1 +pn )/2, pn+1 ].
We want to design a distributed algorithm such that the robots have equally sized partitions. We consider
a simple coordination law, where each robot i heads for the midpoint ci (Vi (p)) of its partition Vi :
pi = ci (Vi (p)) pi .
By using these coordination laws, the robots partitions asymptotically become equally large.
(iv) Discrete-time update rules: If the robots move in discrete-time according to p+

i = ui , then the above
coordination laws are easily modified via an Euler discretization as follows: replace pi = f (p) by
p+
i pi = f (p) in each coordination law, where > 0 is sufficiently small so that the matrices
involved in the discrete iterations are nonnegative.
Consider n = 3 robots, take your favorite problem from above, and show that both the continuous-time and
discrete-time dynamics asymptotically lead to the desired geometric configurations.
E2.19 Continuous-time cyclic pursuit. Consider four mobile robotic vehicles, indexed by i {1, 2, 3, 4}. We
model each robot as fully-actuated kinematic point mass, that is, we write pi = ui , where pi C is the
position of robot i in the plane and ui C is its velocity command. The robots are equipped with onboard
cameras as sensors. The task of the robots is rendezvous at a common point (while using only onboard sensors).
A simple strategy to achieve rendezvous is cyclic pursuit: each robot i picks another robot, say i + 1, and
pursues it. This gives rise to the control ui = pi+1 pi and the closed-loop system

p1 1 1 0 0 p1
p2 0 1 1 0 p2
= .
p3 0 0 1 1 p3
p4 1 0 0 1 p4
A simulation of the cyclic-pursuit dynamics is shown in Figure E2.1.
0.3
0.2
0.1
y 0
-0.1
-0.2
-0.3
-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3

x
Figure E2.1: Four robots with initial positions that perform a cyclic pursuit to rendezvous at .
Your tasks are as follows.
(i) Prove that the center of mass
4
X pi (t)
average(p(t)) =
i=1
4
is constant for all t 0. Notice that this is equivalent to saying d/dt average(p(t)) = 0.
(ii) Prove that the robots asymptotically rendezvous at the initial center of mass, that is,
lim pi (t) = average(p(0)) for i {1, . . . , 4} .

t
(iii) Prove that if the robots are initially arranged in a square formation, they remain in a square formation
under cyclic pursuit.
Hint: Recall that for a matrix
PA with semisimple eigenvalues, the solution to the equation x = Ax is given by
the modal expansion x(t) = i ei t vi wi> x(0), where i is an eigenvalue, and vi and wi are the associated right
and left eigenvectors pairwise normalized to wi> vi = 1.
E2.20 Simulation (contd). This is a followup to Exercise E1.1. Consider the linear averaging algorithm in equa-
tion (1.1): set n = 5, select the initial state equal to (1, 1, 1, 1, 1), and use (a) the complete graph (b) a ring
graph, and (c) a star graph with node 1 as center.
(i) To which value do all nodes converge to?
(ii) Compute the dominant left eigenvector of the averaging matrix associated to each of the three graphs
and verify that the result in Corollary 2.15(iii) is correct.
E2.21 Continuous- and discrete-time control control of mobile robots. Consider n robots moving on the line
with positions z1 , z2 , . . . zn R. In order to gather at a common location (i.e., reach rendezvous), each robot
heads for the centroid of its neighbors, that is,
1 X
n
zi = zj zi .
n1
j=1,j6=i
(i) Will the robots asymptotically rendezvous at a common location?
(ii) Consider the Euler discretization of the above closed-loop dynamics with sampling rate T > 0:
1 X
n
zi (k + 1) = zi (k) + T zj (k) zi (k) .
n1
j=1,j6=i
For which values of the sampling period T will the robots rendezvous?
Hint: Use the modal decomposition in Remark 2.3.
Chapter 3
Elements of Graph Theory
In this chapter we review some basic concepts from graph theory as exposed in standard books, e.g.,
see (Diestel 2000; Bollobs 1998). Graph theory provides key concepts to model, analyze and design network
systems and distributed algorithms; the language of graphs pervades modern science and technology and
is therefore essential
3.1 Graphs and digraphs
[Graphs] An undirected graph (in short, a graph) consists of a set V of elements called vertices and of a set
E of unordered pairs of vertices, called edges. For u, v V and u 6= v, the set {u, v} denotes an unordered
edge. We define and visualize some basic examples graphs in Figure 3.1.
Figure 3.1: Example graphs. First row: the ring graph with 6 nodes, a star graph with 7 nodes, a tree (see definition
below), the complete graph with 6 nodes (usually denoted by K(6)). Second row: the complete bipartite graph with
3 + 3 nodes (usually denoted by K(3, 3)), a grid graph, and the Petersen graph.
35
36 Chapter 3. Elements of Graph Theory
[Neighbors and degrees in graphs] Two vertices u and v of a given graph are neighbors if {u, v} is an
undirected edge. Given a graph G, we let NG (v) denote the set of neighbors of v.
The degree of v is the number of neighbors of v. A graph is regular if all the nodes have the same degree;
e.g., in Figure 3.1, the ring graph is regular with degree 2 whereas the complete bipartite graph K(3, 3) and
the Petersen graph are regular with degree 3.
[Digraphs and self-loops] A directed graph (in short, a digraph) of order n is a pair G = (V, E), where V is a
set with n elements called vertices (or nodes) and E is a set of ordered pairs of vertices called edges. In other
words, E V V . As for graphs, V and E are the vertex set and edge set, respectively. For u, v V , the
ordered pair (u, v) denotes an edge from u to v. A digraph is undirected if (v, u) E anytime (u, v) E.
In a digraph, a self-loop is an edge from a node to itself. Consistenyl with a customary convention, self-loops
are not allowed in graphs. We define and visualize some basic examples digraphs in Figure 3.2.
Figure 3.2: Example digraphs: the ring digraph with 6 nodes, the complete graph with 6 nodes, and a directed acyclic
graph, i.e., a digraph with no directed cycles.
[Subgraphs] A digraph (V 0 , E 0 ) is a subgraph of a digraph (V, E) if V 0 V and E 0 E. A digraph (V 0 , E 0 )

is a spanning subgraph of (V, E) if it is a subgraph and V 0 = V . The subgraph of (V, E) induced by V 0 V
is the digraph (V 0 , E 0 ), where E 0 contains all edges in E between two vertices in V 0 .
[In- and out-neighbors] In a digraph G with an edge (u, v) E, u is called an in-neighbor of v, and v is
called an out-neighbor of u. We let N in (v) (resp., N out (v)) denote the set of in-neighbors, (resp. the set of
out-neighbors) of v. Given a digraph G = (V, E), an in-neighbor of a nonempty set of nodes U is a node
v V \ U for which there exists an edge (v, u) E for some u U .
[In- and out-degree] The in-degree din (v) and out-degree dout (v) of v are the number of in-neighbors and
out-neighbors of v, respectively. Note that a self-loop at a node v makes v both an in-neighbor as well as an
out-neighbor of itself. A digraph is topologically balanced if each vertex has the same in- and out-degrees
(even if distinct vertices have distinct degrees).
3.2 Paths and connectivity in undirected graphs
3.3. Paths and connectivity in digraphs 37
[Paths] A path in a graph is an ordered sequence of vertices such that any pair of consecutive vertices in
the sequence is an edge of the graph. A path is simple if no vertex appears more than once in it, except
possibly for the initial and final vertex.
[Connectivity and connected components] A graph is connected if there exists a path between any two vertices.
If a graph is not connected, then it is composed of multiple connected components, that is, multiple connected
subgraphs.
[Cycles] A cycle is a simple path that starts and ends at the same vertex and has at least three distinct
vertices. A graph is acyclic if it contains no cycles. A connected acyclic graph is a tree.
Figure 3.3: This graph has two connected components. The leftmost connected component is a tree, while the
rightmost connected component is a cycle.
3.3 Paths and connectivity in digraphs
[Directed paths] A directed path in a digraph is an ordered sequence of vertices such that any pair of
consecutive vertices in the sequence is a directed edge of the digraph. A directed path is simple if no vertex
appears more than once in it, except possibly for the initial and final vertex.
[Cycles in digraphs] A cycle in a digraph is a simple directed path that starts and ends at the same vertex. It
is customary to accept as feasible cycles in digraphs also cycles of length 1 (that is, a self-loop) and cycles
of length 2 (that is, composed of just 2 nodes). The set of cycles of a directed graph is finite. A digraph is
acyclic if it contains no cycles.
[Sources and sinks] In a digraph, every vertex with in-degree 0 is called a source, and every vertex with
out-degree 0 is called a sink. Every acyclic digraph has at least one source and at least one sink; see
Exercise E3.1.
Figure 3.4: Acyclic digraph with one sink and two sources. Figure 3.5: Directed cycle.
[Directed trees] A directed tree (sometimes called a rooted tree) is an acyclic digraph with the following
property: there exists a vertex, called the root, such that any other vertex of the digraph can be reached
by one and only one directed path starting at the root. A directed spanning tree of a digraph is a spanning
subgraph that is a directed tree.
3.3.1 Connectivity properties of digraphs

Next, we present four useful connectivity notions for a digraph G:
(i) G is strongly connected if there exists a directed path from any node to any other node;
(ii) G is weakly connected if the undirected version of the digraph is connected;
(iii) G possesses a globally reachable node if one of its nodes can be reached from any other node by
traversing a directed path; and
(iv) G possesses a directed spanning tree if one of its nodes is the root of directed paths to every other
node.
An example of a strongly connected graph is shown in Figure 3.6, and a weakly connected graph with a
globally reachable node is illustrated in Figure 3.7.
3 4 3 4
2 5 2 5
1 6 1 6
Figure 3.6: A strongly connected digraph Figure 3.7: A weakly connected digraph with a globally
reachable node, node #2.
For a digraph G = (V, E), the reverse digraph G(rev) has vertex set V and edge set E(rev) composed
of all edges in E with reversed direction. Clearly, a digraph contains a directed spanning tree if and only if
the reverse digraph contains a globally reachable node.
3.3.2 Periodicity of strongly-connected digraphs
[Periodic and aperiodic digraphs] A strongly-connected directed graph is periodic if there exists a k > 1,
called the period, that divides the length of every cycle of the graph. In other words, a digraph is periodic if
the greatest common divisor of the lengths of all its cycles is larger than one. A digraph is aperiodic if it is
not periodic.
Note: the definition of periodic digraph is well-posed because a digraph has only a finite number of
cycles (because of the assumptions that nodes are not repeated in simple paths). The notions of periodicity
3.3. Paths and connectivity in digraphs 39
(a) (b) (c)
Figure 3.8: (a) A periodic digraph with period 2. (b) An aperiodic digraph with cycles of length 1 and 2. (c) An
aperiodic digraph with cycles of length 2 and 3.
and aperiodicity only apply to digraphs and not to undirected graphs (where the notion of a cycle is defined
differently). Any strongly-connected digraph with a self-loop is aperiodic.
3.3.3 Condensation digraphs
[Strongly connected components] A subgraph H is a strongly connected component of G if H is strongly

connected and any other subgraph of G strictly containing H is not strongly connected.
[Condensation digraph] The condensation digraph of a digraph G, denoted by C(G), is defined as follows:
the nodes of C(G) are the strongly connected components of G, and there exists a directed edge in C(G)
from node H1 to node H2 if and only if there exists a directed edge in G from a node of H1 to a node of H2 .
Figure 3.9: An example digraph, its strongly connected components and its condensation.
Lemma 3.1 (Properties of the condensation digraph). For a digraph G and its condensation digraph
C(G),
(i) C(G) is acyclic,
(ii) G is weakly connected if and only if C(G) is weakly connected, and
(iii) the following statement are equivalent:
a) G contains a globally reachable node,
b) C(G) contains a globally reachable node, and
c) C(G) contains a unique sink.
Proof. We prove statement (i) by contradiction. If there exists a cycle (H1 , H2 , . . . , Hm , H1 ) in C(G), then
the set of vertices H1 , . . . , Hm are strongly connected in C(G). But this implies that also the subgraph
of G containing all node of H1 , . . . , Hm is strongly connected in G. But this is a contradiction with the
fact that any subgraph of G strictly containing any of the H1 , . . . , Hm must be not strongly connected.
Statement (ii) is intuitive and simple to prove; we leave this task to the reader.
Regarding statement (iii), we start by proving that (iii)a = (iii)b. Let v be a a globally reachable node
in G and let H be an arbitrary node of C(G). Let H denote the nodes in C(G) containing v and pick a

node v in H. Since v is globally reachable, there exists a directed path from v to v in G. This directed path
induces naturally a directed path in C(G) from H to H. This shows that H is a globally reachable node in
C(G).
Regarding (iii)b = (iii)a, let H be a globally reachable node of C(G) and pick a node v in H. We
claim v is globally reachable in G. Indeed, pick any node v in G belonging to a strongly connected
component U of G. Because H is globally reachable in C(G), there exists a directed path of the form

H = H0 , H1 , . . . , Hk , Hk+1 = H in C(G). One can now piece together a directed path in G from v to v,
by walking inside each of the strongly connected components Hi and moving to the subsequent strongly
connected components Hi+1 , for i {0, . . . , k}.
The final equivalence between statement (iii)b and statement (iii)c is an immediate consequence of
C(G) being acyclic.
3.4 Weighted digraphs

A weighted digraph is a triplet G = (V, E, {ae }eE ), where the pair (V, E) is a digraph with nodes
V = {v1 , . . . , vn }, and where {ae }eE is a collection of strictly positive weights for the edges E.
Note: for simplicity we let V = {1, . . . , n}. It is therefore equivalent to write {ae }eE or {aij }(i,j)E .
2 4 The collection of weights for this weighted digraph

1.2 is
3.7
2.3
a12 = 3.7, a13 = 3.7, a21 = 8.9,
8.9 3.7 4.4
a24 = 1.2, a34 = 3.7, a35 = 2.3,
3
1 3.7 2.3 a51 = 4.4, a54 = 2.3, a55 = 4.4.
4.4 5
3.5. Appendix: Database collections and software libraries 41
A digraph G = (V = {v1 , . . . , vn }, E) can be regarded as a weighted digraph by defining its set of

weights to be all equal to 1, that is, setting ae = 1 for all e E. A weighted digraph is undirected if
aij = aji for all i, j {1, . . . , n}.
The notions of connectivity and definitions of in- and out-neighbors, introduced for digraphs, remain
equally valid for weighted digraphs. The notions of in- and out-degree are generalized to weighted digraphs
as follows. In a weighted digraph with V = {v1 , . . . , vn }, the weighted out-degree and the weighted in-degree
of vertex vi are defined by, respectively,
n
X
dout (vi ) = aij , (i.e., dout (vi ) is the sum of the weights of all the out-edges of vi ) ,
j=1
n
X
din (vi ) = aji , (i.e., din (vi ) is the sum of the weights of all the in-edges of vi ) .
j=1
The weighted digraph G is weight-balanced if dout (vi ) = din (vi ) for all vi V .
3.5 Appendix: Database collections and software libraries

Useful collections of example networks are freely available online; here are some examples:
(i) The Koblenz Network Collection, available at http://konect.uni-koblenz.de and described

in (Kunegis 2013), contains model graphs in easily accessible Matlab format (as well as a Matlab
toolbox for network analysis and a compact overview the various computed statistics and plots for
the networks in the collection).
(ii) A broad range of example networks is available online at the Stanford Large Network Dataset
Collection, see http://snap.stanford.edu/data.
(iii) The University of Florida Sparse Matrix Collection, available at http://www.cise.ufl.edu/research/
sparse/matrices and described in (Davis and Hu 2011), contains a large and growing set of sparse
matrices and complex graphs arising in a broad range of applications; e.g., see Figure 3.10.
(iv) The UCI Network Data Repository, available at http://networkdata.ics.uci.edu, is an effort to
facilitate the scientific study of networks; see also (DuBois 2008).
Useful software libraries for network analysis and visualization are freely available online; here are
some examples:
(i) Gephi, available at https://gephi.org, is an interactive visualization and exploration platform for
all kinds of networks and complex systems, dynamic and hierarchical graphs. Datasets are available
at https://wiki.gephi.org/index.php?title=Datasets.
(ii) NetworkX, available at http://networkx.github.io, is a Python library for network analysis. For
example, one feature is the ability to compute condensation digraphs. A second interesting feature
is the ability to generate numerous well-known model graphs, see http://networkx.lanl.gov/
reference/generators.html
(a) IEEE 118 bus system (b) Klavzar bibliography
(c) Pajek network GD99c
Figure 3.10: Example networks from distinct domains: Figure 3.10a shows the standard IEEE 118 power grid testbed (118
nodes); Figure 3.10b shows the Klavzar bibliography network (86 nodes); Figure 3.10c shows the GD99c Pajek network
(105 nodes). Networks parameters are available at http://www.cise.ufl.edu/research/sparse/matrices, and
their layout is obtained via the graph drawing algorithm proposed by Hu (2005).
(iii) Cytoscape, available at http://www.cytoscape.org, is an open-source software platform for visu-

alizing complex networks and integrating them with attribute data.
(iv) Mathematica provides functionality for modeling, analyzing, synthesizing, and visualizing graphs and
networks beside the ability to simulate dynamical systems; see description at http://reference.
wolfram.com/mathematica/guide/GraphsAndNetworks.html.
(v) Graphviz, available at http://www.graphviz.org/, is an open source graph visualization software
which is also compatible with Matlab: http://www.mathworks.com/matlabcentral/fileexchange/
4518-matlab-graphviz-interface.
3.6. Exercises 43
3.6 Exercises
E3.1 Acyclic digraphs. Let G be an acyclic digraph with n nodes. Show that:
(i) G contains at least one sink, i.e., a vertex without out-neighbors and at least one source, i.e., a vertex
without in-neighbors;
(ii) the vertices of G can be given labels in the set {1, . . . , n} in such a way that if (u, v) is an edge, then
label(u) > label(v). This labeling is called a topological sort of G. Provide an algorithm to define this
labelling; and
(iii) after topologically sorting its vertices, the adjacency matrix of the digraph is lower-triangular, i.e., all its
entries above the main diagonal are equal to zero.
E3.2 Condensation digraphs. Draw the condensation for each of the following digraphs.
E3.3 Directed spanning trees in the condensation digraph. For a digraph G and its condensation digraph
C(G), show that the following statements are equivalent:
(i) G contains a directed spanning tree, and

(ii) C(G) contains a directed spanning tree.
E3.4 Properties of trees. Consider an undirected graph G with n nodes and m edges (and without self-loops).
Show that the following statements are equivalent:
(i) G is a tree;
(ii) G is connected and m = n 1; and
(iii) G is acyclic and m = n 1.
E3.5 Connectivity in topologically balanced digraphs. Prove the following statement: If a digraph G is
topologically balanced and contains either a globally reachable vertex or a directed spanning tree, then G is
strongly connected.
E3.6 Globally reachable nodes and disjoint closed subsets (Lin et al. 2005; Moreau 2005). Consider a digraph
G = (V, E) with at least two nodes. Prove that the following statements are equivalent:
(i) G has a globally reachable node, and

(ii) for every pair S1 , S2 of non-empty disjoint subsets of V , there exists a node that is an out-neighbor of
S1 or S2 .
E3.7 Swiss railroads. Consider the fictitious railroad map of Switzerland given in Figure E3.1.
(i) Can a passenger go from any station to any other?

(ii) Is the graph acyclic? Is it aperiodic? If not, what is its period?
BASEL
ST. GALLEN
5
ZURICH 2
1
BERN
6
INTERLAKEN 3
LAUSANNE
4 CHUR
7
9
ZERMATT 8 LUGANO
Figure E3.1: Fictitious railroad map connections in Switzerland
Chapter 4
The Adjacency Matrix
We review here basic concepts from algebraic graph theory. Standard books on algebraic graph theory
are (Biggs 1994; Godsil and Royle 2001). One objective is to relate matrix properties with graph theoretical
properties. A second objective is to understand when a row-stochastic matrix is primitive.
4.1 The adjacency matrix

Given a weighted digraph G = (V, E, {ae }eE ), with V = {1, . . . , n}, the weighted adjacency matrix of G
is the n n nonnegative matrix A defined as follows: for each edge (i, j) E, the entry (i, j) of A is equal
to the weight a(i,j) of the edge (i, j), and all other entries of A are equal to zero. In other words, aij > 0 if
and only if (i, j) is an edge of G, and aij = 0 otherwise.
2 4 The adjacency matrix of this weighted di-

1.2 rected graph is
3.7
8.9 3.7 2.3
4.4
0 3.7 3.7 0 0
3 8.9 0 0 1.2 0

1 0 0 0 3.7 2.3 .
3.7 2.3

4.4 5 0 0 0 0 0
4.4 0 0 2.3 4.4
The binary adjacency matrix A {0, 1}nn of a digraph G = (V = {1, . . . , n}, E) or of a weighted
digraph is defined by
(
1, if (i, j) E,
aij = (4.1)
0, otherwise.
Here, a binary matrix is any matrix with entries taking values in 0, 1.
45
46 Chapter 4. The Adjacency Matrix
Finally, in a weighted digraph, the weighted out-degree matrix Dout and the weighted in-degree matrix
Din are the diagonal matrices defined by

dout (1) 0 0
..
Dout = diag(A1n ) = 0 . 0 , and Din = diag(A> 1n ),
0 0 dout (n)
where diag(z1 , . . . , zn ) is the diagonal matrix with diagonal entries equal to z1 , . . . , zn .
4.2 Algebraic graph theory: basic and prototypical results

In this section we review some basic and prototypical results that involve correspondences between graphs
and adjacency matrices.
In what follows we let G denote a weighted digraph and A its weighted adjacency matrix or, equivalently,
we let A be a nonnegative matrix and G be its associated weighted digraph(i.e., the digraph with nodes
{1, . . . , n} and with weighted adjacency matrix A). We start with some straightforward statements:
(i) G is undirected if and only if A is symmetric and its diagonal entries are equal to 0;
(ii) G is weight-balanced if and only if A1n = A> 1n , i.e., Dout = Din ;
(iii) in a digraph G without self-loops, the node i is a sink in G if and only if ith row-sum of A is zero;
(iv) in a digraph G without self-loops, the node i is a source in G if and only if ith column-sum of A is
zero;
(v) A is row-stochastic if and only if each node of G has weighted out-degree equal to 1 (so that
Dout = In ); and
(vi) A is doubly-stochastic if and only if each node of G has weighted out-degree and weighted in-degree
equal to 1 (so that Dout = Din = In and, in particular, G is weight-balanced).
Next we relate the powers of the adjacency matrix with the existence of directed paths in the digraph.
We start with some simple observation. First, pick two nodes i and j and note that there exists a directed
path from i to j of length 1 (i.e., an edge) if and only if (A)ij > 0. Next, consider the formula for the matrix
power:
X n
(A2 )ij = (ith row of A) (jth column of A) = Aih Ahj .
h=1
A directed path from i to j of length 2 exists if and only if there exists a node k such that (i, k) and (k, j)
are edges of G. In turn, (i, k) and (k, j) are edges if and only if Aik > 0 and Akj > 0 and therefore
(A2 )ij > 0. In short, we know that a directed path from i to j of length 2 exists if and only if (A2 )ij > 0.
These observations lead to the following result, whose proof we leave as Exercise E4.1.
Lemma 4.1 (Directed paths and powers of the adjacency matrix). Let G be a weighted digraph with n
nodes, with weighted adjacency matrix A, with unweighted adjacency matrix A0,1 {0, 1}nn , and possibly
with self-loops. For all i, j {1, . . . , n} and k N
4.3. Graph theoretical characterization of irreducible matrices 47
(i) the (i, j) entry of Ak0,1 equals the number of directed paths of length k (including paths with self-loops)
from node i to node j; and
(ii) the (i, j) entry of Ak is positive if and only if there exists a directed path of length k (including paths
with self-loops) from node i to node j.
4.3 Graph theoretical characterization of irreducible matrices

In this section we provide three equivalent characterizations of the notion of irreducibility an we can now
characterize certain connectivity properties of digraphs based on the powers of the adjacency matrix.
Before proceeding, we introduce a few useful concepts. First, {I, J } is a partition of the index set
6 , J 6= , and I J = . Second, a permutation matrix is a square
{1, . . . , n} if I J = {1, . . . , n}, I =
binary matrix with precisely one entry equal to 1 in every row and every columns. (In other words, the
columns of a permutation matrix are a reordering of the basis vectors e1 , . . . , en ; a permutation matrix
acts on a vector by permuting its entries.) Finally, an n n matrix A is block triangular if there exists
r {1, . . . , n 1} such that

B C
A= ,
0(nr)r D
where B Rrr , C Rr(nr) and D R(nr)(nr) are arbitrary.

We are now ready to state the main result of this section.
Theorem 4.2 (Connectivity properties of the digraph and positive powers of the adjacency ma-
trix). Let G be a weighted digraph with n 2 nodes and weighted adjacency matrix A. The following
statements are equivalent:
Pn1
(i) A is irreducible, that is, k=0 Ak > 0;
(ii) there exists no permutation matrix P such that P > AP is block triangular;
(iii) G is strongly connected;
(iv) for all partitions {I, J } of the index set {1, . . . , n}, there exists i I and j J such that {i, j} is an
edge in G.
Note: as the theorem establishes, there are four equivalent characterizations of irreducibility. In the
literature, it is common to define irreducibility through property (ii) or (iv). We next see two simple
examples.
This digraph is strongly connected and, accordingly, its

1
adjacency matrix is irreducible:

0 1 0
0 0 1 .
2 3 1 1 0
This digraph is not strongly connected (vertices 2 and 3

1
are globally reachable, but 1 is not) and, accordingly, its
adjacency matrix is reducible:

0 1 1
2 3 0 0 1 .
0 1 0
Proof of Theorem 4.2. Regarding (iii) = (iv), pick a partition {I, J } of the index set {1, . . . , n} and two
nodes i0 I and j0 J . By assumptions there exists a directed path from i0 to j0 . Hence there must exist
an edge from a node in I to a node in J .
Regarding (iv) = (iii), pick a node i {1, . . . , n} and let Ri {1, . . . , n} be the set of nodes
reachable from i, i.e., the set of nodes that belong to directed paths originating from node i. Denote the
unreachable nodes by Ui = {1, . . . , n} \ Ri . Second, by contradiction, assume Ui is not empty. Then
Ri Ui is a partition of the index set {1, . . . , n} and irreduciblity implies the existence of a non-zero entry
ajh with j Ri and h Ui . But then the node h is reachable. Therefore, Ui = , and all nodes are
reachable from i.
Regarding (iii) = (i), because G is strongly connected, there exists a directed path of length k 0
connecting node i to node j, for all i and j. By removing any cycle from such a path exists (so that no
intermediate node is repeated), one can compute a path from i to j of length k P < n. Hence, by Lemma 4.1(ii),
the entry (A )ij is strictly positive and, in turn, so is the entire matrix sum n1
k
k=0 A .
k
Pn1 k
Regarding (i) = (iii), pick two nodes i and j. Because k=0 A > 0, there must exists k such that
(Ak )ij > 0. Lemma 4.1(ii) implies the existence of a path of length k from i to j. Hence, G is strongly
connected.
Regarding (ii) = (iv), by contradiction, assume there exists a partition (I, J ) of {1, . . . , n} such that
aij = 0 for all (i, j) I J . Let : {1, . . . , n} {1, . . . , n} be the permutation that maps all entries of
I into the first |I| entries of {1, . . . , n}. Here we let |I| denote the number of elements of I. Let P be the
corresponding permutation matrix. We now compute P AP > and block partition it as:

> AII AIJ
P AP = ,
AJ I AJ J
where AII R|I||I| , AIJ R|I||J | , AJ I R|J ||I| , and AJ J R|J ||J | . By construction, AJ I =
0|J ||I| so that P AP > is block triangular, which is in contradiction with the assumed statement (ii).
Regarding (iv) = (ii), by contradiction, assume there exists a permutation matrix P and a number
r < n such that
> B C
P AP = ,
0(nr)r D
where the matrices B Rrr , C Rr(nr) , and D R(nr)(nr) are arbitrary. The permutation
matrix P defines a unique permutation : {1, . . . , n} {1, . . . , n} with the property that the columns of
P are e(1) , . . . , e(n) . Let J = {(1), . . . , (r)} and I = {1, . . . , n} \ J . Then, by construction, for any
pair (i, j) I J , we know aij = 0, which is in contradiction with the assumed statement (iv).
4.3. Graph theoretical characterization of irreducible matrices 49
Next we present two results, whose proof are analogous to those of the previous theorem and left to
the reader as an exercise.
Lemma 4.3 (Global reachability and powers of the adjacency matrix). Let G be a weighted digraph
with n 2 nodes and weighted adjacency matrix A. For any j {1, . . . , n}, the following equivalent
(i) the jth node of G is globally reachable, and

P
(ii) the jth column of n1k=0 A is positive.
k
Next, we notice that if node j is reachable from node i via a path of length k and at least one node
along that path has a self-loop, then node j is reachable from node i via paths of length k, k + 1, k + 2, and
so on. This observation and the last lemma lead to the following corollary.
Corollary 4.4 (Connectivity properties of the digraph and positive powers of the adjacency ma-
trix: contd). Let G be a weighted digraph with n nodes, weighted adjacency matrix A and a self-loop at
each node. The following statements are equivalent:
(i) G is strongly connected; and

(ii) An1 is positive, so that A is primitive.
For any j {1, . . . , n}, the following two statements are equivalent:
(iii) the jth node of G is globally reachable; and

(iv) the jth column of An1 has positive entries.
Finally, we conclude this section with a clarification.
Remark 4.5 (Similarity transformations defined by permutation matrices). Note that P > AP is the
similarity transformation of A defined by P because the permutation matrix P satisfies P 1 = P > ; see
Exercise
E2.14. Moreover,
note
that P > AP
is simply
areordering
of rows and columns.
For example, consider
0 0 1 0 1 0 1 3 1 2

P = 1 0 0 with P = 0 0 1 . Note P 2 = 1 as well as P
> > 2 = 3 and compute

0 1 0 1 0 0 3 2 3 1

a11 a12 a13 a22 a23 a21
A = a21 a22 a23 = P > AP = a32 a33 a31 ,
a31 a32 a33 a12 a13 a11
so that the entries of the 1st, 2nd and 3rd rows of A are mapped respectively to the 3rd, 1st and 2nd rows of
P > AP and, at the same time, the entries of the 1st, 2nd and 3rd columns of A are mapped respectively to
the 3rd, 1st and 2nd columns of P > AP .
4.4 Graph theoretical characterization of primitive matrices

In this section we present the main result of this chapter, an immediate corollary and its proof.
Proposition 4.6 (Strongly connected and aperiodic digraph and primitive adjacency matrix). Let
G be a weighted digraph with weighted adjacency matrix A. The following two statements are equivalent:
(i) G is strongly connected and aperiodic; and

(ii) A is primitive, that is, there exists k N such that Ak is positive.
Before proving Proposition 4.6, we introduce a useful fact from number theory, whose proof we leave
as Exercise E4.11. First, we recall a useful notion: a set of integers are coprime if its elements share no
common positive factor except 1, that is, their greatest common divisor is 1. Loosely, the following lemma
states that coprime numbers generate, via linear combinations with nonnegative integer coefficients, all
numbers larger than a given threshold.
Lemma 4.7 (Frobenius number). Given a finite set A = {a1 , a2 , . . . , an } of positive integers, an integer
M is said to be representable by A if there exist nonnegative integers {1 , 2 , . . . , n } such that M =
1 a1 + + N aN . The following statements are equivalent:
(i) there exists a finite largest unrepresentable integer, called the Frobenius number of A, and
(ii) the greatest common divisor of A is 1.
Finally, we provide a proof for Proposition 4.6 taken from (Bullo et al. 2009).
Proof of Proposition 4.6. Regarding (i) = (ii), pick any ordered pair (i, j). We claim that there exists
a number k(i, j) with the property that, for all m > k(i, j), we have (Am )ij > 0, that is, there exists a
directed path from i to j of length m for all m k(i, j). If this claim is correct, then the statement (ii) is
proved with k = max{k(i, j) | i, j {1, . . . , n}}. To show this claim, let {c1 , . . . , cN } be the set of the
cycles of G and let {k1 , . . . , kN } be their lengths. Because G is aperiodic, the lengths {k1 , . . . , kN } are
coprime and Lemma 4.7 implies the existence of a number h(k1 , . . . , kN ) such that any number larger than
h(k1 , . . . , kN ) is a linear combination of k1 , . . . , kN with nonnegative integer as coefficients. Because G is
strongly connected, there exists a path of arbitrary length (i, j) that starts at i, contains a vertex of each
of the cycles c1 , . . . , cN , and terminates at j. Now, we claim that k(i, j) = (i, j) + h(k1 , . . . , kN ) has the
desired property. Indeed, pick any number m > k(i, j) and write it as m = (i, j) + 1 k1 + + N kN for
appropriate numbers 1 , . . . , N N. A directed path from i to j of length m is constructed by attaching
to the path the following cycles: 1 times the cycle c1 , 2 times the cycle c2 , . . . , N times the cycle cN .
Regarding (ii) = (i), from Lemma 4.1 we know that Ak > 0 means that there are paths of length
k from every node to every other node. Hence, the digraph G is strongly connected. Next, we prove
aperiodicity. Because G is strongly connected, each node of G has at least one outgoing edge, that is, for
all i, there exists at least one index j such that aij > 0. This Pn fact implies that the matrix Ak+1 = AAk is
positive via the following simple calculation: (A )il = h=1 aih (A )hl aij (Ak )jl > 0. In summary, if
k+1 k
Ak is positive for some k, then Am is positive for all subsequent m > k (see also Exercise E2.5). Therefore,
there are closed paths in G of any sufficiently large length. This fact implies that G is aperiodic; indeed,
4.4. Graph theoretical characterization of primitive matrices 51
by contradiction, if the cycle lengths were not coprimes, then G would not possess such closed paths of
arbitrary sufficiently large length.
4.5 Elements of spectral graph theory

In this section we provide some elementary results on the spectral radius of a nonnegative matrix A. (We
provide bounds on the eigenvalues of the Laplacian matrix in Section 6.3 and Exercise E6.16.) Recall that
ith entry of the vector A1n contains the ith row-sum of the matrix A and the out-degree of the ith node of
the digraph associated to A. In other words, dout (i) = e>
i A1n .
Theorem 4.8 (Bounds on the spectral radius of a nonnegative matrix). For a nonnegative n n
matrix A with associated digraph G, the following statements hold:
(i) (A) max(A1n );

(ii) if min(A1n ) = max(A1n ), then (A) = max(A1n ); and
(iii) if min(A1n ) < max(A1n ), then the following two statements are equivalent:
a) for each node i with e>

i A1n = max(A1n ), there exists a directed path in G from node i to a
node j with e>
j A1 n < max(A1n ); and
b) (A) < max(A1n ).
Before providing the proof, we introduce a useful notion and establish a corollary.
Definition 4.9 (Row-substochastic matrix). A nonnegative n n matrix A is row-substochastic if its

row-sums are at most 1 and at least one row-sum is strictly less than 1, that is,
A1n 1n , and there exists i {1, . . . , n} such that e>

i A1n < 1.
Note that a row-substochastic matrix satisfies min(A1n ) < max(A1n ) and that any irreducible row-
substochastic matrix satisfies condition (iii)a because the associated digraph is strongly connected. These
two observations lead immediately to the following corollary.
Corollary 4.10 (Irreducible row-substochastic matrices). A row-substochastic irreducible matrix is

convergent.
We now present the proof of the main theorem in this section.
Proof of Theorem 4.8. Regarding statement (i), the PerronFrobenius Theorem 2.12 applied to the nonnega-
tive matrix A implies the existence of a vector x 0n , x 6= 0n , such that
n
X
Ax = (A)x = (A)xi = aij xj .
j=1
Let ` argmaxi{1,...,n} {xi } be the index (or one of the indices) satisfying x` = max{x1 , . . . , xn } > 0
and compute
Xn Xn
xj
(A) = a`j a`j max(A1n ) .
x`
j=1 j=1
4.5. Elements of spectral graph theory 53
Regarding statement (ii), note that 1n is an eigenvector with eigenvalue max(A1n ) so that we know
(A) max(A1n ). But we also know from statement (i) that (A) max(A1n ).
Next, we establish that the condition (iii)a implies the bound (iii)b. It suffices to focus on row-
substochastic matrices (if max(A1n ) 6= 1, we consider the row-substochastic matrix A/(A)). We now
claim that:
(1) if e> > 2
i A1n < 1, then ei A 1n < 1,
(2) if i has an outneighbor j (that is, Aij > 0) with e> > 2
j A1n < 1, then ei A 1n < 1,
(3) there exists k such that Ak 1n < 1n , and
(4) (A) < 1.
Regarding statement (1), for a node i satisfying e>
i A1n < 1, we compute
e>
i A1n < 1 = e> 2 > >
i A 1n = ei A(A1n ) ei A1n < 1,
where we used the implication: if 0n v 1n and w 0n , then w> v w> 1n . Next, for a node i
satisfying Aij > 0) with e> >
j A1n < 1, note that ej A1n < 1 and A1n 1n together imply

A1n 1n 1 ej A1n ej , where 1 ej A1n > 0.
Therefore, we compute
e> 2 >
i A 1n = (ei A)(A1n )
>
(e>
i A) 1n 1 ej A1 >
n ej = ei A1n 1 ej A1n ei Aej 1 1 ej A1n Aij < 1.
This conclude the proof of statement (2).

Regarding statement (3), note that, if A is row-substochastic, then Ak is row-substochastic for any
natural k 1. Let Sk be the set of indices i such that the ith row-sum of Ak is strictly less than 1.
Statement (1) implies Sk Sk+1 . Moreover, because of the existence of directed paths from every node
to nodes with row-sum less than 1, we know that there exists k such that Sk = {1, . . . , n}. This proves
statement (3).
Next, define the maximum row-sum at time k by
n
X
= max (Ak )ij < 1.
i{1,...,n}
j=1
Given any natural number k, we can write k = ak + b with a positive integer and b {0, . . . , k 1}.
Note that

Ak 1n Aak 1n a 1n .
The last inequality implies that, as k and therefore a , the sequence Ak converges to 0. This
fact proves statement (4) and, in turn, that the condition (iii)a implies the bound (iii)b.
Finally, we sketch the proof that the bound (iii)b implies the condition (iii)a. By contradiction, if
condition (iii)a does not hold, then the condensation of G contains a sink whose corresponding row-sums
in A are all equal to max(A1n ). But to that sink corresponds an eigenvector of A whose eigenvalue is
therefore max(A1n ). We refer to Theorem 5.3 for a brief review of the properties of reducible nonnegative
matrix and leave to the reader the details of the proof.
4.6 Exercises
E4.1 Directed paths and powers of the adjacency matrix. Prove Lemma 4.1.
E4.2 Edges and triangles in an undirected graph. Let A be the binary
Pn adjacency matrix for an undirected graph
G without self-loops. Recall that the trace of A is trace(A) = i=1 aii .
(i) Show trace(A) = 0.
(ii) Show trace(A2 ) = 2|E|, where |E| is the number of edges of G.
(iii) Show trace(A3 ) = 6|T |, where |T | is the number of triangles of G. (A triangle is a complete subgraph
with three vertices.)

0 1 1
(iv) Verify results (i)(iii) on the matrix A = 1 0 1.
1 1 0
E4.3 A sufficient condition for primitivity. Assume the square matrix A is nonnegative and irreducible. Show
that
(i) if A has a positive diagonal element, then A is primitive,
(ii) if A is primitive, then it is false that A must have a positive diagonal element.
E4.4 Example row-stochastic matrices and associated digraph. Consider the row-stochastic matrices

0 0 1 1 1 0 1 0 1 0 1 0
1 1 0 1 0 1 1 0 1 0 1 1 1 0 0
A1 = , A2 = , and A3 = .
2 0 1 0 1 2 0 1 0 1 2 0 0 1 1
1 1 0 0 0 1 0 1 0 1 0 1
Draw the digraphs G1 , G2 and G3 associated with these three matrices. Using only the original definitions
and without relying on the characterizations in Theorem 4.2 and Proposition 4.6, show that:
(i) the matrices A1 , A2 and A3 are irreducible and primitive,
(ii) the digraphs G1 , G2 and G3 are strongly connected and aperiodic, and
(iii) the averaging algorithm defined by A2 converges in a finite number of steps.
E4.5 Primitive matrices are irreducible. Prove Lemma 2.11, that is, show that a primitive matrix is irreducible.
Hint: You are allowed to use Theorem 4.2.
E4.6 Yet another equivalent definition of irreducibility. Consider a nonnegative matrix A of dimension n.
From Theorem 4.2, we know that A is irreducible if and only if
(i) there does not exist a permutation P {0, 1}nn and 1 r n 1 such that

> Brr Cr(nr)
P AP = .
0(nr)r D(nr)(nr)
Consider now the following property of A:

(ii) for any nonnegative vector y Rn0 with 0 < k < n strictly positive components, the vector (In + A)y
has at least k + 1 strictly positive components.
Prove that statement (i) implies statement (ii).
E4.7 Irreducibility and permutations. Consider the following binary matrix:

1 0 0 1 1
1 1 1 1 1

A= 1 1 1 1 1 .

1 0 0 1 1
1 0 0 1 1
Prove that A is irreducible or prove that A is reducible by
providing
a permutation matrix P that transforms
> ? ?
A into an upper block-triangular matrix, i.e., P AP = .
0 ?
E4.8 Bounds on the spectral radius of irreducible nonnegative matrices. For a nonnegative and irreducible
matrix A, show
(i) min(A1n ) (A) and, therefore,
min(A1n ) (A) max(A1n );
(ii) if min(A1n ) < max(A1n ), then

min(A1n ) < (A) < max(A1n ).
E4.9 Eigenvalue shifting for stochastic matrices. Let A Rnn be an irreducible row-stochastic matrix. Let
E be a diagonal matrix with diagonal elements Eii {0, 1}, with at least one diagonal element equal to zero.
Show that AE is convergent.
E4.10 Normalization of nonnegative irreducible matrices. Consider a strongly connected weighted digraph
G with n nodes and with an irreducible adjacency matrix A Rnn . The matrix A is not necessarily
row-stochastic. Find a positive vector v Rn so that the normalized matrix
1
Anormalized = (diag(v))1 A diag(v)
(A)
is nonnegative, irreducible, and row-stochastic.
E4.11 The Frobenius number. Prove Lemma 4.7.
Hint: Read up on the Frobenius number in (Owens 2003).
E4.12 Leslie population model. The Leslie model is used in population ecology to model the changes in a
population of organisms over a period of time; see the original reference (Leslie 1945) and a comprehensive
text (Caswell 2006). In this model, the population is divided into n groups based on age classes; the indices i are
ordered increasingly with the age, so that i = 1 is the class of the newborns. The variable xi (k), i {1, . . . , n},
denotes the number of individuals in the age class i at time k; at every time step k the xi (k) individuals
produce a number i xi (k) of offsprings (i.e., individuals belonging to the first age class), where i 0
is a fecundity rate, and
progress to the next age class with a survival rate i [0, 1].
If x(k) denotes the vector of individuals at time k, the Leslie population model reads

1 2 . . . n1 n
1 0 . . . 0 0

.. ..
x(k + 1) = Ax(k) = 0 . . 0 x(k), (E4.1)
2
. . . . .
.. .. .. .. ..
0 0 . . . n1 0
where A is referred to as the Leslie matrix. Consider the following two independent sets of questions. First,
assume i > 0 for all i {1, . . . , n} and 0 < i 1 for all i {1, . . . , n 1}.
(i) Prove that the matrix A is primitive.
(ii) Let pi (k) = Pnxi (k) denote the percentage of the total population in class i at time k. Call p(k) the
i=1 xi (k)
population distribution at time k. Compute limk+ p(k) as a function of the spectral radius (A) and
the parameters (i , i ), i {1, . . . , n}.
(iii) Assume i = > 0 and i = n for i {1, . . . , n}. What percentage of the total population belongs to
the eldest class asymptotically, that is, what is limk pn (k)?
(iv) Find a sufficient condition on the parameters (i , i ), i {1, . . . , n}, so that the population will
eventually become extinct.
Second, assume i 0 for i {1, . . . , n} and 0 i 1 for all i {1, . . . , n 1}.
(v) Find a necessary and sufficient condition on the parameters (i , i ), i {1, . . . , n} so that the Leslie
matrix A is irreducible.
(vi) For an irreducible Leslie matrix (as in the previous point (v)), find a sufficient condition on the parameters
(i , i ), i {1, . . . , n}, that ensures that the population will not go extinct.
E4.13 Swiss railroads: continued. From Exercise E3.7, consider the fictitious railroad map of Switzerland given in
Figure E3.1. Write the unweighted adjacency matrix A of this transportation network and, relying upon A
and its powers, answer the following questions:
(i) what is the number of links of the shortest path connecting St. Gallen to Zermatt?
(ii) is it possible to go from Bern to Chur using 4 links? And 5?
(iii) how many different routes, with strictly less then 9 links and possibly visiting the same station more
than once, start from Zrich and end in Lausanne?
Chapter 5
Discrete-time Averaging Systems
After our discussions about matrix and graph theory, we are finally ready to go back to the examples
introduced in Chapter 1. Namely, we recall from Chapter 1 the
study of (i) opinion dynamics in social influence networks (given
an arbitrary stochastic matrix, what do its powers converge to?)
and (ii) averaging algorithms in wireless sensor networks (design
an algorithm to compute the average of a collection numbers lo-
cated at distinct nodes). Other related examples were given in the
appendices of Chapter 1, including the study of robotic networks in
cyclic pursuit and balancing and of more general design problems
in wireless sensor networks.
This chapter discusses two topics. First, we present some analy-
Figure 5.1: Interactions in a social influ-
sis results, and, specifically, some convergence results for averaging ence network
algorithms defined by stochastic matrices; we discuss primitive ma-
trices and reducible matrices with a single or multiple sinks. Our
treatment is related to the discussion in (Jackson 2010, Chapter 8) and (DeMarzo et al. 2003, Appendix C and,
specifically, Theorem 10). Second, we show some design results and, specifically, how to design optimal
matrices; we discuss the equal-neighbor model and the MetropolisHastings model. The computation of
optimal averaging algorithms (doubly-stochastic matrices) is discussed in Boyd et al. (2004).
5.1 Averaging with primitive row-stochastic matrices

From Chapter 2 on matrix theory, we can now re-state the main convergence result in Corollary 2.15 in a
more explicit way using the main graph-theory result in Proposition 4.6.
Corollary 5.1 (Consensus for row-stochastic matrices with strongly connected and aperiodic
graph). If a row-stochastic matrix A has an associated digraph that is strongly connected and aperiodic
(hence A is primitive), then
(i) limk Ak = 1n w> , where w > 0 is the left eigenvector of A with eigenvalue 1 satisfying w1 + +
wn = 1;
57
58 Chapter 5. Discrete-time Averaging Systems
(ii) the solution to x(k + 1) = Ax(k) satisfies

lim x(k) = w> x(0) 1n ;
k
(iii) if additionally A is doubly-stochastic, then w = n1 1n (because A> 1n = 1n and n1 1>

n 1n = 1) so that
1>
n x(0)

lim x(k) = 1n = average x(0) 1n .
k n
5.2 Averaging with reducible matrices

Next, consider a reducible row-stochastic matrix A, i.e., a row-stochastic matrix whose associated digraph
G is not strongly connected. We wish to give sufficient conditions for semi-convergence of A.
We first recall a useful property from Lemma 3.1: G has a globally reachable node if and only if its
condensation digraph has a globally reachable node (that is a single sink). Along these same lines one can
show that the set of globally reachable nodes induces a strongly connected component of G. A digraph
with a globally reachable node and its condensation digraph is illustrated in Figure 5.2.
Figure 5.2: First panel: An example digraph with a set of globally reachable nodes. Second panel: its strongly
connected components (in red and blue). Third panel: its condensation digraph with a sink. For this digraph, the
subgraph induced by the globally reachable nodes is aperiodic.
We are now ready to establish the semiconvergence of adjacency matrices of digraphs with globally
reachable nodes. The following result amounts to an extension to a class of reducible matrices of the
Perron-Frobenius Theorem 2.12.
Theorem 5.2 (Consensus for row-stochastic matrices with a globally-reachable aperiodic strong-
ly-connected component). Let A be a row-stochastic matrix and let G be its associated digraph. Assume
that G has a globally reachable node and the subgraph induced by the set of globally reachable nodes is
aperiodic. Then
(i) the simple eigenvalue (A) = 1 is strictly larger than the magnitude of all other eigenvalues, hence A is
semi-convergent;
(ii) limk Ak = 1n w> , where w 0 is the left eigenvector of A with eigenvalue 1 satisfying w1 + +
wn = 1;
(iii) the eigenvector w 0 has positive entries corresponding to each globally reachable node and has zero
entries for all other nodes;
(iv) the solution to x(k + 1) = Ax(k) satisfies

lim x(k) = w> x(0) 1n .
k
5.2. Averaging with reducible matrices 59
Note that: for all nodes j which are not globally reachable, the initial values xj (0) have no effect on the
final convergence value.
Note: as we discussed in Section 2.3, the limiting vector is a weighted average of the initial conditions.
The relative weights of the initial conditions are the convex combination coefficients w1 , . . . , wn . In a social
influence network, the coefficient wi is regarded as the social influence of agent i. We illustrate this concept
by computing the social influence coefficients for the famous Krackhardts advice network (Krackhardt
1987); see Figure 5.3.
Note: adjacency matrices of digraphs with globally reachable nodes are sometimes called indecomposable;
see (Wolfowitz 1963).
Figure 5.3: Krackhardts advice network with 21 nodes. The social influence of each node is illustrated by its gray
level.
Proof of Theorem 5.2. By assumption the condensation digraph of A contains a sink that is globally reachable,
hence it is unique. Assuming 0 < n1 < n nodes are globally reachable, a permutation of rows and columns
(see Exercise E3.1), brings the matrix A into the form

A11 0
A= , (lower-triangular matrix), (5.1)
A21 A22
where A11 Rn1 n1 , A22 Rn2 n2 , with n1 + n2 = n. The state vector x is correspondingly partitioned
into x1 Rn1 and x2 Rn2 so that
x1 (k + 1) = A11 x1 (k), (5.2)

x2 (k + 1) = A21 x1 (k) + A22 x2 (k). (5.3)
In other words, x1 and A11 are the variables and the matrix corresponding to the sink. Because the sink,
as a subgraph of G, is strongly connected and aperiodic, A11 is primitive and row-stochastic and, by
Corollary 5.1,
lim Ak11 = 1n1 w1> ,
k
where w1 > 0 is the left eigenvector with eigenvalue 1 for A11 normalized so that 1> n1 w1 = 1.
The matrix A22 is analyzed as follows. Recall from Corollary 4.10 that an irreducible row-substochastic
matrix has spectral radius less than 1. Now, because A21 cannot be zero (otherwise the sink would not
be globally reachable), the matrix A22 is row-substochastic. Moreover, (after appropriately permuting
rows and columns of A22 ) it can be observed that A22 is a lower-triangular matrix such that each diagonal
block is row-substochastic and irreducible (corresponding to each node in the condensation digraph).
Therefore, we know (A22 ) < 1 and, in turn, In2 A22 is invertible. Because A11 is primitive and
(A22 ) < 1, A is semiconvergent and limk x2 (k) exists. Taking the limit as k in equation (5.3),
some straightforward algebra shows that

lim x2 (k) = (In2 A22 )1 A21 lim x1 (k) = (In2 A22 )1 A21 (1n1 w1> ) x1 (0).
k k
From the row-stochasticity of A, we know A21 1n1 + A22 1n2 = 1n2 and hence (In2 A22 )1 A21 1n1 = 1n2 .
Collecting these results, we write
k >
A11 0 1n1 w1> 0 w
lim = = 1n 1 .
k A21 A22 1n2 w1> 0 0
5.3 Averaging with reducible matrices and multiple sinks

In this section we now consider the general case of digraphs that do not contain globally reachable nodes,
that is, digraphs whose condensation digraph has multiple sinks. In the following statement we say that a
node is connected with a sink of a digraph if there exists a directed path from the node to any node in the
sink.
Theorem 5.3 (Convergence for row-stochastic matrices with multiple aperiodic sinks). Let A be
a row-stochastic matrix and let G be its associated digraph. Assume the condensation digraph C(G) contains
M 2 sinks and assume all of them are aperiodic. Then
(i) the semi-simple eigenvalue (A) = 1 has multiplicity equal M and is strictly larger than the magnitude
of all other eigenvalues, hence A is semi-convergent,
(ii) there exist M left eigenvectors of A, denoted by wm Rn , for m {1, . . . , M }, with the properties
that: wm 0, w1m + + wnm = 1 and wim is positive if and only if node i belongs to the m-th sink,
(iii) the solution to x(k + 1) = Ax(k) with initial condition x(0) satisfies

(wm )> x(0), if node i belongs to the m-th sink,

(wm )> x(0), if node i is connected with the m-th sink and no other sink,
lim xi (k) = M
k X

zi,m (wm )> x(0) , if node i is connected to more than one sink,
m=1
where, for each node i connected to more than one sink, the coefficients zi,m , m {1, . . . , S}, are convex
combination coefficients and are strictly positive if and only if there exists a directed path from node i to
the sink m.
5.3. Averaging with reducible matrices and multiple sinks 61
Proof. Rather than treating with heavy notation the general case, we work out an example and refer the
reader to (DeMarzo et al. 2003, Theorem 10) for the general proof. Assume the condensation digraph of A
is composed of three nodes, two of which are sinks, as in the side figure.
x3
x1 x2
Therefore, after a permutation of rows and columns (see Exercise E3.1), A can be written as

A11 0 0
A = 0 A22 0
A31 A32 A33
and the state vector x is correspondingly partitioned into the vectors x1 , x2 and x3 . The state equations are:
x1 (k + 1) = A11 x1 (k), (5.4)

x2 (k + 1) = A22 x2 (k), (5.5)
x3 (k + 1) = A31 x1 (k) + A32 x2 (k) + A33 x3 (k). (5.6)
By the properties of the condensation digraph and the assumption of aperiodicity of the sinks, the
digraphs associated to the row-stochastic matrices A11 and A22 are strongly connected and aperiodic.
Therefore, we immediately conclude that

lim x1 (k) = w1> x1 (0) 1n1 and lim x2 (k) = w2> x2 (0) 1n2 ,
k k
where w1 (resp. w2 ) is the left eigenvector of the eigenvalue 1 for matrix A11 (resp. A22 ) with the usual
normalization 1> >
n1 w1 = 1n2 w2 = 1.
Regarding the matrix A33 , the same discussion as in the previous proof leads to (A33 ) < 1 and, in
turn, to the statement that In3 A33 is nonsingular. By taking the limit as k in equation (5.6), some
straightforward algebra shows that

lim x3 (k) = (In3 A33 )1 A31 lim x1 (k) + A32 lim x2 (k)
k k k

= (w1 x1 (0)) (In3 A33 ) A31 1n1 + (w2> x2 (0)) (In3 A33 )1 A32 1n2 .
> 1
Moreover, because A is row-stochastic, we know
A31 1n1 + A32 1n2 + A33 1n3 = 1n3 ,
and, using again the fact that In3 A33 is nonsingular,
1n3 = (In3 A33 )1 A31 1n1 + (In3 A33 )1 A32 1n2 .
This concludes our proof of Theorem 5.3 for the simplified case C(G) having three nodes and two sinks.
Note that: convergence does not occur to consensus (not all components of the state are equal) and
the final value of all nodes is independent of the initial values at nodes which are not in the sinks of the
condensation digraph.
We conclude this section with a figure providing a summary of the asymptotic behavior of discrete-time
averaging systems and its relationships with properties of matrices and graphs; see Figure 5.4.
Strongly connected, Strongly connected

Converges to consensus Does not converge Doubly stochastic, Irreducible matrices, aperiodic and and periodic
on the average primitive matrices but not primitive weight-balanced
Strongly connected
Primitive matrices and aperiodic
Converges to consensus
depending on all nodes
One aperiodic
sink component
Converges to consensus
that does not depend
on all the nodes
Multiple aperiodic
Converges sink components
not to consensus
Properties of x(k + 1) = Ax(k) Properties of row-stochastic matrix A Properties of associated digraph
Figure 5.4: Corresponding properties for the discrete-time averaging dynamial system x(k + 1) = Ax(k), the
row-stochastic matrix A and the associated weighted digraph.
5.4 Appendix: Design of graphs weights

In this section we describe two algorithms to design weights for unweighted graphs.
5.4.1 The equal-neighbor model
1/3
1/3 1/3
3 4 3 4
1/3
1/3 1/3
1/2 1/4
1/4 1/4
1/4
1 2 1 2
1/2
Figure 5.5: The equal-neighbor model
From Section 1.2 let us consider an undirected graph as in Figure 5.5 and the following simplest
distributed algorithm, based on the concepts of linear averaging. Each node contains a value xi and
5.4. Appendix: Design of graphs weights 63
repeatedly executes:

x+
i := average xi , {xj , for all neighbor nodes j} . (5.7)
Let us make a few simple observations. The algorithm (5.7) can be written in matrix format as:

1/2 1/2 0 0
1/4 1/4 1/4 1/4
x(k + 1) =
0
x(k) =: Awsn x(k).
1/3 1/3 1/3
0 1/3 1/3 1/3
The binary symmetric adjacency matrix and the degree matrix of the undirected graph are

0 1 0 0 1 0 0 0
1 0 1 1 0 3 0 0
A=
0
, D= ,
1 0 1 0 0 2 0
0 1 1 0 0 0 0 2
and so one can verify that
Awsn = (D + I4 )1 (A + I4 ),
1 X
xi (k + 1) = xi (k) + xj (k) .
1 + d(i)
jN (i)
Recall that A + I4 is the adjacency matrix of a graph that is equal to the graph in figure with the addition
of a self-loop at each node; this new graph has degree matrix D + I4 .
Now, it is also quite easy to verify (see also see Exercise E5.3) that
Awsn 14 = 14 , but unfortunately 1> >

4 Awsn 6= 14 .
We summarize this discussion and state a more general result, in arbitrary dimensions and for arbitrary
graphs.
Lemma 5.4 (The equal-neighbor row-stochastic matrix). Let G be a weighted digraph with n nodes,
weighted adjacency matrix A and weighted out-degree matrix Dout . Define
Aequal-neighbor = (In + Dout )1 (In + A).
Note that the weighted digraph associated to (A + In ) is G with the addition of a self-loop at each node with
unit weight. Then
(i) Aequal-neighbor is row-stochastic;

(ii) Aequal-neighbor is primitive if and only if G is strongly connected; and
(iii) Aequal-neighbor is doubly-stochastic if G is weight-balanced and the weighted degree is constant for all
nodes (i.e., Dout = Din = dIn for some d R>0 ).
5/12
5/12
3 4 3 1/3 4
3/4 1/4 1/4
1/4
1 2 1 1/4 2
Figure 5.6: The MetropolisHastings model
Proof. First, for any v Rn with non-zero entries, it is easy to see diag(v)1 v = 1n . Recalling the
definition Dout + In = diag((A + In )1n ),

(Dout + In )1 (A + In ) 1n = diag((A + In )1n )1 (A + In )1n = 1n ,
which proves statement (i). To prove statement (ii), note that, beside self-loops, G and the weighted digraph
associated with Aequal-neighbor have the same edges. Also note that the weighted digraph associated with
Aequal-neighbor is aperiodic by design. Finally, if Dout = Din = dIn for some d R>0 , then
> 1
(Dout + In )1 (A + In ) 1n = (A + In )> 1n
d+1

= (Din + I)1 (A + In )> 1n

= diag((A + In )> 1n )1 (A + In )> 1n = 1n .
This concludes the proof of statement (iii).
5.4.2 The MetropolisHastings model

Next, we suggest a second way of assigning weights to a graph for the purpose of designing an averaging
algorithm. Given an undirected unweighted graph G with n nodes, edge set E and degrees d(1), . . . , d(n),
define the weighted adjacency matrix AMetropolis-Hastings by

1

, if {i, j} E and i 6= j,

1 + max{d(i), d(j)}

X
(AMetropolis-Hastings )ij = 1 (AMetropolis-Hastings )ih , if i = j,

{i,h}E

0, otherwise.
In our example,

0 1 0 0 1 0 0 0 3/4 1/4 0 0
1 0 1 1 0 3 0 0 1/4 1/4 1/4 1/4
A=
0 1 0 1 , D = 0
= AMetropolis-Hastings = .
0 2 0 0 1/4 5/12 1/3
0 1 1 0 0 0 0 2 0 1/4 1/3 5/12
5.5. Appendix: Centrality measures 65
One can verify that the MetropolisHastings weights have the following properties:
(i) (AMetropolis-Hastings )ij > 0 if {i, j} E, (AMetropolis-Hastings )ii > 0 for all i {1, . . . , n}, and
(AMetropolis-Hastings )ij = 0 else;
(ii) AMetropolis-Hastings is symmetric and doubly-stochastic; and
(iii) AMetropolis-Hastings is primitive if and only if G is connected.
5.5 Appendix: Centrality measures

In network science it is of interest to determine the relative importance of a node in a network. There
are many ways to do so and they are referred to as centrality measures or centrality scores. Part of the
treatment in this section is inspired by (Newman 2010). We refer (Brandes and Erlebach 2005) for a
comprehensive review of network analysis metrics and related computational algorithms, to (Gleich 2015)
for a comprehensive review of Pagerank and its multiple extentions and applications, and to (Bonacich
1972a; Friedkin 1991; Friedkin and Johnsen 2014) for early references and historical reviews in sociology.
We start by presenting six centrality notions based on the adjacency matrix. We treat the general case
of a weighted digraph G with weighted adjacency matrix A (warning: many articles in the literature deal
with undirected graphs only.) The matrix A is nonnegative, but not necessarily row stochastic. From the
Perron-Frobenius theory, recall the following facts:
(i) if G is strongly connected, then the spectral radius (A) is an eigenvalue of maximum magnitude
and its corresponding left eigenvector can be selected to be strictly positive and with unit sum (see
Theorem 2.12); and
(ii) if G contains a globally reachable node, then the spectral radius (A) is an eigenvalue of max-
imum magnitude and its corresponding left eigenvector is nonnegative and has positive entries
corresponding to each globally reachable node (see Theorem 5.2).
Degree centrality For an arbitrary weighted digraph G, the degree centrality cdegree (i) of node i is its
in-degree:
n
X
cdegree (i) = din (i) = aji , (5.8)
j=1
that is, the number of in-neighbors (if G is unweighted) or the sum of the weights of the incoming edges.
Degree centrality is relevant, for example, in (typically unweighted) citation networks whereby articles are
ranked on the basis of their citation records. (Warning: the notion that a high citation count is an indicator
of quality is clearly a fallacy.)
Eigenvector centrality One problem with degree centrality is that each in-edge has unit count, even
if the in-neighbor has negligible importance. To remedy this potential drawback, one could define the
importance of a node to be proportional to the weighted sum of the importance of its in-neighbors
(see (Bonacich 1972b) for an early reference). This line of reasoning leads to the following definition.
For a weighted digraph G with globally reachable nodes (or for an undirected graph that is connected),
define the eigenvector centrality vector, denoted by cev , to be the left dominant eigenvector of the adjacency
matrix A associated with the dominant eigenvalue and normalized to satisfy 1> n cev = 1.
Note that the eigenvector centrality satisfies
n
X
1
A> cev = cev cev (i) = aji cev (j). (5.9)

j=1
1
where = (A) is the only possible choice of scalar coefficient in equation (5.9) ensuring that there exists
a unique solution and that the solution, denoted cev , is strictly positive in a strongly connected digraph
and nonnegative in a digraph with globally reachable nodes. Note that this connectivity property may be
restrictive in some cases. We refer to Exercise E5.13 for a generalization of eigenvector centrality.
Figure 5.7: Comparing degree centrality versus eigenvector centrality: the node with maximum in-degree has zero
eigenvector centrality in this graph
Katz centrality For a weighted digraph G, pick an attenuation factor < 1/(A) and define the Katz
centrality vector (see (Katz 1953)), denoted by cK , by the following equivalent formulations:
n
X
cK (i) = aji (cK (j) + 1), (5.10)
j=1
or
X
X n
cK (i) = k (Ak )ji . (5.11)
k=1 j=1
Katz centrality has therefore two interpretations:
(i) the importance of a node is an attenuated sum of the importance and of the number of the in-neighbors
note indeed how equation (5.10) is a combination of equations (5.8) and (5.9), and
(ii) the importance of a node is times number of length-1 paths into i (i.e., the in-degree) plus 2 times
the number of length-2 paths into i, etc. (From Lemma 4.1, recall that, for an unweighted digraph,
(Ak )ji is equal to the number of directed paths of length k from j to i.)
Note how, for < 1/(A), equation (5.10) is well-posed and equivalent to
cK = A> (cK + 1n )
cK + 1n = A> (cK + 1n ) + 1n
(In A> )(cK + 1n ) = 1n
cK = (In A> )1 1n 1n (5.12)
X
cK = k (A> )k 1n ,
k=1
P
where we used the identity (In A)1 = k=0 Ak valid for any matrix A with (A) < 1; see Exer-
cise E2.13.
There are two simple ways to compute the Katz centrality. According to equation (5.12), for limited size
problems, one can invert the matrix (In A> ). Alternatively, one can show that the following iteration
converges to the correct value: c+ >
K := A (cK + 1n ).
1000
Figure 5.8: Image taken temporarily without permission
from (Ishii and Tempo 2014). The pattern in figure dis-
plays the so-called hyperlink matrix, i.e., the transpose
2000 of the adjacency matrix, for a collection of websites at
the Lincoln University in New Zealand from the year
2006. Blue points are nonzero entries of the adjacency
matrix; red points are outgoing links toward dangling
nodes. Each empty column corresponds to a webpage
3000 without any outgoing link, that is, to a so-called dan-
gling node. This Web has 3756 nodes with 31,718 links.
A fairly large portion of the nodes are dangling nodes:
in this example, there are 3255 dangling nodes, which is
0 1000 2000 3000 over 85% of the total.
Pagerank centrality For a weighted digraph G with row-stochastic adjacency matrix (i.e., unit out-
degree for each node), pick a convex combination coefficient ]0, 1[ and define the pagerank centrality
vector, denoted by cpr , as the unique positive solution to
n
X 1
cpr (i) = aji cpr (j) + , (5.13)
n
j=1
or, equivalently, to
1
cpr = M cpr , 1>
n cpr = 1, where M = A> + 1n 1>
n. (5.14)
n
(To establish the equivalence between these two definitions, the only non-trivial step is to notice that if cpr
solves equation (5.13), then it must satisfy 1>
n cpr = 1.)
Note that, for arbitrary unweighted digraphs and binary adjacency matrices A0,1 , it is natural to compute
1
the pagerank vector with A = Dout A0,1 . We refer to (Brin and Page 1998; Page 2001; Ishii and Tempo
2014) for the important interpretation of the pagerank score as the stationary distribution of the so-called
random surfer of an hyperlinked document network it is under this disguise that the pagerank score was
conceived by the Google co-founders and a corresponding algorithm led to the establishment of the Google
search engine. In the Google problem it is customary to set .85.
Closeness and betweenness centrality (based on shortest paths) Degree, eigenvector, Katz and
Pagerank centrality are presented using the adjacency matrix. Next we present two centrality measures
based on the notions of shortest path and geodesic distance; these two notions belong to the class of radial
and medial centrality measures (Borgatti and Everett 2006).
We start by introducing some additional graph theory. For a weighted digraph with n nodes, the length
of a directed path is the sum of the weights of edges in the directed path. For i, j {1, . . . , n}, a shortest
path from a node i to a node j is a directed path of smallest length. Note: it is easy to construct examples
with multiple shortest paths, so that the shortest path is not unique. The geodesic distance dij from node i
to node j is the length of a shortest path from node i to node j; we also stipulate that the geodesic distance
dij takes the value zero if i = j and is infinite if there is no path from i to j. Note: in general dij 6= dji .
Finally, for i, j, k {1, . . . , n}, we let gikj denote the number of shortest paths from a node i to a node
j that pass through node k.
For a strongly-connected weighted digraph, the closeness of node i {1, . . . , n} is the inverse sum
over the geodesic distances dij from node i to all other nodes j {1, . . . , n}, that is:
1
ccloseness (i) = Pn . (5.15)
j=1 dij
For a strongly-connected weighted digraph, the betweenness of node i {1, . . . , n} is the fraction of
all shortest paths gkij from any node k to any other node j passing through node i, that is:
Pn
j,k=1 gkij
cbetweenness (i) = Pn Pn . (5.16)
h=1 j,k=1 gkhj
Summary To conclude this section, in Table 5.1, we summarize the various centrality definitions for a
weighted directed graph.
Measure Definition Assumptions
degree centrality cdegree = A> 1n

1
eigenvector centrality cev = A> cev = , G has a
(A)
globally reachable node
1
pagerank centrality cpr = A> cpr + 1n < 1, A1n = 1n
n
1
Katz centrality cK = A> (cK + 1n ) <
(A)
1
closeness centrality ccloseness (i) = Pn G strongly connected
dij
j=1 P
n
gkij
betweenness centrality cbetweenness (i) = Pn j,k=1
Pn G strongly connected
h=1 j,k=1 gkhj
Table 5.1: Definitions of centrality measures for a weighted digraph G with adjacency matrix A
Figure 5.9 illustrates some centrality notions on a small instructive example due to Brandes (2006).
Note that a different node is the most central one in each metric; this variability is naturally expected and
highlights the need to select a centrality notion relevant to the specific application of interest.
(a) degree centrality (b) eigenvector centrality
(c) closeness centrality (d) betweenness centrality
Figure 5.9: Degree, eigenvector, closeness, and betweenness centrality for an undirected unweighted graph. The dark
node is the most central node in the respective metric; a different node is the most central one in each metric.
5.6 Exercises
E5.1 A sample DeGroot panel. A conversation between 5 panelists is modeled according to the DeGroot model
by an averaging algorithm x+ = Apanel x, where

0.15 0.15 0.1 0.2 0.4
0 0.55 0 0 0.45

Apanel =
0.3 0.05 0.05 0 0.6
.
0 0.4 0.1 0.5 0
0 0.3 0 0 0.7
Assuming that the panel has sufficiently long deliberations, answer the following:
(i) Based on the associated digraph, do the panelists finally agree on a common decision?
(ii) In the event of agreement, does the initial opinion of any panelists get rejected? If so, which ones?
(iii) If the panelists initial opinions are their self-appraisals (i.e., the self-weights a11 , . . . , a55 ), what is the
final opinion?
E5.2 Three DeGroot panels. Recall the DeGroot model introduced in Chapter 1. Denote by xi (0) the initial
opinion of each individual, and xi (k) its updated opinion after k communications with its neighbors. Then
the vector of opinions evolves over time according to x(k + 1) = Ax(k) where the coefficient aij [0, 1] is
P influence of the opinion of individual j on the update of the opinion of agent i, subject to the constraint
the
j aij = 1. Consider the following three scenarios:
(i) Everybody gives the same weight to the opinion of everybody else.
(ii) There is a distinct agent (suppose the agent with index i = 1) that weights equally the opinion of all the
others, and the remaining agents compute the mean between their opinion and the one of first agent.
(iii) All the agents compute the mean between their opinion and the one of the first agent. Agent 1 does not
change her opinion.
In each case, derive the averaging matrix A, show that the opinions converge asymptotically to a final opinion
vector, and characterize this final opinion vector.
E5.3 Left dominant eigenvector for equal-neighbor row-stochastic matrices. Let A01 be the binary (i.e.,
each entry is either 0 or 1) adjacency matrix for a unweighted undirected graph. Assume the graph is connected.
Let D = diag(d1 , . . . , dn ) be the degree matrix, let |E| be the number of edges of the graph, and define
A = D1 A01 . Show that
(i) the definition of A is well-posed and A is row-stochastic, and
(ii) the left dominant eigenvector of A associated to the eigenvalue 1 and normalized so that 1>
n w = 1 is

d1
1 .
w= . .
2|E| .
dn
Next, consider the equal-neighbor averaging algorithm in equation (5.7) with associated row-stochastic matrix
Aequal-neighbor = (D + In )1 (A01 + In ).
(iii) Show that
1 Xn
lim x(k) = (1 + di )xi (0) 1n .
k 2|E| + n i=1
(iv) Verify that the left dominant eigenvector of the matrix Awsn = Aequal-neighbor defined in Section 1.2 is
[1/6, 1/3, 1/4, 1/4]> , as seen in Example 2.5.
E5.4 A stubborn agent. Pick ]0, 1[, and consider the discrete-time consensus algorithm
x1 (k + 1) = x1 (k),
x2 (k + 1) = x1 (k) + (1 )x2 (k).
(i) compute the matrix A representing this algorithm and verify it is row-stochastic,
(ii) compute the eigenvalues and eigenvectors of A,
(iii) draw the directed graph G representing this algorithm and discuss its connectivity properties,
(iv) compute the condensation digraph of G,
(v) compute the final value of this algorithm as a function of the initial values in two alternate ways:
invoking and without invoking Theorem 5.2.
E5.5 Agents with self-confidence levels. Consider 2 agents, labeled +1 and 1, described by the self-confidence
levels s+1 and s1 . Assume s+1 0, s1 0, and s+1 + s1 = 1. For i {+1, 1}, define
x+
i := si xi + (1 si )xi .

(i) compute the matrix A representating this algorithm and verify it is row-stochastic,
(ii) compute A2 ,
(iii) compute the eigenvalues, the right eigenvectors, and the left eigenvectors of A,
(iv) compute the final value of this algorithm as a function of the initial values and of the self-confidence
levels. Is it true that an agent with higher self-confidence makes a larger contribution to the final value?
E5.6 Persistent disagreement and the Friedkin-Johnsen model of opinion dynamics (Friedkin and Johnsen
1999). Let W be a row-stochastic matrix whose associated digraph describes an interpersonal influence network.
Let each individual possess an openness level i [0, 1], i {1, . . . , n}, descring how open is the individual to
changing her initial opinion about a subject; set = diag(1 , . . . , n ). Consider the Friedkin-Johnsen model
of opinion dynamics
x(k + 1) = W x(k) + (In )x(0).
Assume that (1) at least one individual is not completely open to change her opinion, that is, assume i < 1
for at least one individual i; and (2) the interpersonal influence network contains directed paths from each
individual with openness level equal to 1 to an individual with openness level less than 1. (Note that, if
assumption (1) is not satisfied, we recover the DeGroot opinion dynamics model introduced in Section 1.1.)
(i) show that the matrix W is convergent,
(ii) show that the matrix V = (In W )1 (In ) is well-defined and row-stochastic,
Hint: Review Exercises E2.11 and E2.13
(iii) show that the limiting opinions are limk+ x(k) = V x(0),
(iv) compute the matrix V and state whether two agents will achieve consensus or mantain persistent
disagreement for the following pairs of matrices:

1/2 1/2
W1 = , and 1 = diag(1/2, 1),
1/2 1/2

1/2 1/2
W2 = , and 2 = diag(1/4, 3/4).
1/2 1/2
(Note: Friedkin and Johnsen (1999, 2011) make the additional assumption that i = 1 wii , for
i {1, . . . , n}; this assumption is not needed here. This model is also referred to the opinion dynamics
model with stubborn agents. See (Ravazzi et al. 2015) for an extention of this model.)
E5.7 Necessary and sufficient conditions for consensus. Let A be a row-stochastic matrix. Prove that the
following statements are equivalent:
(i) the eigenvalue 1 is simple and all other eigenvalues have magnitude strictly smaller than 1,
(ii) limk Ak = 1n w> , for some w Rn , w 0, and 1> n w = 1,
(iii) the digraph associated to A contains a globally reachable node and the subgraph of globally reachable
nodes is aperiodic.
Hint: Use the Jordan normal form to show that (i) = (ii).
E5.8 Computing centrality. Write in your favorite programming language algorithms to compute degree, eigen-
vector, Katz and pagerank centralities. Compute these four centralities for the following undirected unweighted
graphs (without self-loops):
(i) the ring graph with 5 nodes;
(ii) the star graph with 5 nodes;
(iii) the line graph with 5 nodes; and
(iv) the Zachary karate club network dataset. This dataset can be downloaded for example from: http:
//konect.uni-koblenz.de/networks/ucidata-zachary
To compute Katz centrality of a matrix A, select = 1/(2(A)). For pagerank, use = 1/2.
Hint: Recall that pagerank centrality is well-defined for a row-stochastic matrix.
E5.9 Central nodes in example graph. For the unweighted undirected graph in Figure 5.9, verify (possibly with
the aid of a computational package) that the dark nodes have indeed the largest degree, eigenvector, closeness
and betweenness centrality as stated in the figure caption.
E5.10 Iterative computation of Katz centrality. Given a graph with adjacency matrix A, show that the solution
to the iteration x(k + 1) := A> (x(k) + 1n ) with < 1/(A) converges to the Katz centrality vector cK ,
for all initial conditions x(0).
E5.11 Move away from your nearest neighbor and reducible averaging. Consider n 3 robots with positions
pi R, i {1, . . . , n}, dynamics pi (t + 1) = ui (t), where ui R is a steering control input. For simplicity,
assume that the robots are indexed according to their initial position: p1 (0) p2 (0) p3 (0) pn (0).
Consider two walls at the positions p0 p1 (0) and pn+1 pn (0) so that all robots are contained between
the walls. The walls are stationary, that is, p0 (t + 1) = p0 (t) = p0 and pn+1 (t + 1) = pn+1 (t) = pn+1 .
Consider the following coordination law: robots i {2, . . . , n 1} (each having two neighbors) move to
the centroid of the local subset {pi1 , pi , pi+1 }. The robots {1, n} (each having one robotic neighbor and one
neighboring wall) move to the centroid of the local subsets {p0 , p1 , p2 } and {pn1 , pn , pn+1 }, respectively.
Hence, the closed-loop robot dynamics are
1
pi (t + 1) =(pi1 (t) + pi (t) + pi+1 (t)) , i {1, . . . , n} .
3
Show that the robots become uniformly spaced on the interval [p0 , pn+1 ] using Theorem 5.3.
(Note: This exercise is a discrete-time version of E2.18(ii) based on averaging with multiple sinks.)
E5.12 The role of the out-degree in averaging systems. Let G be an undirected, connected graph without
self-loops. Let each node represent an agent in a network, with the following system dynamics:
1
x(k + 1) = Ax(k) where A = Dout A01 ,
xi (0) [0, 1],
where Dout is the out-degree matrix and by A01 the binary adjacency matrix:
(
1, if {i, j} E ,
(A01 )ij =
0, otherwise.
(i) Under which conditions on the network will the system converge to a final in span{1n }? What is this
steady state value?
(ii) Let e(k) = x(k) limk x(k) be the disagreement error at time instant k. Show that the error
dynamics evolve as e(k + 1) = Be(k) and determine the matrix B.
(iii) Find a function f (k, i , dout (i)) depending on the time step k, the eigenvalues i of A, and the out-
degrees of the nodes dout (i) such that
ke(k)k2 f (k, i , dout (i))ke(0)k2 ,
that is, f (k, i , dout (i)) is the per-step convergence factor.

E5.13 Hubs and authorities (Kleinberg 1999). Let G = (V, E) be a digraph with vertex set V = {1, . . . , n} and
edge set E. Assume G has a globally reachable node and the subgraph of globally reachable nodes is aperiodic.
We define two scores for each vertex j {1, . . . , n}: the hub score hj R and the authority score aj R.
Let these scores be initialized by some positive values and be updated simultaneously for all vertices according
to the following mutually reinforcing relation: the hub score of vertex j is set equal to the sum of the authority
scores of all vertices pointed to by j, and, similarly, the authority score of vertex j is set equal to the sum of
the hub scores of all vertices pointing to j. In concise formulas:
( P
hj ai ,
P i: (j,i)E (E5.1)
aj i: (i,j)E i .
h
>
(i) Let x(k) = h(k)> a(k)> denote the stacked vector of hub and authority scores. Provide an update
equation for the hub and authority scores of the form
x(k + 1) = M x(k),
for some matrix M R2n2n .

(ii) Will the sequence x(k) converge as k ?
In what follows, we consider the modified iteration
M y(k)
y(k + 1) = ,
kM y(k)k2
where M is defined as in statement (i) above.

(iii) Will the sequence y(k) converge as k ?
(iv) Show that the two subsequences of even and odd iterates, k 7 y(2k) and k 7 y(2k + 1), converge,
that is,
lim y(2k) = yeven (y0 ), lim y(2k + 1) = yodd (y0 ),
k k
where y0 = x(0) is the stacked vector of initial hub and authority scores.
(v) Provide expressions for yeven (y0 ) and yodd (y0 ).
Chapter 6
The Laplacian Matrix
So far, we have studied adjacency matrices. In this chapter, we study a second relevant matrix associated
to a digraph, called the Laplacian matrix. More information on adjacency and Laplacian matrices can be
found in standard books on algebraic graph theory such as (Biggs 1994) and (Godsil and Royle 2001). Two
surveys about Laplacian matrices are (Mohar 1991; Merris 1994).
6.1 The Laplacian matrix

Definition 6.1 (Laplacian matrix of a digraph). Given a weighted digraph G with adjacency matrix A
and out-degree matrix Dout , the Laplacian matrix of G is
L = Dout A.
In components L = (ìj )i,j{1,...,n}

anij ,
if i 6= j,
ìj = X

aih , if i = j,
h=1,h6=i
or, for an unweighted undirected graph,

1, if {i, j} is an edge and not self-loop,
ìj = d(i), if i = j,

0, otherwise.
75
76 Chapter 6. The Laplacian Matrix
2 4 The Laplacian matrix of this weighted di-

1.2 rected graph is
3.7
8.9 3.7 2.3
4.4
7.4 3.7 3.7 0 0
3 8.9 10.1 0 1.2 0

1 3.7 2.3 0 0 6.0 3.7 2.3 .

4.4 5 0 0 0 0 0
4.4 0 0 2.3 6.7
Note:
(i) the sign pattern of L is important diagonal elements are nonnegative (zero or positive) and
off-diagonal elements are nonpositive (zero or negative);
(ii) the Laplacian matrix L of a digraph G does not depend upon the existence and values of self-loops in
G;
(iii) the graph G is undirected (i.e., symmetric adjacency matrix) if and only if L is symmetric. In this
case, Dout = Din = D and A = A> ;
(iv) in a directed graph, ìi = 0 (instead of ìi > 0) if and only if node i has zero out-degree;
(v) L is said to be irreducible if G is strongly connected.
We next define the same concept, but without starting from a digraph.
Definition 6.2 (Laplacian matrix). A matrix L Rnn , n 2, is Laplacian if

(i) its row-sums are zero,
(ii) its diagonal entries are nonnegative, and
(iii) its non-diagonal entries are nonpositive.
A Laplacian matrix L induces a weighted digraph G without self-loops in the natural way, that is, by
letting (i, j) be an edge of G if and only if ìj > 0. With this definition, L is the Laplacian matrix of G.
We conclude this section with some useful equalities. By the way, obviously
n
X
(Ax)i = aij xj . (6.1)
j=1
First, for x Rn ,
n
X n
X X
n n
X
(Lx)i = ìj xj = ìi xi + ìj xj = aij xi + (aij )xj
j=1 j=1,j6=i j=1,j6=i j=1,j6=i
Xn X
= aij (xi xj ) = aij (xi xj ) (6.2)
j=1,j6=i jN out (i)
for unit weights
= dout (i) xi average({xj , for all out-neighbors j}) .
6.2. The Laplacian in mechanical networks of springs 77
Second, assume L = L> (i.e., aij = aji ) and compute:

n
X n
X X
n
x> Lx = xi (Lx)i = xi aij (xi xj )
i=1 i=1 j=1,j6=i
X n 1 1 X
n n
X
= aij xi (xi xj ) = + aij x2i aij xi xj
2 2
i,j=1 i,j=1 i,j=1
n
X n
X n
X
by symmetry 1 1
= aij x2i + aij x2j aij xi xj
2 2
i,j=1 i,j=1 i,j=1
n
1 X
= aij (xi xj )2 (6.3)
2
i,j=1
X
= aij (xi xj )2 . (6.4)
{i,j}E
These equalities are useful because it is common to encounter the array of differences Lx and the
quadratic error or disagreement function x> Lx. They provide the correct intuition for the definition of
the Laplacian matrix. In the following, we will refer to x 7 x> Lx as the Laplacian potential function; this
name is justified based on the energy and power interpetation we present in the next two examples.
6.2 The Laplacian in mechanical networks of springs
Let xi R denote the displacement of the ith rigid body. Assume that each spring is ideal linear-elastic
and let aij be the spring constant for the spring connecting the ith and jth bodies.
Define a graph as follows: the nodes are the rigid bodies {1, . . . , n} with locations x1 , . . . , xn , and the
edges are the springs with weights aij . Each node i is subject to a force
X
Fi = aij (xj xi ) = (Lx)i ,
j6=i
where L is the Laplacian for the network of springs (modeled as an undirected weighted graph). Moreover,
recalling that the spring {i, j} stores the quadratic energy 21 aij (xi xj )2 , the total elastic energy is
1 X 1
Eelastic = aij (xi xj )2 = x> Lx.
2 2
{i,j}E
In this role, the Laplacian matrix is referred to as the stiffness matrix. Stiffness matrices can be defined
for spring networks in arbitrary dimensions (not only on the line) and with arbitrary topology (not only
a chain graph, or line graph, as in figure). More complex spring networks can be found, for example, in
finite-element discretization of flexible bodies and finite-difference discretization of diffusive media.
6.3 The Laplacian in electrical networks of resistors

+ 3
1 2
Suppose the graph is an electrical network with only pure resistors and ideal voltage sources: (i) each
graph vertex i {1, . . . , n} is possibly connected to an ideal voltage source, (ii) each edge is a resistor, say
with resistance rij between nodes i and j. (This is an undirected weighted graph.)
Ohms law along each edge {i, j} gives the current flowing from i to j as
cij = (vi vj )/rij = aij (vi vj ),
where aij is the inverse resistance, called conductance. We set aij = 0 whenever two nodes are not
connected by a resistance. Kirchhoffs current law says that at each node i:
n
X n
X
cinjected at i = cij = aij (vi vj ).
j=1,j6=i j=1,j6=i
Hence, the vector of injected currents cinjected and the vector of voltages at the nodes v satisfy
cinjected = L v.
Moreover, the power dissipated on resistor {i, j} is cij (vi vj ), so that the total dissipated power is
X
Pdissipated = aij (vi vj )2 = v> Lv.
{i,j}E
Historical Note: Kirchhoff (1847) is a founder of graph theory in that he was an early adopter of graph
models to analyze electrical circuits.
6.4 Properties of the Laplacian matrix

Lemma 6.3 (Zero row-sums). Let G be a weighted digraph with Laplacian L and n nodes. Then
L1n = 0n .
In equivalent words, 0 is an eigenvalue of L with eigenvector 1n .
6.4. Properties of the Laplacian matrix 79
Proof. For all rows i, the ith row-sum is zero:
n
X n
X X
n n
X
ìj = ìi + ìj = aij + (aij ) = 0.
j=1 j=1,j6=i j=1,j6=i j=1,j6=i
Equivalently, in vector format (remembering the weighted out-degree matrix Dout is diagonal and contains
the row-sums of A):

dout (1) dout (1)

L1n = Dout 1n A1n = ... ... = 0n .
dout (n) dout (n)
Lemma 6.4 (Zero column-sums). Let G be a weighted digraph with Laplacian L and n nodes. The following
(i) G is weight-balanced,i.e., Dout = Din ; and

(ii) 1>
n L = 0n .
>
Proof. Pick j {1, . . . , n} and compute
n
X n
X
(1>
n L)j
>
= (L 1n )j = ìj = `jj + ìj = dout (j) din (j),
i=1 i=1,j6=i
where the last equality follows from
n
X
`jj = dout (j) ajj and ìj = (din (j) ajj ).
i=1,j6=i
In summary, we know that 1> >

n L = 0n if and only if Dout = Din .
Lemma 6.5 (Spectrum of the Laplacian matrix). Given a weighted digraph G with Laplacian L, the
eigenvalues of L different from 0 have strictly-positive real part.
P
Proof. Recall ìi = nj=1,j6=i aij 0 and ìj = aij 0 for i 6= j. By the Gergorin Disks Theorem 2.8,
we know that each eigenvalue of L belongs to at least one of the disks
n n
X o
z C |z ìi | |ìj | = z C | |z ìi | ìi .
j=1,j6=i
ìi
`jj
These disks, with radius equal to the center, contain the origin and complex numbers with positive real
part.
For an undirected graph without self-loops and with symmetric adjacency matrix A = A> , we know
that L is symmetric and positive semidefinite, i.e., all eigenvalues of L are real and nonnegative and that
d(i) = ì . In this case, by convention, we write these eigenvalues as
0 = 1 2 n .
Note:
the second smallest eigenvalue 2 is called the Fiedler eigenvalue or the algebraic connectivity (Fiedler
1973);
the theorem proof also implies n 2 max{d(1), . . . , d(n)}; and
we refer the reader to Exercise E6.16 for a lower bound on n based on the maximum degree.
6.5 Graph connectivity and the rank of the Laplacian

Theorem 6.6 (Rank of the Laplacian). Let L be the Laplacian matrix of a weighted digraph G with n
nodes. Let d be the number of sinks in the condensation digraph of G. Then
rank(L) = n d.
This theorem has the following immediate consequences:
(i) a digraph G contains a globally reachable vertex if and only if rank(L) = n 1 (also recall the
properties of C(G) from Lemma 3.1); and
(ii) for the case of undirected graphs, we have the following two results: the rank of L is equal to n
minus the number of connected components of G and an undirected graph G is connected if and
only if 2 > 0.
Early references for this theorem include (Fife 1972; Foster and Jacquez 1975) and (Agaev and Chebotarev
2000) which featured the first necessary and sufficient characterization; see also (Lin et al. 2005; Ren and
Beard 2005) for the case of rank(L) = n 1.
6.5. Graph connectivity and the rank of the Laplacian 81
Proof. We start by simplifying the problem. Define a new weighted digraph G by modifying G as follows:
at each node, add a self-loop with unit weight if no self-loop is present, or increase the weight of the
self-loop by 1 if a self-loop is present. Also, define another weighted digraph G by modyfing G as follows:
for each node, divide the weights of its out-going edges by its out-degree, so that the out-degree of each
node is 1. In other words, define A = A + I and L = L, and define A = D
out
1 =D
A and L out
1
L = I A.

Clearly, the rank of L is equal to the rank of L. Therefore, without loss of generality, we consider in what
follows only digraphs with row-stochastic adjacency matrices.
Because the condensation digraph C(G) has d sinks, after a renumbering of the nodes, that is, a
permutation of rows and columns (see Exercise E3.1), the adjacency matrix A can be written in block lower
tridiagonal form as

A11 0 0 0 0
.. ..
0 A22 0 . . 0

.. .. .. ..
0 0 . . . .
A= .
Rnn .

.. . .. . .
.. .. . ..
0
.. ..
0 . . 0 A 0
dd
A1o A2o Ado Aothers
where the state vector x is correspondingly partitioned into the vectors x1 , . . . , xd and xothers of dimensions
n1 , . . . , nd and n (n1 + + nd ) respectively, corresponding to the d sinks and all other nodes.
Each sink of C(G) is a strongly connected and aperiodic digraph. Therefore, the square matrices
A11 , . . . , Add are nonnegative, irreducible, and primitive. By the PerronFrobenius Theorem for primitive
matrices 2.12, we know that the number 1 is a simple eigenvalue for each of them.
The square matrix Aothers is nonnegative and it can itself be written as a block lower triangular matrix,
whose diagonal block matrices, say (Aothers )1 , . . . , (Aothers )N are nonnegative and irreducible. Moreover,
each of these diagonal block matrices must be row-substochastic because (1) each row-sum for each
of these matrices is at most 1, and (2) at least one of the row-sums of each of these matrices must be
smaller than 1, otherwise that matrix would correspond to a sink of C(G). In summary, because the
matrices (Aothers )1 , . . . , (Aothers )N are irreducible and row-substochastic, the matrix Aothers has spectral
radius (Aothers ) < 1.
We now write the Laplacian matrix L = In A with the same block lower triangular structure:

L11 0 0 0 0
.. ..
0 L22 0 . . 0

. . . . . . ..
0 0 . . . .
L= .
,
(6.5)
. . . . . . . . . .
. . . . 0
. .
0 . . . . 0 Ldd 0
A1o A2o Ado Lothers
where, for example, L11 = In1 A11 . Because the number 1 is a simple eigenvalue of A11 , the number 0 is a
simple eigenvalue of L11 . Therefore, rank(L11 ) = n1 1. This same argument establishes that the rank of
L is at most n d because each one of the matrices L11 , . . . , Ldd is of rank n1 1, . . . , nd 1, respectively.
Finally, we note that the rank of Lothers is maximal, because Lothers = I Aothers and (Aothers ) < 1 together
imply that 0 is not an eigenvalue for Lothers .
6.6 Appendix: Community detection via algebraic connectivity

As just presented, the algebraic connectivity 2 of an undirected and weighted graph G is positive if and
only if G is connected. We build on this insight and show that the algebraic connectivity does not only
provide a binary connectivity measure, but it also quantifies the bottleneck of the graph. To develop this
intuition, we study the problem of community detection in a large-scale undirected graph. This problem
arises, for example, when identifying group of friends in a social network by means of the interaction graph.
Specifically, we consider the problem of partitioning the vertices V of an undirected connected graph
G in two sets V1 and V2 so that
V1 V2 = V, V1 V2 = , and V1 , V2 6= .
Of course, there are many such partitions. We measure the quality of a partition by the sum of the weights
of all edges that need to be cut to separate the vertices V1 and V2 into two disconnected components.
Formally, the size of the cut separating V1 and V2 is
X
J= aij .
iV1 ,jV2
We are interested in finding the cut with minimal size that identifies the two groups of nodes that are most
loosely connected. The problem of minimizing the cut size J is combinatorial and computationally hard
since we need to consider all possible partitions of the vertex set V . We present here a tractable approach
based on a so-called relaxation step. First, define a vector x {1, +1}n with entries xi = 1 for i V1
and xi = 1 for i V2 . Then the cut size J can be rewritten via the Laplacian potential as
n
1 X 1
J= aij (xi xj )2 = x> Lx
4 2
i,j=1
and the minimum cut size problem is:
minimize x> Lx.

x{1,1}n \{1n ,1n }
(Here we exclude the cases x {1n , 1n } because they correspond to one of the two groups being
empty.) Second, since this problem is still computationally hard, we relax the problem from binary decision
variables xi {1, +1} to continuous decision variables yi [1, 1] (or kyk 1), where we exclude
y span(1n ) (corresponding to one of the two groups being empty). Then the minimization problem
becomes
minimize y > Ly.
yRn ,y1n ,kyk =1
6.6. Appendix: Community detection via algebraic connectivity 83
As a third and final step, we consider a 2-norm constraint kyk2 = 1 instead of an -norm constraint

kyk = 1 (recall that kyk kyk2 nkyk ) to obtain the following heuristic:
minimize y > Ly.

yRn ,y1n ,kyk2 =1
Notice that y > Ly 2 kyk2 and this inequality holds true with equality whenever y = v2 , the normalized
eigenvector associated to 2 . Thus, the unique minimum of the relaxed optimization problem is 2 and the
minimizer is y = v2 . We can then use as a heuristic x = sign(v2 ) to find the desired partition {V1 , V2 }.
Hence, the algebraic connectivity 2 is an estimate for the size of the minimum cut, and the signs of the
entries of v2 identify the associated partition in the graph. For these reasons 2 and v2 can be interpreted
as the size and the location of a bottleneck in a graph.
To illustrate the above concepts, we borrow an example problem with the corresponding Matlab code
from (Gleich 2006). we construct a randomly generated graph as follows. First, we partition n = 1000
nodes in two groups V1 and V2 of sizes 450 and 550 nodes, respectively. Second, we connect any pair of
nodes in the set V1 (respectively V2 ) with probability 0.3 (respectively 0.2). Third and finally, any two
nodes in distinct groups, i V1 and j V2 , are connected with a probability of 0.1. The sparsity pattern
of the associated adjacency matrix is shown in the left panel of Figure 6.1. No obvious partition is visible
at first glance since the indices are not necessarily sorted, that is, V1 is not necessarily {1, . . . , 450}. The
second panel displays the entries of the eigenvector v2 sorted according to their magnitude showing a
sharp transition between positive and negative entries. Finally, the third panel displays the correspondingly
sorted adjacency matrix A clearly indicating the partition V = V1 V2 .
The Matlab code to generate Figure 6.1 can be found below. For additional analysis of this problem, we
refer the reader to (Gleich 2006).
Figure 6.1: The first panel shows a randomly-generated sparse adjacency matrix A for a graph with 1000 nodes. The
second panel displays the eigenvector v2 which is identical to the normalized eigenvector v2 after sorting the entries
according to their magnitude, and the third panel displays the correspondingly sorted adjacency matrix A.
1 % choose a graph size

2 n = 1000;
3
4 % randomly assign the nodes to two grous
5 x = randperm(n);
6 group_size = 450;
7 group1 = x(1:group_size);
8 group2 = x(group_size+1:end);
9
10 % assign probabilities of connecting nodes
11 p_group1 = 0.3;
12 p_group2 = 0.2;
13 p_between_groups = 0.1;
14
15 % construct adjacency matrix
16 A(group1, group1) = rand(group_size,group_size) < p_group1;
17 A(group2, group2) = rand(ngroup_size,ngroup_size) < p_group2;
18 A(group1, group2) = rand(group_size, ngroup_size) < p_between_groups;
19 A = triu(A,1); A = A + A';
20
21 % can you see the groups?
22 subplot(1,3,1); spy(A);
23 xlabel('$A$', 'Interpreter','latex','FontSize',28);
24
25 % construct Laplacian and its spectrum
26 L = diag(sum(A))A;
27 [V D] = eigs(L, 2, 'SA');
28
29 % plot the components of the algebraic connectivity sorted by magnitude
30 subplot(1,3,2); plot(sort(V(:,2)), '.');
31 xlabel('$\tilde v_2$', 'Interpreter','latex','FontSize',28);
32
33 % partition the matrix accordingly and spot the communities
34 [ignore p] = sort(V(:,2));
35 subplot(1,3,3); spy(A(p,p));
36 xlabel('$\tilde A$', 'Interpreter','latex','FontSize',28);
6.7. Exercises 85
6.7 Exercises
E6.1 The spectra of Laplacian and row-stochastic adjacency matrices. Consider a row-stochastic matrix
A Rnn . Let L be the Laplacian matrix of the digraph associated to A. Compute the spectrum of L as a
function of the spectrum spec(A) of A.
E6.2 The adjacency and Laplacian matrices for the complete graph. For any number n N, the complete
graph with n nodes, denoted by K(n), is the undirected and unweighted graph in which any two distinct
nodes are connected. For example, see K(6) in figure.
Compute, for arbitrary n,
(i) the adjacency matrix of K(n) and its eigenvalues; and
(ii) the Laplacian matrix of K(n) and its eigenvalues.
E6.3 The adjacency and Laplacian matrices for the complete bipartite graph. A bipartite graph is a graph
whose vertices can be divided into two disjoint sets U and V with the property that every edge connects a
vertex in U to one in V . A complete bipartite graph is a bipartite graph in which every vertex of U is connected
with every vertex of V . If U has n vertices and V has m vertices, for arbitrary n, m N, the resulting
complete bipartite graph is denoted by K(n, m). For example, see K(1, 6) and K(3, 3) in figure.
Compute, for arbitrary n and m,
(i) the adjacency matrix of K(n, m) and its eigenvalues; and
(ii) the Laplacian matrix of K(n, m) and its eigenvalues.
E6.4 The Laplacian matrix of an undirected graph is positive semidefinite. Give an alternative proof, with-
out relying on the Gergorin Disks Theorem 2.8, that the Laplacian matrix L of an undirected weighted graph
is symmetric positive semidefinite. (Note that the proof of Lemma 6.5 relies on Gergorin Disks Theorem 2.8).
E6.5 A lower bound. Let G be a weighted undirected graph with adjacency matrix A. Assume G is connected, let
2 be the smallest non-zero eigenvalue of L, and show that, for any x Rn ,
1 2
x> Lx 2 x (1>
n x)1n 2 .
n
E6.6 The Laplacian matrix plus its transpose. Let G be a weighted digraph with Laplacian matrix L. Prove the
(i) G is weight-balanced,
(ii) L + L> is positive semidefinite.
Next, assume G is weight-balanced with adjacency matrix A, show that
(iii) L + L> is the Laplacian matrix of the digraph associated to the symmetric adjacency matrix A + A> ,
and
(iv) (L + L> )1n = 0n ,
E6.7 Scaled Laplacian matrices. Let L = L> Rnn be the Laplacian matrix of a connected, undirected, and
symmetrically weighted graph. Consider a diagonal matrix D = diag{d1 , . . . , dn }. Define the matrices A
and B by
A := DL and B := LD.
(i) Give necessary and sufficient conditions on {d1 , . . . , dn } for A to be a Laplacian matrix.
(ii) Give necessary and sufficient conditions on {d1 , . . . , dn } for B to be a Laplacian matrix.
(iii) Give a sufficient condition on {d1 , . . . , dn } for A and B to be symmetric.

(iv) Assuming di 6= 0, i {1, . . . , n}, do A and B possess a zero eigenvalue? If so, what are the correspond-
ing right and left eigenvectors for A and B?
E6.8 The disagreement function in a directed graph (Gao et al. 2008). Recall that the quadratic form associated
with a symmatric matrix B Rnn is the function x 7 x> Bx. Let G be a weighted digraph G with n nodes
and define the quadratic disagreement function G : Rn R by
n
1 X
G (x) = aij (xj xi )2 .
2 i,j=1
Show that:
(i) G is the quadratic form associated with the symmetric positive-semidefinite matrix
1
P = (Dout + Din A A> ),
2
1

(ii) P = 2 L + L(rev) , where the Laplacian of the reverse digraph is L(rev) = Din A> .
E6.9 The pseudoinverse Laplacian matrix. The Moore-Penrose pseudoinverse of an n m matrix M is the
unique m n matrix M with the following properties:
(i) M M M = M ,
(ii) M M M = M , and
(iii) M M is symmetric and M M is symmetric.
Assume L is the Laplacian matrix of a weighted connected undirected graph with n nodes. Let U Rnn be
an orthonormal matrix of eigenvectors of L such that

0 0 ... 0
0 2 . . . 0
>
L = U . . . U .
.. .. . . ...

0 0 . . . n
Show that

0 0 ... 0
0 1/2 ... 0
>
(i) L = U . .. .. .. U ,
.. . . .
0 0
. . . 1/n
1
(ii) LL = L L = In 1n 1>n , and
n
(iii) L 1n = 0n .
E6.10 The Green matrix of a Laplacian matrix. Assume L is the Laplacian matrix of a weighted connected
undirected graph with n nodes. Show that
(i) the matrix L + n1 1n 1>
n is positive definite,
(ii) the so-called Green matrix
1 1 1
X = L + 1n 1>n 1n 1>
n (E6.1)
n n
is the unique solution to the system of equations:

(
LX = In n1 1n 1>
n,
1>
n X = 0>
n ,
(iii) X = L , where L is defined in Exercise E6.9. In other words, the Green matrix formula (E6.1) is an
alternative definition of the pseudoinverse Laplacian matrix.
E6.11 Monotonicity of Laplacian eigenvalues. Consider a symmetric Laplacian matrix L Rnn associated to
a weighted and undirected graph G = {V, E, A}. Assume G is connected and let 2 (G) > 0 be its algebraic
connectivity, i.e., the second-smallest eigenvalue of L. Show that
(i) 2 (G) is a monotonically non-decreasing function of each weight aij , {i, j} E; and
(ii) 2 (G) is monotonically non-decreasing function in the edge set in the following sense: 2 (G) 2 (G0 )
for any graph G0 = (V, E 0 , A0 ) with E E 0 and aij = a0ij for all {i, j} E.
Hint: Use the disagreement function.
E6.12 Invertibility of principal minors of the Laplacian matrix. Consider a connected and undirected graph
and an arbitrary partition of the node set V = V1 V2 . The associated symmetric and irreducible Laplacian
matrix L Rnn is partitioned accordingly as

L11 L12
L= > .
L12 L22
Show that the submatrices L11 R|V1 ||V1 | and L22 R|V2 ||V2 | are nonsingular.
E6.13 Gaussian elimination and Laplacian matrices. Consider an undirected and connected graph and its
associated Laplacian matrix L Rnn . Consider the associated linear Laplacian equation y = Lx, where
x Rn is unknown and y Rn is a given vector. Verify that an elimination of xn from the last row of this
equation yields the following reduced set of equations:

.. .. .
y1 L1n /Lnn . . .. x1
.. . .
yn = . . .
Lin Ljn
. + .. . . . Lij Lnn .. ,
yn1 Ln1,n /Lnn . .. .. xn1
| {z } .. . .
=A
| {z }
=Lred
where the (i, j)-element of Lred is given by Lij Lin Ljn /Lnn . Show that the matrices A Rn11 and
L R(n1)(n1) obtained after Gaussian elimination have the following properties:
(i) A is nonnegative and column-stochastic matrix with at least one strictly positive element; and
(ii) Lred is a symmetric and irreducible Laplacian matrix.
Hint: To show the irreducibility of Lred , verify the following property regarding the fill-in of the matrix Lred : The
graph associated to the Laplacian Lred has an edge between nodes i and j if and only if (i) either {i, j} was an
edge in the original graph associated to L, (ii) or {i, n} and {j, n} were edges in the original graph associated to
L.
E6.14 Thomsons principle and energy routing. Consider a connected and undirected resistive electrical network
with n nodes, with external nodal current injections c Rn satisfying the balance condition 1>n c = 0, and
with resistances Rij > 0 for every undirected edge {i, j} E. For simplicity, we set Rij = if there is no
edge connecting i and j. As shown earlier in this chapter, Kirchhoffs and Ohms laws lead to the network
equations
X Xn
1
cinjected at i = cji = (vi vj ) ,
Rij
jN (i) jN (i)
where vi is the potential at node i and cji = 1/Rij (vi vj ) is the current flow from node i to node
j. Consider now a more general set of current flows fij (for all i, j Rn ) routing energy through the
network and compatible with the following basic assumptions:
(i) Skew-symmetry: fij = fji for all i, j Rn ;
(ii) Consistency: fij = 0 if {i, j} 6 E;
P
(iii) Conservation: cinjected at i = jN (i) fji for all i Rn .
Show that among all possible current flows fij , the physical current flow fij = cij = 1/Rij (vj vi )
uniquely minimizes the energy dissipation:
n
1 X 2
minimize J= Rij fij
fij , i,j{1,...,n} 2 i,j=1
subject to fij = fji for all i, j Rn ,
fij = 0 for all {i, j} 6 E ,
X
cinjected at i = fji for all i Rn .
jN (i)
Hint: The solution requires knowledge of the Karush-Kuhn-Tucker (KKT) conditions for optimality; this is a
classic topic in nonlinear constrained optimization discussed in numerous textbooks, e.g., in (Luenberger 1984).
E6.15 Linear spring networks with loads. Consider the two (connected) spring networks with n moving masses
in figure. For the right network, assume one of the masses is connected with a single stationary object with a
spring. Refer to the left spring network as free and to the right network as grounded. Let Fload be a load force
applied to the n moving masses.
x x
For the left network, let Lfree,n be the n n Laplacian matrix describing the free spring network among
the n moving masses, as defined in Section 6.2. For the right network, let Lfree,n + 1 be the (n + 1) (n + 1)
Laplacian matrix for the spring network among the n masses and the stationary object. Let Lgrounded be
the n n grounded Laplacian of the n masses constructed by removing the row and column of Lfree,n + 1
corresponding to the stationary object.
For the free spring network subject to Fload ,
(i) do equilibrium displacements exist for arbitrary loads?
(ii) if the load force Fload is balanced in the sense that 1>
n Fload = 0, is the resulting equilibrium displacement
unique?
(iii) compute the equilibrium displacement if unique, or the set of equilibrium displacements otherwise,
assuming a balanced force profile is applied.
For the grounded spring network,
(iv) derive an expression relating Lgrounded to Lfree,n ,
(v) show that Lgrounded is invertible,

(vi) compute the displacement for the grounded spring network for arbitrary load forces.
E6.16 Degree-dependent bounds on Laplacian spectrum. In the discussion following Lemma 6.5, we found that
the maximal eigenvalue n of a symmetric Laplacian matrix L = LT Rnn satisfies n 2dmax , where
the maximum degree is dmax = maxi{1,...,n} di . Prove the lower bound:
n dmax .
E6.17 Distributed averaging-based PI control. Consider a set of n controllable agents governed by the second-
order dynamics
x i = yi , (E6.2a)
y i = ui + i , (E6.2b)
where i {1, . . . , n} is the index set, ui R is a control input to agent i, and i R is an unknown
disturbance affecting agent i. Given an undirected, connected, and weighted graph G = (V, E, A) with node
set V = {1, . . . , n}, edge set E V V , and adjacency matrix A = AT Rnn , we assume each agent can
measure its velocity yi R as well as the relative position xi xj for each neighbor {i, j} E. Based on
these measurements, consider now the distributed averaging-based proportional-integral (PI) controller
Xn
ui = aij (xi xj ) yi qi , (E6.3a)
j=1
Xn
qi = yi aij (qi qj ) , (E6.3b)
j=1
where qi R is a dynamic control state for each agent i {1, . . . , n}. Your tasks are as follows:
Pn
(i) show that the center of mass n1 i=1 xi (t) is bounded for all t 0,
(ii) characterize the set of equilibria (x? , y ? , q ? ) of the closed-loop system (E6.2)-(E6.3), and
(iii) show that all trajectories converge to these closed-loop equilibria.
E6.18 Maximum power dissipation. As in Subsection 6.3, consider an electrical network composed by three
voltage sources (v1 , v2 , v3 ) connected by three resistors (each with unit resistance R = 1) in an undirected
ring topology. Recall that the total power dissipated by the circuit is
Pdissipated = v > Lv.
What is the maximum dissipated power if the voltages v are such that kvk2 = 1?
Hint: Recall the notion of induced 2-norm.
Chapter 7
Continuous-time Averaging Systems
In this chapter we consider averaging algorithms in which the variables evolve in continuous time, instead of
discrete time. Therefore we look at some interesting differential equations. We borrow ideas from (Mesbahi
and Egerstedt 2010; Ren et al. 2007).
7.1 Example systems

We present here two simple examples of continuous-time averaging systems.
7.1.1 Example #1: Flocking behavior for a group of animals

We are interested in a simple alignment rule for each agent to steer towards the average heading of its
neighbors; see Figure 7.1.
Figure 7.1: Alignment rule: the center fish rotates clockwise to align itself with the average heading of its neighbors.
This alignment rule amounts to a spring-like attractive force, described as follows:

(j i ), if ith agent has one neighbor
i = 1 (j i ) + 1 (j i ), if ith agent has two neighbors
2 1 2 2
1 1
m (j1 i ) + + m (jm i ), if ith agent has m neighbors

= average {j , for all neighbors j} i .
This interaction law can be written as
= L
91
92 Chapter 7. Continuous-time Averaging Systems
where L is the Laplacian of an appropriate weighted digraph G: each bird is a node and each directed edge
(i, j) has weight 1/dout (i). Here it is useful to recall the interpretation of (Lx)i as a force perceived by
node i in a network of springs.
Note: it is weird (i.e., mathematically ill-posed) to compute averages on a circle, but let us not worry
about it for now.
Note: this incomplete model does not concern itself with positions. In other words, we do not discuss
collision avoidance and formation/cohesion maintenance. Moreover, note that the graph G should be really
state dependent. For example, we may assume that two birds see each other and interact if and only if their
pairwise Euclidean distance is below a certain threshold.
Figure 7.2: Many animal species exhibit flocking behaviors that arise from decentralized interactions. On the left:
pacific threadfins (Polydactylus sexfilis); public domain image from the U.S. National Oceanic and Atmospheric
Administration. On the right: flock of snow geese (Chen caerulescens); public domain image from the U.S. Fish and
Wildlife Service.
7.1.2 Example #2: A simple RC circuit
Consider an electrical network with only pure resistors and with pure capacitors connecting each node
to ground; this example is taken from (Mesbahi and Egerstedt 2010; Ren et al. 2007).
From the previous chapter, we know the vector of injected currents cinjected and the vector of voltages
at the nodes v satisfy
cinjected = L v,
7.2. Continuous-time linear systems and their convergence properties 93
where L is the Laplacian for the graph with coefficients aij = 1/rij . Additionally, assuming Ci is the
capacitance at node i, and keeping proper track of the current into each capacitor, we have
d
Ci vi = cinjected at i
dt
so that, with the shorthand C = diag(C1 , . . . , Cn ),
d
v = C 1 L v.
dt
Note: C 1 L is again a Laplacian matrix (for a directed weighted graph).
Note: it is physically intuitive that after some transient all nodes will have the same potential. This
intuition will be proved later in the chapter.
7.2 Continuous-time linear systems and their convergence properties

In Section 2.1 we presented discrete-time linear systems and their convergence properties; here we present
their continuous-time analog.
A continuous-time linear system is
x(t)
= Ax(t). (7.1)
Its solution t 7 x(t), t R0 from an initial confition x(0) satisfies x(t) = eAt x(0), where the matrix
exponential of a square matrix A is defined by

X 1 k
eA = A .
k!
k=0
The matrix exponential is a remarkable operation with numerous properties; we ask the reader to review a
few basic ones in Exercise E7.1. A matrix A Rnn is
(i) continuous-time semi-convergent if limt+ eAt exists, and

(ii) continuous-time convergent (Hurwitz) if limt+ eAt = 0nn .
The spectral abscissa of a square matrix A is the maximum of the real parts of the eigenvalues of A, that is,
(A) = max{<() | spec(A)}.
Theorem 7.1 (Convergence and spectral abscissa). For a square matrix A, the following statements hold:
(i) A is continuous-time convergent (Hurwitz) (A) < 0,

(ii) A is continuous-time semi-convergent (A) 0, no eigenvalue has zero real part other than
possibly the number 0, and if 0 is an eigenvalue, then it is semisimple.
We leave the proof of this theorem to the reader and mention that most required steps are similar to
the dicussion in Section 2.1 and are discussed later in this chapter.
7.3 The Laplacian flow

Let G be a weighted directed graph with n nodes and Laplacian matrix L. The Laplacian flow on Rn is the
dynamical system
x = Lx, (7.2)
or, equivalently in components,

n
X X
x i = aij (xj xi ) = aij (xj xi ).
j=1 jN out (i)
7.3.1 Matrix exponential of a Laplacian matrix

Before analyzing the Laplacian flow, we provide some results on the matrix exponential of (minus) a
Laplacian matrix. We show how such an exponential matrix is row-stochastic and has properties analogous
to those for adjacency matrices studied in Section 4.2.
Theorem 7.2 (The matrix exponential of a Laplacian matrix). Let L be an n n Laplacian matrix
with associated digraph G and with maximum diagonal entry `max = max{`11 , . . . , `nn }. Then
(i) exp(L)1n = 1n , for any digraph G,

(ii) 1>
n exp(L) = n,
1> for a weight-balanced G (i.e., 1>
n L = 0n ),
>
(iii) exp(L) e`max In 0, for any digraph G,

(iv) exp(L) ej > 0, for a digraph G whose j-th node is globally reachable, and
(v) exp(L) > 0, for a strongly connected digraph G (i.e., for an irreducible L).
Proof. From the equality L1n = 0n and the definition of matrix exponential, we compute

X (1)k
exp(L)1n = In + Lk 1n = 1n .
k!
k=1
Similarly, if 1> >

n L = 0n , we compute

X (1)k k
1>
n exp(L) = 1>
n In + L = 1>
n.
k!
k=1
These calculations establish statements (i) and (ii).

Next, we define a nonnegative matrix AL by
AL = L + `max In L = `max In + AL .
Because AL In = In AL , we know
exp(L) = exp(`max In ) exp(AL ) = e`max exp(AL ).
7.3. The Laplacian flow 95
Here we used the following properties of the matrix exponential operation: exp(A + B) = P
exp(A) exp(B)
if AB = BA and exp(aIn ) = ea In . Next, because AL 0, we know that exp(AL ) = k
k=0 AL /k! is
lower bounded by the first n 1 terms of the series so that
n1
X 1 k
exp(L) = e`max exp(AL ) e`max A . (7.3)
k! L
k=0
Next, we derive two useful lower bounds on exp(L) based on the inequality (7.3). First, by keeping just
the first term, we establish statement (iii):
exp(L) e`max In 0.
Second, we lower bound the coefficients 1/k! and write:
n1
X n1
1 k e`max X k
exp(L) e`max A A .
k! (n 1)!
k=0 k=0
Notice now that the digraph G associated to L is the same as that associated to AL (we do not need to
worry about self-loops
Pn1 here). Hence, if node j is globally reachable in G, then Lemma 4.3 implies that the
jth column of k=0 AkL is positive and, by inequality (7.3), also the jth column of exp(L) is positive.
This
Pn1statement establishes (iv). Moreover, if L irreducible, then AL is irreducible, that is, AL satisfies
A k > 0 so that also exp(L) > 0. This establishes statement (v).
k=0
7.3.2 Equilibria and convergence of the Laplacian flow

We can now focus on the Laplacian flow dynamics.
Lemma 7.3 (Equilibrium points). If G contains a globally reachable node, then the set of equilibrium
points of the Laplacian flow (7.2) is {1n | R}.
Proof. A point x is an equilibrium for the Laplacian flow if Lx = 0n . Hence, any point in the kernel of the
matrix L is an equilibrium. From Theorem 6.6, if G contains a globally reachable node, then rank(L) = n1.
Hence, the dimension of the kernel space is 1. The lemma follows by recalling that L1n = 0n .
In what follows, we are interested in characterizing the evolution of the Laplacian flow (7.2). To build
some intuition, let us first consider an undirected graph G and write the modal decomposition of the solution
as we did in Remark 2.3 in Section 2.1 for a discrete-time linear system. We proceed in two steps. First,
because G is undirected, the matrix L is symmetric and has real eigenvalues 0 = 1 2 n with
corresponding orthonormal (i.e., orthogonal and unit-length) eigenvectors v1 , . . . , vn . Define yi (t) = vi> x(t)
and left-multiply x = Lx by vi :
d
yi (t) = i yi (t), yi (0) = vi> x(0).
dt
These n decoupled ordinary differential equations are immediately solved to give
x(t) = y1 (t)v1 + y2 (t)v2 + + yn (t)vn
= e1 t (v1> x(0))v1 + e2 t (v2> x(0))v2 + + en t (vn> x(0))vn .

Second, recall that 1 = 0 and v1 = 1n / n because L is a symmetric Laplacian matrix (L1n = 0n ).
Therefore, we compute (v1> x(0))v1 = average(x(0))1n and substitute
x(t) = average(x(0))1n + e2 t (v2> x(0))v2 + + en t (vn> x(0))vn .
Now, let us assume that G is connected so that its second smallest eigenvalue 2 is strictly positive. In this
case, we can infer that
lim x(t) = average(x(0))1n ,

t
or, defining a disagreement vector (t) = x(t) average(x(0))1n , we infer
(t) = e2 t (v2> x(0))v2 + + en t (vn> x(0))vn .
In summary, we discovered that, for a connected undirected graph, the disagreement vector converges
to zero with an exponential rate 2 . In what follows, we state a more convergence to consensus result for
the continuous-time Laplacian flow. This result is parallel to Theorem 5.2; early references for it include (Lin
et al. 2005; Ren and Beard 2005).
Theorem 7.4 (Consensus for Laplacian matrices with globally reachable node). If a Laplacian ma-
trix L has associated digraph G with a globally reachable node, then
(i) the eigenvalue 0 of L is simple and all other eigenvalues of L have negative real part,
(ii) limt eLt = 1n w> , where w 0 is the left eigenvector of L with eigenvalue 0 satisfying w1 + +
wn = 1,
(iii) wi > 0 if and only if node i is globally reachable. Accordingly, wi = 0 if and only if node i is not
globally reachable,
(iv) the solution to d
dt x(t) = Lx(t) satisfies

lim x(t) = w> x(0) 1n ,
t
(v) if additionally G is weight-balanced, then G is strongly connected, 1>

n L = 0n , w = n 1n , and
> 1
1>
n x(0)

lim x(t) = 1n = average x(0) 1n .
t n
Note: as a corollary to the statement (iii), the left eigenvector w Rn associated to the 0 eigenvalue
has strictly positive entries if and only if G is strongly connected.
Proof. Because the associated digraph has a globally reachable node, Theorem 6.6 establishes that L has
rank n 1 and that all eigenvalues of L have nonnegative real part. Therefore, also remembering the
property L1n = 0n , we conclude that 0 is a simple eigenvalue with right eigenvector 1n and that all other
eigenvalues of L have positive real part. This concludes the proof of (i). In what follows we let w denote
the left eigenvector associated to the eigenvalue 0, that is, w> L = 0> >
n , normalized so that 1n w = 1.
7.3. The Laplacian flow 97
To prove statement (ii), we proceed in three steps. First, we write the Laplacian matrix in its Jordan
normal form:

0 0 0

0 J2 . . . 0 1
L = P JP 1
= P . P , (7.4)

.. . . . . . . 0
0 0 Jm
where m n is the number of Jordan blocks, the first block is the scalar 0 (being the only eigenvalue we
know), the other Jordan blocks J2 , . . . , Jm (unique up to re-ordering) are associated with eigenvalues with
strictly positive real part, and where the columns of P are the generalized eigenvectors of L (unique up to
rescaling).
Second, using some properties from Exercise E7.1, we compute the limit as t of eLt = P eJt P 1
as

1 0 0

0 0 . . . 0 1
lim eLt = P lim eJt P 1
= P . P = (P e1 )(e>
1P
1
) = c1 r1 ,
t t . .
.. . . . . 0
0 0 0
where c1 is the first column of P and r1 is the first row of P 1 . The contributions of the Jordan blocks
J2 , . . . , Jm vanish because their eigenvalues have negative real part; e.g., for more details see (Hespanha
2009).
Third and final, we characterize c1 and r1 . By definition, the first column of P (unique up to rescaling)
is a right eigenvector of the eigenvalue 0 for the matrix L, that is, c1 = 1n for some scalar since we
know L1n = 0n . Of course, it is convenient to define c1 = 1n . Next, equation (7.4) can be rewritten as
P 1 L = JP 1 , whose first row is r1 L = 0> >
n . This equality implies r1 = w for some scalar . Finally,
we note that P 1 P = In implies r1 c1 = 1, that is, w> 1n = 1. Since we know w> 1n = 1, we infer that
= 1 and that r1 = w> . This concludes the proof of statement (ii).
Next, we prove statement (iii). Pick a positive constant < 1/dmax , where the maximum out-degree
is dmax = max{dout (1), . . . , dout (n)}. Define B = In L. It is easy to show that B is nonnegative,
row-stochastic, and has strictly positive diagonal elements. Moreover, w> L = 0> >
n implies w B = w
>
so that w is the left eigenvector with unit eigenvalue for B. Now, note that the digraph G(L) associated
to L (without self-loops) is identical to the digraph G(B) associated to B, except for the fact that B has
self-loops at each node. By assumption G(L) has a globally reachable node and therefore so does G(B),
where the subgraph induced by the set of globally reachable nodes is aperiodic (due to the self-loops).
Therefore, statement (iii) is now an immediate transcription of the same statement for row-stochastic
matrices established in Theorem 5.2 (statement (iii)).
Statements (iv) and (v) are straightforward and left as Exercise E7.3.
7.4 Second-order Laplacian flows

In this section we assume each node of the network obeys a so-called double-integrator (that is, second-order)
dynamic: (
x i = vi ,
x
i = ui , or, in first-order equivalent form, (7.5)
v i = ui ,
where ui is an appropriate control input signal to be designed.
We assume a weighted digraph describes the sensing and/or communication interactions among the
agents with adjacency matrix A and Laplacian L. We also introduce constants kp , kd 0 describing a
so-called spring and damping coefficients respectively, as well as constants p , d 0 describing position-
averaging and velocity-averaging coefficients. Given the following law:
n
X
ui = kp xi kd x i + aij p (xj xi ) + d (x j x i ) ,
j=1
the corresponding closed-loop systems, called the second-order Laplacian flow, is
x
(t) + (kd In + d L)x(t)
+ (kp In + p L)x(t) = 0n . (7.6)
By introducing the second-order Laplacian matrix L R2n2n , we write the system in first-order form:

x(t)
0nn In x(t) x(t)
= =: L .
v(t)
kp In p L kd In d L v(t) v(t)
Name Dynamics References
Second-order consensus protocol x

(t)+Lx(t)+
p Lx(t) = 0n (Ren and Atkins 2005; Ren
(kp = kd = 0, d = 1, p > 0) 2008a; Yu et al. 2010)
Harmonic oscillators coupled via velocity x
(t) + Lx(t)
+ kp x(t) = 0n (Ren 2008b)
averaging (kd = p = 0, d = 1, kp > 0)
Position-averaging with absoluted velocity x
(t) + kd x(t)
+ Lx(t) = 0n Exercise ??
damping (kp = d = 0, p = 1, kd > 0)
Arbitrary-sign gains and digraphs (possibly equation (7.6) (Zhu et al. 2009).
with L 6= L> ) See (Zhang and Tian
2009) for discrete-time
setting.
Table 7.1: Classification of second-order Laplacian flows
It turns out that it is possible to compute the eigenvalues of the second-order Laplacian matrix; we refer
to Exercise E7.12 for its eigenvectors.
7.4. Second-order Laplacian flows 99
Theorem 7.5 (Eigenvalues of second-order Laplacian matrices). Given a Laplacian matrix L and
coefficients kp , kd , p , d R,
(i) the characteristic polynomial of L is

det(I2n L) = det 2 In + (kd In + d L) + (kp In + p L) ;
(ii) given the eigenvalues i , i {1, . . . , n}, of L, the 2n eigenvalues i, , i {1, . . . , n}, of L are solutions
to
2 + (kd + d i ) + (kp + p i ) = 0, i {1, . . . , n}. (7.7)
Proof. Regarding statement (i), we recall equality (E7.1b) from Exercise E7.11 and compute the characteristic
polynomial of L as:

In In
det(I2n L) = det
kp In + p L ( + kd )In + d L

= det (In )(( + kd )In + d L) (In )(kp In + p L)

= det 2 In + (kd In + d L) + (kp In + p L) .
Regarding statement (ii), let JL be the Jordan normal form of L, i.e., let L = T JL T 1 for an appropriate
invertible T , and note

det(I2n L) = det 2 In + (kd In + d JL ) + (kp In + p JL )
Yn

= 2 + (kd + d i ) + (kp + p i ) .
i=1
Therefore, the 2n solutions to the characteristic equation det(I2n L) = 0 are n pairs of solutions 2i,2ii ,
i {1, . . . , n}, for the second-order equations (7.7). This concludes task (ii).
Next, we provide a necessary and sufficient characterization of a so-called asymptotic second-order

consensus concept.
Theorem 7.6 (Asymptotic second-order consensus). Consider the second-order Laplacian flow (7.6).
The following statements are equivalent:
(i) the second-order Laplacian flow achieves asymptotic second-order consensus, that is, |xi xj | 0
and |x i x j | 0 as t for all i, j {1, . . . , n}, and
(ii) the 2(n 1) eigenvalues i, , i {2, . . . , n}, of the second-order Laplacian matrix L have strictly
negative real part.
The following proof is based on elementary calculations. An equivalent proof can be obtained using
the Jordan normal form, e.g., see (Ren and Atkins 2005; Ren 2008b).

xave (t)
Proof of Theorem 7.6. We introduce the following change of coordinates: T x(t) = , where
(t)

1/n 1/n . . . 1/n
1 1

xave (t) = average(x(t)), (t) Rn1 , and, from Exercise E2.3, T = .. .. . Corre-
. .
1 1

x ave (t)
spondingly, we also have T x(t)
= . To write the system in the new coordinates, we observe
(t)
T 1n = e1 and compute
T LT 1 e1 = T LT 1 (T 1n ) = T L1n = 0n ,
where the last equality follows from L1n = 0n . This implies that the first column of T LT 1 is 0n , that is,

0 c>
T LT 1 = , for Lred R(n1)(n1) and c Rn1 , (7.8)
0n1 Lred
so that spec(L) = {0} spec(Lred ). Next, we compute

1
T 0nn 0nn In T 0nn
0nn T kp In p L kd In d L 0nn T 1
1
0nn T T 0nn 0nn In
= = .
kp T p T L kd T d T L 0nn T 1 kp In p T LT 1 kd In d T LT 1
(7.9)
Based on equations (7.8) and (7.9), we write the system in these new coordinates as

xave 0 0>
n1 1 0 xave
d
= 0n1 0(n1)(n1)) 0n1 In1

dt x ave kp p c> kd d c> x ave .
0n1 kp In1 p Lred 0n1 kd In1 d Lred
We reorder the variables to obtain a block-diagonal matrix, whose eigenvalues are the eigenvalues of the
diagonal blocks:

xave 0 1 0>
n1 0 xave
d
x ave = kp kd p c> d c> x ave

dt 0n1 0n1 0(n1)(n1)) In1 .
0n1 0n1 kp In1 p Lred kd In1 d Lred
We are now ready to conclude the proof: asymptotic second order consensus is achieved if and only if

0 In1
0n1 and 0n1 as t if and only if all eigenvalues of
kp In1 p Lred kd In1 d Lred
have strictly negative real part. But these eigenvalues are precisely the 2(n 1) eigenvalues i, , i
{2, . . . , n}, of the second-order Laplacian matrix L.
7.4. Second-order Laplacian flows 101
Finally, we present convergence results for undirected graphs and positive gains; we refer to (Zhu et al.
2009) for the general case.
Theorem 7.7 (Asymptotically convergence of second-order Laplacian flows). Consider the second-
order Laplacian flow (7.6). Assume L is symmetric and irreducible (i.e., its associated digraph
is undirected
and connected).
Define the state average and its time derivative by: xave (t) = average x(t) and x ave (t) =
average x(t)
. Then the state averages satisfy

d xave (t) 0 1 xave (t)
= , (7.10)
dt x ave (t) kp kd x ave (t)
and, moreover,
(i) for the second-order consensus protocol (kp = kd = 0, d = 1, p > 0), asymptotic consensus on a ramp
signal is achieved, that is, as t ,

x(t) xave (0) + x ave (0)t 1n ;
(ii) for the harmonic oscillators coupled via velocity averaging

p (kd = p = 0, d = 1, kp > 0), asymptotic
synchronization on an harmonic signal with frequency kp is achieved, that is, as t ,
q 1 q
x(t) xave (0) cos( kp t) + p x ave (0) sin( kp t) 1n ;
kp
(iii) for the position-averaging flow with absoluted velocity damping (kp = d = 0, p = 1, kd > 0),
asymptotic consensus on a composite average value is achieved, that is, as t

x(t) xave (0) + x ave (0)/kd 1n .
Proof. First, we show that, in the similarity transformation (7.8), if L is symmetric, then c = 0n1 . To do
this, we observe e> >
1 T = (1/n)1n and compute
e>
1 T LT
1
= 1>
n LT
1
= 0>
n,
where the last equality follows from L> 1n = 0n . This implies that the first row of T LT 1 is 0n and, in
turn, that equation (7.10) are correct. Second, for the index range i {2, . . . , n}, in all three cases the
second-order polynomial (7.7) has strictly positive coefficients, which implies that the 2(n 1) eigenvalues
i, , i {2, . . . , n}, of the second-order Laplacian matrix L have strictly negative real part. Therefore,
by Theorem 7.6, the second-order Laplacian flow achieves asymptotic second-order consensus and, more
specifically, xi (t) (xave (t))i 0 and x i (t) (x ave (t))i 0 for all i {1, . . . , n}. Third and finally , the
specific values for xave (t) follow from explicitely solving the state average dynamics (7.10). We leave the
details to the reader.
7.5 Appendix: Design of weight-balanced digraphs

Problem: Given a directed graph G that is strongly connected, but not weight-balanced, how do we choose
the weights in order to obtain a weight-balanced digraph and a Laplacian satisfying 1> n L = 0n ? (Note that
an undirected graph is automatically weight-balanced.)
Answer: As usual, let w > 0 be the left eigenvector of L with eigenvalue 0 satisfying w1 + + wn = 1.
In other words, w is a vector of convex combination coefficients, and the Laplacian L satisfies
L1n = 0n , and w> L = 0>

n.
Following (Ren et al. 2007), define a new matrix:
Lrescaled = diag(w)L.
It is immediate to see that
Lrescaled 1n = diag(w)L1n = 0n , 1> > > >

n Lrescaled = 1n diag(w)L = w L = 0n .
Note that:
Lrescaled is again a Laplacian matrix because (i) its row-sums are zero, (ii) its diagonal entries are
positive, and (iii) its non-diagonal entries are nonpositive;
Lrescaled is the Laplacian matrix for a new digraph Grescaled with the same nodes and directed edges
as G, but whose weights are rescaled as follows: aij 7 wi aij . In other words, the weight of each
out-edge of node i is rescaled by wi .
7.6 Appendix: Distributed optimization using the Laplacian flow

In the following, we present a computational application of the Laplacian flow in distributed optimization.
The materials in this section are inspired by (Wang and Elia 2010; Gharesifard and Cortes 2014; Droge
et al. 2013; Cherukuri and Corts 2015), and we present them here in a self-contained way. As only
preliminaries notions, we introduce the following two definitions: A function f : Rn R is said be convex
if f (x + y) f (x) + f (y) for all x and y in Rn and for all convex combination coefficients and ,
i.e., coefficients satisfying , 0 and + = 1. A function is said to be strictly convex if the previous
inequality holds strictly.
Consider a network of n processors that can perform local computation and communicate with another.
The communication architecture is modeled by an undirected, connected, and weighted graph with n
nodes and symmetric Laplacian L = L> Rnn . The objective of the processor network is to solve the
optimization problem
Xn
minimize xR f (x) = fi (x), (7.11)
i=1
where fi : R R is a strictly convex and twice continuously differentiable cost function known only
to processor i {1, . . . , n}. In a centralized setup, the decision variable x is globally available and the
7.6. Appendix: Distributed optimization using the Laplacian flow 103
minimizers x R of the optimization problem (7.11) can be found by solving for the critical points of f (x)
X n

0n = f (x) = fi (x).
x x
i=1
A centralized continuous-time algorithm converging to the set of critical points is the negative gradient flow

x = f (x) .
x
To find a distributed approach to solving the optimization problem (7.11), we associate a local estimate
yi R of the global variable x R to every processor and solve the equivalent problem
n
X 1
minimize yRn f(y) = fi (yi ) + y > Ly subject to Ly = 0n , (7.12)
2
i=1
where the consistency constraint Ly = 0 assures that yi = yj for all i, j {1, . . . , n}, that is, the local
estimates of all processors coincide. We also augmented the cost function with the term y > Ly, which clearly
has no effect on the minimizers of (7.12) (due to the consistency constraint), but it provides supplementary
damping and favorable convergence properties for our algorithm. The minimizers of the optimization
problems (7.11) and (7.12) are then related by y = x 1n .
Without any further motivation, consider the function L : Rn Rn R given by
1
L(y, z) = f (y) + y > Ly + z > Ly.
2
In the literature on convex optimization this function is known as (augmented) Lagrangian function and
z Rn is referred to as Lagrange multiplier. What is important for us is that the augmented Lagrangian
function is strictly convex in y and linear (and hence1 concave) in z. Hence, the augmented Lagrangian
function admits a set of saddle points (y , z ) Rn Rn , that is points satisfying
L(y , z) L(y , z ) L(y, z ) for all (y, z) Rn Rn .
Since L(y, z) is differentiable in y and z, the saddle points can be obtained as solutions to the equations

0n = L(y, z) = f (y) + Ly + Lz,
y y

0n = L(y, z) = Ly.
z
Our motivation for introducing the Lagrangian is the following lemma.
Lemma 7.8 (Properties of saddle points). Let L = L> Rnn be a symmetric Laplacian associated to
an undirected, connected, and weighted graph, and consider the Lagrangian function L, where each fi is strictly
convex and twice continuously differentiable for all i {1, . . . , n}. Then
1
A function f : Rn R is said to be concave (resp. strictly concave) if f (x) is a convex (resp. strictly convex) function.
(i) if (y , z ) Rn Rn is a saddle point of L, then so is (y , z + 1n ) for any R;

(ii) if (y , z ) Rn Rn is a saddle point of L, then y = x 1n where x R is a solution of the original
optimization problem (7.11); and
(iii) if x R is a solution of the original optimization problem (7.11), then there are z Rn and y = x 1n

satisfying Lz + y f (y ) = 0n so that (y , z ) is a saddle point of L.
We leave the proof to the reader in Exercise E7.15. Since the Lagrangian function is convex in y and
concave in z, we can compute its saddle points by following the so-called saddle-point dynamics, consisting
of a positive and negative gradient:

y = L(y, z) = f (y) Ly Lz, (7.13a)
y y

z = L(y, z) = Ly. (7.13b)
z
For processor i {1, . . . , n}, the saddle-point dynamics (7.13) read component-wise as
n
X X n

y i = fi (yi ) aij (yi yj ) aij (zi zj ),
yi
j=1 j=1
n
X

zi = L(y, z) = aij (yi yj ).
zi
j=1
Hence, the saddle-point dynamics can be implement in a distributed processor network using only local
knowledge of fi (yi ), local computation, nearest-neighbor communication andof courseafter discretizing
the continuous-time dynamics; see Exercise E7.18. As shown in (Wang and Elia 2010; Gharesifard and
Cortes 2014; Droge et al. 2013; Cherukuri and Corts 2015), this distributed optimization setup is very
versatile and robust and extends to directed graphs and non-differentiable convex objective functions. We
will later establish using a powerful tool termed LaSalle Invariance Principle to show that the saddle-point
dynamics (7.13) always converge to the set of saddle points; see Exercise E13.5.
For now we restrict our analysis to the case of quadratic cost functions fi (x) = (x xi )> Pi (x xi ),
where Pi > 0 and xi R is the minimizer of the cost fi (x). Thus, the cost reads up to a constant scalar as
n
X n
X
f (x) = (x xi )> Pi (x xi ) = (x x )> Pi (x x ) + O(x0 ) ,
i=1 i=1
Pn P
where x is the weighted average x = ( i=1 Pi )1 ni=1 Pi xi , which is the global minimizer of f (x) (as
obtained by f (x)/x = 0); see Exercise E7.17. In this case, the saddle-point dynamics (7.13) reduce to
the linear system
y P L L y
= , (7.14)
z L 0 z
| {z }
=A
where y = y x 1nand P = diag({Pi }i{1,...,n} ). The matrix A is a so-called saddle matrix (Benzi et al.
2005). We will in the following establish the convergence of the dynamics (7.14) to the set of saddle points.
7.6. Appendix: Distributed optimization using the Laplacian flow 105
First, observe that 0 is an eigenvalue of A with multiplicity 1 and the corresponding eigenvector, given by
> >
0n 1> n corresponds to the set of saddle points:

0n P L L y
= = (P + L) y = 0n = y span(1n )
y + Lz = 0n and L
0n L 0 z
= y> P y = 0n obtained by multiplying (P + L)
y + Lz = 0n by y>
= y = 0n and z = 1n .
Next, note that

1 > P L 0
(A + A ) =
2 0 0
is negative semidefinite. It follows by a Lyapunov or standard linear algebra result (Bernstein 2009, Fact
5.10.28) that all eigenvalues of A have real part less or equal than zero. Since there is a unique zero
eigenvalue associated with the set of saddle points, it remains to show that the matrix A has no purely
imaginary eigenvalues. This is established in the following lemma whose proof is left to the reader in
Exercise E7.16:
Lemma 7.9 (Absence of sustained oscillations in saddle matrices). Consider a negative semidefinite
matrix B Rnn and a not necessarily square matrix C Rnm . If kernel(B) image(C) = {0n }, then
the composite block-matrix
B C
A=
C > 0
has no eigenvalues on the imaginary axis except for 0.
>
It follows that the saddle point dynamics (7.14) converge to the set of saddle points y> z >

>
span 0> n 1n
> . Since 1>
n z = 0, it follows that average(z(t)) = average(z0 ), we can further conclude
that the dynamics converge to a unique saddle point satisfying limt y(t) = x 1n and limt z(t) =
z0 1n .
7.7 Exercises
P 1 k
E7.1 Properties of the matrix exponential. Recall the definition of eA = k=0 k! A for any square matrix A.
Complete the following tasks:
P 1 k
(i) show that k=0 k! A converges absolutely for all square matrices A,
P P
Hint: Recall that a matrix series k=1 Ak is said to converge absolutely if k=1 kAk k converges, where
k k is a matrix norm. Introduce a sub-multiplicative matrix norm k k and show k eA k ekAk .
(ii) show that, if A = diag(a1 , . . . , an ), then eA = diag(ea1 , . . . ean ),
1
(iii) show that eT AT = T eA T 1 for any invertible T ,
(iv) show that AB = BA implies eAB = eBA ,
(v) give an example of matrices A and B such that eAB 6= eBA , and
(vi) compute the matrix exponential of etJ where J is a Jordan block of arbitrary size and t R.
E7.2 Continuous-time affine systems. Given A Rnn and b Rn , consider the continuous-time affine
systems
x(t)
= Ax(t) + b.
Assume A is Hurwitz and, similarly to Exercise E2.11, show that
(i) the matrix A is invertible,
(ii) the only equilibrium point of the system is A1 b, and
(iii) limt x(t) = A1 b for all initial conditions x(0) Rn .
E7.3 Consensus for Laplacian matrices: missing proofs. Complete the proof of Theorem 7.4, that is, prove
statements (iv) and (v).
E7.4 Laplacian average consensus in directed networks. Consider the directed network in Figure E7.1 with
arbitrary positive weights and its associated Laplacian flow x(t)
= L(x(t).
4 2 1
Figure E7.1: A sample digraph
(i) Can the network reach consensus, that is, as t does x(t) converge to a limiting point in span{1n }?
(ii) Does x(t) achieve average consensus, that is, limt x(t) = average(x0 )1n ?
(iii) Will your answers change if you smartly add one directed edge and adapt the weights?
E7.5 Convergence of discrete-time and continuous-time averaging. Consider the following two weighted
digraphs and their associated nonnegative adjacency matrices A and Laplacian matrices L of appropriate
dimensions. Consider the associated discrete-time iterations x(t + 1) = Ax(t) and continuous-time Laplacian
flows x(t)
= Lx(t). For each of these two digraphs, argue about whether the discrete and/or continuous-time
systems converge as t . If they converge, what do they converge to? Please justify your answers.
E7.6 Euler discretization of the Laplacian. Given a weighted digraph G with Laplacian matrix L and maximum
out-degree dmax = max{dout (1), . . . , dout (n)}. Show that:
(i) if < 1/dmax , then the matrix In L is row-stochastic,
0.95 7
8
0.95
Digraph 1 Digraph 2 0.05
0.05
5
0.05 0.05
1 0.34
4 0.95 0.9 0.05
0.34
1
4 0.32 0.05
0.66 0.56
0.32 0.68
6
0.66 0.68 3 0.07
0.32
0.68 2 0.03
0.68
0.31
3 0.01
2 0.34 9
0.32
0.01 0.63
10 0.99
Figure E7.2: Two example weighted digraphs
(ii) if < 1/dmax and G is weight-balanced, then the matrix In L is doubly-stochastic, and
(iii) if < 1/dmax and G is strongly connected, then In L is primitive.
Given these results, note that (no additional assignment in what follows)
In L is the one-step Euler discretization of the continuous-time Laplacian flow and is a discrete-time
consensus algorithm; and
In L is a possible choice of weights for an undirected unweighted graph (which is therefore also
weight-balanced) in the design of a doubly-stochastic matrix (as we did in the discussion about Metropolis-
Hastings).
E7.7 Doubly-stochastic matrices on strongly-connected digraphs. Given a strongly-connected unweighted
digraph G, design weights along the edges of G (and possibly add self-loops) so that the weighted adjacency
matrix is doubly-stochastic.
E7.8 Constants of motion. In the study of mechanics, energy and momentum are two constants of motion, that
is, these quantities are constant along each evolution of the mechanical system. Show that
(i) If A is a row stochastic matrix with w> A = w> , then w> x(k) = w> x(0) for all times k Z0 where
x(k + 1) = Ax(k).
n , then w x(t) = w x(0) for all times t R0 where
(ii) If L is a Laplacian matrix with with w> L = 0> > >
x(t)
= Lx(t).
E7.9 Weight-balanced digraphs with a globally reachable node. Given a weighted directed graph G, show
that if G is weight-balanced and has a globally reachable node, then G is strongly connected.
E7.10 The Lyapunov equation for the Laplacian matrix of a strongly-connected digraph. Let L be the
Laplacian matrix of a strongly-connected weighted digraph. Find a positive-definite matrix P such that
(i) P L + L> P is positive semidefinite, and

(ii) (P L + L> P )1n = 0n .
E7.11 Determinants of block matrices (Silvester 2000). Given square matrices A, B, C, D Rnn , n 1,
useful identities are

1

det(D) det(A BD C), if D is invertible, (E7.1a)
A B
det = det(AD BC), if CD = DC, (E7.1b)
C D

det(DA BC), if BD = DB. (E7.1c)
(i) Prove equality (E7.1a).
(ii) Prove equality (E7.1b) and (E7.1c) assuming D is invertible.

A B In 0nn A BD1 C B
Hint: Show = . We refer to (Silvester 2000) for the complete
C D D1 C In 0nn D
proofs and for the additional identities

A B det(AD CB), if AC = CA, (E7.2a)
det =
C D det(DA CB), if AB = BA. (E7.2b)
E7.12 Eigenvectors of the second-order Laplacian matrix. Consider a Laplacian matrix L, coefficients kp , kd , p , d
R and the induced second-order Laplacian matrix L. Let vl,i and vr,i be the left and right eigenvectors of L
corresponding to the eigenvalue i ,
(i) the right eigenvectors of L corresponding to the eigenvalues i, are

vr,i
,
i, vr,i
(ii) for kp > 0, the left eigenvectors of L corresponding to the eigenvalues i, are

vl,i
i,
vl,i .
kp + p i
E7.13 Laplacian oscillators. Given the Laplacian matrix L = L> Rnn of an undirected, weighted, and
connected graph with edge weights aij , i, j {1, . . . , n}, define the Laplacian oscillator flow by
x
(t) + Lx(t) = 0n . (E7.3)
This flow is written as first-order differential equation as

x(t)
0 In x(t) x(t)
= nn =: L .
z(t)
L 0nn z(t) z(t)
(i) Write the second-order Laplacian flow in components.
(ii) Write the characteristic polynomial of the matrix L using only the determinant of an n n matrix.
(iii) Given the eigenvalues 1 = 0, 2 , . . . , n of L, show that the eigenvalues 1 , . . . , 2n of A satisfy
p
1 = 2 = 0, 2i,2i1 = i i, for i {2, . . . , n},
where i is the imaginary unit.
(iv) Show that the solution is the superposition of a ramp signal and of n 1 harmonics, that is,
n
X p

x(t) = average(x(0)) + average(x(0))t
1n + ai sin( i t + i )vi ,
i=2

where {1n / n, v2 , . . . , vn } are the orthonormal eigenvector of L and where the amplitudes ai and
phases i are determined by the initial conditions x(0), x(0)
.
E7.14 Delayed Laplacian flow. Define the delayed Laplacian flow dynamics over a connected, weighted, and graph
G by:
X
x i (t) = aij (xj (t ) xi (t )), i {1, . . . , n},
jN
where aij > 0 is the weight on the edge {i, j} E, and > 0 is a positive scalar delay term. The Laplace
domain representation of the system is X(s) = G(s)x(0) where G(s) is associated transfer function
G(s) = (sIn + es L)1 ,
and L = L> Rnn is the network Laplacian matrix. Show that the transfer function G(s) admits poles
on the imaginary axis if the following resonance condition is true for an eigenvalue i , i {1, . . . , n}, of the
Laplacian matrix:

= .
2i
E7.15 Properties of saddle points. Prove Lemma 7.8.
E7.16 Absence of sustained oscillations in saddle matrices. Prove Lemma 7.9.
E7.17 Centralized formulation of sum-of-squares cost. Consider a distributed optimization problem with n
agents, where the cost function fi (x) of each agent i {1, . . . , n} is described by fi (x) = (xxi )> Pi (xxi ),
where Pi > 0 and xi R is the minimizer of the cost fi (x). Consider the joint sum-of-squares cost function
Xn
fsos (x) = (x xi )> Pi (x xi ) .
i=1
Calculate the global minimizer x of fsos (x), and show that the sum-of-squares cost fsos (x) is, up to constant
terms, equivalent to the centralized cost function
X n
fcentralized (x) = (x x )> Pi (x x ) .
i=1
E7.18 Discrete saddle-point algorithm for distributed optimization. Consider the centralized optimization
problem
n
1X
z ? := argminzR pi (z ri )2 , (E7.4)
2 i=1
where pi > 0 and ri R are fixed scalar quantities for each i {1, . . . , n}. Our aim is to solve this
optimization problem in a distributed fashion, that is, distributing the computation among a population of n
agents. Each agent i has access only to pi and ri and can communicate with the other agents via a network
defined by the Laplacian matrix L. We assume that this network is undirected and connected.
(i) Show that solving Problem (E7.4) is equivalent to solving

n
1X
x? := argminxRn pi (xi ri )2 ,
2 i=1 (E7.5)
s.t. Lx = 0n ,
where x = [x1 , . . . , xn ]> . In other words, show that x? = z ? 1n .

(ii) Write the KKT conditions of Problem (E7.5), using the notation P := diag(pi ) Rnn , r = [r1 , . . . , rn ]> .
Let ( be a solution of such KKT system. Show that a generic pair (
x, ) is a solution of the KKT
x, )
system if and only if x =x ?
= x and = + 1n , for some R.
(iii) Analogous to (7.13), consider the discrete-time distributed saddle point algorithm
X
xi (k + 1) = xi (k) pi (xi (k) ri ) + Lji j (k) , (E7.6a)
jNin (i)
X
i (k + 1) = i (k) + Lij xj (k) , (E7.6b)
jNout (i)
where > 0 is a sufficiently small step size. Show that, if the algorithm (E7.6) converges, then it
converges to a solution of Problem (E7.5).
(iv) Define the error vector by
x(k) x

e(k) := .
(k)
Find the error dynamics of the algorithm (E7.6), that is, the matrix G such that e(k + 1) = Ge(k).
(v) Show
that, for > 0 small enough, if is an eigenvalue of G then either = 1 or || < 1 and that
0n
is the only eigenvector relative to the eigenvalue = 1. Find an expression for . Use these
1n
results to study the convergence properties of the distributed algorithm (E7.6). Will x(k) x? as
k ?
Hint: Use Lemma 7.8.
Chapter 8
The Incidence Matrix and its Applications
After studying adjacency and Laplacian matrices, in this chapter we introduce one final matrix associated
with a graph: the incidence matrix. We study the properties of incidence matrices and their application
to a class of estimation problems with relative measurements. For simplicity we restrict our attention to
undirected graphs. We borrow ideas from (Barooah 2007; Barooah and Hespanha 2007; Bolognani et al.
2010; Piovan et al. 2013) and refer to (Foulds 1995; Biggs 1994; Godsil and Royle 2001) for more information.
8.1 The incidence matrix

Let G be an undirected unweighted graph with n nodes and m edges. Number the edges of G with a unique
e {1, . . . , m} and assign an arbitrary direction to each edge. The (oriented) incidence matrix B Rnm
of the graph G is defined component-wise by

+1, if node i is the source node of edge e,
Bie = 1, if node i is the sink node of edge e, (8.1)

0, otherwise.
Here, we adopt the convention that an edge (i, j) has the source i and the sink j.
It is useful to consider the following example graph, as depicted in figure.
e4
3 4 3 4
e2 e3
1 2 1 2
e1
Figure 8.1: How to number and orient the edges of a graph
As depicted on the right, we add an orientation to all edges, we order them and label them as follows:
111
112 Chapter 8. The Incidence Matrix and its Applications
e1 = (1, 2), e2 = (2, 3), e3 = (4, 2), and e4 = (3, 4). Accordingly, the incidence matrix is

+1 0 0 0
1 +1 1 0
B= 0 1 0 +1 .

0 0 +1 1
Note: 1> >

n B = 0m since each column of B contains precisely one element equal to +1, one element
equal to 1 and all other zeros.
Note: assume the edge e {1, . . . , m} is oriented from i to j, then for any x Rn ,
(B > x)e = xi xj .
8.2 Properties of the incidence matrix

Given an undirected weighted graph G with edge set {1, . . . , m} and adjacency matrix A, recall
L = D A, where D is the degree matrix.
Lemma 8.1 (From the incidence to the Laplacian matrix). If diag({ae }e{1,...,m} ) is the diagonal
matrix of edge weights, then
L = B diag({ae }e{1,...,m} )B > .
Note: In the right-hand side, the matrix dimensions are (n m) (m m) (m n) = n n. Also

note that, while the incidence matrix B depends upon the selected direction of each edge, the Laplacian
matrix is independent of that.
P
Proof. Recall that, for matrices O, P and Q of appropriate dimensions,
P we have (OP Q)ij = k,h Oik Pkh Qhj .
Moreover, if the matrix P is diagonal, then (OP Q)ij = k Oik Pkk Qkj .
For i 6= j, we compute
Xm
(B diag({ae }e{1,...,m} )B > )ij = Bie ae (B > )ej
e=1
Xm
= Bie Bje ae (e-th term = 0 unless e is oriented {i, j})
e=1
= (+1) (1) aij = ìj ,
where L = {ìj }i,j{1,...,n} , and along the diagonal of B we compute

Xm m
X n
X
(B diag({ae }e{1,...,m} )B > )ii = 2
Bie ae = ae = aij .
e=1
e=1, e=(i,) or e=(,i) j=1
Lemma 8.2 (Rank of the incidence matrix). Let B be the incidence matrix of an undirected graph G
with n nodes. Let d be the number of connected components of G. Then
rank(B) = n d.
8.3. Distributed estimation from relative measurements 113
Proof. We prove this result for a connected graph with d = 1, but the proof strategy easily extends to d > 1.
Recall that the rank of the Laplacian matrix L equals n d = n 1. Since the Laplacian matrix can be
factorized as L = B diag({ae }e{1,...,m} )B > , where diag({ae }e{1,...,m} ) has full rank m (and m n 1
due to connectivity), we have that necessarily rank(B) n 1. On the other hand rank(B) n 1 since
B > 1n = 0n . It follows that B has rank n 1.
The factorization of the Laplacian matrix as L = B diag({ae }e{1,...,m} )B > plays an important role of
relative sensing networks. For example, we can decompose, the Laplacian flow x = Lx into
open-loop plant: x i = ui , i {1, . . . , n} , or x = u ,
measurements: yij = xi xj , {i, j} E , or y = B>x ,
control gains: zij = aij yij , {i, j} E , or z = diag({ae }e{1,...,m} )y ,
X
control inputs: ui = zij , i {1, . . . , n} , or u = Bz .
{i,j}E
Indeed, this control structure, illustrated as a block-diagram in Figure 8.2, is required to implement flocking-
type behavior as in Example 7.1.1. The control structure in Figure 8.2 has emerged as a canonical control
structure in many relative sensing and flow network problems also for more complicated open-loop
dynamics and possibly nonlinear control gains (Bai et al. 2011).
..
.
u x
x i = ui
_ ..
.
B BT
..
.
aij
z .. y
.
Figure 8.2: Illustration of the canonical control structure for a relative sensing network.
8.3 Distributed estimation from relative measurements

In Chapter 1 we considered estimation problems for wireless sensor networks in which each node measures
a scalar absolute quantity (expressing some environmental variable such as temperature, vibrations, etc).
In this section, we consider a second class of examples in which meaurements are relative, i.e., pairs
of nodes measure the difference between their corresponding variables. Estimation problems involving
relative measurements are numerous. For example, imagine a group of robots (or sensors) where no robot
can sense its position in an absolute reference frame, but a robot can measure other robots relative positions
by means of on-board sensors. Similar problems arise in study of clock synchronization in networks of
processors.
8.3.1 Problem statement

The optimal estimation based on relative measurement problem is stated as follows. As illustrated in
Figure 8.3, we are given an undirected graph G = ({1, . . . , n}, E) with the following properties. First, each
node i {1, . . . , n} of the network is associated with an unknown scalar quantity xi (the x-coordinate of
node i in figure). Second, the m undirected edges are given an orientation and, for each edge e = (i, j),
absolute reference
frame
x
xj
xi xj
xi
Figure 8.3: A wireless sensor network in which sensors can measure each others relative distance and bearing. We
assume that, for each link between node i and node j, the relative distance along the x-axis xi xj is available,
where xi is the x-coordinate of node i.
e E, the following scalar measurements are available:
y(i,j) = xi xj + v(i,j) = (B > x)e + v(i,j) ,
where B is the graph incidence matrix and the measurement noises v(i,j) , (i, j) E, are independent
jointly-Gaussian variables with zero-mean E[v(i,j) ] = 0 and variance E[v(i,j)
2 ] = 2
(i,j) > 0. The joint
matrix covariance is the diagonal matrix = diag({(i,j) }(i,j)E ) R
2 mm . (For later use, it is convenient
to define also y(j,i) = y(j,i) = xj xi v(i,j) .)
The optimal estimate x b of the unknown vector x Rn via the relative measurements y Rm is the
solution to
min kB > x
b yk21 .
x
b
Since no absolute information is available about x, we add the additional constraint that the optimal estimate
should have zero mean and summarize this discussion as follows.
Definition 8.3 (Optimal estimation based on relative measurements). Given an incidence matrix B,
a set of relative measurements y with covariance , find x
b satisfying
min kB > x
b yk21 . (8.2)
x
b1n
8.3.2 Optimal estimation via centralized computation

From the theory of least square estimation, the optimal solution to problem 8.2 is obtained as by differenti-
ating the quadratic cost function with respect to the unknown variable xb and setting the derivative to zero.
8.3. Distributed estimation from relative measurements 115
Specifically:

0= kB > x
b yk21 = 2B1 B > x
b 2B1 y.
b
x
b Rn satisfying
The optimal solution is therefore obtained as the unique vector x
B1 B > x
b = B1 y x = B1 y,
Lb
(8.3)
1>
nxb = 0,
where the Laplacian matrix L is defined by L = B1 B > . This matrix is the Laplacian for the weighted
graph whose weights are the noise covariances associated to each relative measurement edge.
Before proceeding we review the definition and properties of the pseudoinverse Laplacian matrix given
in Exercise E6.9. Recall that the Moore-Penrose pseudoinverse of an n m matrix M is the unique m n
matrix M with the following properties:
(i) M M M = M ,
(ii) M M M = M , and
(iii) M M is symmetric and M M is symmetric.
For our Laplacian matrix L, let U Rnn be an orthonormal matrix of eigenvectors of L. It is known that

0 0 ... 0 0 0 ... 0
0 2 . . . 0 0 1/2 . . . 0
> >
L = U . . . .. U = L = U . . . .. U .
.
. .. . . . .. .
. . . .
0 0 . . . n 0 0 . . . 1/n
1
Moreover, it is known that LL = L L = In 1n 1>
n and L 1n = 0n .
n
Lemma 8.4 (Unique optimal estimate). If the undirected graph G is connected, then
(i) there exists a unique solution to equations (8.3) solving the optimization problem in equation (8.2); and
(ii) this unique solution is given by
b = L B1 y.
x
Proof. We claim there exists a unique solution to equation (8.3) and prove it as follows. Since G is connected,
the rank of L is n 1. Moreover, since L is symmetric and since L1n = 0n , the image of L is the
(n 1)-dimensional vector subspace orthogonal to the subspace spanned by the vector 1n . The vector
B1 y belongs to the image of L because the column-sums of B are zero, that is, 1> >
n B = 0n , so that
> 1 >
1n B y = 0n . Finally, the requirement that 1n x >
b = 0 ensures x
b is perpendicular to the kernel of L.
1
b = L B y follows from left-multiplying left and right hand side of equation (8.3)
The expression x
by the pseudoinverse Laplacian matrix L and using the property L L = In n1 1n 1> n . One can also verify
> 1
that 1n L B y = 0, because L 1n = 0n .
8.3.3 Optimal estimation via decentralized computation

To compute xb in a distributed way, we propose the following distributed algorithm; see (Bolognani et al.
2010, Theorem 4). Pick a small > 0 and let each node implement the affine averaging algorithm:
X 1
x bi (k)
bi (k + 1) = x b
x i (k) b
x j (k) y (i,j) ,
2
jN (i) (i,j) (8.4)
bi (0) = 0.
x
There are two interpretations of this algorithm. First, note that the estimate at note i is adjusted at each
iteration as a function of edge errors: each edge error (difference between estimated and measured edge
difference) contributes to a weighted small correction in the node value. Second, note that the affine
Laplacian flow
= L
x x + B1 y (8.5)
results in a steady-state satisfying L x = B1 y, which readily delivers the optimal estimate x b =
1
L B y for appropriately chosen initial conditions. The algorithm (8.4) results from an Euler discretization
of the affine Laplacian flow (8.5) with step size .
Lemma 8.5. Given a graph G describing a relative measurement problem for the unknown variables x Rn ,
with measurements y Rm , and measurement covariance matrix = diag({(i,j)
2 }
(i,j)E ) R
mm . The
following statements hold:

(i) the affine averaging algorithm can be written as
x(k) + B1 y,
b(k + 1) = (In L)b
x
(8.6)
b(0) = 0n .
x
(ii) if G is connected and if < 1/dmax where dmax is the maximum weighted out-degree of G, then the
solution k 7 xb(k) of the affine averaging algorithm (8.4) converges to the unique solution x
b of the
optimization problem 8.2.
Proof. To show fact (i), note that the algorithm can be written in vector form as
b(k) B1 (B > x
b(k + 1) = x
x b(k) y),
and, using L = B1 B > , as equation (8.6).

b b
To show fact (ii), define the error signal (k) = x b and that average((0)) =
x(k). Note that (0) = x
>
b = 0. Compute
0 because 1n x
x (In L)b
(k + 1) = (In L + L)b x(k) B1 y
x B1 y)
= (In L)(k) + (Lb
= (In L)(k).
Now, according to Exercise E7.6, is sufficiently small so that In L is nonnegative. Moreover, (In L)
is doubly-stochastic and symmetric, and its corresponding undirected graph is connected and aperiodic,
Corollary 5.1 implies that (k) average((0))1n = 0n .
8.4. Appendix: Cycle and cutset spaces 117
8.4 Appendix: Cycle and cutset spaces

As stated in the factorization in Lemma 8.1, we know that the incidence matrix contains at least as much
information as the Laplacian matrix. Indeed, we argue via the following example that the incidence matrix
contains additional information and subtleties.
Recall the distributed estimation in Example 8.3 defined over an undirected ring graph. Introduce next
an arbitrary orientation for the edges and, for simplicity, assume all edges are oriented counterclockwise.
Then, in the absence of noise, a summation of all measurements y(i,j) in this ring graph yields
X X
y(i,j) = xi xj = 0 or 1> > >
n y = 1n B x = 0 ,
(i,j)E (i,j)E
that is, all relative measurements around the ring cancel out. Equivalently, 1n kernel(B). This consistency
check can be used as additional information to process corrupted measurements.
These insights generalize to arbitrary graphs, and the nullspace of B and its orthogonal complement,
the image of B > , can be related to cycles and cutsets in the graph. In what follows, we present some of
these generalizations; the presentation in this section is inspired by (Biggs 1994; Zelazo 2009). As a running
example in this section we use the graph and the incidence matrix illustrated in Figure 8.4.
3 2 3
1 2 6 6 +1 +1 0 0 0 0 0
6 1 0 +1 0 0 0 07
1 6 7
5 60 1 0 +1 +1 +1 0 7
4 B=6
60
7
6 0 1 1 0 0 +177
5 40 0 0 0 1 0 15
2 3 7
4 0 0 0 0 0 1 0
Figure 8.4: An undirected graph with arbitrary edge orientation and its associated incidence matrix B R67 .
Definition 8.6 (Signed path vector). Given an undirected graph G with an arbitrary orientation of its m
edges, let be a simple path. The signed path vector v {1, 0, +1}m of the simple path is defined by, for
e {1, . . . , m},
+1, if edge e is traversed positively by ,

ve = 1, if edge e is traversed negatively by ,

0, otherwise.
Proposition 8.7 (Cycle space). Consider an undirected graph G with an arbitrary orientation of its edges
and incidence matrix B Rnm . The kernel of B, called the cycle space of G, is the subspace of Rm spanned
by the signed path vectors corresponding to all the cycles in G.
The proposition follows from the following lemma.
Lemma 8.8. Given an undirected graph G, consider an arbitrary orientation of its edges, its incidence matrix
B Rnm , and a simple path with distinct initial and final nodes described by a signed path vector v Rm .
The vector y = Bv has components

+1, if node i is the initial node of ,

yi = 1, if node i is the final node of ,

0, otherwise.
Proof. We write y = B diag(v)1n . The (i, e) element of the matrix B diag(v) takes the value 1 (respec-
tively +1) if edge e is used by the path to enter (respectively leave) node i. Now, if node i is not the initial
or final node of the path , then the ith row-sum of B diag(v), (B diag(v)1n )i , is zero. For the initial node
(B diag(v)1n )i = 1, and for the final node (B diag(v)1n )i = 1.
For the example graph in Figure 8.4, two cycles and their signed path vectors are illustrated in Figure 8.5.
Observe that v1 , v2 kernel(B) and the cycle traversing the edges (1, 3, 7, 5, 2) in counter-clockwise
orientation has a signed path vector given by the linear combination v1 + v2 .
02 31
+1 0
3 6 B6 1 0 7C
1 2 6 B6 7C
B6+1 0 7C
1 B6 7C
5 kernel(B) = span B 6 7C
v1 4 B6 1 +17C = span(v1 , v2 )
v2 B6 0 17 C
B6 7C
5 @4 0 0 A5
2 3 7
4 0 +1
Figure 8.5: Two cycles and their respective signed path vectors in kernel(B).
Definition 8.9 (Cutset orientation vector). Given an undirected graph G, consider an arbitrary orientation
of its edges and a partition of its vertices V in two non-empty and disjoint sets V1 and V2 . The cutset orientation
vector v {1, 0, +1}m corresponding to the partition V = V1 V2 has components

+1, if edge e has its source node in V1 and sink node in V2 ,

ve = 1, if edge e has its sink node in V1 and source node in V2 ,

0, otherwise.
Proposition 8.10 (Cutset space). Given an undirected graph G, consider an arbitrary orientation of its
edges and its incidence matrix B Rnm . The image of B > , called the cutset space of G, is subspace of Rm
spanned by all cutset orientation vectors corresponding to all partitions of G.
Proof. For a cutset orientation vector v Rm associated to the partition V = V1 V2 , we have

1 X > X >
v> = bi bi ,
2
iV1 iV2
where b>i is the ith row of the incidence matrix. If Bx = 0n for some x R , then bi x = 0 for all
m >
i {1, . . . , n}. It follows that v > x = 0, or equivalently, v belongs to the orthogonal complement of
8.4. Appendix: Cycle and cutset spaces 119
kernel(B) which is the image of B > . Finally, notice that the image of B > can be constructed this way: the
kth column of B > is obtained by choosing the partition V1 = {k} and V2 = V \ {k}. Thus, the cutset
orientation vectors span the image of B > .
Since rank(B) = n 1, any n 1 columns of the matrix B > form a basis for the cutset space. For
instance, the ith column corresponds to the cut isolating node i as V = {i} V \ {i}. For the example
in Figure 8.4, five cuts and their cutset orientation vectors are illustrated in Figure 8.6. Observe that
vi image(B > ), for i {1, . . . , 5}, and the cut isolating node 6 has a cutset orientation vector given
by the linear combination (v1 + v2 + v3 + v4 + v5 ). Likewise, the cut separating nodes {1, 2, 3} from
{4, 5, 6} has the cutset vector v1 + v2 + v3 corresponding to the sum of the first three columns of B > .
02 31
+1 1 0 0 0
3 v3 B6+1 0 1 0 07 C
v1 1 2 6 6 B6 7C
B6 0 +1 0 1 0 7C
7
B6 C
1 5 image B T = span B 6
B6 0 0 +1 1 07 7C
C
4 B6 0 0 +1 0 17 C
7
B6 C
5 @4 0 0 +1 0 0 5A
2 3 7 0 0 0 +1 1
v2 v5
4 v4
= span (v1 , v2 , v3 , v4 , v5 )
Figure 8.6: Five cuts and their cutset orientation vectors in image(B > ).
Example 8.11 (Kirchhoffs and Ohms laws revisited). In the following, we revisit the electrical resistor
network from Section 6.3, and re-derive its governing equations via the incidence matrix. Recall that with
each node i {1, . . . , n} of the network, we associate an external current injection cinjected at i . With each edge
{i, j} E we associate a positive conductance (i.e., the inverse of the resistance) aij > 0 and (after introducing
an arbitrary direction for each edge) a current flow cij and a voltage drop uij .
Kirchhoffs voltage law states that the sum of all voltage drops around each cycle must be zero, that is, for
each cycle in the network there is a signed path vector c Rm so that cT u = 0. Equivalently, by Proposition
8.7, there is a vector v Rn so that u = B T v, where B Rnm is the incidence matrix of (oriented) network.
In Section 6 we referred to v as the vector of nodal voltages or potentials.
Kirchhoffs current law states that the sum of all current injections at Pevery node must be zero, that is,
for each node i {1, . . . , n} in the network, we have that cinjected at i = nj=1 cji . Consider now the cut
isolating node i characterizedPby a cutset vector corresponding to the ith column bTi of B T ; see Figure 8.6. Then,
n
we have that cinjected at i = j=1 cji = cbTi = bi cT . Equivalently, we have that cinjected = Bc.
Finally, Ohms law states that the current cji and the voltage drop uij over a resistor with resistance
1/aij are related as cji = aij uij . By combining Kirchhoffs and Ohms laws, we arrive at
cinjected = Bc = B diag(aij )u = B diag(aij )B T v = Lv ,
where we used Lemma 8.1 to recover the conductance matrix L.
Example 8.12 (Nonlinear network flow problem). Consider a (static) network flow problem where a
commodity (e.g., power or water) is transported through a network (e.g., a power grid or a piping system). We
model this scenario with an undirected and connected graph with n nodes. With each node we associate an
external supply/demand P variable (positive for a source and negative for a sink) yi and assume that the overall
network is balanced: ni=1 yi = 0. We also associate a potential variable xi with every node (e.g., voltage or
pressure), and assume the flow of commodity between two connected nodes i and j depends on the potential
difference as fij (xi xj ), where fij is a strictly increasing function satisfying fij (0) = 0. For example, for
piping systems and power grids these functions fij are given by the rational Hazen-Williams flow and the
trigonometric power flow, which are both monotone in the region of interest. By balancing the flow at each
node (akin to the Kirchhoffs current law), we obtain at node i
n
X
yi = aij fij (xi xj ) , i {1, . . . , n} ,
j=1
where aij {0, 1} is the (i, j) element of the network adjacency matrix. In vector notation, the flow balance is

y = Bf B > x ,
where f RE is the vector-valued function with components fij . Consider also the associated linearized
problem y = BB > x = Lx, where L is the network Laplacian matrix, where we implicitly assumed fij0 (0) = 1.
The flows in the linear problem are obtained as B > x? = B > L y, where L is the Moore-Pennrose inverse of
L; see Exercises E6.9 and E6.10. In the following we restrict ourselves to an acyclic network and show that the
nonlinear solution can be obtained from the solution of the linear problem.
We formally replace the flow f (B > x) by a new variable v := f (B > x) and arrive at
y = Bv , (8.7a)
>

v=f B x . (8.7b)
In the acyclic case, kernel(B) = {0} and necessarily v image(B > ), or v = B > w for some w Rn . Thus,
equation (8.7a) reads as y = Bv = BB > w = Lw and its solution is w = L y. Equation (8.7b) then reads as
f (B > x) = v = B > w = B > L y, and its unique solution (due to monotonicity) is B > x? = f 1 (B > L y).
8.5. Exercises 121
8.5 Exercises
E8.1 Continuous distributed estimation from relative measurements. Consider the continuous distributed
estimation algorithm given by the affine Laplacian flow (8.5). Show that for an undirected and connected
graph G and appropriately initial conditions x
(0) = 0n , the affine Laplacian flow (8.5) converges to the unique
b of the estimation problem given in Lemma 8.4.
solution x
E8.2 The edge Laplacian matrix (Zelazo and Mesbahi 2011). For an unweighted undirected graph with n
nodes and m edges, introduce an arbitrary orientation for the edges. Recall the notions of incidence matrix
B Rnm and Laplacian matrix L = BB > Rnn and define the edge Laplacian matrix by
Ledge = B > B Rmm .
Select an edge orientation and compute B, L and Ledge for

(i) a line graph with three nodes, and
(ii) for the graph with four nodes in Figure 8.1.
Show that, for an arbitrary undirected graoh,
(iii) kernel(Ledge ) = kernel(B);
(iv) for an acyclic graph Ledge is nonsingular;
(v) the non-zero eigenvalues of Ledge are equal to the non-zero eigenvalues of L, and
(vi) rank(L) = rank(Ledge ).
E8.3 Evolution of the local disagreement error (Zelazo and Mesbahi 2011). Consider the Laplacian flow
x = Lx, defined over an undirected and connected graph with n nodes and m edges. Beside the absolute
disagreement error (t) = x(t) average(x(t))1n Rn considered thus far, we can also analyze the relative
disagreement error eij (t) = xi (t) xj (t), for {i, j} E.
(i) Write a differential equation for the relative disagreement errors t 7 e(t) Rm .
(ii) Based on Exercise E8.2, show that the relative disagreement errors converge to zero with exponential
convergence rate given by the algebraic connectivity 2 (L).
E8.4 Averaging with distributed integral control. Consider a Laplacian flow implemented as a relative sensing
network over a connected and undirected graph with incidence matrix B Rn|E| and weights aij > 0 for
i, j E, and subject to a constant disturbance term R|E| , as shown in Figure E8.1.
(i) Derive the dynamic closed-loop equations describing the model in Figure E8.1.
(ii) Show that asymptotically all states x(t) converge to some constant vector x Rn depending on the
value of the disturbance , that is, x is not necessarily a consensus state.
Consider the system in Figure E8.1 with a distributed integral controller forcing convergence to consensus,
as shown in Figure E8.2. Recall that 1s is the the Laplace symbol for the integrator.
(iii) Derive the dynamic closed-loop equations describing the model in Figure E8.2.
(iv) Show that the distributed integral controller in Figure ?? asymptotically stabilizes the set of steady states
(x , p ), where x span(1n ) corresponds to consensus.
Hint: To show stability, use Lemma 7.9.
..
.
u x
x i = ui
_ ..
.
B BT
..
+ + .
aij
z .. y
.
Figure E8.1: A relative sensing network with a constant disturbance input R|E| .
..
.
u x
x i = ui
_ ..
.
B BT
..
+ + .
aij
+ z .. y
p .
..
.1
s ..
.
Figure E8.2: Relative sensing network with a disturbance R|E| and distributed integral action.
E8.5 Sensitivity of Laplacian eigenvalues. Consider an unweighted undirected graph G = (V, E) with incidence
matrix B Rnm , and Laplacian matrix of L = BB T Rnn . Consider now a graph G0 obtained by adding
one unweighted edge e / E to G, that is, G0 = (V, E e). Show that
max (LG ) max (LG0 ) max (LG ) + 2.
Hint: You may want to take a detour via the edge Laplacian matrix Ledge = B T B Rmm (see Exercise
E8.2) and use the following fact (Horn and Johnson 1985, Theorem 4.3.17): if A is a symmetric matrix with
eigenvalues ordered as 1 2 . . . n , and B is a principal submatrix of A with eigenvalues ordered as
1 2 . . . n1 , then the eigenvalues of A and B interlace, that is,
1 1 2 . . . n1 n .
Chapter 9
Positive and Compartmental Systems
This chapter is inspired by the excellent text (Walter and Contreras 1999) and the tutorial treatment
in (Jacquez and Simon 1993); see also the texts (Luenberger 1979; Farina and Rinaldi 2000; Haddad et al.
2010). Additional results on Mezler matrices are available in (Berman and Plemmons 1994; Santesso and
Valcher 2007). For nonlinear extensions of the material in this chapter, including recent studies of traffic
networks, we refer to Como et al. (2013); Coogan and Arcak (2015).
9.1 Introduction and example systems

In this chapter we study various continuous-time systems with state variables that are nonnegative for all
times. We are particularly interested in compartmental systems, that is, models of dynamical processes
characterized by conservation laws (e.g., mass, fluid, energy) and by the flow of material between units
known as compartments. Example compartmental systems are transportation networks, queueing networks,
communication networks, epidemic propagation models in social contact networks, as well as ecological
and biological networks. We review some examples in what follows.
Ecological and environmental systems The flow of energy and nutrients (water, nitrates, phosphates,
etc) in ecosystems is typically studied using compartmental modelling. For example, Figure 9.1 illustrates a
widely-cited water flow model for a desert ecosystem (Noy-Meir 1973). Other classic ecological network
systems include models for dissolved oxygen in stream, nutrient flow in forest growth and biomass flow in
fisheries (Walter and Contreras 1999).
Epidemiology of infectious deseases To study the propagation of infectious deseases, the population at
risk is typically divided into compartments consisting of individiduals who are susceptible (S), infected
(I), and, possibly, recovered and no longer susceptible (R). As illustrated in Figure 9.2, the three basic
epidemiological models are (Hethcote 2000) called SI, SIS, SIR, depending upon how the desease spreads. A
detailed discussion is postponed until Chapter 16.
Drug and chemical kinetics in biomedical systems Compartmental model are also widely adopted to
characterize the kinetics of drugs and chemicals in biomedical systems. Here is a classic example (Charkes
125
126 Chapter 9. Positive and Compartmental Systems
precipitation soil evaporation, drainage, runo
uptake plants transpiration
drinking herbivory
animals evaporation
Figure 9.1: Water flow model for a desert ecosystem. The blue line denotes an inflow from the outside environment.
The red lines denote outflows into the outside environment.
Susceptible Infected Susceptible Infected
Susceptible Infected Recovered
Figure 9.2: The three basic models SI, SIS and SIR for the propagation of an infectious desease
et al. 1978) from nuclear medicine: bone scintigraphy (also called bone scan) is a medical test in which
the patient is injected with a small amount of radioactive material and then scanned with an appropriate
radiation camera. .
rest of the body
radioactive blood kidneys urine

material
bone ECF bone
Figure 9.3: The kinetics of a radioactive isotope through the human body (ECF = extra-cellular fluid).
9.2 Positive systems

Motivated by the examples in the previous sections, we start our study by characterizing the class of positive
systems.
Definition 9.1 (Positive systems). A dynamical system x(t)

= f (x(t), t), x Rn , is positive if x(0)
0n implies x(t) 0n for all t.
9.2. Positive systems 127
We are especially interested in linear and affine systems, described by
x(t)
= Ax(t), and x(t)
= Ax(t) + b.
Note that the set of affine systems includes the set of linear systems (each linear system is affine with
b = 0n ).
It is now convenient to introduce a second useful definition.
Definition 9.2 (Metzler matrix). A matrix A Rnn , n 2, is Metzler (sometimes also referred to as
quasi-positive or essentially nonnegative) if all its off-diagonal elements are nonnegative.
In other words, A is Metzler if and only if there exists a scalar a > 0 such that A + aIn is nonnegative.
For example, if G is a weighted digraph with Laplacian matrix L, then L is a Metzler matrix with zero
row-sums.
A Metzler matrix A induces a weighted digraph G without self-loops in the natural way, that is, by
letting (i, j) be an edge of G if and only if aij > 0. We say a Metzler matrix is irreducible if its induced
digraph is strongly connected.
We are now ready to classify which affine systems are positive.
Theorem 9.3 (Positive affine systems and Metzler matrices). For the affine system x(t)
= Ax(t) + b,
the following statements are equivalent:
(i) the system is positive, that is, x(t) 0n for all t and all x(0) 0n ,
(ii) A is Metzler and b 0n .
Proof. We start by showing that statement (i) implies statement (ii). If x(0) = 0n , then x cannot have any
negative components, hence b 0n . If any off-diagonal entry (i, j), i 6= j, of A is strictly negative, then
consider an initial condition x(0) with all zero entries except for x(j) > bi /|aij |. It is easy to see that
x i (0) < 0 which is a contradiction.
P there exists i such
Next, we show that statement (ii) implies statement (i). It suffices to note that, anytime
that xi (t) = 0, the conditions x(t) 0n , A Metzler and b 0n together imply x i (t) = i6=j aij xj (t)+bi
0.
This results motivates the importance of Metzler matrices. Therefore we now study their properties in
two theorems. We start by writing a version of Perron-Frobenius Theorem 2.12 for nonnegative matrices.
Theorem 9.4 (Perron-Frobenius Theorem for Metzler matrices). If A Rnn , n 2, is Metzler,

then
(i) there exists a real eigenvalue such that <() for all other eigenvalues , and
(ii) the right and left eigenvectors of can be selected nonnegative.
If additionally A is irreducible, then
(iii) there exists a real simple eigenvalue such that <() for all other eigenvalues , and
(iv) the right and left eigenvectors of are unique and positive (up to rescaling).
As for nonnegative matrices, we refer to as to the dominant eigenvalue. We invite the reader to
work out the details of the proof in Exercise E9.2. Next, we give necessary and sufficient conditions for the
dominant eigenvalue of a Metzler matrix to be strictly negative.
Theorem 9.5 (Properties of Hurwitz Metzler matrices). For a Metzler matrix A, the following statements
are equivalent:
(i) A is Hurwitz,
(ii) A is invertible and A1 0, and
(iii) for all b 0n , there exists x 0n solving Ax + b = 0n .
Moreover, if A is Metzler, Hurwitz and irreducible, then A1 > 0.
Proof. We start by showing that (i) implies (ii). Clearly, if A is Hurwitz, then it is also invertible. So it suffices
to show that A1 is nonnegative. Pick > 0 and define A,A = In + A, that is, (A) = (In A,A ).
Because A is Metzler, can be selected small enough so that A,A 0. Moreover, because the spectrum
of A is strictly in the left half plane, one can verify that, for small enough, spec(A) is inside the disk
of unit radius centered at the point 1; as illustrated in Figure 9.4. In turn, this last property implies
Figure 9.4: For any C with strictly negative real part, there exists such that the segment from the origin to is
inside the disk of unit radius centered at the point 1.
that spec(In + A) is strictly inside the disk of unit radius centered at the origin, that is, (A,A ) < 1.
We now adopt the Neumann series as defined in Exercise E2.13: because (A,A ) < 1, we know that
(In A,A ) = (A) is invertible and that

X
1 1
(A) = (In A,A ) = Ak,A . (9.1)
k=0
Note now that the right-hand side is nonnegative because it is the sum of nonnegative matrices. In summary,
we have shown that A is invertible and that A1 0. This statement proves that (i) implies (ii).
Next we show that (ii) implies (i). We know A is Metzler, invertible and satisfies A1 0. By the
Perron-Frobenius Theorem 9.4 for Metzler matrices, we know there exists v 0n , v 6= 0n , satisfying
Av = Metzler v, where Metzler = max{<() | spec(A)}. Clearly, A invertible implies Metzler 6= 0 and,
moreover, v = Metzler A1 v. Now, we know v is nonnegative and A1 v is nonpositive. Hence, Metzler must
be negative and, in turn, A is Hurwitz. This statement establishes the equivalence between (ii) implies (i)
Finally, regarding the equivalence between statement (ii) and statement (iii), note that, if A1 0 and
b 0n , then clearly x = A1 b 0n solves Ax + b = 0n . This proves that (ii) implies (iii). Vice versa,
9.2. Positive systems 129
if statement (iii) holds, then let xi be the nonnegative solution of Axi = ei and let X be the nonnegative
matrix with columns x1 , . . . , xn . Therefore, we know AX = In so that A is invertible, X is its inverse,
and A1 = (X) = X is nonnegative. This statement proves that (iii) implies (ii).
Finally, the statement that A1 > 0 for each Metzler, Hurwitz and irreducible matrix A is proved as
follows. Because A is irreducible, the matrix A,A = In + A is nonnegative (for sufficiently small) and
primitive. Therefore, the right-hand side of equation (9.1) is strictly positive.
This theorem about Metzler matrices immediately leads to the following corollary about positive affine
systems, which extends the results in Exercise E7.2.
Corollary 9.6 (Existence, positivity and stability of equilibria for positive affine systems). Con-
sider a continuous-time positive affine system x = Ax + b, where A is Metzler and b is nonnegative. If the
matrix A is Hurwitz, then
(i) the system has a unique equilibrium point x Rn , that is, a unique solution to Ax + b = 0n ,
(ii) the equilibrium point x is nonnegative, and
(iii) all trajectories converge asymptotically to x .
Several other properties of positive affine systems and Metzler matrices are reviewed in (Berman and
Plemmons 1994), albeit with a slightly different language.
The following properties extend the previous results and show that positive systems admit very nice
convergence properties.
Theorem 9.7 (Convergence of positive systems). Consider a continuous-time positive system x = Ax,
where A is Metzler and Hurwitz. Then,
P
(i) there is a positive vector w Rn so that the weighted 1-norm (i.e., the weighted mass) ni=1 wi |xj (t)|
is exponentially convergent;
(ii) there is a positive vector v Rn so that the weighted -norm (i.e., the weighted max) maxi{1,...,n} |xi (t)|/vi
is exponentially convergent; and
P
(iii) there is a positive vector p Rn so that the weighted 2-norm (i.e., the weighted energy) ni=1 pi x2i is
exponentially convergent.
Proof. In the following, let v and w be the dominant right and left eigenvectors corresponding to the
dominant real eigenvalue with largest real part; see Theorem 9.4. Since A is Hurwitz by assumption, it
follows that < 0.
To prove claim (i), consider the function
n
X n
X
V1 (x(t)) = wi |xi (t)| = wi xi (t) = wT x(t) ,
i=1 i=1
where we used the positivity of the dynamical system in the second equality. The derivative of V1 (x(t)) is
then
V 1 (x(t)) = wT x(t)
= wT Ax(t) = wT x(t) = V1 (x(t)) .
It follows that V1 (x(t)) = et V1 (x(0)) is exponentially convergent since < 0.

To prove claim (ii), we recall Av = v < 0 and consider the function
V2 (x(t)) = max |xi (t)|/vi = max xi (t)/vi

i{1,...,n} i{1,...,n}
where we used the positivity of the dynamical system in the second equality. Pick a time t and assume that
k arg maxi{1,...,n} xi (t)/vi . Then, at time t, the time derivative of V2 (x(t)) is given by
d
V 2 (x(t)) = diag(v)1 x(t) k = diag(v)1 Ax(t) k
dt
xk (t)
= diag(v)1 A diag(v) diag(v)1 x(t) k diag(v)1 A diag(v)1n ,
vk k
where we used the element-wise inequality diag(v)1 x(t) 1n xkvk(t) that follows from the fact that all
elements of the vector diag(v)1 x(t), except for the kth, are multiplied by the off-diagonal elements of the
Metzler matrix diag(v)1 A diag(v). We continue by further analyzing the derivative V 2 (x(t)) as
xk (t) xk (t)
V 2 (x(t)) diag(v)1 A diag(v)1n = diag(v)1 Av
vk k vk k
xk (t) xk (t)
= diag(v)1 v = = V2 (x(t)) .
vk k vk
Accordingly, V2 (x(t)) e tV2 (x(0)) is exponentially convergent since < 0.

To prove claim (iii), we let pi = wi /vi and consider now the function
V3 (x(t)) = x(t)T P x(t).
Its derivative is given by

V 3 (x((t)) = x(t)T P A + AT P x(t) .
Next notice that Q = P A + AT P is a symmetric matrix, Q is Metzler since P is diagonal and positive, and
Qv = (P A + AT P )v = P Av + AT w < 0, that is, there is a vector v so that Qv < 0. Thus, according to
claim (ii), the function
V2 (x(t)) = max |xi (t)|/vi
i{1,...,n}
shows that Q is Hurwitz. It follows that
V 3 (x((t)) = x(t)T Qx(t) max (Q)kx(t)k22 max (Q)min (P )V3 (x(t)) ,
and accordingly V3 (x(t)) = emax (Q)min (P )t V3 (x(0)) is exponentially convergent since max (Q) < 0 and
min (P ) > 0.
9.3. Compartmental systems 131
u3 q3 F4!3 q4 F4!0
ui qi Fi!0
F3!2
F2!4
Fi!j
Fj!i F2!3
u1 q1 F1!2 q2 F2!0
Figure 9.5: A compartment and a compartmental system
9.3 Compartmental systems

In this section, motivated by the examples in Section 9.1, we study an important class of positive affine
systems.
A compartmental system is a dynamical system in which material is stored at individual locations and
is transferred along the edges of directed graph, called the compartmental digraph; see Figure 9.5b. The
storage nodes are referred to as compartments; each compartment contains a time-varying quantity qi (t).
Each directed arc (i, j) represents a mass flow (or flux), denoted Fij , from compartment i to compartment
j. The compartmental system interacts with its surrounding environment via inputs and output flows,
denoted in figure by blue and red arcs respectively: the inflow from the environment into compartment i is
denoted by ui and the outflow from compartment i into the environment is denoted by Fi0 .
In summary, a (nonlinear) compartmental system is described by a directed graph GF , by maps Fij
for all edges (i, j) of GF , and by inflow and outflow maps. (The compartmental digraph has no self-loops.)
The dynamic equations of the compartmental system are obtained by the instantaneous flow balance at each
compartment. In other words, asking that the rate of accumulation at each compartment equals the net
inflow rate we obtain:
n
X
qi (t) = (Fji Fij ) Fi0 + ui . (9.2)
j=1,j6=i
In general, the flow along (i, j) is a function of the entire system state q = (q1 , . . . , qn ) and of time t, so
that Fij = Fij (q, t).
Remarks 9.8 (Basic properties). (i) The mass in each of the compartments as well as the mass flowing
along each of the edges must be nonnegative at all times (recall we assume ui 0). Specifically, we
require the mass flow functions to satisfy
Fij (q, t) 0 for all (q, t), and Fij (q, t) = 0 for all (q, t) such that qi = 0. (9.3)
Under these conditions, if at some time P t0 one of the compartments has no mass, that is, qi (t0 ) = 0 and
q(t0 ) Rn0 , it follows that qi (t0 ) = nj=1,j6=i Fji (q(t0 ), t0 ) + ui 0 so that qi does not become
negative. The compartmental system (9.2) is therefore a positive system, as introduced in Definition 9.1.
Pn
(ii) If M (q) = i=1 qi n q denotes the total mass in the system, then along the solutions of (9.2)
= 1>
d Xn Xn
M (q(t)) = 1> =
n q(t) Fi0 (q(t), t) + ui . (9.4)
dt | i=1
{z } | i=1
{z }
outflow into environment inflow from environment
This equality implies that the total mass t 7 M (q(t)) is constant in systems without inflows and
outflows.
Linear compartmental systems

Loosely speaking, a compartmental system is linear if it has (i) constant nonnegative inflow from the
environment and (ii) all other flows depend linearly upon the mass in the originating compartment.
Definition 9.9 (Linear compartmental systems). A linear compartmental system with n compartments
is a triplet (F, f0 , u) consisting of
(i) a nonnegative n n matrix F = (fij )i,j{1,...,n} with zero diagonal, called the flow rate matrix,
(ii) a vector f0 0n , called the outflow rates vector, and
(iii) a vector u 0n , called the inflow vector.
The flow rate matrix F is the adjacency matrix of the compartmental digraph GF (a weighted digraph without
self-loops).
The flow rate matrix F encodes the following information: the nodes are the compartments {1, . . . , n},
there is an edge (i, j) if there is a flow from compartment i to compartment j, and the weight fij of the
(i, j) edge is the corresponding flow rate constant. In a linear compartmental system,
Fij (q, t) = fij qi , for j {1, . . . , n},
Fi0 (q, t) = f0i qi , and
ui (q, t) = ui .
Indeed, this model is also referred to as donor-controlled flow. Note that this model satisfies the physically-
meaningful contraints (9.3). The affine dynamics describing a linear compartmental system is
n
X n
X
qi (t) = f0i + fij qi (t) + fji qj (t) + ui . (9.5)
j=1,j6=i j=1,j6=i
Definition 9.10 (Compartmental matrix). The compartmental matrix C = (cij )i,j{1,...,n} of a com-
partmental system (F, f0 , u) is defined by
(
fji , if i 6= j,
cij = Pn
f0i h=1,h6=i fih , if i = j.
Equivalently, if LF = diag(F 1n ) F is the Laplacian matrix of the compartmental digraph,

C = L> >
F diag(f0 ) = F diag(F 1n + f0 ). (9.6)
In what follows it is convenient to call compartmental any matrix C with the following properties:
(i) C is Metzler, that is, cij 0, for i 6= j,

(ii) C has nonpositive diagonal entries, that is, cii 0 for all i, and
P
(iii) C is column diagonally dominant, that is, |cii | nh=1,h6=i chi for all i.
With the notion of compartmental matrix, the dynamics of the linear compartmental system (9.5) can
be written as
q(t)
= Cq(t) + u. (9.7)
Moreover, since LF 1n = 0n , we know 1> >
n C = f0 and, consistently with equation (9.4), we know
d > >
dt M (q(t)) = f0 q(t) + 1n u.
Algebraic and graphical properties of linear compartmental systems

In this section we present useful properties of compartmental matrices, that are related to those enjoyed by
Laplacian and Metzler matrices.
Lemma 9.11 (Spectral properties of compartmental matrices). For a compartmental system (F, f0 , u)
with compartmental matrix C,
(i) if spec(C), then either = 0 or <() < 0, and

(ii) C is invertible if and only if C is Hurwitz (i.e., <() < 0 for all spec(C)).
Proof. Statement (i) is akin the result in Lemma 6.5 and can be proved by an application of the Gergorin
Disks Theorem 2.8. We invite the reader to fill out the details in Exercise E9.5. Statement (i) immediately
implies statement (ii).
Next, we introduce some useful graph-theoretical notions, illustrated in Figure 9.6. In the compartmental
digraph, a set of compartments S is
(i) outflow-connected if there exists a directed path from every compartment in S to the environment,
that is, to a compartment j with a positive flow rate constant f0j > 0,
(ii) inflow-connected if there exists a directed path from the environment to every compartment in S,
that is, from a compartment i with a positive inflow ui > 0,
(iii) a trap if there is no directed path from any of the compartments in S to the environment or to any
compartment outside S, and
(iv) a simple trap is a trap that has no traps inside it.
It is immediate to realize the following equivalence: the system is outflow connected (i.e., all compartments
are outflow-connected) if and only if the system contains no trap.
Theorem 9.12 (Algebraic graph theory of compartmental systems). Consider the linear compartmen-
tal system (F, f0 , u) with dynamics (9.7) with compartmental matrix C and compartmental digraph GF . The
(a) An example compartmental system and its strongly connected components: this (b) This compartmental system is not
system is outflow-connected because its two sinks in the condensation digraph are outflow-connected because one of its sink
outflow-connected. strongly-connected components is a trap.
Figure 9.6: Outflow-connectivity and traps in compartmental system
(i) the system is outflow-connected,

(ii) each sink of the condensation of GF is outflow-connected, and
(iii) the compartmental matrix C is Hurwitz.
Moreover, the sinks of the condensation of GF that are not outflow-connected are precisely the simple traps of
the system and their number equals the multiplicity of 0 as a semisimple eigenvalue of C.
Proof. The equivalence between statements (i) and (ii) is immediate.

To establish the equivalence between (ii) and (iii), we first consider the case in which GF is strongly
connected and at least one compartment has a strictly positive outflow rate. Therefore, the Laplacian matrix
LF of GF and the compartmental matrix C = L> F diag(f0 ) are irreducible. Pick 0 < < 1/ maxi |cii |,
>
and define A = In + C . Because of the definition of , the matrix A is nonnegative and irreducible. We
compute its row-sums as follows:
A1n = 1n + (LF diag(f0 ))1n = 1n f0 .
Therefore, A is row-substochastic, i.e., all its row-sums are at most 1 and one row-sum is strictly less than
1. Moreover, because A is irreducible, Corollary 4.10 implies that (A) < 1. Now, let 1 , . . . , n denote the
eigenvalues of A. Because A = In +C > , we know that the eigenvalues 1 , . . . , n of C satisfy i = 1+i
so that maxi <(i ) = 1 + maxi <(i ). Finally, we note that (A) < 1 implies maxi <(i ) < 1 so that
1
max <(i ) = max <(i ) 1 < 0.
i i
This concludes the proof that if G is strongly connected, then F has eigenvalues with strictly negative real
part. The converse is easy to prove by contradiction: if f0 = 0n , then the matrix C has zero row-sums, but
this is a contradiction with the assumption that C is invertible.
Next, to prove the equivalence between (ii) and (iii) for a graph GF whose condensation digraph has an
arbitrary number of sinks, we proceed as in the proof of Theorem 6.6: we reorder the compartments as
described in Exercise E3.1 so that the Laplacian matrix LF is block lower-triangular as in equation (6.5).
We then define an appropriately small and the matrix A = In C > as above. We leave the remaining
details to the reader.
An alternative clever proof strategy for the equivalence between (ii) and (iii) is given as follows. Define
the matrix
C 0
Caugmented = T R(n+1)(n+1) ,
f0 1Tn f0
and consider the augmented linear system x = Caugmented x with x Rn+1 . Note that Laugmented =
T
Caugmented is the Laplacian matrix of the graph whose nodes {1, . . . , n, n + 1} are the n compartments
and the environment as (n + 1)st node, and whose edges are the edges of the compartmental graph GF
as well as the outflow edges to the environment. By Theorem 7.4, the Laplacian flow x = Laugmented x
with x Rn+1 is semi-convergent, limt x(t) = (wT x(0))1n+1 , and w1 = wn = 0 if and only if
node n + 1 (the environment) is globally reachable. Equivalently, the system x = Caugmented x satisfies
limt x(t) = (0n , 1Tn x0 ) if and only if the environment is globally reachable (i.e., system is outflow-
connected). Equivalently, (x1 , . . . , xn ) converges to zero (i.e., C is Hurwitz) if and only if the system is
outflow-connected.
Dynamic properties of linear compartmental systems

Consider a linear compartmental system (F, f0 , u) with compartmental matrix C and compartmental
digraph GF . Assuming the system has at least one trap, we define the reduced compartmental systems
(Frd , f0,rd , urd ) as follows: remove all traps from GF and regard the edges into the trapping compartments
as outflow edges into the environment, e.g., see Figure 9.7.
(a) A compartmental system that is not (b) The corresponding reduced compart-
outflow-connected mental system
Figure 9.7: An example reduced compartmental system
We now state our main result about the asymptotic behavior of linear compartmental systems.
Theorem 9.13 (Asymptotic behavior of compartmental systems). The linear compartmental system
(F, f0 , u) with compartmental matrix C and compartmental digraph GF has the following possible asymptotic
behaviors:
(i) if the system is outflow-connected, then the compartmental matrix C is invertible, every solution tends
exponentially to the unique equilibrium q = C 1 u 0n , and in the ith compartment qi > 0 if and
only if the ith compartment is inflow-connected to a positive inflow;
(ii) if the system contains one or more simple traps, then:
a) the reduced compartmental system (Frd , f0,rd , urd ) is outflow-connected and all its solutions con-
1
verge exponentially fast to the unique nonnegative equilibrium Crd urd , for Crd = Frd>
diag(Frd 1n + f0,rd );
b) any simple trap H contains non-decreasing mass along time. If H is inflow-connected to a positive
inflow, then the mass inside H goes to infinity. Otherwise, the mass inside H converges to a scalar
multiple of the right eigenvector corresponding to the eigenvalue 0 of the compartmental submatrix
for H.
Proof. Statement (i) is an immediate consequence of Corollary 9.6. We leave the proof of statement (ii) to
the reader.
9.4. Table of asymptotic behaviors for averaging and positive systems 137
9.4 Table of asymptotic behaviors for averaging and positive systems
Dynamics Assumptions & Asymptotic Behavior References
averaging system the associated digraph has a globally reachable node Convergence properties:
x(k + 1) = Ax(k) = Theorem 5.2.
A row-stochastic limk x(k) = (w> x(0))1n where w 0 is the left Examples: opinion dynamics
eigenvector of A with eigenvalue 1 satisfying 1>
nw = 1 & averaging in Chapter 1
affine system A convergent (that is, its spectral radius is less than 1) Convergence properties: Ex-
x(k + 1) = Ax(k) + b = limk x(k) = (In A)1 b ercise E2.11.
Examples: Friedkin-Johnsen
system in Exercise E5.6
positive affine system x(0) 0n = x(k) 0n for all k, and Positivity properties: Exer-
x(k + 1) = Ax(k) + b cise E9.9
A 0, b 0n A convergent (that is, || < 1 for all spec(A)) Examples: Leslie population
= limk x(k) = (In A)1 b 0n model in Exercise E4.12
Table 9.1: Discrete-time systems
averaging system the associated digraph has a globally reachable node Convergence properties:
x(t)
= Lx(t) = Theorem 7.4.
L Laplacian matrix limt x(t) = (w> x(0))1n where w 0 is the left Examples: Flocking system
eigenvector of L with eigenvalue 0 satisfying 1>
nw = 1 in Section 7.1.1
affine system A Hurwitz (that is, its spectral abscissa is negative) Convergence properties: Ex-
x(t)
= Ax(t) + b = limt x(t) = A1 b ercise E7.2
positive affine system x(0) 0n = x(t) 0n for all t, and Positivity properties: Theo-
x(t)
= Ax(t) + b rem 9.3 and Corollary 9.6.
A Metzler, b 0n A Hurwitz (that is, <() < 0 for all spec(A)) Example: compartmental
= limt x(t) = A1 b 0n systems in Section 9.1.
Table 9.2: Continuous-time systems
9.5 Exercises
E9.1 The matrix exponential of a Metzler matrix. In this exercise we extend and adapt Theorem 7.2 about the
matrix exponential of a Laplacian matrix to the setting of Metzler matrices.
Let M be an n n Metzler matrix with minimum diagonal entry mmin = min{m11 , . . . , mnn }. As usual,
associate to M a digraph G without self-loops in the natural way, that is, (i, j) is an edge if and only if
mij > 0. Prove that
(i) exp(M ) emmin In 0, for any digraph G,
(ii) exp(M ) ej > 0, for a digraph G whose j-th node is globally reachable,
(iii) exp(M ) > 0, for a strongly connected digraph G (i.e., for an irreducible M ).
Morever, prove that, for any square matrix A,
(iv) exp(At) 0 for all t 0 if and only if A is Metzler.
E9.2 Proof of the Perron-Frobenius Theorem for Metzler matrices. Prove Theorem 9.4.
E9.3 Metzler invariance under nonnegative change of basis. Consider a positive system with Mezler matrix
A and constant input b 0:
x = Ax + b.
Show that, under the change of basis
z = T 1 x,
with T invertible and T 1 0, the transformed matrix T 1 AT is also Metzler.
E9.4 Equilibrium points for positive systems. Consider two continuous-time positive affine systems
x = Ax + b,
b + bb.
x = Ax
Assume that A and Ab are Hurwitz and, by Corollary 9.6, let x and x
b denote the equilibrium points of the
b b
two systems. Show that the inequalities A A and b b imply x x b .
E9.5 Establishing the spectral properties of compartmental matrices. Prove Lemma 9.11 about the spectral
properties of compartmental matrices.
E9.6 Simple traps and strong connectivity. Show that a compartmental system that has no outflows and that is
a simple trap, is strongly connected.
E9.7 On Metzler matrices and compartmental systems with growth and decay. Let M be an nn symmetric
Mezler matrix. Recall Lemma 9.11 and define v Rn by M = L+diag(v), where L is a symmetric Laplacian
matrix. Show that:
(i) if M is Hurwitz, then 1>
n v < 0.
Next, assume n = 2 and assume v has both nonnegative and nonpositive entries. (If v is nonnegative, lack of
stability can be established from statement (i); if v is nonpositive, stability can be established via Theorem 9.12.)
Show that
(ii) there exist nonnegative numbers f , d and g such that, modulo a permutation, M can be written in the
form:
1 1 g 0 (g f ) f
M = f + = ,
1 1 0 d f (d f )
(iii) M is Hurwitz if and only if

gd
d>g and f > .
dg
Note: The inequality d > g (for n = 2) is equivalent to the inequality 1> n v < 0 in statement (i). In the
interpretation of compartmental systems with growth and decay rates, f is a flow rate, d is a decay rate and g
is a growth rate; the statement (iii) is then interpreted as follows: M is Hurwitz if and only if the decay rate is
larger than the growth rate and the flow rate is sufficiently large.
E9.8 Sufficient condition for a Metzler matrix to be Hurwitz. Let A be a symmetric irreducible Metzler matrix
with zero row sums. Prove that, for any i {1, . . . , n} and > 0, all eigenvalues of the matrix A ei e>
i are
negative.
E9.9 Nonnegative inverse. Let A be a nonnegative square matrix and show that the following statements are
equivalent:
(i) > (A), and
(ii) the matrix (In A) is invertible and its inverse (In A)1 is nonnegative.
Moreover, show that
(iii) if A is irreducible and > (A), then (In A)1 is positive.
(Given a square matrix A, the map 7 (In A)1 is sometimes referred to as the resolvent of A.)
E9.10 Mean Residence Time for a particle in a compartmental system. Consider an outflow-connected com-
partmental system with irreducible matrix C and (C) < 0. Let v is the dominant eigenvector of C, that is,
Cv = (C)v, 1> n v = 1, and v > 0.
Assume a tagged particle is randomly located inside the compartmental system at time 0 with probability
mass function v. The mean residence time (mrt) of the tagged particle is the expected time that the particle
remains inside the compartmental system.
Using the definition of expectation, the mean residence time is
Z
mrt = t P[particle leaves at time t] dt.
0
Let us also take for granted that:

d
P[particle leaves at time t] = P[particle inside at time t] .
dt
Show that
1
mrt = .
(C)
E9.11 Resistive circuits as compartmental systems. Consider the RC circuit model presented in Section 7.1.2
and assume that the circuit is connected. Attach to at least one node j {1, . . . , n} an external current
cinjected at j > 0, and connect to at least one node i {1, . . . , n} a positive resistor to ground.
(i) Model the resulting system as a compartmental system, that is, write the compartmental matrix, the
inflow vector and the outflow rate vector, and
(ii) show that there exists a unique steady state that is globally-asymptotically stable and nonnegative.
E9.12 Solutions of partial differential equations (Luenberger 1979, Chapter 6). The electric potential V
within a two-dimensional enclosure is governed by the Laplaces partial differential equation:
2V 2V
+ = 0, (E9.1)
x2 y 2
combined with the value of V along the boundary of the enclosure; see the left image in Figure E9.1.
b1 b2 b3 b4
h
b5 V1 V2 V3 V4 b7
2 2
@ V @ V
+ =0
y @x2 @y 2 V5 V6 V7 V8
b6 b8
x b9 b10 b11 b12
Figure E9.1: Laplaces equation over a rectangular enclosure and a regular Cartesian grid.
For arbitrary enclosures and boundary conditions, it is impossible to solve the Laplaces equation in closed
form. An approximate solution is computed by (i) introducing a regular Cartesian grid of points with spacing
h, e.g., see the right image in Figure E9.1, and (ii) approximating the second-order derivatives by second-order
finite differences. Specifically, at node 2 of the grid, we have along the x direction
2V 1 1 1
(V2 ) 2 (V3 V2 ) 2 (V2 V1 ) = 2 (V3 + V2 2V2 ),
x2 h h h
so that equation (E9.1) is approximated as follows:
2V 2V 1
0= 2
(V2 ) + 2
(V2 ) 2 (V1 + V3 + V6 + b2 4V2 ) = 4V2 = V1 + V3 + V6 + b2 .
x y h
This approximation translates into the matrix equation:
4V = Agrid V + Cgrid-boundary b, (E9.2)
where V Rn is the vector of unknown potentials, b Rm is the vector of boundary conditions, Agrid
{0, 1}nn is the binary adjacency matrix of the (interior) grid graph (that is, (Agrid )ij = 1 if and only if
the interior nodes i and j are connected), and Cgrid-boundary {0, 1}nm is the connection matrix between
interior and boundary nodes (that is, (Cgrid-boundary )i = 1 if and only if grid interior node i is connected with
boundary node ). Show that
(i) Agrid is irreducible but not primitive,
(ii) (Agrid ) < 4,
Hint: Recall Theorem 4.8.
(iii) there exists a unique solution V to equation (E9.2),
(iv) the unique solution V satisfies V 0n if b 0m , and
(v) each solution to the following iteration converges to V :
4V (k + 1) = Agrid V (k) + Cgrid-boundary b,
whereby, at each step, the value of V at each node is updated to be equal to the average of its neighboring
nodes.
Part II
Topics in Averaging Systems
141
Chapter 10
Convergence Rates, Scalability and
Optimization
In this chapter we discuss the convergence rate of averaging algorithms. We borrow ideas from (Xiao and
Boyd 2004; Carli et al. 2009; Garin and Schenato 2010; Fagnani 2014). We focus on discrete-time systems
and their convergence factors. The study of continuous-time systems is analogous.
Before proceeding, we recall a few basic facts. Given a square matrix A,
(i) the spectral radius of A is (A) = max{|| | spec(A)};
(ii) the p-induced norm of A, for p N {}, is
kAxkp
kAkp = max kAxkp | x Rn and kxkp = 1 = max ,
x6=0n kxkp

and, specifically, the induced 2-norm of A is kAk2 = max{ | spec(A> A)};
(iii) for any p, (A) kAkp ; and
(iv) if A = A> , then kAk2 = (A).
Definition 10.1 (Essential spectral radius of a row-stochastic matrix). The essential spectral radius
of a row-stochastic matrix A is
(
0, if spec(A) = {1, . . . , 1},
ess (A) =
max{|| | spec(A) \ {1}}, otherwise.
10.1 Some preliminary calculations and observations

The convergence factor for symmetric row-stochastic matrices To build some intuition about the
general case, we start with a weighted undirected graph G with adjacency matrix A that is row-stochastic
and primitive (i.e., the graph G, viewed as a digraph, is strongly connected and aperiodic). We consider the
corresponding discrete-time averaging algorithm
x(k + 1) = Ax(k).
143
144 Chapter 10. Convergence Rates, Scalability and Optimization
Note that G undirected implies that A is symmetric. Therefore, A has real eigenvalues 1 2 n
and corresponding orthonormal eigenvectors v1 , . . . , vn . Because A is row-stochastic, 1 = 1 and v1 =

1n / n. Next, along the same lines of the modal decomposion given in Section 2.1, we know that the
solution can be decoupled into n independent evolution equations as
x(k) = average(x(0))1n + k2 (v2> x(0))v2 + + kn (vn> x(0))vn .
Moreover, A being primitive implies that max{|2 |, . . . , |n |} < 1. Specifically, for a symmetric and
primitive A, we have ess (A) = max{|2 |, |n |} < 1. Therefore, as predicted by Corollary 2.15
lim x(k) = 1n 1>

n x(0)/n = average(x(0))1n .
k
To upper bound the error, since the vectors v1 , . . . , vn are orthonormal, we compute
v
Xn uX 2
u n >
x(k) average(x(0))1n = k >
(v
j j x(0))v j = t |j | 2k (v
j x(0))v j
2 2 2
j=2 j=2
v
uX
u n
>
2

k

ess (A)k t (v
j x(0))v j = ess (A) x(0) average(x(0))1n , (10.1)
2 2
j=2
where the second and last equalities are Pythagoras Theorem.

In summary, we have learned that, for symmetric matrices, the essential spectral radius ess (A) < 1 is
the convergence factor to average consensus, i.e., the factor determining the exponential convergence of
the error to zero. (The wording convergence factor is for discrete-time systems, whereas the wording
convergence rate is for continuous-time systems.)
A note on convergence factors for asymmetric matrices Consider now the asymmetric matrix

0.1 1010
Alarge-gain = .
0 0.1
Clearly, the two eigenvalues are 0.1 and so is the spectral radius. This is therefore a convergent matrix. It is
however false that the evolution of the system
x(k + 1) = Alarge-gain x(k)
with an initial condition with non-zero second entry, satisfies a bound of the form in equation (10.1). It is
still true, of course, that the solution does eventually converge to zero exponentially fast.
The problem is that the eigenvalues (alone) of a non-symmetric matrix do not fully describe the state
amplification that may take place during a transient period of time. (Note that the 2-norm of Alarge-gain is
order 1010 .)
10.2. Convergence factors for row-stochastic matrices 145
10.2 Convergence factors for row-stochastic matrices

Consider a discrete-time averaging algorithm (distributed linear averaging)
x(k + 1) = Ax(k),
where A is doubly-stochastic and not necessarily symmetric. If A is primitive (i.e., the associated digraph is
aperiodic and strongly connected), we know

lim x(k) = average(x(0))1n = 1n 1> n /n x(0).
k
We now define two possible notions of convergence factors. The per-step convergence factor is
kx(k + 1) xfinal k2
rstep (A) = sup ,
x(k)6=xfinal kx(k) xfinal k2
where xfinal = average(x(0))1n = average(x(k))1n and where the supremum is taken over any possible
sequence. Moreover, the asymptotic convergence factor is
!1/k
kx(k) xfinal k2
rasym (A) = sup lim .
x(0)6=xfinal k kx(0) xfinal k2
Given these definitions the preliminary calculations in the previous Section 10.1, we can now state our
main results.
Theorem 10.2 (Convergence factor and solution bounds). Let A be doubly-stochastic and primitive.
(i) The convergence factors of A satisfy
rstep (A) = kA 1n 1>

n /nk2 ,
(10.2)
rasym (A) = ess (A) = (A 1n 1>
n /n) < 1.
Moreover, rasym (A) rstep (A), and rstep (A) = rasym (A) if A is symmetric.
(ii) For any initial condition x(0) with corresponding xfinal = average(x(0))1n ,

x(k) xfinal rstep (A)k x(0) xfinal , (10.3)
2 2

x(k) xfinal c (rasym (A) + )k x(0) xfinal , (10.4)
2 2
where > 0 is an arbitrarily small constant and c is a sufficiently large constant independent of x(0).
Note: A sufficient condition for rstep (A) < 1 is given in Exercise E10.1.
Before proving Theorem 10.2, we introduce an interesting intermediate result. For xfinal = average(x(0))1n ,
the disagreement vector is the error signal
(k) = x(k) xfinal . (10.5)
Lemma 10.3 (Disagreement or error dynamics). Given a doubly-stochastic matrix A, the disagreement
vector (k) satisfies
(i) (k) 1n for all k,

(ii) (k + 1) = A 1n 1>
n /n (k),
(iii) the following properties are equivalent:
a) limk Ak = 1n 1> n /n, (that is, the averaging algorithm achieves average consensus)
b) A is primitive, (that is, the digraph is aperiodic and strongly connected)
c) (A 1n 1> n /n) < 1. (that is, the error dynamics is convergent)
Proof. To study the error dynamics, note that 1> > > >
n x(k + 1) = 1n Ax(k) and, in turn, that 1n x(k) = 1n x(0);
see also Exercise E7.8. Therefore, average(x(0)) = average(x(k)) and (k) 1n for all k. This completes
the proof of statement (i). To prove statement (ii), we compute

(k + 1) = Ax(k) xfinal = Ax(k) (1n 1> >
n /n)x(k) = A 1n 1n /n x(k),

and the equation in statement (ii) follows from A 1n 1> n /n 1n = 0n .
Next, let us prove the equivalence among the three properties. From PerronFrobenius Theorem 2.12
for primitive matrices in Chapter 2 and from Corollary 2.15, we know that A primitive (statement (iii)b)
implies average consensus (statement (iii)a). The converse is true because 1n 1> n /n is a positive matrix and,
by the definition of limit, there must exist k such that each entry of Ak becomes positive.
Finally, we prove the equivalence between statement (iii)a and (iii)c. First, note that P = In 1n 1> n /n
is a projection matrix, that is, P = P . This can be easily verified by expanding the matrix power P 2 .
2
Second, let us prove a useful identity:
Ak 1n 1> k >
n /n = A (In 1n 1n /n) (because A row-stochastic)
= Ak (In 1n 1>
n /n)
k
(because In 1n 1>
n /n is a projection)
k k
= A(In 1n 1> n /n) = A 1n 1>
n /n .
The statement follows from taking the limit as k in this identity and by recalling that a matrix is
convergent if and only if its spectral radius is less then one.
We are now ready to prove the main theorem in this section.
Proof of Theorem 10.2. Regarding the equalities (10.2), the formula for rstep is an consequence of the defini-
tion of induced 2-norm:
kx(k + 1) xfinal k2
rstep (A) = sup
x(k)6=xfinal kx(k) xfinal k2
k(k + 1)k2 k(A 1n 1>
n /n)(k)k2 k(A 1n 1>
n /n)yk2
= sup = sup = sup ,
(k)1n k(k)k2 (k)1n k(k)k2 y6=0n kyk2
where the last equality follows from (A 1n 1>

n /n)1n = 0n .
10.3. Cumulative quadratic index for symmetric matrices 147
The equality rasym (A) = (A 1n 1> n /n) is a consequence of the error dynamics in Lemma 10.3,
statement (ii).
Next, note that (A) = 1 is a simple eigenvalue and A is semiconvergent. Hence, by Exercise E2.2 on
the Jordan normal form of A, there exists a nonsingular T such that

1 0>
A=T n1 T 1 ,
0n1 B
where B R(n1)(n1) is convergent, that is, (B) < 1. Moreover we know ess (A) = (B).
Usual properties of similarity transformations imply

k 1 0> 1 k 1 0>
A =T n1 T , = lim A = T n1 T 1 .
0n1 B k k 0n1 0(n1)(n1)
Because A is doubly-stochastic and primitive, we know limk Ak = 1n 1> n /n so that A can be decomposed
as
> 0 0>
A = 1n 1n /n + T n1 T 1 ,
0n1 B
and conclude with ess (A) = (B) = (A 1n 1> n /n). This concludes the proof of the equalities (10.2).
The bound (10.3) is an immediate consequence of the definition of induced norm.
Finally, we leave to the reader the proof of the bound (10.4) in Exercise E10.3. Note that the arbitrarily-
small positive parameter is required because the eigenvalue corresponding to the essential spectral radius
may have an algebraic multiplicity strictly larger than its geometric multiplicity.
10.3 Cumulative quadratic index for symmetric matrices

The previous convergence metrics (per-step convergence factor and asymptotic convergence factor) are
worst-case convergence metrics (both are defined with a supremum operation) that are achieved only for
particular initial conditions, e.g., the performance predicted by the asymptotic metric rasym (A) is achieved
when x(0) xfinal is aligned with the eigenvector associated to ess (A) = (A 1n 1> n /n). However, the
average and transient performance may be much better.
To study an appropriate average performance, we follow the treatment in (Carli et al. 2009). We consider
an averaging algorithm
x(k + 1) = Ax(k),
defined by a row-stochastic matrix A and subject to random initial conditions x0 satisfying
E[x0 ] = 0n , and E[x0 x>

0 ] = In .
Recall the disagreement vector (k) defined in (10.5) and the associated disagreement dynamics

(k + 1) = A 1n 1> n /n (k) ,
and observe that the initial conditions of the disagreement vector (0) satisfy
E[(0)] = 0n and E[(0)(0)> ] = In 1n 1>

n /n .
To define an average transient and asymptotic performance of this averaging algorithm, we define the
cumulative quadratic index of the matrix A by
K
1X
Jcum (A) = lim E k(k)k22 . (10.6)
K n
k=0
Theorem 10.4 (Cumulative quadratic index for symmetric matrices). The cumulative quadratic in-
dex (10.6) of a row-stochastic, primitive, and symmetric matrix A satisfies
1 X 1
Jcum (A) = .
n 1 2
spec(A)\{1}
PK
Proof. Pick a terminal time K N and define JK (A) = 1
n k=0 E k(k)k22 . From the definition (10.6)
and the disagreement dynamics, we compute
1X
K
JK (A) = trace E (k)(k)>
n
k=0
k k >
1 XK
> >
>
= trace A 1n 1n /n E (0)(0) A 1n 1n /n
n
k=0
1X
K k k >
> >
= trace A 1n 1n /n A 1n 1n /n .
n
k=0
Because A is symmetric, also the matrix A1n 1> >

n /n is symmetric and can be diagonalized as A1n 1n /n =
>
QQ , where Q is orthonormal and is a diagonal matrix whose diagonal entries are the elements of
spec A 1n 1> n /n = {0} spec(A) \ {1}. It follows that
K >
1X k > k >
JK (A) = trace Q Q Q Q
n
k=0
1 XK
= trace k k (because trace(AB) = trace(BA))
n
k=0
XK X
1
= 2k
n
k=0 spec(A)\{1}
1 X 1 2(K1)
= . (because of the geometric series)
n 1 2
spec(A)\{1}
The formula for Jcum follows from taking the limit as K and recalling that A primitive implies
ess (A) < 1.
Note: All eigenvalues of A appear in the computation of the cumulative quadratic index (10.6), not
only the dominant eigenvalue as in the asymptotic convergence factor. Similar results can be obtained
for normal matrices, as opposed to symmetric, as illustrated in (Carli et al. 2009); it is not known how to
compute the cumulative quadratic index for arbitrary doubly-stochastic primitive matrices.
10.4. Circulant network examples and scalability analysis 149
10.4 Circulant network examples and scalability analysis

In general it is difficult to compute explicitly the second largest eigenvalue magnitude for an arbitrary
matrix. There are some graphs with constant essential spectral radius, independent of the network size n.
For example, a complete graph with identical weights and doubly stochastic adjacency matrix A = 1n 1> n /n
has ess (A) = 0. In this case, the associated averaging algorithm converges in a single step.
Next, we present an interesting family of examples where all eigenvalues are known. Recall the cyclic
balancing problem from Section 1.4, where each bug feels an attraction towards the closest counterclockwise
and clockwise neighbors. Given the angular distances between bugs di = i+1 i , for i {1, . . . , n}
(with the usual convention that dn+1 = d1 and d0 = dn ), the closed-loop system is d(k + 1) = An, d(k),
where [0, 1/2[, and

1 2 0 0
.. ..
1 2 . . 0

.. .. ..
0 1 2 . . .
An, =
.
.

.. . .. .. .. ..
. . . 0
.. ..
0 . . 1 2
0 0 1 2
This matrix is circulant, that is, each row-vector is equal to the preceding row-vector rotated one element
Figure 10.1: Digraph associated to the circulant matrix An, , for n = 6.
to the right. The associated digraph is illustrated in the Figure 10.1. Circulant matrices have remarkable
properties (Davis 1994). For example, from Exercise E10.4, the eigenvalues of An, can be computed to be
(not ordered in magnitude)
2(i 1)
i = 2 cos + (1 2), for i {1, . . . , n}. (10.7)
n
An illustration is given in Figure 10.2. For n even (similar results hold for n odd), plotting the eigenvalues
on the segment [1, 1] shows that
2
ess (An, ) = max{|2 |, |n/2+1 |}, where 2 = 2 cos + (1 2), and n/2+1 = 1 4.
n
fk (x) = 2 cos(2x) + (1 2)
1.0 i = f ((i 1)/n), i 2 {1, . . . , n}, n = 5
1 =1
= .1
= .4
0.5
= .2 =
2 5
= .3
x
0.0
0.2 0.4 0.6 0.8 1.0
= .4 0.2 0.4 0.6 0.8 1.0
-0.5 = .5 3 = 4
-1.0
Figure 10.2: The eigenvalues of An, as given in equation (10.7). The left figure illustrate also the case of = .5, even
if that value is strictly outside the allowed range [0, .5[.
If we fix ]0, 1/2[ and consider sufficiently large values of n, then |2 | > |n/2+1 |. In the limit of large
graphs n , the Taylor expansion cos(x) = 1 x2 /2 + O(x4 ) leads to
1 1
ess (An, ) = 1 4 2 2 + O 4 .
n n
Note that ess (An, ) < 1 for any n, but the separation from ess (An, ) to 1, called the spectral gap, shrinks
with 1/n2 .
In summary, this discussion leads to the broad statement that certain large-scale graphs have slow
convergence factors. For more results along these lines (specifically, the case of elegant study of Cayley
graphs), we refer to (Carli et al. 2008). These results can also be easily mapped to the eigenvalues of the
associated Laplacian matrices; e.g., see Exercise E6.1.
We conclude this section by computing the cumulative quadratic cost introduced in Section 10.3. For
the circulant network example, one can compute (Carli et al. 2009)
1 1
C1
Jcum (An, ) C2 ,
n n
where C1 and C2 are positive constants. It is instructive to compare this result with the worst-case
asymptotic or per-step convergence factor that scale as ess (An, ) = 1 4 2 n12 .
10.5 Design of fastest distributed averaging

We are interested in optimization problems of the form:
minimize rasym (A) or rstep (A)

subject to A compatible with a digraph G, doubly-stochastic and primitive
where A is compatible with G if its only non-zero entries correspond to the edges E of the graph. In
>
P if Eij = ei ej is the matrix with entry (i, j) equal to one and all other entries equal to zero,
other words,
then A = (i,j)E aij Eij for arbitrary weights aij R. We refer to such problems as fastest distributed
averaging (FDAs) problems.
10.5. Design of fastest distributed averaging 151
Note: In what follows, we remove the constraint A 0 to widen the set of matrices of interest. Accord-
ingly, we remove the constraint of A being primitive. Convergence to average consensus is guaranteed by
(1) achieving convergence factors less than 1, (2) subject to row-sums and column-sums equal to 1.
Problem 1: Asymmetric FDA with asymptotic convergence factor

minimize A 1n 1> n /n
X
subject to A = aij Eij , A1n = 1n , 1>
n A = 1n
>
(i,j)E
The asymmetric FDA is a hard optimization problem. Even though the constraints are linear, the objective
function, i.e., the spectral radius of a matrix, is not convex (and, additionally, not even Lipschitz continuous).
Problem 2: Asymmetric FDA with per-step convergence factor

minimize A 1n 1>
n /n 2
X
subject to A = aij Eij , A1n = 1n , 1>n A = 1n
>
(i,j)E
Problem 3: Symmetric FDA problem (recall A = A> implies (A) = kAk2 ):

minimize A 1n 1>n /n
X
subject to A = aij Eij , A = A> , A1n = 1n
(i,j)E
Both Problems 2 and 3 are convex and can be rewritten as so-called semi-definite programs (SDPs);
see (Xiao and Boyd 2004). An SDP is an optimization problem where (1) the variable is a positive semidefinite
matrix, (2) the objective function is linear, and (3) the constraints are affine equations. SDPs can be efficiently
solved by software tools such as CVX; see (Grant and Boyd 2014).
10.6 Exercises
E10.1 Induced norm of certain doubly stochastic matrices. Assume A is doubly stochastic, primitive and has
a strictly-positive diagonal. Show that
rstep (A) = kA 1n 1>

n /nk2 < 1.
E10.2 Spectrum of A 1n 1> n /n. Consider a matrix A doubly stochastic, primitive and symmetric. Assume
1 n are its real eigenvalue with corresponding orthonormal eigenvectors v1 , . . . , vn . Show that
the matrix A 1n 1> n /n has eigenvaalues 0, 2 n with eigenvectors v1 , . . . , vn .
E10.3 Bounds on the norm of a matrix power. Given a matrix B Rnn and an index k N, show that
(i) there exists c > 0 such that
kB k k2 c k n1 (B)k ,
(ii) for all > 0, there exists c > 0 such that
kB k k2 c ((B) + )k .
Hint: Adopt the Jordan normal form

E10.4 Eigenpairs for circulant matrices. Let C Cnn be circulant, that is, assume there exists a vector
(c0 , . . . , cn1 ) such that
c0 c1 . . . cn1
cn1 c0 . . . cn2

C= . .. .. .. .
.. . . .
c1 c2 ... c0
Show that
(i) the complex eigenvectors and eigenvalues of C are, for j {0, . . . , n 1},
>
vj = 1, j , j2 , , jn1 ,
j = c0 + c1 j + c2 j2 + + cn1 jn1 ,
2ji
where j = exp , j {0, . . . , n 1}, are the nth root of unity, and i = 1.
n
(ii) for n even and (c0 , c1 , . . . , cn1 ) = (1 2, , 0, . . . , 0, ), the eigenvalues are i = 2 cos 2(i1)
n +
(1 2) for i {1, . . . , n}.
E10.5 Spectral gap of regular ring graphs. A k-regular ring graph is an undirected ring graph with n-nodes each
connected to itself and its 2k nearest neighbors with a uniform weight equal to 1/(2k + 1). The associated
doubly-stochastic adjacency matrix An,k is a circulant matrix with first row given by
1 1 1 1

An,k (1, :) = 2k+1 . . . 2k+1 0 . . . 0 2k+1 . . . 2k+1 .
Using the results in Exercise E10.4, compute

(i) the eigenvalues of An,k as a function of n and k;
(ii) the limit of the spectral gap for fixed k as n ; and
(iii) the limit of the spectral gap for 2k = n 1 as n .
E10.6 Properties of the spectral radius. For any A Cnn and any matrix norm, show
(i) (A) kAk, and

(ii) (A) kAk k1/k for all k,
(iii) (A) = limk kAk k1/k .
Next, for any A Cnn , let |A| denote the matrix with entries |aij |, and for any real matrices B, C, let
B C mean bij cij for each i and j. Show
(iv) if |A| B, then (A) (|A|) (B).
Hint: Peruse (Meyer 2001, Chapter 7).
E10.7 H2 performance of balanced averaging in continuous time. Consider the continuous-time averaging
dynamics with disturbance
x(t)
= Lx(t) + w(t),
where L = L> is the Laplacian matrix of an undirected and connected graph and w(t) is an exogenous
disturbance input signal. Pick a matrix Q Rpn satisfying Q1n = 0p and define the output signal
y(t) = Qx(t) Rp as the solution from zero initial conditions x(0) = 0n . Define the system H2 norm from
w to y by
Z Z Z
2 > > > >
kHk2 = y(t) y(t)dt = x(t) Q Qx(t)dt = trace H(t) H(t)dt ,
0 0 0
where H(t) = QeLt is the so-called impulse response matrix.

p
(i) Show kHk2 = trace(P ), where P is the solution to the Lyapunov equation
LP + P L = Q> Q. (E10.1)
p
(ii) Show kHk2 = trace (L Q> Q) /2, where L is the pseudoinverse of L.
(iii) Define short-range and long-range output matrices Qsr and Qlr by Q> >
sr Qsr = L and Qlr Qlr = In
1 >
n 1n 1n , respectively. Show:

n 1, for Q = Qsr ,
2 n
X
kHk2 = 1

, for Q = Qlr .
(L)
i=2 i
Hint: The H2 norm has several interesting interpretations, including the total output signal energy in response
to a unit impulse input or the root mean square of the output signal in response to a white noise input with
identity covariance. You may find useful Theorem 7.4 and Exercise E6.9.
E10.8 Convergence rate for the Laplacian flow. Consider a weight-balanced, strongly connected digraph G
with self-loops, degree matrices Dout = Din = In , doubly-stochastic adjacency matrix A, and Laplacian
matrix L. Consider the associated Laplacian flow
x(t)
= Lx(t).
1>
n x(0)
For xave := n , define the disagreement vector by (t) = x(t) xave 1n .
1> x(t)
(i) Show that the average t 7 n n is conserved and that, consequently, 1>
n (t) = 0 for all t 0.
(ii) Derive the matrix E describing the disagreeement dynamics
= E(t).
(t)
(iii) Describe the spectrum spec(E) of E as a function of the spectrum spec(A) of the doubly-stochastic
adjacency matrix A associated with G. Show that spec(E) has a simple eigenvalue at = 0 with
corresponding normalized eigenvector v1 := 1n / n.
(iv) The Jordan form J of E can be described as follows

0 0 0 0
0 J2 0 0
1 0 0 r1
E=P P =: c1 C
0 0 . . . 0 0 J R ,
0 0 0 Jm
where c1 is the first column of P and r1 is the first row of P 1 . Show that
(t) = C exp(Jt)
R(0).

(v) Use statements (iii) and (iv) to show that, for all > 0, there exists C > 0 satisfying
k(t)k C (e + )t k(0)k,
where = max{<() 1 | spec(A)\{1}} < 0. Show that, if A = A> , then ess (A) 1.
Hint: Use arguments similar to those in Exercise E10.3 and in the proof of Theorem 7.4.
E10.9 Convergence factors in digraphs with equal out-degree. Consider the unweighted digraphs in Fig-
ure E10.1 with their associated discrete-time consensus protocols x(t + 1) = Aa x(t) and x(t + 1) = Ab x(t).
For which digraph is the worst-case discrete-time consensus protocol (i.e., the evolution starting from the
worst-case initial condition) guaranteed to converge faster? Assign to each edge the same weight equal to 13 .
Digraph 1 Digraph 2
1 2 1 2
4 3 4 3
Figure E10.1: Two example digraphs
E10.10 Convergence estimates. Consider a discrete time consensus

P3 network with n = 4 agents, state variable
x R4 , and dynamics x(k + 1) = Ax(k), where A = i=1 i vi vi> R44 is the adjacency matrix of the
network and

1 0 1
1 1 1 1 1 1 1 0
1 = 1, 2 = , 3 = , v1 = , v2 = , v3 = .
2 4 2 1 2 0 2 1
1 1 0
Suppose x(0) = [0, 8, 2, 2]> . It is possible that x(3) = [4, 3, 2, 3]> ?
Chapter 11
Time-varying Averaging Algorithms
In this chapter we discuss time-varying consensus algorithms. We borrow ideas from (Hendrickx 2008;
Bullo et al. 2009). Relevant references include (Tsitsiklis 1984; Tsitsiklis et al. 1986; Cao et al. 2008; Carli
et al. 2008).
11.1 Examples and models of time-varying discrete-time algorithms

In time-varying or time-varying algorithms the averaging row-stochastic matrix is not constant throughout
time, but instead changes values and, possibly, switches among a finite number of values. Here are examples
of discrete-time averaging algorithms with switching matrices.
11.1.1 Shared Communication Channel

Given a communication digraph Gshared-comm , at each communication round, only one node can transmit to
all its out-neighbors over a common bus and every receiving node will implement a single averaging step.
For example, if agent j receives the message from agent i, then agent j will implement:
1
x+
j := (xi + xj ). (11.1)
2
Each node is allocated a communication slot in a periodic deterministic fashion, e.g., in a round-robin
scheduling, where the n agents are numbered and, for each i, agent i talks only at times i, n + i, 2n +
i, . . . , kn + i for k Z0 . For example, in Figure 11.1 we illustrate the communication digraph and in
Figure 11.2 the resulting round-robin communication protocol.
Gshared-comm
3 4
1 2
Figure 11.1: Example communication digraph
155
156 Chapter 11. Time-varying Averaging Algorithms
3 4 3 4 3 4 3 4
1 2 1 2 1 2 1 2
time = 1, 5, 9, . . . time = 2, 6, 10, . . . time = 3, 7, 11, . . . time = 4, 8, 12, . . .
Figure 11.2: Round-robin communication protocol.
Formally, let Ai denote the averaging matrix corresponding to the transmission by agent i to its
out-neighbors. With round robin scheduling, we have
x(n + 1) = An An1 A1 x(1).
11.1.2 Asynchronous Execution

Imagine each node has a different clock, so that there is no common time schedule. Suppose that messages
are safely delivered even if transmitting and receiving agents are not synchronized. Each time an agent
wakes up, the available information from its neighbors varies. At an iteration instant for agent i, assuming
agent i has new messages/information from agents i1 , . . . , im , agent i will implement:
1 1
x+
i := xi + (xi + + xim ).
m+1 m+1 1
Given arbitrary clocks, one can consider the set of times at which one of the n agent performs an
iteration. Then the system is a discrete-time averaging algorithm. It is possible to carefully characterize all
possible sequences of events (who transmitted to agent i when it wakes up).
11.1.3 Models of time-varying averaging algorithms

Consider a sequence of row-stochastic matrices {A(k)}kZ0 , or equivalently a time-varying row-stochastic
matrix k 7 A(k). The associated time-varying averaging algorithm is the discrete-time dynamical system
x(k + 1) = A(k)x(k), k Z0 . (11.2)
We let {G(k)}kZ0 be the sequence of weighted digraphs associated to the matrices {A(k)}kZ0 .

Note that (1, 1n ) is an eigenpair for each matrix A(k). Hence, all points in the consensus set 1n |
R are equilibria for the algorithm. We aim to provide conditions under which each solution converges to
consensus.
We start with a useful definition, for two digraphs G = (V, E) and G0 = (V 0 , E 0 ), union of G and G0 is
defined by
G G0 = (V V 0 , E E 0 ).
In what follows, we will need to compute only the union of digraphs with the same set of vertices; in
that case, the graph union is essentially defined by the union of the edge sets. Some useful properties of
the product of multiple row-stochastic matrices and of the unions of multiple digraphs are presented in
Exercise E11.1.
11.2. Convergence over time-varying connected graphs 157
11.2 Convergence over time-varying connected graphs

Let us first consider the case when A(k) induces an undirected, connected, and aperiodic graph G(k) at
each time k.
Theorem 11.1 (Convergence under point-wise connectivity). Let {A(k)}kZ0 be a sequence of sym-
metric and doubly-stochastic matrices with associated digraphs {G(k)}kZ0 so that
(i) each non-zero edge weight aij (k), including the self-loops weights aii (k), is larger than a constant > 0;
and
(ii) each graph G(k) is connected and aperiodic point-wise in time.

Then the solution to x(k + 1) = A(k)x(k) converges exponentially fast to average x(0) 1n .
The first assumption in Theorem 11.1 prevents the weights from becoming arbitrarily close to zero
as k and assures that ess (A(k)) is upper bounded by a number strictly lower than 1 at every time
k Z0 . To gain some intuition into this non-degeneracy assumption, consider a sequence of symmetric
and doubly-stochastic averaging matrices {A(k)}kZ0 with entries given by

exp(1/(k + 1) ) 1 exp(1/(k + 1) )
A(k) =
1 exp(1/(k + 1) ) exp(1/(k + 1) )
for k Z0 and exponent 1. Clearly, for k and for any 1 this matrix converges to
A = [ 01 10 ] with spectrum spec(A ) = {1, +1} and essential spectral radius ess (A ) = 1. One can
show that, for = 1, the convergence of A(k) to A is sufficiently slow so that {x(k)}k converges to
average(x(0)1n , whereas this property is not satisfied for faster convergence rates > 1, and the iteration
oscillates indefinitely.1
Proof of Theorem 11.1. Under assumptions (i) and (ii), there exists a c [0, 1[ so that ess (A(k)) c < 1
for all l Z0 . Recall the notion of the disagreement vector (k) = x(k) average(x(0))1n and define
V () = kk2 . It is immediate to compute
V ((k + 1)) = V (A(k)(k)) = kA(k)(k)k2 ess (A(k))2 k(k)k2 c2 V ((k)).
It follows that V ((k)) c2k V ((0)) or k(k)k ck k(0)k, that is, (k) converges
to zero exponentially
fast. Equivalently, as k , x(k) converges exponentially fast to average x(0) 1n .
The proof idea of Theorem 11.1 is based on the disagreement vector and a so-called common Lyapunov
function, that is, a positive function that decreases along the systems evolutions (we postpone the general
definition of Lyapunov function to Chapter 13). The quadratic function V proposed above is useful also
for sequences of primitive row-stochastic matrices {A(k)}kZ0 with a common positive left eigenvector
associated to the eigenvalue (A(k)) = 1, see Exercise E11.5. If the matrices {A(k)}kZ0 do not share a
1
To understand the essence of this example, consider
Pthe scalar iteration x(k + 1) = exp(1/(k + 1) )x(k). In logarithmic
k1 1
coordinates the solution is given by log(x(k)) = =0 (+1) x0 . For = 1, limk log(x(k)) diverges to , and
limk x(k) converges to zero. Likewise, for > 1, limk log(x(k)) exists finite, and thus limk x(k) does not converge
to zero.
common left eigenvector associated to the eigenvalue (A(k)) = 1, then there exists generally no common
quadratic Lyapunov function of the form V () = > P with P being a positive definite matrix; e.g.,
see (Olshevsky and Tsitsiklis 2008). Likewise, if a sequence of symmetric matrices {A(k)}kZ0 does not
induce a connected and aperiodic graph point-wise in time, then the above analysis fails, and we need to
search for non-quadratic common Lyapunov functions.
11.3 Convergence over digraphs connected over time

We are now ready to state the main result in this chapter, originally due to Moreau (2005).
Theorem 11.2 (Consensus for time-varying algorithms). Let {A(k)}kZ0 be a sequence of row-
stochastic matrices with associated digraphs {G(k)}kZ0 . Assume that
(A1) each digraph G(k) has a self-loop at each node;

(A2) each non-zero edge weight aij (k), including the self-loops weights aii (k), is larger than a constant > 0;
and
(A3) there exists a duration N such that, for all times k Z0 , the digraph G(k) G(k + 1)
contains a globally reachable node.
Then
(i) there exists a nonnegative w Rn normalized to w1 + + wn = 1 such that limk A(k)A(k

1) A(0) = 1n w> ;

(ii) the solution to x(k + 1) = A(k)x(k) converges exponentially fast to w> x(0) 1n ;
(iii) if additionally each matrix in the sequence is doubly-stochastic, then w = n1 1n so that

lim x(k) = average x(0) 1n .
k
Note: In a sequence with property (A2), edges can appear and disappear, but the weight of each edge
(that appears an infinite number of times) does not go to zero as k .
Note: This result is analogous to the time-invariant result that we saw in Chapter 5. The existence of a
globally reachable node is the connectivity requirement in both cases.
Note: Assumption (A3) is a uniform connectivity requirement, that is, any interval of length must
have the connectivity property. In equivalent words, the connectivity property holds for any contiguous
interval of duration .
Note: the theorem provides only a sufficient condition. For results on necessary and sufficient conditions
we refer the reader to the recent works (Blondel and Olshevsky 2014; Xia and Cao 2014) and references
therein.
11.3.1 Shared communication channel with round robin scheduling

Consider the shared communication channel model with round-robin scheduling. Assume the algorithm is
implemented over a communication graph Gshared-comm that is strongly connected.
11.3. Convergence over digraphs connected over time 159
Consider now the assumptions in Theorem 11.2. Assumption (A1) is satisfied because in equation (11.1)
the self-loop weight is equal to 1/2. Similarly, Assumption (A2) is satisfied because the edge weight is equal
to 1/2. Finally, Assumption (A3) is satisfied with duration selected equal to n, because after n rounds
each node has transmitted precisely once and so all edges of the communication graph Gshared-comm are
present in the union graph. Therefore, the algorithm converges to consensus. However, the algorithm does
not converge to average consensus since it is false that the averaging matrices are doubly-stochastic.
Note: round robin is not necessarily the only scheduling protocol with convergence guarantees. Indeed,
consensus is achieved so long as each node is guaranteed a transmission slot once every bounded period of
time.
11.3.2 Convergence theorems for symmetric time-varying algorithms

Theorem 11.3 (Consensus for symmetric time-varying algorithms). Let {A(k)}kZ0 be a sequence
of symmetric row-stochastic matrices with associated undirected graphs {G(k)}kZ0 . Let the matrix sequence
{A(k)}kZ0 satisfy Assumptions (A1) and (A2) in Theorem 11.2 as well as
(A4) for all k Z0 , the graph k G( ) is connected.
Then
n;
(i) limk A(k)A(k 1) A(0) = n1 1n 1>

(ii) each solution to x(k + 1) = A(k)x(k) converges exponentially fast to average x(0) 1n .
Note: this result is analogous to the time-invariant result that we saw in Chapter 5. For symmetric
row-stochastic matrices and undirected graphs, the connectivity of an appropriate graph is the requirement
in both cases.
Note: Assumption (A3) in Theorem 11.2 requires the existence of a finite time-interval of duration
so that the union graph k k+1 G( ) contains a globally reachable node for all times k 0. This
assumption is weakened in the symmetric case in Theorem 11.3 to Assumption (A4) requiring that the
union graph k G( ) is connected for all times k 0.
11.3.3 Uniform connectivity is required for non-symmetric matrices

We have learned that, for asymmetric matrices, a uniform connectivity property (A3) is required, whereas for
symmetric matrices, uniform connectivity is not required (see (A4)). Here is a counter-example from (Hen-
drickx 2008) showing that Assumption (A3) cannot be relaxed for asymmetric graphs. Initialize a group of
n = 3 agents to
x1 < 1, x2 < 1, x3 > +1.
Step 1: Perform x+ + +
1 := (x1 + x3 )/2, x2 := x2 , x3 := x3 a number of times 1 until
x1 > +1, x2 < 1, x3 > +1.
1 := x1 , x2 := x2 , x3 := (x2 + x3 )/2 a number of times 2 until
x1 > +1, x2 < 1, x3 < 1.
1 := x1 , x2 := (x1 + x2 )/2, x3 := x3 a number of times 3 until
x1 > +1, x2 > +1, x3 < 1.
And repeat this process.
1 1 1 1
=
2 3 2 3 2 3 2 3
Step 1 Step 2 Step 3 union
Observe that on steps 1, 7, 15, . . . , the variable x1 is made to become larger than +1 by computing averages
with x3 > +1. Note that every time this happens the variable x3 > +1 is increasingly smaller and closer
to +1. Hence, 1 < 7 < 15 < . . . , that is, it takes more steps for x1 to become larger than +1. Indeed,
one can formally show the following:
(i) The agents do not converge to consensus.

(ii) Hence, one of the assumptions of Theorem 11.2 must be violated.
(iii) It is easy to see that (A1) and (A2) are satisfied.
(iv) Regarding connectivity, note that, for all k Z0 , the digraph k G( ) contains a globally
reachable node. However, this property is not quite equivalent to Assumption (A3).
(v) Assumption (A3) in Theorem 11.2 must be violated: there does not exist a duration N such that,
for all k Z0 , the digraph G(k) G(k + 1) contains a globally reachable node.
(vi) Indeed, one can show that limk k = so that, as we keep iterating Steps 1+2+3, their duration
grows unbounded.
11.4 Analysis methods and proofs

It is well known that, for time-varying systems, the analysis of eigenvalues is not sufficient anymore! In
the following example, two matrices with spectral radius equal to 1/2 are multiplied to obtain a spectral
radius larger than 1:
1 1 5
2 1 2 0 = 4 0 .
0 0 1 0 0 0
Hence, it is not possible to predict the convergence of arbitrary products of matrices, just based on their
spectral radii and we need to work harder and with sharper tools.
11.4. Analysis methods and proofs 161
11.4.1 Bounded solutions and non-increasing max-min function

In what follows, we propose a so-called contraction analysis based on a common Lyapunov function (which
is not quadratic). We start by defining the max-min function Vmax-min : Rn R0 by
Vmax-min (x) = max(x1 , . . . , xn ) min(x1 , . . . , xn )

= max xi min xi .
i{1,...,n} i{1,...,n}
Note that:
(i) Vmax-min (x) 0, and

(ii) Vmax-min (x) = 0 if an only if x = 1n for some R.
Lemma 11.4 (Monotonicity and bounded evolutions). If A is row-stochastic, then for all x Rn
Vmax-min (Ax) Vmax-min (x).
For any sequence of row-stochastic matrices, the solution x(k) of the corresponding time-varying averaging
algorithm satisfies, from any initial condition x(0) and at any time k,
Vmax-min (x(k)) Vmax-min (x(0)), and

min x(0) min x(k) min x(k + 1) max x(k + 1) max x(k) max x(0).
Proof. For the maximum, let us compute:

n
X n
X n
X
max(Ax)i = max aij xj max aij max xh = max aij max xh = 1 max xi .
i i i h i h i
j=1 j=1 j=1
Similarly, for the minimum,

n
X n
X n
X
min(Ax)i = min aij xj min aij min xh = min aij min xh = 1 min xi .
i i i h i h i
j=1 j=1 j=1
Connectivity over time

Lemma 11.5 (Global reachability over time). Given a sequence of digraphs {G(k)}kZ0 such that each
digraph G(k) has a self-loop at each node, the following two properties are equivalent:
(i) there exists a duration N such that, for all times k Z0 , the digraph G(k) G(k + 1)
contains a directed spanning tree;
(ii) there exists a duration N such that, for all times k Z0 , there exists a node j = j(k) that reaches
all nodes i {1, . . . , n} over the interval {k, k + 1} in the following sense: there exists a sequence
of nodes {j, h1 , . . . , h1 , i} such that (j, h1 ) is an edge at time k, (h1 , h2 ) is an edge at time k + 1,
. . . , (h2 , h1 ) is an edge at time k + 2, and (h1 , i) is an edge at time k + 1;
or, equivalently, for the reverse digraph,
(iii) there exists a duration N such that, for all times k Z0 , the digraph G(k) G(k + 1)
contains a globally reachable node;
(iv) there exists a duration N such that, for all times k Z0 , there exists a node j reachable from all
nodes i {1, . . . , n} over the interval {k, k + 1} in the following sense: there exists a sequence of
nodes {j, h1 , . . . , h1 , i} such that (h1 , j) is an edge at time k, (h2 , h1 ) is an edge at time k + 1, . . . ,
(h1 , h2 ) is an edge at time k + 2, and (i, h1 ) is an edge at time k + 1.
Note: It is sometimes easy to see if a sequence of digraphs satisfies properties (i) and (iii). Property (iv)
is directly useful in the analysis later in the chapter. Regarding the proof of the lemma, it is easy to check
that (ii) implies (i) and that (iv) implies (iii) with = . The converse is left as Exercise E11.3.
11.4.2 Proof of Theorem 11.2: the max-min function is exponentially decreasing

This proof is inspired by the presentation in (Hendrickx 2008, Theorem 9.2). We start by noting that
Assumptions (A1) and (A3) imply property Lemma 11.5(iv) about the existence of a duration with certain
properties. Next, without loss of generality, we assume that at some time h, for some h N, the solution
x(h) is not equal to a multiple of 1n and, therefore, satisfies Vmax-min (x(h)) > 0. Clearly,
x((h + 1)) = A((h + 1) 1) A(h + 1) A(h) x(h)

=: Ax(h).
By Assumption (A3), we know that there exists a node j reachable from all nodes i over the interval
{h, (h + 1) 1} in the following sense: there exists a sequence of nodes {j, h1 , . . . , h1 , i} such
that all following edges exist in the sequence of digraphs: (h1 , j) at time h, (h2 , h1 ) at time h + 1, . . . ,
(i, h1 ) at time (h + 1) 1. Therefore, Assumption (A2) implies

ah1 ,j h , ah2 ,h1 h + 1 , . . . , ai,h1 (h + 1) 1 ,
and therefore their product satisfies

ai,h1 (h + 1) 1 ah1 ,h2 (h + 1) 2 ah2 ,h1 h + 1 ah1 ,j h .
Remarkably, this product is one term in the (i, j) entry of the row-stochastic matrix A = A((h + 1)
1) A(h). In other words, Assumption (A3) implies Aij .
Hence, for all nodes i, given globally reachable node j during interval {h, (h + 1)}, we compute
Xn
xi (h + 1) = Ai,j xj (h) + Ai,p xp (h) (by definition)
p6=j,p=1

Ai,j xj (h) + (1 Ai,j ) max x(h) (because xp (h) max x(h) )

max Ai,j xj (h) + (1 Ai,j ) max x(h)
Ai,j

xj (h) + (1 ) max x(h) . (because xj (h) max x(h) )
11.5. Time-varying algorithms in continuous-time 163
A similar argument leads to

xi (h + 1) xj (h) + (1 ) min x(h) ,
so that

Vmax-min x((h + 1)) = max xi (h + 1) min xi (h + 1)
i i

xj (h) + (1 ) max x(h) xj (h) + (1 ) min x(h)

(1 )Vmax-min x(h) .
This final inequality, together with Lemma 11.4, proves exponential convergence of the cost function
k 7 Vmax-min (x(k)) to zero and convergence of x(k) to a multiple of 1n . We leave the other statements in
Theorem 11.2 to the reader and refer to (Moreau 2005; Hendrickx 2008) for further details.
11.5 Time-varying algorithms in continuous-time

We now consider the continuous-time linear time-varying system
x(t)
= L(t)x(t).
We associate a time-varying graph G(t) (without self loops) to the time-varying Laplacian L(t) in the usual
manner.
For example, in Chapter 7, we discussed how the heading in some flocking models is described by the
continuous-time Laplacian flow:
= L,
where each is the heading of a bird, and where L is the Laplacian of an appropriate weighted digraph
G: each bird is a node and each directed edge (i, j) has weight 1/dout (i). We discussed also the need to
consider time-varying graphs: birds average their heading only with other birds within sensing range, but
this sensing relationship may change with time.
Recall that the solution to a continuous-time time-varying system can be given in terms of the state
transition matrix:
x(t) = (t, 0)x(0),
We refer to (Hespanha 2009) for the proper definition and study of the state transition matrix.
11.5.1 Undirected graphs

We first consider the case when L(t) induces an undirected and connected graph G(t) for all t R0 .
Theorem 11.6 (Convergence under point-wise connectivity). Let t 7 L(t) = L(t)> be a time-
varying Laplacian matrix with associated time-varying digraph t 7 G(t), t R0 . Assume
(A1) each non-zero edge weight aij (t) is larger than a constant > 0,
(A2) for all t R0 , the digraph associated to the symmetric Laplacian matrix L(t) is undirected and
connected.
Then
(i) the state transition matrix (t, 0) associated to L(t) satisfies limt (t, 0) = 1n 1>
n /n,
(ii) the solution to x(t)
= L(t)x(t) converges exponentially fast to

lim x(t) = average x(0) 1n .
t
The first assumption in Theorem 11.1 prevents that the weights become arbitrarily close to zero as
t , and it assures that 2 (L) is strictly positive for all t R0 . To see the necessity of this non-
degeneracy assumption, consider the time-varying Laplacian

L(t) = a(t)L, (11.3)
where a : R0 R0 is piece-wise continuous, and L =L > is a symmetric time-invariant Laplacian

matrix. It can be verified that solution to x(t)
= L(t)x(t) is given by
Z t
x(t) = exp(L(t))x0 = exp L a( )d x0 = exp L exp (A(t) A(0)) x0 ,
0

where d/dt A(t) = a(t). If a(t) is integrable on [0, [, then exp(L(t)) converges to exp L A(0) .
In analogy to Theorem 11.1, Theorem 11.6 can be proved by considering the norm of the disagreement
vector V () = kk2 as a common Lyapunov function. As in the continuous-time case, this quadratic
Lyapunov function has some fundamental limitations pointed out by Moreau (2004). We review these
limitations in the following theorem, extension of Lemma 6.4.
Theorem 11.7 (Limitations of quadratic Lyapunov functions). Let L be a Laplacian matrix associated
with a weighted digraph G. The following statements are equivalent:
(i) L + L> is negative semi-definite;

(ii) L has zero column sums, that is, G is weight-balanced;
(iii) the sum of squares function V () = kk2 is strictly decreasing along trajectories of the Laplacian flow
x = Lx; and
(iv) every convex function V (x) invariant under coordinate permutations is non-increasing along the trajec-
tories of x = Lx.
Proof sketch. The equivalence of statements (i) and (ii) has been shown in Lemma 6.4. The equivalence
of (i) and (iii) can be proved with a Lyapunov argument similar to the discrete-time case; see Theorem 11.1.
The implication (iv) = (iii) is trivial. To complete the proof, we show that (ii) = (iv). Recall that
the matrix exponential of a Laplacian matrix exp(Lt) is a nonnegative doubly stochastic matrix (see
Theorem 7.2) that can be decomposed into a convex combination of finitely many permutation P matrices
by the the Birkhoff-Von-Neumann Theorem (see Exercise E2.15). In particular, exp(Lt) = i i (t)Pi ,
11.5. Time-varying algorithms in continuous-time 165
where Pi are permutation matrices and i (t) are convex coefficients for every t 0. By convexity of V (x)
and invariance under coordinate permutations we have for any initial condition x0 Rn and for any t 0
X X X
V (exp(Lt)x0 ) = V i (t)Pi x0 i (t)V (Pi x0 ) = i (t)V (x0 ) = V (x0 ) .
i i i
It follows that V () = kk2 serves as a common Lyapunov function for the time-varying Laplacian
flow x(t)
= L(t)x(t) only if L(t) is weight-balanced and connected point-wise in time. To partially
remedy these strong assumptions, consider now the case when L(t) induces an undirected graph at any
point in time t 0 and an integral connectivity condition holds similar to the discrete-time case. To
motivate the general case, recall the example in (11.3) with a single time-varying parameter a(t). In this
simple
R example, a necessary and sufficient condition for convergence to consensus was that the integral
0 a( )d is divergent. The following result from (Hendrickx and Tsitsiklis 2013) generalizes this case.
Theorem 11.8 (Convergence under integral connectivity). Let t 7 A(t) = A(t)> be a time-varying
symmetric adjacency
R matrix. Consider an associated undirected graph G = (V, E), t R0 , that has an edge
(i, j) E if 0 aij ( )d is divergent. Assume
(A2) the graph G is connected.
Then
(i) the state transition matrix (t, 0) associated to L(t) satisfies limt (t, 0) = 1n 1>
n /n,
= L(t)x(t) converges exponentially fast to

t
Theorem 11.8 is the continuous-time analog of Theorem 11.3. We remark that the original statement
in (Hendrickx and Tsitsiklis 2013) does not require Assumption (A1) thus allowing for weights such as
aij = 1/t which lead to non-uniform convergence, i.e., the convergence rate depends on the time t0
when the system is initialized. The proof method of Theorem 11.8 is based on the fact that the minimal
(respectively maximal) element of x(t), the sum of the two smallest (respectively two largest) elements, the
sum of the three smallest (respectively three largest) elements, etc., are all bounded and non-decreasing
(respectively non-increasing). A continuity argument can then be used to show average consensus.
11.5.2 Directed graphs

The proof method of Theorem 11.8 does not extend to general non-symmetric Laplacian matrices. If we use
the max-min function Vmax-min (x) = maxi{1,...,n} xi mini{1,...,n} xi as a common Lyapunov function
candidate, then we arrive at the following general result (Moreau 2004; Lin et al. 2007).
Theorem 11.9 (Consensus for time-varying algorithms in continuous time). Let t 7 A(t) be a
time-varying adjacency matrix with associated time-varying digraph t 7 G(t), t R0 . Assume
(A2) there exists a duration T > 0 such that, for all t R0 , the digraph associated to the adjancency matrix
Z t+T
A( )d
t
contains a globally reachable node.
Then
(i) there exists a nonnegative w Rn normalized to w1 + + wn = 1 such that the state transition
matrix (t, 0) associated to L(t) satisfies limt (t, 0) = 1n w> ,

= L(t)x(t) converges exponentially fast to w> x(0) 1n ,
(iii) if additionally, the 1>n L(t) = 0n for almost all times t (that is, the digraph is weight-balanced at all
>
times, except a set of measure zero), then w = n1 1n so that

t
11.6. Exercises 167
11.6 Exercises
E11.1 On the product of stochastic matrices (Jadbabaie et al. 2003). Let k 2 and A1 , A2 , . . . , Ak be non-
negative n n matrices with positive diagonal entries. Let amin (resp. amax ) be the smallest (resp. largest)
diagonal entry of A1 , A2 , . . . , Ak and let G1 , . . . , Gk be the digraphs associated with A1 , . . . , Ak .
Show that
2 k1
amin
(i) A1 A2 Ak (A1 + A2 + + Ak ), and
2amax
(ii) if the digraph G1 . . . Gk is strongly connected, then the matrix A1 Ak is irreducible.
Hint: Set Ai = amin In + Bi for a nonnegative Bi , and show statement (i) by induction on k.
E11.2 Products of primitive matrices with positive diagonal. Let A1 , A2 , . . . , An1 be primitive n n ma-
trices with positive diagonal entries. Show that A1 A2 An1 > 0.
E11.3 A simple proof. Prove Lemma 11.5.
Hint: You will want to use Exercise E3.6.
E11.4 Alternative sufficient condition. As in Theorem 11.2, let {A(k)}kZ0 be a sequence of row-stochastic
matrices with associated digraphs {G(k)}kZ0 . Prove that the same asymptotic properties in Theorem 11.2
hold true under the following Assumption (A5), instead of Assumptions (A1), (A2), and (A3):
(A5) there exists a node j such that, for all times k Z0 , each edge weight aij (k), i {1, . . . , n}, is larger
than a constant > 0.
In other words, Assumption (A5) requires that all digraphs G(k) contain all edges aij (k), i {1, . . . , n},
and that all these edges have weights larger than a strictly positive constant.
Hint: Modify the proof of Theorem 11.2.
E11.5 Convergence for strongly-connected graphs point-wise in time: discrete time. Consider a sequence
{A(k)}kZ0 of row-stochastic matrices with associated graphs {G(k)}kZ0 so that
(A1) each non-zero edge weight aij (k), including the self-loops weights aii (k), is larger than a constant
> 0;
(A2) each graph G(k) is strongly connected and aperiodic point-wise in time; and
(A3) there is a positive vector w Rn satisfying w> 1n = 1 and w> A(k) = w> for all k Z0 .
Without relying on Theorem 11.2, show that the solution to x(k+1) = A(k)x(k) converges to limk x(k) =
w> x(0)1n .
Hint: Search for a common quadratic Lyapunov function.
E11.6 Convergence for strongly-connected graphs point-wise in time: continuous time. Let t 7 L(t) be
a time-varying Laplacian matrix with associated time-varying digraph t 7 G(t), t R0 so that
(A1) each non-zero edge weight aij (t) is larger than a constant > 0;
(A2) each graph G(t) is strongly connected point-wise in time; and
(A3) there is a positive vector w Rn satisfying 1>
n w = 1 and w L(t) = 0n for all t R0 .
> >
Without relying on Theorem 11.9, show that the solution to x(t)

= L(t)x(t) satisfies limt x(t) =
w> x(0)1n .
Hint: Search for a common quadratic Lyapunov function.
Chapter 12
Randomized Averaging Algorithms
In this chapter we discuss averaging algorithms defined by sequences of random stochastic matrices. In
other words, we imagine that at each discrete instant, the averaging matrix is selected randomly according
to some stochastic model. We refer to such algorithms as randomized averaging algorithms.
Randomized averaging algorithms are well behaved and easy to study in the sense that much information
can be learned simply from the expectation of the averaging matrix. Also, as compared with time-varying
algorithms, it is possible to study convergence rates for randomized algorithms. In this chapter we present
results from (Fagnani and Zampieri 2008; Tahbaz-Salehi and Jadbabaie 2008; Garin and Schenato 2010;
Frasca 2012). Relevant references include (Chatterjee and Seneta 1977; Cogburn 1984; Hatano and Mesbahi
2005; Touri and Nedi 2014).
In this book we will not discuss averaging algorithms in the presence of quantization effects, we refer
the reader instead to (Kashyap et al. 2007; Nedi et al. 2009; Frasca et al. 2009). Similarly, regarding averaging
in the presence of noise, we refer to (Xiao et al. 2007; Bamieh et al. 2012; Lovisari et al. 2013; Jadbabaie and
Olshevsky 2015).
12.1 Examples of randomized averaging algorithms

Consider the following models of randomized averaging algorithms.
Uniform Symmetric Gossip. Given an undirected graph G, at each iteration, select uniformly likely one
of the graph edges, say agents i and j talk, and they both perform (1/2, 1/2) averaging, that is:
1
xi (k + 1) = xj (k + 1) := xi (k) + xj (k) .
2
A detailed analysis of this model is given by Boyd et al. (2006).
Packet Loss in Communication Network. Given a strongly connected and aperiodic digraph, at each
communication round, packets travel over directed edges and, with some likelihood, each edge may
drop the packet. (If information is not received, then the receiving node can either do no update
whatsoever, or adjust its averaging weights to compensate for the packet loss).
169
170 Chapter 12. Randomized Averaging Algorithms
Broadcast Wireless Communication. Given a digraph, at each communication round, a randomly-

selected node transmits to all its out-neighbors. (Here we imagine that simultaneous transmissions
are prohibited by wireless interference.)
Opinion Dynamics with Stochastic Interactions and Prominent Agents. (Somehow similar to uni-
form gossip) Given an undirected graph and a probability 0 < p < 1, at each iteration, select
uniformly likely one of the graph edges and perform: with probability p both agents perform the
(1/2, 1/2) update, and with probability (1 p) only one agent performs the update and the promi-
nent agent does not. A detailed analysis of this model is given by (Acemoglu and Ozdaglar 2011);
see also (Acemoglu et al. 2013).
Note that, in the second, third and fourth example models, the row-stochastic matrices at each iteration
are not symmetric in general, even if the original digraph was undirected.
12.2 A brief review of probability theory

We briefly review a few basic concepts from probability theory and refer the reader for example to (Breiman
1992).
Loosely speaking, a random variable X : E is a measurable function from the set of possible
outcomes to some set E which is typically a subset of R.
The probability of an event (i.e., a subset of possible outcomes) is the measure of the likelihood that
the event will occur. An event occurs almost surely if it occurs with probability equal to 1.
The random variable X is called discrete if its image is finite or countably infinite. In this case, X is
described by a probability mass function assigning a probability to each value in the image of X.
P1 , . . . , xM } R, then the probability mass function p : {x1 , . . . , xM }

Specifically, if X takes value in {x
[0, 1] satisfies pX (xi ) 0 and ni=1 pX (xi ) = 1, and determines the probability of X being equal to
xi by P[X = xi ] = pX (xi ).
The random variable X is called continuous if its image is uncountably infinite. If X is an absolutely
continuous function, X is described by a probability density function assigning a probability to
intervals in the image of X.
Specifically, ifRX takes value in R, then the probability density function fX : R [0, 1] satisfies
f (x) 0 and R f (x)dx = 1, and determines the probability of X taking value in the interval [a, b]
Rb
by P[a X b] = a f (x)dx.
P
The expected value of a discrete variable is E[X] = M i=1 xi pX (xi ).
R
The expected value of a continuous variable is E[X] = xfX (x)dx.
A (finite or infinite) sequence of random variables is independent and identically distributed (i.i.d.) if
each random variable has the same probability mass/distribution as the others and all are mutually
independent.
12.3. Randomized averaging algorithms 171
12.3 Randomized averaging algorithms

In this section we consider random sequences of row stochastic sequences. Accordingly, let A(k) be the
row-stochastic averaging matrix occurring randomly at time k and G(k) be its associated graph. We then
consider the stochastic linear system
x(k + 1) = A(k)x(k).
We now present the main result of this chapter; for its proof we refer to (Tahbaz-Salehi and Jadbabaie
2008), see also (Fagnani and Zampieri 2008).
Theorem 12.1 (Consensus for randomized algorithms). Let {A(k)}kZ0 be a sequence of random
row-stochastic matrices with associated digraphs {G(k)}kZ0 . Assume
(A1) the sequence of variables {A(k)}kZ0 is i.i.d.,

(A2) at each time k, the random matrix A(k) has a strictly positive diagonal so that each digraph in the
sequence {G(k)}kZ0 has a self-loop at each node, and
(A3) the digraph associated to the expected matrix E[A(k)], for any k, has a globally reachable node.
Then the following statements hold almost surely:
(i) there exists a random nonnegative vector w Rn with w1 + + wn = 1 such that
lim A(k)A(k 1) A(0) = 1n w> almost surely,

k
(ii) as k , each solution x(k) of x(k + 1) = A(k)x(k) satisfies

lim x(k) = w> x(0) 1n almost surely,
k
(iii) if additionally each random matrix is doubly-stochastic, then w = n1 1n so that

lim x(k) = average x(0) 1n .
k
Note: if each random matrix is doubly-stochastic, then E[A(k)] is doubly-stochastic. The converse is
easily seen to be false.
12.3.1 Additional results on uniform symmetric gossip algorithms

Recall: given undirected graph G with edge set E, at each iteration, select uniformly likely one of the graph
edges, say agents i and j talk, and they both perform (1/2, 1/2) averaging, that is:
1
xi (k + 1) = xj (k + 1) := xi (k) + xj (k) .
2
Corollary 12.2 (Convergence for uniform symmetric gossip). If the graph G is connected, then each
solution to the uniform symmetric gossip converges to average consensus with probability 1.
172 Chapter 12. Randomized Averaging Algorithms
Proof based on Theorem 12.1. The corollary can be established by verifying that Assumptions (A1)(A3) in
Theorem 12.1 are satisfied. Regarding (A3), note that the graph associated to the expected averaging matrix
is G.
We here provide a simple interesting proof by (Frasca 2012).
Proof based on Theorem 11.3. For any time k0 0 and any edge (i, j) E, consider the event the edge
(i, j) is not selected for update at any time larger than k0 . Since the probability that (i, j) is not selected at
any time k is 1 1/|E|, the probability that (i, j) is not selected at any times after k0 is
1 kk0
lim 1 = 0.
k |E|
With this fact one can verify that all assumptions in Theorem 11.3 are satisfied by the random sequence
of matrices almost surely. Hence, almost sure convergence follows. Finally, since each matrix is doubly
stochastic, average(x(k) is preserved, and the solution converges to average(x(0))1n .
12.3.2 Additional results on the mean-square convergence factor

Given a sequence of stochastic averaging matrices {A(k)}kZ0 and corresponding solutions x(k) to
x(k + 1) = A(k)x(k), we define the mean-square convergence factor by
!
h i 1/k
rmean-square {A(k)}kZ0 = sup lim sup E kx(k) average(x(k))1n k22 .
x(0)6=xfinal k
We now present upper and lower bounds for the mean-square convergence factor; for a comprehensive
proof we refer to (Fagnani and Zampieri 2008, Proposition 4.4).
Theorem 12.3 (Upper and lower bounds on the mean-square convergence factor). Under the same
assumptions as in Theorem 12.1, the mean-square convergence factor satisfies
2 h i
> >
ess E[A(k)] rmean-square E A(k) (In 1n 1n /n)A(k) .
12.4. Table of asymptotic behaviors for averaging systems 173
12.4 Table of asymptotic behaviors for averaging systems

discrete-time: G has a globally reachable node Thm 5.2

x(k + 1) = Ax(k), =
A row-stochastic adjacency limk x(k) = (w> x(0))1n ,
matrix of digraph G where w 0, w> A = w> , and 1>
nw = 1
continuous-time: G has a globally reachable node Thm 7.4

x(t)
= Lx(t), =
L Laplacian matrix of limt x(t) = (w> x(0))1n ,
digraph G where w 0, w> L = 0> >
n , and 1n w = 1
time-varying discrete-time: (i) at each time k, G(k) has self-loop at each node, Thm 11.2
x(k + 1) = A(k)x(k), (ii) each aij (k) 0 is larger than > 0,
A(k) row-stochastic adjacency (iii) there exists duration s.t., for all time k,
matrix of digraph G(k), G(k) G(k + 1) has a globally reachable node
k Z0 =
limk x(k) = (w> x(0))1n , where w 0, 1>
nw = 1
time-varying symmetric (i) at each time k, G(k) has self-loop at each node, Thm 11.3
discrete-time: (ii) each aij (k) 0 is larger than > 0,
x(k + 1) = A(k)x(k), (iii) for all time k, k G( ) is connected
A(k) symmetric stochastic =
adjacency of G(k), k Z0 limk x(k) = average x(0) 1n
time-varying continuous-time: (i) each aij (k) 0 is larger than > 0, Thm 11.9
x(t)
= L(t)x(t), (ii) there exists duration T
R t+T s.t., for all time t,
L(t) Laplacian matrix of digraph associated to t A( )d has a globally reachable
digraph G(t), t R0 node
=
limk x(k) = (w> x(0))1n , where w 0, 1>
nw = 1
randomized discrete-time: (i) {A(k)}kZ0 is i.i.d., Thm 12.1

x(k + 1) = A(k)x(k), (ii) each matrix has strictly positive diagonal,
A(k) random row-stochastic (iii) digraph associated to E[A(k)] has a globally reachable
adjacency matrix node,
of digraph G(k), k Z0 =
limk x(k) = w> x(0) 1n almost surely,
where w > 0 is random vector with 1> nw = 1
Table 12.1: Averaging systems: definitions, assumptions, asymptotic behavior, and reference
Part III
Nonlinear Systems
175
Chapter 13
Nonlinear Systems and Robotic
Coordination
Coordination in relative sensing networks: rendezvous, flocking, and formations The material
in this section is self-contained. Further information on flocking can be found in (Tanner et al. 2007;
Olfati-Saber 2006), and further material formation control and graph rigidity can be found in (Drfler and
Francis 2010; Krick et al. 2009; Anderson et al. 2008; Oh et al. 2015).
13.1 Coordination in relative sensing networks

We consider the following setup for the coordination of n autonomous mobile robots (referred to as agents)
in a planar environment:
(i) Agent dynamics: We consider a simple and fully actuated agent model: pi = ui , where pi R2
and ui R2 are the position and steering control input of agent i.
(ii) Relative sensing model: We consider the following sensing model.
Each agent is equipped with onboard sensors only and has no communication devices.
The sensing topology is encoded by an undirected and connected graph G = (V, E)
Each agent i can measure the relative position of neighboring agents: pi pj for {i, j} E.
To formalize the relative sensing model, we introduce an arbitrary orientation and labeling k
{1, . . . , |E|} for each undirected edge {i, j} E. Recall the incidence matrix B Rn|E| of the
associated oriented graph and define the 2n 2|E| matrix B = B I2 via the Kronecker product.
The Kronecker product A B is the element-wise matrix product so that each scalar entry Aij of
A is replaced by a block-entry Aij B in the matrix A B. For example, if B is given by

+1 0 0 0 +I2 0 0 0
1 +1 1 0 I2 +I2 I2 0
B=
0 1 0 +1 , then B is given by B = B I2 = 0 I2
.
0 +I2
0 0 +1 1 0 0 +I2 I2
177
178 Chapter 13. Nonlinear Systems and Robotic Coordination
p2
e1
e2
p1
e3
p3
Figure 13.1: A ring graph with three agents. The first panel shows the agents embedded in the plane R2 with positions
pi and relative positions ei . The second panel shows the artificial potentials as springs connecting the robots, and the
third panel shows the resulting forces.
> p.
With this notation the vector of relative positions is given by e = B
(iii) Geometric objective: The objective is to achieve desired geometric configuration which can be
expressed as a function of relative distances kpi pj k for each {i, j} E. Examples include
rendezvous (kpi pj k = 0), collision avoidance (kpi pj k > 0), and desired relative spacings
(kpi pj k = dij > 0).
(iv) Potential-based control: We specify the geometric objective for each edge {i, j} E as the
minimum of an artificial potential function Vij : Dij R R0 . We require the potential functions
to be twice continuously differentiable on their domain Dij .
It is instructive to think of Vij (kpi pj k) as a spring coupling neighboring agents {i, j} E. The

resulting spring forces acting on agents i and j are fij (pi pj ) = p i
Vij (kpi pj k) and fji (pi pj ) =

fij (pi pj ) = pj Vij (kpi pj k); see Figure 13.1 for an illustration. The overall network potential
function is then X
V (p) = Vij (kpi pj k) .
{i,j}E
We design the associated gradient descent control law as

V (p) X X
pi = ui = = Vij (kpi pj k) = fij (pi pj ) , i {1, . . . , n} .
pi pi
{i,j}E {i,j}E
In vector form the control reads as the gradient flow
V (p) > diag({fij }{i,j}E ) B

>p .
p = u = =B (13.1)
p
The closed-loop relative sensing network (13.1) is illustrated in Figure 13.2.
Controllers based on artificial potential functions induce a lot of structure in the closed-loop system.
Recall the set of 2-dimensional orthogonal matrices O(2) = {R R2 | RR> = I2 }, introduced in
Exercise E2.14, as the set of 2-dimensional rotations and reflections.
Lemma 13.1 (Symmetries of relative sensing networks). Consider the closed-loop relative sensing net-
work (13.1) with an undirected and connected graph G = (V, E). For every initial condition p0 R2n , we
have that
13.1. Coordination in relative sensing networks 179
..
.
u x i = ui x
..
.

B T
B
..
.
z fij () y
..
.
Figure 13.2: Closed-loop diagram of the relative sensing network (13.1).
(i) the center of mass is stationary: average(p(t)) = average(p0 ) for all t 0; and
>
(ii) the closed-loop p = Vp(p) is invariant under rigid body transformations: if i = Rpi + q, where
>
R O(2) and q R2 is a translation vector, then = V() .
P P P
Proof. Regarding statement (i), since ni=1 pi = 0, it follows that ni=1 pi (t) = ni=1 pi0 .
Regarding statement (ii), first, notice that potential function is invariant under translations since
V (p) = V (p + 1n q) for any translation q R2 . Second, notice that the potential function is invariant
= V (p) where
under rotations and reflections since Vij (kR(pi pj )k) = Vij (kpi pj k) and thus V (Rp)
R = In R. From the chain rule we obtain V (Rp) R = >
p p V (p) or p V (Rp) = p V (p)R . By
combining these insights when changing coordinates via i = Rpi + q (or = Rp + 1n q), we find that
> > >
V (p) V (p) >
V (Rp) V () >
= R
p = R = R = = .
p p p

Example 13.2 (The linear-quadratic rendezvous problem). An undirected consensus system is a relative
sensing network coordination problem where the objective is rendezvous: pi = pj for all {i, j} E. For
each edge {i, j} E consider the artificial potential Vij : R2n R0 which has a minimum at the desired
objective. For example, for the quadratic potential function
1
Vij (pi pj ) = aij kpi pj k22 ,
2
where L
the overall potential function is obtained as the Laplacian potential V (p) = 12 p> Lp, = L I2 . The
resulting gradient descent control law gives rise to the linear Laplacian flow
X
pi = ui = V (p) = aij (pi pj ) . (13.2)
pi
{i,j}E
So far, we analyzed the consensus problem (13.2) using matrix theory and exploiting the linearity of the problem.
In the following, we introduce numerous tools that will allow us to analyze nonlinear consensus-type interactions
and more general nonlinear dynamical systems.
13.2 Stability theory for dynamical systems

Dynamical systems and equilibrium points A (continuous-time) dynamical system is a pair (X, f )
where X, called the state space, is a subset of Rn and f , called the vector field, is a map from X to Rn . Given
an initial state x0 X, the solution (also called trajectory or evolution) of the dynamical system is a curve
t 7 x(t) satisfying the differential equation
x(t)
= f (x(t)), x(0) = x0 .
A dynamical system (X, f ) is linear if x 7 f (x) = Ax for some square matrix A. Typically, the map f
is assumed to have some continuity properties so that the solution exists and is unique for at least small
times; we do not discuss this topic here and refer, for example, to (Khalil 2002).
Examples of continuous-time dynamical systems include the (linear) Laplacian flow x = PLx (see equa-
tion (7.2) in Section 7.3) and the (nonlinear) Kuramoto coupled oscillator model i = i K
n
n
j=1 sin( i j )
(which we discuss in Chapter 14).
An equilibrium point for the dynamical systems (X, f ) is a point x X such that f (x ) = 0n . If the
initial state is x(0) = x , then the solution exists unique for all time and is constant: x(t) = x for all
t R0 .
Convergence and invariant sets A curve t 7 x(t) approaches a set S Rn as t + if the distance
from x(t) to the set S converges to 0 as t +. If the set S consists of a single point s, then x(t) converges
to s in the usual sense: limt+ x(t) = s.
Given a dynamical system (X, f ), a set W X is invariant if each solution starting in W remains in
W , that is, if x(0) W implies x(t) W for all t 0. We also need the following general properties: a
set W Rn is
(i) bounded if there exists a constant K that each w W satisfies kwk K,

(ii) closed if it contains its boundary (or, equivalently, if it contains all its limit points), and
(iii) compact if it is bounded and closed.
Stability An equilibrium point x for the system (X, f ) is said to be
(i) stable (or Lyapunov stable) if, for each > 0, there exists = () > 0 so that if kx(0) x k < ,
then kx(t) x k < for all t 0,
(ii) unstable if it is not stable,
(iii) locally asymptotically stable if it is stable and if there exists > 0 such that limt x(t) = x for all
trajectories satisfying kx(0) x k < .
Moreover, given a locally asymptotically stable equilibrium point x ,
(i) the set of initial conditions x0 X whose corresponding solution x(t) converges to x is a closed
set termed the region of attraction of x ,
(ii) x is said to be globally asymptotically stable if its region of attraction is the whole space X,
13.2. Stability theory for dynamical systems 181
(iii) x is said to be globally (respectively, locally) exponentially stable if it is globally (respectively, locally)
asymptotically stable and all trajectories starting in the region of attraction satisfy
kx(t) x k c1 kx(0) x kec2 t ,
for some positive constants c1 , c2 > 0.
Some of these concepts are illustrated in Figure 13.3.
x x x
"
(a) Stable equilibrium (b) Unstable equilibrium (c) Asymptotically stable equilib-
rium
Figure 13.3: Illustrations of a stable, an unstable and an asymptotically stable equilibrium.
Energy functions: non-increasing functions, sublevel sets and critical points In order to establish
the stability and convergence properties of a dynamical system, we will use the concept of an energy
function that is non-increasing along the systems solution.
The Lie derivative (also called directional derivative) of a function V : Rn R with respect to a vector
field f : Rn Rn is the function Lf V : Rn R defined by

Lf V (x) =
V (x)f (x). (13.3)
x
A differentiable function V : Rn R is said to be non-increasing along every trajectory of the system if
each solution x : R0 X satisfies
V (x(t)) = Lf V (x(t)) 0,
or, equivalently, if each point x X satisfies
Lf V (x) 0.
A critical point for a differentiable function V : Rn R is a point x
X satisfying
V
(
x) = 0n .
x
Every critical point of a differentiable function is either a local minimum, local maximum or a saddle point.
Given a function V : Rn R and a constant ` R, the `-level set of V is {y Rn | V (y) = `}, and the
`-sublevel set of V is {y Rn | V (y) `}. These concepts are illustrated in Figure 13.4.
`1
`2
`3
x1 x2 x3 x4 x5 x
{x | V (x) `2 }
Figure 13.4: A differentiable function, its sublevel set and its critical points. The sublevel set {x | V (x) `1 } is
unbounded. The sublevel set {x | V (x) `2 } = [x1 , x5 ] is compact and contains three critical points (x2 and x4 are
local minima and x3 is a local maximum). Finally, the sublevel set {x | V (x) `3 } is compact and contains a single
critical point x4 .
13.2.1 First main convergence tool: the Lyapunov Theorem

Given a point x0 Rn , a function V : Rn R is locally positive-definite about x0 (respectively, locally
positive-semidefinite about x0 ) if V (x0 ) = 0 and if there exists a neighborhood U of x0 such that V (x) > 0
(respectively, V (x) 0) for all x U \ {x0 }. A function V : Rn R is globally positive-definite about
x0 if V (x0 ) = 0 and V (x) > 0 for all x Rn \ {x0 }. We define locally and globally negative-definite
and negative-semidefinite functions in the obvious way. A function V : Rn R is radially unbounded if
V (x) along any trajectory such that kxk .
Theorem 13.3 (Lyapunov Theorem). Consider a dynamical system (Rn , f ) with differentiable vector field
f and with an equilibrium point x Rn . The equilibrium point x is
(i) stable if there exists a continuously-differentiable function V : Rn R, called a weak Lyapunov

function, such that
V is locally positive-definite about x ,

Lf V is locally negative-semidefinite about x ,
(ii) locally asymptotically stable if there exists a continuously-differentiable function V : Rn R, called a

local Lyapunov function, such that
V is locally positive-definite about x ,

Lf V is locally negative-definite about x ,
(iii) globally asymptotically stable if there exists a continuously-differentiable function V : Rn R, called

a global Lyapunov function, such that
V is globally positive-definite about x ,

Lf V is globally negative-definite about x ,
V is radially unbounded.
Note: Lyapunov Theorem assumes the existence of a Lyapunov function with certain properties, but
does not provide any constructive method to design or compute one. In what follows we will see that
Lyapunov functions can be designed easily for certain classes of systems. But, in general, the computation
of Lyapunov function is a challenging task.
13.2.2 Second main convergence tool: the LaSalle Invariance Principle

We now present a powerful analysis tool for the convergence analysis of nonlinear systems, namely the
LaSalle Invariance Principle. We refer to (Khalil 2002, Theorem 4.4) for a complete proof, many examples
and much related material. Also, we refer to (Mesbahi and Egerstedt 2010; Bullo et al. 2009) for various
extensions and applications to robotic coordination.
Theorem 13.4 (LaSalle Invariance Principle). Consider a dynamical system (X, f ) with differentiable
f . Assume there exist
(i) a compact set W X that is invariant for (X, f ), and

(ii) a continuously-differentiable function V : X R satisfying Lf V (x) 0 for all x X.
Then each solution t 7 x(t) starting in W , that is, x(0) W , converges to the largest invariant set contained
in

x W | Lf V (x) = 0 .
Note: If the set S is composed of multiple disconnected components and t 7 x(t) approaches S, then it
must approach one of its disconnected components. Specifically, if the set S is composed of a finite number
of points, then t 7 x(t) must converge to one of the points.
13.2.3 Application #1: Linear and linearized systems

It is interesting to study the convergence properties of a linear system. Recall that a symmetric matrix is
positive definite if all its eigenvalues are strictly positive.
Theorem 13.5 (Convergence of linear systems). For a matrix A Rnn , the following properties are
equivalent:
(i) each solution to the differential equation x = Ax satisfies limt+ x(t) = 0n ,

(ii) A is Hurwitz, i.e., all the eigenvalues of A have strictly-negative real parts, and
(iii) for every positive-definite matrix Q, there exists a unique solution positive-definite matrix P to the
so-called Lyapunov equation:
A> P + P A = Q.
One can show that statement (iii) implies statement (i) using the LaSalle Invariance Principle with
function V (x) = x> P x, whose derivative along the systems solutions is V = x> (A> P + P A)x =
x> Qx 0.
The linearization at the equilibrium point x of the dynamical system (X, f ) is the linear dynamical
system defined by the differential equation x = Ax, where
f
A= (x ).
x
Theorem 13.6 (Convergence of nonlinear systems via linearization). Consider a dynamical system
(X, f ) with an equilibrium point x , with twice differentiable vector field f , and with linearization A at x .
The following statements hold:
(i) the equilibrium point x is locally exponentially stable if all the eigenvalues of A have strictly-negative
real parts; and
(ii) the equilibrium point x is unstable if at least one eigenvalue of A has strictly-positive real part.
Theorem 13.6 can often be invoked to analyze local stability of a nonlinear system. For example, for
R, consider the dynamical system
= f () = sin() ,
which we will study extensively in Chapters 14 and 15. If [0, 1[, then two equilibrium points are
1 = arcsin() [0, /2[ and 2 = arcsin() ]/2, +]. Moreover, the 2-periodic set of equilibria
are given by {1 + 2k | k Z} and {2 + 2k | k Z}. The linearization matrix A(i ) = f
(i ) =
cos(i ) for i {1, 2} shows that 1 is locally stable and 2 is unstable.
On the other hand, pick a scalar c and, for x R, consider the dynamical system
x = f (x) = c x3 .
The linearization at the equilibrium x = 0 is indefinite: A(x ) = 0. Thus, Theorem 13.6 offers no
conclusions other than the equilibrium cannot be exponentially stable. On the other hand, the LaSalle
Invariance Principle shows that for c < 0 every trajectory converges to x = 0. Here, a non-increasing and
differentiable function is given by V (x) = x2 with Lie derivative Lf V (x) = 2cx4 0. Since V (x(t)) is
non-increasing along the solution to the dynamical system, a compact invariant set is then readily given by
any sublevel set {x | V (x) `} for ` 0.
13.2.4 Application #2: Negative gradient systems

Given a twice differentiable function U : Rn R, the negative gradient flow defined by U is the dynamical
system
U
x(t)
= (x(t)). (13.4)
x
For x Rn , the Hessian matrix Hess U (x) is the symmetric matrix of second order partial derivatives:
(Hess U )ij (x) = 2 U/xi xj .
Theorem 13.7 (Convergence of negative gradient flows). Let U : Rn R be twice differentiable and
assume its sublevel set {x | U (x) `} is compact for some ` R. Then the negative gradient flow (13.4) has
the following properties:
(i) the sublevel set {x | U (x) `} is invariant,

(ii) each solution t 7 x(t) with U (x(0)) ` satisfies limt+ U (x(t)) = c ` and approaches the set
of critical points of U :
n U o
x Rn (x) = 0n ,
x
(iii) each local minimum point x is locally asymptotically stable and it is locally exponentially stable if and
only if Hess U (x ) is positive definite,
(iv) a critical point x is unstable if at least one eigenvalue of Hess U (x ) is strictly negative.
Proof. To show statements (i) and (ii), we verify that the assumptions of the LaSalle Invariance Principle
are satisfied as follows. First, as set W we adopt the sublevel set {x | U (x) `} which is compact by
assumption and is invariant because, as we show next, the value of t 7 U (x(t)) is non-increasing. Second,
the derivative of the function U along its negative gradient flow is
2
U

U (x) = (x)
x 0.
The first two facts are now an immediate consequence of the LaSalle Invariance Principle. The statements (iii)
and (iv) follow from observing that the linearization of the negative gradient system at the equilibrium x
is the Hessian matrix evaluated at x and from applying Theorem 13.6.
Note: If the function U has isolated critical points, then the negative gradient flow evolving in a compact
set must converge to a single critical point. In such circumstances, it is also true that from almost all initial
conditions the solution will converge to a local minimum rather than a local maximum or a saddle point.
Note: given a critical point x , a positive definite Hessian matrix Hess U (x ) is a sufficient but not a
necessary condition for x to be a local minimum. As a counterexample, consider the function U (x) = x4
and the critical point x = 0.
Note: If the function U is radially unbounded, that is, limkxk U (x) = (where the limit is taken
along any path resulting in kxk ), then all its sublevel sets are compact.
Note from (ojasiewicz 1984): if the function U is analytic, then every solution starting in a compact
sublevel set has finite length and converges to a single equilibrium point.
Example 13.8 (Dissipative mechanical system). Consider a dissipative mechanical system of the form
p = v,

mv = dv U (p) ,
p
where (p, v) R2 are the position and velocity coordinates, m and d are the positive inertia and damping
coefficients, and U : R R is a twice differentiable potential energy function. We assume that U is strictly
convex with a unique global minimum at p . Consider the mechanical energy E : R R R0 given by the
sum of kinetic and potential energy:
1
E(p, v) = mv 2 + U (p).
2
We compute its derivative along trajectories of the mechanical system as follows:

E(p, v) = mv v + U (p)p = dv 2 0 .
p
Notice that the assumptions of the LaSalle Invariance Principle in Theorem 13.4 are satisfied: the function E and
the vector field (the right-hand side of the mechanical system) are continuously differentiable; the derivative E is
nonpositive; and for any initial condition (p0 , v0 ) R2 the sublevel set {(p, v) R2 | E(p, v) E(p0 , v0 )}
is compact due to the strict convexity of U . It follows that (p(t), v(t)) converges to largest invariant set
contained in {(p, v) R2 | E(p, v) E(p0 , v0 ), v = 0}, that is, {(p, v) R2 | E(p, v) E(p0 , v0 ), v =

0, p U (p) = 0}. Because U is strictly convex and twice differentiable, p
U (p) = 0 if and only if p = p .
Therefore, we conclude
lim (p(t), v(t)) = (p , 0).
t+

13.3 A nonlinear rendezvous problem

Consider the nonlinear rendezvous system
X
pi = fi (p) = gij (pi pj ) , (13.5)
{i,j}E
where (for each {i, j} E) gij = gji is a continuously differentiable, strictly increasing, and anti-symmetric
function satisfying e gij (e) 0 and gij (e) = 0 if and only if e = 0. Notice that the linearization of the
system around the consensus subspace may be zero and thus not very informative, for example, when
gij (e) = kek2 e. The nonlinear rendezvous system (13.5) can be written as a gradient flow:
X n

pi = V (p) = Vij (kpi pj k) .
pi pi
j=1
R kpi pj k
with the associated edge potential function Vij (kpi pj k) = 0 gij () d.
Theorem 13.9 (Nonlinear rendezvous). Consider the nonlinear rendezvous system (13.5) with an undi-
rected and connected graph G = (V, E). Assume that the associated edge potential functions Vij (kpi pj k) =
R kpi pj k
0 gij () d are radially unbounded. For every initial condition p0 R2n , we have that
the center of mass is stationary: average(p(t)) = average(p0 ) for all t 0; and

limt p(t) = 1n average(p0 ).
Proof. Note that the nonlinear rendezvous system (13.5) is the negative gradient system defined by the
network potential function X
V (p) = Vij (kpi pj k) .
{i,j}E
13.4. Flocking and Formation Control 187
Recall from Lemma 13.1 that the center of mass is stationary, and observe that the function V (p) is radially
unbounded with exception of the direction span(12n ) associated with a translation of the stationary center
of mass. Thus, for every initial condition p0 R2n , the set of points (with fixed center of mass)
{p R2n | average(p) = average(p0 ) , V (p) V (p0 )}
defines a compact set. By the LaSalle Invariance Principle in Theorem 13.4, each solution converges to the
largest invariant set contained in
n V (p) o
p R2n average(p(t)) = average(p0 ) , V (p) V (p0 ) , = 0>n .
p
It follows that the only positive limit set is the set of equilibria: limt p(t) span(1n average(p0 )).

13.4 Flocking and Formation Control

In flocking control, the objective is that the robots should mimic the behavior of fish schools and bird flocks
and attain a pre-scribed formation defined by a set of distance constraints. Given an undirected graph
G(V, E) and a distance constraint dij for every edge {i, j} E, a formation is defined by the set
F = {p R2n | kpi pj k2 = dij {i, j} E} .
We embed the graph G into the plane R2 by assigning to each node i a location pi R2 . We refer to the
pair (G, p) as a framework, and we denote the set of frameworks (G, F) as the target formation. A target
formation is a realization of F in the configuration space R2 . A triangular example is shown in Figure 13.5.
p2
p2 p1
e13 2 = d13
e12 2 = d12 e12 2 = d12
p3
e23 2 = d23 e23 2 = d23 p1
p1
e23 2 = d23
e13 2 = d13 p3 p2 p3 e13 2 = d13
Figure 13.5: A triangular formation specified by the distance constraints d12 , d13 , and d23 . The left subfigure shows
one possible target formation, the middle subfigure shows a rotation of this target formation, and the right subfigure
shows a flip of the left target formation. All of this triangles satisfy the specified distance constraints and are
elements of F.
We make the following three observations on the geometry of the target formation:
To be non-empty, the formation F has to be realizable in the plane. For example, for the triangular
formation in Figure 13.5 the distance constraints dij need to satisfy the triangle inequalities:
d12 d13 + d23 , d23 d12 + d13 , d13 d12 + d23 .
A framework (G, p) with p F is invariant under rigid body transformations, that is, rotation or
translation, as seen in Figure 13.5. Hence, the formation F is a set of at least of dimension 3.
The formation F may consist of multiple disconnected components. For instance, for the triangular
example in Figure 13.5 there is no continuous deformation from the left framework to the right
flipped framework, even though both are target formations. In the state space R6 , this absence of a
continuous deformation corresponds to two disconnected components of the set F.
To steer the agents towards the target formation consider an artificial potential function for each edge
{i, j} E which mimics the Hookean potential of a spring with rest length dij :
1 2
Vij (kpi pj k) = kpi pj k2 dij .
2
Since this potential function is not differentiable, we choose the modified potential function
1 2
Vij (kpi pj k) = kpi pj k22 d2ij . (13.6)
4

The resulting closed loop under the gradient control law u = p V (p) is given by
X
pi = ui = V (p) = kpi pj k22 d2ij (pi pj ) . (13.7)
pi
{i,j}E
Observe that the set of equilibria of the closed loop (13.7) is the set of critical points of V (p) which is a
strict super-set of the target formation F. For example, it includes the set of points when two neighbors
are collocated: pi = pj for {i, j} E. In the following, we show convergence to the equilibrium set.
Theorem 13.10 (Flocking). Consider the nonlinear flocking system (13.7) with an undirected and connected
graph G = (V, E) and a realizable formation F. For every initial condition p0 Rn , we have that
the center of mass is stationary: average(p(t)) = average(p0 ) for all t 0; and

the agents asymptotically converge to the set of critical points of the potential function.
Proof. As in the proof of Theorem 13.9, the center of mass is stationary and the potential is non-increasing:

V (p) > 2

V (p) = 0.
p
Observe further that for a fixed initial center of mass, the sublevel sets of V (p) form a compact set. By the
LaSalle Invariance Principle in Theorem 13.4, p(t) converges to the largest invariant set contained in
n V (p) o
p R2n average(p(t)) = average(p0 ) , V (p) V (p0 ) , = 0>
n .
p
It follows that the positive limit set is the set of critical points of the potential function.
13.4. Flocking and Formation Control 189
Observe that Theorem 13.10 guarantees at most convergence to the critical points of the potential
function. Depending on the problem scenario of interest, we still have to investigate which of these critical
points are locally asymptotically stable or unstable on a case-by-case basis; see Exercise E13.8 for an
application to a linear formation and Section 13.5 for a more general analysis.
The above Theorem 13.10 also holds true for non-smooth potential functions Vij : ] d2i , [ R that
satisfy
(P1) regularity: Vij () is defined and twice continuously-differentiable on ]0, [;

(P2) distance specification: fij () = Vij () = 0 if and only if = dij ;

(P3) mutual attractivity: fij () = Vij () is strictly monotone increasing; and
(P4) collision avoidance: lim0 Vij () = .
An illustration of possible potential functions can be found in Figure 13.6. These potential functions can
also be easily modified to include input constraints; see Exercise E13.4.
Vij kfij k
kpi pj k kpi pj k
d2ij d2ij
(a) Artificial potential functions (b) Induced artificial spring forces
Figure 13.6: Illustration of the quadratic potential function (13.6) (blue solid plot) and a logarithmic barrier potential
function (red dashed plot) that approaches as two neighboring agents become collocated
Theorem 13.11 (Flocking with collision avoidance). Consider the gradient flow (13.1) with an undirected
and connected graph G = (V, E), a realizable formation F, and artificial potential functions satisfying (P1)
through (P4). For every initial condition p0 R2n satisfying pi (0) 6= pj (0) for all {i, j} E, we have that
the solution to the non-smooth dynamical system exists for all times t 0;
the center of mass average(p(t)) = average(p(0)) is stationary for all t 0;
neighboring robots will not collide, that is, pi (t) 6= pj (t) for all {i, j} E and for all t 0; and
the agents asymptotically converge to the set of critical points of the potential function.
Proof. The proof of Theorem 13.11 is identical to that of Theorem 13.10 after realizing that, for initial
conditions satisfying pi (0) 6= pj (0) for all {i, j} E, the dynamics are confined to the compact and
forward invariant set
n o
p R2n average(p(t)) = average(p0 ) , V (p) V (p0 ) .
Within this set, the dynamics (13.7) are twice continuously differentiable and collisions are avoided.
At this point we should ask ourselves the following three questions:
(i) Do the agents actually stop, that is, does there exist an p Rn so that limt p(t) = p ?
(ii) The formation F is a subset of the set of critical points of the potential function. How can we render
this particular subset stable (amongst possible other critical points)? What are the other critical
points?
(iii) Does our specification of the target formation make sense? For example, in Figure 13.7 the target
formation can be infinitesimally deformed, such that the resulting geometric configurations are not
congruent.
Figure 13.7: A rectangular target formation among four robots, which is specified by four distance constraints.
The initial geometric configuration (solid circles) can be continuously deformed such that the resulting geometric
configuration is not congruent anymore. All of the displayed configurations are part of the target formation set and
satisfy the distance constraints, even the case when the agents are collinear.
The answers to all this question is tied to a graph-theoretic concept called rigidity.
13.5 Rigidity and stability of the target formation

To introduce the notion of graph rigidity, we view the undirected graph G = (V, E) as a framework (G, p)
embedded in the plane R2 . Given a framework (G, p), we define the rigidity function rG (p) as
1 >
rG : R2n R|E| , rG (p) , . . . , kpi pj k22 , . . . ,
2
where each component in rG (p) corresponds the length of the relative position pi pj for {i, j} E.
Definition 13.12 (Rigidity). Given an undirected graph G(V, E) and p R2n , the framework (G, p) is
said to be rigid if there is an open neighbourhood U of p such that if q U and rG (p) = rG (q), then (G, p) is
congruent to (G, q).
13.5. Rigidity and stability of the target formation 191
(a) A flexible framework (b) A rigid framework
Figure 13.8: The framework in Figure 13.8a is not rigid since a slight perturbation of the upper two points of the
framework results in a framework that is not congruent to the original one although their rigidity functions coincide.
If an additional cross link is added to the framework as in Figure 13.8b, small perturbations that do not change the
rigidity function result in a congruent framework. Thus, the framework in Figure 13.8b is rigid.
An example of a rigid and non-rigid framework is shown in Figure 13.8.

Although rigidity is a very intuitive concept, its definition does not provide an easily verifiable condition,
especially if one is interested in finding the exact neighbourhood U where the framework is rigid. The
following linearized rigidity concept offers an easily checkable algebraic condition. The idea is to allow an
infinitesimally small perturbation p of the framework (G, p) while keeping the rigidity function constant
up to first order. Then the first order Taylor approximation of the rigidity function rG about p is
rG (p)
rG (p + p) = rG (p) + p + O2 (p) .
p

rG (p) rG (p)
The rigidity function then remains constant up to first order if p kernel p . The matrix p
R|E|2n is called the rigidity matrix of the graph G. If the perturbation p is a rigid body motion, that is
a translation and rotation of the framework, then, by Definition 13.12, the framework is still rigid. Thus,
the dimension of the kernel of the rigidity matrix is at most 3. The idea that rigidity is preserved under
infinitesimal perturbations motivates the following definition of infinitesimal rigidity.
Definition 13.13 (Infinitesimal rigidity). Given an undirected

graph G(V, E) and p R2n ,the frame-

rG (p) rG (p)
work (G, p) is said to be infinitesimally rigid if dim kernel p = 3 or equivalently if rank p =
2n 3.
If a framework is infinitesimally rigid, then it is also rigid but the converse is not necessarily true
(Asimow and Roth 1979). Also note that an infinitesimally rigid framework must have at least 2n 3
edges E. If it has exactly 2n 3 edges, then we call it a minimally rigid framework. Finally, if (G, p) is
infinitesimally rigid at p, so is (G, p0 ) for p0 in an open neighborhood of p. Thus, infinitesimal rigidity
is a generic property that depends almost only on the graph G and not on the specific point p R2n .
Throughout the literature (infinitesimally, minimally) rigid frameworks are often denoted as (infinitesimally,
minimally) rigid graphs.
Example 13.14 (Rigidity and infinitesimal rigidity of triangular formation). Consider the triangular
framework in Figure 13.9a and the collapsed triangular framework in Figure 13.9b which are both embeddings
of the same triangular graph. The rigidity function for both frameworks is given by

kp p1 k2
1 2
rG (p) = kp3 p2 k2 .
2
kp1 p3 k2
Both frameworks are rigid but only the left framework is infinitesimally rigid. To see this, consider the rigidity
matrix >
p1 p>
2 p>
2 p>1 0>2
rG (p) > p> p> .
= 0>
2 p>
2 p3 3 2
p > > >
p1 p3 02 p>
3 p1
>
The rank of the rigidity matrix at a collinear point is 2 < 2 n 3. Hence, the collapsed triangle in Figure 13.9b
is not infinitesimally rigid. All non-collinear realizations are infinitesimally and minimally rigid. Hence, the
triangular framework in Figure 13.9a is generically minimally rigid (for almost every p R6 ).
Minimally rigid graphs can be constructed by adding a new node with two undirected edges to an
existing minimally rigid graph; see Figure 13.10. This construction is known under the name Henneberg
sequence.
The flocking result in Theorem 13.10 identifies the critical points of the potential function as the positive
limit set. For minimally rigid graphs, we can perform a more insightful stability analysis. To do so, we
first reformulate the formation control problem in the coordinates of the relative positions e = B > p. The
rigidity function can be conveniently rewritten in terms of the relative positions eij = pi pj for every
edge {i, j} E:
1 >
rG : B > R2n R|E| , rG (e) = . . . , keij k22 , . . . .
2
p2
p2
p1 p1
p3
p3
(a) A rigid and infinitesimally rigid framework (trian- (b) A rigid but not infinitesimally rigid framework
gle inequalities are strict) (triangle inequalities are equalities)
Figure 13.9: Infinitesimal rigidity properties of a framework with three points
Figure 13.10: Construction of a minimally rigid graph by means of Henneberg sequence
13.5. Rigidity and stability of the target formation 193
The rigidity matrix is then obtained in terms of the relative positions as
rG (e) rG (e) e > .

R(e) , = = diag(e> ) B
p e p
>
Consider the shorthand v(e) d = . . . , kpi pj k22 d2ij , . . . . Then the closed-loop formation control
equations (13.7) can be reformulated in terms of relative positions as
> p = B
e = B > u = B
>B
diag(e)(v(e) d) = B
> R(e)> (v(e) d) . (13.8)
> p0 is a vector in image(B
The associated initial condition e0 = B > ).
Theorem 13.15 (Stability of minimally rigid formations (Drfler and Francis 2009)). Consider the
nonlinear flocking system (13.7) with an undirected and connected graph G = (V, E) and a realizable and
minimally rigid formation F. For every initial condition p0 Rn , we have that
the center of mass is stationary: average(p(t)) = average(p0 ) for all t 0;

the agents asymptotically converge to the set
Wp0 = {p R2n | average(p) = average(p0 ) , V (p) V (p0 ) , kR(e)> [v(e) d]k2 = 0|E| } .
In particular, the limit set Wp0 is a union of realizations of the target formation (G, p) with p Wp0 F
and the set of points p Wp0 where the framework (G, p) is not infinitesimally rigid; and
For every p0 R2n such that the framework (G, p) is minimally rigid for all p in the set
{p R2n | average(p) = average(p0 ) , V (p) V (p0 )} ,
the agents converge exponentially fast to a stationary target formation (G, p ) with p Wp0 F.
Proof. Consider the potential function (13.6), which reads in e-coordinates as

1 2
V (e) = v(e) d , (13.9)
4
> F is compact since the translational invariance
In the space of relative positions the target formation set B
is removed. Also the sublevel sets of V (e) are compact, and the derivative along the trajectories of (13.8) is
V (e) > R(e)[v(e) d] = [v(e) d]> R(e)R(e)> [v(e) d] 0 .

e = [v(e) d]> diag(e> )B
e
Notice that V (e(t)) is non-increasing, and for every c 0 the sublevel set
> ) | V (e) c}
(c) := {e Im(B
> ) the
is forward invariant. By the LaSalle Invariance Principle, for every initial condition e0 image(B
associated solution of (13.8) converges to the largest invariant set in
> ) | V (e) V (e0 ) , kR(e)> [v(e) d]k2 = 0|E| }.
We0 = {e image(B
In particular, the limit set We0 includes (i) realizations of the target formation (G, p) with p Wp0 F,
e=B > p, and [v(e)d] = 0|E| , and (ii) the set of points e We where the rigidity matrix R(e)> Rn|E|
0
looses rank corresponding to points p Wp0 where the framework (G, p) is not infinitesimally rigid.
Due to minimal rigidity of the target formation the matrix R(e)> R2nm has full rank |E| = 2n 3
for all e B > F, or said differently R(e)R(e)> has no zero eigenvalues for all e B > F. The minimal
eigenvalue of R(e)R(e)> is positive for all e B > F and thus (due to continuity of eigenvalues with
respect to the matrix elements) also in an open neighborhood of B > F. In particular, for any strictly positive
> 0, we can find = () so that everywhere in the sublevel set () the matrix R(e)R(e)> is positive
definite with eigenvalues lower-bounded by . Formally, is obtained by
= argmaxe,
subject to e (
)

min eig R(e)R(e)> .
e(
)
Then, for all e (), we can upper-bound the derivative of V (e) along trajectories as
V (e) kv(e) dk2 = 4 V (e) . (13.10)
By the Grnwall-Bellman Comparison Lemma in Exercise E13.1, we have that for every e0 (),
V (e(t)) V (e0 )e4t . It follows that the the target formation set (parameterized in terms of relative
positions) B > F is exponentially stable with () as guaranteed region of attraction.
Although the e-dynamics (13.8) and the p-dynamics (13.7) both have the formation F as a limit set,
convergence of the e-dynamics does not automatically imply convergence to a stationary target formation
(but only convergence of the point-to-set distance to F). To establish stationarity, we rewrite the p-dynamics
(13.7) as Z t
p(t) = p0 + f ( ) d , (13.11)
0

where f (t) = B diag(e(t)) v(e(t) d . Due to the exponential convergence rate of the e-dynamics in
We0 the function f (t) is exponentially decaying in time and thus an integrable (L1 ) function. It follows that
the integral on the right-hand side of (13.11) exists even in the limit as t and thus a solution of the
p-dynamics converges to a finite point in F, that is, the agents converge to a stationary target formation.
In conclusion for every p0 R2n so that e0 = B > p0 (), the agents converge exponentially fast to a
stationary target formation.
Theorem 13.15 formulated for minimally rigid formations can also be extended to more redundant
infinitesimally rigid formations; see (Oh et al. 2015).
13.6. Exercises 195
13.6 Exercises
E13.1 Grnwall-Bellman Comparison Lemma. Given a continuous function of time t 7 a(t) R, suppose
the signal t 7 x(t) satisfies
x(t)
a(t)x(t).
Define a new signal t 7 y(t) satisfying y(t)
= a(t)y(t). Show that
Rt
(i) y(t) = y(0) exp 0 a( )d , and
(ii) x(t) y(t).
E13.2 The Lotka-Volterra predator/prey dynamics. In mathematical ecology (Takeuchi 1996), the Lotka-
Volterra equations are frequently used to describe the dynamics of biological systems in which two animal
species interact, a predator and a prey. According to this model, the animal populations change through time
according to
x(t)
= x(t) x(t)y(t),
(E13.1)
= y(t) + x(t)y(t),
y(t)
where x is the nonnegative number of preys, y is the nonnegative number of predators individuals, and , ,
and are fixed positive systems parameters.
(i) Compute the unique non-zero equilibrium point (x , y ) of the system.
(ii) Determine, if possible, the stability properties of the equilibrium points (0, 0) and (x , y ) via lineariza-
tion (Theorem 13.6).
(iii) Define the function V (x, y) = x y + ln(x) + ln(y) and note its level sets as illustrated in
Figure (E13.1).
a) Compute the Lie derivative of V (x, y) with respect to the Lotka-Volterra vector field.
b) What can you say about the stability properties of (x , y )?
c) Sketch the trajectories of the system for some initial conditions in the x-y positive orthant.
y (x , y )
Figure E13.1: Level sets of the function V (x, y) for unit parameter values
E13.3 On the gradient flow of a strictly convex function. Let f : Rn R be a strictly convex and twice

differentiable function. Show convergence of the associated negative gradient flow, x = x f (x), to the
>
global minimizer x of f using the Lyapunov function V (x) = (x x ) (x x ) and the LaSalle Invariance
Principle in Theorem 13.4.
Hint: Use the global underestimate property of a strictly convex function stated as follows: f (x0 ) f (x) >
x f (x)(x x) for all distinct x and x in the domain of f .
0 0
E13.4 Consensus with input constraints. Consider a set of n agents each with first-order dynamics x i = ui .
(i) Design a consensus protocol that respects input constraints ui (t) [1, 1] for all t 0, and prove
that your protocol achieves consensus.
Hint: Adopt the hyperbolic tangent function (or the arctangent function) and Theorem 13.9.
(ii) Extend the protocol and the proof to the case of second-order dynamics x i = ui to achieve consensus
of position states and convergence of velocity states to zero.
Hint: Recall Example 13.8.
E13.5 Distributed optimization using the Laplacian flow. Consider the saddle point dynamics (7.13) that
solve the optimization problem (7.12) in a distributed fashion. Assume that the objective functions are
strictly convex and twice differentiable and that the underlying communication graph among the distributed
processors is connected and undirected. By using the LaSalle Invariance Principle show that all solutions of
the saddle point dynamics converge to the set of saddle points.
Hint: Use the following global underestimate property of a strictly convex function: f (x0 )f (x) > x

f (x)(x0
x) for all distinct x and x in the domain of f ; and the following global overestimate property of a concave
0
function: g(x0 ) g(x) x

g(x)(x0 x) for all distinct x and x0 in the domain of g. Finally, note that the
overestimate property holds with equality g(x0 ) g(x) = x
g(x)(x0 x) if g(x) is affine.
E13.6 Region of attraction of nonlinear systems. Consider the nonlinear system
x 1 = 2x1 2x2 4x31 x22 , (E13.2)

x 2 = 2x1 2x2 2x41 x2 . (E13.3)
Is the origin locally asymptotically stable? Can you comment on the region of attraction?
E13.7 Pentagon formation. Consider n = 5 agents that should form a pentagon with unit side lengths ac-
cording to the formation control protocol (13.7). Design a graph so that the pentagon formation is locally
asymptotically stable.
E13.8 Global analysis of a linear formation. Consider two agents with positions pi = (xi , yi ) R2 , i
{1, . . . , 2}, with controllable integrator dynamics pi = ui , where ui R2 is the steering command that
serves as control input. The two agents have access to only relative position measurements p1 p2 . Your
tasks are as follows:
(i) propose a control law for u1 and u2 as function of the relative position p1 p2 and a design parameter
d12 > 0 so that the agents achieve a desired distance kp1 p2 k = d12 > 0 in steady state (possibly
next to other undesired equilibria).
(ii) study the convergence properties of the closed loop under your proposed control law.
(iii) show that your proposed control law (or a modification thereof) achieves that almost all trajectories
converge to the desired formation. Possibly you need to modify your controller accordingly.
Chapter 14
Coupled Oscillators: Basic Models
In this chapter we discuss network of coupled oscillators. We borrow ideas from (Drfler and Bullo
2011, 2014). This chapter focuses on phase-coupled oscillators and does not discuss models of impulse-
coupled oscillators. Further information on coupled oscillator models can be found in Mauroy et al. (2012);
Acebrn et al. (2005); Strogatz (2000); Arenas et al. (2008).
14.1 History
The scientific interest in synchronization of coupled oscillators can be traced back to the work by Christiaan
Huygens on an odd kind sympathy between coupled pendulum clocks (Huygens 1673). The model of
coupled oscillator which we study was originally proposed by Arthur Winfree (Winfree 1967). For complete
interaction graphs, this model is nowadays known as the Kuramoto model due to the work by Yoshiki
Kuramoto (Kuramoto 1975, 1984). Stephen Strogatz provides an excellent historical account in (Strogatz
2000).
The Kuramoto model and its variations appear in the study of biological synchronization phenomena
such as pacemaker cells in the heart (Michaels et al. 1987), circadian rhythms (Liu et al. 1997), neuroscience
(Varela et al. 2001; Brown et al. 2003; Crook et al. 1997), metabolic synchrony in yeast cell populations
(Ghosh et al. 1971), flashing fireflies (Buck 1988), chirping crickets (Walker 1969), and rhythmic applause
(Nda et al. 2000), among others. The Kuramoto model also appears in physics and chemistry in modeling
and analysis of spin glass models (Daido 1992; Jongen et al. 2001), flavor evolutions of neutrinos (Pantaleone
1998), and in the analysis of chemical oscillations (Kiss et al. 2002). Some technological applications include
deep brain stimulation (Tass 2003), vehicle coordination (Paley et al. 2007; Sepulchre et al. 2007; Klein
et al. 2008), semiconductor lasers (Kozyreff et al. 2000; Hoppensteadt and Izhikevich 2000), microwave
oscillators (York and Compton 2002), clock synchronization in wireless networks (Simeone et al. 2008), and
droop-controlled inverters in microgrids (Simpson-Porco et al. 2013).
197
198 Chapter 14. Coupled Oscillators: Basic Models
3
k34
4
k23
2 k24
k12
Figure 14.1: Mechanical analog of a coupled oscillator network
14.2 Examples
14.2.1 Example #1: A spring network on a ring
This coupled-oscillator network consists of particles rotating around a unit-radius circle and assumed to
possibly overlap without colliding. Each particle is subject to (1) a non-conservative torque i , (2) a linear
damping torque, and (3) a total elastic torque.
Pairs of interacting particles i and j are coupled through elastic springs with stiffness kij > 0. The
elastic energy stored by the spring between particles at angles i and j is
kij kij
Eij (i , j ) = distance2 = (cos i cos j )2 + (sin i sin j )2
2 2
= kij 1 cos(i ) cos(j ) sin(i ) sin(j ) = kij 1 cos(i j ) ,
so that the elastic torque on particle i is

Eij (i , j ) = kij sin(i j ).
i
In summary, Newtons law applied to this rotating system implies that the network of spring-interconnected
particles obeys the dynamics
Xn
Mi i + Di i = i kij sin(i j ),
j=1
where Mi and Di are inertia and damping coefficients. In the limit of small masses Mi and uniformly-high
viscous damping D = Di , that is, Mi /D 0, the model simplifies to:
Xn
i = i aij sin(i j ), i {1, . . . , n}.
j=1
with natural rotation frequencies i = i /D and with coupling strengths aij = kij /D.
14.2. Examples 199
!"#$%&'''%()(*%(+,-.,*%/012-3*%)0-4%5677*%899: !"#$%&'
8
8 15 37 9
10 10
10 37 30 25
38 02
29 10 26 03
30 25 26 28 29
i / rad
2 38 2
04
18 27 05
1 24 9 5 1 3 27
6
28
35
39
17 18 22
F 16 0 9
4
17 21
6
3 8
5
1 15 35 -5 24
21 0 2 4 7 6 8 10
39 22
12 14 15 23
6
36
4 14
31
16
7
13
2
5 15 11
6 12 19 23
06
10
13 20 36 19 07
7 31 11 10 32 20 08
10 34 33 7 33
i / rad
8 2 3
09
34
32 5 4 5
4
9
3 5
0
Fig. 9. The New England test system [10], [11]. The system includes
10 synchronous generators and 39 buses. Most of the buses have constant -5
0 2 4 6 8 10
active and reactive
Figure 14.2: powerdiagram
Line loads. Coupled
and swing
graph dynamics of 10 generators
representation for a simplified model of the New England Power Grid. Generators
are studied in the case that a line-to-ground fault occurs at point F near bus TIME /s
16.are represented by and load buses by .
Fig. 10. Coupled swing of phase angle i in New England test system.
The fault duration is 20 cycles of a 60-Hz sine wave. The result is obtained
by numerical integration of eqs. (11).
14.2.2 Example
test system can #2: The structure-preserving
be represented by power network model

i = i , an AC power network, visualized
We consider
in Figure 14.2, with n buses including generators and load
H !10 are provided to discuss whether themodel instability
i
buses. We
i = D i i + Pmi Gii Ei
present 2
two simplified models Ei Ej for this (11) network, a static power-balance
occurs in the corresponding real power system. First, the
andina dynamic
Fig. 10
fs

continuous-time model. j=1,j!=i classical model with constant voltage behind impedance is
{Gij cos(i j ) + Bij sin(i j )},
The transmission network is described by an admittance used for matrix
first swing Y Cnnofthat
criterion transient stability [1].
is symmetric andThis is
sparse
where i = 2, . . . , 10. is the rotor angle of generator i with because second and multi swings may be affected by voltage
with line impedances Z = Zji for each branch {i,fluctuations,
i j} E. The dampingnetwork
effects,admittance
controllers such matrix is sparse
as AVR, PSS,
respect to bus 1, and i the rotorij speed deviation of generator
i relative to P
matrix withsystem angular frequency (2fs = 2 ij60 Hz).
nonzero off-diagonal entries Y = 1/Z and
ij
governor.
for each Second,
branch the
{i, fault
j} durations,
E; the which
diagonal we fixed at
elements
1Yis constant nfor the above assumption. The parameters 20 cycles, are normally less than 10 cycles. Last, the load
ii = j=1,j6 =i Yij assure zero row-sums. condition used above is different from the original one in
fs , Hi , Pmi , D i , Ei , Gii , Gij , and Bij are in per unit
The static model is described
system except for Hi and Di in second, and for fs in Helz. by the following [11].
two We cannotFirstly,
concepts. hence argue that global
according instability occurs
to Kirchhoffs in
current
The mechanical input power P to generator i and the the real system. Analysis, however, does show a possibility
law, the current injection at node i is balanced by the current flows from adjacent nodes:
mi
magnitude Ei of internal voltage in generator i are assumed of global instability in real power systems.
to be constant for transient stability studies [1],X n H is
[2]. X n
1i IV. T OWARDS A C ONTROL FOR G LOBAL S WING
the inertia constant of generator i, Di its damping Ii = coefficient, (Vi Vj ) = Yij Vj . I NSTABILITY
and they are constant. Gii is the internal conductance, Zijand
Gij + jBij the transfer impedance between generators j=1 i Global j=1 instability is related to the undesirable phenomenon
j; that should be avoided by control. We introduce a key
and They
Here, Ii and VNote
topology changes.
are the
i are the phasor representations of the nodal current injections and nodal voltages, e.g.,
parameters which change with network
that electrical loads in the test system mechanism for thecontrol problem and discuss control
V = |V | e ii corresponds to the signal |V | cos( t + i ). (Recall i = 1.)
are imodeled i as passive impedance [11]. i 0 strategies for preventing or avoiding
The complex the instability.
power injection
i = Vi IExperiment
B.SNumerical i (where z denotes the complex conjugateA.ofInternal z C)Resonance then satisfiesas Another
the power Mechanismbalance equation
Coupled swing dynamics of 10 generators Xn in the X nInspired by [12], we here describe the global instability
test system are simulated. Ei and the initial condition with dynamical systems i(i jtheory
) close to internal resonance
S =V
(i (0), i (0) = 0) for generator i arei fixedi through power
Y ij V j = [23],Y[24]. ij |VConsider
i ||Vj |e collective . dynamics in the system (5).
flow calculation. Hi is fixed at the original values j=1 in [11]. j=1 the system (5) with small parameters pm and b, the set
For
Pmi and constant power loads are assumed to be 50% at their {(, ) S 1 R | = 0} of states in the phase plane is
Secondly,
ratings [22]. The for adamping
losslessDnetwork the real part of the power
i is 0.005 s for all generators.
balancesurface
called resonant equations at each
[23], and node is
its neighborhood resonant
Gii , Gij , and Bij are also based on the original line n data band. The phase plane is decomposed into the two parts:
X
in [11] and the power flow calculation. It is assumed that resonant band and high-energy zone outside of it. Here the
Pi condition = at t = 0as,ij sin( i j ) i {1,
, local and.mode
. . , n}, (14.1)
the test system is in a steady operating
that a line-to-ground faultactive occurs
|{z}
at point F near bus 16
|at initial
{z conditions
indeed exist }insideofthe resonant band.
disturbances in Sec.
The collective
II
motion
j=1
power injection active power flow from j to i
t = 1 s20/(60 Hz), and that line 1617 trips at t = 1 s. The before the onset of coherent growing is trapped near the
resonant band. On the other hand, after the coherent growing,
fault
where durationa is=20|Vcycles
is simulatedijby adding i ||V
of a |60-Hz
||Y denotes sine the
wave. The fault power
maximum transfer over the transmission line {i, j}, and
a jsmallij impedance (107 j) between it escapes from the resonant band as shown in Figs. 3(b),
busPi16=and <(Sground.
i ) is Fig.
the 10
active shows
power coupled swings
injection of
into rotor
the 4(b),
network 5, atandnode 8(b)i, and
which(c). isThe trappedfor
positive motion is almost
generators and
angle i in the test system. The figure indicates that all rotor integrable and
negative for loads. The systems of equations (14.1) are the so-called (balanced) active power flow equations. is regarded as a captured state in resonance
angles start to grow coherently at about 8 s. The coherent [23]. At a moment, the integrable motion may be interrupted
growing is global instability. by small kicks that happen during the resonant band. That is,
C. Remarks Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug so-called the 2016). Draft release from
not for resonance
circulation. [23] happens,
Copyright and the
2012-16.
collective motion crosses the homoclinic orbit in Figs. 3(b),
It was confirmed that the system (11) in the New Eng- 4(b), 5, and 8(b) and (c), and hence it goes away from
land test system shows global instability. A few comments the resonant band. It is therefore said that global instability
(')$
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 10, 2009 at 14:48 from IEEE Xplore. Restrictions apply.
Next, we discuss a simplified dynamic model. Many appropriate dynamic models have been proposed
for each network node: zeroth order (for so-called constant power loads), first-order models (for so-called
frequency-dependent loads and inverter-based generators), and second and higher order for generators;
see (Bergen and Hill 1981). For extreme simplicity here, we assume that every node is described by a
first-order integrator with the following intuition: node i speeds up (i.e., i increases) when the power
balance at node i is positive, and slows down (i.e., i decreases) when the power balance at node i is negative.
In other words, we assume
X n
i = Pi aij sin(i j ). (14.2)
j=1
The systems of equations (14.2) are a first-order simplified version of the so-called coupled swing equations.
Note that, when every node is connected to every other node with identical connections of strength
K > 0, our simplified model of power network is identical to the so-called Kuramoto oscillators model:
n
KX
i = i sin(i j ). (14.3)
n
j=1
(Here i = Pi and aij = K/n for all i, j.)

Let us remark that a more realistic model of power network would necessarily include higher-order
dynamics for the generators, uncertain load models, mixed resistive-inductive lines, and the modelling of
reactive power.
14.2.3 Example #3: Flocking, schooling, and vehicle coordination

Consider a set of n particles in the plane R2 , which we identify with the complex plane C. Each particle
i {1, . . . , n} is characterized by its position ri C, its heading angle i S1 , and a steering control
law ui (r, ) depending on the position and heading of itself and other vehicles, see Figure 14.3.(a). For
simplicity, we assume that all particles have unit speed. The particle kinematics are then given by
ri = eii ,
(14.4)
i = ui (r, ) ,
for i {1, . . . , n}. If no control is applied, then particle i travels in a straight line with orientation i (0),
and if ui = i R is a nonzero constant, then particle i traverses a circle with radius 1/|i |.
The interaction among the particles is modeled by a interaction graph G = ({1, . . . , n}, E, A) de-
termined by communication and sensing patterns. As shown by Vicsek et al. (1995), interesting motion
patterns emerge if the controllers use only relative phase information between neighboring particles. As
discussed in the previous chapter, we may adopt potential functions-based gradient control strategies (i.e.,
negative gradient flows) to coordinate the relative heading angles i (t) j (t). As shown in Example #1, an
intuitive extension of the quadratic Hookean spring potential to the circle is the function Uij : S1 , S1 R
defined by
Uij (i , j ) = aij (1 cos(i j )),
for each edge {i, j} E. Notice that the potential Uij (i , j ) achieves its unique minimum if the heading
angles i and j are synchronized, and it achieves its maximum when i and j are out of phase by angle .
14.3. Coupled phase oscillator networks 201
These considerations motivate the gradient-based control strategy
X Xn
i = 0 K Uij (i j ) = 0 K aij sin(i j ) , i {1, . . . , n} . (14.5)
i j=1
{i,j}E
to synchronize the heading angles of the particles for K > 1 (gradient descent), respectively, to disperse
the heading angles for K < 1 (gradient ascent). The term 0 can induce additional rotations (for 0 6= 0)
or translations (for 0 = 0). A few representative trajectories are illustrated in Figure 14.3.
The controlled phase dynamics (14.5) give rise to elegant and useful coordination patterns that mimic
animal flocking behavior and fish schools. Inspired by these biological phenomena, scientists have studied
the controlled phase dynamics (14.5) and their variations in the context of tracking and formation controllers
in swarms of autonomous vehicles (Paley et al. 2007).
(a) (b) (c) (d) (e)

!" #!
! x !
!
!r! = ! !
y !
(x, y)
eii
Figure 14.3: Panel (a) illustrates the particle kinematics (14.4). Panels (b)-(e) illustrate the controlled dynamics (14.4)-
(14.5) with n = 6 particles, a complete interaction graph, and identical and constant natural frequencies: 0 (t) = 0 in
panels (b) and (c) and 0 (t) = 1 in panels (d) and (e). The values of K are K = +1 in panel (b) and (d) and K = 1 in
panel (c) and (e). The arrows depict the orientation, the dashed curves show the long-term position dynamics, and the
solid curves show the initial transient position dynamics. As illustrated, the resulting motion displays synchronized or
dispersed heading angles for K = 1, and translational motion for 0 = 0, respectively circular motion for 0 = 1.
14.3 Coupled phase oscillator networks

Given a connected, weighted, and undirected graph G = ({1, . . . , n}, E, A), consider the coupled oscillator
model Xn
i = i aij sin(i j ), i {1, . . . , n}. (14.6)
j=1
A special case of the coupled oscillator (14.6) is the so-called Kuramoto model (Kuramoto 1975) with a
complete homogeneous network (i.e., with identical edge weights aij = K/n):
K Xn
i = i sin(i j ), i {1, . . . , n}. (14.7)
n j=1
14.3.1 The geometry of the circle and the torus

Parametrization The unit circle is S1 . The torus Tn is the set consisting of n-copies of the circle. We
parametrize the circle S1 by assuming (i) angles are measured counterclockwise, (ii) the 0 angle is the
intersection of the unit circle with the positive horizontal axis, and (iii) angles take value in [, [.
Geodesic distance The clockwise arc-length from i to j is the length of the clockwise arc from i to
j . The counterclockwise arc-length is defined analogously. The geodesic distance between i and j is
the minimum between clockwise and counterclockwise arc-lengths and is denoted by |i j |. In the
parametrization:
distcc (1 , 2 ) = mod((2 1 ), 2), distc (1 , 2 ) = mod((1 2 ), 2)

|1 2 | = min{distc (1 , 2 ), distcc (1 , 2 )}.
Rotations Given the angle [, [, the rotation of the n-tuple = (1 , . . . , n ) Tn by , denoted

by rot (), is the counterclockwise rotation of each entry (1 , . . . , n ) by . For = Tn , we also define
its rotation set to be
[] = {rot () Tn | [, [}.
The coupled oscillator model (14.6) is invariant under rotations, that is, given a solution : R0 Tn to
the coupled oscillator model, a rotation of rot ((t)) by any angle is again a solution.
Arc subsets of the n-torus Given a length [0, 2[, the arc subset arc () Tn is the set of n-tuples
(1 , . . . , n ) such that there exists an arc of length containing all 1 , . . . , n . The set arc () is the interior
of arc (). For example, arc () implies all angles 1 , . . . , n belong to a closed half circle. Note:
(i) If (1 , . . . , n ) arc (), then |i j | for all i and j. The converse is not true in general.
For example, { Tn | |i j | for all i, j} is equal to the entire Tn . However, the converse
statement is true in the following form (see also Exercise E14.2): if |i j | for all i and j and
(1 , . . . , n ) arc (), then (1 , . . . , n ) arc ().
(ii) If = (1 , . . . , n ) arc (), then average() is well posed. (The average of n angles is ill-posed in
general. For example, there is no reasonable definition of the average of two diametrically-opposed
points.)
14.3.2 Synchronization notions

Consider the following notions of synchronization for a solution : R0 Tn :
Frequency synchrony: A solution : R0 Tn is frequency synchronized if i (t) = j (t) for all time t
and for all i and j.
Phase synchrony: A solution : R0 Tn is phase synchronized if i (t) = j (t) for all time t and for
all i and j.
Phase cohesiveness: A solution : R0 Tn is phase cohesive with respect to > 0 if one of the
following conditions hold for all time t:
(i) (t) arc ();

(ii) |i (t) j (t)| for all edges (i, j) of a graph of interest; or
qP
n 2
(iii) i,j=1 |i (t) j (t)| /2 < .
Asymptotic notions: We will also talk about solutions that asymptotically achieve certain synchronization
properties. For example, a solution : R0 Tn achieves phase synchronization if limt |i (t)
j (t)| = 0. Analogous definitions can be given for asymptotic frequency synchronization and
asymptotic phase cohesiveness.
Finally, notice that phase synchrony is the extreme case of all phase cohesiveness notions with = 0.
14.3.3 Preliminary results

We have the following result on the synchronization frequency.
Lemma 14.1 (Synchronization frequency). If a solution of the coupled oscillator model (14.6) achieves
frequency synchronization, then it does so with a constant synchronization frequency equal to
n
1X
sync , i = average().
n
i=1
Proof. This fact is obtained by summing all equations (14.6) for i {1, . . . , n}.
Lemma 14.1 implies that, by expressing each angle with respect to a rotating frame with frequency
sync and by replacing i by i sync , we obtain sync = 0 or, equivalently, 1 n . In this rotating
frame a frequency-synchronized solution is an equilibrium. Due to the rotational invariance of the coupled
oscillator model (14.6), it follows that if Tn is an equilibrium point, then every point in the rotation set
[ ] = { Tn | rot ( ) , [, [}
is also an equilibrium. Notice that the set [ ] is a connected circle in Tn , and we refer to it as an equilibrium
set. Figure 14.4 for the two-dimensional case.
[ ]
12
|1 2 | < /2
Figure 14.4: Illustration of the state space T2 , the equilibrium set [ ] associated to a phase-synchronized equilibrium
(dotted blue line), the (meshed red) phase cohesive set |2 1 | < /2, and the tangent space with translation
vector 12 at arising from the rotational symmetry.
We have the following important result on local stability properties of equilibria.
Lemma 14.2 (Linearization). Assume the frequencies satisfy 1

n and G is connected with incidence
matrix B. The following statements hold:
(i) Jacobian: the Jacobian of the coupled oscillator model (14.6) at Tn is

J() = B diag({aij cos(i j )}{i,j}E )B > ,
(ii) Local stability: if there exists an equilibrium such that |i j | < /2 for all {i, j} E, then
a) J( ) is a Laplacian matrix; and
b) the equilibrium set [ ] is locally exponentially stable.
Proof. We start with statements (i) and (ii)a. Given Tn , we define the undirected graph Gcosine () with
the same nodes and edges as G and with edge weights aij cos(i j ). Next, we compute
Xn Xn
i aij sin(i j ) = aij cos(i j ),
i j=1 j=1
Xn
i aik sin(i k ) = aij cos(i j ).
j k=1
Therefore, the Jacobian is equal to minus the Laplacian matrix of the (possibly negatively weighted) graph
Gcosine () and statement (i) follows from Lemma 8.1. Regarding statement (ii)a, if |i j | < /2 for all
{i, j} E, then cos(i j ) > 0 for all {i, j} E, so that Gcosine () has strictly nonnegative weights
and all usual properties of Laplacian matrices hold.
To prove statement (ii)b notice that J( ) is negative semidefinite with the nullspace 1n arising from
the rotational symmetry, see Figure 14.4. All other eigenvectors are orthogonal to 1n and have negative
eigenvalues. We now restrict our analysis to the orthogonal complement of 1n : we define a coordinate
transformation matrix Q R(n1)n with orthonormal rows orthogonal to 1n ,
Q1n = 0n1 and QQ> = In1 ,
and we note that QJ( )Q> has negative eigenvalues. Therefore, in the original coordinates, the zero
eigenspace 1n is exponentially stable. By Theorem 13.6, the corresponding equilibrium set [ ] is locally
exponentially stable.
Corollary 14.3 (Frequency synchronization). If a solution of the coupled oscillator model (14.6) satisfies
the phase cohesiveness properties |i (t) j (t)| for some [0, /2[ and for all t 0, then the coupled
oscillator model (14.6) achieves exponential frequency synchronization.
Proof. Let xi (t) = i (t) be the frequency. Then x(t)

= J((t))x(t) is a time-varying averaging system.
The associated undirected graph has time-varying yet strictly positive weights aij cos(i (t) j (t))
aij cos() > 0 for each {i, j} E. Hence, the weighted graph is connected for each t 0. From
the analysis of time-varying averaging systems in Theorem 11.6, the exponential convergence of x(t) to
average(x(0))1n follows. Equivalently, the frequencies synchronize.
14.3.4 The order parameter and the mean field model

An alternative synchronization measure (besides phase cohesiveness) is the magnitude of the order parameter
1 Xn
rei = eij . (14.8)
n j=1
The order parameter (14.8) is the centroid of all oscillators represented as points on the unit circle in C1 .
The magnitude r of the order parameter is a synchronization measure:
if the oscillators are phase-synchronized, then r = 1;

if the oscillators are spaced equally on the unit circle, then r = 0; and
for r ]0, 1[ and oscillators contained in a semi-circle, the associated configuration of oscillators
satisfy a certain level of phase cohesiveness; see Exercise E14.3.
By means of the order parameter rei the all-to-all Kuramoto model (14.7) can be rewritten in the
insightful form
i = i Kr sin(i ) , i {1, . . . , n} . (14.9)
(We ask the reader to establish this identity in Exercise E14.4.) Equation (14.9) gives the intuition that the
oscillators synchronize because of their coupling to a mean field represented by the order parameter rei ,
which itself is a function of (t). Intuitively, for small coupling strength K each oscillator rotates with its
distinct natural frequency i , whereas for large coupling strength K all angles i (t) will entrain to the
mean field rei , and the oscillators synchronize. The transition from incoherence to synchrony occurs at a
critical threshold value of the coupling strength, denoted by Kcritical .
14.4 Exercises
E14.1 Simulating coupled oscillators. Simulate in your favorite programming language and software package
the coupled Kuramoto oscillators in equation (14.3). Set n = 10, define a vector R10 with entries
deterministically uniformly-spaced between 1 and 1. Select random initial phases.
(i) Simulate the resulting differential equations for K = 10 and K = 0.1.
(ii) Find the approximate value of K at which the qualitative behavior of the system changes from
asynchrony to synchrony.
Turn in your code, a few printouts (as few as possible), and your written responses.
E14.2 Phase cohesiveness and arc length. Pick < 2/3 and n 3. Show the following statement: if Tn
satisfies |i j | for all i, j {1, . . . , n}, then there exists an arc of length containing all angles, that
is, arc ().
E14.3 Order parameter and arc length. Given n 2 and Tn , the shortest arc length () is the length of the
shortest arc containing all angles, i.e., the smallest () such that arc (()). Given Tn , the order
parameter is the centroid of (1 , . . . , n ) understood as points on the unit circle in the complex plane C:
1 Xn
r() e()i := ej i .
n j=1

where recall i = 1. Prove the following statements:
(i) if () [0, ], then r() [cos(()/2), 1]; and
(ii) if arc (), then () [2 arccos(r()), ].
The order parameter magnitude r is known to measure synchronization. Show the following statements:
(iii) if all oscillators are phase-synchronized, then r = 1, and
(iv) if all oscillators are spaced equally on the unit circle (the so-called splay state), then r = 0.
E14.4 Order parameter and mean-field dynamics. Show that the Kuramoto model (14.7) is equivalent to the
so-called mean-field model (14.9) with the order parameter r defined in (14.8).
E14.5 Multiplicity of equilibria in the Kuramoto model. A common misconception in the literature is that
the Kuramoto model has a unique equilibrium set in the phase cohesive set { Tn | |i j | <
/2 for all {i, j} E}. Consider now the example of a Kuramoto oscillator network defined over a sym-
metric ring graph with identical unit weights and zero natural frequencies. The equilibria are determined
by
0 = sin(i i1 ) + sin(i i+1 ) ,
where i {1, . . . , n} and all indices are evaluated modulo n. Show that for n > 4 there are at least two
disjoint equilibrium sets in the phase cohesive set { Tn | |i j | < /2 for all {i, j} E}.
Chapter 15
Networks of Coupled Oscillators
15.1 Synchronization of identical oscillators

We start our discussion with the following insightful lemma.
Lemma 15.1. Consider the coupled oscillator model (14.6). If i 6= j for some distinct i, j {1, . . . , n},
then the oscillators cannot achieve phase synchronization.
Proof. We prove the lemma by contraposition. Assume that all oscillators are in phase synchrony i (t) =
j (t) for all t 0 and all i, j {1, . . . , n}. Then by equating the dynamics, i (t) = j (t), it follows
necessarily that i = j .
Motivated by Lemma 15.1, we consider oscillators with identical natural frequencies, i = R for
all i {1, . . . , n}. By working in a rotating frame with frequency , we have = 0. Thus, we consider
the model Xn
i = aij sin(i j ), i {1, . . . , n}. (15.1)
j=1
Notice that phase synchronization is an equilibrium of the this model. Conversely, phase synchronization
cannot be an equilibrium of the original coupled oscillator model (14.6) if i 6= j .
15.1.1 An averaging-based approach

Let us first analyze the coupled oscillator model (15.1) with initial conditions restricted to an open semi-
circle, (0) arc () for some [0, [. In this case, the oscillators remain in a semi-circle at least for
small times t > 0 and the two coordinate transformations
xi (t) = tan(i (t)) (with xi R), and yi (t) = i (t) (with yi R)
are well-defined and bijective (at least for small times).

In the xi -coordinates, the coupled oscillator model reads as the time-varying continuous-time averaging
system Xn
x i (t) = bij (t)(xi (t) xj (t)), (15.2)
j=1
207
208 Chapter 15. Networks of Coupled Oscillators
p
where bij (t) = aij (1 + xi (t)2 )/(1 + xj (t)2 ) and bij (t) aij cos(/2); see Exercise E15.3 for a deriva-
tion. Similarly, in the yi -coordinates, the coupled oscillator model reads as
Xn
y i (t) = cij (t)(yi (t) yj (t)), (15.3)
j=1
where cij (t) = aij sinc(yi (t) yj (t)) and cij (t) aij sinc(). Notice that both averaging formulations
(15.2) and (15.3) are well-defined as long as the the oscillators remain in a semi-circle arc () for some
[0, [.
Theorem 15.2 (Phase cohesiveness and synchronization in open semicircle). Consider the coupled
oscillator model (15.1) with a connected, undirected, and weighted graph G = ({1, . . . , n}, E, A). The
following statements hold:
(i) phase cohesiveness: for each [0, [ each solution orginating in arc () remains in arc () for all
times;
(ii) asymptotic phase synchronization: each trajectory originating in arc () for [0, [ achieves expo-
nential phase synchronization, that is,
k(t) average((0))1n k2 k(0) average((0))1n k2 eps t , (15.4)
where ps = 2 (L) cos(/2).
Proof. Consider the averaging formulations (15.2) and (15.3) with initial conditions (0) arc () for some
[0, [. By continuity, for small positive times t > 0, the oscillators remain in a semi-circle, the time-
varying weights bij (t) aij (cos(/2) and cij (t) aij sinc() are strictly positive for each {i, j} E,
the associated time-dependent graph is connected. As one establishes in the proof of Theorem 11.9, the
max-min functions
Vmax-min (x) = max xi min xi ,

i{1,...,n} i{1,...,n}
Vmax-min (y) = max yi min yi ,

i{1,...,n} i{1,...,n}
are strictly decreasing for the time-varying consensus systems (15.2) and (15.3) until consensus is reached.
Thus, the oscillators remain in arc () phase synchronization exponentially fast. Since the graph is
undirected, we can also conclude convergence to the average phase. Finally, the explicit convergence
estimate (15.4) follows, for example, by analyzing (15.2) with the disagreement Lyapunov function and
using bij (t) aij cos(/2).
15.1.2 The potential landscape, convergence and phase synchronization

The consensus analysis in Theorem 15.2 leads to a powerful result but is inherently restricted to a semi-circle.
To overcome this limitation, we use potential functions as an analysis tool. Inspired by Examples #1 and #3,
define the potential function U : Tn R by
X
U () = aij 1 cos(i j ) . (15.5)
{i,j}E
15.1. Synchronization of identical oscillators 209
Then the coupled oscillator model (14.6) (with all i = 0) can be formulated as the gradient flow
>
U ()
= . (15.6)

Among the many critical points of the potential function (15.5), the set of phase-synchronized angles is the
global minimum of the potential function (15.5). This can be easily seen since each summand in (15.5) is
bounded in [0, 2aij ] and the lower bound is reached only if neighboring oscillators are phase-synchronized.
This global minimum is locally exponentially stable.
Theorem 15.3 (Phase synchronization). Consider the coupled oscillator model (15.1) with a connected,
undirected, and weighted graph G = ({1, . . . , n}, E, A). Then
(i) Global convergence: For all initial conditions (0) Tn , the phases i (t) converge to the set of critical
points { Tn | U ()/ = 0> n }; and
(ii) Local stability: Phase synchronization is a locally exponentially stable equilibrium set.
Proof. The derivative of the potential function U () along trajectories of (15.6) is

U () > 2

U () = .

Since the potential function and its derivative are smooth and the dynamics are bounded in a compact
forward invariant set (Tn ), we can apply the Invariance Principle in Theorem 13.4 to arrive at statement (i).
Statement (ii) follows from the Jacobian result in Lemma 14.2 and Theorem 13.6.
Theorem 15.3 together with Theorem 15.2 gives a fairly complete picture of the convergence and phase
synchronization properties of the coupled oscillator model (15.1).
According to Theorem 15.3 phase synchronization is only locally stable. A stronger result can be made
in case of an all-to-all homogeneous coupling graph, that is, for the Kuramoto model (14.7).
Corollary 15.4 (Almost global phase synchronization for the Kuramoto model). Consider the Ku-
ramoto model (14.7) with identical natural frequencies i = j for all i, j {1, . . . , n}. Then for almost all
initial conditions in Tn , the oscillators achieve phase synchronization.
Proof. For identical natural frequencies, the Kuramoto model (14.7) can be put in rotating coordinates
coordinates so that i = 0 for all i {1, . . . , n}; see Section 15.1. The Kuramoto model reads in the
order-parameter formulation (14.9) as
i = Kr sin(i ) , i {1, . . . , n} . (15.7)
The associated potential function reads as (see Exercise E15.1)

X Kn
U () = aij 1 cos(i j ) = (1 r2 ) , (15.8)
2
{i,j}E
and its unique global minimum is obtained for r = 1, that is, in the phase-synchronized state. By
Theorem 15.3, all angles converge to the set of equilibria which are from (15.7) either (i) r = 0, (ii) r > 0
and in-phase with the order parameter i = , or (iii) r > 0 and out-of-phase with the order parameter
i = + k for k Z \ {0} for all i {1, . . . , n}. In the latter case, any infinitesimal deviation from
an out-of-phase equilibrium causes the potential (15.8) to decrease, that is, the out-of-phase equilibria are
unstable. Likewise, the equilibria with r = 0 correspond to the global maxima of the potential (15.8), and
any infinitesimal deviation from these equilibria causes the potential (15.8) to decrease. It follows that,
from almost all initial conditions1 , the oscillators converge to phase-synchronized equilibria i = for all
i {1, . . . , n}.
15.1.3 Phase balancing

Applications in neuroscience, vehicle coordination, and central pattern generators for robotic locomo-
tion motivate the study of coherent behaviors with synchronized frequencies where the phases are not
synchronized, but rather dispersed in appropriate patterns. While the phase-synchronized state can be
characterized by the order parameter r achieving its maximal (unit) magnitude, we say that a solution
: R0 Tn to the coupled oscillator model (14.6) achieves phase balancing if all phases i asymptotically
converge to the set
Xn
Tn | r() = eij /n = 0 ,
j=1
that is, asymptotically the oscillators are uniformly distributed over the unit circle S1 so that their centroid
converges to the origin.
For a complete homogeneous graph with coupling strength aij = K/n, i.e., for the Kuramoto model
(14.7), we have a remarkable identity between the magnitude of the order parameter r and the potential
function U ()
Kn
U () = 1 r2 . (15.9)
2
(We ask the reader to establish this identity in Exercise E15.1.) For the complete graph, the correspon-
dence (15.9) shows that the global minimum of the potential function U () = 0 (for r = 1) corresponds to
phase-synchronization and the global maximum U () = Kn/2 (for r = 0) corresponds to phase balancing.
This motivates the following gradient ascent dynamics to reach phase balancing:
> n
X
U ()
= + = aij sin(i j ) . (15.10)

j=1
Theorem 15.5 (Phase balancing). Consider the coupled oscillator model (15.10) with a connected, undirected,
and weighted graph G = ({1, . . . , n}, E, A). Then
(i) Global convergence: For all initial conditions (0) Tn , the phases i (t) converge to the set of critical
points { Tn | U ()/ = 0> n }; and
1
To be precise further analysis is needed. A linearization of the Kuramoto model (15.7) at the unstable out-of-pase equilibria
yields that these are exponentially unstable. The region of attraction (the so-called stable manifold) of such exponentially unstable
equilibria is known to be a zero measure set (Potrie and Monzn 2009, Proposition 4.1).
15.2. Synchronization of heterogeneous oscillators 211
(ii) Local stability: For a complete graph with uniform weights aij = K/n, phase balancing is the global
maximizer of the potential function (15.9) and is a locally asymptotically stable equilibrium set.
Proof. The proof statement (i) is analogous to the proof of statement (i) in Theorem 15.3.
Kn 2
balanced set characterized by r = 0
To prove statement (ii), notice that, for a complete graph, the phase
achieves the global maximum of the potential U () = 2 1 r . By Theorem 13.7, local maxima of the
potential are locally asymptotically stable for the gradient ascent dynamics (15.10).
15.2 Synchronization of heterogeneous oscillators

In this section we analyze non-identical oscillators with i 6= j . As shown in Lemma 15.1, these oscillator
networks cannot achieve phase synchronization. On the other hand frequency synchronization with a
certain degree of phase cohesiveness can be achieved provided that the natural frequencies satisfy certain
bounds relative to the network coupling. We start off with the following necessary conditions.
Lemma 15.6. Necessary synchronization condition Consider the coupled Pn oscillator model (14.6) with graph
G = ({1, . . . , n}, E, A), frequencies 1n , and nodal degree deg i = j=1 aij for each node i {1, . . . , n}.
If there exists a frequency-synchronized solution satisfying the phase cohesiveness |i j | for all {i, j} E
and for some [0, /2], then the following conditions hold:
(i) Absolute bound: For each node i {1, . . . , n},
degi sin() |i | . (15.11)
(ii) Incremental bound: For distinct i, j {1, . . . , n},
(degi + degj ) sin() |i j | . (15.12)
Proof. Statement (i) follows directly from the fact that synchronized solutions must satisfy the equilibrium
equation i = 0.
PSince the sinusoidal interaction terms in equation (14.6) are upper bounded by the nodal
degree degi = nj=1 aij , condition (15.11) is necessary for the existence of an equilibrium.
Statement (ii) follows from the fact that frequency-synchronized solutions must satisfy i j = 0. By
analogous arguments, we arrive at the necessary condition (15.12).
15.2.1 Synchronization of heterogeneous oscillators over complete homogeneous graphs

Consider the Kuramoto model over a complete homogeneous graph:
K Xn
i = i sin(i j ), i {1, . . . , n}. (15.13)
n j=1
As discussed in Subsection 14.3.4, the Kuramoto model synchronizes provided that the coupling gain K is
larger than some critical value Kcritical . The necessary condition (15.12) delivers a lower bound for Kcritical
given by
n
K max i min i .
2(n 1) i i
Here we evaluated the left-hand side of (15.12) for aij = K/n, for the maximum = /2, and for all
distinct i, j {1, . . . , n}. Perhaps surprisingly, the lower necessary bound (15.2.1) is a factor 1/2 away
from the upper sufficient bound.
Theorem 15.7 (Synchronization test for all-to-all Kuramoto model). Consider the Kuramoto model
(15.13) with natural frequencies 1
n and coupling strength K. Assume
K > Kcritical , max i min i , (15.14)

i i
and define the arc lengths min [0, /2[ and max ]/2, ] as the unique solutions to sin(min ) =
sin(max ) = Kcritical /K.
Kcritical /K
max min
(i) phase cohesiveness: each solution starting in arc (), for [min , max ], remains in arc () for all
times;
(ii) asymptotic phase cohesiveness: each solution starting in arc (max ) asymptotically reaches the set
arc (min ); and
(iii) asymptotic frequency synchronization: each solution starting in arc (max ) achieves frequency synchro-
nization.
Moreover, the following converse statement is true: Given an interval [min , max ], the coupling strength
K satisfies K > max min if, for all frequencies supported on [min , max ] and for the arc length max
computed as above, the set arc (max ) is positively invariant.
Proof. We start with statement (i). Define the function W : arc () [0, [ by
W () = max{|i j | | i, j {1, . . . , n}}.
The arc containing all angles has two boundary points: a counterclockwise maximum and a counter-
clockwise minimum. If Umax () (resp. Umin ()) denotes the set indices of the angles 1 , . . . , n that are
equal to the counterclockwise maximum (resp. the counterclockwise minimum), then
W () = |m0 k0 |, for all m0 Umax () and k 0 Umin ().
We now assume (0) arc (), for [min , max ], and aim to show that (t) arc () for all times
t > 0. By continuity, arc () is positively invariant if and only if W ((t)) does not increase at any time t
such that W ((t)) = .
In the next equation we compute the maximum possible amount of infinitesimal increase of W ((t))
along system (15.13). We do this in a loose way here and refer to (Lin et al. 2007, Lemma 2.2) for a rigorous
treatment. The statement is:
W ((t + t)) W ((t))
D+ W ((t)) := lim sup = m (t) k (t),
t0+ h
where m Umax ((t)) and k Umin ((t)) have the property that m (t) = max{m0 (t) | m0 Umax ((t))}
and k (t) = min{k0 (t) | k 0 Umin ((t))}. In components
K X
n
D+ W ((t)) = m k sin(m (t) j (t)) + sin(j (t) k (t)) .
n
j=1
The trigonometric identity sin(x) + sin(y) = 2 sin( x+y xy

2 ) cos( 2 ) leads to
n
+ KX m (t) k (t) m (t) i (t) i (t) k (t)
D W ((t)) = m k 2 sin cos .
n 2 2 2
i=1
Measuring angles counterclockwise and modulo 2, the equality W ((t)) = implies m (t) k (t) = ,
m (t) i (t) [0, ], and i (t) k (t) [0, ]. Moreover,

m i i k m i i k
min cos = cos max = cos(/2),
2 2 2 2
so that
K X
n
D+ W ((t)) m k 2 sin cos .
n 2 2
i=1
Applying the reverse identity 2 sin(x) cos(y) = sin(x y) + sin(x + y), we obtain
n
+ KX
D W ((t)) m k sin() (max i min i ) K sin() .
n i i
i=1
Hence, the W ((t)) does not increase at all t such that W ((t)) = if K sin() Kcritical = maxi i
mini i .
Given the structure of the level sets of 7 K sin(), there exists an open interval of arc lengths
[0, ] satisfying K sin() maxi i mini i if and only if equation (15.14) is true with the strict
equality sign at = /2, that is, if K > Kcritical . Additionally, if K > Kcritical , there exists a unique
min [0, /2[ and a unique max ]/2, ] that satisfy equation (15.14) with the equality sign. In
summary, for every [min , max ], if W ((t)) = , then the arc-length W ((t)) is non-increasing. This
concludes the proof of statement (i).
Moreover, pick max min . For all [min + , max ], there exists a positive () with the
property that, if W ((t)) = , then D+ W ((t)) (). Hence, each solution : R0 Tn starting
in arc (max ) must satisfy W ((t)) min after time at most (max min )/(). This proves
statement (ii).
Regarding statement (iii), we just proved that for every (0) arc (max ) and for all ]min , max ]
there exists a finite time T 0 such that (t) arc () for all t T and for some < /2. It follows that
|i (t) j (t)| < /2 for all {i, j} E and for all t T . We now invoke Corollary 14.3 to conclude
the proof of statement (iii).
The converse statement can be established by noticing that all of the above inequalities and estimates
are exact for a bipolar distribution of natural frequencies i {, } for all i {1, . . . , n}. The full proof
is in (Drfler and Bullo 2011)
15.2.2 Synchronization of heterogeneous oscillators over weighted undirected graphs

Consider the coupled oscillator model over a weighted undirected graph:
Xn
i = i aij sin(i j ), i {1, . . . , n}. (15.15)
j=1
Adopt the following shorthands:

r r

1 Xn
1 Xn
2, pairs
= (i j )2 , and 2, pairs
= |i j |2 .
2 i,j=1 2 i,j=1
Theorem 15.8 (Synchronization test I). Consider the coupled oscillator model (15.15) with frequencies
n defined over a weighted undirected graph with Laplacian matrix L. Assume
1
2 (L) > critical , kk2, pairs , (15.16)
and define max ]/2, ] and min [0, /2[ as the solutions to (/2) sinc(max ) = sin(min ) =
critical /2 (L). The following statements hold:

(i) phase cohesiveness:
each solution starting
in arc () | kk2, pairs , for [min , max ],
remains in arc () | kk2, pairs for all times,

(ii) asymptotic phase cohesiveness:
each solution starting in arc () | kk2, pairs < max asymptoti-
cally reaches the set arc () | kk2, pairs min ; and
(iii)
asymptotic frequency synchronization:
each solution starting in
arc () | kk2, pairs < max achieves frequency synchronization.
The proof of Theorem 15.8 follows the reasoning of the proof of Theorem 15.7 using the quadratic
2
Lyapunov function 2, pairs . The full proof is in (Drfler and Bullo 2012, Appendix B).
15.2.3 Appendix: alternative theorem

Notice that the parametric condition (15.16) of the above theorem is very conservative since the left-hand
side is at most n (for a complete graph), and the right hand side is a sum of n2 terms. In the following we
partially improve upon this conservativeness. Adopt the following shorthands:
rX rX

= ( i j )2, and = |i j |2 .
2, edges {i,j}E 2, edges {i,j}E
Theorem 15.9 (Synchronization test II). Consider the coupled oscillator model (15.15) with frequencies
n defined over a weighted undirected graph with Laplacian matrix L. Assume
1
2 (L) > critical , kk2, edges , (15.17)
and define min [0, /2[ as the solution to sin(min ) = critical /2 (L). Then there exists a locally exponen-
tially stable equilibrium set [ ] satisfying |i j | min for all {i, j} E.
Proof. Lemma 14.2 guarantees local exponential stability an equilibrium set [ ] satisfying |i j | for
all {i, j} E and for some [0, /2[. In the following we establish conditions for existence of equilibria
this particular set () = { Tn | |i j | , for all {i, j} E}. The equilibrium equations can be
written as
= L(B > ) , (15.18)
where L(B > ) = B diag({aij sinc(i j )}{i,j}E )B > is the Laplacian matrix associated with the graph
with nonnegative edge weights a
G = ({1, . . . , n}, E, A) ij = aij sinc(i j ) aij sinc() > 0 for
{i, j} E and (). Since for any weighted Laplacian matrix L, we have that L L = L L =
In (1/n)1n 1> > >
n , a multiplication of equation (15.18) from the left by B L(B ) yields
B > L(B > ) = B > . (15.19)
Note that the left-hand side of equation (15.19) is a continuous2 function for (). Consider the formal
substitution x = B > , the compact and convex set S () = {x Img(B > ) | kxk } (corresponding
to ()), and the continuous map f : S () R given by f (x) = B > L(x) . Then equation (15.19) is
equivalent to the fixed-point equation
f (x) = x.
We invoke the Brouwers Fixed Point Theorem which states that every continuous map from a compact
and convex set to itself has a fixed point, see for instance (Spanier 1994, Section 7, Corollary 8).
Since the analysis of the map f in the -norm is very hard in the general case, we resort to a 2-norm
>
set S2 () = {x image(B ) | kxk2 } S (). The set S2 ()
analysis and restrict ourselves to the

corresponds to the set { T | 2, edges } in -coordinates. By Brouwers Fixed Point Theorem,
n
there exists a solution x S2 () to the equation x = f (x) if and only if kf (x)k2 for all x S2 (), or
equivalently if and only if

max B > L(x) . (15.20)
xS2 () 2
After some bounding (see (Drfler and Bullo 2012, Appendix C) for details), we arrive at

max B > L(x) 2, edges / (2 (L) sinc()) .
xS2 () 2
2
The continuity can be established when re-writing equations (15.18) and (15.19) in the quotient space 1 >
n , where L(B )
is nonsingular, and using the fact that the inverse of a nonsingular matrix is a continuous function of its elements. See also
(Rakoevi 1997, Theorem 4.2) for a necessary and sufficient conditions for continuity of the Moore-Penrose inverse requiring that
L(B > ) has constant rank for ().
The term on the right-hand side of the above inequality has to be less or equal than . In summary,
we
conclude that there is a locally exponentially stable synchronization set [ ] { Tn | 2, edges
} () if
2 (L) sin() 2, edges . (15.21)
Since the left-hand side of (15.21) is a concave function of [0, /2[, there exists an open set of
[0, /2[ satisfying equation (15.21) if and only if equation (15.21) is true with the strict equality sign
at = /2, which corresponds to condition (15.17). Additionally, if these two equivalent statements are
exists a unique min [0, /2[ that satisfies equation (15.21) with the equality sign, namely
true, then there
sin(min ) = 2, edges /2 (L). This concludes the proof.
15.3. Exercises 217
15.3 Exercises
P Kn 2
E15.1 Potential and order parameter. Recall U () = {i,j}E aij 1cos(i j ) . Prove U () = 2 (1r )
for a complete homogeneous graph with coupling strength aij = K/n.
E15.2 Analysis of the two-node case. Present a complete analysis of a system of two coupled oscillators:
1 = 1 a12 sin(1 2 ) ,
2 = 2 a21 sin(2 1 ) ,
where a12 = a21 and 1 + 2 = 0. When do equilibria exist? What are their stability properties and their
basins of attraction?
E15.3 Averaging analysis of coupled oscillators in a semi-circle. Consider the coupled oscillator model (15.1)
with arc () for some < . Show that the coordinate transformations xi = tan(i ), with xi R,
gives the averaging system (15.2) with bij aij cos(/2).
E15.4 Phase synchronization in spring network. Consider the spring network from Example #1 in Subsec-
tion 14.2.1 with identical oscillators, no external torques, and a connected, undirected, and weighted graph:
n
X
Mi i + Di i + aij sin(i j ) = 0 , i {1, . . . , n} .
j=1
Prove the phase synchronization result (in Theorem 15.3) for this spring network.
Pn
E15.5 Synchronization on acyclic graphs. For frequencies i=1 i = 0, consider the coupled oscillator model
Xn
i = aij sin(i j ).
j=1
Assume the adjacency matrix A with elements aij = aji {0, 1} is associated to an undirected, connected,
and acyclic graph. Show that the following statements are equivalent:
(i) there exists a locally stable frequency-synchronized solution in the set { Tn | |i j | <
/2 for all {i, j} E},
>
(ii) B L < 1, where B and L are the network incidence and Laplacian matrices.

Hint: Follow the derivation in Example 8.12.
E15.6 Distributed averaging-based PI control for coupled oscillators. Consider a set of n controllable coupled
oscillators governed by the second-order dynamics
i = i , (E15.1a)
Xn
Mi i = Di i aij sin(i j ) + ui , (E15.1b)
j=1
where i {1, . . . , n} is the index set, each oscillator has the state (i , i ) T1 R, ui R is a control input
to oscillator i, and Mi > 0 and Di > 0 are the inertia and damping coefficients. The oscillators are coupled
through an undirected, connected, and weighted graph G = (V, E, A) with node set V = {1, . . . , n}, edge
set E V V , and adjacency matrix A = AT Rnn . To reject disturbances affecting the oscillators,
consider the distributed averaging-based integral controller (see Exercise E6.17)
ui = qi , (E15.2a)
Xn
qi = wi bij (qi qj ) , (E15.2b)
j=1
where qi R is a controller state for each agent i {1, . . . , n}, and the matrix B with elements bij is the
adjacency matrix of an undirected and connected graph. Your tasks are as follows:
(i) characterize the set of equilibria (? , ? , q ? ) of the closed-loop system (E15.1)-(E15.2),
(ii) show that all trajectories converge to the set of equilibria, and
(iii) show that the phase synchronization set { Tn | i = j for all i, j {1, . . . , n}} together with
= q = 0n is an equilibrium and that it is locally asymptotically stable.
Chapter 16
Virus Propagation: Basic Models
In this chapter and the next we present simple models for the diffusion and propagation of infectious
diseases. The proposed models may be relevant also in the context of propagation of information/signals in a
communication network and diffusion of innovations in competitive economic networks. Other interesting
propagation phenomena include failures in power networks and wildfires in forests.
In this chapter and the next, we are interested in (1) models (lumped vs network, deterministic vs
stochastic), (2) asymptotic behaviors (vanishing infection, steady-state epidemic, full contagion), and (3) the
transient propagation of epidemics starting from small initial fractions of infected nodes (possible epidemic
outbreak as opposed to monotonically vanishing infection). In the interest of clarity, we begin with lumped
variables, i.e., variables which represent an entire well-mixed population of nodes. The next chapter will
discuss distributed variable models, i.e., network models. We study three low-dimensional deterministic
models in which nodes may be in one of two or three states; see Figure 16.1.
Susceptible Infected Susceptible Infected
Susceptible Infected Recovered
Figure 16.1: The three basic models SI, SIS and SIR for the propagation of an infectious disease
We say that an epidemic outbreak takes place if a small initial fraction of infected individuals leads to
the contagion of a significant fraction of the population. We say the system displays an epidemic threshold
if epidemic outbreaks occur when some combined value of parameters and initial conditions are above
critical values.
16.1 The SI model

Given a population, let x(t) denote the fraction of infected individuals at time t R0 . Similarly, let s(t)
denote the fraction of susceptible individuals. Clearly, x(t) + s(t) = 1 at all times. We model propagation
219
220 Chapter 16. Virus Propagation: Basic Models
via the following first-order differential equation, called the susceptibleinfected (SI) model
x(t)
= s(t)x(t) = (1 x(t))x(t), (16.1)
where > 0 is the infection rate. We will see distributed and stochastic versions of this model later in
the chapter. A simple qualitative analysis of this equation can be performed by plotting x over x, see
Figure 16.2.
0.4
x = (1 x)x
0.3
0.2
0.1
0.0
0.2 0.4 0.6 0.8 1.0
x
-0.1
Figure 16.2: Phase portrait of the (lumped deterministic) SI model ( = 1)
Remark 16.1 (Heuristic modeling assumptions and derivation). Over the interval (t, t+t), pairwise
meetings between individuals in the population take place in the following fashion: assume the population
has n individuals, pick a meeting rate m > 0, and assume that nm t individuals will meet other nm t
individuals. Assuming meetings involve uniformly-selected individuals, over the interval (t, t + t), there
are s(t)2 nm t meetings between a susceptible and another susceptible individual; these meetings, as well
as meetings between infected individuals result in no epidemic propagation. However, there will also be
s(t)x(t)nm t + x(t)s(t)nm t meetings between a susceptible and an infected individual. We assume a
fraction i [0, 1], called transmission rate, of these meetings results in the successful transmission of the
infection:
i s(t)x(t)nm t + x(t)s(t)nm t = 2i m x(t)s(t)nt.
In summary, the fraction of infected individuals satisfies
x(t + t) = x(t) + 2i m x(t)s(t)t,
and the SI model (16.1) is the limit at t 0+ , where the infection parameter is twice the product of
meeting rate m and infection transmission fraction i .
Lemma 16.2 (Dynamical behavior of the SI model). Consider the SI model (16.1). The solution from
initial condition x(0) = x0 [0, 1] is
x0 et
x(t) = . (16.2)
1 x0 + x0 et
From all positive initial conditions 0 < x0 < 1, the solution x(t) is monotonically increasing and converges to
the unique equilibrium 1 as t .
It is easy to see that the SI model (16.1) results in an evolution akin to a logistic curve; see Figure 16.3.
16.2. The SIR model 221
x(t)
1.0
0.8
0.6
0.4
0.2
t
0.0
0 2 4 6 8
Figure 16.3: Evolution of the fraction of infected individuals in the (lumped deterministic) SI model ( = 1), from
initial conditions in the range [.001, .5].
16.2 The SIR model

Next, we study a model in which individuals recover from the infection and are not susceptible to the
epidemics after one round of infection. In other words, we assume the population is divided into three
distinct groups: s(t) denotes the fraction of susceptible individuals, x(t) denotes the fraction of infected
individuals, and r(t) denotes the fraction of recovered individuals. Clearly, s(t) + x(t) + r(t) = 1. We
model the recovery process via a constant recovery rate and write our (susceptibleinfectedrecovered) SIR
model as
= s(t)x(t),
s(t)
x(t)
= s(t)x(t) x(t), (16.3)
r(t)
= x(t).
Remark 16.3 (Heuristic modeling assumptions and derivation). One can show that the constant
recovery rate assumption corresponds to assuming a so-called Poisson recovery rate for the stochastic version of
the SI model. This is arguably not a very realistic assumption.
Lemma 16.4 (Dynamical behavior of the SIR model). Consider the SIR model (16.3). From each initial
condition s(0) + x(0) + r(0) = 1 with s(0) > 0, x(0) > 0 and r(0) 0, the resulting trajectory t 7
(s(t), x(t), r(t)) has the following properties:
(i) s(t) > 0, x(t) > 0, r(t) 0, and s(t) + x(t) + r(t) = 1 for all t 0;
(ii) t 7 s(t) is monotonically decreasing and t 7 r(t) is monotonically increasing;
(iii) limt (s(t), x(t), r(t)) = (s , 0, r ), where r is the unique solution to the equality

1 r = s(0) exp (r r(0)) ; (16.4)

(iv) if s(0)/ < 1, then t 7 x(t) monotonically and exponentially decreases to zero as t ;
(v) if s(0)/ > 1, then t 7 x(t) first monotonically increases to a maximum value and then monotonically
decreases to zero as t ; (we describe this case as epidemic outbreak, that is, an exponential growth
of t 7 x(t) for small times).
1.0
1.0
r(t) ( / )r1
0.8 s(0)e , / = 1/4
0.8 s(t)
0.6
0.6
1 r1
0.4 0.4
x(t)
( / )r1
0.2 0.2 s(0)e , / =4
t
r1
5 10 15 20 0.0 0.2 0.4 0.6 0.8 1.0
Figure 16.4: Left figure: evolution of the (lumped deterministic) SIR model from small initial fraction of infected
individuals (and zero recovered); parameters = 2, = 1/4 (case (iv) in Lemma 16.4). Right figure: intersection
between the two curves in equation (16.4) with s(0) = 0.95, r(0) = 0 and / {1/4, 4}. If / = 1/4, then
.05 < r < .1. If / = 4, then .95 < r .
Proof. We first prove statement (i). We compute dt d

(s(t) + x(t) + r(t)) = 0 so that s(t) + x(t) + r(t) = 1 for
all time t. Regarding s(t), because
Rt s(t) satisfies a linear differential equation with time-varying coefficients
we know s(t) = s(0) exp( 0 x( )d ); for more details see the Grnwall-Bellman Comparison Lemma
in Exercise E13.1. Therefore, s(0) > 0 implies s(t) > 0 for allRt. Similarly, regarding x(t), because
t
x(t)
= (s(t) )x(t) and x(0) > 0, we know x(t) = x(0) exp( 0 (s( ) )d ) so that x(t) > 0 for
0 implies r cannot become negative. This proves statement (i).
all t. Finally r(t)
Regarding statement (ii), statement (i) implies that the signals s(t) and r(t) are monotonically de-
= s(t)x(t) < 0 and r(t)
creasing because s(t) = x(t) > 0 for all t.
We next focus on statement (iii). Because the signals s(t) and r(t) are monotonically decreasing and
lower bounded, their two limits exist: limt s(t) = s and limt r(t) = r . Moreover, the equality
s(t) + x(t) + r(t) = 1 implies that the third limit also must exist, that is, limt x(t) = x . We now
claim that x = 0. By contradiction, assume x > 0. But, then, also limt r(t) = x > 0 and this
contradicts the fact that r(t) 1.
Next, we prove equation (16.4). If s(0) = 0, then clearly r = 1. If instead s(0) > 0, then s(t) remains
strictly positive for all time t, as we show in statement (i). We now note a useful equality and integrate it
from 0 to t:
s(t)
s(t)
= x(t) = r(t)
= ln = (r(t) r(0))
s(t) s(0)

= s(t) = s(0) exp (r(t) r(0)) .

Equation (16.4) follows by taking the limit as t and noting that for all time 1 = s(t) + x(t) + r(t); in
particular, 1 = s + r . The uniqueness of the solution r to equation (16.4) follows from showing there
16.3. The SIS model 223
exists a unique intersection between left and right hand side, as illustrated in Figure 16.4. This concludes
the proof of statement (iii).
Regarding statement (iv), note that s(t) being monotonically decreasing and s(0)/ < 1 together imply
s(t)/ s(0)/ < 1 for all time t. This implies x(t) = s(t)x(t) x(t) (s(0)/ 1)x(t) < 0.
By the Grnwall-Bellman Comparison Lemma in Exercise E13.1, we now know that x(t) y(t), where
y = (s(0)/ 1)y so that both y and x decrease exponentially fast to zero. This concludes the proof of
statement (iv).
Regarding statement (v), because x(t) = (s(t) )x(t) and x [0, 1], we know the sign of x(t) is
equal to the sign of s(t) . By assumption, we start with x(0) > 0. From statement (ii), we know
t 7 s(t) is monotonically decreasing. It remains to show that s(t) crosses the zero value in
finite time. By contradiction, assume s(t) 0 for all time t. Then x(t)
0 and hence x(t) x(0) for
x(0)s(t) and, via the Grnwall-Bellman Comparison Lemma
all time t. In turn, this implies that s(t)
in Exercise E13.1, that s(t) decreases to zero exponentially fast as time diverges. This is a contradiction and
concludes the proof of statement (v).
16.3 The SIS model

As third and final lumped deterministic model, we study the setting in which individuals recover from the
infection, but are susceptible to being re-infected. As in the SI model, the population is divided into two
fractions with with s(t) + x(t) = 1. We model infection, recovery and possible re-infection with the SIS
model:
x = sx x = ( x)x, (16.5)
where is the infection rate and is the recovery rate. Note that the first term is the same infection term
as in the SI model and the second term is the same recovery term as in the SIR model.
A simple qualitative analysis of this equation can be performed by plotting x over x for < , = ,
and > ; see Figure 16.5.
x = ( x)x x = ( x)x
0.0
0.2 0.4 0.6 0.8 1.0
x 0.2
0.2 0.4 0.6 0.8 1.0

x
0.0
-0.5
-1.0 -0.2
-1.5
Figure 16.5: Phase portrait of the (lumped deterministic) SIS model for = 1 < = 3/2 and for = 1 > = 1/2.
Lemma 16.5 (Dynamical behavior of the SIS model). For the SIS model (16.5):
(i) the closed form solution to equation (16.5) from initial condition x(0) = x0 [0, 1], for 6= , is
( )x0
x(t) = , (16.6)
x0 e()t ( (1 x0 ))
(ii) if , all trajectories converge to the unique equilibrium x = 0 (i.e., the epidemic disappears), and
(iii) if > , then, from all positive initial conditions x(0) > 0, all trajectories converge to the unique
exponentially stable equilibrium x = ( )/ < 1 (epidemic outbreak and steady-state epidemic
contagion).
We illustrate these results in Figure 16.6.
0.5
x(t)
0.4
0.3
0.2
0.1
t
5 10 15 20
Figure 16.6: Evolution of the (lumped deterministic) SIS model from small initial fraction of infected individuals;
= 1 > = .5.
16.4. Exercises 225
16.4 Exercises
E16.1 Closed-form solutions for SI and SIS models. Verify the correctness of the closed-form solutions for SI
and SIS models given in equations (16.2) and (16.6).
E16.2 Dynamical behavior of the SIS model. Prove Lemma 16.5.
Chapter 17
Virus Propagation in Contact Networks
In this chapter we continue our discussion about the diffusion and propagation of infectious diseases.
Starting from the basic lumped models discussed in Chapter 16, we now focus on network models as well
as we discuss some stochastic modelling aspects.
We borrow ideas from the lecture notes by Zampieri (2013) and from Bullo et al. (2016). A detailed
survey about infectious diseases is (Hethcote 2000); a more recent survey is (Nowzari et al. 2016). A very
early work on epidemic models over networks, the spectral radius of the adjacency matrix and the epidemic
threshold is Lajmanovich and Yorke (1976). Later works on similar models include (Wang et al. 2003)
and Van Mieghem et al. (2009); Van Mieghem (2011). Our stochastic analysis is based on the approach
in (Mei and Bullo 2014). Recent extensions and general proofs for the deterministic SIS network model are
given by Khanafer et al. (2014). A related book chapter is (Newman 2010, Chapter 17). The network SIR
model is discussed by Youssef and Scoglio (2011).
17.1 The stochastic network SI model

In this section we consider epidemics models that are richer, more general and complex than the lumped
deterministic models consider before. We extend our treatment in two ways: we consider a stochastic
model of the propagation phenomenon and we imagine the population is distributed over a network.
The stochastic model The stochastic network SI model, illustrated in Figure 17.1, is defined as follows:
(i) We consider a group of n individuals. The state of each individual is either S for susceptible or I for
infected.
(ii) The n individuals are in pairwise contact, as specified by an undirected graph G with adjacency
matrix A (without self-loops). The edge weights represent the frequency of contact among two
individuals.
(iii) Each individual in susceptible status can transition to infected as follows: given an infection rate
> 0, if a susceptible individual i is in contact with an infected individual j for time t, the
probability of infection is aij t. Each individual can be infected by any neighboring individual:
these random events are independent.
227
228 Chapter 17. Virus Propagation in Contact Networks
(infection rate)
Susceptible Infected
Figure 17.1: In the stochastic network SI model, each susceptible individual (blue) becomes infected by contact with
infected individuals (red) in its neighborhood according to an infection rate .
An approximate deterministic model We define the infection variable at time t for individual i by
(
1, if node i is in state I at time t,
Yi (t) =
0, if node i is in state S at time t,
and the expected infection, which turns out to be equal to the probability of infection, of individual i by
xi (t) = E[Yi (t)] = 1 P[Yi (t) = 1] + 0 P[Yi (t) = 0]

= P[Yi (t) = 1].
In what follows it will be useful to approximate P[Yi (t) = 0 | Yj (t) = 1] with P[Yi (t) = 0], that is, to
require Yi and Yj to be independent for arbitrary i and j. We claim this approximation is acceptable over
certain graphs with large numbers n of individuals. The final model, which we obtain below based on the
Independence Approximation, is an upper bound on the true model because P[Yi (t) = 0] P[Yi (t) =
0 | Yj (t) = 1].
Definition 17.1 (Independence Approximation). For any two individuals i and j, the infection variables
Yi and Yj are independent.
Theorem 17.2 (From the stochastic to the deterministic network SI model). Consider the stochastic
network SI model with infection rate over a contact graph with adjacency matrix A. The probabilities of
infection satisfy
X n
d
P[Yi (t) = 1] = aij P[Yi (t) = 0, Yj (t) = 1].
dt
j=1
Moreover, under the Independence Approximation 17.1, the probabilities of infection xi (t) = P[Yi (t) = 1],
i {1, . . . , n}, satisfy (deterministic) network SI model defined by
n
X
x i (t) = (1 xi (t)) aij xj (t).
j=1
17.1. The stochastic network SI model 229
We study the deterministic network SI model in the next section.

Proof. In what follows, we define the random variables
Yi (t) = (Y1 (t), . . . , Yi1 (t), Yi+1 (t), . . . , Yn (t)),
and, similarly, Yij (t), for i, j {1, . . . , n}. We are interested in the events that a susceptible individual
remains susceptible or becomes infected over the interval of time [t, t + t], for small t. We start by
computing the probability of non-infection for time t, conditioned upon Yi (t):
n
Y n
X

P[Yi (t + t) = 0 | Yi (t) = 0, Yi (t)] = 1 aij Yj (t)t = 1 aij Yj (t)t + O(t2 ),
j=1 j=1
where O(t2 ) is a function upper bounded by a constant times t2 . The complementary probability, i.e.,
the probability of infection for time t is:
n
X
P[Yi (t + t) = 1 | Yi (t) = 0, Yi (t)] = aij Yj (t)t + O(t2 ).
j=1
We are now ready to study the random variable Yi (t + t) Yi (t), given Yi (t):
E[Yi (t+t) Yi (t) | Yi (t)]
= 1 P[Yi (t + t) = 1, Yi (t) = 0 | Yi (t)]

+ 0 P (Yi (t + t) = Yi (t) = 0) or (Yi (t + t) = Yi (t) = 1) | Yi (t) (by def. expectation)
= P[Yi (t + t) = 1 | Yi (t) = 0, Yi (t)] P[Yi (t) = 0 | Yi (t)] (by conditional prob.)
X n
= aij Yj (t)t + O(t2 ) P[Yi (t) = 0 | Yi (t)].
j=1
We now remove the conditioning upon Yi (t) and study:

E[Yi (t + t) Yi (t)] = E E[Yi (t + t) Yi (t) | Yi (t)]
X n
= aij t E Yj (t) P[Yi (t) = 0 | Yi (t)] + O(t2 ),
j=1
and therefore we compute (where y is an arbitrary realization of the random variable Y ):

E Yj (t) P[Yi (t) = 0 | Yi (t)]
X
= yj P[Yi (t) = 0 | Yi (t) = yi ] P[Yi (t) = yi ] (by def. expectation)
yi
X
= 1 P[Yi (t) = 0 | Yij (t) = yij , Yj (t) = 1]
yij
P[Yij (t) = yij , Yj (t) = 1] (because yj {0, 1})
X
= P[Yi (t) = 0, Yij (t) = yij , Yj (t) = 1] (by conditional prob.)
yij
= P[Yi (t) = 0, Yj (t) = 1],
where, for example, the first summation is taken over all possible values yi that the variable Yi (t) takes.
In summary, we know
n
X
E[Yi (t + t) Yi (t)] = aij t P[Yi (t) = 0, Yj (t) = 1] + O(t2 ),
j=1
so that, also recalling P[Yi (t) = 1] = E[Yi (t)],

X n
d E[Yi (t + t) Yi (t)]
P[Yi (t) = 1] = lim = aij P[Yi (t) = 0, Yj (t) = 1].
dt t0+ t
j=1
The final step is an immediate consequence of the Independence Approximation: P[Yi (t) = 0, Yj (t) = 1] =
P[Yi (t) = 0 | Yj (t) = 1] P[Yj (t) = 1] (1 P[Yi (t) = 1]) P[Yj (t) = 1].
17.2 The network SI model

In this and the following sections we consider deterministic network models for the propagation of
epidemics. Two interpretations of the provided models are possible: if node i is a population of in-
dividuals at location i, then xi can be interpreted as the infected fraction of that population. If node
i is a single individual, then xi can be interpreted as the probability that the individual is infected:
xi (t) = P[individual i is infected at time t].
(infection rate)
Susceptible Infected
Figure 17.2: In the (deterministic) network SI model, each node is described by a probability of infection taking value
between 0 (blue) and 1 (red). The rate at which individuals become increasingly infected is parametrized by the
infection rate .
Consider an undirected weighted graph G = (V, E) of order n with adjacency matrix A and degree
matrix D = diag(A1n ). Let xi (t) [0, 1] denote the fraction of infected individuals at node i V at time
t R0 . The network SI model is
n
X
x i (t) = (1 xi (t)) aij xj (t), (17.1)
j=1
or, in equivalent vector form,

x = In diag(x) Ax.
17.2. The network SI model 231
Alternatively, in terms the fractions of susceptibile individuals s = 1n x, the network SI model reads
s = diag(s)A(1n s). (17.2)
Theorem 17.3 (Dynamical behavior of the network SI model). Consider the network SI model (17.1).
Assume G is connected so that A is irreducible; let D denote the degree matrix. The following statements hold:
(i) if x(0), s(0) [0, 1]n , then x(t), s(t) [0, 1]n for all t 0;
(ii) there are two equilibrium points: 0n (no epidemics), and 1n (full contagion);
(iii) the linearization of model (17.1) about the equilibrium point 0n is x = Ax and it is exponentially
unstable;
(iv) the linearization of model (17.2) about the equilibrium 0n is s = Ds and it is exponentially stable;
(v) each trajectory with initial condition x(0) 6= 0n converges asymptotically to 1n , that is, the epidemics
spreads to the entire network.
Proof. Statement (i) can be proved by evaluating the vector field (17.1) at the boundaries of the admissible
state space that is for x [0, 1]n such that at least one entry i satisfies xi {0, 1}. We leave the detailed
proof of statement (i) to the reader.
We now prove statement (ii). The point x is an equilibrium point if and only if:

In diag(x) Ax = 0n Ax = diag(x)Ax.
Clearly, 0n and 1n are equilibrium points. Hence we just need to show that no other points can be equilibria.
First, suppose that there exists an equilibrium point x with 0n x < 1n . But then In diag(x)
Pn1 hask strictly
positive diagonal and therefore
Pn1 k x must satisfy Ax = 0 n . Note that Ax = 0 n implies also k=1 A x = 0n .
If A is irreducible, then k=1 A has all off-diagonal terms strictly positive. Because xi [0, 1[, the only
possible solution to Ax = 0n is therefore x = 0n . This is a contradiction.
Next, suppose there exists an equilibrium point x = (x1 , x2 ) with 0n1 x1 < 1n1 , x2 = 1n2 , and
n1 + n2 = n. The equality Ax = diag(x)Ax implies Ax = diag(x)k Ax for all k N and, in turn,

k 0n1 n1 0n1 n2
Ax = lim diag(x) Ax = Ax.
k 0n2 n1 In2
By partitioning A in corresponding blocks, the previous equality implies A11 x1 + A12 x2 = 0n1 . Because
x2 = 1n2 we know that A12 = 0n1 n2 and, therefore, that A is reducible. This contradiction concludes the
proof of statement (ii).
Statements (iii) and (iv) are straightforward computations:
x
= In diag(x) Ax = Ax diag(x)Ax Ax,

s
= diag(s)A(1n s) = diag(s)A1n + diag(s)As = Ds + diag(s)As Ds,

where we used the equality diag(y)z = diag(z)y for y, z Rn . Exponential stability of the linearization
s = Ds is obvious, and the PerronFrobenius Theorem 2.12 for irreducible matrices implies the existence
of the unstable positive eigenvalue (A) > 0 for the linearization x = Ax.
To show statement (v), consider the function V (x) = 1> n (1n x); this is a smooth functiondefined
over the compact and forward invariant set [0, 1]n (see (i)). We compute V = 1> n In diag(x) Ax and

note that V 0 for all x, and V (x) = 0 if and only if x {0n , 1n }. Because of these facts, the LaSalle
Invariance Principle in Theorem 13.4 implies all trajectories with x(0) converge asymptotically to either
1n or 0n . Additionally, note that 0 V (x) n for all x [0, 1]n , that V (x) = 0 if and only if x = 1n and
that V (x) = n if and only if x = 0n . Therefore, all trajectories with x(0) 6= 0n converge asymptotically to
1n .
Before proceeding, we review the notion of dominant eigenvector and introduce some notation. Let
max = (A) be the dominant eigenvalue of the adjacency matrix A and let vmax be the corresponding
positive eigenvector normalized to satisfy 1>
n vmax = 1. (Recall that these definitions are well posed because
of the PerronFrobenius Theorem 2.12 for irreducible matrices.) Additionally, let vmax , v2 . . . , vn denote an
orthonormal set of eigenvectors with corresponding eigenvalues max > 2 n for the symmetric
adjacency matrix A.
Consider now the onset of an epidemics in a large population characterized by a small initial infection
x(0) = x0 1n . So long as x(t) 1n , the system evolution is approximated by x = Ax. This
initial-times linear evolution satisfies
n
X
>

x(t) = vmax x0 emax t vmax + vi> x0 ei t vi
i=2

= emax t >
vmax x0 vmax + o(t) , (17.3)
where o(t) is a function exponentially vanishing as t . In other words, the epidemics initially
experiences exponential growth with rate max and with distribution among the nodes given by the
eigenvector vmax .
17.3 The network SIS model

As previously, consider an undirected weighted graph G = (V, E) of order n with adjacency matrix A. Let
xi (t) [0, 1] denote the fraction of infected individuals at node i V at time t R0 . Given an infection
rate and a recovery rate , the network SIS model is
n
X
x i (t) = (1 xi (t)) aij xj (t) xi (t), (17.4)
j=1
or, in equivalent vector form

x = In diag(x) Ax x. (17.5)
We start our analysis with useful preliminary notions. We define the monotonically-increasing functions
f+ (y) = y/(1 + y), and f (z) = z/(1 z)
17.3. The network SIS model 233
for y R0 and z [0, 1]. One can easily verify that f+ (f (z)) = z for all z [0, 1]. For vector variables
y Rn0 and z [0, 1]n , we write F+ (y) = (f+ (y1 ), . . . , f+ (yn )), and F (z) = (f (z1 ), . . . , f (zn )).
Denoting A = A/ and assuming x < 1n , the model (17.5) is rewritten as:

x = F (x) = diag(1n x) Ax F (x) ,
so that
F (x) 0n F (x)
Ax x.
F+ (Ax)
= F (x ) or, equivalently, if and
Moreover, x is an equilibrium point (F (x ) = 0n ) if and only if Ax
) = x . We are now ready to present our results in two theorems.
only if F+ (Ax
Theorem 17.4 (Dynamical behavior of the network SIS model: below the threshold). Consider the
network SIS model (17.4) over an undirected graph G with infection rate and a recovery rate . Assume G is
connected, let A be its adjacency matrix with dominant eigenvalue max . If max / < 1, then
(i) there exists a unique equilibrium point 0n ,
(ii) the linearization of model (17.4) about the equilibrium 0n is x = (A In )x and it is exponentially
stable; and
(iii) from any initial condition x(0) 6= 0n , the weighted average t 7 vmax
> x(t) is monotonically and
exponentialy decreasing, so that all trajectories converge to 0n .

Ax
Proof. Regarding statement (i), for x [0, 1]n \ {0n }, note F+ (Ax) because f+ (z) z. Compute
2 kAxk
kF+ (Ax)k 2 kAk
2 kxk2 < kxk2 ,
2 = (A),
where the last inequality follows from kAk because A is symmetric, and from (A) = max / <

1. Therefore, no x 6= 0n can satisfy F+ (Ax) = x.
Regarding statement (ii), the linearization of equation (17.5) is verified by dropping the second-order
terms. The eigenvalues of A In are i , where 1 = max > 2 n are the eigenvalues
of A. The linearized system is exponentially stable at 0n for max < 0.
> x(t), note I diag(z) v
Finally, regarding statement (iii), define y(t) = vmax n max vmax for any
n
z [0, 1] , and compute
>
>
y(t)
= vmax In diag(x(t)) Ax(t) vmax x(t)
> >
vmax Ax(t) vmax x(t)
(max )y(t).
By the Gronwalls Lemma, this inequality implies that t 7 y(t) is monotonically decreasing and satisfies
t 7 y(t) y(0) e(max )t from all initial conditions y(0). This concludes our proof of statement (iii)
since vmax > 0n .
Theorem 17.5 (Dynamical behavior of the network SIS model: above the threshold). Consider the
network SIS model (17.4) over an undirected graph G with infection rate and a recovery rate . Assume
G is connected, let A be its adjacency matrix with dominant eigenpair (max , vmax ) and with degree vector
d = A1n . If max / > 1, then
(i) 0n is an equilibrium point, the linearization of system (17.5) at 0n is unstable with dominant unstable
eigenvalue max and with dominant eigenvector vmax , i.e., there will be an epidemic outbreak;
(ii) besides the equilibrium 0n , there exists a unique other equilibrium point x such that
a) x > 0n ,
b) x = vmax + O( 2 ), for := max / 1, as 0+ ,
c) x = 1n (/) diag(d)1 1n + O( 2 / 2 ), at fixed A as / 0+ ,
d) x = limk y(k), where the monotonically-increasing {y(k)}kZ0 [0, 1]n is defined by
X
n
yi (k + 1) := f+ aij yj (k) , y(0) := vmax ,
(1 + )2
j=1
(iii) if x(0) 6= 0n , then x(t) x as t . Moreover, if x(0) < x (resp. x(0) > x ), then t 7 x(t) is
monotonically increasing (resp. decreasing).
Note: statement (i) means that, near the onset of an epidemic outbreak, the exponential growth rate
is max and the outbreak tends to align with the dominant eigenvector vmax as in the discussion
leading up to the approximate evolution (17.3).
Proof of selected statements in Theorem 17.5. Statement (i) follows from the same analysis of the linearized
system as in the proof of Theorem 17.4(ii).
We next focus on the statements (ii). We begin by establishing two properties of the map x 7 F+ (Ax).

First, we claim that, y > z 0n implies F+ (Ay) > F+ (Az). Indeed, note that G being connected implies
that the adjacency matrix A has at least one strictly positive entry in each row. Hence, y z > 0n implies
z) > 0n and, since f+ is monotonically increasing, Ay
A(y > Az implies F+ (Ay) > F+ (Az).
n
Second, we claim that there exists an x [0, 1] satisfying F+ (Ax) > x. Indeed, let max = max (A)
=
max (A)/ > 1 and compute for any > 0

max )) = f+ (
F+ (A(v max vmax,i ) > vmax,i max
2 vmax,i ,
i max
max 1)/
where we used the scalar inequality /(1 + ) > (1 ), for > 0. For = ( 2 and recalling
max
vmax,i < 1 for each i, compute
max
2 vmax,i =
max (
max 1)vmax,i >
max (
max 1) = 1.
max
max ) > vmax , for = (

This concludes our proof that F+ (Av max 1)/ 2 . Simple calculations show
max
2
that = /(1 + ) so that vmax = y(0).
These two properties allow us to analyze the iteration defined in the theorem statement. We just proved

that y(2) = F+ (Ay(1))
> y(1) = (/(1 + )2 )vmax . This inequality implies F+ (Ay(2))
> F+ (Ay(1))

and, by induction, F+ (Ay(k
+ 1)) > y(k + 1) = F+ (Ay(k)). Each sequence {yi (k)}kN , i {1, . . . , n},
is monotonically increasing and upper bounded by 1. Hence, the sequence {y(k)}kN converges and it
converges to a point x > 0n such that F+ (Ax ) = x . This proves the existence of an equilibrium point
x = limk y(k) > 0n , as claimed in statements (ii)d and (ii)a.
17.4. The network SIR model 235
Regarding the statement (ii)b, we claim there exists a bounded sequence {w(k, )}kZ0 Rn such
that the sequence {y(k)}kZ0 satisfies y(k) = vmax + 2 w(k, ). The statement x = vmax + O( 2 ) is
then an immediate consequence of this claim and of the limit limk y(k) = x . We prove the claim by
induction. Because /(1+)2 = 2 2 +O( 3 ), the claim is true for k = 0 with w(0, ) = 2vmax +O().
We now assume the claim is true at k and show it is true at k + 1:

max + 2 w(k, ))
yi (k + 1) = F+ A(v

= F+ (1 + )vmax + 2 Aw(k, )

= F+ vmax + 2 (Aw(k, ) + vmax )

= vmax + 2 (Aw(k, ) + vmax )

2 2
diag vmax + (Aw(k, ) + vmax ) vmax + (Aw(k, ) + vmax ) + O( 3 )

= vmax + 2 Aw(k, ) + vmax diag(vmax )vmax + O() ,
where we used the Taylor expansion F+ (y) = y diag(y)y + O(kyk3 ). Hence, the claim is true if the
sequence {w(k, )}kZ0 defined by

w(k + 1, ) = Aw(k, ) + vmax diag(vmax )vmax + O()
is bounded. But the sequence is bounded because the spectral radius of A equals max / < 1. This
concludes the proof of statement (ii)b. The proof of statement (ii)c is analogous: it suffices to show the
existence of a bounded sequence {w(k)} such that y(k) = 1n (/) diag(d)1 1n + (/)2 w(k).
To complete the proof of statement (ii) we establish the uniqueness of the equilibrium x [0, 1]n \{0n }.
First, we claim that an equilibrium point with an entry equal to 0 must be 0n . Indeed,

Pn assume y is an
equilibrium point and assume yi = 0 for some i {1, . . . , n}. The equality yi = f+ ( j=1 aij yj ) implies
that also any node j with aij > 0 must satisfy yj = 0. Because G is connected, all entries of y must be
zero. Second, by contradiction, we assume there exists another equilibrium point y > 0n distinct from x .
Without loss of generality, assume there exists i such that yi < xi . Let (0, 1) satisfy y x > 0n
and yi = xi . Note:

) y = f+ (Ay
F+ (Ay )i xi
i

)i xi
f+ (Ax (because A 0)

)i xi
> f+ (Ax (because f+ (y) > f+ (y) for < 1)

) x = 0.
= F+ (Ax (because x is an equilibrium)
i

Therefore F+ (Ay ) y > 0n and this is a contradiction.
i
Regarding statement (iii) we refer to (Lajmanovich and Yorke 1976; Fall et al. 2007; Khanafer et al. 2014)
in the interest of brevity.
17.4 The network SIR model

As previously, consider an undirected weighted graph G = (V, E) of order n with adjacency matrix A. Let
si (t), xi (t), ri (t) [0, 1] denote the fractions of susceptibile, infected and recovered individuals at node
i V at time t R0 . The network SIR model is

Xn
s i (t) = si (t) aij xj (t),
j=1
Xn
x i (t) = si (t) aij xj (t) xi (t),
j=1
ri (t) = xi (t),
where > 0 is the infection rate and > 0 is the recovery rate. Note that the third equation is redundant
because of the constraint si (t) + xi (t) + ri (t) = 1 and that, therefore, we regard the dynamical system as
described by the first two equations and write it in vector form as
s = diag(s)Ax,
(17.6)
x = diag(s)Ax x.
We now state the main result of this section.
Theorem 17.6 (Dynamical behavior of the network SIR model). Consider the network SIR model (17.6)
over an undirected graph G with infection rate and a recovery rate . Assume G is connected and let A be
its adjacency matrix. Let (max,t , vmax,t ) be the dominant eigenpair for the nonnegative matrix diag(s(t))A.
(i) if x(0) 0n , x(0) 6= 0n , and s(0) > 0n , then
a) t 7 s(t) and t 7 x(t) are strictly positive for all t > 0,

b) t 7 s(t) is monotonically decreasing, and
c) t 7 max,t is monotonically decreasing;
(ii) the set of equilibrium points is the set of pairs (s , 0n ), for any s [0, 1]n , and the linearization of
model (17.6) about an equilibrium point (s , 0n ) is
s = diag(s )Ax,
(17.7)
x = diag(s )Ax x;
(iii) (behavior above the threshold) if max,0 > and x(0) 6= 0n , then,
a) for small time, the weighted average t 7 vmax,0

> x(t) grows exponentially fast with rate max,0 ,
i.e., an epidemic outbreak will develop, and
b) there exists a time 0 such that max, < ;
(iv) (behavior below the threshold) for all 0 such that max, < , the weighted average t 7
>
vmax, x(t), for t , is monotonically and exponentialy decreasing to zero.
(v) each trajectory converges asymptotically to an equilibrium point, that is, limt x(t) = 0n so that the
epidemics asymptotically disappears.
17.5. Exercises 237
17.5 Exercises
E17.1 Network SI model in digraphs. Generalize Theorem 17.3 to the setting of strongly-connected directed
graphs:
(i) what are the equilibrium points?
(ii) what are their convergence properties?
E17.2 Initial evolution of network SIS model. Consider the network SIS model with initial fraction x(0) = x0 ,
where we take x0 1n and 1. Show that in the time scale t() = ln(1/)/(max ), the linearized
evolution satisfies
>
lim+ x t() = vmax x0 vmax .
0
Chapter 18
Lotka-Volterra Population Dynamics
The Lotka-Volterra model is one of the simplest frameworks for modeling the dynamics of interacting
populations in mathematical ecology. These equations were originally developed in (Lotka 1920; Volterra
1928). Treatment based on (Goh 1980; Takeuchi 1996; Baigent 2010). We refer to (Baigent 2010) for
additional results on conservative Lotka-Volterra models (Hamiltonian structur and existence of periodic
orbits), competitive and monotone models. We refer to (Hofbauer and Sigmund 1998; Sandholm 2010) for
comprehensive discussions about the connection with evolutionary game dynamics.
(a) Common Clownfish (Amphiprion (b) The Canadian Lynx (Lynx canaden- (c) Subadult male lion (Panthera Leo)
ocellaris) near Magnificent Sea sis) is a major predator of the Snow- and spotted hyena (Crocuta Crocuta)
Anemones (Heteractis magnifica) on the shoe Hare (Lepus americanus). Histori- compete for the same resources in the
Great Barrier Reef, Australia. Clownfish cal records of animals captures indicate Maasai Mara National Reserve in Narok
and anemones provide an example that the lynx and hare numbers rise and County, Kenya. (Picture "Hynen und
of ecological mutualism in that each fall periodically; see Odum (1959). Photo Lwe im Morgenlicht" by lubye134 is
species benefits from the activity of the source: Rudolfos Usenet Animal Pic- licensed under CC BY 2.0)
other. tures Gallery (no longer in existence).
Figure 18.1: Mutualism, predation and competition in population dynamics
18.1 The Lotka-Volterra population model: setup

In this section we introduce various single-species and multi-species model of population dynamics. We
start with single-species models. We let x(t) denote the population numer or its density at time t. The ratio
x/x
is the average contribution of an individual to the growth of the pupulation.
239
240 Chapter 18. Lotka-Volterra Population Dynamics
Single-species constant growth model In a simplest model, one may assume x/x is equal to a constant
growth rate r. This assumption however leads to exponential growth or decay x(t) = x(0) ert depending
upon whether r is positive or negative. Of course, exponential growth may be reasonable only for short
periods of time and violates a reasonable assumption of bounded resources for large times.
Single-species logistic growth model In large populations it is natural to assume that resources would
= r(1 x/K),
diminish with the growing size of the population. In a simplest model, one may assume x/x
where r > 0 is the intrinsic growth rate and K > 0 is called the carrying capacity. This assumption leads
to the so-called logistic equation
x = rx(1 x/K). (18.1)
This dynamical system has the following behavior:
(i) there are two equilibrium points 0 and K,

(ii) the solution is
Kx(0) ert
x(t) = ,
K + x(0)(ert 1)
(iii) all solutions with 0 < x(0) < K are monotonically increasing and converge asymptotically to K,
(iv) all solutions with K < x(0) are monotonically decreasing and converge asymptotically to K.
The reader is invited to show these facts and related ones in Exercise E18.1. The evolution of the logistic
equation from multiple initial values is illustrated in Figure 18.2.
Kx(0)ert
x(t) =
K + x(0)(ert 1)
t
1/r 2/r 3/r 4/r 5/r
Figure 18.2: Solutions to the logistic equations
Multi-species Lotka-Volterra model with signed interations Finally, we consider the case of n 2
interacting species. We assume logistic growth model for each species with an additional term due to the
interaction with the other species. Specifically, we write the growth rate for species i {1, . . . , n},
n
X
x i
= ri + aii xi + aij xj , (18.2)
xi
j=1,j6=i
18.1. The Lotka-Volterra population model: setup 241
where the first two terms are the logistic equation (so that aii is typically negative because of bounded
resources and the carrying capacity is Ki = ri /aii ), and the third term is the combined effect of the
pairwise interactions with all other species.
The vector r is called the intrisic growth rate, the matrix A = [aij ] is called the interaction matrix,
and the the ordinary differential equations (18.2) are called the Lotka-Volterra model for n 2 interacting
species. This model is written in vector form as

x = diag(x) Ax + r =: fLK (x). (18.3)
For any two species i and j, the sign of aij and aji in the interaction matrix A is determined by which
of the following three possible types of interaction is being modelled:
(+, +) = mutualism: for aij > 0 and aji > 0, the two species are in symbiosis and cooperation. The
presence of species i has a positive effect on the growth of species j and vice versa.
(+,-) = predation: for aij > 0 and aji < 0, the species are in a predator-prey or host-parasite relationship.
In other words, the presence of a prey (or host) species j favors the growth of the predator (or parasite)
species i, wheres the presence of the predator species has a negative effect on the growth of the prey.
(-,-) = competition: for aij < 0 and aji < 0, the two species compete for a common resources of sorts
and have therefore a negative effect on each other.
Note: the typical availability of bounded resources suggests it is ecologically meaningful to assume that
the interaction matrix A is Hurwitz and that, to model the setting in which species live in isolation, the
diagonal entries aii are negative.
18.2 Two-species model and analysis

In this section we consider the two-species Lotka-Volterra system
x 1 = x1 (r1 + a11 x1 + a12 x2 ),
(18.4)
x 2 = x2 (r2 + a21 x1 + a22 x2 ),
with scalar parameters (r1 , r2 ) and (a11 , a12 , a21 , a22 . It is possible to fully characterize the dynamics
behavior of this system as a function of the six scalar parameters. Our standing assumptions are:
ri > 0, aii < 0, for i {1, 2}.
We study various cases depending upon the sign of a12 and a21 ; the results are taken from (Goh 1976).
To study the phase portrait of this two-dimensional system, it is establish the following details:
(i) along the axis x2 = 0, there exists a unique non-trivial equilibrium point x1 = r1 /a11 ; similarly,
along the axis x1 = 0, there exists a unique non-trivial equilibrium point x2 = r2 /a22 ;
(ii) the x1 -null-line is the set of points (x1 , x2 ) where x 1 = 0, that is, the line in the (x1 , x2 ) plane
defined by r1 + a11 x1 + a12 x2 = 0; similarly, the x2 -null-line is the (x1 , x2 ) plane defined by
r2 + a21 x1 + a22 x2 = 0.
Clearly, the x1 -null-line (respectively the x1 -null-line) passes through the equilibrium point x1 (respec-
tively x2 ).
18.2.1 Mutualism
Here we assume inter-species mutualism, that is, we assume both inter-species coefficients a12 and a21
are positive. We identify two distinct parameter ranges corresponding to distinct dynamic behavior and
illustrate them in Figure 18.3.
x1 -null-line x2 -null-line
x2 -null-line
r2 /a22
r2 /a22
x2 =
x2 =
x1 -null-line
x1 = r1 /a11 x1 = r1 /a11
Case I: a12 > 0, a21 > 0, a12 a21 < a11 a22 . There exists a Case II: a12 > 0, a21 > 0, a12 a21 > a11 a22 . There exists
unique positive equilibrium point. All trajectories starting in no positive equilibrium point. All trajectories starting in R2>0
R2>0 converge to the equilibrium point. diverge.
Figure 18.3: Two possible cases of mutualism in the two-species Lotka-Volterra system
18.2. Two-species model and analysis 243
18.2.2 Competition
Here we assume inter-species competition, that is, we assume both inter-species coefficients a12 and a21
are negative. We identify four distinct parameter ranges corresponding to distinct dynamic behavior and
illustrate them in Figure 18.4.
r2 /a22
r1 /a12
x2 -null-line
x1 -null-line
r2 /a22
r1 /a12
x2 -null-line
x1 -null-line
r1 /a11 r2 /a21 r2 /a21 r1 /a11

Case III: a12 < 0, a21 < 0, r2 /|a22 | < r1 /|a12 |, and Case IV: a12 < 0, a21 < 0, r1 /|a12 | < r2 /|a22 |, and
r1 /|a11 | < r2 /|a21 |. There exists a unique positive equi- r2 /|a21 | < r1 /|a11 |. The equilibrium in R2>0 is unstable;
librium, which attracts all trajectories starting in R2>0 . all trajectories (except the equilibrium solution) converge ei-
ther to the equilibrium (r1 /a11 , 0) or to the equilibrium
(0, r2 /a22 ).
r1 /a12
r2 /a22
r2 /a22
r1 /a12
r2 /a21 r1 /a11 r1 /a11 r2 /a21

Case V: a12 < 0, a21 < 0, r2 /|a22 | < r1 /|a12 |, and Case VI: a12 < 0, a21 < 0, r1 /|a12 | < r2 /|a22 |, and
r2 /|a21 | < r1 /|a11 |. There exists no equilibrium in R2>0 . r1 /|a11 | < r2 /|a21 |. There exists no equilibrium in R2>0 .
All trajectories starting in R2>0 converge to the equilibrium All trajectories starting in R2>0 converge to the equilibrium
(r1 /a11 , 0). (0, r2 /a22 ).
Figure 18.4: Four possible cases of competition in the two-species Lotka-Volterra system
18.3 General results for Lotka-Volterra models

We have seen some variety of behavior in the 2-species Lotka-Volterra model (18.4). Much richer dynamical
behavior is possible in the Lotka-Volterra model (18.3), including persistence, extinction, equilibria, periodic
orbits, and chaotic evolution. We present some analysis results in this and the next section and refer the
interested reader for more details to (Takeuchi 1996; Baigent 2010).
Lemma 18.1 (Basic properties). For n 2, the Lotka-Volterra system (18.3) is a positive system, i.e.,
x(0) 0 implies x(t) 0 for all subsequent t. Moverover, if xi (0) = 0, then xi (t) = 0 for all subsequent t.
Therefore, without loss of generality we can assume that all initial conditions are positive vectors that
is, in Rn>0 . In other words, the best we can hope for is to establish that an equilibrium point is globally
asymptotically stable on Rn>0 . We are now ready to state the main result of this section, due to (Goh 1979).
Theorem 18.2 (Sufficient conditions). For the Lotka-Volterra system (18.3) with interaction matrix A and
intrinsic growth rate r, assume
(A1) A is diagonally stable, i.e., there exists a positive vector d such that diag(d)A + A> diag(d) is negative
definite.
(A2) the unique equilibrium point x = A1 r is positive,
Then x is globally asymptotically stable on Rn>0 .
Proof. Note that A diagonally stable implies A Hurwitz and invertible. For K > 0, define the linear-minus-
logarithmic function Vlin-log,K : R>0 R by
x
Vlin-log,K (x) = x K K log .
K
Define V : Rn>0 R0 by
n
X n
X
V (x) = di Vlin-log,xi (xi ) = di xi xi xi log(xi /xi ) .
i=1 i=1
The reader is invited to show in Exercise E18.2 that the function Vlin-log,K is continuously differentiable,
takes nonnegative values and satisfies Vlin-log,K (xi ) = 0 if and only if xi = K. Moreover, this function is
unbounded in the limits as xi and xi 0+ . Therefore, V is globally positive-definite about x and
is radially unbounded.
Next, we compute the Lie derivative of V along the flow of the Lotka-Volterra vector field fLK (x) =
diag(x)(Ax + r). First, compute dxd i Vxi (xi ) = (xi xi )/xi , so that
n
X xi xi
LfLK V (x) = di (fLK (x))i .
xi
i=1
18.4. Cooperative Lotka-Volterra models 245
VK (x) = x K K log(x/K)
K 2K 3K 4K
x
Figure 18.5: The function Vlin-log,K (x) = x K K log(x/K) studied in Exercise E18.2
Because A is invertible and x = A1 r, we write Ax + r = A(x x ) and obtain

n
X
LfLK V (x) = di (xi xi )(A(x x ))i
i=1
= (x x )> A> diag(d)(x x )
1
= (x x )> (A> diag(d) + diag(d)A)(x x ).
2
This implies that LfLK V (x) 0 with equality if and only if x = x . Therefore, LfLK V is globally negative-
definite about x . According to the Lyapunov Theorem 13.3, x is globally asymptotically stable on
Rn>0 .
Note: Assumption (A2) is not critical and, via a more complex treatment, a more general theorem can
be obtained. Under Assumption (A1) about A being diagonally stable, (Takeuchi 1996, Theorem 3.2.1)
shows the existence of a unique nonnegative and globally stable equilibrium point x for each r Rn ; the
existence and uniqueness of x is established via a linear complementarity problem.
18.4 Cooperative Lotka-Volterra models

In this section we focus on the case of Lotka-Volterra systems with only mutualistic interactions. In other
words, we consider systems whose interaction terms satisfy aij 0 for all i and j. For such systems,
whenever i 6= j

(fLK )i (x) = aij xj 0,
xj
so that their Jacobian matrix is Metzler everywhere in R0 . Such systems are called cooperative.
We recall from Section 9.2 the properties of Metzler matrices. For example the Perron-Frobenius
Theorem 9.4 for Metzler matrices establishes the existence of a dominant eigenvalue.
Lemma 18.3 (Unbounded evolutions for unstable Metzler matrices). If A is Metzler and has a positive
dominant eigenvalue, then the Lotka-Volterra systems has unbounded solutions in R>0 .
If the dominant eigenvalue is negative, then the Metzler matrix is Hurwitz; this case was studied in
Theorem 9.5. We here provide a useful extension of that theorem.
Theorem 18.4 (Properties of Hurwitz Metzler matrices: continued). For a Metzler matrix A, the
(i) A is Hurwitz,
(ii) A is invertible and A1 0,
(iii) for all b 0n , there exists x 0n solving Ax + b = 0n ,
(iv) A is negative diagonally dominant, i.e., there exists > 0n such that A < 0n , and
(v) A is diagonally stable, i.e., there exists a positive-definite diagonal matrix P such that A> P + P A < 0.
Proof. The equivalence between statements (i), (ii), and (iii) is established in Theorem 9.5.
(ii) = (iv) Set = A1 1n . Because A1 0 is invertible, it can have no rows identically equal to
zero. Hence = A1 1 > 0n . Moreover A = 1n < 0n .
(iv) = (i) We follow the steps in (Baigent 2010, Lemma 6). Let be an eigenvalue of A with eigenvector
w Rn by wi = vi /i , for i {1, . . . , n}, where is as in statement (iv). We have therefore
v. Define P
i wi = nj=1 aij j wj . If ` is the index satisfying |w` | = maxi |wi | > 0, then
n
X wj
` = a`` ` + a`j j ,
w`
j=1,j6=`
which, in turn, implies

n n
X wj X
|` a`` ` | a`j j a`j j < a`` ` ,
w`
j=1,j6=` j=1,j6=`
where the last equality follows from the `-th row of the inequality A < 0n . Therefore, | a`` | <
a`` . This inequality implies that the eigenvalue must belong to an open disc in the complex plan
with center a`` < 0 and radius |a`` |. Hence, , together with all othe eigenvalues of A, must have
negative real part.
(iv) = (v) From statement (iv) applied to A and A> , let > 0n satisfy A < 0n and > 0n satisfy
A> < 0n . Define P = diag(1 /1 , . . . , n /n ) and consider the symmetric matrix A> P + P A.
This matrix is Metzler and satisfies (A> P + P A) = A> + P A < 0n . Hence, A> P + P A is
negative diagonally dominant and, because (iv) = (i), Hurwitz. In summary, A> P + P A is
symmetric and Hurwitz, hence, it is negative definite.
(v) = (i) This is established in Theorem 13.5.
18.4. Cooperative Lotka-Volterra models 247
Theorem 18.5 (Global convergence for cooperative Lotka-Volterra). For the Lotka-Volterra system (18.3),
assume
(A3) the interaction matrix A is Metzler and Hurwitz, and

(A4) the intrinsic growth parameter r is positive.
Then there exists a unique interior equilibrium point x and x is globally attractive on Rn>0 .
Proof. We leave it to the reader to verify that the Assumptions (A1) and (A2) of Theorem 18.2 are satisfied
so that its consequences hold.
|xi xi |
Note: In (Baigent 2010), Theorem 18.5 is established via the Lyapunov function V (x) = maxi{1,...,n} i ,
where x is the equilibrium point and diag(1 , . . . , n ) is the diagonal Lyapunov matrix for A.
18.5 Exercises
E18.1 Logistic ordinary differential equation. For r > 0 and K > 0, consider the logistic equation (18.1)
defined by
x = rx(1 x/K),
for x R0 . Show that
(i) there are two equilibrium points 0 and K,
(ii) the solution is
Kx(0) ert
x(t) = ,
K + x(0)(ert 1)
(iii) all solutions with 0 < x(0) < K are monotonically increasing and converge asymptotically to K,
(iv) all solutions with K < x(0) are monotonically decreasing and converge asymptotically to K, and
(v) if x(0) < K/2, then the solution x(t) has an inflection point when x(t) = K/2.
E18.2 The linear-minus-logarithmic function. For K > 0, define the function Vlin-log,K : R>0 R by
x
Vlin-log,K (x) = x K K log .
K
Show that
d
(i) Vlin-log,K is continuously differentiable and dx VK (x) = (x K)/x,
(ii) Vlin-log,K (x) = 0 if and only if x = K,
(iii) Vlin-log,K (x) > 0 for all x > 0, x 6= K, and
(iv) limx0+ Vlin-log,K (x) = limx Vlin-log,K (x) = +.
Bibliography
J. A. Acebrn, L. L. Bonilla, C. J. P. Vicente, F. Ritort, and R. Spigler. The Kuramoto model: A simple
paradigm for synchronization phenomena. Reviews of Modern Physics, 77(1):137185, 2005.
D. Acemoglu and A. Ozdaglar. Opinion dynamics and learning in social networks. Dynamic Games and
Applications, 1(1):349, 2011.
D. Acemoglu, G. Como, F. Fagnani, and A. Ozdaglar. Opinion fluctuations and disagreement in social
networks. Mathematics of Operation Research, 38(1):127, 2013.
R. P. Agaev and P. Y. Chebotarev. The matrix of maximum out forests and its applications. Automation and
Remote Control, 61(9):14241450, 2000.
B. D. O. Anderson, C. Yu, B. Fidan, and J. M. Hendrickx. Rigid graph control architectures for autonomous
formations. IEEE Control Systems Magazine, 28(6):4863, 2008.
M. Arcak. Passivity as a design tool for group coordination. IEEE Transactions on Automatic Control, 52(8):
13801390, 2007.
A. Arenas, A. Daz-Guilera, J. Kurths, Y. Moreno, and C. Zhou. Synchronization in complex networks.

Physics Reports, 469(3):93153, 2008.
L. Asimow and B. Roth. The rigidity of graphs, II. Journal of Mathematical Analysis and Applications, 68(1):
171190, 1979.
H. Bai, M. Arcak, and J. Wen. Cooperative Control Design, volume 89. Springer, 2011. ISBN 1461429072.
S. Baigent. Lotka-Volterra Dynamics An Introduction. Preprint, Mar. 2010. University of College, London.
B. Bamieh, M. R. Jovanovic, P. Mitra, and S. Patterson. Coherence in large-scale networks: Dimension-

dependent limitations of local feedback. IEEE Transactions on Automatic Control, 57(9):22352249,
2012.
P. Barooah. Estimation and Control with Relative Measurements: Algorithms and Scaling Laws. PhD thesis,
University of California at Santa Barbara, July 2007.
249
250 Bibliography
P. Barooah and J. P. Hespanha. Estimation from relative measurements: Algorithms and scaling laws. IEEE
Control Systems Magazine, 27(4):5774, 2007.
D. Bauso and G. Notarstefano. Distributed-player approachability and consensus in coalitional games. IEEE
Transactions on Automatic Control, 60(11):31073112, 2015.
M. Benzi, G. H. Golub, and J. Liesen. Numerical solution of saddle point problems. Acta Numerica, 14:1137,
2005.
A. R. Bergen and D. J. Hill. A structure preserving model for power system stability analysis. IEEE
Transactions on Power Apparatus and Systems, 100(1):2535, 1981.
A. Berman and R. J. Plemmons. Nonnegative Matrices in the Mathematical Sciences. SIAM, 1994. ISBN
978-0-89871-321-3.
D. S. Bernstein. Matrix Mathematics. Princeton University Press, 2 edition, 2009. ISBN 0691140391.
N. Biggs. Algebraic Graph Theory. Cambridge University Press, 2 edition, 1994. ISBN 0521458978.
V. D. Blondel and A. Olshevsky. How to decide consensus? a combinatorial necessary and sufficient condition
and a proof that consensus is decidable but np-hard. SIAM Journal on Control and Optimization, 52(5):
27072726, 2014.
B. Bollobs. Modern Graph Theory. Springer, 1998. ISBN 0387984887.
S. Bolognani, S. Del Favero, L. Schenato, and D. Varagnolo. Consensus-based distributed sensor calibration
and least-square parameter identification in WSNs. International Journal of Robust and Nonlinear
Control, 20(2):176193, 2010.
P. Bonacich. Technique for analyzing overlapping memberships. Sociological Methodology, 4:176185, 1972a.
P. Bonacich. Factoring and weighting approaches to status scores and clique identification. Journal of
Mathematical Sociology, 2(1):113120, 1972b.
S. P. Borgatti and M. G. Everett. A graph-theoretic perspective on centrality. Social Networks, 28(4):466484,

2006.
S. Boyd, P. Diaconis, and L. Xiao. Fastest mixing Markov chain on a graph. SIAM Review, 46(4):667689,
2004.
S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah. Randomized gossip algorithms. IEEE Transactions on
Information Theory, 52(6):25082530, 2006.
U. Brandes. Centrality: concepts and methods. Slides, May 2006. The International Workshop/School and
Conference on Network Science.
U. Brandes and T. Erlebach. Network Analysis: Methodological Foundations. Springer, 2005. ISBN 3540249796.
L. Breiman. Probability, volume 7 of Classics in Applied Mathematics. SIAM, 1992. ISBN 0-89871-296-3.
Corrected reprint of the 1968 original.
Bibliography 251
S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks, 30:
107117, 1998.
E. Brown, P. Holmes, and J. Moehlis. Globally coupled oscillator networks. In E. Kaplan, J. E. Marsden,
and K. R. Sreenivasan, editors, Perspectives and Problems in Nonlinear Science: A Celebratory Volume in
Honor of Larry Sirovich, pages 183215. Springer, 2003.
A. M. Bruckstein, N. Cohen, and A. Efrat. Ants, crickets, and frogs in cyclic pursuit. Technical Re-
port CIS 9105, Center for Intelligent Systems, Technion, Haifa, Israel, July 1991. Available at
http://www.cs.technion.ac.il/tech-reports.
J. Buck. Synchronous rhythmic flashing of fireflies. II. Quarterly Review of Biology, 63(3):265289, 1988.
F. Bullo, J. Corts, and S. Martnez. Distributed Control of Robotic Networks. Princeton University Press,
2009. ISBN 978-0-691-14195-4. URL http://www.coordinationbook.info.
F. Bullo, W. Mei, S. Mohagheghib, and S. Zampieri. Nonlinear propagation models in contact networks. To
be submitted, 2016.
M. Cao, A. S. Morse, and B. D. O. Anderson. Agreeing asynchronously. IEEE Transactions on Automatic

Control, 53(8):18261838, 2008.
R. Carli, F. Fagnani, A. Speranzon, and S. Zampieri. Communication constraints in the average consensus
problem. Automatica, 44(3):671684, 2008.
R. Carli, F. Garin, and S. Zampieri. Quadratic indices for the analysis of consensus algorithms. In Information
Theory and Applications Workshop, pages 96104, San Diego, CA, USA, Feb. 2009.
H. Caswell. Matrix Population Models. Sinauer Associates, 2 edition, 2006. ISBN 087893121X.
N. D. Charkes, P. T. M. Jr, and C. Philips. Studies of skeletal tracer kinetics. I. digital-computer solution of
a five-compartment model of [18f] fluoride kinetics in humans. Journal of Nuclear Medicine, 19(12):
13011309, 1978.
S. Chatterjee and E. Seneta. Towards consensus: Some convergence theorems on repeated averaging.
Journal of Applied Probability, 14(1):8997, 1977.
A. Cherukuri and J. Corts. Asymptotic stability of saddle points under the saddle-point dynamics. In
American Control Conference, Chicago, IL, USA, July 2015. To appear.
N. Chopra and M. W. Spong. On exponential synchronization of Kuramoto oscillators. IEEE Transactions on

Automatic Control, 54(2):353357, 2009.
R. Cogburn. The ergodic theory of Markov chains in random environments. Zeitschrift fr Wahrschein-
lichkeitstheorie und verwandte Gebiete, 66(1):109128, 1984.
G. Como, K. Savla, D. Acemoglu, M. A. Dahleh, and E. Frazzoli. Robust distributed routing in dynamical
networks Part I: locally responsive policies and weak resilience. IEEE Transactions on Automatic
Control, 58(2):317332, 2013.
252 Bibliography
S. Coogan and M. Arcak. A compartmental model for traffic networks and its dynamical behavior. IEEE
E. Cristiani, B. Piccoli, and A. Tosin. Multiscale Modeling of Pedestrian Dynamics. Springer, 2014. ISBN
978-3-319-06619-6.
S. M. Crook, G. B. Ermentrout, M. C. Vanier, and J. M. Bower. The role of axonal delay in the synchronization
of networks of coupled cortical oscillators. Journal of Computational Neuroscience, 4(2):161172, 1997.
H. Daido. Quasientrainment and slow relaxation in a population of oscillators with random and frustrated
interactions. Physical Review Letters, 68(7):10731076, 1992.
P. J. Davis. Circulant Matrices. American Mathematical Society, 2 edition, 1994. ISBN 0828403384.
T. A. Davis and Y. Hu. The University of Florida sparse matrix collection. ACM Transactions on Mathematical
Software, 38(1):125, 2011.
M. H. DeGroot. Reaching a consensus. Journal of the American Statistical Association, 69(345):118121, 1974.
P. M. DeMarzo, D. Vayanos, and J. Zwiebel. Persuasion bias, social influence, and unidimensional opinions.
The Quarterly Journal of Economics, 118(3):909968, 2003.
R. Diestel. Graph Theory, volume 173 of Graduate Texts in Mathematics. Springer, 2 edition, 2000. ISBN
3642142788.
F. Drfler and F. Bullo. On the critical coupling for Kuramoto oscillators. SIAM Journal on Applied Dynamical
Systems, 10(3):10701099, 2011. doi: 10.1137/10081530X.
F. Drfler and F. Bullo. Exploring synchronization in complex oscillator networks, Sept. 2012. Extended
version including proofs. Available at http://arxiv.org/abs/1209.1335.
F. Drfler and F. Bullo. Synchronization in complex networks of phase oscillators: A survey. Automatica, 50
(6):15391564, 2014. doi: 10.1016/j.automatica.2014.04.012.
F. Drfler and B. Francis. Formation control of autonomous robots based on cooperative behavior. In
European Control Conference, pages 24322437, Budapest, Hungary, Aug. 2009.
F. Drfler and B. Francis. Geometric analysis of the formation problem for autonomous robots. IEEE
G. Droge, H. Kawashima, and M. Egerstedt. Proportional-integral distributed optimization for networked

systems. arXiv preprint arXiv:1309.6613, 2013.
C. L. DuBois. UCI Network Data Repository, 2008. URL http://networkdata.ics.uci.edu.
F. Fagnani. Consensus dynamics over networks. Winter School on Complex Networks, INRIA, Jan. 2014.
F. Fagnani and S. Zampieri. Randomized consensus algorithms over large scale networks. IEEE Journal on
Selected Areas in Communications, 26(4):634649, 2008.
Bibliography 253
A. Fall, A. Iggidr, G. Sallet, and J.-J. Tewa. Epidemiological models and Lyapunov functions. Mathematical
Modelling of Natural Phenomena, 2(1):6268, 2007.
L. Farina and S. Rinaldi. Positive Linear Systems: Theory and Applications. John Wiley & Sons, 2000. ISBN
0471384569.
M. Fiedler. Algebraic connectivity of graphs. Czechoslovak Mathematical Journal, 23(2):298305, 1973.
D. Fife. Which linear compartmental systems contain traps? Mathematical Biosciences, 14(3):311315, 1972.
D. M. Foster and J. A. Jacquez. Multiple zeros for eigenvalues and the multiplicity of traps of a linear
compartmental system. Mathematical Biosciences, 26(1):8997, 1975.
L. R. Foulds. Graph Theory Applications. Universitext. Springer, 1995. ISBN 0387975993.
B. A. Francis and M. Maggiore. Flocking and Rendezvous in Distributed Robotics. Springer, 2016. ISBN
978-3-319-24727-4.
P. Frasca. Quick convergence proof for gossip consensus. Personal communication, 2012.
P. Frasca, R. Carli, F. Fagnani, and S. Zampieri. Average consensus on networks with quantized communica-
tion. International Journal of Robust and Nonlinear Control, 19(16):17871816, 2009.
J. R. P. French. A formal theory of social power. Psychological Review, 63(3):181194, 1956.
N. E. Friedkin. Theoretical foundations for centrality measures. American Journal of Sociology, 96(6):
14781504, 1991.
N. E. Friedkin and E. C. Johnsen. Social influence networks and opinion change. In E. J. Lawler and M. W.
Macy, editors, Advances in Group Processes, volume 16, pages 129. JAI Press, 1999.
N. E. Friedkin and E. C. Johnsen. Social Influence Network Theory: A Sociological Examination of Small Group
Dynamics. Cambridge University Press, 2011. ISBN 9781107002463.
N. E. Friedkin and E. C. Johnsen. Two steps to obfuscation. Social Networks, 39:1213, 2014.
P. A. Fuhrmann and U. Helmke. The Mathematics of Networks of Linear Systems. Springer, 2015. ISBN
3319166468.
C. Gao, J. Corts, and F. Bullo. Notes on averaging over acyclic digraphs and discrete coverage control.
Automatica, 44(8):21202127, 2008. doi: 10.1016/j.automatica.2007.12.017.
F. Garin and L. Schenato. A survey on distributed estimation and control applications using linear consensus
algorithms. In A. Bemporad, M. Heemels, and M. Johansson, editors, Networked Control Systems, LNCIS,
pages 75107. Springer, 2010.
B. Gharesifard and J. Cortes. Distributed continuous-time convex optimization on weight-balanced digraphs.

IEEE Transactions on Automatic Control, 59(3):781786, 2014.
254 Bibliography
A. K. Ghosh, B. Chance, and E. K. Pye. Metabolic coupling and synchronization of NADH oscillations in
yeast cell populations. Archives of Biochemistry and Biophysics, 145(1):319331, 1971.
D. Gleich. Spectral Graph Partitioning and the Laplacian with Matlab, Jan. 2006. URL https://www.cs.
purdue.edu/homes/dgleich/demos/matlab/spectral/spectral.html. (Last retrieved on May
30, 2016.).
D. F. Gleich. Pagerank beyond the Web. SIAM Review, 57(3):321363, 2015.
C. D. Godsil and G. F. Royle. Algebraic Graph Theory, volume 207 of Graduate Texts in Mathematics. Springer,
2001. ISBN 0387952411.
B. S. Goh. Global stability in two species interactions. Journal of Mathematical Biology, 3(3-4):313318,
1976.
B. S. Goh. Stability in models of mutualism. American Naturalist, pages 261275, 1979.
B.-S. Goh. Management and Analysis of Biological Populations. Elsevier, 1980. ISBN 978-0-444-41793-0.
M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming, version 2.1. http:
//cvxr.com/cvx, Oct. 2014.
W. H. Haddad, V. Chellaboina, and Q. Hui. Nonnegative and Compartmental Dynamical Systems. Princeton
University Press, 2010. ISBN 0691144117.
F. Harary. A criterion for unanimity in Frenchs theory of social power. In D. Cartwright, editor, Studies in
Social Power, pages 168182. University of Michigan, 1959.
T. Hatanaka, Y. Igarashi, M. Fujita, and M. W. Spong. Passivity-based pose synchronization in three

dimensions. IEEE Transactions on Automatic Control, 57(2):360375, 2012.
Y. Hatano and M. Mesbahi. Agreement over random networks. IEEE Transactions on Automatic Control, 50
(11):18671872, 2005.
J. M. Hendrickx. Graphs and Networks for the Analysis of Autonomous Agent Systems. PhD thesis, Universit
Catholique de Louvain, Belgium, Feb. 2008.
J. M. Hendrickx and J. N. Tsitsiklis. Convergence of type-symmetric and cut-balanced consensus seeking

systems. IEEE Transactions on Automatic Control, 58(1):214218, 2013.
J. P. Hespanha. Linear Systems Theory. Princeton University Press, 2009. ISBN 0691140219.
H. W. Hethcote. The mathematics of infectious diseases. SIAM Review, 42(4):599653, 2000.
J. Hofbauer and K. Sigmund. Evolutionary Games and Population Dynamics. Cambridge University Press,
1998. ISBN 052162570X.
L. Hogben, editor. Handbook of Linear Algebra. Chapman and Hall/CRC, 2 edition, 2013. ISBN 1466507284.
Bibliography 255
F. C. Hoppensteadt and E. M. Izhikevich. Synchronization of laser oscillators, associative memory, and

optical neurocomputing. Physical Review E, 62(3):40104013, 2000.
R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, 1985. ISBN 0521386322.
Y. Hu. Efficient, high-quality force-directed graph drawing. Mathematica Journal, 10(1):3771, 2005.
C. Huygens. Horologium Oscillatorium. Paris, France, 1673.
H. Ishii and R. Tempo. The pagerank problem, multiagent consensus, and web aggregation: A systems and
control viewpoint. IEEE Control Systems Magazine, 34(3):3453, 2014.
M. O. Jackson. Social and Economic Networks. Princeton University Press, 2010. ISBN 0691148201.
J. A. Jacquez and C. P. Simon. Qualitative theory of compartmental systems. SIAM Review, 35(1):4379,
1993.
A. Jadbabaie and A. Olshevsky. On performance of consensus protocols subject to noise: role of hitting
times and network structure. arXiv preprint arXiv:1508.00036, 2015.
A. Jadbabaie, J. Lin, and A. S. Morse. Coordination of groups of mobile autonomous agents using nearest
neighbor rules. IEEE Transactions on Automatic Control, 48(6):9881001, 2003.
G. Jongen, J. Anemller, D. Boll, A. C. C. Coolen, and C. Perez-Vicente. Coupled dynamics of fast spins
and slow exchange interactions in the XY spin glass. Journal of Physics A: Mathematical and General,
34(19):39573984, 2001.
A. Kashyap, T. Baar, and R. Srikant. Quantized consensus. Automatica, 43(7):11921203, 2007.
L. Katz. A new status index derived from sociometric analysis. Psychometrika, 18(1):3943, 1953.
H. K. Khalil. Nonlinear Systems. Prentice Hall, 3 edition, 2002. ISBN 0130673897.
A. Khanafer, T. Baar, and B. Gharesifard. Stability properties of infected networks with low curing rates.
In American Control Conference, pages 35793584, Portland, OR, USA, June 2014.
G. Kirchhoff. ber die Auflsung der Gleichungen, auf welche man bei der Untersuchung der linearen
Verteilung galvanischer Strme gefhrt wird. Annalen der Physik und Chemie, 148(12):497508, 1847.
I. Z. Kiss, Y. Zhai, and J. L. Hudson. Emerging coherence in a population of chemical oscillators. Science,
296(5573):16761678, 2002.
M. S. Klamkin and D. J. Newman. Cyclic pursuit or "the three bugs problem". American Mathematical
Monthly, 78(6):631639, 1971.
D. J. Klein, P. Lee, K. A. Morgansen, and T. Javidi. Integration of communication and control using discrete
time Kuramoto models for multivehicle coordination over broadcast networks. IEEE Journal on Selected
Areas in Communications, 26(4):695705, 2008.
256 Bibliography
J. M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5):604632,

1999. doi: 10.1145/324133.324140.
G. Kozyreff, A. G. Vladimirov, and P. Mandel. Global coupling with time delay in an array of semiconductor
lasers. Physical Review Letters, 85(18):38093812, 2000.
D. Krackhardt. Cognitive social structures. Social Networks, 9(2):109134, 1987.
L. Krick, M. E. Broucke, and B. Francis. Stabilization of infinitesimally rigid formations of multi-robot

networks. International Journal of Control, 82(3):423439, 2009.
J. Kunegis. KONECT: the Koblenz network collection. In International Conference on World Wide Web
Companion, pages 13431350, 2013.
Y. Kuramoto. Self-entrainment of a population of coupled non-linear oscillators. In H. Araki, editor, Int.

Symposium on Mathematical Problems in Theoretical Physics, volume 39 of Lecture Notes in Physics,
pages 420422. Springer, 1975. ISBN 978-3-540-07174-7.
Y. Kuramoto. Chemical Oscillations, Waves, and Turbulence. Springer, 1984. ISBN 0387133224.
A. Lajmanovich and J. A. Yorke. A deterministic model for gonorrhea in a nonhomogeneous population.

Mathematical Biosciences, 28(3):221236, 1976.
P. H. Leslie. On the use of matrices in certain population mathematics. Biometrika, 3(3):183212, 1945.
Z. Lin, B. Francis, and M. Maggiore. Necessary and sufficient graphical conditions for formation control of
unicycles. IEEE Transactions on Automatic Control, 50(1):121127, 2005.
Z. Lin, B. Francis, and M. Maggiore. State agreement for continuous-time coupled nonlinear systems. SIAM
Journal on Control and Optimization, 46(1):288307, 2007.
C. Liu, D. R. Weaver, S. H. Strogatz, and S. M. Reppert. Cellular construction of a circadian clock: period
determination in the suprachiasmatic nuclei. Cell, 91(6):855860, 1997.
S. ojasiewicz. Sur les trajectoires du gradient dune fonction analytique. Seminari di Geometria 1982-1983,
pages 115117, 1984. Istituto di Geometria, Dipartimento di Matematica, Universit di Bologna, Italy.
A. J. Lotka. Analytical note on certain rhythmic relations in organic systems. Proceedings of the National
Academy of Sciences, 6(7):410415, 1920.
E. Lovisari, F. Garin, and S. Zampieri. Resistance-based performance analysis of the consensus algorithm
over geometric graphs. SIAM Journal on Control and Optimization, 51(5):39183945, 2013.
D. G. Luenberger. Introduction to Dynamic Systems: Theory, Models, and Applications. John Wiley & Sons,
1979. ISBN 0471025941.
D. G. Luenberger. Linear and Nonlinear Programming. Addison-Wesley, 2 edition, 1984.
E. Mallada, R. A. Freeman, and A. K. Tang. Distributed synchronization of heterogeneous oscillators on

networks with arbitrary topology. IEEE Transactions on Control of Network Systems, 3(1):112, 2016.
Bibliography 257
J. R. Marden, G. Arslan, and J. S. Shamma. Joint strategy fictitious play with inertia for potential games.
J. A. Marshall, M. E. Broucke, and B. A. Francis. Formations of vehicles in cyclic pursuit. IEEE Transactions
on Automatic Control, 49(11):19631974, 2004.
A. Mauroy, P. Sacr, and R. J. Sepulchre. Kick synchronization versus diffusive synchronization. In IEEE
Conf. on Decision and Control, pages 71717183, Maui, HI, USA, Dec. 2012.
W. Mei and F. Bullo. Modeling and analysis of competitive propagation with social conversion. In IEEE
Conf. on Decision and Control, pages 62036208, Los Angeles, CA, USA, Dec. 2014.
R. Merris. Laplacian matrices of a graph: A survey. Linear Algebra and its Applications, 197:143176, 1994.
M. Mesbahi and M. Egerstedt. Graph Theoretic Methods in Multiagent Networks. Princeton University Press,
2010. ISBN 9781400835355.
C. D. Meyer. Matrix Analysis and Applied Linear Algebra. SIAM, 2001. ISBN 0898714540.
D. C. Michaels, E. P. Matyas, and J. Jalife. Mechanisms of sinoatrial pacemaker synchronization: a new

hypothesis. Circulation Research, 61(5):704714, 1987.
B. Mohar. The Laplacian spectrum of graphs. In Y. Alavi, G. Chartrand, O. R. Oellermann, and A. J. Schwenk,
editors, Graph Theory, Combinatorics, and Applications, volume 2, pages 871898. John Wiley & Sons,
1991. ISBN 0471532452.
L. Moreau. Stability of continuous-time distributed consensus algorithms. In IEEE Conf. on Decision and
Control, pages 39984003, Nassau, Bahamas, 2004.
L. Moreau. Stability of multiagent systems with time-dependent communication links. IEEE Transactions
Z. Nda, E. Ravasz, T. Vicsek, Y. Brechet, and A.-L. Barabsi. Physics of the rhythmic applause. Physical
Review E, 61(6):69876992, 2000.
A. Nedi, A. Olshevsky, A. Ozdaglar, and J. N. Tsitsiklis. On distributed averaging algorithms and quantization
effects. IEEE Transactions on Automatic Control, 54(11):25062517, 2009.
M. E. J. Newman. Networks: An Introduction. Oxford University Press, 2010. ISBN 0199206651.
C. Nowzari, V. M. Preciado, and G. J. Pappas. Analysis and control of epidemics: A survey of spreading
processes on complex networks. IEEE Control Systems Magazine, 36(1):2646, 2016.
I. Noy-Meir. Desert ecosystems. I. Environment and producers. Annual Review of Ecology and Systematics,
pages 2551, 1973.
E. P. Odum. Fundamentals of Ecology. Saunders Company, 1959.
K.-K. Oh, M.-C. Park, and H.-S. Ahn. A survey of multi-agent formation control. Automatica, 53:424440,
2015.
258 Bibliography
R. Olfati-Saber. Flocking for multi-agent dynamic systems: Algorithms and theory. IEEE Transactions on
R. Olfati-Saber, E. Franco, E. Frazzoli, and J. S. Shamma. Belief consensus and distributed hypothesis testing
in sensor networks. In P. J. Antsaklis and P. Tabuada, editors, Network Embedded Sensing and Control.
(Proceedings of NESC05 Worskhop), Lecture Notes in Control and Information Sciences, pages 169182.
Springer, 2006. ISBN 3540327940.
A. Olshevsky and J. N. Tsitsiklis. On the nonexistence of quadratic Lyapunov functions for consensus
algorithms. IEEE Transactions on Automatic Control, 53(11):26422645, 2008.
R. W. Owens. An algorithm to solve the Frobenius problem. Mathematics Magazine, 76(4):264275, 2003.
L. Page. Method for node ranking in a linked database, Sept. 2001. US Patent 6,285,999.
D. A. Paley, N. E. Leonard, R. Sepulchre, D. Grunbaum, and J. K. Parrish. Oscillator models and collective
motion. IEEE Control Systems Magazine, 27(4):89105, 2007.
J. Pantaleone. Stability of incoherence in an isotropic gas of oscillating neutrinos. Physical Review D, 58(7):
073002, 1998.
G. Piovan, I. Shames, B. Fidan, F. Bullo, and B. D. O. Anderson. On frame and orientation localization for
relative sensing networks. Automatica, 49(1):206213, 2013. doi: 10.1016/j.automatica.2012.09.014.
V. H. Poor. An Introduction to Signal Detection and Estimation. Springer, 2 edition, 1998. ISBN 0387941738.
R. Potrie and P. Monzn. Local implications of almost global stability. Dynamical Systems, 24(1):109115,
2009.
V. Rakoevi. On continuity of the Moore-Penrose and Drazin inverses. Matematichki Vesnik, 49(3-4):
163172, 1997.
B. S. Y. Rao and H. F. Durrant-Whyte. A decentralized Bayesian algorithm for identification of tracked

targets. IEEE Transactions on Systems, Man & Cybernetics, 23(6):16831698, 1993.
C. Ravazzi, P. Frasca, R. Tempo, and H. Ishii. Ergodic randomized algorithms and dynamics over networks.
IEEE Transactions on Control of Network Systems, 2(1):7887, 2015.
W. Ren. On consensus algorithms for double-integrator dynamics. IEEE Transactions on Automatic Control,
53(6):15031509, 2008a.
W. Ren. Synchronization of coupled harmonic oscillators with local interaction. Automatica, 44:31963200,
2008b.
W. Ren and W. Atkins. Second-order consensus protocols in multiple vehicle systems with local interactions.
In AIAA Guidance, Navigation, and Control Conference and Exhibit, pages 1518, San Francisco, CA,
USA, Aug. 2005.
Bibliography 259
W. Ren and R. W. Beard. Consensus seeking in multi-agent systems under dynamically changing interaction
topologies. IEEE Transactions on Automatic Control, 50(5):655661, 2005.
W. Ren and R. W. Beard. Distributed Consensus in Multi-vehicle Cooperative Control. Communications and
Control Engineering. Springer, 2008. ISBN 978-1-84800-014-8.
W. Ren, R. W. Beard, and E. M. Atkins. Information consensus in multivehicle cooperative control: Collective
group behavior through local interaction. IEEE Control Systems Magazine, 27(2):7182, 2007.
W. H. Sandholm. Population Games and Evolutionary Dynamics. MIT Press, 2010. ISBN 0262195879.
P. Santesso and M. E. Valcher. On the zero pattern properties and asymptotic behavior of continuous-time
positive system trajectories. Linear Algebra and its Applications, 425(2):283302, 2007.
L. Schenato and F. Fiorentin. Average TimeSynch: A consensus-based protocol for clock synchronization in
wireless sensor networks. Automatica, 47(9):18781886, 2011.
R. Sepulchre, D. A. Paley, and N. E. Leonard. Stabilization of planar collective motion: All-to-all communi-
cation. IEEE Transactions on Automatic Control, 52(5):811824, 2007.
J. R. Silvester. Determinants of block matrices. The Mathematical Gazette, 84(501):460467, 2000.
O. Simeone, U. Spagnolini, Y. Bar-Ness, and S. H. Strogatz. Distributed synchronization in wireless networks.

IEEE Signal Processing Magazine, 25(5):8197, 2008.
J. W. Simpson-Porco, F. Drfler, and F. Bullo. Synchronization and power sharing for droop-controlled
inverters in islanded microgrids. Automatica, 49(9):26032611, 2013. doi: 10.1016/j.automatica.2013.05.
018.
S. L. Smith, M. E. Broucke, and B. A. Francis. A hierarchical cyclic pursuit scheme for vehicle networks.
Automatica, 41(6):10451053, 2005.
E. H. Spanier. Algebraic Topology. Springer, 1994.
M. W. Spong and N. Chopra. Synchronization of networked Lagrangian systems. In Lagrangian and

Hamiltonian Methods for Nonlinear Control 2006, volume 366 of Lecture Notes in Control and Information
Sciences, pages 4759. Springer, 2007. ISBN 978-3-540-73889-3.
S. H. Strogatz. From Kuramoto to Crawford: Exploring the onset of synchronization in populations of

coupled oscillators. Physica D: Nonlinear Phenomena, 143(1):120, 2000.
A. Tahbaz-Salehi and A. Jadbabaie. A necessary and sufficient condition for consensus over random
networks. IEEE Transactions on Automatic Control, 53(3):791795, 2008.
Y. Takeuchi. Global Dynamical Properties of Lotka-Volterra Systems. World Scientific Publishing, 1996. ISBN
9810224710.
H. G. Tanner, A. Jadbabaie, and G. J. Pappas. Flocking in fixed and switching networks. IEEE Transactions
260 Bibliography
P. A. Tass. A model of desynchronizing deep brain stimulation with a demand-controlled coordinated reset
of neural subpopulations. Biological Cybernetics, 89(2):8188, 2003.
B. Touri and A. Nedi. Product of random stochastic matrices. IEEE Transactions on Automatic Control, 59
(2):437448, 2014.
J. N. Tsitsiklis. Problems in Decentralized Decision Making and Computation.

PhD thesis, Massachusetts Institute of Technology, Nov. 1984. Available at
http://web.mit.edu/jnt/www/Papers/PhD-84-jnt.pdf.
J. N. Tsitsiklis, D. P. Bertsekas, and M. Athans. Distributed asynchronous deterministic and stochastic

gradient optimization algorithms. IEEE Transactions on Automatic Control, 31(9):803812, 1986.
P. Van Mieghem. The N -intertwined SIS epidemic network model. Computing, 93(2-4):147169, 2011.
P. Van Mieghem, J. Omic, and R. Kooij. Virus spread in networks. IEEE/ACM Transactions on Networking, 17
(1):114, 2009.
F. Varela, J. P. Lachaux, E. Rodriguez, and J. Martinerie. The brainweb: Phase synchronization and large-scale
integration. Nature Reviews Neuroscience, 2(4):229239, 2001.
T. Vicsek, A. Czirk, E. Ben-Jacob, I. Cohen, and O. Shochet. Novel type of phase transition in a system of
self-driven particles. Physical Review Letters, 75(6-7):12261229, 1995.
V. Volterra. Variations and fluctuations of the number of individuals in animal species living together. ICES
Journal of Marine Science, 3(1):351, 1928.
T. J. Walker. Acoustic synchrony: two mechanisms in the snowy tree cricket. Science, 166(3907):891894,
1969.
G. G. Walter and M. Contreras. Compartmental Modeling with Networks. Birkhuser, 1999. ISBN 0817640193.
J. Wang and N. Elia. Control approach to distributed optimization. In Allerton Conf. on Communications,
Control and Computing, pages 557561, Monticello, IL, USA, 2010.
Y. Wang, D. Chakrabarti, C. Wang, and C. Faloutsos. Epidemic spreading in real networks: An eigenvalue
viewpoint. In IEEE Int. Symposium on Reliable Distributed Systems, pages 2534, Oct. 2003.
A. Watton and D. W. Kydon. Analytical aspects of the N -bug problem. American Journal of Physics, 37(2):
220221, 1969.
J. T. Wen and M. Arcak. A unifying passivity framework for network flow control. IEEE Transactions on
A. T. Winfree. Biological rhythms and the behavior of populations of coupled oscillators. Journal of
Theoretical Biology, 16(1):1542, 1967.
J. Wolfowitz. Product of indecomposable, aperiodic, stochastic matrices. Proceedings of American Mathe-

matical Society, 14(5):733737, 1963.
Bibliography 261
W. Xia and M. Cao. Sarymsakov matrices and asynchronous implementation of distributed coordination
algorithms. IEEE Transactions on Automatic Control, 59(8):22282233, 2014.
L. Xiao and S. Boyd. Fast linear iterations for distributed averaging. Systems & Control Letters, 53:6578,
2004.
L. Xiao, S. Boyd, and S. Lall. A scheme for robust distributed sensor fusion based on average consensus.
In Symposium on Information Processing of Sensor Networks, pages 6370, Los Angeles, CA, USA, Apr.
2005.
L. Xiao, S. Boyd, and S.-J. Kim. Distributed average consensus with least-mean-square deviation. Journal of
Parallel and Distributed Computing, 67(1):3346, 2007.
R. A. York and R. C. Compton. Quasi-optical power combining using mutually synchronized oscillator
arrays. IEEE Transactions on Microwave Theory and Techniques, 39(6):10001009, 2002.
M. Youssef and C. Scoglio. An individual-based approach to SIR epidemics in contact networks. Journal of
Theoretical Biology, 283(1):136144, 2011.
W. Yu, G. Chen, and M. Cao. Some necessary and sufficient conditions for second-order consensus in
multi-agent dynamical systems. Automatica, 46(6):10891095, 2010.
S. Zampieri. Lecture Notes on Dynamics over Networks. Minicourse at UC Santa Barbara, Apr. 2013.
D. Zelazo. Graph-Theoretic Methods for the Analysis and Synthesis of Networked Dynamic Systems. PhD
thesis, University of Washington, 2009.
D. Zelazo and M. Mesbahi. Edge agreement: Graph-theoretic performance bounds and passivity analysis.
Y. Zhang and Y. P. Tian. Consentability and protocol design of multi-agent systems with stochastic switching
topology. Automatica, 45:11951201, 2009.
J. Zhu, Y. Tian, and J. Kuang. On the general consensus protocol of multi-agent systems with double-
integrator dynamics. Linear Algebra and its Applications, 431(5-7):701715, 2009.

LecturesNetworkSystems Full Book

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

LecturesNetworkSystems Full Book

Încărcat de

Drepturi de autor:

Formate disponibile

Lectures on

1 Motivating Problems and Systems 3

2 Elements of Matrix Theory 13

3 Elements of Graph Theory 35

4 The Adjacency Matrix 45

4.3 Graph theoretical characterization of irreducible matrices . . . . . . . . . . . . . . . . . . . 47

5 Discrete-time Averaging Systems 57

6 The Laplacian Matrix 75

7 Continuous-time Averaging Systems 91

8 The Incidence Matrix and its Applications 111

9 Positive and Compartmental Systems 125

II Topics in Averaging Systems 141

10 Convergence Rates, Scalability and Optimization 143

11 Time-varying Averaging Algorithms 155

12 Randomized Averaging Algorithms 169

III Nonlinear Systems 175

13 Nonlinear Systems and Robotic Coordination 177

14 Coupled Oscillators: Basic Models 197

15 Networks of Coupled Oscillators 207

15.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

16 Virus Propagation: Basic Models 219

17 Virus Propagation in Contact Networks 227

18 Lotka-Volterra Population Dynamics 239

The Nonlinear Systems part includes

a solution manual, available upon request by instructors at accredited institutions;

Acknowledgments I am extremely grateful to Florian Drfler for his contributions to

(i) Chapter 13 Nonlinear Systems and Robotic Coordination,

(i) Chapter 1 Motivating Problems and Systems,

Santa Barbara, California, USA Francesco Bullo

1.1 Social influence networks: opinion dynamics

Questions of interest are:

(i) Is this model of human opinion dynamics believable at all?

(ii) How does one measure the coefficients aij ?

1.2 Wireless sensor networks: averaging algorithms

1.3 Compartmental networks: dynamical flows among compartments

precipitation soil evaporation, drainage, runo

uptake plants transpiration

1.4 Appendix: Robotic networks in cyclic pursuit and balancing

i (k + 1) = mod(i (k) + ui (k), 2).

Objective: optimal patrolling of a perimeter. Approach: Cyclic pursuit

distcc (i , i+1 ) = mod(i+1 i , 2),

upursuit,i (k) = distcc (i (k), i+1 (k)).

Figure 1.5: Cyclic pursuit and balancing prototypical n-bug problems

Questions of interest are:

(ii) Is a rotating equally-spaced configuration a solution? An equally-spaced angle configuration is one

Objective: optimal sensor placement. Approach: Cyclic balancing

ubalancing,i (k) = distcc (i (k), i+1 (k)) distc (i (k), i1 (k)),

where distc (i (k), i1 (k)) = distcc (i1 (k), i (k)).

di (k + 1) = (1 )di (k) + di+1 (k), (1.2)

We conclude with the following remarks.

1.5 Appendix: Design problems in wireless sensor networks

1.5.1 Wireless sensor networks: distributed parameter estimation

Questions of interest are:

1.5.2 Wireless sensor networks: distributed hypothesis testing

p(yi |h ) = probability of measuring yi given that h is the true hypothesis.

or equivalently, we aim to exchange data among the sensors in order to compute:

Questions of interest here are the same as in the previous section.

[a, b] = {x R | a x b}, ]a, b]= {x R | a < x b},

2.1 Linear systems and the Jordan normal form

2.1.1 Discrete-time linear systems

Definition 2.2 (Semi-convergent and convergent matrices). A matrix A Rnn is

(i) semi-convergent if limk+ Ak exists, and

It is immediate to see that, if A is semi-convergent with limiting matrix A = limk+ Ak , then

In what follows we characterize the sets of semi-convergent and convergent matrices.

A is primitive, we know is simple and so we write the Jordan