Sunteți pe pagina 1din 423

The Numerical Solution

of Systems of Polynomials
Arising in Engineering and Science
The Numerical Solution
of Systems of Polynomials

Arising in Engineering and Science

Andrew J. Sommese

University of Notre Dame du Lac, USA


Charles W. Wampler, II
General Motors Research & Development, USA

^p World Scientific
NEW JERSEY • LONDON • SINGAPORE • BEIJING • SHANGHAI • H O N G K O N G • TAIPEI • CHENNAI
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data


A catalogue record for this book is available from the British Library.

THE NUMERICAL SOLUTION OF SYSTEMS OF POLYNOMIALS ARISING


IN ENGINEERING AND SCIENCE

Copyright © 2005 by World Scientific Publishing Co. Pte. Ltd.


All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,
electronic or mechanical, including photocopying, recording or any information storage and retrieval
system now known or to be invented, without written permission from the Publisher.

Forphotocopying of material in this volume, please pay a copying fee through the Copyright Clearance
Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy
is not required from the publisher.

ISBN 981-256-184-6

Printed in Singapore.
To Rebecca, Rachel, and Ruth To Vani, Megan, and Anne
Preface

This book started with the goal of explaining, to engineers and scientists, the ad-
vances made in the numerical computation of the isolated solutions of systems of
nonlinear multivariate complex polynomials since the book of A. Morgan (Morgan,
1987). The writing of this book was delayed because of a number of surprising devel-
opments, which made possible numerically describing not just the isolated solutions,
but also positive-dimensional solution sets of polynomial systems. The most recent
advances allow one to work with individual solution components, which opens up
new ways of solving a large system of polynomials by intersecting the solution sets
of subsets of the equations. This collection of ideas, methods, and problems makes
up the new area of Numerical Algebraic Geometry.
The heavy dependence of the new developments since (Morgan, 1987) on alge-
braic geometric ideas poses a serious challenge for an exposition aimed at engineers,
scientists, and numerical analysts — most of whom have had little or no exposure to
algebraic geometry. Furthermore most of the introductory books on algebraic geom-
etry are oriented towards computational algebra, and give short shrift at best to the
geometric results which underly the numerical analysis of polynomial systems. Even
worse, from the standpoint of an engineer or scientist, such books typically aim to
resolve algebraic questions and so do not directly address the numerical/geometric
questions coming from applications.
Our approach throughout this book is to assume that we are trying to explain
each topic to an engineer or scientist. We want to be accurate: we do not cut
corners on giving precise definitions and statements. We give illustrative examples
exhibiting all the phenomena involved, but we only give proofs to the extent that
they further understanding.
The set of common zeros of a system of polynomials is not a manifold, but it is
close to being one in the sense that exceptional points are rare. This vague statement
can be made mathematically precise, and indeed, the theoretical underpinnings
of our methods imply that we avoid such trouble spots "with probability one."
The usual algebraic approaches to the subject do not show how familiar geometric
notions from calculus relate to these solution sets. The geometric approach is harder,
since to link concepts like prime ideals to algebraic sets with certain very nice

vii
viii Numerical Solution of Systems of Polynomials Arising in Engineering and Science

geometric properties, you must use not only algebra, but topology, several complex
variables, and partial differential equations. Doing this with full proofs would rule
the book out for all but a very small audience. Yet the theory basically says that,
in any number of dimensions, solution sets are as nice as a few well chosen and
simple examples would naively lead an engineer or scientist to expect.
There remains a tension that we see no way to completely resolve. Dealing with
polynomials and algebraic subsets of Euclidean space is basic, but this is not general
enough to cover the applications common in engineering and science. For example,
the use of products of projective spaces and multihomogeneous polynomials which
live on them is extraordinarily useful, but these polynomials are not "functions" on
the products of projective spaces. Working in an appropriate generality to cover
everything needed would cast a pall over the whole book. Moreover, the early
parts of the book need only advanced calculus and a few concepts from algebraic
geometry. For this reason, we often restate results in different levels of generality
in different parts of the book. We have also included an appendix with detailed
statements of useful, more technical results from algebraic geometry.
Part One of the book is introductory.
Chapter 1 gives examples of polynomial systems as they arise in practice and
gives an introduction to homotopy continuation, the numerical solution tool under-
lying our work.
Chapter 2 gives a more detailed discussion of homotopy continuation and what
it means to be a complex or real solution of a system of polynomials.
Chapter 3 introduces some algebraic geometry and shows some of the ways it
naturally presents itself, e.g., dealing with solutions at infinity and continuation
paths going to infinity.
Chapter 4 gives a first discussion of generic points and probability-one algo-
rithms. The powerful ability to choose "generic points" in Euclidean space increases
the efficiency and stability of numerical algorithms and eliminates some problems
that are endemic in exact symbolic procedures.
In Chapter 5, there is some detailed discussion of polynomials in just one vari-
able. For example, we discuss the fundamental limitations that the number of digits
available to us impose on our recognizing a zero of a polynomial.
Chapter 6 gives a brief discussion, with some pointers to the literature, of other
approaches to solving systems of polynomials.
Part Two is devoted to the theory and practice of finding isolated solutions of
polynomial systems. Here we consider the many special features of a polynomial
system that make it amenable to efficient solution.
Chapter 7 explains the coefficient-parameter framework for systems arising in
engineering and science. It is a compelling fact that almost all systems that arise
in practice depend on parameters, and need to be solved many times for different
values of the parameters. Thus it becomes worthwhile to spend extra computation
solving such a system if that extra time, amortized over all the times we solve the
Preface ix

system, leads to a more efficient and quicker average solution time. We include a
case study of this approach applied to Stewart-Gough platform robots.
Polynomial systems arising in engineering and science tend to be sparse and
highly structured. In Chapter 8, we give an extended discussion of such special
structures. These features cause systems to have fewer solutions than would be
naively expected. Taking advantage of this structure leads to more efficient homo-
topies and much faster solution times.
Chapter 9 gives case studies for systems arising from a number of different
engineering and scientific applications. We have found that these systems present
challenging problems and excellent trial grounds for improving our algorithms.
Chapter 10 covers endgame methods. These methods exploit continuation to
improve the numerical accuracy of singular solutions, such as double or triple roots.
Chapter 11 deals with how to recognize and deal with problems that may occur.
The probability-one methods we use are based on choosing generic points. If only
we had computers with infinite precision, these methods would eliminate all manner
of unpleasant difficulties, e.g., path crossing. Since real computers have only finite
precision, the probability of "probability zero" events is very small, but positive.
This chapter discusses how to detect the occurrence of such events, in the large
problems occurring in engineering and science, and how to deal with them.
Part Three of the book shows how the ability to compute isolated solutions by
homotopy continuation can be exploited to manipulate higher-dimensional solution
sets of polynomial systems. To do so, we introduce "witness sets" to represent
curves, surfaces and other algebraic-geometric sets as numerical objects. Witness
sets and the underlying theory should be looked at as a new subject Numerical
Algebraic Geometry whose relation to Algebraic Geometry is similar to the relation
of Numerical Linear Algebra to Linear Algebra.
Chapter 12 introduces some needed material from algebraic geometry, such as
the Zariski topology, its relation to the complex topology, the irreducible decompo-
sition, constructible algebraic sets, and multiplicity.
Chapter 13 introduces the basic concepts of numerical algebraic geometry. Pri-
mary among these are witness points, which is the natural numerical data structure
to encode irreducible algebraic sets. We also give an extensive discussion of the
reduction to systems with the same number of equations as unknowns. Based on
(Sommese & Wampler, 1996), the article where the Numerical Algebraic Geometry
started, this chapter explains the numerical irreducible decomposition and how to
compute "witness point supersets," a first approximation to the witness point sets
occurring in the numerical irreducible decomposition.
Chapter 14 presents an alternative procedure to compute the "witness point
supersets" of Chapter 13. We follow (Sommese & Verschelde, 2000), with some
of the later improvements from (Sommese, Verschelde, & Wampler, 2004b). One
novelty is the complete removal of slack variables.
Chapter 15 explains the algorithms to compute the numerical irreducible de-
x Numerical Solution of Systems of Polynomials Arising in Engineering and Science

composition. This is primarily based on (Sommese, Verschelde, & Wampler, 2001a,


2001c, 2002b). The chapter ends with a section on singular path-tracking. We
give some applications, mainly from the theory of mechanisms, which was a major
motivation for our studying the numerical solution of polynomial systems.
Chapter 16 discusses briefly the recent algorithms of (Sommese et al., 2004b,
2004c) to find the numerical irreducible decomposition of the intersection of irre-
ducible algebraic sets. This gives a new method which shows promise for solving
large polynomial systems.
Appendix A collects in one place many useful results from algebraic geometry,
including some structure theorems relating solutions sets of parameterized polyno-
mial systems at generic points and particular points of the parameter space.
Appendix B lists some software packages available for solving polynomial sys-
tems by continuation.
Appendix C contains a users guide to HOMLAB, a suite of Matlab1 routines pro-
vided by the authors for experimenting with polynomial continuation and working
the numerous exercises in this book.
The bibliography is not meant to be exhaustive. At the present time, when a few
keystrokes brings a deluge of references, the inclusion of everything of relevance on a
topic as broad as polynomial systems would diminish the value of the bibliography
as a tool for learning. Given this, we have followed the policy of only including
references of such direct relevance to the topics we cover that they are referred to
in the text.
Given the frequency with which web addresses change, we do not list explicit
addresses of webpages in this book. We do mention numerous websites: it is easy
to find their current coordinates by using a search engine.
We would like to express our thanks to the National Science Foundation for their
support (under Grant No. 0105653 and Grant No. 0410047 for the first author and
under Grant No. 0410047 for the second author). The first author thanks the Uni-
versity of Notre Dame and the Duncan Chair for their support. The second author
thanks General Motors Research and Development for their support, especially his
long-time supervisor, Samuel Marin, and current supervisor, Roland Menassa.
The second author wishes to acknowledge his mathematical colleagues at GM
R&D who have aided his continuing education in the field, particularly Daniel Baker
and the late W. Weston Meyer. Both authors are indebted to Alec Morgan for early
collaborations, which introduced us to the area and had the additional benefit of
introducing us to each other.
We would like to thank Tien-Yien "T.-Y." Li for his helpful comments on this
book and on many of our numerical algebraic geometry articles.
We would like to express our thanks to all the many people who have made help-
ful comments and suggested improvements. Our close collaborator, Jan Verschelde,
deserves special recognition. We also thank Wesley Calvert, Ye Lu, and Yumiko
1
"MATLAB" is a registered trademark of The Mathworks, Inc.
Preface xi

Watanabe. We give special thanks to Daniel Bates for his many helpful suggestions
and remarks.
Most of all, we thank our families for their strong encouragement and patience
during the writing of this book.

Andrew J. Sommese Charles W. Wampler


sommese@nd.edu charles.w.wampler@gm.com
Notre Dame, Indiana, U.S.A. Warren, Michigan, U.S.A.
Contents

Preface vii

Conventions xxi

I Background 1

1. Polynomial Systems 3
1.1 Polynomials in One Variable 3
1.2 Multivariate Polynomial Systems 5
1.3 Trigonometric Equations as Polynomials 7
1.4 Solution Sets 8
1.5 Solution by Continuation 9
1.6 Overview 10
1.7 Exercises 11

2. Homotopy Continuation 15
2.1 Continuation for Polynomials in One Variable 15
2.2 Complex Versus Real Solutions 18
2.3 Path Tracking 20
2.4 Exercises 24

3. Projective Spaces 27
3.1 Motivation: Quadratic Equations 27
3.2 Definition of Projective Space 29
3.3 The Projective Line P 1 30
3.4 The Projective Plane P 2 32
3.5 Projective Algebraic Sets 34
3.6 Multiprojective Space 35

xiii
xiv Numerical Solution of Systems of Polynomials Arising in Engineering and Science

3.7 Tracking Solutions to Infinity 36


3.8 Exercises 39

4. Genericity and Probability One 43


4.1 Generic Points 44
4.2 Example: Generic Lines 46
4.3 Probabilistic Null Test 48
4.4 Algebraic Probability One 50
4.5 Numerical Certainty 51
4.6 Other Approaches to Genericity 52
4.7 Final Remarks 53
4.8 Exercises 53

5. Polynomials of One Variable 55


5.1 Algebraic Facts for One Variable Polynomials 55
5.2 Analytic Facts for One Variable Polynomials 58
5.3 Some Numerical Aspects of Polynomials of One Variable 61
5.4 Exercises 65

6. Other Methods 67
6.1 Exclusion Methods 68
6.2 Elimination Methods 72
6.2.1 Resultants 73
6.2.1.1 Hidden Variable Resultants 73
6.2.1.2 u-Resultants 76
6.2.2 Numerically Confirmed Eliminants 76
6.2.3 Dixon Determinants 77
6.2.4 Heuristic Eliminants 79
6.3 Grobner Methods 81
6.3.1 Definitions 81
6.3.2 From Grobner Bases to Eigenvalues 83
6.4 More Methods 84
6.5 Floating Point vs. Exact Arithmetic 84
6.6 Discussion 85
6.7 Exercises 86

II Isolated Solutions 89
7. Coefficient-Parameter Homotopy 91
7.1 Coefficient-Parameter Theory 92
7.2 Parameter Homotopy in Application 98
Contents xv

7.3 An Illustrative Example: Triangles 99


7.4 Nested Parameter Homotopies 101
7.5 Side Conditions 102
7.6 Homotopies that Respect Symmetry Groups 103
7.7 Case Study: Stewart-Gough Platforms 104
7.7.1 General Case 106
7.7.2 Platforms with Coincident Joints 108
7.7.3 Planar Platforms 110
7.7.4 Summary of Case Study 110
7.8 Historical Note: The Cheater's Homotopy Ill
7.9 Exercises 112

8. Polynomial Structures 117


8.1 A Hierarchy of Structures 118
8.2 Notation 120
8.3 Homotopy Paths for Linearly Parameterized Families 120
8.4 Product Homotopies 122
8.4.1 Total Degree Homotopies 122
8.4.2 Multihomogeneous Homotopies 126
8.4.3 Linear Product Homotopies 130
8.4.4 Monomial Product Homotopies 133
8.4.5 Polynomial Product Homotopies 134
8.5 Polytope Structures 138
8.5.1 Newton Polytopes and Mixed Volume 138
8.5.2 Bernstein's Theorem 139
8.5.3 Computing Mixed Volumes 140
8.5.4 Polyhedral Homotopies 143
8.5.5 Example 144
8.6 A Summarizing Example 146
8.7 Exercises 147

9. Case Studies 149


9.1 Nash Equilibria 149
9.2 Chemical Equilibrium 152
9.3 Stewart-Gough Forward Kinematics 154
9.4 Six-Revolute Serial-Link Robots 156
9.5 Planar Seven-Bar Structures 159
9.5.1 Isotropic Coordinates 160
9.5.2 Seven-Bar Equations 160
9.6 Four-Bar Linkage Design 161
9.6.1 Four-Bar Synthesis 162
9.6.2 Four-Bar Equations 163
xvi Numerical Solution of Systems of Polynomials Arising in Engineering and Science

9.6.3 Four-Bar Analysis 164


9.6.4 Function Generation 164
9.6.5 Body Guidance 165
9.6.6 Five-Point Path Synthesis 166
9.6.7 Nine-Point Path Synthesis 167
9.6.8 Four-Bar Summary 169
9.7 Exercises 170

10. Endpoint Estimation 177


10.1 Nonsingular Endpoints 178
10.2 Singular Endpoints 179
10.2.1 Basic Setup 179
10.2.2 Fractional Power Series and Winding Numbers 180
10.3 Singular Endgames 181
10.3.1 Endgame Operating Zone 182
10.3.2 Simple Prediction 183
10.3.3 Power-Series Method 183
10.3.4 Cauchy Integral Method 186
10.3.5 The Clustering or Trace Method 187
10.4 Losing the Endgame 188
10.5 Deflation of Isolated Singularities 190
10.5.1 Polynomials in One Variable 191
10.5.2 More than One Variable 192
10.6 Exercises 194

11. Checking Results and Other Implementation Tips 197


11.1 Checks 197
11.1.1 Endpoint Quality Measures 197
11.1.2 Global Checks 199
11.2 Corrective Actions 200
11.2.1 Adaptive Re-Runs 201
11.2.2 Verified Path Tracking 201
11.2.3 Multiple Precision 201
11.3 Exercises 202

III Positive Dimensional Solutions 205

12. Basic Algebraic Geometry 207


12.1 Affine Algebraic Sets 209
12.1.1 The Zariski Topology and the Complex Topology 211
12.1.2 Proper Maps 212
Contents xvii

12.1.3 Linear Projections 212


12.2 The Irreducible Decomposition for Affine Algebraic Sets 215
12.2.1 The Dimension of an Algebraic Set 216
12.3 Further Remarks on Projective Algebraic Sets 217
12.4 Quasiprojective Algebraic Sets 219
12.5 Constructible Algebraic Sets 220
12.6 Multiplicity 223
12.7 Exercises 225

13. Basic Numerical Algebraic Geometry 227


13.1 Introduction to Witness Sets 229
13.2 Linear Slicing 231
13.2.1 Extrinsic and Intrinsic Slicing 234
13.3 Witness Sets 235
13.3.1 Witness Sets for Reduced Components 236
13.3.2 Witness Sets for Deflated Components 237
13.3.3 Witness Sets for Nonreduced Components 238
13.4 Rank of a Polynomial System 239
13.5 Randomization and Nonsquare Systems 241
13.6 Witness Supersets 244
13.6.1 Examples 247
13.7 Probabilistic Algorithms About Algebraic Sets 249
13.7.1 An Algorithm for the Dimension of an Algebraic Set 250
13.7.2 An Algorithm for the Dimension of an Algebraic Set at a Point 250
13.7.3 An Algorithm for Deciding Inclusion and Equality of Reduced
Algebraic Sets 252
13.8 Summary 253
13.9 Exercises 253

14. A Cascade Algorithm for Witness Supersets 255


14.1 The Cascade Algorithm 256
14.2 Examples 261
14.3 Exercises 262

15. The Numerical Irreducible Decomposition 265


15.1 Membership Tests and the Numerical Irreducible Decomposition . . 267
15.2 Sampling a Component 272
15.2.1 Sampling a Reduced Component 272
15.2.2 Sampling a Deflated Component 273
15.2.3 Witness Sets in the Nonreduced Case 273
15.3 Numerical Elimination Theory 274
15.4 Homotopy Membership and Monodromy 275
xviii Numerical Solution of Systems of Polynomials Arising in Engineering and Science

15.4.1 Monodromy 276


15.4.2 Completeness of Monodromy 277
15.5 The Trace Test 279
15.5.1 Traces of Functions 280
15.5.2 The Simplest Traces 280
15.5.3 Traces in the Parameterized Situation 281
15.5.4 Writing Down Defining Equations: An Example 282
15.5.5 Linear Traces 283
15.6 Singular Path Tracking 284
15.7 Exercises 288

16. The Intersection Of Algebraic Sets 289


16.1 Intersection of Irreducible Algebraic Sets 290
16.2 Equation-by-Equation Solution of Polynomial Systems 292
16.2.1 An Example 293
16.3 Exercises 294

Appendices 297
Appendix A Algebraic Geometry 299
A.I Holomorphic Functions and Complex Analytic Spaces 300
A.2 Some Further Results on Holomorphic Functions 302
A.2.1 Manifold Points and Singular Points 306
A.2.2 Normal Spaces 308
A.3 Germs of Complex Analytic Sets 308
A.4 Useful Results About Algebraic and Complex Analytic Sets 310
A.4.1 Generic Factorization 316
A.5 Rational Mappings 317
A.6 The Rank and the Projective Rank of an Algebraic System 318
A.7 Universal Functions and Systems 320
A.7.1 One Variable Polynomials 321
A.7.2 Polynomials of Several Variables 322
A.7.3 A More General Case 322
A.7.4 Universal Systems 323
A.8 Linear Projections 324
A.8.1 Grassmannians 325
A.8.2 Linear Projections o n P * 327
A.8.3 Further Results on System Ranks 329
A.8.4 Some Genericity Properties 330
A.9 Bertini's Theorem and Some Consequences 331
A.10 Some Useful Embeddings 334
Contents xix

A.10.1 Veronese Embeddings 334


A.10.2 The Segre Embedding 334
A.10.3 The Secant Variety 335
A. 10.4 Some Genericity Results 336
A.ll The Dual Variety 337
A.12 A Monodromy Result 339
A.13 Line Bundles and Vector Bundles 341
A.13.1 Bihomogeneity and Multihomogeneity 341
A.13.2 Line Bundles and Their Sections 341
A.13.3 Some Remarks on Vector Bundles 343
A.13.4 Detecting Positive-Dimensional Components 343
A. 14 Generic Behavior of Solutions of Polynomial Systems 344
A.14.1 Generic Behavior of Solutions 347
A. 14.2 Analytic Parameter Spaces 349

Appendix B Software for Polynomial Continuation 353

Appendix C HomLab User's Guide 355


C.I Preliminaries 356
C.I.I "As is" Clause 356
C.I.2 License Pee 356
C.I.3 Citation and Attribution 356
C.I.4 Compatibility and Modifications 356
C.I.5 Installation 357
C.I.6 About Scripts 358
C.2 Overview of HOMLAB 358
C.3 Denning the System to Solve 360
C.3.1 Fully-Expanded Polynomials 360
C.3.2 Straight-Line Functions 362
C.3.3 Homogenization 364
C.3.4 Function Utilities and Checking 365
C.4 Linear Product Homotopies 366
C.5 Parameter Homotopies 369
C.5.1 Initializing Parameter Homotopies 370
C.6 Defining a Homotopy Function 371
C.6.1 Defining a Parameter Path 371
C.6.2 Homotopy Checking 372
C.7 The Workhorse: Endgamer 372
C.7.1 Control Settings 373
C.7.2 Verbose Mode 375
C.7.3 Path Statistics 375
C.8 Solutions at Infinity and Dehomogenization 376
xx Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Bibliography 379

Index 397
Conventions

The following notational conventions are used in this book.


• Often when using indices, we refer to the objects being discussed as a se-
quence with first and last elements. For example, we might write a Jacobian
matrix as
r &h. dh 1
dzi ' dzm

dfN dfN
L dzi " ' dzm J
This is an abuse of notation in the case N = 1 or m = 1. Rather than
avoiding the abuse and obscuring things we usually leave the reader to fill
in the special cases, e.g., in the example just given with N = m = 1 we
mean

WA and not [ | | .
• When clear from context, we let 0 denote the origin of a vector space.
• When we have a map / : X —> Z between sets, and Y C X, we usually
denote the restriction of / to Y by / y . Similarly, for a point z € Z, we
denote the fiber f~1(z) by Xz.
• We often use := when we are making a definition, e.g., the disk of radius r
in the complex plane C around a point x is defined
Ar(a;) := {z G C | \z ~ x\ < r) .
In pseudocode statements of algorithms, we use the same symbol for copying
the right-hand result to the left, e.g., k := k + 1 increments k by one.
• We use multidegree notation. For example, if z\,..., zN are indeterminates,
and / = (ii,...,ijv) is a n -/V-tuple of nonnegative integers, then z1 denotes
N
*i'---2#. We let |/| := 5 > .
i=i

xxi
xxii Numerical Solution of Systems of Polynomials Arising in Engineering and Science

• C[x] is the set of polynomials in x with coefficients in C. Similarly,


C[xi,... ,xn] is the set of multivariate polynomials with complex coeffi-
cients.
• A polynomial

p(zi,...,zN) := ^2 cizI e
C[zi,...,ZAr],
|/|<d
is said to be of total degree d (or of degree d for short) if there is at least
one coefficient with c/ ^ 0 and \I\ = d. For one variable polynomials we
often follow the reverse convention on the ordering, i.e., we write p(z) :=
aozd H \-ade C[z], with a0 ^ 0.
• The symbol \ is the "setminus" operator, that is

A\B:={xG A\x<£B}.
• For set A, #A is the cardinality of A.
• For a subset A of a topological space, A denotes the closure of A.
• We use C* := C \ 0, the complex line minus the origin.
• We use p^\x) :— d?p(x)/dx^, the j t h derivative of p with respect to x.
• The ./V-dimensional complex Euclidean space is
CN := C x • • • x C = {(xu...,xN) J Xi e C } .
N times
• For real numbers a, b, we denote open and closed intervals of the real line
as

[a, b] = {x e M | a < x < b} , (a, b] = {x £ K | a < x < b} ,


[a, b) = {x G R | a < x < b} , (a, b) = {x e R | a < x < b} .
• A point x in iV-dimensional complex projective space, FN, is often
written via homogeneous coordinates enclosed in square brackets, vis
[xo,xi,... ,XN] £ ¥N. (Projective space is explained in Chapter 3 and
§12.4.) Unfortunately, this means that a real interval [u, v] C M and a point
on the projective line [u,v] G P 1 have the same notation. The distinction
between the two will be clear from context.
• We use Hardy's big O notation. The expression, f(x) = g(x) + O(h(x)) as
x —+ a, means that there exists a constant C > 0 such that |/(x) — g(x)\ <
Ch(x) for all x sufficiently near a, but not necessarily equal to a. The most
typical choices for a are 0 and oo, and the most typical choice for h(x) is
Xs for some nonnegative integer 5.
PART I
Background
Chapter 1

Polynomial Systems

The goal of this book is to describe numerical methods for computing the solutions
of systems of polynomial equations. It is appropriate, therefore, to begin by denning
"polynomials," discussing how they may arise in science and engineering, describing
in nontechnical terms what the "solutions of polynomial systems" look like and
how we might represent these numerically. The last section of this chapter gives
an overview of the rest of the book, to help the reader understand it in a larger
perspective.

1.1 Polynomials in One Variable

As will be our habit throughout the book, we start with simple scenarios before
proceeding to more general ones. A polynomial of degree d in one variable, say x,
is a function of the form
f(x) = aoxd + a^"1 + h ad-ix + ad, (1.1.1)
where ao,...,a,d are the coefficients and the integer powers of x, namely 1, x,
x2, ..., x d , are monomials. In science and engineering, such functions usually
have coefficients that are real numbers although sometimes they may be complex.
Accordingly, we will consider f(x) as a function that maps complex numbers to
complex numbers, / : C —> C. The notation C[x] is often used to denote the set of
all polynomials over the complex numbers in the variable x, so that we may write
/(x) G C[x]. When we say that f(x) in Equation 1.1.1 is degree d, this implies that
a0 ^ 0; otherwise, we say that / is at most degree d.
The "solution set" of the equation /(x) = 0 is the set of all values of x g C such
that /(x) evaluates to zero. We may write this as
/- 1 (0) = { x e C | / ( x ) = 0 } .
One of the great advantages of working over complex numbers is that, by the fun-
damental theorem of algebra (see Theorem 5.1.1), we know that as long as ao ^ 0,
f~l(x) will consist of exactly d points, counting multiplicities. Thus, a data struc-

3
4 Numerical Solution of Systems 'of Polynomials Arising in Engineering and Science

ture convenient for representing the solution of the equation is just a list of d complex
numbers, say x\,...,x*d, not all necessarily distinct. These are also called the roots
of the polynomial. If some of the roots are repeated, then the reduced solution set
is just the list of distinct roots. We know the solution set is complete and correct if
d
= a
a0 Y[(x ~ xi) o%d + a-\xd~l + • • • + a,d-ix + ad-
1=1

That is, we expand the left-had side and check that all the coefficients match.
It is possible to study polynomials over other rings, for example: the reals, M[x];
rational numbers, Q[x}\ the integers, Z[x]; any finite field1, W[x\; or sometimes, in
statements of theory, an unspecified field, usually denoted K[x]. In one sense, there
is no loss of generality in restricting our attention to C [x], for if we find all complex
solutions of f{x) = 0, all real solutions will be contained therein, and similarly
rational and integer solutions. However, the situation may be turned on its head
if we ask other questions. As an example, suppose we seek the conditions for a
sixth degree polynomial to be factorable over a field other than C. Since the funda-
mental theorem of algebra tells us that all polynomials of degree greater than one
in one variable factor over the complexes, we would have to consider the specific
field in question to get an answer. Computer algebra systems deal extensively with
polynomials over the rational numbers or over finite fields, since these permit exact
calculation. And in the area of encryption, essential to secure digital communica-
tions, polynomials on finite fields are crucial. However, in engineering and science,
real or complex numbers are of greatest concern, and it is in this arena that we
focus our effort.
At this point, it is worth noting that our approach will be numerical, so in fact,
all of the coefficients and the solutions we compute will be represented in floating
point arithmetic. Typically, both will be only approximate, so that in reality we
compute approximate solutions to a polynomial that is already an approximation of
the original problem. This is the nature of almost all scientific computation. What
is critical is that we have some estimate of the sensitivity of the problem so that we
have assurance that the solutions are near the correct ones, or, as some would have
it, that the problem that our solutions satisfy is near the one we want to solve.
There is an extensive literature on the numerical solution of polynomials in one
variable. We will not delve into it here, as our focus is on multivariate cases. For
low-degree polynomials in one variable, one approach is to reformulate the problem
as finding the eigenvalues of the companion matrix,
/ 0 1 ••• 0 \
: : :
A= '•• , (1.1.2)
0 0 ••• 1
^ oo a0 ao'
a
Most commonly, the integers modulo a prime number.
Polynomial Systems 5

having ones on the superdiagonal and the coefficients of / in the last row. Since
the characteristic polynomial of A is det(xl — A) = f(x)/a0, its eigenvalues are the
roots of / . This formulation is convenient due to the wide availability of high-
quality software for solving eigenvalue problems, and as documented in (Goedecker,
1994), it is a highly effective numerical approach. For polynomials with high degree,
divide-and-conquer techniques may be better (Pan, 1997).

1.2 Multivariate Polynomial Systems

We may generalize the single-variable case in two ways: we may seek the simulta-
neous solution of several polynomials, and each of these may involve more than one
variable. The formal definition of a polynomial, which includes the single variable
case, is as follows.

Definition 1.2.1 (Polynomial) A function f(x) : Cn —> C in n variables x =


(x\,... ,xn) is a polynomial if it can be expressed as a sum of terms, where each
term is the product of a coefficient and a monomial, each coefficient is a complex
number, and each monomial is a product of variables raised to nonnegative integer
powers. Restating this in multidegree notation, let a = (a\,... ,an) with each a»
a nonnegative integer, and write monomials in the form xa = YYi=i x<ii • Then, a
polynomial f is a function that can be written as

f(x) = J2aaxa, (1.2.3)

where T is a finite index set and aa £ C. The notation f € C [ x i , . . . ,xn] = C[x]


means f is a polynomial in the variables x with coefficients in C. The total degree
of a monomial xa is \a\ := a\ +••• + an and ofthe polynomial f'(x) is max \a\.
When no confusion can result, we abbreviate total degree to degree.

In practice, polynomials often arise in unexpanded form, so that although in


principle they can be expanded to the form of Equation 1.2.3, it is neither convenient
nor numerically expedient to do so. Consequently, it is useful to make the following
simple observations.

Proposition 1.2.2 If f € C[x] and g € C[x] are polynomials, then

• - / e c[4.
• f + geC[x],
. f-g£c[x],
• fg £ C[x], and
• fk £ C[x], for any nonnegative integer k.
6 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Proof. Apply the distributive law to expand each expression into a sum of terms.

Note that constants are polynomials too, so we may add, subtract or multiply by
them as well.
Notice that the operation of division is missing. Although with suitable care,
many of the techniques in this book can be extended to algebraic functions, which
allow division, we will for the most part concentrate on polynomials. Mainly, this
just means that before commencing to solve algebraic equations, we must clear
denominators. 2
The facts listed in Proposition 1.2.2 allow us to consider systems of polynomial
functions given in "straight-line" form, convenient for both the analyst and for
evaluation by computer.

Definition 1.2.3 (Straight-Line Function) Beginning with a list of known


quantities, consisting of internal constants, c, and a set of variables, x, a straight-
line function specifies a finite sequence of elementary operations whose operands are
among the known quantities and whose output is added to the list of known quanti-
ties. At termination, a subset of the known quantities are the function values, f(x).
Definition 1.2.4 A polynomial straight-line function is one whose elementary
operations are limited to those listed in Proposition 1.2.2.
When coding a function for numerical work, the analyst typically writes the function
in a high-level description using the standard rules for precedence of operations
and parentheses, as necessary. A compiler program parses this into a low-level
sequence of unary and binary operations, producing a succession of intermediate
results until the function values are reached. The following is a direct consequence
of Proposition 1.2.2 and Definition 1.2.4.

Corollary 1.2.5 A polynomial straight-line function is a polynomial in x with


coefficients that are polynomial in the internal constants c.
One of our goals will be to solve such polynomial systems with a minimum of
symbolic processing. Particularly, we do not wish to expand the polynomials into
the form of Equation 1.2.3. For example, if / = 1 + x\ + X2 + • • • + %k, then the
efficient way to evaluate fn given values of x\,..., Xk is to evaluate / and raise it
to the nth power, as we would do in a straight-line program. Fully expanded, / "
has ("^ ) terms, which can become rather large even for moderate n and fe.
Example 1.2.6 Consider the polynomial f(x,y) = (1 + 2.2a; - 0.3y)3. In fully
expanded form, this has ten terms, namely
f(x, y) = 1 + 6.6 * x + 14.52 * x2 + 10.648 * x3 - 0.9 * y - 3.96 *x*y
- 4.356 * x2 * y + 0.27 * y2 + 0.594 * x * y2 - 0.027 * y3.
2
Laurent polynomials, which allow negative exponents, are treated briefly in § 8.5
Polynomial Systems 7

An un-optimized evaluation of the function proceeding left to right and accumulat-


ing results term-by-term will not be very efficient. Compiling this into a sequence
of elementary operations would give 27 operations in total. A computer code that
only accepts fully expanded polynomials may optimize the evaluation procedure in
some fashion. For example, the function is easily rearranged into nested Horner
form as
f(x, y) = l + x * (6.6 + x * (14.52 + 10.648 * x))
+ y* (-0.9 + x * (-3.96 - 4.356 * x) + y * (0.27 + 0.594 * x - 0.027 * y)),

which reduces the operation count to 18. This is still far from the most efficient
straight-line form, in which we first evaluate the quantity inside the parentheses
and then cube the result. Compiled into a sequence of elementary operations, the
evaluation proceeds as follows, using two temporary variables a and b and only five
operations

a^- (2.2* a;)


6<- (1 + a)
a«-(-0.3*j/)
b *~ (b + a)
f - (b3)
Here, the "<—" symbol indicates that the right-hand expression should be evaluated
and loaded into the variable at the left.

1.3 Trigonometric Equations as Polynomials

Problems in geometry and kinematics are often formulated using trigonometric func-
tions. Very often these can be converted to polynomials. For example, equations
involving sin 9 and cos# can be treated by replacing these with new indeterminates,
say sg and eg, respectively, and then adding the polynomial relation s^ + Cg = 1.
Once solution values for sg and eg have been found, the value of 9 is easily deter-
mined.3 Sine or cosine of a multiple angle can always be reduced to a polynomial
in sine and cosine of the angle, e.g., sin29 = 2 sin9cos9, and the sine or cosine of
sum and differences of angles can also be expanded into polynomials in the sines
and cosines of the angles.
There are limits, of course: Not all trigonometric expressions can be converted
to polynomials. Examples include x + sinx and sin a; + sinxy.
The reason that trigonometric expressions arising in practice are so often con-
vertible to polynomials is that they usually have to do with angular rotations,
3
A different maneuver is to use a new variable t and the substitutions sin6< = 2i/(l + t 2 ) and
cosd — (1 — t 2 )/(l + t 2 ). This avoids introducing a new equation at the cost of making the
substitution quadratic.
8 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Table 1.1 Solution Sets of Polynomial Systems


Univariate Multivariate System
1 Equation, 1 Variable n Equations, TV Variables
solution points sol'n points, curves, surfaces, etc.
double roots, etc. sets with multiplicity
Factorization, r7|(^ — ai)fli Irreducible decomposition
Numerical Representation
| list of points | list of witness sets ~|

whose main property is the preservation of length. Length relations are inherently
polynomial, due to the Pythagorean Theorem.

1.4 Solution Sets

We have already described above the nature of the solution set of a single polynomial
in one variable, which can be represented numerically by a list of approximate
solution points. As summarized in Table 1.1, the situation is more complicated for
multivariate systems of polynomials. Such a system may have solution sets of several
different dimensions: that is, a system could have isolated solution points (dimension
0), curves (dimension 1), surfaces (dimension 2), etc., all simultaneously. Moreover,
just as a univariate polynomial may have repeated roots, a multivariate system may
have solution sets that appear with multiplicity greater than one. Corresponding
to the factorization of a univariate polynomial into linear factors, the solution set of
a multivariate system can be broken down into its irreducible components. Isolated
points are always irreducible, but higher dimensional sets may factor. For example,
the quadratic x2 + y2 factors into two lines {x + iy){x — iy), whereas the quadratic
x2 + y2 — 1 is an irreducible circle. The computation of a numerical representation
of the irreducible decomposition of a multivariate polynomial system is the major
topic in Part III of this book. This requires witness sets, a special numerical data
structure. We postpone any further discussion of this until that point.
If f(x) : CN —> C™ is a system of multivariate polynomials, we use the notations
1
f~ (0) and V(f) interchangeably to mean the solution set of f(x) = 0, i.e.,

V(f) = f-1(0) = {xeCN\f(x)=0}.

The set V(f) contains no multiplicity information. When multiplicity is at issue,


we will explicitly say so.
V(f) is read as the algebraic set associated to / or the algebraic set of f. The
letter V in V(f) stands for variety, and indeed V(f) is sometimes referred to as the
variety associated to / . As we will see at the start of § 12.2, the word variety often
stands irreducible algebraic set. Because of the possible confusion that results, we
have avoided using the word variety in this book.
Let us state now one caveat regarding real solutions. Higher dimensional solution
Polynomial Systems 9

sets retain the property that the complex solution sets must contain the real solution
sets. However, the containment can now be looser, because the real solution set
may be of lower dimension than the complex component that contains it. For
example, the complex line x + iy = 0 on C2 only contains one real point (x, y) =
(0,0). Also, an irreducible complex component can contain more than one real
component, as for example, the solution of y2 — x(x2 — 1) = 0 is one complex curve
that has two disconnected real components, one in the range x > 1 and one in
— 1 < x < 0. Regrettably, the extraction of real components from complex ones
is not developed enough for treatment in this book. We refer the reader to (Lu,
Sommese, k Wampler, 2005).
This caveat notwithstanding, the complex solutions often give all the information
that an analyst desires. In fact, although systems can, and often do, have solution
sets at several dimensions, a scientist or engineer may often only care about isolated
solution points. When circumstances dictate this, higher dimensional solutions may
be justifiably labeled "degenerate" or "mathematical figments of the formulation."
Consequently, methods that are guaranteed to find the isolated solutions, without
systematically finding the higher dimensional solution sets, are of significant value,
and we will spend a large portion of this book discussing how to do this efficiently.
Moreover, the numerical treatment of higher dimensional solutions will rest upon
the ability to reformulate the problem so that at each dimension we are seeking a
set of isolated solution points.

1.5 Solution by Continuation

The earliest forms of continuation tracked just one root as parameters of a problem
were moved from a solved problem to a new problem. A notable example is the
"bootstrap method" of (Roth, 1962; Freudenstein & Roth, 1963), which happened
to be applied to problems involving polynomials but made no essential use of their
properties. Beginning in the 1970's, an approach to solving multivariate polynomial
systems, called "polynomial continuation," was developed. To just list a few of
the early articles, there are (Drexler, 1977, 1978; Chow, Mallet-Paret, & Yorke,
1979; Garcia & Zangwill, 1979, 1980; Keller, 1981; Li, 1983; Morgan, 1983). A
more detailed history of the first period of the subject may be found in (Morgan,
1987). That period had relatively sparse use of algebraic geometry and centered on
numerically computing all isolated solutions by means of total degree homotopies.
A more recent survey of developments in finding all isolated solutions, taking into
account which monomials appear in the equations, may be found in (Li, 2003).
Methods for finding higher-dimensional solution sets are new; for these, we refer
you to Part III of this book. In (Allgower & Georg, 2003, 1993, 1997), a broader
perspective on continuation, including non-polynomial systems, is available.
By using algebraic geometry and specializing "homotopy continuation" to take
advantage of the properties of polynomials, the algorithms can be designed to be
10 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

theoretically complete and practically very robust. Besides being general, polyno-
mial continuation has the advantage that very little symbolic information needs
to be extracted from a polynomial system to proceed. It often suffices, for exam-
ple, just to know the degree of each polynomial, which is easily obtained without
a full expansion into terms. For small systems, other approaches may be faster,
and we will mention some of these. But these alternatives are quickly overwhelmed
by systems of even moderate size, whereas continuation pushes out the boundary
to include a much larger set of practical applications. For this reason, we highly
recommend continuation and we devote nearly all of this book to that approach.

1.6 Overview

The main text of this book is divided into three main parts:
Part I an introduction to polynomial systems and continuation, along with mate-
rial to familiarize the reader with one-variable polynomials and a chapter
summarizing alternatives to continuation,
Part II a detailed study of continuation methods for finding the isolated solutions
of multivariate polynomials systems, and
Part III in which continuation methods dealing with higher dimensional solution
sets are presented.
As such, Part I is a combination of classical material and warm-ups for a serious
look at the continuation method. Although we give brief looks at some alterna-
tive solution methods, beyond Part I, we concentrate exclusively on polynomial
continuation. Part II is our attempt to put a common perspective on the major
developments in that method from the 1980's and 1990's. Part III brings the reader
to the cutting edge of developments.
The book also contains two substantial appendices. The first, Appendix A, pro-
vides extra material on some of the results we use from algebraic geometry. The
style of the main text is intended to be understood without these extra details, but
some readers will wish to dig deeper. Unfortunately, most of the existing mathemat-
ical texts take a more abstract point of view, necessitated by the mathematicians'
drive to be general by encompassing polynomials over number fields other than the
complexes. By collecting the basics of algebraic geometry over complex numbers,
we hope to make this theory more accessible. Even mathematicians from outside
the specialty of algebraic geometry might find the material useful in developing a
better intuition for the field.
Appendix C is important for the serious student who wishes to work the exercises
in the book. We give a user's guide to HomLab, a collection of Matlab routines for
polynomial continuation. In addition to the basic HomLab distribution, there is a
collection of routines associated with individual examples and exercises. These are
documented in the exercises themselves.
Polynomial Systems 11

1.7 Exercises

As the focus of this book is on numerical work, most of the exercises will involve
the use of a computer and a software package with numerical facilities, such as
Matlab. A free package called SciLab is also available. While most exercises require
a modicum of programming in the way of writing scripts or at least interactive
sessions with the packages, there are a few that require extensive programming.
Unless stated otherwise, statements such as »x=eig(A) refer to Matlab com-
mands, where " » " is the Matlab prompt. Similar commands are available in the
other packages mentioned above.

Exercise 1.1 (Companion Matrices) See Equation 1.1.2 for the definition of a
companion matrix. In the following, poly() is a function that returns the coeffi-
cients of a polynomial given its roots, whereas roots () returns the roots given the
coefficients.

(1) Form the companion matrix for

f{x) = x 5 - 1.500a;4 - 0.320a;3 - 0.096a:2 + 0.760a; + 0.156

and find its roots using an eigenvalue solver (in Matlab: eig).
(2) Repeat the example using » f = p o l y ( [ l , 1.5, - . 4 + . 6 i , - . 4 - . 6 i , - . 2 ] ) to
form the polynomial and » r o o t s (f) to find its roots. (Note that in Matlab,
roots() works by forming the companion matrix and finding its eigenvalues.)
(3) Wilkinson polynomials. Use >>roots(poly(l:n)) to solve the Wilkinson
polynomial (Wilkinson, 1984) of order n

n?=1(x-i).
Explore how the accuracy behaves as n increases from 1 to 20. Why does it
degrade? (Examine the coefficients of the polynomials.)
(4) Roots of unity. Use roots () to solve xn — 1 = 0 for n = 1 , . . . , 20. Compare
answers to the roots of unity, e27™/™, where i = -J^l.
(5) Repeated roots. Solve x6 - 12a;5 + 56a;4 - 130a;3 + 159a;2 - 98x + 24 using
» r o o t s ( p o l y ( [ l , 1, 1, 2, 3 , 4 ] ) ) . What is the accuracy of the triple
root? What is the centroid (average) of the roots clustered around x = 1?

Exercise 1.2 (Straight-Line Polynomials: Efficiency) Consider the determi-


nantal polynomials pn(xi,... ,xn2), where pn is the determinant of the n x n matrix
having elements x\,...,xni listed row-wise. For example,

. Xi X2
P2\X\,X2,X-i,X4,) = =X1Xi-X2X3.
X3 Z 4

(1) What is the degree of p n ?


12 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

(2) How many terms are there in the fully expanded form of pn? Using the sequence
of operations implied by the fully expanded expression, how many arithmetic
operations are required to numerically evaluate pn, given numeric values of

(3) Using expansion by minors, how many operations are required to numerically
evaluate pn?
(4) What method does Matlab use to efficiently evaluate the determinant of an
n x n matrix? How many operations are required?

Exercise 1.3 (Straight-Line Polynomials: Degree) As mentioned in Defini-


tion 1.2.1, the degree of a monomial is the sum of the exponents appearing in it,
e.g., the degree of xy2z = x1y2z1 is 4 and the degree of a polynomial is the maximal
degree of any of its terms. The purpose of this exercise is to find the degree of a
straight-line polynomial without expanding it.

(1) Given the degrees of / and g, what can you say about the degree of the result
for each of the operations listed in Proposition 1.2.2?
(2) Suppose each step of a straight-line program is given as an operator followed by
a list of the addresses of one or two operands (as appropriate) and an address
for the result. Design an algorithm to compute an upper bound on the degree
of a straight-line polynomial. The complexity of the algorithm should be linear
in the number of steps in the straight-line program.
(3) Implement your algorithm in a language of your choice.
(4) Can you think of a polynomial for which your algorithm computes a degree that
is too high?

Exercise 1.4 (A Trigonometric Problem) Figure 1.1 shows a planar two-link


robot arm, with upper arm length a and forearm length b. The end of the arm is
at point (x, y) in the plane. Simple trigonometry gives the relations

x = acosd + bcoscp, y = asin.0 + bsincp. (1-7-4)

(1) Given a, b, x, y, use trigonometry to find 6 and 4>.


(2) Reformulate Equations 1.7.4 as polynomials using the method suggested in
§ 1-3.
(3) An alternative formulation is to let the coordinates of the "elbow" point be
(u,v) and write equations for the squared distance from (u,v) to (0,0) and
from (u,v) to (x,y). Do so.
(4) Reduce the pair of equations in (u, v) to a single quadratic in u. What does
this tell you about the number of solutions of the two-link arm?
(5) What region of the plane can the endpoint of the arm reach? What happens to
the solutions of the polynomial outside this range?
Polynomial Systems 13

Fig. 1.1 A planar two-link robot arm. The triangle with hash marks indicates a grounded link,
meaning that it cannot move. Open circles indicate hinge joints that allow relative rotation of the
adjacent links.

Exercise 1.5 (Solution Sets) Create a system of three polynomials in three vari-
ables such that the solution set includes a surface, a curve, and several isolated
points? (Hint: it is easier to do if the equations are written as products of factors,
some of which appear in more than one equation.)
Chapter 2

Homotopy Continuation

In this chapter we present the basic theory underlying the homotopy continuation
method. This flexible method works well in many situations where there is no other
numerical method. The underlying approach of homotopy continuation is to
(1) put the problem we are solving into a family of problems depending on para-
meters;
(2) solve the problem for some appropriate point in the parameter space; and
(3) track the solutions of the problem as the point representing it in the para-
meter space passes from the point where we have the solutions to the point
representing the original problem that we wish to solve.
This approach is useful on a wide variety of problems, not necessarily polynomial,
which exhibit a continuous dependence of the solutions on the parameters. Of
course, in this generality, many things can go wrong, even to the extent that the
approach completely fails. The major theme of this book is that for polynomial
problems arising in applications, this approach works wonderfully well. An added
advantage of homotopy continuation is that it may easily be parallelized: if the
starting problem has several solutions, the corresponding solution paths may be
tracked on different processors.
In this chapter we start with simple examples and gradually build up to more
general ones. For the first examples, there are other methods, but even for these
examples the continuation method's many robust properties recommend its use.

2.1 Continuation for Polynomials in One Variable

Let us consider how to find the roots of the polynomial p(z) := zd + aizd~l + • • -+0,4
where d is a positive integer and the ai are constants. In Chapter 1, we saw that
finding the eigenvalues of the companion matrix is an effective approach.
Let's see how continuation might be used to solve this same problem. We know
how to solve zd ~ 1 = 0: the roots are
z* = ek1-n^\/d {OTk=l,,..,d.

15
16 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Consequently, let's define our family of problems by

H(z, t) := t(zd - 1) + (1 - t)p(z). (2.1.1)

When t = 1 we have the system H(z, 1) = zd — 1, with known roots, and when
t = 0, we have the system H(z,0) = p(z), which we want to solve. We propose to
track the solution paths as t goes from 1 to 0.
For example, applying Equation 2.1.1 to the very simple case p(z) = z2 — 5 = 0,
we have

H{z, t) = t(z2 - 1) + (1 - t)(z2 - 5) = z2 - (5 - 4t).

Thus for t 6 [0,1] we have two solutions of H(z, t) = 0, namely z{(t) = \/5 — 4£ and
Z
2^) = —-v/5 — 4t. As £ goes from 1 to 0, the roots go from ±1 to ±\/5, the roots of
the equation z2 - 5 = 0. Pretending that we don't know formulae for the solution
paths, our continuation method consists of numerically tracking the solutions of
H(z, t) = 0 as t goes from 1 to 0. Of course, no one would bother to solve this
trivial case in such a complicated way, but the point is that, with a few tweaks, the
same approach works for any polynomial.
So how can we numerically follow the solution paths? One approach is to observe
that the solution paths z*{t) satisfy the Davidenko differential equation; see, e.g.,
(Davidenko, 1953a, 1953b; Allgower & Georg, 2003). This equation is obtained by
noting that H(z*(t),t) = 0 for all t. Consequently, letting Hz(z,t) and Ht(z,t)
denote the partial derivatives of H(z, t) with respect to z and t respectively, we
have

For the general case of Equation 2.1.1, we have

dz*(t) = Ht(z*(t),t) = z*(t)d - 1 -p(z*(t))


dt H2(z*(t),t) tdz*(ty-i + (l-t)p'(z*(t))-
This is an ordinary differential equation for z*(t), with initial values given for z*(l).
The roots we seek are the values z*(0).
In the particular case of p(z) = z2 — 5, the Davidenko equation simplifies to

At this point we could numerically solve the two independent initial value problems,
dz\ 2 , dz2 2 ,N
—7- = with zi{\) = 1, and = with 22(1) = —1-
at z\ dt Z2
This does work, though it opens us up to all the issues and numerical errors facing
the use of the numerical theory of ordinary differential equations.
Homotopy Continuation 17

A more numerically stable approach takes full advantage of the fact that the
solution paths satisfy the equation H(z, t) = 0 for each t. Thus we might use the
following algorithm to track the paths starting at zi{l) = 1 and 2:2(1) = —1-
Simple Path Tracker
Begin
(1) Set up a grid to,..., ijvf with M some large number, h = j ^ , and tj =
(M - j)h;
(2) For each i from 1 to 2, do
(a) set w0 = Zj(l);
(b) for each j from 1 to M — 1 do
i. use one step of Euler's method to define w = w-: h\
ii. find the solution Wj+\ of H(z,tj) = 0 using Newton's method1
with start value w.
End
The reader probably has many worries about this simple algorithm. Some ob-
vious ones are:
Ql. How should one choose Ml;
Q2. Euler's method is pretty terrible;
Q3. Newton's method could fail; and
Q4. If you had a multiple root, e.g., your original system was z2 = 0, Newton's
method does not work so well.
To these we also add the following observation:
Q5. If one wants to solve the equivalent equation p(z) = 5 — z2 = 0, the homotopy,
Equation 2.1.1, becomes H{z,t) = t(z2 - 1) + (1 - t)(5 - z2) = 0. This gives
trouble at t = 5/6 (because H(z, 5/6) = (2/3)z2 has a double root) and at
t = 1/2 (because H(z, 1/2) = 3/2 has no solution).
Some quick responses to these concerns are:
Al. In fact, we do not pick an M but choose the tj by an adaptive procedure.
Of course this raises more questions, e.g., "How do we control the step size?"
Section 2.3 below addresses the main points.
A2. Because we use Newton's method to correct solutions as we move along, Euler's
method gives the same accuracy as using a more sophisticated solver for ordi-
nary differential equations. Higher-order predictors can be used in place of
Euler's method to increase efficiency.
1
The method known as "Newton-Raphson's method" in engineering circles is commonly called
just "Newton's method" in the numerical analysis community. We adopt the briefer appellation.
18 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

A3. The adaptive procedure is designed to keep the application of Newton's method
within its zone of convergence. There are good ways of dealing with special
situations where Newton's method still fails (see next item).
A4. Yes, singular solutions pose particular difficulties, but there are a number of ef-
fective "endgame" procedures to refine such singular solutions. See Chapter 10.
A5. Certain simple procedures guarantee that bad situations such as these happen
with "probability zero." In the next paragraph, we apply a quick fix of wide
applicability to the particular example above.
All of these answers and answers to other questions, e.g., how to construct good
homotopies H(z, t) when we have many equations with special structure, will be
dealt with in this book. For now, let us satisfy ourselves that we can eliminate the
troubles arising in the example in item Q5, above, using the following "quick fix."
This is a special case of the gamma trick first introduced in (page 108 Morgan &
Sommese, 1987a). Let's introduce a random angle 9 e [—vr, TT] and modify the
homotopy of Q5 to

H(z, t) = tel9(z2 - 1) + (1 - t)(5 - z2) = 0 (2.1.3)

where i = \/^T is the imaginary element. Note that at t = 1, we have the same
start points z = ±1 as before. But now, due to the complex factor e%e, the paths
are well-behaved for all t G [0,1]; the coefficient of z2 does not vanish nor does the
constant. Figure 2.1 shows the solution paths for several values of 6 in (0, TT]. For
values of 6 in [—n, 0), the paths are the reflection through the real line of those
shown in the figure. We see that trouble is brewing for 9 near zero. For 8 = 0.1 the
paths are mildly behaved, but the trend of what will happen for small 9 is apparent:
as 9 —> 0, the paths start at ±1, meet at a double point at the origin, then follow
the positive and negative branches of the imaginary axis to infinity, then re-enter
the scene along the real axis, coming in from infinity to arrive at the final roots
±\/5. Numerically, we can stand a very small value of 0, although the length of
the path becomes longer and longer. Thankfully, if we were to pick 6 at random,
there would be a very small chance of picking 9 close enough to zero to cause any
trouble. This kind of random complexification of a homotopy is a very useful tool
for avoiding singularities. We will justify the gamma trick in a more general context
in Chapter 7.

2.2 Complex Versus Real Solutions

In applications, it is quite common that only real solutions have physical meaning,
yet we find all solutions, including the complex ones. Isn't this a waste of computing
time? Why bother? One might think, "Surely it must be simpler to just find the
real solutions."
The answer has different aspects. First, there is currently no good general
Homotopy Continuation 19

Fig. 2.1 Solution paths of Equation 2.1.3 for 9 = {0.1, 0.3,1,2}.

method for finding all real roots directly. A good choice in low dimensions is to
use exclusion methods, also known as interval or box-bisection methods, to fence in
isolated roots, but in high dimensions, the rate of convergence tends to be slow. We
summarize these methods in more detail in § 6.1. These methods have a place, faring
best in comparison to continuation if dimensions are low, degrees are high (where
there is the possibility of large numbers of complex roots), and if one only desires real
roots in a limited region. These methods often perform poorly if the problem has
any nonisolated solution components, as they bog down computing a large number
of boxes covering the solution curve, surface, etc. Research in methods for real
roots is an active area, so one shouldn't count them out. Meanwhile, continuation
offers the option of finding all roots, real and complex, and then casting out the
complex ones.
The second answer is that there is useful information to be gained from the
whole solution list. One example is a complex root with small imaginary parts,
an "almost real" solution. Such roots suggest that a small perturbation of the
problem might introduce a new real root. Indeed, a mechanical system modeled as
a collection of rigid bodies always has a bit of elasticity, so "almost real" solutions
of the mathematical model might indicate an extra assembly configuration for the
actual device.
An even more compelling reason to find all roots is that it can reveal structural
information about other problems in the same family as the one at hand. The total
20 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

number of nondegenerate isolated roots for a general problem from the family is an
upper bound on the number of such roots for any other problem in the family. The
number of real roots does not respect such a relationship. The complete set of roots
of the general problem can be used as start points for a homotopy to solve other
problems in the family. This sometimes can make a large difference in the amount
of computation used for those subsequent problems. Chapter 7 deals with this in
some detail.
One might hope to use continuation to follow just the real roots from the start
system to the target system. As a general approach, this is doomed to fail, because
the number of real roots is usually not constant. Even if the number of real roots
is the same for the start and target systems, surprising things can happen. Fig-
ure 2.2 shows two examples where real solutions become nonreal while nonreal ones
become real.
Example 2.2.1 Suppose we set up a homotopy between the polynomials
f(y)=y4-2y2~y +l and g(y) = y4 - 2y2 + y + 1,
which both have two real and two nonreal solutions. The linear homotopy
h(y, t) = yA - 2y2 - ty + 1 = 0
has two real roots for all t s [—1,1], except at t ~ 0, where there are two real
double roots. But the two positive real roots for 1 > t > 0 do not connect to the
two negative real roots for — 1 < t < 0.
Example 2.2.2 Consider the homotopy
h{y, t) = y4 - t + 0.25 = 0
which at t = 1 has two real and two imaginary roots. Let t travel around the unit
circle in the complex plane; that is, let t = el6 as 9 goes from 0 to 2n. At the end
of the circuit, we end up with the same polynomial and hence the same roots as at
the start. But the paths starting at the two real roots lead to the imaginary ones,
and vice versa.
As a final word on the utility of finding nonreal solution points, we note that in
Part III of this book, we give algorithms for finding all solutions to a polynomial
system, including positive-dimensional solution components. These algorithms rely
heavily on the ability to reliably find all isolated solutions, both real and complex,
to certain polynomial systems related to the initial problem.

2.3 Path Tracking

The heart of any numerical continuation method is its path-tracking algorithm. We


already presented a simple path tracking algorithm on page 17, where we noted
Homotopy Continuation 21

Fig. 2.2 Interchange of real and imaginary roots for two homotopies.

some deficiencies, especially regarding selection of the step size. Much has been
written about path tracking in general (Allgower & Georg, 2003) and path trackers
for polynomial continuation (Morgan, 1987) in particular, so we only sketch the bare
necessities here. Surprisingly, perhaps, the basic algorithm presented below is suffi-
cient for most of our needs without further improvements. The main improvement
over our earlier simple algorithm is the use of an adaptive step size.
For solving algebraic problems, we often place a higher priority on finding all
solutions reliably than on finding one or a few solutions quickly. Therefore, when
faced with a choice between speed and reliability, we choose the more cautious route.
This has the added benefit that the cautious choice is usually simpler as well.
General path trackers must deal with all sorts of difficult issues, for example,
a path that bifurcates into several paths, or a path that reverses direction. For-
tunately, with proper care in forming a homotopy, one can assure that the paths
for solving polynomial systems have none of these troubles: they advance steadily
as the homotopy parameter t advances and never intersect except possibly at the
end target. (More precisely, the probability of a singularity occurring on a path
is zero. This is an issue that will be discussed at greater length when we discuss
homotopies.) The numerical treatment of singularities at the end of the homotopy
is addressed in Chapter 10 on endgames.
The nonsingular path-tracking task may be summarized as follows. Here, as
throughout this book, we arrange the homotopy to begin at t — 1 and end at t = 0.
Given the following:

• a continuous homotopy function H(z,t) : Cn x R - > C n ; and


• a start point x\, such that H(xu 1) = 0, where xx lies on a nonsingular path.
That is, there exists a path z(t), continuous over t € (0,1], such that z(l) = Xi,
22 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

OH
H(z(t),t) — 0, and the Jacobian matrix -j—(z(t),t) is nonsingular for all t £
(0,1].
Again, the existence of the nonsingular homotopy path, z(t), is one of the primary
topics of Part II; for the moment, we just assume that it exists.
Our goal is:
• to move along the path, from t = 1 to as close as possible to t = 0, in order
to produce a close approximation to the endpoint z(0) = limt_,o z(t) or else, in
the case of a diverging path, to conclude that the limit does not exist.
Section 3.7 outlines an improved treatment for the case of diverging endpoints.
In the context of the introductory example of this chapter, we already touched
on using Davidenko's equation to turn the path-tracking problem into an initial-
value problem for an ordinary differential equation. We also saw that we may use
a predictor/corrector method based on having an explicit homotopy H(z,t). Such
a predictor/corrector method is highly preferred, because the corrector step avoids
the build-up of error which often accumulates in a numerical o.d.e. solver.

Fig. 2.3 Schematic of path tracking, showing prediction (Euler) and correction (Newton) steps.
In practice, the step size would not be so big.

Basic prediction and correction, schematically illustrated in Figure 2.3, are both
accomplished by considering a local model of the homotopy function via its Taylor
series:
H(z+Az, t+At) = H(z,t)+Hz(z,t)Az+Ht(z,t)At+mgher-Order Terms, (2.3.4)
where Hz = dH/dz is the n x n Jacobian matrix and Ht = dH/dt is size n x l . If
we have a point (zi,t\) near the path, that is, H(z\,ti) « 0, one may predict to a
new approximate solution at t\ + At by setting H(z + Az, t\ + At) = 0 and solving
the first-order terms to get

Az = -H^izuh) Htizuh) At. (2.3.5)


Homotopy Continuation 23

On the other hand, when if(z 1 ,i 1 ) is not as small as one would like, one may hold
t constant by setting At = 0 and solving the equation to get

^z=-H-\zuh)H{zut1) (2.3.6)

These are precisely Euler prediction and Newton correction. The main concern of
a numerical path-tracking algorithm is deciding which of these to do next and how
big a step At to use in the predictor.
A generic path-tracking algorithm proceeds as follows, adapted from (Allgo-
wer & Georg, 1997), (see also (Allgower & Georg, 2003; Morgan, 1987)). In our
homotopies, we may assume that the path parameter, s, is strictly monotonic, that
is, the path has no turning points. This is a consequence of the assumption above
that the Jacobian matrix is nonsingular along the path.

• Given: System of full-rank equations, g(v, s) = 0, initial point v 0 at *o = 1


such that <7(VQ, SQ) « 0, and initial step length h.
• Find: Sequence of points (vj,Si), i = 1,2,..., along the path such that
g(vi,Si) « 0, Si+i < Si, terminating with sn = 0. Return a high-accuracy
estimate of v n .
• Procedure:
— Loop: For i = 1,2,...
(1) Predict: Predict solution (u, s') such that ||(u, s') — (VJ_I,S,-_I)|| as h
with s' < Si-i.
(2) Correct: In the vicinity of (u, s'), attempt to find a corrected solution
(w, s") such that g(w, s") w 0.
(3) Update: If correction step was successful, update (VJ,SJ) = (w,s"). In-
crement i.
(4) Adjust: Adjust the step length h.
(5) Terminate Loop: Terminate when Si = 0 or when nonconvergence of the
path has been detected.
— Refine endpoint: At s» = 0, correct v» to high accuracy.

There are many possible choices for the implementation of each step. Some
useful choices are as follows.

Predictor The simplest predictor is just u = Vj_i, but it is much better to use a
linear prediction along the path. Higher-order predictions can also be used, such
as matching a quadratic to two points and a tangent. There are two sensible
linear predictors:
Secant Predictor Use the last two points on the path to linearly extrapolate
to the next. That is,

(U, s') = (Vi_i,s) + /lAi/||Aj||, Aj = (Vi_i - Vi_2,Si_i - S j _ 2 ) .


24 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Tangent Predictor Step along the tangent direction

u = vi-a(dg/dv)-1(dg/ds),
where a is calculated to give the desired step length. This is Euler's method.
Step Length The step length can be measured by any preferred norm of (u —
Vj, s' - Si-i). A simple choice is ||(u — Vj, s' — Si-i)|| := \s' — s,_i|.
Corrector A common corrector strategy is to hold s constant, that is, s" — s',
and compute w by Newton's method, allowing a fixed number of iterations.
The correction is deemed successful if Newton's method converges within a
pre-specified path-tracking tolerance within the allowed number of iterations.
Step Length Adjustment A good strategy is to cut the step length in half on
failure of the corrector and to double it if m successive corrections at the current
step size have been successful. A choice of m in the range two to five works well.
Final Step Near the end of the path-tracking interval, the step length is adjusted
to land exactly on s = 0.
Terminate Eventually, we must either arrive at s = 0 or else \s' — s»_i| must be-
come progressively smaller. It is useful to set a minimum threshold for progress
in s, below which we declare that the path is either diverging or approaching
a singularity. One can also terminate if the magnitude of the solution grows
excessively large.
Refine Newton's method will work well for nonsingular endpoints.
By keeping the number of iterations in the corrector small (no larger than three, con-
servatively just one) and the path-tracking tolerance tight, all intermediate points
are kept close to the exact path, minimizing any chance that a solution will jump
tracks. However, to save computation time, the path-tracking tolerance is generally
looser at the beginning (for say, 1 > s > 0.1), then made tighter near the end
(0.1 > s > 0), and finally set very tight for the final refinement at s = 0.
This path-tracking algorithm incorporates an adaptive step size. One can also
employ adaptivity at a higher level. Specifically, if a path-tracking failure occurs,
the whole path can be recomputed with more conservative choices in the control
parameters. Most useful is adjustment of the tracking tolerance.

2.4 Exercises

This set of exercises begins with the simple o.d.e. and fixed-step path trackers
applied to single variable homotopies and carries through to the application of
HomLab's variable-step path tracker to multivariate systems.

Exercise 2.1 (Davidenko Equation) Derive the Davidenko differential equa-


tion for the homotopy equation

H(z, t) = jt{z2 - 1) + a(l - t){z2 - 5), (2.4.7)


Homotopy Continuation 25

and check that for 7 = o — 1 you get Equation 2.1.2. In HomLab\exercise, there is
a short m-file, davidenko .m that defines this function in a form suitable for solution
by an o.d.e. solver. To use it, you must declare global variables

» g l o b a l gamma sigma

and assign them values. The solution path beginning at z(l) = a may be obtained
with the command » [ t ,z]=ode45(Qdavidenko, [1 0] ,a) ; Use this approach to
do the following.

(1) Verify that for 7 = 1, a = 1, the solution paths starting at z(l) = ±1 terminate
near ±\/5. What accuracy is achieved?
(2) Try the same for 7 = 1, a = —1. (Tip: <Ctrl>C interrupts a nonterminating
process.)
(3) Reproduce Figure 2.1 by setting a = - 1 and using 7 = el6 for 9 =
{0.1,0.3,1,2}.
(4) Compute the path from z(l) = 1 for a = - 1 and 7 = eie for 0 = W~k,
k — 0,2,4,6,8,10. Monitor the number of time steps, the computational time
used, and the final accuracy \z(0) - \/5| versus k.
(5) Using the results of the previous item, examine the history of t returned by
ode45. What values of t require small time steps? (Hint: p l o t ( t , ' . ') may
be insightful.) Save the array of t values for use in the next exercise.

Exercise 2.2 (Crude Path Tracker)


File HomLab\exercise\crudetrack.m implements the simple path tracking algo-
rithm given in § 2.1, generalized to the homotopy of Equation 2.4.7. The calling
sequence is >> [z] = crudetrack(zO,t, gamma, sigma); where zO is an array of
starting values for z(t), and t is an array of values of t.

(1) Use >>[z]=cmdetrack( [1 -1] , 1- [0:M] /M,gamma,sigma) to track the paths


using M steps of equal length. Do the same experiments as in Items 1-3 of the
previous exercise using M = 100. How does the speed and endpoint accuracy
compare?
(2) With M = 100, how close to zero can 6 be so that the paths for 7 = e*9, a = — 1,
still have the correct shape? What happens for smaller 91
(3) Try a small value of 6 but use the nonuniform t values returned by ode45 from
Exercise 2.1. What does this show about the value of step size control? How
does the final accuracy compare to the o.d.e. approach? Can you explain the
difference?

Exercise 2.3 (Crude Tracker Generalized) Another m-file, crudetrack2.m,


generalizes the simple path tracker to the homotopy

H(z,t)=1tg(z) + (l-t)f(z), (2.4.8)


26 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

where g(z) and f(z) are any polynomials in one variable. The calling sequence is
>> [z] = crudetrack2(zO,t, gamma, g,f);
where g and / are given as coefficient arrays in the usual Matlab convention. For
g=[l 0 -1], f=[-l 0 5], this is exactly the same as crudetrack with a = — 1.
(1) Compare the speed of crudetrack and crudetrack2. Can you explain the
difference? What does this say about the importance of efficient function eval-
uation?
(2) Use crudetrack2 with g and / as in Example 2.2.1. Choose 7 complex in the
vicinity of 1 to avoid trouble with double roots. Try other values of 7 around
the unit circle. Do the start points always end up at the same endpoints?
(3) Try crudetrack2 to solve a polynomial f(z) of degree 7 having random real
coefficients chosen in the range [—2,2]. Use the start system g(z) = z7 — 1.
Compare the success rate using 7 = 1 versus using 7 = e%e for a random
6 e [0,27r].
Exercise 2.4 (Multivariate Davidenko O.D.E.) The Davidenko differential
equation generalizes for multivariate homotopies.
(1) Derive the Davidenko equation for a homotopy H(z,t) = 0, where H(z,t) :
C " x l ^ C n . (Hint: see Equations (2.3.4) and (2.3.5).)
(2) Use this approach and Matlab's ode45 to solve the system

*<*.».«>=7* (£ :!)+(!-•) ( J ; x % ; 2 , ) =». <*«)


(Routine davidtwo.m may be used.) What are the starting solutions at t = 1?
How many of the endpoints at t = 0 are real? What is the final value of
H(x,y,l)?
Exercise 2.5 (Variable-Step Tracker) An implementation of the variable step-
size tracker as described in § 2.3 is provided as routine goodtrack.m. It makes some
choices of control parameters that are generally acceptable for tracking well-scaled
homotopies in double precision. This is similar to the main path-tracking engine
inside HomLab. The routines in the first two items below use goodtrack. m to solve
some of our earlier examples.
(1) Use goodtrack2vble.m to track the homotopy of Equation 2.4.9.
(2) Using these examples for guidance, write your own m-file to reproduce Fig-
ure 2.1. Do the same study as in Exercise 2.1, items 4 and 5, to see the
effectiveness of the variable step size.
Chapter 3

Projective Spaces

A very useful tool in both algebraic geometry and in the practical implementation of
polynomial continuation, projective spaces are a fundamental construct. They sim-
plify theorems by sewing up infinity, compactifying Euclidean space so that points
at infinity becomes just like ordinary points. This allows us to more conveniently
make accurate statements about the number of roots of polynomial systems. Fur-
thermore, in the numerical context, this has the benefit of allowing "solutions at
infinity" to be computed as easily as finite ones. The concept of solutions at infinity
and why one would wish to compute them will also be covered in this chapter.

3.1 Motivation: Quadratic Equations

To motivate the introduction of projective space, let's begin with the very familiar
quadratic equation in one variable, x,
ax2 + bx + c = 0, (3.1.1)
which has two solutions given by the quadratic formula:

x = ~6±v/*2-4aC. (3.1.2)
2a
Of course, this is not quite the whole story, for if we wish to be precise, we must
add the caveats:
• if 62 — 4ac = 0, then there is just one (double) root, x = —b/(2a);
• if a = 0, b ^ 0, there is just one root, x = —c/b;
• if a = b = 0, c 7^ 0, there is no solution; and
• if a = 6 = c = 0, the solution is all x e C
There are two ways to simplify the situation. One way is to exclude all but one
of the special cases by observing that if a = 0, we don't really have a quadratic
equation, but something of lower degree. Accordingly, we may say that a quadratic
equation with nonzero coefficient on x2 has two roots, possibly equal, given by

27
28 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Equation 3.1.2. This is correct and simple, but it merely sidesteps the exceptions.
Moreover, we will have a completely separate statement for linear equations that
says nothing about the connection between linear and quadratic equations, even
though it is clear that the set of quadratic equations parameterized by the coeffi-
cients (a, b, c) includes linear equations.
There is an associated concern in numerical work: what should we do if a is
very small? Careful analysis of the quadratic formula shows that as a —> 0, one root
approaches — c/b while the other root diverges to infinity. Is there a well-behaved
numerical representation of the large root?
A second way to simplify the situation by formulating the solution of the
quadratic equation in terms of projective space addresses these concerns. We replace
x by the ratio u/v and clear denominators to obtain the homogeneous polynomial

au2 + buv + cv2 = 0. (3.1.3)

Because of the homogeneity, if (u,v) satisfies Equation 3.1.3, then so does (Xu, Xv)
for any A G C, and as long as v ^ 0, these give the same value of x = u/v. We use the
notation [it, v] j^ [0,0] to denote all pairs (v!, v') ^ (0,0) such that (V, v') = (Au, Xv)
for some A G C. We call the space of all nonzero [it, v] the one-dimensional complex
projective space, denoted P 1 , and we call [u,v] the homogeneous coordinates of P 1 .
Points [u, v] with « / 0 are said to be "finite," whereas the point with v = 0 is said
to be "at infinity." (There is only one point, [u,v] = [1, 0], at infinity in P 1 .)
With this notation, we see that for a = 0, b ^ 0, Equation 3.1.3 factors as
(bu + cv)v = 0, so there are two roots: [u,v] = {[—c,b], [1,0]}. The first gives the
same x = —c/b as we had before, while the second is a root "at infinity." Similarly,
for a = 6 = 0, c / 0, we have cv2 = 0, which implies a double root at infinity
[u,v] = [1,0]. Note that b2 — Aac = 0 for this case. Accordingly, we may eliminate
two of our former caveats to say that in projective space, the homogeneous quadratic
equation, au2 + buv + cv2 = 0, has two roots for general a, b, c, one double root for
b2 — Aac = 0, and all [u, v] G P 1 when a = b = c — 0. This is certainly more succinct
than our first statement in the opening paragraph of this section, while still covering
all the cases. This is because roots at infinity have become just like any other roots.
In homogeneous coordinates, the quadratic formula can be written in many
equivalent ways, since only the ratio of u to v matters. The following formulae
agree everywhere that they are well defined:

[u,v] = [-b±Vb2-Aac,2a], if a ^ 0; (3.1.4)


2 2
[u, v] = [-b - Vb - Aac, 2a] or [2c,-b-^b - Aac], if b + 0. (3.1.5)
2
[u,v] = [2c,-b±s/b -Aac], if c ^ 0; (3.1.6)

For every (a, b, c) ^ (0,0,0), at least one of these formulae is well denned. These are
also useful for accurately computing numerical values of roots in the neighborhood
of infinity.
Projective Spaces 29

Of even greater importance for our larger goal of treating general polynomial
systems is the fact that homogeneous coordinates allow the continuation method to
track solution paths to infinity without any numerical difficulty. We will return to
this in § 3.7 after first discussing projective spaces more thoroughly.

3.2 Definition of Projective Space

We have already denned P 1 in the foregoing example. The concept generalizes


straightforwardly to any dimension as follows.
Definition 3.2.1 (Projective Space) JV-dimensional complex projective space,
denoted VN, is the space of complex lines through the origin in CN+1. Points in
P are given by (JV + l)-tuples of complex numbers [ZQ, • • • ,-ZTV], not all zero, with
the equivalence relation given by {zo, • • •, ZJV] ~ [ZQ, • • •, z'N] if and only if there is a
nonzero complex number A such that z'j = XZJ for j = 0,... ,N.

The definition makes sense, because a line through the origin in CJV+1 is a set
of the form

{(Xzo,...,XzN)eCN+1 I AGC}
with not all the Zi zero. The Zi occurring within the brackets [z$,..., z^} are called
homogeneous coordinates, even though they are not coordinates on P^, but rather
coordinates on C^"1"1.
To put the structure of a complex manifold and hence also the structure of a
topological space on FN, we specify coordinate charts. We define the sets

Ul:={[z0,...,zN}GPN | Zi/0}.

On Ui the ratios Zj/zi of the homogeneous coordinates Zi, Zj are well-defined func-
tions that can be used to identify Ut with C ^ . Indeed, we identify CN with UQ by
the map (zo,i, • • • i ZO,N) —• [1,20,11 • • • 1 ZO,N], and for other i, we identify <CN with Ui
b y t h e m a p (-Zj.o,- • • , Z i , i - i , 2 i , » + i > • • • J ^ . J V ) —> [zi,\,- • • , ^ i , i - i , 1 , zi<i+1, zi<N], where
we make the obvious modification for i = N. The transition functions between
Ui \ {zitj = 0} and Uj \ {zjti = 0} for i ^ j are given by z^k = ziykjzi%} for all i and
j . Here we follow the convention z^i = 1 for any i.
One way to think of projective space P^ is as C^ with infinity a slit filled by
P-^"1. In other words, we have the following.
• P° consists of a single point, C° = [1].
• P1 has the chart UQ given by (w) —» [l,w] and the chart U\ given by the map
(z) —> [z, 1], and the transition function z = 1/w. Uo is thus identified with C1
and covers all of P 1 except the point [0,1]. So we have that P 1 = C1 U C° =
Uo U (Ui \ Uo) = {weC}U{z = 0}.
30 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

• P ^ is the disjoint union PN = CN U C ^ " 1 U • • • U C° given by

¥N = Uo U (U! \ Uo) U • • • U (UN \ U t o 1 ^ ) -

Thus, we make C1 compact by adding a single point at infinity to form P 1 . Similarly,


we make C2 compact by adding a line at infinity to form P 2 , the line at infinity
being itself compactified as P 1 .
In the next two sections, we solidify concepts by discussing P 1 and P2 in greater
detail.

3.3 The Projective Line P 1

First, let's clarify a bit of terminology. Historically, C1 has often been called the
"complex plane," because we may identify each point x e C 1 with a point in the real
plane M2 by sending x to (Re(x), Im(x)). To avoid confusion, we prefer the terms
"Argand plane" or "Argand diagram" for this construction. We say that C1 is a
line, having one complex dimension, in analogy to the real line M1, which has one
real dimension. We will always have to keep in mind that the n dimensional complex
Euclidean space Cn is isomorphic to K2n, the 2n dimensional real Euclidean space.
We have seen above that P 1 is the union of the complex line C1 with a single
point, .Hoo := [0,1], which we call the point at infinity. We may visualize the real
part of P 1 , as in Figure 3.1, by plotting real [u, v] as a line through the origin in R2
through the point (u,v). Several such lines are shown, with [1,0] as the horizontal
line and [0,1] as the vertical line.
Figure 3.1 also shows the chart Uo represented by the dashed vertical line (1, to)
as w varies. We see that there is one point of intersection of this line with each line
through the origin except for the line [0,1], which is the point at infinity that we
must add to complete P 1 . It is just as valid, however, to consider P 1 as the union
of U\, shown here as the horizontal line (z, 1) as z varies, with the single point
[1,0]. In fact, any inhomogeneous line au + f3v = 1 cuts each line through the origin
exactly once, except for the line au + (3v = 0. That is, we may view P 1 as the line
au + (3v = 1 union the single point \fl, —a]. Such a line, labeled £/', is shown in
the figure. Each such line is a "Euclidean patch" that covers all of P 1 except one
point. When performing a calculation in P 1 , we are free to choose any patch that is
convenient as long as the answer we seek is not the point missing from that patch.
For example, the line L, which happens to represent the point [cos7r/8,sin7r/8],
intersects each of the patches Uo, U\, and U' in one point, whose coordinates will
be the numerical representation of this projective point on that patch. This will
turn out to be very useful in § 3.7 below.
The Riemann sphere uses stereographic projection to visualize P 1 more fully,
showing all of P 1 , not just its real part. It allows us to draw P 1 , which is a manifold
having two real dimensions, as a surface in real three-dimensional space. This
Projective Spaces 31

Fig. 3.1 Real part of P 1 , showing three patches and the real part of the Riemann sphere.

construction does not readily generalize to higher-dimensional projective spaces,


and it plays no essential role in our numerical work. However, it is useful for
gaining a better understanding of P 1 , so we include it here.
The construction of the Riemann sphere is illustrated in Figure 3.2. The part
of the sphere that corresponds to the real line is shown as the dotted circle in
Figure 3.1, tangent to [0,1] at the origin. Each real line through the origin, except
the line [0,1], intersects the circle at the origin and in one additional point. We
set up a similarity correspondence between the nonzero point and the line. The
line [0,1] meets the circle in a double point at the origin, so we say that the origin
corresponds to that line. This takes care of the real line inside of P 1 . To get the
full picture, we extend into three dimensions. Suppose w = a + bi £ C1, with a, b
real. Figure 3.2 illustrates the chart UQ which maps (w) to [l,w] by plotting it in
three (real) dimensional space as the point (l,a,b). That is, we send each point
in P 1 \ [0,1] to the plane x = 1 in R3. Figure 3.2(a) shows the real slice from
Figure 3.1 laid on its side, Figure 3.2(b) shows a similar imaginary slice standing
up, and Figure 3.2(c) combines the two. We see that the plane x = 1 intersects
almost every real line through the origin in R3 in one point, just missing those
passing through points of the form (0,a, b). But all of those missed lines are just
the same point in projective space [0,1]. The dashed circles are great circles of the
Riemann sphere, lying tangent at the origin to the plane x = 0. The correspondence
between a point (1, a, b) and a point on the sphere is analogous to the real slice but
rotated into the plane containing the origin, (1,0,0) and (1, a, b). Moreover, all the
lines through the origin and a point of the form (0, a, b) meet the sphere at the
32 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

origin. That is, the points of the sphere and the points of P 1 are in one-to-one
correspondence, and in fact P 1 is topologically equivalent to the real two-sphere S2.

(c) Real and imaginary slices with rectangle in Co

Fig. 3.2 Riemann sphere construction.

3.4 The Projective Plane P 2

The projective plane, P 2 , is a compactification of the complex plane C2. There are
several natural ways of adding points at infinity to compactify C2, but P 2 stands
out as the "simplest" and arguably the most useful in general. One sees that C2
is equivalent to M4 just by taking real and imaginary parts. In contrast, P2 is
a manifold of four real dimensions for which there is no easy visualization. This
limitation of our three-dimensional minds notwithstanding, by the definitions we
have already given, it is easy to see that P 2 is C2 with a projective line, P 1 , added
at infinity.
Just to clarify, let's restate the general construction for the specific case of P 2 .
It is defined as the set of all triples [zo,zi,z2] of complex numbers except [0,0,0]
subject to the equivalence of [z0, Zi,z2] and [z'o, z[, z'2] if there is a nonzero complex
Projective Spaces 33

number A such that z[ = \z\ for i = 0,1,2. There is a one-to-one correspondence


of points [zo,zi,z2] G P 2 with lines in C 3 that contain the origin. Indeed, simply
make the association

[ZO,ZT.,Z2} <-> {(\zo,\Zl,\z2) e C3|A G C}.

We can identify C 2 with a subset Uo of P 2 by sending (xi,x2) G C 2 to [l,xi:x2] G P 2 .


The remaining portion of the space, P 2 \ Uo, is the set of triples [0, z\, Z2} satisfying
(1) not both of z\ and z2 are zero; and
(2) [0,21,2:2] and [0, 2i,2 2 ] a r e equivalent if there is a nonzero complex number A
such that z[ = Xzi for i = 1,2.

We can identify P 2 \ UQ with P 1 via the association [0,21,22] <-> [21,22]. We call
Hoo := P 2 \ UQ the line at infinity.
The importance of P 2 and its homogeneous coordinates [20,21,22] lies in the
fact that if p(zo, z\, z2) is a homogeneous polynomial, i.e., a polynomial where each
monomial term has the same total degree, then the zero set of p(zo, z\, z2) is well
defined on P 2 . For example, if p(zo, z\, z2) = ZQ + 2122, then p(\zo,Xzi,Xz2) =
A 2 p(2 Q ,2i,2 2 ). Moreover, given the zero set C C C 2 = : Uo C P 2 of a polynomial
p{x\, x2) of degree d, the closure C of C in P 2 is the zero set of the homogenized
polynomial

\z0 zoj
which by abuse of notation we typically write p(zo, Z\, z2). When we say a polyno-
mial has degree d, we assume that there is at least one term of the polynomial with
degree d.
Homogeneous coordinates are very well adapted for computations, as we shall
see in the next section. As a simple illustration, let's consider the intersection of
two parallel lines, given as

Two general lines in C 2 intersect in a single point, but these parallel lines either
coincide, if c = d, or they do not meet, if c ^ d. Homogenizing with x = z\/z0,
y — Z2/Z0, one has

p(x,y)=(aZl+buZ2 + C
;°)=0. (3.4.8)
v ;
^y 'yj \az! +bz2+dzoj
Assuming that at least one of a, b is nonzero, we now find the solution [0, —b, a};
that is, the lines meet at infinity. The line at infinity, H^, meets each finite line
in a point and any two lines passing through a given point on H^ are all parallel.
Accordingly, Hoo is the set of slopes for the finite lines, and in the breakup we gave
34 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

above for H^, = C1 U C°, C1 represents the slopes of lines that have a finite slope
while the final point, C° = [0,0,1], represents the slope of a vertical line. We have
the nice result that every pair of noncoincident lines in P2 meet in a single point.
For any pair of lines, parallel or not, we may write the homogenized equations in the
form Az = 0, where A is a 2 x 3 matrix. Then, the solution can be computed using
any of the standard methods from linear algebra, such as Gaussian elimination with
column pivoting, or more robustly, the singular value decomposition.

3.5 Projective Algebraic Sets

To speak of projective algebraic sets, we first need to define homogeneous polyno-


mials.
Definition 3.5.1 (Homogeneous Polynomial) A polynomial f onCN+1 is said
to be homogeneous of degree d if f{Xz) = Xdf(z) for all (X,z) e C x CN+1.

This is equivalent to / being expandable to a sum 2_^ cjz1, where / denotes an


\i\=d
(N + l)-tuple (io,..., ijv) G ^^Q" 1 °f nonnegative integers with |/| := io 4- • • • + IN,
z1 := ZQ° • • • z)y , and c/ £ C Though a homogeneous polynomial / does not have
a well-defined value at most points of FN, the set where it is zero is well defined.
This follows because a point [zo,..., zpj] G ¥N is also represented by its multiples,
[Xz0:..., XzN] with A € C*. If f{z) = 0 we have also f{Xz0,..., XzN) = Xdf{z) = 0.
N+l
Given a system of n homogeneous polynomials on C
fi(z\,..., ZJV+I)

/(«):= I , (3-5.9)
_fn(zi> • • • >2jV+l).

we denote the set of common solutions on VN by


V(h,---Jn):={zeCN+1 | /i(*) = 0;... ;/„(*) = 0 } ,
or more briefly V(f).
We give a name to such sets as follows.
Definition 3.5.2 (Projective Algebraic Set) A projective algebraic set is any
subset of P whose homogeneous coordinates are the set of common solutions to a
system of homogeneous polynomials on CN+1.
The simplest projective algebraic set is P ^ itself, because it is the solution of an
empty set of polynomials. Next simplest would be a hypersurface defined by a single
homogeneous polynomial.
It can happen that a polynomial system comes directly from an application in
homogeneous form. More commonly, we start with an inhomogeneous polynomial
Projective Spaces 35

system and, as we did in this chapter for the quadratic equation, Equation 3.1.1,
and for lines in the plane, Equation 3.4.7, we homogenize the equations in order to
facilitate computing their solutions. Similar to what we did in § 3.1 for equations
on C 1 and in § 3.4 for equations on C 2 , we homogenize a polynomial p(xi,... ,xn)
of degree d on C" as

P ( z 0 , z u ..., z n ) = z ^ ( — , . . . , — ) . (3.5.10)

Indeed, if V(p) C C 2 is the solution set of p — 0, then its closure V(p) c P n is


the zero set of P. When we say a polynomial has degree d, we assume that there
is at least one term of the polynomial with degree d. A polynomial of multidegree
( d i , . . . ,dm) is multihomogenized by applying this same procedure for each of m
groups of variables.
The upshot of this is that a root of p having a large magnitude for one or more
variables among xi,...,xn can be represented numerically as a zero of P having
moderate magnitudes for z\,...,zn but a small value of ZQ. For example, we may
map (xi,... ,£„) — i > [l/y,Xi/y,... ,xn/y], where y = max \xi\. In this manner, a so-
lution at infinity, or in the neighborhood of infinity, becomes numerically tractable.

3.6 Multiprojective Space

A mild generalization of projective space will prove useful in later chapters. We


wish to consider spaces that are built using projective spaces as the building blocks.
Definition 3.6.1 A multiprojective space is a cross product of projective spaces,
P™1 x • • • x P " m . This includes the case m = 1, which is just a projective space P n i .
The homogeneous coordinates for such a space are the cross product of homogeneous
coordinates for each projective factor, hence forming a space

( C n i + 1 \ 0 ) x ••• x ( C n m + 1 \ 0 ) .

Definition 3.6.2 A multihomogeneous polynomial


ft- y \ . (pni + 1 v v (pnm + l (p
J\zli • • • i zm) •^ X • • • X li, > U_

of multidegree ( d i , . . . , dm) is a polynomial such that


f(\iz1,...,\mzm) = Xf1 •••X^f(z1,...,zm)

for all
{{Al, • . . , Am), Z\, . . . ,Zmj fc lb X U_ X • • • X H_

We may also say that such a function is m-homogeneous, and the 1-homogeneous
case is understood to be included. We say that a multihomogeneous polynomial /
is compatible with multiprojective space X if the dimensions n i , . . . , nm match.
36 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

A multihomogeneous polynomial is just a sum of terms whose monomials all


have the same multidegree [d\,..., dm); that is, monomials of the form zf1 • • • z^ m
with |o!i| = di. The procedure described in § 3.5 for 1-homogenizing a polynomial
can be applied separately to each of the variable groups Zj to multihomogenize a
polynomial.

Example 3.6.3 The polynomial x\y\-\-x\Xiy\Jry\J\-\. can be looked at as having


degree (2,4) in the variables £1,£251/1,2/2,2/3- Multihomogenizing with respect to
this grouping gives x\j/iyg + xix2ylvl + x\y\ + x%y$.

It would be natural to define a "multiprojective algebraic set" A as any subset of


a multiprojective space X such that the multihomogeneous coordinates of A are the
set of common solutions to a system of multihomogeneous polynomials compatible
with X. There is no need to do this. In § A.10.2, we will see that any multiprojective
space may be regarded in a natural way as a projective algebraic set, thus every
multiprojective algebraic set is a projective algebraic set.
Use of multiprojective space often leads to simple descriptions of important sets.
An example is the generalized eigenvalue problem

{AX + B[i)v = 0,

in which A and B, each an n x n square matrix, are known, and (A, fi,v =
( « i , . . . , D n ) ) e C 2 x Cn are to be found. This is a set of n homogeneous quadratics.
The equations are homogeneous of bidegree (1,1) in (A,/x) and in v separately, so
the solution sets have a natural interpretation as sets in P 1 x P™"1. Much more
could be said about eigenvalue problems, but for now we just show this as an exam-
ple, one common enough that most packages for linear algebra include a solution
method for it. To avoid confusion later, we point out that unlike in this example,
in a more general case of a multihomogeneous polynomial system, the individual
equations can have different multihomogeneous degrees.

3.7 Tracking Solutions to Infinity

Let us return now to the subject of Chapter 2, that is, solving polynomial systems
by tracking the solution paths of a suitably defined homotopy h{x, t) = 0, where
h(x,l) is a starting polynomial system whose solutions are known and h(x,0) is
the target polynomial system we wish to solve. Often, this will be a linear interpo-
lation between a start system g(x) and a target system f(x), both consisting of n
polynomials in n variables, as

h(x,t)=tg(x) + {l-t)f(x). (3.7.11)


Projective Spaces 37

In particular, we might choose g(x) as the system

g(x)=1\ : , (3.7.12)
W" ' 1/
where 7 is a randomly chosen complex number and di is the degree of the ith
polynomial in f(x). The art of choosing a good homotopy is studied extensively in
Part II of this book, so let's just take it on faith for now that, with probability one,
the homotopy paths starting at the JliLi ^ solutions to g(x) = 0 are nonsingular
for t £ (0,1] and the endpoints of the paths as t —> 0 include all the nonsingular
solution points of f(x) = 0.
The matching of the degrees in the polynomials of g(x) to those of f(x) is
an attempt to match the number of roots of the two systems, so that there are no
wasted paths. This works some of the time, but not always, and when the difference
is too great, we will make use of more sophisticated homotopies. But despite our
best efforts to match the homotopy to the problem at hand, it is very common
for the start system to have more solutions than the target system. In such cases,
the extra solutions must diverge. This causes two problems for the path tracker.
First, a diverging path has infinite arclength, which can cause the path tracker to
spend an inordinate amount of time on a futile quest. Second, as the magnitude of
the solution grows, the polynomials can no longer be accurately evaluated and all
numerical accuracy is lost.
One simple remedy mentioned in § 2.3 is to simply truncate any path whose
solution components grow too large in magnitude. This introduces an uncertainty
about setting the limit, because one never knows if the path may be heading to a
large, but finite, solution, or even if the path might reverse course and converge to
a small magnitude. Indeed, in the example from Q5 in § 2.1, we encountered a path
that approached infinity at t = 1/2 and then returned to the finite realm.
A robust way to eliminate the trouble is to homogenize the polynomials, as in
Equation 3.5.10, and track the paths in P n . Our homotopy becomes

H(z, t) = tG(z) + (1 - t)F(z), (3.7.13)

with G(z) as the system


z
i ~ zo \
: . (3.7.14)
z z
n - 0 /

Along any path, at any value of t, we can rescale [ZQ, ..., zn] to keep the magnitudes
of the homogeneous coordinates in range.
In numerical work, we want to restrict the representation of a root to just n
variables at any particular moment. One way is to pick one of the variables and
38 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

set it to one. Typically we do this initially with ZQ = 1. If at any later time we


find that some variable, say z*, is growing large, we may rescale to make z* = 1
and let Zj, j ^ i, vary (including ZQ). In other words, we can pick any of the
Euclidean patches, UQ, ..., Un, to do the computation and we can transition from
one to another whenever it is advantageous to do so.
More generally, we may pick any Euclidean patch of P™ for our computations,
such as the patch V illustrated in Figure 3.1. That is, we may choose coefficients
a = (ao,... ,an) and append a linear equation
aozo + aizi H anzn = 1. (3.7.15)
Whenever any variable grows too large, we may switch patches by picking a new
set of coefficients, a. We call the application of Equation 3.7.15 a projective trans-
formation, introduced as a numerical technique in polynomial continuation by Mor-
gan (Morgan, 1987).
By the homogeneity of H(z,t), we have that if H(z,t) = 0 then H(Xz,t) = 0
for any A. Using Equation 3.7.15, the numerical representation of the root [z] is
(Azo,. • • ,Azn) where

A = l/(aozo H anzn).

This representation breaks down if we happen to have chosen coefficients a such


that CLQZO + • • • anzn = 0. If we choose random, complex values for a, there is a
probability of zero that we will encounter such a point in the homotopy. Thus, in
practice, it is usually sufficient to make such a random choice once at the beginning
of the continuation run, with no further monitoring to check for the need to switch
patches. However, there is little overhead involved in the checking, so for the utmost
in reliability, it is worthwhile to implement patch switching.
This takes care of numerically tracking the path to infinity. Once we have
endpoints of all the homotopy paths, we usually wish to sort the finite solutions
from the solutions at infinity. In numerical work, this comes down to setting a
tolerance e^ and declaring any solution point with |zo|/max|zj| < e^ to be at
infinity. Obviously, the proper setting of this tolerance depends on the precision
of the arithmetic we are using and the conditioning of the solution point. We
cannot know with certainty whether the point is actually at infinity or just so close
to infinity that the difference cannot be discerned at whatever level of numerical
precision is in place. Increasing the precision can raise one's confidence in the
judgement, but certainty can never be attained. In this respect, the numerical
result gives a strong indication of the truth, but it is never equal to rigorous proof.
Let's look at an example in one variable that has roots diverging to infinity. We
can arrange this by using a starting polynomial of higher degree than the target.
In practice, we would normally use a start system of equal degree to the target,
so there would be no diverging paths. But in multivariate systems, such an exact
matching is often not possible, so that the phenomenon illustrated here is very
Protective Spaces 39

common. Single variable examples have the advantage that the solution path of
the homotopy can be visualized by plotting it in an Argand diagram. Examples of
multivariate homotopies with diverging paths are given in the exercises that follow.
Example 3.7.1 Choose a start system g{x) = a;3 - 1 and a target system f(x) =
x + 1.5. Form the homotopy

h(x,t)=tg + (l-t)f = O,

and follow the three solution paths from x — 1, x = (—1 ± iy/3)/2 at t = 1 as t


goes from 1 to 0. As shown at the left in Figure 3.3, two roots diverge to infinity as
t —> 0. Homogenizing h by substituting x — ZI/ZQ and clearing denominators, one
obtains

#(2 0 ,z u t) = t(z\ - 4) + (1 - t)(zi + 1.5ZQ)Z$ = 0. (3.7.16)

On the patch ZQ = 1, the solution paths for z\ are the same as the paths for x
in the inhomogeneous homotopy. In contrast, on the patch z\ = 1, we get the
picture at the right in Figure 3.3. The roots that diverge on the left patch now are
seen to approach the origin. In addition to being represented numerically by finite
numbers, the paths to infinity (i.e., to z0 = 0) also have finite arclength, so one can
successfully track the entire path. Neither patch is suitable for all the roots, as the
real root on the patch z\ = 1 now goes to infinity at t — 2/3. Accordingly, let's
pick a "random" complex patch:

(0.2 + 0.8i)20 + (0.4 - 0.5i)zi = 1. (3.7.17)

(In practice, we would use a random number generator for the coefficients of this
equation, but for illustrative purposes, we keep the numbers simple here.) In this
patch, the paths of both ZQ and z\ stay finite on all of t e [0,1], as shown in
Figure 3.4. At t = 2/3, zx passes through zero on the path labeled "1," and at
t = 0, ZQ reaches zero for paths labeled "2,3."

3.8 Exercises

Exercise 3.1 (Projective Transformation 1)


(1) Use the multivariate Davidenko equation (See 2.4) to reproduce Figure 3.4 by
appending the projective transformation Equation 3.7.17 as the second equation
to homotopy Equation 3.7.16.
(2) Use goodtracklnfty.m to reproduce Figure 3.4. What is the final value of t
for the paths going to infinity? What criterion caused the path tracker to stop?
What is the underlying cause?
(3) Instead of tracking the homotopy as a two variable system, one can solve the
projective transformation, Equation 3.7.17, for ZQ as a function of z\. Substitute
40 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Fig. 3.3 Solution paths of Equation 3.7.16 as t goes from 1 to 0 (real), shown on two different
patches.

Fig. 3.4 Solution paths of Equation 3.7.16 as t goes from 1 to 0 (real), using a general projective
transformation.

this into the homotopy Equation 3.7.16 to get a homotopy in z\ alone. What
are the start points? Adapt goodtracklnfty.m to solve this homotopy. How
do you recover the value of [zo>2i] f° r the endpoints?

Exercise 3.2 (Projective Transformation 2) Any homotopy between polyno-


mials in one variable having a start system with higher degree than the final one
must have solution paths diverging to infinity. Multivariate systems may exhibit
this phenomenon even if the degrees match. Use goodtrack. m and the projective
transformation to treat the following systems. You must homogenize and then ap-
pend the projective transformation. Use a random, complex 7 to avoid singularities.
Dehomogenize the finite roots to obtain the final solutions of the original homotopy.
Projective Spaces 41

• A one-variable system
h{x, t) = 7i(x + l)(x - l)(x - 2) + (1 - t)(x + 2) = 0.
Plot the three solution paths of a; in an Argand diagram. Try several values of 7.
How do the paths to infinity respond? Do they seem to be going to the same or
different endpoints? Now, plot the paths of z0 that resulted from the projective
transformation (presuming you used x = Zi/z0 in the homogenization step).
Does this help you explain the paths of x?
• A two-variable system

^y,*) = T*(^lJ)H-(i-t)( a:2 ^;^ 2 )=o.


Plot the curves xy = 1 and x2 — x — 2 = 0 in the real x, y plane. Do the finite
roots agree with your computation? How many roots at infinity do you get?
Can you interpret their meaning in the context of the plot of the curves?
Exercise 3.3 (Circles) Try intersecting two circles in the x, y plane using a ho-
motopy similar to Exercise 3.2, utilizing homogenization and the projective trans-
formation.
(1) For two general circles, how many finite roots and how many roots at infinity
are there? Can you confirm from the homogenized equations what the roots at
infinity should be? Does the computation agree?
(2) What if the circles are concentric? Predict the outcome by studying the ho-
mogenized equations. Do the endpoints found by continuation agree with your
analysis?
Exercise 3.4 (Projective Cross-Product Spaces) Consider the cross-product
space P 1 xP 1 , that is, the set {{u,v) \ u G P 1 , v G P 1 }. The finite portion of P 1 x P1
is equivalent to C2. Answer the following:
(1) Describe (P1 x P1) \ C2, i.e., what is added at infinity to C2 to form P 1 x P1?
(2) The line ax + by + c = 0 in (x, y) £ C2 is the finite portion of the line az\ + bz2 +
cz0 = 0 in [zo,zi,z2] G P 2 under the mapping [zo,Zi,z2] —• {zi/zo,z2/zo)- It is
also the finite portion of the line auiVo + buoVi +CUOVQ ~ 0 in ([UQ, U\], [VQ, V\]) G
P1 x P 1 under the mapping ([uo>ui]> [t>o,^i]) —> (ui/uo,vi,vo). Investigate
the intersection of two parallel lines under both homogenizations. Be sure to
consider horizontal lines, vertical lines and lines with arbitrary slope. What do
you conclude about the relationship of points at infinity of one space to the
other?
Chapter 4

Genericity and Probability One

This chapter explores how one of the fundamental concepts of algebraic geometry,
genericity, is also the foundation of polynomial continuation. In an idealized model
where paths are tracked exactly and where random numbers can be generated to in-
finite precision, our homotopies can be proven to succeed "with probability one." In
the non-ideal world of floating point arithmetic and pseudo-random number genera-
tors, probability one cannot be achieved, but experience shows that high reliability
is obtained when reasonable precautions are taken. Moreover, that reliability can
be raised asymptotically close to one by increasing the precision of the calculations
and taking other steps to bring the actual numerical behavior closer to the ideal.
It is impossible to talk about generic points without introducing a few notions
from algebraic geometry.
We have various types of sets, which it is natural to refer to as algebraic sets.
Affine algebraic sets An affine algebraic set on CN (see § 12.1 for more details)
is a set defined by the vanishing of a finite number, say n, of polynomials
Pi,... ,pn £ C[a:i,... ,XN]. That is, a set X c CN defined by
X = { ( x i , . . . , x N ) € CN \ p i { x 1 , . . . , x N ) = 0 , i=l,...,n} .

Projective algebraic sets Recall from § 3.2 that the set of lines through the ori-
gin in CN+1 is equivalent to the projective space ¥N (the projective plane) and
that the zero set of any homogeneous polynomial f(xo,xi,... , Xjv) is a subset
of PN with homogeneous coordinates [xo,Xi,..., XN], see § 3.5. Accordingly, a
projective algebraic set on P^ (see Chapter 3, § 3.5, and § 12.3 for more details)
is a set defined by the vanishing of a finite set of homogeneous polynomials, say
Pi(xo,xi,... ,xN),... ,pn(x0,xi,... ,xN).
That is, a set X C PN defined by
X = { [ x o , a ; i , . . . , : r j v ] 6 P W | Pi(xo,x1:... ,xN) = 0, i=l,...,n}.
Quasiprojective algebraic sets Sets of the form X\(XDY), where X C ¥N and
Y C VN are both projective algebraic sets, are called quasiprojective algebraic

43
44 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

sets. These sets include both affine algebraic sets and projective algebraic sets
(see § 12.4 for more details).

For this book, quasiprojective algebraic set and algebraic set are synonyms.
In differential geometry and topology, there is the basic notion of a manifold.
This is defined precisely in § A.2.1, but for now we can use the loose definition that
an n-dimensional complex manifold is a space that is locally like C™. Not every
algebraic set is a manifold, e.g., V(xy) is locally like C except at the point (0,0).
A point of a quasiprojective algebraic set with neighborhood like C™ for some n
is called a smooth point or a manifold point of X. Here the word "like" must be
made precise: this will be done in § A.2.1. For now the important point to note is
that the subset of manifold points Xreg of a quasiprojective algebraic set X is dense
and open, and the set of singular points Sing(X) := X \ Xleg is a quasiprojective
subset of X.
The most basic building block of any of the above three types of algebraic sets
is an irreducible algebraic set. We say that a quasiprojective (or affine algebraic
or projective algebraic) set Z is irreducible if ZTeg, the set of manifold points of
Z, is connected. The dimension of an irreducible quasiprojective set Z is defined
to be dimZ reg as a complex manifold, which is half the dimension of Z reg as a
manifold. Note that in all three cases Zreg is quasiprojective, but if Z is projective
(respectively affine) then Zreg is not necessarily projective (respectively affine). In-
deed, if Z is projective and has singularities, then Zreg is noncompact and thus not
projective. Moreover if Z is affine, then Zreg is affine if and only if the singularity
set Zsing of Z contains no manifold point x with the dimension of Zsing at x less
than dimZ — 2. These sorts of algebraic sets, the singular subset of a quasiprojec-
tive set, irreducibility, the natural breakup of an quasiprojective set into irreducible
quasiprojective sets, and dimension are discussed in detail in Chapter 12.

4.1 Generic Points

The concept of a general point or a generic point is classical. The desire is to


have something like a "random" point on a quasiprojective set which has no special
properties not true for all points of the quasiprojective set. As stated, this is asking
too much, but we can make the notion of generic points precise just by being a bit
more careful in our language.
The crucial refinement is to restrict our attention to individual irreducible com-
ponents of quasiprojective sets. Indeed, to see the necessity of this, consider
V(z\Zi) C C 2 , which is the union of two lines: Z\ = 0 and Z2 = 0. We can
easily distinguish between these components, and a random point on V{z\z-i) must
be on one or the other. A property that holds generally on one component cannot
be expected to hold on the other one. An obvious example is the property that
z\ = 0, which holds on every point of the component V(z\), but holds only for
Genericity and Probability One 45

(z\,Z2) = (0,0) on the component V{z2).


With this restriction, we may define the meaning of generic as follows.

Definition 4.1.1 (Generic) Let X be an irreducible quasiprojective set. Property


P holds generically on X if the set of points in X that do not satisfy P are contained
in a proper algebraic subset Y of X. The points in Y are called nongeneric points
and their complement, the points in X\Y, are called generic points.
As discussed in § 4.6, there are other ways to define generic, but the definition above
suits our needs.
From this definition, one sees that the term generic is only meaningful in the
context of the property P in question. In many instances, the property in question
is a compound one. For instance, if properties Pi and P 2 both hold generically on
X, then the compound property P = {Pi and P2) also holds generically on X. This
is because Pi holds on X \ Yi and property P% holds on X \ Y2, where Yx and Y2 are
both proper algebraic subsets of the irreducible quasiprojective set X: so P holds
on X \ (Yi LJ Y2)- But the union of two proper algebraic subsets is also a proper
algebraic subset.
We state the following claim without proof.
Claim 4.1.2 Let / : C n -> CN be a set of polynomial functions and let X C C n be
an irreducible quasiprojective set. Suppose property P is equivalent to the condition
f(x) -£ 0. Then, P holds generically on X if and only if it holds for at least one
point of X.
The concept of generic properties is very useful. The set Y in the definition has
complex codimension at least 1, equivalent to a real codimension of at least 2. It
also has measure zero. Thus it is "small" from a number of perspectives. Often
this captures what we want. For example, we might consider X = C and for a € C
ask whether the property that V(z2 + a) has two distinct roots is true or not. It
is easy to see that V(a) is the set a G C where the property fails, and so, although
the property is not always true, it does hold generically. If we were to pick a at
random from C, the probability that z2 +a = 0 would have two distinct roots is one.
In general it may be difficult or impossible to completely describe the set where a
certain property fails. Often we do not care that a property is sometimes false, and
really only need to know that the set where a property holds is "large." According
to Claim 4.1.2, we just need to know that the conditions for failure of the property
are algebraic and that there exists a point in X for which the property holds.
Note that the definition of generic carries over naturally to real affine sets X C
RN. The main difference is that the set Y C X of exceptional points has at least
real codimension 1, whereas in the complex case, it at least real codimension 2.
This difference is essential and is a major reason that we construct homotopies in
complex Euclidean space instead of in real Euclidean spaces. A crucial advantage
of complex space is reflected in the following special case of Theorem 12.4.2.
46 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Theorem 4.1.3 The complement of a proper algebraic subset Y in an irreducible


affine set X C CN is connected. If an affine algebraic set X c CN is connected,
then it is path connected.
This proposition implies that we can always move from one generic point to
another along a continuous path consisting of all generic points.

4.2 Example: Generic Lines

Throughout this book, we often apply the adjective "generic" to various geometric
objects, as in a "generic point" or a "generic line." The precise meaning of the
adjective always depends on the context, which we illustrate here by considering in
detail the meaning of the following statement:
The degree of a homogeneous polynomial p(xo,x\,X2) is the same as the
degree of the homogeneous polynomial obtained by restricting to a generic
line in P 2 .
In the notation of the previous section, this statement without the word generic is
our property P.
Saying a line is generic we are implicitly referring to all the lines on P2 and
assuming they have some sort of algebraic structure. Then, a "generic line" is any
line that is not a special exception to the statement at hand, or said another way,
the statement is true for all lines except those in a proper algebraic subset of the
set of all lines. In the notation of the previous section, we need to show that:
• there is an irreducible algebraic set X, each point of which represents a line in
P2,
• the failure of proposition P is described by a set of algebraic equations, and
• there exists a line for which the proposition holds.
In the next few paragraphs, we show this in some detail.
Typically we can represent objects in different ways. The simplest way of repre-
senting lines on P2 is as the solution set of a linear equation b\Xo + b\Xi + 62^2 = 0.
Lines correspond to three-tuples (bi,b2,bs) 6 C3 with not all three coordinates
0. Since (&o,&i,&2) and {b'^b'^b'^) give the same line if and only if there is a
A e C* := C \ {0} with b't = Xbi for i = 0,1,2, we see that lines in P 2 are pa-
rameterized by points [60, &i, 62] £ P2-
Since the proposition concerns the degree of the restriction of p(xo,x\,X2), it
is more convenient to parameterize the line by its solution points, rather than
representing it by the coefficients of its equation. Suppose that two distinct points
[aio,an,ai 2 ] and [0.20,0,21,0,22] are on the line. Then, the entire line on P2 is given
in parametric form as
[ZQ,ZI] —>• [xo,xi,x2] = [zo,zi] • A,
Genericity and Probability One 47

where

\a 2 o 021 ^22/

In this manner, every line in P 2 has a representation as a 2 x 3 matrix of complex


numbers.
At first sight the parameter space for the lines is C 6 . This is not quite true.
For a 2 x 3 matrix A to give a map from P 1 to P 2 , the nullspace of the map
(ZQ, ZI) —> (ZQ, Z\) • A must be the single point (0,0). (Otherwise, there would exist
[£o,2i] £ P 1 that give [XQ,XI,X2[ = [0,0,0], which is not allowed.) Thus letting U
denote the set of matrices of rank two, we have A G U. Note that U is a dense open
set of C 6 . It is the complement of the set

V : = V(aua22 - 012021,011.023 - 013021,012023 - 013022),

i.e., the set of common solutions of the three polynomials 011022 — 012021, 0 ^ 2 3 -
o-iso-2i,ai2a23 — 013022- The set T> is a typical example of an affine algebraic set,
i.e., the set of solutions of a finite set of polynomials on complex Euclidean space
(see § 12.1). As such, it follows from T> ^ C 6 , that T> is "thin" in a precise sense,
e.g., it's complement U := C 6 \Z? is dense and open, and T> is of measure zero in C 6 .
Moreover, T> is of complex dimension at most five, which is equal to real dimension
at most ten. Since U is open dense, we have that generically a point of C 6 is in U.
In practice this means that a six-tuple generated by a random number generator
will lie in U.
But this space is six dimensions, and we have already identified the space of lines
as P 2 . Why are the dimensions different? Notice that given any B G GL(2,C), i.e.,
any invertible 2 x 2 matrix B, then A and B • A give maps with the same line
as image in P 2 . This accounts for the four dimensions. For genericity questions
it suffices to work on U, and indeed, more often than not, we will work on larger
spaces that map onto the true parameter spaces.
Now, the restriction of p(xo,Xi,x2) to a line is just

g(zo,zi) :=p((z o ,2i) -A),


and we are trying to show that g(zo,Z\) has degree d on a generic line. Without
carrying out the algebra it is easy to see that

(1) each term in the expansion of P{{ZQ,Z\) • A) is degree d in (z\,zo);


(2) the coefficients of these terms are polynomials in the entries of A;
(3) the condition that g is degree d is equivalent to at least one coefficient being
nonzero.

Let B G C[A] be the set of coefficient polynomials. The only thing that remains is
to check is that not all of the polynomials B are identically zero, that is, V(B) ^ U.
It suffices to check that there is at least one line on which g has degree d. To do this
48 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

t a k e a n y p o i n t [a,(3,-y] w i t h p(a, /3,7) = c / 0 , a n d choose a n y line [xo,Xi,x2} =


[azl,/3zi:jzi +ZQ}. Then g(zo,zi) = czf + zoq(zo, z{), where q(zo,Zi) is either zero
or has degree d — 1.
The reader may confirm that an analog of the proposition holds on C2; that is,
The degree of a polynomial p{x\,x2) is the same as the degree of the poly-
nomial obtained by restricting to a generic line in C2.
Notice that the modifier "homogeneous" has been dropped. The demonstration
follows analogously to the discussion above, replacing ZQ and xo by 1. The main
difference is that the set of polynomials B which must be zero for nongeneric lines
is no longer the set of all coefficients, but only the set of coefficients of terms having
degree d in z\.
The purpose of all this is to illustrate how the intuitive notion that "generic"
means "nothing special" can be concretely reduced for this specific case to say that
"generic lines" are those represented by a matrix A whose entries do not satisfy
the finite set of polynomials B. Usually, we will not go to such lengths to work
out the precise definition of "generic" in other contexts. It is enough to know
that in principle, nontrivial algebraic conditions exist whose zero sets contain the
nongeneric points, and so the generic points, containing the complement of such a
zero set, contain a dense, open set of the ambient space. Ultimately, this comes
down to knowing that all the conditions of the context are algebraic and that there
is at least one point that is not special.

4.3 Probabilistic Null Test

The concept of generic points leads quite naturally to the notion of "probability-
one" algorithms. Before making a general definition, let's motivate it by considering
the question of whether a polynomial p{z) on C^ is zero or not.
Of course, for any p[z) of reasonable complexity, we could expand it into terms
and check if any of the coefficients is nonzero. In this sense, the question may appear
to be a toy problem, but it has many aspects of serious questions we face about
whether a given polynomial system has some property or another. For example,
given a polynomial f(z) on C^, how can we check whether it is identically zero
on an affine algebraic set X C CN? But even the question posed on CN is not so
trivial as it may seem at first, for p(z) might be defined in straight-line fashion in
a form not so easily expanded into terms; for example, it could be the determinant
of a matrix whose elements are all polynomials.
To settle whether p{z) is zero, we propose choosing a random point of z* G C^,
and checking whether p(z*) = 0 or not. We wish to conclude that if p(z*) = 0, then
p is the zero polynomial and if p(z*) =£ 0, then p is not the zero polynomial. The
important observation is that if p(z) is not the zero polynomial the set V(p) is an
affine algebraic subset of C^ of codimension at least one, and in particular of real
Generidty and Probability One 49

codimension at least two. The volume of V(p) as a subset of CN is zero relative to


the usual 2JV-dimension real Euclidean volume. So if we choose a random number
z* £ CN, then except for a set of measure zero, i.e., a probability-zero event, we
have that p(z) is identically zero if and only if p(z*) = 0. Thus, we say that testing
the value p{z*) for a random z* £ CN is a "probability-one" algorithm for deciding
if p(z) is the zero polynomial. This is very fast, but raises practical questions.
The worry is that a random point might be close enough to the set of nongeneric
points that numerical analysis difficulties ensue. In floating point arithmetic, p(z*)
will almost never evaluate exactly to zero even if p(z) is the zero polynomial. So in
practical work, we must replace the test "Is p(z*) = 0?" with the test "Is |p(z*)| < e
for some small positive real e?" Upon doing so, we face the trouble that if p(z) is
not the zero polynomial, then the region {z G C n | \p(z)\ < e} is not measure zero.
There are two ways in which the probabilistic null test can give an erroneous
answer:

False Positive |p(^*)| < e even though p(z) is not the zero polynomial; and
False Negative |p(z*)| > e even though p(z) is zero.

False negatives are the result of numerical error only, because if p(z) is identically
zero, the random pick of z* cannot land on a mathematical exception. This is not
true for false positives, where by chance we might pick a z* close to a solution of
the equation p(z) = 0 even though p(z) is not identically zero.
The chance of a false positive can be reduced by testing more than one random
point. Suppose that for a given e, there is a false positive rate, neglecting numerical
error, of r. This rate depends only on the set {z £ C | \p(z)\ < e} and the distrib-
ution from which we draw the random test point z*. Suppose we test twice with
independent random test points, and declare p to be zero only if both tests indicate
so. Then, the false positive rate neglecting numerical error declines to r 2 .
Consider a polynomial given as the determinant of a matrix with polynomial
entries, say, p(z) = detM(z), z € C^. Instead of expanding the determinant, the
probabilistic null test is to simply evaluate the elements of M at a random point
z* G C^ and check if M(z*) is a singular matrix. It is well-known that instead
of simply evaluating det M(z*), a safer test for singularity is to use singular value
decomposition. Suppose that M{z) does represent a singular matrix for all z, so
p(z) is the zero polynomial. Typically, neither numerical evaluation of detM(;z*)
nor numerical determination of the smallest singular value of M(z*) will return an
exact value of zero: instead we will get a value which is at best a small multiple
of machine precision. We must make a judgement of how small the result must be
before we declare that M(z) is singular. This gets to the heart of the matter: we
cannot know with certainty using floating point arithmetic that p{z*) = 0, but by
raising the number of digits used in the computation, we can make the uncertainty
in the conclusion arbitrarily small.
In short, under the assumption of exact arithmetic and a random number gen-
50 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

erator of infinite precision, the probabilistic null test will give correct answers with
probability one. In floating point arithmetic, this ideal is not achieved, but the
probability of false answers can be made arbitrarily close to zero by increasing pre-
cision.
For example, fix a positive integer d and positive real numbers M and R and
consider the monic polynomials p(z) of degree d on the region {z 6 C | \z\ < R}
with all coefficients bounded in absolute value by M. Then the probability of
false positives and false negatives in the probabilistic null test goes to zero as the
number of digits increase. This follows by combining the fact, that we can choose e
smaller and smaller in such a way that the absolute error of evaluating p(z) is less
than e, with the fact, that the bound on the area of {z € C | \p(z)\ < e} given in
Lemma 5.3.2.
Analogously on C^, the probability of false positives and false negatives in the
probabilistic null test goes to zero as the number of digits increase. As in the case
of one variable we need to have some limits on our data for this to hold, e.g., it
suffices to fix a positive integer d and positive numbers M, R and restrict to
• z = (zi,...,zN) G CN with max{|zi|,...,|;zjv|} < R;
• those polynomials p(z) of degree d on CN having all coefficients bounded in
absolute value by M and at least one term of the form zf for some i.

4.4 Algebraic Probability One

From the discussion of the probabilistic null test, one sees that the idea of "generic"
translates directly to randomized algorithms that succeed "with probability one."
While this is exactly true in a mathematical sense, in floating point arithmetic,
probability one is an ideal that is only attained in the limit as the arithmetic is
extended to an infinite number of digits, consuming infinite computer time and
memory! The success of such an approach in practice depends on careful consid-
eration of numerical processes and benefits greatly if the mathematical functions
under consideration are mildly behaved. In this respect, algebraic questions have
properties not generally enjoyed in other mathematical domains. For this reason,
we declare the following equivalence.

Definition 4.4.1 (Algebraic Probability One) Suppose property P holds gen-


erically on an irreducible algebraic set X. Then we say that P holds with algebraic
probability one for a random point of X.

In this manner, we will also speak of "algebraic probability-one algorithms,"


meaning algorithms whose correctness, ignoring numerical error, depends only on
some choice of parameter being generic. Since the scope of this book is limited to
algebraic systems, we often drop the modifier, using "probability one" in place of
"algebraic probability one."
Genericity and Probability One 51

Even though we often drop the adjective "algebraic," the distinction is mean-
ingful. Consider a proposition P that holds for irrational real numbers, but fails
for rational ones. It is known that although the rational numbers are dense in the
real line (there is a rational number between any two given real numbers), they
are also countable and hence measure zero. In this sense, a random number drawn
uniformly from the real interval [0,1] has a zero probability of being rational. One
could then imagine a test for the truth of P based on testing it at a random point.
But this becomes utter nonsense in floating point computations, where every num-
ber represented on the computer is rational! We can only draw test points from
the rational numbers, so we cannot test P on any irrational number, let alone a
random one.
We are in a much stronger position when treating algebraic systems, as illus-
trated in the following simple theorem.

Theorem 4.4.2 // proposition P holds generically on C, the exceptions to P are


finite in number.

Proof. This follows from the definition of generic and the fact that a polynomial
in one variable has a finite number of roots. •

4.5 Numerical Certainty

In the probabilistic null test for polynomial p(z), two sources of uncertainty come
into play: the random selection of a test point z* and numerical error in evaluating
p(z*). Intuitively, if p(z*) is far from zero, we feel very secure in concluding that
p(z) is not identically zero. It is only when p(z*) is small that doubts enter in. But
how small is small? That is, if our test is "Is |p(z*)| < e?," how do we pick e? And
can we ever have certainty in our conclusion?
One can attain certainty in many instances. If we can establish bounds on the
round-off errors in the calculations and find a z* such that |p(z*)| is bounded away
from zero, then we know with certainty that p(z) is not identically zero. It would be
onerous to derive bounds for every situation that arises, but fortunately, methods
exist for automating the process. In particular, interval arithmetic can be used for
this purpose. The idea is each number in a sequence of arithmetic operations is
replaced by an interval guaranteed to contain the exact result. To ensure this, each
arithmetic operation rounds down the lower limit and rounds up the upper limit
according to strict rules. In a complex version of this, numbers become rectangular
regions in the complex plane (i.e., a cross product of a real interval and an imaginary
interval). If the region computed for p(z*) does not include 0, then one knows with
certainty that p(z) is not zero.
This eliminates the question of deciding a value for e, by changing the question
to "Does the interval value of p(z*) include zero?" If it does not, we have a certain
52 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

conclusion. False negatives are therefore eliminated. However, if it does include


zero, we still do not know whether it might be a false positive due to overconservative
estimates of the error interval or due to an unlucky random choice. Increasing
precision and checking independent random points may turn a false positive into a
certain negative. But if p(z) is identically zero, we can never determine this with
certainty. We can, however, make the probability of false positives vanishingly small.
In practice, we do not usually employ the rigorous methodology of interval
arithmetic. If the computations are lengthy, the final error bounds can be very
pessimistic, accumulating the worst case for every intermediate stage. The extra
computation can be a burden as well. With a little good judgement in picking
e, the uncertain approach yields good results. This approach values getting the
correct answer with high probability quickly over rigor in distinguishing between
certain and uncertain results. Mathematical proof, when required, is usually best
sought in other ways. We can obtain very strong conjectures to guide the search
for such proof.

4.6 Other Approaches to Genericity

There is classical approach to "generic points," which is espoused rigorously by


(Weil, 1962) and in a simplified form in (page 2 Mumford, 1995). It forms the
framework for Weil's approach to algebraic geometry. In characteristic zero, which
is where complex and real algebraic geometry mainly sit, the idea is this. In a given
discussion, a large but at most finite number of polynomial equations pi,. • • ,pm
arise. Take all the coefficients of these polynomials and adjoin them to the rational
numbers to produce a field K of finite transcendence degree over Q. For example,
add \/2 to Q to get all the numbers of the form a + b\/2 with a, b G Q. Let Q be
a field extension of infinite transcendence degree over the algebraic closure of K,
e.g., if we started with Q, we could make the classical choice Q := C Now given
a set of polynomials / i , . . . , / n € K[zi,..., ZN] generating an ideal X whose radical
\/T C fi[zi,..., ZN] is prime, a generic point for V ( / i , . . . , /„) C £lN is a point
r G V(fi,... ,fn) with the property that if q(z) G K[zi, ...,zN] is zero on T, then
q(z) belongs to the radical \/X.
Even though this classical approach (with its careful attention to fields of defin-
ition and a "universal field") seems somewhat far from our notion of generic point,
the use of this approach is very close to the use we make of generic points in this
book. For example, in (Chapter 16.3 van der Waerden, 1949), the criterion is given
that for an algebraic function to vanish on an irreducible affine algebraic set, it is
necessary and sufficient that it vanish at a generic point. If a property holds "gener-
ically" in the sense of § 4.1 for points of an irreducible quasiprojective algebraic set,
then it holds "generically" also for this classical approach.
Another variant of the concept of generic is to replace Y with countable unions
of proper algebraic subsets. You give up the openness of U in the Zariski topology,
Genericity and Probability One 53

but the theory is basically the same. We do not ever need this generality.
We refer to (Sommese & Wampler, 1996) where generic points were introduced
numerically and some different approaches are contrasted with more detail.

4.7 Final Remarks

Though our experience with solving systems of polynomials using probabilistic al-
gorithms has been very good, more research needs to be done on quantifying how
secure we are in using probability-one algorithms. In such an endeavor, more quan-
titative measures of the size of numerically bad sets are needed. The remarks in
§ 5.3 discuss some of the numerical issues involved in deciding whether a point
x G CN is a zero of p(z). We know that the model we are using is good for a range
of degrees and dimensions dependent on the number of digits we use. As use of
these algorithms spreads and applications are made well outside of the ranges so
far considered, it will be useful to have more than rules of thumb for the behavior
of this dependence.

4.8 Exercises

In the following exercises, "random normal" means a Gaussian distribution with


zero mean and unit variance. "Complex random normal" means that the real and
imaginary parts are each independent random normals. In Matlab, one can produce
a n x m matrix of such variables using the command

randn.(n, m) + l i * randn(n,m).

Exercise 4.1 (Generic Circles) Interpret the statement: two generic circles in
the plane meet in exactly two distinct finite points. Prove it.

Exercise 4.2 (Nonsingular Matrices)

(1) Prove the statement: n x n matrices are generically nonsingular.


(2) The expression p(M) = det M is a polynomial in the entries of matrix M. The
probabilistic null test on p consists of choosing a random M* and checking "Is
\p(M*)\ < e?" Use Matlab to generate a large number of trials, perhaps 10,000,
and plot a histogram of loglO (abs (det (M))), where M is a 2 x 2 matrix whose
entries are complex random normals. What does your result imply about how
the probability of false positives depends on e?
(3) Repeat (2) using condition number instead of the determinant.
(4) Do similar experiments for the condition number of larger nxn matrices. How
does the variance in the result relate to n?
(5) Try different distributions for the elements of M, such as uniform on [0,1],
uniform on [—1,1] x [—i,i], and uniform on the unit-magnitude circle in the
54 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

complex plane. What affect, if any, do these have on the probability of false
positives?
Exercise 4.3 (Singular Matrices) The expression det(AAT) where A has more
rows than columns is an identically zero polynomial in the elements of A. The
following experiments explore the effectiveness of the probabilistic null test on such
expressions.
(1) Form a singular 2 x 2 matrix M by generating a random 2 x 1 vector a and
setting M = aaT. Let the elements of a be complex random normal. Plot a
histogram of loglO(le-20+abs(det(M))). What is the largest observed value?
How does this relate to false negatives in the probabilistic null test?
(2) Perform a similar experiment for nxn matrices M = AAT, where A is nx(n— 1)
and complex random normal.
(3) Compare these results to those of Exercise 4.2.2. Does there exist an e so that
the null test "|detM| < e?" gives a correct answer in all your tests? Does the
size of the matrix matter? Why?
Exercise 4.4 (Null Tests on Random Polynomials) Experiment with the
probabilistic null test on randomly generated polynomials of degree d, d = 1, 2, 3,4.
Pick d roots r,, i = 1,..., d and a test point x, all complex random normal, and let
p = Yli=i(x ~ ri)- Notice that considering p as a polynomial in x, it is never the
zero polynomial, because it has leading term xd.
(1) For d = 1, show that Prob(\p\ < a) = 1 - e"" 2 / 4 . (Hint: the sum of two
normal distributions is normal, and the sum of two squared unit normals is a
chi-squared distribution.)
(2) Estimate Prob(\p\ < a) for d = 1,2,3,4 by numerical experiment.
(3) Plot the experimental data and overlay the theoretical result for d = 1 for
comparison.
(4) What is the behavior of Prob(\p\ < a) for small a? How does this relate to the
probability of false positives in the probabilistic null test?
Chapter 5

Polynomials of One Variable

This chapter presents three interrelated but distinctly different perspectives on poly-
nomials in one variable: their algebraic properties, the analytic behavior of their
roots, and their numerical behavior when evaluated in floating point arithmetic.
The algebraic picture is important as a precursor to more general results for multi-
variate systems. Each algebraic result for one variable polynomials may be viewed
as a special case of the more complicated set of possibilities that arise in the mul-
tivariate situation. The analytic and numerical pictures do not generalize quite so
readily, although, as demonstrated in the short discussion of growth estimates, one
may sometimes gain insight to the multivariate case by considering a multivariate
polynomial as a polynomial in one variable with coefficients that are polynomials
in the remaining variables. Let us begin with the algebraic point of view.

5.1 Some Algebraic Facts about Polynomials of One Complex Vari-


able

We have already gained considerable experience with polynomials of one variable in


earlier chapters, and we have even seen how to solve them with eigenvalue methods
and continuation. However, these earlier presentations took for granted certain
algebraic facts that have waited until now for a definitive statement.
For a polynomial of one variable p{z) G C[z], the structure of the solution set of
p(z) = 0 is a simple consequence of the Fundamental Theorem of Algebra, which
states that
Theorem 5.1.1 Any polynomial p(z) = aozd + • • • + a d e C[z], where d is a
positive integer, the a^ are complex numbers, and ao ^ 0, factors; that is,
k
p(z) = aol\(z - xt)d\
i=l

where the Xi are distinct complex numbers and di are positive integers satisfying
d = d!-\ + dk.

55
56 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

In this case, the set V(p) defined by {z € C | p(z) = 0} consists of the k points
Xj,... ,Xk, which are the k irreducible components of V{p). This set is the simplest
example of an affine algebraic set, i.e., a closed algebraic subset of complex Euclidean
space (see § 12.1 for a precise definition). The description V(p) = {zi} U • • • U {xk}
is a special case of the irreducible decomposition, see § 12.2. The multiplicity of a
root Xi oi p(z) is the integer di > 0 occurring in the factorization of p(z). It is easy
to check the following theorem, which we state without proof. We use the notation
p(i)(z) to mean the jth derivative of p(z), i.e., p^(z) = j^jp(z).

Theorem 5.1.2 Point x% is a root of p(z) with multiplicity di if and only if


di = m i n { j > 0 \p^{xi)^Q}.

Considering the common zeros of more than one polynomial leads to no new
sets, since it is an easy consequence of the Fundamental Theorem of Algebra that
V(pi,... ,pn), the common zeros of n polynomials, equals V(p) for the greatest
common divisor of the p^. Another way of approaching this is to take the set of
zeros of one of the Pi and keep only those for which all the remaining pi are zero.
Let p \ (z) — aozdl -\ \- adl and p2 (z) = bozd2 -j \- bd2 denote polynomials of
degrees d\ and d2- The polynomials P\{z) and p2(z) have a root in common if and
only if the Sylvester determinant (defined below in Equation 5.1.1), a polynomial
of degree d\ + di in the coefficients of p\ and p2, is zero. A quick proof of this,
in sufficient generality to be used as a tool to symbolically investigate multivariate
polynomials, may be found in (Walker, 1962). A more extensive development of
resultants may be found in (Cox, Little, & O'Shea, 1997). In the case of polynomials
of one complex variable, the proof in (Walker, 1962) comes down to simple linear
algebra. Since we will have occasion to contrast the numerical methods we use
with purely algebraic methods, we prove the underlying lemma about the Sylvester
determinant in this case.

Lemma 5.1.3 Let p\{z) — aozdl H \- adl and p2(z) = b0zd2-i \-bd2 denote
polynomials of degrees d\ and d2- If there is an x E C such that pi(x) = 0 and
p2(x) = 0, then there exist polynomials f(z),g(z) e C[z] with p2{z) f'(z) = Pi{z)g(z),
deg/(z) < degpi(z), and degg(z) < degp2(z)-

Proof. Since x is a root of both pi(z) and p2(z), we may factor out [z — x) to write
Pi(z) = {z- x)f(z) and p2{z) = {z - x)g(z). Accordingly,

P2{z)f{z) = {z- x)g{z)f{z) = {z - x)f{z)g{z) = Pl{z)g{z).

Lemma 5.1.3 leads directly to the following theorem.

Theorem 5.1.4 (Sylvester) The polynomials p\{z) = aozdl + h adl, a0 =£ 0,


andp2(z) — boz 2 + - • - + bd2, bo ^ 0, have a common root if and only if the Sylvester
Polynomials of One Variable 57

an
resultant Res(pi,p 2 ) — 0, where Res(pi,p 2 ) := det(Syl(pi,p 2 )) d

'ao ... adl 0 . . . 0


0 a0 ... adl 0 . . .

0 ... 0 an ... ad,


dl
Syl( P l , P 2 ) := ° . (5.1.1)
b0 -..bd2 0 ... 0
0 b0 ... bd2 0 . . .

. 0 ... 0 b0 ...bd2.
x
The matrix in this expression is size (di +^2) (di + d 2 ) and has d2 rows involving
the ai 's and d\ rows involving the bi 's. The columns above and below the dividing
line do not necessarily line up.

Proof. The condition given in Lemma 5.1.3 is the existence of f(z) and g(z) such
that p2(z)f(z) = Pl (z)g{z), where f(z) = fozdl-1 + ... + fdl-1 andg(z) = gozd'-1 +
• • • + 9d2 -1 • This may be written in matrix form as

0 = [go,- • • ,9d2-i,-fo,- • • ,-fd!-i} -Syl(pi,p 2 ), (5.1.2)


where the matrix on the right is the same one as appears in the Sylvester resultant.
The condition is met if and only if the above linear system of equations has a
solution, which happens if and only if the determinant of the matrix is zero. •

The reader should write out a few low degree cases for himself or herself. For
example, the special case when d\ = di = 1 is

Res = det[f? 1 l
and R = ao&i ~ &o°i- -R = 0 if and only if the vectors (ao,ai) and (6o>&i) a r e
linearly dependent. This agrees with what we know: if two linear equations in one
variable have a common solution, then one is a multiple of the other.
Remark 5.1.5 Treating the ai and bj as indeterminates, we see (looking ahead
to A.13.1) that R is a bihomogeneous polynomial of bidegree (d2,di).
Theorems 5.1.2 and 5.1.4 may be combined to conclude the following.
Theorem 5.1.6 A polynomial p(z) = a,ozd + • • • + ad, ao ^ 0, has a multiple root
if and only if its discriminant Dis(p) is zero, where D\s(p) := Res(p,dp/dz).
Note that the discriminant condition, Dis(p) = 0, is a polynomial equation on
C[ao, •. •, an], so we are justified in saying that a generic polynomial of degree d has
d distinct roots.
58 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

As afinalremark, we note that the number of real zeros of a degree d polynomial


is less than or equal to d, for to find the real zeros of a polynomial p(z) G R[z], we
can find the complex zeros of the polynomial and pick out those that are real. Even
though the real solutions of p(z) are easy to understand, we can see that the algebra
and the geometry are not closely connected as in the case of complex solutions. For
example, for integer d > 1, the polynomial z2d + 1 has 2d complex zeros, but no
real zeros.

5.2 Some Analytic Facts about Polynomials of One Complex Vari-


able (Optional)

We now collect some of the classical relations between the coefficients of a polyno-
mial p(z) = aozd + • • • + Od £ C[z] and its roots, i.e., the solutions of p(z) = 0. We
follow Marden's beautiful book (Marden, 1966), which contains many more results
than we present here. This section is marked "optional," because it is not essential
to an understanding of the continuation method. Indeed, it is quite difficult to
find similar relations that apply to systems of multivariate polynomials, our main
subject of concern. We include this material as background, because it at least
gives a hint of what we might expect in the more general situation. Moreover, in
Remark 5.2.4, we show the one variable growth estimates given here give growth es-
timates for general affine algebraic sets. These estimates combined with the Noether
Normalization Theorem A.10.5 and the use of trace functions as in § 15.5.4 may be
developed into a geometric proof of the existence of the irreducible decomposition.
We start by getting numerical bounds on the roots of p(z) in terms of the
coefficients a«. The basic trick here is an observation of Cauchy. For any complex
number a £ C and any real number r > 0, we let

Ar(a) := {z e C | \z - a\ < r]

denote the disk of radius r around a.


L e m m a 5.2.1 Let p(z) = aozd + • • • + aj, e C[z], with a® ^ 0 and with a,j ^ 0
for at least one j > 0. Then the polynomial
d

q{z)-\aQ\zd-Y,\ai\zd~i
2=1

has a unique positive root R, and all the roots of p(z) are contained in the disk
AR(0).

Proof. Without loss of generality we can assume that a,j ^ 0, since otherwise we
could factor a power zl with i > 0 out of the polynomials p(z) and q(z) and have
the condition that p(z) has a nonzero constant term.
Polynomials of One Variable 59

Consider the function h(x) := q(x)/xd on x G (0, oo). Note that the derivative
h'(x) is positive for all x G (0, oo). This shows that h(x) is an increasing function
with at most one x G (0, oo) with h(x) = 0. Since

lira h(x) = —oo

and

lim h(x) = oo,


x—>oo

we conclude from the intermediate value theorem that h(x) = 0 has at least one
solution. Thus q(z) has a unique root R on (0, oo), and q(x) > 0 for real x > R.
Now we will assume that there is a root z* oip{z) which satisfies \z*\ > R and
show we get a contradiction. We have p(z*) = 0, which gives

Thus we conclude the absurdity that g(|-z*|) < 0. •

The first observation is that this radius R satisfies

# < l + m a x ( f e ,..., ^ 1. (5.2.3)


Uao a
o J
To see this assume that for R with q(R) = 0we have the contrary

R>1 + max < — , . . . , — > .


{ a0 a0 )

Dividing q(z) by Rd and setting M := max < — , . . . , — > we have


I ao ao|J
d d

i = X) — R~1 <MJ2 •i?"i (5-2-4)

and thus we conclude that R < M + 1.


Theorem 5.2.2 Let r denote the maximum absolute value of any of the roots of
p(z) = aozd -\ h ad G C[z], with ao ^ 0, and let R denote the unique nonnegative
real root of the polynomial q{z) := \ao\zd - J2i=i \ai\zd~l. Then, denoting a =

™*i<^(|S|/(")) 1A >
a<r < R< -r^—•
~ ~ " 2 3 - 1
60 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Proof. We can assume that r and R are nonzero, since otherwise the result is trivial.
We have already shown that r < R. The left inequality follows from the observation
that if we denote the roots of p(z) = 0 by z\,..., Zd we have

- = £ **•••**<(*>.

For the right hand inequality, using a as defined in the theorem, note that

Rd < Y ^ Rd~* < Y f^a'iJ""' = (R + a)d - Rd.


-£?«o - f e w
a
d
Remark 5.2.3 Given a polynomial p(z) = a^z + • • • + a^ G C[z], with ao ^
O / o j and roots z\,...,Zd, then the numbers — are the roots of the polynomial
a0 + a\z + . . . + a,d,zd. Using this we get another set of bounds for the absolute value
of the \zi\ by applying Theorem 5.2.2 to a0 + a\z + ... + adzd.
Remark 5.2.4 (Growth Estimates) For a polynomial p(z) G C[zi,... ,ZN], Theo-
rem 5.2.2 gives some quantitative feeling for the behavior of the solution set
V(p) := {zGCN\ p{z) = 0} .
To understand this, assume that degp(z) = d and single out one of the vari-
ables, e.g., ZN and consider p(z) as an element of C[zi,... ,ZJV-I][^^] and write
d
p(z) = 2^ai(zi,... ,Zjv-i)Zjv~*- F° r simplicity, assume that ao(zi,... ,ZJV-I) = 1:
i=0
this can always be achieved by a linear change of coordinates. We have that
a,i{z\,... ,ZM-I) G C[zi,... ,ZN-I] has degree at most i, and on
r rf 1
r ^ a
1
Br:= Uzu...,ZN-i)&C - . £|zil <r \,

for all sufficiently large r, we have that \at(zi,...,z/v_i)| < Cj?*8, where Q is
a positive constant independent of r. Thus Theorem 5.2.2 implies that for all
(zi,... ,ZJV_I) G C^" 1 with \/Ylj=i \zj\2 sufficiently large, we have that any solu-
tion (zi,... ,ZN) of p(z\,... ,ZN) = 0 satisfies
N-l

\zN\<C J2\ZJ\*

with C a positive constant independent of (z\,..., ZJV-I).


Polynomials of One Variable 61

5.3 Some Numerical Aspects of Polynomials of One Variable

It is a numerical fact of life that constants are only known to (and computations
are only carried out with) limited numbers of digits. It is worth spending a little
time thinking through what this means for polynomials of a single variable, i.e., to
consider how closely numerical calculations match the algebraic-geometric picture.
If we were considering polynomials with coefficients in a finite field, it may well
happen that a polynomial is nonzero even though it evaluates to zero at all points of
the field, e.g., z(z — l) over Z2 = {0,1}. One happy consequence of the Fundamental
Theorem of Algebra is that this does not happen over the complex numbers, i.e., a
given polynomial p(z) is only zero at a finite set x\,... ,Xd- But what about the
situation when we use the floating point numbers on the computer?
At first sight there is nothing to worry about. Assuming 15 digits on our com-
puter we have on the order of 1015 distinct numbers and for a polynomial to be zero
at all of them, it would need to be of degree at least 10:5, which is absurdly large
for any application we know of. But there is a snag here. If, for a polynomial to
numerically be zero we mean it is less than some small constant, e.g., 10~15, then
the Fundamental Theorem of Algebra is certainly false. Consider the normalized
Chebychev polynomial of order n, which is given by
n
Tn(z) = ]J(z - cos((i - l/2)7r/n)).
i=l

As Hamming eloquently points out (§28.5 Hamming, 1986), since the normalized
Chebychev polynomial of degree n is a real polynomial that oscillates between ±2 1 ^ n
on the interval [—1,1], the 51st of these,

T5i(z) = z51 + lower order terms

is < 10~15 on [—1,1]. Thus, although the exact polynomial has just 51 roots in this
interval, the numerical approximation of it in standard double precision arithmetic
is zero to within round-off error on the entire interval.
Indeed, it is worth thinking about what we mean when we say we find a zero
of a polynomial p(z). We mean a floating point number x of some prescribed
number of digits (say 15 for simplicity of discussion) whose distance from one of
the k zeros of p(z) is less than some prescribed number, e.g., 10~15. In light of
Hamming's observation, we might want to say, "all right, if p(x) is very small we
cannot conclude that x is close to a zero, but certainly, if x is close to a zero oip(z),
then p(x) is close to zero." Unfortunately, even this is false. Consider

p(z) := z10 - 28z9 + 1. (5.3.6)

To 15 digits of accuracy one of the roots of this polynomial is x = 27.9999999999999.


Evaluate p(z) at this number and rounding to 15 digits, we find that it is rather
62 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

far from 0, i.e., p(x) = - 2 . Even with 17 digit accuracy, the approximate root is
x = 27.999999999999905 and we still only have p(x) = -0.01.
To go a bit further in this direction let p(z) = zd + a\zd~1 + • • • + a<j with
positive coefficients all of modest sizes, e.g., < 10. Let the degree be d = 15, and
consider the implications of Theorem 5.2.2. It implies that the roots of p(z) are
all within the disk of radius 24.66. Suppose that p(z) is the same as p(z) with the
four lowest degree terms dropped, that is, p(z) = z15 + aiz14 + • • • + anz4. Then,
also by Theorem 5.2.2, the polynomial p(z) — p{z) has all its roots within the disk
of radius 12.83. Then, for \z\ > 24.66, the relative error, \\p(z) - p(z)\\/\\p{z)\\, of
approximating p(z) by p(z) is bounded by

\\p{z) -p(z)\\/\\p(z)\\ < (|s| + 12.83)3/(M - 24.66)15.


This implies that for \z\ > 48, the relative error is < 10^15. Hence, if we are using
15 digits of accuracy, we can drop the four lowest-degree terms without observable
change in the numerical values on \z\ > 48.
The moral is that with only a certain limited number of digits, we can only look
at algebraic objects of a limited size before the numerical limitations imposed by
the allowed number of digits overwhelm the model coming from algebraic geometry.
It is convenient to have a rough rule-of-thumb for how the number of digits we
have available is connected to the degrees of the polynomials we can safely compute
with. To achieve this let us look a little more closely at the phenomena raised by
the Hamming example. The first observation is that the situation is not as serious
for the unit disk as it is for the interval [—1,1].
Lemma 5.3.1 Let p(z) = zd + aizd~l + ... + ad. Then
max \p(z)\ > 1.

Proof. Assume that max \p(z)\ = c < 1. Then by Rouches theorem (Hille, 1959), it
follows that zd and zd — p{£) have the same number of zeros within the unit disk.
Since these numbers are d and d—1, they cannot be equal and we have shown the
lemma. •

This shows that the normalized Chebychev polynomials of any order, and in
fact, any polynomial with leading coefficient 1, is distinguishable from zero in the
unit disk. Though comforting, the real problem is that T^\{z) and its relatives are
very close to zero on a significant set within the unit disk. We start with a crude
order of magnitude result.
Lemma 5.3.2 Given a polynomial p{z) = zd + a,izd~l + ... + a<j G C[z] and a
positive number e, the area of the set of z e C such that \p(z)\ < e is at most d-ne2ld.

Proof. Let z\,...,z& denote the roots of p(z). Let w denote a point such that
Polynomials of One Variable 63

\p{z)\ < £• We claim that w is contained in

d
{J\1/d{Zi).
i=l

If not, then we would have that \w — Zi\ > e1^. Thus we get the absurdity that
e > |p(u>)| = \w — Zi\ • • • \w — Zd\ > e. •

If the roots are sufficiently separated, the bound for the area is actually of the
form < Cdirc2/d, where Cd is a universal constant bounded by 2. We see this as
follows.

Theorem 5.3.3 Letp(z) = zd ~\- a^zd~l + .. . + a,d £ C[z] be a complex polynomial


with distinct roots Z\,... ,Zd, and let e be a positive number. Assume that for all
l<i<j<d,

Then the area of the set o / z £ C such that \p(z)\ < e is

Proof. It suffices to show that the region \p(z)\ < e is contained in the union of
disks

i=i I Id"1) d
J

We first show that \p(z) | > e on the set

To see this note that for j ^ i,

, ( d \ x
z - Zj\ + \z - Zi\ > \Zi - Zj\ > — - ^ e*.
\(d-l) d J
64 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Thus if we have z as in Equation 5.3.7, we have that


d
\p(z)\ = H\z-zJ\

( «J w d - i ,V"'
= e.
The set of z such that |p(z)| < e has at most d connected components. This can
be seen by noting that z —> p(z) is a d-sheeted branched cover.
f e^ 1
Since the roots Zj are in distinct disks < z \z — zt\ < j—f > , we are done.
{ (d-l)-a-J ^
Conjecture 5.3.4 (Zero Region Bound) Letp{z) = zd+aizd~l + .. .+ad G C[z]
be a complex polynomial and let e be a positive number. Then the area of the set of
z G C such that \p(z)\ < e is
< Cdne2'd
where Cj, is a constant only dependent on d and bounded by 2 for all sufficiently
large d.

Remark 5.3.5 We suspect that Cd < 2 for d > 3.


Lemma 5.3.2 and Theorem 5.3.3 suggest a rule of thumb for the tradeoff be-
tween the number of digits used and the degree of polynomial that can be handled.
Suppose we can tolerate at most an area of 10~a for the set where \p{z)\ < e, i.e.,
where p(z) looks like it is zero numerically. If we are computing with E digits of
accuracy, then we take e = IG~B. By Theorem 5.3.3, we have
27re2/d < 10" a ,
which implies

--£<-a-log10(27r)<-a-l
a
or

In particular, for an area of 10~3, we should use around 2d digits of precision in


calculations.
Polynomials of One Variable 65

If the degree is high, the assumption on the spread of the roots looks like
C{d)e1/d, where C(d) = d/{(d - l)^"1)/"*) slowly approaches 1 from above as d
grows. If the closest pair of roots is separated a distance r, we may turn this
around and say that Theorem 5.3.3 only applies for e < (r/C)d « rd/(de), where
e = 2.718... For a difficult case such as the Chebychev polynomial T51(;z), which
has d = 51 and two roots within r = 3.8 • 10^3, one must have e < 2.7 • 10~126.
This ensures that the sets with \p(z)\ < e are in distinct disks centered on the roots.
Sharper bounds may be possible, but the message is that high degree polynomials
require high precision arithmetic, and any roots close to each other exacerbate the
difficulty.

5.4 Exercises

Exercise 5.1 (Resultants) We call the matrix appearing in (5.1.1) and (5.1.2)
the Sylvester matrix for the resultant.
(1) Write out the Sylvester matrix for the resultant of a general cubic and a general
quadratic.
(2) Let the cubic and quadratic have random coefficients. Use a numerical test of
the rank of the Sylvester matrix (singular value decomposition is best) to show
that it is nonsingular.
(3) Form the Sylvester matrix for

Pl= z3 + 2z- 3, p2 = 2z3 - z2 - 3z + 2.

Numerically evaluate the determinant and find the rank of the Sylvester matrix.
Do pi and pi have a common factor? If so, use linear algebra to compute the
polynomials f(z), g(z) as in Lemma 5.1.3.
(4) Repeat the above for

Pi = z3 - 2x2 - x + 2, pi = z3 - -x2 + -x + l.

(5) Use the results of the last two items to form a conjecture about how the rank
of the Sylvester matrix relates to the number of common solutions. Prove it.
Be sure to account for the possibility of multiple roots.
(6) Pick any one of the polynomials in the preceding items and use Theorem 5.1.6
to show that it does not have a repeated root. What does the same test say
about z3 + z2 - z - 1?

Exercise 5.2 (Chebychev polynomials) Form normalized Chebychev polyno-


mials Tn(z).
(1) For n = 1,2,3,4,5,10,15, plot fn(z) for z £ [-1.1,1.1]. Verify the predicted
limits of oscillation.
66 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

(2) Make a contour plot of |TiO(.z)| for z £ C in the unit disk, Ai(O).
(3) Zoom in around z = 1 until the contour lines separate the roots.
(4) Try this for larger n and see how far you can go before the roots near z = 1 can
no longer be separated.
(5) Plot the contour |f n (z)| = 0.001 for n = 5,10,15,20.
(6) How much of a problem is this for the probabilistic null test? Consider degree
and the precision of arithmetic in your answer.
Chapter 6

Other Methods

While the focus of this book is on homotopy methods, this chapter highlights some
of the most useful alternatives: exclusion methods, eliminants, and Grobner bases.
We already indicated in § 1.1 that the eigenvalue approach is one of the most
effective means of solving a polynomial in one variable, but its extension to systems
in more than one variable requires significant symbolic preprocessing. In contrast,
we have seen that homotopy methods for one variable extend rather naturally to
multivariate systems, a matter that we take up in detail in Part 2. Exclusion
methods have this same property: the multivariate algorithm looks almost exactly
like the one-variable method. Numerical applications of eliminants and Grobner
bases work the other way around: they reduce multivariate problems back to just
one variable so that an eigenvalue routine or other method for one variable can be
employed.
As our interest is in numerical methods, some readers may be surprised to
see resultants and Grobner bases mentioned here. These are usually regarded as
symbolic approaches, applicable to systems with rational coefficients and computed
in exact arithmetic. But, in fact, even if we use a symbolic method for most of
the computation, we will generally have to rely on numerics to estimate the zeros
of the system. As a very simple example, consider the equation x2 — 2 = 0, which
has no roots over the rational numbers. To proceed further symbolically, we must
add the symbol \[2 to the number field, whereupon the roots can be expressed as
x = ±y/2. This may be perfectly suitable for some purposes, but a scientist or
engineer will usually want to know that y/2 « 1.41421.... The situation is even
more dicey in general, because according to Galois theory, there is no symbolic
formula for the roots of a polynomial of degree greater than five. In short, for
most practical purposes, it is not a question of whether to proceed symbolically
or numerically; rather, it is a question of how far to proceed symbolically before
turning to numerics. With this in mind, one may craft symbolic approaches that
lead naturally into numeric methods. It is from this viewpoint that we discuss how
eliminants and Grobner basis methods can lead us to eigenvalue formulations for
computing solutions numerically.
There are a host of considerations relevant to choosing a solution method, in-

67
68 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

eluding
• Does it find all solutions? What happens if there are isolated singular solutions?
How about higher-dimensional solutions?
• Does it provide error estimates and/or error bounds?
• Under what conditions is it efficient?
• Is it easy to implement? Are software packages readily available?
Since each method has an extensive literature on its own, full answers to such ques-
tions are beyond the scope of this book. We will not attempt detailed comparisons
here, nor in fact, will we even give in-depth descriptions of practical algorithms. Our
aim is only to introduce these alternatives to give the interested reader a starting
point for further investigation.

6.1 Exclusion Methods

Exclusion methods, also known as subdivision methods or generalized bisection


methods (and related to branch and bound algorithms for optimization), operate
by subdividing a region into pieces, excluding those pieces which cannot contain
a solution, saving pieces that can be seen to contain a single solution point, and
subdividing again the remaining pieces, stopping the process when all pieces are
smaller than some predetermined size. Let's rephrase this in a precise, yet general,
way and then specialize to a common practical form.
Assume we wish to solve a system of equations f(x) = 0, / : X —> X, where
X = C n or X = W1. Suppose there is a kind of subset of X that we will call a box,
such that for every box B we have a test T(f, B), a splitting algorithm S(B), and
a real-valued size measure \B\. The test T(f,B) returns one of three values:
• T(/, B) = - 1 means that there is no x £ B such that f(x) = 0;
• T(f, B) = 1 means that there is a unique x £ B such that f(x) = 0; and
• T(f, B) = 0 means that neither of the other two conditions could be verified.
For some real 0 < p < 1, the splitting algorithm returns boxes £?i,... ,Bk C X,
k>2, such that B C u£ =1 #i and vol(Bj) < pvol(B).
An exclusion method for finding solutions of f(x) = 0 in an initial box BQ is as
follows.
Given An initial box BQ, a real tolerance e > 0, and a function f(x) with test
T(f, B) and splitting algorithm S(B) as above.
Find Sets of boxes B* and Be such that every solution point in Bo is in at least
one of the boxes in B* or B£, each box in B* contains a unique solution point
and each box B g B€ has size \B\ < e.
Begin
• Initialize B = {-Bo}-
Other Methods 69

• Initialize Be = {} and B* = {} (empty sets).


• While B ^ {},
- Set B' = {}.
- For each B, € B do:
* Remove Bi from B.
* UT(f,Bi) = - 1 , discard B^.
* Else if T(/, B ^ == 1, append Bt to B*.
* Else if |Bj| < e, append £?, to Be.
* Else, append 5(JBi) to B'.
- Set S = B'.
• End While.
End
After m passes through the main loop, the largest boxes in B have volume no greater
than pmvol(Bo), so with mild conditions on the way that splitting is done, we can
be assured that the algorithm terminates. At the conclusion of the algorithm, each
box in B* contains a unique solution, each box in B€ is no bigger than e, and all
solutions of f{x) — 0 in the initial box BQ is in one of the boxes in B* or Be.
At this level of generality, the algorithm always succeeds, but possibly in a
completely useless way: there is nothing to keep it from returning an extremely
long list Be whose boxes cover all of Bo- What is needed to make the approach
useful is a test T(f,B) that is reasonably sharp; that is, it classifies boxes as ±1
while they are still relatively large, eliminating them from further subdivision.
Usually, a box in Rn is taken to be a rectangular parallelepiped defined by lower
and upper limits on each coordinate:

x = {(xi,...,xn)\xi e [x^Xi], i= l,...,n}

Then, |x| = maxi(xi — x j , and S(x) bisects x along the mid-plane of this maximally
wide coordinate direction,1 so k = 2 and p = 1/2. A nice consequence of this choice
is that the subdivisions of a box exactly cover the original box with no overlap
except for sharing a boundary face. This means that a solution point can be in
the interior of only one box, so we get duplicate copies of a solution only in the
rare instance that a bisection face passes exactly through it. A box in C n can be
considered a box in ]R2n having independent coordinates for the real and imaginary
parts of each complex coordinate.
The obvious question at this point is how to construct good exclusion tests.
The most common approach, popular because of its wide generality, is interval
arithmetic. An interval extension of a function / : M.n —> R is a function f : R " —>
R, where R C I 2 is the half-plane of intervals [a, b], a < b, and f(x) G f (x) for any
x
It can be advantageous to bisect along a smaller edge of the box, using derivative information
to inform the decision; see (Kearfott, 1997).
70 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

x G x. That is, the interval extension function evaluated on an interval box gives an
interval that contains all possible values of the function evaluated on points in the
box. Clearly, if there is a solution of f(x) — 0 in box x, that is, if there is a x* G x
such that f(x*) = 0, then f (x) must contain 0. Consequently, if 0 ^ f (x), then the
box can be excluded, or in the notation above, T(f, x) = —1. The interval extension
does not have to be sharp, that is, it may give loose bounds on the actual image
of f(x), x € x, and in practice, this is almost always the case, as sharp bounds are
prohibitively expensive to compute.
An interval extension of a polynomial in straight-line form can be computed
by concatenating interval extensions of each of the basic operations of negation,
addition, subtraction, multiplication, and integer powers (see § 1.2). For these, we
have the sharp bounds

~{ao,ai] C [-ai,-a0]
[a0, ai] + [b0, bi] C [a0 + b0, ax + bi]
[a o ,ai] - [bo,6i] C [a0 - b i , a i - b0]
[a o ,ai]- [bo,bi] C [min(aobo,aob1:aibo,aibi),ma,x(aobo,aob1,aibo,aibi)]
[ao,ai]fc C [(0, if aoai < 0; else min(|ao|, |ai|)fc),max(|ao|, lai|)fe]i k even
[ao,ai]k C [oo,ai]. k o d d
(6.1.1)
When the operations are carried out in floating point, one must be careful to round
the upper limit of the output interval up and the lower limit down to be sure that
it properly contains all possible results. To evaluate a general polynomial function,
one may simply apply these interval operations at every stage of a straight-line
implementation of the function. Sharper bounds can be determined by considering
the special properties of a polynomial, as illustrated by the exponentiation formula
above: in principle we only need the multiplication formula to evaluate x2 as x • x,
but the formula invokes the fact that x2 is always nonnegative.
With only an exclusion test, we have a bisection method for narrowing potential
solution boxes down to size |x| < e. But bisection becomes very expensive as the
dimension n grows, because we may generate as many as 2™ sub-boxes in the course
of bisecting each of the coordinates. The process is greatly expedited if an inclusion
test returns T(f, x) = 1 while |x| is still relatively large. An approach that can
provide this is the interval Newton test. Although the method can be refined in
various ways, the basic idea is to compute a Newton step using interval arithmetic
and test the overlap of the resulting box with the initial box. To be precise, the
interval Newton step is computed as

N(/,x)=x-f'(x)-1/(x),

where x is any point in x (typically the midpoint), f' is an interval extension of the
Jacobian matrix of / , and the inversion is computed by Gaussian elimination using
Other Methods 71

interval arithmetic. If f (x) includes singular matrices, the inversion will fail, and
the test is inconclusive. Otherwise, we have the following facts.

• Any solution in x is also in N(/, x).


• If N(/,x) C int(x), where int(x) is the interior of x, then there is a unique
solution in x. Hence, the test T(/,x) = 1.

In any case, we can restrict further search for a solution to the box N(/, x) fix, and
the general algorithm given above can easily be refined to take advantage of this.
If the intersection is empty, we may declare the test T(f,x) = — 1.
Once the Newton test confirms that a box contains a unique solution, the box
can be constricted by repeated iterations of the interval Newton step. As in the
usual Newton method, convergence is quadratic, under certain assumptions on dif-
ferentiability and on the tightness of the interval extension, which are satisfied by
polynomial functions evaluated with interval arithmetic.
This brief overview just gives a glimpse of the approach; for more information,
see the books (Alefeld & Herzberger, 1983; Kearfott, 1996; Moore, 1979; Neumaier,
1990). References (Allgower, Georg, & Miranda, 1992; Dian & Kearfott, 2003;
Xu, Zhang, & Wang, 1996) are also useful. Substantial effort has been expended
on methods to sharpen the interval tests or to reduce the computation required
(Georg, 2001, 2003; Kearfott, 1997), and software packages are available, including
IntBis (Kearfott & Novoa, 1990), ALIAS (Merlet, 2001), and IntLab (Rump, 1999).
A major strength of the approach is that the search can be conducted entirely
in the reals and limited to a finite region of space, so if one is only interested in such
solutions, effort is not expended elsewhere. The approach also easily generalizes to
non-polynomial functions, just by including interval extensions of other elementary
functions. (In fact, almost all the literature on the subject is for general nonlinear,
continuous functions.) Importantly, even though we are using floating point arith-
metic, we obtain not just an approximate answer, but also mathematically reliable
bounds and a guarantee that all solutions in the initial box are somewhere in the
final set of solution boxes.
The method has several weaknesses. First, the Newton test is inconclusive in the
neighborhood of singular solutions, even isolated ones, in which case the method
behaves like bisection and converges slowly. Worse, in the presence of a higher-
dimensional solution set that intersects the initial box, the method returns a set
of boxes covering that whole set. The number of such boxes grows exponentially
with the dimension of the solution set, so this sea of boxes can easily founder the
computation. Finally, interval arithmetic does not return sharp results, and with
every arithmetic operation, the looseness may accumulate. For functions with many
operations, the interval extensions may grossly overestimate the true bounds. This
also applies to the linear solving step in the interval Newton test, so that for large
dimensions n, loose bounds inevitably accumulate.
72 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

6.2 Elimination Methods

Instead of numerically attacking all variables at once, as in the exclusion methods,


one can eliminate some variables and then numerically solve for the remaining
ones. The extreme case is to eliminate all but one variable, so that the remaining
polynomial can be numerically solved readily. Then, a backsolving procedure must
reconstruct the other variables.
For a system of n polynomials in n unknowns, say f{x\,... ,xn) = 0, we call
an "eliminant" any system ofTO< n equations inTOunknowns, g(xi,... ,xm) = 0,
such that if x* is an isolated solution of / = 0, then TT(X*) is also an isolated solution
of g = 0, where IT : (xi,... ,xn) \—> (xi,... ,xm) is the projection onto the first m
variables. Note that the vanishing of g is only a necessary condition for / to vanish;
exact eliminants that are also sufficient are found in some approaches.
The solution set of g = 0 includes the projection of the solution set of / = 0.
If the projection were in a general direction, then distinct isolated solutions of
/ = 0 would project to distinct isolated solutions of g — 0. But, if we merely
project onto the first m variables, the projection may not be general, and so several
solutions may project to the same point. The backsolving procedure must then be
able to find all of the pre-images of that point. To avoid such complications, one
may introduce a random, linear change of variable before computing the eliminant,
effectively randomizing the subsequent projection direction. While simplifying the
backsolve, this may make the calculation of the eliminant more difficult.
With only the necessary condition in place, some isolated solutions of g = 0 may
not have pre-images that are isolated solutions to / = 0. These are called extraneous
solutions. The conceptualization of elimination as a projection explains how such
solutions can appear in the eliminant. One way is that a positive dimensional
solution set might project to a point, as in a vertical line under the projection from
the xy-plane to the x-axis. The other is that a solution at infinity of / = 0 might
project to a finite solution of g = 0.
A worse situation for elimination is when / = 0 has a positive dimensional
solution set, for then, the projection of such a set may contain the projection of some
isolated solution. In fact, if we eliminate to a single variable and if that variable is
not constant on the positive dimensional solution set, then the projection covers the
entirety of C and no isolated solutions can be found. This means that elimination
to a single variable produces just the null polynomial.
Under the assumption that / = 0 has only isolated solutions, we may find
all of them by computing an eliminant, finding all of its isolated solutions, and
backsolving these, checking for extraneous solutions. There are several approaches
for computing eliminants along with a backsolving procedure. One of the most
popular approaches is to use resultants, which we discuss next.
Other Methods 73

6.2.1 Resultants
Recall from Theorem 5.1.4 that the condition for two polynomials p\(z) and pz(z)
in one variable to have a common root is the vanishing of their Sylvester resultant
Res(pi,p2)5 a determinant in the coefficients of the two polynomials. Similarly, for
degrees di,..., dn, let Pi(x) be the polynomial i n i = ( i i , . . . , xn-\) composed of all
monomials xa with |a| < di and with coefficient ci%a on monomial xa. This is called
the "universal polynomial system" of degree d\,..., dn. There is a polynomial in the
coefficients Cj)Cn called the resultant, unique up to scale, such that the n polynomials
Pi in n — 1 variables x have a common root if and only if the resultant is zero
(Ch.3,Thm. 2.3 Cox, Little, & O'Shea, 1998).2 We may denote the resultant as
Res(i1,...,dn to indicate its relation to the degrees of the polynomials. An exposition
on how to find the resultant for n > 2 is beyond our scope; see (Canny & Manocha,
1993; Cox et al., 1998; Manocha, 1993) for details. More generally, following the
notation introduced in Equation 1.2.3, for index sets I i ; i = 1,... , n, we suppose
that polynomial pt(x) is of the form Pi(x) = J2aej. Ciaxa. Then, the condition
that the polynomials have a common root is again a resultant polynomial in the
coefficients, called the sparse resultant (Cox et al., 1998; Emiris, 1994, 1995; Gelfand,
Kapranov, & Zelevinsky, 1994), which we may denote as Resi1,...iin.
While Res^j,^ is given in Equation 5.1.1 as the determinant of a matrix having
a single coefficient or zero in each entry, this is not true in the general case. For
universal polynomial systems, the resultant is a ratio of two such determinants, e.g.,
(Ch.3,Thm.4.9 Cox et al., 1998) and (Macaulay, 1902). For nongeneric coefficients,
such as when a system has specific integer coefficients or when a system is sparse,
the determinant in the denominator of such an expression may vanish, so that more
complicated formulae may have to be employed. Some conditions that guarantee
that the resultant has an expression as a single determinant, sometimes referred to
as a resultant of "Sylvester type," are given in (Sturmfels & Zelevinsky, 1994).
Although resultants apply to n polynomials in n — 1 variables, several techniques
exist for applying them to compute solutions to n polynomials in n variables. We
briefly touch on two of them here.

6.2.1.1 Hidden Variable Resultants


The hidden variable technique picks out one variable, say xn, and rewrites each
polynomial Pi(x\,... ,xn) as a polynomial in just y = (x\,.. • ,£ n -i) with coeffi-
cients that depend on xn. That is,

Pi(X) — } j Ci,ax — / J Cj,aVxnJy i

2
Officially, the scale is made unique by adding an extra condition, as in CLO98, but that is not
of interest to us here.
74 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

where J7i is a new index set and CjQ(a:n) are the corresponding coefficient polyno-
mials, these being derived from 2,- after hiding xn and collecting like terms. Then,
a necessary condition that pi (x) = 0,..., pn (x) — 0 have a common solution is

Resyi,...,J-«(ci,a(*n))=0, (6.2.2)

where we mean to indicate that the resultant depends on all the coefficient poly-
nomials that appear in the system of equations. Since this is a polynomial in the
single variable x n , we may solve it numerically via the eigenvalues of the companion
matrix or any other suitable numerical method.
Equation 6.2.2 does not tell us how to find the corresponding values of the
remaining variables. We will not address this in a general way, but will content
ourselves to show how it can be done for systems of two equations in two variables.
We have px (x, y) = ao(y)xdl + ... + adl(y) and p2(x,y) = bo(y)xd2 + ... + bd2(y),
where y is "hidden" in the coefficients. Looking back to the proof of Theorem 5.1.4,
we note that each column in Equation 5.1.2 corresponds to a power of x, that is,
we have the matrix equation

a a
flo i ••• di 0 0 ...
0 a0 ... adl-i adl 0 ... rx(d1+d2-i) -
: ]
0=[fl,-/]- , , , „ n , (6.2.3)
o0 b\ . . . bd2 U U . . . x
0 6 0 • • • bd2-i bd20 ... i

where, to save space, we have written g and / in place of the row vector for their
coefficients. We may rename the matrices appearing in this equation as

[g, -f]S(y)x = 0,

so that the resultant condition is just det S(y) = 0. Key to the proof of Theo-
rem 5.1.4 was that the vanishing of the resultant is necessary for the existence of
left null vectors [g, —/] satisfying [g, —f)S{y) = 0, but this also implies the exis-
tence of right null vectors x satisfying S(y)x — 0. So for each value of y satisfying
det>S(y) = 0, we solve the linear homogeneous system S(y)x — 0 for x, and since
this is determined only up to scale, we recover x as the ratio of the last two entries
in x. This approach assumes the co-rank of S(y) is one at each solution for y,
otherwise x is not uniquely determined. Also, the final entry in the solution for x
must be nonzero for x to be well defined. We cannot go into the details of what to
do when these conditions fail.

Example 6.2.1 Using y as the hidden variable, the resultant formulation for the
Other Methods 75

system
2x2 -xy-y-2 =0
2 2
x - y - 2x + 2y = 0
is
"2 -y -y-2 0 1 |V

.0 1 -2 -y2 +2y\ L 1.
The determinant of the matrix gives the resultant -12 + 16j/ + lly 2 - 14y3 + 3j/4,
whose roots are y = —1,2/3,2,3. Substituting each of these in turn back into
Equation 6.2.4 and solving the homogeneous linear system, one obtains column
vectors whose last two entries are in the ratio x = — 1, 4/3, 2 , - 1 , respectively.
For nongeneric coefficients Cj>a, the hidden resultant formula Equation 6.2.2 can
fail to yield solutions of the system. The problem is that the system may have
a positive-dimensional solution set so that there is a solution x for every value of
xn. This implies that the hidden-variable resultant must be identically zero. The
system may have isolated solution points in addition to the positive dimensional
solution set, but the resultant formula does not find them. An approach for dealing
with this situation can be found in (Canny, 1990).
Example 6.2.2 Consider a system of two quadratics of the form
x2 + (3y + 4)x + {2y2 + 5y + 3) = 0
x2 + 7x + (-y2 + 5y + 6) = 0.
Using y as the hidden variable, the resultant condition is
"1 3y + 4 2y2 + 5y + 3 0

.0 1 7 -2/ 2 + 5y + 6.
A bit of algebra shows that this polynomial is identically zero, even though the
system has a nonsingular root, (x, y) = (—5,1). The trouble is that the system also
has a singular solution set: x + y + 1 = 0.
This failure of the hidden-variable resultant formula on nongeneric systems is
one of the major drawbacks of the approach. Also, the symbolic derivation of
resultant formulae can be an onerous task, even if done using computer algebra.
For example, a result due to B. Sturmfels, reported in (Cox et al., 1998), is that the
resultant for three general quadratics in two variables, Res2,2,2, when fully expanded
as a degree 12 polynomial in the 18 coefficients of the system, has 21,894 terms.
This exaggerates the problem though, for when we apply the method to a system of
76 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

three quadratics in three variables having numerical coefficients, the hidden variable
resultant formula gives at most a degree 8 polynomial in the hidden variable. The
trick, then, is to use resultant theory to set up Sylvester-type matrix formulae and
operate on these, without expanding the associated determinants, see (Manocha,
1994). Such approaches can be very fast, especially for small or sparse systems,
which may outweigh the drawbacks. For large systems, the resultant formulae tend
to be unwieldy, and the method is no longer useful.

6.2.1.2 u-Resultants
Instead of hiding a variable to get n equations in n — 1 variables, one can add an
extra linear equation

f(x) = uo + wi^i H h unxn = 0


to get n + 1 equations in n variables. This is the first step in the so-called u-
resultant method. If the coefficients Ui were to be specified as given constants, then
in general the whole system would not have any solutions, but the idea is to treat
them as unknown. The resultant for the system f,pi,-..,pn will depend on the
coefficients of all the pi and o n t i o , . . . , u n . In fact, it factors as a constant multiple
of a polynomial of the form nfc( u o + flifc«i + • • • + OnfeWn), whereupon the fcth
solution can be read off as x = (ai/t,..., anfc)- The u-resultant usually becomes
unmanageable, because even though the coefficients of the pi polynomials will in
application be numerical, the u's must be carried through symbolically in what can
be rather large determinant formulae. Afterwards, the large polynomial must be
factored.
In a maneuver akin to the hidden variable approach, one can reduce computation
by singling out one variable, say xn, and appending the simpler equation f(x) =
UQ — xn. Then, substituting numerical values for the coefficients of the pt, the
resultant just depends on u0, and its roots are the values of xn.
This just describes the gist of resultant methods, since we have evaded the
rather difficult technical issue of deriving resultant formulae. The sequel to this
section discusses techniques that instead of working with the resultant, work with
a polynomial multiple of it, which is often all that is required to compute solutions
numerically.

6.2.2 Numerically Confirmed Eliminants


There are a number of ways to eliminate variables other than resultant formulae.
Typically, these come down to a final expression of the form

A(xn)m = 0, (6.2.6)

where A is a matrix whose entries are polynomials in xn and m is a column vector


of monomials in x\,... , x n . An example is the Sylvester formula given in Equa-
Other Methods 77

tion 6.2.3. When A(xn) is square, the existence of a nontrivial solution requires

detA(x n )=0, (6.2.7)

which is a polynomial in xn. This general approach is sometimes called "Sylvester


dialytic elimination" (Raghavan & Roth, 1995). Often the procedure leading to such
a formula guarantees only that it is a necessary condition, which may be equivalent
to the hidden variable resultant, a polynomial multiple of the resultant, or a null
polynomial.
The case of a null polynomial can be detected using the probabilistic null test
(§ 4.3). Instead of evaluating the determinant directly, it is numerically more stable
and reliable to test, for a random test value x*, whether A{x*) is full rank using a
singular value decomposition. If this shows that A(xn) is generically nonsingular,
the solutions for xn in Equation 6.2.7 and the corresponding monomial vectors from
Equation 6.2.6 must include all the solutions of the original problem. These can
be tested in the original equations to see if any extraneous solutions are included.
If so, this means that det A(xn) includes an extraneous polynomial factor whose
degree is the number of extraneous roots.
Generally, it is numerically disadvantageous to solve the determinantal polyno-
mial of Equation 6.2.7 in route to solving Equation 6.2.6. It is better to convert
Equation 6.2.6 to an equivalent eigenvalue problem, whereby the monomials m are
recovered from the eigenvectors. We will return to this briefly below.
This description has been intentionally sketchy and is meant only to convey the
general sense of the approach. Again, the approach gives necessary, not sufficient,
conditions. One approach described below, the Dixon determinant, is at least al-
gorithmic, but we also include a short description of some other more heuristic
methods. Either way, probabilistic numerical tests can be used to determine if the
formulae are nontrivial and if extraneous roots are present.

6.2.3 Dixon Determinants


One of the earliest solution methods for three polynomials in three variables is due
to Dixon (Dixon, 1909), which in modern notation is generalized to n polynomials
as follows. Given n polynomials, / i , • • •, / n in n — 1 variables xi, • • •, xn_\, one
introduces new variables o>i,... ,an^1 and forms the determinant

fl(x1,X2,--.,Xn-i) ••• fn{xi,X2,...,Xn-l)


fi(ai,x2,...,xn-i) ••• fn(a1,x2,.-.,xn-i)

f i ( o L i , c t 2 , • • •, a n _ i ) ••• f n ( a i , a 2 , - • • , a n - \ )

In the ith row of this equation, variables X\,..., Xi—i are replaced by u\,..., cti-i.
If for any i we let Xi = on, then row i and row i + 1 will be identical and so the
78 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

determinant is zero. Cancelling out such factors, one obtains the Dixon polynomial
n-l
8(xu. •. ,a;n-i,ai, • • • ,an-i) = N Y[(xi - (*)• (6-2.9)

When this determinant is expanded and like terms collected, it can be put into the
form 6 = maWmx, where m Q is a row vector of monomials in the a» variables,
mx is a column vector of monomials in the variables xt, and W is a function of
the coefficients of f\ , . . . , / „ . It is clear that for a common solution of the original
equations, the first row of the determinant is zero, so 5 must also be zero. Moreover,
this will be true for arbitrary values of the auxiliary variables a$. Consequently,
solutions must satisfy the matrix equation

Wmx = 0. (6.2.10)

It happens that W is square, so a necessary condition that / i , . . . , / „ have a common


root is det W = 0.
Notice that the procedure as just described has one more equation /* than
unknown Xi. To use this as an elimination method, one may apply the same trick
as in the hidden-variable resultant of § 6.2.1.1; that is, consider the /* as polynomials
in x i , . . . , x n _i with coefficients that depend on xn. Then the eliminant matrix W
in Equation 6.2.10 depends only on xn and detVK = 0 is a polynomial equation in
one variable, and we have the situation described at Equation 6.2.6. The Dixon
determinant can, of course, be used in symbolic work, see for example, (Mourrain,
1993). Some examples of its use in formulating numerical algorithms in kinematics
are in (Nielsen k Roth, 1999; Wampler, 2001).

E x a m p l e 6.2.3 (Three quadratics) To apply Dixon's method to three quadratics,


rewrite them in the form, for i = 1, 2,3,

fi(x, y, z) = (cot + cux + c2ix2) + (c 3i + c4ix)y + (c5i + Cfax)z


+ c7iy2+c8iyz + c9iz2 (6.2.11)
0 l l 2
:= cOOiy°z + cWiy z° + c0liy°z + c2Oiy z° + c n i y V + c02iy0z2,
(6.2.12)

where cmni is a polynomial in x of degree 2 — m — n. At Equation (6.2.8), we


will have a 3 x 3 matrix, where y,z play the role of xi,x2 and x is hidden in the
coefficients c m n j. Subtracting row 2 from row 1 and row 3 from row 2 and then
cancelling a factor of (y — a\) from the new row 1 and a factor (z - a2) from the
new row 2, 6 in Equation 6.2.9 is a 3 x 3 determinant whose ith column is

cioi + c20i(y + oil) + ciuz


cou + cniai + cO2i(z + a2) (6.2.13)
cooi + c1Oiai + cfma2 + c2aia\ + cnjQ 1 a 2 + c02ial
Other Methods 79

The determinant is linear in y, quadratic in z, and quadratic in y, z together, so


it gives terms only in the monomial set mx = {l,y,z,yz,z2}. Expanding and
collecting like terras, one obtains a matrix W of size 5 x 5 , each entry of which is
a polynomial in x. The degrees of these entries are as follows, where "0" indicates
an entry that is identically zero:
/ 4 3 3 2 2\
3 2 2 11
degW = 3 2 2 11 . (6.2.14)
2 110 0
\2 1 1 0 0/
From this, one sees that det W is a polynomial of degree 8 in a;, as one would expect
for the intersection of three quadratics.
It remains to show that det W is nontrivial, which may be done by checking the
rank of W for a random test value of x. It turns out that this is so for general
coefficients c^, and we have the equivalent of the hidden-variable resultant. Of
course, the method will fail on examples like Example 6.2.2, for the simple reason
that elimination can never work in the presence of positive dimensional solutions.

6.2.4 Heuristic Eliminants


Historically, a very popular approach among engineers has been to heuristically
search out an eliminant of the form of Equation 6.2.6. The basic idea is that
if f(x) = 0, then necessarily xaf(x) = 0, where xa is any monomial (writ-
ten in multidegree notation). Augmenting the original system of polynomials
fi(x) = 0,..., fn(x) = 0 with a number of such auxiliary equations, one may with
some cunning or luck arrive at a system that can be written in the desired form,
with an eliminant matrix that is square and generically nonsingular. Often, a "hid-
den variable" approach is used, meaning that at the outset one of the variables is
"hidden" in the coefficients and the analyst tries to construct an augmented system
of N equations in N monomials that depend only on the other n — 1 variables.
We can be more precise and less restrictive in stating the requirements for a
successful eliminant formulation. First, it is useful to have the notion of ideals.

Definition 6.2.4 (Ideal) The ideal I(T) of a system of polynomials J- =


{/l; /2> • • •, fn} is the set of all polynomials that can be formed by multiplying each
fi by a polynomial gi and summing them up:
I{J=) = {h\h = 9lh + g2f2 + • • •<?„/„}, (6.2.15)
where the gi can be any polynomials in the same variables as T.

Notice that if a; is a solution to T, that is, fi(x) = 0 for i = 1,... ,n, then
h(x) = 0 for any h 6 I{3~). Thus, any subset of polynomials in I(!F) is potentially a
80 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

set that could be rewritten in the form of Equation 6.2.6 and used as an eliminant.
If the requirements on the number of equations and monomials can be met, and if
the consequent eliminant matrix is nonsingular, then a viable eliminant (possibly
including extraneous roots) has been found.
We call this a "heuristic method," because it is not based on an algorithm guar-
anteed to deliver a set of polynomials in the ideal having the necessary properties
to form an eliminant. See Stetter's book (Stetter, 2004) for more on finding such
formulations without resorting to Grobner methods. (Grobner bases are sketched
in the next section.)
A variant of this approach was presented in (Wampler, 2004) and also used in
(Su, Wampler, & McCarthy, 2004). Instead of hiding a variable, a set of equations
from I{J-) is written as a constant matrix A, depending only on the coefficients of the
original equations, times a set of monomials m, as Am = 0. To these, one appends
identity relations that are linear in one variable. For example, if in multidegree
notation, xa and x0 are both in the set of monomials m, with a — (3 = [1,0,..., 0]
then the identity xa — x\x® = 0 is an allowed identity. Such identities can be
appended to the equations from the ideal to form a system

[B+AXiC]m = 0, (6.2.16)
where the lower block is the collection of monomial identities. We are again in
the situation of Equation 6.2.6. An important characteristic of Equation 6.2.16
is that x\ appears linearly in the elimination matrix, so the numerical solution of
the problem falls within the purview of linear algebra and is, in fact, a sparse,
generalized eigenvalue problem.
To illustrate this last approach, let's return again to the example of three
quadratic equations.
Example 6.2.5 (Three quadratics revisited) Consider again three general qua-
dratics as in Equation 6.2.11. We may multiply each of the three original poly-
nomials by each of nine monomials {1, x, y, z, x2, xy, xz, y2, yz} to generate a set of
27 polynomials in the ideal. These polynomials contain 34 monomials, being all the
monomials of degree 4 or less, except for z4. Thus, A in Equation 6.2.16 is a 27 x 34
matrix, but numerical test shows that it is rank only 26. Keeping 26 independent
rows of A, we need 8 identities to produce a square system. These can be formed
using x as the eigenvariable and the 8 monomials {I,x,x2,x3,y, xy, z,xz}, that is,
the 8 identities are
x • {1, x, x2, x3,y, xy, z, xz} = {x, x2, x3, x4, xy, x2y, xz, x2z}.
The net result is a 34 x 34 generalized eigenvalue problem in which x appears only
in the last 8 rows. A numeric test shows that for generic coefficients and a random
value of x, the matrix is nonsingular, so this is indeed an eliminant, and in fact,
generically there are no extraneous roots. The problem is sparse, as there are 10
Other Methods 81

nonzero entries in each of the first 26 rows and just two nonzero entries in each of
the last 8 rows (these being one appearance each of x and -1). Standard linear
algebra can be used to reduce the problem to size 8 before applying an eigenvalue
routine to solve for x.

6.3 Grobner Methods

Grobner bases have wide applicability in computational algebraic geometry. Out


of the vast literature on the subject, we can recommend the textbook (Cox et al.,
1997) for a good introduction to the concept and to the Buchberger algorithm for
computing them, while the sequel (Cox et al., 1998) illustrates the many uses of the
approach. In the more narrow objective of finding isolated solutions to systems of
polynomial equations, the computation of a Grobner basis can be used as the key
step in turning the heuristic approach of the previous section into an algorithm.
That is, it can be viewed as an organized way to generate new polynomials in
the ideal, I(J-), selecting a subset that retains exactly the same solution set as
the original polynomials, and determining a valid set of monomial identities that
complete the definition of an eigenvalue problem. Routines for computing Grobner
bases can be found in most general symbolic processing software packages, such
as Maple and Mathematica; the Singular package (Greuel & Pfister, 2002), which
is specialized to computer algebra, is one of the most efficient implementations
available. We give only a brief glimpse of the approach here.

6.3.1 Definitions
First, some terminology. We already introduced ideals in Definition 6.2.4 above. A
basis of an ideal is denned as follows

Definition 6.3.1 (Basis of an Ideal) LetX be an ideal. Any set of polynomials


Ji = {/i1;..., hm}, hj £ X, j = 1,..., m, that generates X, that is, I(H) = X, is
called a basis for X.

Two bases for the same ideal, say T and H with I(T) = I(H), have the same set
of solutions, because each of them is in the ideal of the other. So, for the purpose
of equation solving, we may exchange one basis for another at our convenience.
Beginning with a system T that we wish to solve, one may, of course, append any
auxiliary polynomials in the ideal, these being algebraic combinations of the original
polynomials, without changing the ideal. We may also discard any polynomials that
can be generated from others remaining in the basis. In this way, we can manipulate
the polynomials into helpful forms without changing the solution set.
The key to organizing the process of changing bases is to establish a monomial
ordering that tells which of any two monomials is "greater."
82 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Definition 6.3.2 A monomial ordering is a set of rules for comparing monomials


that satisfies the following three statements.
(1) The ordering always tells which of two distinct monomials is greater.
(2) The relative order of two monomials does not change when they are each mul-
tiplied by the same monomial.
(3) Every strictly decreasing sequence of monomials eventually terminates.
Once a monomial ordering has been established, a polynomial, being a sum of mono-
mials with coefficients, has a leading term whose monomial, the leading monomial,
is greater than any other appearing in the polynomial.
There are many possible monomial orderings. Among the most useful are the
graded orderings, which compare monomials by their degrees, that is, if |a| > |/3|
then xa > x13, with one or more secondary rules for ordering monomials of equal
degree. "Graded lexicographic" and "graded reverse lexicographic" orderings are
frequently used (Cox et al., 1997).
This finally brings us to the definition of a Grobner basis.
Definition 6.3.3 (Grobner Basis) A Grobner basis, Q, for an ideal X with re-
spect to a given monomial ordering is a basis of I such that the leading monomial
of every polynomial in X is a multiple of at least one of the leading monomials of Q.
The main technique for computing a Grobner basis is called Buchberger's algo-
rithm. We start with a set of polynomials, say T. Initialize Q = T. Then, pick any
two polynomials in Q and combine them so as to cancel out their leading terms (via
the leading terms' least common multiple). This is called their "s-polynomial." If
the result has a leading term that is a multiple of the leading monomial of any of
the members of Q, reduce it by again forming the s-polynomial, proceeding until
it can no longer be reduced. If at that point, it is nonzero, append it to Q. Pick
another pair and do the same series of operations. Keep repeating this until there
are no two members of Q that generate a new nonzero member. The final Q is a
Grobner basis for T. We can further reduce Q to a minimal Grobner basis by divid-
ing out the leading coefficients and eliminating any members of the basis that are
in the ideal of the other members. This is just a basic description: efficient versions
of the algorithm carefully prune the list along the way, use sophisticated tests to
avoid forming s-polynomials that cannot be fruitful, and use informed heuristics for
deciding what pair to combine next so as to speed up termination of the algorithm.
The following describes a useful property of Grobner bases. For any set of
monomials m, we can define the normal set of monomials as those monomials
which are not multiples of any monomial in m. We can extend this to say that the
normal set for an ideal is the normal set of all the leading monomials of the ideal.
From the definition of a Grobner basis, stated above, one may conclude that if Q is
a Grobner basis of X, then the normal set of X is just the normal set of the leading
monomials of Q.
Other Methods 83

We need one last property of a Grobner basis. Any polynomial p has a unique
remainder r with respect to a Grobner basis Q such that p = g + r with g £ I{Q)
and no term of r divisible by a leading monomial of Q. In other words, all of the
monomials in r are in the normal set of Q. The remainder can be computed by
initializing r — p, and if any term in r is divisible by a leading monomial of any
9 G G, we just add the appropriate multiple of g to r to cancel that term. Repeat
this until no term in r is so divisible. Let us denote the remainder as renig(p).

6.3.2 From Grobner Bases to Eigenvalues


We now sketch how to use a Grobner basis to derive an eigenvalue problem to solve
the system. For details, see (Moller, 1998; Moller & Stetter, 1995; Stetter, 2004).
Normal sets, as defined above, are key. Suppose that a set of polynomials has a finite
number of solutions. Then, the number of solutions, counted with multiplicities, is
equal to the number of monomials in the normal set. (See Proposition 3 of (Moller &
Stetter, 1995).) Furthermore, we can use the normal set as the eigenvector in an
eigenvalue problem that determines the solutions. A Grobner basis allows us to
easily find the normal set and formulate the associated eigenvalue problem.
Suppose we are solving a system of polynomials T in x = (x\,..., xn) which has
a Grobner basis Q. Let A be any linear combination,

A = co + cixi H h cnxn (6.3.17)


for given constants Co,...,cn. Let n = {xai,... ,xak} be the normal set of J-.
Assuming T has a finite number of solutions, we know that the number of solutions
must be k, and we wish to formulate a size k eigenvalue problem to find them.
Consider the polynomial Pi{x) = \xa\ for some i, and suppose that x* is a
solution of T. Since f(x*) = 0 for any / £ T, we can add any multiple of a
polynomial in the ideal of T to pi without changing the value of Pi(x*). This
implies that if r* = remg(pi), then r^x*) = Pi(x*) = Arij. But ri(x) is a sum of
terms in the normal set, so we can write it as r» = [an • • • ajfc]n. The entries a^
are just the constant coefficients in the formulae for the remainders rt, i = 1,..., k.
Assembling all these into matrix notation, we have

An = An. (6.3.18)
Hence, by computing remainders using the Grobner basis, we have derived an eigen-
value problem. For each eigenvector n, we can get a unique solution x, because
either i ; £ n or else it is a leading monomial of Q. In the latter case, we just
evaluate Xi using the Grobner basis element that has it as the leading monomial.
By picking the constants co,...,cn at random, the procedure is made more
robust than if one were to make a special choice, such as A = xn. With the
randomization, distinct solution points give distinct values of A with probability one.
Still, a root with multiplicity greater than one can lead to a repeated eigenvalue, a
84 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

situation that requires extra care, as addressed in (Moller & Stetter, 1995).
We should note that the eigenvector in Equation 6.3.18 is defined only up to
scale. But the correct scale is easily discerned, because one of the members of the
normal set is 1. If the monomial 1 is not in the normal set, then the constant
polynomial p = 1 must be in the Grobner basis, which means that the system has
no solution.

6.4 More Methods

The construction of the eigenvalue problem at Equation 6.3.18 has at its heart the
so-called multiplication map for the polynomial system. Eigenvalue problems can
be formed by devising other algorithms for constructing this map without using
Grobner bases and the Buchberger algorithm (Auzinger & Stetter, 1988; Mourrain,
1998; Stetter, 2004). Methods that take advantage of the sparse structure of the
polynomial system are described in (Emiris, 2003) and extensions of the approach
allowing it to treat systems with higher-dimensional components are in (D'Andrea &
Emiris, 2003). (For background on sparse structures, see § 8.5.) Related methods
and more can be found in the book (Dickenstein & Emiris, preprint).

6.5 Floating Point vs. Exact Arithmetic

If the system to be solved has integer or rational coefficients, the elimination and
Grobner methods can be carried forward in exact arithmetic through the stage of
forming an eigenvalue problem. After that point, floating point algorithms must be
employed. In that way, one may be sure that the calculations are rigorous up to the
eigenvalue routine, at which point at least one knows the number of solutions to
expect. For the elimination methods, exact arithmetic guarantees the determination
of the rank of the eliminant matrix, and for Grobner methods, it guarantees that
leading terms are determined properly. In floating point, either of these may require
judgements of whether small numbers should be declared zero, because they may
just be the figments of limited precision.
Unfortunately, exact arithmetic over the integers is often not feasible, because for
a series of computations, the number of digits usually grows ponderously large. To
avoid this, one may do the calculations over a finite field (i.e., over integers modulo
a moderately large prime number). For the purposes of determining the rank of an
eliminant matrix or finding the correct leading term of an s-polynomial, this will
almost certainly work correctly. A polynomial that is found to be nontrivial over
a finite field is certainly nontrivial over the integers, while the opposite direction is
not necessarily true, but holds with a high probability.
In engineering problems, more often than not, the polynomials have real coeffi-
cients. What shall we do then? One option is simply to proceed in floating point
Other Methods 85

and make decisions about zero quantities taking round off into account. Another
way is to use a finite field calculation in parallel with the floating point, using the
exact arithmetic results to determine when quantities are zero and using the float-
ing point results as the value of the nonzero quantities. See (Losch, 1995) for a
discussion of this approach to Grobner bases as applied to kinematics problems.
If all one wants is to count the generic number of solutions for a family of
problems, one can sometimes choose a candidate system having integer coefficients
and proceed in exact arithmetic over a finite field. In general, this can only be
employed when the coefficients for the family are a Euclidean space, for if they are
defined by algebraic conditions, integer examples may not exist. See (Faugere &
Lazard, 1995) for examples of this technique applied to problems in kinematics
and also for a discussion of the validity of such demonstrations. (They are not
mathematical proof, but the authors argue that facts discovered in this way may be
more reliable than proofs constructed by fallible humans.) As we argued at the top
of the chapter, however, if one wants solutions, or more properly speaking, solution
estimates, then floating point must be invoked at some point.

6.6 Discussion

All of the methods mentioned above have their place. Exclusion methods work best
in low dimensions and when only the real solutions in a finite region are desired.
Positive dimensional solution sets outside the region of interest have no effect, but
inside the region, they can be devastating. Isolated roots with multiplicity greater
than one cause extra work, as at best, the box containing such a root must be
whittled down to size e for the method to terminate.
Algorithmic resultant methods and Grobner methods work well on small sys-
tems, and in the case that all solutions are nonsingular and isolated, these can be
very fast as well. However, these methods can get very expensive for high degrees
and many variables. Clever heuristic eliminants can occasionally fill in where algo-
rithmic methods fail. The basic elimination methods described here can break down
completely when positive dimensional solutions exist and multiple roots can also
cause difficulties, although in both cases more advanced techniques can be brought
into play. Also, note that resultants and Grobner methods require the polynomials
to be expanded into sums of monomials; they do not work directly with straight-line
programs for evaluating polynomials. This expansion can significantly increase the
complexity of the calculations (see § 1.2). Work on using straight-line programs in
symbolic computations is relatively new (Krick, 2004).
The continuation methods that we propound have their own set of strengths
and weaknesses. In contrast to exclusion methods, continuation cannot be limited
to a pre-defined region and the only way to find all real solutions is to first find
all complex solutions and then pick out the real ones. For small systems where
elimination is still cheap, continuation can be orders of magnitude more expensive.
86 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

On the other hand, homotopy continuation is very robust in the face of multiple
roots and positive dimensional solutions. Homotopies can easily be set up to find all
isolated solutions and, as we shall see in Part III, can be adapted to catalog all the
positive dimensional solutions as well. Continuation can use straight-line programs
and the cost per solution point tends to grow mildly with the number of variables.
And unlike the other methods, continuation very naturally applies to families of
polynomial systems that are parameterized by real-valued physical quantities, such
as typically arise in engineering and science. We take up this last point in some
detail in the next chapter, as we concentrate on the continuation method exclusively
for the remainder of the book.

6.7 Exercises

Exercise 6.1 (Exclusion) Download one of the interval arithmetic packages


mentioned in § 6.1 and try the following.

(1) Solve Example 6.2.1. How large are the boxes when the interval Newton test
terminates?
(2) Solve Example 6.2.2. Try different termination settings for the size e for which
boxes are put in the list B€. How does the number of boxes in the list depend
on e?
(3) Try to solve the six-revolute serial-link inverse kinematic problem as formulated
in Equations (9.4.30), (9.4.31), and (9.4.32). For parameters, use the Manseur-
Doty example from Exercise 9.5. Beware of long running times. Why?

Exercise 6.2 (Hidden Variable Resultants)

(1) Repeat Examples 6.2.1 and 6.2.2 by hand or using a symbolic manipulation
program.
(2) Convert Equation 6.2.4 to a generalized eigenvalue problem for y by adding
xy and y to the column of monomials and appending related identities. The
resultant matrix should no longer have any quadratic entries. Solve the problem
numerically with an eigenvalue routine (in Matlab, see qz). How do the results
for this 6 x 6 problem reconcile with the fact that we expect just four roots?

Exercise 6.3 (Dixon Determinants) Implement the Dixon determinant for


three general quadratics in three variables, as discussed in § 6.2.3.

(1) Test the method on a system of three randomly generated quadratics.


(2) Try your routine on the following system representing two parallel cylinders of
Other Methods 87

radius 1 and a sphere of radius r:


x2 + y2-l=0
2 2
(x - I) + y - 1 = 0
x2 +y2 + z2 -r2 =0
For r = 2, see what your Dixon algorithm returns. If the algorithm fails, can you
make it work? (Hint: consider using a projective transformation.) Determine
the solutions by hand and see how they compare. Try this again for the case
r = 1.
(3) Append the equation z2 — 1 = 0 to the system of Example 6.2.2 and try your
Dixon routine. What happens?
(4) Use HOMLAB to solve these same problems, using the total-degree tableau-style
script totdtab. How does it perform?
Exercise 6.4 (Heuristic Eliminants) Repeat Exercise 6.3, but use the heuristic
elimination algorithm described in Example 6.2.5.
Exercise 6.5 (Grobner Bases) This exercise requires access to a computer al-
gebra package that can compute a Grobner basis.
(1) Repeat the examples of Exercise 6.3 once more using an algorithm for computing
Grobner bases. Use a graded ordering and determine the normal set. If possible,
derive an eigenvalue problem using the method described in § 6.3.2. (What can
go wrong here?)
(2) Try to use this approach to solve the six-revolute serial-link inverse kinematic
problem as formulated in Equations (9.4.30), (9.4.31), and (9.4.32).
PART II
Isolated Solutions
Chapter 7

Coefficient-Parameter Homotopy

Equations arising in science and engineering express relationships between various


physical quantities: the length of a bar, a chemical or physical property of a sub-
stance, an angle, a velocity and so on. Some of these quantities are the variables,
whose unknown values are to be found, and the others are known parameters. In
this way, we may consider any one problem to be a member of a whole family of
problems, defined by letting the parameters range over all admissible values.
It is natural to consider how the unknowns change in response to changes in the
parameters. In most cases, but not all, this response is expected to be continuous.
The essence of any continuation method is to track one or more solutions known
for one set of parameter values to get solutions for some new set of parameters.
While parameterized problems arise in many forms, this book is concerned with
polynomial problems: simultaneous equations that are a sum of terms, each term a
product of a coefficient with a monomial, itself the product of nonnegative integer
powers of the variables (Definition 1.2.1). In this context, continuous parameters
enter only through the coefficients, that is, the coefficients are functions of the
parameters. When denning a parameterized family of systems, we often use the
physical parameters of the problem as it originates in engineering or science, while
at other times, we may define artificial parameters, such as the coefficients of the
polynomial system. In this way, even a single polynomial system, one that has
specific coefficients and no parameters, can be cast as a member of a parameterized
family of systems having the same monomials but coefficients ranging over a complex
Euclidean space. This sort of maneuver lies at the heart of classical algebraic
geometry, where the complex number field is king. (Modern algebraic geometry
commonly considers arbitrary number fields, abandoning continuation arguments.)
Assuming that the coefficients are continuous functions of the parameters, a
continuous path through parameter space determines a continuous evolution of the
coefficients and, generally, continuous paths for the solutions as well. We call this
a coefficient-parameter homotopy. Throughout this book, we often abbreviate the
terminology to simply "parameter homotopy," meaning exactly the same thing. In
a sense, every homotopy considered in this book is a parameter homotopy: the key
is in recognizing what parameterization is most useful in a given context.

91
92 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

The beauty of using parameter continuation for polynomial systems is that if we


can find all solutions to a general member of a family, then we can find all solutions
to any other member of that same family. All it takes is a little care to ensure
that continuity is preserved as we move from one parameter set to another. We
begin with the theory that describes this situation and then examine how it can be
applied in solving systems. In this chapter, we consider the situation when all of
the solutions for some general system of the family are known at the outset. The
sequel, Chapter 8, takes up the question of how to get started when no such prior
solution is available.

7.1 Coefficient-Parameter Theory

In the following paragraphs, we will state several versions of the basic coefHcient-
parameter theory. The most concise approach would be to give the most general
version first and then state the rest as corollaries, but for the sake of understanding,
let's work the other way, from simple to general.
Before stating the theorem, we need the concept of a Zariski open set. A Zariski
open set of an algebraic set A is any set derived by removing from A an algebraic
subset of A. If A is smooth and connected, e.g., C m , then except for the trivial case
of the empty set, a Zariski open set of A is dense in A. It is almost all of A with all
the missing pieces equal to a lower-dimensional algebraic set. See § 12.1.1 for more
details.

Theorem 7.1.1 (Basic Parameter Continuation) Let F(z\q) be a system of


polynomials in n variables and m parameters,

F(z;q) :Cn x C m - • C n ,

that is, F(z; q) = {fi(z; q),..., fn(z; q)} and each ft(z; q) is polynomial in both z and
q (see Definition 1.2.1). Furthermore, let Af(q) denote the number of nonsingular
solutions as a function of q:

M{q) -=#[ze Cn F(z; q) = 0, det (j^(z; q)\ + 0 J .


Then,

(1) Af(q) is finite, and it is the same, say Af, for almost all q € C m ;
(2) For all q£Cm, M{q) < N;
(3) The subset ofCm where M{q) = J\f is a Zariski open set. That is, the exceptional
set Q* := {q e Cn\Af(q) < AT} is an affine algebraic set contained within an
algebraic set of dimension n — 1;
(4) The homotopy F(z; <f>(t)) = 0 with 4>(t) : [0,1] ->• C m \ Q* has N continuous,
nonsingular solution paths z(t) £ C n ;
Coefficient-Parameter Homotopy 93

(5) As t —> 0 ; the limits of the solution paths of the homotopy F(z; <p(t)) = 0 with
<f>{t) : [0,1] -» Cm and 4>{t) g Q* for t £ (0,1] include all the nonsingular roots
ofF{z;<t>{0))=0-

Items 1 and 3 are implied by Corollary A.14.2; the quantity d\ in that theorem
is the generic number of nonsingular roots which we denote as M here. In the
terminology established in Chapter 4, the property M(q) = Af holds generically
on q £ Cm. Item 2 holds because by Theorem A.14.1, a nonsingular solution at a
parameter point q* G C M must extend to a nonsingular solution in the neighborhood
of q*. Hence, it would be a contradiction for the open neighborhood around an
exceptional parameter point q* to have fewer nonsingular roots than at q*. On the
other hand, in the reverse direction as we approach a point q*, it is possible for a
solution path to become singular or to diverge. Items 4 and 5 follow from similar
reasoning, because if the N nonsingular solution paths coming from qi did not
arrive at the nonsingular solutions of go, there would be more than Af nonsingular
solutions in the neighborhood of go-
lf we have all nonsingular solutions for one set of generic parameters q\, items
4 and 5 allow us to find all nonsingular solutions to any system in the family by
continuation. We simply track all the solution paths along a path, i(t), through
the parameter space that avoids the exceptional set, Q*, for t G (0,1]. All that is
lacking is a method for constructing ^{t) to have the required property. But this is
easy, as the following lemma shows.

L e m m a 7.1.2 Fix a point go £ C m and a proper algebraic set A C C m . For


almost all q\ G C m , the one-real-dimensional open line segment

<t>(t) := tQl + (1 - t)q0, te(0,l],

is contained in Cm \ A.

Proof. Set A has complex dimension at most m — 1, so it has real dimension at


most 2m — 2. Let B be the union of all real-one-dimensional lines through qo and
any point of A. B has real dimension at most 2m — 1, and so its complement in
C m has real dimension 2m. The set of points q\ G C m that give a line segment
satisfying the condition of the theorem includes all of C m \ B. •

Item 5 of Theorem 7.1.1 with Lemma 7.1.2 imply that for a given target set of
parameters go almost any starting set of parameters qi will give a homotopy

F(z;tq1 + (l-t)qQ) = 0 (7.1.1)

whose solution paths include all the nonsingular solutions of .F(z; go) = 0 at their
endpoints as t goes from 1 to 0 on the real line. If somehow we can arrange to solve
F(z;q\) = 0 for a random, complex set of parameters gi, we are ready to solve the
94 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

target system, because the one-real-dimensional open line segment of the homotopy
is contained in C m \ Q* with probability one.
Suppose we have all the nonsingular solutions for only the particular system
F(z; qi) = 0, with M(q\) = A/*. Even though gi is generic, it could happen that we
wish to solve the system for a target go f° r which the homotopy of Equation 7.1.1
fails. This means there is some relation between qi and qo, for example, they might
both be real with a degenerate point on the real line segment joining them. Referring
to the proof of Lemma 7.1.2, we have that q\ is not in the degenerate set, Q*, but
it is in the set of points lying on a real straight line from q0 to a point of Q*.
When gi is generic, in the sense that Af(qi) = Af, but not random complex
independent of qo, can we still formulate an homotopy to find all nonsingular solu-
tions of F(z;qo) = 0 with probability one? Yes: the answer is to follow a different
continuation path, one that is not the real straight-line segment from qi to go and
that includes some extra parameter or parameters that can be chosen generically
to avoid degeneracies. Here are three, among many, possibilities:

• Pick a third random, complex parameter point p £ C m and follow the broken
line homotopy path from qi to p to go- Each of the two real-straight-line
segments will succeed with probability one, and so the concatenation of the
two will succeed also.
• Pick p as in the previous item, and employ a curved-path homotopy such as

F(z; tqi + i(l - t)p + (1 - t)q0) = 0. (7.1.2)

By similar reasoning to Lemma 7.1.2, the endpoints at t = 0 of N paths from the


nonsingular solutions of F(z\q{) = 0 will include all the nonsingular solutions
of F(z; go) = 0 for almost all choices of p £ C m .
• Use the same homotopy as in Equation 7.1.1, but follow a more general path in
the complex line denned by t, instead of just following the real segment [0,1].
A convenient way of doing so is to reparameterize the homotopy by T £ [0,1],
setting

for generic 7 £ C. This maneuver is justified in the following lemma.

Lemma 7.1.3 ( " G a m m a Trick") Fix a point q0 £ Cm, a proper algebraic set
A C Cm, and a point gi £ Cm, q\ £ A . For all 7 £ C except for a finite number of
one-real-dimensional rays from the origin, the one-real-dimensional arc

<f>(t):=tqi + {l-t)q0, t= 1 + ^ 1 ) T > T £ (0,1],

is contained in C m \ A. Furthermore, if we let 7 = e%e, the foregoing statement still


holds for all but a finite number points 9 £ [—TT, IT] .
Coefficient-Parameter Homotopy 95

Proof. Since the set

T := {t G C | (tqi + (1 - £)<7o) e A}
is algebraic, it must either be all of C or a finite number of
points in C. But by assumption, t = 1 is not in T, so T must
be finite. The bilinear transform from r to t maps [0,1] to a
circular arc in the Argand plane for t, leaving t = 0 with angle
equal to the angle of 7. Hence, any two choices of 7 / 0 having
different angles give distinct circular arcs that meet only in the
two points t = 0 and t = 1. This implies that there is only one
such arc through each t € T, and each such arc is produced by
values of 7 on a one-real-dimensional ray from the origin. For
all other values of 7 £ C, the path 4>{t) for T £ (0,1] is contained
in C m \ A.
The final statement follows because each ray from the origin hits the unit circle,
I7I = 1, in a single point. •

There are many alternative ways one could set up paths with the desired gener-
icity, but these simple approaches suffice. We have already seen the usefulness of a
variant of the "gamma trick" in the example of Figure 2.1, and we will return to it
in § 8.3.
Theorem 7.1.1 covers many of the cases that arise in practice, but situations
arise when more refined versions are useful. Some useful variants are:
(1) the variables z live on projective space or on a cross product of projective spaces
instead of on Euclidean space;
(2) we count solutions on a Zariski open subset of the variable space instead of on
the whole space, that is, solutions that satisfy prespecified algebraic conditions
are to be ignored;
(3) the parameters q live on an irreducible algebraic set in Euclidean space or in
projective space or in a cross product of projective spaces.
In the case that the variable space or the parameter space involve a projec-
tive factor, the system of equations must be multihomogeneous in a way that is
compatible with those spaces. Recall from § 3.6 the definition of a multiprojective
space as a product of projective spaces, for which we have the associated concept
of multihomogeneous polynomials.

Theorem 7.1.4 (Generalized Parameter Continuation) Let X be a multi-


projective space of dimension n, that is, X = P" 1 x • • • x Pnfe with n\ + • • • + n^ = n.
Let U C X be a Zariski open subset of X. Let Q c Y be an irreducible multi-
projective algebraic set in a multiprojective space Y. Let F(z; q) be a system of n
multihomogeneous polynomials compatible with X xY such that z and q are homo-
96 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

geneous coordinates for X andY, respectively. Furthermore, let J\f(q,U,Q) denote


the number of nonsingular solutions in U as a function of q € Q:

M(q, U,Q):=#lzeu\ F(z; q) = 0, rank f^(z; q)\ = n\ .

Then,
(1) N(q, U, Q) is finite and it is the same, say Af(U, Q), for almost all q G Q;
(2) For all qeQ, N{q, U, Q) < Af(U, Q);
(3) The subset of Q where Af(q, U, Q) = M(U, Q) is a Zariski open set; We denote
the exceptional set where Af(q, U, Q) < Af(U, Q) as Q*;
(4) The homotopy F(z; 7(£)) = 0 with 7(4) : [0,1] —> Q\Q* hasM(U, Q) continuous,
nonsingular solution paths z(t) G U;
(5) As t —> 0, the limits of the solution paths of the homotopy F(z;/y(t)) = 0 with
j(t) : [0,1] —» Q and j(t) $ Q* for t G (0,1] include all the nonsingular roots
inU o/F(z; 7 (0))=0.
Note that computations will be done in z G C™1+1 x • • • x C" m + 1 but interpreted
as points in X. For each projective factor, we typically add an inhomogeneous hy-
perplane equation to make the scaling factor unique. This is the projective transfor-
mation technique described in Chapter 3. The constancy of the number of solutions
for the algebraic case still follows from Corollary A. 14.2, which allows an even more
general situation than we use here. We require Q to be irreducible so that it is path
connected which implies the constancy of the root count; if Q were not irreducible,
the root count could be different on different components of Q. Since C n is a Zariski
open subset of IP™, Theorem 7.1.4 clearly includes Theorem 7.1.1, by letting m = 1,
U = Cn, and Q = C m , an irreducible algebraic set.
Notice that in the generalized version of the theorem, we denote the generic
number of nonsingular solutions as N{U,Q), because the count may change if we
consider a different Zariski open set U for the variables or if we restrict the para-
meters to a different algebraic set Q. We will consider both of these possibilities in
the succeeding sections.
We can generalize the theorem further. It sometimes happens that the parame-
ters appear via analytic expressions instead of polynomial ones. That is, the coeffi-
cients of F(z; q) as a polynomial system in z may be trigonometric or other analytic
functions of q. All the same conclusions follow. This is discussed in § A.14.2, so we
omit further discussion here and simply state the analytic version of the theorem
in the following abbreviated form.

Theorem 7.1.5 (Analytic Parameter Continuation) Consider the same situ-


ation as in Theorem 1.1.4 except that Q — C m and each of the n functions in F(z; q)
is a multihomogeneous polynomial in z with coefficients that are holomorphic func-
tions of q € Q. Then, we have the same conclusions as Theorem 7.1.4 for items 1,
2, 4, and 5, with item 3 modified as
Coefficient-Parameter Homotopy 97

(3) The subset of Q where Af(q, U, Q) = N{U, Q) is an analytic Zariski open set.

Elsewhere, without the qualifier analytic, we use the term Zariski open set to
mean the algebraic case. The inclusion of analytic in item 3 of Theorem 7.1.5 implies
a weaker condition than the algebraic case, as is to be expected since the set of
holomorphic functions is larger than the set of polynomial functions. The difference
is illustrated by the algebraic case f(z; q) = z2 — q, which has Af(q) — 2 everywhere
in C except q = 0, as compared to the analytic case of f(z; q) = z2 — sin(g), which
has exceptions for q = kir, k any integer. An algebraic equation can never have
an infinite number of isolated roots, but an analytic one can. Even so, an analytic
Zariski open set of C m is path connected, so continuation will succeed.
A final generalization of the theorem is to consider not just nonsingular roots,
but isolated roots of any multiplicity. Theorem A.14.1 and Corollary A.14.2 are
general enough to justify a restatement of Theorem 7.1.4 for isolated roots. Care
must be taken in the restatement of items 2 and 5, as the limit behavior of multiple
roots as a parameter path approaches the exceptional set is more complicated than
for nonsingular roots. The fact is that in this limit only three things can happen:
a solution path can leave U by landing on X \ U (this may include paths going
to infinity); a solution path can land on a higher-dimensional solution component
and thus cease being an isolated point; and two or more solution paths may merge
to form an isolated solution whose multiplicity is the sum of those for the incom-
ing paths. The number of isolated roots of a given multiplicity can increase, but
only at the expense of a corresponding decrease in the number of roots having a
lower multiplicity.

Theorem 7.1.6 (Parameter Continuation of Isolated Roots) Let X, Y, U,


Q, and F(z;q) be as in Theorem 7.1.4- Furthermore, let Mi(q,U,Q) denote the
number of multiplicity i isolated solutions in U as a function of q £ Q.
(1) Ni{q, U, Q) is finite and it is the same, say Ni(U, Q), for almost all q e Q, and
there is some finite number fx such that for all i > /x, Afi(U, Q) = 0;
(2) For allqeQ and any m, J2?=i *M(«, U, Q) < Y7=i iMi(U, Q);
(3) The subset of Q where Mi{q, U, Q) = Afi(U, Q) for all i < m is a Zariski open
set; We denote the exceptional set where any of these equalities fails as Q*m;
(4) For each i, the homotopy F(z; j{t)) = 0 with -y{t) : [0,1] -> Q\Q* has Aft(U, Q)
continuous, isolated solution paths z(t) € U of multiplicity i;
(5) As t —* 0, the limits of the set of multiplicity i solution paths such that i < m of
the homotopy F(z; j(t)) = 0 with i(t) : [0,1] -> Q and j(t) g Q*m for t € (0,1]
include all the isolated roots of F(z; 7(0)) = 0 in U of multiplicity less than m!,
where m' is such that Mi(U, q) = 0 for m < i < m'.

In numerical work, the paths traced by roots of multiplicity greater than one
are hard to track, but in principle, singular path tracking is possible, see § 15.6. If
we track only nonsingular paths, item (5) tells us that we are assured of obtaining
98 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

all nonsingular roots of the target system, which is what was claimed in the earlier
theorems. To be assured of finding all isolated roots of the target system, we must
track all the generically isolated roots, as indicated when m in item (5) is equal to
fi in item (1). A special case of particular interest is when all the isolated roots of
a generic system in the family are nonsingular, that is, when fj, — 1 in item (1) of
the theorem. Then, we can easily track all the isolated solution paths, and we are
assured that the endpoints of these include all isolated solutions, even those with
multiplicity greater than one.
It is important to note that where Theorems 7.1.1, 7.1.4, and 7.1.5 refer to a
polynomial system F(z; q), it is acceptable for F to be given in straight line form
(see Definition 1.2.4).

7.2 Parameter Homotopy in Application

The foregoing describes the essence of the polynomial continuation method. To find
nonsingular solutions of the polynomial system p(z) = 0 in a Zariski open set U,
we do the following, a restatement in mathematical terms of the steps enumerated
in the introduction to Part II.
Ab Initio Procedure:
To find all solutions in a Zariski open set U of p(z) = 0.

(1) Embed p(z) : C" —> C" as a member of a parameterized family F(z;q) : C" x
Q —> C n of polynomial systems. Denote by qo £ Q the particular parameter
values that correspond to p(z), that is, F(z,qo) =p(z).
(2) Arrange the embedding such that we have starting parameters q\ £ Q, q\ $ Q*,
for which we either have or can compute all N(U, Q) nonsingular solutions to
F(z;qi) = 0. Call these the "start points."
(3) Construct a continuous path j(t) : C —> Q such that 7(1) = qi, 7(0) = qo, and
7(i) £ Q* for t in the real interval (0,1]. That is, -y(t) for t £ [0,1] connects the
start parameters to the target parameters without intersecting the exceptional
set, except possibly at t — 0.
(4) Follow the Af(U, Q) solution paths of F(z;j{t)) = 0 from t = 1 along the real
axis to the vicinity of t = 0. These paths begin at the start points, and we
propagate them towards t = 0 using a numerical path-tracking algorithm.
(5) In the neighborhood of t = 0, determine which paths are converging to nonsin-
gular solutions. Refine these to numerically approximate the solutions to the
desired accuracy.
(6) Keep only those roots which are in U, that is, eliminate those that lie on the
algebraic set C™ \ U.

Suppose that p(z) is not just a single system of interest, but rather it is a member
of a family systems G(z; q) : X x Q' —> C" of the sort we have been discussing:
Coefficient-Parameter Homotopy 99

p(z) = G(z; q) for some q G Q'. For the sake of item 2 above, we may have had to
cast p(z) in a larger family of systems than G. That is, G(z; q) is F(z; q) restricted
to Q' C Q. This is often necessary when we have no generic member of G for which
we have (or can easily generate) all nonsingular solutions. The larger family F is
chosen in a way that provides such a start system. However, once we have solved an
initial generic member G, we can then solve any other member of G by parameter
continuation along paths in Q'. This can be advantageous because the generic root
count on G can be smaller (perhaps much smaller) than the generic root count for
F. To capture this advantage, one may apply a two-phase procedure as follows.

Two-Phase Procedure:
To find all solutions of G(z; q) — 0 in Zariski open set U for several parameter
points, say qlt..., qk £ Q'.

(1) Phase 1: solve G(z;q0) = 0


(a) Choose q0 random, complex in Q'.
(b) Solve G(z; go) = 0 using an ab initio technique as above.
(c) Let Z be the set of nonsingular solutions in U so obtained.
(2) Phase 2: for each qt, i = l,...,k, solve G(z;qi) = 0 by a continuation on a
straight line homotopy.
(a) Form the homotopy G(z; tq0 + (1 — t)qi) = 0.
(b) Track each root in Z from t = 1 to near t = 0.
(c) In the neighborhood of t = 0, determine which paths are converging to
nonsingular solutions and compute their endpoints to the desired accuracy.
(d) Keep only those roots which are in U, that is, eliminate those that lie on
the algebraic set Cn \ U.

In the remainder of this chapter, we will concentrate on Phase 2 of this proce-


dure; that is, we will assume that we have the solution set for some initial generic
system. Phase 1, the ab initio procedure, is the subject of Chapter 8.

7.3 An Illustrative Example: Triangles

Before embarking on a more complete examination of parameter continuation, let's


look at a simple example where the parameterization and the start system are rather
easily obtained. One would not use continuation to solve this problem, but it may
help illustrate what goes on in more challenging problems.
To make a concrete example, we consider the classic problem of solving for the
angles of a triangle given the lengths of its sides, a, 6, c. Let 9 be the angle opposite
side c. We will write a system of polynomials in two variables eg = cos9 and
sg = sin 9. As shown in Figure 7.1, we have the three vertices of the triangle as
100 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

(0,0), (b, 0) and (acg,asg), and the system to solve is

The first of these is the basic trigonometric identity for sine and cosine, and the
second says that point (acg,asg) is distance c from point (6,0). Our parameters
are the physical parameters q = (a,b,c), and the variables are z = (cg,sg). The
coefficients in /i are constants and, when expanded out, the coefficients in /2 are
quadratic polynomials in (a,b,c).
(acg,asg)

/Q \
(0,0) b (6,0)

Fig. 7.1 Triangle with side lengths a,b, c.

The system is easily solved without using continuation by forming ji — a2 f\ to


get

a2 + b2 - 2abce - c2 = 0, se = ±\Jl-c20. (7.3.5)


The first of these is the familiar Law of Cosines for planar triangles. For almost all
(a,b,c), there is a unique value of eg, the exceptions being if a = 0 or b = 0. In
these cases, the angle 9 is not well defined, because one of the sides of the triangle is
nonexistent. Away from these sets, the second of Equations 7.3.5 gives two distinct
values of SQ unless eg = ±1, in which case there is a double root. Substituting this
in the law of cosines equation, one sees that there will be double roots for (a, b, c)
on any of the four planes a ± b ± c = 0. These are the boundaries of the triangle
inequality conditions. For real (a, b, c) that violate the triangle inequality, one has
c$\ > 1, and sg is a pair of complex conjugate roots.
Now, let's pretend that we do not know the solution via Equation 7.3.5 and that
we seek a solution by parameter continuation using (a, 6, c) s C 3 as our parameter
space. The first hurdle is to obtain a start system. For a more complicated system,
we would normally pick (a, 6, c)i at random and rely on one of the special homotopies
discussed in Chapter 8, such as the total degree homotopy, to solve it. We will
discuss this type of maneuver more below. However, for this simple system, we
can pick out a known solution easily: let (a\,bi,c\) = (5,4,3), a Pythagorean
triple. Then, we have two solution points (cg,sg) = (4/5, ±3/5). Note that J2 in
Equation 7.3.4 is homogeneous in the parameters; in particular, all the coefficients
are homogeneous quadratics in (a, b,c). This means that the solution does not
Coefficient-Parameter Homotopy 101

change under scaling, and so for (a, b, c) = (5a, 4a, 3a) we have the same solution
points (ce,sg) = (4/5, ±3/5) for any nonzero, complex a.
One may wonder if there are any other solutions. The total degree of the system
is four, and its one-homogenization has two roots at infinity of the form [ZQ, c$, sg] =
[0,1, ±i], so there are only two finite roots. Here, the one-homogenization is obtained
via the substitutions eg = Cg/zo, sg = sg/' ZQ.
Next, we need a homotopy path from our starting system (ai,&i,ci) = to the
target (a, b, c)o- The straight line path
-y(t) = (1 - t)(5a, 4a, 3a) + t(a0, b0, c0) (7.3.6)
will suffice for almost all targets. It is not difficult to check that when a is complex
and the target is real, the values of t for which the path intersects the singularity
conditions is complex, unless the target itself is singular. So we will not encounter
any singularities for t on the real interval (0,1]. For a fixed complex-valued a, there
will exist complex targets for which the homotopy path hits a singularity, but if we
choose a at random, independent of the target, then there is a zero probability of
this failure.
It may be instructive1 to consider what would happen if we were to choose a
homotopy path in the reals, say a = 1. The homotopy is still fine for any real
target that is inside the triangle inequalities, since these bound a convex region of
the real parameter space. However, a line segment connecting a real target outside
the triangle inequality region to a real start system inside must cross the singularity.
These real targets form a set of measure zero in C3, so considering all targets in C3,
the homotopy is still valid with probability one. But in practice, we usually want
to solve systems for real-valued parameters. This illustrates that it is important to
use some sort of complex randomizing factor in the homotopy so that real systems
are solved with probability one.

7.4 Nested Parameter Homotopies

In practice, it is quite common for a parameterized family of problems to have special


cases that are themselves of significant interest. In fact, we often see an elaborate
network of special cases, each one inheriting the special structure of the solution sets
of the more general cases of which it is a member. The forward kinematics problem
for Stewart-Gough robot manipulators discussed below (§ 7.7) illustrates this.
Let us be a bit more precise about this situation.
Corollary 7.4.1 For a family of polynomial systems F(z;q) : C™ x Qo —> C n , a
chain of parameter spaces

Qo 3 Qi D Q2 3 • • •
1
Exercise 7.1 at the end of the chapter is a good way to get a feel for the numerical behavior
of this simple homotopy.
102 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

each of which is an irreducible quasiprojective algebraic set, and a Zariski open set
U C C n , the generic nonsingular root counts J\f(U,Qi) obey the inequalities

M{U, Qo) > M(U, Q0 > M{U, Q2) > • • •


Proof. This is just the repeated application of item 2 of Theorem 7.1.4. •
We know that we can use parameter homotopy within any one of these spaces
to compute the nonsingular roots of the associated polynomial systems, assuming
we have all nonsingular solutions at an initial generic point in the family. Suppose
we wish to use parameter continuation within the space Q\, but we do not yet have
a solution for any point in that space. Suppose that instead we have all nonsingular
solutions for the system f(z;qo) = 0, for a generic point qo € Qo- Let q® € Q\
be a generic point of Q\. But Qi C Qo implies q\ G Qo, so we may find all
nonsingular solutions to f{z;q1) = 0 by parameter continuation in Qo, starting at
qo- If Qi C Qo, the exceptional set in Qo, then there are fewer solutions at q\ than
at qo- Now, we may proceed to solve the system for any other parameters q\ £ Q\,
i = 1,2,..., using this smaller number of paths, by continuation inside Q\ starting
at q±. Obviously, the same approach can be applied to solve a start system in any
Qi once we have a solution for a system in one of its ancestors, Qj, j < i.
Unlike the simple triangle problem discussed above, when solving problems in
engineering or science, we rarely have all the solutions for any generic point in
the natural parameter space of the problem. So how do we get started? A very
useful trick is to solve the first naturally-parameterized problem by embedding the
whole family within a larger, artificially-parameterized family of problems, within
which we do have a solved general case. This is the Ab Initio procedure of § 7.2.
Suppose, for example, that an engineering problem is a system of two quadratics
in two variables. There are a total of 12 coefficients in two bivariate quadratics,
but for our problem these may depend on just a few physical parameters. We may
solve the initial problem given by generic physical parameters using a homotopy
in Qo = C12, the parameter space of all coefficients of two bivariate polynomials.
Then, Qi C Qo consists of the sets of coefficients that are generated by ranging
over the physical parameters.

7.5 Side Conditions

In the statement of coefficient-parameter homotopy above, the generic number of


nonsingular roots, M(U, Q), is counted on a Zariski open subset U in complex space.
The result is stated in that way to justify the application of "side conditions" for
eliminating uninteresting solution paths from a parameter homotopy.
Suppose the zeros of a system of analytic functions s(z) : C™ —v Ck are not
of interest as solutions of F(z;q) — 0. We call s(z) — 0 "side conditions," and
U = Cn \ s~1(0). Typically, the side conditions identify degenerate solution sets
Coefficient-Parameter Homotopy 103

that are known by other means, but they may also be certain pro forma conditions
that have been noticed to arise often. A common choice of the latter type, especially
when using monomial product homotopies, is the side condition s(z) = HILi z* = 0>
which simply means that we are not interested in solutions that have any coordinate
equal to zero. This is equivalent to saying that we are working on the open set
U — (C*)n, where C* = C \ 0. We will see below the use of side conditions specific
to a particular application, such as two variables being equal: s(z) = z\ — z% = 0.
In essence, even when we work on U = C n , we are invoking a side condition on P":
we are ignoring solutions at infinity.
Side conditions work hand-in-hand with nested parameter homotopies. When-
ever we solve the first generic example in a parameter space, we check the solutions
against the side conditions. Then, when solving other problems in the same pa-
rameter space using the first example as the start system, we drop the solutions
that satisfy the side conditions from the list of start points for the continuation. In
some cases, the degenerate solutions specified by the side conditions vastly outnum-
ber the interesting ones, and the number of paths in the parameter continuation is
dramatically reduced.

7.6 Homotopies that Respect Symmetry Groups

Some systems respect symmetry groups and we can reduce the number of paths to
follow accordingly. Suppose we have a mapping 5 : C" —> C n such that for any
q G Q, if F(z;q) = 0, then F(S(z);q) = 0. Furthermore, suppose that if z is a
nonsingular solution, then so is S(z). Often, F(S(z);q) is either exactly F(z;q) or
a rearrangement of the polynomials of F(z;q). For example, under the mapping
i > (y, x), the polynomial system {xy—qi, x2+y2+q2} is invariant, whereas
S : (x, y) —
the polynomials in the system {xy3 — a, x3y — a} interchange. In such cases, it is
clear that nonsingular roots map to nonsingular roots.
Using the notation S2(z) = S{S(z)), S3(z) = S(S(S(z))), etc., suppose k is the
smallest integer such that z = Sk(z). We say that / respects 5 as a symmetry
group of order k. The symmetry implies that for the homotopy F(z;q(t)) = 0, a
solution path zo(t) is matched by the paths Zi(t) = Sl(zo(t)), i = 1,..., k - 1 . So we
only need to compute one of the k paths: we use S to compute the endpoint of the
matching paths without knowing their intermediate points. It can happen that for
the same symmetry mapping, roots appear in symmetry groups of different orders.
For example, for the system {xy3 — 1, x3y - 1} = 0 and the mapping (x, y) i-» (y, x),
the root (x,y) = (1,1) maps to itself, while the root (\/2/2, —\/2/2) is in a group
of order two. This must be taken into account when using symmetry to reduce the
number of solution paths.
When we solve the first generic example in a parameter space, we usually must
resort to an ab initio procedure (§ 7.2), embedding the target system into a larger
family of systems. Since the members of this larger family generally do not respect
104 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

the symmetry, we must follow all the paths in the first run. The symmetry can
still be useful as a check on the computation: do all roots appear in the requisite
symmetries? If so, we have some assurance that the numerical process was carried
out successfully. Then, in subsequent runs using Phase 2 of the two-phase parameter
homotopy procedure, the symmetry is used to reduce the number of paths in the
computation.

7.7 Case Study: Stewart-Gough Platforms

For the first significant example of this book, we examine an important family of
problems from mechanical engineering: the forward kinematics of Stewart-Gough
platform robots. As we will see shortly, there are a number of different options
for the design of such robots, and these can be organized into nested families of
robot types. These parameterized families are ideal for illustrating the concept of
parameter continuation.
A Stewart-Gough platform, shown schematically in Figure 7.2, is a type of
parallel-link robot, having a stationary base platform upon which a moving platform
is supported by six "legs." Each of these legs has a spherical (ball-and-socket) joint
at each end,2 with a prismatic joint (linearly telescoping) in between. The prismatic
joint is actuated, usually by a ball screw and electric motor, so that the distance
between the center of its adjacent universal and spherical joints can be controlled by
computer. That is, leg i, i — 1,..., 6, connects point Ai of the stationary platform
to point Bi of the moving platform, and we control the lengths Lt = \Bi — Ai\.
By proper coordination of the six leg lengths, the moving plate can be placed in
any position and orientation within a working volume (actually a six-dimensional
workspace, a subset of R3 x 50(3)), whose boundaries are determined by the limits
of travel of the prismatic joints. Collisions between the legs can also limit the range
of motion.
These robots are best known as the mechanism beneath motion platforms for
aircraft flight simulators, but they are applicable to tasks as varied as aiming tele-
scopes or welding automotive bodies. The kinematics of these robots has been the
subject of extensive academic research, which we cannot begin to address here. We
refer the interested reader to (Merlet, 2000; Tsai, 1999) as a starting point.
Although many interesting algebraic problems arise in the study of these mech-
anisms, for the moment, we will consider only the so-called "forward kinematics"
problem, which is as follows:

Given: the geometry of the stationary and moving platforms and the six leg
lengths,
2
One ball joint on each leg can be replaced by a universal joint to eliminate rotation of the leg
around its axis, but this does not alter the motion of the moving platform, our present object of
study.
Coefficient-Parameter Homotopy 105

Find: the position and orientation of the moving platform with respect to the
stationary one.

As usual, in what follows, we embed the real problem into complex space, so even
though only real values of the leg lengths are physically meaningful, we consider
complex Li e C. Similarly, we treat the robot workspace as C3 x 50(3, C), where
SO(3,C) = {AG C 3x3 |,4 T yl = I,detA = 1}.

Fig. 7.2 General Stewart-Gough platform robot.

To write a system of polynomial equations, we need to precisely define the


problem data. Choose reference frames in the stationary and moving platforms.
Let the position of point Ai be given by vector at S C 3 in the stationary frame, and
let Bi be given by vector bi € C 3 in the moving frame. Rather than use a direct
coordinatization of C3 x SO(3,C), it is more convenient for the problem at hand is
to use Study coordinates, also known as "soma coordinates" (p.150-152 Bottema &
Roth, 1979). These consist of all points [e,g] = [e0, ei, e2, e3, g0, gi, 52 >#3] G P 7 that
lie on the Study quadric

/ 0 (e, g) = eog0 + eigi + e2g2 + e3g3 = 0. (7.7.7)

This is an isomorphism of C3 x 50(3, C), wherein the elements e are a quaternion


that represents the orientation of the moving platform with respect to the stationary
one and g is a quaternion that encodes translation as p = ge'/(ee1). Accordingly,
106 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

the position of point Bi in the reference frame of the stationary platform is written

(ge' + eble')/(eel),
where multiplication follows the rules for quaternions and g' = {go, —g\, —gi, ~gz)
and e' = (eo, —&\, ~&2-, —63) are quaternion conjugates of g and e. Clearly, we must
exclude the points that satisfy

s(e,g) = ee' = 0. (7.7.8)

The Study quadric is exactly the condition that the translation ge' be a pure
vector, and sincefo,is a pure vector, so is e^e''. These facts and the fact that the
length of pure vector v, considered as a quaternion, is just vv', allow us to write
the basic kinematic equations for the Stewart-Gough platform as

Li = {(ge' + ebie')/(eef) - <n) ((ge' + ebie')/(ee') - a,)', 1 = 1 , . . . , 6. (7.7.9)

Note that this system of equations immediately solves the "inverse" kinematic prob-
lem: given the position and orientation of the moving platform as [e,g], we can
calculate the leg lengths Lt. We are looking to solve the opposite problem: given
Lu find [e,g\.
To proceed, we expand Equation 7.7.9 and multiply through by ee' to get, for
i = l,...,6

fi(e, 9) = 99' + (hK + a X " £<)«*' + (gb'te' + ebi9')


l ;
-(ge'a'i + ateg') - (eft^'a; + a.eb'.e') = 0 .

In summary, Equations (7.7.7,7.7.10) form the forward kinematic problem for


Stewart-Gough platforms as

F(e,g) = {f0,fu...,f6}=0, (7.7.11)

subject to the side condition s(e,g) ^ 0 from Equation 7.7.8. System F(e,g) = 0 is
a set of seven homogeneous quadratic equations in [e,g] G P 7 .

7.7.1 General Case


The complete family of Stewart-Gough forward kinematic problems is parameterized
by the joint center points and the leg lengths, {(a i; 6,, L4), i = 1,...,6} G (C3 x
C 3 x C) 6 , a 42-dimensional space. Hence, in the preceding section, we should
have written the equations as F((e,g);p) — 0, where p G C 42 . It is of historical
interest to note that the number of solutions to the forward kinematics of general
Stewart-Gough platforms was found to be 40 by several different researchers at
about the same time 3 using entirely different approaches: continuation (Raghavan,
3
Historical note: preprints of (Ronga & Vust, 1995) circulated widely in 1992 and were refer-
enced in (Lazard, 1993; Mourrain, 1993). The conference paper (Raghavan, 1991) was the first
report of the count of 40, and this numerical result may have helped motivate the proofs.
Coefficient-Parameter Homotopy 107

1993), vector bundles and Chern classes from algebraic geometry (Ronga & Vust,
1995), computer algebra using Grobner bases (Lazard, 1993), and computation of
a resultant using computer algebra (Mourrain, 1993). See also (Mourrain, 1996).
The formulation of the problem we use here follows (Wampler, 1996a), wherein a
simple proof of 40 roots is given. The same formulation was derived independently
by Husty (Husty, 1996), who gave a procedure that uses computer algebra to derive
a degree-40 equation in one variable. This is but a small indication of the level of
interest this problem has attracted.
If we could solve the forward kinematics problem for just one general member
of C42, we could solve any other member by parameter continuation. The question
of how to get that first solution set is addressed in the next chapter. For the mo-
ment, let us just say that the trick is to cast the Stewart-Gough forward kinematics
problems as members of a much larger family, the family of all systems of seven
quadrics on [e,g] G P 7 . General members of this family have 27 = 128 isolated
solution points, so we can find all isolated solutions for an initial Stewart-Gough
problem by tracking 128 solution paths for a homotopy defined in this larger space.
Doing so reveals that a generic Stewart-Gough platform, p0 £ C42, (chosen using a
random number generator) has 40 nonsingular solutions and 88 singular ones. The
singular solutions are on the degenerate set Equation 7.7.8, so we can safely ignore
them as they are not of physical significance. In short, we have 7V(P7,C42) = 40
and only these roots are of interest.
Having the 40 isolated solutions x0 € -Fp^1(0) to a generic Stewart-Gough plat-
form, po £ C42, we are ready to apply parameter continuation within the family.
By Lemma 7.1.2, a straight line path from po to almost any other pi G C42 stays
generic and so by Theorem 7.1.4, the 40 solution paths starting at Xo for t = 1 of
the homotopy

HSG((e,g),t) := F((e,g);tp0 + (1 - t)Pl) = 0 (7.7.12)

will lead to a set of endpoints that contains all isolated solutions of F((e, g);pi) = 0.
(We invoke the generalized Theorem 7.1.4 instead of the basic version, Theo-
rem 7.1.1, because we are working on projective space P7.) There exist points
p* for which the line segment between p0 and p*, parameterized by t G [1,0) in the
homotopy above, strikes a singular point. Such points are a set of measure zero in
C42, but they do exist. If one happens to encounter such a problem, where some
homotopy paths founder before t approaches zero, all that is necessary is to first
continue from po to another random point in C42 before proceeding to the final tar-
get. Or, to accomplish the same thing, we may choose a random 7 £ C and follow
the homotopy HSG{{^,9)>t(s)) = 0 along a nonlinear path t(s) = s + -ys(l — s) on
the real segment s £ [1,0]. In practice, unless one is solving a large number of such
problems, the exceptions to the linear homotopy path will almost certainly not be
encountered, so Equation 7.7.12 is sufficient.
108 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

7.7.2 Platforms with Coincident Joints


Various special families of Stewart-Gough platform robots may be defined by re-
quiring some joint centers to coincide. For example, suppose legs 1 and 2 both
connect to the same point on the moving platform; in other words, points B\ and
i?2 coincide. This is an example of a so-called 6-5 platform, where 6 and 5 are the
numbers of distinct joint centers on the stationary and moving platforms, respec-
tively. Such special platform robots can have advantageous kinematic properties, so
they are of practical interest. In fact, the limiting case of a 3-3 platform, discussed
below, is one of the most popular designs in practice. A 6-6 platform is the most
general type, treated in the preceding paragraphs.
The number of joint centers on a platform can take on any value from 3 to 6.
(If there were only 2 joint centers, rotation around the line through them cannot
be resisted by the mechanism, making it useless.) Moreover, these two integers
are not enough in general to fully specify the mechanism type, since it matters,
for example, if one of the legs connects two double joints. We can schematically
represent the topological type of a platform with coincident joints by two rows
of dots representing joint centers, with lines between them representing connecting
legs. There are always six legs, but the number of dots is reduced by the presence of
coincident joints. We will assume that the top row of dots represent the joint centers
on the moving platform and the bottom row represents those of the stationary
platform. The connection patterns

VvyY - YvM
4-4a 4-4b
are both 4-4 patterns, but they are topologically distinct. We will only address
a few of the possibilities in the next few paragraphs. A more complete catalog
of coincident-joint geometries and their root counts can be found in (Faugere &
Lazard, 1995).
Consider first the 4-4 connection pattern illustrated on the left above, which we
label 4-4a. It is given as a quasiprojective algebraic subset of C42 by the equations
{ai = a2,a5 = ae,b2 = 63,^4 = b5}. We may solve such an example by making it
the target system of either a total degree homotopy or the general Stewart-Gough
homotopy HSG, because it is a member of both. Usually, it is more efficient to use
the 40-path option than the 128 paths of the total degree homotopy. But either
way, one finds only 16 solutions, with the rest of the paths having endpoints on the
degenerate condition, Equation 7.7.8. With 16 solutions for a generic example in
family 4-4a in hand, we can solve any other problem in that subfamily using HSG
and only 16 paths.
This is just the tip of the iceberg in terms of the possible subfamilies of the
Stewart-Gough platform. Figure 7.3 shows a family tree of six sub-families, with
arrows indicating inclusions (lower families in the figure are sub-families of higher
ones). At the top, "quad7" is the family of all systems of 7 quadrics, which contains
Coefficient-Parameter Homotopy 109

all of the Stewart-Gough platform systems. Table 7.1 lists these same families:
each is given a name, such as 4-4a, and the pattern of coincident joints is indicated
graphically. Ignore for the moment the families whose names end in "P;" these are
discussed in the next subsection. The number of nonsingular roots is indicated as
M. This will be the number of homotopy paths for a parameter homotopy starting
from a generic point in the family and ending at any other point in the family,
including any point in a family that is a subset of that family. For each family,
the dots in the table in its row indicate which families it belongs to. For example,
the first column is the family of all systems of 7 quadrics, which contains all of the
other families, so there is a dot in every cell of the first column. We can solve any
Stewart-Gough platform by a 128-path homotopy through the parameter space of
7 quadrics or by a 40-path homotopy through the space of general 6-6 platforms.
Of course, if the target system is a member of some other subfamily, it is more
efficient to work within that family after a first generic member of the family has
been solved by continuation in a family above it. This is why, for example, we need
the seven-quadric system to get the process started.

Fig. 7.3 Stewart-Gough coincident joint family tree

Table 7.1 is not an exhaustive list of special Stewart-Gough sub-families. Among


the coincident-joint families, any type K-L with 3 < K, L < 6 is possible, including
cases where 3 joints are coincident. Four coincident joints will be degenerate—either
no solutions or a positive-dimensional solution set—so these can be ignored. Further
exploration of the coincident-joint families is an exercise at the end of this chapter.
Besides these families, there exist special cases where no joints are coincident, but
rather, there is some other geometric relationship, such as joints in a straight line.
110 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Table 7.1 Stewart-Gough Sub-Families

4 4
q 6 6 4 4 4 4
d 6 6 6 4 4 a 4 b 3
Pattern Name Af 7 6 P 4 P a P b P 3
N/A quadf~ 128 .
66 40
nnn
111111
"
6-6P 20 + 20
•» •• •
64 32
IA A ! "
1A/M 6-4P 16 + 16
4 4a
\A A / ~ is • •
Y W V 4-4aP 8+ 8 . « » . « « »
\AA T ^ 24 r~; ; i
V V \1 4-4bP 12+12 « •
/W 3-3 8+8 ..»»
We will have reason to study such a case later, in Part III.

7.7.3 Planar Platforms


Every family in Table 7.1 has a planar version, indicated by the suffix "P" in its
name. These have the six points of the stationary platform in a plane and similarly
for the moving platform. In the interest of simplicity, these have not been added
to Figure 7.3, but we may summarize the membership relationship as follows. If
A and B are non-planar families, AP and BP their planar sub-families, and B is a
sub-family of A, then we have the following inclusions.
A D AP
U U
B D BP
The planarity condition results in a symmetry, because the moving platform and its
mirror image reflected through the plane of the stationary platform are congruent
and all the leg lengths are preserved by the reflection. Hence, solutions appear
in symmetric pairs. If we perform continuation in a planar family, this symmetry
applies at every step, and hence all solutions can be obtained by tracking only one of
each pair. This is the reason that N in Table 7.1 is written in the form N/2 + AT/2:
only half the paths must be tracked to solve a member of that family.

7.7.4 Summary of Case Study


The main point to remember is that if we have a list of N{U, Q) nonsingular solu-
tions for one generic member of a parameterized family F of polynomial systems,
we can find the nonsingular solutions of any other member of the family using these
as the start points of M(U,Q) homotopy solution paths. In the case of the for-
Coefficient-Parameter Homotopy 111

ward kinematics of Stewart-Gough platforms, N(F7, C42) = 40, so any problem can
be solved using a 40-path homotopy. We have identified a number of sub-families
that have a reduced number of nonsingular solutions, and a homotopy that stays
within such a parameter subspace solves other members of the sub-family using
the reduced number of solution paths. Sub-families with planar platforms admit
a two-way symmetry which can be used to reduce the number of solution paths
by half.
We see that parameter continuation can be an effective way to explore such
nested families and discover the generic number of nonsingular roots for each. In
the exercises in the next section, we encourage the reader to experience this directly,
by running Matlab routines supplied for this purpose.
It should be mentioned that there are many other approaches to such a study.
In addition to studies of the general 6-6 case already mentioned (Husty, 1996; Mour-
rain, 1996; Raghavan, 1993; Ronga & Vust, 1995; Wampler, 1996a), for several of
the subfamilies, kinematicians have found elimination procedures reducing the prob-
lem to a single polynomial (Chen & Song, 1994; Nanua, Waldron, & Murthy, 1991;
Sreenivasan, Waldron, & Nanua, 1994; Zhang & Song, 1994) or have applied their
own variants of continuation (Sreenivasan Sz Nanua, 1992; Dhingra, Kohli, & Xu,
1992). An extensive study of coincident-joint sub-families using Grobner bases can
be found in (Faugere & Lazard, 1995).

7.8 Historical Note: The Cheater's Homotopy

Among those who have some passing knowledge of developments in polynomial con-
tinuation, there has sometimes been confusion between parameter homotopy and a
similar approach called the "cheater's homotopy" by its inventors (Li, Sauer, &
Yorke, 1989). Appearing in print before the article establishing "coefficient-
parameter homotopy" (Morgan & Sommese, 1989), the cheater's homotopy pre-
saged much of the flavor of the full parameter theory. Consequently, the cheater's
homotopy holds an important place in the development of the subject, even though
it was soon eclipsed by the more general parameter homotopy theory.
Rather than working in the natural parameter space Q associated to a system
f(z; q) = 0, the cheater's homotopy expands the parameter space by generic con-
stants b E C". The method starts by solving the initial system f(z; q{) + b = 0 for
generic qi E Q and b € C". Then, the finite, nonsingular solutions of this system
are used as start points in a homotopy to find all the finite, nonsingular solutions
to some other example in the family, say f(z;qo) = 0, qo € Q. This is done by
following the solution paths from t = 1 to t = 0 in the homotopy f(z; q(t))+tb = 0,
where q(t) G Q is a continuous path in Q with q(l) = q\ and q(0) = qo.
We can see immediately from the parameter homotopy theory that this approach
works: we have a generic start system (qi, b) in an expanded parameter space Q X C"
and the target system is given by (qo,O) 6 Q x C " . However, the addition of
112 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

the generic constants to each equation often destroys crucial structure, causing an
increase in the number of paths to track, often substantially. A simple example
that shows a big difference is

For general q, this has one nonsingular solution (x,y) = (q,q), so a parameter
homotopy will have just one path to track. But the start system for the cheater's
homotopy

f(^MM={x4;2XXb:b2}=0. (7.8.14)
has six nonsingular solutions. Computing solutions of Equation 7.8.13 for several
different values of q by the cheater's homotopy requires six paths each time. The
added constants b\ and 62 destroy all the structure of the original system. This kind
of difference arises in meaningful problems as well; for the nine-point path synthesis
problem discussed in § 9.6.7, a parameter homotopy requires only 1442 solution
paths, whereas the cheater's homotopy would require at least 90,000 continuation
paths (see (Wampler, Morgan, & Sommese, 1992, 1997)). The difference is due to
the presence of positive dimensional solution components. Parameter homotopy
preserves these components and so the associated paths can be safely ignored. But
the cheater's homotopy perturbs these components, replacing them with thousands
of nonsingular paths that must be tracked.
The same property that makes the cheater's homotopy undesirable in the gen-
eral situation can make it the method of choice in certain specialized situations: the
addition of the random constants makes all finite roots nonsingular. For example,
Equation 7.8.13 has a quintuple root at the origin, (x,y) = (0,0). Adding the con-
stants as in Equation 7.8.14 perturbs this into five distinct roots. If we wish to have
the origin appear as the endpoint of nonsingular homotopy paths, the cheater's ho-
motopy will accomplish this. Usually though, our aims are in the opposite direction:
we would like to avoid computing degenerate solutions whenever possible.

7.9 Exercises

The following exercises are intended to help the reader understand the principles
of parameter continuation and also to experience the numerical behavior of the
continuation method. They assume that the user has access to Matlab, and that
the package HOMLAB, available on the authors' websites, has been installed on the
Matlab search path. A users guide to HOMLAB appears in Appendix C.
Demonstration codes are provided for most of the exercises, so they can be run
with minimal knowledge of Matlab commands. A few exercises require the user to
Coefficient-Parameter Homotopy 113

write or modify an m-file. Even those with minimal prior experience with Matlab
should be able to handle these after a little experimentation.
A few words about HOMLAB. The main output of the demonstration programs
is always stored in two arrays: xsoln and stats. Each column of xsoln contains a
solution of the system in homogeneous coordinates, and column i of s t a t s compiles
some statistics on the numerics of the ith solution.
HOMLAB treats all problems as formulated in a multiprojective space to take
advantage of the ability of the projective transformation to handle paths leading
to solutions at infinity. For the Stewart-Gough platform problems, this is natural,
since we have formulated them on IP7. The code requires that problems naturally
formulated in C n , such as the initial triangle example, be homogenized for solution in
P". Typically, the homogeneous coordinate that is added in this process is appended
as the last row in xsoln. (See the user's guide for information on the full range of
options.) Function y=dehomog(xsoln,epsO) de-homogenizes solutions by dividing
through by the homogeneous coordinate for any solution for which the homogeneous
coordinate is nonzero as judged by the test abs (xsoln(n+1, :) )>epsO. Part of the
learning process of the exercises will be to see how to set the tolerances such as
epsO.
The second output, stats, compiles some statistics for the run. Each column
of stats corresponds to the matching column in xsoln. Full information is given
in the user's guide. For the exercises to follow, we are mainly concerned with rows
2, 3, and 5, having the following meanings:

Row 2 This is a convergence test on the solution. It is a two-norm estimate of how


accurately the solution has been computed.
Row 3 This is the maximum of the absolute values of the polynomials evaluated
at the solution point. If this is not small, an error has occurred.
Row 5 Condition number of the Jacobian matrix of the polynomial system eval-
uated at the solution point. A large condition number implies the solution is
singular.

Exercise 7.1 (Triangle) This exercise experiments with file triangle.m, which
solves Ex. 7.3 using the parameter homotopy path given in Equation 7.3.6. It
uses a path tracker without an endgame to handle singular roots so that one can
see what happens in such cases. The routine allows the option of accepting a
randomly-generated, complex value for the path constant a in Equation 7.3.6. Try
the following experiments:

(1) Solve several triangles of your own choice, accepting the option to use a random,
complex value for a. Does the routine reliably return accurate solutions?
(2) Try again, but choose a = 1. Can you find examples for which the routine fails?
Succeeds? Can you determine a condition on (a, b, c) that predicts success versus
failure?
114 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

(3) Now choose a = I + li. Can you find a (a,b, c) for which the algorithm now
fails? What happens if you add a small random perturbation to the values?
(4) Enter an (a, b, c) that is on the boundary of the triangle inequality, for example,
(2,1,1). Let the routine pick a random value for a. What happens? How about
for {a,b,c) = (2,l,l+le-8)?

Exercise 7.2 (Symmetry) Consider the family of systems F{x, y; a) = {xy3 —


a,x3y — a} parameterized by a £ C. Solve F(x,y;a) = 0 symbolically by hand.
Find a mapping that gives symmetry groups of order 4. How many roots are there
in (x, y) G C2? How many paths would you need to track if symmetry is used to
its fullest extent?

Exercise 7.3 (Cheater's Homotopy) This exercise addresses the system in


Equation 7.8.13.
(1) Prove the claim that Equation 7.8.13 has just one nonsingular solution for a
generic value of q.
(2) Use the script cheatrun.m provided with HOMLAB to numerically determine
the number of nonsingular roots for Equation 7.8.13 and for the Cheater's start
system, Equation 7.8.14, assuming generic b\ and &2-
(3) How many solution paths would a parameter continuation have when solving
for different settings of the parameter ql How about the cheater's homotopy?
(4) What are the singular solutions of Equation 7.8.13?

Exercise 7.4 (Stewart-Gough Platforms) The goal of this exercise is to recon-


firm the results presented in the case study of § 7.7. You will use Matlab routine
Stewart/sgparhom.m.
(1) In Matlab, type »sgparhom to begin solving Stewart-Gough forward kinemat-
ics problems. A file, strt66.mat, containing random parameters and the 40
corresponding solutions for a generic 6-6 problem is provided to bootstrap the
process.
(2) Plan a strategy for reconfirming the solution counts for all the subfamilies shown
in Table 7.1. Try to minimize the total number of paths that are tracked. The
program provides a facility for saving solutions to re-use as start points in
subsequent runs. Run at least one of each topological type, some planar and
some not.
(3) Pick a subfamily and write an m-file that defines a specific example in that sub-
family. Then, compute solutions to that example twice: once using a homotopy
in the subfamily and once as a special case of a larger subfamily that contains
it. For example, you might solve a specific 3-3 case with a 16-path homotopy
in that family and also with a 32-path homotopy in the 6-4 family. Compare
computation times and check that the same (nondegenerate) solutions are ob-
tained both ways. Remember that the points are computed using homogeneous
Coefficient-Parameter Homotopy 115

coordinates in P 7 , so you will need to devise a scheme for judging that two such
points are equal. How closely do the points match?
(4) Run a real case and check that any complex solutions appear in conjugate pairs.
Change the parameters and see if the number of real roots changes.
(5) Solve a problem with real parameters, p £ IR42. Then, use 3-D graphics com-
mands to draw simple (stick-figure) models of the Stewart-Gough platform in
all its real poses.

Exercise 7.5 (Secant Homotopy) Let f(x;p) : C" x Q —> C" be a system
of parameterized polynomials. Then the secant system derived from f(x;p) is
g(x;\,n,Pi,P2) = A/(x;pi) + fj,f(x;p2).

(1) What is the parameter space for g(x;\,n,pi,p2) = 0? (Note, we may consider
[A,/i] G P 1 . Why?) Denote the parameter space as Q' in the following items.
(2) What is the relationship between the nonsingular root count for g, J\fg(U,Q'),
and the one for / , Nf(U,Q), where U is any Zariski open subset4 of C n ?
(3) Suppose we know all A//(U,Q) nonsingular roots of f(x;pi) = 0 for some general
pi £ Q. We would like to use these as start points for a secant homotopy

h(x,t) = -ytf(x;Pl) + (1 - t)f(x-p2) (7.9.15)

to find all nonsingular solutions in U of f(x;p2) = 0 by tracking solution paths


as t goes from 1 to 0. Why do we need Mf(U, Q) = Mg{U, Q') for this to be
justified? If this equality does not hold, can you think of a way the homotopy
might fail?
(4) If conditions of the previous item are satisfied and the constant 7 in Equa-
tion 7.9.15 is chosen randomly in C, the secant homotopy will be successful with
probability 1. Choosing 7 randomly on the unit circle (|7| = 1) also works. Can
you see why the random 7 is necessary? (Try to think of a counter-example if
7 = 1.)
(5) Prove the claims of the previous item. (Hint: see § 8.3.)

Exercise 7.6 (Secant Homotopy for Stewart-Gough Platforms) The con-


ditions laid out in the previous exercise for success of the secant homotopy hold
for general Stewart-Gough forward kinematics problem (type 6-6) as defined by
Equation 7.7.11. A set of 40 solutions for a general 6-6 platform are provided in file
stewart/strt66.mat. These were used in Exercise 7.4 as start points for a para-
meter homotopy, but they can also be used for secant homotopy as implemented in
Stewart/sgsecant.

(1) In (Wampler, 1996a), it is shown that the root count of 40 for general 6-6 plat-
forms follows from the fact that Bi is antisymmetric when we re-write Equa-
4
see § 12.1.1 for definition
116 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

tion 7.7.10 for leg i as a quadratic form,


eTAie + 2eTBig + gTg = 0,
where e and g are interpreted as 4 x 1 column matrices. Use this fact to prove
that the secant homotopy is valid for 6-6 platforms.
(2) Use sgsecant to solve a random 6-6 example. How does the running time
compare to the parameter homotopy? Can you explain why?
(3) Use the secant homotopy to solve examples of the other coincident-point sub-
families, using the 40 start points from strt66.mat. Why is this justified? In
particular, solve a 3-3 problem. How does the computation time compare to
item 2? Can you explain?
(4) We would like to solve problems in a coincident-joint subfamily using a start
problem from the same subfamily so that the number of solution paths is equal
to the generic number of solution points. For example, we would like to solve
4-4b problems using just 24 paths. What check must be performed to see that
this is justified?
(5) (Challenging) Write a program to do the check for subfamily 4-4b. (Tip: modify
a copy of sgsecant .m.) What is your conclusion? Try the same for other
families.
(6) What needs to be checked to conclude that a secant homotopy between two
6-6P platforms can be done using just 20 paths? Modify sgsecant.m so that
you can do this check. What is your conclusion?
(7) Use the results of the last two items to determine the minimum number of paths
required to solve a 3-3 problem by secant homotopy.
Exercise 7.7 (Numerics of Tracking) File htopyset .m sets constants that con-
trol the behavior of the path tracker. The two most important ones are maxit, and
epsbig. Small values require the numerical solution point to stay close to the true
path denned by the equations. Large values allow more deviation. Adjust the set-
tings by putting a copy of htopyset .m in your local directory and editing it. Run
sgparhom and observe the effect on computation time and reliability by recording
changes in runtime, the number of function evaluations (last row in stats), and by
noting any path failures. Also, type >> pathcros(xsoln) to check if any solutions
have "jumped paths," causing some root to be reported more than once and leaving
out the root at the end of the solution path that was left behind in the jump. Can
you make sgparhom run faster?
Chapter 8

Polynomial Structures

In the previous chapter, we introduced the basic concept of a coefficient-parameter


homotopy. This is the underlying principle for all of the homotopies discussed in
this book; each system that we solve has a parameter space, and a homotopy is
just a continuous path between two points in this parameter space. Whenever we
approach a new polynomial system, the first question we face is how to parameterize
it. Problems from engineering or science generally come with a natural set of
parameters built in: the dimensions of the links in a mechanical system or the
rate coefficients for chemical reactions, for example. But rarely do we know all the
solutions for a general choice of such parameters. We need to cast the naturally-
parameterized problems in some larger family of problems in which a start system
is more easily found. We called this the Ab Initio procedure in § 7.2, but postponed
detailed discussion for later. We now return to this important question.
At the opposite end of the spectrum from the natural, physical parameterization
of a system are total degree homotopies. These can in principle solve any system,
because as we shall see, every system is a member of a total-degree family parame-
terized by the coefficients. Moreover, in each such family, there is a start system
whose solutions are immediately apparent. The downside is that, depending on the
target system, the total-degree homotopy may have many paths that go to solutions
at infinity or other degeneracies. These waste computer time, and the process of
carefully distinguishing between degenerate and nondegenerate solutions can also
cause extra work. Even so, if the extra work is not so excessive as to make the
computation infeasible, we only have to do it once to get the solutions for a general
member of the naturally parameterized family. Then, we can use a homotopy in
the natural parameter space to solve any other system of that parameterized family.
But what if the extra work is excessive?
Over the years, a number of useful classes of homotopies have been invented
to populate the territory between total degree and naturally-parameterized homo-
topies. We choose among these with the objective of best matching the target
system without overly complicating the solution of the start system. The purpose
of this chapter is to discuss the most important signposts in this territory.

117
118 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

8.1 A Hierarchy of Structures

Fig. 8.1 Classes of Product Structures. Below line A, start systems can be solved using only
routines for solving linear systems of equations. Above line B, special methods must be designed
case-by-case.

Figure 8.1 shows a hierarchy of classes of special structures that are useful in
constructing homotopies. Each structure in the diagram is a member of the class
above it; for example, a total degree structure is a particular kind of multihomo-
geneous structure. (In particular, as we will shortly see, it is a one-homogeneous
structure.) As we ascend the hierarchy, each class of structures presents more and
more possibilities for matching a particular target system that we wish to solve. As
indicated on the right of the diagram, this means that we can select a more special
structure, usually with the aim of reducing the number of solution paths to track
in the homotopy. The trade-off we face in this ascent is indicated by the downward
pointing arrow on the left of the diagram: the lower structures allow us to select
start systems that are easier to solve. For some problems, the ascent up the dia-
gram pays handsomely in path reduction and may turn an intractable problem into
a solvable one. On the other hand, it can happen that solving a start system for a
higher structure can consume more computer time than is saved in path reduction.
Unfortunately, even just counting the number of roots of the start system can be
expensive, so it is a matter of experience to decide the most advantageous spot in
this hierarchy to solve a particular problem.
Two dashed lines appear in Figure 8.1 to demarcate significant differences in the
start systems of homotopies respecting the various special structures. Below Line A,
Polynomial Structures 119

the start systems can be chosen in a factored form which permits all solutions to
be computed using simple combinatorics and routines for solving linear systems.
Thus, for these structures, the time spent solving the start system is insignificant
compared with tracking the solution paths to the target system. Above Line A,
some path tracking is usually required just to solve the start system. Furthermore,
above Line B, solving the start system usually requires the use of a homotopy based
on one of the structures below it in the hierarchy. Typically, these are not optimal
in the sense that some paths lead to degenerate points. Between the two lines lie the
monomial-product and Newton-polytope homotopies. These require path tracking
to solve the start system, but the homotopies involved can be specially designed
to produce all solutions of the start system without any extra paths leading to
degenerate solutions. In addition to the cost of the path tracking, the combinatoric
calculations can be significant.
In addition to differences in computation times, the position in the hierarchy
also has an effect on the complexity of the computer code that implements it. In
this regard, the two extremes are the simplest. All homotopies require routines for
path tracking. To this, a total degree homotopy adds a simple start system that
is almost trivially solved. Consequently, the corresponding computer code is as
simple as possible. At the other extreme, we may formulate a coefficient-parameter
homotopy in terms of the physical parameters of the engineering or science problem
at hand, a step which we must do in any case. The start system simply amounts
to choosing random, complex values for these parameters. The difficulty comes
in solving the start system. A simple way to proceed is to solve the start system
with a total degree homotopy. This may be expensive, but it only has to be done
once. After that, we may solve any target system in the same parameterized family
using only the paths from the nondegenerate solutions of that first start system.
So, once we have implemented a general-purpose solver for total degree homotopies,
coefficient-parameter homotopies require only a bit of data management to solve a
start system and store its nondegenerate solution list.
The other intermediate structures introduce intermediate levels of complexity
to a computer code. Multihomogeneous and linear product homotopies introduce
simple combinatorics into the enumeration of the start solutions. In contrast, the
combinatorics introduced by monomial homotopies have been the subject of signif-
icant mathematical study, of which we give only a hint in § 8.5.
A final important consideration in the choice of homotopy is numerical stability
and robustness. For the paths leading to nonsingular solutions, there is not much
difference to be expected in this regard no matter which homotopy is chosen. How-
ever, it can happen that if one uses a homotopy near the bottom of Figure 8.1, the
singular solutions may vastly outnumber the nonsingular ones. In some practical
situations, we may be satisfied to casually discard all badly-conditioned solutions
without wasting much computer time on them. This runs the risk of dropping out
some generically nonsingular solutions that happen to have marginal conditioning.
120 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

When we wish to be more careful about finding all nonsingular solutions, a great
deal of effort may be necessary to resolve all the badly-conditioned solutions. Mov-
ing up the hierarchy to a more special structure may eliminate these solutions from
the homotopy and avoid the cost and uncertainty of computing singular solutions.
In some cases, singular solutions remain but have reduced multiplicity, making them
easier to compute accurately using "singular endgames," (see Chapter 10).
With this general picture in mind, we will proceed to examine each of the special
structures in some detail. Before starting this journey, we present a discussion of
homotopy paths that is relevant to all the special structures. Then, we start at the
bottom of the diagram of Figure 8.1 and work our way up to structures of increasing
specificity. We give only simple examples in this chapter, postponing case studies
of more significant examples to the next chapter.

8.2 Notation

Throughout the remainder of this chapter, it will be convenient to use the following
notations.

(1) Let (ei,...,en) be the n-dimensional vector space having basis elements
e i , . . . , e n and coefficients from C. Any point in this space may be written
in the form X^Li Ci&i w i* n c i £ C for all i. Note that we have not specified
anything about the basis elements: in the structures we discuss below these will
be variously individual variables, monomials, or polynomials.
(2) Let {pi, • • •,pn} ® {Qi, • • • ,Qm} be the product of two sets, that is, the set
{Pi ® Qj-, 1 < i < n, 1 < j < m} having nm elements. Throughout this
chapter, we take this product as the image inside the polynomial ring; that is,
x ®y = y ® x = xy is just the product of two polynomials.
(3) Define P x Q = {pq | p s P,q £ Q}. Accordingly, {P ® Q) is the space whose
members are sums of members of (P) x (Q). Since this includes a sum of one
item, we have (P) x (Q) C (P<g>Q).
(4) For repeated products, we use the shorthand notations P^ = P <S> P and
(P) = (P) x (P), and similarly for three or more products.
(5) We write an element of complex projective space P™ using square brackets, for
example, x = [xo,x\,... ,xn] £ P™, see § 3.2.

8.3 Homotopy Paths for Linearly Parameterized Families

As we shall soon see, in our hierarchy of special structures, Figure 8.1, all but
the top case (general coefficient-parameter structures) have parameters that appear
linearly. This means that the family of systems F(z\ q) : C™ x C m —> C n has the
Polynomial Structures 121

property that for any a, (3 £ C and qi, q2 £ C m ,

F(z; aqi + f3q2) = aF(z; Ql) + 0F(z; q2).

The special structures of this chapter all obey this linearity condition because they
are parameterized by coefficients which multiply a basis set of monomials or poly-
nomials.
Since the parameter space, C m , is linear, we can easily construct an homotopy
that stays in the parameter space while continuing from a start system, F(z; qi), at
t = 1, to a target system, F(z; q2), at t = 0, as

H(z, t) := F(z; tqx + (1 - t)q2) = 0.

By Lemma 7.1.2, to solve the system for a given target q2, we just need the solutions
at almost any starting qi £ C m , from which we can follow the real straight line path
t £ (0,1]. However, in the case of an Ab Initio homotopy, where we have chosen
<?i specially so that F(z; qi) = 0 is easy to solve and where q2 is a target that may
not be general, we cannot use the lemma: neither endpoint is chosen with complete
freedom in the parameter space.
Since we are going to choose qi in a way that guarantees it has the generic
number of nonsingular roots, we can use any of the alternatives mentioned in § 7.1.
Among these, the "gamma trick" of Lemma 7.1.3 leads to the homotopy

where 7 £ C is chosen randomly and r € (0,1]. For nonzero 7 not on the negative
real axis and r G [0,1], the denominator 1 + (7 — l)r ^ 0. By the linearity of F(z; q)
with respect to q, we can clear the denominator to get

H(z, T) := F{z; 7TQ1 + (1 - r)q2) = 0,

without changing the solution paths. It can save computation to further rewrite
this as

H(z, r) := 1TF(z- qx) + (1 - r)F(z; q2) = 0.

This is sufficiently convenient that we state it formally below. The upshot is that
in the succeeding sections, we may concentrate on finding start systems for each of
the special structures. Any start system in the family will do, as long as it has the
generic number of roots. Recall from the previous chapter, the notation N{q, U, Q)
is the number of nonsingular roots in U of F(z; q) = 0 at parameter point q € Q.
Theorem 8.3.1 Suppose F(z; q) : Cn x C m —> Cra is polynomial in z and linear
in q, and let f(z) = F(z;q0) for some given q0 G C m . If g(z) = F(z;q*) with
M{q*, U, Cm) = Af(U, Cm) for some Zariski open set U C Cn, then
122 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

(1) for almost all 7 G S1, i.e., for all but finitely many complex numbers 7 of
absolute value one, the homotopy

h(z,t):=-ytg(z) + (l-t)f(z)=O

has Af(U, C m ) nonsingular solution paths ont 6 (0,1] whose endpoints as t —* 0


include all of the nonsingular roots of f(z) = 0 in U;
(2) if g(z) = 0 has no isolated roots of multiplicity greater than 1, the endpoints of
the nonsingular solution paths include all isolated solutions of f(z) = 0 in U;
and
(3) if we let 7 = el6, the foregoing statements still hold for all but a finite number
points 9 G [—7T, TT].

Proof. This is a consequence of Theorem 7.1.4, Theorem 7.1.6, and Lemma 7.1.3
with rearrangements described above for linearly parameterized families. •

Remark 8.3.2 In cases where g already incorporates a generic complex scaling


factor, 7 is superfluous; it can be dropped from the homotopy. (This is equivalent
to choosing 7 = 1 . )

Through use of Theorem 8.3.1, as long as the parameters of the family of systems
appear linearly, all that we need to form a good homotopy is to find one start system
in the family having the generic number of nonsingular roots. Then, by picking 7
at random in C, the homotopy leads to all nonsingular solutions of a target system,
with probability one.

8.4 Product Homotopies

Let us now jump to the bottom of the hierarchy of Figure 8.1 and work our way
up. Although the lower structures can be justified as special cases of the higher
ones, it is better for building understanding and intuition to start with the simpler
cases. Not surprisingly, for the most part, this follows the historical development
of the subject.

8.4.1 Total Degree Homotopies


At the bottom of the hierarchy, the total degree homotopy uses the least detail of
the structure of the target system to be solved. The structure is completely char-
acterized by the number of variables n and a list of degrees di, i = 1,..., n. (Here,
the di are all positive integers.) Let F(z, q) : Cn x Q —> C" be the family consisting
of n polynomials in n variables with dt being the degree of the ith polynomial. The
parameter space Q consists of the coefficients of all monomials that respect the
Polynomial Structures 123

specified degree structure. In other words, we have

fi(z;q)= ^ qi>aza, i = l,...,n, (8.4.1)


\a\<di

where a = {ai,...,an} G Z | o , \a\ := ax -\ + an, and za := z™1z%2 • • • < " .


1
The number of monomials in n variables having degree less than or equal to d is
("n^)' s o denoting rrii = (™~^di), the parameter space for the total degree homotopy
is Q = C m i x • • • x C m ". Using the notation of § 8.1, we may write a description of
F in the alternative form

/i(z)e({l,;zi,...,zn}<*>),

where the parameter space is the set of coefficients multiplying the elements of the
vector space.
Since the parameters of F appear linearly, we can apply Theorem 8.3.1, if only
we can find a start system g £ F that has the generic number of nonsingular roots
and is easy to solve. We know from the classical Bezout Theorem for systems that
the number of finite, nonsingular solutions to a generic member of the total degree
family is J\f = d\ • • • dn. A simple system that achieves this bound is

(#-1) d
z 2 _ -^
2
g(z) = I . \ = 0. (8.4.2)

We can solve the individual equations independently, obtaining dt roots for z»; the
solutions of the system g(z) = 0 are the d\ • • • dn combinations of these. It is easy
to see that all of these roots are nonsingular. So, even though it is very sparse,
g(z) has as many roots as the most general member of the total degree family. We
summarize the net result in the following theorem.

Theorem 8.4.1 (Total Degree Homotopy) Given a system of polynomials


/ ( z ) = {fi(z),...,fn(z}} : C™ -> C" with the degree of fc equal to di} let g(z)
be any system of polynomials of matching degrees such that g{z) = 0 has d = fj™ di
nonsingular solutions. Then, the d solution paths of the homotopy

h(z,t):=<ytg(z) + (l-t)f(z)=O

starting at the solutions of g(z) = 0 are nonsingular for t £ (0,1] and their endpoints
as t —> 0 include all of the nonsingular solutions of f(z) — 0 for almost all 7 £
C, excepting a finite number of real-one-dimensional rays through the origin. In
1
Simple demonstration: a monomial za, z = {21,..., zn}, \a\ < d, can be written as a string of
d + n symbols as za = 1 • • • 1 X z\ • • • z\ X • • • X zn • • • zn, where the positions of the n occurrences
of the "x" symbol uniquely specify the monomial. Hence, the choices of n items in a list of n -f- d
things enumerate the monomials.
124 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

particular, restricting 7 to the unit circle, 7 = e%0, the exceptions are a finite number
of points 9 £ [0, 2TT].

Proof. Because the family of all polynomial systems with the specified degrees
is a vector space over the coefficients of its monomials, this follows directly from
Theorem 8.3.1 under the condition that g(z) = 0 has the generic nonsingular root
count. The classical Bezout Theorem says that d = 11"= 1 di i s the generic root
count for this family, so we are done. •

Remark 8.4.2 The system g(z) from Equation 8.4.2 satisfies the conditions of
the theorem, and so it can be used as the start system of a homotopy to solve
f(z) = 0. There are, however, many viable alternatives. One that is occasionally
useful has gi(z) a product of di generic linear factors. Using the notation of § 8.1,
we may write

gi(z)e{z1,...,zn,l)(d').

The roots of this start system are found by choosing one factor from each equation
and solving the resulting linear system of equations. If we choose the coefficients
of all the linear factors at random, these linear systems will all be nonsingular with
probability one. Equation 8.4.2 is a special case in which

9i{z) G (Zi,l)idi) .

Instead of taking the classical Bezout Theorem as given, we can prove it with the
tools at hand. It is instructive to do so, because a slight generalization of the same
argument will apply for multihomogeneous structures in the next section. First, we
rephrase Bezout's Theorem in the current notation.
Theorem 8.4.3 (Projective Bezout) Given positive integers di,...,dn, let
F(z, q) : C r a + 1 xQ —> C" be the family of homogeneous polynomial systems whose ith
function is a member of the vector space ({?o, ~z\,. •., z n } d i ) and whose parameters
Q are the coefficients of this space. Then,
n

t=i

Corollary 8.4.4 (Affine Bezout) Given positive integers d\,..., dn, let F(z,q) :
C " x Q ^ C n be the family of polynomial systems whose ith function is a member
of the vector space ({1, z\,..., zn}d') and whose parameters Q are the coefficients
of this space. Then,
n
Af(Ci,Q) = '[[di.
Polynomial Structures 125

Proof. Let q* G Q be the set of coefficients for the system 5(2) = F(£; g*) as

z z
l 0

2
5(2) = I . ° [=0. (8.4.3)
\zn z
0 >

We see that 5 has no solutions at infinity, because if ?o = 0, then all of the % =


0, but [0,..., 0] is not a point in projective space. Away from infinity, we may
dehomogenize by setting z0 = 1, and find the remaining % as the djth roots of
unity. Clearly, there are d = JJ"=1 di distinct solutions, and they are all nonsingular.
Theorem 7.1.4 says that since q* G Q, the generic root count A^(Pn, Q) > d. Suppose
q' 6 Q in the neighborhood of q* has N > d nonsingular solutions. Theorem A. 14.1
implies that nonsingular roots continue in an open neighborhood, so since P" is
compact, the nonsingular solutions along a path from q' to q* must have a limit in
P n as the path approaches q*. Accordingly, some solution of g(z) = 0 must have
at least two solution paths approaching it. But this contradicts Theorem A. 14.1,
leaving M = d as the only possible conclusion. The corollary follows immediately
from the observation that since ^(2) = 0 has no roots at infinity, this is the case
generically on the whole family F(z; q) = 0, and therefore the affine root count is
the same as the root count onP". •

Remark 8.4.5 We call d = Yl7=i °k * n e total degree of the system. Thus, we may
say that the number of finite, nonsingular roots of a system of n polynomials on C71
is less than or equal to its total degree.

Remark 8.4.6 The system of Equation 8.4.3 can be used as the start system in
an homogeneous homotopy to solve n homogeneous polynomials in n + 1 variables
using the homogeneous analogue of Theorem 8.4.1. In fact, it is very useful to
homogenize a target system and solve it on P n , so that solution paths that would
diverge to infinity in C n can be followed to their endpoints at infinity in Pn. See
Chapter 3 for more on this.

The total degree homotopy is easy to implement and very effective for systems
of dense polynomials. However, systems arising in practice often display patterns
of sparsity that result in fewer than the total degree number of roots. The next few
sections move up the hierarchy of Figure 8.1 to capture more of the structure of the
target system.
126 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

8.4.2 Multihomogeneous Homotopies


The quickest way to understand multihomogeneous structures is to start with an
example. Suppose we have the system

f(x,v) = {*Hll}=0- (8-4-4)


The total degree of this system is four, but it has only two finite roots, (x, y) =
±(1,1). When we use a total degree homotopy on C2, we are in essence solv-
ing a one-homogenization of the system on a patch of P 2 . In this case, the one-
homogenization of f(x, y) is obtained by substituting x = X/W and y = Y/W and
clearing denominators to get
2
FX(W,X, Y) =f |XY£ [-_W ~)
w2 I = 0. (8.4.5)
Now the finite roots are [W, X, Y] = [1,1, ±1] and there is an additional double
root at infinity: [W, X, Y] = [0,1,0]. The total degree homotopy not only wastes
computation by following four solution paths, but the two unwanted paths lead to
a singular root. If this root is not handled properly, the procedure may spend much
more time on it than is spent on the meaningful finite roots.
It would be better to use a different treatment of infinity, so that the undesired
roots no longer exist. In this case, this can be done by introducing a separate
homogeneous coordinate for each variable; that is, set x = X/U and y = Y/V and
clear denominators to get

F2(X,Y,U,V) = [X£Z™ } =°- (8A6)


We now seek solutions ([U,X], [V,Y]) G P x P and find that there are only the
two finite solutions ([1,1], [1,1]) and ([—1,1], [—1,1]). There are no solutions at
infinity, because setting U = 0 implies (U,X) = (0,0), which is not allowed, since
[0,0] ^ P, and setting V = 0 has similar consequences. An homotopy that respects
the two-homogeneous structure of the system will have only two paths.
This can be understood in another way using the vector space notation of § 8.1.
Recall from the previous section that the total degree homotopy treats f(x,y) as
follows:
(xy - 1) G ({x, y, 1} <g> {a;,y, 1}) = (x2, xy,y2,x,y,l),
1 2 2 [ l>
(x - 1 ) e ({x,y,l}®{x,y,l}) = (x ,xy,y ,x,y,l) . ^
In contrast the two-homogeneous treatment places f(x, y) as a member of the family
as follows:
(xy - 1) G ({x, 1} ® {y, 1}) = (xy, x, y, 1), .
l8 4 8j
(x2 - 1) G ({x, 1} ® {x, 1}} = ( x 2 , x , l ) . - -
Polynomial Structures 127

Clearly, for this system, the two-homogeneous treatment is more restrictive than
the one-homogeneous treatment. The corresponding start system is

9i(x,y)£(x,l)x(y,l),
g2(x,y) e (x,l) x (x,l). ^ ^
A particular instance that is sufficient is

which has two solutions (x, y) = (±1,1). When solving this system, we cannot
choose the first factor x = 0 in the first equation as it is incompatible with either
factor in the second equation. This hints at the general phenomenon that we make
use of in multihomogeneous homotopies.
More formally, the structure used in a multihomogeneous treatment of a system
can be summarized as follows. We have n variables that are partitioned into m
disjoint subsets of size ki,..., km, (fci + • • • + km = n); that is, we have z G C™
written as

z = {zi,..., zm} with Zj = {ZJI, . . . , zjk.}.

Furthermore, in the target system f(z), the degree of the zth polynomial fi(z) with
respect to the jth set of variables Zj is d^. This can be written for i = 1 , . . . , n as

fi= £ c { Q l ,..., Q m } zf 1 ... Z °™ (8.4.11)


{ = !,...,or m }

where each a^ is a multidegree. Equivalently

fi e ({1, 2l}<d»> ® • • • ® {1, zmyd^) . (8.4.12)

We consider the family of all such systems, parameterized by the coefficients of all
the monomials that appear in this vector space. In the remainder of this section,
f(z) is a particular member of this family and Af(Cn, Q) is the root count for
the family, where the parameters, forming space Q, are the coefficients of all the
monomials of the vector space specified by Equation 8.4.12.
As we will justify below, a start system that corresponds to F given by Equa-
tion 8.4.12 is g with

9i e (zu l ) ( d i l ) x • • • x (Zm, l){d""}. (8.4.13)

That is, gi is the product of linear factors, with d^ factors of the variables Zj. Let G
be the family of all such systems, having a parameter space Q' consisting of the cross
product of the parameter spaces for the vector spaces of the factors. Clearly, after
expanding the product and collecting terms, each such g is in the family defined by
Equation 8.4.12, which defines a map <j> : Q' —> Q. Let Qg C Q denote the image
128 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

of Q' under the map 0. We know that Qg is irreducible, because Q' is, so we may
speak of M(U, Qg), the generic nonsingular root count of the start system family G
as a subfamily of F.
To find a solution of g{z) = 0, choose one factor from each equation and solve
these n linear equations simultaneously. One finds all of the solutions by ranging
over all possible choices of the factors. As we saw in the example of Equation 8.4.10,
some combinations of factors will be incompatible; in fact, we must choose exactly
kj factors for each group of variables Zj.
There are several ways to count the number of solutions of the start system
g(z) = 0. Let D be the n x m matrix of nonnegative integers with entries d\j and
let K = {k\,..., km}. For generic coefficients in the linear factors, we have a generic
root count that depends only on D and K. We'll call this function Bez(D,K) =
J\f(Cn,Qg). Let s(K) be a list of length n containing kj copies of a,- for j = 1,..., m.
From this, let ir{K) be all the distinct permutations of the list s(K), of which there
will be n\/(k\\ • • • km\). Then, a direct formulation of the combinatoric process
described in the previous paragraph is
n
Bez(D,K)= J2 11^- (8A14)
An equivalent definition is

( n m \

^•••amm.I[2dtfaJ h (8A15
)
where coeff(x,p(x)) reads as "the coefficient of monomial x in the polynomial p(x)."
A special case of this formula occurs when m = n, which implies kj = 1,
j = 1,... ,m. Then, D is a square matrix and Bez(£>, K) — permanent(Z?), where
the permanent of a matrix is just the determinant except all terms are added without
introducing negative signs on the odd permutations. If D has all nonzero entries,
then there are n! terms in the sum. The other extreme is the one-homogeneous
case m — 1, k\ = n, for which we get one term, the total degree Bez(D,{n}) =
dn • • -dni.
Now, let's justify the use of this start system by proving the following theorem.
Theorem 8.4.7 (Multihomogeneous Bezout Theorem) Let F : Cn x Q —>
C n and G : Cn x Qg —> Cn be the families of systems specified by Equation 8.4.12
and Equation 8.4.13, respectively. Then
JV(C", Q) = M(Cn, Qg) = Bez(D, K),
where a formula for Bez(D, K) is given in Equation 8.4.I4.

Proof. The proof is essentially the same as the proof of Theorem 8.4.4, except
we use multihomogenizations of F and G to compactify the solution domain. See
Polynomial Structures 129

§ 3.6 for the definition of multiprojective spaces and multihomogeneous polynomial


systems compatible with them. The multihomogenizations of F and G are functions
on C fcl+1 x • • • x Ckm+1 compatible with the multiprojective space X = ¥kl x • • • x
pfem j n particular, the multihomogenization G of G, using the homogenization
substitutions Zj£ = z"j(/u>j, has an ith function of the form

A solution to 5 = 0 must have at least one factor in each equation equal to zero. For
a generic J E G , a choice of kj factors in the group of variables {WJ, 2ji,..., "z^ }
from kj different equations, determines a unique point in the corresponding ¥kj,
and a collective choice of one factor from each equation that has kj factors in each
group of variables for j — 1,..., m gives one nonsingular solution of 5 = 0 in X.
These are the only possible choices, since any other choice must have more than
kj factors in some group j and so has only the trivial solution {0, ...,0} ^ Fkj.
These are the same combinatorics that define Bez(D,K), so we have Af(X,Qg) =
Hez(D,K). Moreover, generically none of the roots are at infinity, and no other
solutions exist. Since the multiprojective space X is compact, by the same argument
used in Theorem 8.4.4, we have for the multihomogenized family F that Af(X, Q) =
Af(X, Qg). Since generically none of the roots is at infinity, and since the affine roots
of the original inhomogeneous systems F and G are in one-to-one correspondence
with those of their multihomogenizations, the affine root counts are the same as the
root counts on X. •

Remark 8.4.8 Although we have not stated it as a separate theorem, it is


clear from the proof that Bez(D,K) is also the generic nonsingular root count
for a multihomogeneous polynomial system with degree matrix D and group sizes
K = {hi,..., km} compatible with the multiprojective space X = Pfcl x • • • x Pfcm.
The final step is to connect our start system g with a target system / of the
same multidegree structure. Since the parameters of F(z\ q) appear linearly, we
may use the homotopy given in Theorem 8.3.1. For the record, we state this as the
following theorem.
Theorem 8.4.9 (Multihomogeneous Homotopy) Given a system of polyno-
mials f(z) = {fi(z),..., fn(z)} '• C n —> Cn having a degree matrix D for variables
partitioned into subsets of sizes K, as above, let g(z) be any system of polynomials
of matching degrees such that g(z) = 0 has Bez(D,K) nonsingular solutions. Then,
the Bez(D, K) solution paths of the homotopy
h(z,t):=jtg(z) + (l-t)f(z)=O
starting at the solutions ofg(z) = 0 are nonsingular for t £ (0,1] and their endpoints
as t —* 0 include all of the nonsingular solutions of f(z) = 0 for almost all 7 €
C, excepting a finite number of real-one-dimensional rays through the origin. In
130 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

particular, restricting 7 to the unit circle, 7 = e , the exceptions are a finite number
of points 0 e [0,2TT].

Proof. Because the family of all polynomial systems with the specified degrees is
linear with respect to the coefficients of its monomials, this follows directly from
Theorem 8.3.1 under the condition that g(z) — 0 has the generic nonsingular root
count. Theorem 8.4.7 establishes that Bez(D, K) is this count. •

Remark 8.4.10 A similar homotopy works on a multiprojective space X =


pfci x • • • x Pfc™ for compatible multihomogeneous functions and start systems. This
is the preferred formulation when the target system might have solutions at infinity,
for the reasons cited in Chapter 3.

Example 8.4.11 (Matrix Eigenvalues) In the realm of numerical linear algebra,


efficient and robust methods already exist for solving matrix eigenvalue problems,
but for purposes of illustration, let's consider the problem of finding eigenvectors
and eigenvalues by multihomogeneous homotopy. Given two n x n matrices A and
B, the generalized eigenvalue problem is to find (v,X) e p™-1 x IP such that

f = (X1A + \2B)v = 0.
This becomes a conventional eigenvalue problem for A if we set Ai = 1 and B — —I.
The problem consists of n quadratic equations, thus the total degree is 2".
Partitioning the variables in the natural way as Z\ = v and z-z = A, we have
da = di2 = 1; that is, the equations are bilinear. The root count is the coefficient
of a™~1a2 in the polynomial (c*i + 02)") which is simply n. This agrees with the
well-known result from linear algebra.
A suitable start system has

gi(v,X):=(aJv)(bJX)=O, i = l,...,n

where a* € C™ and 6j £ C2 are chosen randomly. For k = 1,... ,n, we choose the
second factor in the /cth equation to solve for A and solve the linear system formed
by the first factors from the remaining (n — 1) equations to get v. This gives n start
points.
Notice that the equations are all two-homogenized from the outset. To treat
these numerically, we may dehomogenize by appending a random, inhomogeneous
linear equation for v and one for A. This amounts to choosing a random patch
C"" 1 x C 1 on P"" 1 x P.

8.4.3 Linear Product Homotopies


Multihomogeneous homotopies are linear product homotopies that respect a given
partitioning of the variables. They are ideal for problems that have a natural
Polynomial Structures 131

partitioning, such as the eigenvector-eigenvalue problem, but some problems benefit


from a less restrictive partitioning, introduced in (Verschelde & Cools, 1993).
We call a linear set any subset of {1, z\,..., zn}. A linear product structure
is specified by a list of linear sets for each equation. Assume the variables are
z = {zi,... ,zn}. Let TOj be the number of linear sets for equation i, and let them
be denoted s^ C {1, z\,..., zn}. Then, a linear product family is given by

/ < € (sn ® • • • ® simi), (8.4.16)

with, as usual, the parameters being the coefficients of the vector space.
For such a family, a sufficient family of start systems G has for the ith equation

9i(z) e (s<i> x . . - x ( s i m t ) . (8.4.17)

As discussed in the previous section on multihomogeneous systems, we may consider


the family of G as a subfamily of F, having an irreducible parameter space Qg C Q,
where Q is the parameter space of F. The sufficiency of G as a start system for
F just means that it has the proper root count, which is stated formally as the
following theorem.

Theorem 8.4.12 (Linear-Product Root Count) Let F and G be the families


of systems specified by Equation 8.4-16 and Equation 8-4-17, respectively, and let Q
and Qg C Q be their parameter spaces. Then, for a Zariski open set U C C n

Af(U,Q)=M(U,Qg).

This is an easy consequence of the general product decomposition theorem,


Theorem 8.4.14, below, so we postpone proof to that point.
The combinatorics of finding all nonsingular roots to g(z) = 0 is slightly more
complicated than in the multihomogeneous case, because the variable groupings
are not necessarily the same across all the factors. However, it is just a matter of
determining, for each collective choice of one factor from each of the polynomials in
g(z), whether the resulting linear system is compatible. We return to this below,
but first, let us state the corollary that justifies using a linear-product homotopy.

Corollary 8.4.13 For any f(z) in the family defined by Equation 8.4-16 and a
generic g(z) from the family defined by Equation 8.4-17, the solution paths of

h{z,t):=-ytg(z) + (l-t)f(z)=O

starting at the nonsingular roots of g(z) = 0 are nonsingular for t g (0,1] and their
endpoints at t = 0 include all the nonsingular roots of f(z) = 0, for all 7 G C
excepting a finite number of one-real-dimensional rays through the origin.

Proof. This is the usual application of Theorem 8.3.1 in light of Theorem 8.4.12.

132 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

The theorem and its corollary are quite simple to apply. Consider the system
(8418)
^U'H^C^H-
We see that /i G ({x,y} <g> {l,x,y}} and / 2 € {{x,y}2 <g> {l,y}), so we pick g1 e
(x,y) x (l,x,y) and g2 £ (x,y)2 x (l,y). A particular example is

(X+ )(1
*(*,*) = (" W , J ^v +2/ M=0- (^.19)
v
lS2 / \ ( ^ - 2 / ) ( ^ + 22/){l + 3/) / >
Although the total degree of g is 6, it has only 4 nonsingular roots, since (0, 0) is a
double root. Although we chose very simple coefficients, it is easy to see that this
is true for generic coefficients. Hence / has at most 4 nonsingular roots on C2. We
give a more substantial example in the case studies below (see § 9.3).
It is easy to build a computer program that takes advantage of linear-product
homotopies, if we rely on the user to identify the product structure. Then, the
program forms a start system consisting of linear factors with coefficients picked by
a random number generator. This gives a system that is generic with probability
one. The program cycles through the various combinations of choosing one factor
from each equation and, if the resulting linear subsystem is full rank, its solution
is determined. This potential start point is a solution of the start system, but it
is a true start point of the homotopy only if it is nonsingular and it is in the set
U. We can check for singularity by numerically evaluating the condition number
of the Jacobian matrix of partial derivatives at the point. Assume U is defined
explicitly as the complement of the solution set of a given polynomial system, say
U = Cn \ s-\0) where s : Cn -> <Cm are "side conditions" as in § 7.5. We need
only evaluate s at the potential start point to determine membership in U.
It is worth considering how a potential start point may fail to be a true start
point. It is singular if and only if it is a solution of two of the linear subsystems
formed by choosing one factor from each equation. Because the coefficients of
the linear factors are chosen randomly, this can only happen when a subset of the
variables is zero due to the lack of an inhomogeneous term in a corresponding subset
of factors. This is what happens, for example, in Equation 8.4.19, where (0,0)
appears twice. In the absence of this phenomenon, there is a zero probability that
the solution satisfies more than one linear subsystem, and it is therefore nonsingular.
Similarly, the only way a potential start point can fail to be in U is if it has a pattern
of zero entries that make it satisfy s independent of the values of the nonzero entries.
In particular, one may choose to work on (C*)n, in which case any solutions to g that
have one or more variables equal to zero, can be cast aside. To understand these
statements better, see the proof of Theorem 8.4.14 below. For an inhomogeneous
linear factor, the base locus of (1, z\,..., Zk) is empty, while for a homogeneous linear
factor, the base locus of (zi,..., Zk) is the linear subspace z\ = z2 = ... = z^ = 0.
Clearly, linear product structures include multihomogeneous structures which
Polynomial Structures 133

include total degree structures. In HOMLAB, the Matlab code distributed for use
with this text (see Appendix C), the general-purpose code uses linear products.
The drivers for multihomogeneous and total degree homotopies construct equivalent
linear-product structures and then proceed as in the general linear-product case.

8.4.4 Monomial Product Homotopies


Next up the hierarchy of Figure 8.1 are monomial product structures. Truth be
told, these are not usually used directly, but we introduce them as a conceptual
bridge to the next level of polynomial products and polytope structures. All we
note here is that the entire theory of linear-product structures carries over to the
more general case where the sets si;,- are collections of monomials. In the case of
linear products, we restricted these monomials to just {1, z\,..., zn}.
Let's consider a simple example to fix ideas. Suppose we have two equations
involving only the monomials {x,y,x2y,xy2}, that is,

/i,/z e (x,y,x2y,xy2).
These are cubics, so the total degree is 9. The two-homogeneous Bezout number is
the coefficient of a(3 in (2a + 2/?)2, which is 8. The best linear-product structure
that contains the given monomials is ({x, y} ® {l,x} ® {l,y}}. If we work on (C*)2,
this structure has 6 roots. But the equations obey the following monomial product
structure

A,/2G {{l,xy}®{x,y}).

This structure gives the same root count as the factored system

gi,92 G (l,xy) x (x,y).


One sees that two generic factors from (l,xy) have no finite roots and two generic
factors from (x,y) have only the origin in common, so working on (C*)2, we have

^({/i,/2},(C*) 2 )-AT({ gi ,(; 2 },(C*) 2 ) = 4.

The drawback of monomial products, in contrast to linear products, is that it


is no longer easy to solve the start system. Fortunately, as covered in the next
section, that problem has been solved in a quite general way via the use of convex
polytopes. Another advantage of the advanced methods is that they do not require
the analyst to find decompositions by hand; it is all automatic. In fact, although
convex polytopes can be used to justify the theory of monomial product structures,
it is more powerful, applying also to monomial vector spaces that do not reduce to
products. If our simple example above is modified to

A, h e (x, V, xy, x2y, xy2) 0 ({1, xy} ® {x, y}),


134 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

the monomial product theory does not apply, but the convex polytope approach
gives the same root count of 4, because xy is inside the "convex hull" of the other
monomials. Still, despite its limitations, it may occasionally be useful to analyze a
small system by monomial products. It also serves as a stepping stone to our final
product structure: polynomial products.

8.4.5 Polynomial Product Homotopies


As throughout this chapter, let's consider a family of polynomial systems F(z; q) :
C n x Q —> Cn. More specifically, let F = {/i,... , / „ } , where each polynomial
fi : C™ x Qvt —> C has as its parameter space the coefficients of a vector space Vj
defined by a polynomial product structure as follows. Each Vj is specified by rrii sets
of polynomials s^-, j = 1 , . . . ,mj, which letting /cy be the number of polynomials
in the set s^, can be denoted as s^ = {ptji, • • • ,Pijktj}- All the polynomials pijk
are given. The vector space Vj is constructed from these as the polynomial product

Vz := (Sil ® • • • ® s i m i ) , i =l,...,n. (8.4.20)


The basis elements of V are all the polynomials obtained by choosing one element
from each sy, j ~ 1 , . . . , rrii and multiplying them together. If two or more of these
choices give an identical element, the duplicates can be dropped, but in any case,
there are at most JTjli ^ij basis elements for Vi. The parameter space for Vi, which
we call Qvi, is the set of coefficients multiplying these elements. The parameter
space for the family of systems F is just Qv1 x • • • , xQyn.
Alternatively said, if a polynomial Wi can be written in the form
Ti rrii
Wi
=£ n ww> t 8 - 4 - 21 )
where wm £ (sy), then wt e Vj. A particular system in the family F consists of
an n-tuple of polynomials {w\,... ,wn}, Wi G Vj.
Now, consider a special member of F wherein each Wi is formed from a single
product, that is, rj = 1 in Equation 8.4.21. We will argue that a generic system
of this type is sufficient as a start system for a homotopy to find all nonsingular
solutions of any system in F. Accordingly, we will choose a generic start system
9{z) = {gi(z)> • • -,9n(z)} with
rrii

9i(z) = I ] 9ij{*), 9lj e (Sij), (8.4.22)


j=i

or what is equivalent,

9i(z) e (sn) x • • • x {simi). (8.4.23)

Each vector space (sy) has Ckij coefficients, so the entire family of start systems
G of the form of Equation 8.4.22 has a parameterization as the cross product of all
Polynomial Structures 135

of these Euclidean spaces, which is therefore just a big Euclidean space. But since
every g{z) e G is also in F, we can cast G as a subfamily of F having parameter
space Qg C Q. Clearly, Qg is connected, because it is the image of a Euclidean
space, where the map is defined by expanding the product and collecting terms.
Accordingly G{z;q) is just F(z;q) restricted to C n x Qg, where Qg is the set of
systems in F that factors as Equation 8.4.22.
The sufficiency of g(z) € G as a start system for any f(z) G F is established by
the following theorem.

Theorem 8.4.14 (Polynomial-Product Root Count) Let F and G be the


families of systems specified by Equation 8-4-20 and Equation 8-4-23, respectively,
having parameter spaces Q and Qg c Q, as described above. Then, for any U that
is a Zariski open subset ofCn,

Af(U,Q)=Af(U,Qg).

In other words, the number of nonsingular roots in U for a generic start system,
one that factors in the specified way, is the same as the generic nonsingular root
count of the whole family. Such a start system g(z) is much easier to solve than
a general system in the family, because g^z) = 0 implies that at least one of
gij(z) = 0, j = I,...,mi.
Our earlier proofs of Theorems 8.4.4 and 8.4.7 hinged on showing that the start
system had no singular solutions and no solutions at infinity. The question of
excluding roots that satisfy some side conditions, that is, the limiting of the root
count to some Zariski open subset U, did not arise, because those start systems
will not generically have roots on any given quasiprojective set. In the case of
polynomial-product structures, a generic system g 6 G may have singular solutions,
solutions at infinity (in some multihomogenization of C"), or solutions on some
quasiprojective set. The inclusion of U in the theorem strengthens the result (as
compared to using just C n in its place), because it will allow us to drop solutions
that generically lie on some quasiprojective set that we wish to ignore. So while
these possibilities give the formulation extra power to eliminate solution paths in
the homotopy, we must pay for them with a more difficult proof. In particular, we
must argue in more detail that in a continuation from a generic member of F to
a generic member of G, none of the nonsingular, finite solution paths end at such
degeneracies. The proof is a little long by the standards of this chapter, but we
attempt to keep the arguments elementary. This sacrifices some rigor and elegance,
but hopefully it grants the reader an easier grasp of the essential facts.
In the linear-product example of Equation 8.4.18 with start systems like Equa-
tion 8.4.19, we already saw an example of a singular solution to the start system
which also happens to lie on the affine algebraic set x = 0. These conditions persist
generically for the entire family of start systems for that example.
We pause here a moment to emphasize that the theorem can be readily applied
136 Numerical Solution of Systems of Polynomials Arising m Engineering and Science

without understanding its proof. In fact, we will give only a sketch of a proof here,
as a rigorous one requires the language of line bundles and sheaves. The reader
who is versed in these technicalities may wish to consult (Morgan, Sommese, &
Wampler, 1995) for a better proof. The proof sketch below may be useful as a
guide to understanding the rigorous proof. On that note, some readers may wish
to skip to the end of the proof now.

Proof, (sketch) We consider the one-homogenizations of F and G with solutions


that live on P", but to keep notation simpler, let us retain the same names.
After homogenization, the variables z are replaced by homogeneous coordinates
x = [xo,xi,... ,xn] G P " and the basis elements of the sets Sik are replaced by
their homogenizations. We count the nonsingular solutions on a Zariski open subset
U C P n . This includes the special case of counting finite solutions, since Cn = P"\^4
where A = {x E Pn\xo = 0}. The finite solutions of the homogenized systems, i.e.,
the solutions with Xo =fi 0, are in one-to-one correspondence with the solutions of
the original systems via the mappings [xo,xi,... ,xn] — i > (XI/XQ,. .. ,xn/xo) and
( z i , . . . , zn) i—> [1, x\,..., xn], so counting the finite solutions of the homogenized
systems is the same as counting the solutions to the original systems.
Let iJfc = {<?!,..., gk, fk+i, • • •, fn} be the system obtained by replacing the first
k functions in F by the corresponding functions in G. Accordingly, Ho = F, Hn = G
and we have a corresponding sequence of parameter spaces Q = Qo D Q\ D • • • D
Qn = Qg- Suppose we can show that M(U, Qi) = N(U, Qo). Then since the order
of the functions doesn't affect the solution set, when stepping from Hk to Hk+\ by
replacing fk+1 in Hk by gk+i, we may reorder to place fk+x as the first function
in the set and conclude that J\f(U,Qk+i) = N(U,Qk)- Chaining these equalities
together, we get Af(U,Q) = N{U,Qg), thus establishing the result we seek. Thus,
the proof of the theorem hangs only on the lemma Af(U, Q\) = J\f{U, Qo).
For the lemma, we fix {/2,..., f n } , and consider what happens for generic f\ and
g\. Abusing notation, we will still call the parameter spaces Q and Qi, respectively,
from here on.
To prove the lemma, we begin by considering that g\ is the product of m; factors,
say si = 311912 • • -01m!, where 0ij e (sij), j = 1,.. • ,m\. For each factor, there is
a generic nonsingular root count dj in U for the system {(sij), /2, • • •, fn}- By the
elementary rules of differentiation of a product, it is easily seen that if a point x*
is a zero of more than one factor in the product, then all the first derivatives of 0i
at x* are zero. On the other hand, if x* is a nonsingular root of one and only one
of the systems {01 j , f2,. • •, / „ } = 0, j = 1 , . . . , mi, then it is a nonsingular solution
of {01, /2, • • •, fn} = 0. Consequently, N{U, Q\) is the sum dx + • • • + dmi minus
the number of roots that are generically at the intersection of two or more of the
factors {sij). We will be more precise in a moment.
Consider W = {/ 2 ,. •., /n}~1(0)> t n e solution set, with multiplicities, of the
last n — \ equations in F. This set can be decomposed into its irreducible com-
Polynomial Structures 137

ponents, which may be of any dimension from n down to 1. The intersection of a


fc-dimensional component with a hypersurface produces components of dimension
k or k — 1, and the multiplicity of the intersection is at least as great as that of the
component. Accordingly, to count the nonsingular solutions of F = Ho or Hi, we
only need to retain from W the irreducible components having both dimension 1
and multiplicity 1. Call this collection of components the curve K. The root count
for Ho concerns the intersection K H /^(O), whereas the count for Hi concerns
Kngi\0) = U^iKng-Jl(0).
In a continuation path through the parameter space for /i as we approach
pi, we must consider whether nonsingular roots might become singular so that
Af(U,Q) > M(U,Qi). Recall that the base locus Bs(V) of a vector space
V = (ei,. •., em) is the set of common zeros of all the basis elements of the space:
Bs(V) = {ei,... ,e m }~ 1 (0). The key observation is that generically the singular
intersections with gf 1(0) can only occur where K meets the base loci of (sy). Any
other singular intersections disappear under generic perturbations of the parame-
ters of gx. The completion of the proof depends on technical arguments about these
base loci. Basically, since /i is a sum of polynomials each of the same form as gi,
the base loci are preserved under the sum, and moreover, so are their effect on the
root count. We leave the details to (Morgan et al., 1995). •

There is one phenomenon mentioned in the proof sketch that is relevant to


practical implementation of polynomial product homotopies. This is that a point
that is a nonsingular solution to one subset of factors (i.e., to a choice of one factor
from each gi) might be a singular solution of the whole system g = 0. Such points
must be dropped from the list of start points.
Except in the special case of linear products, treated in § 8.4.3, polynomial
product structures require special methods to solve the start system. After breaking
the start system into its various subsystems, one could apply a simpler structure,
say multihomogeneous, to each subsystem. However, this is the same amount of
work as solving the entire start system, and therefore the target system, with the
simpler embedding. We only come out ahead if we use something in the structure of
the subsystems to solve them more efficiently. A common occurrence is that many
of the starting subsystems have the same structure. After one such subsystem
is solved, it can be used as a start system for the other similar subsystems in a
parameter continuation.
A second major inhibitor to the use of polynomial products is that there is
no automatic way to identify a useful breakdown of a given system into a product.
Usually, the product is suggested by the method of derivation of the equations. The
dual difficulties of finding a useful breakdown into a product and solving the start
system means that polynomial products are only appealing for very large problems
where the potential payoff is worth the analyst's time. Otherwise, one may as well
employ a more automated method and let the computer do the work.
138 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

This completes our tour of the product structures in Figure 8.1. In the next sec-
tion, we consider a different generalization of monomial products, using monomial
polytopes, which respect product structures but also take advantage of monomial
sparsity that is not captured in by any breakdown into products.

8.5 Polytope Structures

A natural way to specify a family of polynomial systems is just to list the mono-
mials that may appear in each polynomial. The family is parameterized by the
coefficients. By the general coefficient-parameter theory, it is clear that there is a
root count associated to such a family. A remarkable theorem, repeated below, due
to Bernstein (Bernstein, 1975) tells how the root count depends on the pattern of
the monomials. Since the family is linear in its coefficients, we can use the homo-
topy of Theorem 8.3.1 to solve problems in the family, if only we can solve a start
system having the generic number of roots. Several methods for formulating and
solving such systems have been invented. We describe here only the basics, so that
the reader can appreciate the methodology, but due to the highly technical nature
of efficient combinatorial formulations, we defer to references for the details. After
reading this section, one might next wish to consult the review article (Li, 2003).
Before we can state Bernstein's theorem, we need a few definitions.

8.5.1 Newton Polytopes and Mixed Volume


Let C* = C \ 0, the complex numbers excepting the origin. A Laurent polynomial
fi : (C*)n x Cm" —> C is given in multidegree notation as

Ji\X,Ci) = y Ci,ct'E
aeSi
where Si C Z n is the set of exponent vectors appearing in the monomials,
#(£i) = rrii is the number of monomials, and Ci%a G C is the coefficient for the
Laurent monomial xa. The qualifier "Laurent" acknowledges that we allow nega-
tive exponents, which are disallowed in our usual definition of polynomials. The
set Si is called the "support" of fu and its convex hull Qi = conv(5j) in W1 is its
"Newton polytope.2" The polynomial family
f{x\c) = f(x;ci,...,cn) = {/i(x;ci),...,/ n (ar;cn)}
is parameterized by mi+m2H \~mn coefficients for the support S = {Si,..., Sn}.
When working on (C*)™, multiplication of any equation by a monomial does
not change the root count, as the zero set of xap(x) = 0 is just the union of the
zero set of p(x) = 0 and the zero set of xa = 0, the latter having no points that
A convex polytope is a bounded region of n-dimensional real space enclosed by hyperplanes.
"Polytope" is to n dimensions as "polyhedron" is to three dimensions.
Polynomial Structures 139

are in (C*)n. So given a Laurent polynomial, we can always multiply through by


some monomial with large enough exponent to clear any negative exponents that
appear. Said another way, we can translate the support into the nonnegative orthant
without changing the zero set on (C*)n. Thus, it is clear that the parameter theory
of Chapter 7 for polynomials with nonnegative exponents applies also to Laurent
polynomials.
There are several operations on convex polytopes that are of interest to us. One
is the Minkowski sum of two polytopes:

Qi + Q2 = {Qi + 42 I qi 6 Qi, <?2 G Q2}-


Second, defining the n-dimensional volume, denoted Vol n of a unit hypercube to be
1, we may speak of the n-volume Voln(Q) for any polytope Q C R n . In fact, the
volume of the simplex having vertices VQ,VI, ... ,vn is

Vol n (conv(u 0 , • • •, vn)) = — |det[t>i -vo,...,vn-vo]\.

From these definitions, it can be shown that Vbln(AiQi+A2<32H l-AnQn), where


0 < A, £ M, is a homogeneous polynomial of degree n in the scalars Aj.

Definition 8.5.1 (Mixed Volume) The mixed volume of convex polytopes


Qi! • • • > Qn is defined as

M(QU ...,Qn) = c o e f f ( A i • • • Xn, V o l n ( A 1 Q 1 + \2Q2 + ••• + A n Q n ) ) .

We say that the mixed volume, Ai(S\,..., Sn), of supports Si,..., Sn is the mixed
volume of their convex hulls.

8.5.2 Bernstein's Theorem


We have argued above that a (Laurent) polynomial family has a well-defined root
count on (C*)ra. The following theorem tells us how to determine it from the
geometry of the supports.
Theorem 8.5.2 (Bernstein, 1975) The root count on (C*)n of a Laurent poly-
nomial family specified by supports Si,...,Sn and parameterized by the coefficients
of the corresponding monomials is the mixed volume M(Si,... ,Sn).

This result is variously called the "Bernstein count," the "BKK bound" (a term
coined in (Canny & Rojas, 1991) in recognition of the contributions of (Kushnirenko,
1976) and (Khovanski, 1978)), the "polyhedral root count," or the "polytope root
count." We adopt the last convention as the most descriptive and precise.
If all the exponents are positive, that is, if the system is polynomial in the
usual sense, there is a well-defined root count on C", which may be higher than the
polytope root count on (C*)n. The count in C" can be determined by the procedure
in (Li & Wang, 1996) with further refinements in (Huber & Sturmfels, 1997).
140 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

8.5.3 Computing Mixed Volumes


The computation of the mixed volume is a combinatorial problem. As mentioned at
the outset, efficient methods for this computation are highly technical and we will
not delve into them here. Instead, we will describe a very basic approach that is of
practical use only for two variables, three variables at the most. This will be enough
to show the nature of the beast. With this level of understanding, the reader can
knowledgeably use software provided by experts, but further study of the references
will be necessary to understand the internal workings of such software.
Let's begin with the direct application of Definition 8.5.1 for two polynomials
in two variables. We know that Vol2(AiQi + X2Q2) is a homogeneous quadratic in
Ai, A2, that is, it is of the form

p(Ai, A2) = c20Ai + C11A1A2 + C02A2.


The mixed volumes is the coefficient of A1A2, which is Ci\. But notice that

cn=p(l,l)-p(l)0)-p(0,l),
or in other words,

M(Q1,Q2) = Vol2(Qi + Q2) ~ Vol2(Qi) - Vol2(Q2). (8.5.24)


Since Vol2(Q) is just the area of polytope Q, it is easy to see how to apply this
using familiar area calculations. Following exactly the same line of reasoning, one
may see that

M{QUQ2, Q3) = Vol3(Qi + Q2 + Q3)


- Vol3(Qi + Q2) - Vol3(Q2 + Q3) - Vol3(Q3 + Qi)
+ Vol3(Qi) + Vol3(Q2) + Vol3(Q3), (8.5.25)
and generally,

M{Qu...,Qn) = Y,{-l)^-^Yo\n I J2 QJ I » ( 8 - 5 - 26 )
i=i \jecr* J
where the inner sum is a Minkowski sum of polytopes and Cf are the combinations
of n things taken i at a time.
It is instructive to see how the mixed volume relates to Bezout's theorem. Sup-
pose fi(x,y) and f2ix,y) are general polynomials of degree d\ and d2, respectively.
This implies that their support polytopes Q\ and Q2 are isosceles right triangles of
size d\ and d2, shown in Figure 8.2, and the Minkowski sum Qi + Q2 is another
such triangle of size d\ + d2- Accordingly, by Equation 8.5.24, the root count is

M(QUQ2) = i(di + d2)2 - id? - id! = did2)


Polynomial Structures 141

which is, of course, the same result as given by Bezout's Theorem. The subtrac-
tion of areas is shown graphically in the drawing of Qi + Q2- Alternatively, we
can visualize the definition of the mixed volume directly by drawing a picture of
AxQi + X2Q2, as shown at the right side of Figure 8.2. Only the area of the shaded
parallelogram scales as A1A2, whereas the triangles scale as \\ and X2-

Fig. 8.2 Mixed volume for two polynomials of degree d\ and di.

In a similar fashion, one may easily see that the mixed volume for two equations
having bidegrees (dix,d\y) and (d2X,d,2V) is dixd2y + diyd2x, in agreement with the
two-homogeneous Bezout count. Figure 8.3 shows this in a self-explanatory way.

Fig. 8.3 Mixed volume for polynomials with bidegrees {d\x,diy) and (d2x, <^2j<).

Although the preceding examples only examine the two-variable case, the mixed
volume does in fact generalize the multihomogeneous Bezout count in any dimen-
sion. This relationship is pursued further in one of the exercises at the end of
the chapter.
Any linear product structure is exactly captured by the polytope root count,
as a linear product is just another way of saying that the monomials appear in a
certain pattern. There are, however, more general patterns that are captured by the
polytope formulation but not by any linear product formulation. Systems having
such patterns are said to be "sparse," because some of the monomials which could
appear in a total degree formulation are missing. Many of the problems that arise
in applications have such sparseness.
142 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Consider, for example, the system


/j (x, y) = l + ax + bx2y2 = 0 (8.5.27)
f2(x, y) = 1 + ex + dy + exy2 = 0 (8.5.28)
This system has a total degree root count of 4 • 3 = 12 and a two-homogeneous root
count of 2 • 2 + 2 • 1 = 6, but as illustrated in Figure 8.4, the mixed volume is only
four.

Fig. 8.4 Mixed volume for polynomials in Equations (8.5.27, 8.5.28).

These diagrams hint at the main idea that underlies efficient algorithms for
computing the mixed volume. In each of the drawings of Qi + Q2, notice that the
gray cells, whose areas sum to the mixed volume, are parallelograms having one
edge in common with Q\ and one edge in common with Qi- These are known as
the "mixed cells" in a "mixed subdivision" of Q\ +Q%- Mixed subdivisions are not
unique, as we show in Figure 8.5. It is only required to find one.

Fig. 8.5 Alternative subdivisions for each example.

One approach to finding subdivisions for the mixed volume calculation is based
on "liftings." A lifting algorithm augments each polytope by adding an (n + l)th
coordinate axis and assigning a value using a lifting function. That is, point a e Qi,
corresponding to monomial xa in fi(x), is lifted to (a,Wj(a)), where the lifting
function, u>i : Z n —> K, for the ith polytope assigns a lift value to each exponent
vector. If these assignments are chosen at random, the following procedure gives a
valid subdivision with probability one. Let Q\ be the (n + 1)-dimensional polytope
derived from Qi using u>i. Then, one forms the Minkowski sum Q[ + • • • + Q'n and
finds the lower convex hull. The projection of the edges of this lower hull onto
the original n coordinates gives a mixed subdivision, from which the mixed cells
Polynomial Structures 143

can be readily identified and their volumes computed. In fact, for efficiency, one
avoids forming the convex hull of the Minkowski sum and instead searches for the
mixed cells directly. See (Gao & Li, 2000, 2003; Li, 2003; Li & Li, 2001; Huber &
Sturmfels, 1995; Verschelde, Gatermann, & Cools, 1996). In (Huber & Sturmfels,
1995), it is also shown how to take advantage of several of the equations having the
same support.

8.5.4 Polyhedral Homotopies


The mixed volume root count by itself does not enable us to solve the system by
continuation. We need a start system that we can solve ab initio. This can be done
using information gleaned from the mixed volume calculation to identify monomial
combinations that contribute to the mixed volume. This was accomplished in (Ver-
schelde, Verlinden, & Cools, 1994), using a recursive formula for the mixed volume,
following in that way Bernstein's proof. The same objective was attained by using
mixed subdivisions in (Huber k Sturmfels, 1995). In fact, the homotopy denned
in (Huber & Sturmfels, 1995) can be used to establish an independent proof of
Bernstein's theorem. A good review of subsequent developments is (Li, 2003).
To form a homotopy, one usually chooses the lifting values not from the reals,
but from the small nonnegative integers. Such a choice is not necessarily sufficiently
generic, but this can be discovered by testing and correcting. In fact, we require the
subdivision induced by the lifting to be "fine mixed," a technical condition which
is best left for study in the references. In the end, one has a lifting function u>i for
each equation. We select a generic member G(x) = {gi(x),... ,gn(x)} of the family
of polynomials by picking random complex coefficients Cj>a for monomials at the
vertices of the convex hull Q, to get

gi{x) = ] T citClxa, i = l,...,n,


aeQi

and form homotopy functions H(x, t) = {h\(x, t),..., hn(x, t)} as


hi(x,t)= Y, Ci,axat^a\ i = l,...,n. (8.5.29)

Att=l, we have H(x, 1) = G(x). We solve G(x) = 0 by first solving H(x, 0) = 0


and then tracking solution paths from t = 0 to t = 1. Subsequently, we solve the
original, possibly nongeneric, target system F(x) = 0, using the homotopy

H(x, t) = tG(x) + (1 - t)F(x)


tracking paths for t = 1 to t = 0 starting at the solutions for G(x) = 0.
At first glance, H(x, 0) as defined in Equation 8.5.29 does not look so easy to
solve. However, if we consider the limit as t approaches zero, the solutions x(t) are
algebraic, having a number of branches each with its own Puiseux series (fractional
power series). Each branch corresponds to a mixed cell in the subdivision, and it
144 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

has a number of solutions equal to the volume of that cell. These solutions can be
found by elementary means. Altogether, the paths emanating from the mixed cells
give the full set of solutions to G(x) = 0, whose number totals to the mixed volume
of the system.
In principle the homotopy could go directly to the target system F(x) — 0,
using the coefficients and monomials of F in Equation 8.5.29 instead of those of
G. In practice it is advisable to use the two-stage procedure, solving G and then
progressing to F. This is because target systems are often not generic in the family
defined by their support (that is, the coefficients may satisfy a degeneracy condition)
and this may cause the standard algorithm for solving H(x, 0) to fail.

8.5.5 Example
Rather than delve any deeper into the technicalities, let us simply show the workings
on the example of Equations (8.5.27) and (8.5.28). A choice of lifting functions as
u>i = 0 and u>2{a) = (1,1) • a yields the subdivision shown in Figure 8.4. To see
this, note that the Newton polytopes of the supports of the polynomials are

Ql = [(0,0), (0,1), (2,2)], Q2 = [(0,0), (1,0), (0,1), (1,2)],

which are convex already. These lift to

Qi = [(0,0,0), (0,1,0), (2,2,0)], Q'2 = [(0,0,0), (1,0,1), (0,1,1), (1,2,3)],

The lower hull of Q[ + Q'2 has the faces shown in the figure with vertices

[(0,0,0), (0,1,0), (2,2,0), (2,0,1), (0,1,1), (3,2,1), (2,3,1), (3,4,3)].

Using these liftings, the homotopy of Equation 8.5.29 applied to this example be-
comes
1 + ax
H(x,y,t) = Clf^l) =( + bx2y2 2 A (8.5.30)
v y 3 } v
' ' ' \h2(x,y,t)J \1 + cxt + dyt + exyH )
The solution paths of H(x, y, t) = 0 are intimately related to the two mixed cells,
labeled A and B in the figure. It can be shown3 (Lemma 3.1 Huber & Sturmfels,
1995) that H(x,y,t) only has branches of the form

(x(t),y(t)) = (zoi71,Vof2) + higher-order terms


when (-ji,72,1) is an inner normal of the mixed cell of the lower convex hull of
Qi + Q'2- As i ^ 0, the lowest order terms dominate and we solve them to obtain
the leading coefficients Zo;2/o of the fractional power series.
Let us start by examining cell A, which is generated by monomials \,x2y2 from
/i and l,y in / 2 . The inner normal for that cell is (71,72,1) = (1, — 1,1). One may
3
The result stated here generalizes to any number of variables.
Polynomial Structures 145

check that the inner product of (71,72,1) with the lifted vertices takes a minimal
value of 0 on the cell. In the case at hand, this means that

{x(t),y(t)) = ( x o t 1 , ^ " 1 ) + higher-order terms. (8.5.31)


Substituting into H(x,t) = 0 gives

hi (x, y, t) = 1 + axot + bxfyg + higher-order terms,


h,2(x,y, t) = 1 + cxot2 + dyo + exoy^t2 + higher-order terms.
Keeping just the lowest-order terms in t, we have equations for the initial coefficients
xo,yo as

0 = l + bx%yZ,
0 = l + dy0-
These give two solutions

(zo,2A>) = (±id/Vb,-l/d).
For each of these, we may use Equation 8.5.31 to predict the values of x(t),y(t)
for small t and then commence path tracking on the homotopy Equation 8.5.30 to
t = 1.
In similar fashion, the mixed cell B in Figure 8.4 is generated by monomi-
als x,x2y2 from fi and l,x from J2- This time the inner normal is (71,72,1) =
(-1,1/2,1), so we get

(x(t),y(t)) = (xot-^yot1/2) + higher-order terms, (8.5.32)

which gives

h\(x,y,t) = 1 + axot^1 + bx1ylt~l + higher-order terms,


h,2(x, y, t) = 1 + cx0 + dy^t3/2 + exoy2t3 + higher-order terms.

This time, the lowest-order terms in t give


0 = axot~l + bxlylt'1,
0 = 1 + cx0,

giving two solutions


/ ChC
{xo,yo) = (-i/c,±J—).
As before, these allow us to predict (x(t),y(t)) for small t, now using Equa-
tion 8.5.32, and then track the homotopy Equation 8.5.30 to t = 1.
Together, these give four paths to the four solutions of Equations (8.5.27) and
(8.5.28). Any other choice of (71, 72) fails to give any nonzero solutions of (XQ, yo) in
the initial fractional power series, as there is only one leading term in one or both of
146 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Table 8.1 Various root counts for the toy example, Equation 8.6.33

Structure Embedding U Count


(4) 4
Total Degree <{1, zu z2, z3, 2 4 }) C 256
(2) (2) 4
Two-Homogeneous {1, zi, Z2} <g> {1, 23, 24} C 96
4
Linear Product ({21,22} ® {1, zi, 22}® (C*) 54
{23,24}® {1,23,24})
Monomial Product ({2124,2223,21,22}® (C*) 4 26
or Polytopes {2124, 2223, 23, 24})
Polynomial {{2124 — 2223,21,22}® (C*) 4 6
Product {2124 — 2223,23,24})

the homotopy equations. Both XQ and yo must be nonzero, because by assumption,


they are the leading coefficients of the series. This kind of argument is the key to
the general result for any number of equations.

8.6 A Summarizing Example

Let us review by studying a "toy" example for which each product structure gives
a different root count.
Consider a system of four equations, each of the form

fi = {qn{z\Zi ~ z2z3) + ql2z\ + qi3Z2){qii(zizA - z2z3) + qibz3 + qi6z4)


+qi7ZiZ3 + ql8ziz4 + qi9z2z3 + qnoz2z4. (8.6.33)

We have four variables 2 = {z\, z2, z3, Z4} and forty parameters g^, i = 1,...,4,
j = 1 , . . . , 10. Table 8.1 gives the root counts for various embeddings of the system.

Here is a quick summary of how each of these is calculated:


• The total degree is 4 4 = 256.
• With the variables split into two groups {zi,z2} and {23,24}, the two-
homogeneous Bezout count is the coefficient of a2f32 in (2a -I- 2/?)4, which
is 96. More explicitly, each polynomial in the start system has the form
x
(1,2:1,2:2) (1,2:3, Zi)^2\ There are (2) = 6 ways to choose the factor
(1,2:1, Z2) from two start polynomials and (1,2:3,2:4)^ from the remaining
two, and then there are 2 4 solutions for each such choice, yielding 6 • 2 4 solu-
tions in all.
• Notice that the equations have no constant or linear terms, so the start systems
can be chosen of the form

(2:1,2:2) x {\,zl,z2) x (2:3,2:4) x (1,23,24)


Polynomial Structures 147

The combinatorics for the linear-product embedding follows those for the two-
homogeneous case, but the simultaneous choice of two factors {zi,z2) gives a
solution with z\ = z2 — 0, and two choices of the form {z3, z±) yield a similar
result. Thus, we get a smaller root count when working on (C*)4 of 3-3- (4) = 54.
• The monomial product root count and the polytope root count are the same for
this system. Evaluation of the mixed volume by computer yields the count of 26.
• The polytope root count does not account for the fact that Z\Z4 and z2z3 do not
appear independently in the factors. The polynomial product structure cap-
tures this fact, and as a result the root count decreases to 6. To determine this
count, one must consider the 24 ways to choose one factor from each equation
in the corresponding start system:
gi G {zxz4 - z2z3, zi,z2) x {ziZi - z2z3, z3, z 4 ).
It turns out that only choices with two of each kind of factor give roots in (C*)4.
Each of these (2) = 6 combinations gives a single root.
Although for this example polynomial products give a lower count than the
polytope root count, that is not necessarily true in general. It depends on whether
the equations admit a favorable polynomial product. Often the polytope root count
is lowest. Other than that, the ordering in the table is fixed, as the structures lower
in the table are generalizations of those higher in the table, as indicated at the
outset of the chapter in Figure 8.1.

8.7 Exercises

The next chapter on case studies contains more challenging exercises connected to
applications. For now, the exercises are simpler, illustrative problems.
Exercise 8.1 (Warm Up) Use HOMLAB to solve the system 8.4.4 using
• a total-degree homotopy (see routine totdtab), and
• a two-homogeneous homotopy (see routine mhomtab).
Exercise 8.2 (Linear Products) Consider the system 8.4.18. What is its total
degree, its two-homogeneous Bezout count, and its best linear-product root count.
Solve it all three ways using the HOMLAB routines totdtab, mhomtab, and lpdtab.
Exercise 8.3 (Generalized Eigenvalues) Create a straight-line program for the
generalized eigenvalue problem
(XtA + X2B)v = 0,
where A and B are n x n matrixes, [Ai,A2] G P 1 , and v £ Fra. Solve a randomly
generated example using a two-homogeneous homotopy with n paths. (Use routine
lpdsolve.) Compare your result to the gz algorithm in Matlab.
148 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Exercise 8.4 (Multihomogeneous and Polytopes) For a general system hav-


ing a given multihomogeneous degree structure (D,K), show that the polytope
root count and the multihomogeneous Bezout count Bez(D, K) are the same. Use
Equation 8.4.15.
Exercise 8.5 (Toy Problem) Use HOMLAB to confirm all the root counts re-
ported in Table 8.1. How can you confirm the polytope root count even though
HOMLAB does not implement a mixed volume calculation?

Exercise 8.6 (Circle Tangents) A circle of radius r and center (a, b) has the
equation f(x,y) := (x — a)2 + {y — b)2 — r2 = 0. The condition for a line through
(x, y) and point (c, d) to be tangent to the circle is g(x, y) := (x — a)(x — c) + (y —
b)(y-d)= 0.
(1) Assume r, a, b, c, d are given. Find the points of the circle where it is touched by
the tangents through (c, d). Do so by solving the system {/ = 0, g = 0}, then try
again by solving {/ = 0, f — g = 0}. Is there a difference in the number of paths
for a total degree homotopy? How about for a two-homogeneous homotopy?
(2) Assume two circles are given. Find the point pairs where a line simultaneously
touches both circles in a tangency. Use the same trick as in item 1 to reduce
the number of homotopy paths.
(3) Show that with the change of variables (z, z) :— (x + iy,x — iy) and judicious
linear combinations of the equations, the simultaneous tangents to two circles
can be found with a system having total degree 8, linear-product root count 6,
and polytope root count 4.
(4) What happens if the two circles are tangent to each other?
Chapter 9

Case Studies

As a means of reviewing the computation of isolated solutions by continuation, we


present a collection of application problems in this chapter. Reflecting our own expe-
riences, these are weighted heavily towards problems in kinematics, with chemistry
and game theory also represented. Readers who have no interest in these application
areas are encouraged nonetheless to study this chapter to solidify concepts.
We order these roughly by the complexity of the analysis of the polynomial
structure. The first case concerning Nash equilibria is naturally formulated as
a multihomogeneous system, while succeeding cases offer a range of options to
consider. The final case study on the design of four-bar linkages is actually a
collection of problems ranging from the very easy, four-bar motion analysis, to rather
hard, nine-point path synthesis. In these examples, one may notice that there is
an art to choosing a clean formulation and simple manipulations of the equations
can sometimes lead to homotopy formulations having fewer paths. Although such
manipulations are sometimes not really necessary, as a few extra solution paths are
not of practical consequence, our objective is to give some sense of the full range
of possibilities.

9.1 Nash Equilibria

An important problem in game theory, with application to economics, is the de-


termination of Nash equilibria. A description of the problem and results of using
several different solution methods, including Grobner methods and continuation,
can be found in (Datta, 2003), and related information is in (Sturmfels, 2002).
The problem concerns N players, and the ith player has Si + 1 possible choices of
play, called "pure strategies." For every combination of strategies, there is a payoff
for each player. There are ni=i( s i + -0 possible combinations and TV players, so
the game is defined by TV rTi=i(s* + -0 numbers, tabulated in utility matrices as
follows. Let's say there are 3 players, Alice, Bob, and Chuck, abbreviated as A, B,
or C, and they make, respectively, the plays a,b,c. Denote by U^bc,U^bc,U^bc the
respective payoffs to the players. More generally, the utilities are U^ J w , where i

149
150 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

ranges 1 to N, and each jk runs 0 to s^.


The game is played multiple times. Suppose Alice observes that in the last
round a change in her strategy would have earned her a higher payoff. Then, she is
likely to change her play in the next round. Bob and Chuck will act similarly. An
equilibrium occurs if every player finds that there is no unilateral change of strategy
that would have increased his or her payoff.
Suppose the players can split their bets between the possible strategies. This is
called a "mixed strategy," which models either the situation of putting a fraction
of ones money on each strategy or of putting all ones money on a single strategy
chosen probabilistically according to the mixed strategy. Let Xi = (xio,... ,XiSi) be
the ith player's mixed strategy. Then, the total payoff Pi{x\,..., XN) to player i is
obtained by summing his/her utility over all the mixed strategies as

U
Pi(xU...,XN)= J2'"Y1 xi31,-,xNJNxljix2h---XNjN- (9.1.1)
ji=0 j«=0

Notice that this is multilinear in the players' mixed strategies. Equilibrium occurs
for player A if, while holding B and C's mixed strategies fixed, every pure strategy
for A returns the same payoff. Otherwise, A would be motivated to bet more heavily
on the higher paying pure strategy. Let e/- be a pure bet on the fcth strategy:
eo = ( 1 , 0 , . . . , 0), ei = (0,1,0,..., 0), etc. Then, a Nash equilibrium occurs when
for i = 1 , . . . , N and k — 1 , . . . , Si

Pi(xi,... ,Xi-i,ek,xi+i,... ,xN) = Pi(xi,... ,Xi-i,eo,xi+i,... ,xN). (9.1.2)

This comprises a total of X)i=i s* homogeneous equations on P S l x ••• x FSN, a


multilinear system of polynomial equations.
Since the entries in the mixed strategy x\ are the percentages that player i bets
on each pure strategy, these should all be in the real interval [0,1] , and they should
sum to one. Each player's strategy Xi £ FSi has a unique scaling factor that makes
the sum of its homogeneous coordinates equal to unity. These can then be filtered
against the [0,1] condition to find the meaningful solutions. Thoseforwhich all bets
are in the interior of the interval (0,1) are called "totally mixed Nash equilibria."
A given game can also have partially mixed equilibria, where some players adopt
pure strategies, due to unequal payoffs, while others adopt mixed strategies. We
consider only the totally mixed Nash equilibria.
The system given in Equation 9.1.2 has two essential structural characteristics:
the equations are all multilinear, and the group of variables Xi does not appear
in the ith block of equations (those that involve Pi). This structure is perfectly
captured by a multihomogeneous formulation. In (Datta, 2003; Sturmfels, 2002),
the solutions are counted using the polyhedral mixed volume and computed via
the associated polyhedral homotopy. This is of course valid, since the polyhedral
formulation sharply bounds any multihomogeneous formulation, but it is a bit of
overkill when the multihomogeneous formulation is already sharp. If the payoffs
Case Studies 151

were such that more monomials vanish from the equations, such as may happen
when payoffs for two pure strategies are equal, then the polyhedral method could
provide a lower root count. For small systems, a multihomogeneous root count
can be done by hand while a general multihomogeneous routine for larger systems
remains a simple and efficient alternative to polyhedral approaches.
Let us take, for example, the case of N = 3 players, with players 1 and 2 having
Si + 1 = 3 pure strategies each, and player 3 having just s 3 + 1 = 2 pure strategies,
so (si, s2, S3) = (2,2,1). By Equation 8.4.15, the multihomogeneous root count is
B = coeS(a2b2c\ (b + c)2(a + c)2(a + b)1)
= coeft(a2b2c, b2{a + c)2(a + b)) + coeff(a2b2,2b(a + c)2(a + b))
= coeff (a2c, (a + c)2a) + coeff(a2b, 2a2(a + 6)) = 2 + 2 = 4
The explanation of the first line is that the exponents in a2fc2c1 match the dimensions
of the space, P Sl x PS2 x PS3, on which we work, while those in the polynomial
(b + c)2(a 4- e)2(a + b)1 match the number of equations of each type, which are
also s\, S2, S3 by Equation 9.1.2. The factor (b + c)2 says that the two equilibrium
equations for player 1 do not involve player l's bets while those of players 2 and 3
appear linearly, and similar factors come from the other two players' equilibrium
conditions. It is clear, we hope, from this example how to generalize to other N
and Sj.
Another way to arrive at the same result is to examine the linear product start
system. For the (si, S2, S3) = (2,2,1) game, the 3-homogeneous start system is

/ 2( ) 3( r player 1 equilibrium
(xi) x <x3) 1 , . .... . (9.1.3)
v ;
; ; ; > player 2 equilibrium
Wxwr
(xi) x (x2) } player 3 equilibrium.
Among the 25 ways to choose one factor from each equation, we are limited to 2
choices each in x\ and x-i and only one choice for X3. Making the choice for x% first,
which can only be done 4 ways, one may see that all the other choices are forced.
The valid choices are
'fa)} [ten [ten [ter
(x2) te) (x2) (x2)
< te> \ \ (zi> > \ te> > < (xi) > - (9.1.4)
(Xi) (Xi) (Xi) (x3)
. (x2) ) I (x2) ) I (xi) ) \ (xi> .
The disparity between the multihomogeneous root count and the total degree, here
4 and 32, respectively, grows rapidly with the size of the problem, for example, for
N = 4 players having 4 pure strategies each, the 4-homogeneous Bezout count is
13,833, while the total degree is 3 12 = 531,441.
152 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

9.2 Chemical Equilibrium

Imagine a reaction vessel, or an automobile engine, in which a mixture of chemical


compounds are reacting. The compounds may break up and recombine into a myr-
iad of intermediate species, settling down eventually to a final equilibrium mixture.
While the transient behavior of the reaction is governed by differential equations,
the final equilibrium conditions are well-modeled by a system of polynomial equa-
tions. The system typically has at least one real root with positive values for the
concentrations of all the chemical species. It is possible for there to be more than
one positive root, in which case the transient behavior determines which of several
possible equilibria is reached. A basic presentation of modeling chemical reactions
can be found in (Morgan, 1987), from which the following discussion is derived;
more sophisticated treatments are given in (Feinberg, 1980).
The variables in the system represent the molar concentrations of the species.
The concentrations at a state of equilibrium are governed by two types of equa-
tions: conservation equations state that the total number of atoms of each element
must stay constant (we assume a closed system), and reaction equations model the
propensity of certain combinations of species to transform into each other. In such
a model, a chemical reaction equation of the familiar form, such as

H2O ^2H + O,
gives rise to an equilibrium reaction equation governing the balance between the
constituents on the two sides, in this case

kXH2o = XHXO,
where k is an equilibrium constant that depends on temperature. (Equilibrium
constants for many reactions are available in standard tables, typically derived
from laboratory experiments.) To go with this reaction equation, the conservation
equations would be

2XH2O + XH = TH,
XH2O + XO = To,

where TH and To stand for the total amount of hydrogen and oxygen in the vessel.
Notice that the coefficient of 2 on XH2O in the conservation equation for hydrogen
comes from the fact that each water molecule has two hydrogen atoms. The conser-
vation equations are always linear, and the reaction equations are polynomial. The
three equations just given determine the equilibrium balance between water, hydro-
gen and oxygen in a simple model that ignores molecular hydrogen and oxygen, if 2
and Oi-
Morgan presents a model (Model B in (Chapter 9 Morgan, 1987)) involving
eleven species formed from oxygen, hydrogen, carbon and nitrogen. The reaction
equations, given in standard chemical notation at left and in polynomial form at
Case Studies 153

right, are:

02 ^ 20 kiXO2 = Xo (9.2.5)
H2 ^ 2H k2XH2 = X2H (9.2.6)
2
7V2 ^ 2N k3XN2 =X N (9.2.7)
C02^0 + CO k4XCO2 = XOXCO (9.2.8)
OH^O +H k5XOH = XOXH (9.2.9)
H2O^±O + 2H k6XH2o=XoX2H (9.2.10)

NO^O +N k7XNO=XoXN. (9.2.11)

There are four conservation equations:

TH = XH + 2XH2 + XOH + 2XH2o (9.2.12)


Tc=XCo + Xco2 (9.2.13)
TO = XO + Xco + 2X O2 + 2XCo2 + XOH + XH2O + XNO (9.2.14)
TN=XN + 2XN, + XNO (9.2.15)
6
These are eleven equations in eleven variables, with total degree 2 • 3 = 192. We
could readily solve the system as given, but it is easy to reduce. The obvious
move is to substitute from the reaction equations into the conservation equations to
eliminate all variables except Xfj, Xo, XQOI a n d ^JV- This gives four equations of
total degree 3 • 2 • 3 • 2 = 36. Note , however, that there is only one cubic monomial
in the equations, which comes when we eliminate XH2O using Equation 9.2.10. So
it is a simple maneuver to replace Equation 9.2.14 with

2To — TH = 2(Xo + Xco + 2Xo2 + 2Xco2 + XOH + X^o) — + 2XH2 + XOH)-


{XH
(9.2.16)
After substituting from the conservation equations, the system of Equations (9.2.12,
9.2.13, 9.2.16, 9.2.15) has total degree 3 • 2 3 = 24.
Now, let's see if any of the product structures can further reduce the number
of homotopy paths. First, for convenience, we list the monomial structure of the
equations:

(l,Xo,XH,XoXH,Xfj,XoXff)
(l,Xco,XoXCo) /g 2 1 7 \
(l, Xo, XH, XCO, XQ,XH, XOXH,XOXCO, XOXN)
\ 1, XN , XN, XOXN ) .

A four-homogeneous formulation gives a root count of 18, which is the lowest possi-
ble multihomogeneous count. We can improve on that slightly with a linear product
154 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

homotopy having the start system

(l,Xo)x(l,XH)i2)
(l,Xo) x (l,XCo) /Q2 lgx
{1,XO,XH) x (1,XO,XH,XCO,XN)
(1,XN) x (1,XO,XN),
which gives a root count of just 16. As always, a sparse monomial homotopy would
do just as well as the best linear product homotopy.
In chemical equilibrium problems a significant numerical issue arises: equilib-
rium constants often have wide ranges of magnitude. For a temperature of 1000°,
Morgan gives reciprocal equilibrium constants Ri = 1/fcj that range from 10 22120 to
10 47 ' 970 . It is essential to rescale the variables and the equations to work in double
precision arithmetic. We will not discuss this issue here. The interested reader may
refer to Morgan's treatment in (Morgan, 1987) or (Meintjes & Morgan, 1987), or
study the implementation in the function scalepol distributed as part of HOMLAB.
This problem is treated in the exercises of this chapter.

9.3 Stewart-Gough Forward Kinematics

A detailed description of Stewart-Gough platform robots and the associated forward


kinematics problem has already been given as a case study in parameter continu-
ation, § 7.7. However, the discussion there assumed that we had the solutions for
some general member of the problem family which could then be used as the start
system for parameter continuation. Here, we return to the problem to examine our
options for solving the first example.
The family of Stewart-Gough platform problems is a sub-family in the family of
all systems of seven quadrics on [e, g] G P 7 . Any member of this family has at most
27 = 128 isolated solution points, and it is easy to write down an example with that
many roots, a simple one being

G(e,g) = {e2 - e2, e 2 - e2,, e2 - e2, e2 - g2, e2 - g2, e2 - g22, e2 ~ g2} = 0. (9.3.19)

We immediately see that this system has exactly 128 solutions, all of the form
ex = ±eo,...,g3 = ±e0. The theory presented in § 8.3,8.4.1 shows that with
probability one, the solution paths of the homotopy

H((e,g),t) = -ytG(e,g) + (1 - t)F((e,g);p0) = 0, (9.3.20)


for any p0 and random 7 e C, will lead from the 128 solutions of G = 0 to a set of
endpoints that contains all isolated solutions of F = 0 as t goes from 1 to 0 along
the real line.
The Stewart-Gough forward kinematic equations can be reduced to a form in
which a linear-product decomposition yields a lower root count than the total de-
gree. The reduction is based on the observation that the quadratic terms in g in all
Case Studies 155

six leg equations, Equation 7.7.10, are the same, namely gg'. Hence, if we subtract
the equation for leg 1 from all the others, this term is eliminated from five of the
equations. That is, the system becomes

fo(e,g)=ge' = 0, (9.3.21)
fi(e,g) = ( M i + aia[ - L\)ee' + (gb'xe' + ehg') - (ge'a[ + aveg')
- (e&ie'ai + axeb\^) + gg' = 0, (9.3.22)
fi(e,g) = (bil/i + aia'i - L})ee' + (gbtf + ebig') - {ge'4 + a%eg')
- {ehe'a'i + a ^ e ' ) = 0 , i = 2 , . . . , 6. (9.3.23)

This system admits the linear product decomposition

/o G {g ® e) (9.3.24)
he({e,g}®{e,g}) (9.3.25)
/,efej}®«). i = 2,...,6. (9.3.26)

Consequently, we may use a start system of the form

9o € (g) x (e) (9.3.27)


(2)
Si€<e,s> (9.3.28)
gi<E(e,g)x(e), i = 2 , . . . , 6. (9.3.29)

The linear-product root count may be tallied up by noting that in picking one factor
from each equation, we must never choose more than three of the form (e), because
choosing four or more forces e = 0, and we wish to ignore any solutions on that
degenerate set. Accordingly, if we pick the factor (g) in go, we may pick either
of two factors in gx and among the remaining five equations, we may choose (e)
from zero to three times. If instead we choose (e) in go, we must limit the last five
equations to choose (e) at most twice. These observations give a root count of

4(HHK)H[0KMDH
It is shown in (Wampler, 1996a) that the count of 40 for general Stewart-Gough
platforms is due to the antisymmetry of the mixed quadratic terms. That is, if we
write Equation 7.7.10 for leg i in the form,

eTAte + 2eTBi9 + gTg = 0,

where e and g are interpreted as 4 x 1 column matrices, then the 4 x 4 matrix


Bi is antisymmetric, Bf = — Bj. [This can be seen in the quaternion formulation
by noting that (gb'e' + ebg') = —{eb'g' + gbe'), and similarly for the other mixed
terms.] Accordingly, any further reduction of the problem must take advantage of
this property. A monomial product or sparse monomial homotopy does not account
156 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

for any relationships between the coefficients of the monomials, so these give 84
roots when applied to Equation 9.3.21.

9.4 Six-Revolute Serial-Link Robots

The solution of the inverse kinematic problem of general six-revolute, serial-link


robots, once called the "Mount Everest of Kinematics" by renowned kinematician
F. Freudenstein,1 is a milestone in the development of polynomial continuation.
The problem is, given a stationary ground link and six subsequent moving links
connected in series by rotational joints, find all sets of joint angles to place the final
link in a given position, p, and orientation, {X7,y7,z7}, as schematically shown
in Figure 9.1. The links are assumed to be rigid bodies, a good approximation
for most industrial robots. The space of rigid-body displacements, E 3 x 50(3), is
six-dimensional, which matches the dimensionality of the joint space, so we expect
in general a finite number of isolated solutions to the problem. The stature of the
problem justifies a historical synopsis, which may help to place the development of
the continuation method in context with other approaches.

Fig. 9.1 Schematic six-revolute serial-link robot

The high points in the history of the problem begin in 1968 with (Pieper, 1968),
who gave a formulation of the general problem having total degree 64,000. This
1
Ferdinand Freudenstein, Higgins Professor Emeritus of Mechanical Engineering, Columbia
University
Case Studies 157

upper bound was substantially sharpened in 1973 to only 32 (Roth, Rastegar, &
Scheinman, 1974), but it was not until 1980 that (Duffy & Crane, 1980) derived
a reduction of the problem to a single polynomial of degree 32. This essentially
solved the problem in the sense that good numerical methods exist for factoring
a polynomial in one variable and also in the sense that one could solve a generic
example and find the true root count. The count is only 16, since 16 of the 32 roots
were extraneous ones introduced by the reduction process. However, at the time
this was not fully appreciated and the prevailing attitude at the time was that the
problem could not be considered fully solved until a reduction to single univariate
polynomial of degree 16 was found. Besides, a numerical demonstration does not
carry the full weight of mathematical proof.
It was into this scene that, in 1985, (Tsai & Morgan, 1985) introduced the
method of polynomial continuation to the kinematics community. They cast the
problem as eight quadratics (total degree 256) and found that only 16 endpoints
of the ensuing homotopy were valid solutions. Perhaps the most important con-
tribution of that work was not the confirmation of the count of 16, but rather the
demonstration that systems of polynomial equations could be solved reliably by
numerical means.
Work continued after that on two fronts: elimination methods and continuation.
(Primrose, 1986) gave the first real proof of the root count of 16, by showing that the
other 16 roots of the Crane-Duffy polynomial correspond to solutions at infinity for
the intermediate joints. Morgan and Sommese (Morgan & Sommese, 1987a) showed
that the Tsai-Morgan system had a two-homogeneous Bezout number of only 96,
the first application of multihomogeneous continuation. Finally, in 1988, Lee and
Liang (Lee & Liang, 1988) produced the long sought-after reduction to a univariate
polynomial of minimal degree, although it was a complicated procedure. A simpler
one was later given by (Raghavan k. Roth, 1993), and a numerical treatment of
this reduction as an eigenvalue problem was given by (Manocha & Canny, 1994).
Complementing all of these works, (Manseur & Doty, 1989) found an example with
all 16 solutions being real.
The reduction of a problem to a univariate polynomial of minimal degree has
two payoffs: it proves an upper bound on the root count and it leads to a numerical
solution. But it is not the only route to either of these. A system of equations
that admits a sharp root count via a multihomogeneous formulation or a monomial
polytope analysis suffices for proof, and continuation can provide the numerical
method. We should not fail to mention the extensive work in computer algebra
to compute Grobner bases as a means of proof; see (Cox et al., 1997) and (Cox
et al., 1998) as a beginning point to the extensive literature on this. Any reduction
of a problem to a Grobner basis can be converted for numerical solution to an
equivalent eigenvalue problem (Auzinger & Stetter, 1988; Moller & Stetter, 1995).
But even as late as Raghavan and Roth's paper, algorithms for computing Grobner
bases were not capable of handling a problem as difficult as the six-revolute inverse
158 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

position problem.
If we are willing to give up rigorous proof of the true root count, it is often
convenient to find a "good enough" formulation of a multivariate system with a
root count low enough that continuation can be reasonably applied. In this sense,
with the tremendous increase in computer power of late, even Pieper's original
formulation of total degree 64,000 might be considered within range. But we will
proceed below to give a much more amenable formulation than that.
The approach we give here, first published in (Wampler & Morgan, 1993), is
of a different cloth than all the others we have mentioned. Those others begin
with a formulation of the kinematics as a product of homogeneous transformation
matrices (Denavit & Hartenberg, 1955), (Chapter 12 Hartenberg & Denavit, 1964).
Reductions starting from that point lead to rather long algebraic expressions, as
one can see from the cited references and (Chap. 10 Morgan, 1987). Instead, we
write down a system mirroring closely the geometry of the problem, and proceed
to solve it in its unmodified form.
Let Zi € R3, i = 1 , . . . , 6, be unit vectors along the joint axes of a six-revolute
serial-link chain; see Figure 9.1. The kinematic chain is completely described by
finding the common normal between each pair of successive joint axes and listing
three values: the "twist angle" ar between joint % and i + 1, the distance a, between
these joint axes (a.k.a, the "link length"), and the distance d» (a.k.a., the "joint
offset") between successive common normals. If none of the successive joints are
parallel2, the common normal directions are

Xi = z% x z i + i / s i n a j , i = l,...,5,

where "x" means the vector cross-product in 3-space. Then, the six-revolute inverse
position problem can be written as the system

Zi-Zi = l, i = 2,3,4,5 (9.4.30)


Zi-zi+1= cos on i = 1,2,3,4,5 (9.4.31)
5

(ai/sinai)zi x z2 + ^ ( d i z i + ( a l / s i n a i ) 2 i x z i + 1 ) = p, (9.4.32)
i=1

where ft is a known vector from where the first common normal intersects joint 1
to where the last common normal intersects joint 6. The vectors z0, XQ, and z\
are known, being fixed in the ground, as are ZQ and xy, being fixed in the last
link whose position and orientation is given. From these, and the known lengths
and offsets of the links, ft is readily computed from p, and we take it as given. So
we have 12 equations (vector Equation 9.4.32 is equivalent to 3 scalar ones) in 12
variables, which are the 3 elements each of £2,2:3,2:4^5. Although these vectors
naturally live in R 3 , we will treat them as if they live in C 3 , by the usual embedding.
2
See (Wampler & Morgan, 1993) for how to handle parallel links.
Case Studies 159

Among the equations, two are linear and the rest are quadratic, for a total degree
of 210 = 1024. Using the two-homogeneous groupings (I,z 2 ,z 4 ) and (I,z3,z5), we
get a lower root count of 24(g) = 320. Although this is quite a bit inflated over the
true root count of 16, it is low enough that we have no trouble tracking all paths
by continuation. Then, we can solve subsequent examples with only a sixteen-path
parameter homotopy.

9.5 Planar Seven-Bar Structures

One of the most prevalent classes of mechanical systems consists of planar links
joined by rotational joints, also known as "pin joints," or simply "hinges." The
axes of all the joints in the mechanism are perpendicular to the plane of motion. In
reality, the links occupy three-dimensional volumes, and they can move in separate
parallel planes, but for the purpose of analyzing their motion, only their projection
onto one of these planes needs to be considered.
Consider the seven-bar assembly shown in Figure 9.2, consisting of four triangles
and three simple bars. (We call this the "type a" seven-bar, as there are two other
topological arrangements of interest; see Exercise 9.6) For general dimensions of
the links, such an assembly is a structure, meaning that it will be rigid. However,
it is quite possible that if we disconnect a joint and reposition the pieces, we can
reconnect that same joint with the links in new relative positions. The question of
finding all such assembly configurations comes up in the study of related six-bar
and eight-bar linkages which have internal motion.

Fig. 9.2 Seven-bar linkage, type a.


160 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

9.5.1 Isotropic Coordinates


Before presenting equations for the problem, we take a brief aside to explain
"isotropic coordinates." Suppose we have a point in the real plane (a, b) £ R2.
We can naturally associate to this point the complex number z = a + ib G C This
is quite convenient because vector addition in R2 becomes just the usual addition of
complex numbers, and the approach is known in the kinematics community as the
"complex vector formulation." Moreover, a rotation around the origin through an
angle 0 moves point z e C to a new point el@z. For brevity, we use the convention
6 := el&. In this manner, any rotation in the plane corresponds to a 9 6 C of unit
magnitude, \9\ = 1.
Now, suppose we extend (a, b) into C 2 by letting a and b take on complex values.
Then, to preserve the convenient modeling of rotations by complex multiplication,
we associate to (a,b) e C 2 the point (z,z) := (a + ib,a - ib) G C 2 . For reasons
beyond the current discussion, the pair (z, z) are known as "isotropic coordinates."
Note that z and z are complex conjugates if, and only if, a and b are real. Rotation
through an angle 0 now gives the point (8z, 9~lz). Any vector loop equation written
in terms of z and 9 has a corresponding equation in which z is replaced by z and 9 is
replaced by 9~1. Alternatively, we may let 9 := 9~l, so that rotation is represented
by the isotropic pair (9,9), with the extra equation 99 = 1.

9.5.2 Seven-Bar Equations


Without loss of generality, let us take the position of link 0 to be fixed; that is,
assume 9o = 1. Then the squared lengths of the three simple bars can be written
as

i\ =(a0 + M i + b292){a0 + M i + b292), (9.5.33)


l\ =(c 0 + a292 + b393)(c0 + a292 + b39s), (9.5.34)
£l ={b0 + a393 + Mi)(&o + M s + M i ) , (9.5.35)
9^ = 1, 8292 = 1, 9393 = 1. (9.5.36)

This is a system of six quadratics, for a total degree of 26 = 64.


The system is bilinear when treated with the two-homogeneous partition

{1,01,02,03} X {1,01,02,03}.

In the corresponding linear-product start system, only choices of factors having


three of each type of factor give finite roots, so the two-homogeneous root count is
(3) = 20-
A sharp root count is obtained by matching the sparsity of the equations using
Case Studies 161

a linear product decomposition as follows:

{1,01,02} x { l , M 2 }
{1,02,03} X {1,02,03}
{1,03,01} X {1,03,0!} f q , w
(9 5 37)
{1,01} X {1,0!} " -
{1,02} X {1,02}
{1,03} X {1,03}
Of the same 20 combinations of factors that gave start points in the two-
homogeneous formulation, six now do not give solutions. For example, we cannot
simultaneously choose the initial factor from the first, fourth and fifth equations, as
we would then have three equations in only two variables: 9%, 02- From this, we see
that the linear product homotopy based on Equation 9.5.37 has a root count of 14.
Readers with a particular interest in planar linkages may wish to look at
(Wampler, 2001) to see an alternative solution approach which, when applied to
the seven-bar problems, converts them to eigenvalue problems of size 14, 16, or 18.

9.6 Four-Bar Linkage Design

We have already studied several systems concerning the kinematics of mechanisms


and robots, namely, Stewart-Gough platforms, six-revolute serial-link robots, and
planar seven-bar linkages. In each of these, the objective was analysis: given the
mechanical structure of the links, we sought all assembly configurations. In this sec-
tion, we study a simpler linkage, the planar four-bar, but we ask synthesis questions,
that is, we seek structural dimensions of the links so that the four-bar produces a
specified motion. Depending on the requirements set forth for the motion, we may
face a system of polynomial equations ranging from easy to hard. The easy exam-
ples have been solved long ago by a variety of methods, but the most difficult, the
nine-point path synthesis problem, stood for almost 70 years before being solved by
modern continuation methods. The use of an efficient product structure was critical
to that success.
In all these examples, we begin with basic loop-closure equations and manipulate
them into a form amenable to efficient solution by continuation. In earlier times,
kinematicians usually declared a problem "solved" when an elimination procedure
had been found for reducing it to a single polynomial in one variable, especially
if that polynomial was of minimal degree (having no extraneous factors). Such a
polynomial could then be solved by a variety of numerical methods. With the advent
of continuation, this is no longer necessary, for we can reliably find all solutions to
a system of multivariate polynomials. Part of the art in applying continuation is
to make informed decisions about how much symbolic pre-processing to do before
turning the problem over to numerical solution. With this in mind, let us take a
look at some four-bar design problems.
162 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Fig. 9.3 Four-bar linkage. Heavy lines are rigid links, whereas thin lines are vectors. Open circles
mark hinge joints, and hash marks indicate a stationary link.

9.6.1 Four-Bar Synthesis


Except for the simple lever, perhaps the most ubiquitous linkage mechanism is the
planar four-bar, Figure 9.3. It consists of four rigid planar bodies connected in a
loop, with one link, the "ground link," held in fixed position. A set of three links
connected in such a loop would form a rigid triangle, a fundamental structural com-
ponent in bridges and the like. In contrast, a hinged quadrilateral deforms, making
it useless for structures but leading to a multitude of applications in machines that
perform useful motions. In particular, points such as A and B on the two links
adjacent to the ground link trace out circles centered on the fixed hinge points Ao
and BQ, respectively, while points such as C on the "coupler link" opposite the
ground link generally trace out sixth-degree curves. Linkages where one or more of
the hinge joints are replaced by linear (slider) joints are also four-bars, but we will
not discuss them here.
The motion characteristics of four-bars can be used in several ways. Most ap-
plications fall into one of the following categories:
Function Generation In this case, the purpose of the four-bar is to transfer an
input rotation at one ground pivot to the other. If the four-bar is a parallel-
ogram, this does nothing more than duplicate the input motion at the output
side (transferring power in the process), but if the linkage is a general quadri-
lateral, some reshaping of the motion takes place. That is, a uniform rotation
speed at the input gives a nonuniform speed at the output, which can be very
useful. Quite often, a steady rotation at the input is converted to an oscillatory
output. A windshield wiper operates on this principle, for instance.
Path Generation In this case, there is a designated point on the coupler link,
where we might place the tip of a tool for the machine to do its work, and so
the path traced out by this tool is of top concern. The designated point is called
the coupler point and its path is called the coupler curve. The motion of the
foot of a simple walking machine might be generated in this way (assuming a
Case Studies 163

ball-shaped foot so that only its center position matters, not its orientation).
Body Guidance In this case, the entire motion of the coupler is at stake, both
position and orientation. Such a machine might scoop up material in one loca-
tion, carry it without spillage to deposit the contents in a second location. A
four-bar might guide the motion of the scoop.

Four-bar synthesis means that we specify at the outset the desired motion, and
seek to find a four-bar that will produce it. Synthesis is the inverse process of
analysis, which seeks to describe the motion characteristics of a given mechanism.
We will proceed to write out the basic equations of four-bar motion, which can
be employed for analysis and for various kinds of synthesis, depending on which
quantities are given and which are treated as unknowns. We will then describe
several synthesis problems. Among these, the most challenging are path-synthesis
problems, and as we shall discuss in some detail, the most challenging of all is the
synthesis of a coupler curve to pass through nine given points.

9.6.2 Four-Bar Equations


The kind of synthesis problems we treat here are called precision-point methods,
because we give a certain number of points through which the coupler curve must
pass precisely or a number of locations through which the coupler must guide a
body. So in the following equations, we use an index j to denote the configuration
of the four-bar at the j t h precision point or precision position.
Referring to Figure 9.3, vectors a and b describe the locations of the fixed pivots
with respect to the origin O, vectors u and v are the links connected to ground
at these pivots having rotations 4>j a n d 4>j-> respectively, pj is the vector from the
coupler point C to the origin, x and y tell the location of the rotational joints in the
coupler link, while Bj is the rotation of the coupler link. Quantities a, b, u, v, x, y
do not change as the four-bar moves, while 4>,ip,6,p do change and hence have
a subscript j in our formulae. Without loss of generality, we may assume 6>o =
4>o — V'o = 1) because the initial orientation of the links can be absorbed into the
orientations of the vectors u,v,x,y. The four-bar can be viewed as consisting of
a left "dyad," a,u,x and a right dyad, b,v,y, that are rigidly connected at the
coupler point.
In the following equations, we will use isotropic coordinates to represent vectors
in the plane. Recall from § 9.5.1, where we give more details, that a vector from
(0,0) to (ao,ai) in the plane is represented by isotropic coordinates as (a,a) :=
(a0 + iai,ao — ia,i).
Summing vectors around the left and right dyads, we have loop equations for
the j t h position, as

pj+x6j+u(pj+a = 0, pj + x6~1+u<pj1 + 5 = 0, (9.6.38)


1 l
Pj+ydJ+vipj+b = O, pj+yej +vtpj +1 = 0. (9.6.39)
164 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

From these basic equations, we can define a wide variety of synthesis problems,
varying in how many positions are prescribed and which of the symbols in the
above equations are known quantities versus variables.

9.6.3 Four-Bar Analysis


Before proceeding to the synthesis questions, let's look at the analysis question of
determining the motion of a given four-bar. This will come down to nothing more
than solving a quadratic equation, but we include it for background in case the
reader wishes to animate any of the linkages synthesized in the subsequent sections.
We assume that in Equations (9.6.38) and (9.6.39) we know the shapes of the links as
given by x,y,u,v,a,b and x, y, u, v, a, b. This leaves five unknowns, pj, pj, Oj, (pj, xjjj
and since we have just four equations, we expect a solution curve. One way to plot
the curve is to rotate the left input link through a sequence of closely spaced angles,
say $j = 0,1°, 2°,..., 360° and solve for the other four variables. First, eliminate
Pj and pj by subtracting one equation from another to get

{x-y)ej+u<t>j-vxj}j+a-b = 0, (x-y)ejx+u(j>~1 -vi/j~1+a-b = 0. (9.6.40)

Next, we eliminate 6j to get

[x - y)(x -y) = (u(j)j - vipj +a- b){u(j)~l - vipj1 +a-b). (9.6.41)

Since <pj is given, this is just a quadratic equation in ipj. After solving it, we can
backtrack through Equation 9.6.40 to get 6j and then Equation 9.6.38 for (pj,pj)-
Remember that "real" points on the motion curve are those points for which \ipj =
\63\ = l.
We should mention that the engineering analysis of a four-bar under considera-
tion for a real machine would encompass much more than just plotting its motion
curve. One would need to consider, for example, the forces transmitted through the
links. This and other considerations are beyond the scope of the present discussion.

9.6.4 Function Generation


For function generation, we prescribe pairs (<pj,ipj), j = 0,..., n, relating the output
angle to the input angle, and we find four-bars that meet these conditions. The
coupler point is of no consequence in this application, so we may assume y = y =
0. Also, the whole linkage can be translated in the plane without changing the
functional relation between the angles, so we may assume a = a — 0. Likewise for
rescaling the size of the links or rotating the whole linkage in the plane, so we may
assume 6 = 6 = 1. It is convenient to eliminate Pj and pj in Equations (9.6.38) and
Case Studies 165

(9.6.39) to get, for j = 0 , . . . , n,

x9j + ucj)j = vipj + 1, (9.6.42)


l l 1
x9j + u(f>J = vt/jj + 1. (9.6.43)

This leaves as variables u, u, v, v, x, x and 9j, j = 1 , . . . , n, since we assume 90 = 1.


Equations (9.6.42) and (9.6.43) for j = 0 , . . . ,n are 2(n + 1) equations in 6 + n
variables, so n < 4. This implies that we can specify up to five pairs of angles,
((j>j,ipj), j = 0 , . . . , 4 , and still expect to find four-bars that exactly interpolate
them.
The system of Equations (9.6.42) and (9.6.43), j = 0 , . . . ,4, after clearing the
negative exponent on 9j, consists of 8 quadratics and two linear equations for a
total degree of 64. We leave it as an exercise to show that the system has a multi-
homogeneous formulation with a root count of only six. We could solve this using
continuation, preferably using a sparse linear solver in the path tracker since only
a few variables appear in each equation. An alternative is to reduce the number of
variables. Eliminating 9j between Equations (9.6.42) and (9.6.43) and then using
the equation for j = 0 to eliminate xx from each of the others, one obtains, for
j = 1 , . . . , 4, the single equation

(-uct>j+vipj + l){-u(l)-l+v^Jl + l) = (-u<l>o+vij>o + l){-u<l>oX+vij)Ql + \). (9.6.44)

It is now easy to see that the total degree is 2 4 = 16, whereas a two-homogeneous
structure {u, v, 1} <g> {u, v, 1} still has a root count of six.
The two-homogeneous root count is not sharp. Expanding Equation 9.6.44, we
see that the monomials uu, vv and 1 appear on both sides and cancel leaving only
the monomials {u, u, v, v, uv, vu}. This means that the origin (u, v, u, v) = (0,0, 0, 0)
is always a solution, which is of no use as a four-bar. Also, after two-homogenizing
via (u,v,u,v) — (U/W,V/W,U/W,V/W), we see that there are two solutions at
infinity ([W,U,V},[W,U,V]) = ([0,1,0], [0,0,1]) and ([0,0,1], [0,1,0]). Since all
three roots appear even for arbitrary coefficients, the root count on C* is three, and
accordingly, the polyhedral approach gives a mixed volume of three.

9.6.5 Body Guidance


This time, we are given positions (pj,pj) and orientations 9j of a body, j = 0 , . . . , n.
We want to find four-bars which carry this body through these locations while it
is rigidly attached to the coupler link. The Equations 9.6.38 for the left dyad are
decoupled from those for the right dyad, Equations 9.6.39. In fact, they are exactly
the same form, so if we find multiple solutions to Equations 9.6.38, we can choose
one of them for the left dyad and one for the right dyad to form a four-bar that
guides the body through the specified locations. For the left dyad, we have 2(n +1)
equations in the 6 + n variables x,x,u,u,a,a and <pj, j = l , . . . , n , <f>o = 1, so
we expect a finite number of solutions when n = 4. Points (x, x) are known as
166 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Burmester points, and points (—a, —a) are the Burmester centers, named after the
man who first solved the problem (Burmester, 1888).
Eliminating <pj from Equations 9.6.38, the equations for the left dyad are, for
j=0,...,4

uu = (pj + x9j + a)(pj + x9~l + a), (9.6.45)

Using case j — 0 to eliminate uu from the others, we have, for j = 1 , . . . , 4,

(Pj + x9j + a)(pj + x9~x +a) = (p0 + x90 + a)(p0 + X8QX + a). (9.6.46)

This is almost identical in form to Equation 9.6.44, except this time the constant
term does not cancel out since pjpj ^ PoPo- Thus, the system has total degree
24 = 16, two-homogeneous degree (2) = 6, and only 4 finite roots. (The same
two roots at infinity exist as in the function generator problem.) In fact, one
classical approach to solving the function generator problem is to use the principle
of kinematic inversion to convert it to the Burmester body guidance problem, but
we will not go into that here.

9.6.6 Five-Point Path Synthesis


Many different path synthesis problems can be formulated, depending on what
additional information is given besides the path points (pj,pj). One version is to
give the ground pivots {a, a, b, b}. The simplification of the equations is exactly as
for body guidance, except we must simultaneously consider both the left and right
dyads, since 93 is unknown. Thus, the system to be solved is, for j = 1 , . . . , n,

(Pj + x6j + a){pj + x9~l + a) = (p0 + x + a){p0 + x + a), (9.6.47)


(Pj + y9j + b) (pj + y6j * +b) = (Po + y + b)(p0 + y + b), (9.6.48)

where we have used 60 = 1. For five precision points, i.e., n — 4, we have eight
equations in the unknowns x, x, y, y and 9j, j = 1 , . . . , 4.
Expanding the products and cancelling terms, we have the system

8M0j(Pi + a) " (Po + a)] + x{6j\p3 + a) - (p0 + a)}


+(Pj + a)(pj +a)~ (po + a)(po + 0)) = 0, (9.6.49)
03{y[03(Pi +b)- (Po + b)\ + y[9-\Pj + b) - (Po + b))
+(Pj + b){p3- + b)- (p 0 + b)(p0 + b)) = 0, (9.6.50)

where the 9j multiplying each equation clears negative exponents. This gives eight
cubic equations, for a total degree of 3 8 = 6561. This obviously misses the sim-
ple monomial structure of the system. We can do much better by noting that
Equation 9.6.49 has the monomial structure {x,x,l} x {l,9j,9^} and similarly,
Equation 9.6.50 has the monomial structure {y, y, 1} <g> {1,9j, #|}. The reader may
verify that this gives a six-homogeneous root count of 24(^) = 96.
Case Studies 167

The monomial structure is in truth sparser than the product structure just given
would imply. Only the monomials {X6J,X6J,X,X~6J,0J} appear in Equation 9.6.49,
and Equation 9.6.50 has a similar pattern. This allows solutions of the form x =
y = 0j = 0, so it is clear that the root count is lower than 96. In fact, the polyhedral
mixed volume yields a root count of 36, which is sharp.

9.6.7 Nine-Point Path Synthesis

In 1923, Alt (Alt, 1923) noted that the extreme path-synthesis problem for four-bars
is to specify nine points on the coupler curve. Compared to the six-revolute serial-
link problem, this one has a longer chronology, but a shorter historical account. The
problem has so far proven to be invulnerable to reduction by hand, and it seems no
one as yet has made a serious attempt at it using computer algebra. To date, the
problem has only been solved by polynomial continuation.
After Alt, the main advance came in 1962, when Roth (Roth, 1962) (Roth &
Preudenstein, 1963) abandoned analytical methods and invented an early form of the
continuation method, which he called the "bootstrap method." The work was done
using real variables, so Roth invented heuristics to work around difficulties which
we now recognize to be solution paths that meet and branch out into complex space.
Most bootstrap paths never found a solution, but nevertheless, the approach did
produce for the first time linkages to interpolate nine specified points. After the
invention of the cheater's homotopy (see § 7.8), Tsai and Lu (Tsai & Lu, 1989) used
a heuristic version of it to improve the yield of solutions, but a complete solution
was not found until 1992, by Wampler, Morgan, and Sommese (Wampler et al.,
1992). A follow-up discussion of this article (Wampler, Morgan, & Sommese, 1997)
showed how the approach could be specialized to design symmetric four-bar coupler
curves with a maximal specification of precision points (five points plus the line of
symmetry).
The system of equations is exactly the same as Equations (9.6.49) and (9.6.50),
except now a, a, b, b are unknown and the index ranges over j = 1 , . . . , 8. Accord-
ingly, the system has the product structure, for j = 1 , . . . , 8,

(l,x,x,a,a,ax,ax){l,0j,0?), (9.6.51)
(l,y,y,b,b,by,by) {1^3,6*). (9.6.52)

Using the fact that four general equations in the monomials {l,x,x,a,a,ax,ax}
have just 4 solutions (hint: introduce new variables n = ax and ft = ax), one sees
that this system has a root count of 212(®) = 286,720. This is the root count of the
formulation used to solve the problem in (Wampler et al., 1992), which at the time
was probably the largest polynomial system ever solved.
This is a case where symmetry can play a helpful role. It is easy to see that
swapping (x, x, a, a) with (y, y, b, b) leaves the equations reordered but otherwise
unchanged. If we can arrange our start system to have this same two-way symmetry,
168 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

we can track just half the number of paths. This can be done by using the same
random coefficients for the factors in Equation 9.6.51 as in Equation 9.6.52. Thus,
the problem can be solved using only 143,360 paths.
This is far from the end of the story. The system has numerous solutions at
infinity. Moreover, if (x, x,a, a) = (y,y,b,b), Equation 9.6.49 and Equation 9.6.50
are identical, so there is a positive dimensional solution component obeying this
relation. Many continuation paths terminate on this singular set. Actual solution
of the problem showed that there were only 8652 nonsingular solutions, appearing
in 4326 pairs due to the two-way symmetry. Since the two-way symmetry amounts
to nothing more than swapping the labels between the left and right dyads of the
mechanism, we may say that there are 4326 distinct four-bars that interpolate nine
general points. Moreover, these appear in triplets, called Roberts cognates, which
not only go through the nine points but have exactly the same coupler curve. This
means there are just 1442 distinct four-bar coupler curves that pass through the
points. By using parameter continuation, we can solve subsequent examples using
only 1442 paths, about a 100-fold reduction from the 143,360 used to solve the
first example.
When dealing with a very sparse system like Equations (9.6.49) and (9.6.50), it
is often advantageous to eliminate some variables. This is because one of the main
costs of the continuation method is solving the linear systems for Euler prediction
and Newton correction. The cost of linear solves grows as 0(n 3 ) with the number
of variables, unless sparse solving methods can be applied. In the problem at hand,
we can eliminate all the 8j variables without increasing the root count, thereby
increasing efficiency when using a linear solver for full systems.
The elimination is accomplished by applying Cramer's rule for linear systems.
The system

at9 + a20~l + a3 = 0,
(9A53)
M + w-i + p3 = o,
has solutions only if

SiS2 + Sj = 0, (9.6.54)

where

6l =
ft ft ' * = ft fc ' S
> = ft ft ' (9-6'55)
Applying this to Equations (9.6.49) and (9.6.50) gives a system of 8 equations with
the monomial product structure

{xd, xa, x, x, a, a, 1 } ( 2 ) <g> {yb, yb, y , y , b, b, 1}<2>. (9.6.56)

This reduced system has been the subject of further study. The mixed volume
of the reduced version of the system, computed by Verschelde (Verschelde, 1996)
Case Studies 169

(Verschelde et al., 1996), was found to be 83,977. The best root count known was
found by applying polynomial products (Morgan et al., 1995). The approach is to
observe that Equation 9.6.54 admits the product decomposition {5i, 53} ® {52S3}.
A homotopy based on this decomposition has 18,700 paths appearing with two-
way symmetry so that only 9,350 paths must be tracked. However, the start system
itself must be solved by continuation since the subsystems obtained by choosing one
factor from each equation are not linear. The whole computation requires 24,300
paths. Although this is a substantial reduction in the number of paths, it requires a
specialized computer program, so one may prefer to use a general purpose algorithm
with more paths.
No matter which method is used to solve the first random example, considerable
efficiency is to be gained in subsequent examples by applying parameter continua-
tion to track only 1442 paths.

9.6.8 Four-Bar Summary


The purpose of this discussion of four-bar linkages is to show a spectrum of prob-
lems, having multihomogeneous root counts ranging from six to 286,720. Each
geometric problem can be formulated in several ways as an algebraic system to be
solved, and each algebraic system can be placed in any one of several homotopies for
numerical solution. Generally, a well-chosen multihomogeneous formulation yields
a root count considerably lower than the total degree, while the mixed volume of
the Newton polytopes gives a somewhat lower root count. General linear prod-
ucts in which different equations have different linear decompositions are not useful
for these synthesis problems, because they have the same monomial structure at
each precision point. For the hardest problem in the set, the nine-point problem,
symmetry can cut the number of paths in half, while polynomial products give the
smallest root count at the expense of a more complicated computer program. Even
with that approach, the number of continuation paths is more than ten times as
large as the actual number of isolated roots. Only parameter homotopy can solve
the problem using only 1442 paths, but we need to bootstrap the process by solving
one example with one of the other homotopies.
In several examples, we see that there is more at stake than just the number
of homotopy paths. We can choose between two homotopies having the same root
count, one having sparse equations in many variables and the other having some
variables eliminated but less sparsity. Which is more efficient depends on the details
of how function evaluation and linear solving are computed. To be efficient, the
large, sparse formulation of the equations requires sparse linear algebra routines in
the path-tracking code. On the other hand, elimination of variables tends to raise
the degrees of the equations that remain, which can adversely affect the numerical
stability of the equations.
170 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Table 9.1 Constants for the chemical equilibrium model, Exercise 9.2

Equilibrium Total
Constants T = 1000° T = 3000° T = 6000° Concentrations
Iog10(l/fei) 24.528 7.289 3.108 To 5.e-5
Iog 10 (l/fc 2 ) 22.206 6.997 3.270 TH 3.e-5
Iog 10 (l/fc 3 ) 47.970 15.107 6.942 Tc l.e-5
Iog 10 (l/fc 4 ) 24.942 6.825 2.559 TN l.e-5
Iog 10 (l/fc s ) 22.120 7.208 3.541
Iog 10 (l/fc 6 ) 46.989 14.680 6.791
Iog 10 (l/fc 7 ) 32.187 10.285 4.878 |

9.7 Exercises

Exercise 9.1 (Nash Equilibria)


(1) Compute the generic number of Nash equilibria for the following cases:
(a) 3 players, 3 pure strategies each;
(b) 4 players, 2 pure strategies each;
(c) 3 players with (4,3,3) pure strategies, respectively.
(2) Let Nash(N, S) be the generic number of Nash equilibria for N players having S
strategies each. Derive a recursive formula for Nash(./V, 2) in terms of Nash(AT —
1,2) and Nash(iV - 2,2). Use it to find Nash(7,2).
(3) Write a code to compute Nash(iV, {Si,..., SJV}), where player i has Si pure
strategies. Compute Nash(5, {4,3,3,2,2}).
(4) Use HOMLAB'S routine bezno to find Nash(3, {4, 3,3}). In general, routine
bezno is not an efficient way to perform such a count, because it works by
forming and solving a linear-product start system. Demonstrate this by using
it to count Nash(6,2). What goes awry for a larger number of players?

Exercise 9.2 (Chemical Equilibrium) This exercise concerns the chemical sys-
tem of § 9.2. Data for this problem is given in Table 9.1. (A typographical error in
Morgan's Table 9-2, corrected here, reverses the constants Tc and TH•)

(1) Carefully verify the 4-homogeneous and the linear-product root counts given in
§ 9-2.
(2) Find a 3-homogeneous formulation that also has a root count of 18.
(3) Follow the steps outlined in § 9.2 to derive expressions for the coefficients of the
monomials listed in Equation 9.2.17 in terms of the mass conservation parame-
ters TH, To,Tc,TN and the equilibrium constants k\,..., k7.
(4) Use routine chemsys in HOMLAB to compute solutions to the system. First,
choose random coefficients. Try the different start systems. Do you get the
same number of finite roots each way? What do the roots at infinity look like?
(Hint: s t a t s (4,:) indicates the multiplicity of roots as determined by the end
game. See Chap.10.)
Case Studies 171

Table 9.2 Concentrations for T = 1000°, the only physically


meaningful answer

Components Concentration Compound Concentration


Xo 1.4911556-015 Xo2 7.499733e-006
XH 3.212064e-019 XH2 1.657938e-015
Xco 7.664381e-016 XN2 4.999735e-006
XN 2.314587e-027 XCo2 1.000000e-005
XOH 6.314036e-012
XH2O 1.500000e-005
XNO 5.308800e-010

(5) Compute the solutions for random parameters. Is the result the same as for
random coefficients?
(6) Compute the solutions for T = 6000°, 3000°, and 1000°. How many physically
meaningful roots are there (concentration values must be real and nonnegative)?
(7) The test in chemsys for real solutions checks if the imaginary part of the con-
centrations is less than 10~6. Why is this not an adequate test for this problem?
Can you devise a better one? Can you spot complex conjugate pairs in the list
of "real" solutions?
(8) Try turning off scaling for T = 6000° and see what happens. What do you
think will happen for T = 3000°? Try it and see.
(9) (Open ended.) Why is T = 1000° so difficult? Can you devise a strategy to
treat this problem more easily?

The sole physically meaningful answer for T = 1000° is given in Table 9.2.

Exercise 9.3 (Stewart-Gough by total degree) Try running the Matlab file
stewart/sgtotdeg.m to solve the forward kinematics of a general 6-6 Stewart-
Gough platform.
(1) Confirm that among the 128 endpoints of the total-degree homotopy, 88 lie on
the afHne algebraic set {(e, g) : e = 0, gg' = 0}.
(2) The degenerate points are all singular. Why?
(3) Save the 40 nonsingular roots and use them as start points for parameter ho-
motopy, as directed in Exercise 7.4.
Exercise 9.4 (Stewart-Gough by LPD) HOMLAB provides a routine, called
lpdsolve, that creates a linear-product start system for a given product struc-
ture and tracks the resulting homotopy paths. The user must provide an m-file
function that computes the function value f(x) and its Jacobian matrix df/dx.
The script file stewart/sglpdhom.m does all of this for Stewart-Gough forward
kinematics problems.
(1) Run sglpdhom and check that it tracks 84 paths and obtains 40 nondegenerate
solution points for a general 6-6 platform.
172 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

(2) The routine warns that the start system has 30 singular solutions of the form
e = 0. Can you see why these are present and why there are 30 of them?
(Hint: they are nonsingular roots for some choice of factors in the start system
G = {go, • • • ,ge}, but singular as solutions of G.)

Exercise 9.5 (Six-Revolute Inverse Position) The following exercises pertain


to the system Equations 9.4.30-9.4.32.
(1) Confirm the two-homogeneous root count of 320.
(2) Run routine sixrevl in HOMLAB and check that there are 16 finite roots.
(3) If your computer is fast enough, modify sixrevl to solve the system with a
total-degree homotopy (1024 paths) and reconfirm the root count.
(4) Run routine sixrev2, which uses a 16-path parameter homotopy, on the fol-
lowing and observe the number of finite roots and the number of real roots.
(a) a random, complex example,
(b) a random, real example,
(c) the Manseur-Doty example,
(d) a real example with intersecting "wrist" axes: a4 = d5 = a5 = 0.

Fig. 9.4 Seven-bar linkage, type b.

Exercise 9.6 (Seven-Bar Structures) The structure in Figure 9.2 is one of just
three topological arrangements of seven links in a structure that cannot be solved by
analyzing a five-bar or three-bar substructure. The other two are shown in Figures
9.4 and 9.5.

(1) Derive equations for each of the seven-bar structures in Figures 9.4 and 9.5 and
find linear product decompositions having root counts of 16 and 18, respectively.
Case Studies 173

Fig. 9.5 Seven-bar linkage, type c.

(2) Create a single program using HOMLAB to solve any of the seven-bar structures
with a 20-path two-homogeneous homotopy. Solve a random example of each
type and verify the root counts of 14, 16, and 18.
(3) Create individual programs for the three cases using linear-product decompo-
sitions having the minimal number of paths. Run the same examples as you
used in the previous item and verify that the same solutions are found.

Exercise 9.7 (Four-bar Function Generation)


(1) Clear the negative exponent from Equation 9.6.43 and show that the system
Equations (9.6.42)' and (9.6.43), j = 0, . . . , 4 , has a six-homogeneous root
count of four. (Hint: use the groupings (x,u,v,l), (x,u,v,l), and (6j,l),
j = 1,2,3,4.)
(2) Confirm the root count of four for the system of Equation 9.6.44, j = 1, 2,3, 4.
(3) Use routine f cngen in H O M L A B to synthesize some function generators. Re-
member that a real linkage has u* = u and v* = v, where "*" is complex
conjugation.
(a) Let *j- = $2 for $ , = {0.0,0.1,0.2,0.3,0.4}. Set (<^-,Vj) = (e'^.e***).
(b) Do the same except *_,• = sin($j) for $.,• = {-1.0,-0.5,0.0,0.5,1.0}.
(c) Construct an original example.
How many real solutions are there in each case?
(4) For real linkages synthesized in the previous item, plot angle * versus $ on a
fine grid and animate the motion of the linkage.
(5) Write a program to use H O M L A B to solve the six-homogeneous version of the
problem from item (1) above.
174 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Exercise 9.8 (Four-bar Body Guidance)


(1) Use routine burmest in HOMLAB to synthesize some dyads for body guidance.
(a) Let 9j = {e~u, 0,0,0, elu}, Pj = {-2, - 1 - 0.5i, 0,1,2 + i}, and Pj = p*
(complex conjugate).
(b) Construct an original example.
(2) A four-bar is obtained by combining a left with a right dyad. For the problems
above, pick two real solutions and use one as the left dyad and one as the right
dyad. Sketch the four-bar linkage at each of the five given positions.
(3) What is the maximum number of distinct four-bars that can guide a body
through five general positions?
(4) Confirm that the only monomials appearing in Equation 9.6.46 are
{xa, ax, x, x, a, a, 1}. Show that by introducing new variables s = xa and
s = ax, one can reformulate the problem as six equations of total degree four.
(This trick is similar to one due to Bottema (ch.8, §5 Bottema & Roth, 1979).)
Exercise 9.9 (Five-Point Path Synthesis)
• Use HOMLAB to solve the five-point problem using a six-homogeneous formu-
lation with 96 paths. You may wish to write a script to form the equations in
"tableau" form and then apply mhomtab. Determine the number of endpoints
that are
(1) at infinity,
(2) singular,
(3) finite and singular,
(4) contained in (C*)8.
• Using the results of the previous run, construct a parameter homotopy to solve
subsequent problems in this family using as start points only the solutions
in (C*)8.
• Explore the symmetric five-point problem in which the fixed pivots and the pre-
cision points are placed with mirror symmetry about the vertical axis. Instead
of writing equations specialized to the symmetric case, just use the general for-
mulation with symmetric data. (Hints: Let the zero-th precision point be on
the vertical axis. Also, note that in isotropic coordinates, (a, a) being mirror-
symmetric to (b,b) means (b,b) = (—a,—a).) What is the generic number of
symmetric solutions?
• How can you set up a parameter homotopy that preserves symmetry? How
many paths must be tracked?
• Find a formulation of the symmetric problem that uses just half as many vari-
ables and equations. Program it in HOMLAB and verify that you get the same
results as using the general formulation with symmetric data.
• Solve the case of a = 0,6 = l,p = (0.765 + 0.735i, 0.935 + 0.595i, 1.335 +
0.595i, 1.685 + 0.945i, 1.08 + 1.05i), with "real" data, meaning a = a*, etc.
Case Studies 175

Verify that one of the solutions has x w 0.71477 + 1.3365i. How many "real"
solutions are there?
• For some real solutions, plot the coupler curve and verify that it passes through
the specified points. A "circuit defect" is said to occur if the real coupler curve
has two circuits and some precision points fall on each. Find examples with and
without circuit defects. Can you find an example having multiple real solutions
without circuit defects?
• Download one of the publicly available packages that implements polyhedral
homotopy and use it to solve the five-point problem.
Chapter 10

Endpoint Estimation

In earlier chapters we studied polynomial homotopies H(z, t) : CN x C ^ C ^ with


t going from 1 to 0. In this chapter we investigate the last part of the continuation
procedure as t goes to 0. This is called the endgame in the continuation algorithm.
In § 10.1, we look at nonsingular solutions of H(z,0) = 0, the system we want
to solve. For these solutions Newton's method 1 is excellent.
In § 10.2, we look at the situation of singular roots of H(z, 0) = 0. For these
solutions, we follow (Morgan, Sommese, <fc Wampler, 1991) and take advantage of
the uniformization theorem for small neighborhoods of points of one-dimensional
analytic sets, which gives a local representation of the path we are tracking. This
powerful analytic handle on the roots yields a number of useful algorithms for root
estimation (Morgan et al., 1991, 1992a, 1992b), which we present in § 10.3. In a
nutshell, no matter what kind of singularity a path leads to, the end of the path in
the vicinity of t — 0 has a local fractional power series and so the endpoint can be
estimated by series-fitting techniques.
For sufficiently singular solutions of H(z, 0) = 0 we may not be able to compute
the solutions to desired accuracy if we have insufficient digits. Some situations
where this happens are discussed in §10.4.
As is usual practice in numerical work, we need to make judgements about such
things as whether two points are equal or whether a point is nonsingular, considering
that all points are known only approximately. We do not want to belabor these
points nor break the flow of the presentation of the material in the chapter. Instead
we discuss a number of the issues at the end of the individual sections.
The different methods in §10.2 and §10.3 were introduced by Morgan, Sommese,
and Wampler (Morgan et al., 1991, 1992a, 1992b). We present the methods guided
by the numerical experience we have had using them over the last decade. An
exception to this is deflation, which we discuss in §10.5. This is a promising method
presented in (Leykin, Verschelde, & Zhao, 2004), and based on (Ojika, Watanabe, &
Mitsui, 1983; Ojika, 1987). We do not attempt to include here all the material in
these articles. Full technical details are contained in the cited articles.
J
As noted before, this is the same as the Newton-Raphson method. Both names are in common
use, so we opt for the shorter one.

177
178 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

10.1 Nonsingular Endpoints

Assume that x is an isolated nonsingular solution of H(z, 0) = 0, i.e., that H(x, 0) =


0 and that the Jacobian dH/dz is an invertible matrix at (x, 0). Then we know that
applying Newton's method to H(z,0) = 0 starting at any point (x',0) sufficiently
near x will converge quadratically to (x,0). Given a homotopy continuation path
z(t) with limt^o z(t) = x, the usual prediction-correction methods described in
§ 2.3 work well. The final prediction to t — 0 provides the initial guess (x',0) for
Newton's method.
Usually, it is not difficult to decide that the limit (x, 0) is a nonsingular solution.
Convergence itself is a good indicator that the solution is nonsingular, but a surer
test is to examine the condition number of the Jacobian at the endpoint. If the
solution converges, as indicated by a small step in the final Newton iteration, and the
condition number is mild, then the solution can be confidently declared nonsingular.
Because convergence behavior and the condition number can be affected by poor
scaling and high degrees, the definition of "mild" is problem dependent. Histograms
of condition numbers for all the solutions of a problem can be very useful in such
judgements, as described further in Chapter 11. If the solution does not converge
well, the condition number computed at the solution estimate might not accurately
reflect the condition number at the true solution. Because of this, one cannot
confidently tell the difference between a cluster of nonsingular roots, each having a
rather high condition number, and an inaccurate estimate of a true multiple solution.
One way to clarify the situation is to increase the digits of accuracy of the com-
putation. If the solution is truly nonsingular, then a sufficient level of accuracy
will eventually reveal this. One can even apply interval arithmetic (see § 6.1) to
obtain proof that a suspected nonsingular solution really is nonsingular. However,
one cannot prove a solution is singular in this way; that is, higher accuracy arith-
metic applied to a truly singular solution will increase the condition number at the
estimated solution, but it likely will never show exact singularity. The interval eval-
uation of the Jacobian matrix will show that a singular matrix is within the bounds
of the computation, but that does not prove singularity. One must finally stop at
some level of accuracy and accept the judgement that the solution is singular to
that level of approximation. This moves us into the realm of singular endgames,
discussed next.
As a practical matter, it is not necessary to determine if a solution is singular
or not to estimate it well. This is because the singular endgames that follow work
equally well on nonsingular endpoints. Thus, to keep a computer code simpler, one
may apply the singular endgame to all paths and judge singular vs. nonsingular
afterwards, according to the results.
Endpoint Estimation 179

10.2 Singular Endpoints

When the endpoint of a solution path is singular, there are several approaches that
can improve the accuracy of its estimate. All the singular endgames are based
on the fact that the homotopy continuation path z(t) approaching a solution of
H(z,0) = 0 as t —> 0 lies on a complex algebraic curve containing (x,0). In this
section we collect the facts that follow from this and underpin the methods. In
particular, we will see that the methods become valid only after the path z(t) has
been tracked into an "endgame operating zone" around t = 0. For very singular
endpoints, this operating zone may only be reached by increasing the number of
digits used. In § 10.4, we discuss in fuller detail what happens if one computes an
estimate while still outside of this operating zone.
Since this chapter is about local behavior of holomorphic functions, our homo-
topies H(z,t) will usually only need to be assumed holomorphic and not algebraic.
In § 10.2.1, we collect all the assumptions we use in one place.

10.2.1 Basic Setup


Assume that H(z, f ) : [ / x A - ^ <CN is a holomorphic function with U C CN an open
neighborhood of a point x G C^ and A C C an open neighborhood of 0 G C. We
assume that there is a one-dimensional complex analytic set X c U x A such that

(1) (x,0)sX;
(2) X <t U x {0};
(3) X is an irreducible component of

{(z,t)£UxA H(z,t) = 0},


i.e., X is the closure of a connected component of the set of points of X with
neighborhoods biholomorphic to an open set of C; and
(4) the projection TT : U x A —> A restricts to a proper holomorphic surjection
•KX • X -> A with 7^(0) = (x,0).
At first sight this seems like a large number of assumptions that might be difficult
to check! The crucial observation is that all the polynomial homotopies H(z,t) = 0
considered in this book fall into this setup. Indeed, if we are tracking a path z(t)
starting at a nonsingular root z(l) of H(z, 1) = 0 and are trying to estimate the
root of 7i(z,0) = 0 as z(i) —> x := 2(0), then the path is part of a one-dimensional
irreducible component X of

{(z,t)eUxA H(z,t) = o}.


By choosing small enough neighborhoods U and A and taking H(z, t) := 7iux&{z, t)
and X to be the irreducible component of X n (U x A), all the hypotheses of the
basic setup are satisfied.
180 Numerical Solution of Systems of'Polynomials Arising in Engineering and Science

In simpler terms, we know that the paths in our polynomial homotopies re-
main nonsingular for t £ (0,1], so each path is one-dimensional and makes a steady
advancement as t goes to zero. The defining equations for the homotopy are all poly-
nomial, so the path is a complex analytic set. This is the essence of the conditions
stated above as applied to polynomial continuation.

10.2.2 Fractional Power Series and Winding Numbers


We have the following consequence of Corollary A.3.3. Recall that Ar(a) c C means
the disk of radius r centered on a.
Lemma 10.2.1 Assume that we are in the basic setup above. There is a neigh-
borhood V c X of (x, 0) € X, a positive number r > 0, and a holomorphic mapping
<f> : A r (0) —> V with </>(0) = (x, 0) and the composition t = nv(<fi(s)) = sc.

Proof. Apply Corollary A.3.3 with the function g of that result set equal to t. •

We call c the winding number of X at (a;, 0). Given an isolated solution (x, 0) of
H(z, 0) = 0, there is a positive e e R such that for 0 < t < e, H(z, t) = 0, considered
as a system in z, has only nonsingular solutions in the vicinity of (x, 0). From this, it
follows that the multiplicity of the solution as a solution of H(z, 0) = 0 is the sum of
the winding numbers of the one-dimensional irreducible components of the solution
set of H{z,t) at (x,0). The nonsingularity condition is satisfied automatically for
many algebraic systems.
Note that since the components Zi{4>(s)) are holomorphic functions of s, they
can be expressed as convergent power series of s. We can consider these as fractional
power series in t1^0.
For the above representation of the components of z(t) to hold we must be within
a disk

A r c := {t e C | \t\ < rc},

such that Tfxnir-1(Ar ) has either no branch point (in which case c = 1) or a branch
point at (x, 0). We refer to rc as the endgame convergence radius.
A good way to visualize the situation is to consider what happens when we track
a solution path as t circles the origin in the complex (Argand) plane at a real radius
r, say as t = retB as 9 goes from 0 to 2TT. We start at z$ satisfying H(zo,r) = 0
and follow the path implicitly defined by H{z,rel9) = 0. For example, the reader
may think about H(z, t) = z2 — t(r] — t) with 77 a small positive number. For almost
all r, paths satisfying the basic setup above will remain nonsingular as we continue
around such a circle, returning at the end of the loop either to z0 again or to a
distinct nonsingular solution z\. For the example H(z,t) = z2 — t{j] — t), paths
will remain nonsingular except for r = n, and we will go from z0 = y/t(rj — t) to
z
i = "V^l 7 ? ~ 0 o r t o z i = \A(^ ~ *) depending on whether r < r\ or r > 77. We
Endpoint Estimation 181

may then proceed around the circle again and again to return to solutions z%, Z3,...
Since there are only a finite number of nonsingular solutions, after some number of
such loops, the solution path must return to the original point; that is, for some k,
we have Zk = z0. In the example, H(z, t) = z2 — t{j] — t), k = 2 or k = 1 depending
on whether r < r\ or r > r\. Considering this whole process again at a slightly
smaller radius r', we generally expect the same picture again, meaning that we get
a sequence of return points z'o, z[,..., z'k with z'k = z'o and z[, i = 1,..., k, being
the continuation of zt as t goes from r to r' on the real line. However, there may
be exceptional values r* of r where at least one of the loops hits a singularity, thus
breaking continuity. In the example, this value is rj. Stepping across this value,
the return sequence may change such that the ith return values Zi and z[ for r and
r' with r > r* > r', i.e., on opposite sides of the exceptional value, are no longer
joined by continuation of t from r to r' in the reals. The value of k that closes the
sequence may change as well. The endgame convergence radius rc is the smallest
such exceptional value of r*: for all smaller radii, the return map remains stable
and the winding number c of the path is the value of k in this range.

Remark 10.2.2 For simplicity we have slid over questions about whether you
can indeed choose small enough open sets so that we can decompose the solution
set of H(z,t) = 0 in a neighborhood of a solution (x,0) components so that for
the one-dimensional components, we have the desired uniformization result. The
language of germs is the way to gently deal with these issues in a rigorous manner.
We have included a short introduction to germs in § A.3.

10.3 Singular Endgames

For a singular endpoint, Newton's method applied to solve H(z, 0) = 0 is no longer


satisfactory for several reasons. First, Newton's method loses its quadratic con-
vergence at a singularity, and in some circumstances, it may even diverge. Sec-
ond, the prediction along the incoming path may give a poor initial guess, which
exacerbates the problem of slow convergence. Finally, while the endpoint of the
continuation path is well defined in the limit, the path might very well end on a
positive-dimensional solution set of H(z, 0) = 0, so unconstrained Newton iterations
may wander along this set rather than give us the endpoint we desire. All of this is
to say that to deal with singular solutions, we need a strategy different than the one
we described above for nonsingular endpoints. We call such a strategy a singular
endgame.
All singular endgames estimate the endpoint at t = 0 by building a local model
of the path inside the endgame convergence radius. The overwhelming problem is
that the paths approaching singular solutions of a system approach their limit very
slowly. To deal with this, we wish to sample the path as close as possible to t = 0,
but numerical ill-conditioning precludes accurate computation too near t = 0. This
182 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

leads to the idea of an endgame operating zone, described next.

10.3.1 Endgame Operating Zone


For a fixed precision of arithmetic, there is typically some small zone around t — 0
inside which a path with a singular endpoint cannot be numerically tracked within
a prescribed accuracy of the true path. Since the endgame can only work inside the
convergence radius, this leaves an annular endgame operating zone, as illustrated
in Figure 10.1.

Fig. 10.1 Endgame operating zone

The endgame operating zone can be empty in the case that the ill-conditioned
zone is larger than the convergence radius. However, whereas the convergence radius
is completely defined by the homotopy, the size of the ill-conditioned zone is not.
It depends on the precision of the arithmetic, so it can be made smaller by using
more digits. Roughly speaking, if we wish to estimate the endpoint with k digits
of accuracy, then we need to sample the path with k digits of accuracy also. Let
10 c denote the condition number of the Jacobian J(z, t) of H{z, t) with respect to
the z-variables and some fixed norm. When we do a correction step of Newton's
method we solve the equation J(z,t)5z = —H(z,t). Here we lose roughly C digits
of accuracy. Computing with d digits of precision, we need d — C > k for success.2
By increasing d, one may effectively shrink the ill-conditioned zone. With enough
2
This analysis of Newton's method is very rough, as the iterative nature of the method can
correct some errors. It would be closer to the truth to say that Newton's method converges
quadratically only to k < d — C digits, but even that is a rough generalization. Our comments are
meant to give a correct general picture without a complicated analysis.
Endpoint Estimation 183

digits, one can ensure that the endgame operating zone is not empty.
Once inside the endgame operating zone, we can sample the path just for real
t or we can sample for complex t in the zone. For a given precision of arithmetic,
better accuracy in the estimate is achieved by sampling for complex t.

10.3.2 Simple Prediction


The simplest approach is to track the path as close to t = 0 as possible using
extended precision to get the same accuracy as a nonsingular root. Let us analyze
a simple example to see what happens.
Assume we were trying to solve zc = 0 for some integer c > 1 using the homotopy
H(z, t) = zc — t = 0. Note in this special case of solving a one variable complex
polynomial, the condition number of J{z,t) is 1. So we can track with precision
on the same order as the number of digits, i.e., k = d. If we follow the path z(t)
with z(l) = 1, our path is then z(t) := £=, but in practice we do not know the
path explicitly, but must track it. Assume we have tracked the path t^ + e(t) where
e(i) is a random error of size O(l0~k). Once t= is of the same order as e(i), path
crossing will likely happen. So we cannot track for t beyond R « 10~k. In this case
we have an estimate 10~fe/c for the solution. This is not very good. For example,
with c = 5 and 15 digits of precision, we get 10~3 as an estimate for the solution 0.
If we wanted 10 digits of accuracy, we could achieve this by using this method
and 50 digits of precision.

10.3.3 Power-Series Method


The simple prediction approach of § 10.3.2 can easily be improved. The idea here
is to estimate the winding number c and then approximate the map cj> : Ar(0) —>
C " x C of Lemma 10.2.1.
There are different schemes to achieve this. We begin by tracking a solution
path z{t) from t = 1 down to t = R for some R G (0,1). We then collect samples
of z(t) by continuation from t = R to use in fitting a power series to 4>(s), where
t = sc. There is a separate power series for each component of z.
Assume for the moment that we know c and that t = R is inside the endgame
operating zone. We choose some number of points s\,..., sK in the s-disk, such that
each si is inside the endgame operating zone, and find the values z(t) = Z(SJ) by
continuation. At each such point, we can compute derivatives. If we compute the
first ki derivatives at a particular Sj, then Si is equivalent to ki + 1 points without
derivatives when determining the order of the power series we can compute. That
is, for each j = 1,..., N, we have a polynomial Pj(s) of degree (X^i=i(^ + -0) ~ 1
approximating <> /_•, (s) and satisfying p3 (st) = <\>^ (s») for i from 1 to K and for v
from 0 to ki. The standard error estimate (Theorem 3.6.1 Davis, 1975) tells us the
error of the approximation of Pj(0) to (pj(O) is O (PliLi Isi]fei+1)- F° r brevity below,
we shall say this is an Mth order fit, where M — (X)"=1(&i + 1)) — 1.
184 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

This leaves open two questions that must be answered in order to deploy the
method:
• How do we find the endgame operating zone so that we can sample within it?
• How do we determine the winding number c?

The only practical approach to finding the endgame operating zone seems to
be adaptive trial-and-error. Suppose we fix a pattern of the sample points,
{asi,..., asK}, where a is a scaling factor for shrinking the sample pattern around
the origin. Typically, asi is real and we arrive at it by tracking t in (0,1). The
remaining sample points may be real or complex, but either way, we evaluate z at
them by continuation. We may execute the endgame repeatedly for a geometrically
decreasing set of scalings aj = A* for some fixed real number A S (0,1), say A = 0.3.
When successive estimates of the endpoint agree to some pre-specified tolerance,
we declare the method a success and stop. If this tolerance is never satisfied, we
stop when the scaling gets so small that we can no longer accurately track paths
due to the ill-conditioning near t = 0. If this happens, we must report that the tol-
erance was not met and return as our best estimate the one for which the smallest
successive difference was found.
There are several good ways to determine c. One is to directly measure the
winding number by tracking a circular path, t — Re^^6 until the path closes up at
9 = 2TTC with c a positive integer, i.e., with z(Re2'KC^1) = z(R). If R is inside the
endgame operating zone, then c, the number of loops around the origin necessary to
close the path, is the winding number. As always, there is the numerical problem of
deciding when two approximate numbers, z f Re2lxc^~^\ and z(R) are equal. This
is the same as the problem of needing to keep the allowed error in our tracking
small enough that we do not have path crossing.
A less computationally-expensive method for small c is to note that since c is
an integer, we can quickly test small values of c, say, from 1 to 4, for consistency
with a power-series fit to an oversampled data set. Such a data set can be obtained
with less path-tracking than would be required to find the winding number by path
closure. A method for determining c and estimating z(0) is as follows.
(1) Use continuation to collect sample values of z(t) for t = ti,... ,tK,tK+i.
(2) For c = 1,..., c max , do the following.
(a) Transform the sample points into the s-plane, using Si = ti . The contin-
uation path in t determines the proper matching angle of each Si, that is,
if U = Re^16 for R e (0,1), then s; = R}/ceV=ie/c taking R1/0 in the
reals.
(b) Derivatives with respect to t at the sample points must also be con-
verted to derivatives with respect to s using the value of c, e.g., dz/ds =
(dz/dt)(dt/ds) = (dz/dt)csc-\
(c) Fit an Mth-order power series, <t>c(s), to the samples at s i , . . . ,sK, as de-
Endpoint Estimation 185

scribed above,
(d) Calculate the prediction error at the extra sample point as ec = \\4>c(sK+i) —
^(Wi)ll-
(3) Use the c that gives the smallest prediction error ec as the estimate of the
winding number, so (f>(s) = 4>c{s). Estimate the path endpoint as z(0) = 4>(0).
When used in conjunction with the adaptive method of determining the endgame
operating zone, one often observes that c = 1 gives the best prediction when the
path is far outside the convergence radius. As the path is tracked into the operating
zone, c settles into the correct value. This is because the order of the prediction
error for an incorrect value c' of the winding number is O(tl/C), whereas for the
correct value it is O{tM>c).
One way to collect samples is in a geometric sequence along the reals:
(^0)^1)^2) •••) = (R,XR,X2R,...) for some A € (0,1). Using z and dz/dt at two
successive values ti and tj+i, one may make a cubic prediction of the next value at
ti+2- A n i c e feature of this sampling pattern is that it advances by adding just one
sample point to the sequence, reusing the last two points of the previous sample.
That is, at one iteration we use samples at (to,£i,<2) a n d at the next (ti, *2J ^3)-
Such a geometric sequence can be used to determine the winding number without
trial and error. The value z{t) is approximately
z(t) = z(0) + at1/0 + higher-order terms,
where a is the first coefficient in the fractional power series. Thus, z(R) — z(XR) «
a(l - \l'c)R}-/c and z(XR) - z(\2R) « a(l - A ^ A ^ i ? 1 / 0 and so
z(XR) - z(X2R) ^ 1/c
z(R)-z(XR) ~
Since we know X, this can be used to estimate c, keeping in mind that c is a positive
integer. This method can fail when a is zero or small, so that the first nonconstant
term in the power series is order £2/c or higher. A method that attempts to deal
with such subtleties is described in (Huber & Verschelde, 1998) (see also (Verschelde,
2000)). We shall not pursue this further here.
As we approach t = 0, we can expect the predictions of the power series to be
quite accurate. Accordingly, we may use it in place of the linear predictor in the
predictor-corrector path tracker when collecting new samples. Of course, one should
use the current best estimate of c at each stage, which may change as the endgame
proceeds. Even when c is not correct, because the path has not fully entered the
endgame operating zone, the best estimate for c obtained by the above method will
generally be better than just assuming c = 1.
A final variation on the power-series method is worth mentioning. Once the
endgame operating zone is entered, it is valuable to quickly gather more samples to
raise the order of the prediction. This allows the process to converge to full accuracy
at larger values of t, before the ill-conditioned zone is encountered. Suppose we have
186 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

sampled along real t £ (0, R) and the prediction of 4>c{s) gives an accurate estimate
of z(tK+i) in step 2d. Then we may try to predict across the origin in s and use
Newton's method to refine samples there. It is particularly convenient to gather
a symmetric sample set —si, — S2, • • •, —sK, because the odd-powered terms in the
power series for tp(s) = {<fi{s) + (j>(—s))/2 drop out, while i/>(0) = 0(0) = z(0).
Consequently, with a change in variables to w = s2, we can estimate z(0) with an
Afth order power series for ip(w) that is the same as a (2M + l)th order power series
for <p(s).
For double precision arithmetic and samples on the real line in t, experience has
shown that there is little profit in attempting the use of winding numbers greater
than four or five. For higher precision arithmetic, this limit can be extended. The
problem is that an Mth-order power series in s corresponds to a power series in t
of order only M/c. To get a good estimate, we will need a large value of M and
a numerically stable method of computing the estimate 0(0) without finding all
M + 1 coefficients of the power series. The Cauchy integral method of the next
section provides this.

10.3.4 Cauchy Integral Method


The Cauchy integral method is based on the use of the Cauchy Integral Theorem
to estimate the solution of H(z,t) = 0 by 0(0), where <j> : Ar(0) -> C^ x C of
Lemma 10.2.1.
As in the power-series method, we first track z(t) until t = R. We then track
z (Re^^e) as 6 varies, to both determine the winding number c and to collect
samples around this circular path.
Letting s denote the coordinate of A r (0), we have t = sc, and z$ = <j>i(s)fori =
1,..., N with the sought after solution given by (zi,... ,z^) = (</>i(0)... ,<^JV(0)).
The Cauchy Integral Theorem gives

fc(0) = -±== f ^ds. (10.3.1)


27TV-1 J{sec | |«|=flv<=} s
In terms of 9 and z (Re^~*e) we get the vector integral

Because of periodicity, an excellent method to evaluate this integral is the trapezoid


method, e.g., (Eq.(3.3.4) Stoer & Bulirsch, 2002). This method yields an estimate
of z(0) with error of the same magnitude as the error with which we know the
sample values z(Re^zzl0).
As in the power-series method, we can benefit from choosing a special sam-
ple set. If M + 1 points around the circle are sampled at equal angles, Sk =
End/point Estimation 187

j^e2n^ikc/(M+i)^ faen the trapezoid method gives exactly the average of the sam-
ple points:
1 M+i

Moreover, it is easily shown that this is the same result as would be obtained from
a power series fit to the same points.
The success of the Cauchy integral method depends on finding an appropriate
radius for the circular sample. As in the power-series method, we do not know a
priori the convergence radius. The most practical recourse is to discover it adap-
tively, by trying the method at geometrically decreasing radii. Convergence may
then be judged by agreement in winding number and endpoint estimate between
successive trials.

10.3.5 The Clustering or Trace Method


This last approach is based on the trace, see § 15.5. Assume that we have a number
of paths Zi(t) converging to what appears to be the same solution z* of the system
H(z,t) = 0 that we want to solve. Denote the paths as wi(t),..., wm(t). We
have a finite number of one-dimensional irreducible analytic sets Xlt... ,Xk passing
through a small neighborhood of (z*, 0). We assume that the projections to the t-
axis -Ki : Xi —> C are proper for all i when restricted to 7r~1(Ar(0)). This will be true
for some r > 0. Each map iri n-i<A ro\\ has a well-defined degree, di- We have for
any of the algebraic homotopies we construct in this book that d\-\-- • -+dh = m. For
0 < t < r, we have that Wi(t)-\ h wm(t) is a sum of the traces of the coordinate
vector z with respect to the different maps TTJ. Since these traces are holomorphic
functions of t bounded in absolute value for 0 < \t\ < e and all sufficiently small
e > 0, e.g., see § 15.5 or (Appendix Morgan, Sommese, & Wampler, 1992a), this
sum extends to a holomorphic function tr(t) for t £ A r (0). We are in a situation
similar to the situation with the power series method of § 10.3.3, but simpler since
we do not worry about c. This method predicts the value tr(0)/m for z*. This
prediction is a prediction for the average of the limit points

wi(0) + • • • + Wm(0)
m
Each of the Wi(t) has a fractional power series, but their sum is holomorphic, that
is, it has a power series with integer exponents. Thus, we may conveniently estimate
z(0) by fitting an integer-exponent power series to the average of the Wi(t).
The main difficulty with this method is determining which solutions are con-
verging to the same endpoints. The difficulty arises because the estimate of the
endpoints of the individual paths is inaccurate unless the winding number is em-
ployed in the estimation. Only the average endpoint is well-behaved (holomorphic),
188 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

not the individual paths.


A lesser disadvantage of the trace method is that to sample the average solution
path, one averages the values of the individual paths at each sample point. This
means that the individual paths must be sampled at the same points. Hence, the
processing of the paths becomes coupled, whereas the power series or the Cauchy
integral can be applied to each endpoint independently.

10.4 Losing the Endgame

It may happen that the endgame is applied outside the endgame convergence radius,
either because there are insufficient digits to track within that radius or because
the endgame zone is not identified correctly. It is natural to ask what happens in
such a circumstance.
When there is a tight cluster of distinct solutions, the precision of the arithmetic
must be high enough not only to distinguish between them, but also to track paths
accurately near them. If the cluster is too tight in comparison to the precision of
the arithmetic, the end of path tracking, and hence the application of the endgame,
will occur outside the radius of convergence. There is the stability question of
whether the methods will compute some sort of average of some of the solutions of
the cluster. The methods do, in fact, have good stability properties, which hold in
a larger range than the endgame operating range.
The setup is that we have a holomorphic function, H(z, t) : CN x U —> C, where
0 6 U C C. Let 7T : C^ x U —> U be the product projection. Of course, in practice
this is our homotopy. We are trying to solve at t = 0.
We have introduced three interrelated methods. The Cauchy integral method
and the power series method are the most accurate. The clustering method of
§ 10.3.5 is less accurate but clearly fails gracefully: it gives the weighted average of
roots of the cluster.
The full gamut of possible behaviors of the methods when we are not in the
endgame operating region is not clear, but we can get some idea of the behavior
from the following examinations.
Consider the simple example on C2

H(z,t) = z2-t2-e2=0.

If we track down to t — R and R < e, then we are in the endgame operating region.
If R > e, we are not. Let's see what we end up computing. The solution set TZ of
H(z, t) = 0 over AR(0) is a Riemann surface that can be shown to be biholomorphic
to some annulus. The important point is that 7r~1({t G C | |t| = R}) is the union
of two disjoint circles C\ and C?,-
Endpoi^it Estimation 189

Applying the Cauchy integral method we end up evaluating


-1 />2TT

— / VR2e2^e + e2d6,
2TT JO

with a choice of one of the two branches of the square root. If R < e, the Cauchy
integral method yields the roots ±e depending on the choice of the branch. If R > e
we get a function dependent on R. This integral is an elliptic integral, but for
explicit values of R and e it is easy to evaluate numerically.
Fixing e = 10" 7 and R = 10~5 we get 0.64- 1(T 5 -0.50-lQ- 7 \f-[, which does not
compare favorably with the actual roots ±10~ 7 . Indeed, the error 0.63 • 10~5 is two
orders of magnitude larger than the root. Since the Cauchy integral method applied
to an approximating polynomial gives the value at the origin of the approximating
polynomial, we see that choosing interpolation points on the circle C\ or C2, the
power series method will yield answers identical to the Cauchy integral method.
It is important to realize that the trace method is not better than the power-
series or Cauchy integral method. Indeed, if we chose the paths wi(t),... ,wm(t)
apparently converging to a common root as in the trace method, and applied the
power series or Cauchy integral method to all the points and summed, we would
get the same sort of answer as in the trace method. Let's see this precisely for the
Cauchy integral method, realizing, as noted above, that this implies the analogous
statement for the power series method using interpolation points on the curves over
the circle \t\ = R.
We assume that over some small disk, AR := {( 6 C \t\ < R}, of radius R
around 0, with A# C U, the set H~1 (0) r\TT~1 (AR — 0) is a one-dimensional analytic
set with closure X in AR X C^ such that irx '• X —> AR and TT-^ : X —> AR are
proper. This is phrased this way to allow the possibility that there is a positive
dimensional analytic solution set in the fiber over 0. By definition, proper means
that the inverse image of any compact set is compact. One significance of properness
for a holomorphic map is that the map has a well-defined sheet number on each
irreducible component of X, e.g., see Corollary A.4.15. As mentioned previously,
these conditions are satisfied for all of our homotopies. We are not assuming that
we are in the endgame convergence radius. Theoretically this means that we do not
necessarily have a map 0 as in § 10.2.1. We still have the normalization mapping v :
TZ —> X, which for curves is the most classical special case of Theorem A.4.1. Here
TZ is a smooth curve (a Riemann surface in the terminology of complex analysis), v
is proper; and for a finite set of points B C AR, the map 7TTC\77-I(B), i.e., the map n
restricted to TZ minus the finite set TT~1(B) is a biholomorphism. When we are in
endgame convergence radius, TZ is a disk and v is 0.
Since TXX extends to a neighborhood of X, v extends to V : TZ —> X, where TZ is
a Riemann surface with boundary a union of circles, i.e., dTZ := TZ — TZ is a union
of disjoint smooth connected curves C\,..., CL for some integer L > 1.
Now the Cauchy integral method (Morgan et al., 1991) that we are using starts
190 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

with a point po £ dTZ and follows its continuation p as n(p) goes around the circle,
{t £ C | |t| = R}, c times, where c is the minimum positive number of times it is
necessary to go around the circle, {t e C | |i| = R}, until p returns to po. Note
p traces out a connected component, d, of OX = UjCj containing p. We let c,
denote the cycle number associated to the curve CV In analogy with the cluster
method we compute the integral

J-yfz.^(±) (10.4.2)

where, abusing notation, we let z : TZ —> C is the vector of coordinate functions on


CN pulled back to TZ.
72. is a noncompact Riemann surface and TZ is a compact Riemann surface with
boundary a finite number of circles, such that dTZ = TZ — TZ, i.e., TZ is the set of
interior points of TZ. We assume that TT : TZ —> AJJ is a proper holomorphic map
from TZ to the disk, A# := {t 6 C | |t| < R}, of radius R around 0, and that TT
extends to a differentiable, finite to one map, TT^ : X —> A#. We let Pi for i in a
finite set I denote the distinct points in ir~1(0). We let n, denote the multiplicity
of the pi as a zero of the holomorphic function n.
The following consequence of Stokes Theorem will let us work out estimates for
the effect of branch points on the Cauchy integral method.

L e m m a 10.4.1 Let n, TZ, dTZ, n^ be as above. Let z : TZ —> C be a holomorphic


map. Let {pi | i G / } be the set of points, Pi, in the set, n~1(0), with multiplicities,
rii. Then letting c = C\ + • • • + ci :
1 f » fdt\ yr--^ rii

Note c = X l i e / n i ' ^ u ^ * n e n * an<^ ^ e Cj can be different (though each rii is a


sum of a subset of the c,.
This consequence of Cauchy's integral theorem is left to the reader as an exercise.

Corollary 10.4.2 If dp = dTZ, then the equation (10.4.2) computes the average
of c (counting multiplicities) solutions of H(z, 0) = 0.

10.5 Deflation of Isolated Singularities

Endpoints of homotopy solution paths can be divided into two types: isolated so-
lutions and points on positive solution sets. We say that z* £ CN is an isolated
root of f{z) = 0, f(z) : CN -> C ^ , if for a small enough positive e e l , the ball
Be(z*) c CN defined by B€(z*) = {z £ CN \ \z - z*\ < e} contains no other root
of f(z) = 0 besides z*. Isolated singular roots can be computed accurately with-
out resorting to the kinds of singular endgames we have discussed above. This is
Endpoint Estimation 191

brought about by a symbolic reformulation of the equations so that z* becomes a


nonsingular root of the new system.
Before describing the method, let us review some facts about the behavior of
Newton's method near an isolated root. If z* is a nonsingular root, that is, if the
Jacobian matrix df/dz(z*) is nonsingular, then it is well-known both that z* is
isolated and that Newton's method converges quadratically to z* when initialized
from any point close enough to it. In most cases, but not all, Newton's method
will also converge for isolated singular roots, but convergence will be slower and the
final accuracy lower than for nonsingular roots.
An illustration of a system for which Newton's method fails near an isolated
root is (Griewank & Osborne, 1983)
(29/16)**-2^0,
xz - y — 0.

No matter how close one starts to the multiplicity-three isolated root at the origin,
(x,y) = (0,0), Newton's method diverges. See (Griewank & Osborne, 1983) for
more on how Newton's method behaves near such irregular singular roots. The
system of Equation 10.5.3 is very special in the sense that if the coefficient (29/16)
is changed to a generic value, Newton's method converges even though the origin
remains a root of multiplicity three. However, we do not wish to depend on this
kind of genericity, as we may indeed be given a system with an irregular singularity.
Moreover, even when Newton's method converges, its behavior may not be sat-
isfactory. For a root of multiplicity fi > 1, its rate of convergence is only linear
and the function must be evaluated with precision /x times greater than the ac-
curacy desired in the estimated root. To be precise, consider a single polynomial
f(z) : C —> C with a root z* of multiplicity /x > 1. Denoting the kth iterate of
Newton's method as Azk, we have the iteration formulae
A2fc = -f{zk-i)/f'(zk~i), Zk = zk-i + Azk.
Let £k := zk — z* be the error between the kth iterate and the true value z*. If the
sequence of iterates converges to z*, then it obeys the following relation in the limit

tk+l = (»-iyk+o(ek).
(A simple demonstration of this result can be found in (Ojika et al., 1983).) So for
H > 1, the convergence rate is linear with geometric ratio (/i — l)//x. For fi = 1,
convergence is quadratic, a much faster process.

10.5.1 Polynomials in One Variable


How can we restore quadratic convergence for roots with multiplicity greater than
one? For a polynomial in one variable, this is rather simple. By Theorem 5.1.2,
we know that a multiplicity fi root of f(z) is a multiplicity one root of f^~1"l{z).
192 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Suppose we begin by solving f{z) by a homotopy method, and we observe that


fi roots are approaching a common endpoint. Then, we may switch to solving
yO-i)(^), initializing Newton's method using the estimated singular endpoint of
the first stage.
While it is clear that in theory this deflation maneuver is valid, one might wonder
if it is numerically stable. The polynomial that we solve in floating point arithmetic
is only an approximation to an exact polynomial and so the multiplicity // root
of the exact polynomial will appear to have a cluster of [i roots for the numerical
polynomial. How does this cluster behave under differentiation? Can we be sure
that f^~l^(z) has a root in the vicinity of the cluster of roots of f(z)l As detailed
in (Sommese, Verschelde, & Wampler, 2004d), the answer depends on the degree d
of f(z) and the distribution of its roots. Let z* be the centroid of a cluster of fi
roots inside a disk Ap(z*) of radius p centered on z*, and let R denote the distance
from z* to the nearest root outside that disk. Then, the condition

5,_1SL (104M)
p a — /i + 1
is a conservative estimate that guarantees that f^k\z), for all k < /i — 1, has
exactly /x — k zeros in AP(ZQ). Even if the root is truly a multiple root due to the
structure of the equations, at any finite level of precision in floating point, it will
likely become a cluster of roots. However, the higher the precision, the tighter the
cluster, and so beyond some precision, the cluster radius p will become small enough
that condition 10.5.4 will be satisfied and the deflation maneuver will succeed.
This does not resolve the question of deciding whether a given polynomial has an
exact multiple root or it has a cluster of closely-spaced roots. As we have indicated
before, this is not a question that can be resolved in favor of a multiple root using
floating point arithmetic. If it is a cluster, a high enough level of precision will
reveal it, but if it is a true multiple root, only exact arithmetic can prove it.

10.5.2 More than One Variable


It is natural to consider how to generalize the approach for one variable to systems
of equations in several variables. The following formulation is based on (Leykin
et al, 2004), which in turn was motivated by (Ojika et al., 1983; Ojika, 1987)
(see also (Lecerf, 2002)). Assume that f(z) : CN —> CN is a polynomial system
with an isolated singular root z*. Denote its N x N Jacobian matrix as J(z) : —
df/dz. At the singular root, J(z*) will have rank r < N. This implies that the
matrix equation J(z*)v = 0 has a linear solution set for v 6 f>w-i of dimension
N — r — 1. We can pick out a unique point of this null set in P ^ " 1 by appending
N — r — 1 homogeneous linear equations and dehomogenize by appending one more
inhomogeneous equation. Equivalently, we can pick a random r-dimensional linear
space to intersect the null space in a point in CN, that is, pick VQ,. .. ,vr G CN at
Endpoint Estimation 193

random and set v = v0 + Y^i=i \vu with unknowns Ai,..., Ar e C. Combining this
condition with the system f(z), we have 2N equations in TV + r unknowns

S(*,A)=( f(^r ) =0, (10.5.5)


\J{z)(vo + J2z=i^iVt)J

where A = (Ai,...,A r ). An initial guess for A can be found by standard linear


algebra applied to J{z) at the estimated value of z* coming from the solution of
system f(z) = 0.
The system of Equation 10.5.5 has more equations than unknowns. It can be
reduced to square using a randomization procedure (see § 13.5), but this is not
necessary. We are only seeking a local solution, not forming a global homotopy, so it
suffices to use Gauss-Newton iteration. This is identical to Newton's method except
that the overdetermined iteration step is solved by least-squares (pseudoinversion).
Let (z*,X*) denote the solution of g(z, A) = 0 that uniquely projects to the
solution z* of f(z) = 0. It is not immediately clear that the multiplicity of (z*, A*)
as a solution of g(z, A) = 0 will be lower than that of z* as a solution of f(z) — 0,
but a proof of this is given in (Leykin et al., 2004), subject to the assumption that z*
is an isolated solution of f(z) = 0. To desingularize an isolated root of multiplicity
fi > 2, deflation may need to be applied multiple times. Indeed, in the case of
n = 1, a single polynomial, the foregoing is exactly the same as the differentiation
approach discussed in Subsection 10.5.1, where we saw that fi — 1 deflation steps
are required. In the general case, the statement is that at most \i — 1 deflation
steps are required. The fewer deflations required, the better, as each one adds more
variables.
The deflation process is local in the sense that different singular points of the
same system may have different deflations. The singularities may differ not only in
their multiplicities, but also in the rank of the Jacobian at each stage of deflation.
An analysis of the numerical properties of deflation is not yet developed for the
multivariate situation: there are no known formulae analogous to Equation 10.5.4
for the univariate case. Experiments reported in (Leykin et al., 2004) indicate that
the approach is effective for a number of test cases having isolated singularities.
In several variables, there is an additional concern that does not arise for just
one variable. This is the possibility of positive dimensional solution sets. Deflation
is only valid for isolated roots. This is a big drawback, because we have no clear way
of deciding which singular endpoints in a homotopy are isolated and which ones are
landing on positive dimensional sets. This issue will be treated further in Part III,
where we consider the treatment of positive dimensional solutions. The frequent
appearance of positive dimensional solution sets, especially at infinity, means that
we cannot depend on deflation alone: the general purpose singular endgames remain
necessary if we wish to find all path endpoints accurately.
194 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

10.6 Exercises

Exercise 10.1 (Power-Series Method) The power-series endgame is imple-


mented in HOMLAB using samples for real t only. The control variables are set
in htopyset .m and are described in § C.7.1. Systems of the form
x y - l = 0, Or + l) f e =0,
have a multiplicity k root at (x,y) = (—1, —1).
• A total-degree homotopy has 2k paths. We have identified (—1,-1) as a mul-
tiplicity k root. Analytically determine the endpoints of the other k paths.
• For k=3, solve the problem with HOMLAB by writing the system in tableau
form and using the script totdtab to solve it with a total-degree homotopy.
Does it give the result you expect?
• Try similar problems for k = 2,3,4,... How high can you go and get good
endpoint estimates? Pay attention to the setting of CycleMax.
• The default setting is allowjump=l, which causes the endgame to also collect
sample points for negative values of s by predicting across the ill-conditioned
zone at the origin. Compare the performance of the endgame for allowjump=0
versus allowjump=l. You may set global verbose=l to get intermediate results
from the endgame, see § C.7.2.

Exercise 10.2 (Power Series Error Analysis) There are two sources of numer-
ical error in the estimate produced by the power series method: truncation error
due to the order of the fit and amplification in the fitting process of errors in the
sample points. Formulate the fitting process as the solution of a linear system whose
unknowns are the coefficients of the power series:
<f>(si) = [ 1 Sj sf • • • sf1 ] [ a 0 ai a2 • • • aM]T

for i = 1,..., M + 1. We may write this in matrix form as


$ = 5a, (10.6.6)
where $ is the column of sample values, S is the Vandermonde matrix whose (i,j)th
element is s^~ , and a is the column of power series coefficients. The final estimate
will be 0(0) = do- The condition number of the Vandermonde matrix affects how
errors in the samples <j>(si) are transmitted to the estimate ao-
• For the same order fit, M, compare the condition number for the following
sample patterns:
(1) a geometric sequence s^ = R, XR, X2R,... for various A,
(2) a symmetric, two-sided geometric sequence s, — ±R, ±XR, ±X2R,...,
(3) the transformation of the two-sided sample set to fit a power series in w =
s2,
Endpoint Estimation 195

(4) a circular sample, s, = ReV=T2m/(M+i)_


• How are the numerics affected by rescaling the fitting as

<t>(Si) = {lpSl (pSi)2 • • • (p S i ) M ][a 0 ai/p a2/p2 • • • aM/pMf

with p = 1/R?
• What sample pattern is best for a thin endgame operating zone, characterized
by having an ill-conditioned region almost as big as the convergence radius?
• Give two reasons why the Cauchy integral method is a good approach for end-
points with large winding numbers.

Exercise 10.3 (Circular Sample Sets)


• For an evenly-spaced circular sample set, s» = fie^-l2nt/(M+1)j find the sums
££+ 1 sJforfc = 0 ) l,2 ) ...,oc.
• Show how this implies equivalence between the trapezoid rule for the Cauchy
integral on evenly spaced circular samples and the power series fit to those
points.
• Show how this also implies that the average of all paths approaching the same
endpoint is a holomorphic function (given by a power series with nonnegative
integer exponents) of the path parameter t. Consider that there can be several
subgroups of paths approaching the same endpoint, each subgroup having its
own winding number.
• Let S be the Vandermonde matrix, as in Equation 10.6.6, formed for the evenly-
spaced circular sample. What is S1"1?

Exercise 10.4 (Multiprecision) (open research topic: see (Bates, Sommese, &
Wampler, 2005b)) The control settings for the endgame in HOMLAB reflect the
fact that Matlab computes in double precision. How should these be changed if
multiprecision arithmetic were available? If the precision of the arithmetic could be
changed at will during the endgame, how should the endgame algorithm best use
this capability?

Exercise 10.5 (Deflation 1) The system


x2 + y2 = 0, x2 - y2 = 0

has a multiplicity four isolated root at (x, y) — (0,0). Show that one stage of
deflation gives a nonsingular system defining the root.

Exercise 10.6 (Deflation 2) Do the following for Griewank and Osborne's sys-
tem of Equation 10.5.3.
• Formulate Newton's method and experimentally observe that initial guesses
near (0,0) diverge.
196 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

• Use HOMLAB to solve the system with the power-series endgame and observe
the winding number of the origin (suggestion: use totdtab.m).
• Use deflation to obtain a new system for which the origin is a nonsingular
solution.
• How many stages of deflation are required? How many variables does the final
system have?
Chapter 11

Checking Results and Other


Implementation Tips

This is a very short chapter to help those who might try to create their own con-
tinuation codes. These tips can also be useful in getting more secure results when
using an existing code.
Since continuation is a floating point numerical process, there is the possibility
of several kinds of failure. The first step in correcting a failure is recognizing that
it has happened. Sophisticated codes detect some failures automatically and take
corrective action. Whether done automatically or manually, the basic techniques
are similar.

11.1 Checks

There are two kinds of checks: local checks examine an endpoint in isolation using
numerical analysis of the iterative method used in the endgame, whereas global
checks use knowledge of the polynomial nature of the problem, primarily the fact
that we expect to find all isolated solutions.
If the path tracker fails mid-course, that fact should be flagged and a corrective
action taken. See § 11.2 below.

11.1.1 Endpoint Quality Measures


Any numerical solution method should provide some measures of the quality of the
solutions it produces. Let us assume we are solving the square system f{x) = 0
and x* is an estimate. An entire treatise could be written on how to analyze the
accuracy of x*, but we will be very brief and simply list some useful indicators:
Function Residual The size of the function value, |/(x*)|. This measure is af-
fected by the scaling of the function, that is, if g(x) = 100/(x), then |p(x*)|
gives a 100 times worse function residual than |/(x*)|, even though the error in
the solution is the same. Even so, this gives a first look at whether the solution
has been successfully computed.
Newton Residual If we are using Newton's method to refine the endpoint, the

197
198 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

magnitude of the last step, \fx(x*)~lf(x*)\, is a good estimate of the distance


between x* and an actual zero of f(x), providing that the Jacobian matrix is
nonsingular.
Endgame Residual If the endpoint is singular, the methods of Chapter 10 are
preferred to Newton's method. Typically, the method is performed several
times for successively smaller values of t as t —> 0. The distance between
successive endpoint approximations, \x* — x*_1|, replaces the Newton residual
as an accuracy estimate.
Condition number The condition number of the Jacobian matrix, K,(fx(x*)), is
a good measure of how singular the solution is. Using the 2-norm, it is the ratio
of the largest to smallest singular value of the matrix. A large condition number
indicates singularity. However, the value can be near one for a near-rank-zero
matrix, having all singular values small, although these appear only rarely in
practice. To signal these, the largest singular value, or any other matrix norm,
\fx{x*)\, can be useful as an auxiliary measure. If one finds the complete list
of singular values, say o\ > <T2 > • • • > an, a sequence with a precipitous drop
in magnitude between two successive singular values is a clearer indication of
singularity than a sequence that declines gradually, even if the condition number
o\jon is the same.
Homogeneous coordinate If using the projective transformation, as we highly
recommend, a small magnitude of the homogeneous coordinate denning the
hyperplane at infinity indicates a solution at or near infinity. If we are using
multihomogenization, the smallest magnitude, of any of the homogeneous co-
ordinates defining the hyperplanes at infinity, is the one to consider. Solutions
at infinity are often singular, so a rather small homogeneous coordinate along
with a rather large condition number is a good indicator that both conditions
may hold, even when either indicator by itself is not extreme enough to be
thoroughly convincing.
For easy human understanding, it is usually best just to report all of these
measures by their decimal exponents, e.g., as Iog10(|/(x*)|). Just two or three
decimal places usually suffice.
In a "clean" run, all roots have a small endpoint residual, and there are no roots
in the gray zone between singular vs. nonsingular or finite vs. at infinity. It is not so
safe a practice to set fixed tolerances for the classification of roots, as, for example,
ill-conditioning is exacerbated by high-degree equations. It is more instructive to
view histograms of the various measures, in which case a gap of several orders of
magnitude between singular vs. nonsingular or finite vs. at infinity gives a rather
secure picture, whilst a smear of values gives no clear indication where to draw
boundaries. It is essential that these histograms be compiled for the logarithms
of the magnitude of the measures. The exercises explore the use of histograms,
including two-dimensional ones that categorize finiteness and singularity in the same
chart.
Checking Results and Other Implementation Tips 199

It is typical that nonsingular solutions will attain very small Newton residuals,
while the accuracy of singular ones will depend on the multiplicity of the root.
Without a singular endgame, a double root usually attains only about half the
accuracy of a nonsingular one. If the condition number is high enough (and we have
taken care that the bad conditioning is not due to poor scaling of the equations),
we can be relatively secure in classifying the root as singular and, if we are only
looking for the nonsingular roots, it can be discarded. It is more satisfying, of
course, to invoke a singular endgame and clean up the solution, if possible. Also,
higher-precision arithmetic can be invoked to clarify the situation.

11.1.2 Global Checks


In addition to the measures above, which are computed for each endpoint separately,
there are some checks that depend on the patterns of roots in the computed solution
set. These are tied to the polynomial character of the problem.

Path Crossing Check By using random complex numbers in our formulations,


we ensure that, with probability one, the solution paths do not cross in the
middle of the homotopy; only at the end might they merge together in a singu-
larity. However, if two paths become sufficiently close, it is possible for the path
tracking algorithm to jump from one to the other while still staying within the
tracking tolerances. Thus, it is a good idea to stop at some small t and check
if all the solutions are still distinct. That is, we pick some small te G (0,1) and
do the tracking in two phases: first from t = 1 to t = te, then from te to 0.
(A value of te =0.1 is typical.) If two solution estimates at te are very close,
this indicates that the tracker jumped paths. Re-running just those paths with
tighter tracking tolerance usually corrects the error.
Multiplicity Check If one uses the power-series endgame of § 10.3.3, an estimate
of the winding number, c, is obtained for each endpoint, and this implies that in
the neighborhood of t — 0, this path is part of a cluster of c paths approaching
the same endpoint. Since we are tracking a complete set of solutions paths, all c
of them should be found. It is possible for more than one cluster to approach the
same endpoint, so the check is to see if the total number of solutions approaching
the same endpoint are compatible with the winding numbers assigned to them.
Examples of valid clusters of winding numbers are {2, 2} (one cluster with
winding number 2); {2,2,2,2}, two clusters, each with c = 2); and {2,2,3,3,3},
(two clusters, one each with winding numbers 2 and 3). One could go further
to extract not just the endpoint of a path, i.e., the constant term of the power
series, but also the next term in the power series to match endpoints into
clusters. The Cauchy integral method of § 10.3.4 gives an even stronger check
for matching up paths: each time the path tracker circles around the origin
without returning to the original point generates another point in the cluster.
We can check for the existence of such points in the incoming solutions.
200 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Multiple Run Comparisons If one runs the same problem two or more times
with different choices for the random constants, the same results should be
obtained. This principle can be invoked at several levels.
• In a homotopy of the form h(z, t) = -ftg(z) + (1 —t)f(z) (see Theorem 8.3.1
for details), this means using a different value for 7. Then, one should
obtain the exact same list of path endpoints, because although the tracking
path has changed, it is a real-one-dimensional curve inside the same complex
curve and its destination point is the same. The association of start points
to endpoints likely will be permuted, however. If the endpoints from two
such runs cannot be sorted to match up, then one or both are in error, and
one can concentrate path re-runs on those paths whose endpoints have no
match in the other set.
• A stronger test than the above is to change the start system to another
in the same class. The start systems described in Chapter 8 all contain
random constants which can be reset to new values. Two such runs should
have the same set of nonsingular endpoints, which can be compared. The
singular endpoints will typically move, but usually these are not of primary
interest.
• For a parameterized family of systems, F(z; q) = 0, using the notation
of Chapter 7, one may solve two instances for different, randomly chosen,
values of the parameters q. The number of nonsingular roots should be
constant, but, of course, their values will change. To cross check them, one
can track paths from one to the other in a parameter homotopy F(z; tq\ +
(l-t)«2)=0.

11.2 Corrective Actions

Points with good quality measures at t = te and which pass the path-crossing test
are ready for the endgame. Those which fail on either count should be re-run from
the beginning, t = 1 to t = te, with different path-tracking parameters.
Paths that fail in one endgame might benefit from another. For example, the
power-series endgame in double precision is only effective up to c — 4, while the
Cauchy integral endgame has no such limit. But ultimately, the only way to compute
some difficult endpoints is to increase the precision of the arithmetic. We briefly
address these two issues next.
How much extra effort should be devoted to corrective actions depends on one's
aims. In an engineering problem, one might not care much about lost solution
paths. This is especially true if the trouble is due to a nearly singular endpoint,
as it may likely be useless for practical purposes anyway. However, if one is doing
an initial run to solve a random-parameter example in preparation for repeated
parameter continuations, then one wants to ensure that a full solution set has been
Checking Results and Other Implementation Tips 201

found. This is because there is no way to predict which of these starting solutions
will lead to the desired answers in a subsequent application.

11.2.1 Adaptive Re-Runs


We saw in Chapter 2 that path tracking benefits greatly from using an adaptive
step size in place of a fixed one. In a similar way, the remaining heuristic con-
trol parameters, such as the path tracking tolerance, can be made adaptive. Too
small a path-tracking tolerance makes progress slow, while too large allows path
crossing. This works hand in hand with the number of iterations allowed in each
corrector step.
For concreteness, let's say that the path-tracking tolerance is 10~4 and we allow
up to three iterations in the corrector. Then, a path-crossing incident is often
cleared up by decreasing the tracking tolerance to 1CT6, and if not, try decreasing
the iterations allowed to just two. (We have found such settings effective when
using double precision on systems of low-degree equations.) These kinds of re-run
strategies are easily automated so that human intervention is not necessary.
As tighter tolerances are set, it may be necessary to decrease the minimum step
size allowed and increase the number of steps allowed, if such constraints are in
place to cut off expensive paths. This presumes, of course, that one is willing to
pay the extra computational cost to get the answer.
If one is planning a large run with a path count on the order of 100,000 or more,
it can be worthwhile to collect run statistics on perhaps 1% of the paths and make
adjustments in the tracking parameters. Once the initial 1% runs well, the entire
run can be launched with confidence, although automatic adaptive re-runs should
be left in place.

11.2.2 Verified Path Tracking


Instead of controlling tracking by a tracking tolerance, one can instead use interval
arithmetic (see § 6.1) to guarantee that the solution estimate stays in a unique
convergence zone throughout, thereby having absolute assurance that path crossing
cannot occur (Kearfott & Xing, 1994). This tends to give conservative step sizes,
so it can be very expensive.

11.2.3 Multiple Precision


There are several difficult situations that may arise that are most simply resolved
with multiple-precision arithmetic. One is the case of a generally ill-conditioned
target system, which can be due to high degree equations or coefficients with widely
different scales. Sometimes, high degree is the result of applying elimination to an
initial system having many equations of lower degree, in which case it might be
better to solve the initial system rather than the reduced one. For systems with
202 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

wide-ranging coefficients, such as the chemical systems presented in § 9.2, a scaling


algorithm can help. But for some systems, there is no practical recourse except
raising the precision of the arithmetic.
A common situation is the existence of singular endpoints. As illustrated in
Figure 10.1, the endgame operating zone is a disk minus an ill-conditioned region
near t = 0. It can happen, especially for endpoints of high multiplicity, that the
ill-conditioned region takes up a large portion (or all) of the convergence disk, thus
preventing the endgame from succeeding. If the desired accuracy is held constant,
higher-precision arithmetic shrinks the ill-conditioned zone and allows the endgame
to succeed. (If one deploys multiple precision and makes the accuracy requirement
more stringent simultaneously, the latter may cancel the former so that there is no
net gain.)
A final possibility is the phenomenon of path crossing. Although in theory there
is a zero probability of two paths crossing, they can approach each other close enough
to require higher precision to negotiate past the near collision. For small systems,
it is acceptable to just pick new random constants and re-run the whole procedure,
but for a large problem, one wouldn't want to throw away a significant investment
of computation if a near collision should happen on some path late in the process.
It would be better to detect ill conditioning in the middle of a path and increase
precision on the fly, or lacking that capability, rerun the paths in question with,
higher precision and tighter path-tracking tolerance. In a sense, singular endpoints
are a case of this same difficulty, except we are not trying to slip by the collision, but
instead we are aiming directly at it. As multiple paths approach the same endpoint,
we need to keep from jumping from one to another so that the endgame attributes
the correct angle in the s-plane to the samples, where sc = t. Extra precision may
be needed to maintain accuracy.

11.3 Exercises

Exercise 11.1 (Checking) Revisit any problem from the exercises of previous
chapters; the six-revolute inverse position problem of Exercise 9.5 might be a good
choice. Do the following.

• Run the problem using standard settings in HOMLAB and make histograms of
condition number, function residual, and the homogeneous coordinate. Note
that for any of these quantities, a histogram of the exponents of the values in
scientific notation is more useful than a histogram of the values themselves. Use
routine pathcros to check for path crossings among the points in xendgame,
which is a list of the solutions for t — Endgame ^ ^- ^ s e P a t hcros again
for the list of solution points, xsoln, at t = 0. For any occurrence of multiple
paths having the same endpoint, check that the incoming paths have winding
numbers consistent with the multiplicity check described above.
Checking Results and Other Implementation Tips 203

• Loosen the path tracking tolerance so that pathcros discovers path crossing
errors.
• Return the path tracking tolerance to its default value, but this time cripple
the endgame by setting CycleMax=l. See what difference this makes in the
histograms.
Exercise 11.2 (Multiple-Run Checking) For any parameterized problem of
your choice, do a multiple-run global check that shows that the nonsingular solutions
for two independent total-degree runs match up under parameter homotopy.
PART III
Positive Dimensional Solutions
Chapter 12

Basic Algebraic Geometry

In this chapter we discuss the basic properties of the different sorts of algebraic sets
that arise in the numerical solution of polynomial systems. The flexible "probability-
one" methods underlying the numerical approach to polynomial systems, developed
in Chapter 13, are based on the fact that given any system of polynomials, the set
of solutions breaks up into a finite number of irreducible components.
Recall that we say that an affine, projective, or quasiprojective algebraic set Z
is irreducible if ZTes is connected. The dimension of an irreducible algebraic set Z
is defined to be dim Zreg as a complex manifold, which is half the dimension of ZTeg
as a real manifold. Irreducible components, discussed in § 12.2 are nice sets that
are almost manifolds. For example, the system

f{x y)
' -[x(y>-x>)(v-2)(3x + y)\-° ^ ^
vanishes on the union of four irreducible components

{x = 0} U {y2 - x3 = 0} U {(1,2)} U {(1, -3)}.

It is a striking and powerful fundamental fact that the most general solution set is
not much worse than this simple example.
To even state this result, which is called the irreducible decomposition, we need
to make precise what is meant by an algebraic set. The aim of this chapter is to
familiarize the reader with the basic types of algebraic sets and their properties.
Four types of algebraic sets are useful to us: affine algebraic sets, projective
algebraic sets, quasiprojective algebraic sets, and constructive algebraic sets. The
first three of these were introduced briefly in the introduction of Chapters 3 and 4.
We consider them in more detail in the succeeding sections.
In § 12.1, we revisit affine algebraic sets, i.e., the solution sets of systems of
polynomials on C^, to discuss the topologies and the maps defined on them. In
§ 12.2, we discuss the irreducible decomposition for affine algebraic sets.
Often polynomials are homogeneous, e.g., f(x,y) = x2 + y2, and in this case
acknowledging that their solution set is naturally defined on P^ simplifies matters,

207
208 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

both conceptually and numerically. For this reason we introduced projective al-
gebraic sets, i.e., solution sets on FN, in Chapter 3 and consider them further in
§ 12.3.
Often we need to consider all points in a projective algebraic set X except for
some that are in a second projective algebraic set Y, i.e., sets of the form X\(Xf~)Y),
such as C2 \ {(0,0)}. These sets, which include affine algebraic sets and projective
algebraic sets, are called quasiprojective algebraic sets. They are discussed in § 12.4.
A map / : X —> Y between quasiprojective algebraic sets X and Y is said to be an
algebraic map if the graph of / is a quasiprojective subset oi X xY: see § 12.4 and
§ A.4 for more details.
Finally, we discuss constructible algebraic sets in § 12.5. These sets, which
include all quasiprojective algebraic sets, may be defined as follows.

Constructible algebraic sets A constructible algebraic set, or constructible set


for short, is any set constructed from projective algebraic sets by a finite number
of the Boolean operations of union, intersection, and complementation.

Constructible algebraic sets prove useful for two reasons. First, many natural sets,
e.g., images of algebraic sets or the set of points of the image of an algebraic map
where the fiber is a given dimension, are not quasiprojective, but are constructible
(see Theorem 12.5.6 and Lemma 12.5.9). Second, a constructible set A contained
in a quasiprojective set X is quite close to being an algebraic set, e.g., the closure
A of A in the complex topology is a quasiprojective algebraic subset of A (see
Lemma 12.5.3), and there is a dense Zariski open set U of A contained in A (see
Lemma 12.5.2).
We end with § 12.6, a brief discussion of multiplicity of algebraic sets. Roughly
speaking, this notion allows us to relate the algebraic degree of a system of equations
to the degrees of the irreducible components of the system's solution set. For a
single polynomial in several variables, this is a straightforward generalization of the
phenomenon of multiple roots (double roots, triple roots, etc.) that may appear
when factoring a polynomial in one variable. For systems of more than one equation,
the situation becomes a bit more delicate, as we shall discuss.
All four basic kinds of algebraic sets arise quite naturally in discussing the so-
lutions of polynomials on CN, as we show by examples. We include in this chapter
only the rudimentary facts about these different classes of sets, with further useful
facts collected in Appendix A. As this book is focussed entirely on polynomial sys-
tems, we may sometimes drop the modifier "algebraic" and speak simply of "affine
sets," "projective sets," etc., but meaning these in the algebraic sense.
Before diving in, let's clarify briefly how quasiprojective sets include both pro-
jective and affine algebraic sets, and how constructible sets include them all. Since
quasiprojective sets are of the form X \ (X n Y), where X and Y are both projec-
tive, they include projective sets as the special case where Y is empty. As for affine
sets, recall that CN is equal to P^ minus its hyperplane at infinity, Hoo, which is
Some Concepts From Algebraic Geometry 209

a projective algebraic set equivalent to P"" 1 given by the homogeneous equation


XQ = 0. So if A is an affine algebraic set defined as the solution of a polynomial
system F(x), and B is the projective algebraic set defined by the homogenization
of F(x), then A = B \ (B n #00) is seen to be quasiprojective. Finally, the defining
form, X \ (X n Y), of a quasiprojective set is just a Boolean construction: we could
rewrite it as X n (not V). So quasiprojective sets are a kind of constructible set.
We now examine each type of algebraic set in more detail.

12.1 Affine Algebraic Sets

Naively, an algebraic set is nothing more than the common zeros of a set of poly-
nomials. Making this precise and convenient to use takes some work.
We start with a polynomial system

~ fi(xi,...,xN)~
f(x) := : (12.1.2)
Jn{x1,...,xN)_

consisting of n polynomials fi{x\,..., XJV) on CN contained in the ring C[zi,..., XN]


of polynomials in the variables Xi, ..., xM with complex coefficients. We denote
the set of common zeros on C by

V(fi,. ..,/„) := { i e C w | / 1 ( i ) = 0;... i fn(x) = 0} .

Such a set of common zeros is called an affine algebraic set. The word affine in
"affine algebraic set" signifies that the set is a closed subset of Euclidean space,
which is sometimes called affine space.
For a system / as above in Equation 12.1.2, we usually abbreviate V ( / i , . . . , /„)
by V(f).
Example 12.1.1 The simplest polynomial system is p(x) = 0 where p(x) is a
monic polynomial of degree d in one variable with complex coefficients, i.e.,

p(x) := xd + axxd-1 + • • • + ad,

k
with a,i £ C constants. As discussed in § 5.3, p(x) factors as \[(x — x,)Mi. Thus
i=l
V(p) consists of the k complex numbers xt. The multiplicity of X{ equals /x, (see
§ 12.6 for further discussion of multiplicity). Thus p(x) = x3 — x2 = x2(x — 1) = 0
has a zero set consisting of 0 and 1.
Unions of affine sets are affine, e.g., if A := V(f) for polynomials / :=
(A) • • • > fr) and B := V(g) for polynomials g := ( g i , . . . ,5s), then A U B is de-
210 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

fined by

V({fl9j | t = l,...,r; j = l,...,s}).

Since any point is an affine set, i.e., (xf,... ,x*N) is denned by (xi — x*,... ,XN~X*N),
we have that have that any finite set is an affine algebraic set. Lemma 12.4.3 will
show that these are the only compact affine sets.
For a single polynomial p(x\, £2) £ C[xi, X2] not equal to a constant, the solution
set is a nonempty one-dimensional affine algebraic set.
Example 12.1.2 A simple polynomial system on C2 is given by X\ = 0. Here
the solution set is the X2-axis.
It is worth emphasizing that passing from a system / of polynomials to V(f)
throws away all multiplicity information. For example, on C, x5, and x define
the same affine algebraic set V(x). Also note that CN is the affine algebraic set
corresponding to the identically zero polynomial, and the empty set is the affine
algebraic set defined by a constant polynomial.
Here is a less trivial one-dimensional example of an affine algebraic set.
Example 12.1.3 Consider the polynomial w — z2. The set
V{w - z2) := {(z,w)eC2\w-z2= 0}

is a smooth connected two-real-dimensional manifold. Indeed, the mappings


(z,w) 1—> z and z 1 — > (z,z2) show that there is a one-to-one correspondence be-
tween points (z,w) G V(w — z2) and z e C . Note that an m-dimensional complex
manifold is a 2m-real-dimensional manifold, since C has real and imaginary parts.
In this book, "dimension" always means complex dimension; otherwise, we explicitly
say "real dimension."
A map / : X —> Y from one affine algebraic set X c C^ to a second affine
algebraic set Y C C M is said to be an algebraic map if there is a map F : CN —
• > CM
such that
(1) F = (FU..., FM) with all the F* G C ^ , . . . , % ] ; and
(2) / = Fx, the restriction of F to X.
When it is clear from the context, we sometimes refer to an algebraic map as a map.
We define an algebraic function on an affine algebraic set X to be an algebraic
map from X to C.
We say that two affine algebraic sets X C CN and Y C C M are isomorphic if
there exist algebraic maps F : X —> Y and G : Y —> X such that F o G i s the
identity on Y and G o F is the identity on X.
Example 12.1.4 Let Y := Viw - z2) be as in Example 12.1.3 and let X := C.
We have the map G : Y —> X given by G(z,w) = z and F : X —> Y given by
F(z) = (z, z2) which shows Y and X are isomorphic.
Some Concepts From Algebraic Geometry 211

12.1.1 The Zariski Topology and the Complex Topology


Noting that given two systems / = {/i,... , / n } and g = {g\,... ,gm} of polynomials

V{f) U V(g) = V({fi9j\l <i<n-l<j<m})

and

V(f)nV(g) = V(f,g),
we conclude that affine algebraic sets in CN are closed under finite unions and
intersections.
Given an arbitrary, possibly infinite, set of polynomials on CN, the Noetherian
property for ideals in C[zi,..., zjv] (see, e.g, (page 74 Cox et al., 1997)) guarantees
that there is always a finite subset of the polynomials with the same common zeros
on C^. This guarantees that an arbitrary intersection of affine algebraic subsets of
C^ is an affine algebraic set. This implies that the set of affine algebraic subsets
of CN that lie on a given affine algebraic set X C CN satisfy the axioms to be the
closed sets of a topology on X, which is called the Zariski topology. Here the open
sets U C X are the sets X \ Y, where Y C C^ is an affine algebraic set contained
in X. Open sets in this topology are called Zariski open sets. Similarly the affine
algebraic subsets of CN that lie on the given affine algebraic set X c CN are the
Zariski closed sets of X.
Besides the Zariski topology, there is the complex topology, which is also called
the classical topology. Given an affine algebraic set X C C^, the complex topology
on X is the topology that X inherits from the usual Euclidean topology on C^,
i.e., a basis of open sets on X at a point x* € X is given by the intersection of X
with the balls

{xeCN | ||x-x*|| <e}


for 0 < e e IR and M the Euclidean norm.
Both topologies are useful. Since every closed set Y in the Zariski topology on
an affine algebraic set X C CN is the zero set of a finite number of polynomials, it
follows that Y is also closed in the complex topology. Thus the complex topology
has at least as many open sets as the Zariski topology. Except for the case of X a
finite set, the Zariski topology has many fewer open sets than the complex topology.
For example, if X is one-dimensional, then the open sets of the Zariski topology are
the complements of finite subsets of X, that is, X minus a finite number of points.
For X = C, this follows immediately from the fundamental theorem of algebra,
Theorem 5.1.1. The point to understand is that a statement about Zariski open sets
is much stronger than one about open sets in the complex topology. In particular,
a nonempty Zariski open set of an irreducible affine algebraic set X is dense, and
therefore a property that holds on a nonempty Zariski open set of X holds with
212 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

probability one on random points of X, as was discussed in Chapter 4. For example,


the nonempty open sets of C in the Zariski topology are the complements of finite
sets, but for the complex topology, the interior of the unit disk is a possible open
set. For more on the material in this section, (Red Book: Chapter 1.10 Mumford,
1999) is a good reference.
In § 12.4, we discuss the quasiprojective algebraic sets, a very broad class of
algebraic sets that includes both affine algebraic sets and Zariski open sets of affine
algebraic sets. For now, we would like to point out that certain Zariski open sets of
afHne algebraic sets may be identified with affine algebraic sets in different Euclidean
spaces. Given a Zariski open set U of an affine algebraic set X c C ^ , we define the
algebraic functions on U to be all functions of the form - where p,q G C [ x i , . . . , XJV]
and V(q)nU = 0. Given a Zariski open set U on an affine algebraic set X C C ^ and
a Zariski open set V on an affine algebraic set Y C C M , a map / : [ / — > V is said to
be an algebraic map if / := FTJ where F : U —> C M is given by F := ( F 1 ; . . . , FM),
with all of the Fj being algebraic functions. In line with the earlier definition of
isomorphism in the case of affine algebraic sets, we say that U and V are isomorphic
if there are algebraic maps F : U —> V and G : V —> U with F o G the identity on
V and G o F the identity on U.
If g is an algebraic function on an affine algebraic set X C C*, then X \ V(g) is
isomorphic to an affine algebraic set. See Lemma A.2.4 for a proof of this useful fact.
The Zariski open sets U of the form X \ V(g) are a basis for the Zariski open
sets on X. To see this let Y := V(h\,..., hr) be an affine algebraic set on X. Then

X\Y = ur=1 (X \ v(hi)).

Not every Zariski open set of an affine algebraic set X is of the form X \ V(g), e.g.,
in Example A.2.3, we show that 0 C C w for iV > 2 is not of the form V(g) for a
polynomial g.

12.1.2 Proper Maps


A continuous map / : X —> Y between topological spaces is called proper if for each
y E Y, there is an open set U C Y containing y and such that U and f~x ([/) are
compact. An algebraic map / : X —> Y between quasiprojective algebraic sets is
called a proper algebraic map if / is proper as a continuous map in the complex
topology. Proper maps are very nice, e.g., see § A.4. They also arise naturally when
working in a probability-one framework.

12.1.3 Linear Projections


In this subsection, we give a brief introduction to linear projections: see § A.8 for
more details.
Some Concepts From Algebraic Geometry 213

A linear projection ix: CN -> Cfc, N > k, is a surjective affine map

7r(x!, ...,XN) = (LI(X), . . . , Ljt(a:)), (12.1.3)

where
JV
e
Li(x) := al0 + ^2 a%ixh a
ij C.

We say that TT is a generic linear projection if the coefficients a^ are chosen "ran-
domly." Precisely speaking, this only has meaning in the context of some property
we are interested in. For example, in Theorem 12.1.5 below, we say that a generic
linear projection restricted to X is proper, which means that there is a Zariski open
dense subset of the a^ £ £kx{N+i) w^ t n e prOper^y t n a t ^ ne restriction to X of
the linear projection, constructed from the Oy, is proper. Choosing a generic linear
change of coordinates, i.e., choosing N generic linear maps to C, any projections
along the coordinate axes is generic.
The simplest example of a nontrivial linear projection 7r : C2 —> C is given by
sending (2:1,£2) to X\. To see what this corresponds to in projective space, fix the
ernbeddings
• C2 into P2 given by sending (xi,x2) —> [l,Xi,x 2 ]; and
• C into P 1 given by sending Xi —> [l,Xj].
We now have a commutative diagram
C2 ^ P 2 \ {[0,0,1]}
ni in'
C ^P 1
where the map TT' : P 2 \ {[0,0,1]} —> P 1 is given by sending [xcXi,^] ~* [^Oj^i]-
Given two distinct points a, b £ FN, let (a, b) denote the unique line through them.
The map TT' is often referred to as the projection from {[0,0,1]} because we can
think of the map as sending each point i £ P z \ { [ 0 , 0 , 1 ] } to

(x,[0,0,l])f){x 2 =0}.

Intuitively we have a source of light at {[0,0,1]} and we send each point to the
shadow it casts on {xi — 0}. With projections, we are perfectly happy to change
the image by a linear transformation, and with this notion of equivalence, the
projection is uniquely determined by the point {[0,0,1]}. The point {[0,0,1]} is
called the center of the projection. Projections from points at infinity, i.e., points of
the form [0, a, b], correspond to linear projections C2 —> C given by sending (x\, X2)
to x\ — (a/b)x2 G C, as illustrated in Figure 12.1.
From the point of view of projective space, there is nothing special about the
points at infinity, and indeed on occasion, e.g., (Sommese, Verschelde, & Wampler,
214 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Fig. 12.1 Projection from point at infinity {[0, a, b]}

2001b) and (Calabri & Ciliberto, 2001), it is useful to project from points not at
infinity. The case of a projection C2 —• C with a finite center is illustrated in
Figure 12.2, where point c is the center of the projection. (We only draw the real
part.) The set of all the lines through c are equivalent to a projective space P 1 ,
and the projection of a point x is the line P{x) through x and c. To perform
calculations, we will often select a line, such as line L, and set TT(X) := LnP(x). No
matter which line we choose in place of L, the essential fact is that all points along
P{x) \ {c} have the same projection as point x. From this observation, it follows
that the projection is determined uniquely by the center c.

Fig. 12.2 Projection with finite center c

We need the following important result, which is proven in § A. 10.4.


Theorem 12.1.5 (Noether Normalization Theorem) Let X C CN denote
an affine algebraic set. Let ix : CN —> Ck denote a generic linear projection. Then
if dim X < k, the map nx is a proper algebraic map with allfibersT^x^iv)finitefor
Some Concepts From Algebraic Geometry 215

ally £ Y :=n{X).
If dim X < k, then there is a Zariski dense subset U c X such that i\u : U —>
TT(C/) is an isomorphism. If X is of pure dimension k, then nx is a branched
covering of degree degX.

12.2 The Irreducible Decomposition for Affine Algebraic Sets

Given an affine algebraic set Z, we let Z reg denote the set of smooth points of Z.
The set Z reg is an open set, dense in Z, with Z \ Zreg equal to a union of affine
algebraic sets, which is why smooth points are also referred to as regular points.
We say that Z is irreducible if Z reg is connected. We would like to follow the
traditional, and very common, usage, e.g., (Mumford, 1995), and call an irreducible
affine algebraic set an affine variety. It is unfortunate that affine variety has been
used as a synonym for affine algebraic set by some authors. At this point it is safe
to say that anyone picking up a book on algebraic or complex geometry must check
whether varieties are irreducible or not (also reduced or nonreduced if that applies).
For example, in (Mumford, 1995) affine variety means irreducible affine algebraic
set, but in (Gunning & Rossi, 1965), a variety is a not necessarily irreducible reduced
analytic set. The word variety is easier to say than irreducible algebraic set, but,
to avoid confusion, we have reluctantly avoided use of this ancient word.
The irreducible decomposition of an affine algebraic set Z C C is the decom-
position Z := UaezZa obtained by first decomposing Zreg into the disjoint union of
connected components Ua and letting Za denote the closure of Ua. Here I is just
an index set assigning subscript numbers to the irreducible components. For many
of our algorithms, it will be useful to group the irreducible components according to
their dimensions, in which case we have index set Xi for dimension i, and we write

Z-^UjLoZi, Zi = UjeIiZij (12.2.4)

where Zi is the union of all i-dimensional irreducible components of Z, and where


Zij for j € Xi are the finite number of distinct irreducible components of Zi. Some
of the Zi may be empty, that is, Z might not have components at every dimension.
Indeed, the only possible component at dimension n is the whole of C n , which
precludes any lower-dimensional pieces, so the decomposition is only interesting
when Zn = 0.
A simple example is given by Z := V{x\X2). This affine set is the union of the X\
and x2 axes, and since this set is clearly singular only at the origin, the irreducible
decomposition of Z is Z = V{x{) U V{x2).
The irreducible decomposition is a fundamental tool in understanding solution
sets of polynomial systems. The primary aim of the remainder of this book is to
show how to numerically find and manipulate this decomposition. (D'Andrea &
Emiris, 2003) is a good place for obtaining an overview of symbolic algorithms for
rinding the irreducible decomposition.
216 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Remark 12.2.1 (The algebraic situation) Though we will use the geometric ap-
proach to solution sets, there is a natural approach based on the underlying algebra
of the polynomial system. Let 1{f) C C[x\,... ,XN] denote the ideal generated
by the polynomials / i ( x i , . . . , XN), • • •, fn(xi> • • • > XN) making up the polynomial
system /. Note that V(/) = V(J(/)). Given an affine algebraic set S c C N , let
I(S) C C ^ ! , . . . , xN], denote the ideal of polynomials vanishing on S.
An affine algebraic set Z is irreducible if and only if I{Z) is a prime ideal.
Given any ideal J c C[xi,..., XN], V% the radical of I, is the ideal consisting of
all / G C[xi,... ,XN] such that fk G I for some k. The irreducible decomposition
is equivalent to the fact that for any ideal I C C[xi,..., XN], we can write \/T =
C\a^j(Pa, where Va are the finite number of minimal prime ideals containing I. For
example, y/l{x\x\) = I(x\X2) = I(xi) n I(x2).

One weakness of the exact irreducible decomposition is that it assumes that


the polynomials are exact and an algebraic set will be said to be irreducible even
though it is for all practical purposes reducible. For example, let p{x,y) = xy — e.
For e = 0, V(p) has two components, but for e ^ 0, V(p) is irreducible, even if
e is so small that, in a problem arising in engineering or science, it is just noise.
This sort of discontinuous behavior is not realistic for problems where data is never
completely exact. For small e, numerical-geometrical methods will rather gracefully
give different answers depending on the precision used.

12.2.1 The Dimension of an Algebraic Set


Using the irreducible decomposition, we can finish the definition of dimension. We
define the dimension of an irreducible affine algebraic set to be the dimension of
the smooth points, Xreg. Since the smooth points of an irreducible component are
connected and dense, this is very natural. We say that an affine algebraic set X is
pure-dimensional if all the irreducible components of X have the same dimension.
We define the dimension dim^ Z of an affine algebraic set Z at a point x G Z to be
the maximum of the dimensions of the irreducible components of Z that contain x.
We define the dimension dim Z to be

max dim-r Z.
xez
Here is a basic fact about dimension, which follows from the general result (Theorem
III.C.14 Gunning & Rossi, 1965).

Theorem 12.2.2 Let Z be an irreducible affine algebraic set Z C C^ of dimen-


sion k. Then given a polynomial f on CN which is not identically zero on Z, it
follows that the dimension of every component of Z n V(f) is k — 1.

Here are some points to be aware of.


Some Concepts From Algebraic Geometry 217

(1) Since the smooth points of an irreducible affine algebraic set Z are connected,
it follows that given any point z G Z, every Zariski open neighborhood of z is
irreducible. This can fail in the complex topology, as shown in the following
example.
Consider the curve Z := V(x2—Xi(xi + 1)) in the neighborhood
of the point z = (0,0). The real part of this curve is shown to
the right, where one may see that near the origin, the curve is
2
resembles two lines, xi = ±xi, so in the local neighborhood it is /
not irreducible, even though globally the curve is one irreducible /~\/
piece. The solution set over the complexes is topologically a real \ Z ^ \ ~x[
two-plane stretched and bent such that two points touch each \
other. Local to the point of contact, it looks like two disks
touching transversely, but globally it is all one surface. This is
discussed in more detail in Example A.4.18.
(2) Real points of irreducible algebraic sets do not have to be connected, nor do
the components have to have the same dimensions. V{x\ — X\{x\ — l)(xi — 2))
is an example of the former and V{x\ — x\(x\ ~ 2)) is an example of the latter.
Nor does there have to be much relation between degrees and number of real
isolated zeros. For example, following (Example 13.6 Fulton, 1998), let

p(x, y) := U^{x - if + Iif=1(y - j) 2 .

We have m2 zeroes on R2 despite degp(x,y) = 2m. Over C, we have a curve


with these m2 points all singular.

12.3 Further Remarks on Projective Algebraic Sets

Though, for applications, affine algebraic sets are the main interest, we must also
define projective algebraic sets. We need them to be able to discuss what happens
at infinity for a given polynomial system, and in particular to be able to carry
out accurate counts of solutions of polynomial systems. Also the behavior of pro-
jective algebraic sets is often easy to understand, e.g., see the Proper Mapping
Theorem A.4.3, and they can be used to understand the behavior of affine algebraic
sets. In this section we continue the discussion of projective sets started in § 3.5.
FN is a compact manifold containing CN as a dense open set. The natural
approach to the definition of algebraic sets on WN is to define them as the solution
sets of finite numbers of whatever are the analogue for FN of polynomials on C^.
At first glance this does not look hopeful, since we cannot expect any nontrivial
global algebraic functions.
To see this consequence of the compactness of P™, consider the representative
case of P 1 . Polynomials on C are holomorphic functions, and so under any reason-
able definition, an algebraic function / on P 1 should be a holomorphic function. The
218 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

snag is that since P 1 is compact it follows from continuity that \f(x)\ has a maxi-
mum at some point x* G P 1 . Thus, then by the Maximum Principle, Lemma A.2.7,
f(x) must be constant on an open neighborhood of x*, and therefore on all of P 1 .
At first sight this is discouraging, but the key insight is that although there is
no reasonable class of algebraic functions on ¥N, there are some "almost functions"
lying around, i.e., the homogeneous polynomials. It is important to realize that,
even though homogeneous polynomials are not functions on projective space, they
behave as "extensions" to FN of polynomials on C^. Later we will return to ho-
mogeneous functions in § A. 13 and see that they are the prototypical nontrivial
example of "sections of line bundles."
Before we give definitions, let's work out a simple representative example. Let
p(xi,X2) — x\ — X2 + 1 be a function on C2. Regarding C2 as the coordinate patch
Uo C P 2 as above, we have in terms of the homogeneous coordinates [zoi^i)^] o n
P2 that x\ = ZI/ZQ and a:2 = Z^/ZQ. Thus the function x\ — X2 + 1 is represented by

(?) a -? +i =(fy^-^ + ^-
\zoj ZQ \zoj
Under the identification of Uo with C , it is easy to check that the closure in P 2
2

of the zero set V{p) is the zero set V(f) of the homogeneous polynomial f(z) :=
Z\ - Z0Z2 + Zl.
The following two examples indicate that counting solutions in C2, even when
we just have points, is not so clear cut as on C.
Example 12.3.1 Consider the system

f
^--=[ax7bf+c]- ^
The reader can check that if a ^ 0, then there are two solutions to f(x,y) = 0
(counting multiplicities in the obvious way when b2 — \ac = 0). But what about
the case a = 0, b ^ 0 where we only have one solution?
Example 12.3.2 Consider the system of two polynomials on C2

(12 3 6)
/(*.»):= [^2.] --
We expect two lines to meet in a point, but these two parallel lines do not.
We already met similar systems in Chapter 3, so we know that the key to sim-
plifying solution counts is to homogenize the systems. In this way, Example 12.3.1
becomes
wx y2
g(w,x,y) := [ ~ ] (12.3.7)
yv y/ v
' ' [ax + by + cw,\ ' '
Some Concepts From Algebraic Geometry 219

which now has for a = 0 a second solution point at infinity of [w, x, y] = [0,1,0] 6 P 2 ,
formerly "missing" from the affine version.
Similarly, Example 12.3.2 becomes

(12 3 8)
S(w,*,v)--=[jSZ]> '-
which now has the solution point at infinity along the x-axis, [w, x, y] = [0,1,0] € P 2 .
Note that Example 12.3.2 shows that if we have a system / on CN, then V(f),
the closure in FN of the set of solutions of / , may be smaller than the set of
solutions V(f ) of the associated system / of homogeneous polynomials on FN. In
that example, V(f) is empty, so V(f) is too, whereas V(f ) is the point {[0,1,0]}.
It is easily checked that V{J) n C N = V(f).

12.4 Quasiprojective Algebraic Sets

Sets of the form X \X nY, where X, Y C FN are projective algebraic sets, are
called quasiprojective algebraic set. These include sets of the form X \ X nY,
where X, Y c P ^ are affine algebraic sets. The simplest nontrivial example of a
quasi-projective algebraic set which is neither projective nor affine is C 2 \ 0.
As with affine algebraic sets, we can with no changes define the Zariski and
complex topology and the notion of irreducibility.
The following is a basic fact.
Theorem 12.4.1 Let U be a Zariski open dense subset of a quasiprojective alge-
braic set X. Then the closure of U in X in the complex topology is X.

Proof. This follows immediately from (Theorem 2.33 Mumford, 1995). •

Finally we note that all the basic results such as the irreducible decomposition of
§ 12.2 hold for quasiprojective algebraic sets (respectively projective algebraic sets)
and not just for affine algebraic sets. The only difference is that the irreducible
components in this generality are not affine algebraic sets, but are only quasipro-
jective varieties (respectively projective algebraic sets). Using this we carry over all
the definitions of dimension. For example, a pure-dimensional quasiprojective set
is a quasiprojective set with all irreducible components having the same dimension.
Let X and Y be quasiprojective algebraic sets. We define an algebraic map
f : X —> Y between X and Y to be a map such that for all x € X and j e F
there are affine open sets U C X containing x and V C Y containing y such that
f(U) C V and / : U —> V is algebraic. The set X x Y is a quasiprojective set,
which may be shown by elaborating on § A. 10.2. The graph of a map f : X —> Y is
the set

Graph(f) := {(x,f(x)) e X x Y | x G X}.


220 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

It may be shown that an equivalent definition for an algebraic map f : X —• Y


between quasiprojective algebraic sets is that / is a map from X to Y such that
the graph Graph(f) C X x Y of / is a quasiprojective algebraic subset of X x Y.
The following fact is useful.
Theorem 12.4.2 The complement of a proper quasiprojective algebraic subset Y
in an irreducible quasiprojective set X is connected. If a quasiprojective set X is
connected, then X is path connected.

Proof. The first assertion follows immediately from (Chapter 4, Corollary (4.16)
Mumford, 1995).
The second assertion would follow if we knew it for irreducible quasiprojective al-
gebraic sets. Given any irreducible quasiprojective set, there is a connected smooth
manifold mapping onto it by Hironaka's Desingularization Theorem A.4.1. Since
connected manifolds are path connected, we are done. •

Few algebraic sets are both affine and projective.


Lemma 12.4.3 Let X C CN denote a compact affine set. Then X is finite.

Proof. To see this assume otherwise. By the irreducible decomposition from § 12.2,
we know that if X is compact and not finite, then X contains a compact irreducible
infinite affine algebraic set. We can assume without loss of generality that X is
this set. The absolute value of any coordinate functions Z; restricted to X has a
maximum on X. By Lemma A.4.2, the restrictions of all the coordinate functions
are constants, and hence X is a single point. •

12.5 Constructible Algebraic Sets

Let us start with an example leading to a constructible set.


Example 12.5.1 Suppose we were interested in the family of systems of polyno-
mials in C[x,y]

F (12A9)
™=[Z,~-u]=0-
parameterized by (t,u) € C2. The set of (t,u) £ C2 where F^u)(x,y) = 0 has a
nonempty solution set is
{(o,o)}u{t^o}.
This set is not quasiprojective, but it is constructible.
Let X be a quasiprojective algebraic set. Let A{X) denote the set of closed
algebraic subsets of X. A(X) is closed under finite unions and arbitrary intersec-
Some Concepts Prom Algebraic Geometry 221

tions. The set T(X) of complements of the elements of A(X) are the open sets of
the Zariski topology of X. The set C(X) of constructible sets of X is the smallest
set of subsets of X that
• contains A(X) and
• is closed under a finite number of Boolean operations,
where the Boolean operations are union, intersection, and sending a subset of X
to its complement in X. Otherwise said, C(X) is the Boolean algebra of subsets
of X generated by A{X) (or equivalently T{X)). Constructible sets are the outer
limits of the type of sets that need to be considered in the numerical analysis of
polynomial systems. We will see that they arise naturally when working with affine
algebraic sets.
We present here a few key facts about constructible sets. A fuller discussion
may be found in (Chaps. AG.l and AG.10 Borel, 1969).
Lemma 12.5.2 Let X be a quasiprojective algebraic set. Assume that A C X is
a constructible set such that A = X, where the closure is in the Zariski topology.
Then there exists a Zariski open and dense set U C X such that U C A.

Proof. See (Proposition in Chap. AG.2 Borel, 1969) •

Lemma 12.5.3 Let A be a constructible subset of a quasiprojective algebraic set


X. Then the closure of A in the complex topology and in the Zariski topology are
the same.

Proof. Use Lemma 12.5.2 and Lemma 12.4.1. •

When we take closures of constructible sets (and almost every set that comes
up in this book is at worst constructible) this lemma tells us it does not matter
whether we use the complex or Zariski topology: in either case we get the same
algebraic set. For this reason, we often do not specify which topology we are taking
the closure in.
It is useful to record the trivial case when a constructible set is automatically
algebraic, a corollary to Lemma 12.5.3.
Lemma 12.5.4 Let A be a constructible algebraic subset of an affine (respec-
tively protective, respectively quasiprojective) set. If A = A, e.g., if A is closed
in the complex topology, then A is an affine (respectively projective, respectively
quasiprojective) set.
Example 12.5.1 is a simple and fairly typical example of a constructible set.
Here it is said a slightly different way.
Example 12.5.5 Consider the map F : C2 —> C2 which sends (z,w) —• (z,zw).
This is a nice algebraic map, but the image is (C2 \ {z = 0}) U {(0,0)}, which is
222 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

neither the set of zeros of a set of polynomials nor the complement of such a set.
This is about the worst the image of an algebraic map gets.

Theorem 12.5.6 (Chevalley's Theorem) Let F : X —> Y be an algebraic map


between quasiprojective algebraic sets. If Z G C(X), then F(Z) s C(Y).

Proof. See (Corollary AG.10.2 Borel, 1969). a

Chevalley's Theorem is one of the features distinguishing algebraic geometry


from complex analytic geometry, e.g., holomorphic maps are too wild to admit any
such result.
Corollary 12.5.7 Let f : X —> Y be an algebraic map between quasiprojective
algebraic sets. Then given any irreducible component B' of f(X), there is an irre-
ducible component A of X with f(A) = B'. In particular, if X is irreducible, then
f(X) is irreducible.

Proof. By Chevalley's Theorem 12.5.6 we know that B := f(X) is algebraic.


We first show the special case when X is irreducible. We have the irreducible
decomposition B = U*j=1Bj for some positive integer r. Since X is irreducible
and contained in \Jj=1f~1(Bj), we conclude that X = f~l(Bj) for some j . Thus
f{X) CB3 and B = By
For the general case assume that X has an irreducible decomposition \J\=1Xj
for some finite s. We have the irreducible decomposition B = U^=1Bj for some
positive integer r. By the last paragraph, f(Xi) is irreducible. Since for any j we
have that Bj C B = Uf=1/(Xj), we conclude that Bj C f(Xi) for some i. Since any
component of the irreducible decomposition of an algebraic set B is not contained
in any larger irreducible algebraic subset of B, we conclude that Bj = f{Xi). •

Maps of algebraic sets that "should" be surjective often fail to be because the
domain lacks some points at infinity. For example, the map from V{zw — 1) C C2 to
C obtained by sending (z,w) —> z misses z = 0. Example 12.5.5 is also of this sort.
For this reason, it is often more useful to use the notion of a dominant map. A map
/ : X —> Y between quasiprojective algebraic sets is called dominant if f(X) = Y.
Lemma 12.5.8 Let f : X —>Y be a dominant algebraic map from a quasiprojec-
tive set X to an irreducible quasiprojective set Y. There exists a Zariski open dense
set V C Y contained in f(X).

Proof. This is an immediate consequence of Theorem 12.5.6 and Lemma 12.5.2. •

Thus using the Upper-Semicontinuity of dimension Theorem A.4.5 and Cheval-


ley's Theorem 12.5.6, we have the following result.
Some Concepts From Algebraic Geometry 223

Lemma 12.5.9 Let f : X —> Y be an algebraic map of quasiprojective algebraic


sets. Then for any integer k, the set {y £ Y\ dim/~ 1 (j/) > k} is constructive.

12.6 Multiplicity

Multiplicity appears in numerous places in algebraic geometry.


In its simplest form, it is very easy to understand, e.g., given a not identically
zero polynomial of one variable p(x) £ C[x], the multiplicity of a point x* £ V(p)
is the integer fi > 0 such that p(x) = (x — x*)Mg(a;) for a polynomial q(x) with
q(x*) ^ 0.
In several variables, the story for a single polynomial is the same. Let
p(xi,... ,XN) £ C[X±,...,XN] be a not identically zero polynomial on CN. The
irreducible decomposition of V(p(x)), the solution set of p(x) = 0, is a decomposi-
tion

V(p) = UUiZN-i,i
where ZN-I,I are distinct afSne algebraic sets, i.e., distinct irreducible affine alge-
braic sets. Moreover, dim.Zjv-i,t = N — 1 for all i and there are polynomials qi{x)
such that

(1) ZN-lti = V(qi);


(2) the multiplicities of the solutions of the one variable polynomial obtained by
restriction of qt(x) to a generic line is one; and
(3) p{x) = qi{x)^ • • • qr{x)^.

This is a satisfying description of multiplicity of a component, although already the


situation is not so easy to prove as in one variable.
What about the multiplicity of an isolated solution x* of a polynomial system
fi(xi,... ,XN)
f(x) := : =0? (12.6.10)
Jn(x1,...,xN)_
The difficulties with multiplicities begin when we have a set denned by more than
one polynomial. Theorem 12.2.2 implies that for this system to have an isolated
solution, n must be > N.
Perhaps the simplest example is given by the system

z1=0 (12.6.11)
^2 — 0.

It is completely reasonable to say that the origin (0,0) is a multiplicity 2 solution


of this system. Indeed, since z\ = 0 defines the -22-axis, and since the restriction of
z\ = 0 to the Z2-axis has 0 as a multiplicity 2 root, we must either have that (0,0)
224 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

is a multiplicity 2 solution of this system 12.6.11 or give up any sort of reasonable


compatibility with the already defined notions.
We define the multiplicity of x* as a solution of the system to be the dimension
/x of the finite dimensional vector space

Ocv/(/i,-,/n),
where
(1) GcN,x' is the ring of convergent power series centered at x*\ and
(2) (/i,..., /„) is the ideal of OQN x. generated by the polynomials /».
It is straightforward to see that when n = N = 1 this agrees with the notion of
multiplicity, that we are used to, but it is certainly not clear what this means when
N > 1. Also, why convergent power series? It turns out this is just a convenience
for us. One could use instead formal power series, or the ring of rational functions
p(x)/q(x) with q(x*) ^ 0. But the equivalence of the multiplicities obtained these
different ways is not obvious!
In the special case n = N, \i has a simple geometrical interpretation. If x* is
a multiplicity /x isolated solution of f(x) = 0 and you choose a generic vector v 6
C^ sufficiently near 0, then f(x) = v has exactly /i nonsingular isolated solutions
x*,...,x^ near x*. By nonsingular we mean that the Jacobian matrix, J, with
elements
T _ dfj(x)
j
~ dXj
is invertible at each of x*,..., x*. This, in fact, implies that ji = 1 is equivalent to
the solution x* being nonsingular isolated. Another consequence of this, in the case
n = N, is that with appropriate homotopies of the sort we construct, the number
of paths ending at x* equals the multiplicity.
Unfortunately, when n ^ N, the meaning of multiplicity becomes a bit more
obscure, and not so closely connected to geometric intuition. This is a reflection
of the complexity of the nonreduced structures on points in higher dimensions,
i.e., the zero dimensional nonreduced schemes. Since we do not make much use
of multiplicity we do not pursue this. If you do, you need to put multiplicity
into a broader context of Hilbert functions, e.g., see the discussion of multiplicity
in (Hartshorne, 1977), and in particular (Exercise V.3.4c Hartshorne, 1977). The
books (Eisenbud, 1995; Fulton, 1998) are good algebraic references. See also (Bates,
Peterson, & Sommese, 2005a) for a numerical-symbolic algorithm for computing
multiplicity.
Multiplicity for us arises in another way. Consider C := V(x\ - x\) c C2. The
multiplicity of C as a component of the solution set of x\ — x\ = 0 is 1. In this case,
it is useful to attach a multiplicity to each point of C. We define the multiplicity of
a point x* € C as a point of C to be the multiplicity of x* as an isolated solution
Some Concepts From Algebraic Geometry 225

x* of the system

[ao + aiXi + 0,2X2

where ^ ( a 0 + aixi + 02^2) is a generic line vanishing at x*. An excellent and very
readable reference for this sort of multiplicity is (Chapter 8 Fischer, 2001).

12.7 Exercises

Exercise 12.1 (Solution Components) Solve the system on page 207 using a
total-degree homotopy. Do you get points on every component? How many?

Exercise 12.2 (Projection from a Point) Write out the formula for a projec-
tion C 2 —> C from center c onto the line {x2 = 0}.

Exercise 12.3 (Composition of Projections) Write the projection CN —>


C N - 1 from center c G C^ onto the hyperplane {x^ = 0}. For points ci,C2 £ C^,
consider the projection CN —>• C ^ " 2 given by the composition of the projection
TTI : CN -> C ^ " 1 from center cx followed by the projection TT2 : C ^ " 1 -> C ^ " 2
having 7Ti(c2) as center. Is the result the same or different if we reverse the order
of c\ and c2?

Exercise 12.4 (Dimension of an Affine Algebraic Set) Let Z be the solution


set of the initial example on page 207. What is the dimension of Zl What is
dim(ii2) Zl

Exercise 12.5 (Classifying Sets) Classify each of the following sets as affine,
projective, quasiprojective, or constructible. Remember that the classifications are
not mutually exclusive.

(1) V{xy-y-l).
(2) The image of V(xy — y — 1) under the projection (x, y) 1 — > x.
2 2 2
(3) V(x + y + yz,xz-2z ).
(4) The set of quadratic equations in one variable that have two distinct roots.
(5) The nonsingular solution points of y2 — x 2 (x — 1) = 0.
(6) Points in C 2 that are not nonsingular solutions of y2 - x 2 (x - 1) = 0.
(7) Pairs of points in C 2 such that there is a unique line containing them.
(8) Pairs of points as in the previous item such that the line contains the origin.

Exercise 12.6 (Real Solution Points) Verify the statements in item 2 on


page 217 concerning the real points of the two algebraic sets mentioned there.
226 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Exercise 12.7 (Multiplicity of an Isolated Solution) Prove that the multi-


plicity of (0,0) as a solution
zi = 0
4 =o
is 2. (Hint: n = N.) Demonstrate this numerically using HOMLAB.
Exercise 12.8 (Multiplicity of an Irreducible Affine Set at a Point)
Show that the multiplicity of x* e V(x\ — x\) is 1 for all points but x* = (0,0), for
which it is 3. Demonstrate these facts numerically using HOMLAB.
Chapter 13

Basic Numerical Algebraic Geometry

Our overarching goal is to numerically encode an algebraic set Z in a form that


allows us to answer such basic questions as:
Membership Is point x in Zl
Dimension What is the dimension of Zl
Degree What is the degree of the pure i-dimensional component of Zl
Decomposition What are the irreducible components of Zl
This is just a beginning, however, for suppose we have a similar encoding for a
second algebraic set Y. Then, we would like to answer:
Inclusion Is Y a subset of Zl
Equality Is Y equal to Zl
Finally, we would like to propagate the encoding through Boolean binary operations,
that is, if we have encodings for algebraic sets Y and 2", we would like to:
Union Find an encoding for X = Y U Z; and
Intersection Find an encoding for X = Y n Z.
(We regard the third Boolean operation of complementation as just the negation
of the membership test.) Numerical algorithms to answer these questions form the
foundation of numerical algebraic geometry.
Typically, we begin not with an algebraic set, but rather, with a system of
polynomials f(x) : CN -> Cn

'hixy
f[x)= j =0. (13.0.1)
Jn{x)_
Then, our object of study is the solution set of / = 0, which we often write as
z = vu).
As discussed in § 12.2, we know that any affine algebraic set decomposes as
Z:=uf™zZz, Zi:=\JjeXiZi:i (13.0.2)

227
228 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

where Zj is the union of all i-dimensional irreducible components of Z, and where Zij
for j £ Xi are the finite number of distinct irreducible components of Zi. Geomet-
rically, the Z^ are the closures of the connected components of the set of manifold
points of Z. The algebraic set Z might be the entire solution set V(f) or it might
be the union of several of its irreducible pieces. In the latter case, once we have
built our encoding for Z, we wish to answer all our questions about Z alone, as if
all the components excluded from Z did not exist.
The purpose of this chapter is to motivate and describe an encoding of algebraic
sets that we call witness sets. A first look at these is given in an introductory section,
§ 13.1, without worrying about how we can compute them or even justifying that
they are well denned. In § 13.2, we present basic theory concerning the intersection
of irreducible components with linear spaces, which is the underpinning for our
formulation of witness sets, given precisely in §13.3. Next in §13.4, we define the rank
of a polynomial system and present a fast algorithm to compute it. Then, in § 13.5,
we show how the solution set of a system of polynomial equations relates to the
solution set of a system of random linear combinations of those same polynomials.
This prepares us for an algorithm in §13.6 to compute a loose inclusion of witness
sets, called witness supersets. The final section, § 13.7, uses these concepts and
procedures to obtain numerical methods to answer several of our basic questions
itemized above.
Much of this Chapter is based on the article (Sommese & Wampler, 1996), where
the subject Numerical Algebraic Geometry was started, and its name coined. The
name was chosen to indicate that this subject would be to algebraic geometry what
numerical linear algebra is to linear algebra.
After this chapter, one major problem remains before we can compute the nu-
merical irreducible decomposition, Equation 13.1.3, namely, the witness point su-
persets are only a crude approximation to the numerical irreducible decomposition.
A lesser problem is that the procedures given in this chapter for finding the witness
point supersets are not as efficient as we would like.
The two chapters following this chapter show how to solve these problems. Chap-
ter 14 gives the efficient algorithm of (Sommese & Verschelde, 2000) to find the wit-
ness point supersets. Chapter 15 gives efficient algorithms (Sommese & Wampler,
1996; Sommese, Verschelde, & Wampler, 2001c, 2002b) to process the numerical
irreducible decomposition out of the witness point supersets.
The notion of witness set has developed over time. Originally in (Sommese &
Wampler, 1996) and continuing through (Sommese & Verschelde, 2000), the cen-
tral notion was that of generic point of a component, though all the information
contained in what we now call witness sets was being computed and used. In the suc-
cessive articles (Sommese et al., 2001c, 2002b), the notion of irreducible witness sets
was distilled out as the essential numerical output of our algorithms. The enriched
version of the witness sets for nonreduced components, presented in this chapter
for the first time, is based on the experience gained from (Sommese, Verschelde, &
Basic Numerical Algebraic Geometry 229

Wampler, 2002a).

13.1 Introduction to Witness Sets

What should we adopt as our numerical encoding of algebraic sets? Let's begin
by considering the simplest case, a zero-dimensional algebraic set Z. This is just a
finite set of points, so we can use as our encoding a list of the points. When we are
given a system of N polynomial equations in TV unknowns, the methods of Part II
allow us to find a numerical approximation to all nonsingular solution points, and
in fact, those methods give us a list of homotopy path endpoints that includes all
isolated singular solution points as well, although we cannot readily sort these out
from singular endpoints on higher dimensional components. Nevertheless, we have
some confidence that the encoding of Z as a list of solution points is computable.
Moreover, up to the approximation of numerical roundoff, we can easily answer all
our questions about membership, union, intersection, etc. The subtlety regarding
isolated singular solutions is a concern that we can resolve by considering the larger
picture that includes higher dimensional components.
But what shall we do when Z is positive dimensional? Looking at natural
examples, e.g., the set V{x{) C C2, there are two obvious ways of encoding the
points of these algebraic sets.
The first approach is to use a parametric representation, e.g., representing
V(xi) C C2 as {t £ C | (xi,x2) = (0,t)}. Unfortunately, while parametric repre-
sentations are very useful, they are also rare. For example, in Remark A.2.10, we
sketch an argument showing that a curve as simple as V{x\—x\{x\ — l)(xi —2)) has
no parametric representation. A nice discussion of which curves have a parametric
representation may be found in (Abhyankar, 1990).
A second approach is to use denning equations. Since by definition, algebraic sets
are solution sets of polynomial equations, we know that this approach has to work.
Indeed, this is the approach taken in computational algebra. Low degree equations
vanishing on an algebraic set are nothing to scoff at: they can be very useful.
Unfortunately, computing denning equations is numerically expensive. Furthermore
such equations can be numerically unstable.
Numerical Algebraic Geometry rests on a third approach, using the notion of
witness sets. This natural data structure to encode algebraic sets is based on the
concept of generic points and the classical notion of a linear space section.
Since we are going to talk often about linear subspaces of CN, it is convenient
to introduce a shorthand notation for them. We use the following conventions:
• Z/dL*J c C^ denotes an affine linear subspace of dimension i; and
• Lcr*l c C^ denotes an affine linear subspace of codimension i, or equivalently,
of dimension N — i.
Depending on context, it is sometimes easier to use the notation of codimension
230 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

instead of dimension, which is why we introduce both. Consider two generic linear
subspaces, L\ and Li. Their dimensions add under the operation of union, while
their codimensions add under the operation of intersection. If their dimensions are
complementary, their intersection is zero-dimensional; i.e., L^ n L^ i s a point.
The following fact, demonstrated in § 13.2, is the foundation of the notion of a
witness set. Let A C C be a pure i-dimensional algebraic set. Given a generic
affine linear subspace LCM 6 C^, the set A = Lcl"*l n A consists of a well-defined
number d of points lying in ATeg. The number d is called the degree of A and
denoted deg A. We refer to A as a set of witness points for A, and we call Lcl~ll the
associated slicing (N — i)-plane or just slicing plane, for short.
The number of witness points A tells us the degree of the set A, and if we
determine the codimension of the slicing plane that cuts A in isolated points, we
have determined the dimension of A. However, to answer most any other question
about A, such as to test whether a given point x is in A, we need the ability to track
the paths of the witness points as the slicing plane is moved. When A is a pure
i-dimensional reduced component of a polynomial system / , then the witness points
A are nonsingular roots of / restricted to Lc'z> (more on this in a moment), so the
data structure W := (A,Lc^\f) is everything a nonsingular path tracker needs
to track solution paths starting at A as L c ^l evolves continuously. Accordingly,
we call W a witness set for A. When A is not reduced we need a slightly richer
structure, which will be discussed in § 13.3.2 and § 13.3.3. We will see later, in
§ 15.2, how to generate from a witness set as large a number as we wish of widely
spaced points on A.
The witness set data structure, more fully described in § 13.3, has many advan-
tages:

(1) it is stable and much cheaper numerically than finding defining equations;
(2) it is sparing of memory;
(3) it can be used to compute quantities of interest, e.g., if you really want defining
equations they may be computed from this encoding; and
(4) it is special case of the notion of a linear space section, for which there is an
extensive theory (Beltrametti & Sommese, 1995).

Using witness sets, we can make numerical sense out of what it means to find the
solution set of a system of polynomials f(x) = 0 in Equation 13.0.1. We wish to find
a numerical irreducible decomposition that mirrors the irreducible decomposition of
Equation 13.0.2, by which we mean to find a collection of witness sets W, for the
i-dimensional components V$, which are themselves decomposed into irreducible
witness sets Wy for the irreducible components Vij, i.e.,

W:=V^v{f)Wi, Wi:=UjeIiWV (13.1.3)

In Equation 13.1.3, we should be a little careful to define what we mean by the


union of witness sets. Let us use the notation that WA means the witness set for
Basic Numerical Algebraic Geometry 231

an algebraic set A. When two algebraic sets, say A and B, have no components of
the same dimension, the witness set for their union is just a formal union of their
witness sets, that is,

WAUB = WAUWB = {WA, WB}, dim A ^ dim B.

However, when A and B have some irreducible components of the same dimension,
we require the witness sets of the components with the same dimension to have the
same slicing planes L. So, in the reduced case with A a pure-dimensional union of
components of V(f) for a system of polynomials / on C^ and B a pure-dimensional
union of components of V(g) for a possibly different system of polynomials g of the
same dimension as A, the formal union resolves as

WAUB = WAUWB = {WA, WB} = {(A, L, / ) , (B, L, g)} .

where A and B are witness point sets for A and B, respectively. The resolution of
unions in this fashion is not necessary, but it is convenient, and if two witness sets
have different slicing planes, they can always be brought to a common slicing plane
by homotopy continuation.
In computing a numerical irreducible decomposition, we are faced with the op-
posite problem of computing a union. Our procedures will first find the witness set
Wi for the i-dimensional component Vit and subsequently, its witness point set Vj
will be partitioned into irreducible witness point sets Vy.
In the above overview, we have claimed the existence of witness sets and asserted
some of their basic properties. This chapter aims to justify these assertions and
to describe some rudimentary algorithms based on them. Subsequent chapters will
discuss refinements and extensions. We begin in the next section with the basic facts
about intersecting irreducible components with linear spaces, thereby establishing
that witness sets do indeed exist and have the main properties that we asserted
above.

13.2 Linear Slicing

We use the terms slicing or linear slicing to mean intersecting algebraic sets with
linear spaces. The answer to the following question supports the use of linear slicing
and will give witness sets much of their power.

How does the irreducible decomposition, Equation 13.0.2, behave under slic-
ing by general hyperplanes?

The crucial value of linear slices is that they have good preservation properties, i.e.,
given a general hyperplane L C CN, Z and ZnL share several important properties.
232 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

An affine hyperplane (or hyperplane for short) Lc^~\ C CN is the zero set of a
linear equation, which we denote

C(x; a) = ao + aiXi + • • • + aNxN

with the <2j E C not all zero for i > 1. C\{x;a) and £2(^1 &) have the same zero
set if and only if a = Xb for some complex number A ^ 0. Thus affine hyperplanes
are parameterized by the subset of points [ao,ai,... ,ajv] £ P ^ with aj 6 C not
all zero for i > 1. The single point not in this set, [1,0,... ,0] G P " , corresponds
to the hyperplane at infinity. Similarly, we regard affine linear spaces Lcl"*l c CN
as parameterized by i-tuples (Ai,..., Ai) G (P^) , where Aj := [a^o, • • •, CLJ,N] and
the rank of the matrix

ai,o • • • OI,JV
A— : •-.:
difi . . . Ili^N _

is i. The linear space is the zero set of the linear equations, so we may write

CN DLc\i\ =v(C(x;A)), Ae(¥N)\

Though we use this representation below, it is not optimal for i > 2. For example,
given an invertible i x i matrix F, the linear equations associated to the matrix
(F • A) define the same affine linear space as the linear equations associated to A. A
much crisper parameterization is given by the use of the Grassmannian, as discussed
in § A.8.1.
We are interested in the relation between the solution set of the polynomial
system f(x) = 0 of Equation 13.0.1 and the augmented polynomial system

" fi(x) -

'• =0 (13.2.4)
fn(x)

.£(x; a) _

on C ^ , where C(x;a) is a general linear equation. The basic facts are as follows.
Theorem 13.2.1 (Slicing Theorem) Let X C C ^ denote a pure i-dimensional
affine algebraic set. There is a Zariski open dense subset U C PN such that for
a€U and L = V (£{x; a)),

(1) ifi = 0, then L n X is empty;


(2) if i > 0, then L n X is nonempty and (i — 1)-dimensional, and deg(L n X) =
degX; and
(3) if i > 1 and X is irreducible, then L n X is irreducible.
Basic Numerical Algebraic Geometry 233

Items 1 and 2 of the theorem are rather elementary consequences of Bertini's


theorem A.7.1, but item 3 is deeper. A quick proof of this fact follows from the
Hironaka Desingularization Theorem A.4.1 and a vanishing theorem of Kodaira
type. See (Theorem 3.42 Shiffman & Sommese, 1985) for a proof in the projective
case. The afHne case follows from this since

(1) the closure X of X in P ^ under the natural embedding C ^ C TN is projective;


(2) X is irreducible if and only if X is irreducible; and
(3) X D L is irreducible if X n L is irreducible.
Theorem 13.2.1 is not quite strong enough to be conveniently used. We say that
a set of linear equations

: = A
: '
•LK{X) +
.XN .
is generic with respect to an irreducible affine algebraic set X if given any subset
Li1,..., Lir of r distinct Lj, it follows that

(1) either Xr\V(Lllt... ,Lir) is empty or dimXnF(Li 2 ,... ,Lir) = d i m X - r > 0;


and
(2) Sing(X n V(LU ,...,Lir))c Sing(X)
We say that a set of linear equations L\,..., L^ is generic with respect to the ir-
reducible components of an algebraic set X if the set of equations is generic with
respect to all irreducible components of X, plus all irreducible components of in-
tersections of any number of the irreducible components of X.
Theorem 13.2.2 Let X C C ^ be an irreducible affine algebraic set. There is a
Zariski open dense subset of U C C X x ( i v + 1 ' of K x {N + 1) matrices such that for
A 6 U, the linear equations

\L^X)] :=A
fxV
: '
•LK{X)1 XN
X
, N .

are generic with respect to the irreducible components of X.


This is a special case of the more general result Theorem A.9.2.
There is a further consequence that we do not make much of because we do not
keep track of multiplicities.
L e m m a 13.2.3 Let f(x) = 0 be as in Equation 13.0.1. Assume that X is a
positive dimensional solution component of f(x) = 0 with multiplicity /x, then there
234 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

is a Zariski open dense subset U C FN such that for a G U and L = V(£(x;a)),


every component of X D L is a component of the solution set of the augmented
system in Equation 13.2.4 with multiplicity fi.

Proof. This follows from the stronger result (Lemma 1.7.2 Fulton, 1998). •

13.2.1 Extrinsic and Intrinsic Slicing


In putting Theorem 13.2.1 to use, we will often simultaneously slice an algebraic
set by several hyperplanes. The theorem implies that slicing an algebraic set by
i generic hyperplanes will cut the i-dimensional components of the set down to
isolated points. As we will see subsequently, this is a standard maneuver in many of
the algorithms of numerical algebraic geometry. Accordingly, it is useful to consider
how different formulations of slicing might affect computational efficiency.
The formulation we have used so far in this chapter, which we call an extrinsic
formulation, represents a linear space by a set of equations, as Lc'1' — V(C{x\ A)),
where A is an i x (N + 1) matrix. To find V{f) n Lc|Y1, where f(x) : CN -> C n , we
simply concatenate the two systems of equations to obtain the augmented system
f(x) : CN -> Cn+i as

=o (i3 2 5)
^••=Uiri)] - --
Clearly, a general A with i < N has full rank i. Using standard techniques from
linear algebra, we can write the solution of C(x; A) = 0 in the form

L c m = {x e CN | x = p + B • u for some u G CN~1},

where p G C^ is a particular solution of C(x; A) = 0 and where the columns of


the N x (N — i) matrix B are a basis for the null space of last N columns of A.
Accordingly, an intrinsic formulation of slicing is the system

fL(u):=f(p + B-u)=0. (13.2.6)

The solutions of the extrinsic and intrinsic systems are isomorphic under the map-
pings u — i > p + B • u and x ^ B^(x — p), where B^ is the pseudoinverse of B.
Since /z, : CN~l —> C" has fewer equations and variables than f(x), the intrinsic
formulation can save computation compared to the extrinsic one.
From a geometric point of view, the extrinsic and intrinsic formulations are
identical: they both describe the intersection of V(f) with LC^T. Furthermore, in a
situation where we wish to choose the slicing plane generically among the set of all
affine (N — i)-planes, it does not matter if we do so by choosing random coefficients
A as a point in (P^) 1 or if we choose random {p, B) for the intrinsic formulation.
Either way, we are choosing a random slicing (N — i)-plane from the Grassmannian
of all such planes in C^.
Basic Numerical Algebraic Geometry 235

13.3 Witness Sets

The strong version of the slicing theorem, Theorem 13.2.2, gives us everything we
need to justify our definition of a witness set. It tells us that for an affine algebraic
set X, a generic Lc^1^ c CN meets the irreducible components of X as follows.
• It misses any irreducible components of dimension less than i.
• It meets each irreducible component X^ of dimension i in degXij isolated
points, and these points do not lie on any other component.
• It intersects irreducible components of dimension k > i in an irreducible alge-
braic set of dimension k — i.
Moreover, Theorem 13.2.2 implies that LCW will be generic with probability one if
we choose the coefficients of its defining linear equations at random from <CIX^N+1\
so it is easy to numerically generate generic slicing spaces. Accordingly, we adopt
the following definition.

Definition 13.3.1 (Witness Set) Let Z C C^ be an affine algebraic set, and let
X be a pure i-dimensional component of Z. Then a witness point set for X as a
component of Z is the set X n L, where L C C^ is a linear space of codimension %
that is generic with respect to all the irreducible components of Z. A witness point
set for Z is just a collection of one witness point set at each dimension i, i =
0,... , dimZ. Finally, when X is an irreducible algebraic set, we say that X n L is
an irreducible witness point set.

The witness point set for a pure i-dimensional set X± whose irreducible compo-
nents are Xij, j G Xj, is just

Xt n Lc™ = (U.ei.Xy) n Lc^ = UjeitiXij ("I Lc^).

If Xi has more than one irreducible component, then its witness point set can be
decomposed into a collection of irreducible witness point sets for its components.
The witness point sets are the main theoretical construct of interest in our
approach. The key to answering all of the questions at the top of this chapter comes
down to computing witness point sets. However, for a particular slicing plane L 0 ^!,
i > 0, afiniteset of points in Lc^^ does not uniquely identify any particular algebraic
set: many different algebraic sets pass through those same points. Accordingly, we
see that a witness point set alone is not a complete encoding of an algebraic set; for
that, we need to carry along additional symbolic information describing the set.
We wish to define a witness set for an affine algebraic set X to consist of a
witness point set plus such additional information required to uniquely define X.
Suppose that X is an i-dimensional irreducible solution component of a system of
polynomial equations f(x) = 0. If we have a symbolic formulation of f(x) on hand,
the set {X (1 Lc^%\ Lc^l\ f} defines X uniquely: the witness points tell us which
236 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

component of V(f) is of interest.1 We hasten to add that situations may arise where
we define an afHne algebraic set in some indirect manner such that, although there
must exist a set of polynomial equations that define the set, we do not necessarily
have such a set at hand. As such situations arise, our definition of a witness set will
be adapted to accommodate them.
While a witness set of the form {X n Lci>l\ Lc^l\ f} is everything we need for
theoretical purposes, it is not always sufficient from the numerical point of view.
As a data structure in a computer program, we want to treat / as a pointer to a
black-box routine that, given a point x*, returns just the function value f(x*) and
the Jacobian matrix df/dx(x*), as floating point numbers. We would prefer not to
perform any symbolic manipulations to extract information from / . Even in the case
of a zero-dimensional algebraic set, which is just a finite set of points, a numerical
witness point set is just a list of approximations to those points. A witness set
should carry along enough additional information to allow us to numerically refine
these approximations to higher precision. Exactly what additional information we
carry along to numerically encode an algebraic set X will depend on the properties
of X and also on the initial symbolic information we have been given to uniquely
describe X. Accordingly, we will have several different flavors of witness set, but
each will include a witness point set and enough additional information to allow
us to use the witness set in our numerical algorithms. In an implementation of
these algorithms in computer code, the witness set would be a data structure that
includes a field identifying its flavor, and basic operations on witness sets need to
be able to handle all such flavors.
In the next few paragraphs, we define threeflavorsof witness sets that are useful
in numerical work. We will not yet give numerical algorithms for computing such
sets; these come later in the chapter.

13.3.1 Witness Sets for Reduced Components


Let us remind ourselves of the meaning of "reduced." The notions of reduced
versus nonreduced are not to be confused with reducible versus irreducible. The
line {(x,y) s C2 | x = 0} is an irreducible algebraic set that is a reduced solution
component of the equation xy = 0, but it is a nonreduced solution component of
the equation x2y = 0. Thus, we see that reduced and nonreduced are not intrinsic
properties of the set as a geometric entity, but relate to algebraic properties of the
system of polynomials that we use to define the set. Reduced is synonymous with
multiplicity-one, while nonreduced implies a multiplicity greater than one.
The salient point is that if Xi C C^ is an i-dimensional reduced solution com-
ponent of the system of equations / = 0, it meets a generic codimension i slicing
1
Strictly speaking, the slicing plane Lcl~ll is not necessary, because it is either uniquely deter-
mined by the witness points or, if not, we can pick a slicing plane at random from among all the
(TV — i)-dimensional linear spaces that interpolate the witness points. Nevertheless, it is convenient
in our algorithms to have it on hand rather than to regenerate it when needed.
Basic Numerical Algebraic Geometry 237

plane in witness points having multiplicity-one. Such a point is numerically tame:


the Jacobian matrix of the system of equations defining the point is full rank N.
Letting L(x) — 0 denote a system of % independent linear equations for the slicing
plane, the witness points are solutions of the augmented system {f(x), L(x)} = 0. If
the number of equations, n, in this system is equal to the number of unknowns, N,
then Newton's method converges quadratically in the neighborhood of the witness
point. If n > N, the Gauss-Newton method, that is, Newton's method modified to
use a least-squares iterative step, converges quadratically. This is quite satisfactory
for our numerical work, so when X, is a reduced component, we use

{XinLcW,LcW,f}
as its witness set.

13.3.2 Witness Sets for Deflated Components


In the case that an irreducible algebraic set X is a nonreduced solution component
of a polynomial system f(x) = 0, its witness points are isolated roots of multiplicity
greater than one for the augmented system f(x) = {f(x),L(x)} — 0. This means
that the Jacobian matrix evaluated at such a root is not full rank: the witness
points are singular. As discussed in § 10.5, the behavior of Newton's method near
singular roots is greatly degraded and may even diverge. However, since the witness
points are isolated, one or more stages of deflation, as described in Equation 10.5.5,
may allow us to compute the witness points in a nonsingular manner. That is, for a
witness point x*, deflation produces a new system of equations, say g(y) = 0, with a
projection operator, say n : y H-> X, such that y* is a nonsingular solution of g{y) = 0
and x* = 7r(y*). In fact, since the slicing plane V(L) is generic and X is irreducible,
the same deflation system that works on one witness point x* G X n V(L) must
work for all other witness points in the set.
Let us restate this in a slightly more general way, independent of the specific
deflation technique described in § 10.5. Suppose that we have an i-dimensional
irreducible algebraic set X C C^, and a linear projection TT : C M —> CN, with
M > N. Let C(x; A) = 0 be a system of linear equations parameterized by matrix A,
as in § 13.2, such that a generic A defines a codimension % linear space. Suppose that
for generic A, each witness point x* £ X D V(L(x; A)) has above it a point y* G C M
that is a nonsingular solution of a system of polynomial equations g(y; A) = 0.
For a particular generic slicing plane Lc^l\ suppose Wy C C M is a collection of
such nonsingular solution points, one for each point i n l f l Lc^^, so that ir(Wy) =
X n Lc^ • Then, we may use as our numerical witness set for X the data

{Wy,LcW,g,n}.
Of course, in our numerical work, Wy will be a numerical approximation to the ideal
points. To refine the witness point set n(Wy), we use Newton's method to refine
238 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Wy as solution points of g(y) = 0, and then project.

13.3.3 Witness Sets for Nonreduced Components


It might not always be possible or convenient to find a deflation formulation for
a nonreduced solution component. For example, it may be that several stages of
deflation would be necessary, giving an unreasonably large dimension M in which
the numerical work is carried out. Alternatively, it may happen that the algebraic
set in question is not given to us as a solution component of a system of polynomial
equations. For example, it could be defined as the intersection of two solution com-
ponents of a system. Such a set is certainly described by some system of equations,
but it might take considerable symbolic computation to construct them from the
data at hand. In this section, we present a third flavor of witness set that can
handle many such situations.
Suppose that we use a homotopy method to solve for the witness points, that
is, each point w € W := X n L c ^ is the endpoint of some solution path xw(t) of
a homotopy function h(x,t) = 0, i.e., h(xw(t),t) = 0, \imt^oxw(t) = w. We will
construct explicit examples of such homotopies below, but for now, we simply posit
the existence of one. When X is multiplicity greater than one as an i-dimensional
solution component of a given polynomial system / , the homotopy we construct
will have w as a singular endpoint.
Recall from Chapter 10 that we have several methods for computing singular
endpoints of homotopy paths. In the power-series endgame or the closely-related
Cauchy integral endgame, we estimate the endpoint by building a local model of
the end of the path, sampling the path for small t inside the endgame convergence
radius but outside the ill-conditioned zone at t — 0. Taking this route, we define
the set

W(e) := {xw(e) \ w e W}
consisting of the solutions of h(x, e) = 0 that lead to the witness points W as e —>• 0,
with e € (0,1].
Our third flavor of witness set for an i-dimensional algebraic set X is, accord-
ingly, the data

{W,Lc^,h(x,t),W(e),e},
where in addition to the conditions that L c ^ is a generic (AT — «)-dimensional linear
subspace of C^ with W := Lc^ fl X, we have h(x, t) and W(e) satisfying
(1) for each point w G W, we have a positive e > 0 and a nonsingular path

xw(t):(0,e]->CN
with h(xw(t),t) — 0 and limt^oxw(t) = w, and
(2) W(e) = {xw(e) \w£W}.
Basic Numerical Algebraic Geometry 239

Whenever we wish to refine a numerical approximation of the witness set, we can


do so by re-playing the singular endgame in higher precision, using W(e) and e to
initialize the solution paths of the homotopy.
Whichever treatment of nonreduced components we choose, we still refer to W
as a witness point set for X. For simplicity of statement, we often suppress the
reference to h(x,t) or g(y) and refer to the witness set (W, Lc^l\f) in both the
reduced and nonreduced case.
We will soon turn to the task of computing witness point sets, but first we
prepare by discussing the rank of a polynomial system and randomizations of poly-
nomial systems.

13.4 Rank of a Polynomial System

Let f(x) = 0 denote a system of n polynomials on <CN. We define the rank of


the polynomial system f{x) = 0 to be the dimension of the closure of the image
f(CN) C Cn. By Corollary 12.5.7, f(CN) is an irreducible affine algebraic set. We
denote the rank of / by rank/. We define the corank of the polynomial system
f(x) = 0 to be N — rank/. In the classical case of a linear system

' xi '
A- x := A- : ,
_XN .

where A is an n x N matrix, the rank of the system is the classical rank A of the
matrix A and the corank is the dimension of the null space of A • x = 0.
Note that given a system / as above, neither adding polynomials in the equations
of / to the system nor replacing / with F • f, where F is an invertible nxn matrix,
changes the rank of the system.

Lemma 13.4.1 Let f(x) = 0 denote the system of n polynomials on CN. Then
there is a Zariski open set Y c f(CN) such that for y eY, V(f(x) — y) is smooth
of dimension equal to the corank of f. Moreover, the Jacobian matrix of f is of
rank equal to rank/ at all points ofV(f(x) — y).

Proof. This is a corollary of Theorem A.6.1 with X taken as CN. •

An important consequence of the above is that for the dense Zariski open set
U := f~1(Y), we have for all points 2* € U, the rank of the Jacobian of / evaluated
at £* equals rank/. This gives us a fast probability-one algorithm for the rank for
a system f. Explicitly, given a system f(x) of n polynomials on CN, then the rank
of / equals the rank of the Jacobian at a random point of C^. To emphasize its
importance, we restate the algorithm below.
240 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Rank of a System: r = rank(/)


• Input: Polynomial system f(x): CN —> Cn.
• Output: r := rank/.
• Procedure:
— Choose a random x* £ C^.
— v '= r a n k I ——IT** ) I

— return(r).

The numerical determination of the rank of the Jacobian matrix is best done using
the singular value decomposition.
The rank intervenes in the study of systems in the following way, which will play
an important role in subsequent developments.

Theorem 13.4.2 Given a system f(x) of n polynomials on CN, all irreducible


components ofV(f) have dimension at least equal to the corank of f.

Proof. As noted in Lemma 13.4.1, the corank of / is dimxf~1(f(x)) for £ in a


dense Zariski open set of CN. Thus, by Theorem A.4.5, the set

{xeCN \ dimx f-\f(x)) > corank/}

must equal C^. •

Remark 13.4.3 Surprisingly, previous to this book, the rank of a system has not
been defined explicitly in numerical algebraic geometry.

Example 13.4.4 Consider the space of 3 x 3 orthogonal matrices, usually denoted


50(3). For any matrix A e 5*0(3), we have the defining equations ATA = I,
detA = 1, where / is the identity. Because ATA is symmetric, the first matrix
condition amounts to just 6 scalar equations, so we have 7 polynomial equations
depending on the 9 entries in A. However, the rank of the system is only 6. Thus,
5*0(3) can have no components of dimension smaller than 9 — 6 = 3. Of course, it
is well-known that the dimension of 50(3) is three, so the rank condition is sharp
in this case.

The definitions of rank and corank make sense for systems of algebraic func-
tions, e.g., rational functions, defined on an irreducible quasiprojective algebraic
set X. Lemma 13.4.1, Theorem 13.4.2, and the algorithm for the rank carry over
immediately with the same proofs to this situation. This generalization, which will
be needed in Chapter 16, is presented in § A.6.
Basic Numerical Algebraic Geometry 241

13.5 Randomization and Nonsquare Systems

We define a square system to be a system

7i(*)"
f(x) = : =0 (13.5.7)
Jn(x)_
of polynomials on C^ with n = N. When we numerically solve a system of equa-
tions, it is usually convenient, and sometimes necessary, to have the same number
of equations as unknowns.
The systems we wish to study might not be square. If n < N, we call the system
underdetermined, and if n > N, we call it overdetermined. However, if it is under-
determined, its rank is at most n, so by Theorem 13.4.2, its irreducible solution
components must be dimension at least N — n. We will work with such components
by slicing them with at least N — n hyperplanes, resulting in an augmented sys-
tem having at least as many equations as unknowns. Of course, when augmented
by slicing planes, square systems become overdetermined, and overdetermined sys-
tems stay overdetermined. Therefore, we see that the overdetermined case needs
attention.
To find the isolated solutions of an overdetermined system, n > N, the naive
approach is to pick out N equations, solve them, and check the solution points
against the remaining equations. This approach is fraught with peril. For example,
consider the system:
xy = 0
x(z-y) = 0 (13.5.8)
y(x - y) = 0.
Any two of the 3 equations have a 1 dimensional solution set, but all three together
have the origin (with multiplicity 3) as the solution set.
There is a natural procedure for obtaining a square system from the above
system, f(x) — 0. Given a n i V x n matrix of complex numbers A G CNxn, we can
form a square system
/ *1,1 fl +•••+M,nfn \
A./= :

\-V/V,l/l + • • • + AjV,n/n /
As we will show below, this square system has all the properties we need to compute
an irreducible decomposition of V(/), and in our first article (Sommese & Wampler,
1996) on Numerical Algebraic Geometry, this was our approach.
In the following paragraphs, we present a somewhat more general view of ran-
domization, which is essential in dealing with intersections of irreducible algebraic
242 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

sets (Sommese et al., 2004b). In particular, let us consider the system A • /(x),
where the k x n matrix A G Ckxn is chosen generically. If k = N, this is a square
system, but we may consider k ^ N as well. Let us discuss the principal facts about
this construction.
First note that this construction is only of interest if k < n. To see this, note
that if k = n, then such a A is invertible. Consequently, the systems A • f(x) = 0
and f(x) = A" 1 • A • f(x) = 0 are equivalent. If k > n, then we may break A G C fcxn
into two submatrices by rows; the matrix Ai formed from the first n rows is an
invertible matrix for a nonempty Zariski open set of the k x n matrices A £ Ckxn.
Let A2 be the remaining (k — n) x n matrix formed from the last k — n rows of A
and let

r ._ [ Aj Onx(k-n)]
1
• [-Aa-Ar /*-„ J
with 0nX(k-n) the n x {k — n) matrix with all zero entries and Ik-n the (k — n) x
(k — n) identity matrix. Then T is invertible and A • f{x) = 0 is equivalent to

r h(x) '

L-A2-A!1 /fc_n J [ A 2 J M ' /n(x)


.0(fe-n)xl.

Thus, only if k < n is this construction interesting.


If k < n, then we may break A into two submatrices as A = [Ai A2], where Ai is
k x k. Submatrix Ai is an invertible matrix for a nonempty Zariski open set of the
kxn matrices A G Ckxn. Thus, A-/(x) is equivalent to A^"1 -A-f(x) = [I A']-f(x),
where A' = A1"1A2- In other words, the system is of the form

/l fk+l
A
': + ' • ':
Jk\ L fn .

for a nonempty Zariski open set of k x (n - k) matrices A' G Ckx(n~k),


It is important to note that though mathematically A • f(x) and [/ A'] • f(x)
are equivalent, the latter may be better than the former for homotopy continu-
ation. Moreover, the ordering of the equations can matter. For example, if the
equations were

' x\ +xl - 1"


x2 =0
Xi — 1
Basic Numerical Algebraic Geometry 243

the randomization
'x2 + x22-l + Xl(x1-l)^ = Q
a;2 + A 2 (x 1 -l) J
would be better than the randomization
[ X2 + Xi{xl + x22-l) 1_
[Xl-l+X2(xl+xl-l)\
since there would be only two paths to follow using a total degree homotopy on the
former as opposed to four paths on the latter.
The key properties of randomization are given by the following simple theorem
of Bertini type.
Theorem 13.5.1 Let
7i(*)'
/(*) = : =0
Jn(x)_
be a system of polynomials on CN. Assume that A C CN is an irreducible affine
algebraic set. Then there is a nonempty Zariski open set U of k x n matrices
A e C fexn such that for A e U
(1) if dim A > N — k, then A is an irreducible component of V(f) if and only if it
is an irreducible component ofV(A • f);
(2) if dim ^4 = N — k, then A is an irreducible component ofV(f) implies that A
is also an irreducible component ofV(A • / ) ; and
(3) if A is an irreducible component ofV(f), its multiplicity as a solution compo-
nent of A • f(x) = 0 is greater than or equal to its multiplicity as a solution
component of f(x) = 0, with equality if either multiplicity is 1.
It is important to emphasize that although an irreducible component of V(f) is
an irreducible component of the randomized system, V(A • / ) , its multiplicity as an
irreducible solution component of A • / = 0 (if not 1) might be larger than as an
irreducible solution component of / = 0. The following system, which is equivalent
to the system Equation 13.5.8 illustrates this:
xy = 0
x2 = 0 (13.5.9)
y2 = o.
The origin is an isolated solution of multiplicity 3. The randomized square system
x(y + /xix) = 0 (13.5.10)
y(x + n2y) = 0.
244 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

has the origin as an isolated solution of multiplicity 4.


The randomization of a system will be used often enough that we introduce a
new notation for it. We let fK(/(x);/c) denote a randomization A • f{x) with A a
k x n matrix. We also may write 9l(fi(x),..., fn(x); k) to mean the same sort of
randomization acting on the system obtained by stacking up all the functions fi(x).
When we use the randomization method in probability-one algorithms, A must
be chosen from a Zariski open dense set that is defined by the problem at hand, and
it may depend on other choices we make in the algorithm. Logically, the open set
from which we must choose A is not defined until all such choices are made, and so
we should choose A last. Operationally, we usually do not have a computationally
useful description of the set nor do we need one, since a random choice of A will be
in the set with probability one. Accordingly, it does not matter when we choose A
in the course of the procedure, as long as the choice is made independently of the
conditions that define the invalid set.

13.6 Witness Supersets

Suppose that we wish to compute the numerical irreducible decomposition, Equa-


tion 13.1.3, of V(f) for some polynomial system / : C^ —> C n . The logical first
step is to find a witness point set, Wi := V* (~l Lc^l\ for each pure-dimensional so-
lution component, Vi. A second step would then decompose these into irreducible
components. Unfortunately, we do not have an algorithm for directly computing
the Wi, but we can readily compute a looser set Wi that contains Wi. We will call
such a set a witness point superset, defined as follows.

Definition 13.6.1 (Witness Point Superset) Let Z c C^ be an affine alge-


braic set, and let X be a pure i-dimensional component of Z. Then W C CN is a
witness point superset for X as a component of Z if it meets the requirements:

(1) W is a finite set of points;


(2) W GZnL^^jjind
(3) (IflLcf'l) CW,

where Lc'1' c C^ is a generic linear space of codimension i. A witness superset for


Z is just a collection of one witness point superset at each dimension along with the
corresponding linear slicing space, Lc^\ at each dimension.

Remark 13.6.2 Since for generic Lc^^, Z n Lc^l is empty for i > dim Z, we see
that the witness point supersets for all dimensions greater than dim Z are empty.

Let Vi be the union of all the i-dimensional irreducible components of V(f), and
let Wi be a witness superset for Vi. UVi is not the maximal dimensional component
of V(f), then a linear space Lc^l will meet the higher dimensional components, and
Basic Numerical Algebraic Geometry 245

Wi will likely contain some points on those components. That is

Wi = Wi + Ji (13.6.11)

where Jj C Uk>iVk. We call Jj the "junk points" in W. Even when i = 0, i.e.,


the classical case of finding isolated solutions of f(x) = 0, the homotopy methods
of Part II return WQ and give no ready method to distinguish isolated singular
solutions in Wo from the junk points JQ. In Chapter 15, we will present algorithms
that discard the junk points Ji to get Wi and then further decompose the Wi into
the Wtj of Equation 13.1.3. This will give the numerical irreducible decomposition.
For the present, we will concentrate on finding the witness point supersets.
We can compute witness point supersets using homotopy continuation. Theo-
rem 8.3.1 gives conditions for a homotopy to find all isolated solutions of a square
system of polynomial equations. Total degree homotopies and multihomogeneous
homotopies as given in § 8.4.1 and § 8.4.2 have start systems with only nonsingular
roots, so they satisfy the required conditions for finding all isolated roots in CN.
The linear product homotopies of § 8.4.3 and the polyhedral homotopies of § 8.5
do the same on (<C*)JV. It is crucial, though, to note that the theory only holds for
square systems. In the pseudocode below, we assume the availability of a homotopy
procedure S = HomSolve(^) that returns a list of points S C V(g) that contains
all isolated points of V(g) for any square polynomial system g : C^ —> CN.
A quick look at our situation reveals that for all but the lowest dimension, simply
appending a set of linear slicing equations to f(x) will give us an overdetermined
system. V(f) may have components at dimensions i = N,N — 1,... ,N — rank/,
and rank/ < min(N,n), so only dimensions i > max(iV - n,0) are of interest. To
get witness points at dimension i, we slice with i linear equations, say L(x), and so
{f(x), L(x)} has i + n > N equations. How can we solve such systems? The answer
is to employ randomization to square up the system, as in the following algorithm.

Witness Superset for Dimension i: [W,L] = WitnessSupi(/, i)


• Input: Polynomial system / : CN —> C n , and an integer 0 < i < N.
• Output: A generic (TV —i)-dimensional linear space L C CN and a set of points
W that includes the witness points W := X n L, where X is the i-dimensional
component of V(f).
• Procedure:
— If i = N, perform the probabilistic null test as follows:
* Choose a random point x* £ <CN.
* If f{x*) = 0, then return{W :=x*,L := x*).
* Else, return(W? := 0, L := x*).
— Else, choose a random point a G Ce and a random matrix A e CxN.
— Let L be V{£), where £(x) := a + A • x.
246 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

- If i > N - rank/,
* Compute S := HomSolve({fH(/; N - i), £.}).
* Let W :={s G S \ f(s) = 0}.
* retum(W,L).
- Else return(t? := 0,L).

Theorem 13.6.3 For 0 < i < N, there is a Zariski open dense set U € CixN+1
such that for (a, A) & U, algorithm WitnessSupi returns a witness point superset
for the i-dimensional component ofV(f).

Proof. By Theorem 13.5.1, if A is an i-dimensional irreducible solution component of


f(x) = 0, then it is also an irreducible solution component of F(x) := fH(/; N - i).
Therefore, the witness points A n Lc^l~\ are isolated points in both V(f) f~l Lc^l~\ and
V(F)C\LC^^, for a generic linear space L 0 ^! c CN of codimension i. By assumption,
the set of points S returned by S := HomSolve(g) is finite and includes all isolated
solutions of V{g), so W must include A n L c ^l. This holds for every i-dimensional
irreducible component of V(f), so W includes the witness points for the entire
i-dimensional component of V(/). Thus, items 1 and 3 of Definition 13.6.1 are
satisfied. To see that item 2 of that definition is satisfied, we argue as follows. By
our assumptions on HomSolve, we have 5* C (V(F) (1 i c ^ l ) . By Theorem 13.2.1,
for generic L c ^ , V(F) n Lc^l this includes only points of V(F) of dimension i or
greater. By Theorem 13.5.1, any components of V(F) of dimension k > i must also
befc-dimensionalcomponents of V(f). Thus, any points in S that lie on components
of V(F) of dimension k > i must also be in V(f). Consequently, any s £ S such
that s $ V(f) n Lc^ must lie in a component of V(F) \ V(f) of dimension i. Such
points do not satisfy f(s) = 0, and so they are not copied from S into W. •

For i = N, the algorithm uses the probabilistic null test to see if all the functions
in / are the zero polynomial. For all other i > N — rank/, we solve a square system
of size N. The statement of the algorithm above uses an extrinsic formulation of
slicing. To work intrinsically, we just change a few lines, and in so doing, decrease
the size of the square system we solve to only N — i.

Witness Superset for Dimension i (intrinsic):

- Choose a random point b 6 CN and a random matrix B £ (^NX(N-I) _


- Let L be the space defined intrinsically as L(u) := b + B • u, u £ CN~l.
- If i > N - rank/,
* Let F(x)=iR(f;N-i).
* S := HomSolve(F(b + B-u))c CN~\
Basic Numerical Algebraic Geometry 247

* Let W :=Jw <E CN \ w = b + B • s, s e 5 and f(w) = 0}.


* ret\irn(W,L).
- Elsereturn(t?:=0,L).

When i = N — 1, the system to be solved has only one variable, so the call to
HomSolve could be replaced by any other method for solving polynomials in one
variable.
With WitnessSupi available to find a witness superset for the i-dimensional
component of V(f), it is a simple matter to assemble a collection of such sets for
every possible dimension. To be explicit, we display the full algorithm, Witness-
Super, below.

Witness Superset: [W] — WitnessSuper(/)


• Input: Polynomial system / :JCN -> C*. __
• Output: A witness superset W — {(Wo, LQ), ..., (WJV, £JV)} for V(f), where
(Wj, Li) is a witness superset for the dimension i component. Empty dimensions
may be omitted.
• Procedure:
- Initialize W = {}.
- Append (WN,LN) =_WitnessSupi(/, N) to W.
- HWN ^ 0, return(W).
- L o o p : For i = N -1,...,N - rank/
* Append (Wi,Lt) - WitnessSupi(/,i) to W.
~ End loop.
- return(W^).
i i

Recall that in the case of nonreduced components, we wish to include in our nu-
merical witness sets additional information to allow robust numerical computation
of the witness points, either a deflation formulation as in § 13.3.2 or a homotopy
formulation as in § 13.3.3. Clearly, a homotopy is available inside algorithm Wit-
nessSup, we merely have to return the information. A deflation formulation can
be returned if one is used in the endgame of HomSolve. Notice that deflation
can only work on the true witness points in the witness superset, because these
are isolated solutions, whereas the junk points are not. So before trying to deploy
deflation, we need to separate the junk from the witness points, which requires the
methods of Chapter 15.

13.6.1 Examples
In the following examples, the tables summarizing runs of algorithm WitnessSuper
have columns labeled as follows:
248 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Dim Dimension of component under investigation.


Paths Number of paths in the homotopy.
#W Number of endpoints in witness superset.
#W^sing Number of those points that are singular.
#7V Number of "nonsolutions," i.e., endpoints x $ V(f).
#00 Number of endpoints at infinity.
nfe Average number of function evaluations per path.
nfe Total number of function evaluations for this dimension.

The number of function evaluations depends on details of the path tracker and
the endgame, including the various control settings they use. The figures reported
here are for the default settings in H O M L A B . These numbers will change slightly
in repeat runs, because the paths depend on random choices of slices and in the
randomization to square up a system. They are included to give a sense of where
the algorithm spends most of its effort.

Example 13.6.4 Consider the system given in Equation 12.0.1, which for con-
venience we repeat here:

f(xv)-\ sfo2-*3)^-1) l - o

The polynomials are degree 5 and 6, so using total-degree homotopies, algorithm


WitnessSuper tracks 6 paths to find a dimension 1 witness superset and 30 paths
to find a dimension 0 witness superset. The results are summarized in the following
table.

Dim Paths #W #Wsins #Af #00 nfe nfe


1 6 4 0 2 0 120 72~T
0 I 30 I 30 I 28 0 I 0 I 165 4948

This is consistent with the fact that V(f) has a degree 4 component at dimension 1,
decomposable into V(x) and V(y2 - x3). The superset at dimension 1 has four
points and no junk. At dimension 0, all paths must end on V(f) as there is no
slice involved. (This would not necessarily be true if / had more equations than
variables.) Of the 30 path endpoints, 28 are singular. The true witness points are
the two nonsingular points in the witness superset, the other 28 are junk. Junk
points are always singular, but it would be erroneous to conclude that singular
points in the superset are necessarily junk. In fact, if the factor (x — 1) in the
first equation were changed to (x — I) 2 , the zero dimensional solution points would
become double points and therefore would be singular.
Basic Numerical Algebraic Geometry 249

Example 13.6.5 The system

f(x v ) - Wy-xy-2yA _
n*,v)-[ xy3_y j-o
leads to the following results from WitnessSuper:

Dim Paths #W #Wsing #JV # 0 0 nfe nfe


1 4 1 0 3 0 158 631
0 12 10 6 0 2 166 1997

At dimension 1, we have one witness point for the set V(y). At dimension 0, two
of twelve paths go to infinity, leaving ten points in the witness superset. The six
singular points in the zero-dimensional witness superset are in fact junk: they all
have y = 0. The remaining four points are the finite isolated roots in V(f).

Example 13.6.6 Running WitnessSuper on the equations for 50(3), see Ex-
ample 13.4.4, one obtains the following table. Note that the rank test saves us from
trying to compute witness points for dimension 2, which would have required 192
paths.

Dim Paths #W #W 5 i n g #AT #00 nfe nfe


~8 3 0 0 3 0 58 173~
7 6 0 0 6 0 107 639
6 12 0 0 12 0 151 1815
5 24 0 0 24 0 150 3590
4 48 0 0 48 0 229 10986
3 96 8 0 40 48 254 24337

In all the examples, we may observe that the number of function calls grows
as we descend dimensions. This is due both to an increase in the number of paths
(which grows geometrically) and also a general tendency for the number of calls
per path to increase. Not reflected in the table is the additional fact that the
number of variables climbs as we descend, so the linear solving routine used in
prediction/correction iterations will be more expensive. So, by every measure, the
bottom run is by far the most expensive. This underscores the importance of using
the rank of the system to eliminate low-dimensional runs.

13.7 Probabilistic Algorithms About Algebraic Sets

In this section we follow (Sommese & Wampler, 1996) and show how the witness
supersets immediately give some numerical algorithms. Subsequent chapters will
present more efficient algorithms, so the main point here is to recognize the capa-
bilities that witness supersets make possible.
250 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

13.7.1 An Algorithm for the Dimension of an Algebraic Set


One consequence of Remark 13.6.2 is a simple algorithm for finding the dimension
of V(f), i.e., the maximum of the dimensions of the components of V(f).
• •
Top Dimension: d = TopDimen(/)
• Input: Polynomial system / : C^ —> C n .
• Output: The dimension of V(f), i.e., d := dim V(f). If V(f) = 0, then d := 0.
• Procedure:
— Loop: For i = N, N - 1,..., N - rank/
* LetjW,L) := WitnessSup(/,«').
* If W / 0, then return(d := i).
— End loop.
— return (d := 0).

13.7.2 An Algorithm for the Dimension of an Algebraic Set at a


Point
Let Z be an algebraic subset of C^ defined by a system of polynomial equations
/ = 0. Let p G Z, i.e., p e CN and f(p) = 0. Recall from § 12.2.1 that if Z = Vfi=1Zi
is the decomposition of Z into pure-dimensional algebraic sets, then the dimension
of Z at p G Z is max{i | p 6 Zi}.
In this section we give an algorithm to compute the dimension of Z at p. In
particular:
(1) if p is a generic point of an irreducible component Zi of Z, then this algorithm
computes dim Zi\
(2) this algorithm lets us decide whether a solution p of a system / = 0 is isolated.
This is the local variant of the dimension algorithm of § 13.6.
The algorithm proceeds as follows. If Zi is an irreducible component of Z
containing p, then any affine CN~dtm z' near a generic affine CN~dim Zl containing p
meets Zi in at least one point near p. Moreover if dim Zi is the maximum dimension
of any irreducible component of Z containing p, then for k > dim Zi it follows that
generically an affine CN~k near a generic affine CN^k containing p does not meet
Z in any points near p. A generic affine CN~k containing p := {pi,- • -,PN) is
specified parametrically by {x G C^ \ x — p + B • u, u G CN~k} where B is
a generic N x (N — k) matrix. An affine CN~k nearby is one parameterized by
ip',B') G CN+N*(N~k*> in the neighborhood of (p,B), using the complex topology.
Let us first lay this out as a conceptual algorithm in which many implementation
details are left for later. In particular, the algorithm depends on a procedure [S] =
LocalSlice(Z,p, L) that returns a list of points S C ZnL that contains all isolated
Basic Numerical Algebraic Geometry 251

points of Z n L near p, where L C C^ is an affine linear space. We do not specify


an implementation for LocalSlice here, but one possibility is given in a numerical
version of the algorithm later in this section.

Local Dimension: (conceptual algorithm) [d] = LocalDimen(Z, p)


• Input: A numerical description of an algebraic set Z c <CN, and a point p e Z.
• Output: The dimension of Z at p, i.e., d := divap Z.
• Procedure:
- Loop: For i = N, N - 1 , . . . , N - rank/
* Choose L C CN generically from the set of all affine (N — i)-dimensional
spaces that contain p.
* Let V be a generic Lc^ near L.
* Let S := LocalSlice(Z,p,L').
* If S contains a point near p, then return(<i := N — i).
— End loop.

Note that the return condition must be true for i = N — dimp Z and it cannot be
true for smaller i. For i = 0, procedure LocalSlice amounts to just the probabilistic
null test on a point p' near p.
This algorithm is conceptual in the sense that we have not specified a description
of Z or an implementation of LocalSlice. A numerical implementation of the
algorithm must deal with all of these. To that end, let us assume Z — V(f) for a
system of polynomials / : C^ —> C". For a square polynomial system g : Cm —>
C m , the methods of Part II provide us with homotopy methods that give a list of
solution points in V(g) that includes all isolated points, so we may adapt this to
implement LocalSlice.
With these considerations in mind, we may construct a numerically viable
method for finding local dimension, as follows. Where overdetermined systems
appear in the algorithm, we use the randomization procedure of § 13.5 to reduce
to the same number of equations as unknowns. This version of the algorithm re-
places LocalSlice with a call to homotopy procedure S = HomSolve(g) for solving
square systems.

Local Dimension: [d] :— LocalDimen(/,p)


• Input: Polynomial system / : C^ —> C™, and a point p £ V(f).
• Output: The dimension of V(f) at p, i.e., d := dimp V(f).
• Procedure:
- If f(q) = 0 for a random q G C^, then return(d := N).
— Loop: For i = 1,2,..., rank/
* Choose a random matrix B e CNxi.
* Set L(u) := p + B • u, a generic affine C* containing p.
252 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

* Let L'(u) be a generic affine C* near L.


* Let fi'{u) := f(L'(u)), a system of n polynomials in i variables.
* Let g(u) := 91(/L'; i), a square system of size i.
* Compute 5 := HomSolve(g).
* If 5 contains a point u* such that p' := L'(u*) is near p, then return(d :=
N -i).
- End loop.

The definition of nearness in this algorithm is a bit problematic. One prescrip-


tion would be to repeat the test for a sequence of linear spaces closer and closer to L
and see if this produces a sequence of solution points closer and closer to p. Ideally,
this would be done using continuation from a generic V to L. The problem is that
the set S can contain singular solutions. Since we do not know the local dimension
at these, we do not know the dimension of the solution paths as V varies and so it
is not possible to numerically track them. The methods of subsequent chapters will
refine the situation so that the nearness test can be implemented as testing equality
between p and the endpoint of a well-defined one-dimensional homotopy path.
The use of HomSolve in the algorithm above is overkill, because it finds all
isolated solutions of g = 0, when all we need is to find one near p, if it exists. A better
alternative might be to use an exclusion method (see § 6.1) initialized on a small
box containing p. An interesting purely local heuristic for checking the dimension
at a point p is given in (Kuo, Li, & Wu, 2004), based on the methods of (Li &
Zheng, 2004). It is a variant of the conceptual algorithm above, with a heuristic for
LocalSlice. If this could be strengthened to a probability-one algorithm, it might
be substantially more efficient than the numerical procedure above.

13.7.3 An Algorithm for Deciding Inclusion and Equality of Re-


duced Algebraic Sets
At this point, we can succinctly formulate an algorithm for deciding inclusion of the
solution sets of two systems of polynomial equations, which will immediately yield
an algorithm for deciding equality of such solution sets.

Inclusion Test: [t] = Inclusion(/, g)


• Input: Polynomial systems / : CN -> C", g : CN -> C m .
• Output: Logical t := true if V(f) C V(g), otherwise, t := false.
• Procedure:
- Loop: For % = AT, N - 1 , . . . , N - rank/,
* Let W := WitnessSup(/, i).
* If g(x) ^ 0 for any x € W, then return(£ := false).
- End Loop.
Basic Numerical Algebraic Geometry 253

— return(£ := true).

The inclusion test leads immediately to an equality testing algorithm, as follows.

Equality Test: [t] = Equal(/,#)


• Input: Polynomial systems / : CN -> Cn, g : CN -> C m .
• Output: Logical t :— true if V(/) = V(g); otherwise, t := false.
• Procedure:
— t\ := Inclusion(/,g).
— t2 := Inclusion^,/).
— If both t\ and £2 are true, then return(i := true).
— Else, return(£ := false).
1 1

We have not dealt with multiplicities in this algorithm. Thus this algorithm
gives a way of deciding if the reduced algebraic set defined by / = 0 is an algebraic
subset of the reduced algebraic set defined by g = 0.
This algorithm is a translation of the algorithm from van der Waerden's classic
(§93 to §98 van der Waerden, 1950). It is a strength of our numerical version of
generic points model that they model the classical generic points close enough that
such results of classical algebraic geometry translate without difficulty.

13.8 Summary

Given a polynomial system fix) = 0 on <CN as in Equation 13.0.1, we have found


a witness superset for V(f). Further processing of these sets, the subject of Chap-
ter 15, will yield eliminate junk points and decompose the witness set for each
dimension into the numerical irreducible decomposition. Before we get to this, we
present an alternative procedure for generating a witness superset in the following
chapter.

13.9 Exercises

Several of these exercises refer to routines from HOMLAB (see Appendix C). Routine
wit sup. m implements algorithm WitnessSuper. If the system to be analyzed is
provided in tableau form, script wsuptab.m will sort them by descending degree
and then call wit sup. m.

Exercise 13.1 (Multiplicity and Randomization) Show that the system


Equation 13.5.9 has the origin as an isolated zero of multiplicity 3. Show that
the system Equation 13.5.10 has the origin as an isolated zero of multiplicity 4.
254 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Exercise 13.2 (Inclusion Test) Use witsup.m to find witness points for the
twisted cubic, V(y — x2,z — x3), and also for V(xy — z2,xz — y2). Apply the
inclusion test to see if either of these contains the other.

Exercise 13.3 (Seven-Bar Linkage) Refer to Figure 9.5 and derive a set of six
equations similar to the ones in Equations 9.5.33—9.5.36, consisting of three loop
equations and three unit magnitude conditions. Compute a witness superset for
general link parameters (a o ,6 0 ,c o ,ai,02,62,03,63,4,4,£ 6 ) & C 1 1 . Then repeat
the exercise arbitrarily choosing a0 = —0.3- li, c0 = —1, ai = 0.28, 62 = 0.37, £6 =
0.55, and setting the remaining parameters with the formulae 60 = 0, a2 = a o 6 2 /co,
as = ai, 63 = <2i(ao — Co)/ao, £4 = £e\ao/co\, and £5 = 11>21. Make a table like those
shown in § 13.6.1.
Chapter 14

A Cascade Algorithm for Witness


Supersets

This chapter revisits the construction of a witness superset for the solution set of a
system f(x) = 0 of n polynomials on CN, a topic addressed earlier in § 13.6. The
algorithm, WitnessSuper, from that section leaves room for improvement both
from a theoretical and practical point of view. To understand why this might be so,
let us assume that we use total degree homotopies to solve the systems arising in
the algorithm. Without loss of generality, we may assume that we have squared-up
the system, so n = N, and we have sorted the polynomials fi{x) from the system
f(x) by descending degree, so that letting di = deg/j, we have d\ > ••• > djv-
Under these conditions, WitnessSuper tracks
N j

paths. In comparison, it is a classical fact, e.g., (12.3.1 Fulton, 1998), that given
the irreducible components Zij of V(f) with Z^ occurring with multiplicity /i^ it
follows that
N

ij i=i

At first sight this does not look so terrible. In the case when all the di = 2,
2N+l — 1 paths to be tracked in the algorithm to find at most 2N solutions. Since,
all other things being equal, computational work is proportional to the number of
paths followed, this amounts to only about twice as much work as is theoretically
needed. But all other things are not equal! Paths that do not lead to witness points
often end up going to singular solutions. This can be expensive.
In § 14.1, we explain an algorithm that follows only Yli=i °*i paths in the total
degree case. These paths are tracked in N stages, yielding at each stage a witness
superset for each successively smaller dimension, and hence, we call this the cascade
algorithm. In the worst case, all the paths survive to the end of the cascade, requir-
ing the equivalent of N FIi=i di paths to track, but in the typical case, many paths
terminate early in the process, making the algorithm relatively efficient. Moreover,

255
256 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

to survive to the next stage, a path must remain nonsingular, which helps keep
computational cost down and reliability high.
The version we present here differs slightly from its first appearance in
(Sommese & Verschelde, 2000). The most notable difference is the removal of slack
variables, which were never used in actual implementations. The new presenta-
tion draws on Theorem 13.2.2 to establish the genericity of the slicing hyperplanes,
removing any dependence on the order in which they are used.
For ease of reading, in this chapter we act as if components are reduced, e.g., we
talk about witness supersets (Wi,Li,f) instead of (Wi,Li: f,hi(x,t),Wi(e),e). All
our arguments and algorithms hold for nonreduced components also. For example,
the cascade algorithm for witness supersets produces the hi(x,t) whether the com-
ponents are reduced or nonreduced, and to obtain the Wi(e) we would only need
to have the t variable in the homotopies in the cascade algorithm take on the value
t = e for an appropriate small value of e.
This short chapter has only two sections: the description of the cascade algo-
rithm in § 14.1, and presentation of some examples of its use in § 14.2.

14.1 The Cascade Algorithm

The form of this algorithm is:

Input a system f(x) — 0 of n polynomials on C^


Output a witness superset for V(f) (see Definition 13.6.1)

For simplicity in forming the algorithm, we begin by squaring up the system so


that we have the same number of equations as unknowns. Let r = rank/. By The-
orem 13.4.2, the lowest possible dimension of any irreducible solution component is
N — r, and by Theorem 13.5.1, all such components are also irreducible components
of [Ir A] • f, where Ir is the r x r identity matrix and A is a generic r x (n — r)
matrix in C" x ( n - r ). To get a witness point set for dimension i, we slice simultane-
ously with % generic hyperplanes, so we see that we use at least N — r such planes
no matter which dimension is being investigated. By Theorem 13.2.2, with proba-
bility one, we can pick a set of N — r such hyperplanes, generic with respect to all
solution components, by choosing random, complex coefficients for their equations.
This is equivalent to choosing an r-dimensional linear space L £ CN intrinsically as
L — b + B • u with random b £ CN and B £ CNxr. Combining these maneuvers,
we have the square system of size r

g{u)= [lrA]-f(b + B-u).

Any solution u* of g(u) = 0 maps to a point x* = b+B-u* £ L, and such points that
also satisfy f(x*) = 0 are the witness points that we seek. Whatever the values
of n and iV may be, we use this approach to convert the problem of analyzing
A Cascade Algorithm for Witness Supersets 257

/ : C ^ —> C n to treating a square system g : Cr —> C . Accordingly, without loss


of generality, from this point on we assume / is square of size n — N — rank/.
Recall that for i = 0 to N, algorithm WitnessSuper obtains a witness superset
for the i-dimensional component of V(f) by intersecting V(f) with i generic linear
equations. Instead of treating each of these as an independent problem, the cascade
approach embeds all of them into a common formulation. For this purpose, we
introduce an JV-tuple of parameters t = (t\,..., £JV) G C ^ , the diagonal matrix

T(t):= •.. , (14.1.1)

and the notational device

tW = ( t 1 , . . . , t i , O , . . . , O ) .

By Theorem 13.2.2, there is a Zariski open dense set U C CNx{N+1) such that for
A := [<io Ai] £ U all subsets of the TV linear equations in the system

L(A,x):=ao + A1-x (14.1.2)

are generic with respect to all the irreducible components of V(f), where ao is the
first column of N x (JV +1) matrix A and A\ is the remaining columns. The witness
point superset for dimension i is a finite set of points containing all isolated solutions
of the system

for nonzero values of t\,..., f j . The zeros on the diagonal of T(v-l>) knock out N — i
of the linear equations in L(x), leaving us with a system of N + i equations in
X£CN.
To obtain all isolated solutions of F(A, x,t) = 0 by homotopy methods, we
need a square system. Theorem 13.5.1 tells us that there is a Zariski open dense
set U C CNxN such that for A e U the isolated solutions of F(A,x,t) = 0 are
contained in those of

£(A, A, x, t) := f(x) + A • T(t) • L(A, x). (14.1.4)

Accordingly, a witness point superset for the i-dimensional component of V(f) can
be found by computing all isolated solutions to

£i{A, A,x) := £(A, A,x,t®) = 0. (14.1.5)


258 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

For clarity, let's denote the fcth row of L(A,x) as Lfc(x) and the fcth entry in f(x)
as fk(x) so that we write £(A,A,x,t) expanded as
T/i(x)"| piL^a;)"!"
£(A,A,x,t):= : +A- : . (14.1.6)
.UN(X)\ [tNLN(x)\_
We summarize what we have done, with some additional useful conclusions, in
the following theorem, carrying over the notation of the preceding paragraphs.
Theorem 14.1.1 For a given polynomial system f : CN —> C^, there is a Zariski
open dense set U C C M x C JVx ( N+1) such that for (A, A) <5 U and any integer i
satisfying 0 < i < N, it follows that
(1) a witness point superset for all i-dimensional components of V(f) is a subset
of the isolated solutions of £i(A, A, x) = 0; and
(2) if x' is a solution of £i(A, A,x) = 0 then either:
(a) x' is in a component ofV(f) of dimension at least i, and Lfc(x) = 0 for all
1 < k <i; or
(b) x' is an isolated nonsingular solution of £i(A,x) — 0 and Lk{x') ^ 0 for
any 1 < k < i, and the number of such x' is the same for all (A, A) £ U.

Proof. If we show that for each specific choice of i, a Zariski open dense set Ui C
I^NXN x (£ivx(iv+i) gxjg^g w jth the above properties, then the intersection of the
Ui may be taken as U. Therefore we can assume without loss of generality that we
have a specific i.
The first assertion of the theorem and item (2a) both follow from the generic
slicing result in Theorem 13.2.2, the randomization technique of Theorem 13.5.1
and the logic given in the paragraphs leading up to the theorem. Item (2b) is a
direct application of Bertini's Theorem A.8.7. •

Having embedded all the systems of interest into £(A, A, x, t), we now turn to the
cascade for solving the £i(A, A, x) = 0 as % descends from N to 0. With probability
one, a random choice of (A, A) satisfies the genericity conditions of Theorem 14.1.1.
Choosing them so, we consider them fixed and suppress them from our notation,
hence writing £(x,t) for the embedding and £i{x) for the ith embedded system.
Define the level i nonsolutions as the set of solutions x' of £i(x) = 0 with
Li(x') / 0. Denote these by Mi. They depend on the choice of A and A, but by
Theorem 14.1.1, the number of them, which we denote i/j, is independent of A s U.
Each £i(x) is in the family of systems £{x,t) for a particular value of t^ G C^.
Moreover, holding il*"1) fixed but letting ti vary, we can view £i{x;ti) = 0 as a
parameterized family of systems which include as a special case £j_i(x) = £i(x; 0).
By the principles of parameter continuation, see § 7.4, if we can solve £t{x; U) = 0
for a generic ti £ C, then we can use those solutions as start points in a homotopy
A Cascade Algorithm for Witness Supersets 259

£i(x;s) = 0 as s goes from t\ to 0. By similar reasoning, we can descend from


£i{x) — 0 to any £j(x) = 0, j < i, using the homotopy

Hji{x, s) := £(x, (*!,..., t,-, atJ+u ..., stu 0,... ,0)) = 0 (14.1.7)

for s going from 1 to 0. We refer to this as a cascade of homotopies.


We can be more precise about which solutions of the higher system lead to
solutions for the lower one, as follows.
Theorem 14.1.2 Let Hji(x, s) be defined as above. There is a Zariski open dense
set U C CNxN x CNx<-N+1) x e such that for (A, A,t®) <E U, there are nonsingular
paths ((f>k(s), s) :C—>CN x C with 1 < k < vt such that:
(1) the set 0^(1) are equal to the set of nonsolutions at level i; and
(2) H3i{4>k(s),s)=Q; and
(3) the limits lims^o0fc(s) with Lj(lims^o 0fc(s)) ^ 0 are the level j nonsolutions;
and
(4) the limits lims^o0fc(s) with Lj(\ims^o4>k{s)) = 0 contain the witness point
superset for the j-dimensional components ofV(f).

Proof. By construction and Theorem 14.1.1, on an Zariski open dense set of


(A,j4,£[fc!), the witness points for level j are among the isolated solutions of
Hji(x, 0) = 0. Moreover, by Theorem 14.1.1, the nonsolutions at each level are non-
singular and therefore isolated. By Theorem 7.1.6, the limits of the isolated solution
paths of Hji(x, s) = 0 include all the isolated solutions of Hij(x, 0) = 0. But the so-
lutions of Hji(x, 1) = 0 that are solutions of Lk{x) = 0, 1 < k < i, remain solutions
of those linear equations as s varies, so they remain on afc-dimensionalor higher
component of V(f) and therefore cannot give witness points on a j-dimensional
component. Thus, the endpoints of the paths beginning at the nonsolutions at level
k include all the witness points and all the nonsolutions at level j . When cate-
gorizing an endpoint of a solution path, it suffices to check only Lj(lims_»o 4>k(s)),
because by Theorem 14.1.1, either all L/c(lims^o 4>k(s)), 1 < k < j , will be zero or
none of them will. •

The Cascade Algorithm simply asserts that tracking the level i nonsolutions
using the homotopy Hji{x, s) = 0 of Equation 14.1.7, we get the level j nonsolutions
plus a witness point superset for the dimension j components of V(f).
The randomness of i'2' in the homotopy of Equation 14.1.7 simplifies the proof
of Theorem 14.1.2, but in practice all the U can be 1. They are just scaling factors
on the linear equations (see Equation 14.1.6) and since the linear coefficients are
already chosen generically in the matrix A, the generic choice of t^ is redundant.
For the same reason, when we track paths from s = 1 to s = 0, we may do so on
the real interval (0,1] with a probability of success equal to one (see Lemma 7.1.2).
Assume that we know by some other means that dim V(f) < K < N. Then,
260 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

we can start the cascade by solving £K(X) = 0 using any homotopy that will find
all isolated solutions, for example, a total degree homotopy. We can check for the
trivial case when V(f) = CN, using the probabilistic null test, and so we usually
start at level K = N — 1. Alternatively, one might use the algorithm TopDimen
from § 13.7.1 to determine a lower starting dimension for the cascade.
A final important note: at the top of the section, we began by squaring up the
system / to size r = rank/. Theorem 14.1.1 applies to the square system; call it / '
and its witness superset as W. If the original system / has more than r equations,
then W may include points which do not satisfy / . We simply discard these.

Cascade: [W] = Cascade(/)


• Input: Polynomial system / : C^ —> Cn.
• Output: A witness point supersets W = {Wo,..., WJV} for V(f), where Wi is
a witness point superset for the codimension i component. Empty dimensions
may be omitted.
• Procedure:
— Initialize W = {}.
— If / is null, return the appropriate result. Otherwise, continue.
— Comment: square up f{x) to form g(u) of size rank/.
— Let r = rank/.
- L e t / ' = 5H(/;r).
— Choose random b G CN and B 6 C i V x r .
— Define g{u) = f'(b + B-u).
— Comment: form embedding and solve for codimension 1.
— Choose random A e C r x r and A £ C r x < r + 1 ) .
— Form £(u, t) = g{u) + A • T(t) • L(A, w), where T(t) is diagonal of size r x r ,
— Let S := HomSolve(£'(u,^ r ^ 1 ))), discarding any solutions at infinity.
— Partition S as W := {u G S : g(u) =0},N' = S\W.
— Loop: For i = 1 , . . . , r — 1
* Comment: i is the codimension.
* Append Wi := b + B • W to W.
* Let d = r — i.
* Track solution paths of £(u, st^ + (1 — s ) ^ " 1 ' ) = 0 as s goes from 1 to
0, starting at each of the points in J\f. Discard any endpoints at infinity
and call the remaining ones set S.
* Partition S as W := {u G 5 : g(u) = 0}, Af = S \ W.
— End loop.
— Comment: the lowest dimension might have extraneous points.
— Let Wr := b + B • W, and expunge any points x G Wr such that f(x) ^ 0.
— Append Wr to W.
— return(VF).
A Cascade Algorithm for Witness Supersets 261

For simplicity, we state the algorithm concentrating on the witness point sets.
The linear slicing equations are easily constructed from b, B, and A.

14.2 Examples

For direct comparison with algorithm WitnessSuper of the previous chapter, we


repeat the same examples as in § 13.6.1, this time using Cascade. Please refer to
the earlier section for the meanings of the table entries.
Example 14.2.1 For the system

f ( x v ) - \ *(2/2-*3)(z-l) l _ 0

the cascade results are as follows. There is a new column called "fail" to record
that some paths did not converge well.

Dim Paths #W #Wsing #JV #00 fail nfe nfe


1 30 4 0 9 4 13 223 6711
0 9 9 7 0 0 0 64 643
As we know the answer before hand, we can verify that the witness supersets contain
a valid witness set. The 13 failed paths are worrisome, but it appears that they
are highly singular points at infinity. This example is rather degenerate with high
degrees; it calls for higher precision arithmetic for a secure treatment.
Example 14.2.2 The system

f(x,y):=\x2y - Xy-2y}=0J
[ 3xyA-y
leads to the following results from the cascade:

Dim Paths #W #Wsing #jV #00 nfe nfe


1 12 1 0 (3 5 117 1403
0 6 6 2 0 2 37 222

Example 14.2.3 Running the cascade on the equations for SO(3), we obtain

Dim Paths #W #Wsing #N #00 nfe nfe


8 96 0 0 72~~ 24 267 25620
7 72 0 0 72 0 46 3290
6 72 0 0 72 0 44 3195
5 72 0 0 72 0 47 3357
4 72 0 0 72 0 48 3465
3 I 72 I 8 1 0 40 I 24 1 107 I 7674
262 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

For each stage after the first, the nonsolutions of the previous stage become the
start points of the next, so the number of paths can only decrease at each stage.
Examples like the SO (3) problem are the worst case for the cascade as far as the
total number of paths is concerned, because all the paths survive to the last stage.
A saving grace is that the number of function evaluations per path falls dramatically
after the top dimension. We can only surmise that the initial homotopy between a
generic start system and our sliced target has longer, perhaps more twisted, paths,
while the cascade homotopies connect highly related systems, so that the paths are
short and relatively straight. The rise in nfe for the final dimension of the 50(3)
problem is due to the solutions at infinity being singular, thus requiring a more
expensive endgame to compute them accurately.
Comparing these tables to the ones in § 13.6.1, we see that Cascade consis-
tently tracks more paths than WitnessSuper, but the total number of function
evaluations is almost the same. We experience some numerical difficulty on the first
cascade example, but it still returned a correct witness superset. There is one clear
difference in performance: Cascade returns a smaller superset than WitnessSu-
per on each of these examples. This means the supersets contain fewer junk points.
This is particularly notable in the zero-dimensional sets for Example 14.2.1, for
which WitnessSuper gave a set of 30 points containing 28 junk points, while
Cascade gave a set of only 9 points containing 7 junk points. When we move on
to computing a numerical irreducible decomposition, the first step is to remove the
junk points. It is quite advantageous to have fewer of them at the outset.

14.3 Exercises

Exercise 14.1 (Comparisons) Run Example 14.2.2 and Example 14.2.3 using
HOMLAB. DO SO both using witsup. m, an implementation of WitnessSuper of the
previous chapter, and using cascade.m, an implementation of Cascade. Compare
run times for the two methods. Use the profiler tool in Matlab to track which routine
is using the most computation. If you are using the "tableau" format, supplied for
both examples, see how much you can improve performance by writing an efficient
straight-line program.

Exercise 14.2 (Slicing Equations) Find an expression, in terms of b, B, and A,


for the slicing equations for dimension i in algorithm Cascade.

Exercise 14.3 (Spherical Pentad) A pentad mechanism is topologically two tri-


angles, A and B, with three line segments, each one joining corresponding vertices
of the triangles. The segments and triangles represent rigid links, but relative
motion is allowed where they meet. In the spherical version, the joints are all revo-
lute (one-degree-of-freedom hinges), and their centerlines all intersect in a common
point. This means that the possible relative positions of one triangle with respect
A Cascafe Algorithm for Witness Supersets 263

to the other is constrained to rotations in M3. Let a i, 02,03 € K3 be unit vectors at


the joints of triangle A and bi,b2,b3 E R3 the same triangle B. Let q 6 R be the
cosine of the arc subtended by the segment from at to bi. Let X e SO(3) be the
rotation of triangle B with respect to A. Then, we have the three equations
a[Xbi = cu i = 1,2,3,
to describe all possible placements of B with respect to A such that the pentad can
be assembled. Explain how results presented in this chapter allow you to conclude
that for general parameters a^, bi, c,, i — 1,2,3, the spherical pentad has at most 8
assembly configurations.
Exercise 14.4 (Griffis-Duffy Platform) A special case of the Stewart-Gough
platform we studied in § 7.7 and § 9.3, Griffis-Duffy platforms (Griffis & Duffy, 1993)
have triangular upper and lower platforms, with the vertices of each connected to a
point on the edge of the other. An even more special case, which we call the Griffis-
Duffy Type I platform, is when the triangles are equilateral (not necessarily the
same size) and the joints on the edges are at the midpoints (Husty k Karger, 2000;
Sommese, Verschelde, & Wampler, 2004a). That is, connecting point a, on the base
to bi on the upper plate, a\, a3 and a5 are vertices of an equilateral triangle, and
a-2 = (ai + a^)/2 and so on cyclically. Meanwhile, &2, 64 and b& are vertices, and
bj = (&6 + fr2)/2 and so on cyclically. The leg lengths, Li: are arbitrary. What is the
dimension and degree of the top-dimensional component? Use Equations (7.7.7)
and 7.7.10), and ignore any points on the degenerate set of Equation 7.7.8.

Exercise 14.5 (Seven-Bar Revisited) Repeat Exercise 13.3 using Cascade.


Chapter 15

The Numerical Irreducible Decomposition

Let Z be an affine algebraic set on C^. This means that Z is the solution set of some
system of polynomials / : C^ —» C", i.e., Z = V(f). In a typical situation, we start
with / as given, and we seek to find a description of its solution set. In other cases,
such as we address in the next chapter, Z may be only a portion of the full solution
set of the polynomials on hand. But for the moment, it does no harm to think of Z
as the full solution set V(f). No matter its origins, Z has a decomposition into its
pure-dimensional parts Zi, i.e., Z = L)fl!£lzZi with dim Zi = i. Furthermore, each
Zi can be decomposed into irreducible pieces Zij, i.e.,Zi = Uj^XiZij, where each
Zij is a distinct irreducible component and the index sets X% are finite.
Our goal is to find a numerical irreducible decomposition, that is, we wish to
find witness point sets, W^ := Z^ fl L c ^ for each irreducible component Z.Lj of Z,
where Lc^l is a generic linear space of codimension i. From Chapters 13 and 14,
we have algorithms WitnessSuper and Cascade that, given polynomial system
/, find a witness superset for V(f). That is, for each dimension i, they give a set
Wi D W{ := Zi n Lc^\ which contains all the witness points for all the irreducible
components of dimension i along with some possible junk points. Accordingly, our
goal becomes to find the breakups
Wi = Wi + J, = Uj-ez, W^ + Jj (15.0.1)
where Jj C (Uj>iZj) n L^1^. To achieve this, we show
• how to trim the junk points out of the witness supersets, Wi, to obtain the
witness sets, Wi, i — 0,..., dim Z, and
• how to decompose a witness set, Wi, into its irreducible components, Wij.
In the sequel, Chapter 16, we present methods for finding witness supersets for the
intersection of algebraic sets. The methods of the current chapter will apply equally
well to those witness supersets.
One way to approach the processing task is to employ a membership test. Junk
points in a witness superset at dimension i are members of component of dimension
greater than i, so one way of detecting them is to start at the highest dimension
and work down, eliminating any points found to be members of higher-dimensional

265
266 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

components. Then, at a fixed dimension, we need a test of whether two or more


witness points belong to the same irreducible component. By such tests, we can
group the points to form the numerical irreducible decomposition. Hence, much of
this chapter is devoted to membership tests, and in § 15.1, we begin the chapter
by discussing how different types of membership tests, defined abstractly by their
inputs and outputs, can be used to process witness supersets into the irreducible
decomposition. The remaining sections present concrete approaches to providing
the necessary membership tests.
All the algorithms of this chapter rely on a basic maneuver we call sampling,
which generates new points on a component by tracking witness points as the slicing
plane is moved continuously. Thus, in § 15.2 we discuss sampling for each of the
three variants of witness set put forward in § 13.3: reduced, deflated, and nonre-
duced. The general nonreduced case requires a method for tracking singular paths,
which we outline in § 15.6.
A sampling routine enables three kinds of algorithms that are useful in comput-
ing a numerical irreducible decomposition. Numerical elimination theory, § 15.3,
interpolates sample points to find equations that vanish on a component thereby
providing a membership test. This approach provides a complete solution to both
the junk elimination and the decomposition stages of processing, but it becomes
prohibitively expensive and numerically unstable for all but the lowest degrees and
dimensions. As a more practical alternative, in § 15.4, we discuss a homotopy
membership test based on the fact that regular points of an irreducible algebraic
set are path connected. This approach provides a complete method for junk elim-
ination, but its use in monodromy loops to heuristically find connection paths be-
tween witness points at the same dimension provides only a partial solution to the
decomposition phase. To complement this, the trace test discussed in § 15.5 de-
termines whether a given subset of witness points forms a complete component. It
can be used to quickly certify a putative decomposition found by monodromy or
to complete a partial one by exhaustive testing. It can even be used by itself to
combinatorially test subsets of points until the entire decomposition is determined.
Our presentation follows the order in which the methods were originally developed:
numerical elimination theory in (Sommese et al., 2001a), monodromy in (Sommese
et al., 2001c), and traces in (Sommese, Verschelde, & Wampler, 2002b)—inspired
by ideas in (Rupprecht, 2004).
The different approaches each have their own niches. For a pure i-dimensional
component of moderate degree, meaning not much more than degree 10, traces prove
to be fastest decomposition method, but the worst-case cost grows exponentially
with degree. For this reason, monodromy certified by traces eventually becomes
more effective. Numerical elimination is not generally competitive for determining
a decomposition, but it could still be useful if one seeks equations vanishing on a
component.
The Numerical Irreducible Decomposition 267

15.1 Membership Tests and the Numerical Irreducible Decompo-


sition

Our task is:

• Given: A witness superset, W, (see Definition 13.6.1) for an affine algebraic


set Z. ^
« Find: The decomposition of W into a numerical irreducible decomposition for
Z, that is, find the breakup of W as in Equation 15.0.1.

We will outline three variations on a procedure to complete this task, each based
on a different type of membership test. The details of how to implement the tests
follow in subsequent sections.
In this and the following sections, we denote the witness superset for dimension
i as Wi, which is composed of a witness set Wj for dimension i plus, possibly, some
junk points, J;. In addition to witness points, witness sets and witness supersets
carry along linear slicing planes and some description of Z in a form that allows
witness points to be refined numerically. When we speak of a point w G Wj, it is
implied that w is in the witness point set for Wi.
Before employing membership tests, we reduce the amount of work by partially
categorizing the points in the witness superset. The first observation is that all
points in the top-dimensional witness superset are true witness points: there is no
junk in the top dimension. This is because, by definition, the junk points in a witness
superset for an i dimensional component of Z must lie in some higher-dimensional
component of Z.
A second observation is that any nonsingular points in W must be true witness
points. Assume that Z = V(/), where / is a system polynomials. A point t t e W j
lies in Z n £ c ' 2 ', so letting / ^ m (x) denote the restriction of / to the linear space
Lc'1', we have / ^ m (w) = 0. Then, w is nonsingular if the Jacobian matrix of
partial derivatives for //,cm has full column rank.1 For this purpose, it does not
matter whether the linear slice is represented extrinsically or intrinsically. (See
§ 13.2.1 for explanation of these terms.) Nonsingularity implies that the point is an
isolated point of Z D £CM. In contrast, junk points in Wi lie in a component of Z
of dimension greater than i and hence in a component of Z fl Lc ^ of dimension at
least one.
The final observation builds on the second. A point w £ Wi is a true witness
point if, and only if, it is an isolated solution to fLo[^(x) = 0. Any test of local
dimension can serve to distinguish between junk points and witness points. If point
w £ Z n I c f ' l C CN is not isolated, then the slice Zn Lc^ must intersect a closed
hypersurface surrounding w. Interval arithmetic might be used to find that the point
is isolated by showing that none of the 2N faces of a rectangular box enclosing w
x
We use the usual convention that rows of the Jacobian correspond to functions in / and
columns correspond to variables.
268 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

contains a solution. Alternatively, a heuristic like the one in (Kuo et al., 2004) could
be used to find nearby points on Z D Lc^1^, thereby showing that the point is not
isolated and must be junk.
Taking such observations into account, we mark some points in W as true witness
points, we discard any points known to be junk, and we mark the remaining ones
as needing further investigation. If local dimension testing of the sort mentioned
in the previous paragraph is reliably complete, then no questionable points remain,
but we do not count on that outcome in what follows.
We can complete the decomposition with one of several types of membership
test. The first of these has the following inputs and outputs.

Irreducible Membership: [Yi,>2] := Memberl(Y,w)


• Input: A finite set of test points Y 6 CN and an isolated point w G Z C\ Lc'*',
where Z is an algebraic set and Lc ^ is a linear space of codimension i generic
with respect to Z.
• Output: Set Y\ consisting of the points in Y that are on the same irreducible
component of Z as w, and set 5^ := Y \ Y\ being the rest of Y.
• Procedure: See § 15.3.

This membership test yields a complete algorithm for the numerical irreducible
decomposition of a witness superset as follows.

Irreducible Decomposition: [W] := IrrDecompl (W)


• Input: A witness superset W for an algebraic set Z.
• Output: The witness set W contained in W decomposed into its irreducible
pieces as in Equation 15.0.1.
• Procedure:
- Initialize W = 0.
- While: W ^ 0,
* Let k be the top dimension of W.
* Pick any w € W^-
* Let \YUY2] :=Memberl(t?,w;).
* Points in Y\ from Wk form an irreducible witness set. Append this set to
W. ^
* Points in Y\ from Wi, i < k, are junk. Discard them.
* Remove Yx from W, i.e., W := W \ Yl.
- End while.
- return(W).

On each pass through the main loop, at least one point w is removed from Wk,
so eventually it is emptied out and the algorithm descends to the next dimension.
The Numerical Irreducible Decomposition 269

Eventually, W is completely empty and the algorithm terminates. Irrdecompl


does both jobs of removing junk and decomposing the witness sets. The only
trouble with this approach is that Memberl turns out to be expensive. For this
reason, we develop more efficient alternatives.
These alternatives proceed by eliminating junk as an independent process from
decomposing the witness set. The key to junk removal is the following algorithm.
• i

Membership: [t] := Member2(?/, W)


• Input: A single point y € CN and a witness set W for a pure-dimensional
algebraic set X.
• Output: If y £ X, return t := true, else return t := false.
• Procedure: See the homotopy membership test of § 15.4.

With this test available, one can remove all junk points as follows. Remember
that the top dimensional component of a witness superset contains no junk points.

Junk Removal: [W] := JunkRemove(W/)


• Input: A witness superset W for an algebraic set Z.
• Output: The witness set W obtained by removing all junk points from W.
• Procedure:
- Let k be the top dimension of W and set i := k - 1.
- Let Wk := Wk.
- While: i > 0,
* For each w £ Wi, if Member2(w;, Wj) for any j > i, then discard w.
Otherwise, copy it into Wj.
* Let i :—i — \.
- End while.
- return(W:=[W 0 ,...,W fe ]).
i i

With the junk removed, it remains to partition the witness sets at each dimension
into irreducible witness sets. The monodromy method, though not complete on its
own, is useful for this task.

Monodromy: [W] := Monodromy(W)


• Input: A witness set W for a pure-dimensional algebraic set Z.
• Output: A witness set W having the same points as W in some permuta-
tion such that corresponding points in the lists are known to be in the same
irreducible component of Z.
• Procedure: See the monodromy algorithm of § 15.4.
270 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

To make the monodromy approach complete, we may employ a trace test. This
test can also be used on its own, without monodromy, to find irreducible decompo-
sitions. Its format is as follows.

Trace Test: [t] := Trace(F)


• Input: A set of points Y C W, where W is a witness set for a pure-dimensional
algebraic set Z.
• Output: An array t containing linear traces of the points in Y. The traces
have the property that if the sum of traces for a set points is zero, then that
set is the union of one or more irreducible witness sets.
• Procedure: See § 15.5.
i i

As we will discuss in § 15.5, if w € W is on an irreducible component of degree


d, there is one and only one subset W of size d that contains w and has trace of
zero. (The trace of a set of points is just the sum of the traces of its members.)
Moreover, any zero-trace set of size greater than d that contains w is the union of
the irreducible one of size d and one or more other irreducible witness sets.
Combining monodromy and the trace test, we have a complete algorithm for
irreducible decomposition of a pure-dimensional algebraic set as follows.

Irreducible Decomposition: [W] := IrrDecompPure(W, M, K)


• Input: A witness set W for a pure-dimensional algebraic set Z. Also, integers
M, K that control when to switch to exhaustive enumeration.
• Output: The a list W of the irreducible components Wj of W.
• Procedure:
— For j = 1,..., #W, initialize Yj as a set containing the jth point of W.
Let Y be the list of all Yj.
— Associate to each Yj a trace value tj := Trace(Y,).
— For each tj that is zero, move Yj from Y to W'.
— Comment: try heuristic monodromy loops first. Integer k counts the number
of attempts without making progress.
— Initialize k = 0.
— While: m := #Y > M and k < K,
* Let {Y{, ...,Y^}:= monodromy {{Yu ..., Ym}).
* If there is any Yj ^ Yo•, we have found a path connecting a point in Yj to
a point of some Yj, i ^ j .
* Regroup the Yi, merging all sets that have a monodromy connection and
updating the corresponding trace as the sum of those for the merged sets.
* For each new trace that is zero, move the merged set from Y to W.
* If there were no mergers, increment k, else set k = 0.
— End while.
The Numerical Irreducible Decomposition 271

- Comment: Switch to the exhaustive tests either because the number of


groups is low enough or because we give up on the monodromy heuristic.
- While: Y / 0
* Among all combinations of one or more Yj eY, find the smallest combi-
nation that both contains Y\ and has a summed trace of zero. "Smallest"
means having the fewest witness points.
* Merge this combination into one set and move it to W.
- End While.
- return(W)-
i _ m ^ i

With care in programming, the exhaustive phase requires at most 2 m ~ 1 — 1


combinations to be examined. There are 2 m possible combinations in all, but if one
combination passes the trace, so does its complement in Y, so we never have to
test both. Also, we know that the trace for the whole of Y must be zero, because
the initial set W is a complete witness set. A further refinement of the algorithm
recognizes that some witness points appear with multiplicities greater than one,
and all witness points on the same component must have the same multiplicity.
Therefore, if we keep track of how many times each witness point appears in the
output of WitnessSuper or Cascade, we can limit the combinations to be tested
in the exhaustive phase to only those combining points of the same multiplicity.
Here, multiplicity means the multiplicity of the component as a solution to the
squared-up system used in witness superset generation, which may be greater than
its multiplicity as a solution of the original system.
If we do not wish to use monodromy, a negative value of K causes IrrDecomp-
Pure to skip directly to exhaustive trace testing.
The numerical irreducible decomposition is obtained from a witness superset
by removing junk and then the witness sets for each dimension one by one. For
completeness, we list out the algorithm as follows.

Irreducible Decomposition: [W] := IrrDecomp2(VF,M, K)


• Input: A witness superset W for an algebraic set Z. Integers M and K are
control parameters for IrrDecompPure.
• Output: The witness set W contained in W decomposed into its irreducible
pieces as in Equation 15.0.1.
• Procedure:
- Let W := JunkRemove(W0.
- For % = 1,..., dim W, let W{ := IrrDecompPure(W', M, K).
- return(W / ).

This completes the top-down description of numerical irreducible decomposition.


The rest of the chapter builds the required membership tests from the bottom up
272 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

and shows that they have the properties that we rely on here.

15.2 Sampling a Component

The fundamental capability upon which all the membership tests depend is the
ability to sample a component given a witness point on it. Recall from § 13.3
that as a theoretical construct, witness points are the isolated points of intersection
between an affine algebraic set and a generic linear space. To sample a component,
we simply move the linear space in continuous fashion, i.e., move it along a real-
one-dimensional path through the Grassmannian of linear spaces. As long as the
prescribed path avoids a proper algebraic subset of nongeneric linear spaces, the
intersection with the component remains isolated and defines a real-one-dimensional
path of points on the component. Sampling is just the process of setting up such
paths and following the points of intersection. Suppose the algebraic set under study
is Z, and let L(s) denote a continuous path of linear spaces that are generic with
respect to Z for s S (0,1]. Then, x(s) := Z n L(s) is a path of isolated points, with
a well-defined endpoint x(0) = lim s ^o3 ; (s). When we choose L(0) to be generic
also, then x(0) is a new sample point lying on the same component as x(l) and on
no other component of Z.
As a numerical construct, witness sets carry along extra information that allows
a numerical approximation to a witness point to be refined to higher precision. This
same information allows us to update the witness point when the slicing plane is
moved slightly, hence we can numerically follow the path x(s). The details vary
according to whether the component is reduced, deflated or nonreduced.
A linear space can be represented extrinsically as a set of linear equations L{x) =
a + A • x = 0 or intrinsically as x(u) = b + B • u. In the extrinsic form, a linear
interpolation between two such spaces of the same dimension, say L\{x) and L0(x),
can be written as

L(x, s) = sL^x) + (1 - s)LQ(x). (15.2.2)

If the coefficients of Li(x) and L0(x) are chosen at random, then L(x, s) = 0 defines
a linear space of the that same dimension for all s G [0,1], with probability one.
Intrinsically defined paths work in an analogous way, so we don't write out the
details. In the rest of this section, we write only extrinsic formulations, but it
should be understood that intrinsic ones can be used instead, usually with some
increase in efficiency for implementations.

15.2.1 Sampling a Reduced Component


A numerical witness point on a reduced component in V(f) is a nonsingular solu-
tion, say xi, to the augmented system {f(x),Li(x)} — 0, for some known slicing
equations Li(x). To sample, we simply replace Li(x) with the path L(x,s) of
The Numerical Irreducible Decomposition 273

Equation 15.2.2 to get the homotopy


h Q (1523)
^=[i(th - --
We wish to track the path beginning at xi for s = 1 to find the endpoint as s —> 0.
For x € CN, the homotopy h(x,s) has at least N equations. When it is not
square, we can use randomization to square it up as h'(x,s) := 9i(h(x,s);N) and
then apply the usual nonsingular path tracker of § 2.3. An alternative is to use a
Gauss-Newton predictor-corrector, meaning that we use least-squares pseudoinver-
sion in place of Gaussian elimination to solve the overdetermined linear systems in
the predictor and corrector steps (see Equations 2.3.5, 2.3.6).

15.2.2 Sampling a Deflated Component


Recall from § 13.3.2 that for a nonreduced component in V(/), we have the option
of constructing a deflation such that the component in question is the projection
of a reduced solution component of a related system of polynomials g. That is,
the witness set has the form (W,L,g, TT) such that the points W are nonsingular
points in V(g) D n~1(L) and the witness points are W = Tr(W'). When L is given
by equations L{x) = 0, the pullback TT^^L) is given by the same equations, so
the path L(x, s) is still just as in Equation 15.2.2. We proceed as in the case of a
reduced component but with g replacing /, obtaining a solution path y(s) in some
larger dimension. The path we seek is just x(s) = ir(y(s)).

15.2.3 Witness Sets in the Nonreduced Case


The nonreduced case without deflation is the most difficult. In this case, witness
points are singular endpoints of solution paths in a homotopy h(x, t) = 0. This
homotopy is constructed in the course of computing a witness superset either by
WitnessSupi called from WitnessSuper of § 13.6 or by algorithm Cascade of
§ 14.1. Either way, the homotopy depends on the coefficients of the linear slicing
equations, which we may explicitly show by the notation h(x, t; A) = 0. Conse-
quently, if Ai is the matrix of coefficients for the slicing plane on which our witness
point lies and Ao is same for the target slice, the sampling homotopy becomes
doubly parameterized by t and s as
H(x, t, s) := h(x, t, sAi + (1 - s)A0) = 0. (15.2.4)
We have in our witness set the start points Wt that satisfy H{x, e, 1) = 0 and which
lead to the witness point as t —> 0 for s = 1. We wish to track the solution path
as s moves along (0,1]. We know that the solution path exists, but it consists of
singular isolated points for each value of s. This is a case of singular path tracking.
So as not to unduly interrupt the flow of the chapter, we postpone discussion of
singular path tracking to the last section, § 15.6.
274 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

15.3 Numerical Elimination Theory

The first approach to the numerical irreducible decomposition, reported in


(Sommese et al., 2001a), uses a membership test based on a numerical version
of elimination theory. This is the test we called Memberl in § 15.1.
Let X denote an irreducible k dimensional component of the solution set V(f)
and let x* be a known generic point on it, that is, x* = X n L c ^ l for a generic linear
space Lcrfel of codimension k. We need to give a criterion for y G C^ to belong to
X. We can assume that N > k > 0 since otherwise nothing needs to be done.
Assume first that k = N — 1. Using the sampling techniques of § 15.2, we vary
the slicing plane and collect as many widely separated general points on X as we
wish. For each positive integer d, there are m(d) := {N^d) monomials of degree less
than or equal to d; the coefficients of these monomials are homogeneous coordinates
in p m ( d ), thereby forming the complex linear space Pd(CN) of polynomial equations
of degree < d. Each point on X gives a linear condition on the Pd(CN). Choosing
m(d) — 1 general points, we get a polynomial Pd{x) vanishing on the points, unique
up to multiplication by a nonvanishing complex number. Choosing one additional
general point e on X we have either

(1) Pd(e) / 0, in which case there are no elements of Pd(CN) vanishing on X and
deg X must be greater than d; or
(2) or Pd(e) = 0, which by genericity implies that Pd{x) is identically zero. In this
case, degX < d and if this is the smallest d for which such a polynomial exists,
we know that degX = d, and in fact, X = V(pd)- Consequently, we have a
membership test: y G X if, and only if, pd(y) = 0.

Thus, we may proceed progressively d = 1,2,... until we find a d for which there is
a polynomial Pd{x) vanishing on X. We know that degX is at most the cardinality
of the witness set for dimension N — 1, which limits the complexity of the method.
Now assume that 0 < k < N — 1. Take a generic linear projection TT : CN —>
Cfc+1. We know by Theorem A.10.5 that 7r is generically one-to-one and proper on
X, and in particular, that TT(X) is an algebraic hypersurface with deg?r(X) = deg^f.
We sample X as usual and project each sample point x to y = TT(X) £ Cfc+1. Just
as above, we now find a polynomial qd(y) of minimal degree that vanishes on the
projected samples and we conclude that TT(X) = V(qd)-
Any point of x' e Tr~l(n(X)) satisfies qd(n(x')) = 0, so at first blush, qd does
not seem adequate for testing membership in X. However, it is sufficient for testing
membership for a finite set F C C^, because a general projection such as TT has the
property that for all x* G F, n(x*) G n(X) if, and only if, x* G X. So choosing the
projection at random, we have a probability-one membership test for points x* G F:
x* € X if, and only if, qd(Tr(x*)) = 0. This is all we need for algorithm Memberl.
The main problem with this approach is that (fc+^+d) grows rapidly with the
dimension k and degree d of the component. Also, fitting polynomials of high degree
The Numerical Irreducible Decomposition 275

to numerical data is often numerically ill-conditioned. The dimensionality of the


problem can sometimes be reduced by detecting that the linear span of a component
is smaller than N and the degree can be lowered mildly by projecting from points on
the component; see (Sommese et al., 2001b) for more on these. Still the approach
is often too inefficient for practical use.

15.4 Homotopy Membership and Monodromy

We can avoid the computational cost of numerical elimination by switching to a


weaker membership test, called Member2 in § 15.1. It has the more stringent con-
dition that the input is a witness set for a pure-dimensional component, whereas
Member 1 only requires a single generic point. However, since our methods of
generating witness supersets always give a top-dimensional witness set free of junk
points, we have the necessary input to start the junk removal process for lower di-
mensions using Member2. The same theoretical underpinning that justifies Mem-
ber2 gives us routine Monodromy: both rely on a homotopy membership test.
The main principle is that if X C C^ is an irreducible algebraic set, X and
XTeg are path connected. Assume X is i-dimensional, i < N, and let G be the
Grassmannian consisting of all codimension i linear spaces in CN. A general point
in G is a generic slicing plane with respect to X, while there is some proper algebraic
subset, say G* C G, of nongeneric slicing planes. A generic slicing plane, say
Li £ G \G*, cuts X in a witness point set W\ := X D L\. For any LQ £ G,
let L(s) c G b e a one-real-dimensional path with L(l) = L\ and L(0) = LQ and
L(s) <EG\G* for all s £ (0,1]. By Theorems A.14.1 and A.14.2, since Wi is the
entire solution set of XDL(0), the solution paths XflL(s) start at W\ for s = 1 and
the limits of their endpoints as s —> 0 includes all isolated solutions of X n L(0). For
convenience throughout this section, we abuse notation by using the same symbol L
for both the linear space and the linear functions which define it, i.e., L = V{L{x)).
Suppose we wish to test if point y £ CN is in X, where X is as in the previous
paragraph. If y £ X, then among all the linear spaces in G that pass through
y, generic ones meet X at y transversely, that is, letting Lo be a generic element
of G meeting y, we have that y is an isolated point of X n Lo- Accordingly, the
endpoints of the homotopy paths X n L(s) starting from W\ must include y as
s —* 0. The only remaining question is how to construct L(s) so that it misses
G*. This is easily accomplished because L\ is generic, so by Lemma 7.1.2, the path
L(s) = sL\ + (1 — S)LQ avoids G* for s 6 (0,1] with probability one. Here, the
interpolation formula for L(s) assumes L\ and Lo are represented extrinsically as a
set of i linear functions.
Now, suppose that X is the union of several irreducible pieces, all of dimension i.
We have y G X if, and only if, it is in one of the pieces. We just conduct the
homotopy membership test for each piece. Notice that if we have a witness set for
X, it includes witness points for all the irreducible pieces even though we may not
276 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

know which points match which pieces. It doesn't matter as far as the membership
test is concerned; if we track all the witness points, we still get all the endpoints,
without knowing which ones are on which piece.
According to the above, we may now write pseudocode for Member2 as follows.

Membership: [t] := Member2(y, W)


• Input: A single point y € CN and a witness set W for a pure-dimensional
algebraic set X of dimension i.
• Output: If y € X, return t := true, else return t :— false.
• Procedure:
— Comment: W includes a linear space L\ and the witness points W = XC\L\.
— Choose a random complex i x N matrix A.
— Let LQ(X) := A • (x — y). This is a generic linear space passing through y.
— Track the paths X n (sLi + (1 - s)L0) from W at s = 1 to get endpoints
Y at s = 0.
— If y £ y, then return(true), otherwise return(false).

Obviously, we can save some computation by tracking the paths one at a time
and returning a positive result as soon as one ends on y. The worst case is when
y $ X, because then we always have to track all the paths to find this out.

15.4.1 Monodromy
The same principle underlying the homotopy membership test leads directly to the
concept of monodromy. In our context, the basic idea is that if L(s) C G \ G* is
a one-real-dimensional closed loop, that is, L(0) = L(l), then the set of witness
points at s = 1 are equal to those at s = 0, i.e., W = X n L(l) = X n L(0). This
is true both when X is irreducible and when it is the union of irreducibles. What
makes this useful to us is that, although the set of points is the same, the paths
leaving at s = 1 may arrive back at s = 0 in permuted order. A path beginning at
point u £ W and arriving at point v G W with u ^ v demonstrates that u and v
are in the same irreducible component. This is just the homotopy membership test
applied on a closed loop.
When we begin with a witness set W for a pure-dimensional component X,
such as would be generated by successive application of algorithms Cascade and
JunkRemove, we do not know how many irreducible components X contains.
Any partition of the points is possible, from every witness point lying in its own
linear component to all witness points on the same component of degree #W.
Each connection between distinct witness points found by monodromy restricts
the possible break up. This is how algorithm Monodromy is used in algorithm
IrrDecompPure of § 15.1. Pseudocode for the monodromy algorithm follows.
The Numerical, Irreducible Decomposition 277

Monodromy: [W] := Monodromy(M^)


• Input: A witness set W for a pure-dimensional algebraic set Z.
• Output: A witness set W having the same points as W in some permuta-
tion such that corresponding points in the lists are known to be in the same
irreducible component of Z.
• Procedure:
— Comment: W includes a linear space L\ and the witness points W € XC\L\.
— Choose a random linear space LQ(X) = 0 of the same dimension as Li{x).
- Let L(s) = sLi + (1 - s)L0.
- Track the paths X H L(s) starting at W for s = 1 to get new endpoints V
at s = 0.
— Choose a random, complex 7 e C.
- Let L(s) = sjLQ + (1 - s)L\.
— Track the paths beginning at V for s = 1 to get endpoints W at s = 0.
- return(W).

In the lists of points W, V, and W, we maintain the path ordering throughout,


so that the kth point in W is path connected to the kth point of W', for all k. Note
that we have used the fact that ryL(x) = 0 defines the same linear space as L{x) — 0,
so X D LQ = I f l 7L0 • This means the start points of the second homotopy are
the endpoints of the first. The 7 causes the return path to be different than the
outbound path. See the figure following Lemma 7.1.3 for illustration.
Note that Monodromy as written above uses two stages of path-tracking to
produce one monodromy loop. In the process, it generates a witness set at a sec-
ond slice, but this information is thrown away. For efficiency, one could save this
intermediate witness set and use it to close monodromy loops with less work on
subsequent executions of the algorithm. For example, if we go from i 2 to LQ to L\
to LQ, we have closed two loops, L\ —> LQ —> L\ and Lo —> L\ —> L o , using only
three rounds of path tracking instead of four. See (Sommese et al., 2001c) for more
on such practical issues.

15.4.2 Completeness of Monodromy


The monodromy procedure above is clearly valid, but it could be vacuous in the
sense that the points might always come back in the same order. This is not usually
observed in practice, and in fact, theory tells us that there exist monodromy loops
sufficient to generate all possible permutations of the witness points on an irreducible
set. This section presents an even stronger result that is key to the next topic of
traces. But before we show that result, we give an extended discussion of a simple
case. Historically, this was probably the first example of monodromy ever studied.
Assume we have a polynomial p(z, w) = w2 - z on C2. Assume that p(z, w) = 0
278 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

and we wish to express w as a function of z, i.e., we wish to make sense out of the
expression y/z with z € C. We would like this to be a global function, but this
is not possible in any continuous way. Let us assume it is possible and see what
goes wrong. At z = 1, we need %/T to be set to either 1 or — 1. Let's assume that
y/z is set to 1: the case of —1 is identical. For z = elS we have either y/z = e%e/2
or y/z — —el6/2. Since \/T = \/e® we conclude by continuity that y/z = eie/2.
The trouble comes when we go full circle and reach Vei2lT. By continuity we have
V ^ F = ei7r = - 1 .
The easiest classical solution of the problem of defining y/z (or ln(.z) for that
matter) is to slit the plane from 0 to —oo, e.g., remove the real numbers from 0 to
—oo from C. On the slit plane there are two "branches" of y/z. One has -\/l set to
1 and the other has vT set to —1. Similarly setting a more complicated polynomial
p(z,w) = 0 with the w degree equal to d, we will have functions w = qi{z) for
i = l,...,d solving p(z,w) = 0. Each is branch of the solution is defined on
an appropriately slit region of the plane. Analytic continuation is the classical
name for the process of extending the function, e.g., extending y/z denned in a
small neighborhood of 1 to a function on a larger region. Hille has a nice detailed
discussion of analytic continuation (Chapter 10 Hille, 1962).
Notice that trying to define y/z and tracking v e * as z goes around the unit
circle leads to a permutation of the set {1,-1} of roots of z2 = 1. Looking at this
a bit more abstractly, we have that w — z2 — 0 defines an algebraic curve X in C 2 .
Projection to the z variable gives a two-sheeted branched cover ?r : X —> C. Over
C* := C \ {0}, we have that 7r : X \ {(0,0)} —» C* is a two-sheeted unramified cover
with the fiber over a point z being (z, w), with w running over the "two square
roots" of z. The fundamental group of C* is the additive group Z, and we have the
monodromy action of Z on the fiber of TT over a fixed basepoint, e.g., 1. The even
elements of Z leave {1,-1} fixed and odd integers send 1 to —1 and —1 to 1.
How does this apply to decomposing an algebraic set into its irreducible com-
ponents? Let's assume we have a purefc-dimensionalaffine algebraic set X C C^.
Let 7T : X —> Ck denote the restriction to X of a generic linear projection from X
to Cfe. Note that by genericity we conclude from the Noether Normalization The-
orem 12.1.5 that 7r is a proper d := degX branched covering of Cfe. The union of
the sets where n is not a covering and X is not a manifold form a proper algebraic
X' C X with dimX' < dimX. Since n is proper, we know that TT(X') is an alge-
braic subset of Ck by the proper mapping theorem A.4.3. Moreover since the fibers
of the map TT are finite, we know that dimTr(X') = diiaX1 < dimX = k. Thus let-
ting X = X\ U • • • LJJ r denote the decomposition of X into irreducible components,
we have that Y := Ck \ n(X') and Xz := Xt \ -K-1(K{X')) are all irreducible and
connected. Moreover, letting X equal the manifold Ul_1Xi, the map n : X —> Y is
a d sheeted unramified covering map.
Fix a basepoint y* £ Y and consider the monodromy action of /n\(Y,y*), the
fundamental group of Y with basepoint y*, on F := tr~l(y*). Note we have a
The Numerical Irreducible Decomposition 279

decomposition

F = Fl U • • • U F r (15.4.5)

given by setting Fi :— F n Xi. For our purposes, it is enough to take smooth


embeddings g : S1 —> Y of the unit circle S 1 into Y with 1 going to y*, and for
points in F track them as 9 S [0, TT] goes from 0 to ~n over the path g. We get
different permutations of F as we carry out this tracking with different embeddings
of S1 • By using the permuations, we break F into disjoint sets

F = FiU---UFl,. (15.4.6)

Since the Xi are connected, we see that the decomposition given in Equation 15.4.5
is compatible with F = U ^ - F / in the sense that each Fj is a subset of one of the
Fi. The immediate question that raises itself is:

Question 15.4.1 Do we have r = r' and is each Fj equal to one of the F{1

If we take sufficiently many smooth immersions g : S1 —* Y with g(l) — y,


the answer to this is yes as we will see in §A.12. By a smooth immersion g :
S1 —> Y, we mean a smooth map with the differential of rank one at all points of
S1. This suggests the method of using monodromy along paths to decompose X
into irreducible components. The problem is that the set X' can be expensive to
compute. Therefore, although it is easy to find random paths in Y and consequently
permutations of F, we have no cheap way in general to find generators of -K\(Y, y*),
and so we have no way to know whether the breakup of F into the F/ equals the
breakup of F into the F^. This raises the second question:

Question 15.4.2 Is there a cheap way of checking that the breakup of F into the
F[ equals the breakup of F into the i^?

The answer to this is yes. Based on Theorem A. 12.2, the trace test to certify the
breakup is explained in § 15.5.

Remark 15.4.3 (Monodromy over general bases) Everything we said above works
equally well for F := p~1(y) where p : X —> Y is a proper finite-to-one covering
map from a pure-dimensional quasiprojective manifold X onto an connected quasi-
projective manifold Y and y is a point we treat as a basepoint.

15.5 The Trace Test

The trace test is based on an explicit geometric description of a defining equation


of a hypersurface built out of traces.
280 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

15.5.1 Traces of Functions


The trace of a function is an old concept arising in a number of different situations.
In this section we summarize the main results about the trace and some related
constructions that go under the same general name.
We follow the approach to these concepts as they arise in the Weierstrass Prepa-
ration Theorem (Gunning, 1990; Gunning & Rossi, 1965). We also follow (Morgan
et al., 1992a; Sommese et al., 2002b) where we have used these concepts in a nu-
merical context. We refer the reader to these places for more details.

15.5.2 The Simplest Traces


Let us explain what a trace is in the simplest case. We have a finite set F c
C consisting of d not necessarily distinct elements Ai,...,Ad. Keeping track of
multiplicities, or assuming that the A^ are distinct, we have a polynomial p{x) of
degree d, unique up to a multiple by a nonzero complex number, with the property
that V(p) = F. It is easy to write down:

p(i) = n t i ( x - A i ) . (15.5.7)

Multiplying this out we get


d
P(x) = £(-1) V * . (15.5.8)
i=0

where the ti are elementary symmetric functions of the roots, i.e., to := 1, and for
i>0,

ti :— 2_^ An '" " AH-


l<ji<h<-<ji<d

The parameterized version of these ti are the traces we are interested in. Before
we turn to the parameterized situation, let us note an interpretation of the above
ti as traces of matrices. Recall from linear algebra that the trace of a matrix is the
sum of its diagonal elements. Let

A:=diag (Ai,...,A d ).

The trace of A is clearly t\. The matrix A induces linear transformations A*A of
the exterior products AJCd. Using the basis

{eh A---Aen\l<j1<j2<---<jl<d},

where efc is the d-tuple with zero entries in all places but the fc-th place, where there
is a 1, we see that the trace of A1 A is U.
The Numerical Irreducible Decomposition 281

15.5.3 Traces in the Parameterized Situation


Now we want to deal with the trace in the parameterized situation. Assume we have
a finite-to-one proper degree d algebraic map TT : X —> Y from one pure-dimensional
quasiprojective algebraic set onto a connected smooth quasi-projective algebraic
set, or more generally onto an irreducible normal quasi-projective algebraic set. In
practice, X is usually an aflane algebraic set in CN and Y is Euclidean space. From
Corollary A.4.14, we know that properness implies there is a Zariski open dense set
U CY and a positive integer d such that TV^-^U) : TT~1(U) —» U is an unbranched
d-sheeted cover. The integer d is called the degree of TT and denoted deg TT.
We call the function ti, that extends to Y, the i-th trace of g with respect to TT,
and we denote it trX)j(A). If g and TT are algebraic, then so are the traces tr^^A).
We have
d
5^(-l)'tr W i i (A)A d -' = 0. (15.5.9)
i=0

Assume we have an algebraic function A(a;) defined on X. If Y was a point,


we would be in the case of § 15.5.2. Over the dense Zariski open subset U C Y,
where n is an unramified covering, each fiber consists of exactly d inverse images,
and we can do the construction of the ti pointwise over each y G Y to get functions
tr7r,i(A)(y) defined on U.
More explicitly, fix a point y G U. The set n~1{y) consists of d := deg?r points
x\,..., Xd- We can form the degree d polynomial zero at the numbers X(xi) counted
with multiplicity

{w-\{xi))---{w-\{xd))-
Expanded we have

i=0

an
where to — 1 d ti for i > 0 denotes the elementary symmetric function

l<ji<-<ji<d

of the roots X(xi).


This unramified assumption is too restrictive for us.
The wonderful fact is that under the modest assumption that Y is normal, e.g., a
manifold such as C^, these functions ti, which depend only on y G U, have unique
extensions to Y as holomorphic functions. We call the extension of ti, the i-th
trace. These extensions exists because the properness of n implies that given any
y E Y, there exists an open set V C Y containing y such that V and TT~1(V) are
compact. Thus tr7r]i(A)(y) is bounded on UC\V. By using Theorem A.2.5 when Y is
282 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

smooth, and Remark A.2.6 when Y is merely normal, we conclude that trnj(\)(y)
extends to V. This gives a holomorphic extension of ti7V:i(X)(y) to all of Y, which
we also denote tr^j(A). The functions tr7rj(A)(y) are algebraic functions. This is a
consequence of the characterization (discussed briefly in §A.l) of algebraic functions
by their growth.
The equation corresponding to the relation between the U and the A^ in § 15.5.2
is the key equation
d
^(-l)Hr T , i (A)( 2 /)A( aJ ) d - i = 0 (15.5.10)
i=0

for (x,y) G X x Y with TT(X) = y. For y G Y where n~1{y) consists of d distinct


points, this is nothing more than the fact that the roots of Equation 15.5.7 satisfy
Equation 15.5.8.

15.5.4 Writing Down Defining Equations: An Example


Consider the cuspidal cubic, defined as the solution set of z2 — z\ = 0 in C2. Let
X = V{z\ — z\). The projection {z\,z2) >-> (z\) restricted to X gives a proper
degree two map it : X —> C.
Then given g{z\, z2), a polynomial on C 2 , we have

tTn,1(\x(zi,z2))(z1) = A U , y ^ J + A (Zl, -yf$\ • (15.5.11)

Though y/zf is not well-defined, the unordered pair -I y/zf, — \fz\ > is well defined,
and thus tr7rjl(Ax(^i,^2))(-2i) is well-defined.
Consider the function Xx(zi,z2) := z2. Substituting into Equation 15.5.11, the
first trace of the function z2 is found to be

trw,i(z2)(zi) = \fz% +—\f$ = 0-

Note that yz^ is only well defined if we choose a branch of the square root, but
whichever branch we choose, we have 0. Similarly

=
trva{z2){zi) = yjz~l ( ~ \ A i ) ~zi'

Recalling that t0 = 1 by convention, Equation 15.5.10 gives

0=(l)z22-(0)z2 + (-z31)z02=z22~zl

It is no surprise that we get z\ — z\ back again, since we know that up to a nonzero


constant multiple, z\ — z\ is the lowest polynomial vanishing on X.
The Numerical Irreducible Decomposition 283

Note the linear projection given above is far from generic, e.g., if the linear
projection was generic, we know that the degree of the projection restricted to X
would equal the degree of X, i.e., deg^f — z\) = 3.

15.5.5 Linear Traces


Let X be a pure (N — l)-dimensional algebraic subset of CN. Choose a generic
projection of CN to C ^ " 1 . Then we know by Theorem 12.1.5 that the restriction
7T of the projection to X is finite and proper of degree equal to d :— deg X. Choose
as coordinates X\,... ,XN of CN, the composition of coordinates xi,... ,5?/v-i of
C ^ " 1 with 7r; and x^ equal to a general linear function on CN that is nonconstant
on a fiber of IT. Then Equation 15.5.10 gives the polynomial
d
P(*) = ^ ( - l y t r ^ a ^ X a ; ! , • • •, arjv-i)*^' (15.5.12)
2=0

of the Xi that vanishes on X. We know that p(x) is a defining equation of X.


Since degp(a;) = d, we conclude from Equation 15.5.12 that

tr7rii(a;iv)(a;i, • • • ,^Ar-i) is a linear function. (15.5.13)


Indeed, if it is not then the coefficient of a;^"1 would be of degree at least two
contradicting degp(x) = d.
If X c CN is a pure fc-dimensional affine algebraic set of degree d, then by the
Noether Normalization Theorem 12.1.5, a generic linear projection of X to Cfc is
proper and finite-to-one, and taking the trace t r ^ i (A) of the restriction to X of any
generic linear function A on C ^ , we also obtain a linear function. This can be seen
by noting that the map (TT, A) : X —> C fc+1 is an embedding on a Zariski open dense
set V of X. Thus, we fall into the case covered by Equation 15.5.13 with N = k +1.
We are now in a position to give an answer to Question 15.4.2. Given X C C ^
of dimension k, choose a generic linear Lo := <CN~k and apply the monodromy
method to get a breakup F = F[ U • • • U F'r, of F = X fl LQ. We can continue Lo by
LQ + sv where v is a random vector in CN and s 6 C. As s varies we, at least for s
in some neighborhood of 0, get continuations q(s) of any point q e F/ for any i.
Theorem 15.5.1 Choose a general linear function A on CN. A set Fj of Equa-
tion 15.4-6 is a set Fi of Equation 15.4-5 if and only if the function Y^,qeF' M s ) *s
linear in s.

Proof. By genericity of A it can be assumed that A separates the points of the


fiber Lo n X.
Assume first that F't is equal to one of the Fj = Z^jCiLo for some fc-dimensional
irreducible component Zkj of the solution set of / . It would follow from 15.5.13
that ^2qeF/ X(q(s)) is tr,rZfc ,i(A), and thus, it is linear in s relative to the family
cut out by Lo + sv.
284 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Now assume that F? is not equal to any of the sets Fj. It must be properly
contained in one of the Fj, say F/ C Fj. Let qi,...,qb be the points of F/ and
let q0 be a point of Fi \ F/. It follows from Theorem A. 12.2 that there is a path
c : S1 —> C, where S1 are the complex numbers of absolute value 1 with c(l) = 0,
such that monodromy under Lo + c(t)v takes q\, q2, • • •, qb to qo, q2, • • •, qb- Since
S F ' X(s) is linear in s we conclude that

A(<?i) + KQZ) + ••• + Hib) = A(<?o) + A(g2) + • • • + A(g6).


From this we conclude that \(qo) = X(qi) contradicting the fact that A separates
the points of the fiber LQ D X. •

Remark 15.5.2 The usefulness of linear traces has been well recognized, e.g.,
(Sasaki, 2001).
In (Rupprecht, 2004) it was asserted in the codimension one case that the con-
verse fact that Y^qeF' A(?(s)) is linear in s implies that F? is one of the Fi. That
proof, as explained in (Sommese et al., 2002b) has two serious gaps.
Theorem 15.5.1 justifies our use of the linear trace test in the numerical irre-
ducible decomposition algorithm IrrDecompPure of § 15.1. We are ready to state
the procedure for the algorithm Trace, as follows.

Trace Test: [t] := Trace(Y)


• Input: A set of points Y <zW, where W is a witness set for a pure-dimensional
algebraic set Z C C^.
• Output: An array t containing linear traces of the points in Y.
• Procedure:
- Let Lo be the linear space that cuts out Y C Z n Lo.
- Choose a random, complex v £ CN.
- Choose two distinct, nonzero, real numbers s± and S2-
- Track the paths of Z D (Lo + sv) from Y at s = 0 to get Y\ at s = si and
y 2 at s = s2.
- Choose a random, complex 1 x N matrix A, and define \(y) := A- y.
- Evaluate q0 = A(Y), qx = A(Yi), and q2 = X(Y2).
- return(i := (qi - qo)/s1 - (q2 - qa)/s2)

15.6 Singular Path Tracking

In § 15.2.3, we saw that sampling a nonreduced solution component can lead to a


singular path-tracking problem. This can be viewed as a special case of the following
situation. Suppose a parameterized family of polynomial systems (see Chapter 7),
f(z;q) : Cra x C m —> C n , has an isolated singular solution (z*,q*) at a generic
The Numerical Irreducible Decomposition 285

parameter point q*, where singular means that the Jacobian matrix df/dz(z*;q*)
has rank less than n. This solution will continue to other isolated singular solutions
on an open set in C m (see § A.14.1), and as described in Theorem 7.1.6, we may wish
to track such a solution along a continuous path in parameter space, say q(s) C O™,
where q(0) = q*. In general, this would be a nearly intractable numerical problem,
but we have a little extra leverage if we have obtained the solution point (z*,q*) as
the endpoint of a nonsingular solution path to a homotopy h(x,t;q*) = 0. Then,
we may define the doubly parameterized homotopy
H(x, t, s) := h(x, t; q(s)) = 0. (15.6.14)
At its root, singular path tracking is based on a singular endgame. For each
value of s, the point on the singular path is the limit as t —> - 0 of a nonsingular
path. In Chapter 10, we discussed how to estimate such endpoints with the power-
series endgame or the related Cauchy integral endgame. Both of these work by
building a local model of the solution path for small t. The gist of singular path
tracking is to update this local model as we advance s and in essence, replay the
endgame at every s.
The power-series endgame and the Cauchy integral endgame both collect sample
data on the incoming paths of the homotopy to determine the winding number c
and to build a local model of the holomorphic function 4>{rj) from Lemma 10.2.1,
where t = rf. The singular path tracker uses prediction/correction techniques to
update the local model as we step along the path. Recall from Chapter 10, that
a cluster of /i paths approaching the same endpoint may break into cycles, each
cycle having a winding number, say c, such that the solution path closes up as t
circles the origin c times. Although we will not argue the issue carefully here, it is
clear that these cycles also continue in the local neighborhood. In a nutshell, the
closing up of the solution path in c loops is an algebraic condition that holds on
at the generic parameter q*, so it continues on an open subset in the neighborhood
of q*. The endgame convergence radius within which the local model holds varies
as q(s) varies with s. It may become zero within a proper algebraic subset of the
parameter space, but by Lemma 7.1.2, a one-real-dimensional path between two
generic parameter points will miss the degenerate set with probability one.
Therefore, at each value of s, we have a nonzero endgame operating zone as
in Figure 10.1, with a convergence radius and an ill-conditioned zone. If we use
sufficiently high precision, the ill-conditioned zone stays inside the convergence ra-
dius for all s 6 [0,1], and our task is to track the local model along this endgame
operating zone. As we have several ways of formulating an endgame based on the
local model, the details of tracking the model must be adjusted accordingly. In
essence though, all the methods are similar. For conciseness, it is helpful to adopt
the notation that for a cluster of points C = {w\,..., wc}, we let H(C,t,s) = 0
mean H(wi, t, s) = 0 for i = 1,..., c. Also, the following definitions are convenient.
Definition 15.6.1 A convergent cluster (C,to,s) = ({wi,..., wc},to,s) with
286 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

H(C,to,s) = 0 is such that to is inside the endgame convergence radius for fixed s
and for i = 1,..., c, the solution path of H(w, t,s) = 0 beginning at (wi, to, s) con-
tinues to (tUj+i, to, s) as t travels once around the circle |t| = to- For this definition,
wc continues to w\.
By requiring that to is inside the convergence radius, we implicitly require that
all the points in the cluster approach the same endpoint w* as t —> 0 and the
same cyclic mapping from u>i to IUJ+I holds under continuation around a circle
for every £ ^ 0 in the disk A to (0). In other words, the projection (w,t,s) —> t
gives a proper c-sheeted finite mapping from the solution set of H(w, t, s) = 0 in a
neighborhood (w*,0, s) to a neighborhood of 0 6 C. We call w* the convergence
point of the cluster.
Definition 15.6.2 For fixed s, the convergence point of a convergent cluster is
the common endpoint as t —» 0 of the solution paths of H(w, t, s) = 0 emanating
from each cluster point (u>i,to, s).
The nonsingular path tracking algorithm of § 2.3 can be adapted to our current
situation to arrive at the following singular path tracking algorithm.

• Given: System of equations, H(w, t, s) = 0, and an initial convergent cluster


Co, such that H(Co,to,0) w 0. Also, an initial step length h and a tracking
tolerance e.
• Find: Sequence of convergent clusters (Ci,U,Si), i = 1,2,..., along the path
such that with Sj+i > Sj, terminating with sn = 1. Return the final cluster at
s = 1 and a high-accuracy estimate of its convergence point.
• Procedure:
- Loop: For i — 1,2,...
(1) Predict: Predict cluster (U,t',s') with s' = min(si_i + h,l) and t' =
U-i.
(2) Correct: In the vicinity of U, attempt to find a corrected cluster W such
thatff(W,i',s')~0.
(3) Recondition: If correction is successful, play a singular endgame in t'
to compute the convergence point of the cluster at s'. If the conver-
gence point is computed to accuracy better than e, declare the endgame
successful and do the following.
* Adjust t: Pick a new £j in the endgame operating zone.
* Update: Set Si = s' and generate the corresponding cluster C^. In-
crement i.
(4) Adjust h: Adjust the step length h.
- Terminate: Terminate when s; = 1.
- Refine endpoint: Play the endgame at s = 1 to compute thefinalcon-
vergence point to high accuracy.
The Numerical Irreducible Decomposition 287

In the context of witness points generated by the cascade algorithm, the paths
of the cluster points are nonsingular away from t = 0. Accordingly, the usual
prediction/correction techniques for nonsingular paths apply.
The adjustment step for reconditioning must select a new value of t, which will
be held constant in the next prediction step. One sensible way to select it is to use
the largest value for which the singular endgame meets the convergence tolerance e.
If the endgame meets the tolerance on the first try at the current value t', it may
be useful to try increasing it. If it fails, we try decreasing t, unless the condition
of the Jacobian matrix indicates that failure may be due to having entered the
ill-conditioned zone around t = 0. With such rules in place, the value of t can
adaptively decrease and increase as s proceeds.
Similar to the nonsingular path tracker, we adaptively adjust the step length h
by halving it when the correction step or the reconditioning step fail. On the other
hand, if these steps both succeed several times in a row, we try doubling h.
A variant of the procedure is to save some computation by applying recondi-
tioning only occasionally to verify that the cluster is convergent. One criterion for
deciding when to recondition is to monitor the condition number of the Jacobian
matrix dH/dw along the paths. Even more computation might be saved by tracking
only one path in the cluster along s holding t constant, and when the condition of
the Jacobian matrix indicates reconditioning is necessary, to regenerate the other
points in the cluster by looping t around the origin. This risks path crossing, be-
cause it is not clear how to set the reconditioning criterion to ensure that t has
remained within the convergence radius as s progresses. There is very little expe-
rience at this point to judge whether such variants can be made both reliable and
efficient. By reconditioning at every step, we have greater assurance that the local
model remains valid for the whole extent of s 6 [0,1].
The techniques we have discussed show that in principle singular path track-
ing is feasible, although in practice a fully satisfactory approach is still a matter
of research. The approach was first presented in (Sommese et al., 2002a), which
also reports on some initial experiments with the technique of using the condition
number to decide when to recondition.
It may seem that we could completely avoid singular path tracking by using
deflation to convert problems into nonsingular path tracking problems. This is
true in the context of witness points generated by the cascade algorithm, because
such points are isolated solutions cut out by the slicing procedure. However, in
Chapter 16, we will see how to find witness points for a set denned as the intersection
of two given algebraic sets, say A and B. If A and B are both components of the
same system of equations f(x) = 0, then although a slice of appropriate dimension
cuts out a unique point on the intersection set, such a point is not an isolated
solution of the system obtained by appending the linear slicing equations to f(x) =
0. Consequently, witness points for A D B are defined only as singular endpoints of
solution paths in a new kind of homotopy, called the diagonal homotopy, and such
288 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

points can be moved along AdB only by singular path tracking. Of course, it could
be that a more elaborate form of deflation could desingularize these points as well;
such a procedure could be subject matter for a new line of inquiry.
To raise the bar even higher, consider intersecting two algebraic sets whose
witness points are only known as singular endpoints of a diagonal homotopy. Then,
we could have a very difficult singular path tracking problem in which each point
in the convergent cluster is itself only known as the convergence point of a prior
homotopy. We have yet to face such a nasty calculation, but it is quite within the
scope of numerical algebraic geometry to consider it.

15.7 Exercises

Exercise 15.1 (Degree of p(x)) Conclude that degp(x) — d for the polynomial
in Equation 15.5.12 by showing that
(1) the highest degree that XN occurs with is d; and
(2) by genericity of XJV we know that there is at least one fiber of n on which XJV
is nowhere zero, and therefore that tT7T^(xN)(xi,...,a;jv-i) is not identically
zero.
Exercise 15.2 (Spherical Parallelogram Mechanism) Pick two unit vectors
ax, a2 € M3 and a random value of a £ I . Let 6i, &2, 63 6 M3- Consider the system
of polynomial equations
ajbi = a, a2b2 = a, b[b2 = aja2, bjbi = 1, 6^62 = 1, 63 = (6i + b2)/2.
These eight equations describe a curve in (61,62,63) € C9. Find a numerical irre-
ducible decomposition of that curve. Report the number of irreducible components
and their degrees.
Exercise 15.3 (Griffls-Duffy Decomposition) Revisit Exercise 14.4 and find
the irreducible decomposition. Do it again for the special case when 6j = a* and
Ci = 1, i = 1,..., 6. Report the number of irreducible components and their degrees.
Exercise 15.4 (Seven-Bar Problem) Use exhaustive trace testing to show that
the one-dimensional component of the seven-bar system presented in Exercise 13.3
is irreducible.
Chapter 16

The Intersection Of Algebraic Sets

God keep me from ever completing anything. This whole


book is but a draught—nay, but the draught of a draught.
Oh, Time, Strength, Cash, and Patience!
—Herman Melville

Up to this point, we have concentrated on describing the numerical solution of a


given system of polynomial equations. That is, given a polynomial system / , we
have numerically described V(f). In Part II, we sought just the isolated points in
V(f), while in Part III, we have sought the numerical irreducible decomposition of
V(f). In this final chapter, we discuss operations on irreducible components. In
particular, we present algorithms from (Sommese et al., 2004b, 2004c) to compute
the numerical irreducible decomposition of A n B, where A and B are irreducible
components of V(f) and V(g), respectively. The capability to work with individual
pieces of the solution sets and to intersect pieces from different sets of equations gives
a new level of refinement, allowing resources to be concentrated on just the objects
of interest, especially when the solution sets of the systems on hand include extra
components that are not of interest, as happens frequently. When one wants the
intersection of reducible algebraic sets, it is just a matter of bookkeeping to intersect
all of their irreducible pieces. For reasons which will become apparent later, we call
the workhorse of the new approach the diagonal intersection algorithm.
The diagonal intersection technique even allows one to examine certain algebraic
sets that are proper subsets of the irreducible components of the equations on hand.
A case in point is where A and B are both irreducible components of V(f). In this
case A n B is certainly an affine algebraic set, but we do not have on hand a set
of polynomials for which it is an irreducible component. One could derive such
a polynomial system with appropriate symbolic operations on / , but we can find
witness points for the set with only numerical operations on / .
Somewhat surprisingly, the ability to work with individual components gives
us new leverage in finding just the isolated solution points. We find that much of
the special structure of a system, which we worked so hard to exploit in Part II,
can be captured starting with total-degree homotopies to decompose individual
equations and then applying the diagonal intersection algorithm to find intersections
equation-by-equation. While the approach is at present too new to have much
practical experience, early experiments (Sommese, Verschelde, & Wampler, 2004e)

289
290 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

show promise. It is hoped that the approach might solve some problems that were
previously too large to solve in one blow by the traditional approaches of Part II.

16.1 Intersection of Irreducible Algebraic Sets

A good idea of the way the diagonal intersection algorithm proceeds can be gleaned
by studying a special case. Assume that we have two polynomials /, g on C2. Let
A be an irreducible component of V(f) and let B be an irreducible component of
V(g). We would like to find A n B. Assume that A has degree d\, and B has
degree d2, and let a.\,..., a^ and /?!,..., (3d2 be witness point sets of A and B,
respectively. That is, for generic linear equations LA(x) = 0 and LB(X) = 0, we
assume that we have already computed the intersections V(LA)^A = {a\,... , a ^ }
and V(L B ) Dfl = {/?i,... ,/JdJ.
Note that AC\B can be interpreted as solutions to a system on C4 by a procedure
from algebraic geometry called reduction to the diagonal (Ex. 13.15 Eisenbud, 1995).
The procedure is to form the system
~f{xi,x2)~
S
F(x1,x2,y1,y2) = ^ = 0.
. x2-y2 .
The solutions of the system consists of points {x\,x^,x\,X2) G C4 with (x*,^)
a point of V(f,g). This identification respects components and all multiplicity
structure. In particular, all the irreducible components of AC\B have corresponding
irreducible components in V(F).
Ignoring for a moment the two diagonal linears, let's consider the set
V{f(x1,x2),g{yi,y2))-
Clearly, A x B is an irreducible component of this set. To see this, remember that
an algebraic set being irreducible means by definition that its set of smooth points
is connected. To see that the smooth points of A x B, (Ax B)leg, is connected, note
that (A x B) reg = ATeg x BTeg, and that the product of connected sets is connected.
Moreover, we know a set of witness points of A x B, i.e., the set of points
{(ati,l3j),i — l,...,d\,j = I,...,d2} are the intersection of A x B with the linear
space V(LA(x1,x2),LB(yi,y2))-
Consider the homotopy

_{1 - t){x2 - y2) + jtLB{x)_


with 7 a general point of S1, the complex numbers of absolute value one. In this
special case, the diagonal intersection theory of (Sommese et al., 2004b) implies
Intersection of Algebraic Sets 291

that the endpoints as t —> 0 of the solution paths of H(x, y, t) starting at the points
(<Xi, (3j) at t — 1 includes (using the identification given by reduction to the diagonal)
all the isolated points of An B.
The general case is conceptually not much harder, although the procedural de-
tails get a bit technical. We sketch only the main idea here. We use notation similar
to that above, but now work in higher dimensions. That is, let A C V(f) C CN and
B C V(g) C C ^ be irreducible algebraic sets, with / and g as polynomial systems.
Let dim A — a and dim B = b. The main idea is that, letting x £ Cfc be the vari-
ables for A and y £ Cfc those for B, we wish to find the irreducible decomposition
of the diagonal polynomial system, namely x — y, restricted to Ax B. The cascade
homotopies of Chapter 14 carry over with A x B in place of Euclidean space. In
short, we have an embedding like Equation 14.1.4 that includes all of the systems
for slicing witness sets at every dimension. As in the cascade method on Euclidean
space, we need to square up systems as necessary. Omitting detailed argumenta-
tion, this just amounts to choosing random, complex matrices M / , Mg, Mxy,S, U, v
with dimensions as follows:
Matrix M/ Mg Mxy S U v
rows N- a N -b a+b a+b N N
columns #(/) #(g) N N 2N 1
The result is a system of 2N polynomials:
Mf • f(x)
£(x,y,t)= Mg-g(y) (16.1.1)
Mxy(x-y) + S-T(t).(u-[xy\+vy
where T(t) is & NxN diagonal matrix with entries ti,... ,tN. Just as in the regular
cascade method, we choose t\,... ,tn randomly, and a witness set for dimension i
is found by solving the equations £ (x, y, t^) = 0, where t ^ = (ti,...,ti,O,...,O).
To get started, note that we have at the outset the solutions (a*, f3j) 6 CN x CN,
i = 1 , . . . , deg A, j = 1 , . . . , deg B of the system J-(x, y) = 0, where

'Mrf(xy
M
H*,V)= ['S] • (16-1.2)
LJA\X)
. LB(y) _
Now, the top dimensional component of A n B is at most fci := min(a, b) and the
lowest is at leastfco:= max(0,a + b - N). We solve for dimension fci by tracking
the solution paths of

s^(x,y) + (1 - s)£ (z,y,tM) = 0, (16.1.3)

from each of the start points (ai, 13j) at s = 1 to get at s = 0 three kinds of points:
witness points on the diagonal x — y = 0, points at infinity, and "nonsolutions." The
292 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

nonsolutions at dimension i are the start points for the homotopy to dimension i — 1,

£(x,y,{t1,...,ti-1,s,0,...,0)) = 0, (16.1.4)
whose solution paths we follow from s = 1 to 0. This is a brief, but procedurally
complete, description of the diagonal intersection method.
As outlined above, each homotopy is a system of 2iV equations in 2N unknowns.
In (Sommese, Verschelde, & Wampler, 2004c), it is shown how to consistently reduce
the size of the homotopy by using intrinsic formulations of the linear equations.
Finally, it is important to note that the output of the diagonal homotopy method
is a witness superset. We still need to remove junk points and, if desired, break
the witness sets into irreducible witness sets. The algorithms of Chapter 15 are
directly applicable.

16.2 Equation-by-Equation Solution of Polynomial Systems

With the diagonal intersection algorithm in hand, we have much more flexibility in
how we solve systems of polynomials. For example, we can subdivide a system into
two sets of polynomials, compute the irreducible decomposition of each, and the use
the diagonal method to intersect each irreducible component of the first subsystem
with each one of the second. With a little bookkeeping, for eliminating duplications
and so on, we get a numerical irreducible decomposition for the whole system.
Taking this approach to the extreme, we may first find witness sets for each
polynomial individually, and then intersect these one-by-one. We call this solving
the system equation-by-equation (Sommese et al., 2004e). The approach is most
easily described in terms of a flowchart, shown in Figure 16.1. The post-processing of
points coming out of the diagonal homotopy discards duplicates and checks whether
singular points are junk. In the junk removal box, we have used the shorthand V{W)
to mean the algebraic set witnessed by W. We also allow an affine algebraic set
Q to be pre-specified for discarding points on known degenerate sets or sets not
of interest. For example, should we wish to work on (C*)N, Q is the union of the
coordinate planes, X{ = 0, any i.
The flowchart also includes two tests that eliminate some witness points of the
subsystems before they get to the diagonal homotopy routine. The one on the
left, "/fc+i = 0?," recognizes that if a witness point satisfies the new equation,
then the set it represents does too, and it passes to the output without change
of dimension. The points eliminated by the similar test on the right, ufi(x) =
0 any i < fc?," discards points on components we have already found. Such tests are
cheap compared to running the diagonal homotopy, so it is useful to employ them.
The pruning of points in the flowchart can be made more stringent if all we wish
to find are the nonsingular isolated points of the system. Supposing that the original
system is square, / : CN —> C^, we can keep in the output for V ( / i , . . . , ft) just the
nonsingular witness points for dimension N -i. There are not enough polynomials
Intersection of Algebraic Sets 293

remaining to cut any higher-dimensional components down to isolated points.


To understand why the equation-by-equation approach might be valuable, con-
sider that systems of 50 or more low degree polynomial equations occur naturally
in the study of polynomial systems. It can happen that such a system has only a
few thousand isolated solutions, and we might wish to find them.
Straightforward use of traditional homotopy continuation, such as we described
in Part II, may have little chance of succeeding. For example, assume that we had
a system of 60 polynomials of order two. A total degree homotopy continuation
would have 260 « 1018 paths. Assuming we had a thousand node computer, each
node of which could compute 20 paths a second, it would take a few million years.
Of course, if the system has many fewer than 260 solutions, we should not be using
a total degree homotopy, but instead use a start system to take some advantage
of the special structure. However, the computation of a special start system also
suffers from a curse of dimensionality, so we may not find a good one in a reasonable
amount of time.
Consider the following simple case: an eigenvalue problem. We have A a given
60 x 60 matrix of constants. We have the polynomial system
Ax — \x = 0.
Regarding it as a system on P 59 x C and homogenizing, we get the system
fxAx — Xx = 0

on P 59 x P 1 . Embedding P 59 x P 1 into P 119 using the Segre embedding described


in § A. 10.2, we see that the total degree of the solution components of the first
k equations is k. This means the number of solution paths working equation-by-
equation never gets large. A reasonable observation is that using the bihomogeneous
structure we just wrote down, the usual homotopy continuation will work well. The
point is that in this case we could see a special structure. If we hadn't, the total
degree homotopy would be useless, but the equation-by-equation approach would
automatically utilize the special structure.

16.2.1 An Example
Consider once more the system given in Equation 12.0.1, which we treated with
WitnessSuper in Example 13.6.4 and with Cascade in Example 14.2.1. The
equations are
\h(x,y)] [ x(y2-x3)(x-l) 1
2 3 U
[h{x,y)\- [x(y -x )(y-2)(3x + y)\ "
It is easy to confirm by hand that the equation-by-equation algorithm flows as
follows. The numbers next to the flow lines indicate how many points flow that
direction. Counting the computation of witness points for the individual equations,
294 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

there are a total of 5 + 6 + 2 = 13 homotopy paths in the procedure for this problem.
This compares to 36 paths for WitnessSuper and 39 paths for Cascade.

Witness V(fx) Witness V(f2)

#Wf = 5 #X2 = 6
5 ^ -. 6

Witness y ( / i , / 2 )

16.3 Exercises

Exercise 16.1 (Flowchart) Draw a diagram showing how witness points flow
when the equation-by-equation method is applied to the system of Example 13.6.5.
(Hint: some points coming from the diagonal homotopy go to infinity.)

Exercise 16.2 (Eigenvalues) Chart the flow of witness points for an equation-
by-equation treatment of the eigenvalue problem described in § 16.2. Assume the
size of the matrix i s n x n . The output of the diagonal homotopy at each stage
consists of only nonsingular points and points at infinity. How many paths are
tracked in total?

Exercise 16.3 (Diagonal Intersection of Reducible Components) The de-


scription of the diagonal approach is for intersecting irreducible sets. Despite this,
the equation-by-equation flowchart does not require the witness sets to be decom-
posed into irreducibles. Explain why this is valid.
Intersection of Algebraic Sets 295

Witness V(h, ...,fr) Witness V{fk+1)

w* w$ ••• w* ••• w£ xk+1

w
x

/fc+i(w) = O? N> 1 | <N /i(a;) = Oanyi<fc?


V • • ^ W X ^ r

V7
7—

f Diagonal \
I Homotopy I

<^N y at CXD? or y e Q? Y> ,

<N ye W^li1? Y> 1

y singular? Y^> • y € V{Wf+1) for any i < j ? Y>

I I r^>^
\ Discard /

Wk +1 Wk +1 ... ^fc+l Wk+1 ... py-fc+l ^ ^ 1

Witness V ( / i , . . . , / f c +i)

Fig. 16.1 Stage A; of equation-by-equation generation of witness sets for V(fi,..., fn) 6 C w \ <5-
The witness sets are subscripted by codimension and superscripted by stage. Q is some pre-
specified algebraic set on which we wish to ignore solutions.
Appendices
Appendix A

Algebraic Geometry

A basic goal underlying algebraic geometry is to translate between algebra and


geometry, and take advantage of people's strong visual intuition and the tools de-
veloped in mathematics to support this intuition. Over the complex numbers, the
relationship between algebra and geometry is remarkably strong, and sadly over the
real numbers this relationship is very weak.
In this appendix we present useful results about these concepts, but we have left
many facts to be introduced as needed throughout the book. What we have tried to
do is give adequate definitions and examples so that the reader can understand the
techniques in the book. Towards this goal we add to the basic concepts introduced
earlier in this book.
There are a plethora of introductory books on algebraic geometry. Unfortu-
nately, many of these, based on a computational algebra approach, are not centered
on the basic geometric facts we need, e.g., the equivalence of an algebraic set be-
ing irreducible with the connectedness of its smooth points. (Kendig, 1977) is a
good geometric introduction. Though restricted to plane curves, (Fischer, 2001) is
a gentle introduction that covers a surprising amount of important material. (Fis-
cher, 1976) is a wonderful book for getting a detailed understanding with precise
statements of the analytic geometry that is useful in the study of polynomial sys-
tems. No one book will cover everything, but for further study we suggest the fine
books (Griffths & Harris, 1994; Harris, 1995; Mumford, 1995), which discuss many
geometrical issues that arise. (Eisenbud, 1995) is a useful book covering the alge-
bra underlying the symbolic methods with attention to the background geometry.
(Decker & Schreyer, 2001) is a good survey of computational algebraic geometry.
(Cox et al., 1997, 1998; Decker & Schreyer, 2005; Greuel & Pfister, 2002; Schenck,
2003) are good introductions to computational algebra and computational alge-
braic geometry.
Except when explicitly stated, algebraic sets are reduced, i.e., we ignore multi-
plicity information.
Systems of polynomials on C'" are not sufficient. For example, if we have a
system of polynomials on C^, it might well happen that there is some algebraic
subset B of CN, known in advance of solving the system, such that solutions in B

299
300 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

may be ignored. Working directly with a system of polynomials on CN \ B leads


to conceptual clarity. A more serious situation occurred in Chapter 16 where the
natural space is not a Zariski open set of C^, but rather a pure-dimensional affine
algebraic set. The compromise we make here is to deal with algebraic functions on
pure-dimensional quasiprojective algebraic sets. Systems of homogeneous polyno-
mials on projective algebraic sets may be reduced to this situation by the moves
discussed throughout the book, i.e., for projective space, we pass to the Euclidean
space of one dimension higher with the addition of a random linear equation or
equivalently passing to a "general" Euclidean patch inside the projective space. We
make several remarks in the rest of this appendix about more general situations,
e.g., working with line bundles and vector bundles.
A significant part of this appendix is devoted to Bertini Theorems, which are
crucial for applications of numerical analysis to polynomial systems. Many of these
results assert that certain sets are smooth with appropriate dimensions or they are
empty. These statements, which do not assert any existence, are usually simple to
prove, and reduce to Theorem A.4.10 combined with some form of the constructions
of § A.7 or § A.8.1. There are also statements asserting certain sets are nonempty
or irreducible. These results are more difficult and rapidly lead beyond the scope of
the book. For this reason, we have multiple statements of Bertini's Theorem with
different levels of generality.

A.I Holomorphic Functions and Complex Analytic Spaces

The complex neighborhoods introduced in § 12.1.1 are convenient because they may
be chosen small enough to discard global information. Loosely speaking, they let us
put local properties of a space "under the microscope." When using complex neigh-
borhoods it is often useful to choose local coordinates which are not polynomials.
Here is a typical example.
Example A.1.1 Consider the affine algebraic set Z := V(w2 - z). We have a
map 7T : Z —> C given by ir(z,w) = z. There are two points in the fiber TT~1(1)
over 1, i.e., (1,1) and (1,-1). As we will see, Z is a manifold, and a natural
parameterization of Z at (1,1) g Z is given by (z, ^/z) where we choose the branch
of yfz with \/I = 1, and stay in a neighborhood of 1, e.g., {z £ C | z ^ (—oo,0]},
where the branch gives a well-defined function.
For doing algebraic geometry over the complex numbers, it has been standard
for over a century to use holomorphic functions such as the function ^/z in Exam-
ple A.1.1 and holomorphic functions such as ez.
When talking about holomorphic functions, we use the complex topology unless
we explicitly say otherwise, e.g., that a set is a Zariski open set.
A function / defined on an open set V C CN is said to be a holomorphic function
on V if given any x = (xi,..., xN) G V, there exists a neighborhood U C V of x
Algebraic Geometry 301

on which there is an absolutely convergent power series expansion


oo
a z x J
f(zu...,zN) = ^2 X! j( - ) >
i=0 \J\=i

where all of the aj G C. Here we use multidegree notation

(1) J denotes an JV-tuple of nonnegative integers (ji,..., J'JV);


(2) \J\ :=ji+---+jN\ and
(3) {z - x ) J : = (zi - x ^ • • • ( z N - x N y > » .

Just as in one complex variable there are many equivalent ways of denning holo-
morphic functions, e.g., in terms of the Cauchy-Riemann equations. We refer the
reader to (Pritzsche & Grauert, 2002; Gunning, 1990; Gunning & Rossi, 1965) for
more on holomorphic functions. We need only a few facts about them. The first
is the obvious fact that polynomials are holomorphic. Locally, polynomials and
holomorphic functions look and behave the same, but when looked at globally,
holomorphic functions can be much more wild than polynomials, e.g., ez — 1 has in-
finitely many complex zeros. On the other hand, there are many results that assert
that a holomorphic function with growth as moderate as an "algebraic function" is
an "algebraic function." For example, any holomorphic function / on C ^ with the
property that there is a constant C > 0 and an integer K > 0 such that

l/(*)l < c(1 + v / N 2 + --- + k v | 2 ) X


is a polynomial of degree < K. This follows immediately from the Cauchy Inequal-
ities (page 21 Pritzsche & Grauert, 2002).
In analogy to affine algebraic sets, we define a complex analytic set X C U on an
open subset U C CN as the set of common zeros of a finite number of holomorphic
functions f\,..., fk on U.
Given a complex analytic set X := V ( / i , . . . , fk) C U on an open subset U C CN
and a complex analytic set Y := V(<?i, • • •, <?L) C V on an open subset V C C M ,
a holomorphic mapping <j> '• X —> Y is a function from X to Y such that (f> is the
restriction to J of a holomorphic map A : U' —> V from an open subset U' C U
containing X to an open subset V' C V containing Y, i.e., the restriction of a
mapping of the form

(z1,...,zN) -> (A^Z!,..., zN),..., AM(zi, • • •, zN))

with Ai(zi,... ,ZN),...,AM(ZI,...,ZN) holomorphic functions. A holomorphic


mapping <j> : X —> Y is called a biholomorphic mapping if there exists a homo-
morphic mapping tp : Y —> X such that <j> o ip is the identity mapping on Y and
tjj°4>\s the identity mapping on X. In this situation we say that X is biholomorphic
to Y. Recall, e.g., (Milnor, 1965), that an n-dimensional differentiable manifold X
302 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

is a metric space that locally looks like Euclidean space. The definition of differ-
entiable manifold requires some technicalities because manifolds can have different
degrees of smoothness. You need a set {Ua \ a € 1} of open sets which covers the
manifold, i.e., X = UaeiUa, and for each Ua, a map (pa : Ua —> R n that gives a
homeomorphism Ua to an open set of ]Rn. Moreover:

(1) given any compact set K C X, only finitely many Ua meet K;


(2) whenever Ua D U0 ^ 0, fa o 0 " 1 : 4>a(Ua n Up) -> 4>p{Ua D L^) is C°°, i.e., has
infinitely many continuous derivatives;
(3) there is a countable basis of open sets, i.e., there are a countable set B of open
sets such that every open set on X is a union of open sets from B.

If we replace Rn by C™ and C°° by holomorphic, we have the definition of n-


dimensional complex manifold.
Before we go any further, we point out that all manifolds connected to algebraic
geometry are quite nice. For algebraic sets, we rarely need worse than complex
manifolds, which are much "nicer" than even C°°-manifolds, i.e., infinitely smooth
manifolds. We never stray below infinitely differentiable manifolds.
The complex analytic sets we have defined so far are analogous to affine al-
gebraic sets. There exists a very natural more global notion of complex analytic
spaces defined analogously to complex manifolds using the complex analytic sets as
local models, e.g., (Fischer, 1976; Gunning & Rossi, 1965). Complex analytic sets,
quasiprojective algebraic sets, and complex manifolds are complex analytic spaces.
In what follows we will state some results for complex analytic spaces.

A.2 Some Further Results on Holomorphic Functions

Holomorphic functions satisfy very strong restraints that are often considerably
stronger when the domain of the functions is at least two dimensions. For example,
there are several convenient extension theorems.

Theorem A.2.1 (Hartogs' Theorem) Let U C CN be an open set with N > 2


and let Y = V(gi,... ,gi) be a complex analytic subset of CM. If K C U is a
compact set with U \ K connected, then any holomorphic mapping A : U \K —• Y
has a unique extension to a holomorphic mapping U —> Y.

Proof. The map A is given by functions Ai,...,AM, and has a unique extension
to U, since the Ai extend uniquely to U by the single function version of Hartogs'
Theorem, e.g., (page 307 Fritzsche & Grauert, 2002). Since the holomorphic func-
tions gt(Ai(z),..., AM{Z)) are identically zero on U \ K, the extensions to U are
identically zero. Thus A(U) CY. •
Algebraic Geometry 303

Remark A.2.2 Theorem A.2.1 is not true with Y merely a complex analytic
subset of an open set U C CM. For example, if G is the open unit ball in C2, K is
the closed ball in C2 of radius 1/2, and Y = U := G\K, the result is false. It is true
whenever U is a holomorphically convex open set of C M , see, (page 75 Fritzsche &
Grauert, 2002). Such sets include CN and open balls.
Here is a typical use of Hartogs' Theorem.
Example A.2.3 Let X := CN \ 0 with N > 2. Then X is not isomorphic to
an affine algebraic set. To see this assume otherwise that it was isomorphic via
F : X —> X' to an affine algebraic set X' C C M for some positive integer M.
Then, since X' is closed, any sequence xn G X converging to 0 £ C^ cannot have
their images F(xn) converge in C M . But, such a sequence does converge, since by
Hartogs' Theorem A.2.1, the mapping F has a holomorphic and hence continuous
extension to CN.
The following simple result puts Example A.2.3 in perspective.
Lemma A.2.4 Let g be an algebraic function on an affine algebraic set X C CN.
Then X \V(g) is isomorphic to an affine algebraic set.

Proof. By definition we have a polynomial p £ C[zi,..., ZN] such that Px = 9 and


hence V(g) = V{Px) = V(p,f,,..., fk) where X := V ( / 1 ; . . . , fk). We let z denote
the AT-tuple (zi,..., zN). We define the map F : X\V(g) -> V(fi,..., fk, wp-1) C
C ^ 1 by F(z) = (z,l/p(z)). Define G : V{h,..., fk,wp - 1) ^ X\V(g) by
G(z, w) = z. Note that G o F is the identity on X and F o G is the identity on
V(f1,...,fk,wp-1). a

Another very useful result is the Riemann Extension Theorem (page 38


Fritzsche & Grauert, 2002).

Theorem A.2.5 (Riemann Bounded Extension Theorem) Let U be a com-


plex manifold. If Y c U is a complex analytic subset of U with Y ^ U, then
any bounded holomorphic function on U \Y has a unique extension to a bounded
holomorphic function on U.

Remark A.2.6 Analogous to the generalization mentioned in Remark A.2.2,


Theorem A.2.5 remains true if the bounded holomorphic function on U \ Y is re-
placed by a holomorphic mapping from U \ Y to an analytic subset X of a bounded
holomorphically convex open set G C C M . The condition that U is smooth may
be relaxed to the condition that U is normal, which will be briefly touched on in
§ A.2.2.
For holomorphic functions, there is the maximum principle, e.g., (Theorem LA.7
Gunning & Rossi, 1965).
304 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

L e m m a A.2.7 (Maximum Principle) Let f{x) be a holomorphic function on a


connected open set U C C ^ . / / |/(:r)| has a maximum on U, then f(x) is constant.

For holomorphic functions, partial derivatives with respect to the coordinates


can be shown to be well defined, e.g., by differentiating the power series term by
term. The analogues of many differential calculus results hold with no change of
statement. For example, there is the important Implicit Function Theorem. A set
of holomorphic functions / i , . . . , /JV defined on an open neighborhood U of a point
x € CN is called a system of coordinates on U centered at x e C ^ if

(1) fi{x) — 0 for all z; and


(2) the mapping ( / i , . . . , fN) : U —> C ^ is a biholomorphic mapping from U to an
open set V oi CN.

By Theorem A.2.8, the second condition, with U possibly replaced by a smaller


open set, is equivalent to a condition that the Jacobian at x

r dfi dfx -I
dzx '" dzN

dfN _ _ _ dfN
• dz\ dzN -

is invertible at x.

T h e o r e m A . 2 . 8 ( I m p l i c i t F u n c t i o n T h e o r e m ) Let fi,...,fk be holomorphic


functions defined in a neighborhood of a point x € CN with fi(x) = 0 for all i.
Assume that the Jacobian
rdfx dfx -i
dzx dzN

dfk . 3/fc
- dzx dzpi -

has rank k at x. Then on some possibly smaller neighborhood U of x, there exist


holomorphic functions fk+i, • • •, /JV such that / i , . . . , fN form a system of coordi-
nates on U centered at x.

The analogues of the many consequences of the differentiable implicit function the-
orem hold with no change. For example, we have a corollary that we will use below.

Corollary A.2.9 Let Z CU be a complex analytic subset of an open set U C C^,


e.g., let U = CN and Z C CN be an affine algebraic set. Let <j> : B -> C ^ be
a holomorphic map from the open ball B in a complex Euclidean space Cm with
4>(0) — x £ Z and <fi(B) equal to a neighborhood of Z containing x. Assume further
Algebraic Geometry 305

that the complex Jacobian


• dfa_ . . . 9<fti "
dz\ dzm
d4>= : •.. :
L dzi '" dzm J
has rank m at 0. Then there exist holomorphic coordinates / i , . . . , /jv in a open set
U' C U ofx such that ZnU' = (j>{B)C\U' = {x € U' | fm+i(x) = 0;...; fN{x) = 0}.

Proof. By renaming if necessary we can assume without loss of generality that


• <Mi . . . Mx. -\
dz\ dZrn

d<f>m _ _ _ d<j,m
- Szi dzm -

has rank m a t 0. In addition to the coordinates z\,..., zm on C m , let zm+i,... ,zjq


be coordinates on CN~m. Define the map f : B x C N " m -^ C^ by f{(zi,..., zN) =
4>i(zi,..., zm) for i from 1 to m, and /i(zi,..., z/v) = —4>l(zi,..., zm) + z, for z
fromTO+ 1 to N.
The Jacobian of / at 0 is
u u
9zi a^m

U U
"9^7 9i^" '
dtpm + l 90m + l 1-1 rv
3zi '" 9 z m -- • • • U

90w d(f>N r\ i
u L
L aZl '"' dzm ''" J

By the implicit function theorem with k = N, the /j form a system of coordinates at


0. Knowing that / is one-to-one in a neighborhood of 0, it follows by construction
that (j>{B) n U = {x e U' | fm+i{x) = 0;...; fN{x) = 0}. •

The reader might observe that in Examples 12.1.3 and 12.1.4, there is a one-to-
one and onto mapping C —> V(w2 — z) given by sending j o e C t o (w, w2). Note that
the differential of the mapping everywhere has rank one, and the map (z, w) —> w
gives an inverse. Given this, it is natural to hope that given a smooth point x of
an affine algebraic set Z, there is a Zariski open dense neighborhood U C Z of x
which can be identified with a Zariski open dense subset of some Euclidean space.
It is a fact of life that this is false.
Example A.2.10 Let I c C 2 denote the affine algebraic set defined by p(z, w) =
w2 ~z{z-l)(z-2) =0. Since
306 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

is the empty set, it follows from Corollary A.2.9 that V{p) is a manifold. It can be
shown that V(p) is as a differentiable manifold homeomorphic to a torus minus one
point, i.e., homeomorphic to S1 x S1 minus a point, where S1 denotes the circle
S*1 := { z e C \z\ = l} . Any Zariski open set U C V(p) is the complement of a
finite set on V(p). Thus, there will be two differentiable embeddings of the circle
5 1 to V(p) that meet transversely in only one point. But, there can be no such
maps of S1 into C. One of the beauties of Zariski open sets is that they are very
big. The problem here though, is that Zariski open sets are too big.

A.2.1 Manifold Points and Singular Points


How bad a set is an affine algebraic set? How far are they from being smooth, i.e.,
from being a manifold? As we will see, the answers are "quite nice" and "not very
far" respectively. In fact, given any affine algebraic set Z, the set of smooth points
of Z will be a Zariski open dense set of Z. Let us introduce definitions and concepts
to make this precise.
Given an affine algebraic set Z C CN, we define a point x G Z to be a smooth
point (also called a manifold point or a regular point) if there is a holomorphic map
<l> : B —> C^ from the open ball B in a complex Euclidean space C'm with <p(0) = x;
(j>{B) is equal to a neighborhood on Z of x; and the complex Jacobian
- dtp! _ _ _ d<j>t -
dzi dzm
d<j>= : •-. :
9<f>N 9<f>N
Uz, ' ' ' dzm J
has rank m a t i , where
(f>{zi,... ,zm) = (4>i(zi,... ,zm),... ,<j>N(zi,... ,zm)).

For example, at (1,1) on Example A.1.1, (j> can be taken to be z —> (z, yfz).
Note that by Corollary A.2.9, it follows that given a smooth point x £ Z, there
are holomorphic coordinates z\,..., z^ defined on a complex open set U C CN con-
taining x and such that Zi(x) = 0 for all i, and such that U(~\Z = V(zm+i, • • •, .ZJV).
This integer m is defined to be the complex dimension of Z at a regular point x £ Z.
The complex dimension of Z at m is half the usual dimension of Z considered as
a topological manifold at x. We typically use the word dimension for complex di-
mension and refer to the usual dimension as the real dimension. For example, the
complex dimension of C is one and the real dimension is two. It is traditional to
denote the smooth points of a quasiprojective set Z by Z reg . The points in Z \ Zreg
are called singular points. The singular points of Z are denoted Sing(Z). The di-
mension of Z at a smooth point is well defined. A nice argument for this follows
by adapting the very short argument for differentiable manifolds (page 7 Milnor,
1965). We gave a general definition of dimension in § 12.2 based on the irreducible
decomposition.
Algebraic Geometry 307

One difficulty with deciding for which points an algebraic set are smooth is that
the defining equations for the set might have too much information packed in them.
Here is an example where the defining functions will not suffice.
Example A.2.11 Let Z := V(z2) C C. In this case, Z = V(z) also, and using
the defining equation z, we see that Z is a manifold. The problem with the defining
equation z2 is that it also includes multiplicity information about Z.
Remark A.2.12 There is no easy computational solution to the problem posed
by the last example. The set of smooth points of an affine algebraic set V(f) is
Zariski open and dense, but the prescription for the singular set is nontrivial.
Given an affine algebraic set Z C C^, Z = V(I(Z)) where I(Z) denotes the
ideal of polynomials in C[zi,... ,ZN] that vanish on Z. One version of Hilbert's
Nullstellensatz, e.g., (Cox et al., 1997), says that given an ideal X C C[zi,..., z^\,
then I(V(X)) = y/X, where y/X, the radical of X, consists of all polynomials g such
that gk G X for some positive integer k. For example, on C, y/(z3) = (z). The
passing from an ideal to its radical throws away all multiplicity information.
The radical intervenes in the algebraic characterization of the set of smooth
points of an affine algebraic set. Let g\, • • • -,gu be a basis of the radical of X(f).
It follows that (Chapter 1A Mumford, 1995) that the singular set, Sing(V(/)), of
V(f) is equal to

T/ ( dgi dgM \
\ dzi dzN)
It must be noted that for a fixed N, M can be arbitrarily large.
A special case of the above, mainly useful for making illustrative examples, is the
following.
Lemma A.2.13 Let p G C[zi,..., z^\. The singular set of V(p) is contained in
(to dp^\

Here is an example using Lemma A.2.13.


Example A.2.14 Let Z = V(zw) C C 2 . In this case the potential singular
points, V I , -r—, zw I, the common zeros of zw and its partial derivatives,
\ oz ow j
equals the origin (0,0) € C 2 , which is clearly a singular point of Z.
Remark A.2.15 The inclusion in Lemma A.2.13 is an equality if the dimension
of the set
( dp_ ^ \
\ dzx dzN J
is < N — 2. For explicit examples, such a criterion is useful, but in our numerical
work such criteria have not yet proven useful.
308 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

A.2.2 Normal Spaces


There is an extensive literature on singularities. The simplest general class of com-
plex analytic spaces after complex manifolds are normal complex analytic spaces,
e.g., see (Fischer, 1976). A complex analytic space Y is normal if given any y GY
and any complex neighborhood U of y and any bounded holomorphic function /
on U \ Singf/, it follows that / extends holomorphically to U. A quasiprojective
algebraic set is said to be normal if it is a normal complex analytic set. Given any
complex analytic space (respectively, any quasiprojective algebraic set) X, there
exists a unique normal complex analytic space (respectively, a unique normal qua-
siprojective algebraic set) X' with a finite proper holomorphic (respectively, alge-
braic) map -K : X' —> X with n : X' \ 7r~1(SingA') —> X \ Sing(X) isomorphic.
Normal spaces include the affine algebraic subsets X C CN with the properties:

(1) X is a reduced complete intersection, i.e., all irreducible components of X have


the same dimension; there are k := N — dimX polynomials pi,...,pk with
X — V(pi,... ,Pk)', and all components of X occur with multiplicity one; and
(2) Sing(X) is codimension at least two in X.

These special sets are irreducible and naturally occur as parameter spaces.

A.3 Germs of Complex Analytic Sets

There are situations, e.g., in the study of endgames in Chapter 10, when we want to
look carefully at behavior in a neighborhood of a point. In such situations specifying
a fixed neighborhood of the point is inconvenient, and the notion of a germ of a
complex analytic set improves clarity. (Chap. II, Sec. E Gunning & Rossi, 1965) is
an excellent place for becoming comfortable with germs of complex analytic sets.
Since we will not be talking about germs of other types of sets, e.g., germs of affine
algebraic sets, which are denned analogously using the Zariski topology in place of
the complex topology, we will often refer to the germ of a complex analytic set as
a germ of an analytic set or a germ.
Given a point x e CN we define an equivalence relation on complex analytic
sets containing x: if X c U and X' C U' are two complex analytic sets denned on
open neighborhoods of x in C ^ , then we say that X and X' have the same germ
at x if there is an open neighborhood V C U D V with X n V = X' D V. Thus the
complex analytic sets V(z) and V(zw) define the same germ of a complex analytic
set at (0,4) but not at (0,0).
Given a point w = (WI,...,WN) G C ^ , we let ||u;|| = \/|u!i| 2 + • • • + \WN\2
denote the Euclidean norm and we denote the ball of radius r about a point x by
Br(x), i.e.,

Br(x)~{zeCN ||z-x||<r}.
Algebraic Geometry 309

We say a germ X at a point x £ C ^ is irreducible if there is a positive number e'


such that for all positive e that are less than e', there is an irreducible representative
of X in Be(x). This is equivalent to the usual definition that a germ l a t i G C^
is irreducible if there is no way to write X as a union Xi U X2 of germs X\, X2 at
x unless either X = X1 or X = X2.
The dimension of an irreducible germ X at a point x E C ^ is defined as the
dimension of any one of the germ's irreducible representatives.
An important fact about germs is that the irreducible decomposition holds, e.g.,
see (Theorem II.E.13 Gunning & Rossi, 1965).

Theorem A.3.1 Let X be a germ of a complex analytic set at a point x € CN.


There are a finite number of irreducible germs X\,..., Xk at x £ CN such that
X = Xi U • • • U Xf, and for all j = 1 , . . . , k we have that Xj <f_ Ui^jXi.

We emphasize this is equivalent to saying there is a positive number e' such that
for all positive e that are less than e', there are representatives of X and the X, in
Be (x) satisfying the conclusions of the theorem.
Note if we have an irreducible affine algebraic set X c C ^ , it may well happen
that the germ of a complex analytic set defined by X at a point x e X is not
irreducible, e.g., X — V{z\ — z\ — zf) at (0,0), which is discussed in detail in
Example A.4.18.
We say an algebraic set or a complex analytic space X is irreducible at a point
x 6 X if the complex analytic germ defined by X at x is irreducible. X is often
said to be locally irreducible at x if X is irreducible at a point x € X. We say
that X is locally irreducible if it is irreducible at all points. In the literature, being
irreducible at a point x is sometimes referred to as being topologically unibranch at
x (page 43 Mumford, 1995).
We define the dimension of a germ X at a point x as the maximum of the
dimensions of the irreducible germs occurring in the irreducible decomposition of X.
The following result is important in Chapter 10.
Theorem A.3.2 Let X be a germ of a one-dimensional complex analytic set at a
point x of CN. If X is irreducible at the point x, then there exists a representative
of X on a neighborhood of x, which by abuse of notation we also denote by X, and
a holomorphic map <j> : Ai(0) —• X from the open unit disk Ai(0) C C to X such
that (f>(0) = x and <p gives a biholomorphism of Ai(0) \ 0 and </>(Ai(0)) \ x.

Proof. This is sometimes referred to as the local uniformization theorem for one-
dimensional analytic sets. It is the one-dimensional complex analytic version of
Theorem A.4.1.
A proof for it can be based on the local parameterization theorem (Gunning,
1970), which, in its simplest form for a pure one-dimensional complex analytic set X,
says that given a point x C X, there exists a finite proper surjection n : U —> Ai(0)
where U is an open neighborhood of x and where 0 = n(x). Since X is one-
310 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

dimensional, the set of singular points of X is O-dimensional and therefore meets a


neighborhood of x with compact closure in a finite set. We can therefore (by taking
a smaller U if needed) assume that the only possible singular point in U is x. If
a; is a smooth point there is nothing to prove. Similarly IT : U \ {x} —> Ai (0) \ 0
is an unramified covering with a well-defined sheet number d. Basic topology tells
us that the restriction / : Ai(0) \ 0 —> Ai(0) \ 0 of the mapping z —• zd, factors
through n : U \ {x} -> Ai(0) \ 0 as f(z) = n(g(z)) with g : Ai(0) \ 0 -> U \ {x}.
Using the Riemann Extension Theorem A.2.5, we conclude we have an extension <> /
of g satisfying the conclusions we are trying to show. •

From Theorem A.3.2 follow results classically referred to as Puiseux's theorem,


e.g., (Chapter 7 Fischer, 2001). The following corollary is a version of this result.

Corollary A.3.3 Let X be a one-dimensional complex analytic subset of an open


set U C C ^ . Assume that X is irreducible at a point x e X. Let <fi : Ai(0) —>
V, for some open neighborhood V c X of x, be the local uniformization map of
Theorem A.3.2. Given any holomorphic function g{z\,... ,zN) on U which is not
constant on X, e.g., a coordinate function Zj, it follows that there exists a positive
r < 1 such that a coordinate function s on A r (0) may be chosen with the property
that the composition g((p(s)) equals g(x) + sc.

Proof. Let w denote any coordinate on Ai(0), which is 0 at 0. The power series
expansion of g((f>(w)) is given by
oo

g(x)+Y^aiw\
k=c

where ac ^ 0. Since Y^k=oai+cwl 1S nonzero at 0 we may express it on A r (0)


for some positive r < 1 as h(w)c, with h(w) holomorphic on A r (0). Choosing r
positive, but possibly smaller, s = h(w) is the desired coordinate. •

A.4 Useful Results About Algebraic and Complex Analytic Sets

The Hironaka Desingularization Theorem, which holds for both complex algebraic
sets and complex analytic spaces, is highly nontrivial, but extremely useful. Given
a quasiprojective algebraic set X (respectively, a complex analytic space X), a
desingularization f : X —> X of X is a quasiprojective manifold X (respectively, a
complex manifold X) and a proper surjective algebraic (respectively, holomorphic)
map / : X —> X such that //-i(x r e g ) : / -1 (-Xreg) —> XTeg is an isomorphism
(respectively, a biholomorphism) with f~l(Xreg) Zariski open and dense in X. Xreg
is always Zariski open and dense in X.
Algebraic Geometry 311

Theorem A.4.1 (Hironaka Desingularization Theorem) Let X be a quasi-


projective algebraic set or a complex analytic space. Then there is a desingulariza-
tion f :X -> X ofX,

More refined versions of the result tell us that we may choose the desingulariza-
tion map so that the inverse image of the singular set under the desingularization
map is a union of smooth codimension one algebraic sets which meet transversely.
See (Lipman, 1975) for a nice exposition of this result.
In the case when all components of X are of dimension one, Theorem A.4.1 is
simply the normalization of X, e.g., see (Fischer, 1976).
The Hironaka Desingularization Theorem makes many facts that are easy for
manifolds carry over immediately to general algebraic sets.
Here is one simple example often referred to as the maximum principle, e.g.,
(Theorem III.B.16 Gunning & Rossi, 1965).

Theorem A.4.2 Let X be an irreducible complex analytic space with infinitely


many points. Let f be a holomorphic function of X. If \f\ has a maximum on X,
then f is a constant function.

Proof. Let n : X —> X be a desingularization. Let / denote the composition of /


with 7T. If |/| has a maximum, then so does |/|. By Lemma A.2.7, it follows that
/ is constant, and hence / is constant. •

Theorem A.4.20 will give another illustration of the clarity brought by using
Hironaka's theorem.
The proper mapping theorem of Grauert assures us that many operations with
algebraic sets yield algebraic sets.

Theorem A.4.3 Let f : X —> Y be a proper holomorphic mapping (respectively


proper algebraic mapping) of complex analytic spaces (respectively protective, re-
spectively affine, respectively quasiprojective algebraic) sets. Then f(X) is a closed
complex analytic (respectively protective, respectively affine, respectively quasipro-
jective algebraic) subset ofY.

Proof. The analytic statement may be found in (Fischer, 1976). This, or the
simple fact that in the complex topology proper maps take closed sets to closed
sets, automatically implies the algebraic statements. To see this note that if X is
projective, affine, or quasiprojective algebraic, we know that ir(X) is constructible
by Theorem 12.5.6. Since n(X) is closed, we have the conclusion from Lemma 12.5.4.

Recall that an algebraic map from a quasiprojective set to an irreducible quasi-


projective set Y is dominant if the image of the map is dense in Y.
312 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Lemma A.4.4 Let f : X —•> Y be a dominant proper algebraic map from a


quasiprojective algebraic set to a quasiprojective algebraic set. Then f(X) — Y.

Proof. Let y be a point not in the image of X. By dominance we can find a sequence
of points yj G f(X) with yj converging to y. Choose Xj G X with f(xj) = yj. By
the definition of properness, there is a neighborhood U that contains y such that
f~l(U) is compact. Thus there is a subsequence of the Xj that converges to a point
x G X. By continuity of / , we have the contradiction that f(x) = y. •

For algebraic sets there is a strong result on upper semicontinuity of dimen-


sion (Corollary 3.16 Mumford, 1995) for the algebraic case or (Theorem, page 137
Fischer, 1976) for the more difficult complex analytic version.
Theorem A.4.5 Let f : X —> Y be an algebraic map (respectively a holomorphic
map) between quasiprojective algebraic sets (respectively complex analytic spaces).
Then for each positive integer k,

{xex\ dimxr'ifix^yk}
is a quasiprojective algebraic set (respectively a complex analytic space).
Remark A.4.6 Let / : X —> Y be an algebraic map between algebraic sets. As
we see from Example 12.5.5, the sets
{yGY | dim f-\y)> k)
do not have to be algebraic sets, though by Theorem A.4.5 and by Theorem 12.5.6,
they are constructible.
Using Theorem A.4.5 and Theorem A.4.3 we have the following result.
Corollary A.4.7 Let f : X —> Y be a proper algebraic mapping of quasiprojective
algebraic sets. For each integerfc> 0, the set {y G Y\ d i m / ^ 1 ( y ) > k} is a closed
quasiprojective subset ofY.
Finally we have the very useful Factorization Theorem of Remmert and Stein
(III Corollary 11.5 Hartshorne, 1977). Note that finite-to-one proper maps are called
finite maps by algebraic geometers, e.g., (Hartshorne, 1977).

Theorem A.4.8 (Stein Factorization Theorem) Let f : X —> Y be a proper


algebraic mapping of quasiprojective algebraic sets. Then f factors as s or, where
r : X —> Z is a proper algebraic map from X onto a quasiprojective algebraic set Z
with all fibers connected, and s : Z —>Y is a finite-to-one proper map.

The following general lemma, which is a special case of (III Proposition 10.6
Hartshorne, 1977), is often useful.
Lemma A.4.9 Let f : X —* Y be an algebraic map from a quasiprojective al-
gebraic set X to a quasiprojective algebraic set Y. Let Xr denote the closure of
Algebraic Geometry 313

those points from x e Xreg such that f(x) 6 f(X) and rank dfx < r. Then
dimf{Xr) < r.
The algebraic analogue of Sard's Theorem, e.g., ((3.7) Mumford, 1995), is much
crisper than the usual Sard's theorem for differentiable maps. It is responsible,
through the Bertini theorems of § A.9 for many of the strong probability-one state-
ments in this book.

Theorem A.4.10 (Sard's Theorem) Let n : X —> Y be a dominant algebraic


map between irreducible quasiprojective algebraic sets X and Y. Then there exists a
Zariski open dense subset V cY such that letting U denote the Zariski open dense
set A^reg (~l TT~1(V), TTu : U —> V is surjective and of maximal rank, i.e., dnx has
rank dimY at all points of x e U. In particular, for all v e V, T T " 1 ^ ) fl XTeg is
smooth of dimension dim X — dim Y.

Proof. In a nutshell the proof goes as follows. By replacing Y by a Zariski open


dense set Y' of Y with Y' C n(X) \ Sing(Y), and X by Xreg n TT~1(Y'), we can
assume without loss of generality that X and Y are smooth and n is surjective. Let
X' denote the closed algebraic subset of X consisting of points for which dirx has
rank < dimY. By Lemma A.4.9, n(X') is a proper algebraic subset of Y. •

Remark A.4.11 The differentiable form of Theorem A.4.10 is quite weak. For
example, consider the infinitely differentiable map / : E 2 —• R defined by

/(*,„) : = f e x p ( ^ i f ^ < l .
[ 0 if x2 + y2 > 1

The image of this map is [0,e -1 ]. Over the dense set (Oje"1) of the image [0, e" 1 ],
/ is of maximal rank, but f~1(U) is far from dense in IR2.
Another useful fact is that generically dimensions add.

Corollary A.4.12 (Additivity of Dimensions) Let f : X -^ Y be a dominant


map between irreducible quasiprojective algebraic sets. There is a dense Zariski open
setUcY such that for all y GU, f"1(y) is pure dimension dimX - dimY.

Proof. By Theorem A.4.5, there is a dense Zariski open set V C X such that
dimx f~1(f(x)) is a constant k for all x G V, and for all x G Z := X \ V,
dimx f~1(f(x)) > k. Using Theorem A.4.10, we see that k = dimX — dimY.
We will be done if we show that f(Z) is not Zariski dense. Assume it was.
Then we would have an irreducible component Z' of Z mapped dominantly to
Y. Using Theorem A.4.10 we conclude that a dense set of points x £ Z' satisfy
dimx fz^{fz'{x)) — dimZ' — dimY. But this gives the contradiction that
dimX - dimY = k < dim^ f^ifz'ix)) = dimZ' - dimY < dimX - dimY.
a
314 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

The following result (Corollary, page 138 Fischer, 1976) is useful for analyzing
not necessarily proper maps. The algebraic case with the Zariski topology follows
from it by using Theorem 12.5.6.

Theorem A.4.13 Let f : X —> Y be a holomorphic mapping of complex analytic


spaces. Assume for a point x € X that there is an open complex neighborhood
O C X of x, such that for all x' € O, dinx^ f~1(f(x')) is a constant k. Then there
are arbitrarily small open complex neighborhoods U C O of x and V C Y of f(x)
such that

(1) f(U) is a complex analytic subset ofV;


(2) fu'.U—t /(J7) is an open map, i.e., all open subsets of U in the complex
topology are mapped to open subsets; and
(3) d\mx U = k + dim /(a; ) f(U).

By a covering map g : A —> B between differentiable manifolds of the same


dimension is meant a differentiable map such that each point y GY has a neighbor-
hood U such that g~1 (U) is a union of disjoint open sets each mapped isomorphically
onto U by / .
We have the important consequence of Theorem A.4.3 and Theorem A.4.10.

Corollary A.4.14 Let f : X —> Y be a surjective proper algebraic map between


quasiprojective algebraic sets. Assume that X and Y are pure dimensional with
dimX = dimY. Then there is a Zariski dense open set U C Y which is smooth
and such that f : f~l(U) —> U is a covering map.

If Y is irreducible, the map / in Corollary A.4.14 has a well-defined degree.

Corollary A.4.15 Let f : X —> Y be a proper map from a pure-dimensional


quasiprojective algebraic set X to an irreducible quasiprojective algebraic set of the
same dimension. Assume that f is surjective on every component of X, e.g., assume
that f is finite-to-one. Then there is a Zariski dense open set U C Y such that
f : f~1(U) —> U is a covering map of degree degf, which is equal to the number of
points in any fiber of fj-i^uy

The most important case of Corollary A.4.15 is when -K : X —» Y is finite-to-one.


We say that an algebraic map from a pure-dimensional quasiprojective algebraic set
X to an irreducible quasiprojective algebraic set of the same dimension is a branched
covering if / is proper and finite. We define the degree of / to be the number deg /
in Corollary A.4.15.
The following Lemma gives an easy local condition for properness.

Lemma A.4.16 (Stein) Let f : X —> Y be a holomorphic map between complex


analytic spaces. Assume that y £Y and A is a connected component (not necessarily
Algebraic Geometry 315

irreducible) of f~l(y). If A is compact, then there are open complex neighborhoods


U c X of A and V <zY of y such that fjj:U^>Vis proper.

Proof. See (Lemma 1, page 56 Fischer, 1976). •

Using this and Grauert's Proper Mapping Theorem A.4.3, we have an extremely
important existence result.
Theorem A.4.17 Let f : X —> Y be a holomorphic map between complex analytic
spaces. Assume that Y is irreducible and that all irreducible components of X have
dimension at least equal to dimY. Assume that y € Y and x is an isolated point
°f f1(y)- IfY is locally irreducible at y, then, there are arbitrarily small complex
neighborhoods U C X of x and V C Y of y such that f\j : U —> V is a proper
surjective map with finite fibers.

Proof. Choose a neighborhood X' of x. By Lemma A.4.16, there are open complex
neighborhoods U C X' of x and V C Y of y, such that fu : U —> V is proper. By
Theorem A.4.3, f(U) is a complex analytic subspace of V. Since x is isolated, we
would be done in the algebraic case by Corollary A.4.12 and the irreducibility of Y
at y. In the complex analytic case, we instead use Theorem A.4.13, which implies
dim/([/) = dimY. From this and the irreducibility of Y at y, we conclude that
f(U) contains a complex open neighborhood V of y. The rest of the result follows
by replacing V by V, and U by U n f~l{V). U

Example A.4.18 The local irreducibility is needed for the above result. Let
X := C and let Y := V(g) C C2 be defined by g(x,y) = y2 - x2(x + 1). Consider
the algebraic map / from to X -* Y given by f(t) = (-(1 + t2),yf-i(t + t3)).
The reader can check that / is surjective and one-to-one everywhere except at
±%/^T which are mapped to (0,0). A small complex neighborhood U of (0,0)
on Y is biholomorphic to a small complex neighborhood of 0 on V(xy). A small
neighborhood of t = yf—l does not map onto a neighborhood of (0,0) in Y, but
only onto a complex neighborhood of (0,0) on one of the irreducible components of
U at (0,0).
The following result underpins most constructions of homotopies. It asserts that
under minimal conditions, isolated solutions of a system in a family of systems are
limits of isolated solutions of nearby systems.
Corollary A.4.19 Let X and Y be irreducible quasiprojective algebraic sets. Let
f(x;y) be a system of N := dimX algebraic functions on X xY. Letir : XxY —> Y
denote the product projection. Assume that x* is an isolated solution of f(x;y*) = 0
for some point y* such that Y is locally irreducible at y*. Then each irreducible
component Z ofV(f) containing (x*,y*) satisfies the following properties:
(1) dim Z = dim Y; and
316 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

(2) there exist arbitrarily small open neighborhoods U C Z of x* and V C Y of y*


such that 7T[/ is a proper finite map of U onto V.

A number of stronger versions of Corollary A.4.19 are contained in § A. 14.

A.4.1 Generic Factorization


The next theorem tells us that if we are satisfied with generic results, as we often are
in numerical work, properness is unnecessary. The proof requires some constructions
involving the graph of a map.

Theorem A.4.20 Let n : X —> Y be a dominant algebraic map from an irre-


ducible quasiprojective algebraic set X to an irreducible quasiprojective algebraic set
Y. There exists a smooth Zariski dense open set U C Y and a smooth Zariski dense
open set W of TT~1(U) such that nw factors nw — s or, where r : W —> V is a
surjective maximal rank algebraic map with connected fibers and s : V —> U is an
algebraic map and a finite covering map. In particular each fiber of TTW '-W-^U
has the same number of irreducible components, i.e., degs irreducible components.

Proof. In the following argument, we will repeatedly replace algebraic sets by dense
Zariski open sets. For the most part, we call these shrunk sets by the same names.
Replacing X by Xreg, we may assume that X is smooth. By shrinking Y and
replacing X by the inverse image under n of the shrunk Y, we may assume that
Y is smooth. Similarly using Chevalley's Theorem 12.5.6, we may assume that
?r(X) = Y. By using the algebraic Sard's Theorem A.4.10 and shrinking X and Y
further we may further assume that IT is of maximal rank.
Let X denote an irreducible projective algebraic set in which X is Zariski open.
Let X denote the closure of Graph(7r) C X x Y in X x Y. The induced map
W : X —> Y extending n is proper. By Hironaka's Theorem A.4.1, there exists a
desingularization / : X —> X. Thus following Theorem A.4.8, we can factor ?F O /
as a o p where p : X —> Z is an algebraic map with connected fibers; where Z is an
irreducible quasiprojective algebraic set; and a : Z —» Y is a finite-to-one proper
algebraic map.
By Corollary A.4.15, there is a Zariski open dense set U' C Y such that U' and
cr' 1 (t/ / ) are smooth and ov-^t/') : <J~1(U') —> U' is a covering. Thus by shrinking
we may assume without loss of generality that a : Z —> Y is a finite-to-one covering;
Z is smooth; and that p(X) = Z.
As we shrink Z we may automatically shrink Y so as not to lose the properties
already obtained. To see this let V' be a Zariski open set of Z. Since the image
under a of the proper algebraic subset Z \ V is a proper algebraic subset, we may
replace Y by U := Y \ a{Z \ V) and Z by V := o-1 (Y \ a(Z \ V1)) to still have an
algebraic finite-to-one covering map a :V —> U between manifolds.
Algebraic Geometry 317

By using the algebraic Sard Theorem A.4.10 on p we may, after shrinking, as-
sume without loss of generality that p is of maximal rank.
Since X is smooth and / is an isomorphism on the inverse image of the regular
points, we may regard X as a subset of X. By using Chevalley's Theorem 12.5.6,
we may assume that px '• X —> Z surjects onto Z.
Since all fibers of p are smooth and connected they are irreducible. We conclude
that all fibers of px are smooth and connected. Indeed, if this failed, we would
have a fiber of p which is irreducible, and which after removing a proper algebraic
subset is disconnected.
Taking U as the final shrunk Y, V :— a~1{U) and W as the final shrunk X, we
have finished the proof of the theorem. •

Corollary A.4.21 Let n : X —> Y be a dominant algebraic map from an ir-


reducible quasiprojective algebraic set X to an irreducible quasiprojective algebraic
set Y. Assume that 7r(Sing(X)) is a proper algebraic subset ofY. There exists a
Zariski dense open set U C Yreg such that W := ?r~1(!7) is smooth; and ITW fac-
tors nw = s o r, where r : W —> V is a surjective maximal rank algebraic map
with connected fibers and s : V —> U is an algebraic map and a finite covering
map. In particular each fiber of irw '• W —> U has the same number of irreducible
components, i.e., degs irreducible components.

Proof. Replacing Y by Y' := Y\ (Sing(y) U7r(Sing(X))) and X by TT^1 (Y1), it may


be assumed without loss of generality that X and Y are smooth. The rest of the
proof follows by carefully going through the proof of Theorem A.4.20. •

A.5 Rational Mappings

Besides algebraic mappings, there is a more general notion of mapping that is often
very useful.
On C, the assignment / : x H-> 1/X is a well-defined function on C \ {0}. Using
the identification of x € C with [1, x] € P 1 ,, it is natural to think of / as extending
to a function on all of C that takes 0 to the value oo equal to [0,1]. In this case,
the algebraic set
{(a;, [x, 1]) e C x P 1 | x G C}
is the graph of the map x —> 1/x regarded as a map from C —> P 1 . This is the
simplest example of a rational mapping.
A rational mapping f : X —> Y between quasiprojective algebraic sets is defined
to be an algebraic set F C X x Y such that there is a Zariski open dense set U c X
such that F n (U x Y) is the graph of an algebraic map and V D (U x Y) = T.
Every algebraic mapping is a rational mapping, but rational mappings are much
general. A rational mapping often has a set of indeterminacy, where it cannot be
318 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

assigned any value.


Consider the rational mapping given by the assignment / : (zi, z<z) >-+ 22/21 on
C 2 . The corresponding set T c C 2 x P 1 is

(z1,z2;[z1,z2]) e C 2 xP 1 .

F does not define an algebraic map at (0,0). Indeed, there is no way to assign a
value there since 22/21 has different constant values on different lines through the
origin. The origin is the set of indeterminacy of / .
Rational mappings may fail to be algebraic mappings even though they are
continuous maps. For example, consider the algebraic map g sending t 6 C to
(21,22) := (*2,£3) e C 2 . The image of g is the affine set C denned by z\ - z\ = 0.
The map g is one-to-one and onto. The inverse from C to C may be checked to be
a rational mapping: it is the restriction of the rational function 22/21. However, it
is not an algebraic mapping, since t as a function on C is not the restriction of a
polynomial p £ C[2i, 22].

A.6 The Rank and the Projective Rank of an Algebraic System

We define an algebraic system on a pure .ZV-dimensional quasiprojective algebraic


set X to be a set of n algebraic functions f{x) := { / 1 , . . . , / „ } on X. Typically we
assume irreducibility in theorems about systems and leave it to the reader to make
the trivial adjustments in statements for the reducible case. We define the rank
of the algebraic system f(x) = 0 on an TV-dimensional irreducible quasiprojective
algebraic set X to be the dimension of the closure of the image / ( I ) c C". By
Corollary 12.5.7, f(X) is an irreducible algebraic set. We denote the rank of / by
rank/. We define the corank of the algebraic system f(x) = 0 to be N — rank/.
Neither adjoining polynomial functions of the equations of / to create a larger
system nor replacing / with g • f, where g is an invertible nxn matrix, changes the
rank of a system.

Theorem A.6.1 Let f(x) = 0 denote a system of n algebraic functions on an


irreducible quasiprojective set X. Then there is a Zariski open set U c f{X) c C"
such that for y € U, V(f(x) — y) PI XIeg is smooth of dimension equal to the corank
of f. Moreover, the Jacobian matrix of f is of rank equal to rank/ at all points of
V(f(x)-y)nXTeg.

Proof. By Theorem A.4.10, we know there is a Zariski open set U C f{X) such that
V(f(x) — y) is smooth and such that the Jacobian matrix of / has rank equal to
dim/pQ at all points of V(f(x) -y). Corollary A.4.12 gives that dim V(f(x)-y) =
N-dimf(X). •

Let V and / be as in Theorem A.6.1. For the dense Zariski open set U :=
Algebraic Geometry 319

/~ 1 (y)nX r e g , we have for all points £* 6 U, the rank of the Jacobian of / evaluated
at x* equals rank/. This gives us a quick probability-one algorithm for the rank
for a system f. Given an algebraic system f(x) of n algebraic functions on an
irreducible ./V-dimensional quasiprojective set X, the rank of / equals the rank of
the Jacobian at a random point of X.
The following is useful.
Corollary A.6.2 Let f(x) = 0 denote a system of n algebraic functions on an
irreducible quasiprojective set X. If the rank of the Jacobian of f at some point
x E Xreg is k, then rank/ > k.

Proof. The set on X reg where the Jacobian has rank less than or equal to A; is a
quasiprojective subset of X reg . If it is dense then rank/ = k. If it is not dense,
then the rank of the Jacobian is greater than A; on a Zariski dense set, which would
imply rank/ > k. •

Theorem A.6.3 Given a system f(x) of n algebraic functions on an irreducible


quasiprojective set X, all irreducible components of V(f) have dimension at least
equal to the corank of f.

Proof. Use the proof of Theorem 13.4.2 D

The rank of a system is a useful invariant, but from the viewpoint of the Bertini
Theorems, a closely related invariant, the projective rank of a system, plays a more
central role. The first time reader may safely ignore the rest of this section and any
mention of the projective rank of a system. The importance of projective rank stems
from Theorem A.8.6, which states that projective rank controls the nonemptiness
of the zero sets in Bertini Theorems.
The projective rank of a system f is the dimension of the closure of the image
of the rational mapping given by sending x G X \ V(f) to [/i(a;),..., fn(x)]. We
denote the projective rank of / by rankp/. A system / on CN having projective
rank N is called a big system.
Remark A.6.4 For an algebraic line bundle and a system of algebraic sections
/ i , . . . ,fn, rank does not make good sense but projective rank does. It is closely
related to the notion of Kodaira dimension, e.g., (Iitaka, 1982).
Lemma A.6.5 Let f be a system of n algebraic functions on an irreducible qua-
siprojective algebraic set X. Then rank/ — 1 < rankp/ < rank/.

Proof. The rational mapping used in the definition of projective rank factors as
the composition of the map in the definition of rank followed by the map Cn \
{0} —> P™"1, which sends (ZQ, ..., zn-i) —> [ZQ, ..., zn^i]. Since the fiber of the
map C™ \ {0} —> P " - 1 is dimension one, we are done. •
320 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

The above proof makes clear how the two ranks may fail to be equal. Before we
make this precise in Lemma A.6.7, let us give a definition and an example.
Let X c P™ be an irreducible projective set. X is said to be a cone with
vertex x e X if for some hyperplane H of Pra not containing x, the projection
•KX : P™ \ {x} —> H maps X \ {x} to a set whose closure has dimension less than
dimX. An irreducible affine algebraic set X C C™ is said to be a cone with vertex
x G X if, when we regard Cn as a subset of P n , the closure of X in P n is a cone
with vertex x.

Example A.6.6 (Cones) Let Y denote an irreducible (N — l)-dimensional smooth


complete intersection in p™-1 defined by homogeneous equations / i , . . . ,fn-N in
the variables zo,..., zn-\. The Af-dimensional cone X := V ( / i , . . . , / n -jv) C C n
intersects a general (n - iV)-dimensional linear subspace through the origin in the
origin only. To see this, regard C" as contained in ¥n with the hyperplane at infinity
Hoo given by V(zn). For positive integers m satisfying (m — 1) + (N — 1) < n — 1,
there is a dense Zariski open set of m-dimensional linear subspaces L c C™ with

I n I n E 0 0 = (In H^) n (x n H^) = 0.


Thus X C\ L = {0}, and this point is singular if degX > 2.

Lemma A.6.7 Let f be a system of n algebraic functions on an N-dimensional


irreducible quasiprojective set X. Then rank/ = rankp/ except when f(X) is a cone
over 0.

Proof. The proof of this is immediate from the definitions and left to the reader. •

To compute the projective rank is easy.


Theorem A.6.8 Let f be a system of n algebraic functions / i , . . . , / „ on an
irreducible quasiprojective algebraic set X with one of the functions, fi, which we
relabel f\, not identically equal to the zero function. Then

rankp/ = rank j , . . . , - ^ ,
I /i h )
where the system of quotients is defined on X \ V(/i).

Proof. The proof of this, and the independence of the choice of the not identically
zero fi, follows immediately from definitions and is left to the reader. •

A.7 Universal Functions and Systems

In this section we will give a detailed discussion of certain special families of poly-
nomials, which are useful in the study of polynomial systems.
Algebraic Geometry 321

A. 7.1 One Variable Polynomials


We start with the case of the most general degree d polynomial. Fix an integer
d > 1. We have the family

p(z, c) := c0 + ciz + ... + cdzd = 0. (A.7.1)

We have some related issues we must face already.

(1) Should we insist that cd ^ 0?


(2) Would it be better to include "solutions at infinity" by using the homogenized
system

p(z,c) := cO2o + ciz^zi + ... + cdzf = 0

withz := [zo,zi] G P 1 ?
(3) Since multiplying an equation by a nonzero complex number does not change
the solution set of the polynomial, should we make the convention that c is
not the point (c 0 , ...,cd) G C d + 1 , but instead [c0, ...,cd] G P d ? Doing this, of
course, implicitly throws away the identically zero polynomial, that corresponds
to c = 0.

For simplicity, we look only at polynomials with no restrictions on the c*, i.e.,
we assume that (z,c) = (z,c0,... ,cd) G Cd+2 corresponding to polynomials of
degree < d. The different choices raised by the issues listed above are treated in a
similar way.
Let's introduce some notation. We let Zd c Cd+2 denote the solution set of
p(z, c) = 0. We let n : Zd —> Cd+1 denote the map induced by the projection
(z, c) —> c, and we let p : Zd —> C denote the map induced by the projection
0,c) ->• z.
Note that for any given c € Cd, n~1(c) consists of the points (z,c) satisfying
p(z, c) = Co + c\z + . . . + cdzd = 0. It is important that the zero set Zd C Cd+2
oip(z,c) = 0 is a connected (d + l)-dimensional complex manifold. Indeed, Zd is
dimension d+ 1 since it is denned by a single algebraic function on an irreducible
d'oi z cl
(d+ 1)-dimensional algebraic set. Moreover, since —^ ' = 1, it is a consequence
OCQ

of Theorem A.2.8, that Zd is smooth.


To show the connectedness of Zd, we apply the criterion that a space (in our
case Zd) is connected if a continuous map (in our case p : Zd —> C) has connected
image and fibers. To see this, note that the fiber p~1(zt) of p over an arbitrary
point zt G C is the set of (z*,c) such that p{z*,c) = 0. Since this is the linear
equation CQ + C\Z* + ... + cdzd = 0 in the variables c G C d + 1 , we see that p~1(z*)
is identified with a hyperplane of Cd+l by ir.
Since Zd is connected and smooth, it is irreducible. Over all points except
0 6 C d + 1 , n has finite fibers.
322 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Let us consider the points c for which p(z, c) has multiple roots, i.e., less than d
distinct roots. This corresponds to points c for which the equations

P(Z>C)] _n ,A 7 9 N
dP(z,c) — u {A. (.A)
. dz J
have a common root. The classical prescription, e.g., (Chapter 1 Walker, 1962)
and (Cox et al., 1997), for how to eliminate z from these equations constructs the
discriminant of p(z,c), a polynomial of degree Id — 1 in c, which is the resultant
of the two polynomials in the system (A.7.2). We only discuss resultants briefly in
§ 6.2.1 of this book. For us it will be enough to note that
(1) for some c», e.g., the c* corresponding to the polynomial zd — 1 = 0, the roots
are distinct;
(2) for c in a complex neighborhood of c* the roots remain distinct; and
(3) the set S in Cd+1 denned by the system A.7.2 is an affine algebraic set.
We know from item 3) and Chevalley's Theorem 12.5.6 that TT(<S) is a constructible
set. We know that the closure T> of n(S) in the complex topology is an affine
algebraic set by Lemma 12.5.3. By item 2) we conclude that V ^ C d+1 and thus
that the Zariski open set C d + 1 \ T> is nonempty, and hence dense.

A.7.2 Polynomials of Several Variables


The construction of § A.7.1 carries over to several variables. We summarize the
construction for polynomials of degree < d on CN. Such a polynomial

P(z,c)= Yl °JzJ
\j\<d

depends on A/" := ( j~ ) coefficients cj, where we use the multidegree notation. We


regard p(z, c) as a polynomial on CN x C . By the same reasoning as in § A.7.1, we
see that Z^ := V(p(z,c)) is smooth, connected, and of codimension one. Moreover
by Theorem A.4.10, there is a Zariski open dense set U C O^", such that the
restriction of the projection map TT : CN x C —> CN to Zd H TT~1(U) is a maximal
rank map.

A.7.3 A More General Case


Let fi(x),..., fn{x) be a set of algebraic functions on an irreducible quasiprojective
algebraic set X. For example, these might be a set of rational functions
PlQ) Pn(x)
Qi(x)''""'' qn(x)
where the Pi(x) and <&(#) are polynomials on CN and X := CN \ (\J™=1V(qi(x))).
Algebraic Geometry 323

n
We define the universal function F(X: x) := V^ A;/,(z) on Cn x X.
i=l
It is traditional in this context to refer to the solution set V(f) of the set of
algebraic functions fi(x),..., fn{x) as the base locus of the set of functions. We
will not use this language, but the reader should be aware of it.
Zf : = V(F) i s a quasiprojective algebraic set with Zf (1 [Cn x (Xieg \ V(f))]
smooth. Moreover there is a Zariski open dense set U C Cn, such that the restriction
of the projection map -K : Cn X (Xreg \ V(/)) —> C" to Zf H TC~1 (U) is either empty
or a maximal rank map. This is important enough to state as a Theorem.
Theorem A.7.1 (Simple Bertini's Theorem) Let f{x) := {fi,- • • ,fn} be a
system of algebraic functions on an irreducible quasiprojective algebraic set X.
There is a Zariski open dense set U C C™, such that for (Ai,...,A n ) G U,
it follows that g := Yli=i Ai/i has a possibly empty quasiprojective zero set Z
such that Z (~i (Xreg \ V(f)) is smooth with the differential dg nowhere zero on
zn(xieg\V(f)).
Proof. First note that we can assume that X is smooth and V(f) is empty, by
simply replacing X with (XTeg \ V(/)) and renaming. Note that if rank/ = 0, then
each g is constant and the theorem is vacuously true. Therefore we can assume that
rank/ > 0.
We have the "universal function" F(\,x) :— X^ILi ^ifi(x) defined for (X,x) G
n
C x X. Zf := V(F) C X is smooth by the same reasoning as used in § A.7.1.
Consider the maps TTI : Zf —>• Cra and 7T2 : Zf —> X induced by the projections
C " x X - » C " a n d C n x I ^ I respectively. The fiber ^(x) for any x G X is an
affine hyperplane of C™. It can be further checked that given any x e X, there is a
neighborhood O of x in the complex topology such that 7r^"1(C7) is biholomorphic
to C"" 1 x C. Thus Zj is a bundle over X and therefore irreducible of dimension
dimX + 71—1.
We are in the situation of Theorem A.4.10, and would be done, if we knew that
7Ti is dominant. Assume it is not. Then, there is a Zariski open dense set U C C n
such that for A G U we would have that V(£)™=1 A,/*) = 0. •

A. 7.4 Universal Systems


Let / i ( x ) , . . . , fn(x) be a set of algebraic functions on an irreducible quasiprojective
algebraic set X. For any positive integer s, we define the universal system
"Fi(A,x)i ["Ai,i ••• Ai,n"i r/r
F(A,x):= : = A • /(z) = : •.. : • :
_F s (A,x)J LAS,1 ••• A s , n J [fn.
on C s x " x X.
324 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Define Zf := V{F(A,x)) C Csxn x X. Let -KX : Zf -> C s x n and TT2 : Z/ -> X


denote the projections induced from C sX71 x X -+ C SXri and C s x " x I -> X
respectively.
L e m m a A.7.2 IfV(f) = 0 £/ien Z j is irreducible of dimension dim.X + s(n— 1).

Proof. The set 7r^"1(x) for x 6 X may be identified with the linear space of A 6 Csx™
satisfying A • / ( x ) = 0. Since f(x) ^ 0, this space has codimension s. It can be
further checked that given any x G X there is a neighborhood 0 of a: in the complex
topology such that T T ^ ^ O ) is biholomorphic to ^ ' " " ' ' x O . Prom this it follows that
the set Zf C C s x n x X consisting of (A, x) such that F(A, x) = 0 is an irreducible
set. •

Lemma A.7.3 IfV(f) is empty, then Zf D (CsX™ x Xreg) is smooth.

Proof. Since at any point x G V(F) C Csxn x X at least one of the / , is nonzero,
we can see that all the partial derivatives
dFj{A,x)

From this we see that V{F) n (Csxn x X r e g ) is smooth. D

After we develop a few results on linear projections and subspaces, we will prove
Theorem A.8.7, the analogue for systems of Theorem A.7.1.

A.8 Linear Projections

"Generic" projections have been used since classical times to reduce questions about
general algebraic sets to questions about hypersurfaces. Here we present the basic
facts that we need. We follow the presentation in (Sommese et al., 2001c) closely.
A linear projection n : CN —> C m is a surjective affine map

?r(x) = a + ^ x , (A.8.3)

where
«i,o a i , i • • • ai,N Xi
o= : ; A= : • . : ; and x = : (A.8.4)
.«m,oj L a m , l • ' • a m,JVJ LXN.

We work with equivalence classes of projections, considering two projections 7Ti,7r2


from CN onto C m equivalent if there is an affine linear isomorphism T : C m —• C m
with T(TTI(X)) = 7T2(x). Thus, for us two linear projections are the same if their
fibers through the origin are parallel (N - m)-dimensional linear subspaces of C ^ ,
Algebraic Geometry 325

i.e., TTJ~ (TTI(O)) is parallel to TT^" (7^(0)). So in the special case of linear projections
from C^ —> C^^ 1 with N > 2, we can consider the projections to be parameterized
by the lines through the origin, or equivalently the hyperplane at infinity H^ :=
N
V(ZQ), where we regard C^ as embedded in ¥ by

( x i , . . . , XN) - » [z 0 , • • • , ZN\ = (1, Xi, . . . , I J V ) .

This observation will play an important role in § A.10.3.


Though we can use the set ofmxJV matrices A g C mXiv with rank A — m to
parameterize the linear projections, it helps to keep the geometrical correspondence
between projections and the nullspaces of the matrices A in mind. As noted in
the last paragraph, when m = N — 1, we are dealing with lines and the natural
parameter space is the projective space parameterizing lines though the origin in
C^. For linear subspaces of other dimensions this leads us to Grassmannians.

A.8.1 Grassmannians
We denned the iV-dimensional projective space P^ in § 3.2 as the set of lines through
the origin in CN+1. Replacing linesby (m+1) -dimensional linear subspaces through
the origin leads to the notion of a Grassmannian. We define the Grassmannian of
(m + l)-planes in (N + l)-space to be the set of all (m + l)-dimensional linear
subspaces of CN+1 through the origin. Equivalently this is the space of linear P m s
in FN. We denote this space Gr(m, N). The reader should be aware that there is a
second convention in the literature where the focus is on CN+1, and the space we
denote Gr(m, N) is denoted Gr(m + 1, N + 1).
An (TO + l)-dimensional subspace of C w + 1 through the origin is determined by
m + 1 elements of CN+1. In analogy with homogeneous coordinates on projective
space, we may represent an element of Gr(m, N) by an (m + 1) x (N + 1) matrix
A. Conversely, we would like an (m + 1) x (N + 1) matrix A to represent an
element of Gr(m, N). For A to represent an (m + l)-dimensional linear subspace, A
must have rank m + 1 , e.g., for projective space P^ = Gr(0,N), the (N + l)-tuples
[zo,... ,ZN] £ FN are not allowed to have all entries 0. In analogy with homogeneous
coordinates on PN, if G is an (m + 1) x (m + 1) invertible matrix, then the rank
m + 1 matrices A and G • A represent the same linear subspace.
As with FN, we can define embeddings of c( m + 1 ) x ( i V - m ) into Gr(m, N). Indeed,
given an (m+1) x (N — m) matrix B, if we send it to [Im+i B], we have a one-to-one
mapping. We take this as giving a neighborhood of any of the elements of the image
of this map. We can construct other embeddings of c(m+i)x(W-™) whose unions
cover Gr(m, N), but do not do so since we do not need this.
Grassmannians are connected projective manifolds. As we saw above, the di-
mension of Gr(m, N) is (m + 1) x (N — m). There is a natural embedding

GrfoNO-pG+D-1
326 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

called the Plucker embedding obtained by sending an (m + 1) x (AT + 1) matrix A


representing a point of Gr(m, N) to the point in p U + i ) " 1 with the (^1J) determi-
nants of (m + 1) x (m + 1) submatrices of A as homogeneous coordinates. Since
using G • A in place of A multiplies all these homogeneous coordinates by det G,
this mapping is well defined.
There is a large literature on Grassmannians. The analogy for Grassmanni-
ans of the linear projections from FN to lower dimensional projective spaces, that
we consider in § A.8.2, are similar mappings from Gr(m, N) to lower dimensional
Grassmannians. The isomorphism of a vector space and its dual leads to the iso-
morphism between Gr(m, N) and Gr(N —m, N). A good place to read more on these
useful homogeneous manifolds is (Griffths & Harris, 1994). For detailed informa-
tion, (Hodge & Pedoe, 1994b) is particularly helpful.
In the same vein as § A.7.4, it may be shown that there is a connected projective
manifold Ti c Gr(m, N) x P N consisting of all points (L, x) where L G Gr(m, N) is
an m-dimensional linear subspace of P ^ and x G PN is a point in the subspace of
PN represented by L. By standard abuse of notation, this is denoted by x £ L. Let
7i"i : 7i —> Gr(m, N) and 7T2 : 7i —> PN denote the algebraic mappings induced by the
projections Gr(m,N) x P N -> Gr(m,N) and G r ( m , N ) x P N - > P N respectively. The
mapping TTI is of maximal rank with the fiber ir~1(L) over L G Gr(m, N) mapped
isomorphically to L by 7T2- Thus
dimH = (m + 1)(N - m) + m.
The mapping TT2 is of maximal rank with the fiber 7r^"1(a;) for x G PN mapped
isomorphically by w\ onto the Gr(m — 1, JV — 1) of m-dimensional linear spaces of
P ^ that contain x. Thus dim7r^"1(a:) = m(N — m)
Regarding CN as P ^ minus a hyperplane H, there is a one-to-one correspondence
of m-dimensional affine linear subspaces of CN with the dense Zariski open set
U C Gr(m, N) of m-dimensional linear subspaces L of P ^ not completely contained
in H. The algebraic set Gr(m, N) \U is thus identified with Gr(m, N — 1).
Theorem A.8.1 Let X be an n-dimensional affine algebraic subset of CN (re-
spectively projective algebraic subset of¥N). Ifn + m < N, there is a dense Zariski
open set U C Gr(m, N) of affine linear subspaces L C C ^ (respectively of linear
spaces L C PN) of dimension m not meeting X.

Proof. Using the identification of affine linear spaces of C ^ with linear spaces in
P ^ , we only need to show this result in the case of FN. Note that d i m T r ^ X ) =
n + m(N — m). Thus 7Ti(7r^"1(X)) cannot be dense because if it was

(m+l)(N-m) = dimGr(m,N) = dimTr^Tr^^X)) < dimTr^pQ = n+m(N-m).


This implies the contradiction N — m < n. •

Theorem A.8.2 Let X be an n-dimensional algebraic subset of CN (respectively


Algebraic Geometry 327

projective algebraic subset of FN). Assume n + m < N. For a given x € X,


there is a dense Zariski open set U of m-dimensional affine linear spaces L C CN
containing x (respectively m-dimensional linear spaces L C fN containing x) such
that L D X = {x}. Moreover if x € X r e g , then U can be chosen so that in addition
TL,X n Tx,x =x £ TCNIX (respectively TL,X n Tx,x =x£ T¥N^X), where TLtX! Tx,x,
Tpjvj,, TCN >x are the tangent spaces of L, X, P w , C ^ respectively at x.

Proof. This theorem is proved by reasoning similar to that for Theorem A.8.1.
We only prove the case when X is projective algebraic (the quasiprojective case
requires the projective case plus an application of Lemma 12.5.2). Fix a point
x £ X C ¥N. The L € Gr(m, N) that contain x, i.e., G\ := 7T1(7r2"1(ar)) is isomorphic
to Gr(m — 1,N — 1) and thus irreducible and m(N — m) dimensional. The set L
containing x and a point y ^ x is isomorphic to the set G2 '•— Gr(m — 2, N — 2) and
thus (m — 1)(N — m) dimensional. The set W of L that contain x and some other
point y of X is thus of dimension at most (m — l)(iV —TO)+ n. Here we are using
the fact that W is projective. To see this let 52 : G± —> fN denote the algebraic
mapping induced by TT2- W is the set ir^q^iX)).
Since dimGi - d i m W > m(N -m) - ((m - l)(iV -m)+n) = N-m-n>l,
we conclude W is a proper algebraic subset of an irreducible projective algebraic
set C?2- Thus there is a Zariski dense open set G2 \ W of m-dimensional projective
linear spaces L containing x and no other point of X.
The tangent space assertions follow by a dimension count showing that the space
of L € Gi such that TL,X H TX,X ¥" x ^ TPN<X is of dimension less than dim G2- The
details are left to the reader. •

Theorem A.8.3 Let X be an n-dimensional affine algebraic subset of CN. As-


sume n + m > N. There is a dense Zariski open set U of m-dimensional linear
subspaces L C CN such that LH X is of dimension n + m — N. Ifn + m>N,U
may be chosen so that if L G U then L D XTeg is nonempty.

Proof. Using the same sort of reasoning used in Theorem A.8.2 or a repeated use
of Theorem A.7.1 gives this result. •

A.8.2 Linear Projections on FN


We need to consider the extension of projections to projective space. Such projec-
tions have traditionally been a major tool in algebraic geometry and are a perennial
focus for research, e.g., (Beltrametti, Howard, Schneider, & Sommese, 2000). Let
[z0,..., ZN\ denote linear coordinates on fN. As above, we regard CN C fN using
the inclusion (xu ... ,xN) —> [z0,... ,zN] = [ l , x i , . . .,xN]. Thus C w = P w \ Hoo,
where HOQ := V(ZQ) is the hyperplane at infinity.
328 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

A projection from WN to P m is a surjective map nL : FN \ L -> P m with

n(z) = Az, (A.8.5)

where

a a a
0,0 l,l ''' 0,N ZQ

A= '• '• "• : and z = :


a
.SO rn,l ••• a m . w j L ZN .

and where i is the linear projective space p^- 7 7 1 " 1 c P w defined by the vanishing
of the linear equations Az. L is the center of the projection. Theoretically, we
work with equivalence classes of projections, considering two projections TTI, TT2 from
PN onto P m equivalent if they have a common center L and there is a projective
linear isomorphism T : P m -> P m with T(TTI(X)) = -K2{x) on P ^ - L. Note that
two projections from P ^ to P m are equivalent if and only if they have the same
center L. Thus the linear projections ¥N —> P m are naturally parameterized by the
Grassmannian Gr(N — m — 1,N) of (N — m— l)-dimensional linear spaces L c PN.
Geometrically, the projection -KI has a simple description. Let L be the center
of the projection. Choose any P m C P ^ with the property that L D P m = 0. Given
a point x e WN \L and letting (a;, L) denote the linear subspace FN~m C P ^
generated by x and L, the projection from P ^ to P m with center L sends x to
(i,L)nPra.
The projections nL from P ^ to P m that are extensions of projections from C^
to C m are precisely the projections with centers L C H^. Indeed, let y i , . . . , ym be
coordinates on C m and let the usual embedding of C m to P m be given by

{ y i , . . . , y m ) -> [ w o , . . . , w m ] = [ l , y i , . . . , y m ] ,

Since we must have a linear equation in Xi,... ,XN when we dehomogenize with
respect to w0, we conclude that A is of the form

" ao,o 0 •• • 0 '

. a m,0 a
m,l ' ' ' am,N .

with o0,o / 0. Using the invertible linear transformation on P m

T
-=\ \°1
Algebraic Geometry 329

ai,o
where u := • , we see that an equivalent form for A is
.am,0.

where A is as in Equation A.8.4. For example, the projection (xi,... ,XN) —»


(xi,... ,a;jv_i) extends to the projection [XQ, ..., XJV] —> [xo,a;i,..., rrjv-i] with
center L := {[0,..., 0,1]}.
To recapitulate a main point: an equivalence class of linear projections is natu-
rally identified with the center of the projection in the projective case and with the
center of the projective extension of the linear projection in the affine case.

A.8.3 Further Results on System Ranks


We have a few more properties on the behavior of the rank of a system under
randomization.

Lemma A.8.4 Let X C C n denote an irreducible affine algebraic set. Then


for any nonnegative integer s, there is a dense Zariski open set of linear projections
CN —> Cs such that the dimension of the closure of the image of X is min{dimX, s}.

Proof. We regard Cn as the complement in P n of a hyperplane Hoc. We first do


the case of s > dim X.
As we saw above, the linear projections are parameterized by the Grassmannian
G :— Gr(n - s — 1, n — 1) of linear P"~'s~1s contained in H^. We have

dimXnffco < d i m X - l .

Since (dimX - 1) + (n - s - 1) = n — (s - dimX) — 2 < n — 1, we conclude from


Theorem A.8.1 that there is a Zariski open dense set U of G corresponding to linear
pn-s-i s m i s s m g X n Hoo. Given one of these, say L, and the associated linear map
irL, t h e fiber -K^(irL{x)) t h r o u g h x G X i s (x,L) H X. S i n c e L n (X \ X) = 0 ,
(x, L)nX is compact and hence finite by Lemma 12.4.3. Thus by Corollary A.4.12,
the closure of the image of X has dimension the same as X.
The case of s < dimX follows from the case s = dimX and the observation
that if s < dimX, then a dense open set of linear projections from Cd™^ —> C s
are onto. •

Theorem A.8.5 Let f(x) — 0 denote a system of n algebraic functions on an


irreducible quasiprojective set X. For any positive integer s, there is a dense Zariski
open set of matrices U C Csxn such that ifAeU, then rank A • / = min{s, rank/}.
330 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Proof. Applying Lemma A.8.4 to f{X)~; there is a dense Zariski open set U of
A £ C s x n such that dim(A-/)(X) = min |dim/(X), s\ . •

The following useful result adds an existence component to Bertini's Theorem.


Theorem A.8.6 Let f be a system of n algebraic functions / i , . . . , / n on an
irreducible N-dimensional quasiprojective algebraic set X. Assume that rankp/ — K.
Then there is a Zariski open dense subset ofUd CKXn of K X n matrices such that
for A £ U, any subset of i distinct functions from the K functions A • / has a
nonempty solution set on Xieg, which is smooth and of dimension N — i.

Proof. This follows from application of Theorem A.8.2 to the closure of the image
ofXinP"-1. •

A.8.4 Some Genericity Properties


We have the important generalization of Theorem A.7.1.
T h e o r e m A . 8 . 7 ( S i m p l e B e r t i n i T h e o r e m f o r S y s t e m s ) Let fi,--.,fn be a
system of algebraic functions on an irreducible quasiprojective algebraic set X with
solution set V(f). For each s x n matrix A £ Csxn, let
'Fi(A,x)-
F{A,x)~ : =A-/(x)
.Fs(A,x)_
on C s x n x X. There is a Zariski open dense set U C Csxn of s x n matrices such
that for A £ U, it follows that V(A • / ) C X is a quasiprojective set such that if
Z\ '•= V(A • f) \ V(f) is nonempty, then dim^A = dimX — s, and Z\ (~l XTeg is
smooth with the differentials dFj spanning the normal bundle of ZhC\Xve&. Moreover
the number of components of Z/^ fl Xreg is independent of A £ U.

Proof. The set U' of A with rank equal to min{s, n} is dense and Zariski open.
Therefore, by replacing any dense Zariski open set U C C s x n that is constructed
below with its intersection with U', we may assume that all A G U have rank equal
to m'm{s,n}.
By replacing X with X\V(f) we can assume that the /; have no common zeros.
Denote V(F(A,x)) C C s x n x X by Zf. By Lemma A.7.2, Zf is irreducible of
dimension dimX + s(n — 1). By Lemma A.7.3, Zf f) (C s x n x Xreg) is smooth.
Let TTI : Zf —> CSXn denote the algebraic map induced by the product projection
C ™ x X —> C s X n and let TT2 : Zf —> X denote the algebraic map induced by the
sx

product projection <CSX™ x l - » l .


Either TTI restricted to Zf is dominant or it is not dominant. If it is not dominant,
then we are done since the Z\ are empty for A in a Zariski open dense set of C s x n .
Algebraic Geometry 331

Therefore we may assume without loss of generality that wi restricted to Zf is


dominant.
By Corollary A.4.12, there is a Zariski open set U C Csxn such that for y eU,
all components of ^:[1{y) have dimension

dimZy - dim<Csxn = s(n - 1) + dimX - sn = dimX - s.

Prom Theorem A.4.10, we conclude there is a Zariski open dense set U C Csxn such
that for A G U and ZA := 7T2(7rf ^A)),

z& n Xies
is smooth with the differentials dFj spanning the normal bundle of Z\ n Xreg. •

A.9 Bertini's Theorem and Some Consequences

In this section we present a general Bertini Theorem about the solution sets of
systems.
Since the intersection of any finite number of dense Zariski open sets is Zariski
open and dense, we can (and typically do) apply Bertini's Theorem to conclude that
a generic choice of some parameters leads to a long list of generic properties. To state
such a result succinctly, let us define the constellation of algebraic sets associated to
a finite number of quasiprojective subsets X\,..., Xr of a quasiprojective set X to
be the collection of sets obtained by repeatedly doing in any order the operations
of
(1) taking irreducible components;
(2) taking intersections;
(3) taking the singular set of a quasiprojective algebraic set;
(4) taking finite unions; and
(5) given two sets A, B taking the set A \ A D B.
Lemma A.9.1 The constellation of algebraic sets, C, associated to a finite num-
ber of quasiprojective subsets X\,-.., Xr of an algebraic set X is a finite set of
quasiprojective sets.

Proof. All these operations start with quasiprojective algebraic sets and produce
quasiprojective algebraic sets.
To prove that C is finite, it suffices to show that the set of all the irreducible
components of the quasiprojective sets obtained by these operations is finite.
Since an irreducible quasiprojective algebraic set A minus a proper algebraic
subset remains irreducible, the last operation leads only to the finite number of
quasiprojective algebraic sets A \ A f) B for the collection of quasiprojective sets
J4, B generated by the first four operations. Thus it suffices to prove the finiteness of
332 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

the collection of sets generated from X±,..., Xr by a repeated use of the operations
1), 2), 3), and 4).
Since any intersection is a finite union of intersections of irreducible quasiprojec-
tive algebraic sets, it suffices to prove the finiteness of the collection of sets obtained
by starting with X\,... ,Xr and repeatedly doing only the operations 1), 2), and
3). Note that the operations of taking intersections of irreducible sets and taking
singular sets decreases dimensions if it leads to anything new. Thus, by the fact
that dimension is finite, the operations 1), 2), and 3) lead to only a finite number
of quasiprojective sets. D

Let gi,... ,gs be a set of algebraic functions defined on a quasiprojective alge-


braic set X. Denote the solution set of all the functions, i.e., V(gi,... ,gg), by V(g).
We say that g\,..., gs are simply generic with respect to a /c-dimensional irreducible
algebraic set Z C X if given any integers 1 < i\ < ... < ir < s, it follows that

(1) ifr >kthenV{gtl,...,gir)nZ cV(g); and


(2) if r < k then either V{gtl,..., gir) D (Z \ V(g)) is empty or

dimV(gil,...,gir)n(Z\V(g)) = k-r

and V{gil,...,gir) n (Zreg \ V(g)) is smooth with the differentials dgit,..., dgir
having rank r in the tangent space Tz,x any x e V(gil,..., gir) n (ZTeg \ V(<?))•
Given an s x n complex matrix

Ai,i • • • Ai i7l
A :
= ; ••. • '
. A s , l • ' • ^s,n .

the s x b submatrix A ( j i , . . . , j b ) 6 Csxb of A G C s x n associated to the list of


integers 1 < j i < . . . < jb < n is defined to be

Ai,ji • • • ^i,jb

A(ji,...,jfc) : = : ••. :
-As,ji ' • • *s,jb _

The following Bertini theorem expands on the conclusions reached in § A.7.3.

T h e o r e m A . 9 . 2 ( B e r t i n i T h e o r e m f o r C o n s t e l l a t i o n s ) Let fi,...,fn be a
set of algebraic functions on a quasiprojective set X. Given any finite number
A\,... ,Am of quasiprojective subsets of X, let C denote the constellation of quasi-
projective sets associated to

(1) the sets A\,..., Am;


(2) all irreducible components of X; and
(3) all sets of the form V(fjl,... ,fjb) for the lists of integers 1 < jx < ... < j b < n.
Algebraic Geometry 333

Then there is a Zariski dense open set U C CsXra, such that for A e U and any list
1 < ji < • • • < jb < n the functions

: :=A(ji,...,jb)- \
.9s \ [fjb.

are simply generic with respect to every irreducible set in C.

Proof. Since the intersection of dense Zariski open subsets of Csxn is dense and
Zariski open, it suffices to prove the result for a single irreducible set Z e C of some
dimension k.
Further if we showed that for a given list l < i i < . . . < i r < s , the result is
true ioi gi1,..., gir where

"fill [/ji"
: := A ( j i , . . . , j b ) • :
.9s J [fjb.
with A in a dense Zariski open set U(ii, •.., ir;ji, • • • ,jb) C C s x n , we will be done
by taking the intersection of these open sets indexed by the finite number of lists
of integers 1 < i\ < ... < ir < s and 1 < j \ < ... < % < n.
Therefore by renaming, it suffices to prove that there is a dense Zariski open set
(/ C C s x n for any s <n such that setting

"si] r/r
: :=A- : ,
.9s\ [fn.

it follows that

(1) if r > k then % , . . . , j s ) n Z c V(/); and


(2) if r < k then dimV(9l,... ,gs) n (Z\V(f)) = k - r and V(gi,... ,gB) n
(Zreg \ V(f)) is smooth with the differentials dgi,..., dgs having rank r in the
t a n g e n t s p a c e Tz,x for a n y x e V(gil ,...,gZr)n (Zreg \ V(g)).

These assertions follow immediately from Theorem A.8.7. •

There are many versions of Bertini's Theorem in the literature, e.g., (Example
12.1.11 Fulton, 1998). For a further discussion of Bertini theorems, see also (§1-7
Beltrametti & Sommese, 1995).
334 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

A.10 Some Useful Embeddings

There are some natural embeddings of algebraic sets that are useful. For simplic-
ity, we give versions for projective algebraic sets, though similar constructions are
equally useful for affine algebraic sets.

A.10.1 Veronese Embeddings


Let N and d be positive integers. The Veronese embedding is the natural embedding

obtained by sending the point [ZO,...,ZN] to the point with homogeneous coor-
J
dinates made out of all the monomials {z \ \J\ = d} where we use multidegree
notation. The restrictions of the linear equations ¥^ N >~ to the image of the
Veronese embedding give all the degree d equations of VN.

A.10.2 The Segre Embedding


Let Ni,..., Nr be r positive integers. The Segre embedding is the natural embedding

P^ 1 x . . . x FNr -> p n L i W + i J - i

given by sending the point [zito,..., Z±>N1 ; . . . ; zr,o, • • •, Zr,Nr] to the point with ho-
mogeneous coordinates made out of all the monomials z\^ • • • zr^T.

Remark A.10.1 The degree of the image of the Segre embedding of the multi-
projective space FNl x . . . x PNr in p n r=i( Ar ;+ 1 )- 1 i s the multihomogeneous Bezout
number for the system with X)[=i Ni equations all of type ( 1 , . . . , 1). This may be
checked, e.g., using Equation 8.4.15, to be
/ V N- \ CV" 7V-V
\N1,---,NrJ- N1\---Nr\-
On a theoretical level, the Segre embedding shows that subsets of multiprojec-
tive spaces defined by multihomogeneous equations may be regarded as projective
algebraic sets.
One case is of special interest.

Example A.10.2 (The Quadric Surface) Let S := P 1 x P 1 with bihomogeneous


coordinates [zi,o, zi,i; Z2,o, ^2,i]- Let [wo, wi, ^2,^3] denote the homogeneous coor-
dinates on P 3 . The Segre embedding of P ' x P U P 3 given by

[wO,Wi,W2,W3] : = [zi,0Z2,O,Zl,1*2,0, Zl,0Z2,l>zl,1^2,1]

has as image the smooth quadric V(u>oW3 — Wiit^)-


Algebraic Geometry 335

The Segre embedding is useful because it gives a consistent way of measuring


the degrees of pure-dimensional algebraic sets on P^ 1 x ... x WNr.

Remark A.10.3 (Measuring Degrees) Measuring degree by using the Segre em-
bedding gives the smallest possible values of all the consistent ways of measuring the
degrees of pure-dimensional algebraic sets. Other consistent ways may be obtained
by using the Veronese embedding on the different projective spaces followed by a
Segre embedding. In the language of line bundles mentioned briefly in § A. 13, such
a choice is equivalent to choosing an ample line bundle L on M := P^ 1 x ... x ¥Nr
and then denning the L-degree degL(X) of a pure fc-dimensional X C M to be
c\{L)k • X, where ci(L) is the first Chern class of L.

A.10.3 The Secant Variety


To derive properties of generic projections, we need to define the secant variety of
an affine algebraic set X. Let X be an irreducible affine algebraic subset of C^.
Given two distinct points x, y e X, we have a unique line between them, para-
meterized by u € C as (1 — u)x + uy. Let A denote the diagonal

A := {{z, w) € X x X \ z = w} .

Then the image of the map / : ( I x I \ A ) x C - > C A r , defined by f(x, y, u) = (1 -


u)x + uy is a constructible set by Theorem 12.5.6. The secant variety of X, denoted
Sec(X), is the closure in CN of the image of this map. By Corollary 12.5.7, Sec(X)
is an irreducible affine algebraic set. By Lemma 12.5.2, dimSec(X) < 2dimX + 1.

L e m m a A.10.4 Let X be an irreducible affine algebraic subset of CN. If N >


2d\mX + 1, then a generic linear projection TT : <CN —> C ^ " 1 applied to X is
one-to-one.

Proof. Embed CN into PN by the map (zi,..., ZN) —> [XQ, . . . , XN] = [1, Z\,..., ZJV].
Let HQ := V(XQ) denote the hyperplane at infinity. Then Sec(X), the closure of
Sec(X) in ¥N, meets Ho in a proper algebraic set of Sec(X). Thus dim iJonSec(X) <
dim Sec (X). So we conclude that

dim# 0 n Sec(X) < dimSec(X) - 1 < 2dimX < N - 1 = dimH0.

Thus Ho <t Sec(X). Therefore, we can choose a point p £ HQ not in Sec(X). Let
•K : CN —» C^" 1 be a linear projection with fibers being lines having direction p.
No fiber can go through two distinct points of X since if it did, p would be in
Sec(X) C\HQ. Since this is true for a Zariski open set of p e Ho, it follows from § A.8
that it is true for a Zariski open set of projections. •
336 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

A.10.4 Some Genericity Results


Let X C C ^ be an irreducible affine algebraic set. Let n : CN —> C m be a linear
projection. The restriction nx, of n to X, does not have to be proper. For example,
the map from the hyperbola V(xi^ — 1) to the x\ axis has as image the complement
of the origin and is therefore not proper. It is a part of the Noether Normalization
Theorem 12.1.5, that if 7r is general, then the restriction wx, of n to X, is proper.
We now go over the geometric proof behind a form of Theorem 12.1.5, which
includes some degree information.

Theorem A.10.5 (Noether Normalization Theorem) Let X C CN denote


an affine algebraic set. Let TV : CN —> Ck denote a generic linear projection. Then
if dim X < k, the map TTX is a proper algebraic map with all fibers nxl{y) finite for
ally GY :=TT(X).
If dim X < k, then there is a Zariski dense subset U C X such that iru : U —>
TT(U) is an isomorphism. If X is of pure dimension k, then nx is a branched
covering of degree degX.

Proof. Let X denote the closure of X in P ^ . Here we embed C ^ by sending


(xi,...,xN) G CN t o [zQ,...,zN] = [l,xi,... ,xN] G FN. A s a b o v e , w e l e t # o o
1 N
denote the hyperplane at infinity, i.e., the p ^ - Q 1P ; equal to V(ZQ). Linear
projections CN —> Ck correspond to (N — k — l)-dimensional linear subspaces
Z/Ar_fc_i c Hoo. Fixing a general fc-dimensional linear subspace Sk C P ^ , the
map 7T£ : ¥N —> ¥k associated to £, an L w _ fc _i c i?oo, sends x ePN \ LN-k-i to
T^cix) = Sk n (x,Ljv-fc-i). If C does not meet the projective algebraic set X \X,
then TT£ is proper when restricted to X. Since d i m X \ X < dimX < k, we con-
clude that the set of C C H^ that meet X \ X is a proper algebraic subset A of
the Grassmannian Gr(N - k, N) of linear PJV~fcs in P ^ " 1 . This implies properness
of the restrictions to X of projections 7f£ with C in the complement of A.
If a fiber of -KC on X was not finite, then since the restriction of TT£ is proper,
we would have a compact projective subset of X which is not finite. This is absurd
by Lemma 12.4.3.
If dimX < k, then it is sufficient to show that given a general point x of an
irreducible component of X, a general £ = fN~k containing x meets X in no
other points and the map associated to H^ n C has maximal rank at x. This
makes sense since the general point of an irreducible quasiprojective algebraic set
is smooth. This follows from Theorem A.8.2.
If X is of pure dimension k, then a general £ = fN~k meets X in deg X points.
The general map associated to C = £ fl iJoo has degree degX. •

N + 1 different projections may be used to separate points.

Lemma A.10.6 Let X be an affine algebraic subset ofCN, all of whose irreducible
components are of dimension < k. Fix a finite set S C CN. For a general linear
Algebraic Geometry 337

projection IT : CN -> C m with m> k + 1, K(X) = ir(y) for x e S and y e X U 5


implies that x — y.

Proof. Since the lemma is vacuous if m — N, we can assume that m < JV — 1 and
thus that k < N — 2. We can reduce by induction to the case when m = N - 1.
Let Hoc := f1^-1 denote the hyperplane at infinity in FN. Let y be a point
of S. If y ^ X, consider the map <fiy : X —• H^ given by sending x £ X
to the point (j>y{x) equal to the intersection of H^ with the line spanned by
x and y. If y £ X (~) S, let (f>y : X \ {y} —> i ^ be the analogous map.
The union T of the closures of the images of these maps as y runs over the
set S is at most dimX. Since dimX = k < N — 2 < dimiJoo, we conclude
that the projection corresponding to a general point of -ffoo \ T has the desired
properties. •

The following result is classical (p. 7 Mumford, 1995). A proof follows, e.g., from
the construction given in § 15.5.4.

Theorem A.10.7 Given a pure (N — 1)-dimensional affine set A C C ^ (respec-


tively, protective set A C ¥N), there is a polynomial p(z) on CN (respectively, a
homogeneous polynomial p(z) on VN) of degree deg A with V(p) — A. A is irre-
ducible if and only if p(z) is irreducible in the sense that p(z) does not factor as a
product of two polynomials both of strictly lower degree.

Given a reduced affine algebraic set X, the following classical lemma lets us
construct polynomials whose set of common zeros is the underlying set of X.

Lemma A.10.8 Let X be an affine subset of CN, all of whose irreducible com-
ponents are of dimension k < N. Given N + 1 generic projections i\i : CN —> Ck+1
with Qi the defining degX polynomial of TTJ(X) for i = 0 , . . . , iV; the set of common
zeros of the polynomials qo(iTQ(x)),..., qN(nN{x)) is X.

Proof. Choose a generic projection TTJV : C ^ —> C fc+1 and let qpj be the defining
degX polynomial of n^(X). Then gjv(7Tjv(aO) vanishes on an (N — l)-dimensional
set XJV containing X. Let S be a finite set consisting of one point from each
irreducible component of X^ \X. Choose a generic projection TT/V-I : C ^ —> Ck+1.
By Lemma A.10.6, TTJV-I(S) fl TTJV-I(X) = 0, and thus the set of common zeros
Xjv-i of qN(^N{x)),qN^i(nN^i(x)) minus X is of dimension at most N — 2. This
step can be repeated, in an induction, to give the conclusion of the lemma. •

A. 11 The Dual Variety

In classical projective geometry there is a simple but basic duality between points
and hyperplanes. To make this precise, let P ^ denote the A^-dimensional projective
338 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

space. A point is represented in homogeneous coordinates by an (TV + l)-tuple


[zo,..., ZN\- A hyperplane is represented by a linear equation a^zo + • • • + OJVZJV = 0
with not all coefficients zero. Since multiplication of a linear defining equation of
a hyperplane does not change the hyperplane, we see that there is a one-to-one
correspondence of hyperplanes with points in a projective space represented in the
homogeneous coordinates \ao,..., a/v]- This second projective space is referred to
as the dual projective space FN*. Note the relationship is completely symmetric,
i.e., P^**, the dual of the dual of FN, is just FN.
The family of hyperplanes containing a P N ~ 2 C ¥N corresponds to a line in P^*.
With such a duality, it is natural to try to extend it to subsets of projective
space besides points and linear spaces. To see how this might be done, let C be an
irreducible curve in P 2 . If C was smooth we could send x £ C to the point in P2*
representing the line in P2 tangent to C at x. This curve C" is called the dual curve
of C despite the fact that if C is a line then C is a point. If C is a singular curve we
could define C as the closure of the image of the smooth points. For such a singular
curve, the map C —• C is a rational mapping but not necessarily a function.
This duality makes sense in general. Given an irreducible projective algebraic
set I c P " , define the dual variety X' as the closure in PN* of the set consisting
of hyperplanes which contain at least one tangent space of some smooth point of
X. We can similarly define the dual of an algebraic set.
There is a strong result about dual algebraic sets in complex projective space.

Theorem A . l l . l Let X be an irreducible subset of¥N. Then (XJ = X.

Proof. (Kleiman, 1986) is a good reference for this result and related material. •
Note this result says that in the case when X is a curve in P 2 and not a line,
the rational map X —> X' gives an isomorphism from a Zariski open set of X to a
Zariski open set of X'. To see this note that the image of X is either a point or a
curve. If it is a point then X" = X is a line. So we have that if X is not a line
it has image a curve. The rational mapping X' —> X is a well-defined map on the
smooth points of X'. Prom this we conclude that X - t l ' could not be r to one
for an r > 1. We need a special consequence of this result.
Corollary A.11.2 Let C be a pure dimension-one, not necessarily irreducible,
algebraic subset o/P 2 . Assume that C has no irreducible components of degree one.
Then C" = C. Further let x be a general point of any one of the components C
of C with the tangent line £ to C at x. Then the defining equation of C given by
Theorem A. 10.7 restricted to £ has x as a zero of multiplicity two with all other
zeros of multiplicity one.

Proof. Since C" = C for an irreducible curve and the degrees of the components of
C are all of degree greater than one, we have from Theorem A.ll.l that the images
of the components of C are distinct irreducible curves. Choosing a general point x
Algebraic Geometry 339

of a component D we get a general point of a component D' of C. This implies


that any line £ tangent to a general point of a component D of C corresponds to
a point of P2* not on any component of C other than D'. In particular £ must be
transverse to C away from x. The condition that a neighborhood of x on C" goes
isomorphically to a neighborhood of x on C is equivalent to the fact that x is a
multiplicity-two zero of the restriction to £ of the defining equation of C. •

A.12 A Monodromy Result

Let X be a pure fc-dimensional affine algebraic subset of CN and let Gr(m, N)


denote the Grassmannian of P m s in FN. We close X up to get a pure fc-dimensional
projective algebraic set X C PN. We consider the family of intersections £;v-fc H i
for A;-dimensional linear spaces Lpi^k C fN. The set of pairs

F := {(LN_k,x) GGr(N-k,N) x X |x<GL N _ k nX}

is a projective algebraic set. This is completely analogous to the simpler construc-


tion in § A.7.
We have the maps p : T —> Gr(N — k, N) and q : T —> X induced by the product
projections on Gr(N — k, N) x X. Since a generic L^-k meets X transversely in a
set of degX distinct points of X reg , we conclude from Corollary A.4.14 that there
is a Zariski open set U c Gr(N - k, N) such that pp-i(u) '• P'1^) —> U is a finite
covering.
Fix a general point y € U, we have the monodromy action of the fundamental
group ni(U,y) on the set p~l(y)- Statements for monodromy using slices of X
follow immediately from the statements for monodromy using slices of X. Indeed,
by shrinking U further it may be assumed that q(p~1(U)) C X, and so the lemmas
and theorems we state hold equally for X and its closure X. Reflecting the bias in
this book to regard polynomial systems as being defined on Euclidean space rather
than projective space, we state the results for affine algebraic sets X in the rest of
this subsection.
Lemma A.12.1 If Xi is an irreducible component of X, then the above mon-
odromy action acts transitively on the set Xi np~1(y).

Proof. Note that q : J- —> X is a fiber bundle with the fibers isomorphic to the
Grassmannian Gr(N — k, N). Thus the set q"1{Xi) is irreducible, and therefore the
Zariski dense open subset p~l(U) D q^1(Xi) C q~l(Xi) is also irreducible. Since
y is general, p~l{y) consists of smooth points of the irreducible and hence path-
connected manifold (p^1(f7) n q~1(Xi))reg. The monodromy action under a path
connecting two distinct points of p~1(y) gives the transitivity. •

We need a much stronger result. Choose a general affine linear subspace B :=


340 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

£N-k+i c £N containing the (N -fc)-dimensionalaffine linear space corresponding


to the basepoint y € Y. Let BQ denote the (N —fc)-dimensionallinear subspace of
C* parallel to B. Though in practice it does not matter, we should theoretically
choose the general B first and the general space corresponding to y afterwards. We
let 7r : CN~k+1 ->C:= CN~k+1/B0 be the induced linear map. Let Ls := TT" 1 ^).
Let V C U be the linear curve through y corresponding to the Ls. We have the
following result from (§3 Sommese et al., 2002b).
Theorem A.12.2 Let X = U^1Xi denote the decomposition of a pure k-
dimensional affine algebraic set X c C^ into irreducible components. We assume
that k > 1. Let ir~l(y) = VJTi=1Fi, where Ft = ir~l(y) n Xt with n, y, and V as
above. The image ofit\ (V, y) into the automorphism group of the set n'1(y) induced
by the monodromy action of slices of X by the (N —fc)-dimensionallinear subspaces
Ls c CN is
r

0Sym(Fi),

where Sym(F;) is the symmetric group of F^.

Proof. First we may discard all degree one components since they do not effect the
veracity of the theorem. Next we reduce to the case when k = 1. Since B is general,
we know from part 3) of Theorem 13.2.1 that each X, is irreducible. The map -K
may now be regarded as the linear map from CN to C with Ls of the form LQ + sv
for a fixed vector v G CN with v £ LQ. By renaming if necessary, we may assume
that X is one-dimensional.
Next we take a general projection TT' : C^ —* C. The generic linear map II :=
(7r,7r') : CN —> C2 maps X generically one-to-one to its image by Theorem A.10.5.
Let TTi denote the projection of C2 onto its ith factor. There is a Zariski open dense
set V of C such that ?rf 1{V) nll(X) is smooth and m : Tr^l(V')r\Tl(X) -> V is a
d := degX sheeted covering map. Since n = TT\ O LT, we may regard V as an open
subset of V. Since every immersion g : S1 —> V' gives an immersion g : S1 —> V, it
suffices to prove the result for V'. This reduces us to the case of a curve in C2 with
V a family of lines parameterized by an open Zariski dense set of a line in the dual
P2 to the P 2 containing C2.
This case follows in two steps. First we prove the statement for the family U
of all affine lines in C2. This follows using Corollary A. 11.2 and a modification of
the proof of the classical statement when X is an irreducible curve, e.g., (page 111
Arbarello, Cornalba, Griffiths, & Harris, 1985).
The proof foiVcU follows from a theorem (Theorem, §5.2. Part II Goresky &
MacPherson, 1988) of Lefschetz type asserting that the homomorphism TT\(V, y) —>
7Ti(U,y) induced by the inclusion V C U is a surjection.
We refer the reader to (Sommese et al., 2002b) for a more detailed proof. •
Algebraic Geometry 341

A. 13 Line Bundles and Vector Bundles

We have mentioned earlier that homogeneous functions are not functions on projec-
tive space, though they are functions on a related Euclidean space. One difficulty
posed by this is that the usual statements for algebraic functions on affine alge-
braic sets are not literally true for homogeneous functions on projective space. If
homogeneous functions on projective space were the only issue, we could state the
results for polynomials with slight rewording for homogeneous functions. But, faced
with a number of very useful generalizations of homogeneous functions, e.g., biho-
mogeneous and more generally multihomogeneous functions, this is not a viable
approach. In this section we first introduce bihomogeneous and multihomogeneous
polynomials, and then define line bundles and their sections.

A.13.1 Bihomogeneity and Multihomogeneity


Let X denote the product of two projective spaces, P m x P n . We can denote a point
in this space by a (a + b + 2)-tuple [ZQ, ..., zm; Wo,..., wn] of points with neither
Zi = 0 for all i nor Wj = 0 for all j , and with the equivalence relation
[zQ>...,zm;wo,...,wn] ~ [Xz'o,..., Xz'm; fiw0,..., p,w'n}

for all 0 j^ A e C and 0 ^ /x € C A polynomial p(z, w) in the variables


ZQ,.. . ,zm,Wo,... ,wn is said to be bihomogeneous of degree (a,b) if it is of the
form Yl\i\=a \j\=bcuzIwJ• Note that since p(Xz, /j,w) = Xanbp(z,w), it follows that
the set where p(z,w) = 0 is a well-defined subset of Fm x P™. Similarly, we can
define multihomogeneous polynomials on P™1 x • • • x Pnfc.

A. 13.2 Line Bundles and Their Sections


First, let's consider the case of C^. If we have a polynomial p(z) on CN, we can
think of p(z) in terms of its graph ap :— {(z, A) e C^ x C j A = p(z)} . We say
that CN x C is the trivial line bundle on CN and av is a section. In loose terms, a
line bundle over X is a quasiprojective algebraic set which maps onto X with fibers
identified with C in such a way that the vector space structure on C is preserved.
Precisely, we can define line bundles on any quasiprojective algebraic set X. A
line bundle L on X consists of the data
(1) UQ, .. •, Ue, a covering of X by affine Zariski open sets Ui dense in X;
(2) for each 0 < i < £, 0 < j < £, an algebraic function ptj defined and nowhere
zero on Uij := Ui f] Uj with pijPji = 1 on Uitj and pa = 1 on Ui\ and
(3) PijPjk = Pik on Ui n Uj n Uk for all 0 < i < I, 0 < j < I, 0 < k < £.
Associated to a line bundle is a space generalizing the trivial bundle. The space,
also called L by abuse of notation, is covered by open sets UiXC where for x € Uij
we identify {x,At) G £7, x C with (x,Aj) 6 Uj x C if Aj = pij{x)Al. The cocycle
342 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

condition pijPjk — Pik guarantees the identifications are well defined. There is a
further natural, but involved, definition of when different covers and choices of pij
lead to the "same" line bundle. Sections, which are basically like graphs of functions,
are defined as a choice of algebraic functions at : Ui —> C with the property that
aj{x) = ai{x)pij{x) for x e Uld and all 0 < i < I, 0 < j < L
An algebraic line bundle L on a quasiprojective algebraic set X is spanned by a
vector space V of global sections of L if for each point x 6 X, there is at least one
section s & V such that s(x) ^ 0.
For example, letting [zo,zi] denote homogeneous coordinates on P 1 , we may
cover P 1 with Uo :— P 1 \ V(zx) and Ux := P 1 \ V(z0). We have the coordinate
z :— ZQ/ZI on UQ and w := 1/'z = Z\/ZQ on U\. We may form the line bundle Opi (d)
by taking the data consisting of the function poi = l/zd on Uo fl U\. We may
regard a homogeneous polynomial p(zo,z\) of degree d > 0 a s a section of Opi(d)
by assigning ao(z) := p(z, 1) to Uo and <T\(z) := p(l, 1/z) to C/'i: for convenience we
are working with the z coordinate. Note as required,

<TI(Z) = p(l, 1/z) = z" d p(-s ; 1) = Po,io-o(^)-

Note that Opi (0) is the trivial bundle, and for d > 0 the bundles 0pi (rf) are spanned.
There are no other sections of (Dpi (d) besides the ones just constructed using the
homogeneous polynomials.
On P ^ , the line bundles are not much more complicated than the ones just
constructed for P 1 . They are in one-to-one correspondence with the integers d,
with the line bundle corresponding to d being denoted Cpw(d). For d < 0 the only
algebraic section of OFw(rf) is the 0-section, i.e., the choice of a cover [7; of FN and
(Ti — 0 for all i. For d = 0 we have the trivial bundle, whose only sections are
the constant functions, and for d > 0 the algebraic sections are again in one-to-one
correspondence with the homogeneous polynomials of degree d.
It turns out that up to equivalence that the only algebraic line bundle on CN is
the trivial line bundle.
Any algebraic line bundle L on an irreducible projective algebraic set X gives rise
to a well-defined element C\{L) in the second integral cohomology group H2(X, Z) of
X. This element c\(L) is called the first Chern class of L. If L has a not identically
zero section s, then ci(L) is Poincare dual to the zero set Z of s.
Let us assume we have line bundles L\,... ,LN on an irreducible projective
algebraic set X of dimension N. If the line bundles are spanned by global sections,
then given general sections Si of Li for i = 1 , . . . , N, it follows that the system

siO) = 0

: (A.13.6)
sN(z) = 0

has exactly (c\{L\) • • -CI(LJV)) [X] isolated solutions and they are all nonsingular.
Algebraic Geometry 343

For example, if X = FN and Li = OpN(di), then the Sj are homogeneous polyno-


mials of degree di, and we have the classical Bezout Theorem.

A. 13.3 Some Remarks on Vector Bundles


Replacing C in the definition of line bundles by C , and letting p^ be invertible
r x r matrix-valued holomorphic functions we end up with the definition of a vector
bundle of rank r. In terms of this definition, given line bundles L\,..., L^ on
an irreducible projective algebraic set X of dimension N, and sections Sj of L,
for i = 1,..., N it follows that the system given by Equation A.13.6 is equivalent
to s = 0, where s = s1 © • • • © s^ is the section of the rank N vector bundle
E := L\ © • • • © LN obtained by taking the direct sum of the N line bundles
L^ The cohomology class c\{L) •••cN(L) e H2N{X,Z) is just the iVth Chern class
Cff(E) of E, and the Bezout number for the system is just c^/(E)[X]. Such numbers
are very often easy to compute.
As a concrete example, we give the simplest nontrivial system on CN arising
as a section of a rank N bundle on P w restricted to C^. For the bundle we take
the tangent bundle Tpw of FN. The Bezout number for the system associated to a
general section s of TpN is AT + 1. Written in terms of coordinates x i , . . . , x^ on
CN the system becomes

' £i(x) - Xleo(x) '


: =0
JN{X) -xNe0(x)_

where li{x) = a^o + auxi + • • • + dixx^ for generic choices of all the a^. By the
theory of vector bundles it may be checked that this system has exactly N + 1
nonsingular isolated solutions.

A.13.4 Detecting Positive-Dimensional Components


The algebraic geometric structure that best captures what is meant by a polynomial
system is that consisting of a vector bundle and one of its sections. For the sake of
simplicity we have avoided line bundles and vector bundles in this book, but they
are in the background and they lead to useful results, e.g., (Morgan et al., 1995) and
(Morgan & Sommese, 1989). Here is one (Theorem 7 Morgan & Sommese, 1989).

Theorem A. 13.1 (Morgan and Sommese) Let £ be a spanned rank N holo-


morphic vector bundle on an N-dimensional irreducible compact complex analytic
space X. Assume that CN{£)[X\ ^ 0. Let a0 and o\ be two holomorphic sections
of £. Then letting \ZQ,ZI] be homogeneous coordinates on P 1 , the solution set of
1
ZQCTQ + z\cj\ on P x X is connected.
344 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Before we start the proof of this theorem we would like to show what it says
about down-to-earth polynomial systems.
Let X := P^ 1 x • • • x ¥Nr be a product of projective spaces. Consider "systems
of polynomials" a consisting of N := YH=i Ni equations where the ith equation has
the nonnegative multidegrees d^i,... ,dktT with respect to the multihomogeneous
structure. Letting TTJ be the product projection of X onto the jth factor P ^ , a is
a section of the bundle

£:=0(0*;<V,(d i j -)J- (A13-7)


When the solutions of a system a are isolated and nonsingular, then the Nth Chern
class of £ evaluated on X, i.e., CN{£)[X], equals the number of points in V(a). In
the case of the £ of Equation A.13.7, c^(£)[X] equals the coefficient of tr 1 • • • t^r
inIT^=1(M»,i + \-Uditr)-
Next let ai be a section of the £ in Equation A. 13.7 with isolated nonsingular
solutions. Assume that we know the solutions of <j\. Now consider the "homotopy"
CTJ := (1 — £)<7o + itai where ao is a second section, whose solution set we are
computing. We know that for all but a finite number of 7 £ 5 1 , the solution set
of <7t — 0 is isolated and nonsingular for t G (0,1]. By Theorem A.13.1, we know
that the limits of V(at) as t —> 0 include points from every connected component
of V(a0).

Proof, (sketch of the proof of Theorem A.13.1) Letting X be a desingularization of


X by Theorem A.4.1 and noting that sections from a Zariski open dense space of
sections of £ are nowhere zero on a proper analytic subset of X, we conclude that
we can assume that X is smooth without any loss of generality.
Analogously to the arguments in § A.7, the universal space of solutions of sections
of £ is a smooth connected projective bundle over X. Using the proof of item (3)
of Theorem 13.2.1, we have the connectedness. •

A. 14 Generic Behavior of Solutions of Polynomial Systems

Systems of polynomials that arise in engineering and science often depend on pa-
rameters. In this section, we take a general approach to polynomial systems with
parameters, and discuss what we can say about the dependence of solution sets on
the parameters. There are two questions we are interested in:
(1) what properties hold for general values of the parameters, e.g., a well-defined
number of isolated solutions; and
(2) given some property for a system with a special value of the parameter, e.g.,
having an isolated solution, what can we conclude for general values of the
parameters.
Algebraic Geometry 345

Since the proofs require material beyond the scope of this book, we refer to references
for essential points. Our approach is the same as (Morgan & Sommese, 1989),
though the focus there was mainly on isolated solutions of systems.
Let

/ i ( z i , . . .,xN;qi,.. .,qM)

f(x;q):= : (A.14.8)
_fn{xi, • • • ,XN',Ql, • • • ,QM) .

be a system of polynomials of (x;q) G CN x C M . We regard this as a family of


polynomial systems in the x variables with the g-variables as parameters.
Though the algebraic system given in Equation A.14.8 is quite general, it is not
general enough. We need to allow also the possibility that systems in the family are
defined on any algebraic subset of <CN, even Zariski open sets such as (C*) , where
C* is <C\{0}. We need to allow the situation when systems in the family are defined
on projective space or products of projective spaces, but this means we need to deal
with the fact that homogeneous functions are not functions on projective space. So
a better description of the systems including all cases of the above would be to let

f(x;q), (A.14.9)

be the restriction to X x Y of an algebraic section of an algebraic rank n vector


bundle £ on X x Y, where X is a Zariski open and dense subset of an iV-dimensional
connected projective manifold X, and Y is an irreducible smooth quasiprojective
algebraic set of dimension M. A special case of this would be the situation that
X x Y is a smooth Zariski open set of an irreducible projective algebraic subset
of some projective space and f(x; q) consists of the restriction of n homogeneous
polynomials fi(x; q) to X x Y. Though we briefly discussed vector bundles in § A.13,
we suggest strongly, the first-time reader proceed with X := CN, Y := C M , and
f(x;q) = 0 in Equation A.14.8 satisfying the extra property that it is a set of n
polynomials on CN+M.
Let X denote the nonreduced solution set of f(x; q) = 0, and let Z := V(f(x; q))
denote the reduction of X. Let n : X —> Y be the map induced from the product
projection X x Y —> Y {CN x C M —> C M if you are following in the simpler setup).
Let Xo denote the union of irreducible components Z of Z such that TTZ is
dominant and such that dim Z = M.

T h e o r e m A.14.1 If n = N and if there is an isolated solution (x*;q*) of


f(x;q*) = 0, then (x*;q*) £ XQ. Moreover there are arbitrarily small complex
open sets U C X x Y that contain (x*; q*) and such that

(1) (x*;q*) is the only solution of f(x; q*) = 0 in U D (X x {<7*})/


(2) f(x;ql) = 0 has only isolated solutions for q' G TT(W) and x G U fl (X x {q1});
and
346 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

(3) the multiplicity of (x*;q*) as a solution of f(x;q*) = 0 equals the sum of the
multiplicities of the isolated solutions of f(x; q1) = 0 for q' € ir(U) and x €
Un(Xx{q'}).

Proof. The first two statements are proven in Theorem A.4.17.


Since the codimension of XQ is N and XQ is defined near (x*;q*) by N functions,
we know that Xo is a local complete intersection near (x*;q*). Thus Xo has at worst
Cohen-Macaulay singularities. Since Y is smooth, this implies that -Kx0 '• X$ —> C M
is flat in a neighborhood of (x*;q*). Here we are using the nonreduced structure
of XQ in the neighborhood of (x*;q*). Flatness yields this result, e.g., (Prop. 3.13
Fischer, 1976) and the corollary following that proposition. •

Corollary A.14.2 Let f(x;q) be as in Equation A.14-9. Then there is a Zariski


open set U <ZY such that for q 6 U the system f(x;q) = 0 has di isolated solutions
(not counting multiplicity) of multiplicity i where di is an integer independent of
qeU.
Remark A. 14.3 (Generic Bezout Number) The generic number of isolated solu-
oo
tions counting multiplicity is d := yjidj, where the sum is always finite and
i=l
bounded by the product of the JV largest degrees (with respect to the x variables)
of the equations making up f(x; q) = 0. We call d, the generic Bezout number or
the generic root count of the system f(x; q) = 0.
Theorem A. 14.1 is one large reason why we use square systems. The following
example is typical of the case n > N.
Example A.14.4 For a system of polynomials in (x; qit q2) € C x C2, take

For q\ — q§, the system has isolated solutions, but for q\ ^ q2., there are no solutions.
Theorem A.14.5 Assume that M + JV > n and that there is an isolated solution
(x*;q*) of f(x;q*) = 0 where f(x;q) is as in Equation A.14.9. There is a germ of
an irreducible complex analytic set Q containing q* with dim Q > M — {n — N) such
that for all points q' in arbitrarily small open sets U <Z Q containing q*, f(x; q') = 0
has isolated solutions near (x*;q*).

Proof. First randomize to get a square system. Using Theorem 13.5.1, it follows
that the point (x*; q*) is an isolated solution of the randomized system. By Theo-
rem A.14.1, (#*; q*) is a point on an irreducible analytic space XQ with -KX'O dominant
and a finite-to-one branched cover in the neighborhood of (x*;q*). We know that
appending any n — JV of the equations of f(x; q) to the randomized system, we
Algebraic Geometry 347

obtain a system equivalent to f(x;q) = 0. Using Theorem 12.2.2 successively, we


cut X'Q down to an affuie algebraic set with dimension > M - (n — N). Take a com-
ponent Z of this set at (x*;q*). By Lemma A.4.16, there are arbitrarily small open
sets V of q* on this component (in the complex topology) on which the restriction
7rgn7r-i(y) : ZPi-K~l(V) —> V is proper (in addition to being finite by construction),
e.g., see (§3, Theorem 8(b) Gunning, 1970) for a discussion. By Theorem A.4.3
applied to map irzr\-ir-1{y)i w e a r e done. •

Remark A.14.6 A similar statement to Theorem A.14.1 can be proved, when


we are talking about k dimensional components in place of an isolated x*. In this
case when M + N > n + k, we get a Q of dimension at least M + N — n — k.

A.14.1 Generic Behavior of Solutions


As at the start of the section, let f(x;q) be as in Equation A.14.9 (or simply as
in Equation A. 14.8). We let X denote the solution set of f(x;q) = 0 with the
induced nonreduced structure, and TT : X —> Y the induced morphism. The easiest
route to generic statements is to exploit the fact that the morphism IT : X —> Y is
"generically flat." Before we do this, let us show some generic properties, just to
give the flavor of how the arguments go. We will continually choose smaller Zariski
open dense sets U CY, and by abuse of notation call them U.

Lemma A.14.7 There is a Zariski open dense set U cY such that either TT"1 ([/)
is empty or n^-i^ : 7r^1(t/) —> U maps every irreducible component of X surjec-
tively onto Y.

Proof. To see this note that there are finitely many irreducible components Z of
X. The set TT(Z) is constructible by Theorem 12.5.6, and so either n(Z) is Y or a
proper algebraic subset of Y. Setting U equal to the complement of the union of the
proper algebraic sets arising in this way, we can assume TT(Z) is dense in U for every
component of Xu, the solution set of f(x;q) over U. We know, by Lemma 12.5.8,
that for such a Z there is a Zariski open dense set of Y contained in ir(Z). By
taking the intersection of these sets, we get a Zariski open dense set U with the
desired property. •

Lemma A.14.8 There is a Zariski open dense set U C Y such that given any
irreducible component Z of n~1(U), TT^-I^) : TT~1([7) —> U maps Z surjectively
onto Y with every fiber of nz having dimension exactly dim Z — M.

Proof The argument follows from Corollary A.4.7 combined with the same reason-
ing as Lemma A.14.7. •

The same sort of arguments yield the following result.


348 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Lemma A. 14.9 There is a Zariski open dense set U C Y such that given any
distinct irreducible components Z\ and Z2 ofn'1^), either Z\V\Zi — 0 or 7rff-i((/) :
TT~1(U) —> U maps every irreducible component W of Z\ D Z 2 surjectively onto Y
with every fiber W of irw having dimension exactly dim W — M
Many results such as this are immediate consequences of the generic flatness
theorem. The generic flatness theorem is a useful algebraic result of Grothendieck,
e.g., (pg. 57 Mumford, 1966), which Frisch (Frisch, 1967) showed holds for holomor-
phic maps between complex analytic spaces. We are not going to define flatness,
but it geometrically says "fibers change without discontinuity." Good places to
read about flatness (and some of the results that justify such a statement) are (pg.
146-161 Fischer, 1976) and (Chapter III. 10 Mumford, 1999).
The generic flatness theorem mentioned above says that there is a Zariski open
set U C Y such that either 7r~1(C/) is empty or T I V - I ^ ) : TT~1(C/) —> U is a flat
surjection. From here on we will assume that U is not empty, since the statements
we show are all trivially true in that case.

The generic irreducible decomposition


Is there a "generic irreducible decomposition?" The answer is a strong yes, but first
we must understand what we mean by this.
For a point y € Y, let Xy denote the solution set of f(x; y) = 0. Forgetting
about multiplicity information, we have the irreducible decomposition
dimZy / \

Zy:=V(f(x;y))= (J M J Zy,it3. (A.14.10)

We would like there to be a Zariski open dense set U <zY such that:

(1) For any 2/1,1/2 € U, both ZVl and ZV2 have the same number of irreducible
components of a given dimension, degree, and multiplicity in their respective
fibers XVl, XV2; and
(2) their components so matched up vary continuously as y\ moves to 2/2•
The first assertion is true, and the second is true, but different paths from j/i to
1/2 may match up the decompositions different ways, i.e., there may be nontrivial
monodromy.
One way of approaching this is to take the irreducible decomposition of ZJJ :=
n-\U), i.e.,
dim Zy / \
Zv= U (\J Zv,itk . (A.14.11)
i=l \keJi /

Note we are using Lemma A. 14.8, which tells us that given any irreducible com-
ponent ZUthk of ZJJ, dimZ[/,j;fc = M + d\mZUylyk,y = M + i, where ZUtitk,y =
Zu,i,k n (X x {y}).
Algebraic Geometry 349

Theorem A.14.10 Let f(x;q) be as in Equation A.14-9. Then there is a Zariski


open dense set U C Y such that for any y £ U and each Zu,i,k occurring in Equa-
tion A.14-11, it follows that Zy^fcClTr—1(y) is a union of the irreducible components
Zy,i,j °f Zy occurring in Equation A. 14-10. Moreover, for each of the i,k, all fibers
of Zjj,i^k under n have the same number of components.

Proof. Assume that it is not true, for the U selected in Lemmas A. 14.7, A. 14.8,
and A.14.9, that Zu,i,k H Tr~1(y) is a union of the irreducible components 2y,i,j of
Zy. Then one of the components Zy^j of Zy must contain one of the components
of Zu^^k H ir~l{y). Moreover one of the components Zuytk' of Z\j must contain
Zyjj. Thus we get that Z[/,i',fc' H •Zf/.i.fc contains a component W with fiber under
7T of dimension i. But this means W is dense in Zu,i,k, which gives the absurdity
that Zutifk C Zuyk'-
By Theorem A.4.20, we may shrink U to a smaller dense Zariski open set U,
so that each Zu,%,k contains a smooth Zariski open set W such that for all y e U,
W Pi 7r~1(y) is dense in 7r^1(y); and IT : W —» U is of maximal rank with all fibers
having the same number of irreducible components. •

A.14.2 Analytic Parameter Spaces

It is a natural question to ask whether the results in this section are true when
the parameters do not vary algebraically but only vary holomorphically. The short
answer is "yes, with certain minor modifications." Because it is useful to allow
complex analytic parameters, we explain what we mean by this and moreover state
the generalization of the above results with the changes needed to prove them. In
this one subsection, Zariski topology refers to the Zariski topology using zero sets
of sets of homomorphic functions.
The simplest case is a system

' fi(xi,...,xN;qi,...,qM)~
f(x;q):= : (A.14.12)
Jn(xi,- •• ,xN;qi,. ..,qM).

of holomorphic functions of (x; q) £ CN x C M , that are polynomial in the x variables.


We regard this as a family of polynomial systems with the q-variables as parameters,
i.e., for each i — l,...,n, there is a positive integer di such that

fi(x;q)= ] T aI(q)xI,
\i\<di

where each ai(q) is holomorphic on all of C M .


The situation analogous to Equation A. 14.9 is a system

f(x;q), (A.14.13)
350 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

which is the restriction to X x Y of a holomorphic section of a holomorphic rank


n vector bundle £ on X x Y, where X is a smooth and dense Zariski open subset
of an irreducible projective algebraic set X of dimension N, and Y is a connected
complex manifold of dimension M. We allow the possibility that X — X. In the
case of Equation A.14.12, X = P^, and £ = FN and

£ = OPN(di) ® • • •® O¥N(dn),

and f(x;q) = fi(x;q)(B- • -®fn(x;q). As before, the first-time reader should assume
that X := CN, Y := C M , and f(x; q) = 0 is as in Equation A.14.12.
As above we let n : V(f) —> Y denote the holomorphic mapping induced by the
product projection X x Y —» Y. We let n : V(f) —> Y denote the holomorphic
mapping induced by the product projection X x Y —> Y. If Z is an irreducible
component of V(/), then by Theorem A.4.3, W (Z) is a complex analytic subspace
of Y. Since n {Z\Z) is a proper complex analytic subspace of n (Z), we conclude
that U :— 7f (Z) \ W [Z \ Z) c n(Z) is a Zariski open dense subset of TT(Z). This
plays the role of Lemma 12.5.8.
We will continue to replace U by Zariski open dense subsets of U as needed,
and call them by the name U. This implies that each irreducible component of
7T~1(C/) maps surjectively on U. We state only the analogues of Theorem A.14.1
and Corollary A.14.2.
Theorem A.14.11 If n = N and if there is an isolated solution (x*;q*) of
f(x;q*) = 0, then (x*;q*) € XQ. Moreover there are arbitrarily small open sets
U C X xY that contain (x*; q*) and such that

(1) (x*;q*) is the only solution of f(x;q*) = 0 in U n (X x {q*});


(2) f(x; q') = 0 has only isolated solutions for q' € TT(U) and x 6 U ("1 (X x {9'});
and
(3) the multiplicity of (x*;q*) as a solution of f(x;q*) = 0 equals the sum of the
multiplicities of the isolated solutions of f(x; q') — 0 for q' e TT(W) and x G
Un(Xx{q'}).

Proof. The argument is the same as that for Theorem A.14.1. O

When working with complex analytic spaces it is useful to define an analytic


Zariski open set to be a subset U C X of an irreducible complex analytic space of
the form X \Y where Y is a complex analytic subspace of X. All the usual notions,
e.g., probability-one and generic point, carry over with no change. We would call
the Zariski open sets we have dealt with up to now, algebraic Zariski open sets, if
we needed to deal in any significant way with both sorts of Zariski open sets.

Corollary A.14.12 Let f(x;q) be as in Equation A. 14-13. Then there is an


analytic Zariski open set U C C M such that for q € U the system f(x; q) = 0 has di
Algebraic Geometry 351

isolated solutions (not counting multiplicity) of multiplicity i where di is an integer


independent of q &U.
Remark A.14.13 Thus, as in the purely algebraic case, the generic number of
oo
isolated solutions counting multiplicity is d := ~S^idi, where the sum is always
i=\
finite and bounded by the product of the N largest degrees (with respect to the x
variables) of the equations making up f(x; q) — 0. We, as in the purely algebraic
case, call d, the generic Bezout number or the generic root count of the system
f{x;q) = 0.
Corollary A. 14.12 holds with X singular.
Appendix B

Software for Polynomial Continuation

There is much to be said for the motto "learn by doing," and in our case, this
means solving polynomial systems with numerical continuation. Even though this
book offers substantially all the information one would need to write a solver from
scratch, that is rather far beyond the level of commitment most readers will muster.
To provide an easy entry to the area, we provide a suite of m-file routines called
HOMLAB for performing polynomial continuation in the Matlab environment. Af-
ter gaining experience with HOMLAB, one may wish to download one of several
freely available software packages for polynomial continuation. These may offer
speed advantages and advanced options, such as polytope methods, not available
in HOMLAB. Some of these have been adapted to run on multi-processor machines
for large computations.
A partial listing of packages available as of the writing of this book is as follows.

• HOMLAB runs in the Matlab environment and implements general linear prod-
uct homotopy and parameter homotopy. See Appendix C.
• HOMPACK, H0MPACK90, POLSYS_PLP are a sequence of increasingly sophisticated
continuation algorithms, written in Fortran. The "PLP" in POLSYS_PLP stands
for Partitioned Linear Products, a special case of the general linear products
discussed in § 8.4.3. This code finds only isolated solutions for square systems
(same number of equations as variables).
• PHoM is a C++ code that implements polyhedral homotopies (see § 8.5). This
package finds isolated solutions for square systems.
• PHCpack implements a variety of homotopies in a menu-driven interface that
includes all the structures discussed in Chapter 8, except polynomial products.
In addition to isolated roots, the algorithms from Part III of this book for han-
dling positive dimensional solutions, nonsquare systems, etc., are implemented.
This package is written by our collaborator, J. Verschelde, and it has been the
experimental platform for validation of these algorithms. Both executables and
Ada source code are available.
• Algorithms for mixed volume computations can be found on T.Y. Li's webpage.
This is the most difficult phase of a polyhedral homotopy, (§ 8.5).

353
Appendix C

HomLab User's Guide

HOMLAB, a suite of scripts and functions for the Matlab environment, is designed as
an easy entry into the use of polynomial continuation and, for the experienced user,
as a platform for experimental development of new methods. Many of the exercises
of this book assume the availability of HOMLAB and special routines using HOM-
LAB functions are provided for some exercises. The use of a routine for a particular
exercise is described in the exercise statement itself, while the general structure and
use of HOMLAB is documented below. The best way to learn HOMLAB is simply to
work the exercises in the order they appear in this book. These progress from the
simple application of the core path-tracking routine to successively more sophisti-
cated homotopies that use it. Help describing the usage of individual routines, say,
endgamer. m, is available by typing "help endgamer" at the Matlab prompt. The
main text of this book is the reference for the methodologies used and the help
facility just mentioned is the reference for individual routines. However, to help the
user in getting started quickly, we provide this user's guide.
We assume the user has at least a minimal acquaintance with Matlab; in par-
ticular, the user must know how to write and execute simple scripts and functions.
A script is a sequence of Matlab commands recorded in a file, say "myscript.m,"
which are executed by typing » myscript at the Matlab prompt, here indicated
as " » . " (Scripts can also be called within other scripts or functions.) A function
is a file, say "myfunc.m," which starts with a declaration line something like

function [out1,out2]=myfunc(inl,in2,in3)

followed by lines of Matlab code that compute the two outputs, outl,out2 from
the three inputs inl,in2,in3. This function might be called as

[a,b]=myfunc(0.1,[1 3] ,x)

where x is an existing variable in the workspace. For more on using Matlab, please
see the Matlab documentation.

355
354 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

• BERTINI is a C code soon to be available on A. Sommese's webpage. An effort


of D. Bates, C. Monico, A. Sommese, and C. Wampler, led by A. Sommese,
BERTINI features a high-level interface for parameter homotopies (including
automatic differentiation) and multiple-precision routines that can adjust pre-
cision on the fly.

As URL's are often subject to change, we suggest that the packages be located by
use of a search engine.
356 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

C.I Preliminaries

C.I.I "As is" Clause


HOMLAB is distributed free of charge on an "as is" basis. Its intended usage is edu-
cational, so that the user may gain a greater understanding of the use of numerical
homotopy continuation for solving systems of polynomial equations. Any other use
is strictly the user's responsibility.

C.I.2 License Fee


There is no license fee for HOMLAB. In lieu of this, we hereby request each user to
buy a copy of this book.

C.I.3 Citation and Attribution


The use of HOMLAB for research purposes, either in its original form or as modified
by the user, is highly encouraged, subject only to professional ethical conduct, as
follows.
In publications based on results obtained using HOMLAB or its successors, the
use of HOMLAB should be acknowledged and this book should be cited. The author
of the code is Charles Wampler.
Any redistribution of HOMLAB in unaltered form must retain the same name and
acknowledge the author. Any distribution of derived codes that extend or modify
HOMLAB should acknowledge the original source and authorship. In addition, the
differences from the original should be clearly documented and attributed to the
new author. These conditions extend to users of the derived codes.

C.I.4 Compatibility and Modifications


HOMLAB is a suite of Matlab routines. Version HOMLABI.O has been restricted
to the conventions of Matlab v.4.0 to provide compatibility with both old and new
Matlab installations. (Even the file names have been restricted to eight characters
for compatibility with old operating systems.) The exception to this rule is that
routines based on Part III of this book for generating witness point supersets use
cell arrays to store sets for different dimensions. Users who advance to that level
will need a more recent version of Matlab, or else they must modify the code. All
routines have all been verified to run under Matlab v.6.5.
By avoiding advanced features of newer versions of Matlab (except as just noted),
we hope the package will be easier to translate to run in other environments, in
case some readers lack access to Matlab. In particular, Octave and SciLab are
both freely available packages that implement a large subset of Matlab functions,
so they are good candidates for substitute environments. Anyone who successfully
HomLab User's Guide 357

ports HOMLAB to one of these, or similar, environments is requested to notify the


authors and to make the ported version freely available. Citation of HOMLAB and
this book are required, and any differences in functionality must be documented.
The authors are not bound to fix bugs in the current version or to upgrade
HOMLAB for compatibility of any future release of the Matlab product. However,
user comments and bug reports are welcome, so that, at our discretion, we can
maintain and possibly improve the educational value of the package. Please see the
HOMLAB webpage for instructions on how to submit a comment or bug report.
The exercises for this book have been written under Matlab v.6.5. Some of these
use features not available in previous releases, namely function pointers and function
files that include subfunctions in the same file. This should be more convenient for
those with an up-to-date release of Matlab; those with old versions will, we hope,
have little trouble revising the source code to run in their environment.

C.I.5 Installation
As a suite of m-files, HOMLAB becomes functional by simply adding the folder
containing the routines to Matlab's search path. The folder for the current release,
HOMLABI.O, is HomLablO. Let's say that you have copied this folder onto your
machine with the full path name of c:\mypath\HomLablO, where "mypath" could
be any path in the file structure of your machine. There are three basic options for
adding HOMLAB to the Matlab path:

• In Matlab (v.6.5 and above), use "File -> Set Path" on the Matlab
menu bar to launch a dialog box for setting the path and use it to add
c: \mypath\HomLablO and its subfolders to the top of the search path. The
change becomes effective immediately in the current session, while the "Save"
button in the dialog box records it for future sessions.
• At the Matlab prompt, use the command

» addpath c:\mypath\HomLablO

HOMLAB will then be available for the current session only. Similarly, add the
subfolders of HomLablO to the path.
• Create a file called startup.m in a directory already on Matlab's search path
and put the appropriate addpath commands there. HOMLAB will then be
available for all future sessions.

Any one of these three options is sufficient. See the Matlab help facility to obtain
more detailed instructions on modifying the search path.
To test if the installation is successful, type » simpltst at the Matlab prompt.
If all is well, the session should look something like:
358 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

» simpltst
Number of s t a r t points = 2 elapsecLtime =
0
Path 1 elapsed_time =
1.4100e-001
Path 2 elapsed_time =
3.1300e-001
The solutions are:
1.0000e+000 -2.2204e-016i -1.0000e+000 -1.1102e-016i
1.0000e+000 -1.6653e-016i -1.0000e+000 -1.6653e-016i
1.0000e+000 1.0000e+000 +5.5511e-017i
»
The times will vary according to your machine and the tiny values of the imaginary
parts of the answers will typically change with each run. This test solves the simple
system

x2 - 1 = 0, xy-l = 0,

in the homogenized form

x2 - w2 = 0, xy — w2 = 0,

using a two-path homotopy based on the linear-product formulation /i G (x,l) ,


/ 2 6 (x,l) x (y, 1). Accordingly, the answers should be (x,y,w) = (1,1,1) and
(—1, —1,1), as above. More information on interpreting the results is given below.

C.I.6 About Scripts


In HOMLAB, the high-level functions are written not as true functions, which hide
their internal variables from the workspace, but as scripts, which are a sequence
of commands that run directly in the top level workspace. One advantage of this
is that a Matlab save command can save all of the data necessary to execute an
exact re-run, including all random constants used in denning a start system, and so
on. A negative consequence is that all such data is in the workspace until the user
clears it. If one wishes to avoid this, one can write a function to call the HOMLAB
script and pass out only the desired results.

C.2 Overview of HOMLAB

HOMLAB is a collection of compatible routines for defining and executing homotopy


algorithms. The workhorse routine is endgamer.m which tracks solution paths for
a homotopy h(x, t) = 0 from a list of startpoint solutions of h(x, 1) = 0 to their
HomLab User's Guide 359

endpoints satisfying h(x, 0) = 0. Specifically, endgamer has the usage


[xsoln,stats,xendgame]=endgamer(startpoint,hfun)
which is more completely documented in § C.7 below. Briefly, the inputs are
startpoint, a matrix with one startpoint solution of the homotopy in each col-
umn, and hf un, a string name of the homotopy function. The matrix xsoln lists
the endpoints of the solution paths in columnwise fashion. As its name suggests,
endgamer applies an endgame to get better estimates of the endpoints for paths
which approach singularities as t —> 0. Specifically, it uses the power-series endgame
described in § 10.3.3.
Usage of HOMLAB mainly comes down to specifying a homotopy and finding its
start points. This can be done by writing one's own m-files or by making use of
utilities and drivers in HOMLAB. The main alternatives are as follows.
Linear Products This option includes total degree homotopies (§ 8.4.1), mul-
tihomogeneous homotopies (§ 8.4.2), and general linear-product homotopies
(§ 8.4.3). The user must specify a target function f(x), its derivative fx{x), and
the linear product structure. Automatic differentiation is available if the func-
tion is specified in fully-expanded form (see § C.3.1). Driver routine lpdsolve
does everything else to construct and solve a homotopy of the form

h(x,t) = ytg(x) + (l-t)f(x) = 0.


That is, lpdsolve constructs a compatible start system g(x), solves it, and
calls endgamer to get the final answers. See § C.4 for details.
Parameter Homotopy This option handles general homotopies of the type de-
scribed in Chapter 7. The user gives a parameterized function f{x,q), its
derivatives fx{x,q) and fq(x,q), starting and ending parameter values q\ and
go, and startpoint solutions for f(x,qi) = 0. (Usually, the start points are
found with a single linear-product run, then parameter homotopy is used for
all subsequent runs for various target values of go-) A means is provided for
selecting a linear path from q\ to qo; otherwise, the user must write an m-file to
implement a nonlinear path. When the linear path is selected, the homotopy
is of the form
h(x,t)=f{x)tq1 + (l-t)qo)=O.
Secant Homotopy This option solves homotopies of the form
h(x, t) = -ytf(x, qi) + (1 - t)f(x, q0) = 0.
The user supplies the function f(x,q), the derivative fx(x,q), and startpoint
solutions to f(x,qi) = 0. (Again, as in the parameter continuation case, one
usually solves f(x,q±) = 0 with a single linear-product homotopy, reusing the
same q\ for subsequent homotopies to various target values of qo.) It is the
360 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

user's responsibility to verify that the homotopy is valid in the sense that the
linear combination of two functions from the family is still in the same family.
This is the least used option, but as shown in Exercise 7.6, it is sometimes
handy.
The usual process involves creating two m-files:
• a function defining the system to be solved
• a script that sets up the required data structures before calling endgamer to
get the solutions.
The exception is if one chooses to specify the function in "tableau" form (§ C.3.1),
in which case the function evaluation routine is already provided. Facilities are
available to make the whole process easy in the most common formulations, while
the more advanced user can directly access the basic routines to implement special-
ized homotopies. In the next few sections, we illustrate each of the main options by
examining example scripts and functions.

C.3 Defining the System to Solve

HOMLAB allows a target system to be denned in one of two ways: as a fully expanded
sum of terms or as a black box function. The fully expanded form is convenient for
simple, sparse polynomials, while user-written functions are more flexible and often
more efficient. Parameterized families of systems must always be written as a user-
defined function, but the underlying functions that HOMLAB uses for evaluating
fully expanded functions can be employed in a user-defined function as well.

C.3.1 Fully-Expanded Polynomials


The simplest option for specifying a target polynomial is to list out its monomials
and coefficients. As discussed in § 1.2, this is not generally an efficient formulation:
for complicated problems, straight-line programs can require much less computa-
tion. However, for simple systems, the fully expanded form is quite reasonable.
HOMLAB supports a "tableau" style definition for systems, wherein the entire
polynomial system is laid out in a single numerical matrix with n + 1 columns for
an n-variable problem. The convention is that each row is a term of a polynomial,
with the coefficient in the first column and the exponents d\,..., dn for monomial
xf1 • • • xff in the remaining columns. The end of a polynomial is marked by a row
with a negative exponent for x\.
A complete script for solving the system

x2 - x - 2 = 0, xy - 1 = 0,

using a tableau definition of the system and a total-degree homotopy is as follows.


HomLab User's Guide 361

% Define the target system in tableau form


eop = [0 -1 0 ] ; % marker for end of polynomial
tableau = [
12 0
-110
-2 0 0
eop
111
-10 0
eop
];
'/, decode tableau and solve with total-degree homotopy
totdtab
7, display the dehomogenized solutions
dispOThe solutions a r e : ' ) ; disp(dehomog(xsoln, le-8))

The total degree is 2 • 2 = 4, and there are two finite solutions [x, y, w] = [2,0.5,1],
and [—1,-1,1] and a double root at infinity of [0,1,0].
More information on the solution script totdtab is given in § C.4. A related
script, lpdtab, can be used to solve tableau-style systems using multihomogeneous
or general linear-product homotopies. Using this capability, a two-path version of
the above would be as follows.
°/0 Define the target system in tableau form
eop = [ 0 - 1 0 ] ; '/, marker for end of polynomial
tableau = [
12 0
-110
-2 0 0
eop
111
-10 0
eop
];
% define a linear-product decomposition
xw=[l 0 1]; yw=[0 1 1] ;
LPDstruct=[
xw; xw;
xw; yw; ] ;
HomStruct= [] ; V, default t o 1-homogeneous
% decode tableau and solve with linear-product homotopy
lpdtab
'/, display the dehomogenized solutions
disp('The solutions are:'); disp(dehomog(xsoln,le-8))
362 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

See § C.4 for details on specifying linear-product structures and see the help for
lpdtab for details on how this script automatically homogenizes the tableau-style
polynomial using the information in HomStruct.
These scripts parse the tableau matrix into a more basic form and pass the results
to a built-in function, ftabcall, which in turn calls function ftabsys. The latter
can be used directly if one wishes to write a straight-line function (see next section)
while specifying some subset of the polynomials in tableau form. For details, see
use the help facility or look at the source code for ftabsys.

C.3.2 Straight-Line Functions


For more complicated functions, it is usually more efficient to express them in
straight-line form. There are two contexts in HOMLAB where a user might define
such a function:

• to define a target polynomial for solution in a linear-product homotopy, or


• to define a parameterized family of functions for a coefficient-parameter homo-
topy.

In the first case, the function must have the form

function [f,fx]=function_name(x)

where x is the input variable list, f is the output function value, and f x is the output
Jacobian matrix of partial derivatives df/dx. The function must be homogeneous,
possibly multihomogeneous. The careful reader might raise an objection that a
homogeneous polynomial on P n is not truly a function (see § 12.3), but for our
purposes we consider it as a function on C" +1 , which it certainly is. The script
which defines the linear-product homotopy appends random linear equations to
effect the projective transformation of § 3.7, one such equation for each projective
subspace when working multihomogeneously.
To repeat the example above of the system

x2 - x - 2 = 0, xy-1-0,

in straight-line form, one could define the function


HomLab User's Guide 363

function [f,fz]=simplefcn(z)
% Straight-line function for
"/. x"2-x-2=0, xy-l=0
x=z(l); y=z(2); w=z(3);
f = [
x~2-x*w-2*w~2
x*y-w*2
];
fz = [
2*x-w, 0, -x-4*w
y, x, -2*w
];
This is not really useful for such a simple example, but it can be significant for more
complicated systems. Notice the use of the homogeneous coordinate w.
Similarly, parameterized functions must also be homogeneous in the unknowns,
but not necessarily in the parameters. The Matlab format for a parameterized
family of systems is simply

function [f,fx,fp]=function_name(x,p)

where the third output, f p, is the matrix of derivatives df /dp. Here is a complete
specification for the intersection of two circles, where a subfunction for a single
circle is used twice.
function [f,fx,fp]=twocircle(x,p)
% Straight-line function for intersection of two circles
f=zeros(2,l); fx=zeros(2,3); fp=zeros(2,6);
[f(l),fx(l,:),fp(l,l:3)]=onecircle(x,p(l:3));
[f(2),fx(2):),fp(2)4:6)]=onecircle(x,p(4:6));
%
function [f,fx,fp]=onecircle(z,p)
% straight-line function for one circle
°/, parameters are [cx;cy;r~2] where (cx,cy)=center, r=radius
x=z(l); y=z(2); w=z(3);
cx=p(l); cy=p(2); rsq=p(3);
a=x-cx*w;
b=y-cy*w;
f = 0.5*( a~2 + b~2 - rsq*w~2);
fx = [ a , b, -cx*a-cy*b-rsq*w];
fp = [ -w*a, -w*b, -0.5*w~2 ] ;
The most error-prone part of writing a straight-line program is in generating
the derivatives. To aid in debugging, utilities are provided to numerically check the
coding of the function. See § C.3.4.
364 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

C.3.3 Homogenization
It is highly recommended that all systems presented to HOMLAB be denned in ho-
mogeneous form. This is the user's responsibility, with the exception that tableau-
style systems will be homogenized automatically. Homogenization is recommended
because path endpoints approaching infinity are very common, and the projective
transformation available after homogenization keeps both the magnitudes of the
coordinates and the arclengths of the homotopy paths finite. If one wishes to com-
pute homotopy paths without homogenization, the path-tracker routines endgame
and tracker will still work, but they do not include any special stopping conditions
for diverging solutions, which may therefore take up inordinate computation time.
(Such paths can never make it to t — 0, so eventually they must fail on a too small
step size condition or on the limit on the number of steps.)
The choices of a linear product structure and a multihomogenization are inde-
pendent. For example, if a system is bilinear, one can reflect this in the linear-
product structure while one-homogenizing the system. The one-homogeneous start
system will have solutions at infinity, but HOMLAB will ignore these. If the system
is two-homogenized instead, respecting the bilinear structure, the linear-product
start system has no solutions at infinity. This is a bit cleaner mathematically, but
in practical terms, both formulations have the same number of solution paths to
follow.
To be clear, consider again the example
x2 ~ x - 2 = 0, xy - 1 = 0.
In a one-homogeneous treatment using coordinates [x, y, w] £ P 2 , we have the equa-
tions
x2 - xw - 2w2 = 0 , xy -w2 = 0,
and we must specify a compatible homogeneous structure:
HomStruct=[l 1 1];
which directs HOMLAB to append a inhomogeneous linear equation ax + by+cw = 1
for some random, complex {o, b, c}, thereby choosing a random patch on P 2 . (For
a discussion of projective spaces, see Chapter 3.) To get a two-path homotopy, we
specify the linear-product structure
LPDstruct=[l 0 1 ; 1 0 1 ; 1 0 1; O i l ] ;
that is, /i e (x,w) x (x,w) and / 2 G (x,w) (y,w). This start system has a
double root at infinity of [x,y,w] = [0,1,0], but HOMLAB will ignore it. The
two-homogeneous treatment of the same system using coordinates {[x, u], [y,v]} e
P 1 x P 1 is

x2 — xu — u2 = 0, xy — uv = 0.
HomLab User's Guide 365

The compatible HomStruct is, assuming the coordinates are ordered as (x,y,u,v),

HomStruct=[l 0 1 0 ; 0 1 0 1];

which directs HOMLAB to append two linear equations

ax + Qy + bu + Ov = 1, Ox + cy + Ou + dv = 1,

for random, complex values of {a, b, c, d}. This picks a random patch on each of the
two P 1 subspaces. Now, the two-path linear-product decomposition is

LPDstruct=[1 0 1 0 ; 1 0 1 0 ; 1010; 010 1];

See § C.8 for a description of the dehomog function to dehomogenize a solution


point.

C.3.4 Function Utilities and Checking


With a few sample scripts in hand, it is easy to set up and run any of the various
kinds of homotopies once the function and its derivatives are available. To make
the definition of these easier, some utilities are available.

• Function f tableau accepts a list of coefficients and a matrix of exponents to


define the terms of a polynomial. It then provides both the function and deriv-
ative evaluations. It works only for a single function / : C n —> C, so a wrapper
function ftabsys is provided to call ftableau multiple times for a system of
such functions.
• Utility scalepol is available for scaling a system for f tabsys, as is sometimes
necessary. For example, see the chemical system of § 9.2.

Since Matlab is a numerical package, there has been no attempt to automate dif-
ferentiation and homogenization except for the simple case of fully expanded poly-
nomials via the ftableau function. Otherwise, this onerous task falls to the user.
Symbolic packages can be employed to preprocess functions in this way and then
copy the results into an m-file function.
The most error-prone step in defining a straight-line program for a function
is in giving formulae for the partial derivatives. A helpful way of checking these
is to compare the computed derivatives with a computation based on numerical
differentiation. The function must also be homogenized, which can also be checked
numerically. The following checking utilities are provided for these purposes.
366 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

function [fxerr]=chekffun(fname,nx,epsO)
—> checks derivatives of target functions
[f,fx]=myfunc(x)
function [fxerr,fperr]=chekpfun(fname,nx,np,epsO)
—> checks derivatives for parameterized functions
[f,fx,fp]=myfunc(x,p)
function [homerr]=chekhmog(fname,HomStruct,mdeg)
—> Checks multihomogenization of function [f]=myfunc(x)
User provides homogeneous structure and multidegree matrix
function [homerr]=chekhmog(fname,HomStruct,deg,lpd)
—> Checks multihomogenization of function [f]=myfunc(x)
User provides homogeneous structure, t o t a l degrees, and
linear product structure. Code computes multidegree matrix
from these.
In each case, the checking is done at a random point x £ Cnx. The functions provide
a numerical comparison and also use the Matlab spy function to graphically show
which elements are suspicious, having an error greater than espO. If epsO is omitted
from the call, it defaults to 10~6. Note that high-level scripts define a global FFUN,
which can be used for fname.

C.4 Linear Product Homotopies

One of the two main options in HOMLAB is the linear-product homotopy, imple-
mented in the script lpdsolve. With appropriate settings, this script performs the
equivalent of a total-degree homotopy, a multihomogeneous homotopy, or a general
linear-product homotopy. For total-degree and multihomogeneous homotopies and
a tableau-style function definition, the higher-level scripts totdtab and mhomtab
automatically perform some preliminary processing steps for you before initiating
lpdsolve.
Let's first see all the set-up information required by lpdsolve by studying a
script to solve a simple system specified in straight-line form. Such a function
is treated as a "black box," so the user must supply all the structural informa-
tion necessary to specify the linear-product formulation. To this end, consider the
straight-line function called simplf en in § C.3 above, that implements the system

x2 - x - 2 = 0, xy - 1 = 0.

It has two variables (before homogenization), and each equation is quadratic. A


complete script to solve this with a total-degree homotopy, with four paths, is as
follows.
HomLab User's Guide 367

% script to solve "simplfcn" by t o t a l degree using lpdsolve


global nvar degrees FFUN
nvar=2; degrees=[2 2 ] ;
FFUN='simplfcn';
LPDstruct=ones(sum(degrees),nvar+l); % t o t a l degree structure
lpdsolve
dispOThe solutions a r e : ' ) ; disp(dehomog(xsoln, ie-8))
The meaning of the global variables is self-explanatory. The degrees of the polyno-
mials as listed in degrees must be in the same order as they appear in the evaluation
function, although in this example they are the same. The item that needs expla-
nation is LPDstruct, which defines the linear-product structure to be used. Each
row in LPDstruct represents one linear factor, and there must be degrees (i) fac-
tors for the ith equation, for a total of sum (degrees) rows in all. The columns
of LPDstruct correspond to the variables in x as it is passed into simplfcn(x).
Typically, the final column is the homogeneous coordinate, but this is at the dis-
cretion of the user when writing the function. A nonzero entry in element (i,j) of
LPDstruct indicates that variable j appears in the ith linear factor, and factors are
assigned to equations in accordance with the entries in degrees. For a total-degree
homotopy, LPDstruct is just a full matrix of ones.
We can run the same problem using only two paths just by changing LPDstruct.
In this case, a two-path homotopy is obtained with the following script.
'/„ script to solve "simplfcn" with two paths using lpdsolve
global nvar degrees FFUN
nvar=2; degrees=[2 2];
FFUN='simplfcn';
xw= [ 1 0 1];
yw=[0 1 l ] ;
LPDstruct=[
xw; xw;
xw; yw;
];
HomStruct=[]; % default to 1-homogeneous
lpdsolve
disp('The solutions a r e : ' ) ; disp(dehomog(xsoln,le-8))
This is exactly the script simpltst that is suggested as an installation check in
§ C.1.5.
The script above uses only two paths even though simplfcn is only one-
homogenized. That is, the homotopy runs in the projective space P 2 . The choice of
linear product structure guarantees that the start system, the target system, and
consequently the whole homotopy, have a double root at infinity that we choose
not to track. If we two-homogenize the equations instead, then this double root
368 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

does not exist at all. It makes no difference in this case, but in cases where some
endpoints arrive at infinity only at the target (as t —• 0), multihomogenization can
change their representation, and sometimes this can make them numerically more
tame. For instance, a singular double root at infinity might break into two distinct
nonsingular roots at infinity.
To show how HomStruct is used to set up a homotopy in a cross product of pro-
jective spaces, let's rework the running example. First, we need a two-homogenized
version of the equations.
function [f,fz]=simplefcn2(z)
% Straight-line function for
'/. x~2-x-2=0, xy-l=0
% Two-homogeneous version on [x,u] \times [y,v]
x=z(l); y=z(2);
u=z(3); v=z(4);
f = [
x~2-x*u-2*u~2
x*y-u*v
];
fz = [
2*x-u, 0, -x-4*u, 0
y. x, -v, -u
];

The script to solve this as a two-homogeneous system follows.


5i script to solve "simplfcn2" two-homogeneously
global nvar degrees FFUN
nvar=2; degrees=[2 2];
FFUN='simplfcn';
xu=[l 0 1 0 ] ;
yv=[0 1 0 1 ] ;
LPDstruct=[
xu; xu;
xu; yv;
];
HomStruct=[
xu

];
lpdsolve
dispOThe solutions a r e : ' ) ; disp(dehomog(xsoln,le-8))
In general, the groupings in the linear factors specified in LPDstruct do not have
to be copies of those in HomStruct, as indeed, they are different in the two-path,
HomLab User's Guide 369

one-homogeneous example above.


The given LPDstruct and HomStruct must be compatible with the target func-
tion. In tableau style functions, the automated scripts will ensure compatibility,
but straight-line functions are treated as black boxes, so HOMLAB has no way of
checking compatibility. It is the user's responsibility to ensure compatibility. In
the case of errors, the resulting behavior will be erratic, sometimes signalled by
path-tracking failures, but not necessarily.
The solution script, lpdsolve, does the following:
• generates a start system g(x) according to the linear-product structure of
LPDstruct,
• appends random hyperplane slices to implement the projective transformation
compatible with HomStruct,
• solves g(x) = 0 to get all the start points,
• forms the homotopy
h(x,t) = 1tg(x) + {l-t)f(x),
• calls endgamer to track the solution paths, invoking a power-series endgame.
The results are in matrices xsoln, stats, and xendgame, as described in § C.7.3.

C.5 Parameter Homotopies

Suppose we have written a coefficient-parameter target function, f(x;p), imple-


mented as an m-file function, say myf unc, having the calling sequence
[f,fx,fp]=myfunc(x,p)
as described in § C.3. An example is function twocircle above. How can we form
a parameter homotopy function to solve it for some target value of pO?
Let's assume we have a solution list for random, complex parameter values pi.
(We will see in a moment how to get this using lpdsolve.) What we need is a
homotopy function
h(x,t;pl,pO) = f(x,p(t;pl,pQ))
where p : C x Q x Q —> Q with Q the parameter space, and p(l;pi,p0) = pi,
p(O;pi,j>o) — Po- The path function p must give a continuous path, with continuous
first derivative, starting at pi, ending at po, and always staying in the parameter
space. HOMLAB does not offer a general solution for arbitrary parameter spaces,
but in the special case that Q = C m , a Euclidean space, a linear path suffices:
p(t\pi,p0) = tpi + ( l - i ) p 0 -
Our path tracker and endgame function, endgamer, expects a homotopy function
with the calling sequence [h,hx,ht]=h(x,t). Therefore, the parameters and the
370 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

path function must be passed from the top level script to the homotopy evaluation
function via global variables. Moreover, we write myfunc in homogeneous form, so
projective transformation equations must be appended. The script parsolve takes
care of all the formatting once the minimal set of information has been established.
Let's assume that pi and pO are in memory along with the start points, as matrix
startpoint, listed columnwise and satisfying f(x,pi) = 0. Then, a script for
solving the system f(x,p0) = 0 is as follows, assuming myfunc implements f(x,p).

global FFUN PATHFUN


FFUN = 'myfunc';
PATHFUN = ' l i n . p a t h ' ;
global ParStart ParGoal
ParStart = p i ; ParGoal = pO;
HomStruct= [] ; °/« defaulting to 1-homogeneous
parsolve
disp('The solutions a r e : ' ) ; disp(dehomog(xsoln,le-8))

Here, lin_path is a pre-defined function for a linear path. Clearly, HomStruct must
be set to agree with the homogenization that has been applied to the user-defined
myfunc.

C.5.1 Initializing Parameter Homotopies


For parameter homotopy to be useful, we must have some way to solve the first
example, f(x,pi) — 0. This can be done with a linear-product homotopy. Once the
parameterized family of systems has been defined, in the form f(x;p) = 0, HOMLAB
can treat it like any other black box target system. This requires, as described in
§ C.4, one to provide the linear product structure to be used in the homotopy. One
additional wrinkle is that the initial set of parameters p\ must be chosen at random,
and then passed behind the scenes through a global variable. Script Ipd2par takes
care of all of this. An example usage of this to solve the example of the intersection
of two circles, function twocircle above, is as follows.

global nvar degrees FFUN


nvar=2; degrees=[2 2 ] ;
FFUN='twocircle';
LPDstruct=ones(4,3) ; °/0 t o t a l degree structure
HomStruct=[l 1 1]; °/0 1-homogeneous
pi = crand(6,l); % 6 random, complex parameters
global ParGoal
ParGoal = p i ;
Ipd2par
dispOThe solutions a r e : ' ) ; disp(dehomog(xsoln,le-8))
HomLab User's Guide 371

This sets up and solves a homotopy of the form


1ftg(x) + (l-t)f{x;p1)
Here, we have chosen random, complex parameters, using the function crand, as this
is the desired first step in establishing a parameter homotopy. One can use the same
script with nonrandom values of ParStart to solve other problems in the family
using a linear-product homotopy, but each such run uses the full linear-product root
count number of paths.
If any of the endpoints are degenerate in the run of Ipd2par for random target
parameters, then correspondingly fewer paths can be used in solving subsequent
members of the family by parameter homotopy. Just copy pi to ParStart and
copy the nondegenerate endpoints into startpoint and you are ready to apply the
parameter homotopy of the previous section. Here, "degenerate" can mean singular
solutions, solutions at infinity, or solutions on any pre-specified (i.e., independent
of the random choice of parameters) irreducible quasiprojective algebraic set. See
Chapter 7 for details.

C.6 Defining a Homotopy Function

In all the above usages, HOMLAB automatically constructs a homotopy in accord


with the instructions provided by the user. Alternatively, one can define a complete
homotopy from scratch and then call up HOMLAB'S path tracker to solve it. The
homotopy function must be denned with the following interface:
function [h,hx,ht]=myriomotopy(x>t)
where myhomotopy can be any name of the user's choosing. The user must also pro-
vide a list of start points, whereupon the corresponding endpoints can be obtained
with the command
[xsoln,stats,xendgame]=endgamer(startpoint,'myhomotopy');
See § C.7 for details.

C.6.1 Defining a Parameter Path


In linear-product decompositions, the homotopy path is automatically chosen by
HOMLAB as a straight line through the corresponding coefficient space, as justified
by Theorem 8.3.1. In parameter homotopies, however, one must ensure that the
homotopy path stays in the desired parameter space for all t, not just for the start
and target systems at t = 1 and t = 0. If the parameter space is Euclidean, then a
linear path is acceptable. As in the example usage of parsolve above, this is easily
obtained by the declaration PATHFUN=' lin_path'; which makes use of a pre-defined
function lin_path for linearly interpolating between points in parameter space. If
372 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

the parameter space is non-Euclidean, a more general type of path is needed. The
user must provide the definition in the form
function [p,dpdt]=mypath(pl,pO,t)
where mypath. m is a user-written m-file function. Then, set PATHFUN=' mypath'; be-
fore calling parsolve to execute the homotopy. See the source code for lin_path.m
for an example to follow.

C.6.2 Homotopy Checking


Most errors in coding a homotopy function can be revealed by checking if computed
derivatives agree with a computation based on numerical differentiation. The fol-
lowing routine is provided for this purpose.
function [hxerr,hterr]=chekhfun(fname,nx,epsO)
—> checks homotopy functions [f,fx,ft]=myfunc(x,t)
The checking is done at a random point (x,t) € C nx x C. The functions provide
a numerical comparison and also use the Matlab spy function to graphically show
which elements are suspicious, having an error greater than espO. If epsO is omitted
from the call, it defaults to 10~6. Note that high-level scripts define a global HFUN,
which can be used for f name.

C.7 The Workhorse: Endgamer

The workhorse routine is endgamer. m which tracks solution paths for a homotopy
h(x,t) = 0 from a list of startpoint solutions of h(x, 1) — 0 to their endpoints
satisfying h(x, 0) = 0. Specifically, endgamer has the usage
[xsoln,stats,xendgame]=endgamer(startpoint,hfun)

with thefollowinginputs and outputs.


Inputs
startpoint An n x N matrix of N start points, listed columnwise.
hfun A string name of the homotopy function, h(x,t) : Cn x C - » C n . The
function routine must provide derivatives (see § C.6). It is recommended
that the homotopy be homogenized.
Outputs
xsoln Ann x N matrix of the endpoints of the homotopy paths.
stats A6xJV matrix of statistics regarding the paths and their endpoints.
xendgame An n x N matrix recording the solutions for t at the start of the
endgame.
HomLab User's Guide 373

There are a number of control settings regarding path-tracking tolerances and


the like which must be set prior to calling endgamer. These global variables can be
set by calling htopyset, as is done automatically by the high-level solving scripts
totdtab, mhomtab, lpdsolve, and parsolve. To change the default settings, one
just puts a copy of htopyset in the current working directory and edits the values.
Matlab will find and use the copy in the current directory, overriding the copy in
HOMLABIO, which is best left in its original condition. Comments in the original
copy ofhtopyset.m tell the default settings in case the user needs them for reference.
Routine endgamer loops through the start points and for each one does the
following:

(1) tracks the path to the beginning of the endgame, t=t_endgame, a global control
variable;
(2) records the solution at t=t_endgame as a column in xendgame;
(3) executes the power-series endgame (§ 10.3.3), monitoring the convergence cri-
terion and stopping when either convergence is reached or when one of several
protective stopping conditions is satisfied;
(4) records the best solution estimate, as judged by the convergence criterion, as a
column in xsoln and certain statistics concerning the solution are recorded as
a column in stats.

The details of all the required control settings are given next, followed by a
detailed description of the outputs.

C.7.1 Control Settings


As mentioned above, the control settings are established in htopyset.m, which is
called automatically by the high-level scripts. Here, we give a detailed list and
describe what each control means, as well as give the default value. We group these
into three general categories. These are all global variables.

LPD Start System (used by lpdstart)


• epsstart = le-12; Each solution of the linear-product start system is
found by choosing one linear factor from each equation and solving the
resulting linear system. Choices that give a singular linear system are ig-
nored. The solver builds the linear system one equation at a time, using
Gaussian elimination to triangularize as it proceeds. If at any stage the
magnitude of the largest available pivot is less than epsstart, that combi-
nation of linear factors is declared invalid, and the solver moves on to the
next combination.
Path Tracking (used by tracker). This variable step-size tracker is a corrector-
predictor type as described in § 2.3.
374 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

• stepmin = le-6; The minimum step size in t, below which a path is de-
clared as having failed.
• maxit = 3; The maximum number of Newton iterations allowed in the
corrector. If the designated convergence criterion is not met within this
number of iterations, the step is a failure and the step size will be halved.
• maxnf e = 1500; The maximum number of function evaluations allowed per
path. This limits the amount of computing time that a diverging path may
consume. For well-scaled, homogenized homotopies, this criterion should
rarely come into play.
• epstiny = le-12; In rare instances, a path may fail due to vanishing of
the tangent vector (dh/dx)~1dh/dt. This is detected using the tolerance
epstiny. Typically, when this occurs, it is a signal that the homotopy is not
properly formed, possibly an error in a user-written function for evaluation
of the derivatives.
End Game (used by endgamer) This routine calls tracker to get to the start of
the endgame, then runs the power-series endgame.
• stepstart = 0.1; The initial step size for t in the tracker.
• epsbig = le-4; The tracking accuracy to be maintained in the initial
phase of tracking. This is the convergence tolerance for the corrector. If a
path does not successfully reach the endgame, it is tried once more from
the beginning with a tighter tolerance of epsbig/100.
• epssmall = le-6; The tracking accuracy to be maintained in the
endgame.
• t_endgame = 0.1; The value of t where the endgame starts.
• tstop = le-10; The value of t where the endgame gives up.
• t r a t i o = 0.3; During the endgame, samples are taken for t in a geometric
series where £& = tratio*£/c_i. The value 0.3 is a compromise between the
need to spread the samples out for a well-conditioned fit (tratio smaller)
and the need to stay away from t = 0, where the path may be singular.
• eps_end = le-10; The criterion for deciding when the endpoint estimate
has converged. When two successive estimates agree to this tolerance, suc-
cess is declared.
• CycleMax = 4; This is the maximum winding number tested by the power-
series endgame. In double precision, the endgame is rarely successful above
winding number c = 4.
• maxerrup = 10; The endgame keeps a record of the smallest change in the
endpoint estimate in successive iterations. (This is compared to eps_end
for declaring success.) Usually, this measure improves with each succes-
sive iteration, unless the path gets too close to t = 0 before converging.
However, in the early stages of the endgame, the convergence measure can
sometimes increase briefly before entering the endgame operating zone. If
there are more than maxerrup successive iterations without improving on
HomLab ,User's Guide 375

the best iteration, the path is stopped.


• allowjump = 1; When nonzero, this flag allows the endgame to predict-
correct across the origin in s, where s = t1//<2 is the un-wound path variable.
This allows the endgame to sample on both sides of s = 0 to estimate
the value of the endpoint by seventh-order interpolation. If allowjump=O,
samples are only taken for s > 0, and the endpoint is estimated using cubic
extrapolation to s — 0.

C.7.2 Verbose Mode


By declaring global verbose and setting verbose=l;, the user will cause
endgamer to print out its progress during the endgame for each path. This al-
lows one to see how well the endgame is performing. Usually this is not of great
interest, but if one is running a huge problem, it may be worth monitoring a small
sample of paths and tuning the control settings for greater efficiency. It is also
a useful way of confirming that all is working well: if superlinear convergence is
obtained in the endgame, it is a strong indicator that everything is in good order.
The five columns of information printed in verbose mode are:
• the t value of the current endgame sample,
• the difference between the last two endpoint estimates (maximum absolute value
of the difference in any variable),
• the current best guess for the winding number c,
• the status of the endgame, which is the number of samples involved in estimating
the endpoint. Each sample includes derivative information. "1" means there
is only one sample, so the estimate will be done by linear extrapolation. "2"
means two samples, so cubic extrapolation is available. "3" means an additional
sample has been acquired on the other side of s = 0, but cubic extrapolation
is still used. "4" means there are two samples on each side, so seventh-order
interpolation is used for the estimate.
• the fifth column is the number of successive iterations that have not improved
on the best estimate.

C.7.3 Path Statistics


The main tool for interpreting the results is to examine the s t a t s matrix. It has
one column per path and six rows. The rows are as follows.
• s t a t s (1,:) The value of t at which the endgame gave its best estimate.
• s t a t s (2,:) The convergence estimate at the endpoint, which is the maximum
absolute value of the difference in any variable between two successive endpoint
estimates.
• s t a t s (3,:) The function residual, that is the maximum absolute value of any
entry in h(x*,0) for the endpoint estimate x*.
376 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

• s t a t s (4, :) The estimated winding number. If the true winding number is


higher than CycleMax, this will typically result in a best guess of c=CycleMax
in this place.
• stats (5,:) Condition number of the Jacobian matrix |^ (x*, 0) at the endpoint
estimate x*.
• stats (6,:) The total number of function evaluations used in computing this
path.
For large runs, it is tedious to examine stats by looking at the raw numbers.
It is much easier to look at histograms and other types of summary statistics.
Any endpoint that does not have a small function residual, s t a t s ( 3 , :), has failed
in a serious way; quite likely the path tracker stopped with a large value of t in
s t a t s ( l , : ) . A histogram plot of Iogl0(stats(3,:)) gives a quick check whether
all the paths have ended in at least an approximate solution. If the endpoint is
singular, its function residual can be small while it is still relatively far from the
true endpoint. Check loglO (stats (2,:)) to see the endpoint convergence measure.
One hopes that all endpoints are computed to the desired accuracy, eps_end, but
some may not, especially if they are singularities with cycle number of 4 or greater.
If the accuracy is only moderate, say better than 10~6 but not at the 10~10 ones
desires, check if the condition number is at least moderately high, say 108 or greater.
This would indicate that the root really is singular and failed to be computed
accurately for that reason. Depending on one's purpose, that may be enough.
When the singular endgame works well, the endpoint accuracy will be better than
10~10 and the condition number of a singular point will be greater than 1010 often
as high as 1016 or more.

C.8 Solutions at Infinity and Dehomogenization

When the solutions are computed in homogeneous coordinates or multihomoge-


neous coordinates, they can be scaled in each projective factor. Usually the original
formulation is in Cn and it has been recast in P", or in a cross product of projective
spaces, by introducing one or more homogenizing coordinates.
Solutions at infinity are indicated by a small homogenizing coordinate, or if
multihomogenized, by at least one homogenizing coordinate being near zero. Here,
being near zero means, typically, being of the same magnitude as the convergence
estimate in s t a t s ( 2 , : ) . If the homogenizing coordinate is in row k of xsoln, then
a histogram of absdoglO(xsoln(k,:))) can be very revealing.
For a finite solution, we wish to rescale to make the homogenizing coordinate(s)
equal to one. Subroutine dehomog does this. The short form is

x=dehomog(xsoln,epsO) ;

where epsO is the magnitude of the homogenizing coordinate below which a solu-
HomLab User's Guide 377

tion is declared to be at infinity. This form assumes that the solutions are one-
homogenized and that the homogenizing coordinate is the last entry. Any solution
determined to be at infinity is rescaled by its largest element, while a finite one is
rescaled by the homogenizing coordinate. This is usually what one wants, but the
result can be a bit surprising if epsO is made too small so that a poorly computed
solution at infinity gets erroneously rescaled as if it were finite.
A more elaborate form must be used for multihomogenized solutions:
x=dehomog(xsoln,espO,HomStruct,homvar);
where HomStruct identifies the membership in the various homogeneous groupings,
and homvar is a list of the row number for each homogenizing variable. If homvar is
missing, the last variable of each group is assumed by default to be the homogenizing
variable for that group.
In either the short form or the long form, dehomog sets one variable of each
homogeneous group to one.
Bibliography

Abhyankar, S. S. (1990). Algebraic geometry for scientists and engineers, Vol. 35 of


Mathematical Surveys and Monographs. Providence, RI: American Mathematical
Society.
Alefeld, G., k Herzberger, J. (1983). Introduction to interval computations. Com-
puter Science and Applied Mathematics. New York: Academic Press Inc. [Har-
court Brace Jovanovich Publishers]. Translated from the German by Jon Rokne.
Allgower, E. L., Erdmann, M., k Georg, K. (2002). On the complexity of exclusion
algorithms for optimization. J. Complexity, 18(2), 573-588. Algorithms and
complexity for continuous problems/Algorithms, computational complexity, and
models of computation for nonlinear and multivariate problems (Dagstuhl/South
Hadley, MA, 2000).
Allgower, E. L., k Georg, K. (1993). Continuation and path following. In Ada
numerica, Vol. 2 (pp. 1-64). Cambridge: Cambridge Univ. Press.
Allgower, E. L., k Georg, K. (1997). Numerical path following. In Handbook of
numerical analysis, Vol. V (pp. 3-207). Amsterdam: North-Holland.
Allgower, E. L., k Georg, K. (2003). Introduction to numerical continuation meth-
ods, Vol. 45 of Classics in Applied Mathematics. Philadelphia, PA: Society for In-
dustrial and Applied Mathematics (SIAM). Reprint of the 1990 edition [Springer-
Verlag, Berlin].
Allgower, E. L., Georg, K., k Miranda, R. (1992). The method of resultants for
computing real solutions of polynomial systems. SIAM J. Numer. Anal., 29(3),
831-844.
Allgower, E. L., k Sommese, A. J. (2002). Piecewise linear approximation of smooth
compact fibers. J. Complexity, 18(2), 547-556. Algorithms and complexity for
continuous problems/Algorithms, computational complexity, and models of com-
putation for nonlinear and multivariate problems (Dagstuhl/South Hadley, MA,
2000).
Alt, H. (1923). Uber die Erzeugung gegebener ebener Kurven mit Hilfe des Ge-
lenkvierecks. Zeitschrift fur Angewandte Mathematik und Mechanik, 3(1), 13-19.
Arbarello, E., Cornalba, M., Griffiths, P. A., & Harris, J. (1985). Geometry of alge-
braic curves. Vol. I, Vol. 267 of Grundlehren der Mathematischen Wissenschaften

379
380 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

[Fundamental Principles of Mathematical Sciences]. New York: Springer-Verlag.


Auzinger, W., & Stetter, H. J. (1988). An elimination algorithm for the computa-
tion of all zeros of a system of multivariate polynomial equations. In Numerical
mathematics, Singapore 1988, Vol. 86 of Internat. Schriftenreihe Numer. Math.
(pp. 11-30). Basel: Birkhauser.
Bates, D., Peterson, C , & Sommese, A. J. (2005a). A numerical-symbolic algorithm
for computing the multiplicity of a component of an algebraic set. in preparation.
Bates, D., Sommese, A. J., & Wampler, C. W. (2005b). Multiprecision endgames
for homotopy continuation, in preparation.
Beltrametti, M. C , Howard, A., Schneider, M., &, Sommese, A. J. (2000). Projec-
tions from subvarieties. In Complex analysis and algebraic geometry (pp. 71-107).
Berlin: de Gruyter.
Beltrametti, M. C , & Sommese, A. J. (1995). The adjunction theory of complex
protective varieties, Vol. 16 of de Gruyter Expositions in Mathematics. Berlin:
Walter de Gruyter & Co.
Bernstein, D. N. (1975). The number of roots of a system of equations. Func-
tional Anal. Appl., 9(3), 183-185. Translated from Funktsional. Anal, i Prilozhen
9(3):1-4,1975.
Borel, A. (1969). Linear algebraic groups. Notes taken by H. Bass. W. A. Benjamin,
Inc., New York-Amsterdam.
Bottema, O., h Roth, B. (1979). Theoretical kinematics, Vol. 24 of North-Holland
Series in Applied Mathematics and Mechanics. Amsterdam: North-Holland Pub-
lishing Co.
Burmester, L. E. H. (1888). Lehrbuch der Kinematik. Leipzig A. Felix.
Calabri, A., & Ciliberto, C. (2001). On special projections of varieties: epitome to
a theorem of Beniamino Segre. Adv. Geom., 1(1), 97-106.
Canny, J. (1990). Generalised characteristic polynomials. J. Symbolic Comput., 9,
241-250.
Canny, J., & Manocha, D. (1993). Multipolynomial resultant algorithms. J. Sym-
bolic Comput., 15, 99-122.
Canny, J., & Rojas, J. M. (1991). An optimal condition for determining the exact
number of roots of a polynomial system. Proceedings of the 1991 International
Symposium on Symbolic and Algebraic Computation (pp. 96-101). ACM, New
York.
Chablat, D., Wenger, P., Majou, R., & Merlet, J.-P. (2004). An interval based study
for the design and the comparison of three-degrees-of-freedom parallel kinematic
machines. Int. J. Robotics Research, 23(6), 615-624.
Chen, N. X., & Song, S.-M. (1994). Direct position analysis of the 4-6 Stewart
platform. ASME J. Mech. Design, 116(1), 61-66.
Chow, S. N., Mallet-Paret, J., & Yorke, J. A. (1979). A homotopy method for
locating all zeros of a system of polynomials. In Functional differential equations
and approximation of fixed points (proc. summer school and conf, univ. bonn,
Bibliography 381

bonn, 1978), Vol. 730 of Lecture Notes in Math. (pp. 77-88). Berlin: Springer.
Chu, M. T., Li, T.-Y., & Sauer, T. (1988). Homotopy method for general A-matrix
problems. SI AM J. Matrix Anal. AppL, 9(4), 528-536.
Cox, D., Little, J., & O'Shea, D. (1997). Ideals, varieties, and algorithms. Under-
graduate Texts in Mathematics. New York: Springer-Verlag, second edition. An
introduction to computational algebraic geometry and commutative algebra.
Cox, D., Little, J., & O'Shea, D. (1998). Using algebraic geometry, Vol. 185 of
Graduate Texts in Mathematics. New York: Springer-Verlag.
D'Andrea, C , & Emiris, I. Z. (2003). Sparse resultant perturbations. In Algebra,
geometry, and software systems (pp. 93-107). Berlin: Springer.
Datta, R. S. (2003). Using computer algebra to find Nash equilibria. Proceedings of
the 2003 International Symposium on Symbolic and Algebraic Computation (pp.
74-79). New York: ACM.
Davidenko, D. F. (1953a). On a new method of numerical solution of systems of
nonlinear equations. Doklady Akad. Nauk SSSR (N.S.), 88, 601-602.
Davidenko, D. F. (1953b). On approximate solution of systems of nonlinear equa-
tions. Ukrain. Mat. Zurnal, 5, 196-206.
Davis, P. J. (1975). Interpolation and approximation. New York: Dover Publications
Inc. Republication, with minor corrections, of the 1963 original, with a new
preface and bibliography.
Decker, W., Greuel, G.-M., & Pfister, G. (1999). Primary decomposition: algo-
rithms and comparisons. In Algorithmic algebra and number theory (Heidelberg,
1997) (pp. 187-220). Berlin: Springer.
Decker, W., & Schreyer, F.-O. (2001). Computational algebraic geometry today.
In Applications of algebraic geometry to coding theory, physics and computation
(Eilat, 2001), Vol. 36 of NATO Sci. Ser. II Math. Phys. Chem. (pp. 65-119).
Dordrecht: Kluwer Acad. Publ.
Decker, W., & Schreyer, F.-O. (2005). Solving polynomial equations: Foundations,
algorithms, and applications, to appear.
Denavit, J., & Hartenberg, R. S. (1955). A kinematic notation for lower pair mech-
anisms based on matrices. J. Appl. Mechanics, 22, 215-221. Trans. ASME, vol.
77.
Dhingra, A., Kohli, D., & Xu, Y. X. (1992). Direct kinematic of general Stewart
platforms. DE-Vol. 45, Robotics, Spatial Mechanisms, and Mechanical Systems
(pp. 107-112). ASME.
Dian, J., & Kearfott, R. B. (2003). Existence verification for singular and nonsmooth
zeros of real nonlinear systems. Math. Comp., 72(242), 757-766.
Dickenstein, A., & Emiris, I. Z. (Eds.), (preprint). Solving polynomial equa-
tions: Foundations, algorithms, and applications. Berlin Heidelberg New York:
Springer-Verlag.
Dietmaier, P. (1998). The Stewart-Gough platform of general geometry can have 40
real postures. In J. Lenarcic, & M. L. Husty (Eds.), Advances in robot kinematics:
382 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Analysis and control (pp. 1-10). Dordrecht: Kluwer Academic Publishers.


Dixon, A. L. (1909). The eliminant of three quantics in two independent variables.
Proc. London Math. Soc, 2(7), 49-69.
Drexler, F. J. (1977). Eine Methode zur Berechnung samtlicher Losungen von
Polynomgleichungssystemen. Numer. Math., 29(1), 45-58.
Drexler, F. J. (1978). A homotopy method for the calculation of all zeros of zero-
dimensional polynomial ideals. In Developments in statistics, vol. 1 (pp. 69-93).
New York: Academic Press.
Duffy, J., & Crane, C. (1980). A displacement analysis of the general spatial 7-link,
7R mechanism. Mechanism Machine Theory, i5(3-A), 153-169.
Eisenbud, D. (1995). Commutative Algebra with a view toward algebraic geometry,
Vol. 150 of Graduate Texts in Mathematics. New York: Springer-Verlag.
Emiris, I. Z. (1994). Sparse elimination and applications in kinematics. PhD the-
sis, Computer Science Division, Dept. of Electrical Engineering and Computer
Science, University of California, Berkeley.
Emiris, I. Z. (1995). A general solver based on sparse resultants. Proc. PoSSo
(Polynomial System Solving) Workshop on Software (pp. 35-54). Paris.
Emiris, I. Z. (2003). Discrete geometry for algebraic elimination. In Algebra, geom-
etry, and software systems (pp. 77-91). Berlin: Springer.
Faugere, J. C , & Lazard, D. (1995). The combinatorial classes of parallel manipu-
lators. Mechanism Machine Theory, 30(6), 765-776.
Feinberg, M. (1980). Chemical oscillations, multiple equilibria, and reaction network
structure. In W. E. Stewart (Ed.), Dynamics and modelling of reactive systems
(pp. 59-130). Academic Press, Inc.
Fischer, G. (1976). Complex analytic geometry. Berlin: Springer-Verlag. Lecture
Notes in Mathematics, Vol. 538.
Fischer, G. (2001). Plane algebraic curves, Vol. 15 of Student Mathematical Li-
brary. Providence, RI: American Mathematical Society. Translated from the
1994 German original by Leslie Kay.
Freudenstein, F., & Roth, B. (1963). Numerical solution of systems of nonlinear
equations. J. ACM, 10(4), 550-556.
Frisch, J. (1967). Points de platitude d'un morphisme d'espaces analytiques com-
plexes. Invent. Math., 4, 118-138.
Fritzsche, K., & Grauert, H. (2002). From holomorphic functions to complex man-
ifolds, Vol. 213 of Graduate Texts in Mathematics. New York: Springer-Verlag.
Fulton, W. (1998). Intersection theory, Vol. 2 of Ergebnisse der Mathematik und
ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results
in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in
Mathematics]. Berlin: Springer-Verlag, second edition.
Gao, T., & Li, T.-Y. (2000). Mixed volume computation via linear programming.
Taiwanese J. Math., 4(4), 599-619.
Gao, T., & Li, T.-Y. (2003). Mixed volume computation for semi-mixed systems.
Bibliography 383

Discrete Comput. Geom., 29(2), 257-277.


Gao, T., Li, T.-Y., Verschelde, J., & Wu, M. (2000). Balancing the lifting values to
improve the numerical stability of polyhedral homotopy continuation methods.
Appl. Math. Comput, 114(2-3), 233-247.
Gao, T., Li, T.-Y., & Wang, X. (1999). Finding all isolated zeros of polynomial
systems in C" via stable mixed volumes. J. Symbolic Comput., 28(1-2), 187-211.
Polynomial elimination—algorithms and applications.
Garcia, C. B., & Zangwill, W. I. (1979). Finding all solutions to polynomial systems
and other systems of equations. Math. Programming, 16(2), 159-176.
Garcia, C. B., & Zangwill, W. I. (1980). Global continuation methods for finding
all solutions to polynomial systems of equations in n variables. In Extremal
methods and systems analysis (Interned. Sympos., Univ. Texas, Austin, Tex.,
1977), Vol. 174 of Lecture Notes in Econom. and Math. Systems (pp. 481-497).
Berlin: Springer.
Gelfand, I., Kapranov, M., & Zelevinsky, A. (1994). Discriminants, resultants and
multidimensional determinants. Boston: Birkhauser.
Georg, K. (2001). Improving the efficiency of exclusion algorithms. Adv. Geom.,
1(2), 193-210.
Georg, K. (2003). A new exclusion test. J. Comput. Appl. Math., 152(1-2), 147-
160. Proceedings of the International Conference on Recent Advances in Com-
putational Mathematics (ICRACM 2001) (Matsuyama).
Giusti, M., Hagele, K., Lecerf, G., Marchand, J., & Salvy, B. (2000). The projective
Noether Maple package: computing the dimension of a projective variety. J.
Symbolic Comput, 30(3), 291-307.
Goedecker, S. (1994). Remark on algorithms to find roots of polynomials. SI AM J.
Sci. Comput, 15(5), 1059-1063.
Goresky, M., & MacPherson, R. (1988). Stratified Morse theory, Vol. 14 of Ergeb-
nisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and
Related Areas (3)]. Berlin: Springer-Verlag.
Greuel, G.-M. (2000). Computer algebra and algebraic geometry—achievements
and perspectives. J. Symbolic Comput, 30(3), 253-289.
Greuel, G.-M., & Pfister, G. (2002). A singular introduction to commutative al-
gebra. Berlin: Springer-Verlag. With contributions by O. Bachmann, C. Lossen
and H. Schonemann, With 1 CD-ROM (Windows, Macintosh, and UNIX).
Griewank, A., & Osborne, M. R. (1983). Analysis of Newton's method at irregular
singularities. SIAM J. Numer. Anal, 20(4), 747-773.
Griffis, M., & Duffy, J. (1993). Method and apparatus for controlling geometrically
simple parallel mechanisms with distinctive connections. US Patent 5,179,525.
Griffths, P. A., & Harris, J. (1994). Principles of algebraic geometry. Wiley Classics
Library. New York: John Wiley & Sons Inc. Reprint of the 1978 original.
Gunning, R. C. (1970). Lectures on complex analytic varieties: The local parame-
trization theorem. Mathematical Notes. Princeton, N.J.: Princeton University
384 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Press.
Gunning, R. C. (1990). Introduction to holomorphic functions of several variables.
Vol. II. The Wadsworth & Brooks/Cole Mathematics Series. Monterey, CA:
Wadsworth k Brooks/Cole Advanced Books & Software. Local theory.
Gunning, R. C, k Rossi, H. (1965). Analytic functions of several complex variables.
Englewood Cliffs, N.J.: Prentice-Hall Inc.
Hamming, R. W. (1986). Numerical methods for scientists and engineers. New
York: Dover Publications Inc., second edition.
Harris, J. (1995). Algebraic geometry, Vol. 133 of Graduate Texts in Mathematics.
New York: Springer-Verlag. A first course, Corrected reprint of the 1992 original.
Hartenberg, R. S., & Denavit, J. (1964). Kinematic synthesis of linkages. McGraw-
Hill, N.Y.
Hartshorne, R. (1977). Algebraic geometry. New York: Springer-Verlag. Graduate
Texts in Mathematics, No. 52.
Hille, E. (1959). Analytic function theory. Vol. 1. Introduction to Higher Mathe-
matics. Ginn and Company, Boston.
Hille, E. (1962). Analytic function theory. Vol. II. Introductions to Higher Mathe-
matics. Ginn and Co., Boston, Mass.-New York-Toronto, Ont.
Hodge, W. V. D., k Pedoe, D. (1994a). Methods of algebraic geometry. Vol. I. Cam-
bridge Mathematical Library. Cambridge: Cambridge University Press. Book I:
Algebraic preliminaries, Book II: Projective space, Reprint of the 1947 original.
Hodge, W. V. D., k Pedoe, D. (1994b). Methods of algebraic geometry. Vol. II. Cam-
bridge Mathematical Library. Cambridge: Cambridge University Press. Book III:
General theory of algebraic varieties in projective space, Book IV: Quadrics and
Grassmann varieties, Reprint of the 1952 original.
Hodge, W. V. D., k Pedoe, D. (1994c). Methods of algebraic geometry. Vol.
III. Cambridge Mathematical Library. Cambridge: Cambridge University Press.
Book V: Birational geometry, Reprint of the 1954 original.
Ho§ten, S., k Shapiro, J. (2000). Primary decomposition of lattice basis ideals. J.
Symbolic Comput., 29(4-5), 625-639. Symbolic computation in algebra, analysis,
and geometry (Berkeley, CA, 1998).
Huang, Y., Wu, W., Stetter, H. J., k Zhi, L. (2000). Pseudofactors of multivariate
polynomials. Proceedings of the 2000 International Symposium on Symbolic and
Algebraic Computation (St. Andrews) (pp. 161-168). New York: ACM.
Huber, B., Sottile, F., k Sturmfels, B. (1998). Numerical Schubert calculus. J.
Symbolic Comput., 26(6), 767-788. Symbolic numeric algebra for polynomials.
Huber, B., k Sturmfels, B. (1995). A polyhedral method for solving sparse polyno-
mial systems. Math. Comp., 64(212), 1541-1555.
Huber, B., k Sturmfels, B. (1997). Bernstein's theorem in affine space. Discrete
Comput. Geom., 17(2), 137-141.
Huber, B., k Verschelde, J. (1998). Polyhedral end games for polynomial continu-
ation. Numer. Algorithms, 18(1), 91-108.
Bibliography 385

Huber, B., & Verschelde, J. (2000). Pieri homotopies for problems in enumerative
geometry applied to pole placement in linear systems control. SIAM J. Control
Optim., 38(4), 1265-1287.
Husty, M. L. (1996). An algorithm for solving the direct kinematics of general
Stewart-Gough platforms. Mechanism Machine Theory, 31(4), 365-380.
Husty, M. L., & Karger, A. (2000). Self-motions of Griffis-Duffy type parallel
manipulators. Proceedings of the 2000 IEEE Int. Conf. Robotics and Automation,
CDROM, San Francisco, CA, April 24-28, 2000. IEEE.
Iitaka, S. (1982). Algebraic geometry, Vol. 76 of Graduate Texts in Mathematics.
New York: Springer-Verlag. An introduction to birational geometry of algebraic
varieties, North-Holland Mathematical Library, 24.
Innocenti, C. (1995). Polynomial solution to the position analysis of the 7-link Assur
kinematic chain with one quaternary link. Mechanism Machine Theory, 30(8),
1295-1303.
Isaacson, E., & Keller, H. B. (1994). Analysis of numerical methods. New York:
Dover Publications Inc. Corrected reprint of the 1966 original [Wiley, New York].
Kearfott, R. B. (1996). Rigorous global search: continuous problems, Vol. 13 of Non-
convex Optimization and its Applications. Dordrecht: Kluwer Academic Publish-
ers.
Kearfott, R. B. (1997). Empirical evaluation of innovations in interval branch and
bound algorithms for nonlinear systems. SIAM J. Sci. Comp., 18(2), 574-594.
Kearfott, R. B., & Novoa, M. (1990). Algorithm 681: INTBIS, a portable interval
Newton/bisection package. ACM Trans. Math. Softw., 16(2), 152-157.
Kearfott, R. B., & Xing, Z. (1994). An interval step control for continuation meth-
ods. SIAM J. Numer. Anal., 31(3), 892-914.
Keller, H. B. (1981). Geometrically isolated nonisolated solutions and their approx-
imation. SIAM J. Numer. Anal., 18(5), 822-838.
Kendig, K. (1977). Elementary algebraic geometry. New York: Springer-Verlag.
Graduate Texts in Mathematics, No. 44.
Khovanski, A. G. (1978). Newton polyhedra, and the genus of complete intersec-
tions. Funktsional. Anal, i Prilozhen., 12(1), 51-61.
Kleiman, S. L. (1986). Tangency and duality. Proceedings of the 1984 Vancou-
ver conference in algebraic geometry, Vol. 6 of CMS Conf. Proc. (pp. 163-225).
Providence, RI: Amer. Math. Soc.
Knuth, D. E. (1981). The art of computer programming. Vol. 2. Addison-Wesley
Publishing Co., Reading, Mass., second edition. Seminumerical algorithms,
Addison-Wesley Series in Computer Science and Information Processing.
Krick, T. (2004). Straight-line programs in polynomial equation solving. In F.
Cucker, R. DeVore, P. Olver, & E. Siili (Eds.), Foundations of computational
mathematics, Minneapolis 2002. Cambridge University Press.
Kuo, Y.-C, Li, T.-Y., & Wu, D. (2004). Determining whether a numerical solution
of a polynomial system is isolated, preprint.
386 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Kushnirenko, A. G. (1976). Newton polytopes and the Bezout theorem. Funktsional.


Anal, i Prilozhen., 10(3), 82-83.
Lazard, D. (1993). On the representation of rigid-body motions and its application
to generalized platform manipulators. In J. Angeles, P. Kovacs, & G. Hommel
(Eds.), Computational kinematics (pp. 175—182). Kluwer.
Lecerf, G. (2001). Une alternative aux methodes de reecriture pour la resolution
des system.es algebriques. PhD thesis, Ecole Polytechnique.
Lecerf, G. (2002). Quadratic Newton iteration for systems with multiplicity. Found.
Comput. Math., 2(3), 247-293.
Lee, H.-Y., & Liang, C.-G. (1988). Displacement analysis of the general spatial
7-link 7R mechanism. Mechanism Machine Theory, 23(3), 219-226.
Leykin, A., Verschelde, J., & Zhao, A. (2004). Newton's method with deflation for
isolated singularities of polynomial systems, preprint.
Li, T.-Y. (1983). On Chow, Mallet-Paret and Yorke homotopy for solving system
of polynomials. Bull. Inst. Math. Acad. Sinica, 11(3), 433-437.
Li, T.-Y. (1993). Solving polynomial systems by homotopy continuation methods.
In Computer mathematics (Tianjin, 1991), Vol. 5 of Nankai Ser. Pure Appl.
Math. Theoret. Phys. (pp. 18-35). River Edge, NJ: World Sci. Publishing.
Li, T.-Y. (1997). Numerical solution of multivariate polynomial systems by homo-
topy continuation methods. In Ada numerica, Vol. 6 (pp. 399-436). Cambridge:
Cambridge Univ. Press.
Li, T.-Y. (1999). Solving polynomial systems by polyhedral homotopies. Taiwanese
J. Math., 3(3), 251-279.
Li, T.-Y. (2003). Numerical solution of polynomial systems by homotopy con-
tinuation methods. In Handbook of numerical analysis, Vol. XI (pp. 209-304).
Amsterdam: North-Holland.
Li, T.-Y., & Li, X. (2001). Finding mixed cells in the mixed volume computation.
Found. Comput. Math., 1(2), 161-181.
Li, T.-Y., & Sauer, T. (1987a). Homotopy method for generalized eigenvalue prob-
lems Ax = XBx. Linear Algebra Appl., 91, 65-74.
Li, T.-Y., & Sauer, T. (1987b). Regularity results for solving systems of polynomials
by homotopy method. Numer. Math., 50(3), 283-289.
Li, T.-Y., &; Sauer, T. (1989). A simple homotopy for solving deficient polynomial
systems. Japan J. Appl. Math., 6(3), 409-419.
Li, T.-Y., Sauer, T., & Yorke, J. A. (1987a). Numerical solution of a class of deficient
polynomial systems. SIAM J. Numer. Anal, 24(2), 435-451.
Li, T.-Y., Sauer, T., & Yorke, J. A. (1987b). The random product homotopy and
deficient polynomial systems. Numer. Math., 51(5), 481-500.
Li, T.-Y., Sauer, T., & Yorke, J. A. (1988). Numerically determining solutions of
systems of polynomial equations. Bull. Amer. Math. Soc. (N.S.), 18(2), 173-177.
Li, T.-Y., Sauer, T., & Yorke, J. A. (1989). The cheater's homotopy: an efficient
procedure for solving systems of polynomial equations. SIAM J. Numer. Anal.,
Bibliography 387

26(5), 1241-1251.
Li, T. Y., Wang, T., & Wang, X. (1996). Random product homotopy with minimal
BKK bound. In The mathematics of numerical analysis (Park City, UT, 1995),
Vol. 32 of Lectures in Appl. Math. (pp. 503-512). Providence, RI: Amer. Math.
Soc.
Li, T.-Y., & Wang, X. (1991). Solving deficient polynomial systems with homotopies
which keep the subschemes at infinity invariant. Math. Comp., 56(194), 693-710.
Li, T.-Y., & Wang, X. (1992). Nonlinear homotopies for solving deficient polynomial
systems with parameters. SIAM J. Numer. Anal, 29(4), 1104-1118.
Li, T.-Y., & Wang, X. (1996). The BKK root count in C". Math. Comp., 65(216),
1477-1484.
Li, T.-Y., & Zheng, Z. (2004). A rank-revealing method and its applications.
preprint.
Lipman, J. (1975). Introduction to resolution of singularities. In Algebraic geometry
(Proc. Sympos. Pure Math., Vol. 29, Humboldt State Univ., Arcata, Calif., 1974)
(pp. 187-230). Providence, R.I.: Amer. Math. Soc.
Lo Cascio, M. L., Pasquini, L., & Trigiante, D. (1989). Simultaneous determina-
tion of polynomial roots and multiplicities: an algorithm and related problems.
Ricerche Mat, 38(2), 283-305.
Losch, S. (1995). Parallel redundant manipulators based on open and closed normal
Assur chains. In J.-P. Merlet, & B. Ravani (Eds.), Computational kinematics '95,
Proceedings of the Second Workshop held in Sophia Antipolis, September 4-6,
1995, Vol. 40 of Solid Mechanics and its Applications (pp. x+310). Dordrecht:
Kluwer Academic Publishers Group.
Lu, Y., Sommese, A. J., & Wampler, C. W. (2005). Finding all real solutions of
polynomial systems: I the curve case, in preparation.
Macaulay, F. (1902). On some formulas in elimination. Proc. London Math. Soc,
3, 3-27.
Manocha, D. (1993). Efficient algorithms for multipolynomial resultant. The Com-
puter Journal, 36, 485-496.
Manocha, D. (1994). Solving systems of polynomial equations. IEEE Comput.
Graph. Appl, 36, 46-55.
Manocha, D., & Canny, J. F. (1994). Efficient inverse kinematics for general 6R
manipulators. IEEE Trans. Rob. Auto., 10(5), 648-657.
Manseur, R., & Doty, K. (1989). A robot manipulator with 16 real inverse kinematic
solution set. Int. J. Robotics Res., 8(5), 75-79.
Marden, M. (1966). Geometry of polynomials. Second edition. Mathematical Sur-
veys, No. 3. Providence, R.I.: American Mathematical Society.
Mavroidis, C, & Roth, B. (1995a). Analysis of overconstrained mechanisms. ASME
J. Mech. Design, 117, 69-74.
Mavroidis, C, & Roth, B. (1995b). New and revised overconstrained mechanisms.
ASME J. Mech. Design, 117, 75-82.
388 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Mayer St-Onge, B., &: Gosselin, C. M. (2000). Singularity analysis and represen-
tation of the general Gough-Stewart platform. Int. J. Robotics Research, 19,
Li 1— ZOO.
Meintjes, K., & Morgan, A. P. (1987). A methodology for solving chemical equilib-
rium systems. Appl. Math. Com/put., 22, 333-361.
Merlet, J.-P. (1989). Singular configurations of parallel manipulators and Grass-
mann geometry. Int. J. Robotics Research, 8, 45—56.
Merlet, J.-P. (2000). Parallel robots. Kluwer Academic Publishers, Dordrecht, The
Netherlands.
Merlet, J.-P. (2001). A parser for the interval evaluation of analytical functions and
its applications to engineering problems. J. Symbolic Computation, 31, 475-486.
Mignotte, M., & Stefanescu, D. (1999). Polynomials. Springer Series in Discrete
Mathematics and Theoretical Computer Science. Springer-Verlag, Singapore. An
algorithmic approach.
Milnor, J. W. (1965). Topology from the differentiate viewpoint. Based on notes
by David W. Weaver. The University Press of Virginia, Charlottesville, Va.
Moller, H. M. (1998). Grobner bases and numerical analysis. In Grobner bases and
applications (Linz, 1998), Vol. 251 of London Math. Soc. Lecture Note Ser. (pp.
159-178). Cambridge: Cambridge Univ. Press.
Moller, H. M., & Stetter, H. J. (1995). Multivariate polynomial equations with
multiple zeros solved by matrix eigenproblems. Num. Math., 70, 311-329.
Moore, R. E. (1979). Methods and applications of interval analysis, Vol. 2 of SIAM
Studies in Applied Mathematics. Philadelphia, Pa.: Society for Industrial and
Applied Mathematics (SIAM).
Morgan, A. P. (1983). A method for computing all solutions to systems of polyno-
mial equations. ACM Trans. Math. Software, 9(1), 1-17.
Morgan, A. P. (1986a). A homotopy for solving polynomial systems. Appl. Math.
Comput., 18(1), 87-92.
Morgan, A. P. (1986b). A transformation to avoid solutions at infinity for polyno-
mial systems. Appl. Math. Comput., 18(1), 77-86.
Morgan, A. P. (1987). Solving polynomial systems using continuation for engineering
and scientific problems. Prentice-Hall, Englewood Cliffs, N.J.
Morgan, A. P., & Sommese, A. J. (1987a). A homotopy for solving general poly-
nomial systems that respects m-homogeneous structures. Appl. Math. Comput.,
101-113.
Morgan, A. P., & Sommese, A. J. (1987b). Computing all solutions to polynomial
systems using homotopy continuation. Appl. Math. Comput., 115-138. Errata:
Appl. Math. Comput. 51 (1992), p. 209.
Morgan, A. P., & Sommese, A. J. (1989). Coefficient-parameter polynomial con-
tinuation. Appl. Math. Comput, 29(2), 123-160. Errata: Appl. Math. Comput.
51:207(1992).
Morgan, A. P., & Sommese, A. J. (1990). Generically nonsingular polynomial
Bibliography 389

continuation. In Computational solution of nonlinear systems of equations (Fort


Collins, CO, 1988), Vol. 26 of Lectures in Appl. Math. (pp. 467-493). Providence,
RI: Amer. Math. Soc.
Morgan, A. P., Sommese, A. J., & Wampler, C. W. (1990). Polynomial continuation
for mechanism design problems. In Computational solution of nonlinear systems
of equations (Fort Collins, CO, 1988), Vol. 26 of Lectures in Appl. Math. (pp.
495-517). Providence, RI: Amer. Math. Soc.
Morgan, A. P., Sommese, A. J., & Wampler, C. W. (1991). Computing singular
solutions to nonlinear analytic systems. Numer. Math., 58(7), 669-684.
Morgan, A. P., Sommese, A. J., & Wampler, C. W. (1992a). Computing singular
solutions to polynomial systems. Adv. in Appl. Math., 13(3), 305-327.
Morgan, A. P., Sommese, A. J., & Wampler, C W. (1992b). A power series method
for computing singular solutions to nonlinear analytic systems. Numer. Math.,
63(3), 391-409.
Morgan, A. P., Sommese, A. J., & Wampler, C. W. (1995). A product-
decomposition bound for Bezout numbers. SIAM J. Numer. Anal, 32(A), 1308-
1325.
Morgan, A. P., Sommese, A. J., & Watson, L. T. (1989). Finding all isolated
solutions to polynomial systems using HOMPACK. A CM Trans. Math. Software,
15(2), 93-122.
Morgan, A. P., & Wampler, C. W. (1990). Solving a planar four-bar design problem
using continuation. ASME J. Mech. Design, 112, 544-550.
Morgan, A. P., & Watson, L. T. (1987). Solving polynomial systems of equations
on a hypercube. In Hypercube multiprocessors 1987 (Knoxville, TN, 1986) (pp.
501-511). Philadelphia, PA: SIAM.
Morgan, A. P., & Watson, L. T. (1989). A globally convergent parallel algorithm
for zeros of polynomial systems. Nonlinear Anal., 13(\1), 1339-1350.
Mourrain, B. (1993, July). The 40 generic positions of a parallel robot. In M.
Bronstein (Ed.), Proc. ISSAC'93 (Kiev) (pp. 173-182). ACM Press.
Mourrain, B. (1996). Enumeration problems in geometry, robotics and vision. In
Algorithms in algebraic geometry and applications (Santander, 1994), Vol. 143 of
Progr. Math. (pp. 285-306). Basel: Birkhauser.
Mourrain, B. (1998). Computing the isolated roots by matrix methods. J. Symbolic
Comput., 26(6), 715-738. Symbolic numeric algebra for polynomials.
Mumford, D. (1966). Lectures on curves on an algebraic surface. With a section
by G. M. Bergman. Annals of Mathematics Studies, No. 59. Princeton, N.J.:
Princeton University Press.
Mumford, D. (1970). Varieties denned by quadratic equations. In E. Marchionna
(Ed.), Questions on algebraic varieties (C.I.M.E., III Ciclo, Varenna, 1969) (pp.
29-100). Rome: Edizioni Cremonese.
Mumford, D. (1995). Algebraic geometry. I. Classics in Mathematics. Berlin:
Springer-Verlag. Complex projective varieties, Reprint of the 1976 edition.
390 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Mumford, D. (1999). The red book of varieties and schemes, Vol. 1358 of Lecture
Notes in Mathematics. Berlin: Springer-Verlag, expanded edition. Includes the
Michigan lectures (1974) on curves and their Jacobians, With contributions by
E. Arbarello.
Nanua, P., Waldron, K. J., & Murthy, V. (1991). Direct kinematic solution of a
Stewart platform. IEEE Trans, on Robotics and Automation, 6(4), 438-444.
Neumaier, A. (1990). Interval methods for systems of equations, Vol. 37 of Ency-
clopedia of Mathematics and its Applications. Cambridge: Cambridge University
Press.
Nielsen, J., & Roth, B. (1999). Solving the input/output problem for planar mech-
anisms. ASME J. Mech. Design, 121(2), 206-211.
Ojika, T. (1987). Modified deflation algorithm for the solution of singular problems.
I. A system of nonlinear algebraic equations. J. Math. Anal. Appl., 123, 199-221.
Ojika, T., Watanabe, S., & Mitsui, T. (1983). Deflation algorithm for the multiple
roots of a system of nonlinear equations. J. Math. Anal. Appl., 96, 463-479.
Pan, V. Y. (1997). Solving a polynomial equation: some history and recent progress.
SI AM Rev., 39(2), 187-220.
Pasquini, L., & Trigiante, D. (1985). A globally convergent method for simultane-
ously finding polynomial roots. Math. Comp., ^^(169), 135-149.
Pernkopf, F., & Husty, M. L. (2002). Singularity analysis of spatial stewart-gough
platforms with planar base and platform. Proc. ASME Design Eng. Tech. Conf,
Montreal, Canada, Sept. 30~Oct. 2, 2002.
Pieper, D. L. (1968). The kinematics of manipulators under computer control. PhD
thesis, Computer Science Dept., Stanford University.
Primrose, E. J. F. (1986). On the input-output equation of the general 7R-
mechanism. Mechanism Machine Theory, 21(6), 509-510.
Raghavan, M. (1991). The Stewart platform of general geometry has 40 config-
urations. Proc. ASME Design and Automation Conf, vol. 32-2 (pp. 397-402).
ASME.
Raghavan, M. (1993). The Stewart platform of general geometry has 40 configura-
tions. ASME J. Mech. Design, 115, 277-282.
Raghavan, M., & Roth, B. (1993). Inverse kinematics of the general 6R manipulator
and related linkages. ASME J. Mech. Design, 115, 502-508.
Raghavan, M., & Roth, B. (1995). Solving polynomial systems for the kinematic
analysis and synthesis of mechanisms and robot manipulators. ASME J. Mech.
Design, 117, 71-79.
Roberts, S. (1875). On three-bar motion in plane space. Proc. London Math. Soc,
VII, 14-23.
Rojas, J. M. (1994). A convex geometric approach to counting the roots of a
polynomial system. Theoret. Comput. Sci., 133(1), 105-140. Selected papers of
the Workshop on Continuous Algorithms and Complexity (Barcelona, 1993).
Rojas, J. M. (1999). Toric intersection theory for affine root counting. J. Pure Appl.
Bibliography 391

Algebra, 136(1), 67-100.


Rojas, J. M., & Wang, X. (1996). Counting affine roots of polynomial systems via
pointed Newton polytopes. J. Complexity, 12(2), 116-133.
Ronga, F., & Vust, T. (1995). Stewart platforms without computer? In Real analytic
and algebraic geometry (Trento, 1992) (pp. 197-212). Berlin: de Gruyter.
Roth, B. (1962). A generalization of Burmester theory: Nine-point path generation
of geared five-bar mechanisms with gear ratio plus and minus one. PhD thesis,
Columbia University.
Roth, B., & Freudenstein, F. (1963). Synthesis of path-generating mechanisms by
numerical means. J. Eng. Industry, 298-306. Trans. ASME, vol. 85, Series B.
Roth, B., Rastegar, J., & Scheinman, V. (1974). On the design of computer con-
trolled manipulators. On the Theory and Practice of Robots and Manipulators:
First CSIM-IFToMM Symposium (pp. 93-113). Springer-Verlag.
Rump, S. M. (1999). INTLAB - INTerval LABoratory. In T. Csendes (Ed.), Devel-
opments in reliable computing, Proc. of (SCAN-98), Budapest, September 22-25,
1998 (pp. 77-104). Dordrecht: Kluwer Academic Publishers.
Rupprecht, D. (2004). Semi-numerical absolute factorization of polynomials with
integer coefficients. J. Symbolic Comput., 37(5), 557-574.
Sasaki, T. (2001). Approximate multivariate polynomial factorization based on
zero-sum relations. In B. Mourrain (Ed.), Proceedings of the 2001 international
symposium on symbolic and algebraic computation (ISSAC 2001) (pp. 284-291).
ACM.
Schenck, H. (2003). Computational algebraic geometry, Vol. 58 of London Mathe-
matical Society Student Texts. Cambridge: Cambridge University Press.
Shiftman, B., & Sommese, A. J. (1985). Vanishing theorems on complex manifolds,
Vol. 56 of Progress in Mathematics. Boston, MA: Birkhauser Boston Inc.
Sommese, A. J., & Verschelde, J. (2000). Numerical homotopies to compute generic
points on positive dimensional algebraic sets. J. Complexity, 16(3), 572-602.
Complexity theory, real machines, and homotopy (Oxford, 1999).
Sommese, A. J., Verschelde, J., & Wampler, C. W. (2001a). Numerical decom-
position of the solution sets of polynomial systems into irreducible components.
SIAM J. Numer. Anal., 38(6), 2022-2046.
Sommese, A. J., Verschelde, J., & Wampler, C. W. (2001b). Numerical irreducible
decomposition using projections from points on the components. In Symbolic com-
putation: solving equations in algebra, geometry, and engineering (South Hadley,
MA, 2000), Vol. 286 of Contemp. Math. (pp. 37-51). Providence, RI: Amer. Math.
Soc.
Sommese, A. J., Verschelde, J., & Wampler, C. W. (2001c). Using monodromy
to decompose solution sets of polynomial systems into irreducible components.
In Applications of algebraic geometry to coding theory, physics and computation
(Eilat, 2001), Vol. 36 of NATO Sci. Ser. II Math. Phys. Chem. (pp. 297-315).
Dordrecht: Kluwer Acad. Publ.
392 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Sommese, A. J., Verschelde, J., & Wampler, C. W. (2002a). A method for tracking
singular paths with application to the numerical irreducible decomposition. In
Algebraic geometry (pp. 329-345). Berlin: de Gruyter.
Sommese, A. J., Verschelde, J., & Wampler, C W. (2002b). Symmetric functions
applied to decomposing solution sets of polynomial systems. SIAM J. Numer.
Anal, 40(6), 2026-2046.
Sommese, A. J., Verschelde, J., & Wampler, C. W. (2003). Numerical irreducible
decomposition using PHCpack. In Algebra, geometry, and software systems (pp.
109-129). Berlin: Springer.
Sommese, A. J., Verschelde, J., & Wampler, C. W. (2004a). Advances in polynomial
continuation for solving problems in kinematics. ASME J. Mech. Design, 126(2),
262-268.
Sommese, A. J., Verschelde, J., & Wampler, C. W. (2004b). Homotopies for in-
tersecting solution components of polynomial systems. SIAM J. Numer. Anal.,
42(4), 1552-1571.
Sommese, A. J., Verschelde, J., & Wampler, C. W. (2004c). An intrinsic homotopy
for intersecting algebraic varieties. J. Complexity, to appear.
Sommese, A. J., Verschelde, J., & Wampler, C. W. (2004d). Numerical factorization
of multivariate complex polynomials. Theoretical Computer Science, 315, 651—
669.
Sommese, A. J., Verschelde, J., & Wampler, C. W. (2004e). Solving polynomial
systems equation by equation, in preparation.
Sommese, A. J., & Wampler, C. W. (1996). Numerical algebraic geometry. In The
mathematics of numerical analysis (Park City, UT, 1995), Vol. 32 of Lectures in
Appi Math. (pp. 749-763). Providence, RI: Amer. Math. Soc.
Sosonkina, M., Watson, L. T., & Stewart, D. E. (1996). Note on the end game in
homotopy zero curve tracking. ACM Trans. Math. Software, 22(3), 281-287.
Sreenivasan, S. V., & Nanua, P. (1992). Solution of the direct position kinematics
problem of the general Stewart platform using advanced polynomial continuation.
DE-Vol. 45, Robotics, Spatial Mechanisms, and Mechanical Systems (pp. 99-106).
ASME.
Sreenivasan, S. V., Waldron, K. J., & Nanua, P. (1994). Closed-form direct dis-
placement analysis of a 6-6 Stewart platform. Mechanism Machine Theory, 29(6),
855-864.
Stetter, H. J. (2004). Numerical polynomial algebra. Philadelphia, PA: Society for
Industrial and Applied Mathematics (SIAM).
Stoer, J., & Bulirsch, R. (2002). Introduction to numerical analysis, Vol. 12 of Texts
in Applied Mathematics. New York: Springer-Verlag, third edition. Translated
from the German by R. Bartels, W. Gautschi and C. Witzgall.
Sturmfels, B. (2002). Solving systems of polynomial equations, Vol. 97 of CBMS
Regional Conference Series in Mathematics. Published for the Conference Board
of the Mathematical Sciences, Washington, DC.
Bibliography 393

Sturmfels, B., & Zelevinsky, A. (1994). Multigraded resultants of Sylvester type. J.


of Algebra, 163(1), 115-127.
Su, H.-J., Wampler, C. W., & McCarthy, J. M. (2004). Geometric design of cylindric
PRS serial chains. ASME J. Mech. Design, 126(2), 269-277.
Tsai, L. W. (1999). Robot analysis: the mechanics of serial and parallel manipula-
tors. New York: John Wiley & Sons Inc.
Tsai, L. W., & Lu, J.-J. (1989). Coupler-point curve synthesis using homotopy
methods. In B. Ravani (Ed.), Advances in Design Automation-1989: Mechani-
cal Systems Analysis, Design and Simulation, Vol. DE-Vol. 19-3 (pp. 417-424).
ASME.
Tsai, L. W., & Morgan, A. P. (1985). Solving the kinematics of the most general
six- and five-degree-of-freedom manipulators by continuation methods. ASME J.
Mech., Trans., Auto. Design, 107, 48-57.
van der Waerden, B. L. (1949). Modern Algebra. Vol. I. New York, N. Y.: Frederick
Ungar Publishing Co. Translated from the second revised German edition by Fred
Blum, With revisions and additions by the author.
van der Waerden, B. L. (1950). Modern Algebra. Vol. II. New York, N. Y.: Frederick
Ungar Publishing Co. Translated from the first German edition by Theodore
Benac.
Verschelde, J. (1996). Homotopy continuation methods for solving polynomial sys-
tems. PhD thesis, Katholieke Universiteit Leuven.
Verschelde, J. (1999). Algorithm 795: PHCpack: A general-purpose solver for
polynomial systems by homotopy continuation. A CM Trans, on Math. Software,
25(2), 251-276.
Verschelde, J. (2000). Toric Newton method for polynomial homotopies. J. Sym-
bolic Comput., £9(4-5), 777-793. Symbolic computation in algebra, analysis, and
geometry (Berkeley, CA, 1998).
Verschelde, J., & Cools, R. (1993). Symbolic homotopy construction. Appl. Algebra
Engrg. Comm. Comput., ^(3), 169-183.
Verschelde, J., Gatermann, K., & Cools, R. (1996). Mixed-volume computation by
dynamic lifting applied to polynomial system solving. Discrete Comput. Geom.,
16(1), 69-112.
Verschelde, J., Verlinden, P., & Cools, R. (1994). Homotopies exploiting Newton
polytopes for solving sparse polynomial systems. SI AM J. Numer. Anal., 31 (3),
915-930.
Verschelde, J., & Wang, Y. (2004). Computing feedback laws for linear systems with
a parallel Pieri homotopy. In Y. Yang (Ed.), Proceedings of 2004 International
Conference on Parallel Processing Workshops, August 15-18, 2004 (PP- 222-229).
IEEE.
Walker, R. J. (1962). Algebraic curves. Dover, New York.
Wampler, C. W. (1992). Bezout number calculations for multi-homogeneous poly-
nomial systems. Appl. Math. Comput, 51(2-3), 143-157.
394 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

Wampler, C. W. (1994). An efficient start system for multihomogeneous polynomial


continuation. Numer. Math., 66(4), 517-523.
Wampler, C. W. (1996a). Forward displacement analysis of general six-in-parallel
SPS (Stewart) platform manipulators using soma coordinates. Mechanism Ma-
chine Theory, 31, 331-337.
Wampler, C. W. (1996b). Isotropic coordinates, circularity and Bezout numbers:
planar kinematics from a new perspective. In J. M. McCarthy (Ed.), Proceedings
of the 1996 ASME Design Engineering Technical Conference, Irvine, California
August 18-22, 1996. American Society of Mechanical Engineers, CD-ROM. Also
available as GM Technical Report, Publication R&D-8188., 1996.
Wampler, C. W. (1999). Solving the kinematics of planar mechanisms. ASME J.
Mech. Design, 121, 387-391.
Wampler, C. W. (2001). Solving the kinematics of planar mechanisms by Dixon
determinant and a complex-plane formulation. ASME J. Mech. Design, 123(3),
382-387.
Wampler, C. W. (2004). Displacement analysis of spherical mechanisms having
three or fewer loops. ASME J. Mech. Design, 126(1), 93-100.
Wampler, C. W., & Morgan, A. P. (1993). Solving the kinematics of general 6R
manipulators using polynomial continuation. In Robotics: applied mathematics
and computational aspects (Loughborough, 1989), Vol. 41 of Inst. Math. Appl.
Conf. Ser. New Ser. (pp. 57-69). New York: Oxford Univ. Press.
Wampler, C. W., Morgan, A. P., & Sommese, A. J. (1990). Numerical continuation
methods for solving polynomial systems arising in kinematics. ASME J. Mech.
Design, 112, 59-68.
Wampler, C. W., Morgan, A. P., & Sommese, A. J. (1992). Complete solution of the
nine-point path synthesis problem for four-bar linkages. ASME J. Mech. Design,
114, 153-159.
Wampler, C. W., Morgan, A. P., & Sommese, A. J. (1997). Complete solution of
the nine-point path synthesis problem for four-bar linkages - closure. ASME J.
Mech. Design, 119, 150-152.
Watson, L. T., Billups, S. C , k Morgan, A. P. (1987). Algorithm 652. HOMPACK:
a suite of codes for globally convergent homotopy algorithms. ACM Trans. Math.
Software, 13(3), 281-310.
Watson, L. T., Sosonkina, M., Melville, R. C , Morgan, A. P., & Walker, H. F.
(1997). Algorithm 777: HOMPACK90: a suite of Fortran 90 codes for globally
convergent homotopy algorithms. ACM Trans. Math. Software, 23(4), 514-549.
Weil, A. (1962). Foundations of algebraic geometry. Providence, R.I.: American
Mathematical Society.
Wilkinson, J. H. (1984). The perfidious polynomial. In Studies in numerical analy-
sis, Vol. 24 of MAA Stud. Math. (pp. 1-28). Washington, DC: Math. Assoc.
America.
Wilkinson, J. H. (1994). Rounding errors in algebraic processes. New York: Dover
Bibliography 395

Publications Inc. Reprint of the 1963 original [Prentice-Hall, Englewood Cliffs,


NJ].
Xu, Z.-B., Zhang, J.-S., & Wang, W. (1996). A cell exclusion algorithm for deter-
mining all the solutions of a nonlinear system of equations. Appl. Math. Comput.,
80'(2-3), 181-208.
Zhang, C.-D., & Song, S.-M. (1994). Forward position analysis of nearly general
Stewart platform. ASME J. Mech. Design, 116(1), 54-60.
Index

Z r e g , 44, 215 Local Dimension, 251


# , xxii LocalDimen, 251
Sing(X), 44 Memberl, 268, 275
Sing(Z), 306 Member2, 269, 276
C*, xxii Monodromy, 269, 277
(x,L), 328 Rank, 240
A, xxii TopDimen, 250
P N , 29 Trace, 270, 284
Gr(m,N), 325 WitnessSuper, 247
V(f), 8 WitnessSupi, 245
Opi(d), 342 WitnessSupi(intrinsic), 246
\, i.e., setminus, xxii algorithm for the rank of a system, 319
analysis, 163
affine algebraic set, 43, 47, 56, 207, 209 analytic continuation, 278
affine hyperplane, 232 analytic parameter spaces, 349
affine space, 209 analytic Zariski open set, 350
affine variety, 215
algebraic function, 210, 212 base locus, 323
algebraic map, 208, 210, 212, 219, 220 Bertini Theorems, 313, 323, 330-333
algebraic probability one, 50 big system, 319
algebraic set, 43, 44, 207, 209 biholomorphic mapping, 301
affine, see affine algebraic set biholomorphic to, 301
constructible set, see constructive set BKK bound, 139
projective, see projective algebraic set body guidance, 163, 165
quasiprojective, see quasiprojective set branched covering, 314
algebraic set associated to / , 8 Buchberger's algorithm, 82
algebraic set of / Burmester centers, 166
see algebraic set associated to / , 8 Burmester points, 166
algorithm
Inclusion, 252 cascade algorithm, 255, 259
LocalDimen, 251 Cauchy integral, 199
Equal, 253 Cauchy integral endgame, 285
IrrDecompl, 268 Cauchy integral method, 186, 187, 189
IrrDecomp2, 271 Cauchy's Lemma, 58
IrrDecompPure, 270, 284 center of a projection, 213, 328
JunkRemove, 269 Chebychev polynomials, 65

397
398 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

chemical equilibria, 152-154, 170 Cauchy integral method, see Cauchy


Chern class, 343 integral method
Chevalley's Theorem, 222 cluster method, see trace method
classical topology, see complex topology, power-series method, see power-series
211 method
cluster method, 187 trace method, see trace method
coefficient, 5 endgame convergence radius, 180
coefficient-parameter homotopy, 91 endgame operating zone, 179, 182, 183,
coefficient-parameter theory, 92 185
compact affine set, 220 equation-by-equation, 292
companion matrix, 4 Exclusion method, 68
complex analytic set, 301 extension theorems, 302
complex analytic space, 302 extrinsic slicing, 234
complex dimension, see dimension, 306
complex manifolds, 302 finite affine set, 210
complex projective space, see projective finite map, 312
space first Chern class, 342
complex topology, 211, 300 five-point path synthesis, 166, 174
condition number, 198 four-bar analysis, 164
cone with vertex x, 320 four-bar equations, 163
constellation of algebraic sets, 331, 332 four-bar function generation, 173
constructible algebraic set, 207, 208 four-bar linkages, 169
constructible set, 208, 209, 221 four-bar synthesis, 162, 163
convex polytope, 138 four-body guidance, 174
corank of a polynomial system, 239 fractional power series, 180
corank of an algebraic system, 318 function generation, 162, 164
coupler curve, 162 Fundamental Theorem of Algebra, 55
covering map, 314
cuspidal cubic, 282 gamma trick, 18, 94, 95
general point, 44
deflation, 190-193, 195 generic, 45, 46
degree, 230 simply, 332
degree of a polynomial, 5 generic Bezout number, 346, 351
desingularization, 310 generic factorization, 316, 317
diagonal intersection, 289 generic line, 46
differentiable manifold, 301 generic linear change of coordinates, 213
dimension, 44, 207, 216, 306 generic linear projection, 213, 324
upper semicontinuity, see upper generic point, 44
semicontinuity of dimension generic projection, see generic linear
dimension of a germ, 309 projection
dimensional complex manifold, 302 generic root count, 346, 351
discriminant, 57 generic with respect to an algebraic set,
disk, 58 233
Dixon determinant, 77 generically, 45
dominant map, 222, 311 genericity, 43
dual curve, 338 germ, 308
dimension of a, 309
elementary symmetric functions, 280 irreducible, 309
elimination methods, 72 germ of a complex analytic set, 308
endgame, 177 germ of an affine algebraic set, 308
Index 399

germ of an analytic set, 308 linear projections, 212


germs, 181 linear slicing, 231
Grobner bases, 81 link length, 158
Grobner basis, 82 locally irreducible, 309
graph of a map, 219 losing the endgame, 188
Grassmannian, 325, 326
Grauert's Proper Mapping Theorem, 311 manifold, 301
ground link, 162 manifold point, 44, 306
grounded link, 13 map
growth estimates, 60 finite, 312
proper, 212
Hartogs' Theorem, 302, 303 maximum principle, 304, 311
heuristic eliminant, 79 membership test, 266
hidden variable resultant, 73 Minkowski sum, 139
Hironaka Desingularization Theorem, 310 mixed strategy, 150
holomorphic function, 300 mixed volume, 138, 140
holomorphic mapping, 301 monodromy, 275-277, 339, 348
homogeneous coordinates, xxii, 29 monodromy action, 278
homogeneous polynomial, 33, 34 monomial, 5
homogeneous polynomials, 218 Mount Everest of Kinematics, see
homotopy continuation method, 15 six-revolute serial-link robots
homotopy membership test, 275 multidegree notation, xxi, 5, 301, 322
hyperplane, 232 multihomogeneous polynomial, 35
hyperplane at infinity, 320, 327, 335-337 multiplicity, 8, 209, 223, 224, 236
multiprojective space, 35
image of an irreducible set, 222
Implicit Function Theorem, 304 Nash equilibria, 149-151, 170
interval arithmetic, 201 nested parameter homotopy, 101
intrinsic slicing, 234 Newton polytope, 138
irreducible algebraic set, 44, 207, 215 Newton's method, 17, 18, 24, 71, 177, 182
irreducible at a point, 309 Newton-Raphson method, 17
irreducible component, 56, 207 nine-point path synthesis problem, 112,
irreducible decomposition, 56, 207, 215, 161, 167
219 Noether Normalization Theorem, 214, 336
irreducible germ, 309 nonreduced, 236
dimension of an, 309 nonsolutions, 258
irreducible witness sets, 230 normal, 281, 303, 308
isomorphic, 210, 212 normal complex analytic space, 308
isotropic coordinates, 160 normalization, 189, 311
Nullstellensatz, 307
joint offset, 158 numerical algebraic geometry, vii,
junk points, 245, 249 227-229, 241
numerical elimination theory, 266
Laurent monomial, 138 numerical irreducible decomposition, 228,
Laurent polynomial, 139 230, 231, 253, 265
level i nonsolutions, 258
line at infinity, 33 overdetermined, 241
line bundle, 341, 342
linear projection, 213, 324 parameter homotopy, see
generic, see generic linear projection coefficient-parameter homotopy
400 Numerical Solution of Systems of Polynomials Arising in Engineering and Science

patch switching, 38 Riemann Bounded Extension Theorem,


path generation, 162 303
path synthesis problems, 166
Plucker embedding, 326 sampling, 272, 273
point at infinity, 30 Sard's Theorem, 313
polyhedron, 138 secant variety, 335
polynomial system, 209 section of a line bundle, 218
polytope, 138 Segre embedding, 293, 334, 335
polytope root count, 139 set of indeterminacy, 317
power-series endgame, 199, 285 seven-bar structures, 172
power-series method, 183, 185, 186, 189, simply generic, 332
194 singular path tracking, 273, 284
precision-point methods, 163 singular point, 44, 306
primary decomposition, 216 singular set, 307
probabilistic algorithm, 249 six-revolute inverse position, 172
probability one, 43, 50, 313 six-revolute serial-link robots, 156
probability-one methods, 207 slicing, 231
projective transformation, 39 smooth immersion, 279
projective algebraic set, 34, 207, 217 smooth point, 44, 306
projective line, 30 solution sets, 8
projective plane, 32 spanned, 342
projective rank of an algebraic system, square system, 241
319, 320, 330 Stein factorization Theorem, 312
projective set, 43 Stewart-Gough forward kinematics, 154
projective space, 27-30 Stewart-Gough platform robots, ix, 101,
projective transformation, 38, 40, 198 104-106, 108, 109, 111, 113-115, 154,
proper, 189 171
proper algebraic map, 212 straight-line function, 6, 11, 12, 48, 70, 85,
proper map, 212 362
proper mapping theorem, 311 submatrix, 332
Puiseux's Theorem, 310 Sylvester determinant, 56
pure-dimensional, 216, 219 Sylvester matrix, 65
Sylvester Resultant, 57
quadric surface, 334 symmetric group, 340
quasiprojective algebraic set, 44, 207, 208 synthesis, 163
quasiprojective set, see quasiprojective synthesis problems, 164, 169
algebraic set, 219 system of coordinates, 304

radical, 216 topologically unibranch, 309


rank of a polynomial system, 228, 239, 240 topology, 211
rank of an algebraic system, 318, 319, 329 classical, see complex topology
rational mapping, 317, 338 complex, see complex topology
real dimension, 210, 306 Zariski, see Zariski topology
reduced, 236 total degree of a polynomial, 5
reduction to the diagonal, 290 trace, 187, 279-281
regular point, 215, 306 trace method, 187, 189
Remmert-Stein Factorization Theorem trace test, 279
see Stein Factorization Theorem, 312 trigonometric equations, 7
resultant, 57, 73 twist angle, 158
Index 401

underdetermined, 241
universal field, 52
universal function, 323
universal system, 323
upper semicontinuity of dimension, 312

variety, 8
vector bundle, 341, 343
Veronese embedding, 334

Wilkinson polynomials, 11
winding number, 180, 182, 183
witness point superset, 244, 245, 255
witness set, see witness point set, 8, 229,
235
witness superset, 253, 256

Zariski closed set, 211


Zariski open set, 92, 211
Zariski topology, 211, 221

S-ar putea să vă placă și