Sunteți pe pagina 1din 420

ADVANCED SET THEORY

J. Donald Monk

Graduate set theory


January 19, 2011
TABLE OF CONTENTS
1. The logic of set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2. ZFC axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3. Elementary set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4. Ordinals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5. The axiom of choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6. Cardinals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7. Linearly ordered sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
8. Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
9. Clubs and stationary sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
10. Infinite combinatorics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
11. Martins axiom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
12. Large cardinals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
13. Well-founded relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
14. Models of set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
15. Constructible sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
16. Boolean algebras and quasi-orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
17. Generic extensions and forcing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
18. Powers of regular cardinals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
19. Relative constructibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
20. Isomorphisms and AC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
21. Embeddings, iterated forcing, and Martins axiom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
22. Various forcing partial orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
23. Proper forcing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
24. More examples of iterated forcing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
25. Cofinality of posets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
26. Basic properties of PCF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
27. Main cofinality theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388

Set theory
1. First-order logic
Here we describe the rigorous logical framework for set theory, indicating some central
notions and facts, without proofs.
Except for proofs, this treatment is self-contained, but a more thorough study of firstorder logic, going into the proofs of the statements in this chapter, is a good idea for anyone
studying set theory carefully. See the references at the end of the chapter.
We assume some elementary mathematics: notions and facts concerning finite sequences, the intuitive notion of a function, elementary properties of natural numbers, and
induction. The exact development of these things within set theory comes later. A finite sequence of objects a0 , . . . , an1 is denoted by ha0 , . . . , an1 i. We always assume that a finite
sequence has positive length n. The case n = 1 is important here. The finite sequence hai
of a single object a is considered as different from a itself. The concatenation of two finite
sequences ha0 , . . . , an1 i and hb0 , . . . , bm1 i is the sequence ha0 , . . . , an1 , b0 , . . . , bm1 i.
First order languages all have the following symbols in common:
n, a, e, x. (Symbols for negation, conjunction, equality, and there exists.)
v0 , v1 , . . . (Variables, intuitively ranging over some nonempty domain; see below for a more
exact description. In practice we use many other symbols in place of these.)
Our set-theoretic language has one more symbol m (membership).
An expression is a finite sequence of symbols. Expr is the set of all expressions. Now we
define functions , , =, , :
maps Expr into Expr, and = hni .
def

assigns to each pair , of expressions the expression = hai .


= assigns to each pair vi , vj of variables the expression he, vi , vj i, denoted by vi = vj .
We depend on the context to distinguish this use of = from ordinary equality.
assigns to each pair vi , vj of variables the expression hm, vi , vj i, denoted by vi vj . We
depend on the context to distinguish this use of from the ordinary membership relation.
assigns to each pair vi , , where is an expression, the expression hx, vi i , denoted
by vi .
We use the usual mathematical conventions about functions. For example:
v5 = v6 is the sequence he, v5 , v6 i.
v0 (v0 = v0 v1 = v1 ) is the sequence hn, x, v0 , a, e, v0, v0 , e, v1 , v1 i.
v0 (v0 v1 v0 = v2 ) is the sequence hx, v0 , a, m, v0 , v1 , e, v0 , v2 i.
An atomic formula is an expression of one of the forms
vi = vj (an atomic equality formula)
1

vi vj (an atomic membership formula)


A formula construction is a finite sequence h0 , . . . , m i of expressions such that for each
i = 0, . . . , m one of the following conditions holds:
(1) i is an atomic formula.
(2) There is a j < i such that i is j .
(3) There exist j, k < i such that i is j k .
(4) There exist j < i and a natural number k such that i is vk j .
A formula is an expression which is the last entry of some formula construction. Clearly
for every formula there is a formula construction with no repeated entries which has last
entry .
Here are some examples of formula constructions:
hv3 = v8 i.
hv0 = v0 , v0 (v0 = v0 )i.
hv0 = v1 , v0 (v0 = v1 ), v0 v1 , v1 (v0 v1 ), v0 (v0 = v1 ) v1 (v0 v1 )i.
The following theorem is easy to prove. It provides the justification for some of the
definitions which follow.
Theorem 1.1. (Unique readability)
(i) For every formula , exactly one of the following conditions holds:
(a) is atomic.
(b) There is a formula such that is .
(c) There are formulas , such that is .
(d) There exist a formula and a natural number i such that is vi .
(ii) Every formula is a nonempty expression.
(iii) Every formula has length at least 3.
(iv) Every formula begins with one of n,a,e,x,m.
(v) If = h0 , . . . , m i is a formula and 0 i < m, then h0 , . . . , i i is not a
formula.
(vi) For every formula exactly one of the following conditons holds:
(a) begins with e, and there is exactly one pair (i, j) of natural numbers such
that is vi = vj .
(b) begins with m, and there is exactly one pair (i, j) of natural numbers such
that is vi vj .
(c) begins with n, and there is exactly one formula such that is .
(d) begins with a, and there is exactly one pair (, ) of formulas such that is
.
(e) begins with x, and there is exactly one pair (i, ) such that i is a natural
number, is a formula, and is vi .
We introduce some more functions.
2

assigns to each pair , of expressions the expression ( ), denoted by .


assigns to each pair , of expressions the expression ( ), denoted by .
assigns to each pair , of expressions the expression ( ) ( ), denoted
by .
assigns to every variable vi and formula the formula vi , denoted by vi .
Another example of a formula:
v0 v1 (v1 v0 ) v1 v0 (v1 v0 ).
With fewer abbreviations, this is
(v0 v1 (v1 v0 ) v1 v0 (v1 v0 ));
and as a sequence it is
hn, a, x, v0 , n, x, v1 , n, m, v1 , v0 , n, n, x, v1, n, x, v0 , m, v1 , v0 i
A structure is an ordered pair (A, R) consisting of a nonempty set A, and a binary relation
R on A. Given such a structure, a formula , and a sequence a0 , . . . of elements of A
(an infinite sequence), we define what it means for the sequence to satisfy the formula in
(A, R); we denote this by (A, R) |= [a0 , . . .].
(A, R) |= (vi = vj )[a0 , . . .]

iff

ai = aj .

(A, R) |= (vi vj )[a0 , . . .]

iff

ai Raj .

(A, R) |= [a0 , . . .]

iff

it is not the case that (A, R) |= [a0 , . . .].

(A, R) |= ( )[a0 , . . .]

iff

both (A, R) |= [a0 , . . .] and (A, R) |= [a0 , . . .].

(A, R) |= vi [a0 , . . .]

iff

there is a sequence b0 , . . . of elements of A


such that aj = bj for all j 6= i,
and such that (A, R) |= [b0 , . . .].

A formula is universally valid iff for every structure (A, R) and every selection a0 , . . . of
elements of A we have (A, R) |= [a0 , . . .].
We now give the logical axioms. Here , , . . . are arbitrary formulas unless otherwise
restricted, and i, j, . . . are arbitrary members of .
( ) ( ).
( ) ( ).
( ).
[ ( )] [( ) ( )].
( ) ( ).
vi vi .
3

vi vi .
vi ( ) (vi vi ).
vi , if vi does not occur in .
vi = vi .
vi = vj vk (vk = vi vk = vj ) for k 6= i, j.
vi = vj (vi = vk vj = vk ).
vi = vj (vk = vi vk = vj ).
vi = vj (vi vk vj vk ).
vi = vj (vk vi vk vj ).
It is easy to check that every logical axiom is universally valid. Now let be a set of
formulas. A -proof is a finite sequence h0 , . . . , m of formulas such that for each i m
one of the following conditions holds:
i is a logical axiom.
i is a member of .
There exist j, k < i such that k is j i .
There is a j < i and a natural number k such that i is vk j .
A -theorem is a formula which appears in some -proof. We write to abbreviate
this.
The completeness theorem says that iff for every structure (A, R) and every
sequence a0 , a1 , . . . of members of A, if (A, R) |= [a0 , a1 , . . .] for every in , then also
(A, R) |= [a0 , a1 , . . .]. This theorem is rather hard to prove, and will not be used in these
notes.
Now much of these notes consists in demonstrating that ZFC for various formulas
, where ZFC is the set of axioms for set theory introduced in the next chapter.
We need a few more logical notions.
An occurrence of a variable vi in a formula is bound iff it occurs in a segment of
of the form vi for some formula ; otherwise the occurrence is free. A sentence is a
formula in which no variable occurs free. For any formula , the universal closure of
is the formula vi0 . . . vim1 , where vi0 , . . . , vim1 are the variables occurring free in ,
with i0 < < im1 .
Here are some examples illustrating these notions:
The first occurrence of v0 in the formula v0 = v1 v0 (v0 = v0 ) is free; the other
occurrences are bound.
v0 occurs free in the formula v0 = v2 , but not in the bigger formula v0 (v0 = v2 ).
v0 = v0 is a not a sentence, but v0 (v0 = v0 ) is.
4

There are some important universally valid formulas involving the notions of free and
bound variables. These can be derived from our logical axioms; i.e., for each of the following
formulas , one can show that .
vi , if vi does not occur free in .
vi , where is obtained from by replacing each free occurrence of vi by vj ,
provided that no free occurrence of vi in becomes a bound occurrence of vj .
If is obtained from by replacing each bound occurrence of vi by vj , where vj does not
occur in , then is universally valid.
EXERCISES
E1.1. Prove Theorem 1.1.
E1.2. We change the basic logical set-up as follows. Two new symbols are added, ( and
). An atomic formula has one of the following forms: hvi , e, vj i, hvi , m, vj i. If and
are formulas, so are h(i h)i and h(i hai h)i. If is a formula and i a natural
number, then hx, vi i is a formula.
Formulate and prove a theorem analogous to Theorem 1.1 for this new set-up.
E1.3. Prove that the following formulas are universally valid, for any formulas , :
(i) ( ). (Hence we could have taken and as primitives instead
of , .)
(ii) ( ). (Hence we could have taken and as primitives instead
of , .)
E1.4. Define | to be . Prove that the following are universally valid:
(i) |.
(ii) ()|(). (Hence we could have taken just | as a primitive instead of
both and .)
E1.5. Show that vi vi is universally valid. (Hence we could have taken as
primitive instead of .)
E1.6. Prove the following:
(A, R) |= ( )[a0 , . . .] iff (A, R) |= [a0 , . . .] or (A, R) |= [a0 , . . .] (or both).
(A, R) |= ( )[a0 , . . .] iff from (A, R) |= [a0 , . . .] it follows that (A, R) |= [a0 , . . .].
(A, R) |= ( )[a0 , . . .] iff ( (A, R) |= [a0 , . . .] iff (A, R) |= [a0 , . . .]).
(A, R) |= vi [a0 , . . .] iff for every sequence b0 , . . . of elements of A such that aj = bj for
all j 6= i we have (A, R) |= [b0 , . . .].
E1.7. Check that every logical axiom is universally valid.
E1.8. Show that for any formula we have . Hint: start with
{ [( ) ]} {[ ( )] ( )}.
5

E1.9. A sentential proof is a proof in which only the first five axioms are used, and the
rule concerning is not used. We write s iff there is a sentential proof of from .
Show that if {} s , then s .
E1.10. Show that s ( ) [ ) ( )].
E1.11. Show that s .
E1.12. Show that vi = vi .
E1.13. Suppose that v0 = v1 ( ). Show that v0 = v1 v2 v2 ).
E1.14. Show that the following formula is universally valid, if vi does not occur free in :
vi .
E1.15. Show that vi is universally valid, where is obtained from by replacing
each free occurrence of vi by vj , provided that no free occurrence of vi in becomes a
bound occurrence of vj .
E1.16. Show that if is obtained from by replacing each bound occurrence of vi by vj ,
where vj does not occur in , then is universally valid.

2. ZFC axioms
Before introducing any set-theoretic axioms at all, we can introduce some important abbreviations.
x y abbreviates z(z x z y).
x y abbreviates x y x 6= y.
x
/ y abbreviates (x y).
x y abbreviates x(x y ).
x y abbreviates x(x y ).
In principle, any statement involving these abbreviations could be reformulated as a statement not using them; we take this as intuitively obvious.
Now we introduce the axioms of ZFC set theory. We give both a formal and informal
description of them. The informal versions will suffice for much of these notes.
Axiom 1. (Extensionality) If two sets have the same members, then they are equal.
Formally:
xy[z(z x z y) x = y].
Note that the other implication here holds on the basis of logic.
Axiom 2. (Comprehension) Given any set z and any property , there is a subset of z
consisting of those elements of z with the property .
Formally, for any formula with free variables among x, z, w1 , . . . , wn we have an
axiom
zw1 . . . wn yx(x y x z ).
Note that the variable y is not free in .
From these first two axioms the existence of a set with no members, the empty set ,
follows:
Proposition 2.1. There is a unique set with no members.
Proof. On the basis of logic, there is at least one set z. By the comprehension axiom,
let y be a set such that x(x y x z x 6= x). Thus y does not have any elements.
By the extensionality axiom, such a set y is unique.
In general, the set asserted to exist in the comprehension axiom is unique; we denote it by
{x z : }. We sometimes write {x : } if a suitable z is evident.
Axiom 3. (Pairing) For any sets x, y there is a set which has them as members (possibly
along with other sets). Formally:
xyz(x z y z).
The unordered pair {x, y} is by definition the set {u z : u = x or u = y}, where z is
as in the pairing axiom. The definition does not depend on the particular such z that is
chosen. This same remark can be made for several other definitions below. We define the
singleton {x} to be {x, x}.
7

Axiom 4. (Union) For any family A of sets, we can form a new set A which has as
elements all elements which are in at least one member of A (maybe A has even more
elements). Formally:
A AY x(x Y Y A x A).
With A Sas in this axiom, we define
x y = {x, y}.

A = {x A : Y A (x Y )}. Also, let

Axiom 5. (Power set) For any set x, there is a set which has as elements all subsets of x,
and again possibly has more elements. Formally:
xyz(z x z y).
Axiom 6. (Infinity) There is a set which intuitively has infinitely many elements:
x[ x y x(y {y} x)].
If we take the smallest set x with these properties we get the natural numbers, as we will
see later.
Axiom 7. (Replacement) If a function has domain a set, then its range is also a set.
Here we use the intuitive notion of a function. Later we define the rigorous notion of a
function. The present intuitive notion is more general, however; it is expressed rigorously
as a formula with a function-like property. The rigorous version of this axiom runs as
follows.
For each formula with free variables among x, y, A, w1, . . . , wn , the following is an
axiom.
Aw1 . . . wn [x A!y Y x Ay Y ].
For the next axiom, we need another definition. For any sets x, y, let xy = {z x : z y}.
Axiom 8. (Foundation) Every nonempty set x has a member y which has no elements in
common with x. This is a somewhat mysterious axiom which rules out such anti-intuitive
situations as a a or a b a.
x[x 6= y x(x y = )].
Axiom 9. (Choice) This axiom will be discussed carefully later; it allows one to pick out
elements from each of an infinite family of sets. A convenient form of the axiom to start
with is as follows:
A [x A (x 6= )x A y A (x 6= y xy = ) Bx A !y(y xy B).

3. Elementary set theory


May 24, 2011
Here we will see how the axioms are used to develop very elementary set theory. The axiom
of choice is not used in this chapter. To some extent the main purpose of this chapter is
to establish common notation.
The proof of the following theorem shows what happens to Russells paradox in our
axiomatic development. Russells paradox runs as follows, working in naive, non-axiomatic
set theory. Let x = {y : y
/ y}. If x x, then x
/ x; but also if x
/ x, then x x.
Contradiction.
Theorem 3.1. zx(x z).
Proof. Suppose to the contrary that x(x z). Let y = {x z : x
/ x}. Then
(y y y
/ y), contradiction.
Lemma 3.2. If {x, y} = {u, v}, then one of the following conditions holds:
(i) x = u and y = v;
(ii) x = v and y = u.
Proof. Since x {x, y} = {u, v}, we have x = u or x = v.
Case 1. x = u. Since y {x, y} = {u, v}, we have y = u or y = v. If y = v, that is
as desired. If y = u, then x = y too, and v {u, v} = {x, y}, so v = x = y. In any case,
y = v.
Case 2. x = v. By symmetry to case 1, y = u.
Now we can define the notion of an ordered pair: hx, yi = {{x}, {x, y}}.
Lemma 3.3. If hx, yi = hu, vi, then x = u and y = v.
Proof. Assume that hx, yi = hu, vi. Thus {{x}, {x, y}} = {{u}, {u, v}}. By Lemma
3.1, this gives two cases.
Case 1. {x} = {u} and {x, y} = {u, v}. Then x {x} = {u}, so x = u. By Lemma
3.1 again, {x, y} = {u, v} implies that either y = v, or else x = v and y = u; in the latter
case, y = u = x = v. So y = v in any case.
Case 2. {x} = {u, v} and {x, y} = {u}. Then u {u, v} = {x}, so u = x. Similarly
v = x, u = x, and u = y. So x = u and y = v.
This lemma justifies the following definition:
1st (a, b) = a and

2nd (a, b) = b.

These are the first and second coordinates of the ordered pair.
The notion of intersection similar to that of union, but there is a minor problem
concerning what to define the intersection of the empty set to be. We have decided to let
it be the empty set.
Theorem 3.4. For any set F there is a set y such that if F 6= then x[x y
z F [x z]], while y = if F = .
9

Proof. Let F be given. If F = , let y = . Otherwise, choose w F and let


y = {x w : z F [x z]}.
T
The set y in Theorem 3.4 is clearly unique, and we denote it by S F . This is the intersection of F . We already introduced in Chapter 2 the notations , , and . To round
out the simple Boolean operations we define
A\B = {x A : x
/ B}.
This is the relative complement of B in A.
Sets a, b are disjoint iff a b = .
The replacement schema will almost always be used in connection with the comprehension schema. Namely, under the assumption x A!y(x, y), we choose Y by the
replacement axiom, so that x Ay Y (x, y); then we form
{y Y : x A(x, y)}.
Lemma 3.5. ABZz(z Z x Ay B(z = hx, yi)).
Proof. For any y B we have
x A!z(z = hx, yi).
Hence by replacement and comprehension, we can define
prod(A, y) = {z : x A(z = hx, yi)}.
Thus y B!z(z = prod(A, y)), so by replacement and comprehension again we can
define
Z = {w : y B(w = prod(A, y))}.
S
We claim that Z is as desired.
S First suppose that x A and
S y B. Then hx, yi

prod(A, y) Z , so hx, yi Z . Conversely, suppose that z Z . Choose w Z such


that z w, and choose y B such that w = prod(A, y). Then choose x A such that
z = hx, yi; this is as desired.
We now define A B to be the unique Z of Lemma 3.5; this is the cartesian product of A
and B. Normally we would define A B as follows:
A B = {hx, yi : x A y B}.
This notation means
{u : x, y(u = hx, yi x A y B)},
which is justified by the lemma.
An important informal notation is
()

{ (x, y) A : (x, y)},


10

where (x, y) is some set determined by x and y. That is, there is a formula (w, x, y) in
our set theoretic language such that x, y!w(x, y); (x, y) is this w. For example (x, y)
might be x y, or hx, yi. Then () really means
{w A : x, y[(w, x, y) (x, y)]}.

()

A relation is a set of ordered pairs.


Lemma 3.6. If hx, yi R then x, y

SS

R.

Proof. x {x} {{x}, {x, y}} = hx, yi R, so x

SS

R. Similarly y

SS

R.

This lemma justifies the following definitions of the domain and range of a set R (we think
of R as a relation, but the definitions apply to any set):
dmn(R) = {x : y(hx, yi R)};
rng(R) = {y : x(hx, yi R)}.
Now we define, using the notation above,
R1 = {hx, yi rng(R) dmn(R) : hy, xi R}.
This is the inverse or converse of R. Note that R1 is a relation, even if R is not. Clearly
hx, yi R1 iff hy, xi R, for any x, y, R. Usually we use this notation only when R is
a function (defined shortly as a special kind of relation), and even then it is more general
than one might expect, since the function in question does not have to be 1-1 (another
notion defined shortly).
A function is a relation f such that
x dmn(f )!y rng(f )[hx, yi f ].
Some common notation and terminology is as follows. f : A B means that f is a
function, dmn(f ) = A, and rng(f ) B. We say then that f maps A into B. If f : A B
and x A, then f (x) is the unique y such that hx, yi f . This is the value of f with
the argument x. We may write other things like fx , f x in place of f (x). Note that if
f, g : A B, then f = g iff a A[f (a) = g(a)]. If f : A B and C A, the
restriction of f to C is f (C B); it is denoted by f C. The image of a subset C of A
def

is f [C] = rng(f C). Note that f [C] = {f (c) : c C}. If D B then the preimage of D
def

under f is f 1 [D] = {x A : f (x) D}.


For any sets f, g we define
f g = {ha, bi : c[ha, ci g and hc, bi f ]}.
This is the composition of f and g. We usually apply this notation when there are sets
A, B, C such that g : A B and f : B C.
11

Lemma 3.7. (i) If g : A B and f : B C, then (f g) : A C and


a A[(f g)(a) = f (g(a))].
(ii) If g : A B, f : B C, and h : C D, then h (f g) = (h f ) g.
Proof. (i): First we show that f g is a function. Suppose that ha, bi, ha, bi (f g).
Accordingly choose c, c so that ha, ci g, hc, bi f , ha, c i g, and hc , b i f . Then
g(a) = c, f (c) = b, g(a) = c , and f (c ) = b . So c = c and hence b = b . This shows
that f g is a function. Clearly dmn(f g) = A and rng(f g) C. For any a A
we have ha, g(a)i g and hg(a), f (g(a))i f , and hence ha, f (g(a))i (f g), so that
(f g)(a) = f (g(a)).
(ii): By (i), both functions map A into D. For any a A we have
(h (f g))(a) = h((f g)(a)) = h(f (g(a))) = (h f )(g(a)) = ((h f ) g)(a).
Hence the equality holds.
Given f : A B, we call f injective, or 1-1, if f 1 is a function; we call f surjective, or
onto, if rng(f ) = B; and we call f bijective if it is both injective and surjective.
A function f will sometimes be written in the form hf (i) : i Ii, where I = dmn(f ).
As an informal usage, we will even define functions in the form h. . . x . . . : x Ii, meaning
the function f with domain I such that f (x) = . . . x . . . for all x I.
If A is a function with domain I, we define
[
[
\
\
Ai =
rng(A) and
Ai =
rng(A).
iI

iI

EXERCISES
In the exercises that ask for counterexamples, it is reasonable to use any prior knowledge
rather that restricting to the material in these notes.
E3.1.
S Prove
 that
S if f : A B and hCi : i Ii is a system of subsets of A, then
f
C
=
iI i
iI f [Ci ].
E3.2. Prove that if f : A B and C, D A, then f [C D] f [C] f [D]. Give an
example showing that equality does not hold in general.
E3.3. Given f : A B and C, D A, compare f [C\D] and f [C]\f [D]: prove the
inclusions (if any) which hold, and give counterexamples for the inclusions that fail to
hold.
E3.4.S Prove that
: A B and hCi : i Ii is a system of subsets of B, then
S if f 1
1
f
[Ci ].
iI Ci =
iI f
E3.5.T Prove that
: A B and hCi : i Ii is a system of subsets of B, then
T if f 1
1
f
[Ci ].
iI Ci =
iI f
E3.6. Prove that if f : A B and C, D B, then f 1 [C\D] = f 1 [C]\f 1 [D].
E3.7. Prove that if f : A B and C A, then
{b B : f 1 [{b}] C} = B\f [A\C].
12

E3.8. For any sets A, B define AB = (A\B) (B\A); this is called the symmetric
difference of A and B. Prove that if A, B, C are given sets, then A(BC) = (AB)C.
E3.9. For any set A let
IdA = {hx, xi : x A}.
This is the identity function on the set A. Justify this definition on the basis of the axioms.
E3.10. Suppose that f : A B. Prove that f is surjective iff there is a g : B A such
that f g = IdB . Note: the axiom of choice might be needed.
E3.11. Let A be a nonempty set. Suppose that f : A B. Prove that f is injective iff
there is a g : B A such that g f = IdA .
E3.12. Suppose that f : A B. Prove that f is a bijection iff there is a g : B A such
that f g = IdB and g f = IdA . Prove this without using the axiom of choice.
E3.13. For any sets R, S define
R|S = {hx, zi : y(hx, yi R hy, zi S)}.
This is the relative product of R and S. Justify this definition on the basis of the axioms.
E3.14. Suppose that f, g : A A. Prove that
(A A)\[((A A)\f )|((A A)\g)]
is a function.
E3.15. Suppose that f : A B is a surjection, g : A C, and x, y A[f (x) = f (y)
g(x) = g(y)]. Prove that there is a function h : B C such that h f = g. Define h as a
set of ordered pairs.
E3.16. The statement
A A B B(A B) implies that

is slightly wrong. Fix it, and prove the result.


E3.17. Suppose that A A B B(A B). Prove that
E3.18. The statement
A A B B(B A) implies that

is slightly wrong. Fix it, and prove the result.

13

A.

B.

4. Ordinals
July 26, 2011
In this chapter we introduce the ordinals, prove a general recursion theorem, and develop
some elementary ordinal arithmetic.
A set A is transitive iff x Ay x(y A); in other words, iff every element of
A is a subset of A. This is a very important notion in the foundations of set theory, and
it is essential in our definition of ordinals. An ordinal number, or simply an ordinal, is a
transitive set of transitive sets. We use the first few Greek letters to denote ordinals. Note
that for any set x, x
/ x by the foundation axiom. If , , are ordinals and ,
then since is transitive. These two facts justify writing < instead of
when and are ordinals. This helps the intuition in thinking of the ordinals as kinds of
numbers. We also define iff < or = .
It is useful in what follows to introduce some of the terminology of ordered sets and
prove a lemma before proceeding with our development of ordinals. A poset or partially
ordered set is an ordered pair (A, <) such that < is a binary relation contained in A A,
it is irreflexive (there is no x such that x < x), and it is transitive (x < y and y < z
imply that x < y). Frequently we will simply write A for the poset, with the relation <
understood. Then we define x y iff x < y or x = y. Given a subset X of A, a minimal
element of X is an element x X such that there is no y X such that y < x. Elements
x, y of A are comparable iff x < y, y < x, or x = y. A poset is a simple ordered set or
linearly ordered set iff any two of its elements are comparable.
Lemma 4.1. Suppose that (A, <) is a poset satisfying the following conditions:
(i) Any nonempty subset of A has a minimal element.
(ii) If x, y A, x 6= y, and for all z A, z < y implies that z < x, then there is a
z < x such that z 6< y.
Then (A, <) is linearly ordered.
Proof. Suppose not; so there are incomparable elements in A. Let
X = {x A : there is a b A which is not comparable with x}.
Thus X is nonempty by supposition. By (i), let x be a minimal element of X. Then let
Y = {y A : x and y are not comparable}.
Hence Y is nonempty by the fact that x X. By (i), let y be a minimal element of Y .
Suppose that z < y. Then z
/ Y , and so x and z are comparable. If x z, then
x < y, contradicting the fact that y Y . It follows that z < x.
Thus we have shown that z < y implies that z < x. Hence by (ii) there is a z < x
such that z 6< y. By the choice of x we thus have that z
/ X, and so in particular z is
comparable with y. So y z, hence y < x, contradiction.
This lemma will be used shortly. Now we return to the discussion of ordinals; The following
properties of ordinals are easy to prove:
14

is an ordinal.
Because of this fact, the empty set is a number; it will turn out to be the first nonnegative
integer, and the first cardinal number. For this reason, we will use 0 and interchangably,
trying to use 0 when numbers are involved, and when they are not.
If is an ordinal, then so is {}.
We denote {} by + 1. After introducing addition of ordinals, it will turn out that
+ 1 = + 1 for every ordinal , so that the prime can be dropped. This ordinal + 1
is the successor of . We define 1 = 0 + 1, 2 = 1 + 1, etc. (up through 9; no further since
we do not want to try to justify decimal notation).
S
If A is a set of ordinals, then A is an ordinal.
S
S
We sometimes write sup(A) for A. In fact, A is the least ordinal each member of
A. We prove this shortly.
Every member of an ordinal is an ordinal.
There does not exist a set which has every ordinal as a member. In fact, suppose to
the contrary that A is such a set. Let B = {x A : x is an ordinal}. Then B is a set
of transitive sets and B itself is transitive. Hence B is an ordinal. So B A, and even
B B, contradicting the foundation axiom.
This fact is what happens in our axiomatic framework to the Burali-Forti paradox.
Now we can prove that the ordinals are simply ordered. More precisely, any set of ordinals
is linearly ordered.
Theorem 4.2. If and are ordinals, then = , , or .
Proof. Let and be ordinals. Let A = ( + 1) ( + 1). So (A, <) is a poset. We
are going to verify the conditions of Lemma 4.1. Suppose that X is a nonempty subset of
A. By the regularity axiom, choose X such that X = . Thus for any < we
have
/ X, since < is the same as among ordinals. So is a minimal element of X, as
desired in 4.1(i). For 4.1(ii), suppose that , A, 6= , and < implies that < .
Then since < is . Since 6= , choose \. Thus < but 6< , as desired in
4.1(ii).
So we apply 4.1 and infer that (A, <) is a linear order. Since , A, they are
comparable.
We give some more simple properties of ordinals. For some of them, it is convenient to use
Theorem 4.2 to check them.
iff .
< iff .
< iff + 1 .

S
These facts can be used to check the implicit statement above that SA is the leastSupper
bound of A, for any set A of ordinals. In fact, if S
A, then A,
S so A. If
for all A, then for each A, hence A , hence A .
15

The following theorem is also fundamental.

T
T
Theorem
4.3.
If

is
a
nonempty
set
of
ordinals,
then

is
an
ordinal,
,
T
and for every .
T
Proof. The
members
of
are clearly ordinals,T
so for the first statement it suffices
T
to show that is transitive. Suppose that ; and suppose that T
. Then
, and hence since
T is transitive. This argument shows that . Since
is arbitrary, it follows that Tis transitive, and hence
is an ordinal.
T
For every T we have , and hence T by a previous
fact.
T
T
Suppose that
/ . For any T we have , hence , hence
<
T
since T but we are assuming that
/ . But this means that [ ]. So
T
, contradiction.
A linearly ordered set (A, <) is a well-ordered set iff every nonempty subset X of A has
a least element, i.e., an element x X such that x y for all y X. So 4.3 says that
for any ordinal , (, <) is a well-ordered set. Later in this chapter we will show that
every well-ordered set is isomorphic to an ordinal; this is the fundamental theorem about
ordinals.
Ordinals are divided into three classes as follows. First there is 0, the empty set. An
ordinal is a successor ordinal if = + 1 for some . Finally, is a limit ordinal if it
is nonzero and is not a successor ordinal. Thus 1, 2, etc. are successor ordinals.
To prove the existence of limit ordinals, we need the infinity axiom. Let x be as in
the statement of the infinity axiom. Thus 0 x, and y {y} x for all y x. We define
=

\
{z x : 0 z and y {y} z for all y z}.

This definition does not depend on the choice of x. The members of are natural numbers,
and the usual induction principle is built into the definition. It is easy to check that is
an ordinal, and in fact is the least limit ordinal.
Proposition 4.4. The following conditions are equivalent:
(i) is a limit ordinal;
(ii) 6= 0,Sand for every < there is a such that < < .
(iii) = 6= 0.
Proof. (i)(ii): suppose that is a limit ordinal. So 6= 0, by definition. Suppose
that < . Then + 1 < , since is not a successor ordinal. Thus = + 1 works
as indicated.
S
(ii)(iii): if , choose such that
is an ordinal.
S . Then since S
If , choose with < < . Thus . This proves that = , and 6= 0
is given.
S
(iii)(i): suppose that = + 1. Then = , so choose such that
. Thus < , so < , contradiction.
S
Also note that if = + 1, then = .

16

Classes and sets


Although expressions like {x : x = x} and { : is an ordinal} are natural, they cannot
be put into the framework of our logic for set theory. These collections are too big.
It is intuitively indispensible to continue using such expressions. One should understand
that when this is done, there is a rigorous way of reformulating what is said. These big
collections are called classes; their rigorous counterparts are simply formulas of our set
theoretic language. We can also talk about class functions, class relations, the domain of
class functions, etc. Most of the notions that we have introduced so far in our sketch of a
development from the axioms have class counterparts. In particular, we have the important
classes V, the class of all sets, and On, the class of all ordinals. They correspond to the
formulas x = x and is an ordinal. We attempt to use bold face letters for classes;
in some cases the classes in question are actually sets. A class which is not a set is called
a proper class.
Sequences of ordinals
We say that F is an ordinal class function iff F is a class function whose domain is an
ordinal, or the whole class On, and whose range is contained in On. If its domain is an
ordinal, then of course it is a set. We consider three properties of an ordinal class function
F with domain A:
F is strictly increasing iff for any ordinals , A, if < then F() < F().
S
F is continuous iff for every limit ordinal A, F() = < F().
F is normal iff it is continuous and strictly increasing.
Proposition 4.5. If F is a strictly increasing ordinal class function with domain A,
then F() for every ordinal A.
Proof. Suppose not, and let be the least member of A such that F() < .
Then F(F()) < F(), so that F() is an ordinal smaller than such that F() < ,
contradiction.
Proposition 4.6. If F is a continuous ordinal class function with domain A, and
F() < F( + 1) for every ordinal such that + 1 A, then F is strictly increasing.
Proof. Fix an ordinal A, and suppose that there is an ordinal A with <
and F() F(); we want to get a contradiction. Take the least such .
Case 1. = + 1 for some . Thus . If = , then F() < F() by
the hypothesis of the proposition, contradicting our supposition. Hence < . Hence
F() < F() by the minimality of , and F() < F() by the assumption of the proposition,
so F() < F(), contradiction.
Case 2. is a limit ordinal. Then there is a < with < , and so by the minimality
of we have
[
F() < F()
F() = F(),
<

contradiction.
17

Proposition 4.7. Suppose that F is a normal ordinal class function with domain A,
and A is a limit ordinal. Then F() is a limit ordinal too.
S
Proof. Suppose that < F(). Thus < F(), so there is a < such that
< F(). Now F() < F(). Hence F() is a limit ordinal.
Proposition 4.8. Suppose that F and G are normal ordinal class functions, with
domains A, B respectively, and the range of F is contained in B. Then also G F is also
normal.
Proof. Clearly G F is strictly increasing. Now suppose that A is a limit ordinal.
Then F() is a limit ordinal by 4.7.
S
Suppose that < . Then F() < F(), so G(F()) <F() G() = G(F()).
Thus
[
G(F()) G(F()).
()
<

S
Now if < F(), then by the continuity of F, < < F(), and hence there is a

S < such that < F(); so G() < G(F()). So for any < F() we have G()
< G(F()). Hence
G(F()) =

G()

G(F());

<

<F()

together with () this gives the continuity of G F .


Transfinite induction and recursion
The following theorem generalizes the familiar principle of complete induction for natural
numbers.
Theorem 4.9. (Complete transfinite induction) Suppose that A is an ordinal, or is
On, B A, and
() For all A, if B for all < , then B.
Then B = A.
Proof. Suppose not. Then A\B is nonempty, and we let be the least element of
it. Thus B for all < , so by the assumption (), also B, contradiction.
There is also an ordinary principle of transfinite induction, in which the argument goes
step-by-step, except for limit ordinals, where we have to do complete induction again. This
generalizes the usual induction principle for , which as we stated is really built in to the
definition of .
Theorem 4.10. (Ordinary transfinite induction) Suppose that A is an ordinal or is
On, B A, and the following three conditions hold:
(i) If 0 A, then 0 B.
18

(ii) If + 1 A and B, then + 1 B.


(iii) If A is a limit ordinal, and if B for all < , then B.
Under these assumptions, B = A.
Proof. Again, suppose not, and let be the least element of A\B. Then 6= 0 by
(i). Suppose that is a successor ordinal + 1. Then B, and (ii) is contradicted.
Finally, if is a limit ordinal, then (iii) is contradicted.
We need the following lemma before formulating and proving the transfinite recursion
principle.
Lemma 4.11. Suppose that F is a ordinal class function with domain On, and is a
particular ordinal. Then there is a unique function f with domain such that f () = F()
for every < .
Note that f here is a set. More formally, Lemma 4.11 looks like this:
For any formula (x, y), we can prove the following in ZFC:
[x[y(x, y) x is an ordinal]
and x[x is an ordinal !y(x, y)]]
x[x is an ordinal !f [f is a function and dmn(f ) = x
and yz[hy, zi f (y, z)]].

Proof. By the axiom of replacement, the set


def

A = {x : F() = x for some < }


exists. Let
f = {(, ) A : F() = }.
Clearly f is as desired, and it is unique.
The function f asserted to exist in this lemma will be denoted by F .
Theorem 4.12. (Class version of the transfinite recursion principle) Suppose that G
is a class function with domain the class of all (ordinary) functions. Then there is a unique
class function F with domain On such that for every ordinal we have F() = G(F ).
Before beginning the proof it may clarify things to formulate the theorem more precisely.
The existence part of the theorem says that for any formula (x, y) there is another
formula (x, y) such that it is provable that if (x, y) represents a function G defined for all
(ordinary) functions, then (x, y) represents a function F as indicated. The uniqueness
part says that if (x, y) is any other formula that works as indicated, then one can prove
xy[(x, y) (x, y)].
Proof of 4.12. We consider the following condition:
19

(*) f is a function with domain , and for every < we have f () = G(f ).
First we show
(1) If f, satisfy (*) and g, satisfy (*) and , then f = g .
To prove this, we prove by transfinite induction on that if < then f () = g().
Suppose that this is true for all < , where < . Then f = g , so f () = G(f
) = G(g ) = g(), finishing the inductive proof.
(2) For every ordinal there is a function f such that (*) holds.
We prove this by transfinite induction. It trivially holds for = 0. Assume that it holds
for , and f is a function such that (*) holds for . Let h = f {(, G(f ))}. Clearly (*)
holds for h and + 1. Finally, suppose that is a limit ordinal, and (*) holds for every
< . By (1), for each < there is a unique
f satisfying (1); we denote it by f (the
S
replacement axiomSis being used). Let g = < f . Then g is a function by (1), and its
domain is clearly .
(3) For any < we have f = g , and g() = G(g ).
In fact, the first condition is clear. For the second,
g() = f+ 1 () = G(f+ 1 ) = G(g ).
So, (3) holds; hence (*) holds for .
This finishes the inductive proof of (2).
Now for any ordinal we let F() = f (), where f is chosen so that (*) holds for

+ 1 and f . This definition is unambiguous by (1). Also by (1), we have F = f .


Hence F() = f () = G(f ) = G(F ).
This finishes the proof of existence.
For uniqueness, suppose that H also satisfies the conditions of the theorem. We prove
that F() = H() for every ordinal by induction. Suppose that this is true for all < .
Then F = H , and hence F() = G(F ) = G(H ) = H(). This finishes the
inductive proof.
More on well-ordered sets
Now we prove some more facts about well-ordered sets, leading up to the fundamental fact
that each such set is similar to an ordinal, in a certain well-defined sense.
Let (A, <) be a well-ordered set. An initial segment of A under < is a subset B
of A such that a Ab B(a < b a B). An initial segment is proper iff it is
different from A itself. If a A, then the initial segment determined by a under < is
the set {b A : b < a}. Clearly this really is an initial segment of A under <. Then
({b A : b < a}, <) is a well-ordered set which we denote by (A a, <). Note here the
slight sloppiness: we really should talk about {(b, c) : b, c < a and b < c} instead of < here.
Sometimes we omit the phrase under < when talking about initial segments, if the <
involved is clear.
20

Proposition 4.13. A proper initial segment of a well-ordered set is determined by


one of its elements.
Proof. Suppose that B is a proper initial segment of the well-ordered set (A, <). Let
a be the least element of A\B. We claim that B is determined by a. For, if b B, then
it cannot happen that a b, since this would imply that a B; so b < a. And if b < a,
then b B by the minimality of a.
Suppose that (A, <) and (B, ) are linearly ordered sets. A function f : A B is
strictly increasing iff a0 , a1 A[a0 < a1 f (a0 ) f (a1 )]. Note that such a function is
necessarily one-one.
The following easy fact will be useful.
Proposition 4.14. If (A, <) and (B, ) are linearly ordered sets and f : A B is
strictly increasing, then a0 , a1 A[a0 < a1 f (a0 ) f (a1 )].
Proof. The direction is given by the definition. Now suppose that it is not true
that a0 < a1 . Then a1 a0 , so f (a1 ) f (a0 ). So f (a0 ) < f (a1 ) is not true.
We already know the following fact for ordinals, but now we need it for arbitrary wellordered sets.
Proposition 4.15. If (A, <) is a well-ordered set and f : A A is strictly increasing,
then x f (x) for all x A.
def

Proof. Suppose not. Then then set B = {x A : f (x) < x} is nonempty. Let b be
the least element of B. Thus f (b) < b. Hence by the choice of b, we have f (b) f (f (b)).
Hence by 4.14, b f (b), contradiction.
Proposition 4.16. If (A, <) is a well-ordered set, then there does not exist a strictly
increasing function from A onto a proper initial segment of A.
Proof. Suppose that f is such a function. By 4.13, the proper initial segment is
determined by some element x. By 4.15, x f (x), contradicting the assumption that f
maps into the given proper initial segment.
Let (A, <) and (B, ) be linearly ordered sets. We say that they are similar if there is a
a strictly increasing function f mapping A onto B. We call f a similarity mapping, or an
order-isomorphism.
Proposition 4.17. If (A, <) and (B, ) are similar well-ordered sets, then there is
a unique strictly increasing function f mapping A onto B.
Proof. The existence of f follows from the definition. Suppose that both f and g
are strictly increasing functions mapping A onto B. Then f 1 g is a strictly increasing
function from A into A, so by 4.15 we get x (f 1 g)(x) for every x A; so f (x) g(x)
for every x A. Similarly, g(x) f (x) for every x A, so f = g.
Corollary 4.18. If 6= , then and are not similar.
Proof. Suppose to the contrary that f is a similarity mapping from onto , with
< . This contradicts 4.16.
21

Proposition 4.19. Suppose that f is a one-one function mapping an ordinal onto


a set X. Define R = {(f (), f ()) : < < }. Then (X, R) is a well-ordered set.
Proof. Completely straightforward.
The following theorem is fundamental. The proof is also of general interest; it can be
followed in outline form in many other situations.
Theorem 4.20. Every well-ordered set is similar to an ordinal.
Proof. Let (A, ) be a well-ordered set. We may assume that A 6= . We define a
class function G whose domain is the class of all functions. For any function h,
G(h) =

-least element of A\rng(h) if this set is nonempty,


A
otherwise.

Now by the recursion theorem let F be a ordinal class function with domain On such that
F() = G(F ) for each ordinal .
(1) If < and F() = A, then F() = A.
For, A\rng(F ) A\rng(F ), so (1) is clear.
(2) if < and F() 6= A, then F() 6= A and F() F().
The first assertion follows from (1). For the second assertion, note that A\rng(F )
A\rng(F ), hence F() A\rng(F ), so F()  F() by definition. Also F()
rng(F ), and F()
/ rng(F ), so F() F(), as desired in (2).
By (2), F() cannot be 6= A for all , because then F would be a one-one function
mapping On into A, contradicting the replacement axiom. Choose minimum such that
F() = A. (Note that F(0) 6= A, since A is nonempty and so has a least element.) By (2),
F is one-one and maps onto A. So is similar to A.
By 4.17 and 4.20, every well-ordered set (A, ) is similar to a unique ordinal. This ordinal
is called the order type of (A, ), and is denoted by o.t.(A, ).
Ordinal addition
We define ordinal addition by transfinite recursion:
+ 0 = ;
+ ( + 1) = ( + ) + 1;
[
+ =
( + ) for limit.
<

Proposition 4.21. + 1 = + 1 for any ordinal .


Proof. + 1 = + (0 + 1) = ( + 0) + 1 = + 1.
Now we can stop using the notation + 1.
22

We state the simplest properties of ordinal addition in the following theorem, but only
prove a couple of representative parts of it. Weakly increasing means that a < b implies
that F(a) F(b). An ordinal is infinite iff .
Theorem 4.22. (i) If m, n , then m + n .
(ii) For any ordinal , the class function F which takes each ordinal to + is a
normal function.
(iii) For any ordinal , the class function F which takes each ordinal to + is
continuous and weakly increasing.
(iv) + ( + ) = ( + ) + .
(v) + .
(vi) 0 + = .
(vii) iff there is a such that + = .
(viii) < iff there is a > 0 such that + = .
(ix) is infinite iff 1 + = .
Proof. (i) is easily proved by ordinary induction on n; this is the induction given by
the very definition of .
We prove (iv) by fixing and and proceeding by induction on . The case = 0 is
obvious. Assume that + ( + ) = ( + ) + . Then
+ ( + ( + 1)) = + (( + ) + 1)
= ( + ( + )) + 1
= (( + ) + ) + 1
= ( + ) + ( + 1).
Finally, suppose that is a limit ordinal and we know our result for all < . Let F, G, H
be the ordinal class functions such that, for any ordinal ,
F() = + ;
G() = ( + ) + ;
H() = + .
Thus according to (i), all three of these functions are normal. Hence, using 4.8,
+ ( + ) = F(H())
[
=
F(H())
<

( + ( + ))

<

(( + ) +

<

G()

<

= G()
= ( + ) + .
23

Note, with reference to 4.22(i), that addition, restricted to natural numbers, is ordinary
addition familiar to the reader. Also note that ordinal addition is not commutative in
general. For example, = 1 + 6= + 1.
There is an equivalent definition of ordinal addition which is more intuitive and direct:
Theorem 4.23. For any ordinals , let
= ( {0}) ( {1}).
We define a relation on as follows. For any x, y , x y iff one of the
following three conditions holds:
(i) There are , < such that x = (, 0), y = (, 0), and < .
(ii) There are , < such that x = (, 1), y = (, 1), and < .
(ii) There are < and < such that x = (, 0) and y = (, 1).
Then ( , ) is a well-order which is order-isomorphic to + .
A simple picture helps to explain the construction in this theorem:

Thus a copy of is put to the right of a copy of . The purpose of the definition of
is to make the copies of and disjoint.
Proof. Clearly is a well-order. We show by transfinite induction on , with
fixed, that ( , ) is order isomorphic to + . For = 0 we have + = + 0 = ,
while = 0 = {0}. Clearly 7 (, 0) defines an order-isomorphism from
onto ( {0}, ). So our result holds for = 0. Assume it for , and suppose that f is
an order-isomorphism from + onto ( , ). Now the last element of ( + 1) is
(, 1), and the last element of + ( + 1) is + , so the function
f {( + , (, 1))}
is an order-isomorphism from + ( + 1) onto ( + 1).
Now assume that is a limit ordinal, and for each < , the ordinal + is
isomorphic to . For each such let f be the unique isomorphism from + onto
. Note that if < < , then f is an isomorphism from + onto ; hence
f = f . It follows that
[
f
<

is an isomorphism from + onto , finishing the inductive proof.

24

Ordinal multiplication
We define ordinal multiplication by recursion:
0 = 0;
( + 1) = + ;
[
( ) for limit.
=
<

Here are some basic properties of ordinal multiplication:


Theorem 4.24. (i) If m, n , then m n .
(ii) If 6= 0, then < ( + 1);
(iii) If 6= 0, then the class function assigning to each ordinal the product is
normal.
(iv) 0 = 0;
(v) If , 6= 0, then 6= 0;
(vi) ( + ) = ( ) + ( );
(vii) ( ) = ( ) ;
(viii) If 6= 0, then ;
(ix) If < then ;
(x) 1 = 1 = .
(xi) 2 = + .
Proof. We only give the proof of (vi). Fix and . By (iv) we may assume that
6= 0; we then proceed by induction on . We define some ordinal class functions F, F , G:
for any , F() = + ; F () = + ; G() = . These are normal functions.
First of all,
( + 0) = = ( ) + 0 = ( ) + ( 0),
so (v) holds for = 0. Now assume that (v) holds for . Then
( + ( + 1)) = (( + ) + 1)
= ( + ) +
= ( ) + ( ) +
= ( ) + ( ( + 1)),
as desired.
25

Finally, suppose that is a limit ordinal and we know (v) for all < . Then
( + ) = G(F())
= (G F)()
[
=
(G F)()
<

( ( + ))

<

(( ) + ( ))

<

F (G())

<

(F G)()

<

= (F G)()
= ( ) + ( ),
as desired. This completes the proof of (v).
The following is a generalization of the division algorithm for natural numbers. That
algorithm is very useful for the arithmetic of natural numbers, and the ordinal version is
a basic result for more advanced arithmetic of ordinals.
Theorem 4.25. (Division algorithm) Suppose that and are ordinals, with 6= 0.
Then there are unique ordinals , such that = + with < .
Proof. First we prove the existence. Note that < + 1 ( + 1). Thus there
is an ordinal number such that <S ; take the least such . Obviously 6= 0. If is
a limit ordinal, then because = < ( ), it follows that there is a < such that
< , contradicting the minimality of . Thus is a successor ordinal + 1. By the
definition of we have . Hence there is an ordinal such that + = . We
claim that < . Otherwise, = + + = ( + 1) = , contradicting
the definition of . This finishes the proof of existence.
For uniqueness, suppose that also = + with < . Suppose that 6= .
By symmetry, say < . Then
= + < + = ( + 1) + = ,
contradiction. Hence = . Hence also = .
We now give, similarly to the case of ordinal addition, an equivalent definition of ordinal
multiplication which is somewhat more intuitive than the definition above. Given ordinals
, , we define the following relation on :
(, ) ( , )

iff

((, ) and ( , ) are in and:


< , or ( = and < ).
26

We may say that this is the anti-dictionary or anti-lexicographic order.


Theorem 4.26. For any two ordinals , , the set under the anti-lexicographic
order is a well-ordering which is isomorphic to .
Proof. We may assume that 6= 0. It is straightforward to check that is a
well-order.
Now we define, for any (, ) ,
f (, ) = + .
We claim that f is the desired order-isomorphism from onto . If (, ) ,
then
f (, ) = + < + = ( + 1) .
Thus f maps into .
To show that f is one-one, suppose that (, ), ( , ) and f (, ) = f ( , ).
Then by Theorem 4.25, (, ) = ( , ). So f is one-one.
To show that f maps onto , let < . Choose and so that = +
with < . Now < , as otherwise
= + .
It follows that f (, ) = ( ) + = . so f is onto.
Finally, we show that the order is preserved. Suppose that (, ) ( , ). Then one
of these cases holds:
Case 1. < . Then
f (, ) = + < + = ( + 1) + = f ( , ),
as desired.
Case 4. = and < . Then f (, ) < f ( , ).
Now it follows that f is the desired isomorphism.
Ordinal exponentiaton
We define exponentiation of ordinals by recursion:
0 = 1;
+1 = ;
[

for limit.

<

Now we give the simplest properties of exponentiation.


Theorem 4.27. (i) If m, n , then mn .
(ii) 00 = 1;
27

(iii) 0+1 = 0;
(iv) 0 = 1 for a limit ordinal;
(v) 1 = 1;
(vi) If 6= 0, then 6= 0;
(vii) If > 1 then < +1 ;
(viii) If > 1, then the ordinal class function assigning to each ordinal the value

is normal;
(ix) If > 1, then ;
(x) If 0 < < , then ;
(xi) For 6= 0, + = ;
(xii) For 6= 0, ( ) = .
The following is another kind of division algorithm for ordinals.
Theorem 4.28. (Extended division algorithm) Let and be ordinals, with 6= 0
and 1 < . Then there exist unique ordinals , , such that the following conditions hold:
(i) = + .
(ii) .
(iii) 0 < < ,
(iv) < .
Proof. We have < + 1 +1 ; so there is an ordinal such that < . We
take the least such . Clearly is a successor ordinal + 1. So we have < +1 .
Now 6= 0, since > 1. Hence by the division algorithm there are ordinals , such that
= + , with < . Now < ; for if , then
= + = +1 > ,
contradiction. We have 6= 0, as otherwise = < , contradiction.. Also, , since
= + .
This proves the existence of , , as called for in the theorem.
Suppose that , , also satisfy the indicated conditions; thus

(1) = + ,
(2) ,
(3) 0 < < ,

(4) < .
Suppose that 6= ; by symmetry, say that < . Then

= + < + = ( + 1) = +1 ,
contradiction. Hence = . Hence by the ordinary division algorithm we also have =
and = .
28

We can obtain an interesting normal form for ordinals by re-applying 4.28 to the remainder over and over again. That is the purpose of the following definitions and results.
This generalizes the ordinary decimal and binary systems of notation, by taking = 10 or
= 2 and restricting to natural numbers. For infinite ordinals it is useful to take = ;
this gives the Cantor normal form.
To abbreviate some long expressions, we let N (, m, , ) stand for the following statement:
is an ordinal > 1, m is a positive integer, and are sequences of ordinals each of
length m, and:
(1) (0) > (1) > > (m 1);
(2) 0 < (i) < for each i < m.
If N (, m, , ), then we define
k(, m, , ) = (0) (0) + (1) (1) + + (m1) (m 1).
Lemma 4.29. Assume that N (, m, , ) and N (, n, , ). Then
(i) k(, m, , ) (0).
(ii) k(, m, , ) < (0) ((0) + 1) (0)+1 .
(iii) If (0) 6= (0), then k(, m, , ) < k(, n, , ) iff (0) < (0).
(iv) If (0) = (0) and (0) 6= (0), then k(, m, , ) < k(, n, , ) iff (0) < (0).
(v) If (j) = (j) and (j) = (j) for all j < i, while (i) 6= (i), then k(, m, , ) <
k(, n, , ) iff (i) < (i).
(vi) If (j) = (j) and (j) = (j) for all j < i, while (i) = (i) and (i) 6= (i),
then k(, m, , ) < k(, n, , ) iff (i) < (i).
(vii) If , , and m < n, then k(, m, , ) < k(, n, , )
Proof. (i): k(, m, , ) (0) (0).
(ii): We prove this by induction on m. It is clear for m = 1. Now assume that it holds
for m 1, where m > 1. Then
(0) (0) + (1) (1) + + (m1) (m 1) < (0) (0) + (1)+1
(0) (0) + (0)
= (0) ((0) + 1)
(0)
= (0)+1 ,
finishing the inductive proof.
For (iii), assume the hypothesis, and suppose that (0) < (0). Then
k(, m, , ) < (0) ((0) + 1) (0)+1
(0)

by (ii)

k(, n, , ) by (i).
29

By symmetry (iii) now follows.


For (iv), assume the hypothesis, and suppose that (0) < (0). Then
k(, m, , ) < (0) ((0) + 1)

(0)

((0) + 1) by (ii)

(0) (0)
k(, n, , )
By symmetry (iv) now follows.
(v) is clear from (iii), by deleting the first i summands of the sums.
(vi) is clear from (iv), by deleting the first i summands of the sums.
(vii) is clear.
Theorem 4.30. Let and be ordinals, with 6= 0 and 1 < . Then there exist a
unique m and finite sequences h(i) : i < mi and h(i) : i < mi of ordinals such that
the following conditions hold:
(i) = (0) (0) + (1) (1) + + (m1) (m 1).
(ii) (0) > (1) > > (m 1).
(iii) 0 < (i) < for each i < m.
Proof. For the existence, with > 1 fixed we proceed by induction on . Assume
that the theorem holds for every < such that 6= 0, and suppose that 6= 0. By
Theorem 4.28, let , , be such that
(1) = + ,
(2) ,
(3) 0 < < ,
(4) < .
If = 0, then we can take our sequences to be h(0)i and h(0)i, with (0) = and
(0) = . Now assume that > 0. Then
< + = ;
so < . Hence by the inductive assumption we can write
= (0) (0) + (1) (1) + + (m1) (m 1)
with
(5) (0) > (1) > > (m 1).
(6) 0 < (i) < for each i < m.
Then our desired sequences for are
h, (0), (1), . . . , (m 1)i and h, (0), (1), . . . , (m 1)i.
30

To prove this, we just need to show that > (0). If (0), then
(0) ,
contradiction.
This finishes the existence part of the proof.
For the uniqueness, we use the notation introduced above, and proceed by induction
on . Suppose the uniqueness statement holds for all nonzero < , and now we have
N (, m, , ), N (, n, , ), and
= k(, m, , ) = k(, n, , ).
We suppose that the uniqueness fails. Say m n. Then there is an i < m such that
(i) 6= (i) or (i) 6= (i); we take the least such i. Then we have a contradiction of
Lemma 4.29.
Now we give an equivalent definition of exponentiation similar to those given above for
addition and multiplication. A set X is finite iff there is a bijection from some natural
number m onto X.
Theorem 4.31. Suppose that and are ordinals, with 6= 0. We define
w

= {f : { < : f () 6= 0} is finite}.

For f, g w we write f g iff f 6= g and f () < g() for the greatest < for which
f () 6= g().
Then ( w , ) is a well-order which is order-isomorphic to the ordinal exponent .
Proof. If = 0, then = 1, and w also has only one element, the empty function
(= the emptyset). So, assume that 6= 0. If = 1, then w has only one member, namely
the function with domain whose value is always 0. This is clearly order-isomorphic to 1,
as desired. So, suppose that > 1.
Now we define a function f mapping into w . Let f (0) be the member of w
which takes only the value 0. Now suppose that 0 < < . By 4.30 write
= (0) (0) + (1) (1) + + (m1) (m 1),
where (0) > (1) > > (m 1) and 0 < (i) < for each i < m. Note that
(0) < , so (0) < . Then we define, for any < ,
(f ())() =

0
if
/ {(0), . . . , (m 1)},
(i) if = (i) with i < m.

Clearly f () w . To see that f maps onto w , suppose that x w . If x takes only


the value 0, then f (0) = x. Suppose that x takes on some nonzero value. Let
{ < : x() 6= 0} = {(0), (1), . . . , (m 1)},
31

where (0) > (1) > > (m 1). Let (i) = x((i)) for each i < m, and let
= (0) (0) + (1) (1) + + (m1) (m 1).
Clearly then f () = x.
Now we complete the proof by showing that for any , < , < iff f () < f ().
This equivalence is clear if one of , is 0, so suppose that both are nonzero. Write
= (0) (0) + (1) (1) + + (m1) (m 1),
where (0) > (1) > > (m 1) and 0 < (i) < for each i < m, and
=

(0)

(0) +

(1)

(1) + +

(n1)

(n 1),

where (0) > (1) > > (n 1) and 0 < (i) < for each i < n.
By symmetry we may suppose that m n. Note that N (, m, , ), k(, m, , ) = ,
N (, n, , ), and k(, n, , ) = . We now consider several possibilities.
Case 1. = . Then clearly f () = f ().
Case 4. , , and m < n. Thus < . Also, (m) is the largest < such
that (f ())() 6= (f ())(), and (f ())() = 0 < (m) = (f ())( (m)), so f () < f ().
Case 3. There is an i < m such that (j) = (j) and (j) = (j) for all j < i, while
(i) 6= (i). By symmetry, say that (i) < (i). Then we have < . Since (i) is the
largest < such that (f ())() 6= (f ())(), and (f ())( (i)) = 0 < (i) = (f ())( (i)),
we also have f () < f ().
Case 4. There is an i < m such that (j) = (j) and (j) = (j) for all j < i, while
(i) = (i) and (i) 6= (i). By symmetry, say that (i) < (i). Then we have < .
Since (i) is the largest < such that (f ())() 6= (f ())(), and (f ())( (i)) = (i) <
(i) = (f ())( (i)), we also have f () < f ().
We finish this chapter with two important characterizations of absorption properties of
ordinals.
Lemma 4.32. If < , then + = .
Proof. First we prove
(1) If < , then + = .
In fact, suppose that < . Then there is a nonzero such that + = . Then
+ = + + = + = (1 + ) = = .
By an easy ordinary induction, we obtain from (1)
(2) If < and m , then m + = .
Now we turn to the general case. If = 0 or < , the desired conclusion is clear. So
suppose that and > 0. Then we can write = m + with m and < .
Then
+ = m + + (m + 1) + =
32

Now we have the following characterization of absorption under addition:


Theorem 4.33. The following conditions are equivalent, for any ordinal :
(i) + = for all < . (Absorption under addition)
(ii) For all , < , also + < .
(iii) = 0, or = for some .
Proof. (i)(ii): Assuming (i), if , < , then + < + = .
(ii)(iii): Assume (ii). If = 0 or = 1, condition (iii) holds, so suppose that 2 .
Then clearly (ii) implies that . Choose , m, such that m , = m + ,
and < . If 6= 0, then m < m + = , and also < < , so that (ii) is
contradicted. So = 0. If m > 1, write m = n + 1 with n 6= 0. Then
= m = (n + 1) = n + ,
and n, < , again contradicting (ii). Hence m = 1, as desired in (iii).
Finally, (iii)(i) by Lemma 4.32.
Theorem 4.34. For any ordinal the following conditions are equivalent:
(i) For all , if 0 < < then = .
(ii) For all , < , also < .

(iii) {0, 1, 2} or there is a such that = ( ) .


Proof. (i)(ii): Assume (i), and suppose that , < . If = 0, then = 0 < .
If 6= 0, then < = .
(ii)(iii): Assume (ii), and suppose that
/ {0, 1, 2}. Clearly then . Now if
, < , then + < . In fact, if , then + + = 2 < by (ii); and if
< then + < + = 2 < . Hence by 4.33 there is a such that = . Now
if , < , then , < = , and hence + = < = , so that + < .
Hence by 4.33, = for some .

(iii)(i): Assume (iii). Clearly 0, 1, 2 satisfy (i), so assume that = ( ) . Take any
< with 6= 0. If < , clearly = . So assume that . Write = m +
with m and < . Then < , and so
= (

= ( m + ) (

( m + ) (
= (m + 1) (
+1 (
= +1+

= ( )
=

33

EXERCISES
E4.1. Prove that the following conditions are equivalent:
(i) x is an ordinal;
(ii) x is transitive, and for all y, z x, either y z, y = z, or z y;
(iii) x is transitive, and for all y, if y x and y is transitive, then y x;
S
S(iv) for all y x, either y {y} = x or y {y} x; and for all y x, either y = x
or y x;
(v) x is transitive and {(y, z) x x : y z} well-orders x.
Hint: prove this in the following order: (i)(v) [easy]; (v)(ii) [obvious]; (ii)(iii)
[Assume that y x, y transitive; apply the foundation axiom to x\y]; (iii)(i) [Let
y = {z x : z is an ordinal}, and get a contradiction from assuming that y x]; (i)(iv)
[easy]; (iv)(i) [Let be an ordinal not in x, and take the least ordinal {} such
that
/ x. Work with to show that x is an ordinal.
E4.2. Show that (2 2) 6= 2 2 .
E4.3. Use the transfinite recursion principle to justify the definition of ordinal addition.
E4.4. Use the transfinite recursion principle to justify the definition of ordinal multiplication.
E4.5. Use the transfinite recursion principle to justify the definition of ordinal exponentiation.
E4.6. Formulate and prove a set version of the transfinite recursion principle.
E4.7. Suppose that , , are ordinal numbers with < . Prove that + + =
+ .
E4.8. Show that for every nonzero ordinal there are only finitely many ordinals such
that = for some .
E4.9. Prove that n(

= (

for every natural number n > 1.

E4.10. Assume that 6= 0, 0 < m < , and < . Prove that + m = m.


E4.11. Suppose that 0 < k and 6= 0. Then k = .
E4.12. Suppose that 0 < k , 6= 0, and 0 < m . Then k m = m.
E4.13. Suppose that = k + with 6= 0, k , and < , < , and m \1.
Show that ( + ) m = m + .
E4.14. Suppose that = k + with 6= 0, k , and < , and < . Show that
( + ) = .
E4.15. Show that ( + ) + for any ordinals , , .Hint: write and
using Theorem 4.28 and using Theorem 4.25.
E4.16. Show that the following conditions are equivalent for any ordinals , :
(i) + = + .
34

(ii) There exist an ordinal and natural numbers k, l such that = k and = l.
References
Sierpi
nski, W. Cardinal and ordinal numbers. Pa
n. Wyd. Naukowe, 1958, 487 pp.

35

5. The axiom of choice


We give a small number of equivalent forms of the axiom of choice; these forms should be
sufficient for most mathematical purposes. The axiom of choice has been investigated a
lot, and we give some references for this after proving the main theorem of this chapter.
The set of axioms of ZFC with the axiom of choice removed is denoted by ZF; so we
work in ZF in this chapter.
The two main equivalents to the axiom of choice are as follows.
Zorns Lemma. If (A, <) is a partial ordering such that A 6= and every subset of A
simply ordered by < has an upper bound, then A has a maximal element under <, i.e., an
element a such that there is no element b A such that a < b.
Well-ordering principle. For every set A there is a well-ordering of A, i.e., there is a
relation < such that (A, <) is a well-ordering.
In addition, the following principle, usually called the axiom of choice, is equivalent to the
actual form that we have chosen:
Choice-function principle. If A is a family of nonempty sets, then there is a function
f with domain A such that f (a) a for every a A. Such a function f is called a choice
function for A.
Theorem 5.1. In ZF the following four statements are equivalent:
(i) the axiom of choice;
(ii) the choice-function principle;
(iii) Zorns lemma.
(iv) the well-ordering principle;
Proof. Axiom of choice choice-function principle. Assume the axiom of
choice, and let A be a family of nonempty sets. Let
A = {X : a A[X = {(a, x) : x a}]}.
Since each member of A is nonempty, also each member of A is nonempty. Given X, Y A
with X 6= Y , choose a, b A such that X = {(a, x) : x a} and Y = {(b, x) : x b}.
Thus a 6= b since A 6= B. The basic property of ordered pairs then implies that A B = .
So, by the axiom of choice, let B have exactly one element in common with each
element of A . Define f = {b B : there exist a A and x such that b = (a, x)}. Clearly
f is the desired choice function for A.
Choice-function principle Zorns lemma. Assume the choice-function principle, and also assume the hypotheses of Zorns lemma. Let f be a choice function for
P(A)\{}. We now define a function F : Ord A {A} by recursion. Namely, for each
ordinal let
n
F () = f ({a A : F () < a for all < }) if this set is nonempty,
A
otherwise.
(1) If < Ord and F () 6= A, then F () 6= A, and F () < F ().
36

In fact, A 6< a, so (1) is true by definition.


(2) There is an ordinal such that F () = A.
Otherwise, by (1), F is a one-one function from Ord into A. So by the replacement axiom,
Ord = F 1 [rng(F )] is a set, contradiction.
Let be minimum such that F () = A. By the hypothesis of Zorns lemma, is a
successor ordinal + 1. Thus F () is a <-maximal element of A.
Zorns lemma well-ordering principle. Assume Zorns lemma, and let A be
any set. We may assume that A is nonempty. Let
P = {(B, <) : B A and (B <) is a well-ordering structure}.
We partially order P as follows: (B, <) (C, ) iff B C, a, b B[a < b iff a b], and
b Bc C\B[b c]. Clearly this does partially order P . P 6= , since ({a}, ) P for
any a A. Now suppose that Q is a nonempty subset of P simply ordered by . Let
[
D=
B,
(B,<)Q

<D =

<.

(B,<)Q

Clearly (D, <D ) is a linear order with D A. Suppose that X is a nonempty subset of
D. Fix x X, and choose (B, <) Q such that x B. Then X B is a nonempty
subset of B; let x be its least element under <. Suppose that y X and y <D x. Choose
(C, ) Q such that x, y C and y x. Since Q is simply ordered by , we have two
cases.
Case 1. (C, )  (B, <). Then y C B and y X. so y < x, contradicting the
choice of x.
Case 2. (B, <) (C, ). If y B, then y < x, contradicting the choice of x. So
y C\B. But then x y, contradiction.
Thus we have shown that x is the <D -least element of X. So (D, <D ) is the desired
upper bound for Q.
Having verified the hypotheses of Zorns lemma, we get a maximal element (B, <) of
P . Suppose that B 6= A. Choose any a A\B, and let
C = B {a},
<C =< {(b, a) : b B}.
Clearly this gives an element (C, <C ) of P such that (B, <) (C, <C ), contradiction.
Well-ordering principle Axiom of choice. Assume the well-ordering
principle,
S
and let A be a family of pairwise disjoint nonempty sets. Let C = A , and let be
a well-order of C. Define B = {c C : c is the -least element of the P A for which
c P }. Clearly B has exactly one element in common with each member of A.
There are many statements which are equivalent to the axiom of choice on the basis of ZF.
We list some striking ones. A fairly complete list is in
37

Rubin, H.; Rubin, J. Equivalents of the axiom of choice. North-Holland (1963),


134pp.
(About 100 forms are listed, with proofs of equivalence.)
1. For every relation R there is a function f R such that dmn(f ) = dmn(R).
2. For any sets A, B, either there is an injection of A into B or one of B into A.
5. For any transitive relation R there is a maximal S R which is a linear ordering.
4. Every product of compact spaces is compact.
5. Every formula having a model of size also has a model of any infinite size.
6. If A can be well-ordered, then so can P(A).
There are also statements which follow from the axiom of choice but do not imply it on
the basis of ZF. A fairly complete list of such statement is in
Howard, P.; Rubin, J. Consequences of the axiom of choice. Amer. Math. Soc.
(1998), 432pp.
(383 forms are listed)
Again we list some striking ones:
1. Every Boolean algebra has a maximal ideal.
2. Any product of compact Hausdorff spaces is compact.
5. The compactness theorem of first-order logic.
4. Every commutative ring has a prime ideal.
5. Every set can be linearly ordered.
6. Every linear ordering has a cofinal well-ordered subset.
7. The Hahn-Banach theorem.
8. Every field has an algebraic closure.
9. Every family of unordered pairs has a choice function.
10. Every linearly ordered set can be well-ordered.
Now for the rest of this chapter we give one of the most startling consequences of the axiom
of choice, the Banach-Tarski paradox. (This is optional material.)
The Banach-Tarski paradox is that a unit ball in Euclidean 3-space can be decomposed
into finitely many parts which can then be reassembled to form two unit balls in Euclidean
3-space (maybe some parts are not used in these reassemblings). Reassembling is done
using distance-preserving transformations. This is one of the most striking consequences
of the axiom of choice, and is good background for the study of measure theory (of course
the parts of the decomposition are not measurable). We give a proof of the theorem here
38

without going into any side issues. We follow Wagon, The Banach-Tarski paradox,
where variations and connections to measure theory are explained in full. The proof
involves very little set theory, only the axiom of choice. Some third semester calculus and
some linear algebra and simple group theory are used. Altogether the proof should be
accessible to a first-year graduate student who has seen some applications of the axiom of
choice.
We start with some preliminaries on geometry and linear algebra. The reassembling
mentioned in the Banach-Tarski paradox is entirely done by rotations and translations.
Given a line in 3-space and an angle , we imagine the rotation about the given line
through the angle . Mainly we will be interested in rotations about lines that go through
the origin. We indicate how to obtain the matrix representations of such rotations. First
suppose that is the rotation about the z-axis counterclockwise through the angle .
Then, using polar coordinates,

x
r cos
y = r sin
z
z

r cos( + )
= r sin( + )
z

r cos cos r sin sin


= r cos sin + r sin cos
z

x cos y sin
= x sin + y cos ,
z
which gives the matrix representation of :

cos
sin
0

sin
cos
0

0
0.
1

Similarly, the matrix representations of rotations counterclockwise through the angle


about the x- and y-axes are, respectively,

1
0
0

0
cos
sin

0
sin
cos

and

cos
0
sin

0 sin
1
0 .
0 cos

Next, note that any rotation with respect to a line through the origin can be obtained
as a composition of rotations about the three axes. This is easy to see using spherical
coordinates. If l is a line through the origin and a point P different from the origin with
spherical coordinates , , , a rotation about l through an angle can be obtained as
39

follows: rotate about the z axis through the angle , then about the y-axis through the
angle (thereby transforming l into the z-axis), then about the z-axis through the angle
, then back through about the y axis and through about the z-axis.
We want to connect this to linear algebra. Recall that a 3 3 matrix A is orthogonal
provided that it is invertible and AT = A1 . Thus the matrices above are orthogonal. A
matrix is orthogonal iff its columns form a basis for 3 R consisting of mutually orthogonal
unit vectors; this is easy to see. It is easy to check that a product of orthogonal matrices
is orthogonal. Hence all of the rotations about lines through the origin are represented by
orthogonal matrices.
Lemma 5.2. If A is an orthogonal 3 3 real matrix and X and Y are 3 1 column
vectors, then (AX) (AY ) = X Y , where is scalar multiplication.
Proof. This is a simple computation:
(AX) (AY ) = (AX)T (AY ) = X T AT AY = X T A1 AY = X T Y = X Y.
It follows p
that any rotation about a line through the origin preserves distance, because
|P Q| = (P Q) (P Q) for any vectors P and Q. Such rotations have an additional
property: their matrix representations have determinant 1. This is clear from the discussion
above. It turns out that this additional property characterizes the rotations about lines
through the origin (see M. Artin, Algebra), but we do not need to prove that. The
following property of such matrices is very useful, however.
Lemma 5.3. Suppose that A is an orthogonal 3 3 real matrix with determinant 1,
A not the identity. Then there is a non-zero 3 1 matrix X such that for any 3 1 matrix
Y,
AY = Y iff a R[Y = aX].
Proof. Note that AT (A I) = I AT = (I A)T . Hence
|A I| = |I A| = |(I A)T | = |AT (A I)| = |AT | |A I| = |A I|.
It follows that |A I| = 0. Hence the system of equations (A I)X = 0 has a nontrivial
solution, which gives the X we want. Namely, we then have AX = X, of course. Then
A(aX) = aAX = aX. This proves in the equivalence of the lemma. It remains to do
the converse. We may assume that X has length 1. Now we apply the Gram-Schmidt
process to get a basis for 3 R consisting of mutually orthogonal unit vectors, the first of
which is X. We put them into a matrix B as column vectors, X the first column. Note
that the first column of AB is X, since AX = X, and hence the first column of B 1 AB
is (1 0 0)T . Since B 1 AB is an orthogonal matrix, it follows because its columns are
mutually orthogonal that it has the form

1
0
0

0 0
a b.
c d
40

Now suppose that AY = Y . Let B 1 Y = (u e f )T . Then


(1) e = f = 0.
For, suppose that (1) fails. Now B 1 ABB 1 Y = B 1 AY = B 1 Y , while a direct computation using the above form of B 1 AB yields B 1 ABB 1 Y = (u ae + bf ce + df )T . So
we get the two equations
ae + bf = e
ce + df = f

or

(a 1)e + bf = 0
ce + (d 1)f = 0



a 1
b

is 0. Thus adad+1bc =
Since (1) fails, it follows that the determinant
c
d 1
0. Now B 1 AB has determinant 1, and its determinant is adbc, so we infer that a+d = 2.
But a2 + c2 = 1 and b2 + d2 = 1 since the columns of B 1 AB are unit vectors, so |a| 1
and |d| 1. Hence a = d = 1 and b = c = 0. So B 1 AB is the identity matrix, so A is
also, contradiction. Hence (1) holds after all.
From (1) we get Y = B(u 0 0)T = uX, as desired.
One more remark on geometry: any rotation preserves distance. We already said this for
rotations about lines through the origin. If l does not go through the origin, one can use
a translation to transform it into a line through the origin, do the rotation then, and then
translate back. Since translations clearly preserve distance, so arbitrary rotations preserve
distance.
The first concrete step in the proof of the Banach, Tarski theorem is to describe a very
special group of permutations of 3 R. Let be the counterclockwise rotation about the
z-axis through the angle cos1 ( 31 ), and let the similar rotation about the x-axis. The
matrix representation of these rotations and their inverses is, by the above,

(1)

1 =

1
3
2 2
3

232
1
3

0
0
1

1
0
1

0
=
3
0 232

232 .
1
3

Let G0 be the group of permutations of 3 R generated by and . By a word in


and we mean a finite sequence with elements in {, 1 , , 1 }. Given such a word
w = h0 , . . . , m1 i, we let w be the composition 0 . . . m1 . Further, we call w
reduced if no two successive terms of w have any of the four forms h, 1 i, h1 , i,
h, 1 i, or h1 , i.
Lemma 5.4. If w is a reduced word of positive length, then w is not the identity.
Proof. Suppose the contrary. Since w 1 is also the identity, we may assume
that w ends with 1 (on the right). [If w already ends with 1 , we do nothing. If it ends
with 1 , let w = w1 . Then w is reduced, unless w has the form 1 w , in which
case w 1 is reduced, and still w 1 = w =the identity.]
Since obviously 1 is not the identity, w must have length at least 2. Now we claim
41

(1) For every terminal segment w of w of nonzero even length the vector w ( 1 0 0 )

T
has the form (1/3k ) ( a b 2 c ) , with a divisible by 3 and b not divisible by 5.

We prove this by induction on the length of w . First note that, by computation,

2
0
2
0
3
6
3
6

1
1
1
6 2 ; 1 = 2 2
1
6 2 ;
= 2 2

9
9
3
3
8
2 2
8
2 2

3
6
3
6
2
0
2
0

1
1
1
6 2 ; 1 1 = 2 2
1
6 2.
1 = 2 2
9
9
8 2 2
3
8
2 2
3
Now we proceed by induction. For w of length 2 we have



3
3
1
1

1
1
0 = 2 2 ; 1 0 = 2 2 ;
9
9
8
8
0
0

1
1
3
3

1
1
1 0 = 2 2 ; 1 1 0 = 2 2 ;
9
9
0
8
0
8
hence (1) holds in this case. The induction step:


a
3a
12b

1
1
k b 2 = k+2 2 2a + b 2 6 2c ;
3
3
c
8a + 4b + 3c

a
3a +12b

1
1
1 k b 2 = k+2 2 2a + b 2 6 2c ;
3
3
c
8a + 4b + 3c

a
3a
12b

1
1
1 k b 2 = k+2 2 2a + b 2 + 6 2c ;
3
3
c
8a 4b + 3c

a
3a +12b

1
1
1 1 k+2 k b 2 = 2 2a + b 2 + 6 2c .
3
3
c
8a 4b + 3c
So, our assertion (1) is true. If w itself is of even length, then a contradiction has been
reached, since b is not divisible by 5. If w is of odd length, then the following shows that
T
1

the second entry of w 0 is nonzero, still a contradiction:


0

a
a 4b

1
1
k b 2 = k+1 2 2a + b 2 ;
3
3
c
3c
42

1
1
b 2 = k+1
k
3
3
c

a
+
4b

2 2a + b 2 .
3c

This finishes the proof of Lemma 5.4


This lemma really says that G0 is (isomorphic to) the free group on two generators. But
we do not need to go into that. We do need the following corollary, though.
Corollary 5.5. For every g G0 there is a unique reduced word w such that g = w.
Proof. Suppose that w and w both work, and w 6= w . Say w = h0 , . . . , m1 i and
w = h0 , . . . , n1 i. If one is a proper segment of the other, say by symmetry w is a proper
segment of w , then
g = w = 0 . . . m1

= w = 0 . . . n1 ;
since i = i for all i < m, we obtain I = m . . . n1 , I the identity. But hm , . . . , n1 i
is reduced, contradicting 5.4.
Thus neither w nor w is a proper initial segment of the other. Hence there is an
i < min(m, n) such that i 6= i , while j = j for all j < i (maybe i = 0). But then we
1
have by cancellation i . . . m1 = i . . . n1 , so n1
. . . i1 i . . . m1 = I.
1
But since i 6= i , the word hn1
, . . . , i1 , i , . . . , m1 i is reduced, again contradicting
5.4.
If G is a group and X is a set, we say that G acts on X if there is a homomorphism from
G into the group of all permutations of X. Usually this homomorphism will be denoted
by , so that g is the permutation of X corresponding to g G. (Most mathematicians
dont even use , using the same symbol for elements of the group and for the image
under the homomorphism.) An important example is: any group G acts on itself by left
multiplication. Thus for any g G, g : G G is defined by g(h) = g h, for all h G.
Let G act on a set X, and let E X. Then we say that E is G-paradoxical if
there are positive integers m, n and pairwise disjoint subsets A0 , . . . , Am1
S , B0 , . . . , Bn1
of E, and elements hgi : i < mi and hhi : i < ni of G such that E = i<m gi [Ai ] and
S
E = j<n hj [Bj ]. Note that this comes close to the Banach-Tarski formulation, except
that the sets X and E are unspecified.
Lemma 5.6. G0 , acting on itself by left multiplication, is G0 -paradoxical.
Proof. If is one of 1 , 1 , we denote by W () the set of all reduced words
beginning on the left with , and W () = {w : w W ()}. Thus, obviously,
G0 = {I} W () W (1 ) W () W (1 ),
where I is the identity element of G0 . These five sets are pairwise disjoint by 5.5. Thus
the lemma will be proved, with m = n = 2, by proving the following two statements:
(1 )].
(1) G0 = W () [W
43

To see this, suppose that g G0 and g


/ W (). Write g = w, w a reduced word. Then w
1
(1 )],
does not start with . Hence w is still a reduced word, and g = 1 w [W
as desired.
(2) G0 = W () [W (1 )].
The proof is just like for (1).
We need two more definitions, given that G acts on a set X. For each x X, the G-orbit
of x is {
g (x) : g G}. The set of G-orbits forms a partition of X. We say that G is
without non-trivial fixed points if for every non-identity g G and every x X, g(x) 6= x.
The following lemma is the place in the proof of the Banach-Tarski paradox where the
axiom of choice is used. Dont jump to the conclusion that the proof is almost over; our
group G0 above has non-trivial fixed points, and so does not satisfy the hypothesis of the
lemma. Some trickery remains to be done even after this lemma. [For example, all points
on the z-axis are fixed by .]
Lemma 5.7. Suppose that G is G-paradoxical and acts on a set X without non-trivial
fixed points. Then X is G-paradoxical.
Proof. Let
A0 , . . . , Am1 , B0 , . . . , Bn1 , g0 , . . . , gm1 , h0 , . . . , hn1
be as in the defintion of paradoxical. By AC, let M be a subset of X having exactly one
element in common with each G-orbit. Then we claim:
(1) h
g[M ] : g Gi is a partition of X.
First of all, obviously each set g[M ] is nonempty. Next, their union is X, since for any
x X there is a y M which is in the same G-orbit as x, and this yields a g G such
that x = g(y) and hence x g[M ]. Finally, if g and h are distinct elements of G, then
] are disjoint. In fact, otherwise let y be a common element. Say g(x) = y,
g[M ] and h[M

x M , and h(z)
= y, z M . Then clearly x and z are in the same G-orbit, so x = z since
they are both in M . Then (g 1 h)(z) = z and g 1 h is not the identity, contradicting
the no non-trivial fixed
S point assumption.
SSo, (1) holds.

Now let Ai = gAi g[M ] and Bj = gBj g[M ], for all i < m and j < n.
(2) Ai Ak = 0 if i < k < m.
In fact, suppose that x Ai Ak . Then we can choose g Ai and h Ak such that
]. But g 6= h since Ai Ak = 0, so this contradicts (1). Similarly the
x g[M ] h[M
following two conditios hold:
(3) Bi Bk = 0 if i < k < n.
(4) Ai Bj = 0 if i < m and j < n.
S
(5) i<m gi [Ai ] = X.
For, let x X. Say by (1) that x g[M ]. Then by the choice of the Ai s there is an i < m
such that g gi [Ai ]. Say h Ai and g = gi (h) = gi h. Since x g[M ], say x = g(m)
44

] A , so x gi [A ],
with m M . Then x = (gi h)(m) = gi (h(m)).
Now h(m)
h[M
i
i
as desired in (5).
S
(6)
hi [B ] = X.
i<n

This is proved similarly.


Let S 2 = {x 3 R : |x| = 1} be the usual unit sphere. Now we can prove the first
paradoxical result leading to the Banach-Tarski paradox:
Theorem 5.8. (Hausdorff) There is a countable D S 2 such that S 2 \D is G0 paradoxical.
Proof. Let D be the set of all fixed points of non-identity elements of G0 . By 5.3,
D is countable. Now we claim that if G0 , then [S 2 \D] = S 2 \D. For, assume that
x S 2 \D and (x) D. Say G0 , not the identity, and ((x)) = (x). Then
1 (x) = x. Now 1 is not the identity, since isnt, so x D, contradiction.
This proves that [S 2 \D] S 2 \D. This holds for any G0 , in particular for 1 , and
applying to that inclusion yields S 2 \D [S 2 \D], so the desired equality holds.
Thus G0 acts on S 2 \D without non-trivial fixed points. So by 5.6 and 5.7, S 2 \D is
G0 -paradoxical.
Let us see how far we have to go now. This theorem only looks at the sphere, not the ball.
A countable subset is ignored. Since the sphere is uncountable, this makes the result close
to what we want. But actually there is a countable subset of the sphere which is dense on
it. [Take points whose spherical coordinates are rational.]
For the next step we need a new notion. Suppose that G is a group acting on a set
X, and A, B X. We say that A and B are finitely G-equidecomposable if A and B
can be decomposed into the same number of parts, each part of A being carried into the
corresponding part of B by an S
element of G. In symbols,
there is a positive integer n such
S
that there are partitions A = i<n Ai and B = i<n Bi and members gi G for i < n
such that gi [Ai ] = Bi for all i < n. We then write A G B.
Lemma 5.9. If G acts on a set X, then G is an equivalence relation on P(X).
Proof. Obviously G is reflexive on P(X)
S and is symmetric.
S Now suppose that
A G B G C. Then we get partitions A = i<m Ai and B S= i<m Bi with elements
S
gi G such that gi [Ai ] = Bi for all i < m; and partitions B = j<n Bj and C = j<n Cj
with elements hi G such that hj [Bj ] = Cj for all j < n. Now for all i < m and j < n let
S
Bij = Bi SBj , Aij = gi 1 [Bij ], and Cij = hj [Bij ]. Then A = i<m,j<n Aij is a partition
of A, C = i<m,j<n Cij is a partition of C, and (hj gi )[Aij ] = Cij . Some of the Bij may
be empty; eliminating the empty ones yields the desired nonemptiness of members of the
partitions.
Lemma 5.10. Suppose that G acts on X, E and E are finitely G-equidecomposable
subsets of X, and E is G-paradoxical. Then also E is G-paradoxical.
Proof. Because E is G-paradoxical, we can find pairwise disjoint subsets
A0 , . . . , Am1 , B0 , . . . , Bn1
45

of E and corresponding elements g0 , . . . , gm1 , h0 , . . . , hn1 of G such that


[
[
E=
gi [Ai ] =
hj [Bi ].
i<m

j<n

And
because E andS E are finitely G=equidecomposable we can find partitions E =
S

k<p Ck and E =
k<p Dk with elements fi G such that fi [Ck ] = Dk for all k < p.
1
Then the following sets are pairwise disjoint: Ai gi [Ck ] for i < m and k < p, and
1
Bj hj [Ck ] for j < n and k < p. And
[
[
E =
Dk =
fk [Ck ]
k<p

k<p

fk [Ck

gi [Ai ]]

i<m

k<p

fk [Ck gi [Ai ]]

k<p,i<m

(fk gi )[Ai gi 1 [Ck ]],

k<p,i<m

and similarly
E =

1
(fk hj )[Bi hj [Ck ]].

k<p,j<n

Lemma 5.11. Let D be a countable subset of S 2 . Then there is a rotation with


respect to a line through the origin such that if G1 is the group of transformations of 3 R
generated by , then S 2 and S 2 \D are G1 -equidecomposable.
Proof. For each d D let f (d) be the line through the origin and d. Then f maps D
into the set L of all lines through the origin, and the range of f is countable. But L itself
is uncountable: for example, for each [0, ] one can take the line through the origin
and (cos , sin , 0). Hence there is a line l L not in the range of f . This means that l
does not pass through any point of D. Fix a direction in which to take rotations about l.
Note that if P and Q are distinct points of D, then there is at most one rotation
about l which takes P to Q and is between 0 and 2; this will be denoted by P Q , if
it exists. Now let A consist of all (0, 2) such that there is a positive integer n and
a P D such that if is the rotation about l through the angle n, then (P ) D.
We claim that A is countable. For, if P, Q D, P Q is defined, n \{0}, k ,
and 0 < n1 (P Q + 2k) < 2, then n1 (P Q + 2k) A; and every member of A can
be obtained this way. [Given A, we have n = P Q + 2k for some P, Q D and
n, k .] This really defines a function from D D (\{0}) onto A, so A is, indeed,
countable.S We choose (0, 2)\A, and take the rotation about l through the angle .
Let D = n n [D]. The choice of says that n [D] D = 0 for every positive integer
n. Hence if n < m < we have n [D] m [D] = 0, since
n [D] m [D] = n [D mn [D]] = n [0] = 0.
46

Note that [D] = D\D. Hence


S 2 = D (S 2 \D) G1 [D] (S 2 \D) = S 2 \D.
Now let G2 be the group of permutations of 3 R generated by {, , }. We now have the
first form of the Banach-Tarski paradox:
Theorem 5.12. (Banach, Tarski) S 2 is G2 -paradoxical.
One can loosely state this theorem as follows: one can decompose S 2 into a finite number of
pieces, rotate some of these pieces finitely many times with respect to certain lines through
the origin to reassemble S 2 , and then similarly transform some of the remaining pieces to
also reassemble S 2 . The rotations are of three kinds: the very specific rotations and
defined at the beginning of this section, and the rotation in the preceding proof, for
which we do not have an explicit description. One can apply the inverses of these rotations
as well. After doing the second reassembling, one can apply a translation to make that
copy of S 2 disjoint from the first copy.
Finally, let B = {x 3 R : |x| 1} be the unit ball in 3-space. Let G3 be the group
generated by , , , and the rotation about the line determined by (0, 0, 12 ) and (1, 0, 12 ),

through the angle / 2. Note that by the considerations at the beginning of this section,
T
T
consists of the translation ( x y z ) 7 ( x y z 12 ) , followed by the rotation
T
T
through 2 about the x-axis, followed by the translation ( x y z ) 7 ( x y z + 12 ) .
Lemma 5.13. For any positive integer k,

0 
0
k
1

sin
k
2
0 =

 2
k
1
0
+
2 cos
2
Hence p ( 0

0 ) 6= ( 0

1
2

0 ) for every positive integer p.

Proof. The first equation is easily proved by induction on k. Then the second
p
is never equal to m
inequality follows since for any positive integer p, the argument
2

for any integer m, since 2 is irrational.


Theorem 5.14. (Banach, Tarski) B is G3 -paradoxical.
2
Proof. By 5.12 there are pairwise disjoint
S subsets Ai and
S Bj of S and members gi ,
2
hj of G2 for i < m and j < n such that S = i<m gi [Ai ] = j<n hj [Bj ]. For each i < m
and j < n let Ai = {x : x Ai , 0 < 1} and Bj = {x : x Bj , 0 < 1}. Then

(1) The Ai s and Bj s are pairwise disjoint.


For example, suppose that y Ai Bj . Then we can write y = x with x Ai ,
0 < 1, and also y = z with z Bj , and 0 < 1. Hence |y| = = . Hence x = z,
contradiction.
S
S
(2) B\{0} = i<m gi [Ai ] = j<n hj [Bj ].
47

In fact, let y B\{0}. Let x = y/|y|. Then x S 2 , so there is an i < m such that
x gi [Ai ]. Say that x = gi (z) with z Ai . Then |y|z Ai , and gi (|y|z) = |y|gi(z) =
|y|x = y. So y gi [Ai ]. This proves the first equality in (2), and the second equality is
proved similarly.
So far, we have shown that B\{0} is G2 -paradoxical. Now we show that B and B\{0}
are finitely G3 -equidecomposable, which will finish the proof. By lemma 5.13 we have
B = D (B\D) G3 [D] (B\D) = B\{0}.
This proves the desired equidecomposablity.
As in the case of S 2 , a translation can be made if one wants one copy of B to be disjoint
from the other.
EXERCISES
In the first four exercises, we assume elementary background and ask for the proofs of
some standard mathematical facts that require the axiom of choice.
E5.1. Show that any vector space over a field has a basis (possibly infinite).
E5.2. A subset C of R is closed iff the following condition holds:
For every sequence f C, if f converges to a real number x, then x C.
Here to say that f converges to x means that
> 0M m M [|fm x| < ].
Prove that if hCm : m i is a sequence of nonempty closed
T subsets of R, m x, y
Cm [|x y| < 1/(m + 1)], and Cm Cn for m < n, then m Cm is nonempty. Hint: use
the Cauchy convergence criterion.
E5.3. Prove that every nontrivial commutative ring with identity has a maximal ideal.
Nontrivial means that 0 6= 1. Only very elementary definitions and facts are needed here;
they can be found in most abstract algebra books. Hint: use Zorns lemma.
E5.4. A function g : R R is continuous at a R iff for every sequence f R which
converges to a, the sequence g f converges to g(a). (See Exercise E5.2.) Show that g is
continuous at a iff the following condition holds:
> 0 > 0x R[|x a| < |g(x) g(a)| < ].
Hint: for , argue by contradiction.
E5.5 Show by induction on m, without using the axiom of choice, that if m and
hAi : i mi is a system of nonempty sets, then there is a function f with domain m such
that f (i) Ai for all i m.
48

E5.6 Using AC, prove the following, which is called the Principle of Dependent Choice
(which is also weaker than the axiom of choice, but cannot be proved in ZF). If A is a
nonempty set, R is a relation, R A A, and for every a A there is a b A such that
aRb, then there is a function f : A such that f (i)Rf (i + 1) for all i .
The remaining exercises outline proofs of some equivalents to the axiom of choice; so each
exercise states something provable in ZF. We are interested in the following statements.
(1) If < is a partial ordering and is a simple ordering which is a subset of <, then there
is a maximal (under ) simple ordering such that is a subset of , which in turn is
a subset of <.
(2) For any two sets A and B, either there is a one-one function mapping A into B or
there is a one-one function mapping B into A.
(3) For any two nonempty sets A and B, either there is a function mapping A onto B or
there is a function mapping B onto A.
(4) A family F of subsets of a set A has finite character if for all X A, X F iff every
finite subset of X is in F . Principle (4) says that every family of finite character has a
maximal element under .
(5) For any relation R there is a function f R such that dmn R = dmn f .
E5.7. Show that the axiom of choice implies (1). [Use Zorns lemma]
E5.8. Prove that (1) implies (2). [Given sets A and B, define f < g iff f and g are one-one
functions which are subsets of A B, and f g. Apply (1) to < and the empty simple
ordering.]
E5.9. Prove that (2) implies (3). [Easy]
E5.10. Show in ZF that for any set A there is an ordinal such that there is no one-one
function mapping into A. Hint: consider all well-orderings contained in A A.
E5.11. Prove that (3) implies the axiom of choice. [Show that any set A can be wellordered, as follows. Use exercise 4 to find an ordinal which cannot be mapped one-one
into P(A). Show that if f : A maps onto , then hf 1 [{}] : < i is a one-one
function from into P(A).
E5.12. Show that the axiom of choice implies (4). [Use Zorns lemma.]
E5.13. Show that (4) implies (5). [Given a relation R, let F consist of all functions
contained in R.]
E5.14. Show that (5) implies the axiom of choice. [Given a family hAi : i Ii of nonempty
sets, let R = {(i, x) : i I and x Ai }.]

49

6. Cardinals
This chapter is concerned with the basics of cardinal arithmetic.
Definition and basic properties
A cardinal, or cardinal number, is an ordinal such that there is no smaller ordinal
which can be put in one-one correspondence with . We generally use Greek letters , ,
for cardinals. Obviously if and are distinct cardinals, then they cannot be put in
one-one correspondence with each other.
Proposition 6.1. For any set X there is an ordinal which can be put in one-one
correspondence with X.
Proof. By the well-ordering principle, let < be a well-ordering of X. Then X under
< is isomorphic to an ordinal.
By this proposition, any set can be put in one-one correspondence with a cardinalnamely
the least ordinal that is in one-one correspondence with the set. This justifies the following
definition. For any set X, the cardinality, or size, or magnitude, etc. of X is the unique
cardinal |X| which can be put in one-one correspondence with X. The basic property of
this definition is given in the following theorem.
Theorem 6.2. For any sets X and Y , the following conditions are equivalent:
(i) |X| = |Y |.
(ii) There is a one-one function mapping X onto Y .
The following proposition gives obvious facts about the particular way that we have defined
the notion of cardinality.
Proposition 6.3. (i) || .
(ii) || = iff is a cardinal.
Proposition 6.4. Every natural number is a cardinal.
Proof. We prove by ordinary induction on n that for every natural number n and
for every natural number m, if m < n then there is no bijection from n to m. This is
vacuously true for n = 0. Now assume it for n, but suppose that m is a natural number
less than n + 1 and f is a bijection from n + 1 onto m. Since n + 1 6= 0, obviously also
m 6= 0. So m = m + 1 for some natural number m . [An easy induction shows that every
nonzero natural number is a successor ordinal.] Let g be the bijection from m onto m
which interchanges m and f (n) and leaves fixed all other elements of m. Then g f is a
bijection from n + 1 onto m which takes n to m . Hence (g f ) n is a bijection from n
onto m , and m < n, contradicting the inductive hypothesis.
Thus the natural numbers are the first cardinals, in the ordering of cardinals determined
by the fact that they are special kinds of ordinals. A set is finite iff it can be put in one-one
correspondence with some natural number; otherwise it is infinite. The following general
lemma helps to prove that is the next cardinal. It is easily proved by induction.
50

Lemma 6.5. If (A, <) is a simple ordering, then every finite nonempty subset of A
has a greatest element.
Theorem 6.6. is a cardinal.
It is harder to find larger cardinals, but they exist; in fact the collection of cardinals is so
big that, like the collection of ordinals, it does not exist as a set. We will see this a little
bit later.
Note that a cardinal is infinite iff it is greater or equal . The following fact will be
useful later.
Proposition 6.7. Every infinite cardinal is a limit ordinal.
Proof. Suppose not: is an infinite cardinal, and = + 1. We define f : as
follows: f (0) = , f (m + 1) = m for all m , and f () = for all \. Clearly f
is one-one and maps onto , contradiction.
Lemma 6.8. If and are cardinals and f : is one-one, then .
Proof. We define iff , and f () < f (). Clearly well-orders . Let g
be an isomorphism from (, ) onto an ordinal . Thus by the definition of cardinals.
If < < , then g 1 () g 1 (), hence by definition of , f (g 1 ()) < f (g 1 ()).
Thus f g 1 : is strictly increasing. Hence . We already know that , so
.
The purpose of this lemma is to prove the following basic theorem.
Theorem 6.9. If A B, then |A| |B|.
Proof. Let = |A|, = |B|, and let f and g be one-one functions from onto A and
of onto B, respectively. Then g f 1 is a one-one function from into , so .
Corollary 6.10. For any sets A and B the following conditions are equivalent:
(i) |A| |B|.
(ii) There is a one-one function mapping A into B.
(iii) A = 0, or there is a function mapping B onto A.
Corollary 6.11. If there is a one-one function from A into B and a one-one function
from B into A, then there is a one-one function from A onto B.
This corollary is called the Cantor-Bernstein, or Schroder-Bernstein theorem. Our proof,
if traced back, involves the axiom of choice. It can be proved without the axiom of choice,
and this is sometimes desirable when describing a small portion of set theory to students.
Some exercises outline such a proof.
The following simple theorem is very important and basic for the theory of cardinals.
It embodies in perhaps its simplest form the Cantor diagonal argument.
Theorem 6.12. For any set A we have |A| < |P(A)|.
Proof. The function given by a 7 {a} is a one-one function from A into P(A), and
so |A| |P(A)|. [Saying that a 7 {a} is giving the value of the function at the argument
a.] Suppose equality holds. Then there is a one-one function f mapping A onto P(A). Let
51

X = {a A : a
/ f (a)}. Since f maps onto P(A), choose a0 A such that f (a0 ) = X.
Then a0 X iff a0
/ X, contradiction.
By this theorem, for every ordinal there is a larger cardinal, namely |P()|. Hence we
can define + to be the least cardinal > . Cardinals of the form + are called successor
cardinals; other infinite cardinals are called limit cardinals. Is + = |P()|? The statement
that this is true for every infinite cardinal is the famous generalized continuum hypothesis
(GCH). The weaker statement that + = |P()| is the continuum hypothesis (CH).
It can be shown that the generalized continuum hypothesis is consistent with our
axioms. But also its negation is consistent; in fact, the negation of the weaker continuum
hypothesis is consistent. All of this under the assumption that our axioms are consistent.
(It is not possible to prove this consistency.)
S
Theorem 6.13. If is a set of cardinals, then is also a cardinal.
S
S
def S
Proof. We
S know already that is an ordinal. Suppose that = | | < . By
definition
of , there isSa such that < . (Membership is the same as <.) Now
S
. So = || | | = , contradiction.
We can now define the standard sequence of infinite cardinal numbers, by transfinite recursion:
0 = ;
+1 = +
;
[
=
for a limit ordinal.
<

For historical reasons, one sometimes writes in place of . Now we get the following
two results.
Lemma 6.14. If < , then < .
Lemma 6.15. for every ordinal .
Theorem 6.16. For every infinite cardinal there is an ordinal such that = .
Proof. Let be any infinite cardinal. Then < +1 . Here + 1 refers to
ordinal addition. This shows that there is an ordinal such that < ; choose the least
such . Clearly 6= 0 and is not a limit ordinal. Say = + 1. Then < +1 ,
so = .
We can now say a little more about the continuum hypothesis. Not only is it consistent that
it fails, but it is even consistent that |P()| = 2 , or |P()| = 17 , or |P()| = +1 ;
the possibilities have been spelled out in great detail. Some impossible situations are
|P()| = and |P()| = + ; we will establish this later in this chapter.
Addition of cardinals
Let and be cardinals. We define
+ = |{(, 0) : } {(, 1) : }|.
52

The idea is to take disjoint copies {0} and {1} of and and count the number
of elements in their union.
Two immediate remarks should be made about this definition. First of all, this is not,
in general, the same as the ordinal sum + . We depend on the context to distinguish
the two notions of addition. For example, + 1 = in the cardinal sense, but not in
the ordinal sense. In fact, we know that < + 1 in the ordinal sense. To show that
+ 1 = in the cardinal sense, it suffices to define a one-one function from onto the set
{(m, 0) : m } {(0, 1)}.
Let f (0) = (0, 1) and f (m + 1) = (m, 0) for any m .
Secondly, the definition is consistent with our definition of addition for natural numbers (as a special case of ordinal addition), and thus it does coincide with ordinal addition
when restricted to :
Proposition 6.17. If m and n are natural numbers, then addition in the sense of
chapter 2 and in the cardinal number sense are the same.
Proof. By induction on n.
Aside from simple facts about addition, there is the remarkable fact that + = for
every infinite cardinal . We shall prove this as a consequence of the similar result for
multiplication.
The definition of cardinal addition can be extended to infinite sums, and very elementary properties of the binary sum are then special cases of more general results; so we
proceed with the general definition. Let hi : i Ii be a system of cardinals (this just
means that is a function with domain I whose values are always cardinals). Then we
define



[
X


i = (i {i}) .


iI

iI

This is a generalization of summing two cardinals, as is immediate from the definitions:


Proposition 6.18. If hi : i 2i is a system of cardinals P
(meaning that is a function
with domain 2 such that both 0 and 1 are cardinals), then i2 i = 0 + 1 .
The following is easily proved by induction on |I|:
P Proposition 6.19. If hmi : i Ii is a system of natural numbers, with I finite, then
iI mi is a natural number.
We mention some important but easy facts concerning the cardinalities of unions:
6.20.
S Proposition
P


iI Ai =
iI |Ai |.

If hAi : i Ii is a system of pairwise disjoint sets, then

S
P
Proposition 6.21. If hAi : i Ii is any system of sets, then iI Ai iI |Ai |.
Corollary 6.22. If hi : i Ii is a system of cardinals, then
53

iI

iI

i .

Finally, we gather together some simple arithmetic of infinite sums:


P
Proposition
6.23.
(i)
iI 0 = 0.
P
(ii) Pi0 i = 0.P
(iii) iI i = iI,
P i 6=0 i . P
(iv) If I J, then iI i iJ
P i .
P
(v) IfPi i for all i I, then iI i iI i .
(vi) iI 1 = |I|.
(vii) If is infinite, then + 1 = .
Multiplication of cardinals
By definition,
= | |.
The following simple result can be used in verifying many simple facts concerning products.
Two sets are equipotent iff there is a bijection between them.
Proposition 6.24. If A is equipotent with C and B is equipotent with D, then A B
is equipotent with C D.
Proposition 6.25. (i) = ;
(ii) ( ) = ( ) ;
(iii) ( + ) = + ;
(iv) 0 = 0;
(v) 1 = ;
(vi) P
2 = + ;
(vii) iI = |I|;
(viii) If and , then .
Proposition 6.26. Multiplication of natural numbers means the same in the cardinal
number sense as in ordinal sense.
The basic theorem about multiplication of infinite cardinals is as follows.
Theorem 6.27. = for every infinite cardinal .
Proof. Suppose not, and let be the least infinite cardinal such that 6= . Then
= 1 , and so < . We now define a relation on . For all , , , ,
(, ) (, ) iff max(, ) < max(, )
or max(, ) = max(, ) and <
or max(, ) = max(, ) and = and < .
Clearly this is a well-order. It follows that ( , ) is isomorphic to an ordinal ; let f
be the isomorphism. We have || = | | = > by the remark at the beginning of
this proof. So < . Therefore there exist , such that f (, ) = . Now
f [{(, ) : (, ) (, )}] = ,
54

so, with = max(, )+1,


= |{(, ) : (, ) (, )}|
| | = || ||.
But < , so either is finite, and || || is then also finite, or else is infinite, and
|| || = || by the minimality of . In any case, || || < , contradiction.
Corollary 6.28. If and are nonzero cardinals and at least one of them is infinite,
then + = = max(, ).
Corollary 6.29. If hAi : i Ii is any system of sets, then


[
[


|Ai |.
Ai |I|


iI

iI

A set A is countable if |A| . So another corollary is


Corollary 6.30. A countable union of countable sets is countable.
Proposition 6.31. If hi : i PIi is a systemS of nonzero cardinals, and either I is
infinite or some i is infinite, then iI i = |I| iI i .
By the above results, the binary operations of addition and multiplication of cardinals are
trivial when applied to infinite cardinals; and the infinite sum is also easy to calculate.
We now introduce infinite products which, as we shall see, are not so trivial. We need the
following standard elementary notion: for hAi : i Ii a family of sets, we define
Y
Ai = {f : f is a function, dmn(f ) = I, and i I[f (i) Ai ]}.
iI

This is the cartesian product of the sets Ai . Now if hi : i Ii is a system of cardinals, we


define


Y
Y


i = i .


iI
iI
Q
Here on the right iI i is the cartesian product of the cardinals i , and on the left is the
defined product ofQthem, a certain new cardinal. We depend on the context to distinguish
these two uses of .
Some elementary properties of this notion are summarized in the following proposition.
Q
Q
Proposition 6.32. (i) iI Ai =
Q iI |Ai |.
(ii) IfQi = 0 for some i I, then iI i = 0.
(iii) Q i0 i = Q
1.
(iv) Q iI i = iI,i 6=1 i .
(v) iI 1 = 1.
Q
Q
(vi) IfQi i for all i I, then iI i iI i .
(vii) i2 i = 0 1 .
55

Proof. Most of these facts are very easy. We give the proof for (i). Q
According to
the
Q definition of product, it suffices to find a one-one function g mapping iI Ai onto
mapping Ai onto |Ai . (We are using
iI |Ai |. For each i I, let fi be a one-one function
Q
the axiom of choice here.) Then for each x iI Ai and each i I let g(x)i = fi (xi ). It
is easily checked that g is as desired.
General commutative, associative, and distributive laws hold also:
Proposition 6.33. (Commutative law) If hi : i Ii is a system of cardinals and
f : I I is one-one and onto, then
Y
Y
i =
f (i) .
iI

then

iI

Proposition 6.34. (Associative law) If hij : (i, j) I Ji is a system of cardinals,

Y Y
Y

ij =
ij .
iI

jJ

(i,j)IJ

Proposition 6.35. (Distributive law) If hi : i Ii is a system of cardinals, then

i =

iI

( i ).

iI

Theorem 6.36. (Konig) Suppose that hi : i Ii and hi : i Ii are systems of


cardinals such that i < i for all i I. Then
X
Y
i <
i .
iI

iI

Proof. The proofQis another P


instance of Cantors diagonal argument. Suppose that
this is notQtrue; thus iI i
iI i . It follows that there is a one-one function f
mapping iI i into {(, i) : i I, < i }. For each i I let
Ki = {(f 1 (, i))i : < i , (, i) rng(f )}.
Clearly Ki i . Now |Ki | i < i , so we can choose xi i \Ki (using the axiom of
choice). Say f (x) = (, i). Then xi = (f 1 (, i))i Ki , contradiction.
Exponentiation of cardinals
We define
= | |.
The elementary arithmetic of exponentiation is summarized in the following proposition:
Proposition 6.37. (i) 0 = 1.
(ii) If 6= 0, then 0 = 0.
56

(iii) 1 = .
(iv) 1 = 1.
(v) 2 = .
(vi) = + .
(vii) ( ) = .
(viii) ( ) = .
(ix) Q
If 6= 0 and , then .
|I|
(x) P
iI = .
Q

(xi) iI i = iI i .
 Q
Q
(xii)

= iI i .
i
iI
Proof. The proofs are straightforward. We give the proof of (xi) as an illustration.
By the definitions of arbitrary sums and products it suffices to find a bijection f from the
set
S
{i {i}:iI}
()

Q
to the set iI i , where the product here is just the product of sets. So take any
Q
member x of (). We define f (x) in iI i by defining its value (f (x))i for each i I;
and we define (f (x))i in i by defining its value ((f (x))i)() for each i . Since
(, i) i {i}, the pair (, i) is in the domain of x. We define ((f (x))i)() = x(, i).
f is one-one: suppose that x, y () and f (x) = f (y). Take any u dmn(x); say
u = (, i) with i I and i . Then x(u) = x(, i) = ((f (x))i )() = ((f (y))i)() =
y(, i) = y(u). Q
Q
f maps onto iI i : take any g iI i . Define x () by setting, for any i I
and i , x(, i) = (g(i))(). Then ((f (x))i )() = x(, i) = (g(i))(), and since i and
are arbitrary we get f (x) = g.
Proposition 6.38. If m, n , then mn
Proposition 6.39. |P(A)| = 2|A| .
For each X A define X A 2 by setting
n
1
X (a) =
0

if a X,
otherwise.

[This is the characteristic function of X.] It is easy to check that is a bijection from
P(A) onto A 2.
The calculation of exponentiation is not as simple as that for addition and multiplication.
The following result gives one of the most useful facts about exponentiation, however.
Theorem 6.40. If 2 , then = 2 .
Proof. Note that each function f : is a subset of . Hence P( ),
and so |P( ). Therefore,
2 |P( )| = 2 = 2 ;
57

so all the entries in this string of inequalities are equal, and this gives = 2 .
Cofinality, and
regular and singular cardinals
Further cardinal arithmetic depends on the notion of cofinality. For later purposes we
define a rather general version of this notion. Let (P, <) be a partially ordered set. A
subset X of P is dominating iff for every p P there is an x X such that p x. The
cofinality of P is the smallest cardinality of a dominating subset of P . We denote this
cardinal by cf(P ).
A subset X of P is unbounded iff there does not exist a p P such that x p for all
x X. If P is simply ordered without largest element, then these notionsdominating
and unboundedcoincide. In fact, suppose that X is dominating but not unbounded.
Since X is not unbounded, choose p P such that x p for all x X. Since P does
not have a largest element, choose q P such that p < q. Then because X is dominating,
choose x X such that q x. Then q x p < q, contradiction. Thus X dominating
implies that X is unbounded. Now suppose that Y is unbounded but not dominating.
Since Y is not dominating, there is a p P such that p 6 x, for all x Y . Since P is a
simple order, it follows that x < p for all x Y . This contradicts Y being unbounded.
A cardinal is regular iff is infinite and cf() = . An infinite cardinal that is not
regular is called singular.
Theorem 6.41. For every infinite cardinal , the cardinal + is regular.
Proof. Suppose that + , is unbounded in + , and || < + . Hence




X
[
X

+

||
= = ,

=

contradiction. The first equality here holds because is unbounded in + and + is a


limit ordinal.
This theorem almost tells the full story about when a cardinal is regular. Examples of
singular cardinals are + and 1 . But it is conceivable that there are regular cardinals
not covered by Theorem 6.41. A regular limit cardinal is said to be weakly inaccessible. A
cardinal is said to be inaccessible if it is regular and has the property that for any cardinal
< , also 2 < . Clearly every inaccessible cardinal is also weakly inaccessible. Under
GCH, the two notions coincide. It is consistent with ZFC that 2 is weakly inaccessible;
but of course it definitely is not inaccessible. It is consistent with ZFC that there are no
uncountable weak inaccessibles at all. But it is reasonable to postulate their existence,
and they are useful in some situations. In fact, the subject of large cardinals is one of the
most studied in contemporary set theory, with many spectacular results.
Theorem 6.42. Suppose that (A, <) is a simple ordering with no largest element.
Then there is a strictly increasing function f : cf(A) A such that rng(f ) is unbounded
in A.
58

Proof. Let X be a dominating subset of A of size cf(A), and let g be a bijection from
cf(A) onto X. We define a function f : cf(A) X by recursion, as follows. If f () X
has been defined for all < , where < cf(A), then {f () : < } has size less than
cf(A), and hence it is not dominating. Hence there is an a A such that f () < a for all
< . We let f () be an element of X such that a, g() f ().
Clearly f is strictly increasing. If a A, choose < cf(A) such that a g(). Then
a f ().
Proposition 6.43. Suppose that (A, <) is a simple ordering with no largest element.
Then cf(cf(A)) = cf(A).
Proof. Clearly cf() for any ordinal ; in particular, cf(cf(A)) cf(A). Now by
Theorem 6.42, let f : cf(A) A be strictly increasing with rng(f ) unbounded in A, and
let g : cf(cf(A)) cf(A) be strictly increasing with rng(g) unbounded in cf(A). Clearly
f g : cf(cf(A)) A is strictly increasing. We claim that rng(f g) is unbounded in A.
For, given a A, choose < cf(A) such that a f (), and then choose < cf(cf(A))
such that g(). Then a f () f (g()), proving the claim. It follows that
cf(A) cf(cf(A)).
S
Proposition 6.44. If is a regular cardinal, , and || < , then < .
Proof. Since cf() = , from the definition of cf it follows
that is bounded in .
S
Hence there is an < such that for all . So < .
Proposition 6.45. If A is a linearly ordered set with no greatest element, is a
regular cardinal, and f : A is strictly increasing with rng(f ) unbounded in A, then
= cf(A).
Proof. By the definition of cf we have cf(A) . Suppose that cf(A) < . By
Theorem 6.42 let g : cf(A) A be strictly increasing with rng(g) unbounded in A. For
each < cf() choose < such S
that g() f ( ). Then { : < cf(A)} and
|{ : < cf(A)}| < , so by 6.44, <cf(A) < . Let < be such that < for
all < cf(A). Then f () is a bound for rng(g), contradiction.
Proposition 6.46. A cardinal isP
regular iff for every system hi : i Ii of cardinals
less than , with |I| < , one also has iI i < .
P
S
S
Proof.
:

|I|

<
.
:
if

and
||
<
,
then
|
|
i
i
iI
iI
P
S
||
<
,
so
also

<
.
Thus

is
regular.

Proposition 6.47. If is an infinite singular cardinal, then there is a P


strictly increasing sequence h : < cf()i of infinite successor cardinals such that = <cf() .
Proof. Choose unbounded and of order type cf(); let h : < cf()i
enumerate in strictly increasing order. Since each is merely an ordinal, this does not
complete the proof. We define the desired sequence bySrecursion. Suppose that <
has been defined for all < , with < cf(). Then < < by the definition of
cofinality. So also

+
[
max ,
< ,
<

59

and we define P
to be this cardinal.
Now <cf() for each < cf(), so
=

cf() = .

<cf()

The main theorem of cardinal arithmetic


Now we return to the general treatment of cardinal arithmetic.
Theorem 6.48. (Konig) If is infinite and cf() , then > .
Proof. If is regular, then = 2 > . So, assume that is singular. Then
therePis a system h : < cf()i of nonzero cardinals such that each is less than ,
and <cf() = . Hence
=

<

<cf()

= cf() .

<cf()

Corollary 6.49. For infinite we have cf(2 ) > .


Proof. Suppose that cf(2 ) . Then (2 ) > 2 . But (2 ) = 2 = 2 ,
contradiction.
We can now verify a statement made earlier about possibilities for |P()|. Since |P()| =
2 , the corollary says that cf(2 ) > . So this implies that |P()| cannot be or + .
Here + is the ordinal sum of with . It rules out many other possibilities of this
sort.
We now prove a lemma needed for the last major theorem of this subsection, which
says how to compute exponents (in a way).
Lemma 6.50. If is a limit cardinal and cf(), then

[
<
a cardinal

cf()

Proof. LetQ : cf() be strictly increasing with rng() unbounded in . We


define F : <cf() as follows. If f , < cf(), and < , then
((F (f )) ) =

f () if f () < ,
0
otherwise.
60

Now F is a one-one function. For, if f, g and f 6= g, say < and f () 6= g().


Choose < cf such that f () and g() are both less than . Then ((F (f )) ) = f () 6=
g() = ((F (g)) ) , from which it follows that F (f ) 6= F (g). Since F is one-one,




Y


= | |

<cf()





<cf() a <
cardinal
cf()

[

=
<
a cardinal

( )cf() = cf() = ,
and the lemma follows.
The following theorem is not needed for the main result, but it is a classical result about
exponentiation.
Theorem 6.51. (Hausdorff) If and are infinite cardinals, then (+ ) = + .
Proof. If + , then both sides are equal to 2 . Suppose that < + . Then


[


+
+

( ) = | ( )| =



<+
X

|| + (+ ) ,
<+

as desired.
Here is the promised theorem giving computation rules for exponentiation. It essentially
reduces the computation of to two special cases: 2 , and cf () . Generalizations of the
results mentioned about the continuum hypothesis give a pretty good picture of what can
happen to 2 . The case of cf( is more complicated, and there is still work being done on
what the possibilities here are. Recent deep work of Shelah on pcf theory has shed some
light on this. For example, he showed that 0 20 + 4 . The role of 4 here is still
unclear.
Theorem 6.52. (main theorem of cardinal arithmetic) Let and be cardinals with
2 and . Then
(i) If , then = 2 .
(ii) If is infinite and there is a < such that , then = .
(iii) Assume that is infinite and < for all < . Then < , and:
61

(a) if cf() > , then = ;


(b) if cf() , then = cf () .
Proof. (i) has already been noted. Under the hypothesis of (ii),
( ) = ,
as desired.
Now assume the hypothesis of (iii). In particular, 2 < , so of course < . Next,
assume the hypothesis of (iii)(a): cf() > . Then


[



= | | =



<
X
|| ,

(since < cf())

<

giving the desired result.


Finally, assume the hypothesis of (iii)(b): cf() . Since < , it follows that is
singular, so in particular it is a limit cardinal. Then

[
<
a cardinal

cf()

cf() ,

finishing the proof.


In theory one can now compute for infinite , as follows. If , then = 2 .
Suppose that > . Let be minimum such that ( ) = . Then < [ < ]. In
fact, if < and , then ( ) ( ) = = < = ( ) , contradiction.
Now ( ) is computed by 6.52(iii).
Under the generalized continuum hypothesis the computation of exponents is very
simple:
Corollary 6.53. Assume GCH, and suppose that and are cardinals with 2
and infinite. Then:
(i) If , then = + .
(ii) If cf() < , then = + .
(iii) If < cf(), then = .
Proof. (i) is immediate from 6.51(i). For (ii), assume that cf() < . Then
is a limit cardinal, and so for each < we have (max(, ))+ < ; hence by
6.51(iii)(b) we have = cf() = + . For (iii), assume that < cf(). If there is a <
such that , then by 6.51(ii), = (max(, ))+ , as desired. If < for
all < , then = by 6.51(iii)(a).

62

EXERCISES
The first four exercises outline a proof of the Cantor-Schroder-Bernstein theorem without
using the axiom of choice. This theorem says that if there is an injection from A into B
and one from B into A, then there is a bijection from A to B. In the development in the
text, using the axiom of choice, the hypothesis implies that |A| |B| |A|, and hence
|A| = |B|. But it is of some interest that it can be proved in an elementary way, without
using the axiom of choice or anything about ordinals and cardinals. Of course, the axiom
of choice should not be used in these four exercises.
E6.1. Let F : P(A) P(A), and assume that for all X, Y A, ifS X Y , then
F (X) F (Y ). Let A = {X : X A and X F (X)}, and set X0 = XA X. Then
X0 F (X0 ).
E6.2. Under the assumptions of exercise 1 we actually have X0 = F (X0 ).
E6.3. Suppose that f : A B is one-one and g : B A is also one-one. For every X A
let F (X) = A\g[B\f [X]]. Show that for all X, Y A, if X Y then F (X) F (Y ).
E6.4. Prove the Cantor-Schroder-Bernstein theorem as follows. Assume that f and g are
as in exercise E6.3, and choose F as in that exercise. Let X0 be as in exercise E6.1. Show
that A\X0 rng(g). Then define h : A B by setting, for any a A,
h(a) =

f (a)
if a X0 ,
1
g (a) if a A\X0 .

Show that h is one-one and maps onto B.


E6.5. Show that if and are ordinals, then | | = || + ||, where is ordinal
addition and + is cardinal addition.
E6.6. Show that if and are ordinals, then | | = || ||, where is ordinal
multiplication and is cardinal multiplication.
E6.7. Show that if and are ordinals, 2 , and , then | | = || ||. Here
the dot to the left of the first exponent indicates that ordinal exponentiation is involved.
[This is a good exercise to keep in mind. For example, 2 is a countable set,
but 2 is not.]
E6.8. Prove that if |A| |B| then |P(A)| |P(B)|.
E6.9. Prove the following general distributive law:
YX
iI jJi

where P =

iI

ij =

XY

i,f (i) ,

f P iI

Ji .

E6.10. Show that for any cardinal we have + = { : is an ordinal and || }.


E6.11. For every infinite cardinal there is a cardinal > such that = .
63

E6.12. For every infinite cardinal there is a cardinal > such that > .
E6.13. Prove that for every n , and every infinite cardinal , n = 2 n .
E6.14. Prove that 1 = 21 0 .
Q
E6.15. Prove that 0 = n n .
E6.16. Prove that for any infinite cardinal , (+ ) = 2 .
E6.17. Show that if is an infinite cardinal and C is the collection of all cardinals less
than , then |C| .
E6.18. Show that if is an infinite cardinal and C is the collection of all cardinals less
than , then
!cf()
X
.
2 =
2
C

E6.19. Prove that for any limit ordinal ,

<

2 = 2 .

E6.20.
Show that if is an infinite cardinal and C is the set of all cardinals , then
P

= 2 .
C
E6.21.PShow that if is an infinite cardinal, = + , and C is the set of all cardinals < ,
then C = 2 .
E6.22. Assume that is an infinite cardinal, and 2 < for every cardinal < . Show
that 2 = cf() .
E6.23. Suppose that is a singular cardinal, cf() = , and 2 < for every < . Prove
that 2 = .
References
Abraham, U.; Magidor, M. Cardinal arithmetic. Chapter in Handbook of Set Theory.
Springer 2010, 2197pp.
Holz, M.; Steffens, K.; Weitz, E. Introduction to Cardinal Arithmetic. Birkh
auser
1999, 304pp.

64

7. Linearly ordered sets


In this chapter we prove some results about linearly ordered sets which form a useful
background in much of set theory. Among these facts are: any two denumerable densely
ordered sets are isomorphic, the existence of sets, the existence of completions, and a
discussion of Suslin lines.
A linearly ordered set (A, <) is densely ordered iff |A| > 1, and for any a < b in A
there is a c A such that a < c < b. A subset X of a linearly ordered set L is dense in L
iff for any two elements a < b in L there is an x X such that a < x < b. Note that if X
is dense in L and L has at least two elements, then L itself is dense.
Theorem 7.1. If L is a dense linear order, then L is the disjoint union of two dense
subsets.
Proof. Let ha : < i be a well-order of L, with = |L|. We put each a in A
or B by recursion, as follows. Suppose that we have already done this for all < . Let
C = {a : < and a < a }, and let D = {a : < and a > a }. We take two
possibilities.
Case 1. C has a largest element a , D has a smallest element a , and a , a A.
Then we put a in B.
Case 2. Otherwise, we put a in A.
Now we want to see that this works. So, suppose that elements a < a of L are given.
Let a < a be the elements of L with smallest indices which are in the interval (a , a ).
If one of these is in A and the other in B, this gives elements of A and B in (a , a ). So,
suppose that they are both in A, or both in B. Let a be the member of L with smallest
index that is in (a , a ). Thus a < a < a < a < a , so by the minimality of and
we have , < . Thus < and a < a .
(1) a is the largest element of {a : < , a < a }.
In fact, a is in this set, as just observed. If a < a , < , and a < a , then also
a < a since a < a , so the definition of is contradicted. Hence (1) holds.
(2) a is the smallest element of {a : < , a > a }.
In fact, < as observed just before (1), and a > a by the definition of a . If a < a ,
< , and a > a , then also a > a since a > a , so the definition of is contradicted.
Hence (2) holds.
So by construction, if a , a A then a B, while if a , a B, then a A. So
again we have found elements of both A and B which are in (a , a ).
The proof of the following result uses the important back-and-forth argument.
Theorem 7.2. Any two denumerable densely ordered sets without first and last elements are order-isomorphic.
Proof. Let (A, <) and (B, ) be denumerable densely ordered sets without first and
last elements. Write A = {ai : i } and B = {bi : i }. We now define by recursion
sequences hci : i i of elements of A and hdi : i i of elements of B. Let c0 = a0 and
d0 = b 0 .
65

Now suppose that c2m and d2m have been defined so that the following condition hold:
(*) For all i, j 2m, ci < cj iff di < dj .
(Note that then a similar equivalence holds for = and for >.) We let c2m+1 = am+1 . Now
we consider several cases.
Case 1. am+1 = ci for some i 2m. Take the least such i, and let d2m+1 = di .
Case 2. am+1 < ci for all i 2m. Let d2m+1 be any element of B less than each di ,
i 2m.
Case 3. ci < am+1 for all i 2m. Let d2m+1 be any element of B greater than each
di , i 2m.
Case 4. Case 1 fails, and there exist i, j 2m such that ci < am+1 < cj . Let d2m+1
be any element b of B such that di < b < dj whenever ci < am+1 < cj ; such an element b
exists by (*).
This finishes the definition of d2m+1 . d2m+2 and c2m+2 are defined similarly. Namely,
we let d2m+2 = bm+1 and then define c2m+2 similarly to the above, with a, b interchanged
and c, d interchanged.
Note that each ai appears in the sequence of ci s, namely c0 = a0 and c2i+1 = ai+1 ,
and similarly each bi appears in the sequence of di s. Hence it is clear that {(ci , di ) : i }
is the desired order-isomorphism.
Theorem 7.3. If L is an infinite linear order, then there is a subset M of L which
is order isomorphic to (, <), or to (, >).
Proof. Suppose that L does not have a subset order isomorphic to (, >). We claim
then that L is well-ordered, and therefore is isomorphic to an infinite ordinal and hence
has a subset isomorphic to (, <). To prove this claim, suppose it is not true. So L has
some nonempty subset P with no least element. We now define a sequence hai : i i of
elements of P by recursion. Let a0 be any element of P . If ai P has been defined, then
it is not the least element of P and so there is an ai+1 P with ai+1 < ai . This finishes
the construction. Thus we have essentially produced a subset of L order isomorphic to
(, >), contradiction.
It would be natural to conjecture that Theorem 7.3 generalizes in the following way: for
any infinite cardinal and any linear order L of size , there is a subset M of L order
isomorphic to (, <) or to (, >). This is clearly false, as the real numbers under their
usual order form a counterexample. (Given a set of real numbers order isomorphic to
2 , one could choose rationals between successive members of the set, and produce 2
rationals, contradiction.) We want to give an example that works for many cardinals. The
construction we use is very important for later purposes too.
The following definitions apply to any infinite ordinal .
If f and g are distinct elements of 2, we define
(f, g) = min{ < : f () 6= g()}.
Let f < g iff f and g are distinct elements of 2 and f ((f, g)) < g((f, g)). (Thus
f ((f, g)) = 0 and g((f, g)) = 1.) Clearly ( 2, <) is a linear order; this is called the
lexicographic order.
66

We also need some general set-theoretic notation. If A is any set and any cardinal, then
[A] = {X A : |X| = };
[A]< = {X A : |X| < };
[A] = {X A : |X| }.
Theorem 7.4. For any infinite cardinal , the linear order 2 does not contain a
subset order isomorphic to + or to (+ , >).
Proof. The two assertions are proved in a very similar way, so we give details only for
the first assertion. In fact, we assume that hf : < + i is a strictly increasing sequence
of members of 2, and try to get a contradiction. The contradiction will follow rather
easily from the following statement:
+

(1) If , [+ ] , and f < f for any , such that < , then there
+
exist < and [] such that f < f for any , such that < .
To prove this, assume the hypothesis. For each let f = f . Clearly does not
have a largest element. For each let be the least member of which is greater
than . Then
[
=
{ : (f , f ) = }.
<
+

Since || = + , it follows that there are < and [] such that (f , f ) = for
all . We claim now that f < f for any two , such that < , as
desired in (1). For, take any such , . Suppose that f = f . (Note that we must
have f f .) Now from (f , f ) = we get f () = 1, and from (f , f ) =
we get f () = 0. Now f = f = f , so we get f < f f , contradiction.
This proves (1).
Clearly from (1) we can construct an infinite decreasing sequence > 1 > 2 >
of ordinals, contradiction.
Now we give some more definitions, leading to a kind of generalization of Theorem 7.2.
If (L, <) is a linear order and A, B L, we write A < B iff x Ay B[x < y]. If
A = {a} here, we write a < B; similarly for A < b.
Intervals in linear orders are defined in the usual way. For example, [a, b) = {c : a c <
b}.
An -set is a linear order (L, <) such that if A, B L, A < B, and |A|, |B| < , then
there is a c L such that A < c < B. Taking A = and B = {a} for some a L, we
see that -sets do not have first elements; similarly they do not have last elements. Note
that an 0 -set is just a densely ordered set without first or last elements.
For any ordinal , we define
H = {f 2 : there is a < such that f () = 1 and f () = 0 for all (, )}.
67

We take the order on H induced by that on


2.

2: f < g in H iff f < g as members of

Theorem 7.5. Let be an ordinal, and let cf( ) = . Then the following conditions hold:
(i) H is an -set.
(ii) cf(H , <) = .
(iii) cf(H , >) = .
P
(iv) |H0 | = 0 , and for > 0, |H | = < 2 .
Proof. For each f H let f < be such that f (f ) = 1 and f () = 0 for all
(f , ).
For (i), suppose that A, B H with A < B and |A|, |B| < . Obviously we may
assume that one of A, B is nonempty. Then there are three possibilities:
Case 1. A 6= =
6 B. Let
= sup{f : f A};
= max( + 1, sup{f : f B}).
Thus , < since |A|, |B| < = cf( ). We now define g 2 by setting, for each
< ,

1 if and f A(f = g and f () = 1);

0 if and there is no such f ;


g() = 0 if < ;

1 if = + 1;

0 if + 1 < < .
Clearly g H . We claim that A < g < B. Note that g
/ A B since g( + 1) = 1 while
f ( + 1) = 0 for any f A B.
To prove the claim first suppose that f A. Assume that g < f ; we will get a
contradiction. Let = (g, f ). Then g() = 0 and f () = 1. It follows that and
g = f , contradicting the definition of g().
Second, suppose that f B. Assume that f < g; we will get a contradiction. Let
= (f, g). Thus f () = 0 and g() = 1. We claim that = + 1. For, otherwise since
g() = 1 we must have , and then there is an h A such that h = g and
h() = 1. So f = g = h , f () = 0, and h() = 1, so f < h. But f B and
h A, contradiction. This proves our claim that = + 1.
Now clearly f . Since
g ( + 1) = g = f = f ( + 1),
it follows that g(f ) = 1. So from f we infer that f . Thus since g(f ) = 1, it
follows that there is a k A such that k f = g f and k(f ) = 1. But now we have
k (f + 1) = g (f + 1) = f (f + 1) and f () = 0 for all (f , ). Hence f k,
which contradicts f B and k A.
This finishes the proof of (i) in Case 1.
68

Case 2. A = =
6 B. Let
= sup{f : f B}.
Define g H by setting, for each < ,
g() =

0 if ,
1 if = + 1,
0 if + 1 < .

Clearly g < B, as desired.


Case 3. A 6= = B. Let be as in Case 1. Define g H by setting, for each < ,
g() =

1 if + 1,
0 if + 1 < .

Clearly A < g, as desired.


This finishes the proof of (i).
For (ii), let h : < i be a strictly increasing sequence of ordinals with supremum
. For each < define f H by setting, for each < ,
f () =

1
0

if ,
if < .

Clearly hf : < i is a strictly increasing sequence of members of H and {f : < }


is cofinal in H . So (ii) holds.
For (iii), take h : < i as in the proof of (ii). For each < define f H by
setting, for each < ,

0 if < ,
f () = 1 if = ,

0 if < .
Clearly hf : < is a strictly decreasing sequence of members of H and {f : < }
is cofinal in (H , >). So (iii) holds.
Finally, for (iv), for each < let
L = {f H : f () = 1 and f () = 0 for all (, )}.
Clearly these sets are pairwise disjoint, and their union is H . For = 0,
|H | =

|L | =

<

<

For > 0,
|H | =

|L |

<

69

2|| = 0 .

|L | +

<

= 0 +

|L |

<

2||

<

= 0 +

2 |{ < : || = }|

<

= 0 +
=

2 +1

<

2 .

<

Corollary 7.6. If is regular, then


(i) H is an -set.
(ii) cf(H , <) = .
(iii) cf(H , >) = .
P
(iv) |H0 | = 0 , and for > 0, |H | = < 2 .
Corollary 7.7. For each regular cardinal there is an -set.
Corollary 7.8. For each ordinal there is an +1 -set of size 2 .
Corollary 7.9. (GCH) For each regular cardinal there is an -set of size .
One of the most useful facts about -sets is their universality, expressed in the following
theorem.
Theorem 7.10. Suppose that is regular. If K is an -set, then any linearly
ordered set of size can be isomorphically embedded in K.
Proof. Let L be a linearly ordered set of size at most , and write L = {a : < }.
We define a sequence hf : < i of functions by recursion. Suppose that f has been
defined for all < so that it is a strictly increasing function mapping a subsetSof L of
size less than into K, and such that f f whenever < < . Let g = < f .
Then g is still a strictly increasing function mapping a subset of L of size less than into
K. If a dmn(g), we set f = g. Suppose that a
/ dmn(g). Let
A = {g(b) : b dmn(g) and b < a };
B = {g(b) : b dmn(g) and a < b}.
Then A < B, and |A|, |B| < . So by the -property, there is an element c of K such
that A < c < B. We let f = g {(a , c)} for such
S an element c. (AC is used.)
This finishes the construction, and clearly < f is as desired.
Given a linearly ordered set L, a subset X of L, and an element a of L, we call a an upper
bound for X iff x a for all x X. Thus every element of L is an upper bound of the
empty set. We say that a is a least upper bound for X iff a is an upper bound for X and
70

is any upper bound for X. Clearly a least upper bound for X is unique if it exists. If
a is the least upper bound of the empty set, then a is the smallest element of L. We use
lub or sup to abbreviate least upper bound. Also supremum is synonymous with least
upper bound.
Similarly one defines lower bound and greatest lower bound. Any element is a lower
bound of the empty set, and if a is the greatest lower bound of the empty set, then a is
the largest element of L. We use glb or inf to abbreviate greatest lower bound. infimum
is synonymous with greatest lower bound.
A linear order L is complete iff every subset of L has a greatest lower bound and a
least upper bound.
Proposition 7.11. For any linear order L the following conditions are equivalent:
(i) L is complete.
(ii) Every subset of L has a least upper bound.
(iii) Every subset of L has a greatest lower bound.
Proof. (i)(ii): obvious. (ii)(iii). Assume that every subset of L has a least upper
bound, and let X L; we want to show that X has a greatest lower bound. Let Y be the
set of all lower bounds of X. Then let a be a least upper bound for Y . Take any x X.
Then y Y [y x], so a x since a is the lub of Y . This shows that a is a lower bound
for X. Suppose that y is any lower bound for X. Then y Y , and hence y a since a is
an upper bound for Y .
(iii)(i) is treated similarly.
Let (L, <) be a linear order. We say that a linear order (M, ) is a completion of L iff the
following conditions hold:
(C1) L M , and for any a, b L, a < b iff a b.
(C2) M is complete.
(C3) Every element of M is the lub of a set of elements of L.
(C4) If a L is the lub in L of a subset X of L, then a is the lub of X in M .
Theorem 7.12. Any linear order has a completion.
Proof. Let (L, <) be a linear order. We let M be the collection of all X L such
that the following conditions hold:
(1) For all a, b L, if a < b X then a X.
(2) If X has a lub a in L, then a X.
We consider the structure (M , ). It is clearly a partial order; we claim that it is a linear
order. (Up to isomorphism it is the completion that we are after.) Suppose that X, Y M
and X 6= Y ; we want to show that X Y or Y X. By symmetry take a X\Y . Then
we claim that Y X (hence Y X). For, take any b Y . If a < b, then a Y by (1),
contradiction. Hence b a, and so b X by (1), as desired. This proves the claim.
71

Next we claim that (M , ) is complete.


S For, suppose that X M . Then X
satisfies (1). In fact, suppose that
S c < d X . Choose X X such that d X. Then
c X by (1) for X, and so c X . Now we consider two cases.
S
S
Case 1. X does not have a lub in L. Then X M , and it is clearly the lub of
X.
S
Case 2. X has a lub in L; say a is its lub. Then
S
(3) X {a} = (, a].
S
In fact, isSclear. Suppose that b < a. Then
b
is
not
an
upper
bound
for
X , so we can
S
S
choose c X such that b < c. Then b X since X satisfies (1). This proves (3).
Clearly (, a] M . We claim that it is the S
lub of X . Clearly it is an upper
S bound.
Now suppose that Z is any upper bound. Then X Z. If a
/ Z, then X = Z,
contradicting (2) for Z. So a Z and hence (, a] Z.
Hence we have shown that (M , ) is complete.
Now for each a L let f (a) = {b L : b a}. Clearly f (a) M .
(4) For any a, b L we have a < b iff f (a) f (b).
For, suppose that a, b L. If a < b, clearly f (a) f (b), and even f (a) f (b) since
b f (b)\f (a). The other implication in (4) follows easily from this implication by assuming
that b a.
(5) Every element of M is a lub of elements of f [L].
For, suppose that X M , and let X = {f (a) : a X}; we claim that X is the lub of X .
Clearly f (a) X for all a X, so X is an upper bound of X . Suppose that Y M is
any upper bound for X . If a X, then a f (a) Y , so a Y . Thus X Y , as desired.
So (5) holds.
(6) If a L is the lub in L of X L, then f (a) is the lub in M of f [X].
For, assume that a L is the lub in L of X L. If x X, then x a, so f (x) f (a).
Thus f (a) is an upper bound for f [X] in M . Now suppose that Y M and Y is an
upper bound for f [X]. If b L and b < a, then since a is the lub of X, there is a d X
such that b < d a. So f (d) Y , and hence d Y . Since b < d, we also have b Y .
This shows that f (a)\{a} Y . If a X, then f (a) f [X] and so f (a) Y , as desired.
Assume that a
/ X. Since a is the lub of X in L, there is no largest member of L which
is less than a. Now suppose that a
/ Y . If u Y , then u < a, as otherwise a u and so
a Y , contradiction. It follows that Y = {u L : u < a}. Clearly then a is the lub of Y .
This contradicts (2). So (6) holds.
Thus M is as desired, up to isomorphism.
Finally, we need to take care of the up to isomorphism business. Non-rigorously,
we just identify a with f (a) for each a L. This is the way things are done in similar
contexts in mathematics. Rigorously we proceed as follows; and a similar method can be
used in other contexts. Let A be a set disjoint from L such that |A| = |M \f [L]|. For
example, we could take A = {(L, X) : X M \f [L]}; this set is clearly of the same size
72

as M \f [L], and it is disjoint from L by the foundation axiom. Let g be a bijection from
A onto M \f [L]. Now let N = L A, and define h : N M by setting, for any x N ,
h(x) =

f (x) if x L,
g(x) if x A.

Thus h is a bijection from N to M , and it extends f . We now define x y iff x, y N


and h(x) h(y). We claim that (N, ) really is a completion of L. (Not just up to
isomorphism.) We check the conditions for this. Obviously L N . Suppose that a, b L.
Then a < b iff f (a) f (b) iff h(a) h(b) iff a b. Now h is obviously an orderisomorphism from (N ) onto (M ), so N is complete. Now take any element a of N .
Then by (5), h(a) is the lub of a set f [X] with X L. By the isomorphism property, a is
the lub of X. Finally, suppose that a L is the lub of X L. Then by (6), f (a) is the
lub of f [X] in M , i.e., h(a) is the lub of h[X] in M . By the isomorphism property, a is
the lub of X in N .
Theorem 7.13. If L is a linear order and M, N are completions of L, then there is
an isomorphism f of M onto N such that f L is the identity.
Proof. It suffices to show that if N is a completion of L and M , f are as in the proof
of Theorem 7.12, then there is an isomorphism g from N onto M such that g L = f .
For any x N let g(x) = {a L : a N x}. We claim that g(x) M . Clearly
condition (1) holds. Now suppose that g(x) has a lub b in L. By (C4) for N , b is the lub
of g(x) in N . But obviously x is the lub of g(x) in N , so b = x g(x). So (2) holds for
g(x), and so g(x) M .
If x <N y, clearly g(x) g(y). By (C3) for N and y, there is an a L such that
x <N a N y. So a g(y)\g(x). Hence g(x) g(y). Hence by Proposition 4.14, x <N y
iff g(x) g(y), for any x, y N .
It remains only to show that g is a surjection. Let X M . Set x = supN X. If
a X, then a N x and so a g(x). Thus X g(x). Now suppose that a g(x). So
a N x. If a <N x, then there is a y X such that a <N y N x. It follows that a X.
If a = x, then a X by (2). So g(x) X, showing that g(x) = X.
Corollary 7.14. Suppose that L is a dense linear order, and M is a linear order.
Then the following conditions are equivalent:
(i) M is the completion of L.
(ii) (a) L M
(b) M is complete.
(c) For any a, b L, a <L b iff a <M b.
(d) For any x, y M , if x <M y then there is an a L such that x <M a <M y.
Proof. (i)(ii): Assume that M is the completion of L. then (a)(c) are clear.
Suppose that x, y M and x <M y. By (C3), choose b L such that x <M b M y. If
x L, then choose a L such that x <L a <L b; so x <M a <M y, as desired. Assume
that x
/ L. Then by (C4), b is not the lub in L of {u L : u <M x}, so there is some
a L such that a <L b and a is an upper bound of {u L : u <M x}. Since by (C3) x is
the lub of {u L : u <M x}, it follows that x <M a <M b M y, as desired.
73

(ii)(i): Assume (ii). Then (C1) and (C2) are clear. For (C3), let x M , and let
X = {a L : a < x}. Then x is an upper bound for X, and (ii)(d) clearly implies that
it is the lub of X. For (C4), suppose that a L is the lub in L of a set X of elements of
L. Suppose that x M is an upper bound for X and x < a. Then by (ii)(d) there is an
element b L such that x < b < a. Then there is an element c X such that b < c a.
It follows that c x, contradiction.
Note from this corollary that the completion of a dense linear order is also dense.
We now take up a special topic, Suslin lines.
A subset U of a linear order L is open iff U is a union of open intervals (a, b) or (, a)
or (a, ). Here (, a) = {b L : b < a} and (a, ) = {b L : a < b}. L itself is also
counted as open. (If L has at least two elements, this follows from the other parts of this
definition.) Note that if L has a largest element a, then (a, ) = ; similarly for smallest
elements.
An antichain in a linear order L is a collection of pairwise disjoint open sets.
A linear order L has the countable chain condition, abbreviated ccc, iff every antichain
in L is countable.
A subset D of a linear order L is topologically dense in L iff D U 6= for every
nonempty open subset U of L. Then dense in the sense at the beginning of the chapter
implies topologically dense. In fact, if D is dense in the original sense and U is a nonempty
open set, take some non-empty open interval (a, b) contained in U . There is a d D with
a < d < b, so D U 6= . If =
6 (a, ) U for some a, choose b (a, ), and then choose
d D such that a < d < b. Then again D U 6= . Similarly if (, a) U for some a.
Conversely, if L itself is dense, then topological denseness implies dense in the order
sense; this is clear. On the other hand, take for example the ordered set ; itself is
topologically dense in , but is not dense in in the order sense.
A linear order L is separable iff there is a countable subset C of L which is topologically
dense in L. Note that if L is separable and (a, b) is a nonempty open interval of L, then
(a, b), with the order induced by L (x < y for x, y (a, b) iff x < y in L) is separable.
In fact, if C is countable and topologically dense in L clearly C (a, b) is countable and
topologically dense in (a, b). Similarly, [a, b] is separable, taking (C [a, b]) {a, b}. This
remark will be used shortly.
A Suslin line is a linear ordered set (S, <) satisfying the following conditions:
(i) S has ccc.
(ii) S is not separable.
Suslins Hypothesis (SH) is the statement that there do not exist Suslin lines.
Later in these notes we will prove that MA + CH implies SH. Here MA is Martins axiom,
which we will define and discuss later. The consistency of MA + CH requires iterated
forcing, and will be proven much later in these notes. Also later in these notes we will
prove that implies SH, and still later we will prove that is consistent with ZFC,
namely it follows from V = L. Both and L are defined later.
74

For now we want to connect our notion of Suslin line with more familiar mathematics,
and with the original conjecture of Suslin. The following is a theorem of elementary set
theory.
Theorem 7.15. For any linear order (L, ) the following conditions are equivalent:
(i) (L, ) is isomorphic to (R, <).
(ii) The following conditions hold:
(a) L has no first or last elements.
(b) L is dense.
(c) Every nonempty subset of L which is bounded above has a least upper bound.
(d) L is separable.
Proof. (i)(ii): standard facts about real numbers.
(ii)(i): By (d), let C be a countable subset of L such that (a, b) C 6= whenever
a < b in L. Clearly C is infinite, is dense, and has no first or last element. By Theorem
7.2, let f be an isomorphism from (C, <) onto (Q, <). We now apply the procedure used
at the end of the proof of 7.12. Let P be a set disjoint from Q such that |L\C| = |P |,
and let R = Q P . Let g be a bijection from L\C onto P , and define h = f g. Define
x y iff x, y R and h1 (x) <L h1 (y). This makes R into a linearly ordered set with
h an isomorphism from L onto R. Now we adjoin first and last elements aR , bR to R and
similarly aR , bR for R; call the resulting linearly ordered sets R and R . Then R and
R are both completions of Q according to Corollary 7.14. Hence (i) holds by Theorem
7.13.
Originally, Suslin made the conjecture that separability in Theorem 7.15 can be replaced
by the condition that every family of pairwise disjoint open intervals is countable. The
following theorem shows that this conjecture and our statement of Suslins hypothesis are
equivalent.
Theorem 7.16. The following conditions are equivalent:
(i) There is a Suslin line.
(ii) There is a linearly ordered set (L, <) satisfying the following conditions:
(a) L has no first or last elements.
(b) L is dense.
(c) Every nonempty subset of L which is bounded above has a least upper bound.
(d) No nonempty open subset of L is separable.
(e) L is ccc.
Proof. Obviously (ii) implies (i). Now suppose that (i) holds, and let S be a Suslin
line. We obtain (ii) in two steps: first taking care of denseness, and then taking the
completion to finish up.
We define a relation on S as follows: for any a, b S,
ab

iff

a = b,
or a < b and [a, b] is separable,
or b < a and [b, a] is separable.
75

Clearly is an equivalence relation on S. Let L be the collection of all equivalence classes


under .
(1) If I L, then I is convex, i.e., if a < c < b with a, b I, then also c I.
For, [a, b] is separable, so [a, c] is separable too, and hence a c; so c I.
(2) If I L, then I is separable.
For, this is clear if I has only one or two elements. Suppose that I has at least three
elements. Then there exist a, b I with a < b and (a, b) 6= . Let M be a maximal
pairwise disjoint set of such intervals. Then M is countable. Say M = {(xn , yn ) : n }.
Since xn yn , the interval [xn , yn ] is separable, so we can let Dn be a countable dense
subset of it. We claim that the following countable set E is dense in I:
[
E=
Dn {e : e is the largest element of I}
n

{a : a is the smallest element of I}.


Thus e and a are added only if they exist. To show that E is dense in I, first suppose
that a, b I, a < b, and (a, b) 6= . Then by the maximality of M , there is an n such
that (a, b) (xn , yn ) 6= . Choose c (a, b) (xn , yn ). Then max(a, xn ) < c < min(b, yn ),
so there is a d Dn (max(a, xn ), min(b, yn )) (a, b), as desired. Second, suppose that
a I and (a, ) 6= ; here (a, ) = {x I : a < x}. We want to find d E with a < d. If
I has a largest element e, then e is as desired. Otherwise, there are b, c I with a < b < c,
and then an element of (a, c) E, already shown to exist, is as desired. Similarly one deals
with . Thus we have proved (2).
Now we define a relation < on L by setting I < J iff I 6= J and a < b for some a I
and b J. By (1) this is equivalent to saying that I < J iff I 6= J and a < b for all a I
and b J. In fact, suppose that a I and b J and a < b, and also c I and d J,
while d c. If d a, then d a < b with d, b J implies that a J, contradiction.
Hence a < d. Since also d c this gives d I, contradiction.
Clearly < makes L into a simply ordered set. Except for not being complete in the
sense of (c), L is close to the linear order we want.
To see that L is dense, suppose that I < J but (I, J) = . Take any a I and b J.
Then (a, b) I J, and I J is separable by (2), so a b, contradiction.
For (d), by a remark in the definition of separable it suffices to show that no open
interval (I, J) is separable. Suppose to the contrary that (I, J) is separable. Let A be a
countable dense subset of (I, J). Also, let B = {K L : I < K < J and |K| > 2}. Any
two distinct members of B are disjoint, and hence by ccc B is countable. In fact, each
K B has the form (a, b), [a, b), (a, b], or [a, b]. since |K| > 2, in each case the open
interval (a, b) is nonempty. So ccc applies.
Define C = A B {I, J}. By (2), each member of C S
is separable, so for each K C
we can let DK be a countable dense subset of K. Let E = KC DK . So E is a countable
set. Fix a I and b J. We claim that E (a, b) is dense in (a, b). (Hence a b and so
I = J, contradiction.) For, suppose that a c < d b with (c, d) 6= .
Case 1. [c] = [d] = I. Then DI (c, d) 6= , so E (c, d) 6= , as desired.
76

Case 2. [c] = [d] = J. Similarly.


Case 3. I < [c] = [d] < J. Then [c] B C , so the desired result follows again.
Case 4. [c] < [d] . Choose K A such that [c] < K < [d] . Hence c < e < d for
any e DK , as desired.
Thus we have obtained a contradiction, which proves that (I, J) is not separable.
Next, we claim that L has ccc. In fact, suppose that A is an uncountable family of
pairwise disjoint open intervals. Let B be the collection of all endpoints of members of
A , and for each I B choose aI I. Then
{(aI , aJ ) : (I, J) A }
is an uncountable collection of pairwise disjoint nonempty open intervals in S, contradiction. In fact, given (I, J) A , choose K with I < K < J. then aK (aI , aJ ). So
(aI , aJ ) 6= . Suppose that (I, J), (I , J ) are distinct members of A . Wlog J I . Then
aJ aI , and it follows that (aI , aJ ) (aI , aJ ) = .
This finishes the first part of the proof. We have verified that L satisfies (b), (d), and
(e). Now let M be the completion of L, and let N be M without its first and last elements.
We claim that N finally satisfies all of the conditions in (ii). Clearly N is dense, it has no
first or last elements, and every nonempty subset of it bounded above has a least upper
bound. Next, suppose that a < b in N and C is a countable subset of (a, b) which is dense
in (a, b). Choose c, d L such that a < c < d < b. For any u, v C with c < u < v < d
choose euv L such that u < euv < v; such an element exists by Corollary 7.14. We claim
that {euv : u, v C, u < v} is dense in (c, d) in L, which is a contradiction. For, given
x, y such that c < x < y < d in L, by the definition of denseness we can find u, v C such
that x < u < v < y; and then x < euv < y, as desired.
It remains only to prove that N has ccc. Suppose that A is an uncountable collection
of nonempty open intervals of N . By Corollary 7.14, for each (a, b) A we can find
c, d L such that a < c < d < b. So this gives an uncountable collection of nonempty
open intervals in L, contradiction.
Characters of points and gaps
The rest of this chapter is optional; we prove a very useful theorem on characters of points
and gaps in linearly ordered sets due to Hausdorff.
For any cardinal , the order type which is the reverse of is denoted by . Reg is
the class of all regular cardinals. We define regular so that every regular cardinal is infinite.
If < are cardinals, then [, ]reg is the collection of all regular cardinals in the interval
[, ]; similarly for half-open and open intervals.
Let R Reg Reg. We define
left (R) = the least cardinal greater than each member of dmn(R);
right (R) = the least cardinal greater than each member of rng(R).
Let L be a linear order, and let x L. If x is the first element of L, then its left character
is 0. If x has an immediate predecessor, then its left character is 1. Finally, suppose
77

that x is not the first element of L and does not have an immediate predecessor. Then
the left character of x is the smallest cardinal such that there is a strictly increasing
sequence of elements of L with supremum x. This cardinal is then regular. A symmetric
definition for right character is clear. The character of x is the pair (, ) where is the
left character and is the right character. The point-character set of L is the collection of
all characters of points of L; we denote it by Pchar(L). Note that Pchar(L) 6= .
A gap of L is an ordered pair (M, N ) such that M 6= =
6 N , L = M N , M has no
largest element, N has no smallest element, and x M y N (x < y). The definitions of
left and right characters of a gap are similar to the above definitions for points; but they
are always infinite regular cardinals. Again, the character of (M, N ) is the pair (, )
where is the left character and is the right character. The gap-character set of L is the
collection of all characters of gaps of L; we denote it by Gchar(L). If Gchar(L) = then
we say that L is inner-complete; L may or may not have first and last elements.
The full character set of L is the pair (Pchar(L), Gchar(L)).
If L does not have a first element, then the coinitiality of L is the least cardinal
such that there is a strictly decreasing sequence ha : < i of elements of L such that
x L < [a < x]; we denote this cardinal by ci(L). Similarly for the right end, if L
does not have a greatest element then we define the cofinality of L, denoted by cf(L).
L is irreducible iff it has no first or last elements, and the full character set of (x, y)
is the same as the full character set of L for any two elements x, y L with x < y.
Now a complete character system is a set R RegReg with the following properties:
(C1) dmn(R) = [, left (R))reg .
(C2) rng(R) = [, right (R))reg .
(C3) There is a such that (, ) R.
(C4) There is a linear order L of size |R| 1 with no strictly increasing sequence of regular
order type left (R) and no strictly decreasing sequence of regular order type right (R).
Note that conditions (1)(3) do not mention orderings. There are many sets which satisfy
the condition (4) and whose descriptions do not mention orderings. For example, any finite
set satisfies (4). If neither left (R) nor right (R) is regular limit, then (4) holds (this is the
condition of Hausdorff).
Proposition 7.17. If L is an irreducible infinite inner-complete dense linear order,
then Pchar(L) is a complete character system. Moreover, ci(L) right (Pchar(L)) and
cf(L) left (Pchar(L)).
Proof. Let R = Pchar(L). (C1): the inclusion is obvious. Now suppose that
[, left(R))reg . Then there is a point x of L with character (, ) such that .
Let ha : < i be a strictly increasing sequence of elements of L with supremum x. Let
y = sup< a . Clearly y has left character , as desired.
(C2): symmetric to (C1).
(C3): By a straightforward transfinite construction one gets (for some ordinal ) a
strictly increasing sequence hx : < i and a strictly decreasing sequence hy : < i such
that x < y for all , < , and such that there is exactly one point z with x < z < y
for all , < . Then is a limit ordinal, and z has character (cf(), cf()), as desired.
78

(C4): Suppose that cf(L) > left (R). Then by the argument for (C1), L has a point
with left character left (R), contradiction. A similar argument works for ci.
We shall use the sum construction for linear P
orders. If hLi : i Ii is a system of linear
orders, and I itself is an ordered set, then by iI Li we mean the set
{(i, a) : i I, a Li }
ordered lexicographically.
The following lemma is probably well-known.
Lemma 7.18.PIf hLi : i Ii is a system of complete linear orders, and I is a complete
linear order, then iI Li is also complete.
P
Proof. Suppose that C is a nonempty subset of iI Li . Let i0 = sup{i I : (i, a)
C for some a Li }. We consider two cases.
Case 1. There is an a Li0 such that (i0 , a) C. Then we let a0 = sup{a Li0 :
(i0 , a) C}. Clearly (i0 , a0 ) is the supremum of C.
Case 2. There is no a Li0 such that (i0 , a) C. Then the supremum of C is (i0 , a),
where a is the first element of Li0 .
Another construction we shall use is the infinite product. Suppose
Q that I is a well-ordered
set and hLi : i Ii is aQsystem of linear orders. Then we make iI Li into a linear order
by defining, for f, g iI Li ,
f <g

iff

f 6= g and f (i) < g(i),

where i = f.d.(f, g), and f.d.(f, g) is the first i I such that f (i) 6= g(i).
Given such an infinite product, and given a strictly increasing sequence x = hx : <
i of members of it, with a limit ordinal, we call x of argument type if the following two
conditions hold:
(A1) hf.d.(x , x+1 ) : < i is strictly increasing
(A2) For each < , the sequence hf.d.(x , x ) : < < i is a constant sequence.
On the other hand, x is of basis type iff there is an i I such that f.d.(x , x ) = i for all
distinct , < .
Lemma Q
7.19. Let hMi : i Ii be a system of ordered sets, with I well-ordered. If
x < y < z in iI Mi , then f.d.(x, z) = min{f.d.(x, y), f.d.(y, z)}.
Proof. Let i = min{f.d.(x, y), f.d.(y, z)}.
Case 1. i = f.d.(x, y) = f.d.(y, z). Then x i = y i = z i and x(i) < y(i) < z(i), so
f.d.(x, z) = i.
Case 2. i = f.d.(x, y) < f.d.(y, z). Then x i = y i = z i and x(i) < y(i) = z(i), so
f.d.(x, z) = i.
Case 3. i = f.d.(y, z) < f.d.(x, y). Then x i = y i = z i and x(i) = y(i) < z(i), so
f.d.(x, z) = i.
79

The following is Satz XIV in Hausdorff [1908].


Theorem 7.20. Let hMi : i Ii be a system of ordered sets, with I well-ordered.
Suppose that is regular and hx : < i is a strictly increasing sequence of elements of
Q
iI Mi . Then this sequence has a subsequence of length which is either of argument
type or of basis type.
Proof. First we claim
(1) For every < there is a > and an i I such that f.d.(x , x ) = i for all .
This is true because, by Lemma 7.19, if < < < , then f.d.(x , x ) f.d.(x , x );
hence
f.d.(x , x+1 ) f.d.(x , x+2 ) f.d.(x , x+ )
for all < ; so this sequence of elements of I has a minimum, and (1) holds.
Now for each < , let () be the least > so that an i as in (1) exists, and let
i() be such an i. Thus
(2) For each < we have < (), and for all () we have f.d.(x , x ) = i().
Now we define a function by setting
(0) = 0;
( + 1) = (());
() = sup () for limit
<

Then we clearly have


(3) is a strictly increasing function, and f.d.(x() , x() ) = i(()) for all , < with
< .
Moreover,
(4) If < < < , then i(()) = f.d.(x() , x() ) f.d.(x() , x() ) = i(()).
In fact, this is clear by Lemma 7.19.
Now we consider two cases.
Case 1. < < [ < and i(()) < i(())]. Then there is a strictly increasing
such that for all , , if < then i((())) < i((())). Hence for < <
we have
f.d.(x(()) , x((+1)) ) = i((())) < i((())) = f.d.(x(()) , x((+1)) );
moreover, if < < , then f.d.(x(()) , x(()) ) = i((())); so hx(()) : < i is of
argument type.
Case 2. < < [ < implies that i(()) = i(())]. Hence hx(+) : < i
is of basis type.
80

A variant of the product construction will be useful. Let be an infinite regular cardinal.
A V-system is a pair (T, M ) with the following properties:
(V1) For each , T is a collection of functions with domain .
(V2) For each < and each x T , Mx is a linear order.
(V3) For each < we have
T+1 = {x {(, a)} : x T , a Mx }.
(V4) If is a limit ordinal and x is a sequence with domain such that x T
for every < , then x T .
(V5) If < and x T , then x T .
We define a linear order on T by setting, for any x, y T , x < y iff x 6= y, and
x() < y(), where is minimum such that x() 6= y(). Here the second < relation is
that of M(x) .
The idea is that this is a variable product: not all functions in a cartesian product are
allowed. If x T , then for each < the value x() lies in an ordered set M(x) which
depends on x . Thus the linear order has a tree-like property.
Theorem 7.21. Assume the above notation. For each < let M = {(x, y) : x
T , y Mx }. Let be any linear ordering of T , for each <Q, and let M have the
lexicographic ordering. Then there is an isomorphism of T into < M .
Namely, for each x T define f (x) by (f (x)) = (x , x()) for any < .
Then f is the indicated isomorphism. Moreover, for all x, y T we have f.d.(x, y) =
f.d.(f (x), f (y)).
Q
Proof. Clearly f maps T into < M . Suppose that x, y T and x < y.
Choose minimum such that x() 6= y(); so x() < y(). Hence (x , x()) < (x
, y()) = (y , y()). If < , then (x , x()) = (y , y()). Hence f (x) < f (y)
and f.d.(x, y) = f.d.(f (x), f (y)). On the other hand, suppose that f (x) < f (y). Let
= f.d.(f (x), f (y)). If < , then (f (x)) = (f (y)) , i.e., (x , x()) = (y , y()).
Hence x = y . Since (f (x)) < (f (y)) , we have (x , x()) < (y , y()).
Hence x() < y(). It follows that x < y.
Theorem 7.22. If (T, M ) is a V -system on a regular cardinal and each linear order
Mx is complete, then T is complete.
Proof. It suffices to take any regular cardinal , suppose that x = hx : < i
is a strictly increasing sequence in T , and show that it has a supremum. By Theorems
7.20 and 7.21 we may assume that x is of argument type or of basis type. Now if x is of
basis type, then it is a sequence in some Mx , and so it has a supremum by assumption.
Suppose that it is of argument type. Thus . Let = sup{f.d.(x , x+1 ) : < }.
Thus . Let y = x f.d.(x , x+1 ) for each < . Hence y Tf.d.(x ,x+1 ) by (V5).
Now
(1) y y if < < .
81

In fact, suppose that this is not true; say that < < but y 6 y . So there is an
< f.d.(x , x+1 ) such that y () 6= y (). Thus < f.d.(x , x +1 ), and x () 6= x ().
So f.d.(x , x ) , contradiction (see the definition of argument type, (A2)).
From (1), clearly
(2) y y if < < .
def S
Now consider the function z = < y . We consider two cases.
Case 1. = . Then z T and = . We claim that z is the supremum of
x in this case. If < , let = f.d.(x , x+1 ). Then z = y = x . Now
= f.d.(x , x+1 ) < f.d.(x+1 , x+2 ) by (A1). So x () < x+1 () = y+1 () z().
Thus x < z. Now suppose that w < z. Let = f.d.(w, z). Since = , choose <
such that < f.d.(x , x+1 ). Then w = z = y , and w() < z() = y (). So
w < y , as desired.
Case 2. < . Then also < . Now z T by (V4). We define an extension
v T of z by recursion. Let w0 = z. If w has been defined as a member of T+ , with
+ < , let a() be the least member of Mw , and set w+1 = w {(, a())}. So
w+1 T++1 by (V3). If is limitSand w has been defined as
S a member of T+ for
all < , and if + < , let w = < w . Finally, let v = < w . So v T and
it is an extension of z. We claim that it is the l.u.b. of hx : < i. First suppose that
< . Then x f.d.(x , x+1 ) = y = x+1 f.d.(x , x+1 ), and

x (f.d.(x , x+1 )) < x+1 (f.d.(x , x+1 )) = y+1 (f.d.(x , x+1 )) = z(f.d.(x , x+1 )).
def

Thus x < v. Now suppose that t < v. Then = f.d.(t, v) is less than by construction,
so t = z and t() < z(). By the definition of z this gives a < such that
t = x and t() < x (). So t < x , as desired.
Our main theorem is as follows. It generalizes Satz XVII of Hausdorff [1908], and its proof
is just a modification of his proof.
Theorem 7.23. Suppose that R is a complete character system, and , are regular
cardinals with right (R) and left (R). Then there is an irreducible inner-complete
dense order L such that Pchar(L) = R, with ci(L) = and cf(L) = .
Proof. By assumption, there is a linear order U of size |R| 1 with no increasing
sequence of regular order type left (R) and no decreasing sequence of regular order type
right (R). Let U be the completion of U . Then |U | |R|, and also U has no increasing
sequence of regular order type left (R) and no decreasing sequence of regular order type
right (R). Let g be a function mapping U onto R.
Now we define some important orders which are components of the final order L. Let
and be regular cardinals.
= + 1 + ;
X
g(a) ;
=
aU

(, ) = 1 + + + + 1.
82

The symmetry of this definition will enable us to shorten several proofs below. Since we are
using the standard notation for sums of order types, and some order types are repeated,
it is good to have an exact notation for the indicated orders. m, f, l are new elements
standing for middle, first, and last respectively. We suppose that with each ordinal
we associate a new element , used in forming things like . Thus more precisely,
= {m} { : < };
the ordering here is: has its natural order; for , < , we define < iff < . The
ordering between the indicated parts of are as suggested in the left to right order.
is as in the discussion of sums above.
(, ) = {f } { : < } {l};
the ordering should be obvious on the basis of the above remarks. We implicitly assume
the distinctness of the various objects making up (, ).
(1) (, ) is a complete linear order, for any regular cardinals , .
This is clear on the basis of Lemma 7.18.
(2) If right (R) is regular and left (R) is regular, then:
(a) the right character of the left end point of (, ) is ;
(b) the left character of the right end point of (, ) is ;
(c) if a (, ) is not an end point, and its character is (, ), then < left (R)
and < right (R).
In fact, (a) and (b) are clear. Now suppose that a (, ) is not an end point, and its
character is (, ). If a is in the or portion, the conclusion of (c) is clear. So, suppose
that a is in the portion. Thus it is within some g(b) with b U . Thus g(b) = (, ) for
some (, ) R, so the conclusion of (c) is clear. Hence (2) holds.
Let p be a new element, not appearing in any of the above orders. Let be the least
regular cardinal such that (, ) R; it exists by condition (C3) in the definition of a
complete character system. For each regular < left (R), let be the least cardinal such
that (, ) R; it exists by (C1) in the definition of complete character set. Similarly,
for each regular right (R) let be the least cardinal such that ( , ) R.
Now we define by recursion a V -system (T, M ) associated with . We do this so that
the following condition holds:
(*) If y Mx is not an endpoint of Mx , then the left character of y is less than left (R)
and the right character of y is less than right (R). Moreover, except for = 0, these
conditions on y also hold for the endpoints of Mx .
To start with, we let T0 = {} and M0 = (, ). Note by (2) that the first element of
M0 has right character , its last element has left character , all other right characters
are less than right (R), and all other left characters are less than left (R). So (*) holds for
= 0.
83

Now suppose that is a limit ordinal. We let T be the set of all x with domain
such that x T for all < . Now suppose that < , still with a limit ordinal.
Now if x T and |M(x) | > 1 for all < , we set
Mx = (cf() , cf() ).
On the other hand, if |M(x) | = 1 for some < , we set Mx = {p}. Clearly (*) holds
for .
Now suppose that = + 1. Then we set
T = {x hbi : x T and b Mx }.
Now we define Mx for each x T .
(3) If x() = p, then Mx = {p}.
(4) If x() is an endpoint of M(x) or has no immediate neighbors, then Mx = {p}.
(5) If x() has a right neighbor but no left neighbor, and the left character of x() in
M(x) is , then Mx = ( , ). Note that < left (R) by (*).
(6) If x() has a left neighbor but no right neighbor, and the right character of x() in
M(x) is , then Mx = (, ). Note that < right (R) by (*).
(7) If x() has both a left and a right neighbor, then Mx = (, ).
Clearly (*) holds for . This finishes the definition of (T, M ). The linear order T is close
to the order we are after.
The following two facts are clear from the construction:
(8) If x T , < < , and x() = p, then x() = p.
(9) If x T and < , then either M(x) = {p} or M(x) = (, ) for some , ;
except for M0 we have < left (R) and < right (R).
From Theorem 7.22 we know that T is complete. Now we find the characters of the
elements of T .
(10) The smallest element of T has character (0, ).
def

To prove this, note that the smallest element of T is x = hf, p, p, . . .i, where f is the first
element of M = (, ). For each < let y = h( + 1) , f, p, p, . . .i. Clearly x < y for
each < , and y < y if < < . Now suppose that x < w. Then there is a <
such that x(0) < w( ); hence y+1 < w, as desired. This proves (10).
By symmetry we get
(11) The largest element of T has character (, 0).
Let a be the first element of T , and b the last element.
(12) If a < x < b and |Mx | > 1 for every < , then x has character (, ).
84

For, by symmetry it suffices to show that x has left character . For each < let
y = (x ) hf, p, p, . . .i.
Clearly y < x for all < , and hy : < i is strictly increasing. Now suppose that
z N and z < x. Let = f.d.(z, x). Then clearly z < y+1 , as desired. So (12) holds.
Now suppose that a < x < b and x() = p for some < , and let be minimum
with this property. Then by construction, is a successor ordinal + 1, and x() is an
endpoint of M(x) or is an element of M(x) with no neighbor.
Case 1. x() is an element of M(x) with no neighbor. Then by definition, there is a
c U such that x() = (c, m), i.e., x() is the middle element of g(c) . Write g(c) = (, ).
So (, ) R. We claim that x has character (, ). To see this, for each < let
y = (x ) h(c, + 1), f, p, p, . . .i.
Then y < x, the sequence hy < < i is strictly increasing, and x is its supremum. So
the left character of x is , and similarly the right character of x is .
Case 2. x() is an endpoint of M(x) ; by symmetry, say that x() is f , the first
element. We now consider three subcases.
Subcase 2.1. = 0. This would imply that x = a, contradiction.
Subcase 2.2. is a limit ordinal. Then cf() < . So by construction, M(x) is
(cf() , cf() ). Clearly the character of x is (cf(), cf() ) R.
Subcase 2.3. = + 1 for some . Then
x = (x ) hx(), x(), p, p, . . .i.
Clearly then one of (5)(7) holds for x().
Subsubcase 2.3.1. (5) holds for x(). So x() has a right neighbor, but no left
neighbor. Say the left character of x() is . Then by (5), Mx is ( , ). We claim that
x has character (, ). To see this, let h : < i be strictly increasing with supremum
x(). Then for each < let y be any element of N such that x = y and
y () = . So clearly y < x and hy : < i is strictly increasing. Now suppose that
z N and z < x. If f.d.(z, x) < , then z < y0 . Suppose that f.d.(z, x) = . Then
z() < x(), so z() < for some < , and hence z < y . Clearly by the form of x,
one of these possibilities for z must hold. Hence the left character of x is . Clearly its
right character is .
Subsubcase 2.3.2. (6) holds for x(). So x() has a left neighbor , but no right
neighbor. Hence has a right neighbor, and hence (5) or (7) holds for (x ) hi in place
of x and in place of . Hence M(x) hi is ( , ) for some , or (, ). Now
def

(13) y = (x ) h, l, p, p, . . .i is the immediate predecessor of x.


In fact, clearly y < x. Suppose that z < x. Clearly f.d.(z, x) . If f.d.(z, x) < ,
obviously z < y. If f.d.(z, x) = , then z() . If z() < , then z < y. If z() = , then
z() l, and so z y. So (13) holds.
85

From the form of M(x) hi it is clear that the left character of y is . Now Mx is
(, ) for some , and so it is clear that the right character of x is also .
Subsubcase 2.3.3. (7) holds for x(). So x has a left neighbor and a right
neighbor. This case is actually almost the same as Subsubcase 2.3.2, except that Mx is
(, ).
Summarizing our investigation of characters of elements of T , we have:
(14) If a < x < b, then one of the following holds:
(a) x has no neighbors, and its character is in R.
(b) x has an immediate predecessor y, and the characters of x, y are (1, ) and
(, 1) respectively.
(c) x has an immediate successor y, and the characters of x, y are (, 1) and (1, )
respectively.
Now let L be obtained from T by deleting the second element of any pair (x, y) of elements
of T such that y is the immediate successor of x. Then L is a complete linear order which
has left endpoint of character (0, ), right endpoint of character (, 0), and the characters
of each of its members different from a and b are in R.
It remains to show that every member of R is the character of some element of L, and
in fact this is true in any interval. So, suppose that x and y are elements of T , x < y, y
not the immediate successor of x. Let = f.d.(x, y).
Case 1. y() is not the immediate successor of x(). Then
(15) There is an element u such that x() < u < y() and u has an immediate predecessor
or an immediate successor.
To see this, note that x() 6= p 6= y(a), and Mx = My has the form (, ) for some
(, ) R. Then (15) follows by considering the position of x(a) in (, ).
By (15), the interval (x, y) contains all members of T that begin with (x ) hui.
Now M(x) hui is a set (, ). So, given (, ) W , let v be the element of (, ), in
the part , which is the middle point of . Then
(x ) hu, v, p, p, . . .i
is a member of N with character (, ).
Case 2. y() is the immediate successor of x(). Let z = (x ( + 1)) hl, p, p, . . .i
and w = (y ( + 1)) hf, p, p, . . .i. Thus x z < w y. So w is the immediate successor
of z. Since y is not the immediate successor of x, we have x 6= z or y 6= w. Say x < z.
Now
x = (x ) hx(), x( + 1), . . .i
z = (x ) hx(), l, p, p, . . .i.

and

Now l is not the immediate successor of x( + 1), so we can find our desired element
between x and z as in Case 1.
Theorem 7.24. Suppose that R is a complete character system and and are
regular cardinals with right (R) and left (R). Suppose also that S T = R,
86

with S nonempty. Then there is an irreducible inner-complete dense order L such that
Pchar(L) = S, Gchar(L) = T , with ci(L) = and cf(L) = .
Proof. Take L as in Theorem 7.23 Let
M = {a L : the character of a is (, ) for some (, ) S T }.
Possibly S T is empty, so that M is also empty. But if S T 6= , then M is nonempty
and dense in L. So by Theorem 7.1, in this case we can write M = M0 M1 with M0 , M1
disjoint and dense in M , hence also dense in L. Let
N = {a L : the character of a is (, ) for some (, ) S\T }.
Then let P = N M0 , or simply P = N if M = . Then P is as desired in the theorem.

EXERCISES
E7.1. Show that Theorem 7.2 does not extend to 1 . Hint: consider 1 Q and 1 Q,
both with the lexicographic order, where 1 is 1 under the reverse order ( < iff
< ).
E7.2. For any infinite cardinal , consider 2 under the lexicographic order, as for H .
Show that it is a complete linear order.
E7.3. Suppose that and are cardinals, with . Let be minimum such that
< . Take the lexicographic order on , as for H . Show that this gives a dense linear
order of size with a dense subset of size .
E7.4. Show that P() under contains a chain of size 2 . Hint: remember that || = |Q|.
E7.5. A subset S of a linear order L is weakly dense iff for all a, b L, if a < b then there
is an s S such that a s b. Show that the following conditions are equivalent for any
cardinals , such that :
(i) There is a linear order of size with a weakly dense subset of size .
(ii) P() has a chain of size .
E7.6. Suppose that Li is a linear order with at least two elements, for each i . Let
Q
i Li have the lexicographic order. Show that it is not a well-order.
E7.7. Suppose that L is a ccc dense linear order. Show that L has a dense subset of size
1 . Hint: let be a well-order of L, and let
N = {p L : there is an open set U in L such that p is the -first element of U },
and show that N is dense in L and has size at most 1 .
E7.8. Let hLi : i Ii be a system of linear orders, with I itself
P an ordered set. Show that
if each Li is dense without first or last elements, then also iI Li is dense without first
or last elements.
87

E7.9. Let be any infinite cardinal number. Let L0 be a linear order similar to + + 1;
specifically, let it consist of a copy of Z followed by one element a greater than every
integer, and let L1 be a linear order similar to + + 2; say it consists of a copy of Z
followed by two elements a < b greater than every integer. For any f 2 let
Mf =

Lf () .

<

Show that if f, g 2 then Mf and Mg are not isomorphic.


Conclude that there are exactly 2 linear orders of size up to isomorphism.
E7.10. Let be an uncountable cardinal. Let L0 be a linear order similar to + 1 + 1 ;
specifically consisting of a copy of the rational numbers in the interval (0, 1] followed by
Q 1 , where Q 1 is ordered as follows: (r, ) < (s, ) iff > , or = and r < s.
Let L1 be a linear order similar to 1 + 1 + 1 ; specifically, we take L1 to be the set
{(q, , 0) : q Q, < 1 } {(0, 0, 1)} {(q, , 2) : q Q, < 1 },
with the following ordering:
(q, , 0) < (r, , 0) iff

< , or = and q < r;

(q, , 0) < (0, 0, 1) < (r, , 2) for all relevant q, r, , ;


(q, , 2) < (r, , 2) iff > , or = and q < r.
For each f 2 let
Mf =

Lf () .

<

Show that each Mf is a dense linear order without first or last elements, and if f, g 2
and f 6= g, then Mf and Mg are not isomorphic.
Conclude that for uncountable there are exactly 2 dense linear orders without first
or last elements, of size , up to isomorphism.
References
Harzheim, E. Ordered sets. Springer 2005, 386pp.
Hausdorff, F. Grundz
uge einer Theorie der geordneten Mengen, Math. Ann. 65 (1908),
435507. (Contains interesting theorems, relatively modern terminology.)
Rosenstein, J. Linear orderings., Academic Press 1982, 487pp.

88

8. Trees
In this chapter we study infinite trees. The main things we look at are Konigs tree
theorem, Aronszajn trees, and Suslin trees.
A tree is a partially ordered set (T, <) such that for each t T , the set {s T : s < t}
is well-ordered by the relation <. Thus every ordinal is a tree, but that is not so interesting
in the present context. We introduce some standard terminology concerning trees.
For each t T , the order type of {s T : s < t} is called the height of t, and is denoted
by ht(t, T ) or simply ht(t) if T is understood.
A root of a tree T is an element of T of height 0, i.e., it is an element of T with no
elements of T below it. Frequently we will assume that there is only one root.
For each ordinal , the -th level of T , denoted by Lev (T ), is the set of all elements of
T of height .
The height of T itself is the least ordinal greater than the height of each element of T ; it
is denoted by ht(T ).
A chain in T is a subset of T linearly ordered by <.
A branch of T is a maximal chain of T .
Note that chains and branches of T are actually well-ordered, and so we may talk about
their lengths.
Some further terminology concerning trees will be introduced later. A typical tree is
<
2, which is by definition the set of all finite sequences of 0s and 1s, with as the partial
order. More generally, one can consider < 2 for any ordinal .
Theorem 8.1. (Konig) Every tree of height in which every level is finite has an
infinite branch.
Proof. Let T be a tree of height in which every level S
is finite. We define a sequence
htm : m i of elements of T by recursion. Clearly T = r a root {s T : r s}, and
the index set is finite, so we can choose a root t0 such that {s T : t0 s} is infinite.
Suppose now that we have defined an element tm of height m such that {s T : tm s}
is infinite. Let S = {u T : tm < u and u has height ht(tm ) + 1}. Clearly
[
{s T : tm s} = {tm }
{s T : u s}
uS

and the index set of the big union is finite, so we can choose tm+1 of height ht(tm ) + 1
such that {s T : tm+1 s} is infinite.
This finishes the construction. Clearly {tm : m } is an infinite branch of T .
In attempting to generalize Konigs theorem, one is naturally led to Aronszajn trees and
Suslin trees. For the following definitions, let be any infinite cardinal.
A tree (T, <) is a -tree iff it has height and every level has size less than .
A -Aronszajn tree is a -tree which has no chain of size .
89

A subset X of a tree T is an antichain iff any two distinct members of X are incomparable.
Note that each set Lev (T ) is an antichain.
A -Suslin tree is a tree of height which has no chains or antichains of size .
An Aronszajn tree is an 1 -Aronszajn tree, and a Suslin tree is an 1 -Suslin tree.
It is natural to guess that Aronszajn trees and Suslin trees are the same thing, since the
definition of -tree implies that all levels have size less than , and a guess is that this
implies that all antichains are of size less than . This guess is not right though. Even our
simplest example of a tree, < 2, forms a counterexample. This tree has all levels finite,
but it has infinite antichains, for example
{h0i, h1, 0i, h1, 1, 0i, h1, 1, 1, 0i, . . .}.
In the rest of this chapter we investigate these notions, and state some consistency results,
some of which will be proved later. There is also one difficult natural open problem which
we will formulate.
First we consider Aronszajn trees. Note that Theorem 8.1 can be rephrased as saying
that there does not exist an -Aronszajn tree. As far as existence of Aronszajn trees is
concerned, the following theorem takes care of the case of singular :
Theorem 8.2. If is singular, then there is a -Aronszajn tree.
Proof. Let h : < cf()i be a strictly increasing sequence of infinite cardinals with
supremum . Consider the tree which has a single root, and above the root has disjoint
chains which are copies of the s. Clearly this tree is a -Aronszajn tree. We picture
this tree here:

+1

Very rigorously, we could define T to be the set {0} {(, ) : < cf() and < },
with the ordering 0 < (, ) for all < cf() and < , and (, ) < ( , ) iff =
and < .
90

Turning to regular , we first prove


Theorem 8.3. There is an Aronszajn tree.
Proof. We start with the tree
T = {s <1 : s is one-one}.
under . This tree clearly does not have a chain of size 1 . But all of its infinite levels
are uncountable, so it is not an 1 -Aronszajn tree. We will define a subset of it that is the
desired tree. We define a system hS : < 1 i of subsets of T by recursion; these will be
the levels in the new tree.
Let S0 = {}. Now suppose that > 0 and S has been constructed for all < so
that the following conditions hold for all < :
(1 ) S T .
(2 ) \rng(s) is infinite, for every s S .
(3 ) For all < , if s S , then there is a t S such that s t.
(4 ) |S | .
(5 ) If s S , t T , and { < : s() 6= t()} is finite, then t S .
(6 ) If s S and < , then s S .
(Vacuously these conditions hold for all < 0.) If is a successor ordinal + 1, we simply
take
S = {s {(, n)} : s S and n
/ rng(s)}.
Clearly (1 )(6 ) hold for all < + 1.
Now suppose that is a limit ordinal less than 1 and (1 )(6 ) hold for all < .
Since is a countable limit ordinal, it follows that cf() = . Let
S hn : n i be a strictly
increasing sequence of ordinals with supremum . Now let U = < S . Take any s U ;
we want to define an element ts T which extends s. Let = dmn(s).
Choose n minimum such that n . Now we define a sequence hui : i i of
members of U ; ui will be a member of Sn+i . By (3n ), let u0 be a member of Sn such
that s u0 . Having defined a member ui of Sn+i , use (3n+i+1 ) to get aSmember ui+1
of Sn+i+1 such that ui ui+1 . This finishes the construction. Let v = i ui . Thus
s v T . Unfortunately, condition (2) may not hold for v, so this is not quite the
element ts that we are after. We define ts as follows. Let < . Then

v(2n+2i ) if = n+i for some i ,
ts () =
v()
if
/ {n+i : i }.
Clearly ts T . Since v(2n+2i+1 )
/ rng(ts ) for all i , it follows that \rng(ts ) is
infinite.
We now define
[
S =
{w T : { < : w() 6= ts ()} is finite}.
sU

91

Now we want to check that (1 )(6 ) hold. Conditions (1 ) and (3 ) are very clear.
For (2 ), suppose that w S . Then w T and there is an s U such that
{ < : w() 6= ts ()} is finite. Since \rng(ts ) is infinite, clearly \rng(w) is infinite. For
(4), note that U is countable by the assumption that (4) holds for every < , while
for each s U the set
{w T : { < : w() 6= ts ()} is finite}
is also countable. So (4) holds. For (5), suppose that w S , x T , and { < :
w() 6= x()} is finite. Choose s U such that { < : w() 6= ts ()} is finite. Then of
course also { < : x() 6= ts ()} is finite. So x S , and (5) holds. Finally, for (6 ),
suppose that w S and < ; we want to show that w S . Choose s U such
that { < : w() 6= ts ()} is finite. Assume the notation introduced above when defining
ts (n, , u, v). Choose i such that n+i . Then
{ < n+i : w() 6= ui ()} = { < n+i : w() 6= v()}
{ < n+i : w() 6= ts ()} {n+j : j < i},
and the last union is clearly finite. It followsSfrom (5n+1 that w S . So (6 ) holds.
This finishes the construction. Clearly <1 S is the desired Aronszajn tree.
We defer a discussion of possible generalizations of Theorem 8.3 until we discuss the closely
related notion of a Suslin tree.
The proof of Theorem 8.2 gives
Theorem 8.4. If is singular, then there is a -Suslin tree.
Note also that Theorem 8.1 implies that there are no -Suslin trees. There do not exist ZFC
results about existence or non-existence of -Suslin trees for uncountable and regular.
We limit ourselves at this point to some simple facts about Suslin trees.
Proposition 8.5. If T is a -Suslin tree with uncountable and regular, then T is a
-tree.
Proposition 8.6. For any infinite cardinal , every -Suslin tree is a -Aronszajn
tree.
This is a good place to notice that the construction of an 1 -Aronszajn tree given in the
proof of Theorem 8.3 does not give an 1 -Suslin tree. In fact, assume the notation of that
proof, and for each n let
[
An =
{s S+1 : s() = n}.
<1

S

S
S
S
Clearly An is an antichain in <1 S , and n AnS= <1 S+1 . Hence n An =
1 . It follows that some An is uncountable, so that <1 S is not a Suslin tree.
We now introduce some notions that are useful in talking about -trees; these conditions
were implicit in part of the proof of Theorem 8.3.
92

A well-pruned -tree is a -tree T with exactly one root such that for all < < ht(T )
and for all x Lev (T ) there is a y Lev (T ) such that x < y.
A normal subtree of a tree (T, <) is a tree (S, ) satisfying the following conditions:
(i) S T .
(ii) For any s1 , s2 S, s1 s2 iff s1 < s2 .
(iii) For any s, t T , if s < t and t S, then s S.
Note that each level of a normal subtree is a subset of the corresponding level of T . Clearly
a normal subtree of height of a -Aronszajn tree is a -Aronszajn tree; similarly for Suslin trees.
A tree T is eventually branching iff for all t T , the set {s T : t s} is not a chain.
Clearly a well-pruned -Aronszajn tree is eventually branching; similarly for -Suslin trees.
Theorem 8.7. If is regular, then any -tree T has a normal subtree T which is a
well-pruned -tree. Moreover, if x T and |{y T : x y}| = then we may assume
that x T .
Proof. Let be regular, and let T be a -tree. We define
S = {t T : |{s T : t s}| = }.
Clearly S is a normal subtree of T , although it may contain more than one root of T . Now
we claim
(1) Some root of T is in S.
In fact, Lev0 (T ) has size less than , and
T =

{t T : s t},

sLev0 (T )

so there is some s Lev0 (T ) such that |{t T : s t}| = . This element s is in S, as


desired in (1).
We now take an s as indicated. To satisfy the second condition in the Theorem, we
can take s below the element x of that condition.
Now we let S = {t S : s t}. We claim that S is as desired. Clearly it is a normal
subtree of T , and it has exactly one root, namely s. To show that it has height and is
well-pruned, it suffices now to prove
(2) If u S , < < , and ht(u, S ) = , then there is a v S Lev (T ) such that
u < v.
In fact,
{t T : u t} =

{t Lev (T ) : u t}

<

[
vLev (T )
u<v

93

{t T : v t},

and the first big union here is the union of fewer than sets, each of size less than .
Hence there is a v Lev (T ) such that u < v and |{t T : v t}| = . So v S and
u < v, as desired.
Proposition 8.8. Let be an uncountable regular cardinal. If T is an eventually
branching -tree in which every antichain has size less than , then T is a Suslin tree.
Proof. Suppose to the contrary that C is a chain of length . We may assume that
C is maximal, so that it has elements of each level less than . For each t T choose
f (t) T such that t < f (t)
/ C; this is possible by the eventually branching hypothesis.
Now we define hs : < i by recursion, choosing
(
)
s

t C : sup ht(f (s ), T ) < ht(t, T ) ;


<

this is possible since is regular. Now hf (s ) : < i is an antichain. In fact, if <


and f (s ) and f (s ) are comparable, then by construction ht(f (s ), T ) < ht(s , T ) <
ht(f (s ), T ), and so f (s ) < f (s ). But then the tree property yields that f (s ) < s and
so f (s ) C, contradiction.
Thus we have an antichain of size , contradiction.
One of the main motivations for the notion of a Suslin tree comes from a correspondence
between linear orders and trees. Under this correspondence, Suslin trees correspond to
Suslin lines, and the existence of Suslin trees is equivalent to the existence of Suslin lines.
First we show how to go from a tree to a line, in a fairly general setting. Suppose that
T is a well-pruned -tree, and let be a linear order of T . Here may have nothing to
do with the order of the tree. Note that every branch of T has limit ordinal length. For
each branch B of T , let len(B) be its length, and let hbB
: < len(B)i be an enumeration
of B in increasing order. For distinct branches B1 , B2 , neither is included in the other,
and so we can let d(B1 , B2 ) be the smallest ordinal < min(len(B1 ), len(B2 )) such that
bB1 () 6= bB2 (). We define the -branch linear order of T , denoted by B(T, ), to be
the collection of all branches of T , where the order < on B(T, ) is defined as follows: for
any two distinct branches B1 , B2 ,
B1 < B2

iff

bB1 (d(B1 , B2 )) bB2 (d(B1 , B2 )).

This is a kind of lexicographic ordering of the branches. Clearly this is an irreflexive


relation, and clearly any two branches are comparable. The following lemma gives that it
is transitive.
Lemma 8.9. Assume that B1 < B2 < B3 . Then exactly one of the following holds:
(i) d(B1 , B3 ) = d(B1 , B2 ) < d(B2 , B3 ).
(ii) d(B1 , B3 ) = d(B1 , B2 ) = d(B2 , B3 ).
(iii) d(B1 , B3 ) = d(B2 , B3 ) < d(B1 , B2 ).
In any case B1 < B3 .
Clearly at most one of (i)(iii) holds. These three conditions are illustrated as follows:
94

B2
B1

B3

B1
B1

B2

B3

B2
B3

Case 1. d(B1 , B2 ) < d(B2 , B3 ). Then, we claim, d(B1 , B3 ) = d(B1 , B2 ). In fact, if


< d(B1 , B2 ), then
bB1 () = bB2 () = bB3 (),
while
bB1 (d(B1 , B2 ) bB2 (d(B1 , B2 )) = bB3 (d(B1 , B2 )).
Hence the claim holds, and B1 < B3 .
Case 2. d(B1 , B2 ) = d(B2 , B3 ). Then, we claim, d(B1 , B3 ) = d(B1 , B2 ). In fact, if
< d(B1 , B2 ), then
bB1 () = bB2 () = bB3 (),
while
bB1 (d(B1 , B2 ) bB2 (d(B1 , B2 )) bB3 (d(B1 , B2 )).
This proves the claim, and B1 < B3 .
Case 3. d(B1 , B2 ) > d(B2 , B3 ). Then, we claim, d(B1 , B3 ) = d(B2 , B3 ). In fact, if
< d(B2 , B3 ), then
bB1 () = bB2 () = bB3 (),
while
bB1 (d(B2 , B3 ) = bB2 (d(B2 , B3 )) bB3 (d(B2 , B3 )).
This proves the claim, and B1 < B3 .
Thus the construction gives a linear order.
Theorem 8.10. If there is a Suslin tree then there is a Suslin line.
Proof. By Theorem 8.7 we may assume that T is well-pruned. Take any linear
order of T . To show that B(T, ) is ccc, suppose that A is an uncountable collection
of nonempty pairwise disjoint open intervals in B(T, ). For each (B, C) A choose
E(B,C) (B, C). Remembering that each branch has limit length, we can also select an
ordinal (B,C) such that
d(B, E(B,C) ), d(E(B,C) , C) < (B,C) < len(E(B,C) )
We claim that hbE(B,C) ((B,C) ) : (B, C) A i is a system of pairwise incomparable elements of T , which contradicts the definition of a Suslin tree. In fact, suppose that (B, C)
95

and (B , C ) are distinct elements of A and bE(B,C) ((B,C) ) bE(B ,C ) ((B ,C ) ). It


follows that
(1) bE(B,C) () = bE(B

,C )

() for all (B,C) .

Hence
(2) If < d(B, E(B,C)), then < (B,C) , and so bB () = bE(B,C) () = bE(B ,C ) ().
Now recall that d(B, E(B,C)) < (B,C) . Hence
bB (d(B, E(B,C) )) bE(B,C) (d(B, E(B,C))) = bE(B ,C ) (d(B, E(B,C) )),
and so B < E(B ,C ) . Similarly, E(B ,C ) < C, as follows:
(3) If < d(C, E(B,C)), then < (B,C) , and so bC () = bE(B,C) () = bE(B ,C ) ().
Now recall that d(C, E(B,C)) < (B,C) . Hence
bC (d(C, E(B,C) )) bE(B,C) (d(C, E(B,C) )) = bE(B ,C ) (d(C, E(B,C))),
and so C > E(B ,C ) . Hence E(B ,C ) (B, C). But also E(B ,C ) (B , C ), contradiction.
To show that B(T, ) is not separable, it suffices to show that for each < 1 the
set {B B(T, ) : len(B) < } is not dense in B(T, ). Take any x T of height .
Since {y : y > x} has elements of every level greater than , it cannot be a chain, as
this would give a chain of size 1 . So there exist incomparable y, z > x. Similarly, there
exist incomparable u, v > y. Let B, C, D be branches containing u, v, z respectively. By
symmetry say B < C.
(4) ht(y) < d(B, C)
This holds since y B C.
(5) d(B, D) ht(y) and d(C, D) ht(y); hence d(B, D) < d(B, C) and d(C, D) < d(B, C).
In fact, y B\D, so d(B, D) ht(y) follows. Similarly d(C, D) ht(y). Now the rest
follows by (4).
(6) d(B, D) = d(C, D).
For, if d(B, D) < d(C, D), then bC (d(B, D)) = bD (d(B, D)) 6= bB (d(B, D)), contradicting
d(B, D) < d(B, C), part of (5). If d(C, D) < d(B, D), then bB (d(C, D)) = bD (d(C, D)) 6=
bC (d(C, D)), contradicting d(C, D) < d(B, C), part of (5).
(7) B < C < D or D < B < C.
In fact, otherwise we have B < D < C, and Lemma 8.9 gives the possibilities d(B, C) =
d(B, D) < d(D, C), d(B, C) = d(B, D) = d(C, D) or d(B, C) = d(D, C) < d(B, D); each
of these contradicts (5) or (6).
Case 1. B < C < D. Thus (B, D) is a nonempty open interval. Suppose that
there is some branch E with len(E) < and B < E < D. Then d(B, E), d(E, D) < . By
Lemma 8.9 one of the following holds: d(B, D) = d(B, E) < d(E, D); d(B, D) = d(B, E) =
96

d(E, D); d(B, D) = d(E, D) < d(B, E). Hence d(B, D) < . Since x B D, we have
bB = bD , contradiction.
Case 2. D < B < C. Thus (D, C) is a nonempty open interval. Suppose that there
is some branch E with len(E) < and D < E < C. Then d(D, E), d(E, C) < . By
Lemma 8.9 one of the following holds: d(D.C) = d(D, E) < d(E, C); d(D.C) = d(D, E) =
d(E, C); d(D.C) = d(E, D) < d(D, E); hence d(D, C) < . Since x C D, we have
bB = bD , contradiction.
In the other direction, we prove:
Theorem 8.11. If there is a Suslin line, then there is a Suslin tree.
Proof. Assume that there is a Suslin line. Then by Theorem 7.16 we may assume
that we have a linear order L satisfying the following conditions:
(1) L is dense, with no first or last elements.
(2) No nonempty open subset of L is separable.
(3) L is ccc.
(We do not need the other condition given in Theorem 7.16.) Let I be the collection of
all open intervals (a, b) with a < b in L. So by denseness, each such interval is nonempty.
We are now going to define a sequence hJ : < 1 i of subsets of I. Let J0 be a maximal
disjoint subset of I. Now suppose that 0 < < 1 and we have defined J for all <
so that the following conditions hold:
(4 ) The elements of J are pairwise disjoint.
S
(5 ) J is dense in L.
(6 ) If < , I J , and J J , then either I J = , or else J I.
(7 ) If < and I J , then there are at least two J J such that J I.
Note that (40 )(70 ) hold: (60 ) and (70 ) trivially hold, (40 ) holds by definition, and (50 )
holds by the maximality of J0 .
First suppose that is a successor ordinal + 1. For each M J , choose disjoint
members I1 , I2 of I such that I1 I2 M , and let KM be a maximal disjoint subset of
{K I : K M }
such
S that I1 , I2 KM . The existence of I1 and I2 is clear by denseness. Then let J =
M J KM . Clearly (4 ) holds.
S
For (5 ), suppose that a, b L and a < b. By (5 ), choose c
J such that
a < c < b. Say c (d, e) J . Thus max(a, d) < c < min(b, e). We claim:
(8) There is a K K(d,e) such that (max(a, d), min(b, e)) K 6= .
For, suppose that (8) fails. Choose u, v with max(a, d) < u < v < min(b, e). Then
(u, v) K = for all K K(d,e) and (u, v) (d, e). This contradicts the maximality of
K(d,e) . So (8) holds.
97

S
Choose K as in (8). It follows that J (a, b) 6= , as desired for (5 ).
For (6 ), suppose that , I J , and J J . Choose M J such that J KM .
Now we consider two cases.
Case 1. = . Then by (4 ), either I M = or I = M . If I M = , then I J =
since J M , as desired in (6 ). If I = M , then J I by definition.
Case 2. < . In this case, by (6 ) we have two further possibilities. If I M = ,
then also I J = , as desired in (6 ). Otherwise we have M I and clearly also J I.
(7 ) is clear by construction.
Second, suppose that is a limit ordinal. Let
K = {K I : for all < and all I J [I K = or K I}.
Before defining J , we need to know that K is nonempty. This follows from the following
stronger statement.
(9) If a, b I and a < b, then there is a K K such that K (a, b).
To prove this, suppose
S that a, b I and a < b. Let E be the collection of all endpoints
of the intervals in < J . Since is countable and each J is countable by virtue of
(4 ) and the ccc for L, it follows that E is countable. Since (a, b) is not separable,
there
S
are c, d L such that a < c < d < b and E (c, d) = . For every I < J , the
interval (c, d) does not contain either of the endpoints of I, so it follows that I (c, d) =
or (c, d) I. Hence (c, d) K and (c, d) (a, b), as desired in (9)
Now let J be a maximal pairwise disjoint subset of K. So (4 ) holds, and also (6 )
is clear.
Now to prove (5 ), take any a, b L with a < b, and choose K K such that
K (a, b), as given in (9).
S By the maximality of J , there is an L J such that
K L 6= . Hence (a, b) J 6= , as desired in (5 ).
Finally, for (7 ), let I J where < . By (7+1 ), there are two distinct J, K J+1
such that J, K I, and then the construction gives J J and K K with J , K J ,
as desired.
S
This finishes the construction. Let T = <1 J , with the ordering . So this gives
a partial order.
(10) If I T , then there is a unique < 1 such that I J .
For, by definition there is some < 1 such that I J . Suppose that also I J , with
6= . By symmetry, say that < . This contradicts (7 ).
We denote the given by (10) by I .
(11) If I T and < I , then there is a unique J J such that I J.
In fact, by (5 ) there is a J J such that I J 6= . Then by (6I ), I J. Hence by
(4I ), also J is unique.
Let I T . For each < I , let f () be the unique J given by (11). Then f is an
order-isomorphism from I onto {J T : I J} under . In fact, if < < I , then
I f () f (), and so (6 ) and 6( ) imply that f () f (). The function f maps onto,
since if I J with J J , then < by (6I ), and so f () = J.
98

It follows that T is a tree, and each I T has level I . So T is a tree of height 1 . If


A is an antichain in T , then it is also an antichain in L, in the ordered set sense by (6 ),
and so it is countable.
By Proposition 8.8, T is a Suslin tree.
We mention without proof a result for higher cardinals. Assuming V = L, for each
uncountable regular cardinal , there is a -Suslin tree iff is not weakly compact. (Weakly
compact cardinals will be discussed later; they are inaccessible) It is a probably difficult
open problem to show that it is consistent (relative to ZFC or even ZFC plus some large
cardinals) that for each uncountable cardinal there is no + -Aronszajn tree.
EXERCISES
E8.1. Let be an uncountable regular cardinal, and suppose that there is a -Aronszajn
tree. Show that there is one which is a normal subtree of < 2. Hint: for each < let g
be an injection of Lev (T ) into |Lev (T )| 2 and glue these maps together.
E8.2. Do exercise E8.1 for -Suslin trees.
E8.3. Suppose that T and T are -Aronszajn trees. Define an order < on T T by
(s, s ) < (t, t ) iff s < t and s < t . Show that T T is not a tree.
E8.4. Suppose that T and T are -Aronszajn trees. Let
T T =

Lev (T ) Lev (T );

<

(s, s ) < (t, t ) iff (s, s ), (t, t ) T T , s < t, and s < t ;


Show that (T T , <) is a -Aronszajn tree.
E8.5. Assume that is regular and uncountable. Suppose that T is a -Suslin tree. With
the order on T T given in exercise E8.4, show that T T is not a -Suslin tree. Hint:
first show that for every < there is an element s of T at level such that there are
incomparable t, u > s.
E8.6. A tree T is everywhere branching iff every t T has at least two immediate successors. Show that every everywhere branching tree has at least 2 branches.
E8.7. Show that the hypothesis that all levels are finite is necessary in Konigs theorem.
E8.8. Show that if is singular with cf() = , then there is no -Aronszajn tree with all
levels finite.
E8.9. Prove that if is singular and there is a cf()-Aronszajn tree, then there is a
-Aronszajn tree with all levels of power less than cf().
E8.10. Show that for every infinite cardinal there is an eventually branching tree T of
height such that for every subset S of T , if S is a tree under the order induced by T and
every element of S has at least two immediate successors, then S has height .
99

E8.11. Show that if is an uncountable regular cardinal and T is a -Aronszajn tree, then
T has a subset S such that under the order induced by T , S is a well-pruned -Aronszajn
tree in which every element has at least two immediate successors.
E8.12.Let (T, <) be a -Aronszajn tree, and let be any linear order of T . For incomparable elements s, t of T let (s, t), (s, t) be the elements of T such that (s, t) s, (s, t) t,
(s, t) and (s, t) are different elements of the same height, and w < (s, t)[w < t].
We define s t iff s, t T and one of the following conditions holds:
(i) s < t.
(ii) s 6< t and t 6< s and (s, t) (s, t).
Show that (T, ) is a linear order in which every element has character (, ) with , < .
Reference
Todorcevic, S. Trees and linearly ordered sets. In Handbook of set-theoretic topology.
North-Holland 1984, 235293.

100

9. Clubs and stationary sets


Here we introduce the important notions of clubs and stationary sets. A basic result here
is Fodors theorem. We also give a combinatorial principle , later proved consistent with
ZFC, and use to construct a Suslin tree.
A subset of an ordinal is unbounded iff for every < there is a such that
. A subset C of is closed in provided that for every limit ordinal < , if C
is unbounded in then C. Closed and unbounded subsets of are called clubs of .
The following simple fact about ordinals will be used below.
Lemma 9.1. If is an ordinal and , then o.t.() .
Proof. Let = o.t.(), and let f be the isomorphism of onto . For all < we
have f () < , so and hence .
Note that is club in 0. If = + 1, then {} is club in . We are mainly interested in
limit ordinals . Then an equivalent way of looking at clubs is as follows.
Theorem 9.2. Let be a limit ordinal.
(i) If C is club in , then there exist an ordinal and a normal function f :
such that rng(f ) = C.
(ii) If is an ordinal and f : is a normal function such that rng(f ) is unbounded
in , then rng(f ) is club in .
Proof. (i): Let be the order type of C, and let f : C be the isomorphism of
onto C. Thus f : , and f is strictly increasing. To show
Sthat f is continuous,
S suppose
that < is a limit ordinal; we want to show that f () = < f (). Let = < f ().
Clearly is a limit ordinal. Now C is unbounded in . For, suppose that < . Then
there is a < such that < f (). Since + 1 < and f () < f ( + 1), we thus have
f () C . So, as claimed, C is unbounded in . Hence C. Since is the lub of
f [], it follows that f () = , as desired. This proves (i).
(ii): Let C = rng(f ). We just need to show that C is closed in . Suppose that <
def S
is a limit ordinal, and C is unbounded in . We are going to show that = f 1 []
is a limit ordinal less than and f () = , thereby proving that C.
Choose C such that < . Say f () = . Then f 1 [] , since for every
ordinal , if f 1 [] then f () < = f () and so < . It follows that also
S
f 1 [] S < .
S
Next, f 1 [] is a limit ordinal. For, if < f 1 [], choose f 1 [] such that
. Thus f () < . Since is a limit ordinal and C isSunbounded in , there is
aS such that f () < f () < . Hence < f 1 [], so f 1 []. This shows that
f 1 [] is a limit ordinal.
S
We have f () = < f () by continuity. If < , choose f 1 [] such that
< . then f () < f () . This shows that f () .
Finally, suppose that < . SinceSC is unbounded in , choose such that
< f () < . Then f 1 [], so f 1 [], i.e., < . Since is a limit ordinal,
say that < < . Then < f () f (). This shows that f (), hence
f () = .
101

Corollary 9.3. If is a regular cardinal and C , then the following conditions


are equivalent:
(i) C is club in .
(ii) There is a normal function f : such that rng(f ) = C.
Proof. (i)(ii): Suppose that C is club in . By Theorem 9.2(i) let be an ordinal
and f : a normal function with rng(f ) = C. Thus is the order type of C, and so
by Lemma 9.1, . The regularity of together with C being unbounded in imply
that = . Thus (ii) holds.
(ii)(i): Suppose that f : is a normal function such that rng(f ) = C. Then
by Theorem 9.2(i), C is club in .
Corollary 9.4. If is a limit ordinal, then there is club of with order type cf().
Proof. By Theorem 6.42, let f : cf() be a strictly increasing function with
rng(f ) unbounded in . Define g : cf() by recursion, as follows:

if = 0,
0
max(f
(),
g()
+
1)
if = + 1 for some ,
g() =
sup
if is a limit ordinal.
< g()
Clearly then g is a normal function from cf() into , with rng(g) unbounded in . By
Theorem 9.2(ii), the existence of the desired set C follows.
If cf() = , then Corollary 9.4 yields a strictly increasing function f : with rng(f )
unbounded in . Then rng(f ) is club in . The condition on limit ordinals in the definition
of club is trivial in this case. Most of our results concern limit ordinals of uncountable
cofinality.
If is any limit ordinal and < , then the interval [, ) is a club of . Another
simple fact about clubs is that if C is club in a limit ordinal of uncountable cofinality, then
the set D of all limit ordinals which are in C is also club in . (We need of uncountable
cofinality in order to have D unbounded.) Also, if C is club in with cf() > , then
the set E of all limit points of members of C is also club in . This set E is defined to be
{ < : is a limit ordinal and C is unbounded in }; clearly E C.
Now we give the first major fact about clubs.
Theorem 9.5. If is a limit ordinal with cf() > , then the intersection of fewer
than cf() clubs of is again a club.
Proof.
Suppose that < cf() and hC : < i is a system of clubs of . Let
T
D = < C . First we show that D is closed. To this end, suppose that < is a limit
ordinal, and D is unbounded in . Then for each < , the set C is unbounded in ,
and hence C since C is closed in . Therefore D.
To show that D is unbounded in , take any < ; we want to find > such that
D. We make a simple recursive construction of a sequence hn : n i of ordinals
less than . Let 0 = . Suppose that n has been defined. Using the fact that each C is
unbounded in , for each < choose n, C such that n < n, . Then let
n+1 = sup n, ;
<

102

we have n+1 < since < cf(). This finishes the recursive construction. Let =
supn n . Then < since cf() > . Clearly C is unbounded in for each < ,
and hence C . So D, as desired.
Again let be any limit ordinal, and suppose that hC : < i is a system of subsets of
. We define the diagonal intersection of this system:
< C = { : < ( C )}.
This construction is used often in discussion of clubs, in particular in the definition of
some of the large cardinals.
Theorem 9.6. Suppose that cf() > . Assume that hC : < i is a system of
clubs of .T
(i) If < C is unbounded in for each < , then < C is club in .
(ii) If is regular, then < C is club in .
Proof. Clearly (ii) follows from (i) (using Theorem 9.5 to verify the hypothesis of
(i)), so it suffices to prove (i). Assume the hypothesis of (i).
For brevity set D = < C First we show that D is closed in . So, assume that
is a limit ordinal less than , and D is unbounded in . To show that D, take any
< ; we show that C . Let E = { D : < }. Then E is unbounded in ,
and for each E we have C , by the definition of D. So C since C is closed.
Second we show that D is unbounded in . So, take any < . We define a sequence
hi : i < i of ordinals less than by recursion.
T Let 0 = . If i has been defined, by
the hypothesis of (i) let i+1 be a member of <i C which is greater than i . Finally,
let = supi i . So < since cf() > . We claim that D. To see this, take any
< . Choose i such that < i . Then j C for all j i, and hence C is
unbounded in , so C . This argument shows that D.
We give one more general fact about closed and unbounded sets; this one is frequently
useful in showing that specific sets are closed and unbounded.
A finitary partial operation on a set A is a nonempty function whose domain is a
subset of mA for some positive integer m and whose range is a subset of A. We say that a
subset B of A is closed under such an operation iff for every a (m B) dmn(f ) we have
f (a) B.
Theorem 9.7. Suppose that is an uncountable regular cardinal, X []< , and F
is a collection of finitary partial operations on , with |F | < . Then { < : X and
is closed under each f F } is club in .
Proof. Denote the indicated set by C. To show that it is closed, suppose that is a
limit ordinal less than , and C is unbounded in . To show that is closed under any
partial operation f F , suppose that dmn(f ) m and a (m ) dmn(f ). For each
def S
i < m choose i < such that ai i . Since is a limit ordinal, the ordinal = i<m i
is still less than . Since C is unbounded in , choose C such that < . Then
a m so, since C, we have f (a) . Thus is closed under f . Hence C; so
C is closed in .
103

To show that C is unbounded in , take any < . We now define a sequence


hn : n i by recursion. Let 0 = . Having defined i < , consider the set
{f (a) : f F , a dmn(f ), and each aj is in i }.
This set clearly has fewer than members. Hence we can take i+1 to be some ordinal
less than and
S greater than each member of this set. This finishes the construction.
Let = i i . We claim that C, as desired. For, suppose that f F , f
has domain n , and a (n ) dmn(f ). Then for each i < n choose mi such
that ai mi . Let p be the maximum of all the i s. Then a (n p ) dmn(f ), so by
construction f (a) p+1 .
Let be a limit ordinal. A subset S of is stationary iff S intersects every club of
. There are some obvious but useful facts about this notion. Assume that cf() > .
Then any club in is stationary. An intersection of a stationary set with a club is again
stationary. Any superset of a stationary set is again stationary. The union of fewer than
cf() nonstationary sets is again nonstationary. Every stationary set is unbounded in .
The following important fact is not quite so obvious:
Proposition 9.8. If is a limit ordinal and is a regular cardinal less than cf(),
then the set
def
S = { < : cf() = }
is stationary in .
Proof. Let C be club in . Let f : cf() be strictly increasing, continuous, and
with range cofinal in . We define g : cf() C by recursion.
Let g(0) be any member
S
of C. For a limit ordinal less than cf(), let g() = < g(). If < cf() and g()
has been defined, let g( + 1) be a member of C greater than both g() and f (). Clearly
g is a strictly increasing continuous function mapping cf() into C, and the range of g is
cofinal in . Thus rng(g) is club in . Now g() C S, as desired.
Let S be a set of ordinals. A function f S Ord is regressive iff f () < for every
S\{0}. This is a natural notion, and leads to an important fact which is used in many
of the deeper applications of stationary sets.
Theorem 9.9. (Fodor; also called the pressing down lemma) Suppose that is
a limit ordinal of uncountable cofinality, S is a stationary subset of , and f : S is
regressive. Then there is an < such that f 1 [] is stationary in .
In case is regular, there is a < such that f 1 [{}] is stationary.
Proof. Assume the hypothesis of the first part of the theorem, but suppose that there
is no of the type indicated. So for every < we can choose a club C in such that
C f 1 [] = . Let D be a club in of order type cf(). Now for each < let ()
be the least member of D greater than . For each < we define
E =

\
D( ()+1)

104

C .

We claim then that for every < ,


E f 1 [] = .

(1)

In fact, < () D ( () + 1), so E f 1 [] C () f 1 [ ()] = . So (1) holds.


Now by Theorem 9.5, each set E is club in . Moreover, clearly E E if < < .
def

Hence we can apply Theorem 9.6(i) to infer that F = < E is club in . Hence also
the set G of all limit ordinals which are in F is club in . Choose G S. Now f () < ;
since is a limit ordinal, choose < such that f () < . But G F , so it follows by
the definition of diagonal intersection that E . From (1) we then see that
/ f 1 [].
This contradicts f () < .
For the second part of S
the theorem, assume that is regular. Note that, with as
1
in the first part, f [] = < f 1 [{}]. Hence the second part follows from the fact
mentioned above that a union of fewer than nonstationary sets is nonstationary.
To illustrate the use of Fodors theorem we give the following result about Aronszajn trees
which answers a natural question.
Theorem 9.10. Suppose that is an uncountable regular cardinal, T is a -Aronszajn
tree, and is an infinite cardinal less than . Further, suppose that x T and |{y T :
x < y}| = . Then there is an > ht(x) such that
|{y Lev (T ) : x < y}| .
Proof. By Theorem 8.7 we may assume that T is well-pruned, and by taking {y
T : x y} we may assume that x is the root of T . So now we want to find a level such
that |Lev (T )| . We assume that this is not the case. So |Lev (T )| < for all < .
Suppose that is singular. Then
[
{ < : |Lev (T )| < + },
=
<
a cardinal

def

so there is a < such that = { < : |Lev (T )| < + } has power . Because T is
well-pruned, we have |Lev (T )| |Lev (T ) whenever < . It follows that |Lev (T )| <
+ for all < , since is clearly unbounded in . Thus we may assume that is regular.
For each s T and each < ht(s) let s be the unique element of height less than
s.
Let = { < : cf() = }. So is stationary in . Now we claim
(1) For every and every s Lev (T ) there is a < such that the set {t T :
s t, ht(t) < } is a chain.
To prove this, suppose not. Thus we can choose and s Lev (T ) such that
(2) For all < there is a [, ) and a t Lev (T ) such that s < t 6= s and s+1 6 t.
Now we use (2) to construct by recursion two sequences h : < i and ht : < i.
Suppose that these have been defined for all < , where < , so that each < . Let
105

S
= < . So < since cf() = . By (2), choose [ + 1, ) and t Lev (T )
such that s < t 6= s and s +1 6 t . Since Lev (T ) has size less than , there exist ,
with < and t = t . Then s +1 s < t = t , contradiction. Hence (1) holds.
(3) For every there is a < such that for each s Lev (T ) the set {t T : s
t, ht(t) < } is a chain.
To prove this, let . By (1), for each s Lev (T ) choose s < such tha the set
{t T : ss t, s ht(t) < } is a chain. Let = supht(s)= s . Clearly is as desired
in (3).
Now for each choose f () to be a as in (3). So f is a regressive function
defined on the stationary set . Hence there is a < such that f 1 [{}] is stationary,
and hence of size . So T does not branch beyond , and hence has a branch of size
because it is well-pruned, contradiction.
For the next result we need another important construction. Suppose that is an infinite
cardinal, f = hf : < + i is a family of injections f : , and S is a cofinal subset of
+ . The (, f, S)-Ulam matrix is the function A : + P() defined for any <
and < + by
A = { S\( + 1) : f () = }.
Theorem 9.11. (Ulam) Let be an infinite cardinal, S is a stationary subset of + ,
and I a collection of subsets of + having the following properties:
(i) I.
S
(ii) If X [I] , then X I.
(iii) If Y X I, then Y I.
(iv) If < + , then {} I.
(v) S
/ I.
Then there is a system hX : < + i of subsets of S such that X X = for distinct
, < + , and X
/ I for all < + .
Proof. Let f = hf : < + i be a family of injections f : , and let A be the
(, f, S)-Ulam matrix. If < , then for distinct , < + we have A A = , since
the functions f are one-one. Moreover, for any < + we have
S\

A S ( + 1) I

<

by (ii)(iv). By conditions (ii) and (v) it then follows that for each < + there is
h()
an h() < such that A

/ I. Thus h : + , so there is a < such that


|h1 [{}]| = + . Hence {A : < + , h() = } is as desired in the theorem.
Theorem 9.12. (i) If is an infinite cardinal and S is a stationary subset of + ,
then we can partition S into + -many stationary subsets.
(ii) If is weakly inaccessible, then can be partitioned into many stationary
subsets.
106

Proof. (i): Let I be the collection of all nonstationary subsets of + . The conditions
of Theorem 9.11 are all clear, and so by it we get a system hX : < + i of subsets of S
+
such
/ I for all < + . We can union
S that X X = for distinct , < , and X
S\ <+ X with X0 to get the desired partition of S.
(ii) For each regular cardinal < , let S = { < : cf() = }. Thus S is
stationary by Proposition 9.8. By induction it is clear that if < , then +1 < .
Hence there are regular cardinals less than . Thus we have many pairwise disjoint
stationary subsets of , and these can be extended to a partition of as in the proof of
(i).
The first part of Theorem 9.12 can actually be extended to weak inaccessibles too, but the
proof is longer.
While the above results are classical and have found wide use, the following related result
has mainly been used in PCF theory, although it should find a much wider use in the
future.
Theorem 9.13. (Club guessing) Suppose that is a regular cardinal, is a cardinal
such that cf() ++ , and S = { : cf() = }. Then there is a sequence hC :
S i such that:
(i) For every S the set C is club, of order type .
(ii) For every club D there is a D S such that C D.
The sequence hC : S i is called a club guessing sequence for S .
Proof. First we take the case of uncountable . Fix a sequence C = hC : S i
such that C is club in of order type , for every S . For any club E of , let
C E = hC E : S E i,
where E = { E : E is unbounded in }. Clearly E is also club in . Also note that
C E is club in for each S E . We claim:
(1) There is a club E of such that for every club D of there is a D E S such
that C E D.
Note that if we prove (1), then the theorem follows by defining C = C E for all
E S , and C = C for S \E .
Assume that (1) is false. Hence for every club E there is a club DE such
that for every DE E S we have
C E 6 DE .
We now define a sequence hE : < + i of clubs of decreasing under inclusion, by
induction on :
(2) E 0 = .
(3) If < + is a limit ordinal and E has been defined for all < , we set E =
Since < + < cf(), E is club in .
107

<

E .

(4) If E has been defined, let E +1 be the set of all limit points of E DE , i.e., the
set of all < such that E DE is unbounded in .
T
This defines the sequence. Let E = <+ E . Then E is club in . Take any S E.
Since |C | = and the sequence hE : < + i is decreasing, there is an < + such that
C E = C E . So C E = C E +1 . Hence C E DE , contradiction.
Thus the case uncountable has been finished.
Now we take the case = . For S = S0 fix C = hC : Si so that C is club in
with order type . We denote the n-th element of C by C (n). For any club E and
any S E we define
CE = {max(E (C (n) + 1)) : n },
where again E is the set of limit points of members of E. This set is cofinal in . In fact,
given < , there is a E such that < since E , and there is an n such
that < C (n). Then < max(E (C (n) + 1)), as desired. There may be repetitions
in the description of CE , but max(E (C (n) + 1)) max(E (C (m) + 1)) if n < m, so
CE has order type . We claim
(5) There is a closed unbounded E such that for every club D there is a
D S E such that CE D. [This proves the club guessing property.]
Suppose that (5) fails. Thus for every closed unbounded E there exist a club DE
such that for every DE S E we have CE 6 D. Then we construct a descending
sequence E of clubs in as in the case > , for < T
1 . Thus for each < 1 and

E
each DE S (E ) we have C 6 DE . Let E = <1 E . Take any S E.
For n and < we have
E (C (n) + 1) E (C (n) + 1),
and so max(E (C (n) + 1)) max(E (C (n) + 1)); it follows that there is an n < 1
such that max(E (C (n) + 1)) = max(E n (C (n) + 1)) for all > n . Choose
greater than all n . Thus
(6) For all > and all n we have max(E (C (n) + 1)) = max(E (C (n) + 1)).

But there is a CE \DE ; say that = max(E (C (n) + 1)). Then = max(E +1
(C (n) + 1)) E +1 = (E DE ) DE , contradiction.
Next we introduce an important combinatorial principle and show that it implies the
existence of Suslin trees. is the following statement:
There exists a sequence hA : < 1 i of sets with the following properties:
(i) A for each < 1 .
(ii) For every subset A of 1 , the set { < 1 : A = A } is stationary in 1 .
A sequence as in is called a -sequence. Such a sequence in a sense captures all subsets
of 1 in a sequence of length 1 . Later in these notes we will show that follows from
V = L.
108

Theorem 9.14. CH.


Proof. Let hA : < 1 i be a -sequence. Then for every A the set { < 1 :
A = A } is stationary in 1 , and hence it has an infinite member; for such a member
we have A = A . So we can let f (A) be the least < 1 such that A = A , and we
thus define an injection of P() into 1 .
Since is formulated in terms of subsets of 1 , to construct a Suslin tree using it is
natural to let the tree be 1 with some tree-order. The following lemma will be useful in
doing the construction.
Lemma 9.15. Suppose that T = (1 , ) is an 1 -tree and A is a maximal antichain
in T . Then
{ < 1 : (T ) = and A is a maximal antichain in T }
is club in 1 .
Proof. Let C be the indicated set. For each < 1 let T = {t T : ht(t, T ) < }.
Suppose that A 1 is a maximal antichain in T . To see that C is closed in 1 , let < 1
be a limit ordinal, and suppose that C is unbounded in . If (T ), then there is
a < such that (T ). Choose (C ) such that < . Then (T ) = ,
so also . This shows that (T ) . Conversely, suppose that . Choose
C such that < . Then T T . Thus (T ) = .
To show that A is a maximal antichain in T , note first that at least it is an
antichain. Now take any (T ); we show that is comparable under to some
member of A, which will show that A is a maximal antichain in T . Choose <
such that (T ), and then choose (C ) such that < . Thus (T ).
Now A is a maximal antichain in T since C, so is comparable with some
(A ) (A ), as desired.
To show that C is unbounded in we will apply Theorem 9.7 to the following three
functions f, g, h : :
f () = ht(, T );
g() = sup(Lev (T ));
h() = some member of A comparable with under .
By Theorem 9.7, the set D of all < which are closed under each of f, g, h is club in
. We now show that D C, which will prove that C is unbounded in . So, suppose
that D. If (T ), let = ht(, T ). Then < and Lev (T ), and so
g() < . Thus (T ) . Conversely, suppose that < . Then f () < , i.e.,
ht(, T ) < , so (T ). Therefore (T ) = . Now suppose that (T ); we
want to show that is comparable with some member of A , as this will prove that
A is a maximal antichain in T . Since by what has already been shown, we
have h() < , and so the element h() is as desired.
Another crucial lemma for the construction is as follows.
109

Lemma 9.16. Let T = (1 , ) be an eventually branching 1 -tree and let hA : <


1 i be a -sequence. Assume that for every limit < 1 , if T = and A is a
maximal antichain in T , then for every x Lev (T ) there is a y A such that
y x.
Then T is a Suslin tree.
Proof. By Proposition 8.8 it suffices to show that every maximal antichain A of T is
countable. By Lemma 9.15, the set
def

C = { < 1 : (T ) = and A is a maximal antichain in T }


is club in 1 . Now by the definition of the -sequence, the set { < 1 : A = A } is
stationary, so we can choose C such that A = A . Now if T and ht(, T ) ,
then there is a Lev(, T ) such that  , and the hypothesis of the lemma further
yields a A such that . Since , it follows that
/ A. So we have shown
that for all T , if ht(, T ) then
/ A. Hence for any T , if A then
(T ) = . So A and hence A = A , so that A is countable.
Theorem 9.17. implies that there is a Suslin tree.
Proof. Assume , and let hA : < 1 i be a -sequence. We are going to construct
a Suslin tree of the form (1 , ) in which for each < 1 the -th level is the set
{ + m : m }. We will do the construction by completely defining the tree up to
heights < 1 by recursion. Thus we define by recursion trees ( , ), so that really
we are just defining the partial orders by recursion.
We let 0 =1 = . Now suppose that > 1 and has been defined for all <
so that the following conditions hold whenever 0 < < :
(1) ( , ) is a tree, denoted by T for brevity.
(2) If < and , T , then iff .
(3) For each < , Lev (T ) = { + m : m }.
(4) If < < and m , then there is an n such that + m + n.
(5) If < , is a limit ordinal, = , and A is a maximal antichain in T , then for
every x Lev (T ) there is a y A such that y x.
Note that conditions (1)(3) just say that the trees constructed have the special form
indicated at the beginning, and are an increasing chain of trees. Condition (4) is to assure
that the final tree is well-pruned. Condition (5) is connected to Lemma 9.16, which will be
applied after the construction to verify that our tree is Suslin. Conditions (1)(5) imply
that if x T , then it has the form + m for some < , and then x Lev (T ) and
for each < there is a unique element
S + n in L such that + n x.
If is a limit ordinal, let = < . Conditions (1)(5) are then clear for any
.
Next suppose that = + 2 for some ordinal . Then we define
=+1 {(, ( + 1) + 2m) : +1 + m, m }
{(, ( + 1) + 2m + 1) : +1 + m, m }.
110

Clearly (1)(5) hold for all < .


The most important case is = + 1 for some limit ordinal . To treat this case, we
first associate with each x T a chain B(x) in T , and to do this we define by recursion
a sequence hynx : n i of elements of T . To define y0x we consider two cases.
Case 1. = and A is a maximal antichain in T . Then x is comparable with
some member z of A , and we let y0x be some element of T such that x, z y0x .
Case 2. Otherwise, we just let y0x = x.
Now let hm : m i be a strictly increasing sequence of ordinals less than such that
0 = ht(y0x , T ) and supm m = . Now if yix has been defined of height i , by (4) let
x
x
yi+1
be an element of height i+1 such that yix yi+1
. Then we define
B(x) = {z : z yix for some i }.
Finally, let hx(n) : n i be a one-one enumeration of , and set
= {(z, + n) : n , z B(xn )}.
Clearly (1)(3) hold with in place of . For (4), suppose that < and m . Let
z = + m. Thus z , and hence there is an n such that z = x(n). Hence
z B(x(n)) and z + n, as desired.
For (5), suppose that = , and A is a maximal antichain in T . Suppose that
x(n)
w Lev (T ). Choose n so that w = +n. Then there is an s A such that s < y0 .
So s B(x(n)) and s + n = w, as desired.
S
def
Thus the construction is finished. Now we let = <1 . Clearly T = (1 , ) is
an 1 -tree. It is eventually branching by (4) and the = + 2 step in the construction.
The hypothesis of Lemma 9.16 holds by the step = + 1, limit, in the construction.
Therefore T is a Suslin tree by Lemma 9.16.
EXERCISES
E9.1. Assume that is an uncountable regular cardinal and hA : < i is a sequence of
subsets of . Let D = < A . Prove the following:
(i) For all < , the set D\A is nonstationary.
(ii) Suppose that E and for every < , the set E\A is nonstationary. Show
that E\D is nonstationary.
E9.2. Let > be regular. Show that there is a sequence hS : < i of stationary
subsets of such that S S whenever < < , and < S = {0}. Hint: use
Theorem 9.12.
E9.3. Suppose that is uncountable and regular, and for each limit ordinal < we are
given a function f . Suppose that S is a stationary subset of . Let n . Show
that there exist a t n and a stationary S S such that for all S , f n = t.
E9.4. Suppose that cf() > , C is club of order type cf(), and hc : < cf()i is
the strictly increasing enumeration of C. Let X . Show that X is stationary in iff
{ < cf() : c X} is stationary in cf().
111

E9.5. Suppose that is regular and uncountable, and S is stationary. Also, suppose
that every S is an uncountable regular cardinal. Show that
def

T = { S : S is non-stationary in }
is stationary in . Hint: given a club C in , let C be the set of all limit points of C and
let be the least element of C S; show that T C.
For the next few exercises we use the following definition. Suppose that is an uncountable
regular cardinal and A is a set such that |A| . Then a subset X of [A]< is closed iff for
every system haS
: < i of elements of X, with < and with a a for all < < ,
also the union < a is in X. And we say that X is unbounded in [A]< iff for every
x [A]< there is a y X such that x y.
E9.6. Suppose that is an uncountable regular cardinal, |A| , and a [A]< . Show
that {x [A]< : a x} is club in [A]< .
E9.7. Suppose that is an regular cardinal > 1 and |A| . Show that {x [A]< :
|x| 1 } is club in [A]< .
E9.8. Suppose that is an uncountable regular cardinal and is a cardinal > . Show
that {x []< : x } is club in []< .
E9.9. Suppose that is an uncountable regular cardinal and |A| . Show that the
intersection of two clubs of [A]< is a club.
E9.10. Suppose that is an uncountable regular cardinal and |A| . Show that the
intersection of fewer than clubs of [A]< is a club.
If is an uncountable regular cardinal, |A| , and hXa : a Ai is a system of subsets of
[A]< , then the diagonal intersection of this system is the set
def

aA Xa =

x [A]< : x

Xa

ax

E9.11. Suppose that is an uncountable regular cardinal, |A| , and hXa : a Ai is a


system of clubs of [A]< . Show that aA Xa is club in [A]< .
Given an uncountable regular cardinal and a set A with |A| , we say that a subset
X of [A]< is stationary iff it intersects every club of [A]< .
E9.12. Suppose that is an uncountable regular cardinal, |A| , S is a stationary subset
of [A]< , and f is a function with domain S such that f (x) x for every nonempty x S.
Show that there exist a stationary subset T of S and an element a A such that f (x) = a
for all x T .
E9.13. Let be an uncountable regular cardinal. Thus []< . Show that if C []<
is club, then C is club in the usual sense.
112

E9.14. Let be an uncountable


regular cardinal, and let C be club in the old sense.
S
<
Show that {X [] : X C} is club in the new sense.
E9.15. Let be an uncountable
regular cardinal, and let S []< be stationary in the
S
new sense. Show that { X : X S} is stationary in the old sense.
Reference
Jech, T. Stationary sets. Chapter in Handbook of set theory.

113

10. Infinite combinatorics


In this chapter we survey the most useful theorems of infinite combinatorics; the best
known of them is the infinite Ramsey theorem. We derive from it the finite Ramsey
theorem.
Two sets A, B are almost disjoint iff |A| = |B| while |A B| < |A|. Of course we are
mainly interested in this notion if A and B are infinite.
Theorem 10.1. There is a family of 2 pairwise almost disjoint infinite sets of
natural numbers.
S
Proof. Let X = n n 2. Then |X| = , since X is clearly infinite, while
|X|

2n = .

Let f be a bijection from onto X. Then for each g 2 let xg = {g n : n }.


So xg is an infinite subset of X. If g, h 2 and g 6= h, choose n so that g(n) 6= h(n).
Then clearly xg xh {g i : i n}, and so this intersection is finite. Thus we have
produced 2 pairwise almost disjoint infinite subsets of X. That carries over to . Namely,
{f 1 [xg ] : g 2} is a family of 2 pairwise almost disjoint infinite subsets of , as is
easily checked.
Let X be an infinite set. A collection A of subsets of X is independent iff for any two
finite disjoint subsets B, C of A we have
\
Y B

(X\Z)

6= .

ZC

Theorem 10.2. (Fichtenholz, Kantorovitch, Hausdorff) For any infinite cardinal


there is an independent family A of subsets of such that each member of A has size
and |A | = 2 ; moreover, each of the above intersections has size .
Proof. Let F be the family of all finite subsets of ; thus |F | = . Let be the set
of all finite subsets of F ; thus also || = . It suffices now to work with F rather
than itself.
For each let
b = {(, ) F : }.
Note that each b has size ; for example, (, {, {}}) b for every < . So to finish
the proof it suffices to take any two finite disjoint subsets H and K of P() and show
that
!
!
\
\
()
bA
((F )\bB )
AH

BK

114

has size . For distinct A, B H K pick AB AB, and let = {AB : A, B


H K, A 6= B}. Now it suffices to show that if \ and = {A : A H}{{}},
then (, ) is a member of (). If A H, then A , and so (, ) bA . Now
suppose, to get a contradiction, that B K and (, ) bB . Then B . Since

/ , it follows that there is an A H such that B = A. Since A 6= B, we have


AB AB and AB , contradiction.
A collection A of sets forms a -system iff there is a set r (called the root or kernel of
the -system) such that A B = r for any two distinct A, B A . This is illustrated as
follows:

root

The existence theorem for -systems that is most often used is as follows.
Theorem 10.3. If is an uncountable regular cardinal and A is a collection of finite
sets with |A | , then there is a B [A ] such that B is a -system.
Proof. First we prove the following special case of the theorem.
() If A is a collection of finite sets each of size m , with |A | = , then there is a
B [A ] such that B is a -system.
We prove this by induction on m. The hypothesis implies that m > 0. If m = 1, then
each member of A is a singleton, and so A is a collection of pairwise disjoint sets; hence
it is a -system with root . Now assume that () holds for m, and suppose that A is a
collection of finite sets each of size m + 1, with |A | = , and with m > 0. We consider
two cases.
def
Case 1. There is an element x such that C = {A A : x A} has size . Let
D = {A\{x} : A C }. Then D is a collection of finite sets each of size m, and |D| = .
Hence by the inductive assumption there is an E [D] which is a -system, say with
kernel r. Then {A {x} : A E} [A ] and it is a -system with kernel r {x}.
Case 2. Case 1 does not hold. Let hA : < i be a one-one enumeration of A .
Then from the assumption that Case 1 does not hold we get:
115

() For every x, the set { < : x A } has size less than .


We now define a sequence h() : < i of ordinals less than by recursion. Suppose
def S
that () has been defined for all < , where < . Then = < A() has size
less than , and so by (), so does the set
[

{ < : x A }.

Thus we can choose () < such that for all x we have x


/ A() . This implies
that A() A() = for all < . Thus we have produced a pairwise disjoint system
hA() : < i, as desired. (The root is again.)
This finishes the inductive proof of ()
Now the theorem itself is proved as follows. Let A be a subset of A of size . Then
A =

{A A : |A| = m}.

Hence there is an m such that {A A : |A| = m} has size . So () applies to give


the desired conclusion.
There is a slightly different version of a -system that is sometimes useful. A system
hAi : i Ii of sets is an indexed -system iff there is a set r (again called the root such
that Ai Aj = r for all distinct i, j I. Note that in an indexed system hAi : i Ii it is
possible to have distinct i, j I such that Ai = Aj ; in fact, all of the Ai s could be equal,
in which case the system is already an indexed -system.
Theorem 10.4. If is an uncountable regular cardinal and hAi : i Ii is a system
of finite sets with |I| , then there is a J [I] such that hAi : i Ji is an indexed
-system.
Proof. We may assume that |I| = . Define i j iff i, j I and Ai = Aj . Thus is
an equivalence relation on I. If some equivalence class K has elements, then hAi : i Ki
is an indexed system with |K| = , as desired; the kernel is Ai for any i K. Suppose
that every equivalence class has size less than . Then there are equivalence classes. Let
J I have one element from each equivalence class, and let A = {Aj : j J}. Then
A is a collection of finite sets, and |A | = . Hence by Theorem 10.3 let B [A ] be a
-system. For each B B there is an iB J such that AiB = B. Let K = {iB : B B}.
Then hAj : j Ki is an indexed -system, and |K| = since the function i is clearly
one-one.
Now we give the general form of the -system theorem.

116

Theorem 10.5. Suppose that and are cardinals, < , is regular, and
for all < , |[]< | < . Suppose that A is a collection of sets, with each A A of size
less than , and with |A | . Then there is a B [A ] which is a -system.
Proof.
(1) There is a regular cardinal such that < .
In fact, if is regular, we may take = . If is singular, then + |[]< | < , so we
may take = + .
We take as in (1). Let S = { < : is a limit ordinal and cf() = }. Then S is
a stationary subset of .

S
Let A0 S
be a subset of A of size . Now AA0 A since < . Let a be an
injection of AA0 A into , and let A be a bijection of onto A0 . Set b = a[A ] for
each < . Now if S, then |b | |b | = |A | < = cf(), so there is an
ordinal g() such that sup(b ) < g() < . Thus g is a regressive function on S. By
Fodors theorem, there exist a stationary S S and a < such that g[S ] = {}. For
each S let F () = b . Thus F () []< , and |[]< | < , so there exist an
S [S ] and a B []< such that b = B for all S .
Now we define h : < i by recursion. For any < , is a member of S such
that
(2) < for all <
S, and
(3) < for all < b .

S


Since < b < , this is possible by the regularity of .

Now let A1 = A[{ : < }] and r = a1 [B]. We claim that C D = r for distinct
C, D A1 . For, write C = A and D = A . Without loss of generality, < . Suppose
that x r. Thus a(x) B b , so by the definition of b we have x A = C.
Similarly x D. Conversely, suppose that x C D. Thus x A A , and hence
a(x) b b . By the definition of , since a(x) b we have a(x) < . So
a(x) b = B, and hence x r.
Clearly |A1 | = .
By the proof of Theorem 10.4 we also get
Theorem 10.6. Suppose that and are cardinals, < , is regular, and for all
< , |[]< | < . Suppose that hAi : i Ii is a system of sets, with each Ai of size less
than , and with |I| . Then there is a J [I] such that hAi : i Ji is an indexed
-system.
Now we give some important results of the partition calculus, which is infinitary Ramsey
theory. The basic definition is as follows:
Suppose that is a nonzero cardinal number, h : < i is a sequence of cardinals,
and , are cardinals. We also assume that 1 for all < . Then we write
(h : < i)
117

provided that the following holds:


For every f : [] there exist < and [] such that f [[] ]] {}.
In this case we say that is homogeneous for f .
The following colorful terminology is standard. We imagine that is a color for each
< , and we color all of the -element subsets of . To say that is homogeneous for f
is to say that all of the -element subsets of get the same color. Usually we will take
and to be a positive integers. If = 2, we have only two colors, which are conventionally
taken to be red (for 0) and blue (for 1). If = 2 we are dealing with ordinary graphs.
Note that if = 1 then we are using only one color, and so the arrow relation obviously
holds by taking = . If is infinite and = 1 and is a positive integer, then the relation
holds no matter what is, since
=

{ < : f ({}) = i},

i<

and so there is some i < such that |{ < : f ({}) = i}| = i , as desired.
For the first few theorems we assume that is finite, and use the letter r instead of .
The general infinite Ramsey theorem is as follows.
Theorem 10.7. (Ramsey) If n and r are positive integers, then
( , . . . , )n .
| {z }
r times

Proof. We proceed by induction on n. The case n = 1 is trivial, as observed above.


So assume that the theorem holds for n 1, and now suppose that f : []n+1 r. For
each m define gm : [\{m}]n r by:
gm (X) = f (X {m}).
Then by the inductive hypothesis, for each m and each infinite S there is an
S
S n
infinite Hm
S\{m} such that gm is constant on [Hm
] . We now construct by recursion
two sequences hSi : i i and hmi : i i. Each mi will be in , and we will have
S0 S1 . Let S0 = and m0 = 0. Suppose that Si and mi have been defined, with
Si an infinite subset of . We define
Si
Si+1 = Hm
i

and

mi+1 = the least element of Si+1 greater than mi .


Clearly S0 S1 and m0 < m1 < . Moreover, mi Si for all i .
(1) For each i , the function gmi is constant on [{mj : j > i}]n .
In fact, {mj : j > i} Si+1 by the above, and so (1) is clear by the definition.
118

Let pi < r be the constant value of gmi [{mj : j > i}]n , for each i . Hence
=

{i : pi = j};

j<r
def

so there is a j < r such that K = {i : pi = j} is infinite. Let L = {mi : i K}. We


claim that f [[L]n+1 ] {j}, completing the inductive proof. For, take any X [L]n+1 ; say
X = {mi0 , . . . , min } with i0 < < in . Then
f (X) = gmi0 ({mi1 , . . . , min }) = pi0 = j.
As a digression, we also prove the finite version of Ramseys theorem:
Theorem 10.8. (Ramsey) Suppose that n, r, l0 , . . . , lr1 are positive integers, with
n li for each i < r. Then there is a k li for each i < r and k n such that
k (l0 , . . . , lr1 )n .
Proof. Assume the hypothesis, but suppose that the conclusion fails. Thus for every
k such that k li for each i < r with k n also, we have k 6 (l0 , . . . , lr1 )n , which means
that there is a function fk : [k]n r such that for each i < r, there is no set S [k]li
such that fk [[S]n ] {i}. We use these functions to define a certain g : []n r which
will contradict the infinite version of Ramseys theorem. Let M = {k : k li for each
i < r and k n}.
To define g, we define functions hi : [i]n r by recursion. h0 has to be the empty
def

function. Now suppose that we have defined hi so that Si = {s M : fs [i]n = hi } is


infinite. This is obviously true for i = 0. Then
Si =

{k Si : fk [i + 1]n = s},

s:[i+1]n r
def

and so there is a hi+1 : [i + 1]n r such that Si+1 = {k Si : fk [i + 1]n = hi+1 } is


infinite, finishing the construction.
S
Clearly hi hi+1 for all i . Hence g = i hi is a function mapping []n into
r. By the infinite version of Ramseys theorem choose v < r and Y [] such that
g[[Y ]n ] {v}. Take any Z [Y ]lv . Choose i so that Z i, and choose k Si . Then for
any X [Z]n we have
fk (X) = hi (X) = g(X) = v,
so Z is homogeneous for fk , contradiction.
According to the following theorem, the most obvious generalization of Ramseys theorem
does not hold.

119

Theorem 10.9. For any infinite cardinal we have 2 6 (+ , + )2 .


Proof. We consider 2 under the lexicographic order; see the beginning of Chapter
7. Let hf : < 2 i be a one-one enumeration of 2. Define F : 2 2 by setting, for
any < < ,

0 if f < f ,
F ({, }) =
1 if f < f .
If 2 (+ , + )2 holds, then there is
If F ({, }) = 0 for all distinct <
sequence of length o.t.(), contradicting
if F ({, }) = 1 for all distinct < in

a set [2 ] which is homogeneous for F .


in , then hf : i is a strictly increasing
Theorem 5.4. A similar contradiction is reached
.

Corollary 10.10. + 6 (+ , + )2 for every infinite cardinal .


Proof. Given F : [+ ]2 2, extend F in any way to a function G : [2 ]2 2.
A homogeneous set for F yields a homogeneous set for G. So our corollary follows from
Theorem 10.9.
We can, however, generalize Ramseys theorem in less obvious ways.
Theorem 10.11. (Dushnik-Miller) If is an infinite regular cardinal, then
(, )2 .
(This is also true for singular , but the proof is more complicated.)
Proof. Let f : []2 2. Assume that
(1) For all , if f [[]2 ] {0}, then || < .
Thus we want to find an infinite such that f [[]2 ] {1}. In order to do this, we
will define by recursion subsets n , n of and elements n of , for all n .
Let 0 = . Now suppose that n has been defined so that |n | = ; we define n+1 ,
n , and n . Let n be a maximal subset of n such that f [[n ]2 ] {0}. Thus |n | <
by (1). By this maximality,
n \n =

{ n \n : f ({, }) = 1}.

Hence since |n \n | = , |n | < , and is regular, there is an n n such that


def
n+1 = { n \n : f ({n , }) = 1} has elements. This finishes the construction.
The following facts about this construction are clear:
(2) n+1 n \n ;
(3) n n n ;
(4) For all n+1 we have f ({n , }) = 1;
In addition:
(5) The n s are all distinct.
120

In fact, suppose that i < n. Then i i and i i+1 = . Since n i+1 , it follows
that i
/ n and so i 6= n . So (5) holds.
Now with i < n we have n n i+1 , and hence f ({i , n }) = 1. Thus
f [{n : n }]2 {1}, as desired.
To formulate another generalization of Ramseys theorem it is convenient to introduce a
notation for a special form of the arrow notation. We write
() iff
(h : < i)
In direct terms, then, () means that for every f : [] there is a [] such
that |f []| = 1.
The following cardinal notation is also needed for our next result: for any infinite
cardinal we define
20 = ;

2n+1 = 2(2n )

for all n .

Theorem 10.12. (Erdos-Rado) For every infinite cardinal and every positive integer n, (2n1 )+ (+ )n .
Proof. Induction on n. For n = 1 we want to show that + (+ )1 , and this is
obvious. Now assume the statement for n 1, and suppose that f : [(2n )+ ]n+1 . For
each (2n )+ define F : [(2n )+ \{}]n by setting F (x) = f (x {}).

(1) There is an A [(2n )+ ]2n such that for all C [A]2n1 and all u (2n )+ \C there is a
v A\C such that Fu [C]n = Fv [C]n .
To prove this, we define a sequence hA : <S(2n1 )+ i of subsets of (2n )+ , each of size
2n . Let A0 = 2n , and for limit let A = < A . Now suppose that A has been

defined, and C [A ]2n1 . Define u v iff u, v (2n )+ \C and Fu [C]n = Fv [C]n .


n
Now |[C] | = 2n , so there are at most 2n equivalence classes. Let KC have exactly one

element in common with each equivalence class. Let A+1 = A {KC : C [A ]2n1 }.

Since (2n )2n1 = 2n , we still have |A+1 | = 2n . This finishes the construction. Clearly
def S
A = (2 )+ A is as desired in (1).
n1

Take A as in (1), and fix a (2n )+ \A. We now define a sequence hx : < (2n1 )+ i
def

of elements of A. Given C = {x : < }, by (1) let x A\C be such that Fx


[C]n = Fa [C]n . This defines our sequence. Let X = {x : < (2n1 )+ }.
Now define G : [X]n by G(x) = Fa (x). Suppose that 0 < < n < (2n1 )+ .
Then
f ({x0 , . . . , xn }) = Fxn ({x0 , . . . , xn1 })
= Fa ({x0 , . . . , xn1 })
= G({x0 , . . . , xn1 }).
121

Now by the inductive hypothesis there is an H [X] such that G is constant on [H]n .
By the above, f is constant on [H]n+1 .
Corollary 10.13. (2 )+ (+ )2 for any infinite cardinal .
We close this chapter with some more non-arrow results.
Theorem 10.14. For any infinite cardinal we have 2 6 (3)2 .
Proof. Define F : [ 2]2 by setting F ({f, g}) = (f, g) for any two distinct
f, g 2. If {f, g, h} is homogeneous for F with f, g, h distinct, let = (f, g). Then
f (), g(), h() are distinct members of 2, contradiction.
Corollary 10.15. For any infinite cardinal we have 2 6 (+ )2 .
Our final result in the partition calculus indicates that infinite exponents are in general
hopeless.
Theorem 10.16. 6 (, ) .
Proof. Let < well-order [] . We define for any X []
F (X) =

0 if Y < X for some Y [X] ,


1 otherwise.

Suppose that H [] is homogeneous for F . Let X be the <-least element of [H] . Thus
F (X) = 1. So we must have F (Y ) = 1 for all Y [H] . Write H = {mi : i } without
repetitions. For each k let
Ik = {m0 , m2 , . . . , m2k } {m2i+1 : i }.
Thus these are infinite subsets of H. Choose k0 so that Ik0 is minimum among all of the
Ik s. Then Ik0 Ik0 +1 and Ik0 < Ik0 +1 , so F (Ik0 +1 ) = 0, contradiction.
We close this chapter with the following theorem of Comfort and Negrepontis.
Theorem 10.17. Suppose that = < . Then there is a system hf : < 2 i of
members of such that
 <
M 2
g M < M [f () = g()].
Proof. Let


F = (F, G, s) : F []< , G [P(F )]< , and s G .
Now if F []< , say |F | = , then


[P(F )]< (2 )< | ( )< < = ,
122

and if G [P(F )]< then |G | < = . It follows that |F | = . Let h be a bijection


from onto F , and let k be a bijection from 2 onto P(). Now for each < 2 we
define f by setting, for each < , with h() = (F, G, s),
f () =

s(k() F )
0

if k() F G,
otherwise.

Now to prove that hf : < 2 i is as desired, suppose that M [2 ]< and g M . For
distinct members , of M choose (, ) k()k(). Then let
F = {(, ) : , M, 6= }

and G = {k() F : M }.

Moreover, define s : G by setting s(k() F ) = g() for any M . This is possible


since k() F ) 6= k() F ) for distinct , M . Finally, let = h1 (F, G, s). Then for
any M we have
f () = s(k() F ) = g().
EXERCISES
E10.1. Suppose that > . Show that there is a family A of subsets of , each of size
, with |A | = + and the intersection of any two members of A is finite.
E10.2. Suppose that is any infinite cardinal, and is minimum such that > . Show
that there is a family A of subsets of , each of size , with the intersection of any two
members of A being of size less than , and with |A | = + .
E10.3. Suppose that is uncountable and regular. Show that there is a family A of
subsets of , each of size with the intersection of any two members of A of size less than
, and with |A | = + . Hint: (1) show that there is a partition of into subsets, each
of size ; (2) Use Zorns lemma to start from (1) and produce a maximal almost disjoint
set; (3) Use a diagonal construction to show that the resulting family must have size > .
E10.4. Prove that if F is an uncountable family of finite functions each with range ,
then there are distinct f, g F such that f g is a function.
E10.5. Give an example of an infinite collection of finite sets which does not contain an
infinite -system.
E10.6. Show that if is singular, then there is a family A of finite sets with |A | = such
that A does not contain a -system of size .
E10.7. (Double -system theorem) Suppose that is a singular cardinal with cf() > .
Let h : < cf()i be a strictly increasing sequence of successor
Pcardinals with supremum
, with cf() < 0 , and such that for each < cf() we have ( < )+ . Suppose
that hA : < i is a system of finite sets. Then there exist a set [cf()]cf() , a
sequence h : i of subsets of , a sequence hF : i of finite sets, and a finite
set G, such that the following conditions hold:
(i) h : i is a pairwise disjoint system, and | | = for every .
123

(ii) hA : i is a -system with root F for every .


(iii) hF : i is a -system with root G.
(iv) If , , and 6= , then A A = G.
E10.8. Suppose that F is a collection of countable functions, each with range 2 , and
with |F | = (2 )+ . Show that there are distinct f, g F such that f g is a function.
E10.9. For any infinite cardinal , any linear order of size at least (2 )+ has a subset of
order type + or one similar to (+ , >).
E10.10. For any infinite cardinal , any tree of size at least (2 )+ has a branch or an
antichain of size at least + .
E10.11. Any uncountable tree either has an uncountable branch or an infinite antichain.
E10.12. Suppose that m is a positive integer. Show that any infinite set X of positive
integers contains an infinite subset Y such that one of the following conditions holds:
(i) The members of Y are pairwise relatively prime.
(ii) There is a prime p < m such that for any two a, b Y , p divides a b.
(iii) If a, b are distinct members of Y , then a, b are not relatively prime, but the
smallest prime divisor of a b is at least equal to m.
E10.13. Suppose that X is an infinite set, and (X, <) and (X, ) are two well-orderings of
X. Show that there is an infinite subset Y of X such that for all y, z Y , y < z iff y z.
E10.14. Let S be an infinite set of points in the plane. Show that S has an infinite subset
T such that all members of T are on the same line, or else no three distinct points of T
are collinear.
E10.15. We consider the following variation of the arrow relation. For cardinals , , , ,
we define
[]
to mean that for every function f : [] there exist an < and a [] such
that f [[] ] \{}. In coloring terminology, we color the -element subsets of with
colors, and then there is a set which is anti-homogeneous for f , in the sense that there is
a color and a subset of size all of whose -element subsets do not get the color .
Prove that for any infinite cardinal ,
6 []2 .
Hint: (1) Show that there is an enumeration hX : < 2 i of [] in which every member
of [] is repeated 2 times. (2) Show that |[] | = 2 . (3) Show that there is a one-one
hY : < 2 i such that Y [X ] for all < 2 . (4) Define f : [] 2 so that for all
< 2 one has
f (Y ) = o.t.({ < : X = X }).
Reference
Erdos, P.; Hajnal, A.; Mate, A.; Rado, R. Combinatorial set theory: partition relations for cardinals. Akad. Kiado, Budapest 1984, 347pp.
124

11. Martins axiom


Martins axiom is not an axiom of ZFC, but it can be added to those axioms. It has many
important consequences. Actually, the continuum hypothesis implies Martins axiom, so
it is of most interest when combined with the negation of the continuum hypothesis. The
consistency of MA + CH involves iterated forcing, and is prove much later in these notes.
The formulation of the axiom involves some new notions about partially ordered sets.
These notions are basic for the method of forcing also. Let P = (P, ) be a partial order.
A subset D of P is dense in P iff for every p P there is a d D such that d p.
A subset G of P is a filter on P iff the following two conditions hold:
(i) For all p, q G there is an r G such that r p, q.
(ii) For all p G and q P , if p q then q G.
Elements p, q P are compatible iff there is an r P such that r p, q. Otherwise they
are incompatible, and we write p q.
A subset A of P is an antichain in P iff any two distinct elements of A are incompatible.
Note that this is different from our previous notion of antichain for trees; we depend on
context to distinguish the two notions.
We say that P satisfies the countable chain condition, ccc, iff every antichain in P is
countable.
For any infinite cardinal , the notation MA() abbreviates the statement that for any
ccc partial order P and any family D of dense sets in P, with |D| , there is a filter G
on P such that G D 6= for every D D.
Martins axiom, abbreviated MA, is the statement that MA() holds for every infinite
< 2 .
To become familiar with these notions we start with some simple examples and facts. If P
is a linear order, then a dense subset of P is just a set which is cofinal in (P, ). Actually,
linear orders are not very interesting as partial orders to which to apply MA, as we will
see. At the other extreme, if P = (P, ), then the only dense subset of P is P itself. This
is also not an interesting thing to apply MA to. A more interesting partial order is the
following:
P = {f : f is a function, dmn(f ) 1 , rng(f ) 2, |f | < };
f g iff f, g P and f g.
Here a subset D is dense iff every f P can be extended to some g D. A filter
on P is a collection of pairwise compatible functions closed under taking subsets. Two
elements f, g P are compatible iff f g is a function. An antichain consists of pairwise
incompatible functions. Somewhat deeper is the fact that P does have ccc. In fact, suppose
that F is an uncountable subset of P. Then hdmn(f ) : f F i is a system of finite sets.
125

By the -system lemma, there is an uncountable F F such that hdmn(f ) : f F i is


a -system, say with root . Then
[
F =
{f F : g f },
g 2
def

2 such that F = {f F : g f } is
F are compatible. Thus the original set F
P has ccc. This argument is rather typical of

and the index set is finite, so there is a g


uncountable. Clearly any two members of
could not be an antichain. So, as claimed,
arguments showing ccc.
Clearly if < and MA(), then also

MA().

Theorem 11.1. MA() holds.


Proof. Let P be a ccc partial order and D a countable collection of dense sets in P.
If D is empty, we can fix any p P and let G = {q P : p q}. Then G is a filter on P,
which is all that is required in this case.
Now suppose that D is nonempty, and let hDn : n i enumerate all the members of
D; repetitions are needed if D is finite. We now define a sequence hpn : n i of elements
of P by recursion. Let p0 be any element of P . If pn has been defined, by the denseness of
Dn let pn+1 be such that pn+1 pn and pn+1 Dn . This finishes the construction. Let
G = {q P : pn q for some n }. Clearly G is as desired.
Note that ccc was not used in this proof.
Corollary 11.2. CH implies MA.
Theorem 11.3. MA(2 ) does not hold.
Proof. Suppose that it does hold. Let
P = {f : f is a finite function with dmn(f ) and rng(f ) 2};
f g iff f, g P and f g;
P = (P, ).
Then P has ccc, since P itself is countable. Now for each n let
Dn = {f P : n dmn(f )}.
Each such set is dense in P. For, if g P , either g is already in Dn , or n
/ dmn(g), and
then g {(n, 0)} is in Dn and it is g.
For each h 2 let
Eh = {f P : there is an n dmn(f ) such that f (n) 6= h(n)}.
Again, each such set Eh is dense in P. For, let f P . If f 6 h, then already f Eh ,
so suppose that f h. Take any n \dmn(f ), and let g = f {(n, 1 h(n))}. Then
g Eh and g f , as desired.
126

So, by MA(2 ) let G be a filter on P which intersects each of the sets Dn and Eh . Let
S
k = G.
(*) k : .
In fact, k is obviously a relation. Suppose that (m, ), (m, ) k. Choose f, g G such
that (m, ) f and (m, ) g. Then choose s G such that s f, g. So f, g s, and s
is a function. It follows that = . Thus k is a function.
If n , choose f G Dn . So n dmn(f ), and so n dmn(k). So we have proved
().
Now take any f G Ek . Choose n dmn(f ) such that f (n) 6= k(n). But f k,
contradiction.
There is one more fact concerning the definition of MA which should be mentioned.
Namely, for > the assumption of ccc is essential in the statement of MA(). (Recall our comment above that ccc is not needed in order to prove that MA() holds.) To
see this, define
P = {f : f is a finite function, dmn(f ) , and rng(f ) 1 };
f g iff f, g P and f g;
P = (P, ).
This example is similar to two of the partial orders above. Note that P does not have ccc,
since for example {{(0, )} : < 1 } is an uncountable antichain. Defining Dn as in the
proof of Theorem 11.3, we clearly get dense subsets of P. Also, for each < 1 let
F = {f P : rng(f )}.
Then F is dense in P. For, suppose that g P . If rng(g), then g itself is in F , so
suppose that
/ rng(g). Choose n \dmn(g). Let f = g {(n, )}. Then f F and
f g, as desired. Now if MA(1 ) holds without the assumption of ccc, then we can apply
it to our present partial order. Suppose that G is a filter on P which intersects each of
def S
these sets Dn and F . As in the proof of Theorem 11.3, k = G is a function mapping
into 1 . For any < 1 choose f G F . Thus rng(f ), and so rng(k). Thus
k has range 1 . This is impossible.
Now we proceed beyond the discussion of the definition of MA in order to give several
typical applications of it. First we consider again almost disjoint sets of natural numbers.
Our result here will be used to derive some important implications of MA for cardinal
arithmetic. We proved in Theorem 8.1 that there is a family of size 2 of almost disjoint
sets of natural numbers. Considering this further, we may ask what the size of maximal
almost disjoint families can be; and we may consider the least such size. This is one
of many min-max questions concerning the natural numbers which have been considered
recently. There are many consistency results saying that numbers of this sort can be less
than 2 ; in particular, it is consistent that there is a maximal family of almost disjoint
subsets of which has size less than 2 . MA, however, implies that this size, and most of
the similarly defined min-max functions, is 2 .
127

Let A P(). The almost disjoint partial order for A is defined as follows:
PA = {(s, F ) : s []< and F [A ]< };
(s , F ) (s, F ) iff s s , F F , and x s s for all x F ;
PA = (PA , ).
We give some useful properties of this construction.
Lemma 11.4. Let A P().
(i) PA is a partial order.
(ii) Let (s, F ), (s, F ) PA . Then the following conditions are equivalent:
(a) (s, F ) and (s , F ) are compatible.
(b) x F (x s s) and x F (x s s ).
(c) (s s , F F ) (s, F ), (s, F ).
(iii) Suppose that x A , and let Dx = {(s, F ) PA : x F }. Then Dx is dense in
PA .
(iv) PA has ccc.
Proof. (i): Clearly is reflexive on PA and it is antisymmetric, i.e. (s, F )
(s , F ) (s, F ) implies that (s, F ) = (s , F ). Now suppose that (s , F ) (s , F )
(s, F ). Thus s s s , so s s . Similarly, F F . Now take any x F . Then
x F , so x s s because (s , F ) (s , F ). Hence x s x s . And x s s
because (s , F ) (s, F ). So x s s, as desired.
(ii): For (a)(b), assume (a). Choose (s , F ) (s, F ), (s, F ). Now take any x F .
Then x s x s since s s , and x s s since (s , F ) (s, F ); so x s s .
The other part of (b) follows by symmetry.
(b)(c): By symmetry it suffices to show that (s s , F F ) (s, F ), and for this
we only need to check the last condition in the definition of . So, suppose that x F .
Then x (s s ) = (x s) (x s ) s by (b).
(c)(a): Obvious.
(iii): For any (s, F ) PA , clearly (s, F {x}) (s, F ).
(iv) Suppose that h(s , F ) : < 1 i is a pairwise incompatible system of elements of
PA . Clearly then s 6= s for distinct , < 1 , contradiction.

Theorem 11.5. Let be an infinite cardinal, and assume MA(). Suppose that
A , B P(), and |A |, |B| . Also assume that
S
(i) For all y B and all F [A ]< we have |y\ F | = .
Then there is a d such that |d x| < for all x A and |d y| = for all y B.
Proof. For each y B and each n let
Eny = {(s, F ) PA : s y 6 n}.
We S
claim that each such set is dense. For,
S suppose that (s, F ) PA . Then by assumption,
|y\ F | = , so we can pick m y\ F such that m > n. Then (s {m}, F ) (s, F ),
128

since for each z F we have z (s {m}) s because m


/ z. Also, m y\n, so
y
(s {m}) En . This proves our claim.
There are clearly at most sets Eny ; and also there are at most sets Dx with x A ,
with Dx as in Lemma 11.4(iii). Hence
by MA() we can let G be a filter on PA intersecting
S
all of these dense sets. Let d = (s,F )G s.
(1) For all x A , the set d x is finite.
For, by the denseness of Dx , choose (s, F ) G Dx . Thus x F . We claim that d x s.
To prove this, suppose that n dx. Choose (s , F ) G such that n s . Now (s, F ) and
(s , F ) are compatible. By Lemma 11.4(ii), y F (y s s); in particular, x s s.
Since n x s , we get n s. This proves our claim, and so (1) holds.
The proof will be finished by proving
(2) For all y B, the set d y is infinite.
To prove (2), given n choose (s, F ) Eny G. Thus s y 6 n, so we can choose
m s y\n. Hence m d y\n, proving (2).
Corollary 11.6. Let be an infinite cardinal and assume MA(). Suppose that
A P() is an almost disjoint set of infinite subsets of of size . Then A is not
maximal.
S
Proof.
If
F
is
a
finite
subset
of
A
,
then
we
can
choose
a

A
\F
;
then
a

F =
T
S
(a

b)
is
finite.
Thus
\
F
is
infinite.
Hence
we
can
apply
Theorem
11.5
to
A
and
bF
def

B = {} to obtain the desired result.


Corollary 11.7. Assuming MA, every maximal almost disjoint set of infinite sets of
natural numbers has size 2 .
Lemma 11.8. Suppose that B P() is an almost disjoint family of infinite sets,
and |B| = , where < 2 . Also suppose that A B. Assume MA().
Then there is a d such that |d x| < for all x A and |d x| = for all
x B\A .
Proof. We apply 11.5 with B\A in place of B. If y B\A S
and F [A ]< , then
y F B, and hence y z is finite
S for all y F . Hence also y F is finite. Since y
itself is infinite, it follows that y\ F is infinite.
Thus the hypotheses of 11.5 hold, and it then gives the desired result.
We now come to two of the most striking consequences of Martins axiom.
Theorem 11.9. If is an infinite cardinal and MA() holds, then 2 = 2 .
Proof. By Theorem 10.1 let B be an almost disjoint family of infinite subsets of
such that |B| = . For each d let F (d) = {b B : |b d| < }. We claim that F maps
P() onto P(B); from this it follows that 2 2 , hence 2 = 2 . To prove the claim,
suppose that A B. A suitable d with F (d) = A is then given by Lemma 11.8.
Corollary 11.10. MA implies that 2 is regular.
129

Proof. Assume MA, and suppose that < 2 . Then 2 = 2 by Theorem 11.9,
and so cf(2 ) = cf(2 ) > by Corollary 4.411.
Another important application of Martins axiom is to the existence of Suslin trees; in fact,
Martins axiom arose out of the proof of this theorem:
Theorem 11.11. MA(1 ) implies that there are no Suslin trees.
Proof. Suppose that (T, ) is a Suslin tree. By 8.7 and the remarks before it, we may
assume that T is well-pruned. We are going to apply MA(1 ) to the partial order (T, ),
i.e., to T turned upside down. Because T has no uncountable antichains in the tree sense,
(T, ) has no uncountable antichains in the incompatibility sense. Now for each < 1
let
D = {t T : ht(t, T ) > }.
Then each D is dense in (T, ). For, suppose that s T . By well-prunedness, choose
t T such that s < t and ht(t, T ) > . Thus t D and t > s, as desired.
Now we let G be a filter on (T, ) which intersects each D . Any two elements of
G are compatible in (T, ), so they are comparable in (T, ). Since G D 6= for all
< 1 , G has a member of T of height greater than , for each < 1 . Hence G is an
uncountable chain, contradiction.
Our last application of Martins axiom involves Lebesgue measure. In order not to assume
too much about measures, we give some results of measure theory that will be used in our
application but may have been omitted in your standard study of measure theory.
Lemma 11.12. Suppose that is a measure and E, F, G are -measurable. Then
(EF ) (EG) + (GF ).
Proof.
(EF ) = (E\F ) + (F \E)
= ((E\F ) G) + ((E\F )\G) + (F \E) G) + ((F \E)\G)
(G\F ) + (E\G) + (G\E) + (F \G)
= (EG) + (GF ).
Lemma 11.13. If E is Lebesgue measurable with finite measure, then for any > 0
there is an open set U E such that (E) S(U ) (E)+. Moreover,
there is a system
P
hKj : j < i of open intervals such that U = j< Kj and (U ) j< (Kj ) (E)+.
Proof. By the basic definition of Lebesgue measure,
(
X
(E) = inf
(Ij ) : hIj : j i is a sequence of half-open intervals
j

such that E

Ij .

130

Hence we can choose a sequence hIj : j i of half-open intervals such that E


and

X
[

(Ij ) (E) + .
Ij

2
j
j

j Ij

Write Ij = [aj , bj ) with aj < bj . Define





Kj = aj j+2 , bj ;
2
[
Kj and
E

then

Kj

(Kj )


X
+
(I
)
j
2j+2
j
X
X
=
(Ij )
+
j+2
2
j
j
=

+ (E) + = (E) + .
2
2

The following is an elementary lemma concerning the topology of the reals.


Lemma 11.14. Suppose that U is a bounded open set.
S
(i) There is a collection A of pairwise disjoint open intervals such that U = A .
(ii) There exist a countable subset C of R and a S
collection B ofSpairwise disjoint open
intervals with rational endpoints such that U = C B and C B = .
Proof. (i): For x, y R, define x y iff one of the following conditions holds: (1)
x = y; (2) x < y and [x, y] U ; (3) y < x and [y, x] U . Clearly is an equivalence
relation on R. If x < z < y and x y, then obviously x z. Thus each equivalence class
is convex. If C is an equivalence class with more than one element, then it must be an open
interval (a, b), since if for example the left endpoint a is in C then some real to the left
of a must be in C, contradiction. It follows now that the collection A of all equivalence
classes with more than one element is as desired in (i).
(ii): First note that the set A of (i) must be countable. Now take any (a, b) A ,
a < b. Let c0 < c1 < < cm < be rational numbers in (a, b) which converge to
b, and c0 = d0 > d1 > > dm > rational numbers which converge to a. Then let
ab
Lab
Let Dab = {ci : i < } {di : i < }.
2i = (ci , ci+1 ) and L2i+1 = (di+1 , di ) for all i . S
ab
Define B = {Lab
i : (a, b) A , i < } and C =
(a,b)A D . Clearly this works for
(ii).
Lemma 11.15. If E is Lebesgue measurable and > 0, then there is an mS and
 a
sequence hIi : i < mi of open intervals with rational endpoints such that E i<m Ii
.
131


Proof. By Lemma 11.13 let U E be open such that
S (E) (U ) P(E) + 2 .
Then choose C and B as in Lemma 11.14(ii). Let W = B.
P So (W ) =
P IB (I).
Then choose m Pand hIi : i < mi elements
of
B
such
that
(I)

IB
S
P
S i<m (Ii )

. Now (W ) =
IB (I) and ( i<m Ii ) =
i<m (Ii ). Let V =
i<m Ii . Thus
2
(W ) (V ) 2 . Hence V W U , and

(EV ) (EU ) + (U W ) + (W V )
= (U \E) + (C) + (W \V )
= (U ) (E) + (W ) (V )

+ = .
2 2
Now we are ready for an application of Martins axiom to Lebesgue measure.
Theorem 11.16. Suppose that is an infinite cardinal and MA() holds.
If hM :
S
< i is a system of subsets of R each of Lebesgue measure 0, then also < M has
Lebesgue measure 0.
S
Proof. Let > 0. We are going to find an open set U such that < M U and
(U ) ; this will prove our result. Let
P = {p R : p is open and (p) < }.
The ordering, as usual, is .
(1) Elements p, q P are compatible iff (p q) < .
In fact, the direction is clear, while if p and q are compatible, then there is an r P
with r p, q, hence p q r and (r) < , hence (p q) < .
Next we check that P has ccc. Suppose that hp : < 1 i is a system of pairwise
incompatible elements of P. Now
1 =

[
n

1
< 1 : (p )
n+1

1
so there exist an uncountable 1 and a positive integer m such that (p ) m
for all . Let C be the collection of all finite unions of open intervals with rational
coefficients. Note that C is countable. By Lemma 11.15, for each let C be a
1
. Now take any two distinct members , .
member of C such that (p C ) 3m
Then
1
(p p ) = (p p ) + (p p )
+ (p p ),
m

and hence (p p )

1
m.

Thus, using Lemma 11.12,

1
1
1
(p p ) (p C ) + (C C ) + (C p )
+ (C C ) +
;
m
3m
3m
132

1
. It follows that C 6= C . But this means that hC : i is a
Hence (C C ) 3m
one-one system of members of C , contradiction. So P has ccc.
Now for each < let
D = {p P : M p}.

To show that D is dense, take any p P. Thus (p) < . By Lemma 11.13, let U be an
open set such that M U and (U ) < (p). Then (p U ) (p) + (U ) < ; so
p U D and p U p, as desired.
S
Now let G be a filter on P which intersects each D . Set V = G. So V is an open
set. For each < , choose p G D . Then M p V . It remains only to show
that (V ) S
. Let B be the set of all open intervals with rational endpoints. We claim
that V = (G B). In fact, is clear, so suppose that x V . Then x p for some
p G, hence there is a U B such that x U p, since p is open. Then U G since
G isS
a filter and the partial order is . So we found a U G B such thatSx U ; hence
x (G B). This provesSour claim. Now if F is a finite subset of G, then F G since
G is a filter. In particular, F P, so its measure
is less than . Now G B is countable;
S
let hpi : i S
i enumerate
it. Define qi =S
pi \ j<i pj for
S
S all i . Then by induction one
sees that i<m pi = i<m qi , and hence (G B) = i< qi . So
[

(V ) =
(G B) =

qi

i<

X
i<

(qi ) = lim

X
i<m

(qi ) = lim
m

[
i<m

qi

= lim
m

pi

i<m

EXERCISES
11.1. Assume MA(). Suppose that X is a compact Hausdorff space, and any pairwise
disjoint collection of open
T sets in X is countable. Suppose that U is dense open in X for
each < . Show that < U 6= .
11.2. A partial order P is said to have 1 as a precaliber iff for every system hp : < 1 i
of elements of P there is an X [1 ]1 such that for every finite subset F of X there is a
q P such that q p for all F .
Show that MA(1 ) implies that every ccc partial order P has 1 as a precaliber.
Hint: for each < 1 let
W = {q P : > (q and p are compatible)}.
Show that there is an < 1 such that W = W for all > , and apply MA(1 ) to
W .
11.3. Call a topological space
ofQpairwise disjoint open sets in
Q X ccc iff every collection
<
X is countable. Show that iI Xi is ccc iff F [I] [ iF Xi is ccc]. Hint: use the
-system theorem.
133

11.4 Assuming MA(1 ), show that any product of ccc spaces is ccc.
11.5. Assume MA(1 ). Suppose that P and Q are ccc partially ordered sets. Define on
P Q by setting (a, b) (c, d) iff a c and b d. Show that < is a ccc partial order on
P Q. Hint: use exercise 11.1.
11.6. We define < on by setting f < g iff f, g and nm > n(f (m) < g(m).
Suppose that MA() holds and F [ ] . Show that there is a g such that f < g
for all f F . Hint: let P be the set of all pairs (p, F ) such that p is a finite function
mapping a subset of into and F is a finite subset of F . Define (p, F ) (q, G) iff
q p, G F , and
f Gn dmn(p)\dmn(q)[p(n) > f (n)].
11.7. Let B [] be almost disjoint of size , with < 2 . Let A B with A
countable. Assume MA(). Show that there is a d such that |d x| < for all x A ,
and |x\d| < for all x B\A . Hint: Let hai : i i enumerate A . Let
P = {(s, F, m) : s []< , F [B\A ]< , and m };
(s , F , m ) (s, F, m) iff s s , F F , m m , and
"
!
#
[
x F
x\
ai s s .
im

Show that P satisfies ccc. To apply MA(), one needs various dense sets. The most
complicated is defined as follows. Let D = {(s, F, m, i, n) : (s, F, m) P, i < m, and
n ai \s}. Clearly |D| = . For each (s, F, m, i, n) D let
E(s,F,m,i,n) = {(s , F , m ) P : (s, F, m) and (s , F , m ) are incompatible
or (s , F , m ) (s, F, m) and n s }.
11.8. [The condition that A is countable is needed in exercise 7.] Show that there exist
A , B such that B is an almost disjoint family of infinite subsets of , A B, |A | =
|B\A | = 1 , and there does not exist a d such that |x\d| < for all x A , and
|x d| < for all x B\A . Hint: construct A = {a : < 1 } and B\A = {b :
< 1 } by constructing a , b inductively, making sure that the elements are infinite and
pairwise almost disjoint, and also a b = , while for 6= we have a b 6= .
T
11.9. Suppose that A is a family of infinite subsets of such that F is infinite for every
finite subset F of A . Suppose that |A | . Assuming MA(), show that there is an
infinite X such that X\A is finite for every A A . Hint: use Theorem 11.5.
11.10. Show that MA() is equivalent to MA() restricted to ccc partial orders of cardinality . Hint: Assume the indicated special form of MA(), and assume given a ccc
partially ordered set P and a family D of at most dense sets in P ; we want to find a
filter on P intersecting each member of D. We introduce some operations on P . For each
D D define fD : P P by setting, for each p P , fD (p) to be some element of D
which is p. Also we define g : P P P by setting, for all p, q P ,

p if p and q are incompatible,
g(p, q) =
r with r p, q if there is such an r.
134

Here, as in the definition of fD , we are implicitly using the axiom of choice; for g, we
choose any r of the indicated form.
We may assume that D 6= . Choose D D, and choose s D. Now let Q be the
intersection of all subsets of P which have s as a member and are closed under all of the
operations fD and g. We take the order on Q to be the order induced from P . Apply the
special form to Q.
11.11. Define x y iff x, y , x\y is finite, and y\x is infinite. Assume MA(), and
suppose that L, <) is a linear ordering of size at most . Show that there is a system
hax : x Li of infinite subsets of such that for all x, y L, x < y iff ax ay . Hint: let
P consist of all pairs (p, n) such that n , p is a function whose domain is a finite subset
of L, and x dmn(p)[p(x) n]. Define (p, n) (q, m) iff m n, dmn(q) dmn(p),
x dmn(q)[p(x) m = q(x)], and x, y dmn(q)[x < y p(x)\p(y) m].
For the remaining exercises we use the following definitions.
a b

a b

iff

a\b is finite;

iff

a b and b\a is infinite.

11.12. If A , B are nonempty countable subsets of [] and a b whenever a A and


b B, then there is a c [] such that a c b whenever a A and b B.
11.13. Suppose that A is a nonempty countable family of members of [] , and a, b
A [a b or b a]. Also suppose that a A [a d], where d [] . Then there is a
c [] such that a A [a c d].
11.14. If a, b [] and a b, then there is a c [] such that a c b.
11.15. Suppose that A and B are nonempty countable subsets of [] , x, y A [x y
or y x], x, y B[x y or y x], and x A y B[a b]. Then there is a
c [] such that a c b for all a A and b B.
Now we need some more terminology. Let A [] , b [] , and a A [a b]. We
say that b is near to A iff for all m the set {a A : a\b m} is finite.
11.16. Suppose that am [] for all m , am an whenever m < n , b [] ,
and am b for all m . Then there is a c [] such that m [am c b] and
c is near to {an : n }.
11.17. Suppose that A [] , x, y A [x y or y x], b [] , x A [x b],
and a A [b is near to {d A : d a}].
Then there is a c [] such that a A [a c b] and c is near to A .
11.18. (The Hausdorff gap) There exist sequences ha : < 1 i and hb : < 1 i of
members of [] such that , < 1 [ < a a and b b ], , < 1 [a
b ], and there does not exist a c such that < 1 [a c and c b ].
Reference
Fremlin, D. Consequences of Martins axiom. Cambridge Univ. Press, 325pp.
135

12. Large cardinals


The study, or use, of large cardinals is one of the most active areas of research in set theory
currently. There are many provably different kinds of large cardinals whose descriptions
are different from one another. We restrict ourselves in this chapter to three important
kinds: Mahlo cardinals, weakly compact cardinals, and measurable cardinals. All of these
large cardinals are regular limit cardinals (which are frequently called weakly inaccessible
cardinals), and most of them are strongly inaccessible cardinals.
Mahlo cardinals
As we mentioned in the elementary part of these notes, one cannot prove in ZFC that
uncountable weakly inaccessible cardinals exist (if ZFC itself is consistent). But now
we assume that even the somewhat stronger inaccessible cardinals exist, and we want
to explore, roughly speaking, how many such there can be. We begin with some easy
propositions.
Proposition 12.1. Assume that uncountable inaccessible cardinals exist, and suppose
that is the least such. Then every strong limit cardinal less than is singular.
The inaccessibles are a class of ordinals, hence form a well-ordered class, and they can be
enumerated in a strictly increasing sequence h : Oi. Here O is an ordinal, or On,
the class of all ordinals. The definition of Mahlo cardinal is motivated by the following
simple proposition.
Proposition 12.2. If = with < , then the set { < : is regular} is a
nonstationary subset of .
Proof. Since is regular and < , we must have sup< < . Let C = { :
sup< < < and is a strong limit cardinal}. Then C is club in with empty
intersection with the given set.
is Mahlo iff is an uncountable inaccessible cardinal and { < : is regular} is
stationary in .
is weakly Mahlo iff is an uncountable weakly inaccessible cardinal and { < : is
regular} is stationary in .
Since the function is strictly increasing, we have for all . Hence the following is
a corollary of Proposition 12.2
Corollary 12.3. If is a Mahlo cardinal, then = .
Thus a Mahlo cardinal is not only inaccessible, but also has inaccessibles below it.
Proposition 12.4. For any uncountable cardinal the following conditions are equivalent:
(i) is Mahlo.
(ii) { < : is inaccessible} is stationary in .
Proof. (i)(ii): Let S = { < : is regular}, and S = { < : is inaccessible}.
Assume that is Mahlo. In particular, is uncountable and inaccessible. Suppose that
136

C is club in . The set D = { < : is strong limit} is clearly club in too. If


S C D, then is inaccessible, as desired.
(ii)(i): obvious.
The following proposition answers a natural question one may ask after seeing Corollary
12.3.
Proposition 12.5. Suppose that is minimum such that k = . Then is not
Mahlo.
Proof. Suppose to the contrary that is Mahlo, and let S = { < : is
inaccessible} For each S, let f () be the < such that = . Then = f () <
by the minimality of . So f is regressive on the stationary set S, and hence there is an
< and a stationary subset S of S such that f () = for all S . But actually f is
clearly a one-one function, contradiction.
Mahlo cardinals are in a sense larger than ordinary inaccessibles. Namely, below every
Mahlo cardinal there are inaccessibles. But now in principle one could enumerate all
the Mahlo cardinals, and then apply the same idea used in going from regular cardinals to
Mahlo cardinals in order to go from Mahlo cardinals to higher Mahlo cardinals. Thus we
can make the definitions
is hyper-Mahlo iff is inaccessible and the set { < : is Mahlo} is stationary in .
is hyper-hyper-Mahlo iff is inaccessible and the set { < : is hyper-Mahlo} is
stationary in .
Of course one can continue in this vein.
Weakly compact cardinals
A cardinal is weakly compact iff > and (, )2 . There are several equivalent
definitions of weak compactness. The one which justifies the name compact involves
infinitary logic, and it will be discussed later. Right now we consider equivalent conditions
involving trees and linear orderings.
A cardinal has the tree property iff every -tree has a chain of size .
Equivalently, has the tree property iff there is no -Aronszajn tree.
A cardinal has the linear order property iff every linear order (L, <) of size has a
subset with order type or under <.
Lemma 12.6. For any regular cardinal , the linear order property implies the tree
property.
Proof. We are going to go from a tree to a linear order in a different way from the
branch method of Chapter 8.
Assume the linear order property, and let (T, <) be a -tree. For each x T and
each ht(x, T ) let x be the element of height below x. Thus x0 is the root which
is below x, and xht(x) = x. For each x T , let T x = {y T : y < x}. If x, y are
137

incomparable elements of T , then let (x, y) be the smallest ordinal min(ht(x), ht(y))
such that x 6= y . Let < be a well-order of T . Then we define, for any distinct x, y T ,
x < y

iff

x < y, or x and y are incomparable and x(x,y) < y (x,y) .

We claim that this gives a linear order of T . To prove transitivity, suppose that x < y <
z. Then there are several possibilities. These are illustrated in diagrams below.
Case 1. x < y < z. Then x < z, so x < z.
Case 2. x < y, while y and z are incomparable, with y (y,z) < z (y,z) .
Subcase 2.1. ht(x) < (y, z). Then x = xht(x) = y ht(x) = z ht(x) so that x < z,
hence x < z.
Subcase 2.2. (y, z) ht(x). Then x and z are incomparable. In fact, if z < x
then z < y, contradicting the assumption that y and z are incomparable; if x z, then
y ht(x) = x = xht(x) = z ht(x) , contradiction. Now if < (x, z) then y = x = z ; it
follows that (x, z) (y, z). If < (y, z) then ht(x), and hence x = y = z ; this
shows that (y, z) (x, z). So (y, z) = (x, z). Hence x(x,z) = y (x,z) = y (y,z) <
z (y,z) = z (x,z) , and hence x < z.
Case 3. x and y are incomparable, and y < z. Then x and z are incomparable. Now
if < (x, y), then x = y = z ; this shows that (x, y) (x, z). Also, x(x,y) <
y (x,y) = z (x,y) , and this implies that (x, z) (x, y). So (x, y) = (x, z). It follows
that x(x,z) = x(x,y) < y (x,y) = z (x,z) , and hence x < z.
Case 4. x and y are incomparable, and also y and z are incomparable. We consider
subcases.
Subcase 4.1. (y, z) < (x, y). Now if < (y, z), then x = y = z ; so
(y, z) (x, z). Also, x(y,z) = y (y,z) < z (y,z) , so that (x, z) (y, z). Hence
(x, z) = (y, z), and x(x,z) = y (y,z) < z (y,z) , and hence x < z.
Subcase 4.2. (y, z) = (x, y). Now x(x,y) < y (x,y) = y (y,z) < z (y,z) =
z (x,y) . It follows that (x, z) (x, y). For any < (x, y) we have x = y = z since
(y, z) = (x, y). So (x, y) = (x, z). Hence x(x,z) = x(x,y) < y (x,y) = y (y,z) <
z (y,z) = z (x,z) , so x < z.
Subcase 4.3. (x, y) < (y, z). Then x(x,y) < y (x,y) = z (x,y) , and if < (x, y)

then x = y = z . It follows that x < z


Clearly any two elements of T are comparable under < , so we have a linear order. The
following property is also needed.
(*) If t < x, y and x < a < y, then t < a.
In fact, suppose not. If a t, then a < x, hence a < x, contradiction. So a and t are
incomparable. Then (a, t) ht(t), and hence x < y < a or a < x < y, contradiction.
Now by the linear order property, (T, < ) has a subset L of order type or . First
suppose that L is of order type . Define
B = {t T : x La L[x a t a]}.

138

x
Case 1
x

Subcase 2.1
y

Subcase 4.1

Subcase 2.2
x

Case 3
x

Subcase 4.2

Subcase 4.3

We claim that B is a chain in T of size . Suppose that t0 , t1 B with t0 6= t1 , and choose


x0 , x1 L correspondingly. Say wlog x0 < x1 . Now t0 B and x0 x1 , so t0 x1 .
And t1 B and x1 x1 , so t1 x1 . So t0 and t1 are comparable.
Now let < ; we show that B has an element of height . For each t of height let
Vt = {x L : t x}. Then
{x L : ht(x) } =

Vt ;

ht(t)=

since there are fewer than elements of height less than , this set has size , and so there
is a t such that ht(t) = and |Vt | = . We claim that t B. To prove this, take any
x Vt such that t < x. Suppose that a L and x a. Choose y Vt with a < y and
t < y. Then t < x, t < y, and x a < y. If x = a, then t a, as desired. If x < a,
then t < a by (*).
This finishes the case in which L has a subset of order type . The case of order type
is similar, but we give it. So, suppose that L has order type . Define
B = {t T : x La L[a x t a]}.
We claim that B is a chain in T of size . Suppose that t0 , t1 B with t0 6= t1 , and choose
x0 , x1 L correspondingly. Say wlog x0 < x1 . Now t0 B and x0 x0 , so t0 x0 . and
t1 B and x0 x1 , so t1 x0 . So t0 and t1 are comparable.
139

Now let < ; we show that B has an element of height . For each t of height let
Vt = {x L : t x}. Then
[
{x L : ht(x) } =
Vt ;
ht(t)=

since there are fewer than elements of height less than , this set has size , and so there
is a t such that ht(t) = and |Vt | = . We claim that t B. To prove this, take any
x Vt such that t < x. Suppose that a L and a x. Choose y Vt with y < a and
t < y. Then t < x, t < y, and y < a x. If a = x, then t < a, as desired. If a < x,
then t < a by (*).
Theorem 12.7. For any uncountable cardinal the following conditions are equivalent:
(i) is weakly compact.
(ii) is inaccessible, and it has the linear order property.
(iii) is inaccessible, and it has the tree property.
(iv) For any cardinal such that 1 < < we have ()2 .
Proof. (i)(ii): Assume that is weakly compact. First we need to show that is
inaccessible.
P
To show that is regular, suppose to the contrary that = < , where <
and < for each
S < . By the definition of infinite sum of cardinals, it follows that
we can write = < M , where |M | = for each < and the M s are pairwise
disjoint. Define f : []2 2 by setting, for any distinct , < ,
n
0 if , M for some < ,
f ({, }) =
1 otherwise.
Let H be homogeneous for f of size . First suppose that f [[H]2 ] = {0}. Fix 0 H, and
say 0 M . For any H we then have M also, by the homogeneity of H. So
H M , which is impossible since |M | < . Second, suppose that f [[H]2 ] = {1}. Then
any two distinct members of H lie in distinct M s. Hence if we define g() to be the
< such that M for each H, we get a one-one function from H into , which
is impossible since < .
To show that is strong limit, suppose that < but 2 . Now by Theorem
10.9 we have 2 6 (+ , + )2 . So choose f : [2 ]2 2 such that there does not exist an
+
X [2 ] with f [X]2 constant. Define g : []2 2 by setting g(A) = f (A) for any
+
A []2 . Choose Y [] such that g [Y ]2 is constant. Take any Z [Y ] . Then
f [Z]2 is constant, contradiction.
So, is inaccessible. Now let (L, <) be a linear order of size . Let be a well order
of L. Now we define f : [L]2 2; suppose that a, b L with a b. Then
n
0 if a < b,
f ({a, b}) =
1 if b > a.
Let H be homogeneous for f and of size . If f [[H]2 ] = {0}, then H is well-ordered by <.
If f [[H]2 ] = {1}, then H is well-ordered by >.
140

(ii)(iii): By Lemma 12.6.


(iii)(iv): Assume (iii). Suppose that F : []2 , where 1 < < ; we want to
find a homogeneous set for F of size . We construct by recursion a sequence ht : < i
of members of < ; these will be the members of a tree T . Let t0 = . Now suppose
that 0 < < and t < has been constructed for all < . We now define t
by recursion; its domain will also be determined by the recursive definition, and for this
purpose it is convenient to actually define an auxiliary function s : + 1 by recursion.
If s() has been defined for all < , we define

F ({, }) where < is minimum such that s = t , if there is such a ,


s() =

if there is no such .
Now eventually the second condition here must hold, as otherwise hs : < i would
be a one-one function from into {t : < }, which is impossible. Take the least
such that s() = , and let t = s . This finishes the construction of the t s. Let
T = {t : < }, with the partial order . Clearly this gives a tree.
By construction, if < and < dmn(t ), then t T . Thus the height of an
element t is dmn(t ).
(2) The sequence ht : < i is one-one.
In fact, suppose that < and t = t . Say that dmn(t ) = . Then t = t = t ,
and the construction of t gives something with domain greater than , contradiction.
Thus (2) holds, and hence |T | = .
(3) The set of all elements of T of level < has size less than .
In fact, let U be this set. Then
|U |

= <

<

since is inaccessible. So (3) holds, and hence, since |T | = , T has height and is a
-tree.
(4) If t t , then < and F ({, }) = t (dmn(t )).
This is clear from the definition.
Now by the tree property, there is a branch B of size . For each < let
H = { < : t B and t
hi B}.
We claim that each H is homogeneous for F . In fact, take any distinct , H . Then
t , t B. Say t t . Then < , and by construction t (dmn(t )) = F ({, }). So
F ({, }) = by the definition of H , as desired. Now
{ < : t B} =

[
<

141

{ < : t H },

so since |B| = it follows that |H | = for some < , as desired.


(iv)(i): obvious.
Now we go into the connection of weakly compact cardinals with logic, thereby justifying
the name weakly compact. This is optional material.
Let and be infinite cardinals. The language L is an extension of ordinary first
order logic as follows. The notion of a model is unchanged. In the logic, we have a sequence
of distinct individual variables, and we allow quantification over any one-one sequence
of fewer than variables. We also allow conjunctions and disjunctions of fewer than
formulas. It should be clear what it means for an assignment of values to the variables to
satisfy a formula in this extended language. We say that an infinite cardinal is logically
weakly compact iff the following condition holds:
(*) For any language L with at most basic symbols, if is a set of sentences of the
language and if every subset of of size less than has a model, then also has a model.
Notice here the somewhat unnatural restriction that there are at most basic symbols.
If we drop this restriction, we obtain the notion of a strongly compact cardinal. These
cardinals are much larger than even the measurable cardinals discussed later. We will not
go into the theory of such cardinals.
Theorem 12.8. An infinite cardinal is logically weakly compact iff it is weakly compact.
Proof. Suppose that is logically weakly compact.
(1) is regular.
Suppose not; say X is unbounded but |X| < . Take the language with individual
constants c for < and also one more individual constant d. Consider the following
set of sentences in this language:

_ _
(d = c ) .
{d 6= c : < }

X <

If []< , let A be the set of all < such that d = c is in . So |A| < . Take any
\A, and consider the structure M = (, , )<. There is a X with < , and
this shows that M is a model of .
Thus every subset of of size less than has a model, so has a model; but this is
clearly impossible.
(2) is strong limit.
In fact, suppose not; let < with 2 . We consider the language with distinct
individual constants c , di for all < and i < 2. Let be the following set of sentences
in this language:
(
) (
)
^
_
[(c = d0 c = d1 ) d0 6= d1 ]
(c 6= df() ) : f 2 .
<

<

142

Suppose that []< . We may assume that has the form


(
) (
)
^
_
[(c = d0 c = d1 ) d0 6= d1 ]
(c 6= df() ) : f M ,
<

<
g()

where M [ 2]< . Fix g 2\M . Let d0 = , d1 = + 1, and c = d , for all < .


Clearly (, c , di )<,i<2 is a model of .
Thus every subset of of size less than has a model, so has a model, say
f ()
(M, u , vi )<,i<2 . By the first part of there is a function f 2 such that u = d
for every < . this contradicts the second part of .
Hence we have shown that is inaccessible.
Finally, we prove that the tree property holds. Suppose that (T, ) is a -tree. Let
L be the language with a binary relation symbol , unary relation symbols P for each
< , individual constants ct for each t T , and one more individual constant d. Let
be the following set of sentences:
def

all L -sentences holding in the structure M = (T, <, Lev (T ), t)<,tT ;


x[P x x d] for each < .
def

Clearly every subset of of size less than has a model. Hence has a model N =
(A, < , S , at , b)<,tT . For each < choose e S with e < b. Now the following
sentence holds in M and hence in N :

_
x P x
(x = cs ) .
sLev (T )

Hence for each < we can choose t() T such that ea = at() . Now the sentence
x, y, z[x < z y < z x and y are comparable]
holds in M , and hence in N . Now fix < < . Now e , e < b, so it follows that e and
e are comparable under . Hence at() and at() are comparable under . It follows
that t() and t() are comparable under . So t() < t(). Thus we have a branch of
size .
Now suppose that is weakly compact. Let L be an L -language with at most
symbols, and suppose that is a set of sentences in L such that every subset of of
size less than has a model M . We will construct a model of by modifying Henkins
proof of the completeness theorem for first-order logic.
First we note that there are at most formulas of L. This is easily seen by the
following recursive construction of all formulas:
F0 = all atomic formulas;
F+1 = F { : F }
[
F =
F for limit.

n_

o
: [F ]< {x : F , x of length < };

<

143

By induction, |F | for all , and F is the set of all formulas. (One uses that is
inaccessible.)
Expand L to L by adjoining a set C of new individual constants, with |C| = . Let
be the set of all subformulas of the sentences in . Let h : < i list all sentences
of L which are of the form x (x) and are obtained from a member of by replacing
variables by members of C. Here x is a one-one sequence of variables of length less than
; say that x has length . Now we define a sequence hd : < i; each d will be a
sequence of members of C of length less than . If d has been defined for all < , then
[

rng(d ) {c C : c occurs in for some < }

<

has size less than . We then let d be a one-one sequence of members of C not in this
set; d should have length . Now for each let
= {x (x) (d ) : < }.
Note that if < . Now we define for each []< and each a
model N of . Since 0 = , we can let N0 = M . Having defined N , since
the range of d consists of new constants, we can choose denotations of those constants,

expanding N to N+1
, so that the sentence
x (x) (d )
S

holds in N+1
. For limit we let N = < N .
It follows that N is a model of . So each subset of of size less than
has a model.
It suffices now to find a model of in the language L . Let h : < i be
an enumeration of all sentences obtained from members of by replacing variables by
members of C, each such sentence appearing times. Let T consist of all f satisfying the
following conditions:
(3) f is a function with domain < .
(4) < [( f () = ) and
/ f () = )].
(5) rng(f ) has a model.
Thus T forms a tree .
(6) T has an element of height , for each < .
def

In fact, = { : < , } { : < , } is a subset of


of size less than , so it has a model P . For each < let
f () =

144

if P |= ,
if P |= .

Clearly f is an element of T with height . So (6) holds.


Thus T is clearly a -tree, so by the tree property we can let B be a branch in T of
size . Let = {f () : < , f B, f has height + 1}. Clearly and for
every < , or .
(7) If , , then .
In fact, say = f () and = f (). Choose > , so that is . We may
assume that dmn(f ) + 1. Since rng(f ) has a model, it follows that f () = . So (7)
holds.
Let S be the set of all terms with no variables in them. We define iff , S
and ( = ) . Then is an equivalence relation on S. In fact, let S. Say that
= is . Since holds in every model, it holds in any model of {f () : }, and
hence f () = ( = ). So ( = ) and so . Symmetry and transitivity follow
by (7).
Let M be the collection of all equivalence classes. Using (7) it is easy to see that the
function and relation symbols can be defined on M so that the following conditions hold:
(8) If F is an m-ary function symbol, then
F M (0 / , . . . , m1 / ) = F (0 , . . . , m1 )/ .
(9) If R is an m-ary relation symbol, then
h0 / , . . . , m1 / i RM

iff

R(0 , . . . , m1 ) .

Now the final claim is as follows:


(10) If is a sentence of L , then M |= iff .
Clearly this will finish the proof. We prove (10) by induction on . It is clear for atomic
sentences by (8) and (9). If it holds for , it clearly holds for . Now suppose that Q is
a set of
V sentences of size less than , and (10) holds for each member of Q. Suppose that
M |= Q. Then M |= for each Q, and so Q . Hence there is a []< such
that Q =Vf [], with f B. Choose greater than each member of such that is the
formulaV Q. We may
V assume that dmn(f ). Since rng(f ) has a model, it follows that
f () = Q. Hence Q . V
Conversely, suppose that Q . From (7) it easily follows thatV for every
Q, so by the inductive hypothesis M |= for each Q, so M |= Q.
Finally, suppose that is x, where (10) holds for shorter formulas. Suppose that
M |= x. Then there are members of S such that when they are substituted in for x,
obtaining a sentence , we have M |= . Hence by the inductive hypothesis, . (7)
then yields x .
Conversely, suppose that x . Now there is a sequence d of members of C such
that x (d) is also in , and so by (7) we get (d) . By the inductive
hypothesis, M |= (d), so M |= x .
Next we want to show that every weakly compact cardinal is a Mahlo cardinal. To do this
we need two lemmas.
145

Lemma 12.9. Let A be a set of infinite cardinals such that for every regular cardinal
, the set A is non-stationary in . Then there is a one-one regressive function with
domain A.
def S
Proof. We proceed by induction on =
A. Note that is a cardinal; it is 0 if
A = . The cases = 0 and = are trivial, since then A = or A = {} respectively.
Next, suppose that is a successor cardinal
+ . Then A = A {+ } for some set A
S
+
of infinite cardinals less than . Then A < + , so by the inductive hypothesis there
is a one-one regressive function f on A . We can extend f to A by setting f (+ ) = , and
so we get a one-one regressive function defined on A.
Suppose that is singular. Let h : < cf()i be a strictly increasing continuous
sequence of infinite cardinals with supremum , with cf() < 0 . Note then that for every
cardinal < , either < 0 or else there is a unique < cf() such that < +1 .
For every < cf() we can apply the inductive hypothesis to A to get a one-one
regressive function g with domain A . We now define f with domain A. In case
cf() = we define, for each A,

g0 () + 2

+ g+1 () + 1
f () =

1
0

if
if
if
if
if

< 0 ,
< < +1 ,
= +1 ,
= 0 ,
= A.

Here the addition is ordinal addition. Clearly f is as desired in this case. If cf() > , let
h : < cf()i be a strictly increasing sequence of limit ordinals with supremum cf().
Then we define, for each A,

g () + 1
if < 0 ,

0
+ g+1 () + 1 if < < +1 ,
f () =

if = ,


0
if = A.
Clearly f works in this case too.
Finally, suppose that is a regular limit cardinal. By assumption, there is a club C
in such that C A = . We may assume that C = . Let h : < i be the
strictly increasing enumeration of C. Then we define, for each A,

if < 0 ,
g0 () + 1
f () = + g+1 () + 1 if < < +1 ,

0
if = A.
Clearly f works in this case too.
Lemma 12.10. Suppose that is weakly compact, and S is a stationary subset of .
Then there is a regular < such that S is stationary in .
Proof. Suppose not. Thus for all regular < , the set S is non-stationary in
. Let C be the collection of all infinite cardinals less than . Clearly C is club in , so
146

S C is stationary in . Clearly still S C is non-stationary in for every regular


< . So we may assume from the beginning that S is a set of infinite cardinals.
Let h : < i be the strictly increasing enumeration of S. Let

Y
T = s : < s
and s is one-one .

<

For every < the set S is non-stationary in every regular cardinal, and hence by
Lemma 12.9 there is a one-one regressive function s with domain S . Now S =
{ : < }. Hence s T .
Clearly T forms a tree of height under . Now for any < ,
Y
<

sup

!||

< .

<

Hence by the tree property there is a branch B in T of size . Thus


regressive function with domain S, contradicting Fodors theorem.

B is a one-one

Theorem 12.11. Every weakly compact cardinal is Mahlo, hyper-Mahlo, hyper-hyperMahlo, etc.
Proof. Let be weakly compact. Let S = { < : is regular}. Suppose that C
is club in . Then C is stationary in , so by Lemma 12.10 there is a regular < such
that C is stationary in ; in particular, C is unbounded in , so C since C is
closed in . Thus we have shown that S C 6= . So is Mahlo.
Let S = { < : is a Mahlo cardinal}. Suppose that C is club in . Let

S = { < : is regular}. Since is Mahlo, S is stationary in . Then C S


is stationary in , so by Lemma 12.10 there is a regular < such that C S is
stationary in . Hence is Mahlo, and also C is unbounded in , so C since C is
closed in . Thus we have shown that S C 6= . So is hyper-Mahlo.
Let S = { < : is a hyper-Mahlo cardinal}. Suppose that C is club in . Let
iv
S = { < : is Mahlo}. Since is hyper-Mahlo, S iv is stationary in . Then C S iv
is stationary in , so by Lemma 12.10 there is a regular < such that C S iv is
stationary in . Hence is hyper-Mahlo, and also C is unbounded in , so C since
C is closed in . Thus we have shown that S C 6= . So is hyper-hyper-Mahlo.
Etc.
Measurable cardinals
Our third kind of large cardinal is the class of measurable cardinals. Although, as the
name suggests, this notion comes from measure theory, the definition and results we give
are purely set-theoretical. Moreover, similarly to weakly compact cardinals, it is not
obvious from the definition that we are dealing with large cardinals.
The definition is given in terms of the notion of an ultrafilter on a set.
147

Let X be a nonempty set. A filter on X is a family F of subsets of X satisfying the


following conditions:
(i) X F .
(ii) If Y, Z F , then Y Z F .
(iii) If Y F and Y Z X, then Z F .
A filter F on a set X is proper or nontrivial iff
/ F.
An ultrafilter on a set X is a nontrivial filter F on X such that for every Y X, either
Y F or X\Y F .
A family A of subsetsTof X has the finite intersection property, fip, iff for every finite
subset B of A we have B 6= .
If A is a family of subsets of X, then the filter generated by A is the set
{Y X :

B Y for some finite B A }.

[Clearly this is a filter on X, and it contains A .]


Proposition 12.12. If x X, then {Y X : x Y } is an ultrafilter on X.
An ultrafilter of the kind given in this proposition is called a principal ultrafilter. There
are nonprincipal ultrafilters on any infinite set, as we will see shortly.
Proposition 12.13. Let F be a proper filter on a set X. Then the following are
equivalent:
(i) F is an ultrafilter.
(ii) F is maximal in the partially ordered set of all proper filters (under ).
Proof. (i)(ii): Assume (i), and suppose that G is a filter with F G . Choose
Y G \F . Since Y
/ F , we must have X\Y F G . So Y, X\Y G , hence
= Y (X\Y ) G , and so G is not proper.
(ii)(i): Assume (ii), and suppose that Y X, with Y
/ F ; we want to show that
X\Y F . Let
G = {Z X : Y W Z for some W F }.
Clearly G is a filter on X, and F G . Moreover, Y G \F . It follows that G is not
proper, and so G . Thus there is a W F such that Y W = . Hence W X\Y ,
and hence X\Y F , as desired.
Theorem 12.14. For any infinite set X there is a nonprincipal ultrafilter on X.
Moreover, if A is any collection of subsets of X with fip, then A can be extended to an
ultrafilter.
Proof. First we show that the first assertion follows from the second. Let A be the
collection of all cofinite subsets of Xthe subsets
whose
complements are finite. A has
T
S
fip, since if B is a finite subset of A , then X\ B = Y B (X\B) is finite. By the second
assertion, A can be extended to an ultrafilter F . Clearly F is nonprincipal.
148

To prove the second assertion, let A be a collection of subsets of X with fip, and let
C be the collection of all proper filters on X which contain A . Clearly the filter generated
by A is proper, so C 6= . We consider C as a partially ordered set S
under inclusion.
Any subset D of C which is a chain has an upper bound in C , namely D, as is easily
checked. So by Zorns lemma C has a maximal member F . By Proposition 12.13, F is an
ultrafilter.
Let X be an infinite set, and let be T
an infinite cardinal. An ultrafilter F on X is <
complete iff for any A [F ] we have A F . We also say -complete synonomously
with 1 -complete.
This notion is clearly a generalization of one of the properties of ultrafilters. In fact, every
ultrafilter is -complete, and every principal ultrafilter is -complete for every infinite
cardinal .
Lemma 12.15. Suppose that X is an infinite set, F is an ultrafilter
on X, and is
T
the least infinite cardinal such that there is an A [F ] such that A
/ F . Then there
is a partition P of X such that |P| = and X\Y F for all Y P.
T Proof. Let hY : < i enumerate A . Let Z0 = X\Y0 , and for > 0 let Z =
( < Y )\Y
T . Note that Y X\Z , and so X\Z F . Clearly Z Z = for 6= .
Let W = < Y . Clearly W Z = for all < . Let
P = ({Z : < } {W })\{}.
So P is a partition of X and X\Z F for all Z P. Clearly |P| . If |P| < , then
=

(X\Z) F,

ZP

contradiction. So |P| = .
Theorem 12.16. Suppose that is the least infinite cardinal such that there is a
nonprincipal -complete ultrafilter F on . Then F is -complete.
Proof. Assume the
T hypothesis, but suppose that F is not -complete. So there is a
A [F ]< such that A
/ F . Hence by Lemma 12.15 there is a partition P of such
that |P| < and X\P F for every P P. Let hP : < i be a one-one enumeration
of P, an infinite cardinal. We are now going to construct a nonprincipal -complete
ultrafilter G on , which will contradict the minimality of .
Define f : by letting f () be the unique < such that P . Then we
define
G = {D : f 1 [D] F }.
We check the desired conditions for G.
/ G, since f 1 [] =
/ F . If D G and D E,
1
1
1
1
then f [D] F and f [D] f [E], so f [E] F and hence E G. Similarly, G
is closed under . Given D , either f 1 [D] F or f 1 [\D] = \f 1 [D] F , hence
D G or \D G. So G is an ultrafilter on . It is nonprincipal, since for any <
149

we have f 1 [{}] = P
/ F and hence {}
/ G. Finally, G is -complete, since if D is a
countable subset of G, then
h\ i
\
1
f
D =
f 1 [P ] F,
P D

and hence

D G.

We say that an uncountable cardinal is measurable iff there is a -complete nonprincipal


ultrafilter on .
Theorem 12.17. Every measurable cardinal is weakly compact.
Proof. Let be a measurable cardinal, and let U be a nonprincipal -complete
ultrafilter on .
Since U is nonprincipal, \{} U for every < . Then -completeness implies
that \F U for every F []< .
Now we show that is regular. For, suppose it is singular. Then we can write
S
= < , where < and each has size less than . So by the previous paragraph,
\ U for every < , and hence
\
=
(\ ) U,
<

contradiction.
Next, is strong limit. For, suppose that < and 2 . Let S [ 2] . Let
hf : < i be a one-one enumeration of S. Now for each < , one of the sets
{ < : f () = 0} and { < : f () = 1} is in U , so we can let () 2 be such that
{ < : f () = ()} U . Then
\
{ < : f () = ()} U ;
<

this set clearly has only one element, contradiction.


Thus we now know that is inaccessible. Finally, we check the tree property. Let
(T, ) be a tree of height such that every level has size less than . Then |T | = , and
we may assume that actually T = . Let B = { < : {t T :  t} U }. Clearly
any two elements of B are comparable under . Now take any < ; we claim that
Lev (T ) B 6= . In fact,
[
(1)
= {t T : ht(t, T ) < }
{s T : t  s}.
tLev (T )

Now by regularity of we have |{t T : ht(t, T ) < }| < , and so the complement of
this set is in U , and then (1) yields
[
(2)
{s T : t  s} U.
tLev (T )

150

Now |Lev (T )| < , so from (2) our claim easily follows.


Thus B is a branch of size , as desired.
EXERCISES
E12.1. Let be an uncountable regular cardinal. We define S < T iff S and T are
stationary subsets of and the following two conditions hold:
(1) { T : cf() } is nonstationary in .
(2) { T : S is nonstationary in )} is nonstationary in .
Prove that if < < < , all these cardinals regular, then E < E , where
E = { < : cf() = },
and similarly for E .
E12.2. Continuing exercise E12.1: Assume that is uncountable and regular. Show that
the relation < is transitive.
E12.3. If is an uncountable regular cardinal and S is a stationary subset of , we define
Tr(S) = { < : cf() > and S is stationary in }.
Suppose that A, B are stationary subsets of an uncountable regular cardinal and A < B.
Show that Tr(A) is stationary.
E12.4. (Real-valued measurable cardinals) We describe a special kind of measure. A
measure on a set S is a function : P(S) [0, ) satisfying the following conditions:
(1) () = 0 and (S) = 1.
(2) If ({s}) = 0 for all s S,
(3) If hXi : i i is a system of pairwise disjoint subsets of S, then (
P
i (Xi ). (The Xi s are not necessarily nonempty.)

Xi ) =

Let be an infinite cardinal. Then is -additive iff for every system hX : < i of
nonempty pairwise disjoint sets, wich < , we have
!
[
X

X =
(X ).
<

<

Here this sum (where the index set might be uncountable), is understood to be
sup

F ,
F finite

(X ).

We say that an uncountable cardinal is real-valued measurable iff there is a -additive


measure on . Show that every measurable cardinal is real-valued measurable. Hint: let
take on only the values 0 and 1.
151

E12.5. Suppose that is a measure on a set S. A subset A of S is a -atom iff (A) > 0
and for every X A, either (X) = 0 or (X) = (A).
Show that if is a realvalued measurable cardinal, is a -additive measure on , and A is a -atom, then
{X A : (X) = (A)} is a -complete nonprincipal ultrafilter on A. Conclude that is
a measurable cardinal if there exist such and A.
E12.6. Prove that if is real-valued measurable then either is measurable or 2 .
Hint: if there do not exist any -atoms, construct a binary tree of height at most 1 .
E12.7. Let be a regular uncountable cardinal. Show that the diagonal intersection of
the system h( + 1, ) : < i is the set of all limit ordinals less than .
E12.8. Let F be a filter on a regular uncountable cardinal . We say that F is normal
iff it is closed under diagonal intersections. Suppose that F is normal, and (, ) F for
every < . Show that every club of is in F . Hint: use exercise E12.7.
E12.9. Let F be a proper filter on a regular uncountable cardinal . Show that the
following conditions are equivalent.
(i) F is normal
(ii) For any S0 , if \S0
/ F and f is a regressive function defined on S0 , then
there is an S S0 with \S
/ F and f is constant on S.
E12.10. A probability measure on a set S is a real-valued function with domain P(S)
having the following properties:
(i) () = 0 and (S) = 1.
(ii) If X Y , then (X) (Y ).
(iii) ({a}) = 0 for all a S.
S
P (iv) If hXn : n i is a system of pairwise disjoint sets, then ( n Xn ) =
n (Xn ). (Some of the sets Xn might be empty.)
Prove that there does not exist a probability measure on 1 . Hint: consider an Ulam
matrix.
E12.11. Show that if is a measurable cardinal, then there is a normal -complete nonprincipal ultrafilter on . Hint: Let D be a -complete nonprincipal ultrafilter on . Define
f g iff f, g and { < : f () = g()} D. Show that is an equivalence relation
on . Show that there is a relation on the collection of all -classes such that for all
f, g , [f ] [g] iff { < : f () < g()} D. Here for any function h we use [h]
for the equivalence class of h under . Show that makes the collection of all equivalence
classes into a well-order. Show that there is a smallest equivalence class x such that
f x < [{ < : < f ()} D. Let E = {X : f 1 [X] D}. Show that E
satisfies the requirements of the exercise.
Reference
Kanamori, A. The higher infinite. Springer 2005, 536pp.

152

13. Well-founded relations


Here we introduce the usual hierarchy of sets, give a final generalized recursion theorem,
and prove Mostowskis collapsing theorem.
The hierarchy of sets
The hierarchy of sets is defined recursively as follows:
V0 = ;
V+1 = P(V );
[
V =
V for limit.
<

Theorem 13.1. For every ordinal the following hold:


(i) V is transitive.
(ii) V V for all < .
Proof. We prove these statements simultaneously by induction on . They are clear
for = 0. Assume that both statements hold for ; we prove them for + 1. First we
prove
(1) V V+1 .
In fact, suppose that x V . By (i) for , the set V is transitive. Hence x V , so
x P(V ) = V+1 . So (1) holds.
Now (ii) follows. For, suppose that < + 1. Then , so V V by (ii) for
(or trivially if = ). Hence by (1), V V+1 .
To prove (i) for + 1, suppose that x y V+1 . Then y P(V ), so y V , hence
x V . By (1), x V+1 , as desired.
For the final inductive step, suppose that is a limit ordinal and (i) and (ii) hold for
all < . To prove (i) for , suppose that x y V . Then by definition of V , there is
an < such that y V . By (i) for we get x V . So x V by the definition of V .
Condition (ii) for is obvious.
A very important fact about this hierarchy is that every set is a member of some V . To
prove this, we need to introduce the notion of transitive closure, which is itself important.
Theorem 13.2. For any set a there is a transitive set b with the following properties:
(i) a b.
(ii) For every transitive set c such that a c we have b c.
S
Proof.
By
recursion
we
define
d
=
a
and
d
=
d

dm for every m .
0
m+1
m
S
Let b = m dm . ThenS a = d0 b. Suppose that x y b. Choose m such
that y dm . Then x dm dm+1 b. Thus b is transitive. Now suppose that c is
a transitive set such that a c. We show by induction that dm c for every m .
First, d0 = a S
c, so thisSis true for m = 0. Now assume that it is true for m. Then
dm+1 = dm dm c c = c, completing the inductive proof.
153

Hence b =

dm c.

The set shown to exist in Theorem 13.2 is called the transitive closure of a, and is denoted
by trcl(a).
Theorem 13.3. Every set is a member of some V .
Proof. Suppose that this is not true, and let a be a set which is not a member of
any V . Let A = {x trcl(a {a}) : x is not in any of the sets V }. Then a A, so A
is nonempty. By the foundation axiom, choose x A such that x A = 0. Suppose that
y x. Then
S y trcl(a {a}), so y is a member of some V . Let y be the least such .
Let = yx y . Then by 13.1(ii), x V . So x V+1 , contradiction.
S
Thus by Theorem 13.3 we have V = On V . An important technical consequence of
Theorem 13.3 is the following definition, known as Scotts trick:
Let R be a class equivalence relation on a class A. For each a A, let be the smallest
ordinal such that there is a b V with (a, b) R, and define
typeR (a) = {b V : (a, b) R}.
This is the reduced equivalence class of a. It could be that the collection of b such that
(a, b) R is a proper class, but typeR (a) is always a set. An important use of this trick is
in defining order types, where we modify this procedure slightly. Define the class R to be
the collection of all ordered pairs (L, M ) such that L and M are order-isomorphic linear
orders. For any linear order L, we define its order type to be
o.t.(L) =

if L is a well-order which is order-isomorphic to the ordinal ,


typeR (L) if L is not well-ordered.

On the basis of this definition one can introduce the usual terminology for order types,
e.g., for the order type of the rationals, for the order type of (, >), etc.
On the basis of our hierarchy we can define the important notion of rank of sets:
For any set x, the rank of x, denoted by rank(x), is the smallest ordinal such that
x V+1 .
We take + 1 here instead of just for technical reasons. Some of the most important
properties of ranks are given in the following theorem.
Theorem 13.4. Let x be a set and an ordinal. Then
(i) V = {y : rank(y) < }.
(ii) For all y x we have rank(y) < rank(x).
(iii) rank(y) rank(x) for every y x.
(iv) rank(x) = supyx (rank(y) + 1).
(v) rank() = .
(vi) V On = .
154

Proof. (i): Suppose that y V . Then 6= 0. If is a successor ordinal + 1, then


rank(y) < . If is a limit ordinal, then y V for some < , hence y V+1 also,
so rank(y) < . This proves .
def
For , suppose that = rank(y) < . Then y V+1 V , as desired.
(ii): Suppose that x y. Let rank(y) = . Thus y V+1 = P(V ), so y V and
hence x V . Then by (i), rank(x) < .
(iii): Let rank(x) = . Then x V+1 , so x V . Let y x. Then y V , and so
y V+1 . Thus rank(y) .
(iv): Let be the indicated sup. Then holds by (ii). Now if y x, then rank(y) <
, and hence y Vrank(y)+1 V . This shows that x V , hence x V+1 , hence
rank(x) , finishing the proof of (iv).
(v): We prove this by transfinite induction. Suppose that it is true for all < .
Then by (iv),
rank() = sup (rank() + 1) = sup ( + 1) = .
<

<

Finally, for (vi), using (i) and (v),


V On = { On : V } = { On : rank() < } = { On : < } = .
We now define a sequence of cardinals by recursion:
i0 = ;
i+1 = 2i ;
[
i =
i

for limit.

<

Thus under GCH, = i for every ordinal ; in fact, this is just a reformulation of
GCH.
Theorem 13.5. (i) n |Vn | for any n .
(ii) For any ordinal , |V+ | = i .
Proof. (i) is clear by ordinary induction on n. We prove (ii) by the three-step
transfinite induction (where is a limit ordinal below):



[


Vn = = i0 by (i);
|V | =


n

|V++1 = |P(V+ )|
= 2|V+ |
= 2i (inductive hypothesis)
= i+1 ;





[
V+
|V+ | =

<
155

|V+ |

<

(inductive hypothesis)

<

<

= || i
= i

by a normal function theorem;

to finish this last inductive step, note that for each < we have i = |V+ | |V+ |,
and hence i |V+ |.
Well-founded class relations
We now introduce a kind of generalization of set membership.
If A is a class, a class relation R is well-founded on A iff for every nonempty subset X
of A there is an x X such that for all y X it is not the case that (y, x) R. Such a
set x is called R-minimal.
This notion is important even if A and R are mere sets. The protypical example of a well
founded relation is itself; it is well founded on V as one sees by the foundation axiom.
Recall that our intuitive notion of class is made rigorous by using formulas instead.
Thus we would talk about a formula (x, y) being well-founded on another formula (x).
In the case of , we are really looking at the formula x y being well-founded on the
formula x = x. So, rigorously we are associating with two formulas (x, y) and (x)
another formula (x, y) is well-founded on (x), namely the following formula:
X[x X(x) X 6= x Xy X(y, x)].
We are going to formulate and prove a recursion principle for well-founded relations. But
first we need some technical preparation.
A class relation R is set-like on a class A iff for every a A the class {b A : (b, a) R}
is a set.
Again we give a rigorous version. Given two formulas (x, y) and (x), the formula (x, y)
is set-like on (x) is the following formula:
x[(x) zy(y z (y) (y, x))].
Let the class relation R be set-like on the class A, and suppose that x A. We define
predn (A, x, R) for every natural number n by recursion.
pred0 (A, x, R) = {y A : (y, x) R};
[
predn+1 (A, x, R) = {pred0 (A, y, R) : y predn (A, x, R)}.
156

Then we define
cl(A, x, R) =

predn (A, x, R).

Proposition 13.6. If R is well-founded and set-like on A, x A, and y is an


element of cl(A, x, R), then pred0 (A, y, R) cl(A, x, R).
Proof. Choose n such that y predn (A, x, R). Clearly then pred0 (A, y, R)
predn+1 (A, x, R) cl(A, x, R).
Proposition 13.7. If R is well-founded and set-like on A, then every nonempty
subclass of A has an R-minimal element.
Proof. Suppose that X is a nonempty subclass of A and x X, but x is not
an R-minimal element of X. So there is a y X such that (y, x) R, and so y
pred0 (A, x, R) X cl(A, x, R) X. Since thus cl(A, x, R) X is a nonempty subset
of A, we can take an R-minimal element y of it. Suppose that (z, y) R. Then by
Proposition 13.6, z pred0 (A, y, R) cl(A, x, R), and so z
/ X. Hence y is an Rminiimal element of X.
Now we are ready for the general recursion theorem.
Theorem 13.8. Suppose that R is a class relation which is well-founded and set-like
on A. Also suppose that F : A V V. Then there is a unique G : A V such that
for every x A,
G(x) = F(x, G pred(A, x, R)).
Before beginning the proof of this theorem, we make some comments about it. The fact
that R is set-like on A implies that pred(A, x, R) is a set, and hence so is F pred(A, x, R),
using the replacement axiom. Recalling that classes are really a shorthand for formulas,
we can formulate the theorem more rigorously, first introducing the following definitions.
(x, y, z) is a function from (x) V into V abbreviates
x, y, z, w[(x, y, z) (x, y, w) z = w] x, y[z(x, y, z) (x)].
(x, y) is a function from (x) into V abbreviates
x, y, z[(x, y) (x, z) y = z] x[y(x, y) (x)].
a is the set of predecessors of x under (x, y) abbreviates
y[y a (y, x)].
f is the restriction of (x, y) to the set of predecessors of x under (x, y) abbreviates
z[z f x, y, a[a is the set of predecessors of x under (x, y)
z = (x, y) x a (x, y)]].
157

Now the rigorous version of the theorem is as follows:


Suppose that (x, y), (x), (x, y, z), (x, y) are formulas. Then there is a formula (x, y)
(given explicitly in the proof below) such that
ZFC [(x, y) is well-founded on (x) (x, y) is set-like on (x)
(x, y, z) is a function from (x) V into V]
(x, y) is a function from (x) into V xf, z[f is the restriction
of (x, y) to the set of predecessors of x under (x, y)
(x, z) (x, f, z)]
{[ (x, y) is a function from (x) into V xf, z[f is the restriction
of (x, y) to the set of predecessors of x under (x, y)
(x, z) (x, f, z)]]
x, z[(x, z) (x, z)]}.
Proof of Theorem 13.8. This proof is very similar to that of Theorem 2.12. Consider the following condition:
(*) f is a function with domain d A, x d(pred(A, x, R) d), and for every x d we
have f (x) = F(x, f pred(A, x, R)).
First we show
(1) If f, d and g, e satisfy (*), then f (d e) = g (d e).
def

To prove (1) suppose that x d e and f (x) 6= g(x). Thus the set X = {y (d e) :
f (y) 6= g(y)} is a nonempty subset of A, so by well-foundedness we can take an Rminimal element z of X. Then f (w) = g(w) for all w d e such that wRz. Now clearly
pred(A, z, R) (d e), so f pred(A, z, R) = g pred(A, z, R). Hence
f (z) = F(z, f pred(A, z, R)) = F(z, g pred(A, z, R)) = g(z),
contradiction.
(2) For every x A, let d(x) = {x} cl(A, x, R). Then for every x A there is a f such
that f and d(x) satisfy (*).
Suppose that (2) is not true, and let x A be R-minimal such that there is no such
f . For each y such that (y, x) R there is a g so that g and d(y) satisfy (*). By (1)
this g is unique, and so by the replacement axiom we can associate with each such y the
corresponding function gy . Let
h=

gy

and f = h {(x, F(x, h))}.

(y,x)R

It is straightforward to check that f and d(x) satisfy (*), contradiction. So (2) holds.
158

Now for any x A, let G(x) = f (x), where f and d(x) satisfy (*). This definition is
unambiguous by (1) and (2). It is easy to see that G is what is needed in the theorem.
The uniqueness of G follows by an easy argument by contradiction.
As a first application of this theorem we can define rank for well-founded relations.
If R is well-founded and set-like on A, then for any x A,
rank(x, A, R) = sup{rank(y, A, R) + 1 : y A and (y, x) R}.
By induction, this is always an ordinal.
Proposition 13.9. If A is a transitive class, then for any x A, rank(x, A, ) =
rank(x). In particular, rank(x, V, ) = rank(x) for any set x.
Proof. Otherwise, let x be R-minimal such that x A and rank(x, A, ) 6= rank(x).
Then
rank(x, A, ) = sup{rank(y, A, ) + 1 : y A and y x}
= sup{rank(y) + 1 : y A and y x} (R-minimality of x)
= sup{rank(y) + 1 : y x} (transitivity of A)
= rank(x) by Theorem 13.4(iii)).
This is a contradiction.
The Mostowski collapse
We give here an important technical result about well-founded relations.
Suppose that R is well-founed and set-like on A. The Mostowski collapsing function is
a class function G : A V defined by recursion as follows: for any x A,
G(x) = {G(y) : y A and (y, x) R}.
The Mostowski collapse of A, R is defined as the range of this function G.
Proposition 13.10. Suppose that R is well-founded and set-like on A, G is the
Mostowski collapsing function for A, R, and M is the Mostowski collapse. Then
(i) For all x, y A, if (x, y) R then G(x) G(y).
(ii) M is transitive.
(iii) For any x A we have rank(x, A, R) = rank(G(x)).
Proof. (i) is obvious from the definition. If a b M, choose y A such that
b = G(y). Since a b, by the definition we have a rng(G) = M. So (ii) holds. For
(iii), suppose it is not true, and let x be R-minimal such that x A and rank(x, A, R) 6=
rank(G(x)). Then
rank(x, A, R) = sup{rank(y, A, R) + 1 : y A and (y, x) R}
= sup{rank(G(y)) + 1 : y A and (y, x) R} (minimality of x)
= sup{rank(u) + 1 : u G(x)} (definition of G)
= rank(G(x)) (by 13.4(iii)).
159

This is a contradiction, proving (iii).


The Mostowski collapse is especially important for extensional relations, defined as follows.
Let R be a class relation and A a class. We say that R is extensional on A iff the
following generalization of the extensionality axiom holds:
x, y A[z A[(z, x) R iff (z, y) R] x = y].
Proposition 13.11. Suppose that R is well-founded and set-like on A. Let G and
M be the Mostowski collapsing function and Mostowski collapse, respectively. Then the
following conditions are equivalent:
(i) R is extensional on A.
(ii) G is one-one, and for all x, y A we have (x, y) R iff G(x) G(y).
Proof. (i)(ii): Assume (i). Suppose that G is not one-one. Then the set
()

{x A : there is a y A such that x 6= y and G(x) = G(y)}

is nonempty, and we take an R-minimal element of this set. Also, let y A with x 6= y
and G(x) = G(y). Since both x and y are in A, and x 6= y, the extensionality condition
gives two cases.
Case 1. There is a z A such that (z, x) R and (z, y)
/ R. Since (z, x) R, it
follows that z is not in the set (). Now G(z) G(x) by Proposition 13.10(i), so the fact
that G(x) = G(y) implies that G(z) G(y). Hence by definition of G we can choose
w A such that (w, y) R and G(z) = G(w). Then from z not in () we infer that
z = w, hence (z, y) R, contradiction.
Case 2. There is a z A such that (z, y) R and (z, x)
/ R. Since (z, y) R,
by Proposition 13.10(i) we get G(z) G(y) = G(x), and so there is a v A such that
(v, x) R and G(z) = G(v). Now v is not in () by the minimality of x, so z = v and
(z, x) R, contradiction.
Therefore, G is one-one. Now the implication in the second part of (ii) holds by
Proposition 13.10(i). Suppose now that G(x) G(y). Choose w A such that (w, y) R
and G(x) = G(w). Then x = w since G is one-one, so (x, y) R, as desired.
(ii)(i): Assume (ii), and suppose that x, y A, and z A[(z, x) R iff (z, y)
R]. Take any u G(x). By the definition of G, choose z A such that (z, x) R and
u = G(z). Then also (z, y) R, so u = G(z) G(y). This shows that G(x) G(y).
Similarly G(y) G(x), so G(x) = G(y). Since G is one-one, it follows that x = y.
Theorem 13.12. Suppose that R is a well-founded class relation that is set-like and
extensional on a class A. Then there are unique G, M such that M is a transitive class
and G is an isomorphism from (A, R) onto (M, ).
Note that we have formulated this in the usual fashion for isomorphism of structures,
but of course we cannot form the ordered pairs (A, R) and (M, ) if A, R, M are proper
classes. So we understand the above as an abbreviation for a longer statement, that G is
a bijection from A onto M, etc.
160

Proof of 13.12. The existence of G and M is immediate from Propositions 13.10


and 13.13. Now suppose that G and M also work. Since M is the range of G and M
is the range of G , it suffices to show that G = G . Suppose not. Let a be R-minimal
such that G(a) 6= G (a). Take any x G(a). Then since M is transitive and G(a) M,
it follows that x M. And since G maps onto M, it then follows that there is a b A
such that G(b) = x. So G(b) G(a) so, by the isomorphism property, (b, a) R. Then
the minimality of a yields That G (b) = G(b) G(a). But also (b, a) R implies that
G (b) G (a) by the isomorphism property, so x = G(b) G (a). Thus we have proved
that G(a) G (a). By symmetry G (a) G(a), so G(a) = G (a), contradiction.
EXERCISES
E13.1. Write out all the elements of V for = 0, 1, 2, 3, 4.
E13.2. Define by recursion
S() =

P(S())

<

for every ordinal . Prove that V = S() for every ordinal .


E13.3. Determine exactly the ranks of the following sets in terms of the ranks of the sets
entering into their definitions. In some cases the rank is not completely determined by the
ranks of the constituents; in such cases, describe all possibilities.
S
(i) {x}
(iv) x y
(vii) x
(x) R1
(ii) {x, y}
(v) x y
(viii) dmn(R)
(iii) (x, y)
(vi) x\y
(ix) P(x)
E13.4. Let R be a relation A A for some set A. Show that R is well-founded on A
iff there does not exist a sequence hai : i i of elements of A such that ai+1 Rai for all
i .
E13.5. Define xRy iff (x, 1) y. Show that R is well-founded and set-like on V.
E13.6. (Continuing E13.5) By recursion let y = {(
x, 1) : x y} for any set y. Let G be
the Mostowski collapsing function for R, V in exercise E13.5. Prove that G(
y ) = y for
every set y.
E13.7. Prove that in the replacement axioms (page 8) the part !y can be replaced by y.
Hint: apply the replacement axiom to a formula that says there is a member y of V such
that holds, and is minimum with this property.
E13.8. Prove that if a is transitive, then {rank(b) : b a} is an ordinal.
E13.9. Show that for any set a we have rank(trcl(a)) = rank(a).
E13.10. Define xRy iff x trcl(y). Show that R is well-founded and set like on V.
E13.11. (Continuing exercise E13.10) Let G be the Mostowski collapsing function for
R, V. Show that G(x) = rank(x) for every set x.
E13.12. Define A and R such that R is well-founded on A but not set-like on A.
161

E13.13. Define A and R such that R is neither well-founded nor set-like on A.


E13.14. Let R = {(m, n) : n < m}. Note that R is not well-founded on , but it
is set-like on . Define a function F : V V such that there is no function G as in
the recursion theorem, using and R.
References
Jech, T. Set Theory, 769pp.
Kunen, K. Set Theory, 313pp.

162

14. Models of set theory


In this chapter we introduce the basic notions used for models of set theory: relativization,
absoluteness, and reflection. Our first consistency result is in this chapter too: if ZFC is
consistent, then so is ZFC+there is no inaccessible cardinal.
Relativization
Although models of set theory are special cases of models in any first-order language, it
is convenient to introduce the basic notions only for our standard language for set theory.
Another aspect of our situation is that we want to consider models which are proper classes.
For this reason, it is better to talk about relativization instead of the more standard notions
of model theory.
The basic definition of relativization is as follows.
. Suppose that M is a class. We associate with each formula its relativization to M,
denoted by M . The definition goes by recursion on formulas:
(x = y)M is x = y
(x y)M is x y
( )M is M M .
()M is M .
(x)M is x[x M M ].
The more rigorous version of this definition associates with each pair , of formulas a
third formula which is called the relativization of to .
We say that holds in M or is true in M, iff M holds.
The basic fact about relativization, which could be proved in an elementary logic
course, is the following theorem; we supply a proof, since the theorem may not be familiar
to the reader. Note that in this theorem and proof we use the rigorous version of the
definition of relativization. We write |= to abbreviate the statment that every structure
which is a model of each member of is also a model of .
Theorem 14.1. Let be a set of sentences, a sentence, and (x, y) a formula,
where we indicate one distinguished variable x, and a string y of other free variables. Let
= { : }. Suppose that |= . Then
|= x(x, y) .
Proof. Assume the hypothesis of the theorem, let (A, E) be any structure for our
set-theoretic language, and let a be a sequence of elements of A of the length of y. Assume
that
(1)

(A, E) |= (a) x(x, a);

we want to show that (A, E) |= (a). To do this, we define another structure (B, F ) for
our language. Let
B = {b A : (A, E) |= (b, a)}
163

and F = E (B B).

Note that B 6= since (A, E) |= x(x, a) by (1). Now we claim:


(2) For any formula (z) and any c in B, (A, E) |= (c, a) iff (B, F ) |= (c).
We prove (2) by induction on :
(A, E) |= (c0 = c1 )

iff
iff

c0 = c1
(B, F ) |= c0 = c1 ;

(A, E) |= (c0 c1 )

iff
iff
iff

(c0 , c1 ) E
(c0 , c1 ) F
(B, F ) |= c0 c1 ;

(A, E) |= () (c, a) iff


iff
iff
(A, E) |= ( ) (c, a) iff

iff
iff

not[(A, E) |= (c, a)]


not[(B, F ) |= (c)] (induction hypothesis)
(B, F ) |= (c);
[(A, E) |= (c, a)] and (A, E) |= (c, a)
[(B, F ) |= (c)] and (B, F ) |= (c)
(induction hypothesis)
(B, F ) |= ( )(c).

We do the quantifier step in each direction separately. First suppose that (A, E) |=
(x) (x, c, a). Thus
(3)

(A, E) |= x[(x, a) (x, c, a)].

Choose b A such that (A, E) |= [(b, a) (b, c, a)]. Now (A, E) |= (b, a) implies
that b B. Hence (A, E) |= (b, c, a)] implies by the inductive hypothesis that (B, F ) |=
(b, c). Hence (B, F ) |= x(x, c).
Conversely, suppose that (B, F ) |= x(x, c); we want to prove (3). Choose b B
such that (B, F ) |= (b, c). Hence (A, E) |= (b, a), and by the inductive hypothesis,
(A, E) |= (b, c, a). So (A, E) |= (b, a) (b, c, a). Now (3) follows.
This finishes the proof of (2).
Now by (1) and (2), (B, F ) |= . By an assumption of the theorem, |= , so
(B, F ) |= . Now an application of (2) yields (A, E) |= .
This shows that |= x(x, y) .
This theorem gives the basic idea of consistency proofs in set theory; we express this as
follows. By consistent we mean has a model.
Corollary 14.2. Suppose that and are collections of sentences in our language
of set theory. Suppose that M is a class, and |= [M 6= and M is a model of ].
Then consistent implies that is consistent.
Although this is the form we use in practice, for the proof we give the more rigorous
version:
164

Suppose that and are collections of sentences in our language of set theory. Suppose
that (x, y) is a formula, and for every sentence , |= x(x, y) . Also assume
that has a model. Then has a model.
Proof. Suppose to the contrary that does not have a model. Then trivially
|= (x = x). By Theorem 14.1, |= x(x, y) (x = x). Hence by hypothesis we
get |= (x = x), contradiction.
Our main examples of the use of this corollary are with = ZFC and = ZFC+there
are no inaccessibles; and with = ZF and = ZFC + AC + GCH. The class M is more
complicated to describe, and we defer that until actually ready to give the applications of
the corollary.
Now we give some simple facts which will be useful in checking the axioms of ZFC in
the transitive classes which we will define. So we need to refer back to the beginning of
the notes, where the ZFC axioms were given.
Theorem 14.3. The extensionality axiom holds in any nonempty transitive class.
Proof. The relativized version of the extensionality axiom is
x My M[z M(z x z y) x = y].
To prove this, assume that x, y M, and suppose that for all z M, z x iff z y. Take
any z x. Because M is transitive, we get z M. Hence z y. Thus z x implies that
z y. The converse is similar. So x = y.
The following theorem reduces checking the comprehension axioms to checking a closure
property.
Theorem 14.4. Suppose that M is a nonempty class, and for each formula with
with free variables among x, z, w1 , . . . , wn ,
z, w1 , . . . , wn M[{x z : M (x, z, w1 , . . . , wn )} M].
Then the comprehension axioms hold in M.
Proof. The straightforward relativization of an instance of the comprehension axioms
is
z Mw1 M . . . wn My Mx M(x y x z M ).
So, we take z, w1 , . . . , wn M. Let
y = {x z : M (x, z, w1 , . . . , wn )};
by hypothesis, we have y M. Then for any x M,
xy

iff

x z and M (x, z, w1 , . . . , wn ).

The following theorems are obvious from the forms of the pairing axiom and union axioms:
165

Theorem 14.5. Suppose that M is a nonempty class and


x, y Mz M(x z and y z).
Then the pairing axiom holds in M.
Theorem 14.6. Suppose that M is a nonempty class and
[

x Mz M
xz .
Then the union axiom holds in M.
For the next result, recall that z x is an abbreviation for w(w z w x).
Theorem 14.7. Suppose that M is a nonempty transitive class. Then the following
are equivalent:
(i) The power set axiom holds in M.
(ii) For every x M there is a y M such that P(x) M y.
Proof. (i)(ii): Assume (i). Thus
(1)

x My Mz M[w M(w z w x) z y].

To prove (ii), take any x M. Choose y M as in (1). Suppose that z P(x) M.


Clearly then w M(w z w x), so by (1), z y, as desired in (ii).
(ii)(i): Assume (ii). This time we want to prove (1). So, suppose that x M.
Choose y M as in (ii). Now suppose that z M and w M(w z w x). Then
the transitivity of M implies that w(w z w x), i.e., z x. So by (ii), z y, as
desired.
We defer the discussion of the infinity axiom until we talk about absoluteness.
Theorem 14.8. Suppose that M is a transitive class, and for every formula with
free variables among x, y, A, w1, . . . , wn and for any A, w1 , . . . , wn M the following implication holds:
x A!y[y M M (x, y, A, w1, . . . , wn )] implies that
Y M[{y M : x AM (x, y, A, w1, . . . , wn )} Y ]].
Then the replacement axioms hold in M.
Proof. Assume the hypothesis of the theorem. We write out the relativized version
of an instance of the replacement axiom in full, remembering to replace the quantifier !
by its definition:
A Mw1 M . . . wn M
[x M[x A y M[M (x, y, A, w1, . . . , wn ) u M
[M (x, u, A, w1 , . . . , wn ) y = u]]]
Y Mx M[x A y M[y Y M (x, y, A, w1, . . . , wn )]]].
166

To prove this, assume that A, w1 , . . . , wn M and


x M[x A y M[M (x, y, A, w1, . . . , wn ) u M
[M (x, y, A, w1, . . . , wn ) y = u]]].
Since M is transitive, we get
x Ay M[M (x, y, A, w1, . . . , wn ) u M[M (x, y, A, w1, . . . , wn ) y = u]],
so that
(1)

x A!y[y M M (x, y, A, w1, . . . , wn )].

Hence by the hypothesis of the theorem we get Y M such that


(2)

{y M : x AM (x, y, A, w1, . . . , wn )} Y.

Suppose that x M and x A. By (1) we get y M such that M (x, y, A, w1, . . . , wn ).


Hence by (2) we get y Y , as desired.
Theorem 14.9. If M is a transitive class, then the foundation axiom holds in M.
Proof. The foundation axiom, with the defined notion eliminated, is
x[y(y x) y[y x z y(z
/ x)]].
Hence the relativized version is
x M[y M(y x) y M[y x z M[z y z
/ x]]].
So, take any x M, and suppose that there is a y M such that y x. Choose y x
so that y x = . Then y M by transitivity. If z M and z y, then z
/ x, as
desired.
Absoluteness
To treat the infinity axiom and more complicated statements, we need to go into the
important notion of absoluteness. Roughly speaking, a formula is absolute provided that
its meaning does not change in going from one set to a bigger one, or vice versa. The exact
definition is as follows.
Suppose that M N are classes and (x1 , . . . , xn ) is a formula of our set-theoretical
language. We say that is absolute for M, N iff
x1 , . . . , xn M[M (x1 , . . . , xn ) iff N (x1 , . . . , xn )].
An important special case of this notion occurs when N = V. Then we just say that is
absolute for M.
167

More formally, we associate with three formulas (y, w1 , . . . , wm ), (y, w1 , . . . , wm ),


(x1 , . . . , xn ) another formula is absolute for , , namely the following formula:

^
x1 , . . . , xn
(xi ) [ (x1 , . . . , xn ) (x1 , . . . , xn )] .
1in

In full generality, very few formulas are absolute; for example, see exercise E14.5. Usually
we need to assume that the sets are transitive. Then there is an important set of formulas
all of which are absolute; this class is defined as follows.
The set of 0 -formulas is the smallest set of formulas satisfying the following conditions:
(a) Each atomic formula is in .
(b) If and are in , then so are and .
(c) If is in , then so are x y and x y.
Recall here that x y and x y are abbreviations for x(x y ) and x(x
y ) respectively.
Theorem 14.10. If M is transitive and is 0 , then is absolute for M.
Proof. We show that the collection of formulas absolute for M satisfies the conditions defining the set 0 . Absoluteness is clear for atomic formulas. It is also clear
that if and are absolute for M, then so are and . Now suppose that
is absolute for M; we show that x y is absolute for M. Implicitly, can involve additional parameters w1 , . . . , wn . Assume that y, w1 , . . . , wn M. First suppose that x y(x, y, w1, . . . , wn ). Choose x y so that (x, y, w1 , . . . , wn ). Since
M is transitive, x M. Hence by the inductive assumption, M (x, y, w1, . . . , wn )
holds. This shows that (x y(x, y, w1 , . . . , wn ))M . Conversely suppose that (x
y(x, y, w1, . . . , wn ))M . Thus x M[x y M (x, y, w1 , . . . , wn ). By the inductive
assumption, (x, y, w1 , . . . , wn ). So this shows that x y(x, y, w1 , . . . , wn ). The case
x y is treated similarly.
Ordinals and special kinds of ordinals are absolute since they could have been defined using
0 formulas:
Theorem 14.11. The following are absolute for any transitive class:
(i) x is an ordinal
(ii) x is a limit ordinal

(iii) x is a successor ordinal (v) x is


(iv) x is a finite ordinal
(vi) x is i (each i < 10)

Proof.
x is an ordinal y xz y[z x] y xz yw z[w y];
x is a limit ordinal y x[y = y] x is an ordinal y xz x(y z);
x is a successor ordinal x is an ordinal x 6= x is not a limit ordinal;
x is a finite ordinal y[y
/ x] (x is a successor ordinal
y x(z[z
/ y] y is a successor ordinal));
x = x is a limit ordinal y x(y is a finite ordinal);
168

finally, we do (vi) by induction on i. The case i = 0 is clear. Then


y = i + 1 x y(x = i z y[z x z = x] z x[z y] x y.
The following theorem, while obvious, will be very useful in what follows.
Theorem 14.12. Suppose that S is a set of sentences in our set-theoretic language,
and M and N are classes which are models of S. Suppose that
S |= x1 , . . . , xn [(x1 , . . . , xn ) (x1 , . . . , xn )].
Then is absolute for M, N iff is.
Of course we will usually apply this when S is a subset of ZFC.
Recall that all of the many definitions that we have made in our development of set
theory are supposed to be eliminable in favor of our original language. To apply Theorem
14.12, we should note that the development of the very elementary set theory in Chapter 3
did not use the axiom of choice or the axiom of infinity. We let ZF be our axioms without
the axiom of choice, and ZF Inf the axioms ZF without the axiom of infinity.
The status of the functions that we have defined requires some explanation. Whenever
we defined a function F of n arguments, we have implicitly assumed that there is an
associated formula whose free variables are among the first n + 1 variables, so that the
following is derivable from the axioms assumed at the time of defining the function:
v0 , . . . , vn1 !vn (v0 , . . . , vn ).
Recall that !vn means there is exactly one vn . Now if we have a class model M in
which this sentence holds, then we can define FM by setting, for any x0 , . . . , xn1 M,
FM (x0 , . . . , xn1 ) = the unique y such that M (x0 , . . . , xn1 , y).
In case M satisfies the indicated sentence, we say that F is defined in M. Given two class
models M N in which F is defined, we say that F is absolute for M, N provide that
is. Note that for F to be absolute for M, N it must be defined in both of them.
Proposition 14.13. Suppose that M N are models in which F is defined. Then
the following are equivalent:
(i) F is absolute for M, N.
(ii) For all x0 , . . . , xn1 M we have FM (x0 , . . . , xn1 ) = FN (x0 , . . . , xn1 ).
Proof. Let be as above.
Assume (i), and suppose that x0 , . . . , xn1 M. Let y = FM (x0 , . . . , xn1 ). Then y
M, and M (x0 , . . . , xn1 , y), so by (i), N (x0 , . . . , xn1 , y). Hence FN (x0 , . . . , xn1 ) = y.
Assume (ii), and suppose that x0 , . . . , xn1 , y M. Then
M (x0 , . . . , xn1 , y) iff
iff
iff

FM (x0 , . . . , xn1 ) = y
N

(definition of F)

F (x0 , . . . , xn1 ) = y (by (ii))


N (x0 , . . . , xn1 , y) (definition of F).
169

The following theorem gives many explicit absoluteness results, and will be used frequently
along with some similar results below. Note that we do not need to be explicit about how
the relations and functions were really defined in Chapter 1; in fact, we were not very
explicit about that in the first place.
Theorem 14.14. The following relations and functions were defined by formulas
equivalent to 0 -formulas on the basis of ZF Inf, and hence are absolute for all transitive
class models of ZF Inf:
(i) x y
(ii) x = y
(iii) x y
(iv) {x, y}
(v) {x}

(xi) x {x}
(xii) xSis transitive
(xiii) T x
T
(xiv) x (with = )

(vi) (x, y)
(vii)
(viii) x y
(ix) x y
(x) x\y

Note here, for example, that in (iv) we really mean the 2-place function assigning to sets
x, y the unordered pair {x, y}.
Proof. (i) and (ii) are already 0 formulas. (iii):
x y z x(z y).
(iv):
z = {x, y} w z(w = x w = y) x z y z.
(v): Similarly. (vi):
z = (x, y) w z[w = {x, y} w = {x}] w z[w = {x, y}] w z[w = {x}].
(vii):
x = y x(y 6= y).
(viii):
z = x y w z(w x w y) w x(w z) w y(w z).
(ix):
z = x y w z(w x w y) w x(w y w z).
(x):
z = x\y w z(w x w
/ y) w x(x
/ y w z).
(xi):
y = x {x} w y(w x w = x) w x(w y) x y.
(xii):
x is transitive y x(y x).
(xiii):
y=

x w yz x(w z) w x(w y).


170

(xiv):
y=

x [x 6= w yz x(w z)

w xt w[z x(t z) t y] [x = y = ].
A stronger form of Theorem 14.14. For each of the indicated relations and functions,
we do not need M to be a model of all of ZF Inf. In fact, we need only finitely many
of the axioms of ZF Inf: enough to prove the uniqueness condition for any functions
involved, and enough to prove the equivalence of the formula with a 0 -formula, since
0 formulas are absolute for any transitive class model. To be absolutely rigorous here,
one would need an explicit definition for each relation and function symbol involved, and
then an explicit proof of equivalence to a 0 formula; given these, a finite set of axioms
becomes clear. And since any of the relations and functions of Theorem 14.14 require only
finitely many basic relations and functcions, this can always be done. For Theorem 14.14
it is easy enough to work this all out in detail. We will be interested, however, in using
this fact for more complicated absoluteness results to come.
As an illustration, however, we do some details for the function {x, y}. The definition
involved is naturally taken to be the following:
x, y, z[z = {x, y} w[w z w = x x = y]].
The axioms involved are the pairing axiom and one instance of the comprehension axiom:
x, yw[x w y w];
x, y, wzu(u z u w (u = x u = y)).
{x, y} is then absolute for any transitive class model of these three sentences, by the proof
of (iv) in Theorem 14.14, for which they are sufficient.
For further absoluteness results we will not reduce to 0 formulas. We need the
following extensions of the absoluteness notion.
Suppose that M N are classes, and (w1 , . . . , wn ) is a formula. Then we say
that is absolute upwards for M, N iff for all w1 , . . . , wn M, if M (w1 , . . . , wn ),
then N (w1 , . . . , wn ). It is absolute downwards for M, N iff for all w1 , . . . , wn M, if
N (w1 , . . . , wn ), then M (w1 , . . . , wn ).
Theorem 14.15. Suppose that (x1 , . . . , xn , w1 , . . . , wm ) is absolute for M, N. Then
(i) x1 , . . . xn (x1 , . . . , xn , w1 , . . . , wm ) is absolute upwards for M, N.
(ii) x1 , . . . xn (x1 , . . . , xn , w1 , . . . , wm ) is absolute downwards for M, N.
Theorem 14.16. Absoluteness is preserved under composition. In detail: suppose
that M N are classes, and the following are absolute:
(x1 , . . . , xn );
F, an n-ary function ;
For each i = 1, . . . , n, an m-ary function Gi .
171

Then the following are absolute:


(i) (G1 (x1 , . . . , xm ), . . . , Gn (x1 , . . . , xm )).
(ii) The m-ary function assigning to x1 , . . . , xm the value
F(G1 (x1 , . . . , xm ), . . . , Gn (x1 , . . . , xm )).
Proof. We use Theorem 14.15:

(G1 (x1 , . . . , xm ), . . . , Gn (x1 , . . . , xm )) z1 , . . . zn (z1 , . . . , zn )

n
^


(zi = Gi (x1 , . . . , xm )) ;

i=1

(G1 (x1 , . . . , xm ), . . . , Gn (x1 , . . . , xm )) z1 , . . . zn

^
n
i=1

(zi = Gi (x1 , . . . , xm ))


(z1 , . . . , zn ) ;

y = F(G1 (x1 , . . . , xm ), . . . , Gn (x1 , . . . , xm )) z1 , . . . zn (y = F(z1 , . . . , zn ))

n
^


(zi = Gi (x1 , . . . , xm )) ;

i=1

y = F(G1 (x1 , . . . , xm ), . . . , Gn (x1 , . . . , xm )) z1 , . . . zn

^
n

(zi = Gi (x1 , . . . , xm ))

i=1


(y = F(z1 , . . . , zn )) .
Theorem 14.17. Suppose that M N are classes, (y, x1 , . . . , xm , w1 , . . . , wn ) is
absolute for M, N, and F and G are n-ary functions absolute for M, N. Then the following
are also absolute for M, N:
(i) z F(x1 , . . . , xm ).
(ii) F(x1 , . . . , xm ) z.
(iii) y F(x1 , . . . , xm )(y, x1 , . . . , xm , w1 , . . . , wn ).
(iv) y F(x1 , . . . , xm )(y, x1 , . . . , xm , w1 , . . . , wn ).
(v) F(x1 , . . . , xm ) = G(x1 , . . . , xm ).
(vi) F(x1 , . . . , xm ) G(x1 , . . . , xm ).
Proof.
z F(x1 , . . . , xm ) w[z w w = F(x1 , . . . , xm )];
w[w = F(x1 , . . . , xm ) z w];
F(x1 , . . . , xm ) z w z[w = F(x1 , . . . , xm )];
172

y F(x1 , . . . , xm )(y, x1 , . . . , xm , w1 , . . . , wn )
wy w[w = F(x1 , . . . , xm ) (y, x1 , . . . , xm , w1 , . . . , wn )];
w[w = F(x1 , . . . , xm ) y w(y, x1 , . . . , xm , w1 , . . . , wn )];
(iv)(vi) are proved similarly.
We now give some more specific absoluteness results.
Theorem 14.18. The following relations and functions are absolute for all transitive
class models of ZF Inf:
(i) x is an ordered pair
(ii) A B
(iii) R is a relation

(iv) dmn(R)
(v) rng(R)
(vi) R is a function

(vii) R(x)
(viii) R is a one-one function
(ix) x is an ordinal

Note concerning (vii): This is supposed to have its natural meaning if R is a function and
x is in its domain; otherwise, R(x) = .
Proof.


[ 
[ 
x is an ordered pair y
x z
x [x = (y, z)];
y = A B (a A)(b B)[(a, b) y]
(z y)(a A)(b B)[z = (a, b)];
R is a relation x R[x is an ordered pair];

[[ 
x = dmn(R) (y x) z
R [(x, z) R]



[[
[[ 
y
R z
R [(y, z) R y x];

[[ 
x = rng(R) (y x) z
R [(z, x) R]



[[
[[ 
y
R z
R [(y, z) R z x];

[[ 
[[ 
R is a function R is a relation x
R y
R


[[
z
R [(x, y) R (x, z) R y = z];
y = R(x) [R is a function (x, y) R]

[R is not a function (z y)(z 6= z)]


[x
/ dmn(R) (z y)(z 6= z)];
R is a one-one function R is a function
x dmn(R)y dmn(R)[R(x) = R(y) x = y];
x is an ordinal x is transitive (y x)(y is transitive).
Theorem 14.19. Suppose that M is a transitive class model of ZF Inf and M.
Then the infinity axiom holds in M.
173

Proof. We have
x M[ x y x(y {y} x)],
so by the absoluteness of the notions here we get
[x[ x y x(y {y} x)]]M ,
which means that the infinity axiom holds in M.
Theorem 14.20. If M is a transitive class model of ZF, then , M and M is
closed under the following set-theoretic operations:
S
(i)
(iv) (a, b) 7 {a, b}
(vii) T
(ii)
(v) (a, b) 7 (a, b)
(viii)
(iii) (a, b) 7 a\b
(vi) x 7 x {x}
Moreover, + 1 M and [M]< M.
Proof. M has elements x, y such that (x = )M and (y = )M . So x = and y =
by absoluteness. (See Theorem 14.13(vii) and Theorem 14.19(iv).) (i)(viii) are all very
similar, so we only treat (i). Let a, b M. Then because M |= ZF, there is a c M such
that (c = a b)M . By absoluteness, c = a b.
Each i is in M by transitivity. Hence + 1 M. Finally, we show by induction
on n that if x M and |x| = n then x M. This is clear for n = 0. Now suppose
inductively that x M and |x| = n + 1. Let a x and set y = x\{a}. So |y| = n. Hence
y M by the inductive hypothesis. Hence x = y {a} M by previous parts of this
theorem.
Our final abstract absoluteness result concerns recursive definitions.
Theorem 14.21. Suppose that R is a class relation which is well-founded and set-like
on A, and F : A V V. Let G be given by Theorem 11.8: for all x A,
G(x) = F(x, G pred(A, x, R)).
Let M be a transitive class model of ZF, and assume the following additional conditions
hold:
(i) F, R, and A are absolute for M.
(ii) (R is set-like on A)M .
(iii) x M A[pred(A, x, R) M].
Conclusion: G is absolute for M.
Proof. First we claim
(1) (R is well-founded on A)M .
In fact, by absolutenss RM = R (M M) and AM = A M, so it follows that in
M every nonempty subset of AM has an RM -minimal element. Hence we can apply the
recursion theorem within M to define a function H : AM M such that for all x AM ,
H(x) = FM (x, H predM (AM , x, RM )).
174

We claim that H = G AM , which will prove the theorem. In fact, suppose that x is
R-minimal such that x AM and G(x) 6= H(x). Then using absoluteness again,
H(x) = FM (x, H predM (AM , x, RM )) = F(x, H pred(A, x, R)) = G(x),
contradiction.
Theorem 14.21 is needed for many deeper applications of absoluteness. We illustrate its
use by the following result.
Theorem 14.22. The following are absolute for transitive class models of ZF.
(i) + (ordinal addition)
(iv) rank(x)
(ii) (ordinal multiplication)
(v) trcl(x)
(iii) (ordinal exponentiation)
Proof. In each case it is mainly a matter of identifying A, R, F to which to apply
Theorem 14.21; checking the conditions of that theorem are straightforward.
(i): A = On, R = {(, ) : , On, and }, and F : On V V is defined
as follows:

if f = ,

fS() {f ()} if f is a function with domain an ordinal ,


F(, f ) =

if f is a function with domain a limit ordinal ,


f ()

otherwise.
(ii) and (iii) are treated similarly. For (iv), take R = {(x, y) : x y}, A = V, and define
F : V V V by setting, for all x, f V,
S
yx (f (y) {f (y)}) if f is a function with domain x,
F(x, f ) =

otherwise.
For (v), let R = {(i, j) : i, j and i < j}, A = , and define F : V V by setting,
for all m and f V,
(
x S
if m = 0,
S S
F(m, f ) = f ( m) f ( m) if m > 0 and f is a function with domain m,

otherwise
Then the function GSobtained from Theorem 14.20 is absolute for transitive class models
of ZF, and trcl(x) = m G(m).
Theorem 14.23. If M is a transitive model of ZF, then:
(i) P M (x) = P(x) M for any x M;
(ii) VM = V M for any M.
Proof. (i): Assume that x M. Then for any set y,
y P M (x)

iff

y M and (y x)M

iff
iff

y M and y x
y P(x) M.
175

(ii): Assume that M. Then for any set x,


x VM

iff
iff

x M and rankM (x) <


x M and rank(x) <

x M and rankM (x) <


x V M
Proposition 14.24. R well-orders A is absolute for models of ZF.
Proof. Let M be a model of ZF. Suppose that R, A M. Clearly
(R well-orders A) iff xf [x is an ordinal f : x A is a bijection
, x[ < iff (f (), f ()) R]].
From this and elementary absoluteness results it is clear that (R well-orders A)M implies
that (R well-orders A). Now suppose that (R well-orders A). Let x and f be such that x
is an ordinal, f : x A is a bijection, and , x[ < iff (f (), f ()) R]. Since M
is a model of ZF, let y, g M be such that in M we have: y is an ordinal, g : y A is a
bijection, and , y[ < iff (g(), g()) R]. By simple absoluteness results, this is
really true. Then x = y and f = g by the uniqueness conditions in 4.174.19.
For this and the other absoluteness results above, remember the remark which
followed Theorem 14.13: we do not have to assume that M is a model of all of
ZF, but is only a model of some finite portion of it.
Consistency of no inaccessibles
Theorem 14.25. If ZFC is consistent, then so is ZFC+there do not exist uncountable inaccessible cardinals.
Proof. For brevity we interpret inaccessible to mean uncountable and inaccessible. Let
M = {x : [ inaccessible x V ]}
Thus M is a class. We claim that M is a model of ZFC+there do not exist uncountable
inaccessible cardinals. To prove this, we consider two possibilities.
Case 1. M = V. Then of course M is a model of ZFC. Suppose that is inaccessible.
Then since M = V we have V = V , which is not possible, since V is a set. Thus M is a
model of ZFC + there do not exist uncountable inaccessible cardinals.
Case 2. M 6= V. Let x be a set which is not in M. Then there is an ordinal such
that is inaccessible and x
/ V . In particular, there is an inaccessible , and we let
be the least such.
(1) M = V .
In fact, if x M, then x V for every inaccessible , so in particular x V . On the
other hand, if x V , then x V for every , so x V for every inaccessible , and
so x M. So (1) holds.
176

Now we show that V is as desired. First, we need to check all the ZFC axioms. Here
we use Theorems 14.314.9. Now V is transitive, so by Theorem 14.3, the extensionality
axiom holds in V .
For the comprehension axioms, we are going to apply Theorem 14.4. Suppose that is
a formula with free variables among x, z, w1 , . . . , wn , and we are given z, w1 , . . . , wn V .
Let A = {x z : M (x, z, w1 , . . . , wn )}. Then A z. Say z V with < . Then
A z V , so A P(V ) = V+1 V . It follows from Theorem 14.4 that the
comprehension axioms hold in V .
Suppose that x, y V . Say x, y V with < . Then {x, y} V , so {x, y}
V+1 V . By Theorem 14.5, the pairing axiom holds in V
S .
S
Suppose that x V . Say x V with < . Then x V , so x V+1 V .
By Theorem 14.6, the union axiom holds in V .
Suppose that x V . Say x V with < . Then x V . Hence y V for
each y x, so y P(V ) = V+1 for each y P(x). This means that P(x) V+1 , so
P(x) V+2 . By Theorem 14.7, the power set axiom holds in V .
For the replacement axioms, we will apply Theorem 14.8. So, suppose that is a
formula with free variables among x, y, A, w1, . . . , wn , any A, w1 , . . . , wn V , and
x A!y[y V V (x, y, A, w1, . . . , wn )].
For each x A, let yx be the unique set such that yx V and V (x, yx , A, w1 , . . . , wn ),
and let x < be such that yx Vx . Choose < such that A V . Then A V ,
def S
and hence |A| |V | < by Theorem 11.5(ii). It follows that = {x : x A} < .
Let
Y = {z V : x AV (x, z, A, w1 , . . . , wn )}.
Thus Y V , so Y V+1 V . Suppose that x A and z V is such that
V (x, z, A, w1 , . . . , wn ). Then z = yx by the above, and so z Y , as desired.
By Theorem 14.9, the foundation axiom holds in V .
We have now shown that V is a model of ZF Inf.
For the infinity axiom, note that by Theorem 11.4(v) we have V+1 V . Hence
the infinity axiom holds in V by Theorem 14.18.
For the axiomSof choice, suppose that A V is a family of pairwise disjoint nonempty
sets, and let B A have exactly
S one element in common with each member of A . Say
A V with < . Then B A V , so B V+1 V , and the axiom of choice
thus holds in V .
So V is a model of ZFC.
Finally, suppose that x V and (x is an inaccessible cardinal)V ; we want to get
a contradiction. In particular, (x is an ordinal)V , so by absoluteness, x is an ordinal.
Absoluteness clearly implies that x is infinite. We claim that x is a cardinal. For, if
f : y x is a bijection with y < x, then clearly f V , and hence by absoluteness
(f : y x is a bijection and y < x)V , contradiction. Similarly, x is regular; otherwise
there is an injection f : y x with rng(f ) unbounded in x, so clearly f V , and
absolutenss again yields a contradiction. Thus x is a regular cardinal. Hence, since is
the smallest inaccessible, there is a y x such that there is a one-one function g from x
177

into P(y). Again, g V , and easy absoluteness results contradicts (x is an inaccessible


cardinal)V .
Reflection theorems
We now want to consider to what extent sentences can reflect to proper subclasses of V;
this is a natural extension of our considerations for absoluteness.
Lemma 14.26. Suppose that M and N are classes with M N. Let 0 , . . . , n be a
list of formulas such that if i n and is a subformula of i , then there is a j n such
that j is . Then the following conditions are equivalent:
(i) Each i is absolute for M, N.
(ii) If i n and i has the form xj (x, y1 , . . . , yt ) with x, y1 , . . . , yt exactly all the
free variables of j , then
N
y1 , . . . , yt M[x NN
j (x, y1 , . . . , yt ) x Mj (x, y1 , . . . , yt )].

Note the seemingly minor respects in which (ii) differs from the definition of absoluteness:
the implication goes only one direction, and on the right side of the implication we relativize
j to N rather than M.
Proof. (i)(ii): Assume (i) and the hypothesis of (ii). Suppose that y1 , . . . , yt M
M
and x NN
j (x, y1 , . . . , yt ). Thus by absoluteness x Mj (x, y1 , . . . , yt ); choose
N
x M such that M
j (x, y1 , . . . , yt ). Hence by absoluteness again, j (x, y1 , . . . , yt )). Hence
x MN
j (x, y1 , . . . , yt ), as desired.
(ii)(i): Assume (ii). We prove that i is absolute for M, N by induction on the
length of i . This is clear if i is atomic, and it easily follows inductively if i has the
form j or j k . Now suppose that i is xj (x, y1 , . . . , yt ), and y1 , . . . , yt M. then
M
M
i (y1 , . . . , yt ) x Mj (x, y1 , . . . , yt ) (definition of relativization)

x MN
j (x, y1 , . . . , yt ) (induction hypothesis)
x NN
j (x, y1 , . . . , yt ) (by (ii)
N
i (y1 , . . . , yt ) (definition of relativization)
Theorem 14.27. Suppose that Z() is a set for every ordinal , and the following
conditions hold:
(i) If < , then Z() Z().
S
(ii) If is a limit ordinal, then Z() = < Z().
S
Let Z = On Z(). Then for any formulas 0 , . . . , n1 ,
> [0 , . . . , n1 are absolute for Z(), Z].
Proof. Assume the hypothesis, and let an ordinal be given. We are going to apply
Lemma 14.26 with N = Z, and we need to find an appropriate > so that we can take
M = Z() in 14.26.
178

We may assume that 0 , . . . , n1 is subformula-closed; i.e., if i < n, then every


subformula of i is in the list. Let A be the set of all i < n such that i begins with an
existential quantifier. Suppose that i A and i is the formula xj (x, y1 , . . . , yt ), where
x, y1 , . . . , yt are exactly all the free variables of j . We now define a class function Gi as
follows. For any sets y1 , . . . , yt ,
Gi (y1 , . . . , yt ) =

the least such that x Z()Z


j (x, y1 , . . . , yt )
0

if there is such,
otherwise.

Then for each ordinal we define


Fi () = sup{Gi (y1 , . . . , yt ) : y1 , . . . , yt Z()};
note that this supremum exists by the replacement axiom.
Now we define a sequence 0 , . . . , p , . . . of ordinals by induction on n . Let
0 = + 1. Having defined p , let
p+1 = max(p+1 , sup{Fi () : i A, p } + 1).
Finally, let = supp p . Clearly < and is a limit ordinal.
(1) If i A, y1 , . . . , yt Z(), and x ZZ
i (x, y1 , . . . , yt ), then there is an x Z() such
Z
that i (x, y1 , . . . , yt ).
In fact, choose p such that y1 , . . . , yt Z(p ). Then Gi (y1 , . . . , yt ) F(p ) < p+1 . Hence
an x as in (1) exists, with x Z(p+1 ).
Now the theorem follows from Lemma 14.26.
Corollary 14.28. (The reflection theorem) For any formulas 1 , . . . , n ,
ZF |= > [1 , . . . , n are absolute for V ].
Theorem 14.29. Suppose that Z is a class and 1 , . . . , n are formulas. Then
X ZA[X A Z and 1 , . . . , n are absolute
for A, Z and |A| max(, |X|)].
Proof. We may assume that 1 , . . . , n is subformula closed. For each ordinal let
Z() = Z V . Clearly there is an ordinal such that X V , and hence X Z().
Now we apply Theorem 14.27 to obtain an ordinal > such that
(1)

1 , . . . , n are absolute for Z(), Z.

Let be a well-order of Z(). Let B be the set of all i < n such that i begins with an
existential quantifier. Suppose that i B and i is the formula xj (x, y1 , . . . , yt ), where
179

x, y1 , . . . , yt are exactly all the free variables of j . We now define a function Hi for each
i B as follows. For any sets y1 , . . . , yt Z(),
Hi (y1 , . . . , yt ) =

Z()

the -least x Z() such that i


the -least element of Z()

(x, y1 , . . . , yt ) if there is such,


otherwise.

Let A Z() be closed under each function Hi , with X A. We claim that A is as desired.
To prove the absoluteness, it suffices by Lemma 14.26 to take any formula i with i A,
with notation as above, assume that y1 , . . . , yt A and x ZZ
j (x, y1 , . . . , yt ), and find
Z
x A such that j (x, y1 , . . . , yt ). By (1), there is an x Z() such that Z
j (x, y1 , . . . , yt ).
Z
Hence Hi (y1 , . . . , yt ) is an element of A such that j (x, y1 , . . . , yt ), as desired.
It remains only to check the cardinality estimate. This is elementary.
Lemma 14.30. Suppose that G is a bijection from A onto M, and for any a, b A we
have a b iff G(a) G(b). Then for any formula (x1 , . . . , xn ) and any x1 , . . . , xn A,
A (x1 , . . . , xn ) M (G(x1 ), . . . , G(xn )).
Proof. An easy induction on .
Theorem 14.31. Suppose that Z is a transitive class and 0 , . . . , m1 are sentences.
Suppose that X is a transitive subset of Z. Then there is a transitive set M such that
Z
X M , |M | max(, |X|), and for every i < m, M
i i .
Proof. We may assume that the extensionality axiom is one of the i s. Now we
apply Theorem 14.29 to get a set A as indicated there. By Proposition 13.11, there is
a transitive set M and a bijection G from A onto M such that for any a, b A, a b
iff G(a) G(b). Hence all of the desired conditions are clear, except possibly X M .
We show that G[X] = X by proving that G(x) = x for all x X. In fact, suppose that
G(x) 6= x for some x X, and by the foundation axiom choose y such that G(y) 6= y while
G(z) = z for all z y. Then if z y we have z, y X A, and hence z = G(z) G(y).
So y G(y). If w G(y), then w M = rng(G), so we can choose z A such that
w = G(z). Then G(z) G(y), so z y. Hence w = G(z) = z and so w y. This gives
G(y) y, and finishes the proof.
Corollary 14.32. Suppose that S is a set of sentences containing ZFC. Suppose also
that 0 , . . . , n1 S. Then
S |= M

M is transitive, |M | = , and

M
i

i<n

Proof. Take Z = V and X = in Theorem 14.31.


The following corollary can be taken as a basis for working with countable transitive models
of ZFC.
180

Theorem 14.33. Suppose that S is a consistent set of sentences containing ZFC.


Expand the basic set-theoretic language by adding an individual constant M. Then the
following set of sentences is consistent:
S {M is transitive} {|M| = } {M : S}.
Proof. Suppose that the indicated set is not consistent. Then there are 0 , . . . , m1
in S such that
^
M
S |= M is transitive and |M| =
i ;
i<n

it follows that
S |= M M is transitive, |M| = , and

M
i

i<n

contradicting Corollary 14.32.


Indescribability
We give an important equivalent of weak compactness. This is optional material.
An infinite cardinal is first-order describable iff there is a U V and a sentence
in the language for (V , , U ) such that (V , , U ) |= , while there is no < such that
(V , , U V ) |= .
Theorem 14.34. If is infinite but not inaccessible, then it is first-order describable.
Proof. is describable by the sentence that says that is the first limit ordinal;
absoluteness is used. The subset U is not needed for this. Now suppose that is singular.
Let = cf(), and let f be a function whose domain is some ordinal < with
rng(f ) cofinal in . Let U = {(, , f ()) : < }. Let be the sentence expressing the
following:
For every ordinal there is an ordinal with < , U is nonempty, and there is an
ordinal and a function g with domain such that U consists of all triples (, , g())
with < .
Clearly (V , , U ) |= . Suppose that < and (V , , V U ) |= . Then is a limit
ordinal, and there is an ordinal < and a function g with domain such that V U
consists of all triples (, , g()) with < . (Some absoluteness is used.) Now V U
is nonempty; choose (, , g()) in it. Then = since it is in U . It follows that g = f .
Choose < such that < f (). Then (, , f ()) U V . Since < f (), it follows
that has rank less than , contradiction.
Now suppose that < 2 . A contradiction is reached similarly, as follows. Let f
be a function whose domain is P() with range . Let U = {(, B, f (B)) : B }. Let
be the sentence expressing the following:
For every ordinal there is an ordinal with < , U is nonempty, and there is an ordinal
and a function g with domain P() such that U consists of all triples (, B, g(B)) with
B .
181

Clearly (V , , U ) |= . Suppose that < and (V , , V U ) |= . Then is a limit


ordinal, and there is an ordinal < and a function g with domain P() such that
V U consists of all triples (, B, g(B)) with B . (Some absoluteness is used.) Clearly
= ; otherwise U V would be empty. Note that g = f . Choose B such that
= f (B). Then (, B, f (B)) U V . Again this implies that has rank less than ,
contradiction.
We need the following little fact about the Mostowski collapse.
Theorem 14.35. Suppose that R is a well-founded class relation on a class A, and
it is set-like and extensional. Also suppose that B A, B is transitive, a, b A[aRb
B a B], and a, b B[aRb a b]. Let G, M be the Mostowski collapse of (A, R).
Then G B is the identity.
Proof. Suppose not, and let X = {b B : G(b) 6= b}. Since we are assuming that X
is a nonempty subclass of A, by Proposition 13.7 choose b X such that y A and yRb
imply that y
/ X. Then
G(b) = {G(y) : y A and yRb}
= {G(y) : y B and yRb}
= {y : y B and yRb}
= {y : y B and y b}
= {y : y b}
= b,
contradiction.
Lemma 14.36. Let be weakly compact. Then for every U V , the structure
(V , , U ) has a transitive elementary extension (M, , U ) such that M .
Proof. Let be the set of all L -sentences true in the structure (V , , U, x)xV ,
together with the sentences
c is an ordinal,
< c (for all < ),
where c is a new individual constant. The language here clearly has many symbols. Every
subset of of size less than has a model; namely we can take (V , , U, x, )xV , choosing
greater than each appearing in the sentences of . Hence by weak compactness, has
a model (M, E, W, kx, y)xV . This model is well-founded, since the sentence
v0 v1 . . .

"

(vn+1 vn )

holds in (V , , U, x)xV , and hence in (M, E, W, kx, y)xV .


182

Note that k is an injection of V into M . Let F be a bijection from M \rng(k) onto


def
{(V , u) : u M \rng(k)}. Then G = k 1 F 1 is one-one, mapping M onto some set
N such that V N . We define, for x, z N , xE z iff G1 (x)EG1 (z). Then G is an
def

isomorphism from (M, E, W, kx, y)xV onto N = (N, E , G[W ], x, G(y))xV . Of course
N is still well-founded. It is also extensional, since the extensionality axiom holds in (V , )
and hence in (M, E) and (N, E ). Let H, P be the Mostowski collapse of (N, E ). Thus P
is a transitive set, and
(1) H is an isomorphism from (N, E ) onto (P, ).
(2) a, b N [aE b V a b].
In fact, suppose that a, b N and aE b V . Let the individual constants used in the
expansion of (V , , U ) to (V , , U, x)aV be hcx : x V i. Then
"
#
_
(V , , U, x)aV |= z z kb
(z = kw ) ,
wb

and hence this sentence holds in (N, E , G[W ], x, G(y))xV as well, and so there is a w b
such that a = w, i.e., a b. So (2) holds.
(3) a, b V [a b aE b]
In fact, suppose that a, b V and a b. Then the sentence ka kb holds in (V ,
, U, x)xV , so it also holds in (N, E , G[W ], x, G(y))xV , so that aE b.
We have now verified the hypotheses of Lemma 14.35. It follows that H V is
the identity. In particular, V P . Now take any sentence in the language of (V ,
, U, x)xV . Then
(V , , U, x)xV |=

iff

(M, E, W, kx)xV |=

iff
iff

(N, E , G[W ], x)xV |=


(P, , H[G[W ]], x)xV |= .

Thus (P, , H[G[W ]]) is an elementary extension of (V , , U ).


Now for < we have
(M, E, W, kx, y)xV |= [y is an ordinal and k Ey], hence
(N, E , G[W ], x, G(y))xV |= [G(y) is an ordinal and E G(y)], hence
(P, , H[G[W ]], x, H(G(y)))xV |= [H(G(y)) is an ordinal and H(G(y))].
Thus H(G(y)) is an ordinal in P greater than each < , so since P is transitive,
P.
The new equivalent of weak compactness involves second-order logic. We augment first
order logic by adding a new variable S ranging over subsets rather than elements. There
is one new kind of atomic formula: Sv with v a first-order variable. This is interpreted as
saying that v is a member of S.
183

Now an infinite cardinal is 11 -indescribable iff for every U V and every secondorder sentence of the form S, with no quantifiers on S within , if (V , , U ) |= ,
then there is an < such that (V , , U V ) |= .
Theorem 14.37. An infinite cardinal is weakly compact iff it is 11 -indescribable.
Proof. First suppose that is 11 -indescribable. By Theorem 14.34 it is inaccessible.
So it suffices to show that it has the tree property. By the proof of Theorem 12.7(iii)(iv)
it suffices to check the tree property for a tree T < . Note that < V . Let be
the following sentence in the second-order language of (V , , T ):
S[T is a tree under , and
S T and S is a branch of T of unbounded length].
Thus for each < the sentence holds in (V , , T V ). Hence it holds in (V , , T ),
as desired.
Now suppose that is weakly compact. Let U V , and let be a 11 -sentence
holding in (V , , U ). By Lemma 14.36, let (M, , U ) be a transitive elementary extension
of (V , , U ) such that M . Say that is S, with having no quantifiers on S. Now
X V [(V , , U ) |= (X)].

(1)

Now since M and (M, ) is a model of ZFC, VM exists, and by absoluteness it is equal
to V . Hence by (1) we get
(M, , U ) |= X V V (U V ).
Hence
(M, , U ) |= X V V (U V ),
so by the elementary extension property we get
(V , , U ) |= X V V (U V ).
We choose such an . Since V On = , it follows that < . Hence (V , , U V ) |= ,
as desired.
A diagram of large cardinals
We define some more large cardinals, and then indicate relationships between them by a
diagram.
All cardinals are assumed to be uncountable.
1. regular limit cardinals.
2. inaccessible.
3. Mahlo.
184

4. weakly compact.
5. indescribable. The -order language is an extension of first order logic in which one
has variables of each type n . For n positive, a variable of type n ranges over P n (A)
for a given structure A. In addition to first-order atomic formulas, one has formulas P Q
with P n-th order and Q (n + 1)-order. Quantification is allowed over the higher order
variables.
is indescribable iff for all U V and every higher order sentence , if (V , , U ) |=
then there is an < such that (V , , U V ) |= .
6. ()<
2 . Here in general
()<
m

S
means that for every function f : n []n m there is a subset H of order type
such that for each n , f [H]n is constant.
7. 0 exists. This means that there is a non-identity elementary embedding of L into L.
Thus no actual cardinal is referred to. But 0 implies the existence of some large cardinals,
and the existence of some large cardinals implies that 0 exists.
8. J
onsson is a Jonsson cardinal iff every model of size has a proper elementary
substructure of size .
9. Rowbottom is a Rowbottom cardinal iff for every uncountable < , every model
of type (, ) has an elementary submodel of type (, ).
10. Ramsey ()<
2 .
11. measurable
12. strong is a strong cardinal iff for every set X there exists a nontrivial elementary
embedding from V to M with the first ordinal moved and with M.
13. Woodin is a Woodin cardinal iff
A V < (, ) < j[j is a nontrivial elementary embedding of V
into some set M, with the first ordinal moved, such that
j() > , V M, A V = j(A) V ]
14. superstrong is superstrong iff there is a nontrivial elementary embedding j : V
M with the first ordinal moved, such that Vj() M.
15. strongly compact is strongly compact iff for any L -language, if is a set of
sentences and every subset of of size less than has a model, then itself has a model.
16. supercompact is supercompact iff for every A with |A| there is normal measure
on P (A).
17. extendible For an ordinal , we say that k is -extendible iff there exist and a
nontrivial elementary embedding j : V+ V with first ordinal moved, with < j().
is extendible iff it is -extendible for every > 0.
185

18. Vop
enkas principle If C is a proper class of models in a given first-order language,
then there exist two distinct members A, B C such that A can be elementarily embedded
in B.
19. huge A cardinal is huge iff there is a nontrivial elementary embedding j : V M
with the first ordinal moved, such that Mj() M.
20. I0. There is an ordinal and a proper elementary embedding j of L(V+1 ) into L(V+1 )
such that the first ordinal moved is less than .
In the diagram on the next page, a line indicates that (the consistency of the) existence of
the cardinal above implies (the consisteny of the) existence of the one below.
EXERCISES
E14.1. For any infinite cardinal , let H() be the set of all x such that |trcl(x)| < .
Prove that V = H(). (H() is the collection of all hereditarily finite sets.) Hint:
V H() is easy. For the other direction, suppose that x H(), let t = trcl(x), and
let S = {rank(y) : y t}. Show that S is an ordinal.
E14.2. Which axioms of ZFC are true in On?
E14.3. Show that the power set operation is absolute for V for limit.
E14.4. Let M be a countable transitive model of ZFC. Show that the power set operation
is not absolute for M .
E14.5. Show that V is a model of ZFC Inf.
E14.6. Show that the formula x(x y) is not absolute for all nonempty sets, but it is
absolute for all nonempty transitive sets.
E14.7. Show that the formula z(x z) is not absolute for every nonempty transitive set.
E14.8. A formula is 1 iff it has the form x with a 0 formula; it is 1 iff it has the
form x with a 0 formula.
(i) Show that X is countable is equivalent on the basis of ZF to a 1 formula.
(ii) Show that is a cardinal is equivalent on the basis of ZF to a 1 formula.
E14.9. Prove that if is an infinite cardinal, then H() V .
E14.10. Prove that for regular, H() = V iff = or is inaccessible.

186

I0
huge
Vopenka
extendible

supercompact

super strong

strongly compact

Woodin

strong
measurable

Ramsey
Rowbottom

(1 )<
2

Jonsson

0 exists

()<
2
indescribable
weakly compact
Mahlo
strongly inaccessible
regular limit

187

E14.11. Assume that is an infinite cardinal. Prove the following:


(a) H() is transitive.
(b) H() On = . S
(c) If x H(), then x H().
(d) If x, y H(), then {x, y} H().
(e) If y x H(), then y H().
(f) If is regular and x is any set, then x H() iff x H() and |x| < .
E14.12. Show that if is regular and uncountable, then H() is a model of all of the ZFC
axioms except possibly the power set axiom.
References
Jech, T. Set Theory, 769pp.
Kanamori, A. The higher infinite,
Kunen, K. Set Theory, 313pp.

188

15. Constructible sets


This chapter is devoted to the exposition of Godels constructible sets. We will define a
proper class L, the class of all constructible sets. The development culminates in the proof
of consistency of AC and GCH relative to the consistency of ZF. We also prove the relative
consistency of .
Sets are called constructible iff they are built up from the empty set using easily
defined procedures. This essentially amounts to replacing the power set operation in the
definition of the V s by an operation which produces only definable subsets. So first we
have to indicate what we mean by definable subsets. The following three functions incorporate the hard part of definability; they deal with membership, equality, and existential
quantification. For any n and any sets A, R let
Proj(A, R, n) = {s n A : t R[t is a function, n dmn(t) and t n = s]}.
For any i, j, n with i, j < n and any set A let
Diag (A, n, i, j) = {s n A : s(i) s(j)}.
Diag= (A, n, i, j) = {s n A : s(i) = s(j)}.
These basic functions are recursively applied by the following definition, where n .
Note that Df (k, A, n) is defined for fixed A and all n simultaneously by recursion on k.
Each set Df (k, A, n) is a collection of subsets of n A, and so is Df (A, n).
Df (0, A, n) = {Diag (A, n, i, j) : i, j < n} {Diag= (A, n, i, j) : i, j < n};
Df (k + 1, A, n) = Df (k, A, n) {n A\R : R Df (k, A, n)}
{R S : R, S Df (k, A, n)}
{Proj(A, R, n) : R Df (k, A, n + 1)};
[
Df(A, n) =
Df (k, A, n).
k

The following rather trivial fact will be technically useful in what follows.
Lemma 15.1. Df(A, n) for any set A and any natural number n.
Proof. Let A be arbitrary. Note that = Diag (A, 0, 0, 1) Df (0, A, 1), and
hence = Proj(A, , 0) Df (1, A, 0), so that Df(A, 0). For n > 0 we have =
Diag (A, 0, 0, n) Df (0, A, n), so that Df(A, n).
The following proposition is really just an elementary consequence of the definition.
Proposition 15.2. Let A be a set and n a natural number. Then
(i) {Diag (A, n, i, j) : i, j < n} Df(A, n).
(ii) {Diag= (A, n, i, j) : i, j < n} Df(A, n).
(iii) For all R, if R Df(A, n), then n A\R Df(A, n).
(iv) For all R, S, if R, S Df(A, n), then R S Df(A, n).
(v) For all R, if R Df(A, n + 1), then Proj(A, R, n) Df(A, n).
189

Furthermore, suppose that hEn : n i is a system of sets, each En being a collection of


subsets of n A, and (i)(iv) hold with Df(A, n) replaced by En for each n. Then Df(A, n)
En for all n .
We need to get a finer description of the definable sets. That is done by the following
recursive definition of a function En of three variables m, A, n with m, n and A any
set. The definition is by recursion on m, with A fixed. We assume that En(m , A, n ) is
defined for all m < m and for all n . Write m = 2i 3j 5k r with r not divisible
by 2, 3, or 5.

Diag (A, n, i, j)
if r = 1, k = 0, and i, j < n,

Diag= (A, n, i, j)
if r = 1, k = 1, and i, j < n,

n
A\En(i, A, n)
if r = 1, k = 2,
En(m, A, n) =

En(i, A, n) En(j, A, n)
if r = 1, k = 3,

Proj(A, En(i, A, n + 1), n) if r = 1, k = 4,

otherwise.
Lemma 15.3. For any m and any set A, Df(A, n) = {En(m, A, n) : m }.
Proof. By an easy induction using Lemma 15.2 we have En(m, A, n) Df(A, n) for
all m; Lemma 15.1 is needed too. This gives the inclusion . If we apply Lemma 15.2 to
the system hEn : n i given by En = {En(m, A, n) : m } for each n , we get the
other inclusion.
Lemma 15.4. If (x0 , . . . , xn1 ) is a formula with free variables among x0 , . . . , xn1 ,
then there is an m such that for every set A,
{s n A : A (s(0), . . . , s(n 1))} = En(m, A, n).
Proof. We proceed by induction on the number of quantifiers in , and within that,
by induction on formulas in the usual sense. For brevity let S() be the set {s n A :
A (s(0), . . . , s(n 1))}. Then
S(xi xj ) = Diag (A, n, i, j) = En(2i 3j , A, n);
S(xi = xj ) = Diag= (A, n, i, j) = En(2i 3j 5, A, n);
if S() = En(i, A, n), then
S() = n A\S() = En(2i 52 , A, n);
if S() = En(i, A, n) and S() = En(j, A, n), then
S( ) = S() S() = En(2l 3j 53 , A, n).
The inductive step from to y requires more care. By a change of bound variable we
obtain a formula such that y is logically equivalent to xn , with the free variables
of among x0 , . . . , xn . Hence with S() = En(i, A, n + 1),
S(y) = {s n A : xn (s(0), . . . , s(n 1), xn )}
= Proj(A, S(), n + 1) = En(2i 54 , A, n).
190

Corollary 15.5. Let (x0 , . . . , xn1 ) be a formula in our set-theoretic language with
free variables among x0 , . . . , xn1 . Then the following is provable in ZF:
A[{s n A : A (s(0), . . . , s(n 1))} Df(A, n)].
Proof. By Lemma 15.4, choose m such that
{s n A : A (s(0), . . . , s(n 1))} = En(m, A, n).
Thus our corollary follows by Lemma 15.3.
For any nonempty sets A, B, we say that A is elementarily contained in B, written A  B,
iff A B and for all m, n ,
En(m, A, n) = nA En(m, B, n).
Lemma 15.6. Suppose that (x0 , . . . , xn1 ) is a formula whose free variables are
among x0 , . . . , xn1 . If A  B then is absolute for A, B.
Proof. Choose m by Lemma 5.4. Suppose that a n A. Then
A (a0 , . . . , an1 ) iff
iff
iff

a En(m, A, n) by Lemma 5.4


a En(m, B, n)
B (a0 , . . . , an1 ) by Lemma 5.4.

The following is a special case of the downward Lowenheim-Skolem theorem.


Theorem 15.7. Let B be any nonempty set, and let X B. Then there is a
nonempty set A with the following properties:
(i) X A B.
(ii) A  B.
(iii) |A| max(, |X|).
Remark: we are going to use the axiom of choice in the proof, so this theorem will not
be available in our development of properties of V = L until we have derived AC from
V = L.
Proof. Fix a well-ordering on B. We may assume that X 6= . For any m, n we
define a function fmn : n B B as follows. For any s n B, we consider two possibilities.
Case 1. There are i, j such that m = 2i 3j 54 such that s En(m, B, n) and
there is an x B such that s hxi En(i, B, n + 1). Then we let fmn (s) be the -least
such element x.
Case 2. If Case 1 does not hold, we just let fmn (s) be the -least element of B.
Let A be the closure of X under all of these functions fmn . Clearly (i) and (iii) hold.
For (ii), we prove that
()

n[En(m, A, n) = n A En(m, B, n)]


191

for all m by induction on m. For each step we assume that s n A is arbitrary. If m does
not have the special forms given in the first parts of the definition of En(m, B, n), then
both sides of () are empty; so we assume that it does have the special forms appropriate
for the several parts of the definition. So with obvious assumptions on i, j we have
s En(2i 3j , A, n) iff

s Diag (A, n, i, j)

iff
iff

s(i) s(j)
s Diag (B, n, i, j)

iff
i
j
s En(2 3 5, A, n) iff

s En(2i 3j , B, n);
s Diag= (A, n, i, j)

iff
iff
iff

s(i) = s(j)
s Diag= (B, n, i, j)
s En(2i 3j 5, B, n);

s En(2i 3j 52 , A, n) iff
iff

s n A\En(i, A, n)
s n A\En(i, B, n)

iff
i
j
3
s En(2 3 5 , A, n) iff

s En(2i 3j 52 , B, n);
s En(i, A, n) En(j, A, n)

iff
iff

s En(i, B, n) En(j, B, n)
s En(2i 3j 53 , B, n).

For the remaining case, first suppose that s En(2i 3j 54 , A, n). It follows that s
Proj(A, En(i, A, n + 1), n). Hence we can choose t En(i, A, n + 1) such that s = t n. By
the inductive hypothesis, t En(i, B, n + 1), so s Proj(B, En(i, B, n + 1), n), and hence
s En(2i 3j 54 , B, n).
Conversely, suppose that s En(2i 3j 54 , B, n). Let m = 2i 3j 54 . Now s
Proj(, En(i, B, n + 1), n). Hence we can choose t En(i, B, n + 1) such that s = t n.
Thus there is an x, namely x = t(n), such that s hxi En(i, B, n + 1) (since s hxi = t).
This means that Case 1 in the definition of fmn (s) holds, and so, since A is closed under
fmn , it follows that fmn (s) A. Let u = s hfmn (s)i. Then u En(i, B, n + 1), so by the
inductive hypothesis, u En(i, A, n + 1). It follows that s Proj(, En(i, A, n + 1), n), and
hence s En(2i 3j 54 , B, n).
Lemma 15.8. For any set A and any n , |Df(A, n) .
Proof. Immediate from Lemma 15.3.
Now we prove a sequence of lemmas leading up to the fact that Df is absolute for transitive
models of ZF. To do this, we have to extend the definitions of our functions above so that
they are defined for all sets, since absoluteness was developed only for such functions. We
do this by just letting the values be 0 for arguments not in the domain of the original
functions.
Lemma 15.9. The function Proj is absolute for transitive models of ZF.
192

Proof. x = Proj(A, R, n) iff n


/ and x = 0, or n and the following condition
holds:
s x[s n A t R(t n = s)]
s n A[t R(t n = s) s x];
from this the conclusion clearly follows.
Lemma 15.10. The function Diag is absolute for transitive models of ZF.
Proof. x = Diag (A, n, i, j) iff not(n and i, j < n) and x = 0, or n , i, j < n,
and the following condition holds:
s x[s n A s(i) s(j)] s n A[s(i) s(j) s x]
Lemma 15.11. The function Diag= is absolute for transitive models of ZF.
Proof. Similar to that of 15.10
Lemma 15.12. For any set A and any natural number n let
T1 (A, n) = {Diag (A, n, i, j) : i, j < n} {Diag= (A, n, i, j) : i, j < n}.
Then T1 is absolute for transitive models of ZF.
Proof. x = T1 (A, n) iff not(n ) and x = 0, or n and the following condition
holds:
y xi, j < n[y = Diag (A, n, i, j) y = Diag= (A, n, i, j)]
i, j < ny Diag (A, n, i, j)[y x]
i, j < ny Diag= (A, n, i, j)[y x].
Lemma 15.13. For any sets A, L and any natural number n let
T2 (A, n, L) = {n A\R : R L}.
Then T2 is absolute for transitive models of ZF.
Proof. x = T2 (A, n, L) iff not(n ) and x = 0, or n and the following condition
holds:
y xR L[y = n A\R] z[R L(z = n A\R) z x].
Here we need a little argument. Let M be a transitive model of ZF. Suppose that A, n, L
M, n , z is a set, R L, and z = n A\R; we would like to show that z M. There is a
w M such that w = (n A\R)M , since M is a model of ZF. By absoluteness, z = w M,
as desired.
Lemma 15.14. For any set X, let T3 (X) = {R S : R, S X}. Then T3 is absolute
for transitive models of Zf.
193

Proof.
y = T3 (X) z yR, S X[z = R S] R, S X[R S y].
Lemma 15.15. For any sets A, X and any natural number n, let T4 (A, n, X) =
{Proj(A, R, n) : R X}. Then T4 is absolute for transitive models of ZF.
Proof. x = T4 (A, n, X) iff not(n ) and x = 0, or n and the following condition
holds:
R X[Proj(A, R, n) x] z[R X(z = Proj(A, R, n)) z x].
This is shown to be absolute as in the proof of Lemma 15.8.
Lemma 15.16. Let B = V , and define S B B as follows:
(k, A, n)S(k , A , n )

iff

k, k , n, n and A = A and k < k .

Then S is well-founded and set-like on B.


Lemma 15.17. Let B and S be as in Lemma 15.16. Define F : B V V
as follows. Let k, n and let A, f be any sets. If f is not a function with domain
pred(B, (k, A, n), S), let F((k, A, n), f ) = 0. If f is such a function, let

if k = 0,
T1 (A, n)

F((k, A, n), f ) = f (k , A, n) T2 (A, n, f (k , A, n))

T3 (A, n, f (k , A, n)) T4 (A, n, f (k , A, n + 1)) if k = k + 1.


Then F is absolute for transitive models of ZF.
Lemma 15.18. Df is absolute for transitive models of ZF.
Proof. It suffices to check that Df is obtained from the function F of Lemma 15.17
by the recursion theorem applied to B and S. Let G be the function so obtained. Then
for any set A and any natural number n,
G(0, A, n) = F((0, A, n), G pred(B, (0, A, n), S)) = F((0, A, n), 0) = T1 (A, n)
= {Diag (A, n, i, j) : i, j < n} {Diag= (A, n, i, j) : i, j < n},
as desired. Now take any k . Then, with f = G pred(B, (k + 1, A, n), S)),
G(k + 1, A, n) = F((k + 1, 0, n), G pred(B, (k + 1, A, n), S))
= F((k + 1, 0, n), f )
= f (k, A, n) T2 (A, n, f (k, A, n))
T3 (A, n, f (k, A, n)) T4 (A, n, f (k, A, n + 1))
= G(k, A, n) {n A\R : R G(k, A, n)}
{R S : R, S G(k, A, n)}
{Proj(A, R, n) : R G(k, A, n + 1)}
194

Lemma 15.19. Df is absolute for all transitive models of ZF.


Proof.
x = Df(A, n) iff

y xk [y Df (k, A, n)]
k y Df (k, A, n)[y x].

It should be reasonably clear that Df has been defined in a very explicit fashion. We need
to express this rigorously; it will lead, eventually, to the conclusion that the axiom of choice
holds in the constructible universe. Just as for absoluteness, we give a sequence of lemmas
leading up to a result saying that any member of the range of Df has a natural well-order.
We will deal frequently with some obvious lexicographic orders, which we uniformly denote
by <lex , leaving to the reader exactly which lexicographic order is referred to.
Let A be any set, and n any natural number. For each R {Diag (A, n, i, j) :
i, j < n}, let (Ch(0, A, n, R), Ch(1, A, n, R)) be the smallest pair (i, j), in the lexicographic
order of such that i, j < n and R = Diag (A, n, i, j). Note, for example, that
Diag (A, 2, 0, 0) = Diag (A, 2, 1, 1) = . Now we define
R <0An S

iff

(Ch(0, A, n, R), Ch(1, A, n, R)) <lex Ch(0, A, n, S), Ch(1, A, n, S)).

Clearly this is a well-order of {Diag (A, n, i, j) : i, j < n}.


In a very analogous way we can define a well-order <1An of {Diag= (A, n, i, j) : i, j <
n}.
Now we can define a well-order <2An of
{Diag (A, n, i, j) : i, j < n} {Diag= (A, n, i, j) : i, j < n}
as follows. For any R, S in this union,
R <2An S

iff

R, S {Diag (A, n, i, j) : i, j < n} and R <0An S


or R {Diag (A, n, i, j) : i, j < n}, S
/ {Diag (A, n, i, j) : i, j < n}
or R, S
/ {Diag (A, n, i, j) : i, j < n} and R <1An S.

For the next few constructions, suppose that X and A are sets, n , and we are given a
well-ordering < of X. Then we well-order {n A\R : R X} by setting
S 0,A,n,<,X T

iff

S , T X[S < T and S = n A\S , T = n A\T ].

We well-order {R S : R, S X} as follows. Suppose that U, V {R S : R, S X}.


Let (R, S) be lexicographically smallest in X X (using <) such that U = R S, and let
(R , S ) be lexicographically smallest in X X (using <) such that V = R S . Then
U <1,A,n,<,X V iff (R, S) <lex (R , S ).
We well-order {proj(A, R, n) : R X} as follows. Suppose that U, V {proj(A, R, n) :
R X}. Let R be <-minimum in X such that U = proj(A, R, n), and let S be <-minimum
in X such that V = proj(A, S, n). Then U <2,A,n,<,X V iff R < S.
195

Next, for any set A and any k, n we define a well-order <3kAn of Df (k, A, n) by
induction on k. Let <30An be <2An . Assume that <3kAn has been defined for all n ,
and let R, S Df (k + 1, A, n). Then we define R <3(k+1)An S iff one of the following
conditions holds:
(1) R, S Df (k, A, n) and R <3kAn S
(2) R Df (k, A, n) and S
/ Df (k, A, n).
(3) R, S
/ Df (k, A, n), R, S {n A\T : T Df (k, A, n)}, and
R 0,A,n,<3kAn ,Df (k,A,n) S.
(4) R, S
/ Df (k, A, n), R {n A\T : T Df (k, A, n)}, and S
/ {n A\T : T
Df (k, A, n)}.
(5) R, S
/ Df (k, A, n), R, S
/ {n A\T : T Df (k, A, n)}, R, S {T U : T, U
Df (k, A, n)}, and
R 1,A,n,<3kAn ,Df (k,A,n) S.
(6) R, S
/ Df (k, A, n), R, S
/ {n A\T : T Df (k, A, n)}, R {T U : T, U
Df (k, A, n)}, and S
/ {T U : T, U Df (k, A, n)}.
(7) R, S
/ Df (k, A, n), R, S
/ {n A\T : T Df (k, A, n)}, R, S
/ {T U : T, U

Df (k, A, n)}, and


R 2,A,n,<3kA(n+1),Df (k,A,n) S.
Finally, for any set A and any natural number n, we well-order Df(A, n) as follows. Let
R, S Df(A, n). Let k be minimum such that R Df (k, A, n), and let l be minimum
such that S Df (l, A, n). Then we define
R <4An S

iff

k < l, or k = l and R <3kAn S.

Constructible sets
The following is the definable power set operation: For any set A,
D(A) = {X A : n s n AR Df(A, n + 1)[X = {x A : s hxi R}]}.
Here s hxi is the member t of

n+1

A such that s t and t(n) = x.

Lemma 15.20. Let (v0 , . . . , vn1 , x) be a formula with the indicated free variables.
Then
Av0 , . . . , vn1 A[{x A : A (v0 , . . . , vn1 , x)} D(A)].
Proof. Let v0 , . . . , vn1 A and R = {s
Lemma 15.5, R Df(A, n + 1). Clearly

n+1

A : A (s(0), . . . , s(n))}. Then by

{x A : A (v0 , . . . , vn1 , x)} = {x A : v hxi R},


196

and hence {x A : A (v0 , . . . , vn1 , x)} D(A).


Lemma 15.21. Let A be any set. Then:
(i) D(A) P(A).
(ii) If A is transitive, then A D(A).
(iii) If X [A]< , then X D(A).
(iv) If A is infinite, then |D(A)| = |A|.
Proof. (i) is obvious. For (ii), let (v, x) be the formula x v. Then for any v A
we have v = {x A : x v} by the transitivity of A, and so v D(A) by Lemma 15.20.
For (iii), suppose that X [A]< . Then there exist an n and an s : n A with
rng(s) = X. For each i < n we have
{s n+1 A : s(n) = s(i)} = Diag= (A, n + 1, i, n) Df(A, n + 1)
by Lemma 15.2(ii). Hence
def

R = {s n+1 A : s(n) rng(s n)} =

{s n+1 A : s(n) = s(i)} Df(A, n + 1)

i<n

by Lemma 15.2(iv). Hence


X = {x A : s hxi R} D(A),
as desired.
Finally, for (iv), note that
D(A) =

{{x A : s hxi R} : R Df(A, n)}.

n
sn A

Hence, using Lemma 15.8,


|D(A)|

(|n A| |Df(A, n)|) |A| = |A|.

On the other hand, {a} D(A) for each a A by (iii), so |A| |D(A). So (iv) holds.
Now we define the hierarchy of constructible sets:
L0 = ;
L+1 = D(L );
[
L =
L for limit;
<

L=

L .

On

197

Lemma 15.22. For every ordinal the following hold:


(i) L is transitive.
(ii) L L for all < .
Proof. The proof is just slightly different from the proof of Theorem 13.1.
We prove both statements simultaneously by induction on . Both statements are
clear for = 0. Now assume them for . By (i) for and 15.21(ii), it follows that
V D(V ) = V+1 , and this easily gives (ii) for + 1. If x y L+1 , then y
D(L ) P(L ), so x L L+1 . So L+1 is transitive.
If is a limit ordinal and (i) and (ii) hold for all < , clearly they hold for
too.
Now we have a notion of rank for constructible sets too: For each x L, its L-rank is the
least ordinal (x) = such that x L+1 . The simple properties of this rank function
are much like those of ordinary rank given in Theorem 13.4:
Theorem 15.23. Let x L and let an ordinal. Then
(i) L = {y L : (y) < }.
(ii) For all y x we have y L, and (y) < (x).
(iii) L, and () = .
(iv) L On = .
(v) L L+1 .
(vi) L V for all .
(vii) [L ]< L+1 for every .
(viii) Ln = Vn for every n .
(ix) L = V .
Proof. For part of this proof we repeat with minor changes parts of the proof of
Theorem 13.4.
(i): Suppose that y L . Then 6= 0. If is a successor ordinal + 1, then
(y) < . If is a limit ordinal, then y L for some < , hence y L+1 also,
so (y) < . This proves .
def

For , suppose that = (y) < . Then y L+1 L , as desired.


(ii): Assume that y x. Let (x) = . Then x L+1 = D(L ) P(L ), and so
y L . Hence (y) < (x).
(iv): We prove this by induction on . It is obvious for = 0, and the inductive step
when is limit is clear. So, suppose that we know that L On = , and = + 1. If
L On, then D(L ) P(L ), so L On = ; hence . This shows
that L On . If < , then L On L On. Thus it remains only to show
that L . Now there is a natural 0 formula (x) which expresses that x is an ordinal:
y xz y(z x) y xz yw z(w y);
this just says that x is transitive and every member of x is transitive. Since L is transitive,
(x) is absolute for it. Hence
= L On = {x L : L (x)}.
198

Hence by Lemma 15.20, D(L ) = L , as desired.


(iii): By (iv) we have L+1 On = + 1, and hence + 1 L+1 , so that L
and () . By (iv) again, we cannot have L , so () = .
(v): L = {x L : (x = x)L } D(L ) = L+1 , using Lemma 15.20.
(vi): An easy induction on .
(vii): By Lemma 15.21(iii), [L ]< D(L ) = L+1 .
(viii): By induction on n. It is clear for n = 0. Assume that Ln = Vn . Thus Ln is
finite. Hence by (vii) and (vi),
Vn+1 = P(Vn ) = P(Ln ) = [Ln ]< Ln+1 Vn+1 ,
as desired.
(ix): Immediate from (viii).
Lemma 15.24. If , then |L | = ||.
Proof. First note by Theorem 15.23(iv) that L . Hence || |L | for every
ordinal . So we just need to prove that |L | || for infinite .
Now we prove the lemma by induction on . We assume that for every infinite <
we have |L | = ||. Since Ln is finite for n by theorems 15.23(viii) and 13.5(i), it
follows that |L | || for every < . If is a limit ordinal, then
X
|L |
|L | || || = ||,
<

If = + 1, then by 15.21(iv), |L | = |D(L )| = |L | = || = ||.


Lemma 15.24 exhibits an important difference between the hierarchy of sets and the hierarchy of constructible sets. Although the two hierarchies agree up through stage , we
have |V+1 | = 2 and |L+1 | = , by Theorem 13.5 and Lemma 15.24. The hierarchy of
sets continues to create many new sets at each stage, but the hierarchy of constructible sets
builds new sets much more slowly. But since, as we will see, it is consistent that V = L,
eventually the same sets could be created.
Theorem 15.25. L is a model of ZF.
Proof. We take the axioms one by one. Extensionality holds since L is transitive (by
Lemma 15.22); see Theorem 14.3.
According to Theorem 14.4, to verify that the comprehension axioms hold in L it
suffices to take any formula with free variables among x, z, w1 , . . . , wn , assume that
z, w1 , . . . , wn L, and prove that
(1)

{x z : L (x, z, w1 , . . . , wn )} L

Clearly there is an ordinal such that z, w1 , . . . , wn L . By Theorem 14.25, choose an


ordinal > such that the formula x z is absolute for L , L. Then
{x z : L (x, z, w1 , . . . , wn )} = {x L : (x z (x, z, w1 , . . . , wn ))L }
= {x L : (x z (x, z, w1 , . . . , wn ))L } (absoluteness)
D(L )
= L+1 ,

(by Lemma 15.20)

199

and (1) holds.


Pairing: See Theorem 14.5. Suppose that x, y L. Choose so that x, y L . Then
by Lemma 15.20,
{x, y} = {z L : (z = x z = y)L } D(L ) = L+1 ,
as desired.
Union: See Theorem 14.6. Suppose that x L. Choose so that x L . Then
[
x = {z : u(z u u x)}
= {z L : (u(z u u x))L }

(since L is transitive)

D(L ) = L+1 ,
as desired.
Power set: See Theorem 14.7. Suppose that x L. For each z L such that z x
choose z such that z Lz . Let = supzx Lz . Then, using Theorem 15.23(v),
P(x) L = {z L : z x} L L+1 L,
as desired.
Replacement: See Theorem 14.8. Suppose that is a formula with free variables
among x, y, A, w1, . . . , wn , we are given A, w1 , . . . , wn L, and
(1)

x A !y[y L L (x, y, A, w1, . . . , wn )].

For each x A let zx be such that zx L and L (x, zx , A, w1 , . . . , wn ); we are using


the replacement axiom here. Then for each x A choose x so that zx Lx . Let
= supxA x . Suppose now that y L and L (x, y, A, w1, . . . , wn ) for some x A.
Then by (1), y = zx , and hence y Lx L . This proves that
{y L : x AL (x, y, A, w1, . . . , wn )} V .
Since V V+1 L, this is as desired.
Foundation: holds by Theorem 14.9.
Infinity: Since L+1 L by Theorem 15.23(iv), te infinity axiom holds by
Theorem 14.18.
The axiom of constructibility is the statement V = L. Slightly more rigorously, it is the
statement x(x L ).
Lemma 15.26. The function L = hL : Oni is absolute for transitive models of
ZF.
Proof. This follows from the absoluteness of Df (Lemma 15.19) and the theorem
about absoluteness for recursive definitions (Theorem 14.21).
Theorem 15.27. L is a model of ZF + V = L.
200

Proof. This is just an extension of Theorem 15.25. We want to prove that x


L L(x LL
). So, let x L. Choose such that x L . Now L by 15.23(iv),
and x LL
by
Lemma
15.26.

Corollary 15.28. If ZF is consistent, then so is ZF + V = L.


The following theorem expresses the minimality of L.
Theorem 15.29. Suppose that M is a proper class model of ZF. Then L = LM M.
Proof. Take any ordinal . then M 6 L , since M is a proper class; so choose x
M\L . Then rank(x) . Now rank(x) = rankM (x) by absoluteness, so rank(x) M,
and hence M. This proves that On M.
It follows by absoluteness of L that
[
[
LM = {x M : ((x L ))M } =
LM
=
L = L.

On

On

Hence L = LM M.
Theorem 15.30. V = L implies AC. In fact, under V = L there is a class wellordering of the universe.
Proof. We define a well-ordering <5 of L by recursion (continuing our numbering
of well-orders from earlier in this chapter). First of all, <50 = . If is a limit ordinal,
then for any x, y L we define
x <5 y

iff

(x) < (y) [(x) = (y) and x <5(x) y].

Clearly this is a well-order of L .


Now suppose that a well-order <5 of L has been defined. Then for each n we
define the lexicographic order <6n on n L : for any x, y n L ,
x <6n y

iff

k < n[x k = y k and x(k) <5 y(k)].

Clearly this is a well-order of n L . Now for any X L+1 = D(L ), let n(X) be the least
natural number n such that
s n L R Df(L , n + 1)[X = {x L : s hxi R}].
Then let s(X) be the least member of

n(X)

L (under the well-order <6n(X) ) such that

R Df(L , n(X) + 1)[X = {x L : s(X) hxi R}].


Then let R(X) be the least member of Df(L , n + 1) (under the well-order <4L (n+1) ) such
that
X = {x L : s(X) hxi R(X)}].
Finally, for any X, Y L+1 we define X <5(+1) Y iff one of the following conditions
holds:
201

(i) X, Y L and X <5 Y .


(ii) X L and Y
/ L .
(iii) X, Y
/ L and one of the following conditions holds:
(a) n(X) < n(Y ).
(b) n(X) = n(Y ) and s(X) <6n(X) s(Y ).
(c) n(X) = n(Y ) and s(X) = s(Y ) and R(X) <4L (n+1) R(Y ).
Clearly this gives a well-order of L+1 .
We denote the union of all the well-orders <5 for On by <L . Under V = L it is a
well-ordering of the universe.
Now we work up to proving GCH from V = L. For any set M , let o(M ) = M On
Lemma 15.31. If M is a transitive set, then o(M ) is an ordinal, and is in fact the
first ordinal not in M .
Proof. Since o(M ) is a set of ordinals, for the first statement it suffices to show that
o(M ) is transitive. Suppose that o(M ). Then M since M is transitive, as
desired.
For the second statement, first of all, o(M )
/ M , as otherwise o(M ) o(M ). Now
suppose that
/ M . If < o(M ), then M , contradiction.
Theorem 15.32. There is a sentence which is a finite conjunction of members of
ZF + V = L such that
ZF C M [M transitive M M = Lo(M ) ].
Proof. Let be a conjunction of V = L together with enough of ZF to prove that
hL : Oni is absolute, and also enough to prove that there is no largest ordinal. Then
for any transitive set M , if M , then o(M ) is a limit ordinal, (x(x L))M and hence
M = LM , and
[
M = LM = {x M : ((x L ))M } =
L = Lo(M ) .
M

Theorem 15.33. If V = L, then for every infinite ordinal we have P(L ) L+ .


Proof. Let be as in Theorem 15.32. Assume that V = L and is an infinite
ordinal. Take any A P(L ). Let X = L {A}. Clearly X is transitive. By Lemma
15.24, |X| = ||. Now by Theorem 14.29 with Z = V, let M be a transitive set such
that X M , |M | = ||, and M V . But V actually holds, so M holds. Hence
M = Lo(M ) by Theorem 15.32. Now o(M ) = M On, and |M | = ||, so o(M ) < + .
Hence A X M = Lo(M ) L+ .
Theorem 15.34. V = L implies AC + GCH.
Proof. Assume V = L. Then AC holds by 15.30. By Theorem 15.33 we have, for
any infinite cardinal , P() P(L ) L+ . Since |L+ | = + by Lemma 15.24, it
follows that 2 = + .
202

Corollary 15.35. If ZF is consistent, then so is ZFC + GCH.


Theorem 15.36. V = L implies .
Proof. Assume V = L. By recursion, for each < 1 we define (A , C ) to be the
<L -first pair of subsets of such that C is club in and there is no C such that
def
A = A , or (A , C ) = (0, 0) if this is not possible. Thus f = h(A , C ) : < 1 i is
defined by recursion, and is absolute for models of ZF (or certain finite fragments of it).
We claim that hA : < 1 i is a -sequence.
To prove this, we suppose that it is not a -sequence. Then there is a subset A of 1
such that { < 1 : A = A } is not stationary, and hence there is a club D in 1 such
that A 6= A for all D. We take the <L -first such pair (A, D).
(1) A, D L2 .
This holds since 1 L1 +1 by Theorem 15.23(iv), hence 1 L1 +1 , and then (1) follows
by Theorem 15.33.
Now we need the following elementary fact:
(2) If x, y L , then {x, y} L+1 .
This is an application of Theorem 15.20:
{x, y} = {z L : z = x or z = y} D(L ) = L+1 .
For brevity, let f = h(A , C ) : 1 i L2 . Then
(3) f L2 .
In fact, fix < 1 . Then A , C L2 by the argument proving (1). Hence by (2), also
(, (A , C )) L2 . Hence unfixing , we see that there is a < 2 such that f L ;
hence (3) follows by Theorem 15.33.
Now we apply Theorem 14.29 to Z = L to obtain a transitive set P such that L2 P
and certain formulas, relations, and functions in the rest of this proof are absolute for P .
Now by Theorem 15.7, let M be a set such that {, 1 , f, (A, D)} M P , M  P ,
and |M | .
(4) M .
In fact, P , so y(y
/ ) holds in P by absoluteness, hence xy(y
/ x) holds in P ,
hence in M by Lemma 15.6, so choose x M such that y(y
/ x) holds in M . Hence it
holds in P by Lemma 15.6. Since P is transitive, it follows that x = , as desired.
(5) If M On, then + 1 M .
The proof of (5) is similar to that of (4).
(6) M 1 is a countable limit ordinal.
To prove (6) it suffices to show that M 1 is an ordinal. In fact, then (5) implies that it
is a limit ordinal, and hence since M is countable, it is countable. To show that M 1
is an ordinal it suffices to take any M 1 and show that M , since this will show
203

that M 1 is transitive; so as a transitive set of transitive sets, it is an ordinal. If < ,


clearly M by (4) and (5). Suppose that . Let g be a bijection from onto .
Then g is a bijection from onto holds in P by absoluteness, so g(g is a bijection
from onto ) holds in P , and hence in M . Choose h M such that h is a bijection
from onto holds in M ; then it holds in P by Lemma 15.6, and hence it is really true,
by absoluteness. Now by similar arguments, h(n) M for every n , so M , as
desired.
Let = M 1 . Now M is extensional since P is. Let G, N be the Mostowski
collapsing function and the Mostowski collapse, respectively.
(7) G() = for all < .
We prove (7) by induction, using the fact from (6) that M :
G() = {G() : M and } = { : } = .
(8) G(A) = A .
For,
G(A) = {G() : M and A}
= {G() : M 1 and A}
= { : and A} using (7)
= A .
Similarly,
(9) G(D) = D .
(10) G(1 ) = .
In fact,
G(1 ) = {G() : M and 1 } = {G() : } =
by (7).
Now by absoluteness,
P |=(A, D) is <L -first such that A, D 1 ,
D is club in 1 , and A 6= f () for all D.
It follows that M is a model of this same formula, and hence applying the isomorphism G
and using the above facts, we get
N |=(A , D ) is <L -first such that D is a club in
and A 6= f () for all D .
By absoluteness, since N is transitive this statment really holds. It follows by the definition
of f that A = A . Moreover, D since D is club in 1 . This is a contradiction.
204

EXERCISES
E15.1. Describe En(m, , n) for m 12 and n .
E15.2. Calculate En(625, V3 , 1) and En(20, 000, A, 2).
E15.3. In the ordering <L determine the first four sets and their order. Hint: use Corollary
15.5.
E15.4. Suppose that M is a nonempty transitive class satisfying the comprehension axioms,
and also x My M[x y]. Show that M is a model of ZF.
E15.5. Show that if M is a transitive proper class model of ZF, then x My M[x
y].
E15.6. Show that for every ordinal > , |L | = |V | iff = i .
E15.7. Assume V = L and > . Then L = V iff = i .
E15.8. Assume V = L and prove that L = H() for every infinite cardinal .
In the remaining exercises we develop the theory of ordinal definable sets. OD is the class
of all sets a such that:
> rank(a)n s n R Df(V , n + 1)
x V [s hxi R x = a].
E15.9. Show that if (y1 , . . . , yn , x) is a formula with at most the indicated variables free,
then
1 , . . . , n a[x[(1 , . . . , n , x) x = a] a OD].
Also show that OD.
E15.10. We define s t iff s, t < ON and one of the following holds:
(i) s = and t 6= ;
(ii) s, t 6= and max(rng(s)) < max(rng(t));
(iii) s, t 6= and max(rng(s)) = max(rng(t)) and dmn(s) < dmn(t);
(iv) s, t 6= and max(rng(s)) = max(rng(t)) and dmn(s) = dmn(t) and k
dmn(s)[s k = t k and s(k) < t(k)].
Prove the following:
(v) well-orders < ON.
(vi) t < ON[{s : s t} is a set].
(vii) For every infinite ordinal we have |< ( + 1)| = ||.
(viii) For every uncountable cardinal , the set < is well-ordered by in order type
and is an initial segment of < ON.
(ix) < is well-ordered by in order type 2 .
E15.11. By exercise E15.10, for each uncountable cardinal there is an isomorphism f
from (, ) onto (< , ). Then f f for < . It follows that there is a function
Enon mapping ON onto < ON such that < iff Enon() Enon().
205

Now we define a class function Enod with domain ON, as follows. For any ordinal ,

a if there exist s, , m, n such that Enon() = s h, n, mi

with m, n , ON, s < , dmn(s) = n, and


Enod() =

x V [s x En(m, V , n + 1) x = a],

0 otherwise.
Prove that OD = {Enod() : ON}.
E15.12. Now we define HOD = {x OD : trcl(x) OD}.
Prove that ON HOD and HOD is transitive.
E15.13. Show that (V HOD) HOD for every ordinal .
E15.14. Prove without using the axiom of choice that HOD is a model of ZFC.
References
Devlin, K. Constructibility, 425pp.
Jech, T. Set Theory, 769pp.
Kunen, K. Set Theory, 313pp.

206

16. Boolean algebras and quasi-orders


A Boolean algebra (BA) is a structure hA, +, , , 0, 1i with two binary operations + and
, a unary operation , and two distinguished elements 0 and 1 such that the following
axioms hold for all x, y, z A:
(A)
(C)
(L)
(D)
(K)

x + (y + z) = (x + y) + z;
x + y = y + x;
x + (x y) = x;
x (y + z) = (x y) + (x z);
x + (x) = 1;

(A )
(C )
(L )
(D )
(K )

x (y z) = (x y) z;
x y = y x;
x (x + y) = x;
x + (y z) = (x + y) (x + z);
x (x) = 0.

The main example of a Boolean algebra is a field of sets: a set A of subsets of some set X,
closed under union, intersection, and complementation with respect to X. The associated
Boolean algebra is hA, , , \, 0, Xi. Here \ is treated as a one-place operation, producing
X\a for any a A. This example is really all-encompassingevery BA is isomorphic to
one of these. We will not prove this, or use it.
As is usual in algebra, we usually denote a whole algebra hA, +, , , 0, 1i just by
mentioning its universe A, everything else being implicit.
Some notations used in some treatments of Boolean algebras are: or for +; or
for ; for . These notations might be confusing if discussing logic, or elementary set
theory. Our notation might be confusing if discussing ordinary algebra.
Now we give the elementary arithmetic of Boolean algebras. We recommend that the
reader go through them, but then approach any arithmetic statement in the future from
the point of view of seeing if it works in fields of sets; if so, it should be easy to derive from
the axioms.
First we have the duality principle, which we shall not formulate carefully; our particular uses of it will be clear. Namely, notice that the axioms come in pairs, obtained from
each other by interchanging + and and 0 and 1. This means that also if we prove some
arithmetic statement, the dual statement, obtained by this interchanging process, is also
valid.
Proposition 16.1. x + x = x and x x = x.
Proof.
x + x = x + x (x + x) by (L )
= x by (L);
the second statement follows by duality.
Proposition 16.2. x + y = y iff x y = x.
Proof. Assume that x + y = y. Then, by (L ),
x y = x (x + y) = x.
The converse follows by duality.
207

In any BA we define x y iff x + y = y. Note that the dual of x y is y x, by 16.2


and commutativity. (The dual of a defined notion is obtained by dualizing the original
notions.)
Proposition 16.3. On any BA, is reflexive, transitive, and antisymmetric; that
is, the following conditions hold:
(i) x x;
(ii) If x y and y z, then x z;
(iii) If x y and y x, then x = y.
Proof. x x means x + x = x, which was proved in 16.1. Assume the hypothesis of
(ii). Then
x + z = x + (y + z)
= (x + y) + z
=y+z
= z,
as desired. Finally, under the hypotheses of (iii),
x = x + y = y + x = y.
A partial ordering on a set X is a binary relation on X which is reflexive, transitive, and
antisymmetric; thus Proposition 16.3 says that is a partial ordering on the BA A. There
are some notions concerning partial orders which we need. An element z is an upper bound
for a set Y of elements of X if y z for all y Y ; similarly for lower bounds. And z is a
least upper bound for Y if it is an upper bound for Y and is any other upper bound for
Y ; simlarly for greatest lower bounds. By antisymmetry, in any partial order least upper
bounds and greatest lower bounds are unique if they exist.
Proposition 16.4. x + y is the least upper bound of {x, y}, and x y is the greatest
lower bound of {x, y}.
Proof. We have x + (x + y) = (x + x) + y = x + y, and similarly y + (x + y) =
y + (y + x) = (y + y) + x = y + x = x + y; so x + y is an upper bound for {x, y}. If z is
any upper bound for {x, y}, then
(x + y) + z = (x + (y + z) = x + z = z,
as desired. The other part follows by duality(!).
Proposition 16.5. (i) x + 0 = x and x 1 = x;
(ii) x 0 = x and x + 1 = 1;
(iii) 0 x 1.
Proof. By (K) and Proposition 16.4, 1 is the least upper bound of x and x; in
particular it is an upper bound, so x 1. Everything else follows by duality, Proposition
16.2, and the definitions.
208

Proposition 16.6. For any x and y, y = x iff x y = 0 and x + y = 1.


Proof. holds by (K) and (K ). Now suppose that x y = 0 and x + y = 1. Then
y = y 1 = y (x + x) = y x + y x = 0 + y x = y x;
x = x 1 = x (x + y) = x x + x y = 0 + x y = x y = y.
Proposition 16.7. (i) x = x;
(ii) if x = y then x = y;
(iii) 0 = 1 and 1 = 0;
(iv) (DeMorgans laws) (x + y) = x y and (x y) = x + y.
Proof. If we apply Proposition 16.6 with x and y replaced respectively by x and x,
we get x = x. Next, if x = y, then x = x = y = y. For (iii), by 16.5(iii),
0 1 = 0 and 0 + 1 = 1, so by 16.6, 0 = 1. Then 1 = 0 by duality. For the first part of
(iv),
(x + y) x y = x x y + y x y
= 0 + 0 = 0,
and

(x + y) + x y = x (y + y) + y + x y
= x y + x y + y + x y
= y + x y + x y
= y + y = 1,

so that (x + y) = x y by Proposition 16.6. Finally, the second part of (iv) follows


by duality.
Proposition 16.8. x y iff y x.
Proof. Assume that x y. Then x + y = y, so x y = y, i.e., y x. For
the converse, use the implication just proved, plus 16.7(i).
Proposition 16.9. If x x and y y , then x + y x + y and x y x y .
Proof. Assume the hypothesis. Then
(x + y) + (x + y ) = (x + x ) + (y + y ) = x + y ,
and so x + y x + y ; the second conclusion follows by duality.
Proposition 16.10. x y iff x y = 0.
Proof. If x y, then x = x y and so x y = 0. Conversely, if x y = 0, then
x = x (y + y) = x y + x y = x y,
so that x y.
209

Elements x, y A are disjoint if x y = 0. For any x, y we define


xy = x y + y x;
this is the symmetric difference of x and y.
Proposition 16.11. (i) x = y iff xy = 0;
(ii) x (yz) = (x y)(x z);
(iii) x(yz) = (xy)z.
Proof. For (i), is trivial. Now assume that xy = 0. Then x y = 0 = y x,
so x y and y x, so x = y.
For (ii), we have
x (yz) = x y z + x z y
= (x y) (x z) + (x z) (x y)
= (x y)(x z),
as desired.
Finally, for (iii),
x(yz) = x (y z + y z) + (y z + y z) x
= x (y + z) (y + z) + x y z + x y z
= x y z + x y z + x y z + x y z;
if we apply the same argument to z(yx) we get
z(yx) = z y x + z y x + z y x + z y x,
which is the same thing. So the obvious symmetry of gives the desired result.
One further useful result is that axiom (D ) is redundant:
Proposition 16.12. (D ) is redundant.
Proof.
(x + y) (x + z) = ((x + y) x) + ((x + y) z)
= (x (x + y)) + (z (x + y))
= x + ((z x) + (z y))
= x + ((x z) + (y z))
= (x + (x z)) + (y z)
= x + (y z).

210

Complete Boolean algebras


P
If
M
is
a
subset
of
a
BA
A,
we
denote
by
M its least upper bound (if it exists), and by
Q
M its greatest lowerWbound, ifVit exists. A is complete
iffQthese always exist. Note that
P
frequently people use M and M instead of
M and M .
Proposition
16.13.
Assume that A is a complete BA.
P
Q
(i) QiI ai = PiI ai .
(ii) iI ai = iI ai .
P
Proof. For (i), let a =
iI ai ; we show that a is the greatest lower bound of
{ai : i I}. If i I, then ai a, and hence a ai ; thus a is a lower bound for the
indicated set. Now suppose that x is any lower bound for this set. Then for any i I we
have x ai , and so ai x. So x is an upper bound for {ai : i I}, and so a x.
Hence x a, as desired.
(ii) is proved similarly.
The following (possibly infinite) distributive law is frequently useful. One should be aware
of the fact that more general infinite distributive laws do not hold, in general. Since this
will not enter into our treatment, we do not go into a counterexample or further discussion
of really general distributive laws.
P
P
Proposition 16.14. If iI ai exists, then iI (b ai ) exists and
b

X
iI

ai =

(b ai ).

iI

P
Proof. Let s = iI ai ; we shall show that b s is the least upper bound of {b ai :
i I}. If i I, then ai s and so b ai b s; so b s is an upper bound for the indicated
set. Now suppose that x is any upper bound for this set. Then for any i I we have
b ai x, hence b ai x = 0 and so ai (b x) = b + x; so b + x is an upper
bound for {ai : i I}. It follows that s b + x, and hence s b x, as desired.
Quasi-orders
A quasi-order is a triple P = (P, , 1) such that is a reflexive and transitive relation
on the nonempty set P , and p P (p 1). Note that we do not assume that is
antisymmetric. Partial orders are special cases of quasi-orders in which this is assumed.
Note that we assume that every quasi-order has a largest element; this is non-standard in
the theory of quasi-orders, but standard in treatments of forcing. Many set-theorists use
partial order instead of quasi-order.
Frequently we use just P for a quasi-order; and 1 are assumed.
We say that elements p, q P are compatible iff there is an r p, q. We write p q
to indicate that p and q are incompatible. A set A of elements of P is an antichain iff any
two distinct members of A are incompatible. WARNING: sometimes antichain is used
to mean pairwise incomparable, a much different notion. A subset Q of P is dense iff for
every p P there is a q Q such that q p.
211

Now we are going to describe how to embed a quasi-order into a complete BA. We
take the regular open algebra of a certain topological space. We assume a very little
bit of topology. To avoid assuming any knowledge of topology we now give a minimalist
introduction to topology.
A topology on a set X is a collection O of subsets of X satisfying the following conditions:
(1) X, O.
(2) O is closed under arbitrary unions.
(3) O is closed under finite intersections.
The members of O are said to be open. The interior of a subset Y X is the union of all
open sets contained in Y ; we denote it by int(Y ).
Proposition 16.15. (i) int() = .
(ii) int(X) = X.
(iii) int(Y ) Y .
(iv) int(Y Z) = int(Y ) int(Z).
(v) int(int(Y )) = int(Y ).
(vi) int(Y ) = {x X : x U Y for some open set U }.
Proof. (i)(iii), (v), and (vi) are obvious. For (iv), if U is an open set contained in
Y Z, then it is contained in Y ; so int(Y Z) int(Y ). Similarly for Z, so holds. For
, note that the right side is an open set contained in Y Z. (v) holds since int(Y ) is
open.
A subset C of X is closed iff X\C is open.
Proposition 16.16. (i) and X are closed.
(ii) The collection of all closed sets is closed under finite unions and intersections of
any nonempty subcollection.
For any Y X, the closure of Y , denoted by cl(Y ), is the intersection of all closed sets
containing Y .
Proposition 16.17. (i) cl(Y ) = X\int(X\Y ).
(ii) int(Y ) = X\cl(X\Y ).
(iii) cl() = .
(iv) cl(X) = X.
(v) Y cl(Y ).
(vi) cl(Y Z) = cl(Y ) cl(Z).
(vii) cl(cl(Y )) = cl(Y ).
(viii) cl(Y ) = {x X :for every open set U , if x U then U Y 6= }.
Proof. (i): int(X\Y ) is an open set contained in X\Y , so Y is a subset of the closed
set X\int(X\Y ). Hence cl(Y ) X\int(X\Y ). Also. cl(Y ) is a closed set containing
Y , so X\cl(Y ) is an open set contained in X\Y . Hence X\cl(Y ) int(X\Y ). Hence
X\int(X\Y cl(Y ). This proves (i).
212

(ii): Using (i),


X\cl(X\Y ) = X\(X\int(X\(X\Y ))) = int(Y ).
(iii)(v): clear.
(vi):
cl(Y Z) = X\int(X\(Y Z)) by (i)
= X\int((X\Y ) (X\Z))
= X\(int(X\Y ) int(X\Z)) by 16.15(iv)
= [X\int(X\Y )] [X\int(X\Z)]
= cl(Y ) cl(Z).
(vii):
cl(cl(Y )) = cl(X\int(X\Y ))
= X\int(X\(X\int(X\Y )))
= X\int(int(X\Y ))
= X\int(X\Y )
= cl(Y ).
(vii): First suppose that x cl(Y ), and x U , U open. By (i) and Proposition
16.15(vi) we have U 6 X\Y , i.e., U Y 6= , as desired. Second, suppose that x
/ cl(Y ).
Then by (i) and 16.15(vi) there is an open U such that x U X\Y ; so U Y = , as
desired.
Now we go beyond this minimum amount of topology and work with the notion of a regular
open set, which is not a standard part of topology courses.
We say that Y is regular open iff Y = int(cl(Y )).
Proposition 16.18. (i) If Y is open, then Y int(cl(Y )).
(ii) If U and V are regular open, then so is U V .
(iii) int(cl(Y )) is regular open.
(iv) If U is open, then int(cl(U )) is the smallest regular open set containing U .
(v) If U is open then U cl(Y ) cl(U Y ).
(vi) If U is open, then U int(cl(Y )) int(cl(U Y )).
(vii) If U and V are open and U V = , then int(cl(U )) V = .
(viii) If U and V are open and U V = , then int(cl(U
)) int(cl(V )) = .
S
(ix) For any set M of regular open sets, int(cl( M ) is the least regular open set
containing each member of M .
Proof. (i): Y cl(Y ), and hence Y = int(Y ) int(cl(Y )).
(ii): U V is open, and so U V int(cl(U V )). For the other inclusion, int(cl(U
V )) int(cl(U )) = U , and similarly for V , so the other inclusion holds.
213

(iii): int(cl(X)) cl(X), so cl(int(cl(X))) cl(cl(X)) = cl(X); hence


int(cl(int(cl(X)))) int(cl(X));
the other inclusion is clear.
(iv): By (iii), int(cl(U )) is a regular open set containing U . If V is any regular open
set containing U , then int(cl(U )) int(cl(V )) = V .
(v):
U (X\(U Y )) X\Y,

hence

U int(X\(U Y )) = int(U ) int(X\(U Y ))


= int(U (X\(U Y )))
int(X\Y ), hence
X\int(X\Y ) X\(U int(X\(U Y )))
= (X\U ) (X\int(X\(U Y ))),
U (X\int(X\Y )) (X\int(X\(U Y ))),

hence

and (v) follows.


(vi):
U int(cl(Y )) = int(U ) int(cl(Y ))
= int(U cl(Y ))
int(cl(U Y )) by (v).
(vii): U X\V , hence cl(U ) cl(X\V ) = X\V , hence cl(U ) V = , and the
conclusion of (vii) follows.
(viii): Apply (vii) twice. S
S
(ix): If U M , then U S
M int(cl( M ). Suppose
that V is regular open and
S
U V for all U M . Then M V , and so int(cl( M )) int(cl(V ) = V .
We let RO(X) be the collection of all regular open sets in X. We define operations on
RO(X) which will make it a Boolean algebra. For any Y, Z RO(X), let
Y + Z = int(cl(Y Z));
Y Z = Y Z;
Y = int(X\Y ).
Theorem 16.19. The structure
hRO(X), +, , , , Xi
is a complete BA. Moreover, the ordering coincides with .
Proof. RO(X) is closed under + by Proposition 16.18(ix), and is closed under by
Proposition 16.18(ii). Clearly it is closed under , and , X RO(X). Now we check
214

the axioms. The following are completely obvious: (A ), (C ), (C). Now let unexplained
variables range over RO(X). For (A), note by 16.18(i) that U U + V (U + V ) + W ;
and similarly V (U + V ) + W and W U + V (U + V ) + W . If U, V, W Z, then by
16.18(iv), U + V Z and hence (U + V ) + W Z. Thus (U + V ) + W is the least upper
bound in RO(X) of U, V, W . This is true for all U, V, W . So U + (V + W ) = (V + W ) + U
is also the least upper bound of them; so (A) holds. For (L):
U + U V = int(cl(U (U V ))) = int(cl(U )) = U.
(L ) holds by 16.18(i). For (D), first note that
Y (Z + W ) = Y int(cl(Z W ))
int(cl(Y (Z W ))) by 16.18(vi)
= int(cl((Y Z) (Y W )))
= Y Z + Y W.
On the other hand, (Y Z) (Y W ) = Y (Z W ) Y, Z W , and hence easily
Y Z + Y W = int(cl((Y Z) (Y W )))
int(cl(Y ) = Y and
Y Z + Y W = int(cl((Y Z) (Y W )))
int(cl(Z W ) = Z + W ;
so the other inclusion follows, and (D) holds.
(K): For any regular open Y , from Proposition 16.17(ii) we get Y = int(X\Y ) =
X\cl(X\(X\Y )) = X\cl(Y ). Hence
X = cl(Y ) (X\cl(Y )) cl(Y ) cl((X\cl(Y )) = cl(Y (X\cl(Y ))),
and hence X = Y + Y .
(K ): Clearly = Y int(X\Y ) = Y Y .
Thus we have now proved that hRO(X), +, , , , Xi is a BA. Since is the same as ,
is the same as . Hence by Proposition 16.18(ix), hRO(X), +, , , , Xi is a complete
BA.
Now we return to our task of embedding a quasi-order into a complete Boolean algebra.
Let P be a given quasi-order. For each p P let P p = {q : q p}. Now we define
OP = {X P : (P p) X for every p X}.
We check that this gives a topology on P . Clearly P, O. To
S show that O is closed
under arbitrary unions, suppose that X S
O. Take any p X . Choose X X
such that p X. Then (P p) X X , as desired. If X, Y OP , suppose that
215

p X Y . Then p X, so (P p) X. Similarly (P p) Y , so (P p) X Y .
Thus X Y OP , finishing the proof that OP is a topology on P .
We denote the complete BA of regular open sets in this topology by RO(P ).
Now for any p P we define
e(p) = int(cl(P p)).
Thus e maps P into RO(P ).
This is our desired embedding. Actually it is not really an embedding in general, but
it has several useful properties, and for many quasi-orders it really is an embedding.
The useful properties mentioned are as follows. We say that a subset X of P is dense
below p iff for every r p there is a q r such that q X.
Theorem 16.20. Let P be a quasi-order. Suppose that p, q P , F is a finite subset
of P , a, b RO(P ), and N is a subset of RO(P )
(i) e[P ] is dense in RO(P ), i.e., for any nonzero Y RO(P ) there is a p P such
that e(p) Y .
(ii) If p q then e(p) e(q).
(iii) p q iff e(p) e(q) = .
(iv) If e(p) e(q), then p and q are compatible.
(v) The following conditions are equivalent:
(a) e(p) e(q).
(b) {r : r p, q} is dense below p.
(vi) The following
Q conditions are equivalent, for F nonempty:
(a) e(p) qF e(q).
(b) {r : r q for all q F } is dense below p.
(vii) The following
Q conditions
P are equivalent:
(a) e(p) ( qF e(q)) N .
(b) {r : r q for all q F and e(r) s for some s N } is dense below p.
(viii) e(p) a iff there is no q p such that e(q) a.
(ix) e(p) a + b iff for all q p, if e(q) a then e(q) b.
Proof. (i): Assume the hypothesis. By the definition of the topology and since Y is
nonempty and open, there is a p P such that P p Y . Hence e(p) = int(cl(P p))
int(cl(Y )) = Y .
(ii): If p q, then P p P q, and so e(p) = int(cl(P p)) int(cl(P q) = e(q)).
(iii): Assume that p q. Then (P p) (P q) = , and hence by Proposition
16.18(viii), e(p) e(q) = .
Conversely, suppose that e(p) e(q) = . Then (P p) (P q) e(p) e(q) = ,
and so p q.
(iv): If e(p) e(q), then e(p) e(q) = e(p) 6= , so p and q are compatible by (iii).
(v): For (a)(b), suppose that e(p) e(q) and s p. Then e(s) e(p) e(q), so s
and q are compatible by (iv); say r s, q. Then r s p, hence r p, q, as desired.
For (b)(a), suppose that e(p) 6 e(q). Thus e(p) e(q) 6= 0. Hence there is an s
such that e(s) e(p) e(q). Hence e(s) e(q) = , so s q by (iii). Now e(s) e(p), so s
216

and p are compatible by (iv); say t s, p. For any r t we have r s, and hence r q.
So (b) fails.
(vi): We proceed by induction on |F |. The case |F | = 1 is given by (v).
Q Now assume
the result for F , and suppose that tQ P \F . First suppose that e(p) qF e(q) e(t).
Suppose that s p. Now e(p) qF e(q), so by the inductive hypothesis there is a
u s such that u q for all q F . Thus e(u) e(s) e(p) e(t), so by (iv), u and t
are compatible. Take any v u, t. then v q for any q F {t}, as desired.
Second, suppose that (b) holds
Q for F {t}. In particular, {r : r q for all q F }
is dense below p, and so e(p) qF e(q) by the inductive hypothesis. But also clearly
{r : r t} is dense below p, so e(p) e(t) too, as desired.
Q
P
(vii): First assume that e(p) ( qF e(q)) N , and suppose that u p. P
By (vi),
there is a v uPsuch that
v

q
for
each
q

F
.
Now
e(v)

e(u)

e(p)

N , so
P
0 6= e(v) = e(v) N = sN (e(v)e(s)). Hence there is an s N such that e(v)e(s) 6= 0.
Hence by (iii), v and s are compatible; say r v, s. Clearly r is in the set described in (b).
Second, suppose
that (b) holds. Clearly then {r : r q for
Q
P all q F } is densePbelow p,
and so e(p) qF e(q) by (vi). Now P
suppose that e(p) 6
N . Then e(p) N 6= 0,
so there is a q such that e(q) e(p) N . By (iv), q and p are compatible; say s p, q.
Then
P by (b)
P choose r s and t N such that e(r) t. Thus e(r) e(s) t e(p) t
( N ) N = 0, contradiction.
(viii): Assume that e(p) a. Suppose that q p and e(q) a. Then e(q)
a a = 0, contradiction.
(viii): Assume that e(p) 6 a. Then e(p) a 6= 0, so there is a q such that
e(q) e(p) a. By (vii) there is an r p, q with e(r) a, as desired.
(ix): Assume that e(p) a + b, q p, and e(q) a. Then e(q) a (a + b) b,
as desired.
(ix): Assume the indicated condition, but suppose that e(p) 6 a + b. Then
e(p) a b 6= 0, so there is a q such that e(q) e(p) a b. By (vii) with F = {p} and
N = {a b} we get q such that q p and e(q) a b. So q p and e(q) a, so by our
condition, e(q) b. But also e(q) b, contradiction.
We now expand on the remarks above concerning when e really is an embedding. Note
that if P is a simple ordering, then the closure of P p is P itself, and hence P has only
two regular open subsets, namely the empty set and P itself. If the ordering on P is trivial,
meaning that no two elements are comparable, then every subset of P is regular open.
An important condition satisfied by many quasi-orders is defined as follows. We say
that P is separative iff it is a partial order (thus is an antisymmetric quasi-order), and for
any p, q P , if p 6 q then there is an r p such that r q.
Proposition 16.21. Let P be a quasi-order.
(i) cl(P p) = {q : p and q are compatible}.
(ii) e(p) = {q : for all r q, r and p are compatible}.
(iii) The following conditions are equivalent:
(a) P is separative.
(b) e is one-one, and for all p, q P , p q iff e(p) e(q).
217

Proof. (i) and (ii) are clear. For (iii), (a)(b), assume that P is separative. Take
any p, q P . If p q, then e(p) e(q) by 16.20(ii). Suppose that p 6 q. Choose r p
such that r q. Then r e(p), while r
/ e(q) by (ii). Thus e(p) 6 e(q).
Now suppose that e(p) = e(q). Then p q p by what was just shown, so p = q
since P is a partial order.
For (iii), (b)(a), suppose that p q p. Then e(p) e(q) e(p), so e(p) = e(q),
and hence p = q. So P is a partial order. Suppose that p 6 q. Then e(p) 6 e(q). Choose
s e(p)\e(q). Since s
/ e(q), by (ii) we can choose t s such that t q. Since s e(p),
it follows that t and p are compatible; choose r t, p. Clearly r q.
Now we prove a theorem which says that the regular open algebra of a quasi-order is unique
up to isomorphism.
Theorem 16.22. Let P be a quasi-order, A a complete BA, and j a function mapping
P into A with the following properties:
(i) j[P ] is dense in A, i.e., for any nonzero a A there is a p P such that j(p) a.
(ii) For all p, q P , if p q then j(p) j(q).
(iii) For any p, q P , p q iff j(p) j(q) = 0.
Then there is a unique isomorphism f from RO(P ) onto A such that f e = j. That is,
f is a bijection from RO(P ) onto A, and for any x, y RO(P ), x y iff f (x) f (y).
Note that since the Boolean operations are easily expressible in terms of (as least upper
bounds, etc.), the condition here implies that f preserves all of the Boolean operations
too; this includes the infinite sums and products.
Proof. Before beginning the proof, we introduce some notation in order to make the
situation more symmetric. Let B0 = RO(P ), B1 = A, k0 = e, and k1 = j. Then for each
m < 2 the following conditions hold:
(1) km [P ] is dense in Bm .
(2) For all p, q P , if p q then km (p) km (q).
(3) For all p, q P , p q iff km (p) km (q) = 0.
(4) For all p, q P , if km (p) km (q), then p and q are compatible.
In fact, (1)(3) follow from 16.20 and the assumptions of the theorem. Condition (4) for
m = 0, so that km = e, follows from 16.20(iv). For m = 1, so that km = j, it follows easily
from (iii).
Now we begin the proof. For each m < 2 we define, for any x Bm ,
gm (x) =

{k1m (p) : p P, km (p) x}.

The proof of the theorem now consists in checking the following, for each m 2:
(5) If x, y Bm and x y, then gm (x) gm (y).
(6) g1m gm is the identity on Bm .
218

In fact, suppose that (5) and (6) have been proved. If x, y RO(P ), then
x y implies that g0 (x) g0 (y) by (5);
g0 (x) g0 (y) implies that x = g1 (g0 (x)) g1 (g0 (y)) = y by (5) and (6).
Also, (6) holding for both m = 0 and m = 1 implies that g0 is a bijection from RO(P )
onto A. So g0 is the desired function f of the theorem.
Now (5) is obvious from the definition. To prove (6), assume that m 2. We first
prove
(7) For any p P and any b Bm , km (p) b iff k1m (p) gm (b).
To prove (7), first suppose that km (p) b. Then obviously k1m (p) gm (b). Second,
suppose that k1m (p) gm (b) but km (p) 6 b. Thus km (p) b 6= 0, so by the denseness
of km [P ] in Bm , choose q P such that km (q) km (p) b. Then p and q are compatible
by (4), so let r P be such that r p, q. Hence
k1m (r) k1m (p) gm (b) =

{k1m (s) : s P, km (s) b}.

P
Hence k1m (r) = {k1m (s) k1m (r) : s P, km (s) b}, so there is an s P such that
km (s) b and k1m (s) k1m (r) 6= 0. Hence s and r are compatible; say t s, r. Hence
km (t) km (r) km (q) b, but also km (t) km (s) b, contradiction. This proves (7).
Now take any b Bm . Then
g1m (gm (b)) =
=

X
X

{km (p) : p P, k1m (p) gm (b)}


{km (p) : p P, km (p) b}

= b.
Thus (6) holds.
This proves the existence of f . Now suppose that g is also an isomorphism from
RO(P ) onto A such that g e = j, but suppose that f 6= g. Then there is an X RO(P )
such that f (X) 6= g(X). By symmetry, say that f (X) g(X) 6= 0. By (ii), choose p P
such that j(p) f (X) g(X). So f (e(p)) = j(p) f (X), so e(p) X, and hence
j(p) = g(e(p)) g(X). This contradicts j(p) g(X).
Corollary 16.23. Suppose that A is a complete BA with at least two elements, and
consider the partial order P = (A\{0}, , 1). Then A is isomorphic to RO(P).
Proof. Let P = A\{0}, and let j be the identity on P . Then of course j[P ] is dense
in A. If p, q P and p q, then j(p) j(q). Condition (iii) of 16.22 is also clear. Hence
there is an isomorphism f of RO(P) onto A such that f (e(p)) = p for every p P .
EXERCISES
E16.1. Let (A, +, , , 0, 1) be a Boolean algebra. Show that (A, , , 0, 1) is a ring with
identity in which every element is idempotent. This means that x x = x for all x.
219

E16.16. Let (A, +, , 0, 1) be a ring with identity in which every element is idempotent.
Show that A is a commutative ring, and (A, , , , 0, 1) is a Boolean algebra, where for
any x, y A, x y = x + y + xy and for any x A, x = 1 + x. Hint: expand (x + y)2 .
E16.3. Show that the processes described in exercises E16.1 and E16.2 are inverses of one
another.
E16.4. A filter in a BA A is a subset F of A with the following properties:
(1) 1 F .
(2) If a F and a b, then b F .
(3) If a, b F , then a b F .
An ultrafilter in A is a filter F such that 0
/ F , and for any a A, a F or a F .
Prove that a filter F is an ultrafilter iff F is maximal among the set of all filters G
such that 0
/ G.
E16.5. (Continuing exercise E16.4) Prove that for any nonzero a A there is an ultrafilter
F such that a F .
E16.6. (Continuing exercise E16.4) Prove that any BA is isomorphic to a field of sets.
(Stones representation theorem) Hint: given a BA A, let X be the set of all ultrafilters
on A and define f (a) = {F X : a F }.
E16.7 (Continuing exercise E16.4) Suppose that F is an ultrafilter on a BA A. Let 2 be
the two-element BA. (This is, up to isomorphism, the BA of all subsets of 1.) For any
a A let
n
1 if a F ,
f (a) =
0 if a
/ F.
Show that f is a homomorphism of A into 2. This means that for any a, b A, the
following conditions hold:
f (a + b) = f (a) + f (b);
f (a b) = f (a) f (b);
f (a) = f (a);
f (0) = 0;
f (1) = 1.
E16.8. (Lindenbaum-Tarski algebras; A knowledge of logic is assumed.) Suppose that L
is a first-order language and T is a set of sentences of L . Define T iff and are
sentences of L and T |= . Show that this is an equivalence relation on the set S of
all sentences of L . Let A be the collection of all equivalence classes under this equivalence
relation. Show that there are operations +, , on A such that for any sentences , ,
[] + [] = [ ];
[] [] = [ ];
[] = [].
220

Finally, show that (A, +, , , [v0((v0 = v0 ))], [v0 (v0 = v0 )]) is a Boolean algebra.
E16.9. (A knowledge of logic is assumed.) Show that every Boolean algebra is isomorphic
to one obtained as in exercise E16.8. Hint: Let A be a Boolean algebra. Let L be the
first-order language which has a unary relation symbol Ra for each a A. Let T be the
following set of sentences of L :
xy(x = y);
x[Ra (x) Ra (x)] for each a A;
x[Rab (x) Ra (x) Rb (x)] for all a, b A;
xR1 (x).
def

E16.10. Let A be the collection of all subsets X of Y = {r Q : 0 r} such that there


exist an m and a, b m (Y {}) such that a0 < b0 < a1 < b1 < < am1 <
bm1 and
X = [a0 , b0 ) [a1 , b1 ) . . . [am1 , bm1 ).
Note that A by taking m = 0, and Y A since Y = [0, ).
(i) Show that if X is as above, c, d Y {} with c < d, c a0 , then X [c, d) A,
and c is the first element of X [c, d).
(ii) Show that if X is as above and c, d Y {} with c < d, then X [c, d) A.
(iii) Show that (A, , , \, , Y ) is a Boolean algebra.
E16.11. (Continuing
exercise E16.10.) For each n let xn = [n, n + 1), an interval in
P
Q. Show that n x2n does not exist in A.
E16.12. Let A be the Boolean algebra of all subsets of some nonempty set X, under the
natural set-theoretic operations. Show that if hai : i Ii is a system of elements of A,
then
Y
X Y (i)
(ai + ai ) = 1 =
ai ,
iI

I 2 iI

where for any y, y 1 = y and y 0 = y.


E16.13. Let M be the set of all finite functions f 2. For each f M let
Uf = {g 2 : f g}.
Let A consist of all finite unions of sets Uf .
(i) Show that A is a Boolean algebra under the set-theoretic operations.
(ii) For each i , let xi = U{(i,1)} . Show that

2=

(xi + xi )

while

X Y
2

(i)

xi

221

= ,

where for any y, y 1 = y and y 0 = y.


This is an example of an infinite distributive law that holds in some BAs (by exercise
E16.12), but does not hold in all BAs.
E16.14. Suppose that (P, , 1) is a quasi-order. Define
pq

iff

p, q P, p q, and q p.

Show that is an equivalence relation, and if Q is the collection of all -classes, then
there is a relation  on Q such that for all p, q P , [p]  [q] iff p q. Finally, show
that (Q, , [1]) is a partial order.
E16.15. We say that (P, <, 1) is a partial order in the second sense iff < is transitive and
irreflexive, and x < 1 for all x 6= 1. (Irreflexive means that for all p P , p 6< p.) Show that
if (P, <, 1) is a partial order in the second sense and if we define  by p  q iff (p, q P
def

and p < q or p = q), then A (P, <, 1) = (P, ) is a partial order, and p  1 for all p P .
Furthermore, show that if (P, ) is a partial order in which p 1 for all p, and we define
def

p q by p q iff (p, q P , p q, and p 6= q), then B(P, ) = (P, , 1) is a partial order


in the second sense.
Also prove that A and B are inverses of one another.
E16.16. Show that if (P, , 1) is a quasiorder and we define by p q iff (p, q P , p q
and q 6 p), then (P, ) is a partial order in the second sense, provided that there is no
element p 6= 1 such that 1 p. Give an example where this partial order is not isomorphic
to the one derived from (P, , 1) by the procedure of exercise E16.14.
E16.17. Prove that if P is a quasi-order such that e is one-one, then P is a partial order.
Give an example of a partial order such that e is not one-one. Give an example of an infinite
quasi-order Q such that e is not one-one, while for any p, q Q, p q iff e(p) e(q).
E16.18. (Continuing E16.14.) Let P = (P, , 1), and let Q = (Q, , [1]). Show that there
is an isomorphism f of RO(P) onto RO(Q) such that f eP = eQ , where : P Q is
defined by (p) = [p] for all p P .
References
Koppelberg, S. General theory of Boolean algebras. Volume 1 of Handbook of
Boolean Algebras, edited by J. D. Monk and R. Bonnet, North-Holland 1989, 312pp.
Kunen, K. Set Theory.

222

17. Generic extensions and forcing


In this chapter we give the basic definitions and facts about generic extensions and forcing.
Uses of these things will occupy much of remainder of these notes. We use c.t.m. for
countable transitive model; see Theorem 14.33.
Let P = (P, , 1) be a quasi-order. A filter on P is a subset G of P such that the
following conditions hold:
(1) For all p, q G there is an r G such that r p and r q.
(2) For all p G and q P , if p q then q G.
Now let M be a c.t.m. of ZFC and let P = (P, , 1) M be a quasi-order. We say that G
is P-generic over M provided that the following conditions hold:
(3) G is a filter on P.
(4) For every dense D P such that D M we have G D 6= .
The definition of generic filter just given embodies a choice between two intuitive options.
The option chosen corresponds to thinking of stronger conditionsthose containing more
informationas smaller in the quasi-order. This may seen counter-intuitive, but it fits
nicely with the embedding of quasi-orders into Boolean algebras, as we will see. Many
authors take the oppositite approach, considering stronger conditions as the greater ones.
Of course this requires a corresponding change in the definition of generic filter (and
denseness).
The following is the basic existence lemma for generic filters.
Lemma 17.1. If M is a c.t.m. of ZFC, P = (P, , 1) M is a quasi-order, and
p P , then there is a G which is P-generic over M and p G.
Proof. Let hDn : n i enumerate all of the dense subsets of P which are in M . We
now define a sequence hqn : n i by recursion. Let q0 = p. If qn P has been defined,
choose qn+1 Dn with qn+1 qn . Thus p = q0 q1 . Now we define
G = {r P : qn r for some n }.
We check that G is as desired. For (1), suppose that r, s G. Say m, n with qm r
and qn s. By symmetry, say m n. Then qn r, s, and qn G, as desired.
Condition (2) is clear. Hence (3) holds.
For (4), let n . Then qn+1 G Dn , as desired.
It is important to realize that usually generic filters are not in the ground model M ; this
is expressed in the following lemma.
Lemma 17.2. Suppose that M is a c.t.m. of ZFC and P = (P, , 1) M is a
quasi-order. Assume the following:
(1) For every p P there are q, r P such that q p, r p, and q r.
Also suppose that G is P-generic over M .
223

Then G
/ M.
Proof. Suppose to the contrary that G M . Then also P \G M , since M is a
model of ZFC and by absoluteness. We claim that P \G is dense. In fact, given p P ,
choose q, r as in (1). Then q, r cannot both be in G, by the definition of filter. So one
at least is in P \G, as desired. Since P \G is dense and in M , we contradict G being
generic.
Most quasi-orders used in forcing arguments satisfy the condition of Lemma 17.2; for more
details on this lemma, see the exercises.
The following elementary proposition gives six equivalent ways to define generic filters.
Proposition 17.3. Suppose that M is a c.t.m. of ZFC and P is a quasi-order in M .
Suppose that G P satisfies condition (2), i.e., if p G and p q, then q G. Then the
following conditions are equivalent:
(i) G D 6= whenever D M and D is dense in P.
(ii) G A 6= whenever A M and A is a maximal antichain of P.
(iii) G E 6= whenever E M and for every p P there is a q E such that p
and q are compatible.
Moreover, suppose that G satisfies (2) and one, hence all, of the conditions (i)(iii). Then
G is P-generic over M iff the following condition holds:
(iv) For all p, q G, p and q are compatible.
Proof. (i)(ii): Assume (i), and suppose that A M is a maximal antichain of P.
Let D = {p P : p q for some q A}. We claim that D is dense. Suppose that r is
arbitrary. Choose q A such that r and q are compatible. Say p r, q. Thus p D.
So, indeed, D is dense. Clearly D M , since A M . By (i), choose p D G. Say
p q A. Then q G A, as desired.
(ii)(iii): Assume (ii), and suppose that E is as in (iii). By Zorns lemma, let A be
a maximal member of
(1) {B P : B is an antichain, and for every p B there is a q E such that p q}.
We claim that A is a maximal antichain. For, suppose that p q for all q A. Choose
s E such that p and s are compatible. Say r p, s. Hence r q for all q A, so r
/ A.
Thus A {r} is a member of (1), contradiction.
Clearly A M , since E M . So, since A is a maximal antichain, choose p A G.
Then choose q E such that p q. So q E G, as desired.
(iii)(i): Obvious.
Now we assume (2) in the definition, and (i)(iii).
If G is P-generic over M , clearly (iv) holds.
Now asume that (i)(iv) hold, and suppose that p, q G; we want to find r G such
that r p, q. Let
D = {r : r p or r q or r p, q}.
We claim that D is dense in P. For, let s P be arbitrary. If s p, then s s and s D,
as desired. So suppose that s and p are compatible; say t s, p. If t q, then t s and
224

t D, as desired. So suppose that t and q are compatible. Say r t, q. Then r t p


and r t s, so r s and r p, q, hence r D, as desired. This proves that D is dense.
Now by (i) choose r D G. By (iv), r is compatible with p and r is compatible
with q. So r p, q, as desired.
We are going to define the generic extension M [G] by first defining names in M , and then
producing the elements of M [G] by using those names.
For any set P , the following defines the notion of P -name by -recursion:
() is a P -name iff is a relation, and for all (, p) , is a P -name and p P .
To justify this definition, let
R = {(, ) : p P [(, p) ]}.
Clearly R is well-founded and set-like on V. Now we define F : V V V by setting
F (, f ) =

1
0

if f is a function with domain pred(V, , R)


and pred(V, , R)[f () = 1],
otherwise

Now let G : V V be obtained by the recursion theorem. Then for any ,


G( ) = F(, G pred(V, , R))
n
= 1 if pred(A, , R)[G() = 1]
0 otherwise
n
= 1 if p P [(, p) G() = 1]
0 otherwise.
Finally, we say that is a P -name iff is a relation and G( ) = 1. Then
is a P -name

iff

is a relation and p[(, p) [ is a P -name]].

Note that is a P -name is absolute.


For any set P , we denote by VP the (proper) class of all P -names. If M is a c.t.m.
of ZFC, then we let M P = VP M . Note by absoluteness that
M P = { M : ( is a P -name)M }.
If G P , we define val(, G) by recursion:
val(, G) = {val(, G) : (, p) for some p G}.
we also write G in place of val(, G). Notice that val is absolute for c.t.m. of ZFC.
Finally, if M is a c.t.m. of ZFC and G P M , we define
M [G] = {G : M P }.
225

Note that M P M , and hence M P is countable. Hence by the replacement axiom, M [G]
is also countable.
Lemma 17.4. If M is a c.t.m. of ZFC, P M is a quasi-order, and G is a filter on
P, then M [G] is transitive.
Proof. Suppose that x y M [G]. Then there is a M P such that y = G . Since
x G , there is a M P such that x = G . So x M [G].
The following Lemma says that M [G] is the smallest c.t.m. of ZFC which contains M as
a subset and G as a member, once we show that it really is a model of ZFC. This lemma
will be extremely useful in what follows.
Lemma 17.5. Suppose that M is a c.t.m. of ZFC, P M is a quasi-order, G is a
filter on P, N is a c.t.m. of ZFC, M N , and G N . Then M [G] N .
Proof. Take any x M [G]. Say x = val(, G) with M P . Then , G N , so by
absoluteness, x = (val(, G))N N .
To show that M is a subset of M [G], we need a function mapping M into the collection
of all P -names. We now assume that P = (P, , 1) is a quasi-order. Again the definition
is by recursion.
x
= {(
y , 1) : y x}.
Note that this depends on P; we could denote it by check(P, x) to bring this out, if necessary.
Again this function is absolute for transitive models of ZFC.
Lemma 17.6. Suppose that M is a transitive model of ZFC, P M is a quasi-order,
and G is a non-empty filter on P . Then
(i) For all x M , x
M P and val(
x, G) = x.
(ii) M M [G].
Proof. Absoluteness implies that x
M P for all x M . We prove val(
x, G) = x by
-induction. If it is true for all y x, then
val(
x, G) = {val(, G) : (, 1) x
}
= {val(
y , G) : y x}
= {y : y x}
= x.
Finally (ii) is immediate from (i).
Next, for any partial order P we define a P -name . It depends on P and could be defined
as P to bring this out.
= {(
p, p) : p P }.
Lemma 17.7. Suppose that M is a transitive model of ZFC, P M is a quasi-order,
and G is a non-empty filter on P . Then G = G. Hence G M [G].
Proof. G = {
pG : p G} = {p : p G} = G.
226

Lemma 17.8. Suppose that M is a transitive model of ZFC, P M is a quasi-order,


and G is a non-empty filter on P . Then rank(G ) rank( ) for all M P .
Proof. We prove this by induction on . Suppose that it is true for all dmn( ).
If x G , then there is a (, p) such that p G and x = G . Hence by the inductive
assumption, rank(x) rank(). Hence
rank(G ) = sup (rank(x) + 1) rank( ).
xG

Note by absoluteness of the rank function that rank( ) is the same within M or M [G].
Lemma 17.9. Suppose that M is a transitive model of ZFC, P M is a quasi-order,
and G is a non-empty filter on P . Then M and M [G] have the same ordinals.
Proof. Since M M [G], every ordinal of M is an ordinal of M [G]. Now suppose that
is any ordinal of M [G]. Write = G , where M P . Now rank( ) = rankM ( ) M .
So by Lemma 17.8, = rank() = rank(G ) rank( ) M , so M .
The following lemma will be used often.
Lemma 17.10. Suppose that M is a transitive model of ZFC, P M is a quasi-order,
E P , E M , and G is a P-generic filter over M . Then:
(i) Either G E 6= , or there is a q G such that r q for all r E.
(ii) If p G and E is dense below p, then G E 6= .
Proof. Let
D = {p : p r for some r E} {q : q r for all r E}.
We claim that D is dense. For, suppose that q P . We may assume that q
/ D. So q
is not in the second set defining D, and so there is an r E which is compatible with q.
Take p with p q, r. then p D and p q, as desired.
Since D is dense, we can choose s G D. Now to prove (i), suppose that G E = .
Then s is not in the first set defining D, so it is in the second set, as desired.
For (ii), suppose that G E = , and by (i) choose q G such that q r for every
r E. By the definition of filter, there is a t G with t p, q. Since E is dense below
p, there is then a u E with u t. Thus u q, so it is not the case that u q,
contradiction.
Proposition 17.11. Suppose that M is a transitive model of ZFC, P M is a quasiorder, and G is a P-generic filter over M . Suppose that p P and p is compatible with
each member of G. Then p G.
Proof. The set {q P : q p or q p} is clearly dense in P.
Now we introduce the idea of forcing. Recall that the logical primitive notions are , ,
, and =.
With each formula (v0 , . . . , vm1 ) of the language of set theory we define another
formula
p P,M (0 , . . . , m1 ),
227

which we read as p forces (0 , . . . , m1 ) with respect to P and M ; it is the statement


P is a quasi-order, P M , 0 , . . . , m1 M P , p P , and for every G which is Pgeneric over M , if p G, then the relativization of to M [G] holds for the elements
0G , . . . , (m1)G .
The main aim of the next part of this chapter is to show that this definition is equivalent to
one which is definable in any countable transitive model of ZFC. We do this by defining a
notion in V, and then proving the equivalence with . Thus for the following definition,
and for Lemmas 17.12 and 17.13, we work within the usual framework of set theory, not
within a model; but of course this can take place within any model of ZFC.
Suppose that P is a quasi-order. Let A = RO(P), and let e be the embedding of
P into A given in Chapter 2. Now we define the Boolean value [[ = ]] for any names
, VP , by recursion:

Y
X
e(p) +
[[ = ]] =
(e(q) [[ = ]])

(,p)

(,q)

(,q)

e(q) +

(,p)

(e(p) [[ = ]]) .

Now we associate with each formula of the language of set theory and with each seqeuence
of members of VP a Boolean value, by metalanguage recursion, as follows:
X

[[ ]] =

(e(p) [[ = ]]);

(,p)

[[(0 , . . . , m1 )]] = [[(0 , . . . , m1 )]];


[[(0 , . . . , m1 ) (0 , . . . , m1 )]] = [[(0 , . . . , m1 )]] [[(0 , . . . , m1 )]];
X
[[x(0 , . . . , m1 , x)]] =
[[(0 , . . . , m1 , )]].
VP

Note that the last big sum has index set which is a proper class in general. But the values
are all in the Boolean algebra A, so this makes sense.
Lemma 17.12.
[[(0 , . . . , m1 ) (0 , . . . , m1 )]] = [[(0 , . . . , m1 )]] + [[(0 , . . . , m1 )]];
[[(0 , . . . , m1 ) (0 , . . . , m1 )]] = [[(0 , . . . , m1 )]] + [[(0 , . . . , m1 )]];
Y
[[x(0 , . . . , m1 , x)]] =
[[(0 , . . . , m1 , )]].
VP

Note that the proof of Lemma 17.12 depends on the standard way of defining the indicated
connectives in terms of the ones we have selected as primitive: is ( ),
is , and x is x.
228

Now we can give our alternate definition of forcing:


p (0 , . . . , m1 ) iff

e(p) [[(0 , . . . , m1 )]].

It is important that Boolean values and are definable in a c.t.m. M of ZFC.


The following lemma gives some properties of Boolean values and forcing which will
be frequently used.
Lemma 17.13. (i) [[ = ]] = [[ = ]].
(ii)

[[ = ]] =

(,p)

(e(p) + [[ ]])

(,q)

(e(q) + [[ ]]) .

(iii) [[ = ]] = 1.
(iv) If (, r) , then e(r) [[ ]], and hence r .
(v) p iff the set
{q : (, s) (q s and q = )}
is dense below p.
(vi) p = iff the following two conditions hold:
(a) For all (, s) , the set
{q p : if q s, then there is a (, u) such that q u and q = }
is dense below p;
(b) For all (, u) , the set
{q p : if q u, then there is a (, s) such that q s and q = }
is dense below p.
(vii)
p (0 , . . . , n1 ) (0 , . . . , n1 )

iff

p (0 , . . . , n1 ) and p (0 , . . . , n1 ).
(viii) p (0 , . . . , n1 ) (0 , . . . , n1 ) iff for all q p there is an r q such
that r (0 , . . . , n1 ) or r (0 , . . . , n1 ).
(ix) p (0 , . . . , n1 ) iff for all q p, q 6 (0 , . . . , n1 ).
(x) {p : p (0 , . . . , n1 ) or p (0 , . . . , n1 )} is dense.
(xi) p x(x, 0 , . . . , n1 ) iff the set
{r p : there is a VP such that r (, 0 , . . . , n1 )}
229

is dense below p.
(xii) p x(x, 0 , . . . , m1 ) iff for all VP , p (, 0 , . . . , m1 ).
(xiiii) The following are equivalent:
(a) p (0 , . . . , m1 ).
(b) For every r p, r (0 , . . . , m1 ).
(c) {r : r (0 , . . . , m1 )} is dense below p.
(xiv) p (0 , . . . , m1 ) (0 , . . . , m1 ) iff the set
{q : q (0 , . . . , m1 ) or q (0 , . . . , m1 )}
is dense below p.
(xv) If p x(x, 0 , . . . , m1 ), then the set
{q : there is a VP such that q (, 0 , . . . , m1 )}
is dense below p.
(xvi) If p (0 , . . . , m1 ) and p (0 , . . . , m1 ) (0 , . . . , m1 ), then

p (0 , . . . , m1 ).
Proof.
(i): By induction:

[[ = ]] =

Y
(,p)

(,q)

e(q) +

Y
(,q)

(,p)

(,q)

(,p)

(,p)

230

(,p)

(,q)

(e(p) [[ = ]])

(e(q) [[ = ]])

X
X

= [[ = ]].

(,q)

(e(q) [[ = ]])

(e(p) [[ = ]])

e(q) +

e(p) +

(,q)

(,p)

e(q) +

e(p) +

X
X

Y
Y

e(p) +

(e(p) [[ = ]])

(e(q) [[ = ]])

(ii):
[[ = ]] =

Y
(,p)

(,q)

e(q) +

Y
(,p)

(,q)

e(q) +

(,p)

(,q)

X
(,p)

e(p) +

e(p) +

(,q)

(,p)

(e(p) [[ = ]])

X
X

(e(q) [[ = ]])

(e(q) [[ = ]])

(e(p) [[ = ]])

(e(p) + [[ ]])

(,q)

(e(q) + [[ ]]) .

We prove (iii) and (iv) simultaneously by induction on the rank of ; so suppose that they
hold for all of rank less than . Assume that (, r) . Then by the definition of
[[ ]],
X
[[ ]] =
(e(s) [[ = ]]) e(r) [[ = ]] = e(r),
(,s)

as desired in (iv). Using this and (ii),


Y
[[ = ]] =
(e(r) + [[ ]]) = 1,
(,r)

as desired in (iii).
We now use Theorem 16.20(vii) in several of our arguments.
(v):
p

iff
iff

e(p) [[ ]]
X
e(p)
(e(s) [[ = ]])
(,s)

iff

{q : (, s) [e(q) e(s) [[ = ]]]} is dense below p.

We claim that the last statement here is equivalent to


()

{q : (, s) [q s and e(s) [[ = ]]]} is dense below p.

In fact clearly () implies the above statement. Now suppose that


{q : (, s) [e(q) e(s) [[ = ]]]} is dense below p.
231

Take any r p, and choose q r and (, x) such that e(q) e(s) [[ = ]]. Then q
and s are compatible; say t q, s. Then t q r and e(t) e(q) e(s) [[ = ]]. Thus
() holds.
Now () is clearly equivalent to
{q : (, s) [q s and s = ]} is dense below p.
(vi): Assume that p = .
For (a), suppose that (, s) and r p. If r 6 s, then r itself is in the desired set;
so suppose that r s. Then

e(r) e(s) e(p) e(s) e(s) +

X
(,u)

(e(u) [[ = ]]) = e(s)

(e(u) [[ = ]]).

(,u)

Hence there is a (, u) such that e(r) e(s) e(u) [[ = ]] 6= 0. Hence there exists a
v r, s such that e(v) e(u) [[ = ]]. (See the argument for (v)). It follows that there
is a q v, u with e(q) [[ = ]]. So q = , and q is in the desired set.
(b) is treated similarly.
Now assume that (a) and (b) hold. We want to show that p = , i.e., that
e(p) [[ = ]]. To show that e(p) is below the first big product in the definition of
[[ = ]], take any (, q) ; we want to show that
X

e(p) e(q) +

(e(r) [[ = ]]),

(,r)

i.e., that
e(p) e(q)

(e(r) [[ = ]]).

(,r)

Suppose that this is not the case. Then there is an s such that
e(s) e(p) e(q)

(e(r) [[ = ]]) = e(p) e(q)

(,r)

(e(r) + [[ = ]]).

(,r)

Hence there is a u s, p, q. By (b) choose v u and (, r) such that v r and


v = . Then e(v) e(r) [[ = ]], and also e(v) e(r) + [[ = ]]), contradiction.
Similarly, e(p) is below the second big product in the definition of [[ = ]].
(vii): Clear.
(viii): Since
p (0 , . . . , n1 ) (0 , . . . , n1 ) iff

e(p) [[(0 , . . . , n1 ) (0 , . . . , n1 )]],

this is immediate from Theorem 16.20(vii).


232

(ix) : if p (0 , . . . , n1 ) and q p, then


e(q) e(p) [[(0 , . . . , n1 )]] = [[(0 , . . . , n1 )]],
and hence e(q) 6 [[(0 , . . . , n1 )]], since e(q) 6= 0. Thus q 6 (0 , . . . , n1 ).
: suppose that p 6 (0 , . . . , n1 ). Then
e(p) 6 [[(0 , . . . , n1 )]] = [[(0 , . . . , n1 )]],
and hence
e(p) [[(0 , . . . , n1 )]] 6= 0,
so we can choose r such that
e(r) e(p) [[(0 , . . . , n1 )]],
hence there is a q p, r, and so q (0 , . . . , n1 ).
(x): Let q be given. If e(q) [[(0 , . . . , n1 )]] 6= 0, choose r such that e(r) e(q)
[[(0 , . . . , n1 )]], and then choose p q, r. Thus p (0 , . . . , n1 ), as desired. If
e(q) [[(0 , . . . , n1 )]] = 0, then q (0 , . . . , n1 ).

P (xi): Suppose that p x(x, 0 , . . . , n1 ), Pand suppose that q p. Then e(q)


M P [[(, 0 , . . . , n1 )]], and so there is a M such that e(q)[[(, 0 , . . . , n1 )]] 6=
0; hence we easily get r q such that e(r) [[(, 0 , . . . , n1 )]]. This implies that
r (, 0 , . . . , n1 ), as desired.
Conversely, suppose that the set
{r p : there is a VP such that r (, 0 , . . . , n1 )}
is dense below p, while p 6 x(x, 0 , . . . , n1 ). Thus e(p) 6 [[x(x, 0 , . . . , n1 )]], so
Y
e(p)
[[(, 0 , . . . , n1 )]] 6= 0.
M P

Then we easily get q p such that


(4)

e(q)

[[(, 0 , . . . , n1 )]].

M P

By assumption, choose r q and M P such that r (, 0 , . . . , n1 ). Thus


e(r) [[(, 0 , . . . , n1 )]]. This clearly contradicts (4).
(xii): clear by Lemma 17.12.
(xiii): Clearly (a)(b)(c). Now assume (c). Suppose that p 6 (0 , . . . , m1 ).
Thus e(p) 6 [[(0 , . . . , m1 )]], so we easily get q p such that
(5)

e(q) [[(0 , . . . , m1 )]].

By (c), choose r q such that r (0 , . . . , m1 ). Clearly this contradicts (5).


233

(xiv): First suppose that p (0 , . . . , m1 ) (0 , . . . , m1 ). Thus by the


definition of we have p (0 , . . . , m1 ) (0 , . . . , m1 ). Hence the desired
conclusion follows by (viii). The converse follows by reversing these steps.
(xv): This is very similar to part of the proof of (xi), but we give it anyway. We have
i(p) [[x(x, 0 , . . . , m1 )]]
X
=
[[(, 0 , . . . , m1 )]].
M P

Now suppose that q p. Then e(q) is the sum here, so we easily get r q and M P
such that e(r) [[(, 0 , . . . , m1 )]]. Hence r (, 0 , . . . , m1 ), as desired.
(xvi): The hypotheses yield
e(p) [[(0 , . . . , m1 )]] and
e(p) [[(0 , . . . , m1 )]] + [[(0 , . . . , m1 )]],
so e(p) [[(0 , . . . , m1 )]] and hence p (0 , . . . , m1 )]].
Note that the discussion of Boolean values and of has taken place in our usual framework
of set theory. The complete BA RO(P ) is in general uncountable. Given a c.t.m. M of
ZFC, the definitions can take place within M , and while M may be a model of RO(P )
is uncountable, actually RO(P )M is countable. Thus even if 0 , . . . , m1 are members
of M P , the statements p (0 , . . . , m1 ) and (p (0 , . . . , m1 ))M are much
different, since the products and sums involved in the definition of the former range over
a possibly uncountable complete BA, while those in the latter range only over a countable
BA (which is actually incomplete if it is infinite). It can be shown, however, that the
statements are equivalent; see exercise 17.20. We do not need this, as we will really be
concerned only with the forcing relativized to M .
Now we prove the fundamental theorem connecting the notion in a c.t.m. M with
the notion , whose definition takes place outside M .
Theorem 17.14. (The Forcing Theorem) Suppose that M is a c.t.m. of ZFC, P M
is a quasi-order, and G is P-generic over M . Then the following conditions are equivalent:
(i) There is a p G such that (p (0 , . . . , m1 ))M .
(ii) (0G , . . . , (m1)G ) holds in M [G].
Proof. First we prove the equivalence for the formula x = y, by induction on the pair
(0 , 1 ).
For (i)(ii), suppose that p G and (p 0 = 1 )M . We want to show that
0G = 1G . First we prove that 0G 1G . So, suppose that a 0G . Then there is
a (, t) 0 such that t G and a = G . Since p, t G, we can choose r G such
that r p, t. Then (r 0 = 1 )M since r p. Now we work within M . We have
e(r) [[0 = 1 ]]. Hence by definition we have
e(r) e(t) +

X
(,u)1

234

(e(u) [[ = ]]);

since r t, we actually have e(r)

(,u)1 (e(u)

[[ = ]]. Now

{s P : (, u) 1 [e(s) [[ = ]]]}
P
is dense below r. In fact, suppose that t r. Then e(t) (,u)1 (e(u) [[ = ]], and
so there is a (, u) 1 such that e(t) e(u) [[ = ]] 6= 0; hence there is a v t, u such
that e(v) h = ]], as desired.
By this denseness, using Lemma 17.10 we get s G and (, u) 1 such that e(s)
[[ = ]]. So s = . By the inductive hypothesis we get G = G . Hence a = G =
G 1G , as desired. The proof that 1G 0G is similar.
For (ii)(i), suppose that 0G = 1G holds in M [G]. By Lemma 17.13(x), choose
p G such that (p 0 = 1 )M or (p (0 = 1 ))M . It suffices now to get a
contradiction from the assumption that (p (0 = 1 ))M . First we claim:
(1) The set
{r : there is a (, s) 1 such that
Y
e(r) e(s)
(e(t) + [[ = ]]),
(,t)0

or there is a (, t) 0 such that


Y
(e(s) + [[ = ]])}
e(r) e(t)
(,s)1

is dense below p.
In fact, suppose that q p. Then
e(q) [[0 = 1 ]]
X
=
(e(s)

(,s)1

(,t)0

(e(t)

(,t)0

(e(t) + [[ = ]]))
(e(s) + [[ = ]])).

(,s)1

Now if e(q) times the first big sum here is nonzero, then we easily get r q and (, s) 1
such that
Y
e(r) e(s)
(e(t) + [[ = ]]).
(,t)0

Thus r is in the set (1), as desired. If e(q) times the first big sum is zero, then we easily
get r q in (1) again. So (1) is dense below p.
Hence there is an r G such that r is in the set (1).
Case 1. There is a (, s) 1 such that
e(r) e(s)

(e(t) + [[ = ]]).

(,t)0

235

Now it is easily seen that the set {u : u r, s} is dense below r, so we can choose u G
such that u r, s. It follows that s G, and hence G 1G . Hence G 0G , and so
there is a (, t) 0 such that t G and G = G . By the induction hypothesis, there is
a v G such that (v = )M . Now choose w G with w u, t, v. Now w u r,
so e(w) e(r), and hence e(w) e(t) + [[ = ]]. Also w t, so e(w) e(t), and
hence e(w) [[ = ]]. But also w v, so (w = )M , and hence e(w) [[ = ]],
contradiction.
Case 2. There is a (, t) 0 such that
e(r) e(t)

(e(s) + [[ = ]]).

(,s)1

This case is treated just as in the first case.


Thus we have proved the equivalence for the formula x = y. Next we take the formula
x y. For (i)(ii), assume that p G and (p )M . By Lemma 17.13(v), the set
{q : (, s) (q s and (q = ))M }
is dense below p. Hence we can choose q G with q p and (, s) such that q s
and (q = )M . Then s G and (, s) , so G G . By the equality case already
treated, from q G and (q = )M we get G = G . Hence G G , as desired.
For (ii)(i) for the formula x y, suppose that G G . Then there is a (, s)
such that s G and G = G . By the equality case, there is a p G such that (p =
)M . Choose r G with r p, s. Then (r = )M , so by Lemma 17.13(v) we get
(r )M , as desired.
Thus now we have proved the equivalence for atomic formulas. We proceed by metalanguage induction on formulas now.
Suppose that the equivalence holds for ; we prove it for . For (i)(ii), suppose
that p G and (p (0 , . . . , m1 ))M . We want to show that (0G , . . . , (m1)G )
holds in M [G]. Suppose to the contrary that (0G , . . . , (m1)G ) holds in M [G]. Then
by the equivalence for , choose q G such that (q (0 , . . . , m1 ))M . Choose r G
with r p, q. Then (r (0 , . . . , m1 ))M and (r (0 , . . . , m1 ))M . This is a
contradiction. [These statements imply that
e(r) [[(0 , . . . , m1 )]] = [[(0 , . . . , m1 )]]
and e(r) [[(0 , . . . , m1 )]],
so that e(r) = 0, contradiction.]
For (ii)(i), suppose that (0G , . . . , (m1)G ) holds in M [G]. By Lemma 17.13(x),
the set
def
D = {p : (p (0 , . . . , n1 ))M or (p (0 , . . . , n1 ))M }
is dense, so choose p D G. If (p (0 , . . . , n1 ))M , then by the equivalence
for , the statement (0G , . . . , (n1)G ) holds in M [G], contradiction. Hence (p
(0 , . . . , n1 ))M .
236

If the equivalence holds for and then it clearly also holds for .
Finally, we deal with the formula x(x, 0 , . . . , n1 ). For (i)(ii), assume that
p G and (p x(x, 0 , . . . , n1 ))M . By Lemma 17.13(xi), choose q p with q G
and a name such that (q (, 0 , . . . , n1 ))M . Then by the inductive hypothesis,
M [G] (0G , 0G , . . . , (m1)G ), as desired.
For (ii)(i), suppose that (x(x, 0G , . . . , (m1)G )M [G] . Let be a name such
that (0G , 0G , . . . , (m1)G ). Then by the inductive hypothesis there is a p G such
that (p (, 0 , . . . , n1 ))M . By Lemma 17.13(xi), (p x(x, 0 , . . . , n1 ))M , as
desired.
Theorem 17.15. Let M be a c.t.m. of ZFC and P M a quasi-order.
(i) For all p P , p (0 , . . . , m1 ) iff (p (0 , . . . , m1 ))M .
(ii) For every G which is P-generic over M ,
(0G , . . . , (m1)G ) holds in M [G] iff
there is a p G such that p (0 , . . . , m1 ).
Proof. (i): Assume that p (0 , . . . , m1 ). Working in M , suppose that
p 6 (0 , . . . , m1 ). Then

e(p) 6 [[(0 , . . . , m1 )]],

and hence

e(p) [[(0 , . . . , m1 )]] 6= 0, hence there is a q such that


e(q) e(p) [[(0 , . . . , m1 )]],
hence we can choose r p, q.
Now we argue outside of M . Let G be P-generic over M with r G. Since e(r)
[[(0 , . . . , m1 )]], it follows from Theorem 17.14 that (0G , . . . , (m1)G )} holds in
M [G]. But r p, so p G, and hence by assumption (0G , . . . , (m1)G )} holds in
M [G], contradiction.
(i): by Theorem 17.14.
(ii): By Theorem 17.14 we get p B such that (p (0 , . . . , m1 ))M , so the
desired conclusion follows from (i).
(ii): Immediate from the definition of .
Now we give various useful results concerning forcing and Boolean values.
Proposition 17.16. If is a sentence which holds in all models, then p for all
p.
Proof. Suppose that p 6 . Then there is a generic G such that p G and holds
in M [G], contradiction.
Proposition 17.17.
(i) The following are equivalent:
(a) p x (x, 0 , . . . , m1 ).
(b) The set
{q : there is a (, r) such that q r and q (, 0 , . . . , m1 )}
237

is dense below p.
(ii) If p x (x, 0 , . . . , m1 ), then there exist a q p and a dmn( ) such
that q (, 0 , . . . , m1 ).
(iii) The following are equivalent:
(a) p x (x, 0 , . . . , m1 ).
(b) For all (, r) and all q p, r, q (, 0 , . . . , m1 ).
Proof. (i)(a)(b): Assume (a), and suppose that r p. Let G be generic with
r G. Thus by definition of , x G (x, 0G , . . . , (m1)G ) holds in M [G], so there is
a member a of M [G] such that a G and (a, 0G , . . . , (m1)G ) holds in M [G]. Then
there is a (, r) such that r G and a = G . By Theorem 17.15, there is an s G
such that s (, 0 , . . . , m1 ). Choose q such that q p, r, s. Then q is in the set of
(b), as desired.
(i)(b)(a): Assume (b), and suppose that G is generic and p G. Then by (b) we
can choose q p with q G, and q in the set of (b). So, choose (, r) such that q r
and q (, 0 , . . . , m1 ). Hence p (, 0 , . . . , m1 ), and (a) holds.
(ii): immediate from (i).
(iii)(a)(b): assume (a) and suppose that (, r) and q p, r. Suppose that
G is generic and q G. Then also p G, so by (a) and the definition of , the statement x G (x, 0G , . . . , (m1)G ) holds in M [G]. Also r G, so G G . Hence
(G , 0G , . . . , (m1)G ) holds in M [G]. This proves (b).
(iii)(b)(a): Assume (b), and suppose that p G with G generic. To prove that
x G (x, 0G , . . . , (m1)G ) holds in M [G], let a be any member of G . Write a = G
with (, r) and r G. Choose q G with q p, r. Then by (b) we see that
(G , 0G , . . . , (m1)G ) holds in M [G]. This proves (a).
Proposition 17.18.
(i)

[[x (0 , . . . , m1 , x)]] =

(e(p) [[(0 , . . . , m1 , ))]]

(,p)

(ii)

[[x (0 , . . . , m1 , x)]] =

(e(p) + [[(0 , . . . , m1 , )]]).

(,p)

Proof. For (i), we first show that [[x (0 , . . . , m1 , x)]] is an upper bound for
def

X = {e(p) [[(0 , . . . , m1 , )]] : (, p) }.


In fact, if (, p) , then
e(p) [[(0 , . . . , m1 , )]] [[ ]] [[(0 , . . . , m1 , )]] by Lemma 17.13(iv),
= [[ (0 , . . . , m1 , )]]
[[x (0 , . . . , m1 , x).]] using Proposition 17.16
Now suppose that a is an upper bound for X, but [[x (0 , . . . , m1 , x)]] a 6= 0.
Choose p such that e(p) [[x (0 , . . . , m1 , x)]] a. Then by definition of ,
238

p x (0 , . . . , m1 , x), so by 17.17(i), choose q and (, r) such that q p,


q r, and q (0 , . . . , m1 , ). Hence
e(q) [[(0 , . . . , m1 , )]] a;
but also e(q) e(p) a, contradiction. So (i) holds.
Now (ii) follows from (i) and Proposition 17.16. This follows from the following facts:
(1) If holds in all models, then [[]] = 1.
This is true by Proposition 17.16.
(2) If holds in all models, then [[]] = [[]].
(3) x x holds in all models.
Using these facts,
[[x (0 , . . . , m1 , x)]] = [[x (0 , . . . , m1 , x)]]
= [[x (0 , . . . , m1 , x)]]
X
=
(e(p) [[(0 , . . . , m1 , x)]])
(,p)

(e(p) + (0 , . . . , m1 , x)]]).

(,p)

Now we are going to show that M [G] is always a model of ZFC. To treat the pairing axiom,
relations and functions, the following definitions are useful.
Let P be any set, and let and be P -names. Then we define
up(, ) = {(, 1), (, 1)};
op(, ) = up(up(, ), up(, )).
Lemma 17.19. Suppose that M is a c.t.m. of ZFC, P M is a quasi-order, and G
is P-generic over M . Suppose that , M P . Then
(i) up(, ) M P and up(, )G = {G , G }.
(ii) op(, ) M P and op(, )G = (G , G ).
Theorem 17.20. Let M be a c.t.m. of ZFC, P M a quasiorder, and G P-generic
over M . Then the relativization of each axiom of ZFC to M [G] holds.
Proof. We are going to use the simplified procedure developed in Chapter 14 to check
axioms.
Since M [G] is transitive by Lemma 17.4, extensionality and foundation hold by Theorems 14.3 and 14.9.
Next we take comprehension, using Theorem 14.4. Let (x, z, w1 , . . . , wn ) be a formula
with free variables among those mentioned, and let , 1 , . . . , n M P . Then by the
comprehension axiom, the set
A = {a G : (a, G , 0G , . . . , nG )M [G] }
239

exists, and it clearly suffices to show that this set is in M [G]. Let
= {(, p) dmn( ) P : p ( (, , 1, . . . , n ))}.
Thus M P . We claim that G = A, as desired. First suppose that a G . Then
there is a (, p) as in the definition of such that p G and G = a. Now p (
(, , 1, . . . , n )), so by the definition of , a = G G and (G , G , 1G , . . . , nG )
holds in M [G]. Thus a A. Conversely, suppose that a A. Then there is a (, p)
such that p G and a = G . So G G (G , G , 0G , . . . , nG ) holds in M [G]. By
Theorem 17.15, choose q G such that q (, , 1, . . . , n )). Thus (, q) ,
so a = G G , as desired.
For pairing, we apply Theorem 14.5. Suppose that x, y M [G]. Say x = G and
y = G . Clearly op(, )G = {x, y}.
For union, we apply Theorem
14.6. We needSto take any element G of M [G] and
S
find G in M [G] such that G G . Let = dmn(). Clearly M P . Suppose
that x y G . Then we can choose (, p) such that p G and y = G ; and we
can choose (, q) such that q G and G = x. Thus dmn() and (, q) , so
(, q) . It follows that x = G G , as desired.
Next comes the power set axiom. By Theorem 14.7 it suffices to take any G M [G]
and find a M P such that P(G ) M [G] G . Let
= ({(, 1) : dmn( ) dmn()}.
Suppose that x P(G ) M [G]. So we can write x = G . In M , let
= {(, p) : dmn() and p }.
Since dmn( ) dmn(), we have (, 1) and so G G . Hence it suffices to show that
G = G .
Suppose that y G . Thus y x G . Say y = G with (, q) for some q G.
Now G G , so by Theorem 17.15 there is a p G such that (p )M . So we have
(, p) , and hence y = G G . This shows that G G .
Conversely, suppose that y G . Hence we can write y = G with (, p) for some
p G. By virtue of (, p) we have (p )M , and so by Theorem 17.15 we have
y = G G . Hence we have now fully shown that G = G .
Next we take care of the replacement axioms, using Theorem 14.8. Thus suppose that
is a formula with free variables among x, y, A, w1, . . . , wn , we are given A, w1 , . . . , wn
M [G] such that
(1)

x A!y[y M [G] M [G] (x, y, A, w1, . . . , wn )],

and we want to find Y M [G] such that


(2)

{y M [G] : x A[M [G] (x, y, A, w1, . . . , wn )} Y ].


240

Write A = G and wi = iG for each i = 1, . . . , n. Now we work within M . Define


B = {(, p) dmn( ) P : M P [p (, , 1 , . . . , n )]}.
For each (, p) B let (, p) be an ordinal such that there is a V(,p) M P such
that p (, , 1 , . . . , n ). Let = sup(,p)B (, p), and set
=

{(, 1) : V M P and p (, , 1, . . . , n )}.

(,p)B
def

This ends our work within M . We claim that Y = G is as desired in (2). For, suppose
that y M [G] and x A with M [G] (x, y, A, w1, . . . , wn ). Write y = G and x = G
with dmn( ). So M [G] (G , G , G , 1G , . . . , nG ). Hence by Theorem 17.15 there
is a p G such that (p (, , , 1, . . . n ))M . It follows that (, p) B. Choose
V M P such that p (, , , 1, . . . n ). Thus M [G] (G , G , G , 1G , . . . , nG )
by Theorem 17.15. So G G = Y . Since we also have M [G] (G , G , G , 1G , . . . , nG ),
by (1) we get y = G = G Y , as desired in (2).
So now we have checked ZF Inf. By Lemma 17.6, =
G , so M [G]. So by
Theorem 14.19, infinity holds in M [G].
Thus the axioms of ZF hold. To prove that AC holds, we can take any equivalent of
it. It is convenient to take the one given in the following statement:
(*) In ZF, AC is equivalent to the statement that for any set X there exist an ordinal
and a function f with domain such that X rng(f ).
In fact, obviously AC implies the indicated statement. Now suppose that the statement
holds, and we want to find a choice function for all of the nonempty subsets of some set
A. Let be an ordinal and f a function with domain such that A rng(f ). Then for
any nonempty subset B of A we define h(B) = f (), where is the smallest ordinal less
than such that f () B. Thus (*) holds.
Now assume that x M [G]; say x = G . Then there is a function M with domain
some ordinal such that dmn() = rng(). Now define
= {op(
, ) : < } {1}.
Then M , and G = {(, ( )G ) : < }. Hence G is a function with domain . If
y x, then there is a ( , p) such that p G and y = ( )G . Hence y = G (). Thus
x rng(G ).
Theorem 17.21. Let M be a c.t.m. of ZFC, P M a quasiorder, and G P-generic
over M . Suppose that G
/ M . (See Lemma 17.2.) Then M [G] models (V = L).
L

Proof. By Lemma 17.9, M and M [G] have the same ordinals. By absoluteness then,
= LM M , so M [G] has elements not in LM [G] .

M [G]

This gives our first application of forcing:


Theorem 17.22. Con(ZFC) implies Con(ZFC + V 6= L).
241

The following lemma is frequently useful.


Lemma 17.23. If M is a c.t.m. of ZFC, P M is a quasi-order, and (v0 , . . . , vn1 )
is a formula which is absolute for c.t.m. of ZFC, and x0 , . . . , xn1 M , then the following
are equivalent:
(i) 1 (x0 , . . . , xn1
).
(ii) p (x0 , . . . , xn1
) for some p P .
(iii) (x0 , . . . , xn1 ) holds in M .
Proof. (i)(ii): trivial. (ii)(iii): let G be P-generic over M , with p G. Then
(x0 , . . . , xn1 ) holds in M [G], and hence in M by absoluteness. (iii)(i): Let G be
P-generic over M . Then (x0 , . . . , xn1 ) holds in M [G] by absoluteness. Hence (i) holds,
since G is arbitrary.
Exercises
E17.1. Let I and J be sets with I infinite and |J| > 1, and let P = (P, , ), where P is
the collection of all finite functions contained in I J and is restricted to P . Show
that P satisfies the condition of Lemma 17.2. [Instances of this partial order are used in
following chapters.]
E17.2. Show that if the condition in the hypothesis of Lemma 17.2 fails, then there is a
P-generic filter G over M such that G M , and G intersects every dense subset of P (not
only those in M ). [Cf. Lemma 17.1.]
E17.3. Assume the hypothesis of Lemma 17.2. Show that there does not exist a P-generic
filter over M which intersects every dense subset of P (not only those which are in M ).
Hint: Take G generic, and show that {p P : p
/ G} is dense. Thus in the definition of
generic filter, the condition on dense sets being in M is necessary.
E17.4. Show that if P satisfies the condition of Lemma 17.2, then it has uncountably many
dense subsets.
E17.5. Assume the hypothesis of Lemma 17.2. Show that there are 2 filters which are
P-generic over M .
E17.6. Let P be a one-element set. Prove that the collection of all P -names is a proper
class.
E17.7. Prove in detail that the definition of is valid.
E17.8. Show how the recursion theorem can be used to justify the definition of [[ = ]].
E17.9. Prove that the following conditions are equivalent:
[[(0 , . . . , m1 ) (0 , . . . , m1 )]] = 1
[[(0 , . . . , m1 )]] = [[(0 , . . . , m1 )]].
E17.10. Prove that [[ = ]] [[ = ]] [[ = ]]. Hint: Let
R = {((, , ), ( , , )) : rank() < rank( ), rank( ) < rank( ), rank() < rank( )}.
242

Prove that [[ = ]] [[ = ]] [[ = ]] = 0 by induction on R.


E17.11. Show that p = iff the following two conditions hold.
(i) For every (, q) and every r p, q one has r .
(ii) For every (, q) and every r p, q one has r .
E17.12. Prove directly, without using 17.14 or 17.15, that the equivalence of exercise
E17.12 holds with replaced by .
E17.13. Assume that P M , p, q P , and p q. Show that { M P : p = } is a
proper class in M .
E17.14. Assume that P M is separative and p, q, r P . Prove that the following two
conditions are equivalent:
(i) p {({(, q)}, r)} = 1.
(ii) p r and p q.
E17.15. Suppose that f : A M with f M [G]. Show that there is a B M such that
f : A B. Hint: let f = G and B = {b : p P [p b rng( )]}.
E17.16. Assume that P M and is a cardinal of M . Then for any P-generic G over M
the following conditions are equivalent:
(1) For all B M , B M = B M [G].
(2) M M = M M [G].
Moreover, if (1) (and hence (2)) holds for every G, and P is separative, then
(3) In M , the intersection of dense open subsets of P is dense.
Finally, if (3) holds, then (1) (and hence (2)) holds for every generic G.
E17.17. Suppose that P M is a quasi-order satisfying the condition of Lemma 17.2.
Assume that
M = M0 M1 M2 Mn (n ),
where Mn+1 = Mn [Gn ] for some S
Gn which is P-generic over Mn , for each n . Show
that the power set axiom fails in n Mn .
E17.18. (i) Let d be the two-place class function whose domain is the set of all pairs (P, p)
such that P is a quasi-order and p P , with d(P, p) = (P p). Show that d is absolute
for c.t.m. of ZFC.
(ii) Let cl be the two-place class function whose domain is the set of all pairs (P, X)
such that P is a quasi-order and X P , with cl(P, X) = {q P : (P q) X 6= }.
Thus this is a version of the closure operator on the space P under the topology given in
Chapter 2. Show that cl is absolute for c.t.m. of ZFC.
(iii) Let int be the two-place class function whose domain is the set of all pairs (P, X)
such that P is a quasi-order and X P , with int(P, X) = {q P : (P q) X}. Thus
this is a version of the interior operator on the space P under the topology given in Chapter
2. Show that int is absolute for c.t.m. of ZFC.
243

(iv) Let e be the two-place class function whose domain is the set of all pairs (P, p)
such that P is a quasi-order and p P , with e(P, p) = int(cl(P p)). Show that e is
absolute for c.t.m. of ZFC.
(v) Let M be a c.t.m. of ZFC, and suppose that P M is a quasi-order. Show that
(RO(P ))M is a subalgebra of RO(P ).
(vi) Let M be a c.t.m. of ZFC, and suppose that P M is a quasi-order. Suppose
that X (RO(P ))M with X M . Then
X(RO(P ))M

X=

XRO(P )

X,

and similarly for products.


E17.19. Suppose that M is a c.t.m. of ZFC. Let P M be a quasi-order. Show that for
any formula (x0 , . . . , xm1 ) and any 0 , . . . , m1 M P ,
[[(0 , . . . , m1 )]]M = [[(0 , . . . , m1 )]].
Hence (p (0 , . . . , m1 ))M iff p (0 , . . . , m1 ).

244

18. Powers of regular cardinals


In this section we give the first main result using forcing: consistency of the negation of
the continuum hypothesis, in a general form where one can specify what the power of a
regular cardinal is in advance. It is rather easy to devise a partial order which adds many
subsets of , or other regular cardinals. More subtle is to make sure in doing this that
cardinals themselves do not change.
In passing from M to M [G] it is possible that a cardinal in M is no longer a cardinal
in M [G]. For example, let P be the partially ordered set consisting of all finite functions
mapping a subset of into 1 , ordered by . Suppose that G is P-generic over M . Now
the following sets are dense:
def

Am = {f P : m dmn(f )} for each m ,


def

B = {f P : rng(f )} for each 1M .


In fact, given any g P , if m \dmn(g), then g {(m, 0)} is a member of P which
in Am and contains g; and given any g P and < 1M , choose m \dmn(g); then
g {(m, )} is a member of P which in B and contains
S g. Now if G is P-generic over M
and intersects all of the sets Am and B , then clearly G, which is a member of M [G], is
a function mapping onto 1M . So 1M gets collapsed to a countable ordinal in M [G].
Note that M = M [G] by absoluteness.
Now we want to formulate precisely when cardinals are preserved, and give an important condition assuring that they are preserved.
Suppose that M is a c.t.m. of ZFC, and P M is a quasi-order. Suppose that is a
cardinal of M . Then we give some definitions.
P preserves cardinals iff for every G which is P-generic over M and every ordinal
in M , is a cardinal in M iff is a cardinal in M [G].
P preserves cofinalities iff for every G which is P-generic over M and every limit
ordinal in M such that (cf())M , (cf())M = (cf())M [G] .
P preserves regular cardinals iff for every G which is P-generic over M and every
ordinal which is a regular cardinal of M , is also a regular cardinal of M [G].
If = , we say simply that P preserves cardinals, cofinalities, or regular cardinals.
In these definitions, if we replace by we obtain new definitions which will be
used below also.
The relationship between these notions that we want to give uses the following fact
about cofinalities which may not be familiar.
Lemma 18.1. Suppose that is a limit ordinal, and are regular cardinals,
f : is strictly increasing with rng(f ) cofinal in , and g : is strictly
increasing with rng(g) cofinal in . Then = .
Proof. Suppose not; say by symmetry < . For each < choose < such
that f () < g( ). Let = sup< . Thus < by the regularity of . But then
f () < g() < for all < , contradiction.
237

Proposition 18.2. Let M be a c.t.m. of ZFC, P M be a quasi-order, and be a


cardinal of M .
(i) If P preserves regular cardinals , then it preserves cofinalities .
(ii) If P preserves cofinalities , and is regular, then P preserves cardinals .
Proof. (i): Let be a limit ordinal of M with (cf())M . Then (cf())M is a
regular cardinal of M which is and hence is also a regular cardinal of M [G]. Now we
can apply Lemma 18.1 within M [G] to = (cf())M and = (cf())M [G] to infer that
(cf())M = (cf())M [G] .
(ii): Suppose that cardinals are not preserved, and let be the least cardinal of
M which is but which is not a cardinal of M [G]. If is regular in M , then
= (cf())M = (cf())M [G] ,
and so is a regular cardinal in M [G], contradiction. If is singular in M , then it is
greater than since is regular in M , and so by the minimality of it is the supremum
of cardinals in M [G], and so it is a cardinal in M [G], contradiction.
We can replace by in this proposition and its proof; call this new statement
Proposition 18.2 . The very last part of the proof of 18.2 can be simplified for , and
actually one does not need to assume that is regular in this case.
A quasi-order P satisfies the -chain condition, abbreviated -c.c., iff every antichain in P
has size less than .
The following theorem is very useful in forcing arguments.
Theorem 18.3. Let M be a c.t.m. of ZFC, P M be a quasi-order, be a cardinal of
M , G be P-generic over M , and suppose that P satisfies the -c.c. Suppose that f M [G],
A, B M , and f : A B. Then there is an F : A P(B) with F M such that:
(i) f (a) F (a) for all a A.
(ii) (|F (a)| < )M for all a A.
Proof. Let M P be such that G = f . Thus the statement G : A B holds
in M [G]. Hence by Theorem 3.15 there is a p G such that

p : A B.
Now for each a A let
F (a) = {b B : there is a q p such that q op(
a, b) }.
To prove (i), suppose that a A. Let b = f (a). Thus (a, b) f , so by Theorem 3.15
there is an r G such that r op(
a, b) . Let q G with q p, r. Then q shows that
b F (a).
To prove (ii), again suppose that a A. By the axiom of choice in M , there is a
function Q : F (a) P such that for any b F (a), Q(b) p and Q(b) op(
a, b) .
(1) If b, b F (a) and b 6= b , then Q(b) Q(b ).
238

In fact, suppose that r Q(b), Q(b). Then


r op(
a, b) op(
a, b ) ;

(2)

hence
but also r Q(b) p, so r : A B,
r x, y, z[op(x, y) op(x, z) y = z]
and hence
r op(
a, b) op(
a, b ) b = b .
It follows from (2) and Theorem 3.13(xvi) that r b = b . Thus if H is P-generic over M
and r H, then b = bH = b H = b by Lemma 3.6, contradiction. Thus (1) holds.
By (1), hQ(b) : b F (a)i is a one-one function onto an antichain of P . Hence
(|F (a)| < )M by the -cc.
Proposition 18.4. If M is a c.t.m. of ZFC, is a cardinal of M , and P M
satisfies -cc in M , then P preserves regular cardinals , and also preserves cofinalities
. If also is regular in M , then P preserves cardinals .
Proof. First we want to show that if is regular in M then also is regular in
M [G] (and hence is a cardinal of M [G]). Suppose that this is not the case. Hence in M [G]
there is an < and a function f : such that the range of f is cofinal in . Recall
from Theorem 3.9 that M and M [G] have the same ordinals. Thus M . By Theorem
18.3, S
let F : P() be such that f () F () and (|F ()| < )M for all < . Let
S = < F (). Then S is a subset of which is cofinal in and has size less than ,
contradiction.
The rest of the proposition follows from Proposition 18.2.
Next, for any sets I, J let
fin(I, J) = {f : f is a finite function and f I J}.
We consider this as a partial order under ; it has a largest element .
By the countable chain condition, abbreviated ccc, we mean the 1 -chain condition.
Lemma 18.5. If I is arbitrary and J is countable, then fin(I, J) has the ccc.
Proof. Suppose that F fin(I, J) is uncountable. Since for each finite F I there
are only countably many members of F with domain F , it is clear that {dmn(f ) : f F }
is an uncountable collection of finite sets. By the -system lemma, let G be an uncountable
subset of this collection which forms a -system, say with root R. Then
[
G =
{f G : f R = k};
kR J

since R J is countable, there is a k R J such that


def

H = {f G : f R = k}
239

is uncountable. Clearly f and g are compatible for any f, g H .


Theorem 18.6. (Cohen) Let M be a c.t.m. of ZFC. Suppose that is any cardinal
of M . Let P be the partial order fin(, 2) ordered by , and let G be P-generic over M .
Then M [G] has the same cofinalities and cardinals as M , and 2 in M [G].
Proof. By Lemma 18.6 and Proposition 18.4, M [G] preserves cofinalities and cardinals, so M [G] has the same cofinalities and cardinals as M . Thus we just need to exhibit
different members
of P().
S
Let g = G. Since any two members of G are compatible, g is a function.
(1) For each , the set {f fin(, 2) : dmn(f )} is dense in P (and it is a member
of M ).
In fact, given f fin(, 2), either f is already in the above set, or else
/ dmn(f ) and
then f {(, 0)} is an extension of f which is in that set. So (1) holds.
Since G intersects each set (1), it follows that g maps into 2. Let (in M ) h :
be a bijection. For each < let a = {m : g(h(, m)) = 1}. We claim that a 6= a
for distinct , ; this will give our result. The set
{f fin(, 2) : there is an m such that
h(, m), h(, m) dmn(f ) and f (h(, m)) 6= f (h(, m))}
is dense in P (and it is in M ). In fact, let distinct and be given, and suppose that
f fin(, 2). Now {m : h(, m) f or h(, m) f } is finite, so choose m not in this
set. Thus h(, m), h(, m)
/ f . Let h = f {(h(, m), 0), (h(, m), 1)}. Then h extends
f and is in the above set, as desired.
It follows that G contains a member of this set. Hence a 6= a .
The method of proof of Theorem 18.7 is called Cohen forcing.
Theorem 18.7. (Cohen) If ZFC is consistent, then so is ZFC + CH.
Proof. Apply Theorem 18.7 with a cardinal of M greater than 1M .
The rest of this chapter is concerned with obtaining more exact versions of Theorems 18.7
and 18.8, and generalizing to powers of any regular cardinal. To obtain upper bounds on
the size of powers the following concept will be used.
Suppose that P is a quasi-order and V P . A nice name for a subset of is a
member of V P of the form
[
({} A ),
dmn()

where each A is an antichain in P.


Proposition 18.8. Let M be a c.t.m. of ZFC, P M a quasi-order, and M P .
(i) For any M P there is a nice name M P for a subset of such that
()

1 = .
240

(ii) If G is P-generic over M and a G in M [G], then a = G for some nice name
for a subset of .
Proof. Assume the hypotheses of the proposition.
(i): Assume also that M P . For each dmn() let A P be such that
(1) p ( ) for all p A .
(2) A is an antichain of P.
(3) A is maximal with respect to (1) and (2).
Moreover, we do this definition inside M , so that hA : dmn()i M . Now let
=

({} A ).

dmn()

To prove (), suppose that G is P-generic over M ; we want to show that G = G G .


First suppose that a G G . Choose (, p) such that p G and a = G . By
Lemma 3.13(iv), p .
(4) A G 6= .
For, suppose that A G = . By Lemma 3.10(i), there is a q G such that q r for all
r A . Now since G G , by Theorem 3.15 there is a q G such that q . Let
r G with r q, q . Then r ( ). It follows that A {r} is an antichain,
contradicting (3). Thus (4) holds.
By (4), take q A G. Then (, q) and q G, so a = G G . Thus we have
shown that G G G .
Now suppose that a G . Choose (, p) such that p G and a = G . Thus
p A , so by (1), p ( ). By the definition of forcing, a = G G G .
This shows that G G G . Hence G = G G .
(ii): Assume the hypotheses of (ii). Write a = G . Taking as in (i), we have
a = G = G G = G , as desired.
Proposition 18.9. Suppose that M is a c.t.m. of ZFC, and in M , P is a quasi-order,
|P | = , P has the -cc, and is an infinite cardinal. Suppose that G is P-generic
over M . Then there is a function in M [G] mapping ((< ) )M onto a set containing
P()M [G] .
Proof. We do some calculations in M . Each antichain in P has size at most < .
def
Since |dmn(
)| has size , we thus have at most = (< ) nice names for subsets of
.
Let h : < i enumerate all of these names. Define
= {(op(,
), 1) : < }.
Now G is a function. For, if x G , then there is an < such that x = (, ( )G ),
by Lemma 3.19. Thus G is a relation. Now suppose that (x, y), (x, z) G . Then there
exist , < such that (x, y) = (, ( )G ) and (x, z) = (, ( )) . Hence = and
241

y = z. Clearly the domain of G is . By Proposition 18.9, P() rng(G ) in M [G], as


desired.
Now we can prove a more precise version of Theorem 18.7.
Theorem 18.10. (Solovay) Let M be a c.t.m. of ZFC. Suppose that is a cardinal
of M such that = . Let P be the partial order fin(, 2) ordered by , and let G be
P-generic over M . Then M [G] has the same cofinalities and cardinals as M , and 2 =
in M [G].
Moreover, if is any infinite cardinal in M , then (2 )M [G] ( )M .
Proof. By Theorem 18.7, M [G] preserves cofinalities and cardinals, and 2 . so
M [G] has the same cofinalities and cardinals as M .
Note that |fin(, 2)| = in M . Hence by Proposition 18.10, for any infinite cardinal
of M we have
(2 )M [G] (2 )M [G] ((< ) )M = ( )M .
Applying this to = we get 2 = in M [G].
By assuming that the ground model satisfies GCH, which is consistent by the theory of
constructible sets, we can obtain a sharper result.
Corollary 18.11. Suppose that M is a c.t.m. of ZF C + GCH. Suppose that is an
uncountable regular cardinal of M . Let P be the partial order fin(, 2) ordered by , and
let G be P-generic over M . Then M [G] has the same cofinalities and cardinals as M , and
2 = in M [G].
Moreover, for any infinite cardinal of M we have


if < ,
M [G]
(2 )
=
+ if .
Proof. By GCH we have = . Hence the hypothesis of Theorem 18.11 holds, and
the conclusion follows using GCH in M .
We give several more specific corollaries.
Corollary 18.12. If ZFC is consistent, then so is each of the following:
(i) ZF C + 20 = 2 .
(ii) ZF C + 20 = 203 .
(iii) ZF C + 20 = 1 .
(iv) ZF C + 20 = 4 .
Corollary 18.13. If ZFC+there is an uncountable regular limit cardinal is consistent, so is ZFC+ 2 is a regular limit cardinal.
Corollary 18.14. Suppose that M is a c.t.m. of ZFC. Then there is a generic
extension M [G] such that in it, 2 = ((2 )+ )M .
Since clearly ((2 )+ ) = (2 )+ in M , this is immediate from Theorem 18.11.
Now we turn to powers of regular uncountable cardinals, where similar results hold. But
to preserve cardinals, we need a new idea, given in the following definition.
242

Let be an infinite cardinal. A quasi-order P = (P, , 1) is -closed iff for all <
and every system hp : < i of elements of P such that p p whenever < < ,
there is a q P such that q p for all < .
The importance of this notion for generic extensions comes about because of the
following theorem, which is similar to Theorem 18.3.
Theorem 18.15. Suppose that M is a c.t.m. of ZFC, P M is a quasi-order, is
a cardinal of M , P is -closed, A, B M , and |A| < . Suppose that G is P-generic over
M and f M [G] with f : A B. Then f M .
Proof. It suffices to prove this when A is an ordinal. For, suppose that this special
case has been shown, and now suppose that A is arbitrary. In M , let j be a bijection from
def
= |A|M onto A. Then f j : B, so f j M by the special case. Hence f M .
So now we assume that A = , an ordinal less than . Let K = ( B)M . Let f = G .
G , so by
We want to show that f K, for then f M . Suppose not. Thus G
/ K
Theorem 3.15 there is a p G such that

p :
B
/ K.

(1)

For a while we work entirely in M . We will define sequences hp : i of elements of P


and hz : < i of elements of B by recursion, so that the following conditions hold:
(2) p0 = p.
(3) p p if < .
(4) p+1 (
) = z .
Of course we start out by defining p0 = p, so that (2) holds. Now suppose that p has
been defined so that (2)(4) hold; we define p+1 . Since p p0 = p, by (1) we have

p :
B,

hence

() = x],
p x
B[

so by Proposition 3.17(iii) we get


(
p x B(
) = x).
Hence by Proposition 3.17(ii) there is a p+1 p and a z B such that p+1 (
) = z .
This takes care of (3) and (4) in the successor case. For limit, p is given by the definition
of -closed.
Note that the function z defined in this way is in K.
This finishes our argument within M . Now let H be P-generic over M with p H.
Then H () = z for each < by (4), so that H = z K. This contradicts (1), since
p p.
Proposition 18.16. Suppose that M is a c.t.m. of ZFC. P M is a quasi-order,
is a cardinal of M , and P is -closed. Then P preserves cofinalities and cardinals .
Proof. Otherwise, by Proposition 18.2 there is a regular cardinal of M which
is not regular in M [G]. Thus there exist in M [G] an ordinal < and a function f :
such that rng(f ) is cofinal in . By Theorem 18.16, f M , contradiction.
243

To proceed we need some elementary facts about cardinals. For cardinals , , we define
< = sup | |.
<

Note here that the supremum is over all ordinals less than , not only cardinals.
Proposition 18.17. Let and be cardinals with 2 and infinite and regular.
Then (< )< = < .
Proof. Clearly holds. For , by the fact that = it suffices to find an injection
from

[ [

(1)

<

into

<

(2)

( + 1).

,<

Let x be a member of (1), and choose < accordingly. Then for each < there is a
x, < such that x() x, . Let x = sup< x, . Then x < by the regularity of
. We now define f (x) with domain x by setting, for any < and < x
n
(f (x))(, ) = (x())() if < x, ,

otherwise.
Then f is one-one. In fact, suppose that f (x) = f (y). Let the domain of f (x) be x
as above. Suppose that < . If x, 6= y, , say x, < y, . Then x = y y, > x, ,
and (f (x))(, x, ) = while (f (y))(, x, ) < , contradiction. Hence x, = y, . Finally,
take any < x . Then
(x())() = (f (x))(, ) = (f (y))(, ) = (y())();
it follows that x = y.
Now the direction follows.
Proposition 18.18. For any cardinals , , |[]< | < .
Proof. For each cardinal < define f : [] \{} by setting f (x) = rng(x)
for any x . Clearly f is an onto map. It follows that |[] | | | < . Hence






[


<
[]
|[] | =

<,

a cardinal
X
|[] |

<,
a cardinal

<,
a cardinal

<
= < .
244

<

We now define a new partial order for the remaining forcing results of this section. For
any sets I, J and infinite cardinal ,
Fn(I, J, ) = {f : f is a function, f I J, and |f | < }.
We consider this as a partial order under ; the greatest element is again . We claim
that this partial order has the (|J|< )+ -chain condition.
Lemma 18.19. If I, J are sets and is an infinite cardinal, then Fn(I, J, ) has the
(|J| ) -cc.
< +

Proof. Let = (|J|< )+ , and suppose that {p : < } is a collection of elements


of Fn(I, J, ); we want to show that there are distinct , < such that p and p are
compatible.
First suppose that is regular. We want to apply the system lemma to the set
{dmn(p ) : < }, with and replaced by and respectively. If this set has fewer
def

than members then there is


: dmn(p ) = D} has size .
S a set D such that
S E = { <<
For any E we have p < J, and | < J| |J| < , so there exist distinct
, E such that p = p , and the desired conclusion follows. Hence we may assume that
{dmn(p ) : < } has size .
For any < we have || |J|< and hence, using Propositions 18.18 and 18.19,
|[]< | ||< (|J|< )< = |J|< < .
Thus we can apply the delta system lemma, and we get a [] such that {dmn(p ) :
} is a -system, say with root r. Now |r J| |J|< < , so there exist a []
and an f r J such that p r = f for all . Clearly p and p are compatible for
any two , .
Second, suppose that is singular. Let h : < cf()i be a strictly increasing
sequence of regular cardinals with supremum . Then
=

{ < : |p | < },

<cf()
def

so there is an < cf() such that = { < : |p | < } has size . Let h : < i be a
one-one enumeration of . Then we can apply the first case to the set {p : < } with
replaced by to obtain the desired result.
Lemma 18.20. If I, J are sets and is a regular cardinal, then Fn(I, J, ) is -closed.
Proof. Suppose that < and hp : < i S
is a system of elements of Fn(I, J, )
such that p p whenever < < . Let q = < p . Clearly q Fn(I, J, ) and
q p for each < .
We now need another little fact about cardinal arithmetic.
Lemma 18.21. If is regular, then < = 2< .
245

Proof. Note that if < , then by the regularity of ,




[
X
X
X



||||
| max(, )|| max(,)|
2| max(,)| 2< < ;
| | =

< <
<
<
hence the lemma follows.
Lemma 18.22. Suppose that M is a c.t.m. of ZFC, I, J, M , and in M , is
a regular cardinal, 2< = and |J| . Then Fn(I, J, )M preserves cofinalities and
cardinalities.
Proof. By Lemma 18.26, the set Fn(I, J, ) is -closed, and so by Proposition 18.17,
Fn(I, J, ) preserves cofinalities and cardinalities . Now |J|< < = 2< = by
Lemma 18.27. Hence by Lemma 18.25, Fn(I, J, ) has the + -cc. By Proposition 18.4,
Fn(I, J, ) preserves cofinalities and cardinals + .
Now we can give our main theorem concerning making 2 as large as we want, for any
regular given in advance.
Theorem 18.23. Suppose that M is a c.t.m. of ZFC and in M we have cardinals
, such that < , is regular, 2< = , and = . Let P = Fn(, 2, ) ordered by
. Then P preserves cofinalities and cardinalities. Let G be P-generic over M . Then
(i) (2 = )M [G] .
(ii) If and are cardinals of M and < , then ( )M = ( )M [G] .
(iii) For any cardinal of M , if then (2 )M [G] = ( )M .
Proof. Preservation of cofinalities and cardinalities follows from Lemma 18.28. Now
we turn S
to (i). To show that (2 )M [G] , we proceed as in the proof of Theorem 18.7.
Let g = G. So g is a function mapping a subset of into 2.
(1) For each , the set {f Fn(, 2, ) : dmn(f )} is dense in P (and it is a
member of M ).
In fact, given f Fn(, 2, ), either f is already in the above set, or else
/ dmn(f ) and
then f {(, 0)} is an extension of f which is in that set. So (1) holds.
Since G intersects each set (1), it follows that g maps into 2. Let (in M ) h :
be a bijection. For each < let a = { : g(h(, )) = 1}. We claim that a 6= a
for distinct , ; this will give (2 )M [G] . The set
{f Fn(, 2, ) : there is a such that
h(, ), h(, ) dmn(f ) and f (h(, )) 6= f (h(, ))}
is dense in P (and it is in M ). In fact, let distinct and be given, and suppose that
f Fn(, 2, ). Now { : h(, ) f or h(, ) f } has size less than , so choose
not in this set. Thus h(, ), h(, )
/ f . Let h = f {(h(, ), 0), (h(, ), 1)}. Then h
extends f and is in the above set, as desired.
It follows that G contains a member of this set. Hence a 6= a . Thus we have now
shown that (2 )M [G] .
246

For the other inequality, note by Lemma 18.25 that P has the (2< )+ -cc, and by
hypothesis (2< )+ = + . By the assumption that = we also have |P | = . Hence by
Proposition 18.10 the other inequality follows. Thus we have finished the proof of (i).
For (ii), assume the hypothesis. If f M [G] and f : , then f M by Theorem
18.16. Hence (ii) follows.
Finally, for (iii), suppose that is a cardinal of M such that . By Proposition
18.10 with replaced by + we have (2 )M [G] ( )M . Now ( )M ( )M [G] =
((2 ) )M [G] = (2 )M [G] , so (iii) holds.
Corollary 18.24. Suppose that M is a c.t.m. of ZFC + GCH, and in M we have
cardinals , , both regular, with < . Let P = Fn(, 2, ) ordered by . Then P
preserves cofinalities and cardinalities. Let G be P-generic over M . Then for any infinite
cardinal ,

+ if < ,
(2 )M [G] =
if < ,
+

if .
Proof. Immediate from Theorem 18.29.
Theorem 18.29 gives quite a bit of control over what can happen to powers 2 for regular.
We can apply this theorem to obtain a considerable generalization of it.
Theorem 18.25. Suppose that n and M is a c.t.m. of ZFC. Also assume the
following:
(i) 1 < < n are regular cardinals in M .
(ii) 1 n are cardinals in M .
(iii) (cf(i ) > i )M for each i = 1, . . . , n.
(iv) (2<i = i )M for each i = 1, . . . , n.
(v) (i i )M = i for each i = 1, . . . , n
Then there is a c.t.m. N M with the same cofinalities and cardinals such that:
(vi) (2i = i )N for each i = 1, . . . , n.
(vii) (2 )N = (n )M for all > n .
Proof. The statement vacuously holds for n = 0. Suppose that it holds for n 1, and
the hypothesis holds for n, where n is a positive integer. Let Pn = Fn(n , 2, n). Then by
Lemma 18.25, Pn has the (2<n )+ -cc, i.e., by (iv) it has the +
n -cc. By Lemma 18.26 it is
n -closed. So by Propositions 18.4 and 18.17, Pn preserves all cofinalities and cardinalities.
Let G be Pn -generic over M . By Theorem 18.29, (2n = n )M [G] , (2 )M [G] = (n )M for
all > n , and also conditions (i)-(v) hold for M [G] for i = 1, . . . , n 1. Hence by the
inductive hypothesis, there is a c.t.m. N with M [G] N such that
(1) (2i = i )N for each i = 1, . . . , n 1.
(2) (2 )N = (n1 )M [G] for all > n1 .
In particular,
n
(2n )N = (n1
)M [G] (nn )M [G] = ((2n )n )M [G] = (2n )M [G]

= n = (2n )M [G] (2n )N .


247

Thus (2n )N = n . Furthermore, if > n then


(2 )N = (n1 )M [G] (n )M [G] = ((2n ) )M [G] = (2 )M [G]
= (n )M (n )N = ((2n ) )N = (2 )N .
It follows that (2 )N = (n )M . This completes the inductive proof.
Corollary 18.26. Suppose that n and M is a c.t.m. of ZFC + GCH. Also
assume the following:
(i) 1 < < n are regular cardinals in M .
(ii) 1 n are cardinals in M .
(iii) (cf(i ) > i )M for each i = 1, . . . , n.
Then there is a c.t.m. N M with the same cofinalities and cardinals such that:
(iv) (2i = i )N for each i = 1, . . . , n.
(v) (2 )N = (n )M for all > n .
Corollary 18.27. If ZFC is consistent, then so are each of the following:
(i) ZFC + 20 = 21 = 3 .
(ii) ZFC {2n = n+2 : n < 100}.
(iii) ZFC {2n = +1 : n < 300}.
(iv) ZFC {2n = +n : n < 33}.
Corollary 18.28. If it is consistent with ZFC that there is an uncountable regular
limit cardinal, then the following is consistent:
ZFC {2n is the first regular limit cardinal: n < 1000}.
Theorem 18.31 can itself be generalized; the following is the ultimate generalization, in
some sense. We do not give the proof.
Theorem. (Easton) Suppose that M is a c.t.m. of ZFC, and that in M E is a class
function whose domain is the class of all regular cardinals, and whose range is contained
in the class of cardinals of M . Also assume the following in M :
(i) For any regular cardinal , cf(E()) > .
(ii) If < are regular cardinals, then E() E().
Then there is a generic extension M [G] of M preserving cofinalities and cardinals
such that in M [G], 2 = E() for every regular .
Note that we have always been concerned with 2 for regular; 2 when is singular can
be computed on the basis of what has been done for regular cardinals. It is difficult to
directly do something about 2 when is singular, and there are even hard open problems
remaining concerning this. An application of PCF theory, treated late in these notes, goes
into this.
EXERCISES
E18.1. Show that fin(, 1 ) does not have ccc.
248

E18.2. Show that if G is fin(, 1 )-generic over M , then 1M is a countable ordinal of


M [G].
E18.3. Show that fin(, 1 ) preserves cardinals 2 .
E18.18. Suppose that is an uncountable regular cardinal of M , and P M is a -cc
quasi-order. Assume that C is club in , with C M [G]. Show that there is a C C
such that C M and C is club in . Hint: in M [G] let f : be such that
< [ < f () C]. Apply Theorem 18.3.
E18.5. Suppose that is an uncountable regular cardinal of M , and P M is a -cc
quasi-order. Assume that S M is stationary in , in the sense of M . Show that it
remains stationary in M [G].
E18.6. Suppose that is an uncountable regular cardinal of M , and P M is a -closed
quasi-order. Assume that S M is stationary in , in the sense of M . Show that it
remains stationary in M [G].
E18.7. Let M be a c.t.m. of ZFC, and in M let P = Fn(, 2, 1 ) with a cardinal of
M which is 1 . Show that M [G] satisfies CH, whether or not M does. Moreover,
M [G]
1M = 1
.
E18.8. Here we work only in ZFC (or in a fixed model of it). Suppose that (X, <) is a linear
order. Let P be the set of all pairs (p, n) such that n and p X n is a finite function.
Define (p, n) (q, m) iff m n, dmn(q) dmn(p), x dmn(q)[p(x) m = q(x), and
x, y dmn(q), if x < y then p(x)\p(y) m.
Show that P has ccc.
E18.9. Continuing exercise E18.8, suppose that we are working in a c.t.m. M of ZFC. Let
G be P-generic over M . For each x X let
ax =

{p(x) : (p, n) G for some n , with x dmn(p)}.

Thus ax . Show that if x < y, then ax \ay is finite.


E18.10. Continuing exercises E18.8 and E18.9, show that if x < y, then ay \ax is infinite.
Hint: for each i < let
E i = {(p, n) : x, y dmn(p) and |p(y)\p(x)| i},
and show that E i is dense.
E18.11. Prove that if ZFC is consistent, then so is ZFC + GCH + (V = L).

249

19. Relative constructibility


In the next two chapters we give the proof of the consistency of AC. Roughly speaking,
that proof runs as follows: start with a c.t.m. M of ZFC, take a generic extension M [G]
such that in it P() is homogeneous (in a certain sense), and then take the submodel N of
M [G] obtained by applying the constructibility process starting with (P())M [G] rather
than the empty set. So we need to go into two matters each of interest independent of the
proof of consistency of AC: (1) relative constructibility, starting with an arbitrary set;
(2) homogeneity. This chapter is devoted to (1).
We also go into the theory of ordinal definable sets.
Relative constructibility is just a mild generalization of ordinary constructibility, which
was covered earlier. So we can go over this material rapidly. While doing so, we may as
well introduce another mild generalization: allowing parameters from some set. So we
imagine given two sets B and C; we construct starting with B, and as we go along we
allow parameters from C.
We begin with some simple operations on sets.
Rel(A, C, n, i) = {s n A : s(i) C} for i < n < .
Proj(A, R, n) = {s n A : t R[t n = s]} for n .
Diag (A, n, i, j) = {s n A : s(i) s(j)} for i, j < n < .
Diag= (A, n, i, j) = {s n A : s(i) = s(j)} for i, j < n < .
These basic functions are recursively applied by the following definition, where n . Note
that Df (k, A, C, n) is defined for fixed A and C and all n simultaneously by recursion on
k. Each set Df (k, A, C, n) is a collection of subsets of n A, and so is Df (A, C, n).
Df (0, A, C, n) = {Rel(A, C, n, i) : i < n} {Diag (A, n, i, j) : i, j < n}
{Diag= (A, n, i, j) : i, j < n};
Df (k + 1, A, C, n) = Df (k, A, C, n) {n A\R : R Df (k, A, C, n)}
{R S : R, S Df (k, A, C, n)}
{Proj(A, R, n) : R Df (k, A, C, n + 1)};
[
Df(A, C, n) =
Df (k, A, C, n).
k

The following rather trivial fact will be technically useful in what follows.
Lemma 19.1. Df(A, C, n) for any sets A, C and any natural number n.
Proof. Clearly Df (k, A, C, 0), and for n > 0, = Diag (A, n, 0, 0), so the result
follows.
Let L be the first order language for set theory augmented by a unary relation symbol R.
Given two sets A, C, a formula (v0 , . . . , vn1 ) and elements a0 , . . . , an1 of A, we denote
by A,C (a0 , . . . , an1 ) the statment that holds with quantifiers relativized to A and a
subformula Rvi interpreted as saying that ai C. It is easy to give a recursive definition
250

of A,C (a0 , . . . , an1 ); we can conceive it as associating with a formula another formula
A,C (a0 , . . . , an1 ).
Lemma 19.2. Let (v0 , . . . , vn1 ) be a formula of L , and let A, C be any sets. Then
{s n A : A,C (s(0), . . . , s(n 1))} Df(A, C, n).
Proof. Induction on :
If is Rvi , then
{s n A : A,C (s(0), . . . , s(n 1))} = {x n A : s(i) C}
= Rel(A, C, n, i)
Df (0, A, C, n)
Df(A, C, n).
If is vi vj , then
{s n A : A,C (s(0), . . . , s(n 1))} = {x n A : s(i) s(j)}
= Diag (A, n, i, j)
Df (0, A, C, n)
Df(A, C, n).
If is vi = vj , then
{s n A : A,C (s(0), . . . , s(n 1))} = {x n A : s(i) = s(j)}
= Diag= (A, n, i, j)
Df (0, A, C, n)
Df(A, C, n).
Suppose that is , where
{s n A : A,C (s(0), . . . , s(n 1))} Df(A, C, n).
Then
{s n A : A,C (s(0), . . . , s(n 1))} = n A\{s n A : A,C (s(0), . . . , s(n 1))}
Df(A, C, n).
Suppose that is , where
{s n A : A,C (s(0), . . . , s(n 1))} Df(A, C, n)
and {s n A : A,C (s(0), . . . , s(n 1))} Df(A, C, n).
251

Then
{s n A : A,C (s(0), . . . , s(n 1))}
= {s n A : A,C (s(0), . . . , s(n 1))} {s n A : A,C (s(0), . . . , s(n 1))}
Df(A, C, n).
Finally, suppose that is vk . Then there is a formula with free variables among
v0 , . . . , vn such that is logically equivalent to vn . We assume inductively that
{s n+1 A : A,C (s(0), . . . , s(n))} Df(A, C, n + 1).
Then
{s n A : A,C (s(0), . . . , s(n 1))}
= {s n A : (vn )A,C (s(0), . . . , s(n 1))} (as is easily seen)
= Proj(A, {s n+1 A : A,C (s(0), . . . , s(n))}, n)
Df(A, C, n)
Now we prove a sequence of lemmas leading up to the fact that Df is absolute for transitive
models of ZF. To do this, we have to extend the definitions of our functions above so that
they are defined for all sets, since absoluteness was developed only for such functions. We
do this by just letting the values be 0 for arguments not in the domain of the original
functions.
Lemma 19.3. The function Rel is absolute for transitive models of Zf.
Proof. x = Rel(A, C, n, i) iff (n
/ and x = 0), or (n and i
/ n) or i < n
and the following condition holds:
s x[s n A and s(i) C] and s n A[s(i) C s x].
Lemma 19.4. The function Proj is absolute for transitive models of ZF.
Proof. x = Proj(A, R, n) iff n
/ and x = 0, or n and the following condition
holds:
s x[s n A t R(t n = s)]
s n A[t R(t n = s) s x];
Lemma 19.5. The function Diag is absolute for transitive models of ZF.
Proof. x = Diag (A, n, i, j) iff not(n and i, j < n) and x = 0, or n , i, j < n,
and the following condition holds:
s x[s n A s(i) s(j)] s n A[s(i) s(j) s x]
252

Lemma 19.6. The function Diag= is absolute for transitive models of ZF.
Proof. Similar to that of Lemma 19.19.
Lemma 19.7. For any sets A and C and any natural number n let
T1 (A, C, n) ={Rel(A, C, n, i) : i < n} {Diag (A, n, i, j) : i, j < n}
{Diag= (A, n, i, j) : i, j < n}.
Then T1 is absolute for transitive models of ZF.
Proof. x = T1 (A, C, n) iff not(n ) and x = 0, or n and the following condition
holds:
y xi, j < n[y = Rel(A, C, n, i) y = Diag (A, n, i, j) y = Diag= (A, n, i, j)]
i < ny {Rel(A, C, n, i)}[y x]
i, j < ny {Diag (A, n, i, j)}[y x]
i, j < ny {Diag= (A, n, i, j)}[y x].
Lemma 19.8. For any sets A, L and any natural number n let
T2 (A, n, L) = {n A\R : R L}.
Then T2 is absolute for transitive models of ZF.
Proof. x = T2 (A, n, L) iff not(n ) and x = 0, or n and the following condition
holds:
y xR L[y = n A\R] z[R L(z = n A\R) z x].
Here we need a little argument. Let M be a transitive model of ZF. Suppose that A, n, L
M , n , z is a set, R L, and z = n A\R; we would like to show that z M . There is a
w M such that w = (n A\R)M , since M is a model of ZF. By absoluteness, z = w M ,
as desired.
Lemma 19.9. For any set X, let T3 (X) = {R S : R, S X}. Then T3 is absolute
for transitive models of ZF.
Proof.
y = T3 (X) z yR, S X[z = R S] R, S X[R S y].
Lemma 19.10. For any sets A, X and any natural number n, let T4 (A, n, X) be the
set {Proj(A, R, n) : R X}. Then T4 is absolute for transitive models of ZF.
Proof. x = T4 (A, n, X) iff not(n ) and x = 0, or n and the following condition
holds:
R X[Proj(A, R, n) x] z[R X(z = Proj(A, R, n)) z x].
253

This is shown to be absolute as in the proof of Lemma 8.


Lemma 19.11. Let B = V V , and define S B B as follows:
(k, A, C, n)S(k , A , C , n )

iff

k, k , n, n and A = A and C = C and k < k .

Then S is well-founded and set-like on B.


Lemma 19.12. Let B and S be as in Lemma 19.11. Define F : B V V as
follows. Let k, n and let A, C, f be any sets. If f is not a function with domain
pred(B, (k, A, C, n), S), let F ((k, A, C, n), f ) = 0. If f is such a function, let

T1 (A, C, n)
if k = 0,

f (k , A, C, n) T2 (A, n, f (k , A, C, n))
F ((k, A, C, n), f ) =
T3 (A, n, f (k , A, C, n))

T4 (A, n, f (k , A, C, n + 1))
if k = k + 1.
Then F is absolute for transitive models of ZF.
Lemma 19.13. Df is absolute for transitive models of ZF.
Proof. It suffices to check that Df is obtained from the function F of Lemma 19.12
by the recursion theorem applied to B and S. Let G be the function so obtained. Then
for any set A and any natural number n,
G(0, A, C, n) = F ((0, A, C, n), G pred(B, (0, A, C, n), S))
= F ((0, A, C, n), 0) = T1 (A, C, n)
= {Rel(A, C, n, i) : i < n} {Diag (A, n, i, j) : i, j < n}
{Diag= (A, n, i, j) : i, j < n},
as desired. Now take any k . Then, with f = G pred(B, (k + 1, A, C, n), S)),
G(k + 1, A, C, n) = F ((k + 1, A, C, n), G pred(B, (k + 1, A, C, n), S))
= F ((k + 1, A, C, n), f )
= f (k, A, C, , n) T2 (A, n, f (k, A, C, n))
T3 (A, n, f (k, A, C, n)) T4 (A, n, f (k, A, C, n + 1))
= G(k, A, C, n) {n A\R : R G(k, A, C, n)}
{R S : R, S G(k, A, C, n)}
{Proj(A, R, n) : R G(k, A, C, n + 1)}
Lemma 19.14. Df is absolute for all transitive models of ZF.
Proof.
x = Df(A, C, n) iff

y xk [y Df (k, A, C, n)]
k y Df (k, A, C, n)[y x].
254

The following is the definable power set operation: For any sets A, C,
D(A, C) = {X A : n s n AR Df(A, C, n + 1)[X = {x A : s hxi R}]}.
Here s hxi is the member t of

n+1

A such that s t and t(n) = x.

Lemma 19.15. D(A, C) is absolute for all transitive models of ZF.


Proof.
D = D(A, C) iff

X Dn s n AR Df(A, C, n + 1)
[x X(x A s hxi R)
x A(s hxi R x X)]
X[n s n AR Df(A, C, n + 1)
[x X(x A s hxi R)
x A s n A(s hxi R x X)] X D]

Here there is an unbounded quantifier X, and one must check that if the statement holds
in a transitive class model of ZF, then it really holds. This is clear.
Lemma 19.16. Let (v0 , . . . , vn1 , x) be a formula of L with the indicated free
variables. Then
ACv0 , . . . , vn1 A[{x A : A,C (v0 , . . . , vn1 , x)} D(A, C)].
Proof. Let v0 , . . . , vn1 A and R = {s
Lemma 2, R Df(A, n + 1). Clearly

n+1

A : A (s(0), . . . , s(n))}. Then by

{x A : A,C (v0 , . . . , vn1 , x)} = {x A : v hxi R},


and hence {x A : A,C (v0 , . . . , vn1 , x)} D(A, C).
Lemma 19.17. For any sets A, C and any n , |Df(A, C, n)| .
Lemma 19.18. Let A and C be any sets. Then:
(i) D(A, C) P(A).
(ii) If A is transitive, then A D(A, C).
(iii) If X [A]< , then X D(A, C).
(iv) If A is infinite, then |D(A, C)| = |A|.
(v) C A D(A, C).
Proof. (i) is obvious. For (ii), let (v, x) be the formula x v. Then for any v A
we have v = {x A : x v} by the transitivity of A, and so v D(A, C) by Lemma
19.119.
For (iii), suppose that X [A]< . Then there exist an n and an s : n A with
rng(s) = X. For each i < n we have
{s n+1 A : s(n) = s(i)} = Diag= (A, n + 1, i, n) Df(A, C, n + 1).
255

Hence
def

R = {s n+1 A : s(n) rng(s n)} =

{s n+1 A : s(n) = s(i)} Df(A, C, n + 1).

i<n

Hence
X = {x A : s hxi R} D(A, C),
as desired.
For (iv), note that
D(A, C) =

{{x A : s hxi R} : R Df(A, C, n)}.

n
sn A

Hence
|D(A, C)|

(|n A| |Df(A, C, n)|) |A| = |A|.

On the other hand, {a} D(A, C) for each a A by (iii), so |A| |D(A, C)|. So (iv)
holds.
Finally, if x CA, then AC = {x A : x C} D(A, C) by Lemma 19.119.
Now we define the constructible hierarchy. Recall that for any set B, trcl(B) is the transitive closure of B, i.e., it is the smallest transitive
S
Sset
S containing B. It can be defined
recursively using the intuition trcl(B) = B B
B . . ..
LBC
= trcl(B);
0
BC
LBC
+1 = D(L , C);
[
LBC
for limit;
LBC
=

<
{B}
L (B) = L
;
L [C] = LC
;
[
LBC =
LBC
;
On

L=

L
;

On

L(B) =

L (B);

On

L[C] =

L [C].

On

Lemma 19.19. For any ordinal ,


(i) LBC
is transitive.

(ii) LBC
LBC
for all < .

256

Proof. We prove both statements simultaneously by induction on . Both statements


are clear for = 0. Now assume them for . By (i) for and Lemma 19.18(ii), it follows
BC
that VBC D(VBC , C) = V+1
, and this easily gives (ii) for + 1. If x y LBC
+1 , then
BC
BC
BC
BC
BC
y D(L , C) P(L ), so x L L+1 . So L+1 is transitive.
If is a limit ordinal and (i) and (ii) hold for all < , clearly they hold for
too.
Now we have a notion of rank for constructible sets too. Let B, C be sets, with B transitive.
For each x LBC , its LBC -rank is the least ordinal (x) = such that x LBC
+1 .
Theorem 19.20. Suppose that B and C are sets, with B transitive. Let x LBC
and let an ordinal. Then
= {y LBC : BC (y) < }.
(i) LBC

(ii) For all y x we have y LBC , and BC (y) < BC (x).


(iii) LC , and C () = .
(iv) LC
On = .
BC
(v) L LBC
+1 .
(vi) L V for all .
(vii) [L ]< L+1 for every .
(viii) Ln = Vn for every n .
(ix) L = V .
(x) B LBC .
(xi) C LBC LBC .
Proof. (i): Suppose that y LBC
. Then 6= 0. If is a successor ordinal +1, then
(y) < . If is a limit ordinal, then y LBC
for some < , hence y LBC

+1
BC
also, so (y) < . This proves .
BC

def

BC
For , suppose that = BC (y) < . Then y LBC
+1 L , as desired.
BC
BC
(ii): Assume that y x. Let BC (x) = . Then x LBC
+1 = D(L , C) P(L ),
BC
(y) < BC (x).
and so y LBC
. Hence
(iv): We prove this by induction on . It is obvious for = 0, and the inductive step
when is limit is clear. So, suppose that we know that LC
On = , and = + 1. If
C
C
C
C
L On, then D(L , C) P(L ), so L On = ; hence . This
C
C
shows that LC
On . If < , then L On L On. Thus it remains only
to show that LC
. Now there is a natural 0 formula (x) which expresses that x is
an ordinal:
y xz y(z x) y xz yw z(w y);

this just says that x is transitive and every member of x is transitive. Since LC
is

transitive, (x) is absolute for it. Hence


L
C
= LC
(x)}.
On = {x L :
C

C
Hence D(LC
, C) = L , as desired.

257

C
(iii): By (iv) we have LC
+1 On = + 1, and hence + 1 L+1 , so that
C
LC and C () . By (iv) again, we cannot have LC
() = .
, so
BC
BC
BC
L
BC
BC
(v): L = {x L : (x = x)
} D(L , C) = L+1 .
(vi): An easy induction on .
<
BC
(vii): Clearly [LBC
D(LBC
]
, C) = L+1 .
(viii): By induction on n. It is clear for n = 0. Assume that Ln = Vn . Thus Ln is
finite. Hence by (vii) and (vi),

Vn+1 = P(Vn ) = P(Ln ) = [Ln ]< Ln+1 Vn+1 ,


as desired.
(ix): Immediate from (viii).
(x): Obvious.
(xi): For each c C LBC let (c) be an ordinal such that c LBC
(c) , and let
S
= cC (c). Thus C LBC LBC
. Then by Lemma 15,
BC
BC
C LBC = C LBC
= {x LBC : x C} D(LBC
.

, C) = L+1 L

In order to show that LBC is a model of ZF, we need another fact about absoluteness
which was covered in the first semester.
Lemma 19.21. Suppose that M and N are classes with M N . Let 0 , . . . , n be a
list of formulas such that if i n and is a subformula of i , then there is a j n such
that j is . Then the following conditions are equivalent:
(i) Each i is absolute for M, N .
(ii) If i n and i has the form xj (x, y1 , . . . , yt ) with x, y1 , . . . , yt exactly all the
free variables of j , then
N
y1 , . . . , yt M [x N N
j (x, y1 , . . . , yt ) x M j (x, y1 , . . . , yt )].

Note the seemingly minor respects in which (ii) differs from the definition of absoluteness:
the implication goes only one direction, and on the right side of the implication we relativize
j to N rather than M .
Proof. (i)(ii): Assume (i) and the hypothesis of (ii). Suppose that y1 , . . . , yt M
M
and x N N
j (x, y1 , . . . , yt ). Thus by absoluteness x M j (x, y1 , . . . , yt ); choose
N
x M such that M
j (x, y1 , . . . , yt ). Hence by absoluteness again, j (x, y1 , . . . , yt )). Hence
x M N
j (x, y1 , . . . , yt ), as desired.
(ii)(i): Assume (ii). We prove that i is absolute for M, N by induction on the
length of i , where we consider formulas to be built up using , , . This is clear if i is
atomic, and it easily follows inductively if i has the form j or j k . Now suppose
that i is xj (x, y1 , . . . , yt ), and y1 , . . . , yt M . then
M
M
i (y1 , . . . , yt ) x M j (x, y1 , . . . , yt ) (definition of relativization)

x M N
j (x, y1 , . . . , yt ) (induction hypothesis)
x N N
j (x, y1 , . . . , yt ) (by (ii)
N
i (y1 , . . . , yt )

(definition of relativization)
258

Theorem 19.22. Suppose that Z() is a set for every ordinal , and the following
conditions hold:
(i) If < , then Z() Z().
S
(ii) If is a limit ordinal, then Z() = < Z().
S
Let Z = On Z(). Then for any formulas 0 , . . . , n1 ,
> [0 , . . . , n1 are absolute for Z(), Z].
Proof. Assume the hypothesis, and let an ordinal be given. We are going to apply
Lemma 19.21 with N = Z, and we need to find an appropriate > so that we can take
M = Z() in 19.21.
We may assume that 0 , . . . , n1 is subformula-closed; i.e., if i < n, then every
subformula of i is in the list. Let A be the set of all i < n such that i begins with an
existential quantifier. Suppose that i A and i is the formula xj (x, y1 , . . . , yt ), where
x, y1 , . . . , yt are exactly all the free variables of j . We now define a class function Gi as
follows. For any sets y1 , . . . , yt ,

Z
Gi (y1 , . . . , yt ) = the least such that x Z()j (x, y1 , . . . , yt ) if there is such,
0
otherwise.
Then for each ordinal we define
Fi () = sup{Gi (y1 , . . . , yt ) : y1 , . . . , yt Z()};
note that this supremum exists by the replacement axiom.
Now we define a sequence 0 , . . . , p , . . . of ordinals by induction on n . Let
0 = + 1. Having defined p , let
p+1 = max(p+1 , sup{Fi () : i A, p } + 1).
Finally, let = supp p . Clearly < and is a limit ordinal.
(1) If i A, y1 , . . . , yt Z(), and x ZZ
i (x, y1 , . . . , yt ), then there is an x Z() such
Z
that i (x, y1 , . . . , yt ).
In fact, choose p such that y1 , . . . , yt Z(p ). Then Gi (y1 , . . . , yt ) F (p ) < p+1 . Hence
an x as in (1) exists, with x Z(p+1 ).
Now the theorem follows from Lemma 19.21.
Theorem 19.23. LBC is a model of ZF.
Proof. We take the axioms one by one, using the results of Chapter 1. Extensionality
holds since LBC is transitive (by Lemma 19.19); see Theorem 1.6.
According to Theorem 1.7 to verify that the comprehension axioms hold in LBC it
suffices to take any formula with free variables among x, z, w1 , . . . , wn , assume that
z, w1 , . . . , wn LBC , and prove that
(1)

{x z : L

BC

(x, z, w1 , . . . , wn )} LBC
259

Clearly there is an ordinal such that z, w1 , . . . , wn LBC


. By Theorem 19.22, choose
BC
an ordinal > such that the formula x z is absolute for LBC
. Then
,L
{x z : L
= {x

BC

(x, z, w1 , . . . , wn )}
LBC

: (x z L

BC

(x, z, w1 , . . . , wn ))}

LBC

= {x LBC
: (x z

(x, z, w1 , . . . , wn ))} (absoluteness)

D(LBC
, C) (by Lemma 19.16)
= LBC
+1 ,
and (1) holds.
Pairing: See Theorem 1.8. Suppose that x, y LBC . Choose so that x, y LBC
.
Then by Lemma 19.16,
BC

BC
{x, y} = {z LBC
: (z = x z = y)L } D(LBC

, C) = L+1 ,

as desired.
Union: See Theorem 1.9. Suppose that x LBC . Choose so that x LBC
. Then
[

x = {z : u(z u u x)}
BC

= {z LBC
: (u(z u u x))L }

BC
D(LBC
, C) = L+1 ,

(since LBC
is transitive)

as desired.
Power set: See Theorem 1.10. Suppose that x LBC . For each z L such that z x
BC
choose z such that z LBC
z . Let = supzx Lz . Then, using Theorem 19(v),
BC
,
LBC
P(x) LBC = {z LBC : z x} LBC
+1 L

as desired.
Replacement: See Theorem 1.11. Suppose that is a formula with free variables
among x, y, A, w1, . . . , wn , we are given A, w1 , . . . , wn L, and
(1)

x A !y[y LBC L (x, y, A, w1, . . . , wn )].

For each x A let zx be such that zx LBC and L (x, zx , A, w1 , . . . , wn ); we are using
the replacement axiom here. Then for each x A choose x so that zx LBC
x . Let
BC
= supxA x . Suppose now that y LBC and L (x, y, A, w1, . . . , wn ) for some x A.
BC
Then by (1), y = zx , and hence y LBC
x L . This proves that
{y LBC : x AL

BC

(x, y, A, w1, . . . , wn )} LBC


.

BC
Since LBC
LBC
, this is as desired.

+1 L

260

Foundation: holds by Theorem 1.12.


BC
Infinity: Since LBC
by Theorem 19.20(iv), the infinity axiom holds by
+1 L
Theorem 1.21.
Theorem 19.24. Suppose that M is a transitive proper class model of ZF, B M ,
and C M M . Then LBC = (LBC )M M .
Proof. Take any ordinal . then M 6 LBC
, since M is a proper class; so choose
BC
BC
x M \LBC
.
Then
rank
(x)

.
Now
rank
(x) = (rankBC )M (x) by absoluteness, so

rankBC (x) M , and hence M . This proves that On M .


It follows by absoluteness of LBC that
[
[
M
M
LBC
= LBC .
(LBC )M = {x M : ((x LBC
(LBC
=
)) } =
)

On

On

Hence LBC = (LBC )M M .


Applying Theorem 19.24 within a c.t.m. M of ZFC, we obtain the following corollary:
Corollary 19.25. Suppose that M is a c.t.m. of ZFC, B, C M , and N M is a
c.t.m. of ZF. Then:
(ii) If B N , then (L(B))M N .
(ii) If C N N , then (L[C])M N .
By Lemma 19.15 and Theorem 1.24 we have:
Corollary 19.26. The function hL (B) : B V, Oni is absolute for transitive
class models of ZFC, and so is the function hL(B) : B V i.
Corollary 19.27. L(B) is a model of ZF + V = L(B).
Proof. We want to prove that x L(B) L(B)(x L (B))L(B) ). So, let
x L(B). Choose such that x L (B). Now L(B) x LL
by Lemma 19.26.
Ordinal definability
Let T be a transitive set. Then OD(T ) is the class of all sets a such that
>rank(a)ns n R Df(V , T, n + 1)
x V [s hxi R x = a].
EXERCISES
E19.1. Show that the axiom of choice holds in L(B) iff trcl({B}) can be well-ordered in
L(B).
E19.2. Recall from elementary set theory the following definition of the standard wellordering of On On:
(, ) (, ) iff

( < )
or ( = and < )
or ( = and = and < ).
261

Prove that is absolute for transitive class models of ZF.


Now define : On On On by recursion as follows:
(0) = (0, 0);

(, + 1)

(0, + 1)
( + 1) =

( + 1, )
(, 0)
() = -least(, )
Prove:
(1)
(2)
(3)
(4)

if () = (, ) and < ,
if () = (, ) and = ,
if () = (, ) and + 1 < ,
if () = (, ) and + 1 = ;
such that < [() (, )] if is limit.

If < , then () ().


maps onto On On.
is absolute for transitive class models of ZF.
1 is absolute for transitive class models of ZF.

E19.3. Suppose that M is a transitive class model of ZFC, and every set of ordinals is in M .
Show that M = V . Hint: take any set X. Let = |trcl({X})|, and let f : trcl({X})
be a bijection. Define E iff , < and f () f (). Use exercise E19.2 to show that
E M . Take the Mostowski collapse of (, E) in M , and infer that X M .
E19.4. Show that if X 1 then CH holds in L(X). Hint: show that if A P() in
L(X), then there are , < 1 such that X L (X ).
E19.5. Show that if X 1 then GCH holds in L(X).

262

20. Isomorphisms and AC


In this chapter we prove that if ZF is consistent, then so is AC. First, however, we go
into the relationship of isomorphisms of quasi-orders to forcing and generic sets; this is
needed for the consistency proof, and is independently interesting and important.
Let P and Q be quasi-orders, and suppose that f : P Q. We define a function f
with domain V P by recursion by setting, for any V P ,
f ( ) = {(f (), f (p)) : (, p) }.
Proposition 20.1. Suppose that P and Q are quasi-orders and f is a function mapping P into Q. Then
(i) f ( ) V Q for any V P .
(ii) f is absolute for c.t.m. of ZFC.
(iii) If M is a c.t.m. of ZFC, then f maps M P into M Q .
Proof. (i) is easily proved by induction on . (ii) follows from absoluteness of recursive
definitions. (iii) follows from (i), (ii).
Again, let P and Q be quasi-orders. An isomorphism from P to Q is a bijection f from P
onto Q such that f (1P ) = 1Q , and for any p, r P , p P r iff f (p) Q f (r).
As with other mathematical notions of isomorphisms, an isomorphism of quasi-orders
extends in a routine way to mappings of structures derived from the quasi-orders. We give
several results which carry out this routine analysis.
Lemma 20.2. If P and Q are quasi-orders and f is an isomorphism from P to Q,
then f (
xP ) = x
Q for every set x.
Proof. The proof is by -induction:
f (
xP ) = {(f (), f (p)) : (, p) x
P }
= {(f (
y P ), f (1P )) : y x}
= {(
y Q , 1Q ) : y x}
=x
Q .
Lemma 20.3. Suppose that P, Q, R are quasi-orders, g is an isomorphism from P to
Q, and f is an isomorphism from Q to R. Then f g is an isomorphism from P to R, and
f (g ( )) = (f g) ( ) for every V P .
Proof. Obviously f g is an isomorphism from P to R. We prove the second statement
by induction:
f (g ( )) = {(f (), f (p)) : (, p) g ( )}
= {(f (g ()), f (g(q))) : (, q) }
= {((f g) (), (f g)(q)) : (, q) }
= (f g) ( ).
263

Corollary 20.4. If f is an isomorphism from a quasi-order P to a quasi-order Q,


then f is a bijection from V P to V Q .
If f is an isomorphism P with Q, then in the terminology of Chapter 2 it is easy to see
that eQ f satisfies the following conditions:
(1) eQ [f [P ]] is dense in RO(Q).
(2) For all p, q P , if p q then eQ (f (p)) eQ (f (q)).
(3) For any p, q P , p q iff eQ (f (p)) eQ (f (q)) = 0.
Hence from Theorem 2.22 it follows that there is a unique isomorphism f of RO(P) onto
RO(Q) such that f eP = eQ f .
Lemma 20.5. Let f be an isomorphism from a quasi-order P onto a quasi-order Q.
Then
f ([[(0 , . . . , n1 )]]RO(P) ) = [[(f (0 ), . . . , f (n1 ))]]RO(Q) .
Proof. We omit the subscripts P and Q .
First take atomic equality formulas, by well-founded induction:

Y
X
e(p) +
f ([[ = ]]) = f
(e(q) [[ = ]])
(,p)

(,q)

!
Y
X
e(q) +

(e(p) [[ = ]])
(,q)

Y
(,p)

(,q)

f (e(q)) +

Y
(,p)

(,q)

e(f (p)) +

f (e(p)) +

(,p)

e(f (q)) +

Y
(,p)f ( )

Y
(,q)f ()

e(q) +

X
(,q)

X
(,p)

(,q)

(,p)

(f (e(q)) f ([[ = ]]))

(f (e(p)) f ([[ = ]])

X
X

e(p) +

(e(f (q)) [[f () = f ()]])

(e(f (p)) [[f () = f ()]])

(e(q) [[ = ]])

(,q)f ()

(e(p) [[ = ]])

(,p)f ( )

= [[f () = f ( )]].
264

Now we prove the lemma itself by induction on , thus officially outside our usual mathematical language. The atomic equality case has already been treated. Here is the remaining
argument, with obvious inductive assumptions:

X
f ([[ ]]) = f
(e(p) [[ = ]])
=

(,p)

(f (e(p)) f ([[ = ]]))

(,p)

(e(f (p)) [[f () = f ()]])

(,p)

(e(q) [[f () = ]])

(,q)f ( )

= [[f () = f ( )]];
f ([[(0 , . . . , n1 )]]) = f ([[(0 , . . . , n1 )]])

= f ([[(0 , . . . , n1 )]])
= [[(f (0 ), . . . , f (n1 ))]]
= [[(f (0 ), . . . , f (n1 ))]];
f ([[(0 , . . . , n1 )(0 , . . . , n1 )]])

= f ([[(0 , . . . , n1 )]] + [[(0 , . . . , n1 )]])


= f ([[(0 , . . . , n1 )]]) + f ([[(0 , . . . , n1 )]])
= [[(f (0 ), . . . , f (n1 ))]]) + [[(f (0 ), . . . , f (n1 ))]])
= [[(f (0 ), . . . , f (n1 )) (f (0 ), . . . , f (n1 ))]]);
!
X
[[(0 , . . . , n1 , )]]
f ([[x(0 , . . . , n1 , x)]]) = f
=

V P

f ([[(0 , . . . , n1 , )]])

V Q

[[(f (0 ), . . . , f (n1 ), f ( ))]])

V Q

[[(f (0 ), . . . , f (n1 ), )]])

= [[x(f (0 ), . . . , f (n1 ), x)]].


Lemma 20.6. If P and Q are quasi-orders and f is an isomorphism of P to Q, then
for any p P ,
p P (0 , . . . , n1 ) iff

f (p) Q (f (0 ), . . . , f (n1 )).

Proof.
p (0 , . . . , n1 ) iff

e(p) [[(0 , . . . , n1 )]]


265

iff
iff

f (e(p)) f ([[(0 , . . . , n1 )]])


e(f (p)) [[(f (0 ), . . . , f (n1 ))

iff

f (p) (f (0 ), . . . , f (n1 )).

Corollary 20.7. If P and Q are quasi-orders and f is an isomorphism of P to Q,


then for any p P ,
p (
x0 , . . . , x
n1 ) iff

f (p) (
x0 , . . . , x
n1 ).

Proof. By Lemmas 20.2 and 20.20.


Lemma 20.8. Suppose that M is a c.t.m. of ZFC, P, Q M are quasi-orders,
f M is an isomorphism of P to Q, and G is P-generic over M . Then f [G] is Q-generic
over M . Moreover, if G G , then (f ())f [G] (f ( ))f [G] , and if G = G , then
(f ())f [G] = (f ( ))f [G] .
Proof. We skip some details. If D Q is dense and D M , clearly f 1 [D] M is
dense in P; choose p f 1 [D] G. Then f (p) D f [G]. So f [G] is P-generic over M .
Now suppose that G G . Choose p G such that p . Then f (p) f [G]
and f (p) f () f ( ). Hence (f ())f [G] (f ( ))f [G] .
Similarly, G = G implies that (f ())f [G] = (f ( ))f [G] .
Lemma 20.9. Suppose that M is a c.t.m. of ZFC, P, Q M are quasi-orders, f M
is an isomorphism of P to Q, and G is P-generic over M . Then M [G] = M [f [G]], and
there is an bijection f : M [G] M [G] such that f (G ) = (f ())f [G] for every M P .
Moreover, for x, y M [G] we have x y iff f (x) f (y).
Proof. Clearly f [G] M [G]. Hence by Lemma 3.5, M [f [G]] M [G]. Applying this
to f 1 , we get M [G] = M [f 1 [f [G]]] M [f [G]]. So M [G] = M [f [G]].
By Lemma 20.8 there is a function f : M [G] M [G] such that f (G ) = (f ())f [G]
for every M P . f is a bijection since
(f 1 ) (f (G )) = (f 1 ) ((f ())f [G]) = ((f 1 ) (f ())f 1 [f [G]] = G ,
so that f 1 f is the identity on M [G]; and similarly f f 1 is the identity on M [f [G]] =
M [G]. Finally, by Lemma 20.8, G G iff f (G ) f (G ).
This finishes our general discussion of isomorphisms. We now turn to more special considerations needed for our proof of consistency of AC.
P is almost homogeneous iff for all p, q P there is an automorphism f of P such that
f (p) and q are compatible.
Lemma 20.10. Let P be an almost homogeneous quasi-order. Then
(i) If there is a p such that p (
x0 , . . . , x
n1 ), then 1 (
x0 , . . . , x
n1 ).
(ii) Either 1 (
x0 , . . . , x
n1 ) or 1 (
x0 , . . . , x
n1 ).
Proof. (i): Assume that p (
x0 , . . . , x
n1 ), but suppose that 1 6 (
x0 , . . . , x
n1 ).
Thus [[(
x0 , . . . , x
n1 )]] 6= 1, so there is a q such that e(q) [[(
x0 , . . . , x
n1 )]]; so
266

q (
x0 , . . . , x
n1 ). Let f be an automorphism such that f (p) and q are compatible. By
Lemma 20.7, f (p) (
x0 , . . . , x
n1 ). If r f (p), q we then have e(r) [[(
x0 , . . . , x
n1 )]]
[[(
x0 , . . . , x
n1 )]], contradiction.
(ii): Suppose that 1 6 (
x0 , . . . , x
n1 ). By (i), p 6 (
x0 , . . . , x
n1 ) for all p. Hence
by Lemma 3.13(ix), 1 (
x0 , . . . , x
n1 ).
Lemma 20.11. Suppose that I and J are sets, is an infinite cardinal, and =
hi : i Ii is a system of permutations of J. For each f Fn(I, J, ) define (f ) to be
the function given as follows:
dmn( (f )) = dmn(f );
( (f ))(i) = i (f (i)) for each i dmn(f ).
Then is an automorphism of Fn(I, J, ).
Proof. Clearly (f ) Fn(I, J, ) for any f Fn(I, J, ). Now is one-one:
suppose that (f ) = (g). Then
dmn(f ) = dmn( (f )) = dmn( (g)) = dmn(g),
and for any i dmn(f ),
f (i) = i1 (i (f (i))) = i1 (( (f ))(i) = i1 (( (g))(i) = i1 (i (g(i))) = g(i).
So f = g.
Next, maps onto Fn(I, J, ). For, let g Fn(I, J, ). Let f (i) = i1 (g(i)) for all
i dmn(g), with dmn(f ) = dmn(g). Clearly (f ) = g.
Now suppose that f, g Fn(I, J, ) and f g. Then for i dmn(f ) we have
( (f ))(i) = i (f (i)) = i (g(i)) = ( (g))(i). So (f ) (g). The other implication is
proved similarly.
Lemma 20.12. Fn(I, J, ) is almost homogeneous.
Proof. Suppose that f, g Fn(I, J, ). For each i I let i be the following
permutation of J: if i
/ dmn(f ) dmn(g), then i is the identity on J. If i dmn(f )
dmn(g), then i is the transposition (f (i), g(i)) (which may be the identity). Then we
claim that (f ) and g are compatible. For, suppose that i dmn( (f )) dmn(g) =
dmn(f ) dmn(g). Then ( (f ))(i) = i (f (i)) = g(i), as desired.
We now need a basic result about product forcing; this result will be useful also when
discussing iterated forcing later.
Let P and Q be quasi-orders. Their product P Q is defined to be (P Q, , 1 ),
where
(p1 , q1 ) (p2 , q2 ) iff p1 p2 and q1 q2 ;
1 = (1P , 1Q ).
267

Theorem 20.13. Suppose that M is a c.t.m. of ZFC, P, Q M are quasi-orders,


G1 P , and G2 Q. Then the following are equivalent:
(i) G0 G1 is (P Q)-generic over M .
(ii) G0 is P-generic over M and G1 is Q-generic over M [G0 ].
(iii) G1 is Q-generic over M and G0 is P-generic over M [G1 ].
Moreover, if one of (i)(iii) holds, then M [G0 G1 ] = M [G0 ][G1 ] = M [G1 ][G0 ].
Proof. (i)(ii): Assume that G0 G1 is (P Q)-generic over M . Suppose that
p G0 and p p P . Then (p, 1) G0 G1 and (p, 1) (p , 1), so (p , 1) G0 G1 ; so
p G0 .
Suppose that p, p G0 . Then (p, 1), (p, 1) G0 G1 , so they are compatible. Clearly
this implies that p and p are compatible.
Let D M be a dense subset of P . Define E = {(p, q) P Q : p D}. Then
E M and it is clearly dense in P Q. Hence we can choose (p, q) E (G0 G1 ). So
p D G0 , as desired.
Thus G0 is P-generic over M .
Now by arguments similar to the above, G1 satisfies the conditions to be Q-generic
over M [G0 ] except possibly the denseness condition. So, suppose that D M [G0 ] is a
dense subset of Q. Take M P such that G0 = D. Then there is a p G0 such that
(1)

p is dense in Q.

Define
E = {(p , q) : p p and p q }.
Thus E is a subset of P Q; we claim that it is dense below (p, 1). To prove this, take
any (p , q ) (p, 1). Now p p, so by (1),
Q(y
and y x),
p x Qy
and hence
and y q ).
p y Q(y
Hence by Proposition 3.17(i), there is a p p and a q Q such that
p q and q q .
Then by Lemma 3.23, q q . Hence p p p and p q , so (p , q ) E. Also
(p , q ) (p , q ). So this proves our claim that E is dense below (p, 1).
Since p G0 , we have (p, 1) G0 G1 , so by the genericity of G0 G1 we get
(G0 G1 ) E 6= 0; say that (r, s) (G0 G1 ) E. Thus r G0 , s G1 , r p, and
r s . Hence s G = D, so D G1 6= . So we have proved (ii).
(ii)(i): Assume (ii). First we check that G0 G1 is a filter. Suppose that (p, q)
G0 G1 and (p, q) (p , q ). Then p p , hence p G0 ; similarly, q G1 . So
(p , q ) G0 G1 . Now suppose that (p, q), (p , q ) G0 G1 . Thus p, p G0 , so there
is an r G0 such that r p, p . Similarly we get an s G1 such that s q, q . So
(r, s) G0 G1 and (r, s) (p, q), (p , q ). So G0 G1 is a filter.
268

Now suppose that D P Q is dense and is in M . Let


D = {q Q : there is a p G0 such that (p, q) D}.
Thus D M [G0 ]. Note that if q D G1 , then with p as in the definition of D we
get (p, q) D (G0 G1 ). Thus it suffices to show that D is dense in Q. To this end,
suppose that q Q. Let
E = {p P : there is a q q such that (p, q ) D}.
Clearly E M . Also, E is dense in P : if p P , choose (p , q ) D such that (p , q )
(p, q); then p E, as desired. Now since G0 is P-generic over M , choose p G0 E.
Then by the definition of E, choose q q such that (p, q ) D. Thus q D and q q,
as desired. This proves (i).
By symmetry, (i)(iii).
Now assume that one of (i)(iii) holds, and hence all three hold. Now M M [G0 ][G1 ]
and G0 G1 M [G0 ][G1 ], so by Lemma 3.5, M [G0 G1 ] M [G0 ][G1 ]. On the other
hand, M M [G0 G1 ] and G0 M [G0 G1 ], so by Lemma 3.5, M [G0 ] M [G0 G1 ].
And G1 M [G0 G1 ], so by Lemma 3.5 yet again, M [G0 ][G1 ] M [G0 G1 ]. This proves
that M [G0 G1 ] = M [G0 ][G1 ]. By symmetry, M [G0 G1 ] = M [G1 ][G0 ].
Lemma 20.14. Suppose that M is a c.t.m. of ZFC, and I, J M are uncountable
(in the sense of M ). Let P and Q be the partial orders Fin(I, 2) and Fin(J, 2) respectively.
Then for any formula (x) and any ordinal ,
1P P ((
P )L(P())

iff

1Q Q ((
Q )L(P()) .

Proof. By symmetry, say (|I| |J|)M . Let R be the partial


order Fn(I, J, 1), and
S
let H be R-generic over M . Then by the usual argument, H is a function mapping I
onto J in M [H]. Thus
(1) (|I| = |J|)M [H] .
Next,
(2) If G is P-generic over M [H], then G is P-generic over M and M [G] M [H][G] =
M [G][H].
In fact, assume that G is P-generic over M [H]. Obviously then G is P-generic over M . By
Lemma 3.5 we have M [G] M [H][G]. By Theorem 20.13, M [H][G] = M [G][H].
Recall that P preserves cardinalities. So M [G] has the same cardinals as M .
Now by Lemma 4.26, we know that R is 1 -closed in M .
(3) R is 1 -closed in M [G].
In fact, working in M [G] suppose that < 1 , p = hp : < i is a system of members
of R, and p p whenever < < . Let be a P -name with G = p, and let q G
be such that

q is a function with domain


1 and range R
< ].
, < [
269

In particular we have
R[
() = s],
q < s
and so by Proposition 3.17, for each < we can choose r G and s R such that
r q and r ()
= s . Then
(4) If < < , then s s .
In fact, choose t r , r ; this is possible since r , r G. Then
t (
) ()
()
= s (
) = s ,
hence t s s , hence by Lemma 3.23, s s , so that (4) holds.
Thus in M we have a decreasing sequence s = hs : < i, and so there is a t R
such that t s for all < . It follows that r t ()
for all < , and hence
t p for all < , as desired for (3).
(5) (P())M [G] = (P())M [G][H] .
For, if f 2 and f M [G][H], then by (3) and Theorem 4.16 we have f M [G]. So (5)
holds.
(6)

1P P,M ((
P )L(P())

iff

1P P,M [H] ((
P )L(P()) .

In fact, first suppose that 1P P,M ((


P )L(P()) ; but suppose also that 1P 6 P,M [H]
((
P )L(P()) . Then 1P 6 P,M [H] (
P )L(P()) , so by Lemma 3.13(ix) there is a
p P such that p P,M [H] (
P )L(P()) . Let G be P-generic over M [H] with p G.
Hence ((
P )L(P()) holds in M [H][G]. By absoluteness (Corollary 5.26) and (5),
((
P )L(P()) holds in M [G]. But G is P-generic over M , so this contradicts the supposition.
Second, suppose that 1P P,M [H] ((
P )L(P()) . Take any G which is P generic over
M [H]. Then ()L(P()) holds in M [H][G], and hence in M [G] by absoluteness (Corollary
5.26) and (5). Thus G is P-generic over M and ()L(P()) holds in M [G], so there is
a p G such that p P,M ((
P )L(P()) . Hence by Lemmas 20.10 and 20.12 applied to
Fin(I, 2) = Fn(I, I, ) we get 1P P,M ((
P )L(P()) . This proves (6).
By symmetry we have
(7)

1Q Q,M ((
Q )L(P())

iff

1Q Q,M [H] ((
Q )L(P()) .

Now in M [H] we have |I| = |J|, as noted above. Hence in M [H], the partial orders P
and Q are isomomorphic. Hence the conclusion of the lemma follows from (6), (7), and
Corollary 20.7.
Theorem 20.15. If ZF is consistent, then so is ZF + AC.
Proof. Assume that ZF is consistent. By the theory of constructibility we know that
also ZFC is consistent, so we take a c.t.m. M of ZFC. Let P = Fin(1 , 2), and let G be
P-generic over M , and let N = L(P())M [G] . By Theorem 5.23, N is a model of ZF. We
claim that AC fails in N , as desired.
270

For, suppose that AC holds in N , and in N let = |P()|. Thus (


= |P()|)L(P())
holds in M [G], and so there is a p G such that p (|
| = |P()|)L(P()) . Hence
1 P (|
| = |P()|)L(P()) by Lemmas 20.10 and 20.12.
Now let Q be the partial order Fin(||+ , 2). By Lemma 20.14 and the preceding
paragraph we have 1 Q (|
| = |P()|)L(P()) . Let H be Q-generic over M . Then
(|
| = |P()|)L(P()) holds in M [H]. This means that there is a bijection from to
P() in M [H]. But the argument used in Cohen forcing shows that has at least ||+
subsets in M [H], contradiction.
EXERCISES
E20.1. Show that for any infinite cardinal , the partial order fin(, 2) is isomorphic to
fin( , 2).
E20.2. Prove that if P and Q are isomorphic quasi-orders, then RO(P) and RO(Q) are
isomorphic Boolean algebras.
E20.3. Give an example of non-isomorphic quasi-orders P and Q such that RO(P) and
RO(Q) are isomorphic Boolean algebras.
Qw
E20.4. For any systemQhPi : i Ii, we define the weak product iI Pi as follows: the
underlying set is {f iI Pi : {i I : f (i) 6= 1} is finite}, with f g iff f (i) Pi g(i)
forQ
all i I. Prove that for any infinite cardinal , the quasi-order fin(, 2) is isomorphic
w
to < P , where each P is equal to fin(, 2).
E20.5. Show that in the model N of the proof of Theorem 20.15, P() cannot be wellordered.
E20.20. Show that the following statement is equivalent to the axiom of choice:
For every relation R there is a function f R such that dmn(f ) = dmn(R).
E20.7. Show that the following statement is equivalent to the axiom of choice:
For every function f there is a function g such that dmn(g) = rng(f ), rng(g) dmn(f ),
and f (g(x)) = x for every x dmn(g).
E20.8. Show that the following statement is equivalent to the axiom of choice:
For any sets A, B, either there is a one-one function mapping A into B or there is a one-one
function mapping B into A.

The remaining exercises are concerned with generalizations of the main theorem, Theorem
20.15, of this chapter. Proofs of the consistency of CH date from the 1930slong before
forcing, and shortly before constructibility. The proofs were not relative to ZFC; one had
to admit many Urelementeelements with no members. But the basic idea of those
proofs can be adapted to ZFC, and Theorem 20.15 is of this sort. We give some exercises
which form an introduction to the rather extensive work that has been done in this area.
271

Most of this work is to show that such-and-such a statement, while a consequence of ZFC,
cannot be proved in ZF, but does not imply AC either.
E20.9. We expand the language of set theory by adding an individual constant . An
Urelement is an object a such that a 6= but a does not have any elements. (Plural is
Urelemente.) A set is an object x which is either or has an element. Both of these are
just definitions, formally like this:
U r(a) a 6= x(x
/ a);
Set(x) x = y(y x).
Now we let ZFU be the following set of axioms in this language:
All the axioms of ZF except extensionality and foundation.
x[(x )].
x, y[Set(x) Set(y) z(z x z y) x = y].
x[Set(x) x 6= y xz(z x z
/ y)].
We also reformulate the axiom of choice for ZFU; it is the following statement:
A {Set(A ) x A [Set(x) x 6= ]
x A y A [x 6= y z[(z x z y)]]
Bx A !y(y x y B)}.
We let ZF CU be all of these axioms.
One can adapt most of elementary set theory to use these axioms; browsing through a
rigorous introduction to set theory (like the first few chapter of my m6730 notes) should
convince one of this. In this exercise, give a new definition of ordinal.
Also, show that if we add the axiom a[U r(a)] we get a theory equivalent to ZF.
E20.10. Let be an infinite cardinal. Let be an infinite ordinal such that V Since
|V+1 | = 2|V | , we also have |V+1 \V |. Let U be a subset of V+1 \V of size . Let
Z be any element of U , fixed for what follows. We define hW : Ordi by recursion:
W0 = U ;
W+1 = W (P(W )\{});
[
W =
W for limit;
<

W =

W .

Ord

Prove the following:


(1) If < , then W W .
(2) If x y W \U , then x W . (Thus y W \U implies that y W .)
272

(3) W V = for all .


Now for each x W we define its rank rankW (x) in this new hierarchy. Let be minimum
such that x W . If = 0, let rankW (x) = 1. Otherwise, is a successor ordinal + 1
and we define rankW (x) = .
(4) W = {x W : rankW (x) < }.
(5) If x, y W and x y, then rankW (x) < rankW (y).
(6) If x W \U , then rankW (x) = supyx (rankW (y) + 1).
(7) If x W , then rank(x) = + 1 + rankW (x).
(8) If a W \U , then W a 6= .
(9) For any a W we have U r W (a) iff a U \{Z}.
(10) For any a W we have SetW (a) iff a
/ U \{Z}.
This is clear from (9).
E20.11. (Continuing E20.10) Show that (W, Z) is a model of ZFCU.
E20.12. (Continuing E20.11) Let f be a permutation of U \{Z}. Show that f extends to
an automorphism f + of the structure (W, , Z) in a natural way, so that f + (a) = {f + (b) :
b W, b a} for every a W .
E20.13. (Continuing E20.12) An element a of W is W -transitive iff for all b, c W , if
b c a then b a. Note that each member of U , and even each set of members of U ,
are symmetric. Show that for any a W \U there is a smallest W -transitive set T such
that either T = a U or a
/ U and a T . We call this set (which is clearly unique) the
W -transitive closure of a.
Also show that if f is a permutation of U \{Z} and a W \U , then f + maps the
W -transitive closure of a onto the W -transitive closure of f + (a).
E20.14. (Continuing E20.13) An element a of W is symmetric iff there is a finite subset F
of U \{Z} such that f + (a) = a for every permutation f of U \{Z} such that f (x) = x for
all x F . Then we call a hereditarily symmetric iff every b in the W -transitive closure of
{a} is symmetric. Let H be the class of all hereditarily symmetric elements of W . Prove:
(i) Every element of U is hereditarily symmetric.
(ii) Prove that if a is symmetric and f is any permutation of U \{Z}, then f + (a) is
symmetric.
(iii) Prove that if f is any permutation of U \{Z}, then f + H is an automorphism
of (H, Z).
(iv) Prove that if (v0 , . . . , vn1 ) is any formula, v0 , . . . , vn1 H, H holds, and f
is any automorphism of (H, Z), then H (f + (v0 ), . . . , f + (vn1 )).
(v) Prove that (H, Z) is a model of ZFU. where H is the class of all hereditrily
symmetric elements of W .
E20.15. (Continuing E20.14) (i) We make a metalanguage definition, associating with each
natural number m a term m in a definitional extension of the language for ZFU: 0 = ,
and m + 1 = m {m}. Prove that ZFU |= m for all m .
273

(ii) Prove that if m < n < , then ZFU |= m n m 6= n.


(iii) Let ZFUI be the theory ZFU together with each of the following sentences, for
m :

^
^
v0 . . . vm
U r(vi )
[(vi = vj )] .
im

0i<jm

Prove that AC cannot be proved in ZFUI.


Hint: In fact, show that ZFUI cannot prove that there is a one-one function mapping
into Ur. For this, take the above model (H, Z) with infinite, hence with U infinite.
Assume that f H is such that
(H, Z) |= f is a one-one function mapping into U r,
and get a contradiction.
References
Howard, P. and Rubin, J. Consequences of the axiom of choice. American Mathematical Society 1998, 432pp. See also
http://www.emunix.emich.edu/phoward/
Jech, T. The axiom of choice. North-Holland 1973, 202pp.
Jech, T. Set Theory. Springer-Verlan 2003, 751pp.
Rubin, H. and Rubin, J. Equivalents of the axiom of choice. North-Holland 1963,
134pp.

274

21. Embeddings, iterated forcing, and Martins axiom


In this chapter we mainly develop iterated forcing. The idea of iterated forcing is to
construct in succession M [G0 ], M [G0 ][G1 ], etc., continuing transfinitely, but stopping at
some stage M [G0 ][G1 ] . . . [G ]. Here G0 is P0 -generic over M , G1 is P1 -generic over M [G0 ],
etc., where P0 is a quasi-order in M , P1 is a quasi-order in M [G0 ], etc.
Note that after the famous . . . in such an iteration one cannot simply take the union
of preceding models. This has already been observed in exercise E3.18 on page 55. We
give here a solution of that exercise, since this helps motivate the way that iterated forcing
is defined. Take the simple case in which we are given a quasi-order P in M such that for
every p P there are incompatible q, r p, and form
M = M0 M1 M2
whereSfor each n, Mn+1 = Mn [Gn ] for some Gn which is P-generic over M
Sn . We claim
that n Mn does not satisfy the power set axiom. For, assume that R = n Mn does
satisfy the power set axiom. Then R |= yz(z P z y). Choose y R so that
R |= z(z P z y). Say y Mn . Then R |= Gn P z y. By absoluteness,
R |= Gn P . So R |= Gn y, hence Gn y Mn . This contradicts Lemma 3.2.
Thus care must be taken at limit steps in an iteration.
A remarkable fact about iteration is that the final stage can be defined as a simple
generic extension of M with respect to a (complicated) quasi-order. In fact, the official
definition of iterated forcing will have this property built-in.
Usually a single step in an iteration is the most important, with the gluing together of
all the single steps a technical matter. Such a single step amounts to seeing what happens
in the situation M [G][H], and we first deal with that in detail.
Since we will be dealing with at least two quasi-orders at the same time, it is important
to be rather precise with the notation. So we return to the official notation P = (P, , 1)
for quasi-orders introduced in Chapter 2. Now we want to be even more precise, and write
P = (PP , P , 1P ) to indicate the dependence on the particular quasi-order.
Let M be a c.t.m. of ZFC, and let P M be a quasi-order. A P-name for a quasi-order
is a P-name = op(op( 0 , 1 ), 2 ) such that 2 dmn( 0 ) and
1P P 2 0 and 1 is a quasi-order on 0 with largest element 2 .
Sometimes we denote 1 , 2 by , and 1 respectively. Recall from page 51 the definition
of op.
Thus if G is P-generic over M , then G is a quasi-order in M [G]. We now want to
define a single quasi-order in M which embodies both P and G , in a sense. So we define
a quasi-order P in M . The underlying set of this quasi-order is
{(p, ) : p PP , dmn( 0 ), and p P 0 }.
The order in P is given by
(p, ) P (q, ) iff

p P q and p P .
275

Finally, we let 1P = (1P , 1 ).


We have given the definition here replete with all necessary subscripts. But from now
on we omit some subscripts when no confusion is likely. We illustrate this simplification
by giving the proof of the following theorem first without subscripts and then with them.
Proposition 21.1. Under the above notation, P is a quasi-order in M .
Proof. Suppose that (p, ) P . Then 1 x 0 (x x) and p 0 , so by
Lemma 3.13(xvi) p . So (p, ) (p, ).
Suppose that (p, ) (q, ) (r, ). Then p q r, so p r. Also, p and
q , so, since p q, p . Thus (p, ) (r, ).
If (p, ) P , then p 1 and p 1, so (p, ) (1, 1).
Proof with subscripts. Suppose that (p, ) P . Then 1P P x 0 (x x)
and p P 0 , so p P . So (p, ) P (p, ).
Suppose that (p, ) P (q, ) P (r, ). Then p P q P r, so p P r. Also,
p P and q P , so, since p P q, p P . Thus (p, ) P (r, ).
If (p, ) P , then p P 1 and p P 1 , so (p, ) P (1P , 1 ).
We now need to digress into more of the general theory of quasi-orders and forcing. If P
and Q are quasi-orders, a function i : PP PQ is a complete embedding iff the following
conditions hold:
(1) For all p, p PP , if p p then i(p ) i(p).
(2) For all p, p PP , p p iff i(p) i(p ).
(3) For any q PQ there is a p PP , which is called a reduction of q to P, such that for
all p PP , if p p then i(p ) and q are compatible.
Theorem 21.2. Suppose that M is a c.t.m. of ZFC, and in M i is a complete
embedding of a quasi-order P into a quasi-order Q. Suppose that H is Q-generic over M .
Then i1 [H] is P-generic over M , and M [i1 [H]] M [H].
Proof. To show that i1 [H] is P-generic over M we will apply Proposition 3.3 and
show that i1 [H] is upward closed, any two members of it are compatible, and it intersects
every dense set which is in M .
Suppose that p p i1 [H]. Thus by (1), i(p) i(p ) H, so i(p) H and hence
p i1 [H].
Suppose that p, q i1 [H]. Thus i(p), i(q) H, so i(p) and i(q) are compatible.
Hence p and q are compatible by (2).
Suppose that D PP is dense and is in M . Let
E = {q PQ : there is a u D such that q i(u)}.
Clearly E is in M . We claim that it is dense in Q. To prove this, suppose that s PQ .
Let t PP be a reduction of s to P. Choose u D such that u t. Then by the definition
of reduction it follows that i(u) and s are compatible. Say q i(u), s. Then q E and
q s, as desired.
276

So, choose q E H. Then there is a u D such that q i(u). It follows that


i(u) H, and hence u i1 [H] D, as desired.
So we have checked that i1 [H] is P-generic over M .
Now i M M [H], so i1 [H] M [H] by absoluteness. It follows from Lemma 3.5
that M [i1 [H]] M [H].
For the next theorem, recall the definition of i from Chapter 6.
Theorem 21.3. Suppose that P and Q are quasi-orders and i is a function mapping
P into Q. Suppose that M is a c.t.m. of ZFC. Then
(i) If H Q and M P , then val(, i1 [H]) = val(i ( ), H).
(ii) Assume that i is a complete embedding of P into Q. Suppose that H is Q-generic
over M and (x1 , . . . , xn ) is a formula which is absolute for c.t.m. of ZFC. Then for any
p P,
p P (1 , . . . , n ) iff i(p) Q (i (1 ), . . . , i (n )).
Proof. (i): by induction on . First suppose that x val(, i1 [H]). Choose (, p)
such that p i1 [H] and x = val(, i1 [H]). Then i(p) H and, by the inductive
hypothesis, x = val(i (), H). Thus (i (), i(p)) i ( ). So x val(i ( ), H).
Conversely, suppose that x val(i ( ), H). Choose (, p) so that i(p) H and
x = val(i (), H). Then p i1 [H] and, by the inductive hypothesis, x = val(, i1 [H]).
So x val(, i1 [H]).
(ii): For , assume that p P (1 , . . . , n ). Let H be Q-generic over Q with i(p) H.
Then p i1 [H], and i1 [H] is P-generic over M by 21.2. Hence by the external definition
of forcing,
(val(1 , i1 [H]), . . . , val(n , i1 [H]))
holds in M [i1 [H]]. Now by (i), val(j , i1 [H]) = val(i (j ), H) for each j = 1, . . . , n, and
M [i1 [H]] M [H] by 21.2, so by absoluteness we see that
(val(i (1 ), H), . . . , val(i (n ), H))
holds in M [H]. Hence by the definition of forcing, i(p) Q (i (1 ), . . . , i (n )).
For , suppose that it is not the case that p P (1 , . . . , n ). Then there is
a q p such that q P (1 , . . . , n ). By the direction , we then have i(q) Q
(i (1 ), . . . , i (n )). Since i(q) i(p), it follows that it is not the case that i(p) Q
(i (1 ), . . . , i (n )).
We return to our discussion of two-stage iterations. With the notation introduced above,
define i(p) = (p, 1 ).
Proposition 21.4. Under the above notation, i is a complete embedding of P into
P .
Proof. (1) and (2) are easy to check. For (3), let (p, ) P be given. We claim that
p is a reduction of (p, ) to P. For, suppose that q p. Then i(q) = (q, 1) is compatible
with (p, ), since (q, ) P and (q, ) (q, 1), (p, ).
277

Proposition 21.5. Again assume the above notation. Suppose that G is P-generic
0
over M , is a P-name, and G G
. then there is a (r, ) P such that r G and
r = 0.
Proof. Choose q G such that q 0 . Then by Lemma 3.13(v), choose
r q with r G such that for some (, s) 0 we have r s and r = . Since
r q, we have r 0 , so by the external definition of forcing, r 0 . Clearly
(r, ) P .
0
Next, suppose that G is P-generic over M and H G
. Then we define

G H = {(p, ) P : p G and G H}.


Theorem 21.6. Suppose that M is a c.t.m. of ZFC, P is a quasi-order in M , and
is a P-name for a quasi-order in M .
If G is P-generic over M and H is G -generic over M [G], then GH is (P)-generic
over M , and M [G H] = M [G][H].
Proof. Suppose that (p, ) G H and (p, ) (q, ). Thus p G, G H, p q,
and p . Hence q G. Also, G G , and so G H. Hence (q, ) G H.
Suppose that (p, ), (q, ) G H. Then p G, G H, q G, and G H. Choose
G H such that G G , G . By Proposition 21.5 choose (r, ) P such that r G
and r = . Also choose s, t G so that s and t . Finally, take
u p, q, r, s, t. Then (u, ) P and (u, ) (p, ), (q, ).
Now suppose that D P is dense in P . Let
F = {G : (q, ) D for some q G and some }.
We claim that F is dense in G . To check this, let x G ; say x = G with (, q) and
q G. Thus (q, ) P . Now we claim that
def

K = {s P : (s, ) (q, ) for some such that (s, ) D}


is dense below q. For, suppose that r q. Then r , so (r, ) P . Choose
(s, ) D such that (s, ) (r, ). Thus s K and s r, as desired.
Now let s G K; say (s, ) (q, ) with (s, ) D. Then s , so G G .
Since s G, this shows that F is dense.
Choose G F H. Say (q, ) D with q G. Then (q, ) D (G H), as desired.
For the final statement of the theorem, note that G M [G H], since p G iff
(p, 1) G H. Hence M [G] M [G H] by Lemma 3.5. Also, H M [G H], since
H = {x : there is a (q, ) G H such that x = G }.
Hence M [G][H] M [GH]. Conversely, clearly GH M [G][H], so M [GH] M [G][H].
Theorem 21.7. Suppose that M is a c.t.m. of ZFC, P is a quasi-order in M , and
is a P-name for a quasi-order in M . Let i be the complete embedding defined above.
278

Suppose that K is (P )-generic over M . Define G = i1 [K] and


H = {G : dmn( 0 ) and (q, ) K for some q}.
Then G is P-generic over M , H is G -generic over M [G], K = G H, and M [K] =
M [G][H].
Proof. By Theorem 21.2, G is P-generic over M and M [G] M [K].
Now suppose that x H and x y; we want to show that y H. Write x = G
0
with dmn( 0 ) and (q, ) K for some q. We are assuming that y G
. So there
0
is an (, s) such that s G and y = G . Since G G , choose p G such that
p . Since p G, we have (p, 1) K. Also (q, ) K, so we can choose (r, ) K
such that (r, ) (p, 1), (q, ). Thus r p, so r . Also, from (r, ) (q, ) we see
that r . So r . Hence (r, ) (r, ), and hence (r, ) K. This shows
that y = G H.
Next suppose that x, y H; we want to find a common extension. Write x = G with
dmn( 0 ) and (q, ) K, and y = G with dmn( 0 ) and (s, ) K. Choose
(p, ) (q, ), (s, ) with (p, ) K. Then also (p, 1) K, so p G. Also, p , so
G G = x. Similarly, G G .
Next let D M [G] be a dense subset of G ; we want to show that D H 6= . Let
be a P-name such that G = D. Then there is a p G such that
p is dense in .
Let
D = {(q, ) P : q }.
We claim that D is dense below (p, 1). For, suppose that (r, ) (p, 1). Then r
is dense in , and r 0 , so
r x 0 (x 0 x x ).
Hence by Proposition 3.18 there exist s r and dmn( 0 ) such that
s 0 .
Thus (s, ) P , (s, ) (r, ), and s . Thus (s, ) D and (s, ) (r, ). This
proves that D is dense below (p, 1).
Now p G, so (p, 1) K. Hence there is a (q, ) D K. Hence also (q, 1) K, so
q G. Since q , it follows that G G = D. Clearly also G H. This finishes
the proof that H is G -generic over M [G].
Next we show that K G H. If (p, ) K, then also (p, 1) K, and so p G.
Clearly also G H, so (p, ) G H.
Now we show that G H K. Let (p, ) G H. Thus p G and G H. Hence
(p, 1) K. By the definition of H, there exist dmn( 0 ) and q such that (q, ) K
and G = G . Choose r G such that r = . So (r, 1) K. Let (s, ) K be such
279

that (s, ) (p, 1), (q, ), (r, 1). Since (s, ) (q, ), we have s . Also, s r, so
s = . Hence s . Hence (s, ) (p, ), and so (p, ) K, as desired.
The last part of the theorem follows from Theorem 21.6.
Lemma 21.8. Suppose that M is a c.t.m. of ZFC, is a regular cardinal of M , P is
a quasi-order in M , and P satisifies -cc in M . Suppose that is a P-name, and
1
|| <
.

Then there is a < such that 1 .


Proof. First we work in M . Let
n
[ o
E = < : there is a p P such that p
=
.
For each E, pick p P such that p
=

(1) {p : E} is an antichain in P.

For, suppose that and are distinct elements of E. If q p , p , then q


= ,
contradicting Lemma 3.24.
Thus by -cc, |E| < . Hence there is a < such that E .
This finishes our argument inside M . Now if G is P-generic
S over M , then Sis regular
in M [G] by Proposition 4.4. Since
S |G | < , it follows that G < . Let
S = G , and
choose p G such that p
= . Thus E, and hence . Thus G < , and

so G . Since this is true for any generic G, it follows that 1 .


Theorem 21.9. Suppose that M is a c.t.m. of ZFC, and in M , is a P-name for a
quasi-order, is a regular cardinal, P is -cc, and 1 is
-cc.
Then P is -cc.
Proof. Suppose not, and let h(p , ) : < i be an antichain in P in M . Let
p ) : < }. Thus is a P-name in M .
= {(,
Now let G be P-generic over M . Then
(1) G = { : p G}.
In fact, x G iff there is a < such that p G and x = , so (1) holds.
We claim now that if < and both are in G , then ( )G and ( )G are incompatible.
0
For, the hypothesis yields p , p G. Suppose that x ( )G , ( )G . Then x G
, so
0
there exists a (, q) such that q G and x = G . Clearly q . Also, there are
s, t G such that s and t . Let u G be such that u p , p , q, s, t.
Then (u, ) P and (u, ) (p , ), (p , ), contradiction. This proves our claim.
However, 1 is
-cc, so G is -cc. Hence by the preceding paragraph, |G | < .
Thus our argument with an arbitrary generic G has shown that 1
. Hence

by Lemma 21.8 there is a < such that 1 . But clearly p ,


contradiction.
We are now ready for the definition of iterated forcing. Suppose that I is any set. An
ideal of subsets of I is a collection I of subsets of I such that I , I is closed under
280

, and if x I and y x then y I . Let M be a c.t.m. of ZFC, an ordinal, and I


an ideal of subsets ofQ
containing all finite subsets of I. If hP : < i is a sequence of
quasi-orders and p < P , then the support of p is the set
def

supp(p) = { < : p() 6= 1 }.


An -stage iterated forcing construction with supports in I is an ordered pair (P, ) in M
with the following properties:
(K1) P is a sequence of length + 1 of quasi-orders.
(K2) is a sequence of length + 1; each is a P -name for a quasi-order.
(K3) For each , P is a collection of sequences of length .
(K4) If < and p P , then p P .
(K5) If < and p P , then p() dmn(0 ).
(K6) If , then 1 = h1 : < i.
(K7) P0 = ({0}, 0, 0).
(K8) For every < and every ( + 1)-termed sequence p,
p P+1

iff

p P , p() dmn(0 ), and p P p() 0 .

(K9) For all < and all p, p P+1 ,


p P+1 p

iff

p P p and p P p() p ().

(K10) If is a limit ordinal and p is an -termed sequence, then


p P

iff

p P for all < and supp(p) I .

(K11) If is a limit ordinal and p, p P , then


p P p

iff

p P p for every < .

Given this situation, if we define a function i with domain P as follows. For


each p P , the sequence i (p) is such that (i (p)) = p and (i (p))() = 1 for all
[, ).
Now we give some elementary properties of iterated forcing constructions.
Theorem 21.10. Let an iterated forcing construction be given, with notation as
above.
(i) For every and every p P , the set supp(p) is in I .
(ii) For each < , the quasiorder P+1 is isomorphic to P .
(iii) For , the function i maps P into P .
(iv) If , then i = i i .
281

(v) If , then i (1P ) = 1P .


(vi) If , p, p P , and p p , then p p .
(vii) If , p, p P , and p p , then i (p) i (p ).
(viii) If , p, q P , and p q , then p q.
(ix) If < , p, q P , and supp(p) supp(q) , then p q iff p q.
(x) If and p, q P , then p q iff i (p) i (q).
(xi) If , then i is a complete embedding of P into P .
Proof. (i): An easy induction on .
(ii): By (K2), is a P -name for a quasi-order, so that P is defined. For each
(p, ) P let f (p, ) be the sequence of length + 1 such that (f (p, )) = p
and (f (p, ))() = . Thus f (p, ) P+1 by the definition of P and (K8). Also
it is clear that f is a bijection. The definitions also make clear that (p, ) (q, ) iff
f (p, ) f (q, ). Finally, f (1P ) = 1P+1 by (K6).
(iii): By induction on , with fixed.
(iv): Obvious.
(v): Obvious.
(vi): We prove this by induction on , with fixed. So, we assume that < and (vi)
holds for all < . Suppose that p, p P and p p . If = + 1, then p p
by (K9), and hence p p by the inductive hypothesis. If is a limit ordinal, then
p p by (K11).
(vii): We prove this by induction on , with fixed. So, we assume that <
and (vii) holds for all < . Suppose that p, p P and p p . If = + 1, then
i (p) i (p ) by the inductive hypothesis, and then i (p) i (p ) by (K9), since
i (p) = i (p), i (p ) = i (p ),
i (p) 1 1, (i (p))( ) = 1, and (i (p ))( ) = 1.
For limit, the desired conclusion is immediate from (K11) and the inductive hypothesis.
(viii): immediate from (vi).
(ix): holds by (viii). For , suppose that r P and r p , q ; we want to
show that p and q are compatible.
Define s with domain by setting, for each < ,

r() if < ,

s() = p() if supp(p),

q() if supp(q)\supp(p),
1
otherwise.
Now it suffices to prove the following statement:
(*) For all we have s P , s p , and s q .
We prove (*) by induction on . If , then s = r P by (K4), and
s p , q since r p , q , by (vi).
Now assume inductively that < . First suppose that is a successor ordinal
+ 1. Then s P by the inductive hypothesis. Now we consider several cases.
282

Case 1. supp(p). Then s( ) = p( ) dmn(0 ). Moreover, by the inductive


hypothesis s p , and p p( ) 0 . It follows that s s( ) 0 .
Thus s P by (K8).
Case 2. supp(q)\supp(p). This is treated similarly to Case 1.
Case 3.
/ supp(p) supp(q). Then s( ) = 1, and hence clearly s P by
(K8).
So, we have shown that s P in any case.
To show that s p , first note that s p by the inductive hypothesis.

If supp(p), then s( ) = p( ) and so obviously p s( ) p( ) and hence


sh p by (K9). If
/ supp(p), then p( ) = 1 and again the desired conclusion
holds. Thus s p .
For s q , first note that s q by the inductive hypothesis. If
supp(q), then
/ supp(p) by the hypothesis of (ix), since . Hence the proof
can continue as for p.
This finishes the successor case = + 1. Now suppose that is a limit ordinal. By
the inductive hypothesis, s P for each < . Since clearly supp(s) supp(r)
supp(p) supp(q), we have supp(s) I . Hence s P by (K10). Finally, s p
, q by the inductive hypothesis and (K11).
This finishes the proof of (ix).
(x): Immediate from (iii) and (ix).
(xi): Conditions (1) and (2) hold by (vii) and (ix). For (3), suppose that q P .
Then q P by (vi); we claim that it is a reduction of q to P . For, suppose that
q p. Then supp(q) supp(i (p)) and q and (i (p)) = p are compatible
since q p. So by (ix), q and i (p) are compatible, as desired.
Lemma 21.11. Suppose that an iterated forcing construction be given, with notation
as above. Also suppose that is an uncountable regular cardinal, and I is the collection
of all finite subsets of . Suppose that for each < , 1 P ( is
cc). Then for each
the quasi-order P is -cc in M .
Proof. We proceed by induction on . It is trivially true for = 0, by (K7). The
inductive step from to + 1 follows from Theorem 21.10(ii) and Theorem 21.9. Now
suppose that is limit and the assertion is true for all < . Suppose that hp : < i is
an antichain in P . Let M [] be such that hsupp(p ) : M i is a -system, say with
root r. Choose < such that r . Then by Theorem 21.10(ix), hp : M i is a
system of incompatible elements of P , contradiction.
Lemma 21.12. Suppose that an iterated forcing construction be given, with notation
as above.
(i) Suppose that G is P -generic over M . For each let G = i1
[G]. Then
(a) For each , the set G is P -generic over M .
(b) If , then M [G ] M [G ] M [G].
(ii) Let < . Define
Q = ( )G ;
H = {G : dmn(0 ) and p(p hi G+1 )}.
283

Then H M [G+1 ] and H is Q -generic over M [G ].


Proof. (i)(a) holds by Theorem 21.10(xi) and Theorem 21.2; and (i)(b) follows from
these theorems too.
To prove (ii) we are going to apply Theorem 21.7 with P and replaced by P and
; by (K2), is a P -name for a quasi-order. Let j be the complete embedding of P
into P given by j(p) = (p, 1); this corresponds to i in Theorem 21.21. Now G+1
is P+1 -generic over M by (i). Let f be the isomorphism of P with P+1 given
in the proof of Theorem 21.10(ii). Clearly then f 1 [G+1 ] is P -generic over M ,
and we apply Theorem 21.7 with it in place of K. Note that f j = i,+1 , and hence
j 1 [f 1 [G+1 ]] = G ; so G is the G in Theorem 21.21. Next,
H = {G : dmn(0 ) and p(p hi G+1 )}
= {G : dmn(0 ) and p((p, ) f 1 [G+1 ])},
so that Theorem 21.7 applies to yield that G is P -generic over M (we already know this
by (i)) and H is Q -generic over M [G ]. Clearly H M [G+1 ].
Lemma 21.13. Suppose that an iterated forcing construction be given, with notation
as above, with limit, and I the collection of all finite subsets of . Suppose that G is
P -generic over M , S M , X S, X M [G], and (|S| < cf())M [G] .
Then there is an < such that X M [i1
[G]].
Proof. Let be a P -name such that X = S
s X iff there is
G . Thus for any s S, S
a p G such that p P s . Now clearly P = < i [P ], and G = < i [i1
[G]].
1
Hence for each s X we can find (s) < such that there is a p i(s) [G] such that
i(s) (p) P s . Let = supsX (s); so < by assumption.
Thus X = {s S : p G (i (p) P s }. Hence X M [G ].
This finishes our general exposition of iterated forcing. The main application of this
method, which forms a starting point of further applications, is to the consistency of
Martins axiom with CH. Before turning to this, however, there is another general fact
about forcing which will be needed. This fact could have been proved earlier; but we have
not needed it up till this point.
Lemma 21.14. Suppose that M is a c.t.m. of ZFC and in M we have a quasi-order
P, an antichain A of P, and a system hq : q Ai of members of M P . Then there is a
name M P such that q = q for every q A.
Proof. We define
(, r)

iff

(, r) M P and there is a q A such that r q


and r q and dmn(q ).

Fix q A and fix a generic G for P over M such that q G; we want to show that
G = (q )G .
First suppose that x G . Choose (, r) such that r G and x = G . By the
definition of , there is a q A such that r q , r q , and dmn(q ). Since
284

r G, also q G. But A is an antichain, q, q A, and q G, so q = q . So r q ,


and since r G it follows that G (q )G .
Second, suppose that y (q )G . Choose (, r) q such that r G and y = G .
Since G (q )G , there is a p G such that p q . Also q G, so let s G be such
that s p, q. Then (, s) , and so y = G G .
Theorem 21.15. (maximal principle) Suppose that M is a c.t.m. of ZFC, P M
is a quasi-order, 1 , . . . , n M P , p P , and p x(x, 1 , . . . , n ). Then there is a
M P such that p (, 1 , . . . , n ).
Proof. This argument takes place in M , unless otherwise indicated. By Zorns lemma,
let A be an antichain, maximal with respect to the property
(1) For all q A, q p and q (, 1 , . . . , n ) for some M P .
By the axiom of choice, for each q A let q M P be such that q (q , 1 , . . . , n ).
By Lemma 21.14, let M P be such that q = q for every q A. Since also
q (q , 1 , . . . , n ), an easy argument using the definition of forcing, thus external to M ,
shows that q (, 1 , . . . , n ).
Now we show that p (, 1 , . . . , n ). To this end we argue outside M . Suppose
that G is P-generic over M . We claim that G A 6= . In fact, by Theorem 3.13(xi), the
set
(2)

{r p : there is a M P such that r (, 1 , . . . , n )}

is dense below p, and hence there is an r G which is also in (2). By Proposition 3.3(iii),
if G A = , then there is an element q G incompatible with each member of A; in this
case, choose s G with s r, q. Then s is in (2) and s is incompatible with each element
of A, contradicting the maximality of A. So G A 6= .
Say q G A. Choose r G such that r p, q. Since q (, 1, . . . , n ), also
r (, 1 , . . . , n ), and hence (G , (1 )G , . . . , (n )G ) holds in M [G], as desired.
Now we give a fact about Martins axiom which is used below; it is an exercise in Chapter
9.
Lemma 21.16. MA() is equivalent to MA() restricted to ccc quasi-orders of cardinality .
Proof. We assume the indicated special form of MA(), and assume given a ccc
quasi-order P and a family D of at most dense sets in P; we want to find a filter on P
intersecting each member of D. We introduce some operations on P . For each D D
define fD : P P by setting, for each p P , fD (p) to be some element of D which is
p. Also we define g : P P P by setting, for all p, q P ,

p if p and q are incompatible,
g(p, q) =
r with r p, q if there is such an r.
Here, as in the definition of fD , we are implicitly using the axiom of choice; for g, we
choose any r of the indicated form.
285

We may assume that D 6= . Choose D D, and choose s D. Now let Q be the


intersection of all subsets of P which have s as a member and are closed under all of the
operations fD and g. We take the order on Q to be the order induced from P .
(1) |Q| .
To prove this, we give an alternative definition of Q. Define
R0 = {s};
Rn+1 = Rn {g(a, b) : a, b Rn } {fD (a) : D D and a Rn }.
S
Clearly n Rn = Q. By induction, |Rn | for all n , and hence |Q| , as desired
in (1).
We also need to check that Q is ccc. Suppose that X is a collection of pairwise
incompatible elements of Q. Then these elements are also incompatible in P , since x, y X
with x, y compatible in P implies that g(x, y) x, y and g(x, y) Q, so that x, y are
compatible in Q. It follows that X is countable. So Q is ccc.
Next we claim that if D D then D Q is dense in Q. For, suppose p Q. Then
fD (q) D Q. as desired.
Now we can apply our special case of MA() to Q and {D Q : D D}; we obtain a
filter G on Q such that G D Q 6= for all D D. Let
G = {p P : q p for some q G}.
We claim that G is the desired filter on P intersecting each D D.
Clearly if p G and p r, then r G .
Suppose that p1 , p2 G . Choose q1 , q2 G such that qi p1 for each i = 1, 2. Then
there is an r G such that r q1 , q2 . Then r G and r p1 , p2 . So G is a filter on P .
Now for any D D. Take q G D Q. Then q G D, as desired.
Theorem 21.17. Suppose that
P M is a c.t.m. of ZFC, and in M we have an uncountable regular cardinal such that < 2 = .
Then there is a quasi-order P in M such that P satisfies ccc, and for any P-generic G
over M , the extension M [G] satisfies MA and 2 = .
Proof. The overall idea of the proof runs like this. We do an iterated forcing which
has the effect of producing a chain
M = M0 M1 M M+1 M
of length + 1 of c.t.m.s of ZFC. We carry along in the construction a list of names of
quasi-orders. This list is of length . At the step from M to M+1 we take care of one
entry in this list, say Q, by taking a Q-generic filter G and setting M+1 = M [G], and
we add to our list all names of quasi-orders in M . By proper coding, we can do this so
that at end we have taken care of all quasi-orders in any model M . Then we show that
any ccc quasi-order in M appeared already in an earlier stage and so a generic filter for
it was added.
286

We begin by defining the coding which will be used. Some elementary notation: For
any ordered pair (a, b), 1st (a, b) = a and 2nd (a, b) = b.
Claim. There is a function f in M with the following properties:
(1) f : .
(2) For all , , < there is an > such that f () = (, ).
(3) 1st (f ()) for all < .
Proof of Claim. Let g : be a bijection. For each < let g() = (, , ),
and set

(, ) if ,
f () =
(0, 0) otherwise.
So (1) and (3) obviously hold. For (2), suppose that , , < . Now g 1 [{(, , ) :
< }] has size , so there is an g 1 [{(, , ) : < }] such that , < . Say
g() = (, , ). Then < , so f () = (, ) and < , as desired.
Another preliminary
is a cardinality bound. Note that if < , then = , since, using
P
regularity and < 2 = , we have


[
X


2 = .
= | | =



<

<

(4) If Q is a ccc quasi-order in M of size less than , then there are at most pairs (, )
such that < and is a nice Q-name for a subset of ( ).
To prove (4), recall that a nice Q-name for a subset of ( )is a set of the form
[
{{
a} Aa : a }
where for each a , Aa is an antichain in Q. Now by ccc the number of antichains
in Q is at most |Q| . So for a fixed < the number sets of the indicated form is at
most = . Hence (4) holds.
For brevity, we let pord(, W ) abbreviate the statement that W is the order relation of a
ccc quasi-order on the set , with largest element 0.
Now we are going to define by recursion functions P, , , and with domain . Let I
be the collection of all finite subsets of .
Let P0 be the trivial partial order ({0}, 0, 0).
Now suppose that P has been defined, so that it is a ccc quasi-order in M . We now
define , , , and P+1 . By (4), the set of all pairs (, ) such that < and

is a nice P -name for a subset of ( ) has size at most . We let {(


, ) : < }
enumerate all of them. This defines and . Now let f () = (, ). So , and
hence and are defined. We consider the complete embedding i given in Theorem
21.10(xi). By Proposition 6.1, i ( ) is a P -name.
287

(5) There is a P -name such that


1P P pord(( ), ) and [pord(( ), i ( )) = i ( )].
In fact, clearly
1P P W (pord(( ), W ) and [pord(( ), i ( )) W = i ( )],
so (5) follows by the maximal principle, Theorem 21.15.
We now let
= op(op( ), ), 0),
which is a P -name for a quasi-order. Finally, P+1 is determined by (K8) and (K9).
For limit we define P by (K10) and (K11).
This finishes the construction.
By Lemma 21.11, each quasi-order P for satisfies ccc.
Now take any P -generic G over M ; we want to show that MA() holds in M [G] for
every < . (Later we show that 2 = in M [G].) Note that, by ccc, P preserves
cofinalities and cardinalities. Let G = i1
[G] for each < .
Suppose that Q is a ccc quasi-order in M [G], |Q| , and D is a family of at most
subsets of Q dense in Q, with D M [G]. By taking an isomorphic image, we may assume
that Q is an ordinal less than , and it has maximal element 0.
(6) There are , , < such that f () = (, ), Q = , Q = ( )G , and D M [G ].
In fact, we have pord(, Q ). Applying Lemma 21.13 with in place of S and Q in
place of X, we see that Q is a member of some M [G ] with < . Let D = {D : < }.
Then we can apply Lemma 21.13 to the set {(, ) : , < and D } to infer that
D M [G ] for some < , and we may assume that = . By Proposition 4.9, there is
a nice name P -name for a subset of such that G =Q . By construction, we can
then choose < such that (, ) = ( , ). Next, choose such that f () = (, ).
Thus (6) holds.
Now we consider the construction of P+1 . In this construction we chose a name as in
(5). Now we know that pord(, Q ) (in M [G]). By absoluteness, this also holds in M [G ].
(It is still ccc, as otherwise it would fail to be ccc in M [G].) We have Q = = (( ))G
and Q = G = ( )G . By Theorem 21.3(i), ( )G = (i ( ))G . Take as in (5).
Then G = (i ( ))G . Thus = op(op( ), ), 0). Then ( )G = Q. Let
H = {G : dmn(0 ) and p hi G+1 for some p}.
Then H is ( )G -generic over M [G ] by Lemma 21.12. Since D M [G ], we get
H D 6= for all D D, as desired.
Since MA() holds for every < , it follows from Lemma 21.16 that in M [G], 2 .
Now in M we have = , as observed early in this proof. Hence by Proposition 4.10
it follows that 2 in M [G]. Thus 2 = in M [G].
288

EXERCISES
E21.1. Let f be a complete embedding of P into Q. Show that there is an isomorphism
PRO(P)
PRO(Q)
g of RO(P) into RO(Q) such that for any X RO(P), g(
X) = xX g(x).
Furthermore, show that the following diagram commutes:
f
P

eP

eQ

RO(P )

RO(Q)

E21.2. Prove that a composition of complete embeddings is a complete embedding.


E21.3. Suppose that f is a complete embedding of P into Q. Also suppose that p
P q, r p(q r). Show that p Qq, r p(q r).
E21.4. Prove that every isomorphism is a complete embedding.
E21.5. Give an example of a complete embedding which is not an isomorphism.
E21.6. A dense embedding of P into Q is a function f : P Q such that the following
conditions hold:
(i) p, q P [p q f (p) f (q)].
(ii) p, q P [p q f (p) f (q)].
(iii) For any q Q there is a p P such that f (p) q.
Show that every dense embedding is a complete embedding, and every isomorphism is a
dense embedding.
E21.21. Prove that if f is a dense embedding of P into Q, then RO(P) and RO(Q) are
isomorphic.
E21.8. Let P and Q be quasi-orders in M . Recall the notion of product from Chapter 6.
Q ), 1Q ). Note that is with respect to P here; see the definition on
Let Q = op(op(PQ ,
page 38. Show that Q is a P-name for a quasi-order, and P Q is isomorphic to P Q .
E21.9. Give an example of a partial order P and a P-name for a partial order such that
P is not a partial order. Hint: Let P be fin(, 2). Let p = {(0, 0)} and q = {(0, 1)}.
Now define
0 = {(, p), ({(, q)}, p), (, q)};
2 = ;
1 = up(op(, ), op(, )).
289

E21.10. Let be an infinite cardinal. For f, g we write f < g iff |{ < : f ()


g()}| < . We say that F is almost unbounded iff there is no g such that
f < g for all f F . Clearly itself is almost unbounded; it has size 2
Show that if is a regular cardinal, then any almost unbounded subset of has size
at least + .
E21.11. Suppose that is an infinite cardinal and MA() holds. Suppose that F
and |F | = . Then there is a g such that f < g for all f F . Hint: let P be the
set of all pairs (p, F ) such that p is a finite function contained in and F is a finite
subset of F . Define (p, F ) (q, G) iff p q, F G, and
f Gn (dmn(p)\dmn(q))[p(n) > f (n)].
E21.12. We begin exercises giving another application of iterated forcing.
(i) Show that there is a c.t.m. M of ZFC + 2 = 1 + 21 = 3 .
(ii) Show that if Q is a ccc quasi-order of size 1 in the model M of (i), then there
are at most 1 nice Q-names for subsets of ( ).
E21.13. (Continuing E21.12) Now we are going to define by recursion functions P, , and
with domain 2 .
Let P0 be the trivial partial order ({0}, 0, 0).
Now suppose that P has been defined, so that it is a ccc quasi-order in M of size at
most 1 . We now define , , and P+1 . By E21.12(ii), the set of all nice P -names
for subsets of ( ) has size at most 1 . We let { : < 1 } enumerate all of them.
Prove:
(iii) For every < 1 there is a P -name such that
1P P :

and [ :

implies that = ].

E21.14. (Continuing E21.13) For each H [1 ]< we define


H = {( , 1P ) : H}.
So
H is a P -name.
Next, define
<
}.
p,
0 = {(op(
H ), 1) : p fin(, ) and H [1 ]

Let G be P -generic over M . Prove:


(iv) (0 )G = {(p, K) : p fin(, ) and K [ ]< }.
E21.15. (Continuing E21.14) Next, we define

1 = {(op(op(
p,
p ,
H ), op(
H )), q) : p, p fin(, ),

H, H [1 ]< , p p, H H, q P , and for all H


and all n dmn(p)\dmn(p ), q P (
n) < (p(n))}.
Again, suppose that G is P -generic over M . Prove:
(v)

(1 )G = {((p, K), (p, K )) : (p, K), (p , K ) (0 )G , p p, K K,


and for all f K and all n dmn(p)\dmn(p ), f (n) < p(n)}.
290

E21.16. (Continuing E21.15) Next, we let 2 = {(op(0, 0), 1P )}. Then for any generic
G, (2 )G = (0, 0). Finally, let = op(op(0 , 1 ), 2 ). This finishes the definition of .
Prove:
(vi) 1P P is
1 cc.
E21.17. (Continuing E21.16) Now P+1 is determined. The limit stages are clear. So the
construction is finished, and P is ccc.
Let G be P -generic over M . Prove
(vii) In M [G], if F and |F | < 2 , then there is a g such that f < g for all
f F.
E21.18. (Continuing E21.17) Show that if ZFC is consistent, then there is a c.t.m. of ZFC
with the following properties:
(i) 2 = 2 .
(ii) 21 = 3 .
(iii) Every almost unbounded set of functions from to has size 2 .
(iv) MA(1 ) fails.

291

22. Various forcing partial orders


In this section we briey survey various partial orders which have been used in forcing
arguments. Many of them give rise to new real numbers, i.e., new subsets of . (It is
customary to identify real numbers with subsets of , since these are simpler objects than
Dedekind cuts; and a bijection in the ground model between R and P() transfers the
newness to real real numbers.) For each kind of forcing we give a reference for further
results concerning it. Of course our list of forcing partial orders is not complete, but we
hope the treatment here can be a guide to further study.
Cohen forcing
The forcing used in section 10 is, as indicated there, called Cohen
forcing. If M is a
S
c.t.m. of ZFC, P is Fin(, 2), and G is P-generic over M , then G is a Cohen real. More
generally, if N is a c.t.m. of ZFC and M N , then a Cohen real in N is a function
f : S in N such that there is a P-generic lter G over M such that M [G] N and
f = G.
Theorem 22.1. Suppose that M is a c.t.m. of ZFC, I M , I = J0 J1 with
J0 J1 = , and G is Fin(I, 2)-generic over M .
(i) Let H0 = G Fin(J0 , 2). Then H0 is Fin(J0 , 2)-generic over M .
(ii) Let H1 = G Fin(J1 , 2). Then H1 is Fin(J1 , 2)-generic over M [H0 ].
(iii) M [G] = M [H0 ][H1 ].
Proof. We are going to use 12.14. Let P be the partial order Fin(J0 , 2) and Q
the partial order Fin(J1 , 2). We claim that Fin(I, 2) is isomorphic to P Q. Dene
f (p) = (p J0 , p J1 ). Clearly this is an isomorphism. We claim that f [G] = H0 H1 .
For, suppose that p G. then p J0 p, so p J0 G, and hence p J0 H0 . Similarly,
p J1 H1 . So f (p) H0 H1 . Conversely, if (p, q) H0 H1 , then p G and q G, so
there is an r G such that p, q r. Now p q r, so p q G. Clearly f (p q) = (p, q).
So this proves that f [G] = H0 H1 .
It follows that H0 H1 is P Q-generic over M , and M [G] = M [H0 H1 ]. Now we
can apply 12.14 to get:
(1) H0 is P-generic over M .
(2) H1 is Q-generic over M [H0 ].
(3) M [G] = M [H0 ][H1 ].
This proves our theorem.
It follows that all of the subsets of given in the proof of 10.3 are Cohen reals:
Corollary 22.2. Let M be a c.t.m. of ZFC and let be a cardinal of M S
such that
= . Let P = Fin(, 2) in M , and let G be P-generic over M , and let g = G. Let
h : be a bijection in M . Then for each < , the set {m : g(h(, m)) = 1}
is a Cohen real.

292

Proof. Remember that subsets of and their characteristic functions are both considered as reals. Implicitly, one is a Cohen real iff the other is, by denition. So we will
def
show that the function l = hg(h(, m)) : m i is a Cohen real.
Fix < , and let J = { < : h1 () has the form (, m) for some m }. Let
k(m) = h(, m) for all m . Then k is a bijection from onto J. By 22.1, G Fin(J, 2)
is Fin(J, 2)-generic over M . Dene k : Fin(J, 2) Fin(, 2) by setting k (p) = p k for
any p Fin(J, 2). So k is an isomorphism from Fin(J, 2) onto Fin(, 2). Clearly then
k [G Fin(J, 2)] is Fin(, 2)-generic over M . So the proof is completed by checking that
S
k [G Fin(J, 2)] = l. Take any m . Then
[
(m, )
k [G Fin(J, 2)] iff there is a p k [G Fin(J, 2)]
such that (m, ) p
iff

there is a q G Fin(J, 2)
such that (m, ) k (q)

iff
iff

there is a q G Fin(J, 2)
such that (m, ) q k
g(k(m)) =

iff
iff

g(h(, m)) =
(m, ) l

Theorem
S 22.3. Suppose that M is a c.t.m. of ZFC and G is Fin(, 2)-generic over
M . Let g = G (so that g is a Cohen real). Then for any f which is in M , the set
{m : f (m) < g(m)} is infinite.
Proof. For each n let in M
Dn = {h Fin(, 2) : there is an m > n such that m dmn(h) and f (m) < h(m)}.
Clearly Dn is dense. Hence the desired result follows.
Cohen reals are widely used in set theory. A couple of interesting facts are: adding a
Cohen real adds a Suslin tree. If one starts with a model of CH and adds Cohen reals,
then in the extension there is a maximal chain of elements dierent from 1 in the Boolean
algebra P()/n.
Roitman, J. [79] Adding a random or a Cohen real. . . Fund. Math. 103 (1979), 4760.
Random forcing
The general idea of random forcing is to take a -algebra of measurable sets with respect
to some measure, divide by the ideal of sets of measure zero, obtaining a complete Boolean
algebra, and use it as the forcing algebra; the partially ordered set of nonzero elements is
the forcing partial order.
We give more details in the simplest case: we are going to dene a measure on certain
subsets of 2 and form a partial order in a natural way, without giving complete details
293

on how this ts into the above framework. We refer to Halmos, Measure Theory, for
standard facts about measures.
Let nseq be the set of all nite sequences of 0s and 1s; thus
nseq =

2.

For each f nseq we dene


Uf = {g 2 : f g}.
Also, let F be the set of all nite unions of Uf s. We count the empty set among the nite
unions, and 2 itself is in F, since 2 = U .
Lemma 22.4. S
(i) If n , then f n 2 Uf = 2.
S
(ii) If m n , and f m 2, then Uf = {Ug : g n 2 and f g}. S
(iii) If a F, then there is an mS and a S
set F m 2 such that a = f F Uf .
(iv) If m , F , G m 2, and S f F Uf S gG Ug , then F G .
(v) If m , F , G m 2, and f F Uf = gG Ug , then F = G .
(vi) F forms a field of sets, i.e., it is closed under union and complementation.
Proof. (i): For any g 2 we have g Ugn .
(ii): Obviously holds. Now suppose that h Uf . Then h Uhn and f h n.
(iii) Let a = Uf0 . . . Ufm1 . Let ni be the domain of fi for each i < m. Let
p = max{ni : i < m} and F = {h p 2 : fi h for some i < m}. Then
a = Uf0 . . . Ufm1
[
[
= {Ug : g p 2 and f0 g} . . . {Ug : g p 2 and fm1 g}
[
=
Uf .
f F

(iv): assume the hypothesis, and suppose that f F . Take any h Uf . Then there
is a g G such that h Ug . Then g = h m = f , so f G . Thus F G .
(v): immediate from (iv).
(vi): Obviously F is closed under unions. For complementation, write a F as in (iii).
Then the element
[
Uf
f m 2\F

is clearly disjoint from a, and its union with a is


a.

2 by (i), so it is the complement of

To check one of the properties of measure it is convenient to introduce another standard


notion in the theory of Boolean algebras. If A is a Boolean algebra, an ultrafilter on A is
a subset G of A satisfying the following conditions:
294

(1)
(2)
(3)
(4)
(5)

0
/ G.
1 G.
If a G and a b, then b G.
If a, b G, then a b G.
For any a A, either a A or a A.

A subset X of a Boolean algebra A satises the finite intersection property (p) iff
for every nite F X.

F 6= 0

Lemma 22.5. If X is a subset of a BA A which satisfies fip, then X G for some


ultrafilter G.
Proof. Let A = {Y A : X Y and Y has p}. We want to apply Zorns lemma
to A , partially ordered by inclusion.
Suppose that B is a nonempty subset of A simply
S
ordered by inclusion. Then B has p; for if F is a nite subset of B, for each a F let
Za B be such that a Za . Since B is simply ordered
by inclusion, there is an a F
Q
such that Zb Za for all b a. Hence F Za , so F 6= 0 since Za B A .
So we apply Zorns lemma and get a maximal member G of A . Thus X G, and we
claim that G is an ultralter on A. Since G itself has p, we have 0
/ G. Clearly G {1}
has p, so by the maximality of G we have 1 G. If x G and x y, again it is clear
that G {y} has p, so y G by maximality. Next, suppose that a, b G. Once again it
is clear that G {a b} has p, and so a b G. Finally, suppose that a A but a
/ G;
we show that a G. By the maximality of GQ
it follows that GQ {a} does not
Q have p.
So there is a nite subset F of G such that a F = 0. Then F G and F a,
so a G by the above, as desired.
P
Lemma 22.6. If G is an ultrafilter on a Boolean algebra A, F [A]< , and F G,
then a G for some a F .
Q
Proof. Suppose
that
a

/
G
for
all
a

F
.
Thus
a

G
for
all
a

F
,
so
aF a
P
Q
G. Then 0 = ( F ) aF a G, contradiction.
The following theorem actually follows from th topological duality theory for Boolean
algebras. To avoid going into this theory, we give a direct proof.
def S
Lemma 22.7. If hai : i Ii is a system of S
elements of F and b = iI ai is in F,
then there is a finite subset F of I such that b = iF ai .
Proof. Suppose this is not true. Then the set
(1)

{b} {ai : i I}

T
has S
p. In fact,
S otherwise there is a nite
S F I such that b iI ai = 0, and hence
b iF ai iI ai = b, so that b = iF ai , contrary to our supposition.
Hence by 22.5
the set (1). Now for each
S let D be an ultralter on F which contains
n
n we have gn 2 Ug = 1 D, so by 22.6, there is a g 2 such that Ug D. Since
Uk Uh for distinct k, h n 2, this g is unique; call it g (n) . If m n, then Ug (m) Ug (n) D,
S
hence Ug (m) Ug (n) 6= , hence g (m) g (n) . Let f = m g (m) . So f 2.
295

S
Now write b = hH Uh with H p 2 for some p , by 22.4(iii). Since b D,
by 22.6, Uh SD for some h H . Since also g (p) D, clearly h = g (p) . It follows that
f Uh b = iI ai , so
S there is an i I such that f ai .
Now write ai = kK Uk with K q 2 for some q , by 22.4(iii). As above, we
then get g (q) K , and so f ai . This is a contradiction.
Now we turn to the actual measure.
Lemma 22.8. There is a measure on F such that if m and F m 2, then

[
|F |

Uf = m .
2
f F

Proof. Note rst that the indicated condition on is not quite a denition, since a
given element can be written as a union in many ways. So we rst show
(1)

If m, n , F m 2, G n 2 and

Uf =

f F

Ug , then

gG

|G |
|F |
= n.
m
2
2

To prove this, say by symmetrySthat m n. For each f F let Hf = {h n 2 : f h}.


Thus |Hf | = 2nm and Uf = hHf Uh by 22.4(ii). Let K = {k n 2 : f k for some
f F }. Then
[
[ [
Uk =
Uf
kK

f F hHf

Uf

and

f F

|K | =

|Hf |

f F

2nm

f F

= |F | 2nm and hence


|F | 2nm
|K |
=
2n
2n
|F |
= m.
2
Thus

[
kK

Uk =

Uf =

f F

[
gG

so by 22.4(v), K = G . And hence by the above


|F |
|K |
|G |
=
=
,
2m
2n
2n
296

Ug ,

as desired in (1).
Thus the condition on in the lemma can be taken as a denition, and 22.4(iii) implies
that is dened on all of F. Obviously takes on only nonnegative values, and (0) = 0.
Note that (a) 6= 0 if a 6= 0, although we will not use this. Clearly 0 (a) 1 for all
a F. Also, ( 2) = 1, since

[
( 2) =
Uf = 1.
f {0} 2

Now suppose that a and b are disjoint elements of F; we want to show that
S (a b) =
m
(a)S+ (b). By 22.4(ii) there exist m and F , G 2 such that a = f F Uf and
b = gG Ug . Since a b = , we also have F G = . Hence

[
[
[
Uf
Uf
Ug =
(a b) =
f F

f F G

gG

|F G |
|F | + |G |
|F | |G |
=
= m + m = (a) + (b).
m
m
2
2
2
2
Finally, countable additivity follows trivially from 22.7, since no innite union of nonempty
pairwise disjoint members of F is in F.
=

We can extend the measure given in Lemma 22.7 to the -eld F generated by F, and
the extended measure is still denoted by . (See, e.g., Halmos, Theorem A in section 22.)
Lemma 22.9. If a F , (a) > 0, and > 0, then there is a b F such that b a
and 0 < (b) < .
Proof. We have (a) < 1, so by standard
there is a system

S measure theory
S results,
hbi : i i of elements of F such that a i bi and
S i bi <S1. (See, e.g.,
S the
top of page 57 of Halmos.) ForSeach i let ci = bi \ j<i bj . Then i ci = i bi .
In fact, is clear, and if f i bi , choose i minimum such that f S
bi ; then f ci .
1
m
Now by 22.4(iii) we can choose an m such that 2 > and c0 = gG Ug for some
G m 2.
!
[
1
(1)
There is a g m 2\G such that Ug
ci < m .
2
0<i
In fact, otherwise we have
[

1=

Ug

gm 2

gG

Ug +

ci

gm 2\G

297

Ug

[
0<i

ci

contradiction.
Now we choose g as in (1), and let

[
d=

hm 2\{g}

Then clearly a

i ci

Uh

ci .

0<i

d, hence d a, and

0 6= (d) = Ug \

ci

0<i

1
< .
2m

Lemma 22.10. Suppose that a F with (a) > 0 and n . Then there is a
b F and an h nseq such that dmn(h) n, b a, 0 < (b), and b Uh .
Proof. By 22.8 choose c F such that c a and 0 < (c) < 21nS
. Then by
measure
theory,
there
is
a
sequence
hd
:
i

i
of
elements
of
F
such
that
c

i
i di and
P
1
i (di ) < 2n . By 22.4(ii) we may assume that each di has the form Uhi . Now
[
(c Uhi ), hence
c=
i

(c)

(c Uhi ),

so we can choose i such that 0 < (c Uhi , giving the desired result.
Now we dene Pr to be the collection of all a F of positive measure, ordered by inclusion.
Theorem 22.11. Let M is a c.t.m. of ZFC, and suppose that F, F , Pr are taken

in
T the sense of M . Let G be Pr -generic over M . Then there is an f 2 such that
G = {f }.
Proof. By Lemma 22.10, the set
Dn = {b Pr : there is an h nseq such that b Uh and dmn(h) n}
is dense in Pr for each m . Choose h(n) Dn G for each n , and choose
h(n) nseq such that b(n) Uh(n) . So Uh(n) G. Clearly h(n) and h(m) are compatible
T
def S
(n)
for all
m,
n

,
so
f
=
h
is
a
function
mapping

into
2.
Clearly
f

G. If
n
T
g G, then g Uh(n) for each n, and hence g = f .
In the situation of Theorem 22.11, f , or {i : f (i) = 1}, is called a random real.
Lemma 22.12. Pr has ccc.
Proof. Suppose to the contrary that X Pr is an uncountable family of pairwise
incompatible elements. Then a b = for all distinct a, b X. Now

[
1
X=
x X : (x) >
,
n+1
n
298

def

1
so there is an n such that Y = {x
S X :P(x) > n+1 } is uncountable. If Z
is S
a denumerable subset of Y , we get ( Z) =
xZ (x), which is impossible since
1
( Z) 1 and (x) > n+1 for all x Z.

We now give a little more of the general theory of Boolean algebras. If A is a BA, a subset
I of A is an ideal iff the following conditions hold:
(1) 0 I.
(2) if x I and y x, then y I.
(3) if x, y I, then x + y I.
Given an ideal I, we dene a relation I A A by:
a I b iff

ab I.

Lemma 22.13. Let I be an ideal on a BA A.


(i) I is an equivalence relation on A.
(ii) If a I a and b I b , then a + b I a + b .
(iii) If a I a and b I b , then a b I a b .
(iv) If a I a , then a b.
Proof. (i): aa = 0 I, so a I a for any a A. Suppose that a I b. Thus
ab I. Since ab = ba, we get b I a. Suppose that a I b I c. Then ab I and
bc I. Hence
ac = a c + c a
= a b c + a b c + c b a + c b a
a b + b c + b a + c b
(ab) + (bc)
I.
So ac I and hence a I c.
(ii): Assume that a I a and b I b . Then
(a + b)(a + b ) = (a + b) (a + b ) + (a + b ) (a + b)
= a a b + b a b + a a b + b a b
(aa ) + (bb )
I;
hence a + b I a + b .
(iii):Assume that a I a and b I b . Then
(a b)(a b ) = (a b (a b ) + a b (a b)
= a b a + a b b + a b a + a b b
(aa ) + (bb )
I;
299

hence a b I a b .
(iv): Clearly a a = aa , so (iv) holds.
This lemma enables us to dene the quotient A/I. Its elements are the equivalence classes
under I , and the operations satisfy the following conditions; the lemma says that this is
possible:
[a] + [b] = [a + b];
[a] [b] = [a b];
[a] = [a].
0 = [0].
1 = [1].
The algebraic structure A/I is clearly a BA, since the axioms are easily checked.
We apply this in our case as follows. Let I be the set of all a F of measure 0. It is
easily checked that I is an ideal in F . We dene M = F /I.
A BA A is -complete iff any countable subset of A has a sum.
Lemma 22.14. If A is a -complete BA satisfying ccc, then A is complete.
Proof. Let X be any subset of A; we want to show that it has a sum. By Zorns
lemma, let Y be a maximal set subject to the following conditions: Y consists of pairwise
disjoint elements, P
and for any y Y there isP
an x X such that y x. By ccc, Y is
countable, and so
Y exists. WeP
claim that
YP
is the least upper boundPof X.
Suppose that x X and x 6
Y . Then x Y 6= 0, and Y {x Y } properly
P
contains
Y.
P Y and satises both of the conditions dening Y , contradiction. Hence x
So
Y is an upper bound for X.
P
P
Suppose that z is any upper bound for X, but suppose that Y 6 z. Thus Y z 6=
0, so by 2.2 there is a y Y such that y z 6= 0. Choose x X such that y x. Now
x z, so y z z, hence y z = 0, contradiction.
Lemma 22.15. M satisfies ccc and is complete.
Proof. Suppose that X is an uncountable set of nonzero pairwise disjoint elements
of M. We can write X = {[y] : y Y } for some Y F such that [y] [z] = 0 for
y 6= z. Let
S hy : < 1 i be a sequence of distinct elements of Y . For each < 1 let
zS= y \ < y . For < < 1 we have [y ] [y ] = 0, hence (y y ) = 0, so
( < (y y ) = 0. Hence (z ) = (y ) 6= 0. So hz : < 1 i is a system of pairwise
disjoint elements of Pr , contradicting 22.12. Hence M satises ccc.
To show that M is complete, by Lemma 22.14 it suces to show that it is -complete.
So, suppose that X is a countable subset of S
M . We can write X = {[y] : y Y } for some

countable subset Y of F . We claim that [ Y ] is S


the least upper
S bound for X. For, if
x X, choose y Y such that x = [y]. Then y Y , so x [ Y ]. Now suppose that
[z] is any upper bound of X. Then [y] [z] for any y Y , so y\z I, i.e., (y\z) = 0,
for any y Y . Hence
[
 X

Y \z
(y\z) = 0;
so [

yY

Y ] [z], as desired.
300

Theorem 22.16. There is an isomorphism f of RO(Pr ) onto M such that f (i(a)) =


[a] for every a Pr , where i is as in the definition of RO.
Proof. Dene j : Pr M by setting j(a) = [a] for all a Pr . Then the following
conditions are clear:
(1) j[Pr ] is dense in M. (In fact, j[Pr ] consists of all nonzero elements of M.)
(2) If a, b Pr and a b, then j(a) j(b).
(3) If a, b Pr and a b, then j(a) j(b) = 0.
Hence our theorem follows from 3.8.
Theorem 22.17. Suppose that M is a c.t.m. of ZFC, and F, F , Pr , and M are
considered in M . Suppose that G is Pr -generic over M , and f in M [G]. Then there
is an h M such that f (n) < h(n) for all n .
Proof. Let f be a name and p Pr a condition such that p f : . We claim
that

def

E = {q Pr : there is an h such that q n (f(n) < h(n))}

is dense below p. Clearly this gives the conclusion of the theorem.


To prove this, take any r p; we want to nd q E such that q r. Let k be the
isomorphism from RO(Pr ) to M given by 22.16. Now temporarily x n . Then
i(r) [[m (f(
n) < m)]] =

[[f(
n) < m]],

using 7.8(i). Applying k, we get


(1)

[r]

k([[f(
n) < m]]).

Let [r] k([[f(


n) < m]])
= [am ] for each mS . Now clearly if m < p, then r f(
n) <

m
f (
n) < p, so [am ] [ap ]. Let bm = pm ap for each m . Then [am ] = [bm ] for
each m.
S

P
(2) m [bm ] =
m bm .
S

In fact,
for {[bm ] : m }. If [c] is any upper bound,
m bm is clearly an upper bound

S

S
then (bm \c) = 0 for each m, and hence m bm \c = 0, so that
m bm [c]. So
(2) holds.
S


S
Note that [r] =
m bm ; so (r) =
m bm . By Theorem D on page 38
of Halmos we get (r) = sup{(bm ) : m }. So we can choose m such that
1
(r). Let h(n) be the least such m. Thus
(bm ) (r) 2n+2
(3)

(r\bh(n) ) = (r) (bh(n) )


301

1
2n+2

(r).

Now
r\

bh(n)

=
It follows that

(r\bh(n) )

1
2n+2

(r)

1
(r).
2

bh(n)

> 0,

and this shows that E is dense below p.


Corollary 22.18. Suppose that M is a c.t.m. of ZFC, and Pr is considered in M .
Suppose that G is Pr -generic over M . Then no f in M [G] is a Cohen real.
Proof. By 22.17 and 22.3.
Thus we may say that adding a random real does not add a Cohen real.
Roitman, J. [79] Adding a random or a Cohen real. . . Fund. Math. 103 (1979), 4760.
Sacks forcing
Let Seq be the set of all nite sequences of 0s and 1s. A perfect tree is a nonempty subset
T of Seq with the following properies:
(1) If t T and m < dmn(t), then t m T .
(2) For any t T there is an s T such that t s and s h0i, sh1i T .
Thus Seq itself is a perfect tree. Sacks forcing is the collection Q of all perfect trees,
ordered by (not by ).
Note that an intersection of perfect trees does not have to be perfect. For example
(with 1 , 2 , . . . any members of 2):
p = {, h0i, h01i, h01 2 i, . . .};
q = {, h1i, h11i, h11 2 i, . . .}.
Also, one can have p, q perfect, p q not perfect, but r p q for some perfect r:
p = {, h1i, h11i, h11 2 i, . . .
h0i, h01i, h012i, h012 3 i . . .};
q = {, h1i, h11i, h1, 1 2 i, . . .
h0i, h00i, h002i, h002 3 i . . .};
r = {, h1i, h11i, h1, 1 2 i, . . .}.
302

Theorem 22.19. Suppose that M is a c.t.m. of ZFC. Consider Q within M , and let
G be Q-generic over M . Then the set
{s Seq : s p for all p G}
is a function from into 2.
Proof. For each n let
Dn = {p Q : there is an s Seq such that dmn(s) = n and s t or t s for all t p}.
Then Dn is dense: if q Q, choose any s q such that dmn(s) = n, and let p = {t q :
s t or t s}. Clearly p Dn and p q.
Now for each n let p(n) be a member of G Dn , and choose s(n) accordingly.
(1) If m < n, then s(m) s(n) .
In fact, choose r G such that r p(m) p(n) . Then s(m) t and s(n) t for all t r
with dmn(t) n, so s(m) s(n) .
(2) s(m) q for all q G.
In fact, let q G, and choose r G such that r q and r p(m) . Take t r with
dmn(t) = m. then t = s(m) since r p(m) . Thus s(m) q since r q.
(3) If t q for all q G, then t = s(m) for some m.
For, let dmn(t) = m. Since t p(m) , we have t = s(m) .
From (1)(3) the conclusion of the theorem follows.
The function described in Theorem 22.19 is called a Sacks real.
If p Q, a member f of p is a branching point iff f h0i, f h1i p.
Sacks forcing does not satisfy ccc:
Proposition 22.20. There is a family of 2 pairwise incompatible members of Q.
Proof. Let A be a family of 2 innite pairwise almost disjoint subsets of . With
each A A we dene a sequence hPA,n : n i of subsets of Seq, by recursion:
PA,0 = {};

{f h0i : f PA,n }
PA,n+1 =
{f h0i : f PA,n } {f h1i : f PA,n }

if n
/ A,
if n A.

S
Note that all members of PA,n have domain n. We set pA = n PA,n . We claim that
pA is a perfect tree. Condition (1) is clear. For (2), suppose that f pA ; say f PA,n .
Let m be the least member of A greater than n. If g extends f by adjoining 0s from n to
m 1, then g h0i, g h1i pA , as desired in (2).
We claim that if A, B A and A 6= B, then pA and pB are incompatible. For,
suppose that q is a perfect tree and q pA , pB . Now A B is nite. Let m be an integer
greater than each member of A B. Let f be a branching point of q with dmn(f ) m;
303

it exists by (2) in the denition of perfect tree. Let dmn(f ) = n. Then f PA,n and
f h0i, f h1i PA,n+1 , so n A by construction. Similarly, n B, contradiction.
Proposition 22.21. Q is not 1 -closed.
Proof. For each n let
pn = {f Seq : f (i) = 0 for all i < n}.
T
Clearly pn is perfect, pn pm if n > m, and n Pn is {f } with f (i) = 0 for all i, so
that the descending sequence hpn : n i does not have any member of Q below it.
By 22.20 and 22.21, the methods of section 11 cannot be used to show that forcing with Q
preserves cardinals, even if we assume CH in the ground model. Nevertheless, we will show
that it does preserve cardinals. To do this we will prove a modied version of 1 -closure.
If p is a perfect tree, an n-th branching point of p is a branching point f of p such that
there are exactly n branching points g such that g f . Thus n > 0. For perfect trees p, q
and n a positive integer, we write p n q iff p q and every n-th branching point of q is
a branching point of p. Also we write p 0 q iff p q.
Lemma 22.22. Suppose that p q are perfect trees, and n . Then:
(i) If p n q, then p i q for every i < n.
(ii) If p n q and f is an n-th branching point of q, then f is an n-th branching point
of p.
(iii) For each positive integer n there is an f p such that f is an n-th branching
point of q.
(iv) The following conditions are equivalent:
(a) p n q.
(b) For every f Seq, if f is an n-th branching point of q, then f h0i, f h1i p.
(v) For each positive integer n there are exactly 2n1 n-th branching points of a perfect
tree p.
(vi) If p and q are perfect trees, then so is p q.
(vii) If p and q are perfect trees, then {r : r is a perfect tree and r p or r q} is
dense below p q.
Proof. (i): Assume that p n q, i < n, and f is an i-th branching point of q.
Then since q is perfect there are n-th branching points g, h of q such that f h0i g and
f h1i h. So g, h p, hence f p. This shows that p i q.
(ii): Suppose that p n q and f is an n-th branching point of q. Let r0 , . . . , rn1 be all
of the branching points g of q such that g f . Then by (i), r0 , . . . , rn1 are all branching
points of p. Hence f is an n-th branching point of p.
(iii): Let f be an n-th branching point of p. Then it is an m-th branching point of q
for some m n. Let r be an n-th branching point of q below f . Then r p, as desired.
[But r might not be a branching point of p.]
(iv), (v), (vi): Immediate from the denitions.
(vii): Suppose that p, q, t are perfect trees and t p q; we want to nd a perfect
tree r t such that r p or r q. If t p q, then r = t works. Otherwise, there is
304

def

some member f of t which is not in both p and q; say f p\q. Then r = {g t : g f


or f g} is a perfect tree with r t and r p.
Lemma 22.23. (Fusion lemma] If hpn : n i is a sequence of perfect trees and
def T
n pn n1 2 p2 1 p1 0 p0 , then q = n pn is a perfect tree, and q n pn
for all n .
Proof. Let n be a positive integer, and let s be an n-th branching point of pn .
If n m, then pm n pn by 22.22(i), so s is an n-th branching point of pm ; hence
s, s h0i, s h1i pm . It follows that s, s h0i, s h1i q, and s is a branching point of q.
Thus we just need to show that q is a perfect tree.
Clearly if t q and n < dmn(t), then t n q. Now suppose that s q; we want to
nd a t q with s t and t is a branching point of q. Let n = dmn(s). Now s pn , and
pn has fewer than n elements less than s, so pn has an n-th branching point t s. By the
rst paragraph, t q.
Let p be a perfect tree and s p. We dene
p s = {t p : t s or s t}.
Clearly p s is still a perfect tree. Now for any positive integer n, let t0 , . . . , t2n 1 be
the collection of all immediate successors of n-th branching points of p. (Recall 22.22(v).)
Suppose that for each i < 2n we have a perfect S
tree qi p ti . Then we dene the
amalgamation of {qi : i < 2n } into p to be the set i<2n qi .
Lemma 22.24. Under the above assumptions, the amalgamation r of {qi : i < 2n }
into p has the following properties:
(i) r is a perfect tree.
(ii) r n p.
Proof. (i): Suppose that f r, g Seq, and g f . Say f qi with i < 2n . Then
g qi , so g r. Now suppose that f r; we want to nd a branching point of r above f .
Say f qi . Let g be a branching point of qi with f g. Clearly g is a branching point of
r.
(ii): Suppose that f is an n-th branching point of p. Then there exist i, j < 2n such
that f h0i = ti and f h1i = tj . So f h0i qi r and f h1i = tj qj r, and so f is
a branching point of r.
Lemma 22.25. Suppose that M is a c.t.m. of ZFC and we consider the Sacks partial
Then there
order Q within M . Suppose that B M , M Q , p Q, and p :
B.
<

is a q p and a function F : [B]


in M such that q (
n) Fn for every n .
Proof. We work entirely within M , except as indicated. We construct two sequences
hqn : n i and hFn : n i by recursion. Let q0 = p. Suppose that qn has been
so
dened; we dene Fn and qn+1 . Assume that qn p. Then qn :
B,

qn x B (
n) = x). Let t0 , . . . , t2n 1 list all of the functions f h0i and f h1i such
that f is an n-th branching point of qn . (Recall 3.21(v).) Then for each i < 2n we have
(
qn ti qn , and so qn ti x B
n) = x). Hence there exist an ri qn ti and a
305

bi B such that ri (
n) = bi . Let qn+1 be the amalgamation of {ri : i < 2n } into qn ,
n
and let Fn = {bi : i < 2 }. Thus qn+1 n qn by 22.24. Moreover:
(1)

qn+1 (
n) Fn .

In fact, let G be Q-generic over M with qn+1 G. By 22.22(vii), there is an i such that
ri G. Since ri (
n) = bi , it follows that G (n) Fn , as desired in (1).
Now with (1) the construction is complete.
By the fusion lemma 22.23 we get s n qn for each n. Hence the conclusion of the
lemma follows.
Theorem 22.26. If M is a c.t.m. of ZF C + CH and Q M is the Sacks forcing
partial order, and if G is Q-generic over M , then cofinalities and cardinals are preserved
in M [G].
Proof. Since |Q| 2 = 1 by CH, the poset Q satises the 2 -chain condition,
and so preserves conalities and cardinals 2 by 9.4. Hence by 9.2 it suces to show
that 1M remains regular in M [G]. Suppose not: then there is a function f : 1M in
M [G] such that rng(f ) is conal in 1M . Hence there is a name such that f = G , and
hence there is a p G such that p :

1M . By Lemma 22.25, choose q p and
F : [1M ]< in M such that q (
n) Fn for every n . Take < 1M such that
S

n), so there exist an r q and an n such that


n Fn < . Now q n ( < (

r < (
n). So we have:
(2) r (
n) Fn ;
S
(3) n Fn < ;
(4) r < (
n).
These three conditions give the contradiction r (
n) < (
n).
Baumgartner, J.; Laver, R. [79] Iterated perfect-set forcing. Ann. Math. Logic 17 (1979),
271288.
Hechler MAD forcing
Innite subsets a, b of are almost disjoint iff a b is nite.
Theorem 22.27. There is a collection A of infinite pairwise almost disjoint subsets
of such that |A | = 2 .
Proof. For each f let af = {f n : n }. If f, g and f 6= g, then af ag
is nite. In fact, choose i such that f (i) 6= g(i). Then af ag {f j : j i}. Now
note that the collection A of all nite sequences of members of has size . So there is a
bijection F from A to . Let
A = {F [af ] : f }.
Clearly A is as desired.
306

A family A of innite subsets of is maximal almost disjoint (MAD) iff any two members
of A are almost disjoint, and A is maximal with this property. By Theorem 22.27, there
is a MAD family of size 2 . (Apply Zorns lemma.)
Theorem 22.28. Every infinite MAD family of infinite subsets of is uncountable.
Proof. Suppose that A is a denumerable pairwise almost disjoint family of innite
subsets of ; we want to extend it. Write A = {An : n }, the An s distinct. We
dene
han : n i by recursion. Suppose that am has been dened for all m < n. Now
S
m<n (Am An ) is nite, so we can choose
an An \ {am : m < n}

(Am An ) .

m<n

Note that then an


/ Am for any m < n. Let B = {an : n }. Then B is innite, and
B An {am : m n}.
Also recall that Martins axiom implies that every MAD family has size 2 ; a proof is in
Chapter 2 of Kunen. We now want to introduce a forcing which will make a MAD family
of size 1 , with CH.
The members of our partial order H will be certain pairs (p, q); we dene (p, q) H
iff the following conditions hold:
(1) p is a function from a nite subset of 1 into n 2 for some n . We write n = np .
(2) q is a function with domain contained in [dmn(p)]2 and range contained in np .
(3) If {, } dmn(q) and q({, }) = m, then for every i with m i < np we have
(p())(i) = 0 or (p())(i) = 0.
Furthermore, for (p1 , q1 ), (p2 , q2 ) H we dene (p1 , q1 ) (p2 , q2 ) iff the following conditions hold:
(4) dmn(p1 ) dmn(p2 ).
(5) p1 () p2 () for all dmn(p2 ).
(6) q1 q2 .
Note that (5) implies that np2 np1 .
The idea here is to produce almost disjoint sets a for < 1 ; p() is the characteristic
function of a np , and a a q({, }).
Lemma 22.29. Suppose that (p3 , q3 ) (p1 , q1 ), (p2 , q2 ). Then (p3 , q1 q2 ) H, and
(p3 , q3 ) (p3 , q1 q2 ) (p1 , q1 ), (p2 , q2 ).
Proof. Condition (1) clearly holds for (p3 , q1 q2 ), since it only involves p3 . Clearly
q1 q2 is a relation with domain [dmn(p1 )]2 [dmn(p2 )]2 [dmn(p3 )]2 . To show that it
is a function, suppose that {, } dmn(q1 ) dmn(q2 ). Then q1 ({, }) = q3 ({, }) =
q2 ({, }). So q1 q2 is a function, and it clearly maps into max(np1 , np2 ) np3 . Hence
(2) holds for (p3 , q1 q2 ). Finally, suppose that {, } dmn(q1 q2 ). By symmetry,
307

say {, } dmn(q1 ). Let q1 ({, }) = m, and suppose that m i < np3 . Then
q3 ({, }) = q1 ({, }) = m, so (p3 ())(i) = 0 or (p3 ())(i) = 0. So (3) holds. The nal
inequalities are clear.
Lemma 22.30. H satisfies ccc.
Proof. Suppose that N is an uncountable subset of H; we want to nd two compatible
members of N . Now hdmn(p) : (p, q) N i is an uncountable system of nite sets, so there
exist an uncountable N N and a nite subset H of 1 such that hdmn(p) : (p, q) N i
is a -system with root H. Next,
N =

{(p, q) N : p H = f and q [H]2 = g},

where

(f,g)J

J = {(f, g) : f : H , g is a function,
dmn(g) [H]2 , and rng(g) }.
def

Since J is countable, let (f, g) J be such that N = {(p, q) N : p H = f and


q [H]2 = g} is uncountable. Now we claim that any two members (p1 , q1 ) and (p2 , q2 ) of
N are compatible. Since p1 H = p2 H and dmn(p1 )dmn(p2 ) = H, the relation p1 p2
is a function. Say np1 np2 . We now dene a function p3 with domain dmn(p1 )dmn(p2 ).
Let dmn(p1 ) dmn(p2 ). Then we dene p3 () : np2 2 by setting, for any i < np2 ,
(p3 ())(i) =

(p2 ())(i) if dmn(p2 ),


(p1 ())(i) if dmn(p1 )\dmn(p2 ) and i < np1 ,
0
otherwise.

To check that (p3 , q1 q2 ) H, rst note that (1) is clear. To show that q1 q2 is a function,
suppose that {, } dmn(q1 ) dmn(q2 ). Then dmn(q1 ) dmn(q2 ) [dmn(p1 )]2
[dmn(p2 )]2 = [H]2 , and it follows that q1 ({, }) = q2 ({, }). Thus q1 q2 is a function.
Furthermore,
dmn(q1 q2 ) = dmn(q1 ) dmn(q2 )
[dmn(p1 )]2 [dmn(p2 )]2
[dmn(p1 ) dmn(p2 )]2
= [dmn(p3 )]2 .
The range of q1 q2 is clearly contained in np2 . So we have checked (2). For (3), suppose
that {, } dmn(q1 q2 ), (q1 q2 )({, }) = m, and m i < np2 . We consider some
cases:
Case 1. {, } dmn(q2 ). Then , dmn(p2 ), so p3 () = p2 () and p3 () = p2 ().
Hence (p3 ())(i) = 0 or (p3 ())(i) = 0, as desired.
Case 2. {, } dmn(q1 )\dmn(q2 ) and i < np1 . Thus , dmn(p1 ). If
dmn(p2 ), then p1 () = p2 (), and so (p3 ())(i) = (p1 ())(i). If
/ dmn(p2 ), still
(p3 ())(i) = (p1 ())(i). Similarly for , so the desired conclusion follows.
308

Case 3. {, } dmn(q1 )\dmn(q2 ) and np1 i. Thus again , dmn(p1 ). If one


of , is not in dmn(p2 ), it follows that one of (p3 ())(i) or (p3 ())(i) is 0, as desired.
Suppose that both are in dmn(p2 ). Then {, } dmn(p1 ) dmn(p2 ) = H, and hence
{, } dmn(q2 ), contradiction.
Theorem 22.31. Let M be a c.t.m. of ZFC, and consider H in M . Let G be Hgeneric over M . Then cofinalities and cardinals are preserved in M [G], and in M [G] there
is a MAD family of size 1 .
let

Proof. Conalities and cardinals are preserved by 22.30 and 9.4. For each < 1 ,
[
x = {p() : (p, q) G for some q, and dmn(p)}.

We claim that x is a function. For, suppose that (a, b), (a, c) x . By the denition,
choose (p1 , q1 ), (p2 , q2 ) G such that dmn(p1 ), dmn(p2 ), (a, b) p1 (), and
(a, c) p2 (). Then choose (p3 , q3 ) G such that (p3 , q3 ) (p1 , q1 ), (p2 , q2 ). By (4) in the
denition of H we have dmn(p3 ), and by (5) we have (a, b), (a, c) p3 (), so a = c.
Next we claim that in fact x has domain . (Its domain is clearly a subset of .)
For, take any m . It suces to show that the set
def

Dm = {(p, q) H : dmn(p) and m dmn(p())}


is dense. So, suppose that (r, s) H. If dmn(r), let t = r. Suppose that
/ dmn(r).
Extend r to t by adding the ordered pair (, h0 : i < nr i). Clearly (t, s) H and
(t, s) (r, s). If m < nt , then (t, s) Dm , as desired. Suppose that nt m. We now
dene
p = {(, g) : dmn(t), g m+1 2, t() g, and
g(i) = 0 for all i [nt , m]}.
Clearly (p, s) H , in fact (p, s) Dm , and (p, s) (t, s) (r, s), as desired.
So Dm is dense, and hence each x is a function mapping into 2. We dene
a = {m : x (m) = 1}. We claim that ha : < 1 is our desired MAD family.
Now we show that each a is innite. For each m let
Em = {(p, q) H : dmn(p), m < np , and there is
an i [m, np ) such that (p())(i) = 1}.
Clearly in order to show that a is innite it suces to show that each set Em is dense.
So, suppose that (r, s) H. First choose (t, u) (r, s) with (t, u) D0 . This is done just
to make sure that is in the domain of t. Let k be the maximum of nt + 1 and m + 1.
Dene the function p as follows. dmn(p) = dmn(t). For any dmn(t) and any i < k, let

(t())(i) if i < nt ,
(p())(i) = 0
if nt i and 6= ,

1
if nt i and = .
309

It is easy to check that (p, u) H, in fact (p, u) Em , and (p, u) (r, s), as desired. So
each a is innite.
Next we show that distinct a , a are almost disjoint. Suppose that , < 1 with
6= . Since D0 and D0 are dense, there are (p1 , q1 ), (p2 , q2 ) G with dmn(p1 )
and dmn(p2 ). Choose (p3 , q3 ) G such that (p3 , q3 ) (p1 , q1 ), (p2 , q2 ). Thus ,
dmn(p3 ). Next we claim:
def

F = {(r, s) : {, } dmn(s)}
is dense below (p3 , q3 ). In fact, suppose that (t, u) (p3 , q3 ). We may assume that {, }
/
dmn(u). Let dmn(r) = dmn(t), and for any dmn(r) let r() be the function with
domain nt + 1 such that t() r() and (r())(nt) = 0. Let dmn(s) = dmn(u) {{, }},
with u s and s({, }) = nt . It is easily checked that (r, s) H, in fact (r, s) F , and
(r, s) (t, u). So, as claimed, F is dense below (p3 , q3 ). Choose (p4 , q4 ) F G.
We claim that a a q4 ({, }). To prove this, assume that m a a , but
suppose that q4 ({, }) m. Thus x (m) = 1 = x (m), so there are (e, b), (c, d) G such
that dmn(e), m dmn(e()), (e())(m) = 1, and dmn(c), m dmn(c()),
and (c())(m) = 1. Choose (p5 , q5 ) G with (p5 , q5 ) (p4 , q4 ), (a, b), (c, d). Then
{, } dmn(q5 ), q5 ({, }) = q4 ({, }) m < np5 , (p5 ())(m) = (e())(m) = 1,
and (p5 ())(m) = (c())(m) = 1, contradiction. So we have shown that ha : < 1 i is
an almost disjoint family.
To show that ha : < 1 i is MAD, suppose to the contrary that b is an innite
subset of such that b a is nite for all < 1 . Let be a name such that G = b.
For each < 1 let
= {(i, (p, q)) : i , (p, q) H, dmn(p),
i dmn(p()), and (p())(i) = 1}.
Cleary G = a . For each n let An be maximal subject to the following conditions:
(1) An is a collection of pairwise incompatible members of H.
(2) For each (p, q) An , (p, q) n
or (p, q) n

/ .
Then
(3) An is maximal pairwise incompatible.
In fact, suppose that (r, s) (p, q) for all (p, q) An . Now (r, s) n
n

/ , so
there is a (t, u) (r, s) such that (t, u) n
or (t, u) n

/ . Then An {(t, u)} still


satises (1) and (2), and (t, u)
/ An , contradiction.
Now choose
[
1 \
dmn(p).
n,
(p,q)An

Let m be such that b a m. Choose (p1 , q1 ) G such that (p1 , q1 ) m.

Using D0 , we may assume that dmn(p1 ). Choose n b with n > m and n np1 .
310

Then take (p2 , q2 ) G An . Then (p2 , q2 ) n


, since n b = G . Choose (p3 , q3 ) G
with (p3 , q3 ) (p1 , q1 ), (p2 , q2 ). Then by 22.29 we have (p3 , q1 q2 ) H and (p3 , q1 q2 )
(p1 , q1 ), (p2 , q2 ).
Now choose k > max(np3 , n), and dene p4 as follows. The domain of p4 is dmn(p3 ).
For each dmn(p3 ) we dene p4 () : k 2 by setting, for each i < k,

(p ())(i)

3
0
(p4 ())(i) =
0

if
if
if
if

i < np3 ,
np3 i and 6= ,
np3 i, = , and i 6= n,
= and i = n.

We check that (p4 , q1 q2 ) H. Conditions (1) and (2) are clear. For (3), suppose that
{, } dmn(q1 q2 ), and (q1 q2 )({, }) i < np4 . Remember that np4 is k. If
i < np3 , then the desired conclusion follows since (p3 , q1 q2 ) H. If np3 i, then the
desired conclusion follows since at least one of , is dierent from . Hence, indeed,
(p4 , q1 q2 ) H.
Clearly (p4 , q1 q2 ) (p2 , q2 ), and so (p4 , q1 q2 ) n
. It is also clear that
(p4 , q1 q2 ) (p1 , q1 ), so (p4 , q1 q2 ) m.
Since m < n, it follows that
(p4 , q1 q2 ) n

/ . But (
n, (p4 , q1 q2 )) is clearly a member of , and hence (p4 , q1 q2 )
n
, contradiction.
Miller, A. [03] A MAD Q-set. Fund. Math. 178 (2003), 271281.
Hechler real forcing
Let (n, f ) P iff n and f , and order these pairs by (n, f ) (m,
S g) iff n m,
f m = g m, and f (k) g(k) for all k. If G is P -generic over M , then (m,f )P f m
is a Hechler real.
Labedzki, G.; Repick
y, M. Hechler reals. J. Symb. Logic 60 (1995), 444-458.
Prikry-Silver forcing
Let P be the collection of all functions f with domain a subset D of such that \D is
innite, and with range contained
Sin 2, ordered by . This is called Prikry-Silver forcing.
If G is P -generic over M , then pG p is a function from to 2, called a Prikry-Silver
real. The treatment of this forcing is similar to that of Sacks forcing.
Jech, T. Multiple forcing. Cambridge Univ. Press 1986, 136pp.

311

Mathias forcing
Let P consist of all pairs (s, A) such that s is a nite subset of , A is an innite subset
of , and each element of s is less than min(A). Dene (s, A) (s , A ) iff sS s , A A ,
and s\s A . This is the Mathias forcing poset. If G is generic, then (s,A)G s is a
Mathias real.
Bartoszy
nski, T.; Judah, H. Set theory. On the structure of the real line. A. K.
Peters 1995. 546pp.
Laver forcing
A Laver tree is a subset T of
properties:

n
n

such that there is an s T with the following

(1) For all t T , if n < dmn(t) then t n T .


(2) If t T , then s t or t s.
(3) For every t T such that s t, the set {m : t hni T } is innite.
We call the element s here, which is clearly uniquely determined by T , the stem of T .
The Laver forcing poset consists of all Laver trees, partially ordered by inclusion. If G is
generic, then {s :there is a T G such that s is the stem of T } is a Laver real.
Pawlikowski, J. Lavers forcing and outer measure. Set theory. Comtemporary Math. 192
(1996), Amer. Math. Soc., 7176.
Collapsing to 1
Theorem 22.32. Let M be a c.t.m. of ZFC, and in M let be an infinite cardinal,
and set = + in M . Let P be fin(, ), and let G be P -generic over M . Then cardinals
M [G]
are preserved in going to M [G], but 1
= .
Thus we may say that all cardinals such that < < become countable ordinals in
M [G].
S
Proof. Let g = G. Clearly g is a function with domain contained in and range
contained in . We claim that actually its domain is and its range is . For, let m
and . Let
D = {f P : m dmn(f ) and rng(f )}.
Clearly D is dense. Hence m dmn(g) and rng(g), as desired. It follows that in
M [G], || = , and so the same is true for every ordinal such that .
Now we can nish the proof using 9.4 by showing in M that P has the -cc. Let
X P with |X| = . Then hdmn(f ) : f Xi is a system of many nite sets, so
by the -system lemma 10.1 with , replaced by , , there is a N [X] such that
hdmn(f ) : f M i is a -system, say with root r. Since |r | < , there are two
members f, g of M such that f r = g r. So f and g are compatible, as desired.
312

We now want to do the same thing for regular limit cardinals . We introduce the Levy
collapsing order:
Lv = {p : p is a nite function, dmn(p) , and
for all (, n) dmn(p), p(, n) }.
Again this set is ordered by .
Lemma 22.33. For regular uncountable, Lv has the -cc.
Proof. Very similar to part of the proof of 22.32.
Theorem 22.34. Let M be a c.t.m. of ZFC, and suppose that in M is regular and
uncountable. Let G be Lv -generic over M . Then cardinals are preserved in M [G],
M [G]
and 1
= .
Proof. Cardinals are preserved by 9.4 and 22.33.SSuppose that 0 < < ; we
will nd a function mapping onto in M [G]. Let g = G. Clearly G is a function.
Now for each < , and m let
Dm = {p Lv : (, m) dmn(p)}.
Clearly Dm is dense, so (, m) dmn(g). Thus dmn(g) = . Now suppose that
< and < . Let
E = {p Lv : there is an m such that (, m) dmn(p) and p(, m) = }.
We claim that E is dense. For, suppose that < and < . Take any q Lv .
Choose m such that (, m)
/ dmn(q), and let p = q {((, m), )}. Clearly p E ,
as desired.
It follows that hg(, m) : m i maps onto .

313

23. Proper forcing


This section is concerned with the proper forcing axiom (PFA), a generalization of (part
of) Martins axiom. We give the basic definition of proper forcing, state the axiom, and
give an application. The relative consistency of PFA requires large cardinals, but we at
least state the relevant theorem.
Let P be a quasi-order, p P , and D P . We say that D is predense below p iff for
every q p there is an r D such that r and q are compatible.
Our definition of proper forcing depends on a certain game of length . Let P be
a quasi-order; we describe a game (P ) played between players I and II. First I chooses
p0 P and a maximal antichain A0 of P . Then II chooses a countable subset B00 of A0 .
At the n-th pair of moves, I chooses a maximal antichain An and then II chooses countable
sets Bin Ai for each i n. Then we say that II wins iff there is a q p0 such that for
every i , the set
[
Bin
in

is predense below q. Finally, we say that P is proper iff II has a winning strategy in this
game.
We give a rigorous formulation of these ideas, not relying on informal notions of games.
A play of the game (P ) is an infinite sequence
hp0 , A0 , C0 , A1 , C1 , . . . , An , Cn . . .i
satisfying the following conditions for each n :
(1) p0 P .
(2) An is a maximal antichain of P .
(3) Cn = hBin : i ni, where each Bin is a countable subset of Ai .
Given such a play, we say that II wins iff there is a q p0 such that for every i , the
set
[
Bin
in

is predense below q.
A partial play of length m of (P ) is a sequence
hp0 , A0 , C0 , A1 , C1 , . . . , Am1 , Cm1 , Am i
satisfying the above conditions. Note that the partial play ends with one of the maximal antichains Am . A strategy for II is a function S whose domain is the set of all
partial plays of (P ), such that if P is a partial play as above, then S(P) is a set
Cm satisfying the condition (3). A play is said to be according to S iff for every m,
Cm = S(hp0 , A0 , C0 , A1 , C1 , . . . , Am1 , Cm1 , Am i). The strategy S is winning iff II wins
every play which is played according to S.
Thus this explains the notion of properness rigorously.
314

Proposition 23.1. Every ccc quasi-order is proper.


Proof. Let P be a ccc quasi-order. The strategy of II is to always let Bin = Ai ; since
Ai is countable by ccc, this strategy is legal. Given a play according
with
S to this strategy,
n
notation as above, let q = p0 , and take any i . To show that in Bi is predense
below p0 , take any r p0 . Since Bin = Ai is a maximal antichain, r is compatible with
some member of Bin .
Proposition 23.2. If P is an 1 -closed quasi-order, then P is proper.
Proof. Let P be an 1 -closed quasi-order. We describe a pair (S, f ) by recursion;
each of these is a function defined on all the partial plays in (P ), and the induction is
on the length of the partial plays. S takes values as in the definition of strategy, while f
takes values in P . At the start of the recursion, given a partial play hp0 , A0 i, we know that
p0 P and A0 is a maximal antichain of P . Let q A0 so that p0 and q are compatible,
and choose f (hp0 , A0 i) to be a member of P below both p0 and q; and let B00 = {q},
S(hp0 , A0 i) = hB00 i. Now suppose that a partial play
hp0 , A0 , C0 , A1 , C1 , . . . , Am1 , Cm1 , Am i
is given, with m > 0. Let q = f (hp0 , A0 , C0 , A1 , C1 , . . . , Am1 i). Now Am is a maximal
antichain, so q is compatible with some member r of Am ; we choose
f (hp0 , A0 , C0 , A1 , C1 , . . . , Am1 , Cm1 , Am i) q, r,
m
and set Bm
= {r} and Bim = for all i < m. This finishes the recursive definition.
Now let
hp0 , A0 , C0 , A1 , C1 , . . . , An , Cn . . .i

be a play according to the strategy S, with associated function f . Then by construction


we have
p0 f (hp0 , A0 i) f (hp0 , A0 , C0 , A1 i) ,
so by 1 -closure there is an q P such that q f (P) for every partial play P of this play.
def S
In particular, q p0 . Now suppose that i ; we want to show that D = in Bin is
predense below q. So take any r q. Let Bii = {s}. Then
r q f (hp0 , A0 , . . . , Ai i) s,
so of course r and s are compatible.
We give one more important kind of quasi-order which is proper. We say that a quasi-order
P satisfies Axiom A iff it is a partial order and there exist partial orders n on P for each
n such that the following conditions hold, for all p, q P :
(A1) p 0 q iff p q.
(A2) If p n+1 q then p n q.
315

(A3) Suppose that A is a maximal antichain in P (with respect to ), p P , and n .


Then there exist a q n p and a countable A A such that A is predense below q (with
respect to ).
(A4) If pn+1 n pn for all n , then there is a q P such that q pn for all n.
Theorem 23.3. If P satisfies Axiom A, then P is proper.
Proof. Assume that P satisfies Axiom A, with the above notation. Again we define
two functions S and f defined on all partial plays, by recursion on the length of the partial
play. For a smallest partial play hp0 , A0 i, apply (A3) to A0 , p0 , and the integer 0 to
obtain an element f (hp0 , A0 i) 0 p0 and a subset B00 of A0 such that B00 is predense below
f (hp0 , A0 i). Then let
S(hp0 , A0 i) = hB00 i.
Now suppose that
hp0 , A0 , C0 , A1 , C1 , . . . , Am1 , Cm1 , Am i
is a partial play with m 1 and apply (A3) to the maximal antichain Am , the element
m
f (hp0 , A0 , C0 , A1 , C1 , . . . , Am1 i) of P , and the integer m, and obtain a countable Bm

Am and an element
def

q = f (hp0 , A0 , C0 , A1 , C1 , . . . , Am1 , Cm1 , Am i) m f (hp0 , A0 , C0 , A1 , C1 , . . . , Am1 i)


m
is predense below q. Let Bim = for i < m, and let
such that Bm

S(hp0 , A0 , C0 , A1 , C1 , . . . , Am1 , Cm1 , Am i) = hBim : i mi.


This finishes the definition of S and f .
Now suppose that
hp0 , A0 , C0 , A1 , C1 , . . . , An , Cn . . .i
is a play according to S, with associated function f . For each m let pm+1 =
f (hp0 , A0 , C0 , A1 , C1 , . . . , Am1 , Cm1 , Am . Then pm+1 n pm for all m , so by (A4)
there is a q P such that q pm for all m. Now take any i . We claim that Bii is
predense below q, as desired. In fact, if r q, then since Bii is predense below pi+1 and
q pi+1 , we get s Bii such that s and r are compatible, as desired.
Now we give the proper forcing axiom:
(PFA) If P is a proper poset and D is a collection of dense subsets of P with |D| 1 ,
then there is a filter G on P such that G D 6= for all D D.
Corollary 23.4. PFA implies MA(1 ).
First of all, PFA is relatively consistent. This requires large cardinals, however. Rather
high in the hierarchy of large cardinals are the supercompact cardinals. They are defined
as follows. Let be an uncountable cardinal, and let A be a set such that |A| . An
ultrafilter F on [A]< is normal iff the following conditions hold:
316

(1) For each P [A]< , the set {Q [A]< : P Q} is in F .


(2) F is -complete, i.e, if hXi : i Ii is a system of members of F and |I| < , then
T
iI Xi F .
(3) F is closed under diagonal intersections, i.e., if hXa : a Ai is a system of members of
F , then the diagonal intersection
def

aA Xa = {x [A]< : x Xa for all a x}


is also in F .
Now we say that an uncountable cardinal is supercompact iff for every A such that
|A| there is a normal measure on [A]< .
As mentioned above, these are very large cardinals. The basic connection to PFA is
the following theorem of Baumgartner (1979):
Theorem. If ZFC + there is a supercompact cardinal is consistent, then so is
ZFC +2 = 2 + PFA.
Another important fact about PFA is as follows:
PFA implies (M A(2 )).
Now we mention various important applications of PFA, giving the basic definitions, but
no background. After this, to conclude this section we give one application in detail.
An important algebra in set theory is P()/fin. Here P() is the collection of all
subsets of , and fin is the collection of finite subsets of . Clearly fin is an ideal in the
Boolean algebra P(), so we are considering here the quotient algebra; see section 13. Let
A denote this quotient algebra.
Now let and be infinite cardinals. A (, )-gap is a pair (f, g) such that the
following conditions hold:
(1) f A and g A.
(2) If < < , then f < f .
(3) If < < , then g < g .
(4) If < and < , then f > g .
We say that the gap is unfilled iff there is no a A such that f < a < g for all <
and < .
Hausdorff proved (in ZFC) that there is an unfilled (1 , 1 )-gap. Under PFA, every
unfilled gap has one of the forms (1 , 1 ), (, ) with 2 , or (, ) with 2 .
PFA implies that there is a linear ordering of size 2 which cannot be embedded in
P()/fin.
A set A of real numbers is 1 -dense iff A intersects every open interval in exactly 1
points. Under PFA, any two 1 -dense sets of real numbers are order-isomorphic.
317

Under PFA, any uncountable Boolean algebra has an uncountable set of pairwise incomparable elements.
Under PFA, any uncountable family of subsets of contains an uncountable simply
ordered subset or an uncountable family of pairwise incomparable elements.
PFA implies that every tree of height 2 with all levels of size less than 2 has a linearly
ordered subset of size 2 .
Now we begin to develop a detailed application of PFA. The application concerns closed
unbounded subsets of 1 , and the proof involves the notion of an indecomposable ordinal.
Recall that an ordinal is called indecomposable iff it is 0 or has the form for some
. First we state two elementary facts. A normal function on an infinite cardinal is a
strictly increasing continuous
function from into . Recall that a function f : is
S
continuous iff f () = < f () for every limit ordinal < .
Proposition 23.5. Let be an uncountable regular cardinal, and C . Then C is
club in iff there is a normal function f on such that rng(f ) = C.
Proof. : Let f be the strictly increasing enumeration of C. Thus f : and f
is strictly increasing. If is a limit ordinal
less than , then f ()
S
S < f () for every <
since f is strictly increasing. Hence < f () S
f (). But < f () C since C is
closed, so there is some < such that f () = < f (). Then f () < f () for all
< . Hence . If < , then f () < f (), hence f () < f () for some < ,
contradiction. So = , as desired, showing that f is continuous.
: Suppose that f is a normal function on and rng(f ) = C. Then obviously C
is unbounded. To show that it is closed, suppose that is a limit ordinal and C is
unbounded
S in . Thus for every < there is a < such that f ( ) < .
Let = < . If < , then f ( ) f (). Thus f (). Now is a limit
ordinal. For, if = + 1, then < , and hence there is a < such that < ,
so S
and hence f () f ( ) < , contradiction. Now the continuity of f gives
f () = < f ( ) . Hence f () = , as desired.
Proposition 23.6. If < 1 , then < 1 .
Proof. By induction on .
Corollary 23.7. The set of all indecomposable ordinals less than 1 is club in 1 .
In particular, the union of a sequence of indecomposable ordinals is indecomposable.
Now we define a partial order which we discuss for the rest of this section. Let PC be
the set of all f fin(1 , 1 ) such that f g for some normal function g on 1 . The order
is .
Lemma 23.8. If p PC , < < 1 for all rng(p), and is indecomposable,
then p {(, )} PC .
Proof. We may assume that p is nonempty. Let f be a normal function on 1 such
that p f . Let be the largest member of rng(p). Then ( rng(f )) (1 \) is club
318

in 1 ; let g be a normal function with it as range. Choose < 1 such that f () = .


Then g( + ) = + for every ordinal < 1 , and so g() = g( + ) = + = . So
p {(, )} g, as desired.
Theorem 23.9. PC is proper.
Proof. For each p PC , let p be the least ordinal such that p p p .
We claim
(1) If A is a maximal antichain in PC and < 1 , then there is an ordinal < 1 such
that for every p PC such that p there is a q A such that q and p are compatible
and q .
To prove this, let A is a maximal antichain in PC and < 1 . For each p PC such that
p choose qp A such that qp and p are compatible. Now {p PC : p } is
countable, so we can choose < 1 such that qp for all p PC such that p .
Clearly is as desired in (1).
We let g(A, ) be the smallest satisfying (1). For each maximal antichain A, let
C(A) = { < 1 : g(A, ) < for all < }. We claim that C(A) is club in 1 . To show
that it is closed, suppose that is a limit ordinal less than 1 and C(A) is unbounded
in . Take any < . Choose C(A) such that < . Then g(A, ) < < . This
shows that C(A), so C(A) is closed. To show that it is unbounded, take any < 1 .
Let 0 = , and if S
n has been defined, let n+1 be an ordinal such that g(A, ) < n for
all < n . Clearly n n C(A), as desired.
Let Ind be the set of all indecomposable ordinals. So C(A) Ind is club for every
maximal antichain A.
Now we define two functions S and f defined on all partial plays of the game (PC ).
For a smallest play hp0 , A0 i, choose f (hp0 , A0 i) Ind C(A0 ) such that p0 f (hp0 , A0 i)
f (hp0 , A0 i), and let
S(hp0 , A0 i) = {p A0 : p f (hp0 , A0 i) f (hp0 , A0 i)}
Now suppose that a partial play
def

P = hp0 , A0 , C0 , A1 , C1 , . . . , Am1 , Cm1 , Am i


T
is given, with m > 0. Choose f (P) Ind im C(Ai ) with
f (P) > f (hp0 , A0 , C0 , A1 , C1 , . . . , Am1 i).
Then define for each i m
Bim = {p Ai : p f (hp0 , A0 , C0 , A1 , C1 , . . . , Ai i) f (hp0 , A0 , C0 , A1 , C1 , . . . , Ai i)}
and let S(P) = hBim : i mi.
This finishes the construction of these functions. Thus S is a strategy for II. We claim
that it is a winning strategy. To prove this, let
hp0 , A0 , C0 , A1 , C1 , . . . , An , Cn . . .i
319

be a play according to S, with associated


S function f . For each m let m =
f (hp0 , A0 , C0 , A1 , C1 , . . . , Ai i). Let = m m . Thus is an indecomposable ordidef

nal less than 1 . Since p0 0 0 < , it follows from 23.8 that q = p0 {(, )} PC .
We claim that q is what
to show that II has won. To prove this, suppose that
S is needed
n
i . To show that in Bi is predense below q, take any r q. Let r = r ( ).
Obviously r PC . Choose n with n i such that r n n . Since n+1 C(Ai )
and n < n+1 , by definition of C(Ai ) we have g(Ai , n ) < n+1 , and it follows that there
is an s Ai such that s and r are compatible and s g(Ai , n ) g(Ai, n ). We claim
that s and r are compatible (as desired). For, choose normal functions h and k on 1 such
that s r h and r k. Now the set (rng(h) n+1 ) (\n+1 ) (rng(k)\) is club;
let l be its strictly increasing enumeration. Choose < 1 such that h[] = rng(h) n+1 .
Now since r q we have dmn(r) and r() = ; hence k() = . It follows that
l() = h() for all < ,
l( + ) = n+1 + for all < , and l[[, )] = \n+1 ,
l( + ) = k( + ) for all < 1 .
Now suppose that dmn(s). Then (, s()) g(Ai , n ) g(Ai , n ) n+1 n+1 , and
so s() = h() = l(). Next, suppose that dmn(r) and < . Since r() = , it
follows that r() < . So (, r()) and hence r() = r () and (, r()) n n .
So r() = r () = h() = l(). Finally, suppose that dmn(r) and . Then
r() = k() = l().
Proposition 23.10. PC does not satisfy Axiom A.
Proof. Suppose it does. We define three sequences hAn : n i, hpn : n i and
hCn : n i by recursion. Let p0 be any nonempty member of PC , and let A0 be a maximal
antichain such that each member q of A0 is nonempty with max(rng(q)) > max(rng(p0 )).
Having defined An , pn , and all Cm with m < n we now define An+1 , pn+1 and Cn . By
condition (A3) in the definition of Axiom A, let pn+1 n pn and Cn An be such that Cn
is a countable subset of An and is predense below pn+1 . Let An+1 be a maximal antichain
such that each member q of An+1 is nonempty with max(rng(q)) > max(rng(pn+1 )).
So, our sequences have been defined. By (A4), choose q PC such that q pn for
all n . Since q is finite, it follows that there is an m such that q = pm = pn for
all n m. Now let be the largestSmember of rng(pm ), say with (, ) pm . Choose
greater than and each member of qCm+1 rng(q), and let r = pm {( + 1, )}. Clearly
r PC . Now r pm = pm+1 , so r is compatible with some member q of Cm+1 Am+1 .
Say r q f with f a normal function. Since r() = and r( + 1) = , we have
f () = and f ( + 1) = . Let = max(rng(q)); say (, ) q. Since q Am+1 , we
have > max(rng(pm+1 )) = max(rng(pm )) = . Thus f () = q() = > = f (), so
+ 1. Hence = f ( + 1) f () = , contradicting the choice of .
Now we are ready for our one application of PFA.
Theorem 23.11. Assume PFA, and suppose that hA : < 1 i is a collection of
infinite subsets of 1 . Then there is a club C in 1 such that for all < 1 , A 6 C.
320

Proof. We are going to apply PFA to the partial order PC which we have been
discussing. By 23.9, PC is proper. We now define three families, each of size at most 1 ,
of dense subsets of PC . For each < 1 , let
D = {p PC : there is a A such that one of the following holds:
(1) 0 dmn(p) and < p(0);
(2) there is a such that , + 1 dmn(p) and p() < < p( + 1)}.
To see that D is dense in PC , let q PC . Say q f with f a normal function on 1 . Let
r = q {(0, f (0))}. Clearly r PC and r q. Let be the largest member of rng(r), and
choose so that r() = .
If < f (0) for some A , then r D , as desired. Suppose that f (0) for
every A .
Next, suppose that A 6 rng(f ). Choose A \rng(f ), and let be minimum such
that < f (). By the continuity of f , is a successor ordinal + 1. Then f () < <
f ( + 1), and so r {(, f ()), ( + 1, f ( + 1))} is the desired member of D extending
q. So suppose that A rng(f ).
Suppose that < for some A . Define g : 1 1 by setting

f ()
if ,
g() =
+ if = + with 6= 0.
def

Clearly g is a normal function, and the function s = r {( + 1, + 1)} is a subset of it.


We have s() = r() = < < + 1 = s( + 1), so s D , as desired. Hence we may
assume that A + 1.
Let dmn(r) = {i : i < m} with 0 = S0 < 1 < < m and r(m ) = . The
infinite set A is contained in the finite union i<m [r(i ), r(i+1 )], so we can choose i < m
such that A [r(i ), r(i+1 )] is infinite. Let hj : j < i be such that hf (j ) : j < i
enumerates the first elements of A (r(i ), r(i+1 )). For each j < let j be such that
j + j = i+1 . Then for j < k < we clearly have j k . Hence we can choose k <
such that k = j for all j k. Now we define g : 1 1 by:

f ()
if k ,
g() =
f (k+1 + ) if = k + with 6= 0.
Clearly g is a normal function on 1 . For any j i we have r(j ) = f (j ) = g(j ). Now
suppose that j i + 1, and write j = i+1 + . Then
r(j ) = f (j ) = f (i+1 + ) = f (k+1 + k+1 + ) = g(k + k + ) = g(i+1 + ) = g(j ).
def

Thus s = r {(k , g(k )), (k + 1, g(k + 1))} is such that s g, and


s(k ) = g(k ) = f (k ) < f (k+1 ) < f (k+1 + 1) = g(k + 1) = s(k + 1),
and f (k+1 ) A . so s D , as desired.
321

This finishes the proof that D is dense in PC .


Next, for each ordinal < 1 let
E = {p PC : dmn(p)}.
Clearly each E is dense in PC .
Finally, for each limit ordinal < 1 let
F = {p PC : one of the following holds:
(1) 0 dmn(p) and < p(0),
(2) there is a dmn(p) such that p() = ,
(3) there is a < 1 such that , + 1 dmn(p)
and p() < < p( + 1)}
To show that F is dense in PC , suppose that q PC . Let f be a normal function on 1
such that q f . Let r = q {(0, f (0)}. So r PC and r q. If < r(0), then r F ,
as desired. So, assume that r(0) . If f () = for some , then r {(, )} F ,
as desired. So assume that
/ rng(f ). Let be smallest such that < f (). Then
is a successor ordinal + 1 since f is continuous, and f () < < f ( + 1). Hence
r {(, f ()), ( + 1, f ( + 1))} F , as desired.
Now by PFA let G be a filter on PC which intersects all of these dense sets. Let
S
f = G. We claim that rng(f ) is the desired club.
First, f is a function. For, suppose that (, ), (, ) f . Choose p, q G such that
(, ) p and (, ) q. Since G is a filter, p q FC , and so = .
Since G E 6= for each < 1 , f has domain 1 .
Next, f is strictly increasing. For, if < , we can easily find p G such that
, dmn(p), and hence f () = p() < p() = f ().
S
f is continuous: suppose that is a limit ordinal. Let = < f (). Since f is
strictly increasing, is a limit ordinal too. Let p G F . Then by the definition of F
there are three possibilities.
Case 1. 0 dmn(p) and < p(0). But p(0) = f (0) , contradiction.
Case 2. There is a < 1 such that , + 1 dmn(p) and p() < < p( + 1).
Choose < such that f () = p() f (). Then clearly < so, since is a limit
ordinal, also + 1 < , and hence p( + 1) = f ( + 1) , contradiction.
Case 3. There is a dmn(p) such that p() = . By Cases 1 and 2, this is the
only possibility left. We claim that = ; this will prove continuity. In fact, if < ,
then = p() = f () < f ( + 1) , contradiction. Suppose that < . If < , then
f () < f (). So f () < f () = , contradiction. Thus = .
So now we know that f is a normal function on 1 , and hence rng(f ) is club in 1 .
Now suppose that < 1 ; we want to show that A 6 rng(f ). Choose p D G.
Choose A in accordance with the definition of D . There are two possibilities. If
0 dmn(p) and < p(0), then < f (0), and so
/ rng(f ). If there is a < 1 such that
, + 1 dmn(p) and p() < < p( + 1), then again
/ rng(p).
322

24. More examples of iterated forcing


We give some more examples of iterated forcing. These are concerned with a certain partial
order of functions. For any regular cardinal we define
f < g

iff

f, g and there is an < such that f () < g() for all [, ).

This is clearly a partial order on . We say that F is almost unbounded iff there is
no g such that f < g for all f F . Clearly itself is almost unbounded; it has
size 2 .

Theorem 24.1. Let be a regular cardinal. Then any almost unbounded subset of
has size at least + .

Proof. Let F have size ; we want to find an almost bound for it. We may
assume that F 6= . Write F = {f : < }, possibly with repetitions. (Since maybe
|F | < .) Define g by setting, for each < ,
!
g() =

sup f ()

+ 1.

If < , then { < : g() f ()} , and so f < g.


Thus under GCH the size of almost unbounded sets has been determined. We are interested
in what happens in the absence of GCH, more specifically, under CH.
Theorem 24.2. Suppose that is an infinite cardinal and MA() holds. Suppose
that F and |F | = . Then there is a g such that f < g for all f F .
Proof. Let P = {(p, F ) : p fin(, ) and F [F ]< }. We partially order P by
setting (p, F ) (q, G) iff the following conditions hold:
(1) p q.
(2) F G.
(3) For all f G and all n (dmn(p)\dmn(q)), p(n) > f (n).
To check that this really is a partial order, suppose that (p, F ) (q, G) (h, H). Obviously p h and F H. Suppose that f H and n (dmn(p)\dmn(h). If n dmn(q),
then p(n) = q(n) > f (n). If n
/ dmn(q), then p(n) > f (n) since f G.
To show that P has ccc, suppose that X P is uncountable. Since fin(, ) is
countable, there are (p, F ), (q, G) X with p = q. Then (p, F G) P and (p, F G)
(p, F ), (p, G), as desired.
For each h F let Dh = {(p, F ) P : h F }. Then Dh is dense. In fact, let
(q, G) P be given. Then (q, G {h}) P and (q, G {h}) (q, G), as desired.
For each n let En = {(p, F ) : n dmn(f )}. Then En is dense. In fact, let
(q, G) P be given. We may assume that n
/ dmn(q). Choose m > f (n) for each f G,
and let p = q {(n, m)}. Clearly (p, G) En and (p, G) (q, G), as desired.
323

Now we apply MA() to get a filter G on P intersecting all of these dense sets. Since G
def S
is a filter, the relation g = (p,F )G p is a function. Since GEn 6= for each n , g has
domain . Let f F . Choose (p, F ) G Df . Let m be greater than each member
of dmn(p). We claim that f (n) < g(n) for all n m. For, suppose that n m. Choose
(q, H) G such that n dmn(q), and choose (r, K) G such that (r, K) (p, F ), (q, H).
Then f K since F K. Also, n dmn(r) since q r. So n dmn(r)\dmn(p). Hence
from (r, K) (p, F ) we get g(n) = r(n) > f (n).
As another illustration of iterated forcing, we now show that it is relatively consistent that
every almost unbounded subset of has size 2 , while MA holds. This follows from
the following theorem, using the fact, which we assume, that MA implies that 2 = 2 for
every infinite cardinal < 2 .
Theorem 24.3. There is a c.t.m. of ZFC with the following properties:
(i) 2 = 2 .
(ii) 21 = 3 .
(iii) Every almost unbounded set of functions from to has size 2 .
Proof. Applying 11.9 to a model N of GCH, with = 1 and = 3 , we get a c.t.m.
M of ZFC such that in M , 2 = 1 and 21 = 3 . We are going to iterate within M , and
iterate 2 times. At each successor step we will introduce a function almost greater than
each member of at that stage. In the end, any subset of of size less than 2 appears
at an earlier stage, and is almost bounded.
(1) If Q is a ccc quasi-order in M of size less 1 , then there are at most 1 nice Q-names
for subsets of ( ).
To prove (1), recall that a nice Q-name for a subset of ( )is a set of the form
[

{{
a} A : a }

where for eachPa , Aa is an antichain in Q. Now by ccc the number of antichains in


Q is at most <1 |Q| 1 by CH in M . So the number of sets of the indicated form
is at most 1 = 1 . Hence (1) holds.
Now we are going to define by recursion functions P, , and with domain 2 .
Let P0 be the trivial partial order ({0}, 0, 0).
Now suppose that P has been defined, so that it is a ccc quasi-order in M of size at
most 1 . We now define , , and P+1 . By (1), the set of all nice P -names for subsets
of ( ) has size at most 1 . We let { : < 1 } enumerate all of them.
(2) For every < 1 there is a P -name such that
1P P :

and [ :

implies that = ].
In fact, clearly
1P P W [W :

and [ :

implies that W = ]],


324

and so (2) follows from the maximal principle.


This defines .

Now for each H [1 ]< we define


H = {( , 1P ) : H}. So H is a P -name.
We now define
<
0 = {(op(
p,
}.
H ), 1) : p fin(, ) and H [1 ]

Let G be P -generic over M . Then


(3)

(0 )G = {(p, K) : p fin(, ) and K [ ]< }.

In fact, first suppose that x (0 )G . Then there exist p fin(, ) and H [1 ]< such

that x = (p, (
H )G ). Now (H )G = {( )G : H}, and ( )G for each by (2).
Thus x is in the right side of (3).
Second, suppose that p fin(, ) and K [ ]< . For each f K there is a

(f ) < 1 such that f = ((f


) )G . Let H = {(f ) : f K}. So H is a finite subset of 1 ,

and hence is in N . By (1) we have f = ((f


) )G for each f K. Now (H )G = K, and so
(p, K) (0 )G , as desired. So (3) holds.
Next, we define

1 = {(op(op(
p,
p ,
H ), op(
H )), q) : p, p fin(, ),
H, H [1 ]< , p p, H H, q P , and for all H

and all n dmn(p)\dmn(p ), q P (


n) < (p(n))}.
Again, suppose that G is P -generic over M . Then
(4)

(1 )G = {((p, K), (p, K )) : (p, K), (p , K ) (0 )G , p p, K K,


and for all f K and all n dmn(p)\dmn(p ), f (n) < p(n)}.

To prove this, first suppose that x (1 )G . Then there are q G, p, p fin(, ) and

H, H [1 ]< such that x = ((p, (


H )G ), (p , (H ))G ), p p, H H, and for all
H and all n dmn(p)\dmn(p ), q (
n) < (p(n)). Then with K = (
H )G and

K = (H ))G , the desired conditions clearly hold.


Second, suppose that p, p , K, K exist as on the right side of (4). Then by the def

inition of 0 , there are H, H [1 ]< such that K = (


H )G and K = (H )G . Then
K = {( )G : H }. Hence for every H and all n dmn(p)\dmn(p ) we have
( )G (n) < p(n). Since H and dmn(p)\dmn(p ) are finite, there is a q G such that for
every H and all n dmn(p)\dmn(p ) we have q P ( )(
n) < p(n). It follows now

1
that ((p, K), (p , K )) (p )G , as desired.
Next, we let 2 = {(op(0, 0), 1P )}. Then for any generic G, (2 )G = (0, 0). Finally,
let = op(op(0 , 1 ), 2 ). This finishes the definition of .
By the argument in the proof of 24.2 we have
(5) 1P P is
1 cc.
325

Now P+1 is determined by (I7) and (I8).


At limit stages we take direct limits. By 15.15, ccc is maintained. So the construction
is finished, and P is ccc.
Let G be P -generic over M .
(6) In N [G], if F and |F | < 2 , then there is a g such that f < g for all
f F.
For, let F = {f : < 1 }, possibly with repetitions. Let
F = {(, i, j) : < 1 , i, j , and f (i) = j}.
1
By 15.19 there is an < 2 such that F N [i1
2 [G]], and hence also F N [i2 [G]].
For brevity write G = i1
2 [G] for every < 2 . Let

H = {G : dmn(0 ) and p hi G+1 for some p}.


Let Q = ( )G . Thus by (3) and (4),
(7)
(8)

Q = {(p, K) : p fin(, ) and K [ ]< };


Q = {((p, K), (p, K )) : (p, K), (p , K ) (0 )G , p p, K K,
and for all f K and all n dmn(p)\dmn(p ), f (n) < p(n)}.

Then byS
15.18, G is P-generic over N , H N [G+1 ], and H is Q -generic over N [G ].
Let g = (p,F )H p. Clearly g is a function. For each m , let
Em = {(p, K) : Q : m dmn(p)}.
Then Em is dense. (See the proof of 24.2.) It follows that g .
def

Now take any f (in N [G ]). The set D = {(p, K) Q : f K} is dense, by


the proof of 24.2. Hence we can choose (p, K) D H . We claim that f (m) < g(m) for
all m such that m > n for each n dmn(p). For, suppose that such an m is given. Choose
(p , K ) Em H , and then choose (p , K ) H with (p , K ) (p, K), (p , K ). Now
m dmn(p ) dmn(p ), and f K. so from (p , K ) (p, K) and m
/ dmn(p) we get
f (m) < p (m) = g(m), as desired. This finishes the proof of (6).
By (6) we have 2 2 .
(8) |P | 1 for all < 2 .
We prove this by induction on . It is clear for = 0. Assume that |P | 1 . Clearly
|0 | = 1 , so by (I7), |P+1 | 1 . Suppose that is limit, and |P | 1 for all < .
Since P is the direct limit of previous P s, clearly |P | 1 , using 15.14.
(9) |P2 | 2 .
This is clear from (8), since P2 is the direct limit of earlier P s.
326

Now by 9.6, replacing , , there by 2 , 1 , , we get 2 2 . So by the above,


2 = 2 in N [G]. By 9.6, replacing , , there by 2 , 1 , 1 , we get 21 3 . Since
21 = 3 in N , it follows that 21 = 3 in N [G].

We want to generalize 24.3 to higher cardinals. This requires some preparation.


2

<

Lemma 24.4. Suppose that M is a c.t.m. of ZFC, and in M is a regular cardinal,


= , and 2 = + . We define a partial order P in M as follows:

 <
P = {(p, F ) : p Fn(, , ), F
} (see before 11.5)
(p, F ) (q, G) iff q p, G F, and f G dmn(p)\dmn(g)(p() > f ());
1P = (0, 0).
Then the following conditions hold.
(i) |P | + .
(ii) P is -closed.
(iii) P has the + -cc.
(iv) P preserves cofinalities and cardinals.
(v) If G is P-generic over M , then there is a function g in M [G] such that f < g
for all f ( )M .
Proof. Clearly (i) holds.
P satisfies the + -c.c.: Suppose that B P with |B| + . Then since |Fn(, , )| =
, wlog there is a q such that p = q for all (p, F ) B, and so + -c.c. is clear.
S
P is -closed:
Suppose
that
h(p
,
F
)
:

<
i
is
decreasing,
with

<
.
Let
q
=

< p
S
and G = < F . Suppose that < ; we claim that (q, G) (p , F ). Suppose that
f F and dmn(q)\dmn(p ). Then there is a < such that dmn(p ). We may
assume that < . Hence (p , F ) (p , F ), so q() = p () > f (), as desired.
Now it follows from sections 9, 11 that (iv) holds.
Now suppose that G is P-generic over M . Define
g=

p.

(p,F )G

Clearly g is a function with domain and range included in . To show that g has domain
, take any < . Let D = {(p, F ) : dmn(p)}. Then D is dense. In fact, suppose
that (q, H) P. Wlog
/ dmn(q). Let p be the extension of g by adding to its domain
and defining p() to be any ordinal less than which is greater than each f () for f H.
Clearly (p, H) (q, H) and (p, H) D. So g has domain .
def

Finally, we claim that f < g for all f M . In fact, clearly E = {(p, F ) P : f


F } is dense, and so we can choose (p, F ) E G. Take < such that sup(dmn(p)) < .
Take any (, ). Choose (q, H) such that dmn(q). Then choose (r, K) G such
that (r, K) (p, F ), (q, H). Then dmn(r)\dmn(p), and f F , so g() = r() > f ().
This shows that f < g.
327

If is a P-name for a p.o., then we say that is full for -sequences iff the following
conditions (a)(d) imply condition (e):
(a) p P.
(b) < .
(c) dmn( 0 ) for each < .
(d) for all , < , if < , then p ( 0 ) ( 0 ) ( ).
(e) There is a dmn( 0 ) such that p and p for each < .
Lemma 24.5. Let M be a c.t.m. of ZFC, and an infinite cardinal in M . Let I be
the ideal in P() consisting of all sets of size less than . In M , let (P, ) be an -stage
iterated forcing construction with supports in I (Kunens sense). Suppose that for each
< , the P -name is full for -sequences. Then P is -closed.
Proof. Let hp : < i be a sequence of elements of P such that p p if
< < , and < . We will define p = hp : < i by recursion so that the following
condition holds:
For all < , p = hp : < i P and < (p p ) and
[
supp(p ) =
supp(p ).
<

The induction step to a limit ordinal is clear, as is the case = 0. Now we define p ,
given p . By fullness we get dmn( 0 ) such that
p and p for each < .
Clearly p is as desired.
Here is our generalization of 24.3.
Theorem 24.6. Let M be a c.t.m. of GCH, and let be an uncountable regular
cardinal in M . Then there is a generic extension N of M preserving cofinalities and
cardinals such that in N the following hold:
(i) 2 = ++ .
+
(ii) 2( ) = +++ .
(iii) Every subset of of size less than 2 is almost unbounded.
Proof. First we apply 11.9 with = + and = +++ to get a generic extension M
+
of M preserving cofinalities and cardinals in which 2< = , 2 = + , and 2 = +++ .
We are going to iterate within M , and iterate ++ times. At each successor step we
will introduce a function almost greater than each member of at that stage. In the end,
any subset of of size less than ++ appears at an earlier stage, and is almost bounded.
(1) If Q is a + -cc quasi-order in M of size less + , then there are at most + nice
Q-names for subsets of ( ).
328

To prove (1), recall that a nice Q-name for a subset of ( )is a set of the form
[
{{
a} A : a }
where for each aP , Aa is an antichain in Q. Now by + -cc, the number of antichains
in Q is at most <+ |Q| + by 2 = + . So the number of sets of the indicated form
is at most ( + ) = + . Hence (1) holds.
Now we are going to define by recursion functions P, , and with domain ++ .
Let P0 be the trivial partial order ({0}, 0, 0).
Now suppose that P has been defined, so that it is a + -cc quasi-order in M of size
at most + , it is -closed, and every element has support of size less than . Also we
assume that has been defined for every < so that is a P -name for a quasi-order,
and it is full for -sequences. We now define , , and P+1 . By (1), the set of all nice
P -names for subsets of ( ) has size at most + . We let { : < + } enumerate all
of them.
(2) For every < 1 there is a P -name such that
1P P : and [ : implies that = ].
In fact, clearly
1P P W [W : and [ : implies that W = ]],
and so (2) follows from the maximal principle.
This defines .

Now for each H [ + ]< we define


H = {( , 1P ) : H}. So H is a P -name.
We now define
+ <
0 = {(op(
p,
}.
H ), 1) : p Fn(, , ) and H [ ]

Let G be P -generic over M . Then


(3)

([ + ]< )M = ([ + ]< )M

[G]

In fact, is clear. Now suppose that L ([ + ]< )M [G] . Then there exist an ordinal <
and a bijection f from onto L. Since P is -closed, by 11.1 we have f M , and hence
L M , as desired in (3). Similarly,
(4)
(5)

(Fn(, , ))M = (Fn(, , ))M

[G]

(0 )G = {(p, K) : p Fn(, , ) and K [ ]< }.

In fact, first suppose that x (0 )G . Then there exist p Fn(, , ) and H [ + ]< such

that x = (p, (
H )G ). Now (H )G = {( )G : H}, and ( )G for each by (2).
Thus x is in the right side of (3).
329

Second, suppose that p Fn(, , ) and K [ ]< . For each f K there is a

+
(f ) < + such that f = ((f
of
) )G . Let H = {(f ) : f K}. So H is a subset of

size less than . By (2) we have f = ((f ) )G for each f K. Now (H )G = K, and so
(p, K) (0 )G , as desired. So (5) holds.
Next, we define

1 = {(op(op(
p,
p ,
H ), op(
H )), q) : p, p Fn(, , ),
H, H [ + ]< , p p, H H, q P , and for all H
< (p())}.
and all dmn(p)\dmn(p ), q P ()

Again, suppose that G is P -generic over M . Then


(6)

(1 )G = {((p, K), (p , K )) : (p, K), (p , K ) (0 )G , p p, K K,


and for all f K and all dmn(p)\dmn(p ), f () < p()}.

To prove this, first suppose that x (1 )G . Then there are q G, p, p Fn(, , )

and H, H [ + ]< such that x = ((p, (


H )G ), (p , (H ))G ), p p, H H, and for all


H and all n dmn(p)\dmn(p ), q () < (p()). Then with K = (
H )G and

K = (H ))G , the desired conditions clearly hold.


Second, suppose that p, p , K, K exist as on the right side of (4). Then by the

definition of 0 , there are H, H [ + ]< such that K = (


H )G and K = (H )G .
Then K = {( )G : H }. Hence for every H and all dmn(p)\dmn(p )
we have ( )G () < p(). Let h( , ) : < i enumerate all pairs (, ) such that
dmn(p)\dmn(p ) and H , with < , limit. Now we define a system hq : i
of members of P by recursion. Let q0 = 1. Suppose that q has been defined so that
q G. Now there is an r G such that r ( ) < (p( )). Let q+1 G be such
that q+1 r, q . At limit stages we use that -closed property of P to continue.
Clearly q is as desired, showing that ((p, K), (p, K )) (p1 )G .
Next, we let 2 = {(op(0, 0), 1P )}. Then for any generic G, (2 )G = (0, 0). Finally,
let = op(op(0 , 1 ), 2 ). This finishes the definition of .
Using (5) and (6) it is clear that is a P -name for a quasi-order. To verify that it
is full for -sequences, suppose that
(7)
(8)

p P , < , dmn(0 ) for each < ,


and if , < and < , then p P ( 0 ) ( 0 ) ( ).

We want to find dmn(0 ) such that


(9)

p 0 and p for each < .

Since dmn(0 ), there exist a q Fn(, , ) and an H [ + ]< such that


S =

op(
q , H ). Now if < < , then p ; hence q q . Let r = <
S
and K = < H . Thus r Fn(, , ) and K [ + ]< . Let = op(
r ,
K ). Clearly
0
dmn( ). Suppose that < . To show that p q , suppose that p G with
330


G P -generic over M . Then G = (r, (
K )G ), and clearly (K )G ) = K. Suppose that
H and dmn(r)\dmn(q ). Say dmn(q ) with < . Clearly < . Since
( )G ( )G by (8), we have r() = q () > ( )G (). This proves that G ( )G ,
and so (9) holds.
Now P+1 is defined by (I7) and (I8) in the definition of iteration. We now want to
show that P+1 is + -cc, and for this we will apply 15.10. we are assuming that P is
+ -cc, so it suffices to prove that 1 P cc. So, let G be P -generic over M . As
above, 2< = in M [G]. Now |P | + by assumption. Hence 2 = + in M [G] by 9.6
(with , , replaced by + , + , respectively). Hence G is + -cc by (5) and (6). So
P+1 is + -cc by 15.10.
P+1 is -closed by 24.5, since we have proved that is full for -sequences. This
finishes the recursion step from to + 1.
Now suppose that is a limit ordinal ++ . We let

P = {p :p is a function with domain , p P for all <


and |{ < : p 6= 1}| < }.
and for p, q P , p q iff p q for all < .
Now we show that P has the + -cc. Suppose that hp : < + i is a system of
members of P . Then we can apply the -system theorem 10.1 to the system hsupp(p ) :
+
< + i, with , replaced by , + respectively. This gives us a set L [ + ] and a
set K such that for all distinct , L, supp(p ) supp(p ) = K. For L and K
we have p () 6= 1, so we can write p () = op(
q , ) with q Fn(, , ). Now for any
Q
L, the function ha : Ki is a member of K Fn(, , ), which has size at most
. So there exist L [L]+ and r such that hq : Ki = hr : Ki for all L .
Now it is clear that p and p are compatible for all , L , as desired.
By 24.5, P is -closed. Clearly, for < ++ P has size at most + .
This finishes the construction. For brevity let R = P++ .
Let G be R-generic over M .
(10) In M [G], if F and |F | < ++ , then there is a g such that f < g for all
f F.
For, let F = {f : < + }, possibly with repetitions. Let
F = {(, i, j) : < 1 , i, j , and f (i) = j}.
By 15.19 there is an < ++ such that F M [i1
[G]], and hence also F
++
1
1
M [i++ [G]]. For brevity write G = i++ [G] for every < ++ . Let
H = {G : dmn(0 ) and p hi G+1 for some p}.
Let Q = ( )G . Thus by (5) and (6),
(11)
(12)

Q = {(p, K) : p fin(, ) and K [ ]< };


Q = {((p, K), (p , K )) : (p, K), (p , K ) (0 )G , p p, K K,
and for all f K and all dmn(p)\dmn(p ), f () < p()}.
331

Now (10) follows from 24.4.


Replacing , , in 9.6 by ++ , + , respectively, we get 2 ++ in M [G]. Hence
by (10), 2 = ++ in M [G].
+
Replacing , , in 9.6 by ++ , + , + respectively, we get 2 +++ in M [G].
+
+
Since 2 = +++ in M , it follows that 2 = +++ in M [G].

332

25. Cofinality of posets


We begin the study of possible cofinalities of partially ordered setsthe PCF theory. In
this chapter we develop some combinatorial principles needed for the main results.
Ordinal-valued functions and their orderings
A filter on a set A is a collection F of subsets of A with the following properties:
(1) A F .
(2) If X F and X Y A, then Y F .
(3) If X, Y F then X Y F .
A filter F is proper iff F 6= P(A).
Suppose that F is a filter on a set A and R Ord Ord. Then for functions
f, g A Ord we define
f RF g

iff

{i A : f (i) R g(i)} F.

The most important cases of this notion that we will deal with are f <F g, f F g, and
and f =F g. Thus
f <F g iff {i A : f (i) < g(i)} F ;
f F g
f =F g

iff
iff

{i A : f (i) g(i)} F ;
{i A : f (i) = g(i)} F.

Sometimes we use this notation for ideals rather than filters, using the duality between
ideals and filters, which we now describe. An ideal on a set A is a collection I of subsets
of A such that the following conditions hold:
(4) I
(5) If X Y I then X I.
(6) If X, Y I then X Y I.
An ideal I is proper iff I 6= P(A).
If F is a filter on A, let F = {X A : A\X F }. Then F is an ideal on A. If I is
an ideal on A, let I = {X A : A\X I}. Then I is a filter on A. If F is a filter on
A, then F = F . If I is an ideal on A, then I = I.
Now if I is an ideal on A, then
f RI g

iff

{i A : (f (i) RI g(i))} I;

f <I g
f I g
f =I g

iff
iff
iff

{i A : f (i) g(i)} I;
{i A : f (i) > g(i)} I;
{i A : f (i) 6= g(i)} I.

Some more notation: RI (f, g) = {i I : f (i) R g(i)}. In particular, <I (f, g) = {i I :


f (i) < g(i)} and I (f, g) = {i I : f (i) g(i)}.
The following trivial proposition is nevertheless important in what follows.
333

Proposition 25.2. Let F be a proper filter on A. Then


(i) <F is irreflexive and transitive.
(ii) F is reflexive on A Ord, and it is transitive.
(iii) f F g <F h implies that f <F h.
(iv) f <F g F h implies that f <F h.
(v) f <F g or f =F g implies f F g.
(vi) If f =F g, then f F g.
(vii) If f F g F f , then f =F g.
Some care must be taken in working with these notions. The following examples illustrate
this.
(1) An example with f F g but neither f <F g nor f =F g nor f = g: Let A = ,
F = {A}, and define f, g by setting f (n) = n for all n and
g(n) =

n
n+1

if n is even,
if n is odd.

(2) An example where f =F g but neither f <F g nor f = g: Let A = and let F consist
of all subsets of that contain all even natural numbers. Define f and g by

n
n if n is even,
n if n is even,
f (n) =
g(n) =
1 if n is odd;
0 if n is odd.
Products and reduced products
In the preceding section we were considering ordering-type relations on the proper classes
A
Ord. Now we restrict ourselves
that h A Ord. We specialize the
Q to sets. Suppose
A
general notion by considering aA h(a) Ord. To eliminate trivialities, we usually
assume that h(a) is a limit ordinal for every a A; then we call h non-trivial.
Proposition 25.3. If Q
F is a proper filter on A, g, h
g <F h, then there is a k aA h(a) such that g =F k.

Ord, h is non-trivial, and

Proof. For any a A let


k(a) =
Thus k

aA

g(a) if g(a) < h(a),


0
otherwise.

h(a). Moreover,
{a A : g(a) = k(a) {a A : g(a) < h(a)} F,

so g =F k.

Q
We will frequently consider the structure ( aA h(a), <F , F ) in what follows. For most
considerations it is equivalent to consider the associated reducedQproduct, which we define
as follows. Note that =F is an equivalence relation on the set aA h(a). We define the
334

underlying set of the Q


reduced product to be the collection of all equivalence
classes under
Q
=F ; it is denoted by aA h(a)/F . Further, we define, for x, y aA h(a)/F ,
x <F y

iff

f, g

x F y

iff

f, g

Y
Y

A[x = [f ], y = [g], and f <F g];


A[x = [f ], y = [g], and f F g].

Here [h] denotes the equivalence class of h

A under =F .

Proposition 25.4. Suppose that h A Ord is nontrivial, and f, g


(i) [f ] <F [g] iff f <F g.
(ii) [f ] F [g] iff f F g.

aA

h(a). Then

Proof.
Now suppose that [f ] <F [g]. Then there are
Q (i): The direction is obvious.

f , g A such that [f ] = [f ], [g] = [g ], and f <F g . Hence

{ A : f () = f ()} { A : g() = g ()} { A : f () < g ()}


{ A : f () < g()},
and it follows that { A : f () < g()} F , and so f <F g.
(ii): similarly.
A filter F on A is an ultrafilter iff F is proper, and is maximal under all the proper filters
on A. Equivalently, F is proper, and for any X A, either X F or A\X F . The dual
notion to an ultrafilter is a maximal Q
ideal.
If F is an ultrafilter on A, then aA h(a)/F is an ultraproduct of h.
A
Proposition 25.5.
Q If h Ord is nontrivial and F is an ultrafilter on A, then <F
is a linear order on aA h(a)/F , and [f ] F [g] iff [f ] <F [g] or [f ] = [g].

Proof. By Proposition 25.2(iii) and Proposition 25.4, <F is transitive.


Q Also, from
Proposition 25.4 it is clear that <F is irreflexive. Now suppose that f, g A; we want
to show that [f ] and [g] are comparable. Assume that [f ] 6= [g]. Thus { A : f () =
g()}
/ F , so { A : f () 6= g()} F . Since
{ A : f () 6= g()} = { A : f () < g()} { A : g() < f ()},
it follows that [f ] < [g] or [g] < [f ].Q
Thus <F is a linear order on A/F .
Next,
{ A : f () g()} = { A : f () = g()} { A : f () < g()},
so it follows by Proposition 25.4 that [f ] F [g] iff [f ] = [g] or [f ] <F [g].
Basic cofinality notions
In this section we allow our quasi-orders P to be proper classes. From now on we
remove our standing restriction that quasi-orders have a largest element. It is
335

also convenient to change the notion of partial order slightly. A (strict) partial ordering is
a pair (P, <P ) consisting of a nonempty class P and an irreflexive transitive relation <P
which is a subset of P P . Many times we omit the subscript P if it is clear from the
context which class is meant. And we may speak of a partial ordering P if the relation
<P is clear from the context. The essential equivalence of this notion of a partial ordering
with the version is expressed in the easy exercise 2.15.
A double ordering is a system (P, P , <P ) such that (P, P ) is a quasi-ordering,
(P, <P ) is a partial ordering, and the following connections between the two hold:
(1) If p <P q or p = q, then p P q.
(2) If p <P q P r or p P q <P r, then p <P r.
(3) For every p P there is a q P such that p <P q.
Proposition 25.6. For any set A and proper filter F on A, the system (A Ord, F
, <F ) is a double ordering.
Proposition 25.7. Let h A Ord, with h taking only limit ordinal values, and let F
be a proper
Q filter on A. Then
(i) ( QaA h(a), F , <F ) is a double ordering.
(ii) ( aA h(a)/F, F , <F ) is a double ordering.
We now give some general definitions, applying to any double ordering (P, P , <P ) unless
otherwise indicated.
A subclass X P is cofinal in P iff p P q X(p P q). By the condition (3) above,
this is equivalent to saying that X is cofinal in P iff p P q X(p <P q).
Since clearly P itself is cofinal in P , we can make the basic definition of the cofinality
cf(P ) of P , for a set P :
cf(P ) = min{|X| : X is cofinal in P }.
We note that cf(P ) can be singular. For example, if we take P to be the disjoint union of
copies of n for n , we get cofinality .
A sequence hp : < i of elements of P is <P -increasing iff , < ( < p <P
p ). Similarly for P -increasing.
Suppose that P is a double order and is a set. We say that P has true cofinality iff P
has a linearly ordered subset which is cofinal.
Proposition 25.8. Suppose that a set P is a double order, and hp : < i is strictly
increasing in the sense of P , is cofinal in P , and is regular. Then P has true cofinality,
and its cofinality is .
Proof. Obviously P has true cofinality. If X is a subset of P of size less than , for
each q X choose q < such that q < pq . Let = supqX q . Then < since is
regular. For any q X we have q < p . This argument shows that cf(P ) = .
336

Proposition 25.9. Suppose that P is a double ordering, P a set, and P has true
cofnality. Then:
(i) cf(P ) is regular.
(ii) cf(P ) is the least size of a linearly ordered subset which is cofinal in P .
(iii) There is a <P -increasing, cofinal sequence in P of length cf(P ).
Proof. Let X be a linearly ordered subset of P which is cofinal in P , and let {y :
< cf(P )} be a subset of P which is cofinal in P ; we do not assume that hy : < cf(P )i
is <P - or P -increasing.
(iii): We define a sequence hx : < cf(P )i by recursion. Let x0 be any element of
X. If x has been defined, let x+1 X be such that x , y < x+1 ; it exists since X is
cofinal, using condition (3). Now suppose that < cf(P ) is limit and x has been defined
for all < . Then {x : < } is not cofinal in P , so there is a z P such that z 6 x
for all < . Choose x X so that z < x . Since X is linearly ordered, we must
then have x < x for all < . This finishes the construction. Since y < x+1 for all
< cf(P ), it follows that {x : < cf(P )} is cofinal in P . So (iii) holds.
(i): Suppose that cf(P ) is singular, and let h : < cf(cf(P ))i be a strictly increasing
sequence cofinal in cf(P ). With hx : < cf(P )i as in (iii), it is then clear that {x :
< cf(cf(P ))} is cofinal in P , contradiction (since cf(cf(P )) < cf(P ) because cf(P ) is
singular).
(ii): By (iii), there is a linearly ordered subset of P of size cf(P ) which is cofinal in
P ; by the definition of cofinality, there cannot be one of smaller size.
For P with true cofinality, the cardinal cf(P ) is called the true cofinality of P , and is
denoted by tcf(P ). It does not always exist; the example given before Proposition 25.8
shows this. When it does exist, it coincides with the cofinality. We write
tcf(P ) =
to mean that P has true cofinality, and it is equal to .
P is -directed iff for any subset Q of P such that |Q| < there is a p P such that
q P p for all q Q; equivalently, there is a p P such that q <P p for all q Q.
Proposition 25.10. (Pouzet) Assume that P is a double poset. For any infinite
cardinal , we have tcf(P ) = iff the following two conditions hold:
(i) P has a cofinal subset of size .
(ii) P is -directed.
Proof. is clear, remembering that is regular. Now assume that (i) and (ii) hold,
and let X be a cofinal subset of P of size .
S
First we show that is regular. Suppose that it is singular. Write X = <cf() Y
with |Y | < for each < cf. Let p be an upper bound for Y for each < cf, and
let q be an upper bound for {p : < cf}. Choose r > q. Then choose s X with r s.
Say s Y . Then s p q < r s, contradiction.
So, is regular. Let X = {r : < }. Now we define a sequence hp : < i by
recursion. Having defined p for all < , by (ii) let p be such that p < p for all
< , and r < p for all < . Clearly this sequence shows that tcf(P, <P ) = .
337

Proposition 25.11. Let P be a set. If G is a cofinal subset of P , then cf(P ) = cf(G).


Moreover, tcf(P ) = tcf(G), in the sense that if one of them exists then so does the other,
and they are equal. (That is what we mean in the future too when we assert the equality
of true cofinalities.)
Proof. Let H be a cofinal subset of P of size cf(P ). For each p H choose qp G
such that p P qp . Then {qp : p H} is cofinal in G. In fact, if r G, choose p H such
that r P p. Then r P qp , as desired. This shows that cf(G) cf(P ).
Now suppose that K is a cofinal subset of G. Then it is also cofinal in P . For, if p P
choose q G such that p P q, and then choose r K such that q P r. So p P r, as
desired. This shows the other inequality.
For the true cofinality, we apply Theorem 25.10. So suppose that P has true cofinality
. By Theorem 25.10 and the first part of this proof, G has a cofinal subset of size , since
cofinality is the same as true cofinality when the latter exists. Now suppose that X G
is of size < . Choose an upper bound p for it in P . Then choose q G such that p P q.
So q is an upper bound for X, as desired. Thus since Theorem 25.10(i) and 25.10(ii) hold
for G, it follows from that theorem that tcf(G) = .
The other implication, that the existence of tcf(G, <) implies that of tcf(P, <) and
their equality, is even easier, since a sequence cofinal in G is also cofinal in P .
A sequence hp : < i of elements of P is persistently cofinal iff
h P 0 < (0 < h <P p ).
Proposition 25.12. (i) If hp : < i is <P -increasing and cofinal in P , then it is
persistently cofinal.
(ii) If hp : < i and hp : < i are two sequences of members of P , hp : < i is
persistently cofinal in P , and p P p for all < , then also hp : < i is persistently
cofinal in P .
If X P , then an upper bound for X is an element p P such that q P p for all q X.
If X P , then a least upper bound for X is an upper bound a for X such that a P a for
every upper bound a for X. So if a and b are least upper bounds for X, then a P b P a,
but it is not necessarily true that a = b; see exercise E25.1.
If X P , then a minimal upper bound for X is an upper bound a for X such that if b
is an upper bound for X and b P a, then a P b.
Proposition 25.13. If X P and a is a least upper bound for X, then a is a
minimal upper bound for X.
The converse of this proposition does not hold in general. In fact, let P be the partial
order consisting of the usual order with two additional incomparable elements which are
declared to be greater than each natural number. Both of the new elements are minimal
upper bounds but not least upper bounds.
Now we come to an ordering notion which is basic for pcf theory.
If X P and for every x X there is an x X such that x <P x , then an element
a P is an exact upper bound of X provided
338

(1) a is a least upper bound for X, and


(2) X is cofinal in {p P : p <P a}.
Note that under the hypothesis here, a
/ X, and hence x <F a for all x X by (1).
In general, there are sets which have least upper bounds but no exact upper bounds.
For example, take two copies of with a single new element a greater than each member
of the two copies. If X is one of the copies of , then a is the least upper bound of X. But
X does not have an exact upper bound
Ordinal-valued functions and exact upper bounds
In this section we give some simple facts about exact upper bounds in the case of most
interest to usthe partial ordering of ordinal-valued functions.
First we note that the rough equivalence between products and reduced products
continues to hold for the cofinality notions introduced above. We state this for the most
important properties above:
Proposition 25.14. Suppose that h A Ord, and h takes only limit ordinal values.
Then
Q
Q
(i) If X
Q aA h(a), then X is cofinal in ( aA h(a), <I , I ) iff {[f ] : f X} is
cofinal in ( QaA h(a)/I, <I , I ). Q
(ii) cf( Q
aA h(a), <I , I ) = cf( aA
Q h(a)/I, <I , I ).
(iii) tcf( aA
Q h(a), <I , I ) = tcf(
Q aA h(a)/I, <I , I ).
(iv) If X aA h(a) and f aA h(a), then f is an exact upper bound for X iff
[f ] is an exact upper bound for {[g] : g X}.
Proof.
(i) is immediate from Proposition 25.5. For (ii), if XQis cofinal in the sysQ
tem ( aA h(a), <I , I ), then clearly {[f ] : f X} is cofinal in ( aA h(a)/I, <I , I ),
by
Now suppose that {[f ] : f Y } is cofinal in
Q Proposition 25.5 again; so holds.
Q
( aA h(a)/I, <I , I ). Given
Q g aA h(a), choose f Y such that [g] <I [f ]. Then
g <I f . So Y is cofinal in ( aA h(a), <I , I ), and holds.
(iii) and (iv) are proved similarly.
The following obvious proposition will be useful.
Proposition 25.15. Suppose that F {f, g} A Ord, I is an ideal on A, and f =I g.
Suppose that f is an upper bound, least upper bound, minimal upper bound, or exact upper
bound for F under I . Then also g is an upper bound, least upper bound, minimal upper
bound, or exact upper bound for F under I , respectively.
Here is our simplest existence theorem for exact upper bounds.
If X is a collection of members of A Ord, then sup X A Ord is defined by
(sup X)(a) = sup{f (a) : f X}.
Proposition 25.16. Suppose that > |A| is a regular cardinal, and f = hf : < i
is an increasing sequence of members of A Ord in the partial ordering < of everywhere
339

dominance. (That is, f < g iff f (a) < g(a) for all a A.) Then sup f is an exact upper
bound for f , and cf((sup f )(a)) = for every a A.
Proof. For brevity let h = sup f . Then clearly h is an upper bound for f . Now suppose that f g A Ord for all < . Then for any a A we have h(a) = sup< f (a)
g(a), so h g. Thus h is a least upper bound for f . Now suppose that k A Ord and
k < h. Then for every a A we have k(a) < h(a), and hence there is a a < such
that k(a) < fa (a). Let = supaA a . So < since is regular and greater than |A|.
Clearly k < f , as desired.
The next proposition gives equivalent definitions of least upper bounds for our special
partial order.
Proposition 25.17. Suppose that I is a proper ideal on A, F A Ord, and f A Ord.
Then the following conditions are equivalent.
(i) f is a least upper bound of F under I .
(ii) f is an upper bound of F under I , and for any f A Ord, if f is an upper
bound of F under I and f I f , then f =I f .
(iii) f is a minimal upper bound of F under I .
Proof. (i)(ii): Assume (i) and the hypotheses of (ii). Hence f I f , so f =I f by
Proposition 25.2(vii).
(ii)(iii): Assume (ii), and suppose that g A Ord is an upper bound for F and
g I f . Then g =I f by (ii), so f I g.
(iii)(i): Assume (iii). Let g A Ord be any upper bound for F . Define h(a) =
min(f (a), g(a)) for all a A. Then h is an upper bound for F , since if k F , then
{a A : k(a) > f (a)} I and also {a A : k(a) > g(a)} I, and
{a A : k(a) > min(f (a), g(a))} {a A : k(a) > f (a)} {a A : k(a) > g(a)} I,
so k I h. Also, clearly h I f . So by (iii), f I h, and hence f I g, as desired.
In the next proposition we see that in the definition of exact upper bound we can weaken
the condition (1), under a mild restriction on the set in question.
Proposition 25.18. Suppose that F is a nonempty set of functions in A Ord and
f F f F [f <I f ]. Suppose that h is an upper bound of F , and g A Ord, if
g <I h then there is an f F such that g <I f . Then h is an exact upper bound for F .
Proof. First note that {a A : h(a) = 0} I. In fact, choose f F . Then f <I h,
and so {a A : h(a) = 0} {a A : f (a) h(a)} I, as desired.
Now we show that h is a least upper bound for F . Let k be any upper bound. Let
n
l(a) = k(a) if k(a) < h(a),
0
otherwise.
Since {a A : l(a) h(a)} {a A : h(a) = 0}, it follows by the above that {a A :
l(a) h(a)} I, and so l <I h. So by assumption, choose f F such that l <I f . Now
f I k, so l <I k and hence
{a A : k(a) < h(a)} {a A : l(a) k(a)} I,
340

so h I k, as desired.
For the other property in the definition of exact upper bound, suppose that g <I h.
Then by assumption there is an f F such that g <I f , as desired.
Q
Corollary 25.19. If h A Ord is non trivial and F aA h(a),
Q then h is an exact
upper bound of F with respect to an ideal I on A iff F is cofinal in aA h(a).
In the next proposition we use the standard notation I + for A\I. The proposition shows
that exact upper bounds restrict to smaller sets A.
Proposition 25.20. Suppose that F is a nonempty subset of A Ord, I is a proper ideal
on A, h is an exact upper bound for F with respect to I, and f F f F (f <I f ).
Also, suppose that A0 I + . Then:
def
(i) J = I P(A0 ) is a proper ideal on A0 .
(ii) For any f, f A Ord, if f <I f then f A0 <J f A0 .
(iii) h A0 is an exact upper bound for {f A0 : f F }.
(i) is clear. Assume the hypotheses of (ii). Then
{a A0 : f (a) f (a)} {a A : f (a) f (a)} I,
and so f A0 <J f A0 .
For (iii), by (ii) we see that h A0 is an upper bound for {f A0 : f F }. To
see that it is an exact upper bound, we will apply Proposition 25.125. So, suppose that
k <J h A0 . Fix f F . Now define g A Ord by setting

f (a) if a A\A0 ,
g(a) =
k(a) if a A0 .
Then
{a A : g(a) h(a)} {a A : f (a) h(a)} {a A0 : k(a) h(a)} I,
so g <I h. Hence there is an l F such that g <I l. Hence
{a A0 : k(a) l(a)} {a A : g(a) l(a)} I,
so k <J l, as desired.
Next, increasing the ideal maintains exact upper bounds:
Proposition 25.21. Suppose that F is a nonempty subset of A Ord, I is a proper ideal
on A, h is an exact upper bound for F with respect to I, and f F f F (f <I f ).
Let J be a proper ideal on A such that I J. Then h is an exact upper bound for F
with respect to J.
Proof. We will apply Proposition 25.125. Note that h is clearly an upper bound for
F with respect to J. Now suppose that g <J h. Let f F . Define g by

g(a) if g(a) < h(a),

g (a) =
f (a) otherwise.
341

Then {a A : g (a) h(a)} {a A : f (a) h(a)} I, since f <I h. So g <I h.


Hence by the exactness of h there is a k F such that g <I k. So
{a : g(a) k(a)} {a A : h(a) > g(a) k(a)} {a A : h(a) g(a)}
{a A : g (a) k(a)} {a A : h(a) g(a)},
and this union is in J since the first set is in I and the second one is in J. Hence g <J k,
as desired.
Q
Again we turn from the general case of proper classes A Ord to the sets aA h(a), where
h A Ord has only limit ordinal values. We Q
prove some results which show that under a
weak hypothesis we can
Q restrict attention to A for A a nonempty set of infinite regular
Q
cardinals instead of aA h(a), as far as cofinality notions are concerned. Here
A
consists of all choice functions f with domain A; f (a) a for all a A.
Proposition 25.22. Suppose that h A Ord and h(a) is a limit ordinal for every
a A. For each a A, let S(a) h(a) be cofinal in h(a) with order type cf(h(a)).
Suppose that
on A. Then
Q I is a proper ideal Q
(i) cf( Q
aA h(a), <I ) = cf( aA
Q S(a), <I ) and
(ii) tcf( aA h(a), <I ) = tcf( aA S(a), <I ).
Q
Q
Proof. For each f h define gf aA S(a) by setting
gf (a) = least S(a) such that f (a) .
Q
Q
We prove (i): suppose
Q that X h and X is cofinal in ( h, <I ); we show
Q that {gf : f
X} is
cofinal
in
cf(
S(a),
<
),
and
this
will
prove
.
So,
let
k

I
aA
aA S(a). Thus
Q
k h, so there is an f X such that
Qk <I f . Since f gf , it follows
Q that k <I gf , as
desired. Conversely, suppose that
Q Y aA S(a) and Y is cofinal in ( aA S(a),
Q <I ); we
show that also Y is cofinal in h, and this will prove of the claim. Let f h. Then
f gf , and there is a k Y such that gf <I k; so f <I k, as desired.
This finishes the proof of (i). Q
For (ii), first suppose that tcf( h, <I ) exists; call it . Thus
Q is an infinite regular
cardinal. Let hfi : i < i be a <I -increasing cofinal sequence in h. We claim that gfi
gfj if i < j < . In fact, if a A and fi (a) < fj (a), then fi (a) < fj (a) gfj (a) S(a),
andQso by the definition of gfi we get gfi (a) gfj (a). This implies that gfi I gfj . Now
cf( h, <I ) = , so for any B []< there is a j < such that gfi <I fj gfj . It follows
that we can take a subsequence of hgfiQ
: i < i which is strictly increasing modulo I; it is
also clearly cofinal, and hence = Q
tcf( aA S(a), <I ) by Proposition 25.25.
Conversely, suppose that tcf( Q
aA S(a), <I ) exists; call it . Let hfi : i < i be
a <Q
I -increasing cofinal sequence in
Q aA S(a). Then it is also a sequence showing that
tcf( h, <I ) exists and equals tcf( aA S(a), <I ).
Proposition 25.23. Suppose that hLa : a Ai and hMa : a Ai are systems of
linearly ordered sets such that each La and Ma has no last element. Suppose that La is
isomorphic to Ma for all a A. Let I be any ideal on A. Then
!
!
Y
Y
La , <I , I
Ma , <I , I .
=
aA

aA

342

Putting the last


we see that to determine cofinality and true
Q two propositions together,
A
cofinality of ( h, <I , I ), where h Ord and h(a) is a limit ordinal for all a A, it
suffices to take the case in which each h(a) is an infinite regular cardinal. (One passes
from h(a) to S(a) and then to cf(h(a)).) We can still make a further reduction, given in
the following useful lemma.
Lemma 25.24. (Rudin-Keisler) Suppose that c maps the set A into the class of
regular cardinals, and B = {c(a) : a A} is its range. For any ideal I over A, define its
Rudin-Keisler projection J on B by
XJ

iff

X B and c1 [X] I.

Q
Q
Then J is an ideal
on
B,
and
there
is
an
isomorphism
h
of
B/J
into
aA c(a)/I such
Q
that for any e B we have h(e/J) = he(c(a)) : a Ai/I.
Q
If |A| <
Q min(B), then
Q the range of h is cofinal in aA c(a)/I, and we have
(i) cf( Q
B/J) = cf( aA
Q c(a)/I and
(ii) tcf( B/J) = tcf( aA c(a)/I).
Q
Proof. Clearly
Q J is an ideal. Next, for any e B let e = he(c(a)) : a Ai. Then
for any e1 , e2 B we have
e1 =J e2

iff

{b B : e1 (b) 6= e2 (b)} J

iff

c1 [{b B : e1 (b) 6= e2 (b)}] I

iff
iff

{a A : e1 (c(a)) 6= e2 (c(a))} I
e1 =I e2 .

This shows that h exists as indicated and is one-one. Similarly, h preserves <I in each
direction. So the first part of the lemma holds.
Now suppose that |A| < min(B). Let
Q G be the range of h.
Q By Proposition 25.11,
Q(i)
and (ii) follow from G being cofinal in aA c(a)/I. Let g aA c(a). Define e B
by setting, for any b B,
e(b) = sup{g(a) : a A and c(a) = b}.
Q
The additional supposition implies that e
B. Now note that {a A : g(a) >
e(c(a))} = I, so that g/I h(e/J), as desired.
According toQthese last propositions, the calculation of true cofinalities for partial orders
of the form ( aA h(a), <I ), with h A Ord and h(a) a limit ordinal for every a A, and
with |A| <Qmin(cf(h(a)), reduces to the calculation of true cofinalities of partial orders of
the form ( B, <J ) with B a set of regular cardinals with |B| < min(B).
For the next proposition,
Q note that an ultraproduct of linear orders is a linear order.
In fact, given [f ] 6= [g] in iI Li /F with each Li a linear order, we have
{i I : f (i) 6= g(i)} = {i I : f (i) < g(i)} {i I : g(i) < f (i)},
343

and since the set on the left is in F , it follows that one of the sets on the right is in F ;
hence [f ] < [g] or [g] < [f ].
Lemma 25.25. If (Pi , <i ) isQa partial order with
Q true cofinality i for each i I and
D is an ultrafilter on I, then tcf( iI i /D) = tcf( iI Pi /D).
Q
Proof. Note that iI i /D is a linear order, and so its true cofinality
exists and
Q
equals its cofinality. So the lemma is asserting that the ultraproduct iI Pi /D has as
true cofinality.
Q
Let hg : < i be a sequence
Q of members of iI i such that hg /D : < i is
strictly increasing and cofinal in iI i /D. For each i I letQhf,i : < i i be strictly
increasing and cofinal in (Pi , <i ). For each < define h iI Pi by setting
h (i) =
Q
fg (i),i . We claim that hh /D : < i is strictly increasing and cofinal in iI Pi /D (as
desired).
To prove this, first suppose that < < . Then
{i I : h (i) < h (i)} = {i I : fg (i),i <i fg (i),i } = {i I : g (i) < g (i)} D;
so h /D < h /D.
Q
Now suppose
that k iI Pi ; we want to find < such that k/D < h /D.
Q
Define l iI i by letting l(i) be the least < such that k(i) < f,i . Choose <
such that l/D < g /D. Now if l(i) < g (i), then k(i) < fl(i),i <i fg (i),i = h (i). So
k/D < h /D.
Existence of exact upper bounds
We introduce several notions leading up to an existence theorem for exact upper bounds:
projections, strongly increasing sequences, a partition property, and the bounding projection property.
We start with the important notion of projections. By a projection framework we
mean a triple (A, I, S) consisting of a nonempty set A, an ideal I on A, and a sequence
hSa : a Ai of nonempty sets of ordinals. Suppose that we are given such a framework. We
define sup S in the natural way: it is a function with domain A, and (sup S)(a) = sup(Sa )
A
for every a A. Thus sup S A Ord. Now
Q suppose also that we +have a function f Ord.
Then we define the projection of f onto aA Sa , denoted by f = proj(f, S), by setting,
for any a A,

min(Sa \f (a)) if f (a) < sup (Sa ),
+
f (a) =
min(Sa )
otherwise.
Thus

f + (a) =

f (a)

if f (a) Sa and f (a) is not


the largest element of Sa ,

least x Sa such that f (a) < x

min(Sa )
344

if f (a)
/ Sa and f (a) < sup(Sa ),
if sup(Sa ) f (a).

Proposition 25.26. Let a Q


projection framework be given, with the notation above.
A
+
(i) If f Ord, then f aA Sa .
(ii) If f1 , f2 A Ord and f1 =I f2 , then f1+ =I f2+ .
Q
(iii) If f A Ord and f <I sup S, then f I f + , and for every g aA Sa , if f I g
then f + I g.
Proof. (i) and (ii) are clear. For (iii), suppose that f A Ord and f <I sup S.
+
+
Then
Qif f (a) > f (a) we must have f (a) sup(Sa ). Hence f I f +. Now suppose that
g aA Sa and f I g. If f (a) g(a) and f (a) < sup(Sa ), then f (a) g(a). Hence
{a A : g(a) < f + (a)} {a A : f (a) > g(a)} {a A : f (a) sup(Sa )} I,
so f + I g.
Another important notion in discussing exact upper bounds is as follows. Let I be an ideal
over A, L a set of ordinals, and f = hf : Li a sequence of members of A Ord. Then
we say that f is strongly increasing under I iff there is a system hZ : Li of members
of I such that
, L[ < a A\(Z Z )[f (a) < f (a)]].
Under the same assumptions we say that f is very strongly increasing under I iff there is
a system hZ : Li of members of I such that
, L[ < a A\Z [f (a) < f (a)].
Proposition 25.27. Under the above assumptions, f is very strongly increasing
under I iff for every L we have
()

sup{f + 1 : L } I f .

Proof. : suppose that f is very strongly increasing under I, with sets Z as


indicated. Let L. Suppose that a A\Z . Then for any L we have
f (a) < f (a), and so sup{f (a) + 1 : L } f (a); it follows that () holds.
: suppose that () holds for each L. For each L let
Z = {a A : sup{f (a) + 1 : L } > f (a)};
it follows that Z I. Now suppose that L and < . Suppose that a A\Z . Then
f (a) < f (a) + 1 sup{f (a) + 1 : L } f (a), as desired.
Lemma 25.28. (The sandwich argument) Suppose that h = hh : Li is strongly
increasing under I, L has no largest element, and is the successor in L of for every
L. Also suppose that f A Ord is such that
h <I f I h for every L.
Then hf : Li is also strongly increasing under I.
345

Proof. Let hZ : Li testify that h is strongly increasing under I. For every L


let
W = {a A : h (a) f (a) or f (a) > h (a)}.
Thus by hypothesis we have W I. Let Z = W Z Z for every L; so Z I.
Then if 1 < 2 , both in L, and if a A\(Z 1 Z 2 ), then
f1 (a) h1 (a) h2 (a) < f2 (a);
these three inequalities hold because a A\W1 , a A\(Z1 Z2 ), and a A\W2
respectively.
Now we give a proposition connecting the notion of strongly increasing sequence with the
existence of exact upper bounds.
Proposition 25.29. Let I be a proper ideal over A, let > |A| be a regular cardinal,
and let f = hf : < i be a <I increasing sequence of functions in A Ord. Then the
following conditions are equivalent:
(i) f has a strongly increasing subsequence of length under I.
(ii) f has an exact upper bound h such that {a A : cf(h(a)) 6= } I.
(iii) f has an exact upper bound h such that cf(h(a)) = for all a A.
(iv) There is a sequence g = hg : < i such that g < g (everywhere) for < ,
and f is cofinally equivalent to g, in the sense that < < (f <I g ) and <
< (g <I f ).
Proof. (i)(ii): Let h() : < i be a strictly increasing sequence of ordinals less
than , thus with supremum since is regular, and assume that hf() : < i is strongly
increasing under I. Hence for each < let Z I be chosen correspondingly. We define
for each a A
h(a) = sup{f() (a) : < , a
/ Z }.
To see that h is an exact upper bound for f , we are going to apply Proposition 25.125.
If f() (a) > h(a), then a Z I. Hence f() I h for each < . Then for any
< we have f I f() I h, so h bounds every f . Now suppose that d <I h. Let
M = {a A : d(a) h(a)}; so M I. For each a A\M we have d(a) < h(a), and so
there is a a < such that d(a) < f(a ) (a) and a
/ Za . Since |A| < and is regular,
def

the ordinal = supaA\M a is less than . We claim that d <I f() . In fact, suppose
that a A\(M Z ). Then a A\(Za Z ), and so d(a) < f(a ) (a) f() (a). Thus
d <I f() , as claimed. Now it follows easily from Proposition 25.18 that h is an exact
upper bound for f .
For the final portion of (ii), it suffices to show
(1) There is a W I such that cf(h(a)) = for all a A\W .
In fact, let
W = {a A : a < [a , )[a Z ]}.
def

Since |A| < , the ordinal = supaW a is less than . Clearly W Z , so W I.


For a A\W we have < [, )[a
/ Z ]. This gives an increasing sequence
346

h : < i of ordinals less than such that a


/ Z for all < . By the strong
increasing property it follows that f(0 ) (a) < f(1 ) (a) < , and so h(a) has cofinality
. This proves (1), and with it, (ii).
(ii)(iii): Let W = {a A : cf(h(a)) 6= }; so W I by (ii). Since I is a proper
ideal, choose a0 A\W , and define

h(a) if a A\W ,

h (a) =
h(a0 ) if a W .
Then h =I h , and it follows that h satisfies the properties needed.
(iii)(iv): For each a A, let ha : < i be a strictly increasing sequence of ordinals
with supremum h(a). Define g (a) = a for all a A and < . Clearly g < g if < .
Now suppose that < . Then f <I h. For each a A such that f (a) < h(a) choose
a < such that f (a) < aa . Since |A| < , choose < such that a < for all a A.
Then for any a A such that f (a) < h(a) we have f (a) < a = g (a). Hence f <I g ,
which is half of what is desired in (iv).
Now suppose that < . Then g < h, so by the exactness of h, there is an <
such that g <I f , as desired.
(iv)(i): Assume (iv). Define strictly increasing continuous sequences h() : < i
and h() : < i of ordinals less than as follows. Let (0) = 0, and choose (0) so
that g0 <I f(0) . If () and () have been defined, choose ( + 1) > () such that
f() I g(+1) , and choose ( + 1) > () such that g(+1) <I f(+1) . Thus for every
< we have
g() <I f() I g(+1) .
since obviously hg() : < i is strongly increasing under I, Lemma 25.28 gives (i).
The notion of a strongly increasing sequence is clarified by giving an example of a sequence
such that no subsequence is strongly increasing. This example depends on the following
well-known lemma.
Lemma 25.30. If is a regular cardinal and I is the ideal []< on , then there is
def
a sequence f = hf : < + i of members of such that f <I f whenever < < .
Proof. We construct the sequence by recursion. Let f0 () = 0 for all < . If f
has been defined, let f+1 () = f () + 1 for all < . Now suppose that < is a
limit ordinal, and f has been defined for every < . Let h() : < i be a strictly
increasing sequence of ordinals with supremum , where = cf(). Thus . Define
f () = (sup f() ()) + 1.

The sequence constructed this way is as desired. For example, if is a limit ordinal as
above, then for each < we have { < : f() () f ()} , and so f() <I f .
Now let A = and let I and f be as in the lemma. Suppose that f has a strongly
increasing subsequence of length + under I. Then by proposition 25.29, f has an exact
347

upper bound h such that cf(h()) = + for all < . Now the function k with domain
taking the constant value is clearly an upper bound for f . Hence h I k. Hence there
is an < such that h() k() = , contradiction.
A further fact along these lines is as follows.
def

Lemma 25.31. Suppose that I = []< and f = hf : < i is a <I -increasing


sequence of members of which has an exact upper bound h, where is an infinite
cardinal. Then hf : < i is a scale, i.e., for any g there is a < such that
g <I f .
Proof. Let k(m) = for all m < . Then k is an upper bound for f under <I ,
and so h I k. Letting h (m) = min(h(m), k(m)) for all m , we thus get h =I h . So
by Proposition 25.15, h is also an exact upper bound for f . Hence we may assume that
h(m) for every m < . Now we claim
(1) n < p n(0 < h(p)).
In fact, the set {p : f0 (p) h(p)} is in I, so there is an n such that f0 (p) < h(p) for
all p n, as desired in (1).
Let n0 be as in (1).
def

(2) M = {p : h(p) 6= } is finite.


For, suppose that M is infinite. Define
n
l(p) = h(p) 1 if 0 < h(p) < ,
0
otherwise.
We claim that l <I h. For, {p : l(p) h(p)} {p : h(p) = 0} I. So our claim holds.
Now by exactness, choose < such that l <I f . Then we can choose p M such that
l(p) < f (p) < h(p), contradiction.
Thus M is finite. Hence by Proposition 25.15 we may assume that h(p) = for all p,
and the desired conclusion of the lemma follows.
Now there is a model M of ZFC in which there are no scales (see for example Blass []),
def

and yet it is easy to see that there is a sequence f = hf : < 1 i which is <I -increasing.
Hence by Lemma 25.31, this sequence does not have an exact upper bound.
Another fact which helps the intuition on exact upper bounds is as follows.
Lemma 25.32. Let be a regular cardinal, and let I = []< . For each < let
def
f be defined by f () = for all < . Thus f = hf : < i is increasing
everywhere. Claim: f does not have a least upper bound under <I . (Hence it does not
have an exact upper bound.)
Proof. Suppose that h is an upper bound for f under <I . We find another upper
bound k for f under <I such that h is not I k. First we claim
(1) < < ( h()).
In fact, otherwise we get a < such that for all < there is a > such that
> h(). But then |{ < : f () > h()}| = , contradiction.
348

By (1) there is a strictly increasing sequence h : < i of ordinals less than such
that for all < and all we have < h(). Now we define k by setting, for
each < ,


if +1 < +2 ,
k() =
h() otherwise.
To see that k is an upper bound for f under <I , take any < . If +1 , then
h() + 1, and hence k() = f (), as desired. For each < we have k(+1 ) =
< h(+1 ), so h is not I k.
Now we define a partition property. Suppose that I is an ideal over a set A, is an
uncountable regular cardinal > |A|, f = hf : < i is a <I -increasing sequence of
members of A Ord, and is a regular cardinal such that |A| < . The following
property of these things is denoted by () :
()

For all unbounded X there is an X0 X of order type


such that hf : X0 i is strongly increasing under I.
Proposition 25.33. Assume the above notation, with < . Then () holds iff the

set
{ < :cf() = and hf : X0 i is strongly increasing under I
for some unbounded X0 }
is stationary in .
Proof. Let S be the indicated set of ordinals .
: Assume () and suppose that C is a club. Choose C0 C of order type
such that hf : C0 i is strongly increasing under I. Let = sup(C0 ). Clearly C S.
: Assume that S is stationary in , and suppose that X is unbounded. Define
C = { : is a limit ordinal and X is unbounded in }.
We check that C is club in . For closure, suppose that < is a limit ordinal and C
is unbounded in ; we want to show that C. So, we need to show that X is
unbounded in . To this end, take any < ; we want to find X such that < .
Since C is unbounded in , choose C such that < . By the definition of C
we have that X is unbounded in . So we can choose X such that < . Since
< < , is as desired. So, indeed, C is closed.
To show that C is unbounded in , take any < ; we want to find an C such
that < . Since X is unbounded in , we can choose a sequence 0 < 1 < of
elements of X with < 0 . Now is uncountable and regular, so supn n < , and it
is the member of C we need.
Now choose C S. This gives us an unbounded set X0 in such that hf : X0 i
is strongly increasing under I. Now also X is unbounded, since C. Hence we can
define by induction two increasing sequences h() : < i and h() : < i such that
349

each () is in X0 , each () is in X, and () < () ( + 1) for all < . It follows


def
by the sandwich argument, Lemma 25.28, that X1 = {() : < } is a subset of X as
desired in () .
Finally, we introduce the bounding projection property.
Suppose that f = hf : < i is a <I -increasing sequence of functions in A Ord, with
a regular cardinal > |A|. Also suppose that is a regular cardinal and |A| < .
We say that f has the bounding projection property for iff whenever hS(a) : a Ai
is a system of nonempty sets of ordinals such that each |S(a)| < and for each < we
have f <I sup(S), then for some < , the function proj(f , hS(a) : a Ai) <I -bounds
f.
We need the following simple result related to Proposition 25.15.
Proposition 25.34. Suppose that f = hf : < i is a <I -increasing sequence
of functions in OrdA , with a regular cardinal > |A|. Also suppose that is a regular
cardinal and |A| < . Assume that f has the bounding projection property for .
Also suppose that f = hf : < i is a sequence of functions in OrdA , and f =I f
for every < .
Then f has the bounding projection property for .
Proof. Clearly f is <I -increasing, so that the setup for the bounding projection
property holds. Now suppose that hS(a) : a Ai is a system of nonempty sets of ordinals
such that each |S(a)| < and for each < we have f <I sup(S). Then the same
is true for f , so by the bounding projection property for f we can choose < such
that the function proj(f , hS(a) : a Ai) <I -bounds f . Now suppose that < . Then
f I proj(f , hS(a) : a Ai). Hence f I proj(f , hS(a) : a Ai), and by Proposition
25.26(ii), proj(f , hS(a) : a Ai) = proj(f , hS(a) : a Ai), as desired.
The following proposition shows that we can weaken the bounded projection property
somewhat, by replacing <I by < (everywhere).
Proposition 25.35. Suppose that f = hf : < i is a <I -increasing sequence
of functions in OrdA , with a regular cardinal > |A|. Also suppose that is a regular
cardinal and |A| < . Then the following conditions are equivalent:
(i) f has the bounding projection property for .
(ii) If hS(a) : a Ai is a system of nonempty sets of ordinals such that each |S(a)| <
and for each < we have f < sup(S) (everywhere), then for some < , the function
proj(f , hS(a) : a Ai) <I -bounds f .
Proof. Obviously (i)(ii). Now assume that (ii) holds, and suppose that hS(a) : a
Ai is a system of sets of ordinals such that each |S(a)| < and for each < we have
f <I sup(S). Now for each a A let
(a) =

sup{f (a) + 1 : < and f (a) sup(S(a))} if this set is nonempty,


sup(S(a)) + 1
otherwise;

S (a) = S(a) {(a)}.


350

Note that f < sup(S ) everywhere. Hence by (ii), there is a < such that the function
proj(f , hS (a) : a Ai) <I -bounds f . Now let < . If f (a) < sup(S(a)) and f (a) <
(proj(f , hS (a) : a Ai))(a), then
(proj(f , hS (a) : a Ai))(a) = min(S (a)\f (a))
= min(S(a)\f (a))
= (proj(f , hS(a) : a Ai))(a).
Hence f <I proj(f , hS(a) : a Ai), as desired.
Lemma 25.36. (Bounding projection lemma) Suppose that I is an ideal over A,
> |A| is a regular cardinal, f = hf : < i is a <I -increasing sequence satisfying
() for a regular cardinal such that |A| < . Then f has the bounding projection
property for .
Proof. Assume the hypothesis of the lemma and of the bounding projection property
for . For every < let
f+ = proj(f , S).
Suppose that the conclusion of the bounding projection property fails. Then for every
< , the function f+ is not a bound for f , and so there is a < such that f 6I f+ .
Since f I f+ , we must have < . Clearly for any we have f 6I f+ . Thus
for every we have < (f+ , f ) I + . Now we define a sequence h() : < i of
elements of by recursion. Let (0) = 0. Suppose that () has been defined. Choose
+
, f ) I + for every ( + 1). If is limit and ()
( + 1) > () so that < (f()
has been defined for all < , let () = sup< (). Then let X be the range of this
sequence. Thus
if , X and < , then < (f+ , f ) I + .
Since () holds, there is a subset X0 X of order type such that hf : X0 i is
strongly increasing under I. Let hZ : X0 i be as in the definition of strongly increasing
under I.
For every X0 , let be the successor of in X0 . Note that
< (f+ , f )\(Z Z {a A : f (a) sup(S(a))}) I + ,
and hence it is nonempty. So, choose
a < (f+ , f )\(Z Z {a A : f (a) sup(S(a))}).
Note that this implies that f+ (a ) S(a ). Since > |A|, we can find a single a A such
that a = a for all in a subset X1 of X0 of size . Now for 1 < 2 with both in X1 , we
have
f+1 (a) < f1 (a) f2 (a) f+2 (a).
351

[The first inequality is a consequence of a = a1 < (f+1 , f1 ), the second follows from
1 2 and the fact that
a = a1 = a2 A\(Z1 Z2 ),
and the third is true by the definition of f+2 .]
Thus hf+ (a) : X1 i is a strictly increasing sequence of members of S(a). This
contradicts our assumption that |S(a)| < .
The next lemma reduces the problem of finding an exact upper bound to that of finding a
least upper bound.
Lemma
cardinal, and
the bounding
Then h is an

25.37. Suppose that I is a proper ideal over A, |A|+ is a regular


f = hf : i is a <I -increasing sequence of functions in A Ord satisfying
projection property for |A|+ . Suppose that h is a least upper bound for f .
exact upper bound.

Proof. Assume the hypotheses, and suppose that g <I h; we want to find <
such that g <I f . By increasing h on a subset of A in the ideal, we may assume that
g < h everywhere. Define Sa = {g(a), h(a)} for every a A. By the bounding projection
def

property we get a < such that f+ = proj(f , hSa : a Ai) is an upper bound for f .
We shall prove that g <I f , as required.
def

Since h is a least upper bound, it follows that h I f+ . Thus M = {a A :


def

h(a) > f+ (a)} I. Also, the set N = {a A : f (a) sup Sa } is in I. Suppose that
a A\(M N ). Then g(a) < h(a) f+ (a) = min(Sa \f (a)), and since g(a) Sa , this
implies that g(a) < f (a). So g <I f , as desired.
Here is our first existence theorem for exact upper bounds.
Theorem 25.38. (Existence of exact upper bounds) Suppose that I is a proper ideal
over A, > |A|+ is a regular cardinal, and f = hf : i is a <I -increasing sequence of
functions in A Ord that satisfies the bounding projection property for |A|+ . Then f has an
exact upper bound.
Proof. Assume the hypotheses. By Lemma 25.37 it suffices to show that f has a
least upper bound, and to do this we will apply Proposition 25.17. Suppose that f does
not have a least upper bound. Since it obviously has an upper bound, this means, by
Proposition 25.17:
(1) For every upper bound h A Ord for f there is another upper bound h for f such that
h I h and {a A : h (a) < h(a)} I + .
In fact, Proposition 25.17(ii) says that there is another upper bound h for f such that
h I h and it is not true that h =I h . Hence {a A : h(a) < h (a)} I and
{a A : h(a) 6= h (a)} I + . So
{a A : h(a) 6= h (a)}\{a A : h(a) < h (a)} I + and
{a A : h(a) 6= h (a)}\{a A : h(a) < h (a)} = {a A : h (a) < h(a)},
352

so (1) follows.
Now we shall define by induction on < |A|+ a sequence S = hS (a) : a Ai of
sets of ordinals satisfying the following conditions:
(2) 0 < |S (a)| |A| for each a A;
(3) f (a) < sup S (a) for all and a A;
(4) If < , then S (a) S (a), and if is a limit ordinal, then S (a) =

<

S (a).

We also define sequences hh : < |A|+ i and hh : < |A|+ i of functions and h() : <
|A|+ i of ordinals.
The definition of S for limit is fixed by (4), and the conditions (2)(4) continue to
hold. To define S 0 , pick any function k that bounds f (everywhere) and define S 0 (a) =
{k(a)} for all a A; so (2)(4) hold.
Suppose that S = hS (a) : a Ai has been defined, satisfying (2)(4); we define
+1
S
. By the bounding projection property for |A|+ , there is a () < such that
def
h = proj(f() , S ) is an upper bound for f under <I . Then
(5) if () < , then h =I proj(f , S ).
In fact, recall that h (a) = min(S (a)\f() (a)) for every a A, using (3). Now suppose
that () < < . Let M = {a A : f() (a) f (a)}. So M I. For any a A\M
we have f() (a) < f (a), and hence
min(S (a)\f() (a)) min(S (a)\f (a));
it follows that h I proj(f , S ). For the other direction, recall that h is an upper
bound for f under <I . So f I h . If a is any element of A such that f (a) h (a)
then, since h (a) S (a), we get min(S (a)\f (a)) h (a). Thus proj(f , S ) I h .
This checks (5).
Now we apply (1) to get an upper bound h for f such that h I h and < (h , h )
I + . We now define S +1 (a) = S (a) {h (a)} for any a A.
(6) If () < , then proj(f , S +1 ) =I h .
For, we have f I h and, by (5), h =I proj(f , S ). If a A is such that f (a) h (a),
h (a) h (a), and h (a) = proj(f , S )(a), then min(S (a)\f (a)) = h (a) h (a)
f (a), and hence
proj(f , S +1 )(a) = min(S +1 (a)\f (a)) = h (a).
It follows that proj(f , S +1 ) =I h , as desired in (6).
Now since |A|+ < , let < be greater than each () for < |A|+ . Define
H = proj(f , S ) for each < |A|+ . Since > (), we have H =I h by (5). Note that
H+1 = proj(f , S +1 ) =I h ; so < (H+1 , H ) I + . Now clearly by the construction we
have S 1 (a) S 2 (a) for all a A when 1 < 2 < |A|+ . Hence we get
(7) if 1 < 2 < |A|+ , then H2 H1 , and < (H2 , H1 ) I + .
353

Now for every < |A|+ pick a A such that H+1 (a ) < H (a ). We have a = a for
all , in some subset of |A|+ of size |A|+ , and this gives an infinite decreasing sequence
of ordinals, contradiction.
The following lemma gives a slight extension of Theorem 25.325.
Lemma 25.39. Suppose that I is a proper ideal over A, |A|+ is a regular
cardinal, f = hf : < i is a <I -increasing sequence of functions in A Ord, |A|+ ,
f satisfies the bounding projection property for , and g is an exact upper bound for f .
Then
{a A : g(a) is non-limit, or cf(g(a)) < } I.
Proof. Let P = {a A : g(a) is non-limit, or cf(g(a)) < }. If a P and g(a) is
a limit ordinal, choose S(a) g(a) cofinal in g(a) and of order type < . If g(a) = 0 let
S(a) = {0}, and if g(a) = + 1 for some let S(a) = {}. Finally, if g(a) is limit but is
not in P , let S(a) = {g(a)}.
Now for any < let
N = {a A : f (a) f+1 (a)} and
Q = {a A : f+1 (a) g(a)}.
Then clearly
() If a A\(N Q ), then f (a) < sup(S(a)).
It follows that {a A : f (a) sup S(a)} N Q I. Hence the hypothesis of
def

the bounding projection property holds. Applying it, we get < such that f+ =
proj(f , hS(a) : a Ai) <I -bounds f . Since g is a least upper bound for f , we get
def

g I f+ , and hence M = {a A : f+ (a) < g(a)} I. By (), for any a P \(N Q )


we have f+ (a) = min(S(a)\f (a)) < g(a). This shows that P \(N Q ) M , hence
P N Q M I, so P I, as desired.
Now we give our main theorem on the existence of exact upper bounds.
Theorem 25.40. Suppose that I is a proper ideal over A, > |A|+ is a regular
cardinal, f = hf : < i is a <I -increasing sequence of functions in A Ord, and |A|+
. Then the following are equivalent:
(i) () holds for f .
(ii) f satisfies the bounding projection property for .
(iii) f has an exact upper bound g such that
{a A : g(a) is non-limit, or cf(g(a)) < } I.

Proof. (i)(ii): By the bounding projection lemma, Lemma 25.36.


(ii)(iii): Since () clearly implies ()|A|+ , this implication is true by Theorem 25.38
and Lemma 25.39.
354

(iii)(i): Assume (iii). By modifying g on a set in the ideal we may assume that g(a)
is a limit ordinal and cf(g(a)) for all a A. Choose a club S(a) g(a) of order type
cf(g(a)). Thus the order type of S(a) is . We prove that () holds. So, assume that
X is unbounded; we want to find X0 X of order type over which f is strongly
increasing
Q under I. To do this, we intend to define by induction on < a function
h S and an index () X such that
(1) h <I f() I h+1 .
(2) The sequence hh : < i is <-increasing (increasing everywhere; and hence it certainly
is strongly increasing under I).
(3) h() : < i is strictly increasing.
After we have done this, the sandwich argument (Lemma 25.28) shows that hf() : < i
is strongly increasing under I and of order type , giving the desired result.
The functions h are defined as follows.
Q
h0 S is arbitrary.
For a limit ordinal < let h = sup< h .
Having defined h , we define h+1 as follows. Since g is an exact upper bound and h < g,
choose () greater than all () for < such that h <I f() . Also, since f <I g for
all < , the projections f+ = proj(f, S) are defined. We define
h+1 (a) =

+
(a)) + 1
max(h (a), f()
h (a) + 1

if f() (a) < g(a),


if f() (a) g(a).

Thus we have
h <I f() I h+1 , for every .
So conditions (1)(3) hold.
Now we apply some infinite combinatorics to get information about () .
Lemma 25.41. Suppose that:
(i) I is an ideal over A.
(ii) and are regular cardinals such that |A| < and ++ < .
(iii) f = hf : < i is a sequence of length of functions in A Ord that is <I increasing and satisfies the following condition:
For every < with cf() = ++ there is a club E such that for some
with < ,
()

sup{f : E } I f .

Under these assumptions, () holds for f .


++

Proof. Assume the hypotheses. Let S = S ; so S is stationary in ++ . By


Theorem 25.1, let hC : Si be a club guessing sequence for S; thus
355

(1) For every S, the set C is a club of order type .


(2) For every club D ++ there is a D S such that C D.
Now let U be unbounded; we want to find X0 U of order type such that
hf : X0 i is strongly increasing under I. To do this we first define an increasing
++
continuous sequence h(i) : i < ++ i recursively.
Let (0) = 0. For i limit, let (i) = supk<i (k).
Now suppose for some i < ++ that (k) has been defined for every k i; we define
(i + 1). For each S we define
h = sup{f : [C (i + 1)]} and

least ((i), ) such that h I f
=
(i) + 1

if there is such a ,
otherwise.

Now we let (i + 1) be the least member of U which is greater than sup{ : S}. It
follows that
(3) If S and the first case in the definition of holds, then h <I f(i+1) .
def

Now the set F = {(k) : k ++ } is closed, and has order type ++ . Let = sup F .
Then F is a club of , and cf() = ++ . Hence by the hypothesis (iii) of the lemma, there
is a club E and a [, ) such that () in the lemma holds. Note that F E is
club in .
Let D = 1 [F E ]. Since is strictly increasing and continuous, it follows that D
is club in ++ . Hence by (2) there is an D S such that C D. Hence
def

C = [C ] F E
is club in () of order type . Then by () we have
sup{f : C } I f .
Now
(5) For every < both in C , we have sup{f : C ( + 1)} <I f .
To prove this, note that there is an i < ++ such that = (i). Now follow the definition of
(i + 1). There C was considered (among all other closed unbounded sets in the guessing
sequence), and h was formed at that stage. Now
h = sup{f : [C (i + 1)]} sup{f : [C ]} = sup{f : C } I f ,
so the first case in the definition of holds. Thus by (3), h <I f(i+1) . Clearly
(i + 1) , so (5) follows.
Now let h() : < i be the strictly increasing enumeration of C , and set
X0 = {( + 2m) : < , 0 < m }.
356

Suppose that X0 . Say = ( + 2m) with < and 0 < m . If X0 ,


then < ( + 2m 1) < , all in C , so
sup{f + 1 : X0 } I f(+2m1)
sup{f : C (( + 2m 1) + 1)}
<I f by (5)
Hence by Proposition 25.27, hf : Xi is very strongly increasing under I.
Now we need a purely combinatorial proposition.
Proposition 25.42. Suppose that and are regular cardinals, and ++ < .
Suppose that F is a function with domain contained in []< and range contained in .
Suppose that for every S++ there is a closed unbounded set E such that [E ]<
dmn(F ). Then the following set is stationary:
{ S : there is a closed unbounded D such that for any a, b D
with a < b, {d D : d a} dmn(F ) and F ({d D : d a}) < b}
Proof. We follow the proof of Theorem 25.41 closely. Call the indicated set T . Let
U be a closed unbounded subset of . We want to find a member of T U .
++
Let S = S ; so S is stationary in ++ . By Theorem 25.1, let hC : Si be a club
guessing sequence for S; thus
(1) For every S, the set C is a club of order type .
(2) For every club D ++ there is a D S such that C D.
++

We define an increasing continuous sequence h(i) : i < ++ i recursively.


Let (0) be the least member of U . For i limit, let (i) = supk<i (k).
Now suppose for some i < ++ that (k) has been defined for every k i; we define
(i + 1). For each S we consider two possibilities. If [C (i + 1)] dmn(F ), we
let be any ordinal greater than both (i) and F ([C (i + 1)]). Otherwise, we let
= (i) + 1. Since |S| < , we can let (i + 1) be the least member of U greater than all
for S. Hence
(3) If S and the first case in the definition of holds, then [C (i + 1)] dmn(F )
and F ([C (i + 1)]) < (i + 1).
Now the set G = rng() is closed and has order type ++ . Let = sup(G). Hence
by the hypothesis of the proposition, there is a closed unbounded set E such that
[E ]< dmn(F ). Note that G E is also closed unbounded in .
Let H = 1 [G E ]. Thus H is club in ++ . Hence by (2) there is an H S
such that C D. Hence
def
C = [C ] G E
is club in () of order type . We claim that C is as desired in the proposition. For,
suppose that a, b C and a < b. Write a = (i). Then {d C : d a} = [C ( +
1)] E , and so (3) gives the desired conclusion.
357

Next we give a condition under which () holds.


Lemma 25.43. Suppose that I is a proper ideal over a set A of regular
Q cardinals
such that |A| < min(A). Assume that > |A| is a regular
Q cardinal such that ( A, <I ) is
-directed, and hg : < i is a sequence of members of A.
Q
Then there is a <I -increasing sequence f = hf : < i of length in A such that:
(i) g < f+1 for every < .
(ii) () holds for f , for every regular cardinal such that ++ < and {a A : a
++ } I.
Q
Proof. Let f0Q
be any member of A. At successor stages, if f is defined, let f+1
be any function in A that <-extends f and g .
At limit stages , there are three cases. In the first case, cf |A|. Fix some E
club of order type cf, and define
f = sup{fi : i E }.
Q
For any a A we have cf() |A| < min(A) a, and so f (a) < a. Thus f A.
In the second case, cf() = ++ , where is regular, |A| < , and {a A : a
++
} I. Then we define f as in the first case. Then for any a A with a > ++ we
have f (a) < a, and so {a A : a f (a)} I, and we can modify f on this set which is
in I to obtain our desired f .
In the third case, neither of the first two cases holds. Then we let f be any I -upper
bound of {f : < }; it exists by the -directedness assumption.
This completes the construction. Obviously (i) holds. For (ii), suppose that is a
regular cardinal such that ++ < and {a A : a ++ } I. If |A| < , the desired
conclusion follows by Lemma 25.41. In case |A|, note that hf : < i is <-increasing,
and so is certainly strongly increasing under I.
Now we apply these results to the determination of true cofinality for some important
concrete partial orders.
Notation. For any set X of cardinals, let
X (+) = {+ : X}.
Theorem 25.44. (Representation of + as a true cofinality, I) Suppose that is a
singular cardinal with uncountable cofinality. Then there is a club C in such that C has
order type cf(), every element of C is greater than cf(), and
+ = tcf

Y


C (+) , <J bd ,

where J bd is the ideal of all bounded subsets of C (+) .


Proof. Let C0 be any closed unbounded set of limit cardinals less than such that
|C0 | = cf() and all cardinals in C0 are above cf(). Then
358

(1) all members of C0 which are limit points of C0 are singular.


In fact, suppose on the contrary that C0 , is a limit point of C0 , and is regular. Thus
C0 is unbounded in , so |C0 | = . But cf() < and |C0 | = cf, contradiction.
So (1) holds. Hence wlog every member of C0 is singular.
Now we claim
Q (+)
(2) ( C0 , <J bd ) is -directed.
Q (+)
(+)
In fact, suppose that F
C0 and |F | < . For a C0 with |F | < a let h(a) =
(+)
supf F f (a); so h(a) a. For a C0 with a |F | let h(a) = 0. Clearly f J bd h for all
f F . So (2) holds.
Q (+)
(3) ( C0 , <J bd ) is + -directed.
Q (+)
In fact,
by
(2)
it
suffices
to
find
a
bound
for
a
subset
F
of
C0 such that |F | = . Write
S
F = <cf() G , with |G | < for each < cf(). By (2), each G has an upper bound
k under <J bd . Then {k : < cf()} has an upper bound h under <J bd . Clearly h is an
upper bound for F .
(+)
Now we are going to apply Lemma 25.43 to J bd , C0 , and + in place of I, A, and
; and with anything for g. Clearly the hypotheses hold, so we get a <J bd -increasing
Q (+)
sequence f = hf : < + i in
C0 such that () holds for f and the bounding
projection property holds for , for every regular cardinal < . It also follows that the
bounding projection property holds for |A|+ , and hence by 25.38, f has an exact upper
bound h. Then by Lemma 25.39, for every regular < we have
()

(+)

{a C0

: h(a) is non-limit, or cf(h(a)) < } J bd .


(+)

Now the identity function k on C0 is obviously is an upper bound for f , so h J bd k. By


(+)
modifying h on a set in J bd we may assume that h(a) a for all a C0 . Now we claim
def

() The set C1 = { C0 : h(+ ) = + } contains a club of .


Assume otherwise. Then for every club K, K (\C1 ) 6= 0. This means that \C1 is
def
stationary, and hence S = C0 \C1 is stationary. For each S we have h(+ ) < + .
Hence cf(h(+ )) < since is singular. Hence by Fodors theorem hcf(h(+ )) : C0 i
is bounded by some < on a stationary subset of S. This contradicts ().
Thus () holds, and so there is a club C C0 such that h(+ ) = +Qfor all C.
Now hf C (+) : < + i is <J bd -increasing. We claim that it is cofinal in ( C (+) , <J bd ).
Q
Q (+)
For, suppose that g C (+) . Let g be the extension of g to C0 such that g (a) = 0
for any a C0 \C. Then g <J bd h, and so there is a Q< + such that g <J bd f . So
g <J bd f C (+) , as desired. This shows that + = tcf( C (+) , <J bd ).
Theorem 25.45. (Representation of + as a true cofinality, II) If is a singular
cardinal of countable cofinality, then there is an unbounded set D of regular cardinals
such that
Y

+ = tcf
D, <J bd .
359

Proof. Let C0 be a set of uncountable regular cardinals with supremum , of order


type .
Q
(1) C0 /J bd is -directed.
Q
For, let X C0 with |X| < . For each a C0 such thatQ|X| < a, let h(a) = sup{f (a) :
f X}, and extend h to all of C0 in any way. Clearly h C0 and it is an upper bound
in the <J bd sense for X.
Q
From (1) it is clear that C0 /J bd is also + -directed. By Lemma 25.43 we then get
a <J bd -increasing sequence hf : < + i which satisfies () for every regular < + . By
Theorems 25.38 and 25.40, f has an exact upper bound h such that {a C0 : h(a) is nonlimit or cf(h(a)) < } J bd for every regular < + . We may assume that h(a) a for all
a C0 , since the identity function is clearly an upper bound for f ; and we may assume that
each h(a) is a limit ordinal of uncountable cofinality since {a C0 : cf(h(a)) < 1 } J bd .

Q
(2) tcf
cf(h(a)),
<
= + .
bd
J
aC0
To prove this, for each a C0 let Da be club in h(a) of order type cf(h(a)), and let
ha : < cf(h(a))i
be the strictly increasing enumeration of Da . For each < + we
Q

define f aC0 cf(h(a)) as follows. Since f <J bd h, the set {a C0 : f (a) h(a)} is
bounded, so choose a0 C0 such that for all b C0 with a0 b we have f (b) < h(b). For

such a b we define f (b)


Q to be the least such that f (b) < b . Then we extend f in any
way to a member of aC0 cf(h(a))).
(3) < < + implies that f J bd f .
This is clear by the definitions.
Q
Q
Now for each l aC0 cf(h(a))) define kl C0 by setting kl (a) = al(a) for all a.
So kl < h. Since h is an exact upper bound for f , choose < + such that kl <J bd f .
Choose a such that kl (b) < f (b) < h(b) for all b a. Then for all b a, bl(b) < bf (b) ,
and hence l(b) < f (b). This proves that l <J bd f . This proves the following statement.

Q
(4) {f : < + } is cofinal in
cf(h(a)),
<
.
bd
J
aC0
Now (3) and (4) yield (2).
Now let B = {cf(h(a)) : a C0 }. Define
X J iff X B and h1 [cf 1 [X]] J bd .
Q
By Lemma 25.24 we get tcf( B/J) = + . It suffices now to show that J is the ideal of
bounded subsets of B. Suppose that X J, and choose a C0 such that h1 [cf 1 [X]]
{b C0 : b < a}. Thus X {b A : cf(h(b)) < a} J bd , so X is bounded. Conversely, if
X is bounded, choose a B such that X {b B : b a}. Now
h1 [cf 1 [X]] = {b C0 : cf(h(b)) X}
{b C0 : cf(h(b)) a},
and this is bounded by the choice of h.
360

EXERCISES
E25.1. Let and be regular cardinals, with < . We define two statements CG(, )
and M AJ(, ):
CG(, ) iff

there is a sequence hC : S i such that:


(i) S [C is club, of order type ];
(ii) D [D club implies that S (C D)];

M AJ(, ) iff

there is a sequence hf : S i such that:


(i) S [f : is strictly increasing;
(ii) g : < S < [g(hf () : < i) < f ()].

Thus Theorem 25.1 implies that CG(, ++ ) holds for any regular cardinal .
Prove that CG(, ) implies M AJ(, ).
E25.2. Recall the condition :
There are sets A for each < 1 such that for every A 1 the set { < 1 :
A = A } is stationary.
We do not need any properties of , but it is of interest that it follows from V = L and
implies CH.
Prove that implies CG(, 1 ). (It is consistent that CG(, 1 ) fails, but this is not
part of this exercise.)
E25.3. Suppose that A is infinite, h A Ord is a function having only limit ordinals as
values, and F is a nonprincipal ultrafilter on A. [This means
that each cofinite subset of
Q
A is in F ; cofinite = complement of finite.] Prove that aA h(a)/F is infinite.
E25.4. An ultrafilter F on a set A is countably incomplete iff there is a countably infinite
partition P of A Q
such that A\a F for every a A. Show that if F is countably
incomplete, then aA h(a)/F
has size at least 2 . Hint: Let hpi : i i enumerate P ,
S
and for each i let bi = ij< pj ; thus bi F . For each a A, let ca = {i : a bi };
hence ca is a finite set. Use these sets to solve the problem.
E25.5. (Continuing
E25.3) Suppose that F is a countably incomplete ultrafilter on A.
Q
Show that aA h(a)/F is not well-ordered.
E25.6. Let F = {, \1}, a filter on . Define a subset X of which has two distinct
least upper bounds under F .
E25.7. Give an example of a set A, a collection F A Ord, and an ideal I on A, such that
there is a subset X of F which has a unique least upper bound under <I , but no exact
upper bound. (See the example on page 121.)
E25.25. Suppose that P and Q are partially ordered sets. Show that the following two
conditions are equivalent:
(i) There is a function f : P Q such that q Qp P r P [p r implies that
q f (r)].
361

(ii) There is a function g : Q P such that X Q[X unbounded in Q implies that


g[X] is unbounded in P ].
E25.9. (Continuing E25.8) A partially ordered set P is directed iff p, q P r P [p r
and q r]. Suppose that P and Q are directed. If either of the conditions of E25.8 hold,
we write P Q.
Assume that P Q P . Show that there exist f : P Q and g : Q P such
that for any p P and q Q the following conditions hold:
(a) If g (q) p, then q f (p).
(b) If f (p) q, then p g (q).
E25.10. (Continuing E25.9) Suppose that P and Q are directed partially ordered sets.
Show that the following conditions are equivalent:
(a) P Q and Q P .
(b) There is a partially ordered set R such that both P and Q can be embedded in R
as cofinal subsets. That is, there is an injection f : P R such that p, p P [p P p iff
f (p) R f (p ), with f [P ] cofinal in R, and similarly for Q: there is an injection g : Q R
such that q, q Q[q P q iff g(q) R g(q ), with g[Q] cofinal in R.
Hint: (b)(a) is easy. For (a)(b), assume (a), and suppose that f , g are as in E25.9.
Also assume wlog that P Q = , let R = P Q, and let the order on R extend both of
the orders on P and Q, and in addition write, for p P and q Q,
p R q iff
q R p iff

p P p[f (p ) Q q];
q Q q[g (q ) P p].

Then R is a quasiorder on R , and one can let R be the associated partial order.
E25.11. Suppose that , , are cardinals such that (1) = 1 or is infinite; (2) ;
(3) and are infinite. For f, g we define f g iff |{ < : f () > g()}| < .
Let
b.. = min{|B| : B is a -unbounded subset of };
d,, = min{|B| : B is a -cofinal subset of }.
Prove that b,, is regular, and b,, cf(d,, ).
E25.12. Suppose that > |A|+ is a regular cardinal and f = hf : < i is a <I increasing sequence. Consider the following property of f and a regular cardinal such
that |A| < :
Bad : There exist:
(a) nonempty sets Sa of ordinals for a A, each of size less than , such that f <I
hsup(Sa ) : a Ai for all < , and
(b) an ultrafilter D over A extending the dual of I
such that for every < there is a < such that proj(f , S) <D f .
Show that the bounding projection property for implies Bad .
362

E25.13. (Continuing E25.12) Suppose that > |A|+ is a regular cardinal and f = hf : <
i is a <I -increasing sequence. Also suppose that is a regular cardinal and |A| < .
Let Ugly be the following statement:
There exists a function g A Ord such that, defining t = {a A : g(a) < f (a)}, the
sequence ht : < i does not stabilize modulo I. That is, for every there is a > in
such that t \t I + .
Show that if Ugly holds, then ht : < i is I -increasing, i.e., if < < then
t \t I.
Also show that if the bounding projection property holds for , then Ugly.
E25.14. (Continuing E25.13) Suppose that > |A|+ is a regular cardinal and f = hf : <
i is a <I -increasing sequence. Also suppose that is a regular cardinal, with |A| < .
Suppose that Bad and Ugly. Show that the bounding projection property holds for
.
Hint: Suppose that it does not hold. For brevity write f+ for proj(f , S). For all
, < let t = {a A : f+ (a) < f (a)}. Prove:
(1) For every < there is a > such that t I + and for all > we have
t \t I.
Now by (1) define strictly increasing sequences h() : < i and h() : < i such that
()
() ()
for all < , t() I + , () < (), () < () if < < , and t \t() I for all
> (). Prove:
(2) If < < , then
o
 n
 

() ()
()
()
()
+
+
t() t() t() t() \t() a A : f() (a) < f() (a) .
Next, prove
( )

( )

I +.
(3) If 1 < < m < , then t(11 ) . . . t(m
m)
()

By (3), the set I {t() : < } has fip, and hence is contained in an ultrafilter D. This
easily leads to a contradiction.
E25.15 (Continuing E25.14; The Trichotomy theorem) Suppose that > |A|+ is a regular
cardinal and f = hf : < i is a <I -increasing sequence. Also let be a regular cardinal
such that |A| < . Let Good be the statement that there exists an exact upper
bound g for f such that cf(g(a)) for every a A.
Prove that Bad , Ugly, or Good .
References
Abraham, U. and Magidor, M. Cardinal arithmetic. To appear in the Handbook of Set
Theory. 86pp.
Holz, M., Steffens, K., Weitz, E. Introduction to cardinal arithmetic. Birkh
auser
1999, 304pp.
Shelah, S. Cardinal arithmetic. Oxford Univ. Press 1994. 481pp.
363

26. Basic properties of PCF


For any set A of regular cardinals define
n Y

o
pcf(A) = cf
A/D : D is an ultrafilter on A .
By definition, pcf() = . We begin with a very easy proposition which will be used a lot
in what follows.
Proposition 26.1. Let A and B be sets of regular cardinals.
(i) A pcf(A).
(ii) If A B, then pcf(A) pcf(B).
(iii) pcf(A B) = pcf(A) pcf(B).
(iv) If B A, then pcf(A)\pcf(B) pcf(A\B).
(v) If A is finite, then pcf(A) = A.
(vi) If B A, B is finite, and A is infinite, then pcf(A) = pcf(A\B) B.
(vii) min(A) = min(pcf(A)).
(viii) If A is infinite, then the first members of A are the same as the first members
of pcf(A).
Proof. (i): For each a A, the principal ultrafilter with {a} as a member shows that
a pcf(A).
(ii): Any ultrafilter F on A can be extendedQto an ultrafilter
Q G on B. The mapping
[f ] 7 [f ] is easily seen to be an isomorphism
of
A/F
onto
B/G. Note here that
Q
Q [f ]
is used in two senses, one for anQ
element of A/F , where each memberQof [f ] is in A,
and the other for an element of B/G, with members in the larger set B.
(iii): holds by (ii). Now suppose that D is an ultrafilter on A B. Then A D or
B D, and this proves .
(iv): SupposeQthat B A and pcf(A)\pcf(B). Let D be an ultrafilter on A
such that = cf( A/D). Then B
/ D, as otherwise pcf(B). So A\B D, and so
pcf(A\B).
(v): If A is finite, then every ultrafilter on A is principal.
(vi): We have
pcf(A) = pcf(A\B) pcf(B) by (iii)
= pcf(A\B) B

by (v)

(vii): Let a = min(A). Thus a pcf(A) by (i). Suppose that pcf(A) with < a;
we want to get a contradiction.
Say h[g ] : < i is strictly increasing and cofinal in
Q
Q
A/D. Now define h A as follows: for any b A, h(b) = sup{g (b) + 1 : < }.
Thus [g ] < [h] for all < , contradiction.
(viii): Suppose that pcf(A)\A. Suppose that A is finite, and let a = min(A\).
So a, and if b A a then b < . Thus A = A a. Hence pcf(A) =
pcf(A\a) (A ) by (vi), and so a by (vii). So = a, contradiction. Thus A is
infinite, and this proves (viii).
The following result gives a connection with earlier material; of course there will be more
connections shortly.
364

Proposition
26.2. If A is a collection of regular cardinals, F is a proper filter on A,
Q
and = tcf( A/F ), then pcf(A).
Q
Proof. Let hf : < i be a <F -increasing cofinal sequence in A/F . Let D be any
ultrafilter
containing F . Then clearly hf : < i is a <D -increasing cofinal sequence in
Q
A/D.
Definitions. A set A is progressive iff A is an infinite set of regular cardinals and |A| <
min(A).
If < are ordinals, then (, )reg is the set of all regular cardinals such that
< < . Similarly for [, )reg , etc. All such sets are called intervals of regular
cardinals.
Proposition 26.3. Assume that A is a progressive set, then
(i) Every infinite subset of A is progressive.
(ii) If is an ordinal and A is unbounded in , then is a singular cardinal.
(iii) If A is an interval of regular cardinals, then A does not have any weak inaccessible
as a member, except possibly its first element. Moreover, there is a singular cardinal such
that A is unbounded in and A\ is finite.
Proof. (i): Obvious.
(ii): Obviously is a cardinal. Now A is cofinal in and |A| |A| < min(A) <
. Hence is singular.
(iii): If A, then by (ii), A cannot be unbounded in ; hence is a successor
cardinal, or is the first element of A. For the second assertion of (iii), let sup(A) = +n
with a limit ordinal. Since A is an infinite interval of regular cardinals, it follows that
A is unbounded in , and hence by (ii), is singular. Hence the desired conclusion
follows.
Theorem 26.4. (Directed set theorem) Suppose that A is a progressive set, and is
a regular cardinal such that sup(A) < . SupposeQthat I is a proper ideal over A containing
all proper initial segments of A and such that ( A, <I ) is -directed. Then there exist a
set A of regular cardinals and a proper ideal J over A such that the following conditions
hold:
(i) A [min(A), sup(A)) and A is cofinal in sup(A).
(ii) |A | |A|.
(iii) J contains
bounded subsets of A .
Q all
(iv) = tcf( A , <J ).
Proof. First we note:
() A does not have a largest element.
For, suppose that Q
a is the largest element of A. Note that then I = P(A\{a}). For each
< a define f A by setting
f (b) =

365

if b 6= a,
if b = a.

Q
Since a < , choose g A such that f <I g for all a. Thus {b A : f (b) g(b)}
I, so f (a) < g(a) for all < a. This is clearly impossible. So () holds.
Q
Now by Lemma 8.43 there is a <I -increasing sequence f = hf : < i in A which
satisfies () for every A. Hence by 8.388.40, f has an exact upper bound h A Ord
such that
(1)

{a A : h(a) is non-limit or cf(h(a)) < } I

for every A. Now the identity function k on A is clearly an upper bound for f , so
h I k; and by (1), {a A : h(a) is non-limit or cf(h(a)) < min(A)} I. Hence by
changing h on a set in the ideal we may assume that
(2)

min(A) cf(h(a)) a for all a A.

Q
Now f shows that ( h, <I ) has true cofinality .QLet A = {cf(h(a)) : a A}. By Lemma
8.24, there is a proper ideal J on A such that ( A , <J ) has true cofinality ; namely,
XJ

iff

X A and h1 [cf 1 [X]] I.

Clearly (ii) and (iv) hold. By (2) we have A [min(A), sup(A)). Now to show that
A is cofinal in sup(A), suppose that A; we find A such that . In fact,
{a A : cf(h(a)) < } I by (1). Let X = {b A : b < }. Then
h1 [cf 1 [X]] = {a A : cf(h(a)) < } I,
and so X J. Taking any A \X we get . Thus (i) holds. Finally, for (iii),
def

suppose that J; we want to show that Y = {b A : b < } J. By (i), choose


A such that . Then Y {b A : b < }, and by the argument just given, the
latter set is in J. So (iii) holds.
Corollary 26.5. Suppose that A is progressive, is an interval of regular cardinals,
and

is a regular cardinal > sup(A). Assume that I is a proper ideal over A such that
Q
( A, <I ) is -directed. Then pcf(A).
Proof. We may assume that I contains all proper initial segments of A. For, suppose
that this is not true. Then there is a proper initial segment B of A such that B
/ I. With
a A\B we then have B A a, and so A a
/ I. Let a be the smallest element of A
def

such that A a
/ I. Then J = I P(A Q
a) is a proper ideal that contains all proper
initialQsegments of A a. we claim that ( (A a), J) isQ-directed. For, suppose that
X (A a) with |X| < . For eachQg X let g +
A be such that g+ g and
+
g + (b) = 0 for all b A\a. Choose f A such that g I f for all g X. So if g X
we have
{b A a : g(b) > f (b)} = {b A : g + (b) > f (b)} I P(A a),
and so g J (f (A a) for all g X, as desired.
366

Now the corollary follows from the theorem.


The ideal J<
Let A be a set of regular cardinals. We define
J< [A] = {X A : pcf(X) }.
In words, X JQ
< [A] iff X is a subset of A such that for any ultrafilter D over A, if
X D, then cf( A, <D ) < . Thus X forces the cofinalities of ultraproducts to be
below .
Clearly J< [A] is an ideal of A. If < min(A), then J< [A] = {} by 26.1(vii). If
< , then J< [A] J< [A]. If
/ pcf(A), then J< [A] = J<+ [A]. If is greater than
each member of pcf(A), then J< [A] is the improper ideal P(A). If pcf(A), then
A
/ J< [A].
If A is clear from the context, we simply write J< .
If I and J are ideals on a set A, then I + J is the smallest ideal on A which contains
I J; it consists of all X such that X Y Z for some Y I and Z J.
Lemma 26.6. If A is an infinite set of regular cardinals and B is a finite subset of
A, then for any cardinal we have
J< [A] = J< [A\B] + P(B ).
Proof. Let X J< [A]. Thus pcf(X) . Using 26.1(vi) we have pcf(X) =
pcf(X\B) (X B), so X\B J< [A\B] and X B B , and it follows that
X J< [A\B] + P(B ).
Now suppose that X J< [A\B] + P(B ). Then there is a Y J< [A\B] such
that X Y (B ). Hence by 26.1(vi) again, pcf(X) pcf(Y ) (B ) , so
X J< [A].
Proposition 26.7. If A is a collection of regular cardinals and is a cardinal, then
\
Y

J<
[A] = {D : D is an ultrafilter and cf( A/D) }.
The intersection
is to be understood as being equal to P(A) if there is no ultrafilter D such
Q
that cf( A/D) .

Proof. Note that for any X A, X J<


[A] iff A\X J< [A] iff
Q pcf(A\X) .

Now suppose that X J< [A] and D is an ultrafilter such that cf( A/D) . If
X
/ D, then A\X D and hence pcf(A\X) 6 , contradiction. Thus X is in the
indicated intersection.
If X is in the indicated intersection, we want to show that A\X . To this end,
supposeQ
that D is an ultrafilter such that A\X D, and to get a contradiction suppose
that cf( A/D) . Then X D by assumption, contradiction.
Note that the argument gives the desired result in case there are no ultrafilters D
as indicated in the intersection; in this case, pcf(A\X) for every X A, and so

J<
[A] = P(A).

367

Theorem 26.8. (-directedness)


Assume that A is progressive. Then for every
Q
cardinal , the partial order ( A, <J< [A] ) is -directed.
Proof. We may assume
that there are infinitely many
Q
Q members of A less than . For,
suppose not. Let F A with |F | < . We define g A by setting, for any a A,
g(a) =

sup{f (a) : f F } if |F | < a,


0
otherwise.

We claim that f g mod J< [A] for all f F . For, if f (a) > g(a), then > |F | a; thus
{a : f (a) > g(a)} A. Now pcf( A) = A , so {a : f (a) > g(a)} J< [A].
def

So, we make the indicated assumption.


It follows
that X = {|A|+ , |A|++, |A|+++ }
Q
Q
J< [A]. Hence it is easy to see that A/J<
= (A\X)/(J< [A] P(A\X)). Now
Y J< [A] P(A\X) iff
iff

pcf(Y ) and Y A\X


Y J< [A\X].

Hence we may assume that |A|+3 < min(A).


Q Now we prove by induction on the cardinal 0 that if 0 < and F =Q{fi : i < 0 }
A is a family of functions of size 0 , then F has an upper bound in ( A, <J< ). So,
we assume that this is true for all cardinals less than 0 . If 0 < min(A), then sup(F ) is
as desired. So, assume that min(A) 0 .
First suppose that 0 is singular. Let hi : i < cf(0 )i be increasing and cofinal in 0 ,
each i a cardinal. By the inductive hypothesis, let gi be a bound for {f : < i } for
each i < cf0 , and then let h be a bound for {gi : i < cf0 }. Clearly h is a bound for F .
So assume that 0 is regular. We are now going to define a <J< -increasing sequence

hf : < 0 i which satisfies () , with = |A|+ , and such that fi fi for all i < 0 . To
0
do this choose, for every S++
a club E of order type ++ . Now for such a we
define
f = sup({fj : j E } {f }).
For ordinals < 0 of cofinality 6= ++ we apply the inductive hypothesis to get f such
that f <J< f for every < and also f <J< f .
This finishes the construction. By Lemma 8.41, ()|A|+ holds for f , and hence by
Theorem 8.40, f has an exact upper bound g A Ord with respect to <J< . The identity
function on A is an upper bound for f , so we may assume that g(a) a for all a A.
def
Now we shall prove that B = {a A : g(a) = a} J< [A], so a further modification of g
yields the desired upper bound for f .
To get a contradiction, suppose that B
/ J<
Q [A]. Hence pcf(B) 6 , and so there is
an ultrafilter Q
D over A such that B D and cf( A/D) . Clearly D J<
Q[A] = , as
otherwise cf( A/D)
<
.
Now
f
has
length

<
,
and
so
it
is
bounded
in
A/D; say
0
Q
that Q
fi <D h A for all i < 0 . Thus h(a) < a = g(a) for all a B. Now we define
h A by
n
h (a) = h(a) if a B,
0
otherwise.
368

Then h <J< g, since


{a A : h (a) g(a)} = {a A : g(a) = 0} {a A : f0 (a) g(a)} J< .
Hence by the exactness of g it follows that h <J< fi for some i < 0 . But B D and
hence h =D h . So h <D fi , contradiction.
Corollary 26.9. Suppose that A is progressive, D is an ultrafilter over A, and is a
cardinal. Then:
Q
(i) cf( QA/D) < iff J< [A] D 6= .
(ii) cf( QA/D) = iff J<+ D 6= = J< D.
(iii) cf( A/D) = iff + is the first cardinal such that J< D 6= .
Proof. (i): : Assuming
Q that J< [A] D = , the factQfrom 10.3 that <J< is
-directed implies that also A/D is -directed, and hence cf( A/D) .
: Assume that J<Q[A] D 6= . Choose X J< D. Then by definition,
pcf(A) , and hence cf( A/D) < .
(ii): Immediate from (i).
(iii): Immediate from (ii).
We now give two important theorems about pcf.
Theorem 26.10. If A is progressive, then |pcf(A)| 2|A| .
J

Proof. By Corollary 26.9, for each pcf(A) we can select an element f ()


\J< . Clearly f is a one-one function from pcf(A) into P(A).

<+

Notation. We write J in place of J<+ .


Theorem 26.11. (The max pcf theorem) If A is progressive, then pcf(A) has a
largest element.
Proof. Let
I=

J< [A].

pcf(A)

Now clearly each ideal J< is proper (since for example


/ J< ), so I is also proper.
Q {}
Extend the dual of I to an ultrafilter D, and let = cf( A/D). Then for each pcf(A)
we have J< D = since I D = , and by Corollary 26.9 this means that .
Corollary 26.12. Suppose that A is progressive. If is a limit cardinal, then
[
J< [A] =
J< [A].
<

Proof. The inclusion is clear. Now suppose that X J< [A]. Thus pcf(X) .
Let be the largest element of pcf(X). Then , and pcf(X) + , so X J<+ , and
the latter is a subset of the right side.
Theorem 26.13. (The interval theorem) If A is a progressive interval of regular
cardinals, then pcf(A) is an interval of regular cardinals.
369

Proof. Let = sup(A). By 26.3(iii) and 26.1(vi) we may assume that is singular.
By Theorem 26.11 let 0 = max(pcf(A)). Thus we want to show that
Q every regular cardinal
in (, 0 ) is in pcf(A). By Theorem 26.8, the partial order ( A, <J< ) is -directed.
Clearly J< is a proper ideal, so pcf(A) by Corollary 26.5.
Definition. If is a cardinal |A|, then we define
pcf (A) =

[
{pcf(X) : X A and |X| = }.

Theorem 26.14. If A is an interval of regular cardinals and < min(A), then


pcf (A) is an interval of regular cardinals.
Note here that we do not assume that A is progressive.
Proof. Let 0 = sup pcf (A). Note that each subset X of A of cardinality is
progressive, and so max(pcf(X)) exists by Theorem 26.11. Thus
0 = sup{max(pcf(X)) : X A and |X| = }.
To prove the theorem it suffices to take any regular cardinal such that min(A) < < 0
and show that pcf (A). In fact, this will show that pcf (A) is an interval of regular
cardinals, whether or not 0 is regular. Since < 0 , there is an X A of size such
that max(pcf(X)). Hence X
/ J< [X]. If there is a proper initial segment Y of X
which is not in J< [X], we can choose the smallest a X such that X a
/ J< [X] and
work with X a rather than X. So we may assume that every proper initial segment of
X is in J< [X]. Since J< [X] is -directed by Theorem 26.8, we can apply 26.4 to obtain
pcf(X), and hence pcf (A), as desired.
Another of the central results of pcf theory is as follows.
Theorem 26.15. (Closure theorem.) Suppose that A is progressive, B pcf(A),
and B is progressive. Then pcf(B) pcf(A). In particular, if pcf(A) itself is progressive,
then pcf(pcf(A)) = pcf(A).
QProof. Suppose that pcf(B), and let E be an ultrafilter on B
Q such that =
cf( B/E). For every b B fix an ultrafilter Db on A such that b = cf( A/Db ). Define
F by
X F iff X A and {b B : X Db } E.
It is straightforward to Q
check that F is an ultrafilter on A. The rest of the proof consists
in showing that = cf( A/F ).
Q Now we apply Lemma 8.25, with I, hi : i Ii, hPi : i Ii replaced by B, hb : b Bi,
h A/Db : b Bi respectively, to obtain
= cf

Y Y
bB

370

A/Db /E

Q
Hence it suffices by Proposition 8.11 to show that A/F is isomorphic to a cofinal subset
of this iterated ultraproduct. To do this, we consider the Cartesian product B A and
define
HP

H B A and {b B : {a A : (b, a) H} Db } E.

iff

Again it is straightforward to check that P is an ultrafilter over B A. Let r(b, a) = a for


any (b, a) B A. Then

Y
Y Y
()
a/P
A/D
=
b /E.
bB

(b,a)BA

To prove (), for any f

hb,aiBA

a we define f

bB (

A/Db ) by setting

f (b) = hf (b, a) : a Ai/Db .


Then for any f, g

hb,aiBA

f =P g

a we have

iff
iff
iff

{(b, a) : f (b, a) = g(b, a)} P


{b : {a : f (b, a) = g(b, a)} Db } E
{b : f (b) = g (b)} E

iff

f =E g .

Hence we can define k(f /P ) =Q


f /E, Q
and we get a one-one function. To show that it is
a surjection,
suppose
that
h

(
A/Db ). For each b B write h(b) = hb /Db with
bB
Q
hb A. Then define f (b, a) = hb (a). Then
f (b) = hf (b, a) : a Ai/Db = hhb (a) : a Ai/Db = hb /Db = h(b),
as desired. Finally, k preserves order, since
f /P < g/P

iff

{(b, a) : f (b, a) < g(b, a)} P

iff
iff

{b : {a : f (b, a) < g(b, a)} Db } E


{b : f (b) < g (b)} E

iff

k(f /P ) < k(g/P ).

So () holds.
Now we apply 7.9, with r, B A, A, P in place of c, A, B, I respectively. Then F is
the Rudin-Keisler projection on A, since for any X A,
XF

iff
iff

{b B : X Db } E
{b B : {a A : r(b, a) X} Db } E

iff

{b B : {a A : (b, a) r 1 [X]} Db } E

iff

r 1 [X] P.
371

Q
Q
Thus by Lemma 8.24 we get an isomorphism h of A/F into (b,a)BA a/P such that
Q
h(e/F ) = he(r(b, a)) : (b, a) B
So now it suffices now to show
Q Ai/P for any e A. Q
that the range of h is cofinal in (b,a)BA a/P . Let g (b,a)BA a. For every b B
Q
define gb
A by gb (a) = g(b, a). Let
Q = min(B). Since B is progressive, we have
|B| <
A/J< [A] (Theorem 26.8), there is a function
Q . Hence by the -directness of
k A such that gb <J< k for each b B. Now b for all b B, so J< Db = , and
so gb <Db k. It follows that g/P <P h(k/D). In fact, let H = {(b, a) : g(b, a) < k(r(b, a))}.
Then
{b B : {a A : (b, a) H} Db } = {b B : {a A : gb (a) < k(a)} Db } = B E,
as desired.
Generators for J<
If I is an ideal on a set A and B A, then I + B is the ideal generated by I {B}; that
is, it is the intersection of all ideals J on A such that I {B} J.
Proposition 26.16. Suppose that I is an ideal on A and B, X A. Then the
following conditions are equivalent:
(i) X I + B.
(ii) There is a Y I such that X Y B.
(iii) X\B I.
Proof. Clearly (ii)(i). The set
{Z A : Y I[Z Y B]}
is clearly an ideal containing I {B}, so (i)(ii). If Y is as in (ii), then X\B Y , and
hence X\B I; so (ii)(iii). If X\B I, then X (X\B) B, so X satisfies the
condition of (ii). So (iii)(ii).
The following easy lemma will be useful later.
Lemma 26.17. Suppose that A is progressive and B A.
(i) P(B) JQ
< [A] = J< [B].
(ii) If f, g A and f <J< [A] g, then (f B) <J< [B] (g B).
Proof. (i): Suppose that X QP(B)JQ
< [A] and X D,
Q an ultrafilter on B. Extend

D to an ultrafilter E on A. Then B/D = A/E, and cf( A/E) < . So X J< [B].
The converse is proved similarly.
Q
(ii): Assume that f, g A and f <J< [A] g. Then
{a B : g(b) f (b)} P(B) J< [A] = J< [B]
by (i), as desired.
Definitions. If there is a set X such that J [A] = J< + X, then we say that is
normal.
372

Let A be a set of regular cardinals, and a cardinal. A subset B A is a -generator


over A iff J [A] = J< [A] + B. We omit the qualifier over A if A is understood from
the context.
Suppose that pcf(A). A universal sequence for is a sequence f =Qhf : < i
which is <J< [A] -increasing,Q
and for every ultrafilter D over A such that cf( A/D) = ,
the sequence f is cofinal in A/D.
Theorem 26.18. (Universal sequences) Suppose that A is progressive. Then every
pcf(A) has a universal sequence.
Proof. First,
(1) We may assume that |A|+ < min(A) < .
In fact, suppose that we have proved the theoremQunder the assumption (1), and now take
the general situation. If = min(A), define f A, for < , by f (a) = for all a A.
Thus f is <-increasing,
hence <J< [A] -increasing. Suppose that D is an ultrafilter on A
Q
such
A\{min(A)} D and hence
Q that cf( A/D) = . Then {min(A)} D, as otherwiseQ
cf( A/D) > by Proposition 26.1(vii). Thus for any g A, let = g(min(A)) + 1.
Then
Q {a A : g(a) < f (a)} {min(A)} D, so [g] < [f ]. Hence h[f ] : < i is cofinal
in A/D.

Now suppose that min(A)


Q < . Let a0 = min A. Let A = A\{a0 }. If D is an
ultrafilter such that = cf( A/D), then A D since a0 < , hence {a0 }
/ D. It
follows that pcf(A ). ClearlyQ|A |+ < min A . Hence by assumption we get a
such that for every
system hf : < i of members of A Q
which is increasing in <JQ
< [A ]

ultrafilter
Q D over A such that = cf( A /D), f is cofinal in A /D. Extend each f
to g A by setting g (a0 ) = 0. If < < , then
{a A : g (a) g (a)} {a A : f (a) f (a)} {a0 },
and {a A : f (a) f (a)} J< [A ] J< [A] and also {a0 } JQ
< [A] since a0 < ,
so g <J< g . Now let D be an ultrafilterQover A such that = cf( A/D). AsQabove,

A D; let DQ
= D P(A ). Then = cf( A /D ). To show that g is cofinal in A/D,
take any h A. Choose < such that (h A )/D < f /D . Then
{a A : h(a) g (a)} {a A : h(a) f (a)},
so h/D < g /D, as desired.
Thus we can make the assumption as in (1). Suppose that there is no universal
sequence for . Thus
(2) For everyQ<J< -increasing sequence f = hfQ
: < i there is an ultrafilter D over A
such that cf( A/D) = but f is bounded in A/D.
We are now going to construct a <J< -increasing sequence f = hf : < i for each
Q
< |A|+ . We use the fact that A/J< is -directed (Theorem 26.8).
Using this directedness, we start with any <J< -increasing sequence f 0 = hf0 : < i.
373

For limit < |A|+ we are going to define f by induction on so that the following
conditions hold:
(3) fi <J< f for i < ,
(4) sup{f : < } f .
Suppose that fi has been defined for all i < . By -directedness, choose g such that
fi <J< g for all i < . Now for any a A we have sup{f (a) : < } < a, since
< |A|+ < min A a. Hence we can define
f (a) = max{g(a), sup{f (a) : < }}.
Clearly the conditions (3), (4) hold.
Now suppose that f has been defined and is <J< -increasing; we define f +1 . By
(2), choose an ultrafilter D over A such that
Q
(5) cf( A/D ) = ;
(6) The sequence f is bounded in <D .

By (6), choose f0+1 which bounds f Q


in <D ; in addition, f0+1 f0 . Let hh /D : < i
be strictly increasing and cofinal in A/D . Now we define f+1 by induction on when
> 0. First, by -directness, choose k such that fi+1 <J< k for all i < . Then for any
a A let
f+1 (a) = max(k(a), h (a), f(a)).

Then the following conditions hold:


(7) f +1 is strictly increasing and cofinal in

A/D ;

(8) fi+1 fi for every i < .


This finishes the construction. Clearly we then have
(9) If i < and 1 < 2 < |A|+ , then fi1 fi2 .
Q
(10) f is bounded in A/D by f0+1 .
Q
(11) f +1 is cofinal in A/D .
Q
Now let h = sup<|A|+ f0 . Then h
A, since |A|+ < min(A). By (11), for each
< |A|+ choose i < such that h <D fi+1
. Since > |A|+ is regular, we can choose

i < such that i < i for all < |A|+ . Now define
A = {a A : h(a) f (a)}.
By (9) we have A A for < < |A|+ . We are going to get a contradiction by showing
that A A+1 for every < |A|+ .
In fact, this follows from the following two statements.
(12) A
/ D .
374

This holds because fi <D fi+1 h.


(13) A+1 D .
This holds because h <D fi+1 by the choice of i and (7).
Proposition 26.19. If A is a set of regular cardinals, Q
is the largest member of
pcf(A), and hf : < i is universal for , then it is cofinal in ( A, J< ).
Q
Proof. Assume the hypotheses. Fix g
A; we want to find < such that
g <J< f . Suppose that no such exists. Then, we claim, the set
(1)

J<
{{a A : g(a) f (a)} : < }

has fip. For, suppose that it does not have fip. Then there is a finite nonempty subset F
of such that
[

(2)
{a A : g(a) < f (a)} : < } J<
.
F

Let be the largest member of F . Note that the set


{a A : f (a) < f (a) for all , F such that < }

is also a member of J<


; intersecting this set with the set of (2), we get a member of J<
which is a subset of {a A : g(a) < f (a)}, so that g <J< f , contradiction.Q
Thus the set (1) has fip. Let D be an ultrafilter containing it. Then cf( A/D) = ,
so by hypothesis there is a < such that g <D f . Thus {a A : g(a) < f (a)} D.
But also {a A : g(a) f (a)} D, contradiction.
Q
QTheorem 26.20. If A is progressive, then cf( A, <) = max(pcf(A)). In particular,
cf( A, <) is regular.

Proof. Q
First we prove . Let = max(pcf(A)),
and let D be an ultrafilter on A such
Q
that =
cf(
A/D).
Now
for
any
f,
g

A,
if
f
<
g then
Q
Q
Q f <D g. Hence
Q any cofinal
set in ( A, <) is also cofinal in ( A, <D ), and Q
so = cf( A, <D ) cf( A, <).
To prove , we exhibit a cofinal subset of ( A, <) of size . For every pcf(A)
fix a universal sequence f = hfi : i < i for , by Theorem 26.18. Let F be the set of all
functions of the form
sup{fi11 , fi22 , . . . , finn },
where 1 , 2 , . . . , n is a finite sequence of members of pcf(A), possiblyQwith repetitions,
and ik < k for each k = 1, . . . , n. We claim that F is cofinal in ( A, <); this will
complete the proof.
Q
To prove this claim, let g A. Let
I = {> (f, g) : f F }.
(Recall that > (f, g) = {a A : f (a) > g(a)}.) Now I is closed under unions, since
> (f1 , g) > (f2 , g) = > (sup(f1 , f2 ), g).
375

If A I, then A = > (f, g) for some f F , as desired. So, suppose that A


/ I. Now
def
J = {A\X : X I} has fip since I is closed
and so this setQcan be extended
Q under unions,

to an ultrafilter D over A. Let = cf( A/D). Then f is cofinal in ( A, <D ) since it


is universal for . But fi I g for all i < , since fi F and so > (fi , g) I. This is a
contradiction.
Note that Theorem 26.20Qis not talking about true cofinality. In fact, clearly any increasing
sequence of elements of A under < must have order type at most min(A), and so true
cofinality does not exist if A has more than one element.
Lemma 26.21. Suppose that A is progressive, pcf(A), and f = hf : < i is a
universal sequence for . Suppose that f = hf : < i is <J< -increasing, and for every
< there is a < such that f J< f . Then f is universal for .
Q
Proof. This is clear, since for any ultrafilter D over A such that cf( A/D) = we
have D J< = , and hence f J< f implies that f D f .
For the next result, note that if A is progressive, then |A| < min(A), and hence |A|+
min(A). So A |A|+ = J< for any . So if is an ordinal and A
/ J< , then
+
|A| < .
Lemma 26.22. Suppose that A is a progressive set of regular cardinals and
pcf(A).
(i) Let be the least ordinal such that A
/ J< [A]. Then there is a universal
sequence for that satisfies () for every regular cardinal such that < .
(ii) There is a universal sequence for that satisfies ()|A|+ .
Proof. First note that (ii) follows from (i) by the remark preceding this lemma. Now
we prove (i). Note by the minimality of that either = + 1 for some A, or is a
limit cardinal and A is unbounded in .
(1) + 1.

Q
For, let D be an ultrafilter such thatQ = cf( A/D). Then A ( + 1) D, as otherwise
{a A : < a} D, and so cf( A/D) > by 26.1(vii), contradiction. Thus
pcf(A ( + 1)), and hence pcf((A ( + 1)) 6 , proving (1).
(2) 6= .
For, |A| < min(A) , so A is bounded in because is regular. Hence 6= by an
initial remark of this proof.
Now we can complete the proof for the case in which is + 1 for some A. In
this case, actually = . For, we have A J< [A] while
/ J< [A]. Let D
Q A ( + 1)
be an ultrafilter on A such that A ( + 1) D and cf( A/D) . Then A
/ D,
since A J< [A], so {} D, and so . By (1) we then have = .
Now define, for < and a A,
f (a) =

376

if a < ,
if a.

Q
Thus f A. The sequence hf : < i is <J< [A] -increasing, since if < < then
{a A : f (a) f (a)} A Q
J< [A]. It is also universal forQ. For, suppose that D is
an ultrafilter on A such that cf( A/D) = . Suppose that g A. Now |A| < min(A)
def
, so = (supaA g(a)) + 1 is less than . Now {a A : g(a) < f (a)} = A\ D, so
g <D f , as desired. Finally, hf : < i satisfies () , since it is itself strongly increasing
under J< [A]. In fact, if < < and a A\, then f (a) = < = f (a), and
A J< [A].
Hence the case remains in which < and A is unbounded in . Let hf : < i be
any universal
sequence for . We now apply Lemma 8.43 with I replaced by J< [A]. (Recall
Q
that ( A, I< [A] is -directed by Theorem 26.8.) This gives us a <J< [A] -increasing
sequence f = hf : < i such that f < f+1 for every < and () holds for f , for
every regular cardinal such that ++ < and {a A : a ++ } J< [A]. Clearly
then f is universal for . If is a regular cardinal less than , then ++ < < , and
{a A : a ++ } J< [A] by the minimality of , so the conclusion of the lemma
holds.
Lemma 26.23. Suppose that A is a progressive set of regular cardinals, B A, and
is a regular cardinal. Then the following conditions are equivalent:
(i) J [A] = J< [A] + B.
Q
(ii) B J [A], and for every ultrafilter D on A, if cf( A/D) = , then B D.
Proof. (i)(ii): Assume
Q (i). Obviously, then, B J [A]. Now suppose that D is
an ultrafilter on A and cf( A/D) = . By Corollary 26.9(ii) we have J [A] D 6= =
J< [A] D. Choose X J [A] D. Then by Proposition 26.16, X\B J< [A], so since
J< [A] D = , we get B D.
(ii)(i): is clear. Now suppose that X J [A]. If X B, then obviously
X J<Q
[A] + B. Suppose that X 6 B, and let D Q
be any ultrafilter such that X\B D.
Then cf( A/D) since pcf(X) + , and so cf( A/D) < by the second assumption
in (ii). This shows that pcf(X\B) , so X\B J< [A], and hence X J< [A] + B by
Proposition 26.16.
Theorem 26.24. If A is progressive, then every member of pcf(A) has a generator.
Proof. First suppose that we have shown the theorem if |A|+ < min(A). We show
how it follows when |A|+ = min(A). The least member of pcf(A) is |A|+ by 26.1(vii).
We have J<|A|+ [A] = {} and J|A|+ [A] = {, {|A|+}} = J<|A|+ [A] + |A|+ , so |A|+ is a
|A|+ -generator. Now suppose that pcf(A) with > |A|+ . Let A = A\{|A|+}. By
26.1(vi) we also have pcf(A ). By the supposed result there is a b A such that
J [A ] = J< [A ] + b. Hence, applying Lemma 26.6 to + and {|A|+ },
J [A] = J [A ] + {|A|+ }
= J< [A ] + b + {|A|+ }
= J< [A] + b,
as desired.
377

Thus we assume henceforth that |A|+ < min(A). Suppose that pcf(A). First we
take the case = |A|++ . Hence by Lemma 26.1(vii) we have A. Clearly
J [A] = {, {}} = {} + {} = J< [A] + {},
so has a generator in this case. So henceforth we assume that |A|++ < .
By Lemma 26.22, there is a universal sequence f = hf : < i for such that ()|A|+
holds. Hence by Lemma 8.40, f has an exact upper bound h with respect to <J< . Since
h is a least upper bound for f and the identity function on A is an upper bound for f , we
may assume that h(a) a for all a A. We now define
B = {a A : h(a) = a}.
Thus we can finish the proof by showing that
()

J [A] = J< [A] + B

First we show that B J [A], i.e., that pcf(B) Q


+ . Let D be any ultrafilter over A
having
Q B as an element; we want to show that cf( A/D) . If D J< 6= , then
cf( A/D) < by the definition of J< . Suppose that D J< = . Now since f is
<J< -increasing
and D J< = , the sequence f is also <D -increasing. It is also cofinal;
Q
for let g A. Define
n
g (a) = g(a) if a B,
0
otherwise.
Then {a A : g (a) h(a)} {a A : h(a) = 0} {a A : f0 (a) h(a)} J< .
So g <J< h. Since h is an exact upper bound for f , there is hence a < such

that
Q g <J< f . Hence g <D f , and clearly g =D g , so g <D f . This proves that
cf( A/D) = . So we have proved in ().
For , we argue by contradiction and suppose that there is an X J such that

X
/ J< [A] + B. Hence (by Proposition 26.16), X\B
/ J< . Hence J
Q< {X\B} has
fip, so we extend it to an ultrafilter D. Since D QJ< = , we have cf( A/D) . But
also X D since X\B D, and
Q X J , so cf( A/D) = . By the universality of f it
follows that f is cofinal in cf( A/D). But A\B D, so {a A : h(a) < a} D, and so
there is a < such that h <D f . This contradicts the fact that h is an upper bound of
f under <J< .
Now we state some important properties of generators.
Lemma 26.25. Suppose that A is progressive, pcf(A),Qand B A.
(i) If B is a -generator, D is an ultrafilter on A, and cf( A/D) = , then B D.
(ii) If B is a -generator, then
/ pcf(A\B).
(iii) If B J and
/ pcf(A\B), then B is a -generator.
(iv) If = max(pcf(A)), then A is a -generator on A.
(v) If B isQa -generator, then the restrictions to B of any universal sequence for
are cofinal in ( B, <J< [B] ).
Q
(vi) If B is a -generator, then tcf( B, <J< [B] ) = .
378

(vii) If B is a -generator on A, then = max(pcf(B)).


Q
(viii) If B is a -generator on A and D is an ultrafilter on A, then cf( A/D) =
iff B D and D J< = .
(ix) If B is a -generator on A and B =J< C, then C is a -generator on A. [Here
X =I Y means that the symmetric difference of X and Y is in I, for any ideal I.]
(x) If B is a -generator, then so is B ( + 1).
(xi) If B and C are -generators, then B =J< C.
(xii) If = max(pcf(A)) and B is a -generator, then A\B J< .
Proof. (i): By Corollary 26.9(ii), choose C J D. Hence C X B for some
X J< . By Corollary 26.9(ii) again, J< D = , so X
/ D. Thus C\X B and
C\X D, so B D.
(ii): Clear by (i).
(iii): Assume the hypothesis. We need to show that every member C of J is a
member of J< + B. Now pcf(C) + . Hence pcf(C\B) , so C\B J< , and the
desired conclusion follows from Proposition 26.16.
(iv): By (iii).
(v): Suppose
not. Let f = hf : < i be a universal sequence for such that there
Q
is an h B such that h is not bounded by any f B. Thus (f B, h)
/ J< [B] for
all < . Now suppose that < < . Then
(f B, h)\( (f B, h)) = {a B : f (a) h(a) < f (a)}
{a A : f (a) < f (a)} J< [A].
Hence by Lemma 26.17(i) we have (f B, h)\( (f B, h)) J< [B]. It follows that
if N is a finite subset of with largest element , then
()

( (f B, h))\

( (f B, h)) J< [B].

We claim now that

def

M = { (f B, h) : < } (J< [B])


has fip. Otherwise, there is a finite subset N of and a set C J< [B] such that

(f B, h) (B\C) = ;

hence if is the largest member of N we get (f B, h) J< [B] by (), contradiction.


So we extend the set M to an ultrafilter D on B, then to an ultrafilter E on A. Note
that B E. Also, E J< [A] = . In fact, if X E J< [A], then X B J< [A],
so X B D J< [B]
Q by Lemma 26.17(i). But D J< [B] = by construction. Now
B E J [A], so cf( A/E) = , and h bounds all f in this ultraproduct, contradicting
the universality of f .
(vi): By Lemma 26.17 and (v).
379

(vii): By (i) we have pcf(B). Now B J [A], so pcf(B) + . The desired


conclusion follows.
Q
(viii): For , suppose that cf( A/D) = . Then B D by (i), and obviously
D
QJ< = . For , assume that B D and D J< = . Now B J , so
cf( A/D) = by Corollary 26.9(ii).
(ix): We have B J and C = (C\B) (C Q
B), so C J . Suppose that
pcf(A\C). Let D be an ultrafilter on A such that cf( A/D) = and A\C D. Now
B D by (i), so B\C D. This contradicts B\C J< . So
/ pcf(A\C). Hence C is a
-generator, by (iii).
(x):Q Let B = B ( + 1). Clearly B J . Suppose that pcf(A\B ). Say
= cf( QA/D) with A\B D. Also A ( + 1) D, since A\( + 1) D would imply
that cf( A/D) > by Proposition 26.1(vii). Since clearly
(A\B ) (A ( + 1)) A\B,
this yields A\B D, contradicting (ii). Therefore,
/ pcf(A\B ). So B is a -generator,
by (iii).
(xi): This is clear from Proposition 26.16.
(xii): Clear by (iv) and (xi).
Lemma 26.26. Suppose that A is a progressive set, F is a proper filter over A, and
is a cardinal.
Q Then the following are equivalent.
(i) tcf( A/F ) = .

(ii) Q
pcf(A), F has a -generator on A as an element, and J<
F.
(iii) cf( A/D) = for every ultrafilter D extending F .
Proof. (i)(iii): obvious.
(iii)(ii): Obviously pcf(A). Let B be a -generator on A. Suppose that
B
/
QF . Then there is an ultrafilter D on A such that A\B D and D extends F . Then
cf( A/D) = by (iii), and this contradicts Lemma 26.25(i).
Q
(ii)(i): LetQ
B F be a -generator. By Lemma 26.25(vi) we have tcf( B/J< ) =

, and hence tcf( A/F ) = since B F and J<


F.
Proposition 26.27. Suppose that A is a progressive set of regular cardinals, and
is any cardinal. Then the following conditions are equivalent:
(i) = max(pcf(A)).
Q
(ii) = tcf(Q A/J< [A]).
(iii) = cf( A/J< [A]).
Proof. (i)(ii): By Lemma 26.25(iv),(vi).
(ii)(iii): Obvious.
(iii)(ii): Assume
(iii). Suppose that D isQ
an ultrafilter on A such that J< D = ,
Q
and let =Qcf( A/D). Let X be cofinal in ( A, <JQ
) with |X| = . Then X is also
<
cofinal in ( A, <D ), so . By Theorem 26.8, ( A, J< ) is -directed, so = .
Hence (ii) follows by Lemma 26.26.
(ii)(i): Assume (ii). By Lemma 26.26, Q
pcf(A). Suppose that pcf(A) and
. Let D be an ultrafilter on A such that cf( A/D) = . Then D J< = since
. By the equivalence of (i) and (iii) in Lemma 26.26 it follows that = .
380

Lemma 26.28. Suppose that A is progressive, A0 A, and pcf(A0 ). Suppose


that B is a -generator on A. Then B A0 is a -generator on A0 .
Proof. Since B J [A], we have pcf(B) + and hence pcf(B A0 ) + and
so B A0 J [A0 ]. If pcf(A0 \B), then also pcf(A\B), and this contradicts
Lemma 26.25(ii). Hence
/ pcf(A0 \B), and hence B A0 is a -generator for A0 by
Lemma 26.25(iii).
Definition. If A is progressive, a generating sequence for A is a sequence hB : pcf(A)i
such that B is a -generator on A for each pcf(A).
Theorem 26.29. Suppose that A is progressive, hB : pcf(A)i is a generating
sequence
for A, and X A. Then there is a finite subset N of pcf(X) such that X
S
N B .
Proof. We show that for all X A, if = max(pcf(X)), then there is a finite subset
N as indicated, using induction on . So, suppose that this is true for every cardinal < ,
and now suppose that X A and max(pcf(X)) = . Then
/ pcf(X\B ) by Lemma
26.25(ii), and so pcf(X\B ) . Hence max(pcf(X\B)) < . Hence
S by the inductive
hypothesis there is a finite subset N of pcf(X\B ) such that X\B N B . Hence
[

X B

B ,

and {} N pcf(X).
Corollary 26.30. Suppose that A is progressive, hB : pcf(A)i is a generating
sequence
S for A, and X A. Suppose that is any infinite cardinal. Then X J< [A] iff
X N B for some finite subset N of pcf(A).
Proof. : Assume that X J< [A]. Thus pcf(X) , and Theorem 26.29 gives
the desired conclusion.
: QAssume that a set N is given as indicated. Suppose that pcf(X). Say
= cf( A/D) with X D. Then B D for some N . By the definition of
generator, B J [A], and hence < . Thus we have shown that pcf(X) , so
X J< [A].
Lemma 26.31. Suppose that A is progressive and hB : pcf(A)i is a generating
sequence for A. Suppose that D is an ultrafilter on A. Then thereQis a pcf(A) such
that B D, and if is minimum with this property, then = cf( A/D).
Q
Proof. Let = cf( A/D). Then pcf(A) and B D by Lemma 26.25(i).
Suppose that B D with < . Now B J J< , contradicting Lemma 26.25(viii),
applied to .
Lemma 26.32. If A is progressive and also pcf(A) is progressive, and if pcf(A)
and B is a -generator for A, then pcf(B) is a -generator for pcf(A).
Proof. Note by Theorem 26.15 that pcf(pcf(B)) = pcf(B) and pcf(pcf(A\B)) =
pcf(A\B). Since B J [A], we have pcf(B) + , and hence pcf(pcf(B)) + and
381

so pcf(B) J [pcf(A)]. Now suppose that pcf(pcf(A)\pcf(B)). Then by Lemma


26.1(iv) we have pcf(pcf(A\B)) = pcf(A\B), contradicting Lemma 26.25(ii). So

/ pcf(pcf(A)\pcf(B)). It now follows by Lemma 26.25(iii) that pcf(B) is a -generator


for pcf(A).
The following result is relevant to Theorem 8.44. Taking C as given there, we can apply
Lemma 26.26 with replaced by and A by C (+) to infer that J<+ [C (+) ] is a subset
of the ideal of bounded subsets of . So the following is apparently a generalization of
Theorem 8.44, noticing also that J< = J<+ .
Lemma 26.33. If Qis a singular cardinal of uncountable cofinality, then there is a
club C such that tcf( C (+) /J< [C (+) ]) = + .
Q (+)
Proof. Let C0 be a club in such that such that + = tcf( C0 /J bd ), by Theorem
(+)
(+)
8.44. Thus + pcf(C0 ) by Lemma 26.26. Let B be a + -generator for C0 . Define
(+)
C = { C0 : + B}. Now C0 \C is bounded. Otherwise,
let X = C0 \B = (C0 \C)(+) .
Q
So X is unbounded, and hence clearly + = tcf( X/J bd ). Hence + pcf(X). This
contradicts Lemma 26.25(ii).
So, choose < such
Q that C0 \C . Hence C0 \ C\ C0 \, so C0 \ =
C\.Q Clearly + = tcf( (C0 \)(+) /J bd ), so + pcf((C0 \)(+) ). We claim that
tcf( (C0 \)(+) /J< [(C0 \)(+) ]) = + (as desired). To show this, we apply Lemma 26.26.
Suppose that D is any ultrafilter on (C0 \)(+) such that J< [(C0 \)(+) ] D = . Now
by Lemma 26.28, B (C0 \)(+) is a + -generator for (C0 \)(+) . QBut B (C0 \)(+) =
B (C\)(+) = (C\)(+) . It follows by Lemma 26.25(viii) that cf( (C0 \)(+) /D) = + .
Lemma 26.34. Assume that is a singular cardinal, that C is a club in , and that
Q
tcf( C (+) /J< [C (+) ]) = + . Then max(pcf(C (+) )) = + .
Proof. Since J< [C (+) ] = J [C (+) ], this is immediate from Proposition 26.27.
Corollary 26.35. If is a singular cardinal of uncountable cofinality, then there is
a club C such that max(pcf(C (+) )) = + .
By essentially the same proof as for Lemma 26.33 we get
Lemma 26.36. If is a singular cardinal of countable cofinality,
Q then there is an
unbounded subset C of consisting of regular cardinals such that tcf( C/J< [C]) = + .
Lemma 26.37. Assume that is a singular
cardinal, that C is an unbounded set of
Q
regular cardinals less than , and that tcf( C/J< [C]) = + . Then max(pcf(C)) = + .
Proof. Since J< [C] = J [C], this is immediate from Proposition 26.27.
Corollary 26.38 . If is a singular cardinal of countable cofinality, then there is an
unbounded C consisting of regular cardinals such that max(pcf(C)) = + .
Proposition 26.39. Suppose that F is a proper filter over a progressive set A of
regular cardinals. Define
n Y

o
pcf F (A) = cf
A/D : D is an ultrafilter extending F .
382

Then:
(i) max(pcf
Q F (A)) exists.
(ii) cf( A/F ) = max(pcf F (A)).
(iii) If B pcf F (A) is progressive, then pcf(B) pcf F (A).
(iv) If A is a progressive interval of regular cardinals with no largest element, and
F = {X A : A\X is bounded}
is the filter of co-bounded subsets of A, then pcf F (A) is an interval of regular cardinals.
Proof. (i): Clearly pcf F (A) pcf(A), and so if = max(pcf(A)), then A F
J<+ [A]. Hence we can choose minimum such that F J< [A] 6= . By Corollary 26.12,

is not a limit cardinal; write = + . Then F J< = , and so F J<


has fip; let D
be an ultrafilter
containing this set. Then D J F J 6= , while D J< = .
Q
Hence cf( A/D) = by Corollary 26.26. On the
Q other hand, since F J [A] 6= , any
ultrafilter E containing F must be such that cf( A/E) .
(ii): Cf. the proof of Theorem 26.20.
Q Let = max(pcf F (A)), and let D be an
ultrafilter extending F such that = cf( A/D). Let hf : < i be strictly increasing
and cofinal mod D. Now if g < h mod F , then also gQ< h mod D. So a cofinal subset of
Q
A mod F is also Q
a cofinal subset mod D, so cf( A/F ). Hence it suffices to exhibit
a cofinal subset of A mod F of size . For every pcf F (A) fix a universal sequence
f = hfi : i < i for , by Theorem 26.18. Let G be the set of all functions of the form
sup{fi11 , fi22 , . . . , finn },
where 1 , 2 , . . . , n is a finite sequence of members of pcf F (A), possibly
Q with repetitions,
and ik < k for each k = 1, . . . , n. We claim that G is cofinal in ( A, <F ); this will
complete the proof of (ii).
Q
To prove this claim, let g A. Suppose that g 6< f mod F for all f G. Then, we
claim, the set
()

F {{a A : f (a) g(a)} : f G}

S
has fip. For, suppose not. Then there is a finite subset G of G such that gG {a A :
g(a) < f (a)} F . Let h = supf G f . Then g < h mod F and h G,Qcontradiction.
Thus () has fip, and we let D be an ultrafilter containing it. Let = cf( A/D). Then
pcf F (A), and f g mod D for all f G. Since the members of a universal sequence
for are in G, this is a contradiction. This completes the proof of (ii).
For (iii), we look at the proof of Theorem 26.15. Let F be the ultrafilter named F at
the beginning of that proof. Since B pcf F (A), each b B is in pcf F (A), and hence the
ultrafilters Db can be taken to extend F . Hence F F , and so pcf F (A), as desired
in (iii).
Finally, we prove (iv). Let 0 = min(pcf F (A)) and 1 = max(pcf F (A)), and suppose
that is a regular
Q cardinal such that 0 < < 1 . Let D be an ultrafilter such that
F D and cf( A/D) = 1 . Then by Corollary 26.9(ii), D J<1 = , so J1 D. Thus
Q
+

F J<
F J<
generates a proper filter G. Since ( A, <J< ) is
D, so F J<
1
383

Q
-directed by Theorem 26.8, so is ( A, <G ). Note that if a A, then {b A : a < b} F .
It follows that sup(A) 0 < . Hence we can apply Theorem 26.4 and get a subset A
of A (since A is an interval of regular cardinals) and a proper ideal K over
A such that
Q
A is cofinal in A, K contains all proper initial segments of A , and tcf( A, <K ) = .
Let hf : < i be strictly increasing and cofinal mod K. Extend K to a filter L on A,
and extend each function f to a function f+ on A. Then clearly hf+ : < i is strictly
increasing and cofinal mod L, and L contains F . This shows that pcf F (A).
EXERCISES
E26.1. A set A of regular cardinals is almost progressive iff A is infinite, and A |A| is
finite. Prove the following:
(i) Every progressive set is almost progressive.
(ii) If A is an infinite set of regular cardinals and |A| < , then A is almost progressive.
(iii) Every infinite subset of an almost progressive set is almost progressive.
(iv) If A is almost progressive, then A\|A|+ is progressive, A |A|+ is finite, and
A = (A\|A|+) (A |A|+ ).
(v) If is an ordinal, A is almost progressive, and A is unbounded in , then
is a singular cardinal.
(vi) If A is an almost progressive interval of regular cardinals, then A does not have
any weak inaccessible as a member, except possibly its first element. If in addition A is
infinite, then there is a singular cardinal such that A is unbounded in and A\ is
finite.
E26.2. Show that Theorem 26.4 and Corollary 26.5 also hold if A is almost progressive.
E26.3. Suppose that A is progressive and |A| is regular. Show that sup(pcf (A))
sup(A) .
E26.4. Suppose that A is progressive, and is a cardinal such that A is unbounded in
. Show that + pcf(A).
E26.5. Assume GCH and suppose that A is progressive. Show that
pcf(A) = A { + : a cardinal, A unbounded in }.
E26.6. Suppose that is a limit ordinal and A is an infinite set such that |A| < cf().
Determine all regular cardinals such that l = cf(A /D) for some ultrafilter D on A.
E26.7. Suppose that is a regular cardinal and D is an ultrafilter on such that \ D
for every < . Show that cf( /D) > .
E26.8. Let A be a progressive set of regular cardinals and an infinite cardinal. Suppose
that J is an ideal on A. Show that the following conditions are equivalent:
(i) J = I< [A].
(ii) J is the intersection of all ideals K on A which satisfy the following condition:
ForQeach X A with X
/ K there is an ultrafilter D on X such that D K = and
cf( X/D) .
384

E26.26. Suppose that A is progressive and J is a proper ideal on A.


(i) Show that if X P(A)\J, then J + (A\X) is proper.
Q
(ii) Show that there is an X P(A)\J such that tcf( A, <J+(A\X) ) exists.
E26.10. Suppose that A is progressive and is an infinite cardinal with |A|. Then
|pcf (A)| |A| .

385

27. Main cofinality theorems


The sets H
We will shortly give several proofs involving the important general idea of making elementary chains inside the sets H . Recall that H , for an infinite cardinal , is the collection
of all sets hereditarily of size less than , i.e., with transitive closure of size less than .
We consider H as a structure with together with a well-ordering < of it, possibly with
other relations or functions, and consider elementary substructures of such structures.
Recall that A is an elementary substructure of B iff A is a subset of B, and for
every formula (x0 , . . . , xm1 ) and all a0 , . . . , am1 A, A |= (a0 , . . . , am1 ) iff B |=
(a0 , . . . , am1 ).
The basic downward Lowenheim-Skolem theorem will be used a lot. This theorem
depends on the following lemma.
Lemma 27.1. (Tarski) Suppose that A and B are first-order structures in the same
language, with A a substructure of B. Then the following conditions are equivalent:
(i) A is an elementary substructure of B.
(ii) For every formula of the form y(x0 , . . . , xm1 , y) and all a0 , . . . , am1 A, if
B |= y(a0 , . . . , am1 , y) then there is a b A such that B |= (a0 , . . . , am1 , b).
Proof. (i)(ii): Assume (i) and the hypotheses of (ii). Then by (i) we see that
A |= y(a0 , . . . , am1 , y), so we can choose b A such that A |= (a0 , . . . , am1 , b).
Hence B |= (a0 , . . . , am1 , b), as desired.
(ii)(i): Assume (ii). We show that for any formula (x0 , . . . , xm1 ) and any elements
a0 , . . . , am1 A, A |= (a0 , . . . , am1 ) iff B |= (a0 , . . . , am1 ), by induction on . It
is true for atomic by our assumption that A is a substructure of B. The induction
steps involving and are clear. Now suppose that A |= y(a0 , . . . , am1 , y), with
a0 , . . . , am1 A. Choose b A such that A |= (a0 , . . . , am1 , b). By the inductive
assumption, B |= (a0 , . . . , am1 , b). Hence B |= y(a0 , . . . , am1 , y), as desired.
Conversely, suppose that B |= y(a0 , . . . , am1 , y). By (ii), choose b A such that
B |= (a0 , . . . , am1 , b). By the inductive assumption, A |= (a0 , . . . , am1 , b). Hence
A |= y(a0 , . . . , am1 , y), as desired.
Theorem 27.2. Suppose that A is an L-structure, X is a subset of A, is an infinite
cardinal, and is both |X| and the number of formulas of L , while |A|. Then A
has an elementary substructure B such that X B and |B| = .
Proof. Let a well-order of A be given. We define hCn : n i by recursion. Let C0
be a subset of A of size with X C0 . Now suppose that Cn has been defined. Let Mn
be the collection of all pairs of the form (y(x0 , . . . , xm1 , y), a) such that a is a sequence
of elements of Cn of length m. For each such pair we define f (y(x0 , . . . , xm1 , y), a) to
be the -least element b of A such that A |= (a0 , . . . , am1 , b), if there is such an element,
and otherwise let it be the least element of Cn . Then we define
Cn+1 = Cn {f (y(x0 , . . . , xm1 , y), a) : (y(x0 , . . . , xm1 , y), a) Mn }.
S
Finally, let B = n Cn .
386

By induction it is clear that |Cn | = for all n , and so also |B| = .


Now to show that B is an elementary substructure of A we apply Lemma 27.1.
It is easy to see that B is a substructure of A. (Closure under fundamental operations of the language follows using the construction.) Now suppose that we are given
a formula of the form y(x0 , . . . , xm1 , y) and elements a0 , . . . , am1 of B, and A |=
y(a0 , . . . , am1 , y). Clearly there is an n such that a0 , . . . , am1 Cn . Then
(y(x0 , . . . , xm1 , y), a) Mn , and f (y(x0 , . . . , xm1 , y), a) is an element b of Cn+1
B such that A |= (a0 , . . . , am1 , b). This is as desired in Lemma 27.1.
Given an elementary substructure A of a set H , we will frequently use an argument of
the following kind. A set theoretic formula holds in the real world, and involves only sets
in A. By absoluteness, it holds in H , and hence it holds in A. Thus we can transfer a
statement to A even though A may not be transitive; and the procedure can be reversed.
To carry this out, we need some facts about transitive closures first of all.
Lemma 27.3. (i) If X A, then tr cl(X) tr cl(A).
(ii) tr cl(P(A)) = P(A) tr cl(A).
(iii) If tr cl(A) is infinite, then |tr cl(P(A))| 2|tr cl(A)| .
(iv) tr cl(A B) = tr cl(A) tr cl(B).
(v) tr cl(A B) = (A B) {{a} : a A} {{a, b} : a A, b B} tr cl(A) tr cl(B).
(vi) If tr cl(A) or tr cl(B) is infinite, then |tr cl(A B)| max(tr cl(A), tr cl(B).
(vii) tr cl(A B) (A B) tr cl(A B).
(viii) If tr cl(A) or tr cl(B) is infinite,
cl(A B)| 2max(|tr cl(A)|,|tr cl(A)|) .
Qthen |tr |tr
(ix) If tr cl(A) is infinite, then |tr cl( A)| 2 cl(A)| .
Q
max(|tr cl(A)|,|tr cl(B)|)
(x) If tr cl(A) or tr cl(B) is infinite, then |tr cl(A ( B))| 22
.
|tr cl(A)|
(xi) If A is an infinite set of regular cardinals, then |tr cl(pcf(A))| 2
.
Q
S
Proof. (i)(viii) are clear. For (ix), note that A A A, so (ix) follow from (viii).
For (x),
|tr cl

 Y 
Q
A
B | 2max(|tr cl(A),|tr cl( B))
2max(|tr cl(A),2
22

|tr cl(B)

by (viii)

max(|tr cl(A)|,|tr cl(B)|)

S
Finally, for (xi), note that tr cl(pcf(A)) = pcf(A) pcf(A).
Now |pcf(A)| 2|A|
Q
2|tr cl(A)| by Theorem 9.27. If pcf(A), then clearly | A|, so the desired conclusion
now follows from (ix).
We also need the fact that some rather complicated formulas and functions are absolute
for sets H . Note that H is transitive. Many of the indicated formulas are not absolute
for H in general, but only under the assumptions given that is much larger than the
sets in question.
Lemma 27.4. Suppose that is an uncountable regular cardinal. Then the following
formulas (as detailed in the proof ) are absolute for H .
(i) B = P(A).
387

(ii) D is an ultrafilter on A.
(iii) is a cardinal.
(iv) is a regular cardinal.
(v) and are cardinals, and = + .
(vi) = |A|.
Q
(vii) B = A.
(viii) A = B C.
(ix) A is infinite, if is uncountable.
(x) A is an infinite Q
set of regular cardinals and D is an ultrafilter on A and is a
regular cardinal and f A and f is strictly increasing and cofinal modulo D, provided
that 2|tr cl(A)| < .
(xi) A is an infinite set of regular cardinals, and B = pcf(A), if 2|tr cl(A)| < .
(xii) A is an infinite set of regular cardinals and f = hJ< [A] : pcf(A)i, provided
that 2|tr cl(A)| < .
(xiii) A is an infinite set of regular cardinals and B = hB : pcf(A)i and
|tr cl(A)|
pcf(A)(B is a -generator), if 22
< .
Proof. Absoluteness follows by easy arguments upon producing suitable formulas, as
follows.
(i): Suppose that A, B H . We may take the formula B = P(A) to be
x B[y x(y A)] x[y x(y A) x B].
The first part is obviously absolute for H . If the second part holds in V it clearly holds in
H . Now suppose that the second part holds in H . Suppose that x A. Hence x H
and it follows that x B.
(ii): Assume that A, D H . We can take the statement D is an ultrafilter on A
to be the following statement:
X D(X A) A D X, Y D(X Y D)
/D
Y X D[X Y Y A Y D] Y [Y A Y D (A\Y ) D].
Again this is absolute because Y A implies that Y H .
(iii): Suppose that H . Then
is a cardinal

iff

is an ordinal and f [f is a function and


dmn(f ) = and rng(f ) f is not one-to-one].

Note here that if f is a function with dmn(f ) = and rng(f ) , then f , and
hence f H .
(iv): Assume that H . Then
is a regular cardinal iff

is a cardinal, 1 < , and f [f is a function


and dmn(f ) and rng(f ) and
, dmn(f )( < f () < f ())
< dmn(f )(f () )].
388

(v): Assume that , H . Then ( and are cardinals and = + ) iff


is a cardinal and is a cardinal and <
and < [ < f [f is a function and dmn(f ) =
and rng(f ) = and f is one-one and rng(f ) = ]].
(vi): Suppose that , A H . Then
= |A| iff

is a cardinal and f [f is a function


and dmn(f ) = and rng(f ) = A and f is one-to-one]

(vii): Assume that A, B H . Then


Y
B=
A iff f B[f is a function and dmn(f ) = A and
x A[f (x) x]] and f [f is a function and
dmn(f ) = A and x A[f (x) x] f B].
Note that if f is a function with domain A and f (x) x for all x A, then f A
and hence f H .
(viii): Suppose that A, B, C H . Then
A = BC

iff

A,

f A[f is a function and dmn(f ) = B


and rng(f ) C] and f [f is a function
and dmn(f ) = B and rng(f ) C f A].

(ix): A is infinite iff f (f is a one-one function, dmn(f ) = ,Qand rng(f ) A).


(x): Suppose that A, D, , f H , and 2|tr cl(A)| ) < . Then A H by Lemma
27.3(ix). Now
A is an infinite set of regular cardinals and D is an ultrafilter on A
Y
and is a regular cardinal and f
A and f is strictly
increasing and cofinal modulo D
iff
A is infinite and x A[x is a regular cardinal] and D is an ultrafilter on A and

Y
is a regular cardinal and B B =
A and f is a function
and dmn(f ) = and rng(f ) B and
, < X A[a A[a X f (a) < f (a)] X D]

and g B < X A[a A[a X g(a) < f (a)] X D] .


389

(xi): Assume that 2|tr cl(A)|) < , and A, B H . Let (A, D, , f ) be the statement
of (x). Note:
(1) If (A, D, , f ), then D, , f H , and max(, |tr cl(A)|) 2|tr cl(A)| .
In fact, D P(A), so tr cl(D) tr cl(P(A)) = P(A) tr cl(A), and so |tr cl(D)|
Q <
by Lemma
27.3(iii); so D H . Now f is a one-one function from into
A,
Q
Q so
|tr cl(A)|
| A| < , and hence H and max(, |tr cl(A)|) 2
. Finally, f A,
so it follows that f H .
Thus (1) holds. Hence the following equivalence shows the absoluteness of the statement in (xi):
A is an infinite set of regular cardinals and B = pcf(A)
iff
A is infinite, and A( is a regular cardinal) BDf (A, D, , f )
Df [(A, D, , f ) B].
(xii): Assume that 2|tr cl(A)|) < . By Lemma 27.3(xi) we have pcf(A) H . Hence
A is an infinite set of regular cardinals f = hJ< [A] : pcf(A)i
iff
A is infinite and A( is a regular cardinal and
f is a function and B[B = pcf(A) B = dmn(f )]
dmn(f )X A[A f () iff C[C = pcf(X) C ]]
|tr cl(A)|

< , and A, B H . Note as above that pcf(A)


(xiii): Assume that 22
H . Note that for any cardinal we have J< [A] P(A) and, with f as in (xi),
f pcf(A) P(P(A)); so f H . Let (f, A) be the formula of (xii). Thus
A is a set of regular cardinals and B = hB : pcf(A)i
and pcf(A)(B is a -generator)
iff
B is a function and C[C = pcf(A) C = dmn(B)] f [(f, A)
dmn(B) dmn(B)[ is a cardinal and is a cardinal and
= + B f () X A[X f () iff X\B f ()]]]
390

Now we turn to the consideration of elementary substructures of H . The following lemma


gives basic facts used below.
Lemma 27.5. Suppose that is an uncountable cardinal, and N is an elementary
substructure of H (under and a well-order of H ).
(i) For every ordinal , N iff + 1 N .
(ii) N .
(iii) If a N , then {a} N .
(iv) If a, b N , then {a, b}, (a, b) N .
(v) If A, B N , then
S A B N.
(vi) If A N then A N .
(vii) If f N is a function, then dmn(f ), rng(f ) N .
(viii) If f N is a function and a N dmn(f ), then f (a) N .
(ix) If X, Y N , X N , and |Y | |X|, then Y N .
(x) If X N and X 6= , then X N 6= .
(xi) P(A) N if A N and 2|tr cl(A)| < .
(xii) If is an infiniteQordinal, ||+ < , and N , then || N and ||+ N .
(xiii) If A N , then A N if 2|tr cl(A)| < .
(xiv) If A N , A is a set of regular cardinals, and A H , then pcf(A) N if
2|tr cl(A)| < .
(xv) If A N , A is a set of regular cardinals, then hJ< [A] : pcf(A)i N if
2|tr cl(A)|
< .
2
(xvi) If A N and A is a set of regular cardinals, then there is a function hB :
|tr cl(A)|
< .
pcf(A)i N , where for each pcf(A), the set B is a -generator, if 22
Proof. (i): Let be an ordinal, and suppose that N . Then H , and hence
{} H . By absoluteness, H |= x(x = {}), so N |= x(x = {}).
Choose b N such that N |= b = {}. Then H |= b = {}, so by absoluteness,
b = {}. This proves that {} N .
The method used in proving (i) can be used in the other parts; so it suffices in most
other cases just to indicate a formula which can be used.
(ii): An easy induction, using the formulas xy x(y 6= y) and x[a x a
x y x[y a y = a]].
(iii): Use the formula x[y x(y = a) a x].
(iv): Similar to (ii).
(v): Use the formula
C[a Ab B[(a, b) C] x Ca Ab B[x = (a, b)]].
(vi): Use the formula B[x A[x B] y Bx A(y x)].
(vii): Use the formula A[xy[(x, y) f x A] x Ay[(x, y) f ]]. Note
that this formula is absolute for H for example (x, y) f H implies that x, y H .
(viii): Use the formula x[(a, x) f ].
(ix): Let f be a function mapping X onto Y (assuming, as we may, that Y 6= ).
Then f H , so by the above method, we get another function g N which maps X
onto Y . Now (viii) gives the conclusion of (ix).
391

(x): Use the formula x X[x = x].


(xi): P(A) H by Lemma 27.3(iii). Hence we can use the formula
B[x B(x A) x[x A x B]].
(xii): Assume that is an infinite ordinal and N . Then
H |= [(f : , a bijection) [(g : , a bijection) ]].
Hence by the standard argument, there are , f N such that
H |= f : is a bijection [(g : , a bijection) ].
Clearly then = ||.
For ||+ , use the formula

f [f is a bijection from onto ]
f [f is a bijection from onto ]
[ 
=
.
Q
(xiii): Note that A H by Lemma 27.3(ix). Then use the formula

B f B(f is a function dmn(f ) = A a A(f (a) a))

f [f is a function dmn(f ) = A a A(f (a) a) f B] .
(xiv): pcf(A) H by Lemma 27.3(xi), so by Lemma 27.4(xi) we can use the formula
B[B = pcf(A)].
(xv): We have pcf(A) H and P(P(H )) by Lemma 27.3(iii),(xi). It follows
that hJ< [A] : pcf(A)i H . Hence by Lemma 27.4(xii) we can use the formula
f [f = hJ< [A] : pcf(A)i].
(xvi): By Lemma 27.3(iii),(xi) and Lemma 27.4(xiii) we can use the formula
B[B : pcf(A) P(A) pcf(A)[B is a generator for A]].
Definition. Let be a regular cardinal. An elementary substructure N of H is presentable iff there is an increasing and continuous chain hN : < i of elementary
substructures of H such that:
(1) |N | = and + 1 N .
S
(2) N = < N .
(3) For every < , the function hN : i is a member of N+1 .
392

It is obvious how to construct a -presentable substructure of H .


Lemma 27.6. If N is a -presentable substructure of H , with notation as above,
and if < , then + N N+1 .
Proof. First we show that N for all < , by induction. It is trivial for = 0,
and the successor step is immediate from the induction hypothesis and Lemma 27.5(vii).
The limit step is clear.
Now it follows that + N by an inductive argument using Lemma 27.5(i).
Finally, N N+1 by (3) and Lemma 27.5(viii).
For any set M , we let M be the set of all ordinals such that M or M is unbounded
in .
Lemma 27.7. If N is a -presentable substructure of H , with notation as above,
then
(i) If < , then N N .
(ii) If < N \N , then is a limit ordinal and cf() = , and in fact there is a
closed unbounded subset E of such that E N and E has order type .
Proof. First we consider (i). Suppose that N . We may assume that
/ N .
Case 1. = sup(N Ord). Then
H |= [( N ) [( N ) ]];
in fact, our given is the unique for which this holds. Hence this statement holds in
N , as desired.
Case 2. N ( < ). We may assume that is minimum with this property. Now
for any N we can let () be the supremum of all ordinals in N which are less than
. So () = . By absoluteness we get
H |= N [ N ( < < )
[ N ( < < ) ]];
Hence N models this formula too; applying it to in place of , we get N such that
N |= N ( < < )
[ N ( < < ) ].
Thus = N , as desired. This proves (i).
For (ii), suppose that < N \N . Let E = {sup( N ) : < }. Note that if
< , then by (i), sup( N ) N . So E N . It is clearly closed in . It is unbounded,
since for any N there is a < such that N , and so sup(N ) N .
For any set N we define the characteristic function of N ; it is defined for each regular
cardinal as follows:
ChN () = sup(N ).
393

Proposition 27.8. Let be a regular cardinal, let N be a -presentable substructure


of H , and let be a regular cardinal.
(i) If , then ChN () = N .
(ii) If < , then ChN ()
/ N , ChN () < , and ChN () has cofinality .
(iii) For every N we have ChN ().
Proof. (i): True since + 1 N .
(ii): Since |N | = < and is regular, we must have ChN ()
/ N and ChN () < .
Then ChN () has cofinality by Lemma 27.7.
(iii): clear.
Theorem 27.9. Suppose that M and N are elementary substructures of H and
< are cardinals, with < .
(i) If M N and sup(M + ) = sup(M N + ) for every successor cardinal
+ such that + M , then M N .
(ii) If M and N are both -presentable and if sup(M + ) = sup(N + ) for every
successor cardinal + such that + M , then M = N .
Proof. (i): Assume the hypothesis. We prove by induction on cardinals in the
interval [, ] that M N . This is given for = . If, inductively, is a limit
cardinal, then the desired conclusion is clear. So assume now that is a cardinal, < ,
and M N . If +
/ M , then by Lemma 27.5(xii), [, + ] M = , so the desired
conclusion is immediate from the inductive hypothesis. So, assume that + M . Then the
hypothesis of (i) implies that there are ordinals in [, + ] which are in M N , and hence
by Lemma 27.5(xii) again, + N . Now to show that M [, + ] N [, + ], take any
ordinal M [, + ]. We may assume that < + . Since sup(M + ) = sup(M N + )
by assumption, we can choose M N + such that < . Let f be the < -smallest
bijection from to . So f M N . Since M , we also have f () M by Lemma
27.5(viii). Now f () < , so by the inductive assumption that M N , we have
f () N . Since f N , so is f 1 , and f 1 (f ()) = N , as desired. This finishes the
proof of (i).
(ii): Assume the hypothesis. Now we want to check the hypothesis of (i). By the
definition of -presentable we have = M = N . Now suppose that is a cardinal
and + with + M . We may assume that < + . Let = ChM ( + ); this is
the same as ChN ( + ) by the hypothesis of (ii). By Lemma 27.8 we have
/ M N;
hence by Lemma 27.7 there are clubs P, Q in such that P M and Q N . Hence
sup(M + ) = sup(M + ) = sup(M N + ). This verifies the hypothesis of (i) for
the pair M, N and also for the pair N, M . So our conclusion follows.
Minimally obedient sequences
Suppose that A is progressive,
Q pcf(A), and B is a -generator for A. A sequence
hf : < i of members of A is called Q
persistently cofinal for , B provided that h(f
B) : < i is persistently
Q cofinal in ( B, <J< [B] ). Recall from page 120 that this
means that for all h
B there is a 0 < such that for all , if 0 < , then
h <J< [B] (f B).
394

Lemma 27.10. Suppose that A is progressive, Q


pcf(A), and B and C are generators for A. A sequence hf : < i of members of
A is persistently cofinal for
, B iff it is persistently cofinal for , C.
Proof.
Suppose
Q
Q that hf : < i is persistently cofinal for , B, and suppose that
h C. Let k B be any function such that h (B C) = k (B C). Choose
0 < such that for all [0 , ) we have k <J< [B] (f B). Then for any [0 , ) we
have
{a C : h(a) f (a)} = {a B C : h(a) f (a)} {a C\B : h(a) f (a)}
{a B : k(a) f (a)} (C\B);
Now (C\B) J< [A] by Lemma 9.25(xi), so h <J< [C] (f C). By symmetry the lemma
follows.
Because of this lemma we say that f is persistently cofinal for iff it is persistently cofinal
for , B for some -generator B.
def

Lemma 27.11. Suppose that A is progressive, pcf(A), and f = hf : < i is


universal for . Then f is persistently cofinal for .
Proof. Let B be a -generator. Then by Lemma 9.25(vii), is the largest member
of pcf(B). By Lemma 9.17, h(fQ B) : < i is strictly increasing under <J< [B] , and by
Lemma 9.25(v)
Q it is cofinal in ( B, <J< [B] ). By Proposition 8.12, it is thus persistently
cofinal in ( B, <J< [B] ).
Lemma 27.12. Suppose that A is progressive, pcf(A), and A N , where N is
|tr cl(A)|
a -presentable elementary substructure of H , with |A| <
< .
Q< min(A) and 2
Suppose that f = hf : < i is a sequence of functions in A.
Then for every < there is an < such that for any a A,
f (a) < ChN (a) iff

f (a) < ChN (a).

Proof.
ChN (a) = sup(N a)
[
= (N a)
=

[ [

<

(N a)

<

ChN (a).

<

Hence for every a A for which f (a) < ChN (a), there is an a < such that f (a) <
ChNa (a). Hence the existence of as indicated follows.
395

Lemma 27.13. Suppose that A is progressive, is regular, pcf(A), and A, N ,


where N is a -presentable elementary substructure of H , with |A| < <Qmin(A) and
is big. Suppose that f = hf : < i N is a sequence of functions in
A which is
persistently cofinal in . Then for every ChN () the set
{a A : ChN (a) f (a)}
is a -generator for A.
Proof. Assume the hypothesis, including ChN (). Let be as in Lemma
27.12. We are going to apply Lemma 9.25(ix). Since A, f, N , we may assume that
A, f, N0 , by renumbering the elementary chain if necessary. Now N , and |A| < ,
so we easily see that there is a bijection f N mapping an ordinal < onto A; hence
A N by Lemma 27.5(viii), and so A N for some < . We may assume that A N0 .
By Lemma 27.5(xvi),(viii), there is a -generator Q
B which is in N0 .
Now the sequence f is persistently cofinal in B/J< , and hence
Y
H |= h
B < [h B <J< f B]; hence
Y
N |= h
B < [h B <J< f B];
Q
Hence for every h N , if h
B then there is an < with N Q
such that
N |= [h B <J< f B]; going up, we see that really for every h N A there
is an h N such that for all with h we have h B <J< f B. Since , as
given in Q
the statement of the Lemma, is each member of N , hence h for each
h N A, we see that
Q
(1)
h B <J< f B for every h N A.
Now we can apply (1) to h = ChN , since this function is clearly in N . So ChN
B <J< [B] f B. Hence by the choice of (see Lemma 27.12)
(2)

ChN B J< [B] f B.

Note that (2) says that B\{a A : ChN (a) f (a)} J< [A].
Now
/ pcf(A\B) by
Q Lemma 9.25(ii), and hence J< [A\B] = J [A\B]. So by
Theorem 9.8 we see Q
that (A\B)/J< [A\B] is + -directed, so hf (A\B) : < i has
an upper bound h (A\B). We may assume that h N , by the usual argument. Hence
f (A\B) <J< [A\B] h < ChN (A\B);
hence {a A\B : ChN (a) f (a)} J< [A], and together with (2) and using Lemma
9.25(ix) this finishes the proof.
Now supposeQthat A is progressive, is a limit ordinal, f = hf : < i is a sequence of
members of A, |A|+ cf() < min(A), and E is a club of of order type cf(). Then
we define
hE = sup{f : E}.
396

We call hE the supremum along E of f . Thus hE


if E1 E2 then hE1 hE2 .

A, since cf() < min(A). Note that

Lemma 27.14. Let A, , f be as above. Then there is a unique function g in


such that the following two conditions hold.
(i) There is a club C of of order type cf() such that g = hC .
(ii) If E is any club of C of order type cf(), then g hE .

Proof. Clearly such a function g is unique if it exists.


Now suppose that there is no such function g. Then for every club C of of order
type cf() there is a club D of order type cf() such that hC 6 hD , hence hC 6 hCD .
Hence there is a decreasing sequence hE : < |A|+ i of clubs of such that for every
< |A|+ we have hE 6 hE+1 . Now note that
|A|+ =

{ < |A|+ : hE (a) > hE+1 (a)}.

aA
def

Hence there is an a A such that M = { < |A|+ : hE (a) > hE+1 (a)} has size |A|+ .
Now hE (a) hE (a) whenever < < |A|+ , so this gives an infinite decreasing sequence
of ordinals, contradiction.
The function g of this lemma is called the minimal club-obedient bound of f .
Corollary 27.15. Suppose
Q that A+is progressive, is a limit ordinal, f = hf : < i
is a sequence of members of
A, |A| cf() < min(A), J is an ideal on A, and f is
<J -increasing. Let g be the minimal club-obedient bound of f . Then g is a J -bound for
f.
Now suppose that A is progressive, pcf(A), and is a regular cardinal such that
|A| < < min(A). We say that f = hf : < i is -minimally obedient for iff f is a
universal sequence for and for every < of cofinality , f is the minimal club-obedient
bound of f .
A sequence f is minimally obedient for iff |A|+ < min(A) and f is minimally
obedient for every regular such that |A| < < min(A).
Lemma 27.16. Suppose that |A|+ < min(A) and pcf(A). Then there is a
minimally obedient sequence for .
Proof. By Theorem 9.18, let hf0 : < i be a universal sequence for . Now
by induction we define functions f for < . Let f0 = f00 , and choose f+1 so that
max(f , f0 ) < f+1 .
For limit < such that |A| < cf() < min(A), let f be the minimally club-obedient
bound of hf : < i.
For other limit < , use the -directedness (Theorem 9.8) to get f as a <J< -bound
of hf : < i.
Thus we have assured the minimally obedient property, and it is clear that hf : < i
is universal.
397

Lemma 27.17. Suppose that A is progressive, and is a regular cardinal such that
|A| < < min(A). Also assume the following:
(i) pcf(A).
(ii) f = hf : < i is a -minimally obedient sequence for .
(iii) N is a -presentable elementary substructure of H , with large, such that
, f, A N .
Then the following conditions hold:
(iv) For every N \N we have:
(a) cf() = .
(b) There is a club C of of order type such that f = sup{f : C} and
C N.
(c) f (a) N a for every a A.
(v) If = ChN (), then:
(a) N \N ; hence we let C be as in (iv)(b), with f = sup{f : C}.
(b) f N for each C.
(c) f (ChN A).
(vi) = ChN () and C is asQin (iv)(b), with f = sup{f : C}, and B is a
generator, then for every h N A there is a C such that (h B) <J< (f B).
Proof. Assume (i)(iii). Note that A N , by Lemma 27.5(ix).
For (iv), suppose also that N \N . Then by Lemma 27.7 we have cf() = ,
and there is a club E in of order type such that E N . By (ii), we have f = fC for
some club C of of order type . By the minimally obedient property we have fC = fCE ,
and thus we may assume that C E. For any C and a A we have f (a) N by
Lemma 27.5(viii). So (iv) holds.
For (v), suppose that = ChN (). Then N \N because |N | = < min(A) .
For each C we have f N by Lemma 27.5(viii). For (c), if a A, then f (a) =
supC f (a) ChN (a), since f (a) N a for all C.
Next, assume the hypotheses of (vi). By Lemma 27.11, f is persistently cofinal in ,
so by Lemma 27.13, B is a -generator. By Lemma 9.25(v) there is a C such that
h B <J< f B . Now B =J< [A] B by Lemma 9.25(xi), so
{a B : h(a) f (b)} (B\B ) {a B : h(a) f (b)} J< [A].
We now define some abbreviations.
H1 (A, , N, ) abbreviates
A is a progressive set of regular cardinals, is a regular cardinal such that |A| < <
min(A), and N is a -presentable elementary substructure of H , with big and A N .
H2 (A, , N, , , f, ) abbreviates
H1 (A, , N, ), pcf(A), f = hf : < i is a sequence of members of
and = ChN ().
P1 (A, , N, , , f, ) abbreviates
H2 (A, , N, , , f, ) and {a A : ChN (a) f (a)} is a -generator.
398

A, f N ,

P2 (A, , N, , , f, ) abbreviates
H2 (A, , N, , , f, ) and the following hold:
(i) f (ChN A). Q
Q
(ii) For every h N A there is a d N A such that for any -generator B,
(h B) <J< (d B) and d f .
Thus H1 (A, , N, ) is part of the hypothesis of Lemma 27.17, and H2 (A, , N, , , f, )
is a part of the hypotheses of Lemma 27.17(v).
Lemma 27.18. If H2 (A, , N, , , f, ) holds and f is persistently cofinal for , then
P1 (A, , N, , , f, ) holds.
Proof. This follows immediately from Lemma 27.13.
Lemma 27.19. If H2 (A, , N, , , f, ) holds and f is -minimally obedient for ,
then both P1 (A, , N, , , f, ) and P2 (A, , N, , , f, ) hold.
Proof. Since f is -minimally obedient for , it is a universal sequence for , by
definition. Hence by Lemma 27.11 f is persistently cofinal for , and so property P1
follows from Lemma 27.18.
For P2 , note that , A N since f N , by Lemma 27.5(vii),(ix). Hence the hypotheses of Lemma 27.17(v)Qhold. So (i) in P2 holds by Lemma 27.17(v)(c). For condition (ii),
suppose that h N A. Take B and C as in Lemma 27.17(vi), and choose C such
that h B <J< f B. Let d = f . Clearly this proves condition (ii).
The following obvious extension of Lemma 27.19 will be useful below.
Lemma 27.20. Assume H1 (A, , N, ), and also assume that = ChN () and
def

(i) f = hf : pcf(A)i is a sequence of sequences hf : < i each of which is a


-minimally obedient for .
Then for each N pcf(A), P1 (A, , N, , , f , ) and P2 (A, , N, , , f , ) hold.
Lemma 27.21. Suppose that P1 (A, , N, , , f, ) and P2 (A, , N, , , f, ) hold.
Then
(i) {a A : ChN (a) = f (a)} is a -generator.
(ii) If = max(pcf(A)), then
< (f , ChN A) = {a A : f (a) < ChN (a)} J< [A].
Proof. By (i) of P2 (A, , N, , , f, ) we have f (ChN A), so (i) holds by
P1 (A, , N, , , f, ). (ii) follows from P1 (A, , N, , , f, ) and Lemma 9.25(xii).
Lemma 27.22. Assume that P1 (A, , N, , , f, ) and P2 (A, , N, , , f, ) hold.
Let
b = {a A : ChN (a) = f (a)}.
399

Then
(i) b is a -generator.
(ii) There is a set b b such that:
(a) b N ;
(b) b\b J< [A];
(c) b is a -generator.
Proof. (i) holds by Lemma 27.21(i). For (ii), by Lemma 27.12 choose < such
that, for every a A,
f (a) < ChN (a) iff

(1)

f (a) < ChN (a).

Now by (i) of P2 (A, , N, , , f, ) we have f (ChN A). Hence by (1) we see that for
every a A,
ab

(2)

iff

ChN (a) f (a).

Now by (ii) of P2 (A, , N, , , f, ) applied to h = ChN A, there is a d N


that the following conditions hold:

A such

(3) (ChN b) <J< (d b).


(4) d f .
Now we define
b = {a A : ChN (a) d(a)}.
Clearly b N . Also, by (3),
b\b = {a b : d(a) < ChN (a)} J< ,
and so (ii)(b) holds. Thus b J< b . If a b , then ChN (a) d(a) f (a) by (4), so
a b by (2). Thus b b. Now (ii)(c) holds by Lemma 9.25(ix).
Lemma 27.23. Assume H1 (A, , N, ) and A N . Q
Suppose that hf : pcf(A)i
N is an array of sequences hf : < i with each f A. Also assume that for every
N pcf(A), both P1 (A, , N, , , f , ()) and P2 (A, , N, , , f , ()) hold.
Then there exist cardinals 0 > 1 > > n in pcf(A) N such that
0
n
(ChN A) = sup{f(
, . . . , f(
}.
0)
n)

Proof. We will define by induction a descending sequence of cardinals i pcf(A)N


and sets Ai P(A) N (strictly decreasing under inclusion as i grows) such that if Ai 6=
then i = max(pcf(Ai )) and
(1)

0
i
(ChN (A\Ai+1 )) = sup{(f(
(A\Ai+1 )), . . . , (f(
(A\Ai+1 ))}.
0)
i)

400

Since the cardinals are decreasing, there is a first i such that Ai+1 = , and then the lemma
is proved. To start, A0 = A and 0 = max(pcf(A)). Clearly 0 N . Now suppose that
i and Ai are defined, with Ai 6= 0. By Lemma 27.22(i) and Lemma 9.25(x), the set
i
{a A (i + 1) : ChN (a) = f(
(a)}
i)

is a i -generator. Hence by Lemma 27.22(ii) we get another i -generator bi such that


(2) bi N .
i
(a)}.
(3) bi {a A (i + 1) : ChN (a) = f(
i)

Note that bi 6= . Let Ai+1 = Ai \bi . Thus Ai+1 N . Furthermore,


(4) A\Ai+1 = (A\Ai ) b1 .
Now by Lemma 9.25(ii) and i = max(pcf(Ai )) we have i
/ pcf(Ai+1 ). If Ai+1 6= , we
let i+1 = max(pcf(Ai+1 )). Now by (i) of P2 (A, , N, , , f j , (j )) we have

j
(5) f(
(ChN A) for all j i.
j)

Now suppose that a A\Ai+1 . If a Ai , then by (4), a b1 , and so by (3), ChN (a) =
i
f(
(a), and (1) holds for a. If a
/ Ai , then A 6= Ai , so i 6= 0. Hence by the inductive
1)
hypothesis for (1),
i1
0
(a)},
(a), . . . , f(
ChN (a) = sup{f(
i1 )
0)
and (1) for a follows by (5).
The cofinality of ([] , )
First we give some simple properties of the sets [] , not involving pcf theory.
Proposition 27.24. If are infinite cardinals, then
()

|[] | = cf([] , ) 2 .

Proof. Let = cf([] , ), and let hYi : i < i be an enumeration of a cofinal subset
of cf([] , ). For each i < let fi be a bijection from Yi to . Now the inequality in ()
is clear. For the other direction, we define an injection g of [] into P(), as follows.
Given E [] , let i < be minimum such that E Yi , and define g(E) = (i, fi [E]).
Clearly g is one-one.
Proposition 27.25. (i) If 1 < 2 , then
cf([]1 , ) cf([]2 , ) cf([2 ]1 , ).
(ii) cf([+ ] , ) = + .
+
(iii) If + , then cf([] , ) cf([] , ) + .
(iv) If 1 < 2 , then cf([1 ] , ) cf([2 ] , ).
401

(v) If , then cf([+ ] , ) cf([] , ) + .


(vi) cf([0 ]0 , ) = 1, while for m \1, cf([m ]0 ) = m .
(vii) cf([] , ) = cf([] , ).
Proof. (i): Let M []2 be cofinal in ([]2 , ) of size cf([]2 , ), and let N
([2 ] , ) be cofinal in ([2 ]1 , ) of size cf([2 ]1 , ). For each X M let fX : 2 X
be a bijection. It suffices now to show that {fX [Y ] : X M, Y N } is cofinal in ([]1 , ).
1
[W ] [2 ]1 , so
Suppose that W []1 . Choose X M such that W X. Then fX
1
there is a Y N such that fX [W ] Y . Then W fX [Y ], as desired.
+
(ii): The set { < + : |\| = } isSclearly cofinal in
S ([ ] . If M is a nonempty
+
+
subset of [ ] of size less than , then | M | = , and ( M ) + 1 is a member of [+ ]
not covered by any member of M . So (ii) holds.
(iii): Immediate from (i) and (ii).
(iv): Let M [2 ] be cofinal of size cf([2 ] , ). Let N = {X 1 : X M }\[1 ]< .
It suffices to show that N is cofinal in cf([1 ] , ). Suppose that X [1 ] . Then also
X [2 ] , so we can choose Y M such that X Y . Clearly X Y 1 N , as
desired.
(v): For each [, + ) let f be a bijection from to . Let E [] be cofinal in

([] , ) and of size cf([] , ). It suffices to show that {f1 [X] : [, + ), X E} is


cofinal in ([+ ] , ). So, take any Y [+ ] . Choose [, + ) such that Y . Then
f [Y ] [] , so we can choose X E such that f [Y ] X. Then Y f1 [X], as desired.
0
(vi): Clearly cf([0 ]0 , ) = 1. By induction it is clear from (v) that cf([
m ] ) m .
S
For m > 0 equality must hold,Ssince if X [m ]0 and |X| < m , then X < m , and
no denumerable subset of m \ X is contained in a member of X.
(vii): Clear.
1

The following elementary lemmas will also be needed.


Lemma 27.26. If < are limit ordinals, then
|[, ]| = |{ : < < , a successor ordinal}|.

Proof. For every [, ) let f () = + 1. Then f is a one-one function from [, )


onto { : < < , a successor ordinal}.
Lemma 27.27. If < with limit, then
|[, ]| = |{ : , a successor ordinal}|.

Proof. Write = + m with limit and m . Then


[, ] = [, + ) [ + , ] (, ],
and the desired conclusion follows easily from Lemma 27.26.
402

Theorem 27.28. Suppose that is singular and < is an uncountable regular


def
cardinal such that A = (, )reg has size < . Then
cf([] , ) = max(pcf(A)).
Proof. Note by the progressiveness of A that every limit cardinal in the interval (, )
is singular, and hence every member of A is a successor cardinal.
First we prove . Suppose to the contrary that cf([] , ) < max(pcf(A)). For
brevity write max(pcf(A)) = . let {Xi : i I} [] be cofinal and of cardinality less
than . Pick a universal sequence hf : < i for by Theorem 9.18. For every < ,
rng(fS ) is a subset of of size |A| , and hence rng(f ) is covered by some Xi . Thus
= iI { < : rng(f ) Xi }, so by |I| < and the regularity of we get an i I such
that |{ < : rng(f ) Xi }| = . Now define for any a A,
h(a) = sup(a Xi ).
Q
Since < a for each a A, we have h A. Now the sequence hf : < i is cofinal in
Q
A under <J< by Lemma 9.25(v),(iv). So there is a < such that h <J< f . Thus
there is an a A such that h(a) < f (a) Xi , contradicting the definition of h.
Second we prove , by exhibiting a cofinal subset of [] of size at most max(pcf(A)).
Take N and so that H1 (A, , N, ). Let M be the set of all -presented elementary
substructures M of H such that A M , and let
F = {M : M M }\[]< .
Since |M | = , we have |M | , and so M F (|M | = ).
(1) F is cofinal in [] .
In fact, for any X [] we can find M M such that X M , and (1) follows.
By (1) it suffices to prove that |F | max(pcf(A)).
Claim. If M, N M are such that ChM A = ChN A, then M = N .
For, if + is a successor cardinal , then sup(M + ) = ChM ( + ) = ChN ( + ) =
sup(N + ). So the claim holds by Theorem 27.9.
Now for each M M , let g(M ) be the sequence h(0 , 0 ), . . . , (n , n )i given by
Lemma 27.23. Clearly the range of g has size max(pcf(A)). Now for each X F ,
choose MX M such that X = MX . Then for X, Y F and X 6= Y we have
MX 6= MY , hence by the claim ChMX A 6= ChMY A, and hence by Lemma
27.23, g(MX ) 6= g(MY ). This proves that |F | max(pcf(A)).
Corollary 27.29. Let A = {m : 1 < m < }. Then for any m we have
cf([ ]m ) = max(pcf(A)).
Proof. Immediate from Lemma 9.1(vi) and Theorem 27.28.

403

Elevations and transitive generators


We start with some simple general notions about cardinals. If B is a set of cardinals, then
a walk in B is a sequence 0 > 1 > > n of members of B. Such a walk is necessarily
finite. Given cardinals 0 > in B, a walk from 0 to is a walk as above with n = .
We denote by F0 , (B) the set of all walks from 0 to .
Now suppose that A is progressive and 0 pcf(A). A special walk from 0 to n in
pcf(A) is a walk 0 > > n in pcf(A) such that i A for all i > 0. We denote by
F 0 , (A) the collection of all special walks from 0 to in pcf(A).
def

Next, suppose in addition that f = hf : pcf(A)i


is a sequence of sequences,
Q
where each f is a sequence hf : < i of members of A. If 0 > > n is a special
walk in pcf(A), and 0 0 , then we define an associated sequence of ordinals by setting
i+1 = fii (i+1 )
for all i < n. Note that i < i for all i = 0, . . . , n. Then we define
El0 ,...,n (0 ) = n .
def

Now we define the elevation of the sequence f , denoted by f e = hf ,e : pcf(A)i, by


setting, for any 0 pcf(A), any 0 0 , and any A,

f00 ()
if 0 ,

max({El0 ,...,n (0 ) : (0 , . . . , n ) F 0 , }) if < 0 ,


0 ,e
f0 () =
and this maximum exists,

0
if < 0 , otherwise.
f0 ()
Note here that the superscript

is only notational, standing for elevated.

Lemma 27.30. Assume the above notation. Then f00 f00 ,e for all 0 pcf(A)
and all 0 0 .
Proof. Take any 0 0 and any A. If 0 , then f00 ,e () = f00 (). Suppose
that < 0 . If the above maximum does not exist, then again f00 ,e () = f00 (). Suppose
the maximum exists. Now (0 , ) F 0 , (A), so
f00 () = El0 , (0 ) max({El0 ,...,n (0 ) : (0 , . . . , n ) F 0 , }) = f00 ,e ().
Lemma 27.31. Suppose that A is progressive, is a regular cardinal such that
def
|A| < < min(A), and f = hf : pcf(A)i is a sequence of sequences f such that f
is -minimally obedient for . Assume also H1 (A, , N, ) and f N .
Then also f e N .
Proof. The proof is a more complicated instance of our standard procedure for going
from V to H to N and then back. We sketch the details.
404

Assume the hypotheses. In particular, A N . Hence also pcf(A) N . Also, |A| < ,
so A N . Now clearly F N . Also, El N . (Note that El depends upon A.) Then by
absoluteness,
H |= g g is a function, dmn(g) = pcf(A) 0

f00 ()

max({El0 ,...,n (0 ) : (0 , . . . , n ) F 0 , })
g() =

0
f0 ()

pcf(A)0 0 A
if 0 ,
if < 0 ,
and this maximum exists,
if < 0 , otherwise.

Now the usual procedure can be applied.


Lemma 27.32. Suppose that A is progressive, is a regular cardinal such that
def
|A| < < min(A), and f = hf : pcf(A)i is a sequence of sequences f such that f
is -minimally obedient for . Assume H1 (A, , N, ) and f N .
Suppose that 0 pcf(A) N , and let 0 = ChN (0 ).
(i) If 0 > > n is a special walk in pcf(A), and 1 , . . . , n are formed as above,
then i N for all i = 0, . . . , n.
(ii) For every A 0 we have f00 ,e () N .
Proof. (i): By Lemma 27.17(iv)(c), f00 () N , and (i) follows by induction using
Lemma 27.17(iv)(c).
(ii): immediate from (i).
Lemma 27.33. Assume the hypotheses of Lemma 27.32. Then
(i) For any special walk 0 > > n = in F 0 , , we have
El0 ,...,n (0 ) ChN ().
(ii) f00 ,e ChN A for every 0 < 0 .
(iii) If there is a special walk 0 > > n = in F 0 , such that
El0 ,...,n (0 ) = ChN (),
then
ChN () = f00 ,e ().
(iv) Suppose that ChN () = f00 ,e () = . If there is an a A such that f,e (a) =
ChN (a), then also f00 ,e (a) = ChN (a).
Proof. (i) is immediate from Lemma 27.32(i) and Lemma 27.8(iii). (ii) and (iii) follow
from (i). For (iv), by Lemma 27.32(i) and (i) there are special walks 0 > > n =
and = 0 > > m = a such that
f00 ,e () = ChN () = El0 ,...,n (0 )
f,e (a) = ChN (a) = El0 ,...,m (a).
405

and

It follows that
El0 ,...,n ,1 ,...,a (0 ) = ChN (a),
and (iii) then gives f00 ,e (a) = ChN (a).
Definition. Suppose that A is progressive and A P pcf(A). A system hb : P i
of subsets of A is transitive iff for all P and all b we have b b .
Theorem 27.34. Suppose that H1 (A, , N, ), and f = hf : pcf(A)i is a system
of functions, and each f is -minimally obedient for . Let f e be the derived elevated
array. For every 0 pcf(A) N put 0 = ChN (0 ) and define
b0 = {a A : ChN (a) = f00 ,e (a)}.
Then the following hold for each 0 pcf(A) N :
(i) b0 is a 0 -generator.
(ii) There is a b0 b0 such that
(a) b0 \b0 J<0 [A].
(b) b0 N (each one individually, not the sequence).
(c) b0 is a 0 -generator.
(iii) The system hb : pcf(A) N i is transitive.
Proof. Note that H2 (A, , N, , 0 , f 0 ,e , 0 ) holds by Lemma 27.31. By definition,
minimally obedient implies universal, so f 0 is persistently cofinal by Lemma 27.11. Hence
by Lemma 27.24, f 0 ,e is persistently cofinal, and so P1 (A, , N, , 0 , f 0 ,e , 0 ) holds by
Lemma 27.18. Also, by Lemma 27.19 P2 (A, , N, , 0, f 0 , 0 ) holds, so the condition
P2 (A, , N, , 0, f 0 ,e , 0 ) holds by Lemmas 27.30 and 27.33(ii). Now (i) and (ii) hold by
Lemma 27.22.
Now suppose that 0 pcf(A) N and b0 . Thus
ChN () = f00 ,e (),
where 0 = ChN (0 ). Write = ChN (). We want to show that b b0 . Take any
a b . So ChN (a) = f,e (a). By Lemma 27.33(iv) we get f00 ,e (a) = ChN (a), so a b0 ,
as desired.
Localization
Theorem 27.35. Suppose that A is a progressive set. Then there is no subset B
pcf(A) such that |B| = |A|+ and, for every b B, b > max(pcf(B b)).
Proof. Assume the contrary. We may assume that |A|+ < min(A). In fact, if we
know the result under this assumption, and now |A|+ = min(A), suppose that B pcf(A)
with |B| = |A|+ and b B[b > max(pcf(B b))]. Let A = A\{|A|+}. Then let
B = B\{|A|+ }. So by Proposition 9.1(vi) we have B pcf(A ). Clearly |B | = |A |+
and b B [b > max(pcf(B b))], contradiction.
Also, clearly we may assume that B has order type |A|+ .
406

Let E = A B. Then |E| < min(E). Let = |E|. By Lemma 27.16, we get an
array hf : pcf(E)i, with each f -minimally obedient for . Choose N and so
that H1 (A, , N, ), with N containing A, B, E, hf : pcf(E)i. Now let hb :
pcf(E) N i be the set of transitive generators as guaranteed by Theorem 27.34. Let
b N be such that b b and b \b J< .
Now let F be the function with domain {a A : B(a b )} such that for each
such a, F (a) is the least B such that a b . Define B0 = { B : a dmn(F )(
F (a)}. Thus B0 is an initial segment of B of size at most |A|. Clearly B0 N . We let
0 = min(B\B0 ); so B0 = B 0 .
Now we claim
(1) There exists a finite descending sequence 0 > > n of cardinals in N pcf(B0 )
such that B0 b0 . . . bn .
We prove more: we find a finite descending sequence 0 > > n of cardinals in
N pcf(B0 ) such that B0 b0 . . . bn . Let 0 = max(pcf(B0 )). Since B0 N ,
def

we clearly have 0 N and hence b0 N . So B1 = B0 \b0 N . Now suppose that


Bk B0 has been defined so that Bk N . If Bk = , the construction stops. Suppose that
def
Bk 6= . Let k = max(pcf(Bk )). Clearly k N , so bk N and B+1 = Bk \bk N .
Since B+1 = Bk \bk and bk is a k -generator, from Lemma 9.25(xii) it follows that
0 > 1 > . So the construction eventually stops; say that Bn+1 = . So Bn bn . So
B0 b0 (B0 \b0 )
= b0 B1
b0 b1 B2
.........
b0 b1 . . . Bn
b0 b1 . . . bn .
This proves (1).
Note that 0 > max(pcf(B 0 ) = max(pcf(B0 )) 0 , . . . , n by the initial assumption of the proof. Next, we claim
(2) b0 b0 . . . bn .
To prove this, first note that b0 A B0 . For, b0 E by definition, and E = A B;
b0 B = B0 , so indeed b0 A B0 . Also, B0 b0 . . . bn . So it suffices to prove
that b0 A b0 . . . bn .
Consider any cardinal a b0 A. Since 0 B, we have a dmn(F ), and since
0
/ B0 we have F (a) < 0 . Let = F (a). So a b , and < 0 , so by the minimality
of 0 , B0 . Since B0 b0 . . . bn , it follows that bi for some i = 0, . . . , n. But
transitivity implies that b bi , and hence a bi , as desired. So (2) holds.
Using Lemma 9.1(iii), by (2) we have
pcf(b0 ) pcf(b0 ) . . . pcf(bn ),
407

and hence by Lemma 9.25(vii) we get 0 = max(pcf(b0 )) max{i : i = 0, . . . , n} < 0 ,


contradiction.
Theorem 27.36. (Localization) Suppose that A is a progressive set of regular cardinals. Suppose that B pcf(A) is also progressive. Then for every pcf(B) there is a
B0 B such that |B0 | |A| and pcf(B0 ).
Proof. We prove by induction on that if A and B satisfy the hypotheses of the
theorem, then the conclusion holds. Let C be a -generator over B. Thus C B and
= max(pcf(C)) by Lemma 9.25(vii). Now C pcf(A) and C is progressive. It suffices
to find B0 C with |B0 | |A| and pcf(B0 ).
Let C0 = C and 0 = . Suppose that C0 Ci and 0 > > i have
been constructed so that = max(pcf(Ci )) and Ci is a -generator over B. If there is
no maximal element of pcf(Ci ) we stop the construction. Otherwise, let i+1 be that
maximum element, let Di+1 be a i+1 -generator over B, and let Ci+1 = Ci \Di+1 . Now
Di+1 Ji+1 [B] J< [B], so Ci+1 is still a -generator of B by Lemma 9.25(ix), and
= max(pcf(Ci+1 )) by Lemma 9.25(vii). Note that i+1
/ pcf(Ci+1 ), by Lemma 9.25(ii).
This construction must eventually stop, when Ci does not have a maximal element;
we fix the index i.
(1) There is an E pcf(Ci ) such that |E| |A| and pcf(E).
In fact, suppose that no such E exists. We now construct a strictly increasing sequence
hj : j < |A|+ i of elements of pcf(Ci ) such that k > max(pcf({j : j < k}i for all
k < |A|+ . (This contradicts Theorem 27.35.) Suppose that {j : j < k} = E has been
defined. Now
/ pcf(E) by the supposition after (1), and < max(pcf(E)) is impossible
since pcf(E) pcf(Ci ) and = max(pcf(Ci )). So > max(pcf(E)). Hence, because Ci
does not have a maximal element, we can choose k Ci such that k > max(pcf(E)),
as desired. Hence (1) holds.
We take E as in (1). Apply the inductive hypothesis to each E and to A, E
S in place
of A, B; we get a set G E such that |G | |A| and pcf(G ). Let H = E G .
Note that |H| |A|. Thus E pcf(H). Since pcf(E) pcf(H) by Theorem 9.15, we
have pcf(H), completing the inductive proof.
The size of pcf(A)
Theorem 27.37. If A is a progressive interval of regular cardinals, then |pcf(A)| <
|A|+4 .
Proof. Assume that A is a progressive interval of regular cardinals but |pcf(A)|
|A|+4 . Let = |A|. We will define a set B of size + consisting of cardinals in pcf(A) such
that each cardinal in B is greater than max(pcf(B b)). This will contradict Theorem
27.35.
+3
Let S = S+ ; so S is a stationary subset of +3 . By Theorem 8.1, let hCk : k Si
be a club guessing sequence. Thus
(1) Ck is a club in k of order type + , for each k S.
(2) If D is a club in +3 , then there is a k D S such that Ck D.
408

Let be the ordinal such that = sup(A). Now pcf(A) is an interval of regular cardinals
by Theorem 9.13. So pcf(A) contains all regular cardinals in the set {+ : < +4 }.
Now we are going to define a strictly increasing continuous sequence hi : i < +3 i of
ordinals less than +4 .
1. Let 0 = +3 .
S
2. For i limit let i = j<i j .
3. Now suppose that j has been defined for all j i; we define i+1 . For each k S
(+)
(+)
let ek = {+j : j Ck (i + 1)}. Thus ek is a subset of pcf(A). If max(pcf(ek )) <
(+)
++4 , let k be an ordinal such that max(pcf(ek )) < +k and k < +4 ; otherwise
let k = 0. Let i+1 be greater than i and all k for k S, with i+1 < +4 . This is
possible because |S| = +3 . Thus
(+)

(+)

(3) For every k S, if max(pcf(ek )) < ++4 , then max(pcf(ek )) < +i+1 .
This finishes the definition of the sequence hi : i < +3 i. Let D = {i : i < +3 }, and
let = sup(D). Then D is club in . Let = + . Thus has cofinality +3 , and
it is singular since > 0 = +3 . Now we apply Corollary 9.35: there is a club C0 in
(+)
such that + = max(pcf(C0 )). We may assume that C0 [ , ). so we can write
C0 = {+i : i D0 } for some club D0 in . Let D1 = D0 D. So D1 is a club of . Let
E = {i +3 : i D1 }. It is clear that E is a club in +3 . So by (2) choose k E S
such that Ck E. Let Ck = { Ck : there is a largest Ck such that < }. Set
+

B = {+
+i : i Ck }. We claim that B is as desired. Clearly |B| = .

Take any j Ck . We want to show that


+
+
+j > max(pcf(B +j )).

()

Let i Ck be largest such that i < j. So i + 1 j. We consider the definition given above
of i+1 . We defined ek = {+l : l Ck (i + 1)}. Now
(+)

(4) B +
+j ek .
+

For, if b B +
+j , we can write b = +l with l Ck and l < j. Hence l i and so
(+)

b = +
+l ek . So (4) holds.
Now if l Ck (i + 1), then l E, and so l D1 D0 . Hence +l C0 . This
(+)
(+)
(+)
(+)
shows that ek C0 . So max(pcf(ek )) max(pcf(C0 )) = + < ++4 . Hence by
(+)
(3) we get max(pcf(ek )) < +i+1 . So
(+)

max(pcf(B +
+j )) max(pcf(ek )) by (4)
< +
+i+1
+
+j ,
which proves ().
Theorem 27.38. If is a singular cardinal such that < , then
cf([ ]|| , ) < ||+4 .
409

Proof. Let = ||+ and A = (, )reg . By Lemma 27.25(iii) and Lemma 27.28,
+

cf([ ]|| , ) max(||+ , cf([ ]|| , ))


max(||+ , max(pcf(A))).
Hence it suffices to show that max(pcf(A)) < ||+4 .
By Theorem 27.37, |pcf(A)| < |A|+4 . Write max(pcf(A)) = and = . We want
to show that < ||+4 . Now pcf(A) = (, max(pcf(A))]reg = ( , ]reg . By Lemma
27.27, |(, )| = |pcf(A)| < |A|+4 ||+4 . Also, = = ||+ < ||+4 . So
|| < ||+4 , and hence < ||+4 .
Theorem 27.39. If is a limit ordinal, then
cf()

< max



||

cf()

+

, ||+4 .

Proof. If = , then || = and the conclusion is obvious. So assume that < .


Now
cf()

(1)

||cf() cf([ ]|| , ).

In fact, let B [ ]|| be cofinal and of size cf([ ]|| , ). Now cf() ||, so
[ ]cf() =

[Y ]cf() ,

Y B

and (1) follows. Hence the theorem follows by Theorem 27.38.



Corollary 27.40. 0 < max (20 )+ , 4 .

410

S-ar putea să vă placă și